Have Hierarchical Sprite Tree, Need To Transform Screen X,Y To Determine Which Sprite

Member
Posts: 34
Joined: 2009.01
Post: #1
I'm writing a 2D sprite engine based somewhat on Flash's movieclip system. For those unfamiliar with Flash, it allows you to create a nested, hierarchical structure of sprites/movieclips. A movieclip has location (x,y), rotation, and scale properties (among others), and can be a drawing "primitive" like an image or geometric object. A movieclip can also be a container for one or more other movieclips, such that when the "parent" clip moves, all of the child clips move in concert. I'm using OpenGL ES (iPhone app) and I've got my sprite tree rendering correctly.

I now need to add my touch detection code to check if a screen (x,y) position is touching any sprite in the hierarchy. I'm not sure what the best approach should be.

I think what I want to do is take the screen touch (x,y) and transform this location from screen coordinates into the coordinate space of each of the sprites I want to check collision against. Can anyone recommend a good solution? It's been too long since I first learned the maths behind 3D transformation that I'm considerably rusty (not that I ever fully understood it...).

Should I be looking at leveraging OpenGL to help me accomplish this? My drawing code obviously has a proper transformation matrix for each sprite as I draw it. Should I be looking to take the inverse of this matrix to transform the screen (X,Y) coordinate? I've read that computing an inverse matrix can be expensive, so I think I may want to avoid having to do this.

My drawing code systematically builds up the modelview matrix as I traverse my sprite tree. Is it possible to write similar code to systematically build up the inverse matrix (or whatever matrix is needed to allow me to transform the screen (x,y) into the sprite's coordinate space).

BTW, I'm not looking for anyone to "post the codez" for me. I'm genuinely interested in learning and understanding the logic behind WHY I should take a certain approach, not just how to solve the problem.
Quote this message in a reply
Moderator
Posts: 3,579
Joined: 2003.06
Post: #2
What I did was implement my own 2D matrix transforms and matrix stack instead of using OpenGL, so I have instant access to the transformed sprite. You can read the matrix of your transformed sprites from the GL, but it's probably inefficient if you're reading for a lot of them (although I don't know for sure, so do not quote me on this: it might require a round-trip to the hardware which would stall the pipeline). I don't do anything tricky beyond that -- just loop through all the potential sprites on screen which could be touched.
Quote this message in a reply
Member
Posts: 34
Joined: 2009.01
Post: #3
AnotherJake Wrote:What I did was implement my own 2D matrix transforms and matrix stack instead of using OpenGL, so I have instant access to the transformed sprite. You can read the matrix of your transformed sprites from the GL, but it's probably inefficient if you're reading for a lot of them (although I don't know for sure, so do not quote me on this: it might require a round-trip to the hardware which would stall the pipeline). I don't do anything tricky beyond that -- just loop through all the potential sprites on screen which could be touched.
Thanks for the extremely quick feedback.

I think I understand the approach you're describing. In addition to the properties (location, rotation, etc.) that are fed into the rendering pipeline, you maintain your own transformation matrix for each of your sprites for purposes of mouse (x,y) checking. I can see how I might implement something like this.

What is still unclear is how building these matrices eventually allow you to check a screen (x,y) position against your sprite. Is it as obvious as multiplying the sprite's transformation matrix by the screen (x,y)? That doesn't sound like it would yield the correct transformed coordinates. Intuitively, it seems that I'd want to do some sort of "inverse" operation to get screen (x,y) into each sprite's coordinate space. But maybe I'm just not seeing the obvious.
Quote this message in a reply
Moderator
Posts: 3,579
Joined: 2003.06
Post: #4
kalimba Wrote:I think I understand the approach you're describing. In addition to the properties (location, rotation, etc.) that are fed into the rendering pipeline, you maintain your own transformation matrix for each of your sprites for purposes of mouse (x,y) checking. ...
Actually, I bypass using the GL to do any of the transforms for me entirely.

[edit] I should say, all the model transforms. The GL still does the projection. [/edit]

Just before I submit my sprite (or other geometry) for drawing, via something like glDrawArrays, I call my custom function to transform all the vertices myself. That way the GL just gets vertices and does absolutely no transforms for me. It's more efficient than the general-purpose, built-in 3D transform "convenience" functions in OpenGL (like glRotate, glTranslate, glScale) because I can optimize things on my end -- like for instance, since it's 2D, I don't have to apply the extra multiplications for the z axis. It comes at the expense of having to write your own 2D transform code, but it's actually pretty easy and can be done in less than 200 lines of code. Plus it is an *excellent* exercise for better understanding graphics transforms and OpenGL. The extra bonus is that future versions of OpenGL won't have transform functions anyway, so you'll have to eventually write your own anyway.

kalimba Wrote:What is still unclear is how building these matrices eventually allow you to check a screen (x,y) position against your sprite. Is it as obvious as multiplying the sprite's transformation matrix by the screen (x,y)? That doesn't sound like it would yield the correct transformed coordinates. Intuitively, it seems that I'd want to do some sort of "inverse" operation to get screen (x,y) into each sprite's coordinate space. But maybe I'm just not seeing the obvious.
You should be able to just grab the current modelview matrix and multiply it by your point to get a location on screen, with which you can compare a box with the touch coordinate. Something like this:

GLfloat matrix[16];
glGetFloatv(GL_MODELVIEW_MATRIX, matrix);

Then multiply an existing point with something like this (Mat4MultFloat3 is for a 3D vector, just copied it out of my mathlib):

float x, y, z;
Mat4MultFloat3(matrix, &x, &y, &z);

Code:
void Mat4MultFloat3(float *m, float *x, float *y, float *z)
{
    Vec3 vector;

    vector.x = *x;
    vector.y = *y;
    vector.z = *z;
    *x = ((m[0] * vector.x) + (m[4] * vector.y) + (m[8]  * vector.z) + m[12]);
    *y = ((m[1] * vector.x) + (m[5] * vector.y) + (m[9]  * vector.z) + m[13]);
    *z = ((m[2] * vector.x) + (m[6] * vector.y) + (m[10] * vector.z) + m[14]);
}
Quote this message in a reply
Member
Posts: 34
Joined: 2009.01
Post: #5
AnotherJake Wrote:You should be able to just grab the current modelview matrix and multiply it by your point to get a location on screen, with which you can compare a box with the touch coordinate.
OK, I believe I understand how you're doing this now. You never translate the touch coordinate into the sprite's coordinate space. Instead, you take the same matrix that is used to transform the sprite to the screen and transform a single point (possibly the sprite origin) to yield the sprite's screen position, then anchor the sprite's bounding box at the screen location and compare against the touch coordinate.

My situation is a bit more complicated, in that my sprites can have a rotation, which means the collision bounding box is not necessarily axis-aligned once transformed to screen coordinates. I can just pass all 4 points of the bounding box through the transformation matrix, though. Once I have the bounding box in screen coordinates, checking if the touch is inside should be trivial.

Thanks for the help on this. Not only have you solved my problem, you've given me tremendous insight into how I might optimize my own sprite rendering code down the line.
Quote this message in a reply
Moderator
Posts: 3,579
Joined: 2003.06
Post: #6
kalimba Wrote:OK, I believe I understand how you're doing this now. You never translate the touch coordinate into the sprite's coordinate space. Instead, you take the same matrix that is used to transform the sprite to the screen and transform a single point (possibly the sprite origin) to yield the sprite's screen position, then anchor the sprite's bounding box at the screen location and compare against the touch coordinate.
Yep. All I gotta do is make sure it matches the viewport and projection, which is simple because I always have my viewport and projection set to screen coordinates in my 2D case. YMMV on the viewport and projection though, so you might have to think about it. Plus you need to make sure you're calcing against the touch coordinates you *think* you have -- landscape or portrait can throw a major wrench in what you think you're dealing with in terms of touch coords.

The best advice I can offer here is to go through the extra trouble and have a little test mode where you can draw your hit rect on screen using a line loop (or equivalent) for each sprite you're testing against, so you can see what you're actually trying to compare your touch coordinates against, instead of assuming your math is correct (or worse: guessing). While you're at it, be sure to print out the touch coordinates too, so you don't go insane.

kalimba Wrote:My situation is a bit more complicated, in that my sprites can have a rotation, which means the collision bounding box is not necessarily axis-aligned once transformed to screen coordinates. I can just pass all 4 points of the bounding box through the transformation matrix, though. Once I have the bounding box in screen coordinates, checking if the touch is inside should be trivial.
Be careful not to assume that a rotated box is somehow more important or accurate for touch comparisons. I've found that touches are generally better off to be compared against crude boxes which approximate position. There are exceptions to this of course, and you can use multiple crude boxes for strangely shaped objects, or even a simple distance calculation (a hit circle). I've done all three, but I've never needed more than that (i.e. I've never needed non-aabb's for my touch hit rects, just possibly multiple aabb's or circles).

But yes, you are correct, you may very well need to further transform your sprite coords with the resultant matrix (which is trivial, really). I do too because my sprites do translate rotate *and* scale. Scale is much more important with hit rects than rotation (as I already mentioned rotation is rather non-important for hit calcs from what I've seen).

Just as a note: My last project using this stuff was recreating flash animations. Wink

It sounds like you're on the right track. Good luck!
Quote this message in a reply
Post Reply