How to handle entity rendering
Frank C. Wrote:It's more math under the hood but the implementation makes more sense (to me anyway) and I haven't run into any performance problems  there are plenty of other (bigger) bottlenecks to worry about. I just like how quaternions always seem to "know" the best route to take when interpolating/combining rotations  angles can be fragile.Well, whatever works for you I guess. [shrugs]
As you are no doubt fully aware, the main point of quaternions is to add a fourth mathematical dimension for doing rotations in three dimensions, since the third rotation is affected by the first two, creating gimbal lock. Adding the fourth "hyper" dimension ,in effect, allows the math to uncouple the third rotation from the first two. If you're just doing 2D then practically speaking, you're only rotating on *one* axis, not even two, which as warmi points out, makes it a simple scalar. So using four dimensional math for a single dimensional calculation certainly seems like overkill.
I put together a little batch of 2D matrix transform functions to share. I haven't tested it in this form, but I expect it'll work since it's essentially what I use in my own library, minus some other goodies. This should help get someone started at least.
The basic matrix is:
m[0] m[1] m[2]
m[3] m[4] m[5]
m[6] m[7] m[8]
The full multiply of two 2D matrices, m1 and m2, should be:
m[d][0] = m1[0] * m2[0] + m1[1] * m2[3] + m1[2] * m2[6];
m[d][1] = m1[0] * m2[1] + m1[1] * m2[4] + m1[2] * m2[7];
m[d][2] = m1[0] * m2[2] + m1[1] * m2[5] + m1[2] * m2[8];
m[d][3] = m1[3] * m2[0] + m1[4] * m2[3] + m1[5] * m2[6];
m[d][4] = m1[3] * m2[1] + m1[4] * m2[4] + m1[5] * m2[7];
m[d][5] = m1[3] * m2[2] + m1[4] * m2[5] + m1[5] * m2[8];
m[d][6] = m1[6] * m2[0] + m1[7] * m2[3] + m1[8] * m2[6];
m[d][7] = m1[6] * m2[1] + m1[7] * m2[4] + m1[8] * m2[7];
m[d][8] = m1[6] * m2[2] + m1[7] * m2[5] + m1[8] * m2[8];
The idea is that you multiply a row into a column and add it up to get a new term.
Then if you look up your basic transform formulas you can put together your own 2D transform functions with that basic knowledge. After you simplify the math by removing redundant multiplies by zero and one, you'll come up with something like this:
I don't know if it is the tightest code one could have, but like I've been saying, that's about all there is to it if you want to do your own 2D transforms and skip OpenGL's. Pretty simple huh?
You use the loadidentity/push/pop/translate/rotate/scale just like you would with OpenGL's equivalents. Then when you're ready to commit your transform(s) to your sprite's vertices, you call either TransformVerts or TransformVertsInPlace to transform the vertices you desire.
Here is a very simple GLUT demo of one way you can use it:
The basic matrix is:
m[0] m[1] m[2]
m[3] m[4] m[5]
m[6] m[7] m[8]
The full multiply of two 2D matrices, m1 and m2, should be:
m[d][0] = m1[0] * m2[0] + m1[1] * m2[3] + m1[2] * m2[6];
m[d][1] = m1[0] * m2[1] + m1[1] * m2[4] + m1[2] * m2[7];
m[d][2] = m1[0] * m2[2] + m1[1] * m2[5] + m1[2] * m2[8];
m[d][3] = m1[3] * m2[0] + m1[4] * m2[3] + m1[5] * m2[6];
m[d][4] = m1[3] * m2[1] + m1[4] * m2[4] + m1[5] * m2[7];
m[d][5] = m1[3] * m2[2] + m1[4] * m2[5] + m1[5] * m2[8];
m[d][6] = m1[6] * m2[0] + m1[7] * m2[3] + m1[8] * m2[6];
m[d][7] = m1[6] * m2[1] + m1[7] * m2[4] + m1[8] * m2[7];
m[d][8] = m1[6] * m2[2] + m1[7] * m2[5] + m1[8] * m2[8];
The idea is that you multiply a row into a column and add it up to get a new term.
Then if you look up your basic transform formulas you can put together your own 2D transform functions with that basic knowledge. After you simplify the math by removing redundant multiplies by zero and one, you'll come up with something like this:
Code:
#import <math.h>
#define MAX_STACK_DEPTH 32
static float m[MAX_STACK_DEPTH][9];
int stackDepth = 0;
void LoadIdentity(void)
{
int d = stackDepth;
m[d][0] = 1.0f; m[d][1] = 0.0f; m[d][2] = 0.0f;
m[d][3] = 0.0f; m[d][4] = 1.0f; m[d][5] = 0.0f;
m[d][6] = 0.0f; m[d][7] = 0.0f; m[d][8] = 1.0f;
}
void PushMatrix(void)
{
int c, d;
if (stackDepth >= MAX_STACK_DEPTH)
return;
c = stackDepth;
stackDepth++;
d = stackDepth;
m[d][0] = m[c][0];
m[d][1] = m[c][1];
m[d][2] = m[c][2];
m[d][3] = m[c][3];
m[d][4] = m[c][4];
m[d][5] = m[c][5];
m[d][6] = m[c][6];
m[d][7] = m[c][7];
m[d][8] = m[c][8];
}
void PopMatrix(void)
{
if (stackDepth <= 0)
return;
stackDepth;
}
void Translate(float dx, float dy)
{
int d = stackDepth;
m[d][6] = dx * m[d][0] + dy * m[d][3] + m[d][6];
m[d][7] = dx * m[d][1] + dy * m[d][4] + m[d][7];
m[d][8] = dx * m[d][2] + dy * m[d][5] + m[d][8];
}
void Rotate(float radians)
{
int d = stackDepth;
float cosTheta = cos(radians);
float sinTheta = sin(radians);
float m0 = m[d][0], m1 = m[d][1], m2 = m[d][2],
m3 = m[d][3], m4 = m[d][4], m5 = m[d][5];
m[d][0] = cosTheta * m0 + sinTheta * m3;
m[d][1] = cosTheta * m1 + sinTheta * m4;
m[d][2] = cosTheta * m2 + sinTheta * m5;
m[d][3] = sinTheta * m0 + cosTheta * m3;
m[d][4] = sinTheta * m1 + cosTheta * m4;
m[d][5] = sinTheta * m2 + cosTheta * m5;
}
void Scale(float sx, float sy)
{
int d = stackDepth;
m[d][0] *= sx;
m[d][1] *= sx;
m[d][2] *= sx;
m[d][3] *= sy;
m[d][4] *= sy;
m[d][5] *= sy;
}
void TransformVerts(float *verts, float *transformedVerts, int count)
{
int d = stackDepth, i, ix, iy;
float x, y;
float m0 = m[d][0], m1 = m[d][1], m3 = m[d][3],
m4 = m[d][4], m6 = m[d][6], m7 = m[d][7];
for (i = 0; i < count; i++)
{
ix = i * 2;
iy = ix + 1;
x = verts[ix];
y = verts[iy];
transformedVerts[ix] = x * m0 + y * m3 + m6;
transformedVerts[iy] = x * m1 + y * m4 + m7;
}
}
void TransformVertsInPlace(float *verts, int count)
{
int d = stackDepth, i, ix, iy;
float x, y;
float m0 = m[d][0], m1 = m[d][1], m3 = m[d][3],
m4 = m[d][4], m6 = m[d][6], m7 = m[d][7];
for (i = 0; i < count; i++)
{
ix = i * 2;
iy = ix + 1;
x = verts[ix];
y = verts[iy];
verts[ix] = x * m0 + y * m3 + m6;
verts[iy] = x * m1 + y * m4 + m7;
}
}
I don't know if it is the tightest code one could have, but like I've been saying, that's about all there is to it if you want to do your own 2D transforms and skip OpenGL's. Pretty simple huh?
You use the loadidentity/push/pop/translate/rotate/scale just like you would with OpenGL's equivalents. Then when you're ready to commit your transform(s) to your sprite's vertices, you call either TransformVerts or TransformVertsInPlace to transform the vertices you desire.
Here is a very simple GLUT demo of one way you can use it:
Code:
#include <stdlib.h>
#include <GLUT/glut.h>
#include "Transform2D.h"
void display(void)
{
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT  GL_DEPTH_BUFFER_BIT);
float vertsSource[] = { 0.5f, 0.5f, 0.5f, 0.5f, 0.5f, 0.5f, 0.5f, 0.5f };
float vertsDestination[8];
LoadIdentity();
PushMatrix();
Translate(200.0f, 240.0f);
Rotate(34.0f);
Scale(200.0f, 200.0f);
TransformVerts(vertsSource, vertsDestination, 4);
glVertexPointer(2, GL_FLOAT, 0, vertsDestination);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
PopMatrix();
glutSwapBuffers();
}
void reshape(int width, int height)
{
glViewport(0, 0, width, height);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0, width, 0, height);
glMatrixMode(GL_MODELVIEW);
}
void idle(void)
{
glutPostRedisplay();
}
int main(int argc, char** argv)
{
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_RGBA  GLUT_DOUBLE);
glutInitWindowSize(640, 480);
glutCreateWindow("2D Transforms");
glutDisplayFunc(display);
glutReshapeFunc(reshape);
glutIdleFunc(idle);
glEnableClientState(GL_VERTEX_ARRAY);
glutMainLoop();
return EXIT_SUCCESS;
}
warmi Wrote:If you are dealing with 2d then there is no interpolation problem whatsoever ... I mean you are dealing with a scalar  no need for quats.As I said this is more a comfort thing (ported from existing 3D code) and is only used in certain cases. Early on the game design called for sprites that could potentially rotate into the Z plane so it made more sense at the time, but this thread has convinced me to revisit the code. I suppose I could wrap up scalarangle transformations into functions that mimic the quaternion API to keep the implementation pretty...
Perhaps you guys can discuss how I am rendering compared to these other things. My way is way higher level and I am wondering how much performance I am giving up for this. I am able to get quite a few baddies on the screen at once but each baddy has only 12 quads at present.
basically in my main game loop i have the following 2 things. I will add ObjC infront of every type to show that its an objc type, but they aren't really named that
ObjCUnit** units=some function that gives me the zordered units list
int unitsCount=came from the above function also
for(int i=0;i<unitsCount;i++)
{
[[RendererFactory getRendererFor:[units[i] getRendererID]] renderUnit:units[i],other params that might be needed];
}
Thats basically it. I have only a few renderers right now. One main renderer that knows how to render any unit, and then a few sub classes of that to handle units with slightly modified draw behaviour.
The basic renderer asks the unit for his position , his texture, and his texture coordinates and rotation. It then also asks the unit for a list of "effects" which are attached to him, like for example smoke if he is on fire. So there is 12 opengl quad renders per unit. No combining of verticies between units. Every unit is likely to have different texture vertices so I need to use separate texture corords per render. Doesn't that mean I can't render them all in one swath? There is also the potential that a unit might be on a seperate texture, but most of them are on the same.
Any comments on this approach? The nice thing is the rendering is highly abstracted away from everything and its easy to change things.
basically in my main game loop i have the following 2 things. I will add ObjC infront of every type to show that its an objc type, but they aren't really named that
ObjCUnit** units=some function that gives me the zordered units list
int unitsCount=came from the above function also
for(int i=0;i<unitsCount;i++)
{
[[RendererFactory getRendererFor:[units[i] getRendererID]] renderUnit:units[i],other params that might be needed];
}
Thats basically it. I have only a few renderers right now. One main renderer that knows how to render any unit, and then a few sub classes of that to handle units with slightly modified draw behaviour.
The basic renderer asks the unit for his position , his texture, and his texture coordinates and rotation. It then also asks the unit for a list of "effects" which are attached to him, like for example smoke if he is on fire. So there is 12 opengl quad renders per unit. No combining of verticies between units. Every unit is likely to have different texture vertices so I need to use separate texture corords per render. Doesn't that mean I can't render them all in one swath? There is also the potential that a unit might be on a seperate texture, but most of them are on the same.
Any comments on this approach? The nice thing is the rendering is highly abstracted away from everything and its easy to change things.
Thanks AnotherJake, I really appreciate your example and its made things a WHOLE lot clearer. I'm going to give it a try in my code.
Thanks again for the time you put into that response.
MikeD
Thanks again for the time you put into that response.
MikeD
Cool man, glad to help. I hope it works out for you.
No, you can definitely render them in one swath. Remember, you're using a texture coord array along with your vertex array, so texture coords can change all you want, just like you change the vertices themselves. This is true for color too, since you can use the color array if needed. Basically, the only things that will require you to stop in between sprites is if you have to change textures or blend mode, or some other special need or effect.
That said, I don't think there's anything wrong with doing things the "inefficient" way. As long as you can maintain enough frames per second to stay happy, then that's Good Enough!â„¢ Honestly, I don't even bother batching very often unless I have an obvious need to, like with a particle emitter puffing out lots and lots of puffs of smoke or fire per second and affecting framerate.
Yes, keeping your renderer abstracted away is definitely the best way to go.
kendric Wrote:Every unit is likely to have different texture vertices so I need to use separate texture corords per render. Doesn't that mean I can't render them all in one swath?
No, you can definitely render them in one swath. Remember, you're using a texture coord array along with your vertex array, so texture coords can change all you want, just like you change the vertices themselves. This is true for color too, since you can use the color array if needed. Basically, the only things that will require you to stop in between sprites is if you have to change textures or blend mode, or some other special need or effect.
That said, I don't think there's anything wrong with doing things the "inefficient" way. As long as you can maintain enough frames per second to stay happy, then that's Good Enough!â„¢ Honestly, I don't even bother batching very often unless I have an obvious need to, like with a particle emitter puffing out lots and lots of puffs of smoke or fire per second and affecting framerate.
Yes, keeping your renderer abstracted away is definitely the best way to go.
Thanks. Until just now I had my brain stuck into a texture array is always the same size mentality, even though I should have known better as I have done vertex arrays that are not the same size . I suppose I could modify my code
to render nothing until the change texture happens and just que these things. But then you have to deal with expanding int arrays as you will never know how many you are going to have. That in itself adds some overhead. I guess it would depend on just how much overhead is there in calling opengl.
to render nothing until the change texture happens and just que these things. But then you have to deal with expanding int arrays as you will never know how many you are going to have. That in itself adds some overhead. I guess it would depend on just how much overhead is there in calling opengl.
kendric Wrote:But then you have to deal with expanding int arrays as you will never know how many you are going to have.
This is a good point to think about too. I had been using dynamic object arrays for years, using linked lists and all sorts of other ideas. Then one day it finally occurred to me that dynamic allocation of vertex/texCoord/etc. arrays is not actually needed. There is logically going to be a maximum number of objects that you can reasonably draw in any given frame without getting to the point were you're doing seconds per frame instead of frames per second. So now all I do is simply use simple static arrays and that's that (e.g. float myArray[MAX_OBJECTS_EVAR]).
In practice you'll want to copy your basic untransformed sprite into the array you will submit to the GL as you transform it. That means you're building the array every frame anyway, which means all you have to do is change the count parameter of glDrawArrays, depending on how many sprites you actually submit for any given frame. If you have a maximum of say 10 textures in your game and you want everything batched, then you have 10 arrays, each some arbitrary "max" size, like say, 144000 bytes each. That's enough to handle 12k triangles per submission, including z values if desired, which is more than you'll need on the current gen iPhone. That comes out to 1.4 MB for all ten vertex arrays. Add another MB for tex coords, and you can see there's plenty of room to spare, with absolutely no need for dynamic memory allocation, or the management code and performance overhead that comes with that.
Basically, with this kind of design, the end result is that all you're doing is using OpenGL for depth sorting and rasterization.
I was thinking about this too, and I came up with something similar. You could have your array auto expand, but retain it from game loop to game loop. That way it only ever expands as needed but once its expanded it remains at that size and only if you need it bigger will it expand again. Your way sounds pretty good too.
kendric Wrote:I was thinking about this too, and I came up with something similar. You could have your array auto expand, but retain it from game loop to game loop. That way it only ever expands as needed but once its expanded it remains at that size and only if you need it bigger will it expand again. Your way sounds pretty good too.
Yeah, I am doing something similar ... have a constructor which takes an initial size value but still making sure that things aren't hardcoded.
I actually took it even further ... essentially creating an API like approach where all I have is a class called Renderer2d which offers methods like drawSprite(screen_coordinates, texture_coordinates, rotation,scale, color tint) etc ... but also have a method called setMaterial(Material) (a class which basically contains texture/render states) which essentially sets the internal state of Renderer2d.
Internally , vertex arrays are being allocated based on materials so in the end the process if fully automated ... you will end up with as many vertex/index arrays as there are submitted materials.
Possibly Related Threads...
Thread:  Author  Replies:  Views:  Last Post  
emulator can't handle multithreading  captainfreedom  13  8,859 
Jun 22, 2010 06:41 AM Last Post: Skorche 

Handle input with opengl es  melobrien  0  2,832 
Oct 24, 2009 07:37 AM Last Post: melobrien 