"Simple" OpenGLES (2D) efficiency question .. for OpenGLES experts!

Apprentice
Posts: 9
Joined: 2010.04
Post: #1
Hello, here is a question from an OpenGLES newb.

Without fully understanding what I am doing, I found it easy enough to get a basic system going, drawing a 2D sprite on the screen and so on. (For the record, of course I use my own totally separate physics system, etc, no confusion there.) But here's my question...


When you reach the point of actually drawing a pseudosprite to the screen, I guess you'll be doing something like this...

Code:
...
GLfloat        happyVertices[] =
{
-halfwidth + drawItHere.x,    -halfheight + drawItHere.y,    0.0,
halfwidth + drawItHere.x,    -halfheight + drawItHere.y,    0.0,
-halfwidth + drawItHere.x,    halfheight + drawItHere.y,    0.0,
halfwidth + drawItHere.x,    halfheight + drawItHere.y,    0.0
};
...
glBindTexture( .. handle textures properly ..
glVertexPointer(3, GL_FLOAT, 0, happyVertices);
glTexCoordPointer( .. handle coords properly ..
glDrawArrays( .. handle drawing properly ..
...


That's fine and it works well (as far as I can see).

However, you could also take this approach:


Code:
...
GLfloat        ezVertices[] =
{
-halfwidth,    -halfheight,    0.0,
halfwidth,    -halfheight,    0.0,
-halfwidth,    halfheight,    0.0,
halfwidth,    halfheight,    0.0
};
...
glTranslatef( drawItHere.x, drawItHere.y, 0.0f);
...
glBindTexture( .. handle textures properly ..
glVertexPointer(3, GL_FLOAT, 0, ezVertices); // (never changes)
glTexCoordPointer( .. handle coords properly;
glDrawArrays( .. handle drawing properly
...

indeed as far as I know and can test, that ALSO seems to work fine.


So in METHOD A you just set the rectangle in the vertex to be where you want the sprite to be, each frame.

In METHOD B, you just glTranslatef, and then just set the vertex to the basic shape sitting on the origin.

(Note that of course in METHOD B it is extremely easy to ROTATE the sprite. Just add a glRotatef after the glTranslatef.)



SO - they both seem to work. But I know nothing.

For all I know one method or the other is hopelessly inefficnet or results in Flash being launched on the user's iPad or whatever. Perhaps one method only works up to five sprites, or, perhaps both methods are actually identical inside and I am too dumb to know that. I just don't know!



Sorry to bother any experts with this question, but if any one has the answer, I can only thank you in advance!!! Cheers.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #2
This is actually a good question. Smile

The answer, as far as I am aware, is that it is technically faster to do METHOD A. The main reason being that OpenGL does transforms in a generic way, and always in 3D, so there are unnecessary extra calculations being done if all you're doing is 2D, in addition to the function call overhead (which is minimal, but still...). Rotations would mathematically be most expensive.

Another thing to be aware of is that glTranslate, glRotate, and glScale supposedly are not a part of newer versions of OpenGL, but I haven't left the old stuff yet, so I can't confirm that Wink

That said, using glTranslate, etc. are plenty fast enough to do just about anything I've needed, and they are certainly easier on the eyes in terms of the code produced. My opinion is that if you find it easier to use glTranslate, etc. then use them! If you run into performance problems, you can easily macro them out and override them with your own transforms if you wish, which is something I've done with great success.
Quote this message in a reply
Apprentice
Posts: 9
Joined: 2010.04
Post: #3
Thank you so much for the answer. (Thank goodness it is a good question. Smile )

Intriguing.

(1) If I understand your second paragraph correctly, you are saying that the whole vertex system (i.e., when we use glVertexPointer ..) is relatively expensive anyway {because it's really 3D}. So, "additionally" using glTranslatef is helova wasteful. ie, the vertex system is powering along on full steam "anyway". Makes sense, THANKS.

(2) "glTranslate, glRotate, and glScale supposedly are not a part of newer versions of OpenGL.." Ah, thanks for the tip. THANKS.

(3) "[one could..] override them with your own transforms if you wish, which is something I've done with great success...."

Since I'm GLNEWB .. in that sentence, do you mean work out the transforms mathematically (no problem) and then in fact use that new data IN THE glVertexPointer statement...? I hope I understood you correctly. THANKS.


(4) Simple newbie question .. (as long as I'm on a roll..) I'm curious why: in most examples I've seen, at that point in the cycle, glBindTexture, lVertexPointer and glTexCoordPointer are ALL SET before you call glDrawArrays once again in the ongoing cycle of life.

Wouldn't you just leave glBindTexture and glTexCoordPointer alone? (Assuming they are not changing of course.) Wouldn't you just set glVertexPointer each time for your new position,rotation,size .. and then use glDrawArrays? I'm sure, I am missing something obvious.


(5) My final newb question (I'll never need to ask another question about all of OpenGL, imagine) ... Should I basically set up the whole scene every time?

Thus, if you have say a background and then some big stationary sprites on it, and let's say some small moving sprites on the big stationary ones ... in fact do you draw the WHOLE SHEBANG each time, i.e. the background, the currently stationary sprites, and the animating ones? Put in the b/g every time? Is that right? THANKS :-O

{I assume it is totally ridiculous to sit an OpenGLES layer over a normal iPhone layer.}


(6) You know, this all makes me think it would be really sensible if you could just blit in 2D on the iPhone? I've noticed that on the iPhone, in 2D, when you exhaust the speed of Quartz, the next step is ... using openGLES3D as a 2D mechanism. In some ways it seems pretty silly! All you would really need (I guess?) is some sort of 2D blitter that lets you handle basically backgrounds, sprites moving, and perhaps scrolling backgrounds. But maybe OpenGL is the only way to access the "metal" I'm guessing? is that about right? I've never seen mentioned any other way to do fast2D on the fone than, using OpenGLES "in 2D mode".

Well ...thanks and cheers!!!
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #4
See the previous threads about this topic.
Quote this message in a reply
Sage
Posts: 1,482
Joined: 2002.09
Post: #5
Both of those methods are going to be pretty close in terms of performance.

The best thing that you can do for performance is to batch your sprites. That means that you create an array with the triangles for several sprites. Then you only have to set the pointers and texture once and can draw a number of sprites with a single draw call. Function calls aren't really that expensive, but the OpenGL state changes they perform are.

However, you shouldn't really be worried about micro-optimizations like the ones you showed in your original post until you are sure which parts of your program are causing your program to be slow. More importantly, don't bother optimizing things at that level until your program is actually running slow. There are tons of terrible, horrible, slow things that people do in their programs but nobody notices as long as the program runs smoothly on the target platform.

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #6
The A method is more efficient by several orders of magnitude.
It is not just the cost of calling glTranslatef or related calls but as Skorche points out, it is the draw calls that will make a lot of difference.

Now it is true that often people don't see any difference when looking at their frame rates but that's mostly because, more often than not, they use alpha-blending with relatively large sprites and their code becomes fill rate bound before anything else.

But rest assured, with the B method you are burning your CPU cycles big time, the same cycles that would come very handy once you get into more advanced code with complicated logic, sound and physics processing.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #7
fattoh Wrote:(1) If I understand your second paragraph correctly, you are saying that the whole vertex system (i.e., when we use glVertexPointer ..) is relatively expensive anyway {because it's really 3D}. So, "additionally" using glTranslatef is helova wasteful. ie, the vertex system is powering along on full steam "anyway". Makes sense, THANKS.
I'm saying that when you call glTranslate/rotate/scale, they are doing matrix transforms on 3D data regardless (x,y,z), so it is faster for you to transform your data yourself if your just doing 2D (x,y). You can cut out a lot of multiplies, especially for rotations if you do them yourself for 2D.

But don't bother! As Skorche says, what you're showing already will work just fine for the vast majority of cases (either way: METHOD A or METHOD B). Wait until you run into performance issues, THEN consider doing other stuff like doing full transforms yourself.

fattoh Wrote:(3) "[one could..] override them with your own transforms if you wish, which is something I've done with great success...."

Since I'm GLNEWB .. in that sentence, do you mean work out the transforms mathematically (no problem) and then in fact use that new data IN THE glVertexPointer statement...? I hope I understood you correctly. THANKS.
Right, you can do the transforms yourself. You will have to do them yourself if you use the "batch" technique, where instead of calling glDrawArrays/Elements for each sprite, you put them in one big pile of vertices, transform them yourself and submit them all at once. Batching can be a pain to set up, but it can pay off huge performance gains. Personally, I would advise not bothering with batching until you get some more experience and you run into a situation where you really need it. Like Warmi said, you'll likely run into fill rate problems with blending way before you'll need to start batching (and batching won't fix fill/blend anyway).


fattoh Wrote:(4) Simple newbie question .. (as long as I'm on a roll..) I'm curious why: in most examples I've seen, at that point in the cycle, glBindTexture, lVertexPointer and glTexCoordPointer are ALL SET before you call glDrawArrays once again in the ongoing cycle of life.

Wouldn't you just leave glBindTexture and glTexCoordPointer alone? (Assuming they are not changing of course.) Wouldn't you just set glVertexPointer each time for your new position,rotation,size .. and then use glDrawArrays? I'm sure, I am missing something obvious.
You can set your pointers once and re-use them, yes. The transforms happen when the geometry is submitted during the call to glDrawArrays/elements.


fattoh Wrote:(5) My final newb question (I'll never need to ask another question about all of OpenGL, imagine) ... Should I basically set up the whole scene every time?
Yes, draw your whole scene every time.


fattoh Wrote:{I assume it is totally ridiculous to sit an OpenGLES layer over a normal iPhone layer.}
Generally speaking, mixing Cocoa layers and widgets with OpenGL views is a bad thing for performance. The only time I ever do that is pretty much just for alerts and sometimes a text field for high score entry. It's best to draw full screen with your OpenGL view on iPhone and avoid mixing it with anything else.


fattoh Wrote:(6) You know, this all makes me think it would be really sensible if you could just blit in 2D on the iPhone?
No, don't. It's terribly slow. Stick to OpenGL all the way.
Quote this message in a reply
Sage
Posts: 1,482
Joined: 2002.09
Post: #8
AnotherJake Wrote:I'm saying that when you call glTranslate/rotate/scale, they are doing matrix transforms on 3D data regardless (x,y,z), so it is faster for you to transform your data yourself if your just doing 2D (x,y). You can cut out a lot of multiplies, especially for rotations if you do them yourself for 2D.

Not true. By calling any of the transformation functions, you aren't adding to a list of transformations that the GPU does. The GPU multiplies each vertex by a single matrix transformation. Calling glTranslate, glRotate, glMultMatrix, etc modify that matrix. You aren't making the GPU's job easier by transforming the vertexes yourself because it's still going to multiply each vertex by the identity (do nothing) matrix. You can't stop it from doing the transformation and no matrix transformation is more or less expensive than any other. The only thing that makes a matrix "more expensive" is how it is initialized. Initializing a rotation matrix needs to use trig functions, a translation matrix is practically free to initialize.

The question then is is it cheaper to set up a transformation and push it to the GPU for each sprite or to transform the vertexes and just leave the transformation matrix alone. Calculating a new transformation matrix for only 4 vertexes is almost certainly more expensive. Pushing it to the GPU may also be expensive, but in a more difficult to measure way.

Again, I'll say that you should not be worried about micro-optimizations like this until you have a really good reason to. Batching your sprites is pretty much guaranteed to give you the best performance, but is not nearly as simple to implement. We basically did the second method from your first post to draw all the sprites in Twilight Golf. The only difference is that we put all our sprites into a single texture atlas and reused the same vertex array so that we only had to set the pointers and texture once. Drawing the 50 or so sprites on the screen in such an inefficient way was never ever a bottleneck compared to the physics or shadow drawing.

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #9
Skorche Wrote:You aren't making the GPU's job easier by transforming the vertexes yourself because it's still going to at least multiply each one by the identity (do nothing) matrix. You can't stop it from doing the transformation and no matrix transformation is more or less expensive.

You don't really know that ... since glLoadIdentity() is an explicit call , I would think it is not inconceivable to eliminate worldview matrix transformation from the equation in the vertex shader ( which is what is being run even on MBX platforms - at last tracking calls thru shark reveals references to some internal vertex shader setup code)
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #10
Skorche Wrote:Not true.

I'm talking about this in 3D for a rotation:

Code:
void tqRotate3DRad(float theta, float x, float y, float z)
{
    int        d = stackDepth3D;
    float    c = cos(theta);
    float    oneMinusCos = 1.0f - c;
    float    s = sin(theta);
    float    m0, m1, m2, m4, m5, m6, m8, m9, m10;
    float    M2[12] = { m[d][0], m[d][1], m[d][2], m[d][3], m[d][4], m[d][5],
                        m[d][6], m[d][7], m[d][8], m[d][9], m[d][10], m[d][11] };
    float    xy = x * y;
    float    xz = x * z;
    float    yz = y * z;
    float    xs = x * s;
    float    ys = y * s;
    float    zs = z * s;
    
    m0 = x * x * oneMinusCos + c;    m1 = xy * oneMinusCos - zs;        m2 = xz * oneMinusCos + ys;
    m4 = xy * oneMinusCos + zs;        m5 = y * y * oneMinusCos + c;    m6 = yz * oneMinusCos - xs;
    m8 = xz * oneMinusCos - ys;        m9 = yz * oneMinusCos + xs;        m10 = z * z * oneMinusCos + c;
    
    m[d][0]  = m0  * M2[0] + m1  * M2[4] + m2  * M2[8];
    m[d][1]  = m0  * M2[1] + m1  * M2[5] + m2  * M2[9];
    m[d][2]  = m0  * M2[2] + m1  * M2[6] + m2  * M2[10];
    m[d][3]  = m0  * M2[3] + m1  * M2[7] + m2  * M2[11];
    m[d][4]  = m4  * M2[0] + m5  * M2[4] + m6  * M2[8];
    m[d][5]  = m4  * M2[1] + m5  * M2[5] + m6  * M2[9];
    m[d][6]  = m4  * M2[2] + m5  * M2[6] + m6  * M2[10];
    m[d][7]  = m4  * M2[3] + m5  * M2[7] + m6  * M2[11];
    m[d][8]  = m8  * M2[0] + m9  * M2[4] + m10 * M2[8];
    m[d][9]  = m8  * M2[1] + m9  * M2[5] + m10 * M2[9];
    m[d][10] = m8  * M2[2] + m9  * M2[6] + m10 * M2[10];
    m[d][11] = m8  * M2[3] + m9  * M2[7] + m10 * M2[11];
}

vs. this for 2D for a rotation:
Code:
void tqRotate2DRad(float theta)
{
    int        d = stackDepth2D;
    float    cosTheta = cos(theta);
    float    sinTheta = sin(theta);
    float    m0 = m[d][0], m1 = m[d][1], m2 = m[d][2],
            m3 = m[d][3], m4 = m[d][4], m5 = m[d][5];
    
    m[d][0] = cosTheta * m0 + sinTheta * m3;
    m[d][1] = cosTheta * m1 + sinTheta * m4;
    m[d][2] = cosTheta * m2 + sinTheta * m5;
    m[d][3] = -sinTheta * m0 + cosTheta * m3;
    m[d][4] = -sinTheta * m1 + cosTheta * m4;
    m[d][5] = -sinTheta * m2 + cosTheta * m5;
    
    texFilteringRecommended2D[d] = true;
}

I'm not talking about individual vertex multiplies as it is done in hardware. Do a lot of sprites and clearly you will save on math calcs. Now whether you transform each vertex yourself, or leave it to the hardware depends on your needs. The hardware can't help you with your batched geometry or mesh deformations, so you'll have to do those. For everything else, you can simply mult by the GL modelview matrix and let it do the vertex transforms as usual, since as you mentioned, it's going to do it anyway Wink
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #11
Oh, not to mention if you do the matrix transforms yourself, you have a copy of the modelview matrix directly on hand and will not need to stall the pipeline for a round trip to read the modelview matrix from the GL, if/when you need it. Note again, I'm talking about "matrix" transforms, not "vertex" transforms. I think that's where some miscommunication is coming from here.
Quote this message in a reply
Apprentice
Posts: 9
Joined: 2010.04
Post: #12
JAKE2:

(1) "I'm saying that when you call glTranslate/rotate/scale, they are … 3D …" Ahhh, I completely understand now. I idiotically thought those were some sort of 2D processes. THANKS

(2) Other general questions - all understood THANKS.

(3) " 'just blit?..' ... Stick to OpenGL all the way." Fascinating. So there is no "2D equivalent" of OpenGLES. Nothing lets you control the chips like OpenGLES; even though OpenGL has the "minor overhead" of being 3D, it's still the quickest way. Intriguing. THANKS.

JAKE2 + SKORCHE

(4) "batch.." Right, I completely understand the batch concept. I was curious about the one sprite case. As you can see, I knew nothing about (say) glTranslate, and now I know much more. THANKS.

Arekkusu:

(5) "See the previous threads about this topic." I will review fully THANKS.

WARMI:

(6) ". .. more often than not, they use alpha-blending with relatively large sprites and their code becomes fill rate bound before anything else."

Now that is fascinating. I will have to investigate that fully. I am a simple person and I don't think I use alpha blending. I think I just, err, use normal transparent PNGs. It sounds like alpha blending is a big eater if you get in to it. Thanks for the pointer. THANKS.

SKORCHE:

(7) I completely understand what you explained from the TGolf example. That is really fascinating. I'm too lazy to ever do anything that complicated but I now have a powerful feeling that it is no longer mysterious! THANKS.

JAKE2

(8) "not to mention if you do the matrix transforms yourself, you have a copy of the modelview matrix directly on hand" As Rodney Brooks says "The world is it's own best model…"

THANKS.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #13
fattoh Wrote:WARMI:


Now that is fascinating. I will have to investigate that fully. I am a simple person and I don't think I use alpha blending. I think I just, err, use normal transparent PNGs. It sounds like alpha blending is a big eater if you get in to it. Thanks for the pointer. THANKS.


Actually ...you do use alpha-blending anytime you use transparent images.
Quote this message in a reply
Sage
Posts: 1,482
Joined: 2002.09
Post: #14
Sure, 3D rotation matrices are more complicated than 2D ones. You don't have to use glRotate() to make a rotation matrix. I don't. My point was that doing the transformations yourself isn't necessarily good or bad. For a very small number of vertexes, it's probably bad due to the overhead of the matrix multiply and pushing the new matrix off to the GPU. Letting the GPU do it does make the code much simpler and easier to follow.

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #15
Skorche Wrote:...My point was that doing the transformations yourself isn't necessarily good or bad. ... Letting the GPU do it does make the code much simpler and easier to follow.

That was my point too at the start of the conversation Rasp

AnotherJake Wrote:That said, using glTranslate, etc. are plenty fast enough to do just about anything I've needed, and they are certainly easier on the eyes in terms of the code produced. My opinion is that if you find it easier to use glTranslate, etc. then use them! If you run into performance problems, you can easily macro them out and override them with your own transforms if you wish, which is something I've done with great success.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  SDL1.3/OpenGLES and iPhone Orientation SparkyNZ 10 9,983 Apr 13, 2011 02:38 AM
Last Post: SparkyNZ
  Newbie question: simple Sesame Street-like app redbaron 2 2,856 Apr 25, 2010 12:42 PM
Last Post: redbaron
  Simple 3D Game Engine(well not that simple) geoface 9 8,574 Apr 14, 2010 04:08 AM
Last Post: sio2interactive
  Using Obj-c in latest xcode opengles tempalte kendric 0 1,712 Dec 7, 2009 09:27 AM
Last Post: kendric
  Why Does OpenGLES Appear To Leak Memory? muleskinner 2 3,341 Oct 22, 2009 04:54 AM
Last Post: muleskinner