OpenGL & Cocoa - Improving frame rate

Member
Posts: 35
Joined: 2009.01
Post: #1
Hello,

I am working on a game in which each level is stored in a file which contains various arrays - ie, background props, foreground props, units, and terrain, where terrain actually defines where the player can walk.

Currently the frame rate is defined by a timer with the interval 0.0166666 seconds - thus resulting in 60fps. However, sometimes the cpu is still busy when the timer fires (I think) and as a result the frame rate switches between 60fps and 30fps. It seems to stay on 60fps for a few seconds, then go to 30fps for a few seconds, back and forth - it's not just single frames that get dropped.

All the drawing is being done in direct mode. My NSOpenGLView subclass holds state information (particularly color, bound texture, and whether textures are enabled) so the pipeline doesn't get clogged.

To finally cut to the chase, there are a few areas in which I can think to optimize my code:

1. Not draw props that are offscreen. For now I don't think this is a huge deal since there are only 34 "trees" in the whole level (20 of which are onscreen at any given time) - in the future this optimization will make a bigger difference, but I'm not sure if it's really having much effect right now.

2. Optimize array iteration efficiency. My props are stored left to right, and top to bottom. So perhaps I should be using a divide and conquer search to find the first onscreen prop, then iterating until the next one is offscreen to the right. Again, it's only O(n), so... I'm really not sure if I should ever bother with this one and make it O(log n).

3. Stop using direct mode rendering.. I have a feeling this is the biggest issue. Using Shark really didn't reveal much about what was holding up the frame rate, but perhaps Shark is not so useful for evaluating frame rate issues. For each prop I draw, I get its screen & texture coordinates, glBegin()... texcoord vertex texcoord vertex etc.

It is my understanding that in order to make use of hardware acceleration, I need to somehow pack all the relevant coordinates into an array containing a bunch of GLfloats, specify that a certain memory location be accelerated ( ? ) and ... then I'm not sure what.

Essentially, I need to know the order in which all the coordinates should be put in the array. An example of my indirect code is shown below:
Code:
glBegin(GL_QUADS);
glTexCoord2f(tx1, ty1);        glVertex2f(x-.5*w, y);
glTexCoord2f(tx2, ty1);        glVertex2f(x+.5*w, y);
glTexCoord2f(tx2, ty2);        glVertex2f(x+.5*w, y+h);
glTexCoord2f(tx1, ty2);        glVertex2f(x-.5*w, y+h);
glEnd();

This code, or something like it, is (at the moment) called maybe 75-100 times per frame. It's only going to get worse as I create more content. I need to make the jump to indirect mode sooner rather than later.

Would this break my rendering into two steps? Ie, 1. pack coordinates then draw them?

Any advice or suggested reading would be greatly appreciated. An example of a hardware accelerated equivalent of the above code would make my day.

Thanks in advance!

- Dave H.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #2
Your biggest win right now will likely to be to jump to vertex arrays. You can get some practice with them via immediate mode rendering, and once you're comfortable you can use VBOs to cache the vertex data on the GPU. This will make for a big improvement.

If you want to rock it old-school, you can wrap your current code in display lists; but I'd shy away from that, simply because they are a little out dated and might not be supported in future versions of GL.

That being said, while you don't have a lot of scenery to render, I'd consider at the very least writing some sort of visibility determination system. In my engine, every object has an AABB, and when it's interted into the world, it is placed in a leaf of an octree. When the object's AABB changes ( due to manual placement, or physics, etc ) the octree quickly figures out if it needs to reparent the entity in a new leaf.

Then, when rendering it's easy to intersect a camera frustum with the octree, and build a visibility set of the entities in the leaf nodes which are visible.

Finally, you just render those visible entities.

In the end, the only "tough" math you'll have to worry about is plane, frustums ( 6 planes ), and frustum/AABB intersection. Some googling will probably help you find the code you need if you don't want to roll it yourself.
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #3
Although you should move to vertex arrays, that's not likely to be your bottleneck here, only drawing 100 quads.

It's more likely that your timer setup is causing you to drop frames-- your first assumption that 1.0/60 is a "good" number is faulty. See this recent Q&A for a discussion.
Quote this message in a reply
Member
Posts: 35
Joined: 2009.01
Post: #4
Thanks for the help guys. Using the provided sample code I've translated the following:

Code:
glBegin(GL_QUADS); {
    glTexCoord2f(tx1, ty1);        glVertex2f(x1, y1);
    glTexCoord2f(tx2, ty1);        glVertex2f(x2, y1);
    glTexCoord2f(tx2, ty2);        glVertex2f(x2, y2);
    glTexCoord2f(tx1, ty2);        glVertex2f(x1, y2);
} glEnd();

into this:

Code:
GLfloat coords[16] = {    x1, y1, tx1, ty1,
                    x2, y1, tx2, ty1,
                    x2, y2, tx2, ty2,
                    x1, y2, tx1, ty2 };
                
// glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(2, GL_FLOAT, 4*sizeof(GLfloat), &coords[0]);
    
glClientActiveTexture(GL_TEXTURE0);
// glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer(2, GL_FLOAT, 4*sizeof(GLfloat), &coords[2]);
    
glDrawArrays(GL_QUADS, 0, 4);

a couple questions,

1. Is there any reason to call glEnableClientState(..) each frame? I've actually commented them out and put them in my openGL initialization method, with no negative repercussions.

2. I suppose in order to properly make use of this, I should pack all the vertexes into a single array, and then call glDrawArrays (GL_QUADS, 0, 4*n)?
... at least until a different texture needs to be bound, then the array should be reset?

Anyways, for now my frame rate is staying above 60, so thanks for the suggestions... Some day soon I will return and ask about VBO implementations
Quote this message in a reply
Member
Posts: 245
Joined: 2005.11
Post: #5
1 - If you ever add code that disables a client state at some point then you will need the corresponding call to re-enable it when needed. Ensuring that you always enable the states you need before drawing can prevent funky bugs 9 months later when you've forgotten how the old drawing code works. On the other hand, if you are going to keep all you drawing to the same set of buffers then setting up the client state during initialisation is fine.

2 - You are correct here. Combining several images into a single texture can allow you to do a great deal of drawing with very few OpenGL calls.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Opengl/Cocoa text rendering tesil 15 16,848 Mar 20, 2012 11:16 AM
Last Post: OneSadCookie
  OpenGL view first frame flickers garbage mk12 8 6,583 Sep 4, 2010 06:06 PM
Last Post: mk12
  OpenGL Text Rendering (in Cocoa) daveh84 5 7,686 Feb 19, 2009 12:44 PM
Last Post: TomorrowPlusX
  bad depth sorting in Cocoa OpenGL aldermoore 2 4,545 Dec 30, 2008 03:07 PM
Last Post: ThemsAllTook
  Loading and using textures with alpha in OpenGL with Cocoa corporatenewt 4 5,948 Dec 8, 2007 02:06 PM
Last Post: Malarkey