How do YOU use OpenGL ES?

Apprentice
Posts: 14
Joined: 2010.01
Post: #1
My problems with displaying loaded textures were solved yesterday by AnotherJake. Now, after trying to display just a few textures at a time, the game runs slowly. Lag. Is there any special trick to make the game run faster? Or do I have to resort to blitting? I've seen games that have flashy graphics, and they don't run slow at all. Any tips?

Or do I have to use a tile atlas? Or whatever they are called. I know what they are and do...
Sage
Posts: 1,482
Joined: 2002.09
Post: #2
One of the worst things that you can do is to change the OpenGL state too much. In particular, setting the active texture (glBindTexture()) seems very expensive even if you are setting the same texture that is already bound. What you should be doing is setting the OpenGL state, draw all the items that use that state, change to the next state, draw all the items that use that state, rinse and repeat. This is one of the big reasons to use texture atlases is that you don't have to change the texture to draw multiple things.

Another thing to watch out for is making too many draw calls (glDrawArray() or similar). Though generally not as big of a deal, if you can then batch your draw calls together as well.

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Member
Posts: 227
Joined: 2008.08
Post: #3
Binding the texture currently bound is expensive?
Interesting, so would wrapping it in something like this be worthwhile? :
Code:
void BindTexture(GLuint id){
  static GLuint currentId=0;
  if(currentId!=id) glBindTexture(GL_TEXTURE_2D,currentId=id);
}
Moderator
Posts: 3,572
Joined: 2003.06
Post: #4
Oddity007 Wrote:Binding the texture currently bound is expensive?

This is a good question. I'd like to know the answer too. More generally, I cache/shadow (however you wanna put it) some things of the OpenGL state, but I wonder just how much of it makes sense to cache. The question came up before about whether or not to go so far as to even cache glColor. I would love to hear from someone in the know about what should be cached, and/or avoided in terms of state changes. Everything, including glColor? Or just the low-hanging fruit like texture binds?

I suspect that this doesn't really hit at what the OP's performance problems probably are though...
Member
Posts: 166
Joined: 2009.04
Post: #5
AnotherJake Wrote:This is a good question. I'd like to know the answer too. More generally, I cache/shadow (however you wanna put it) some things of the OpenGL state, but I wonder just how much of it makes sense to cache. The question came up before about whether or not to go so far as to even cache glColor. I would love to hear from someone in the know about what should be cached, and/or avoided in terms of state changes. Everything, including glColor? Or just the low-hanging fruit like texture binds?

I suspect that this doesn't really hit at what the OP's performance problems probably are though...

A lot of states changes are simply not much more than setting a flag or a variable but every time you call GLES side of things , beside the actual work that needs to be done, there are other thing going on like locking the context and similar bookeeping stuff - something you don't have to do at all on your end so in my case, I pretty much shadow just about everything.
Sage
Posts: 1,232
Joined: 2002.10
Post: #6
wademcgillis Wrote:Any tips?

A "tip" is to gain a better understanding of how the graphics pipeline works, and what impact each of your rendering commands have.

Consider each of the following necessary stages in "display a few textures":

1) allocate memory on the CPU for texel data
2) put an image into that memory
3) pass the memory to GL
4) set rendering state
5) draw geometry
6) present results

Those are the minimal steps your application must perform. Where's the bottleneck?
You have to consider each step, and that brings up additional sub-steps.

For example, maybe step 2) is the bottleneck:
2a) open a compressed image file (.png, .jpg, .pdf) on disk
2b) create a CG rendering context from the memory allocated in 1)
2c) set CG rendering state
2d) render the image (reading it from disk, decompressing it into linear scanlines)

Accessing disk, decompressing data, and rasterizing it on the CPU can be a slow operation. If you are doing this inside your rendering loop, that's going to be a bottleneck.

There is also more happening inside the GL, between 5) and 6):

5a) evaluate state modifications
5b) prepare any resources required to render
5c) submit geometry
5d) transform, clip, (tile) geometry
5e) rasterize primitives
5f) color all (visible) fragments (sampling texels, modulating, etc)
5g) write results to framebuffer
5h) enqueue the frame to the display

In a complex application, any of these could be the bottleneck. There are various techniques (spelled out in the ES Programming Guide and in the PowerVR SDK) to optimize each stage, but that doesn't help if you don't know where the bottleneck is.

Your description, "display a few textures at a time" is too vague for anyone to make a reasonable guess at where the bottleneck is.

So here's another "tip": be specific. How many textures? How big are they? How are you loading them? How are you drawing them? What do you expect the performance to be?

Again, for example: if you're drawing 20 photo thumbnails, which are each 64x64 RGBA8, and are static content, drawn with two triangles each, then you can readily calculate the memory footprint and bandwidth requirements of this scene. You can also make a reasonable expectation for the performance, like "it should be as fast as the iPhone Photo Album browser and scroll at 60fps".

If you have 10 textures, but they are all 1024x1024, and you want to use half of them as bumpmaps and wrap them around a 20,000 vertex animated model, then your scene requirements are quite different.
Moderator
Posts: 3,572
Joined: 2003.06
Post: #7
warmi Wrote:... I pretty much shadow just about everything.

Well, heck, since there doesn't seem to be any disagreement about that, then I guess I'm going to start doing that myself! Thanks for the input warmi Smile
Member
Posts: 446
Joined: 2002.09
Post: #8
AnotherJake Wrote:I would love to hear from someone in the know about what should be cached, and/or avoided in terms of state changes. Everything, including glColor? Or just the low-hanging fruit like texture binds?
While prototyping an iPhone game last year I had a brute force "draw sprite" function that set the state (without caching) and drew a single quad. When drawing a few hundred same-coloured sprites it would drop to around 30FPS, but when I set glColor only once for the whole batch the scene rendered at 60FPS. This was just a prototype and pretty much a worst-case rendering pipeline but I was surprised that just getting rid of those redundant glColor calls made that big a difference. Since then I've been caching texture binds, colour, blending, texturing, texture environment and a few other often-used states.

I wouldn't rely just on caching to avoid redundant state changes though. You still gotta use texture atlases and render by texture and/or state as much as possible. My state-caching wrapper functions are more a catch-all for when I'm feeling paranoid or lazy.

AnotherJake Wrote:I suspect that this doesn't really hit at what the OP's performance problems probably are though...
Agreed. Need more details/code - there's a kajillion ways to wreck your framerate.
Moderator
Posts: 3,572
Joined: 2003.06
Post: #9
Frank C. Wrote:While prototyping an iPhone game last year I had a brute force "draw sprite" function that set the state (without caching) and drew a single quad. When drawing a few hundred same-coloured sprites it would drop to around 30FPS, but when I set glColor only once for the whole batch the scene rendered at 60FPS. This was just a prototype and pretty much a worst-case rendering pipeline but I was surprised that just getting rid of those redundant glColor calls made that big a difference. Since then I've been caching texture binds, colour, blending, texturing, texture environment and a few other often-used states. ...

Interesting. That's a pretty extreme example of what can be gained. More great input on the subject. Thanks Frank Smile
Apprentice
Posts: 14
Joined: 2010.01
Post: #10
Well, blitting seems to have solved my problems, but they wont be solved for long. In many 2D games, the background moves, right? In the case of an iPhone, the entire 320x480 screen will change every step. This results in severe lag since you are unable to blit because the entire screen changes.



Could the lag be caused because I'm not using a real device? Emulators are often slower than their real life counterparts.
Nibbie
Posts: 4
Joined: 2009.06
Post: #11
Actually, in this case its the other way around the emulator is much faster then the real device.

The emulator uses the resource of your computer and the graphic card of your computer. Make sure you test on a real device, otherwise you may have some nasty surprises once you try running it on a device.
Apprentice
Posts: 14
Joined: 2010.01
Post: #12
fbronner Wrote:Actually, in this case its the other way around the emulator is much faster then the real device.

The emulator uses the resource of your computer and the graphic card of your computer. Make sure you test on a real device, otherwise you may have some nasty surprises once you try running it on a device.

Wait, so if a stupid little game of mine sucks when displaying a 320x480 background and one moving 32x32 sprite, it will suck more on a real device?

How are games even playable on an iPhone?!?

edit:
Tap Tap Revolution and Doodle Jump and Super Monkey Ball all have moving graphics in every frame. How do they do it?

edit2:
http://www.wademcgillis.com/downloads/iPhoneApp.zip
In ES1Render.mm, change the value of BLITTING and you'll see what I mean. With a game that has a moving view, you can't blit. So how do iPhone games do it?
Member
Posts: 166
Joined: 2009.04
Post: #13
fbronner Wrote:Actually, in this case its the other way around the emulator is much faster then the real device.

The emulator uses the resource of your computer and the graphic card of your computer. Make sure you test on a real device, otherwise you may have some nasty surprises once you try running it on a device.

Depends , for CPU bound code it will obviously be faster but in terms of GLES rendering, the device often tends to be much faster mostly because the emulator is not using the graphic card at all - it is a software emulator.
Apprentice
Posts: 14
Joined: 2010.01
Post: #14
warmi Wrote:Depends , for CPU bound code it will obviously be faster but in terms of GLES rendering, the device often tends to be much faster mostly because the emulator is not using the graphic card at all - it is a software emulator.

Yeah. My graphics card is horrible. It's an Intel GMA, and my processor is a 1.6 Ghz Atom

/\ I hope I don't get banned for that post.

Or maybe you guys won't help me anymore...
Sage
Posts: 1,482
Joined: 2002.09
Post: #15
In a similar example to Frank's, I had a simple asteroids game that I stressed tested by drawing like 1000 asteroids at once. I was able to more than double the framerate by only binding the texture once at the beginning of drawing the asteroid sprite batch instead of for each one.

I've never suspected glColor*() to be expensive, though I generally only have it set to white anyway.

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Thread Closed