OpenGL performance

Jesse
Unregistered
 
Post: #1
Hi all,

I'm posting to ask for some general advice on how to set up my OpenGL engine for optimal performance. I know that the only way to know for sure what approach is fastest is to code and profile it. Unfortunately, as a hobbyist game programmer (at least for the time being), I don't really have time to try a lot of different implementations. Since what rendering method I use will influence my data structures at the lowest level, I'm trying to get an idea of which direction I should go before I do too much more work on my engine.

Currently I'm using non-interleaved vertex arrays. Each primitive (Bezier patch, mesh model, etc.) has its own set of arrays and renders itself with a call to glDrawElements(GL_TRIANGLES, etc.). I'm not sorting by renderstate yet, but will implement a render tree in the future.

Here are some of the questions I have. Any input on any of these issues would be greatly appreciated.

1. How much trouble should I go to to use triangle strips? It seems that many primitives (such as Bezier patches) would require a lot of strips to render (unless degenerate tris are used). Is it better to make one call and draw tris, or several calls and draw tristrips?

2. Should I be using an OpenGL extension for static geometry? I've read about vertex buffer objects and compiled vertex arrays, but haven't implemented them. What's the best way to go?

3. What exactly constitutes static and dynamic geometry? I assume an object which never moves or changes shape would be static, and the skin on a skeletal model, for example, would be dynamic. What about geometry with doesn't change position or shape locally (such as a spaceship model), but moves about the world? I assume that would be static and you would just call the display list/vertex buffer/whatever after setting up your transformation matrix, but I'm not sure.

4. Finally, I know it's important to batch lots of tris and send them to the card together. Correct me if I'm wrong, but I've gathered that the Quake 3 engine actually builds a vertex buffer each frame from the visible faces and renders that, rather than rendering each face individually. It obviously would take some time to build such a buffer on a per-frame basis, but is the time saved in API calls and setup worth it?

I know these questions are very general, and in practice some of the answers are probably different on different systems. Unfortunately, though, I don't have time to try every option, so I'm hoping that some of you who have tried some of these methods might be able to point me down the right path.

Thanks very much for taking the time to read this.

Jesse
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #2
I think you understand that no matter what gets said this topic can't really be done the justice it deserves...

1. -- don't bother with triangle strips. Quake III doesn't Smile

2. -- display lists are the easiest. Apple doesn't support the vendor-neutral ARB_vertex_buffer_object yet, and APPLE_vao/var/fence/whatever seem to be pretty buggy still. Display lists will probably give you a good speedup on 10.2, they're not much help at all on 10.1

3. -- you've basically got the right idea, although there are extensions which allow you to do skinning in hardware and therefore treat even an animated model as static geometry. The big "dynamic geometry" thing is something like a level, where you don't know which parts you're going to draw from frame to frame, but can't afford to draw the whole thing.

4. -- very important to avoid immediate mode (glBegin) if at all possible.
Quote this message in a reply
Jesse
Unregistered
 
Post: #3
Thanks, onesadcookie - your response is very helpful.

Quote:The big "dynamic geometry" thing is something like a level, where you don't know which parts you're going to draw from frame to frame, but can't afford to draw the whole thing.
Could you clarify this? Do you mean that level geometry should be considered dynamic rather than static? If so, what would constitute static geometry? I'm sure you were clear - I'm just missing something.

Quote:-- very important to avoid immediate mode (glBegin) if at all possible.
By this you just mean

glBegin(GL_TRIANGLES);
glVertex3fv(...)
glEnd();

Right? But I assume glDrawElements() is ok?

Thanks again, and any thoughts or opinions from others are welcome. And yes, it's a big topic that can't reallly be covered in one thread Smile But whatever knowledge I can gain here will still be very helpful.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #4
Level geometry will usually be considered dynamic, since there's generally too much for it to be feasible to store in VRAM, only a very small amount of the total will be drawn each frame, and exactly what is drawn will change each frame. This does depend on your level geometry management scheme, though...

Static geometry will be smaller objects like ammo pickups, statues, torches, possibly even animated objects like players depending on use of extensions.

DrawElements is all good.
Quote this message in a reply
Feanor
Unregistered
 
Post: #5
Why is the definition of static/dynamic getting inverted from what the words mean in the real world? I thought static was the level, because it was static relative to the world and to static light sources, so you could insert it into a static tree, like a BSP. You can't put player models into a BSP, because the move around, and you can't light them before hand for the same reason, so that makes them static. Now "static" means what you can keep in VRAM? That's damn confusing.

Re: display lists. Geoff Stahl, I think, said that they are implemented using VAR/VAO or some other extension (which is why they are fast in 10.2), meaning that whatever bugs are in those should show up in display lists. But don't quote me on it.
Quote this message in a reply
Member
Posts: 177
Joined: 2002.08
Post: #6
UT2K3's definition of "static" level geometry refers to environment sections which may be instantiated multiple times in the level. Those can easily be stored in VRAM.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #7
The meaning of the words hasn't changed, only the context in which they're applied.

The context we're talking about here is whether the contents of a vertex array would change each time it's submitted to draw. Objects almost certainly won't change. Characters will if skinning is done in software, won't if it's done in hardware. Arrays belonging to the level will almost certainly change each frame as a technique for dealing with the huge amounts of data.

Display lists may be implemented in terms of VAO on 10.2 but I know of no bugs with either VAO or display lists. VAR is a completely different story, and is still unusable on 10.2.6 as far as I can tell...
Quote this message in a reply
Member
Posts: 79
Joined: 2002.08
Post: #8
Quote:Originally posted by OneSadCookie
VAR is a completely different story, and is still unusable on 10.2.6 as far as I can tell...


I wouldn't say that. As far as I can tell it works fine. The trick is to get it set up right. It has taken some experimentation, mostly because there's no useful docs to go by, but it's working great now. Recently I included the index array into the range too and it's working fine using glDrawElementArrayAPPLE.

Using VAR is around 3 - 5 times faster in my case so it's definitely worth it.

KenD

CodeBlender Software - http://www.codeblender.com
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #9
It certainly seems fine on the Gf4Ti, but even Apple's vertex performance VAR sample doesn't work on the Radeon 9800.

Just be careful Smile
Quote this message in a reply
Member
Posts: 79
Joined: 2002.08
Post: #10
Yeah, I can imagine it's broken on some cards. The 8500 seems to handle it ok though.

Doesn't it seem like it has become the norm that OpenGL is broken in one way or the other on cards lately? Now we don't just have to worry about if a certain extension is available or not, we have to worry about if it actually works too.

One would think it should not have to take a year or so to get things fixed and working properly.

KenD

CodeBlender Software - http://www.codeblender.com
Quote this message in a reply
Mars_999
Unregistered
 
Post: #11
I guess I will throw my hat in the ring. I have just completed coding my terrain engine with VBO's and it made a huge difference. I went from 30fps to 150fps. Now from Vertex arrays to VBO's around 90fps to 150fps.
//this is on a PC Mac doesn't have VBO's yet


The whole static, dynamic terminology is quite simple. Display lists will only handle static geometry. VBO's, VAR, whatever the vender decided to call it beings VAR is nVidia I think specific? are usually used for dynamic and can be used for static geometry. Static geometry is anything that will not be changed(reconfigured) dynamic is always or can be changed from the orginal form.

As of right now I have my terrain which is static in a VBO. I don't plan on changing anything in my terrain, but I wanted the speed.

I found display lists did help out performance, but not as much as vertex arrays or VBO's. Granted I was using glBegin() glEnd() a billion times, so my engine was running into CPU limitations. A map size of 256x256 with many function calls takes a toll on the CPU.

If you really want speed VBO's, VAR, display lists, ect... is all fine but you really need to implement octrees, quad trees, with Frustum Culling and other methods of determining what should and shouldn't be rendered. I myself haven't implemented it yet but from what I hear it increases your fps by many more times than VBO's would do for you.
Quote this message in a reply
Member
Posts: 177
Joined: 2002.08
Post: #12
I suspect the 9800 problem is in its drivers... I'm getting artifacts on my 9800 that disappear if I so much as drag the window to my other monitor.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Anyone experienced Performance ± with UILabels vs. Bitmap Font in OpenGL ES 1.1? Elphaba 3 4,465 Jul 28, 2009 10:40 AM
Last Post: AnotherJake
  OpenGL(R) Programming on Mac OS(R) X: Architecture, Performance, and Integration m3the01 9 4,832 Feb 17, 2008 04:34 PM
Last Post: mlee442
  OpenGL vs Quartz Performance for 2D Game lfalin 1 3,769 Apr 25, 2007 09:51 PM
Last Post: OneSadCookie