iDevGames Forums
Roots of Display List Overhead - Printable Version

+- iDevGames Forums (http://www.idevgames.com/forums)
+-- Forum: Development Zone (/forum-3.html)
+--- Forum: Graphics & Audio Programming (/forum-9.html)
+--- Thread: Roots of Display List Overhead (/thread-5685.html)



Roots of Display List Overhead - aarku - Apr 2, 2005 08:32 PM

What are the reasons behind there being a lot of overhead for using display lists? If a display list has been created and the system has a lot of VRAM at its disposal, what's left generating overhead for glCallList()? The graphics card should have all that it needs sitting right there, right?

Here is the only discussion I was able to find about it:
http://developer.apple.com/graphicsimaging/opengl/optimizingdata.html
http://www.idevgames.com/forum/showthread.php?t=7992

Can anyone point me to some reading material to learn about this more in depth?

Thanks
-Jon


Roots of Display List Overhead - Dan Potter - Apr 3, 2005 05:57 PM

A friend at the local IGDA group told me that there are issues in OSX with using display lists where it will sometimes not actually cache it like you expect, and it can actually turn out to be slower than just sending the primitives again. One of those links alludes to that. Wish I could remember exactly what it was. Annoyed We've got a meeting coming up next week, I can ask him about it again.


Roots of Display List Overhead - Dan Potter - Apr 7, 2005 05:39 PM

Ok, here is the deal... I asked my friend about that, and he said the primary thing that causes slow display lists (at least in theory) is nested display lists. Like if you make list A, and then while making list B, you call list A. He said that OpenGL display lists are implemented in OSX by using object buffers, where nesting doesn't really make a lot of sense. So.. it ends up basically replaying the raw primitives instead of using the nice cached data on the video card.

He also told me that after some tests he's not sure he believes all that. Grin But that was the issue in theory.

HTH...


Roots of Display List Overhead - FCCovett - Apr 7, 2005 05:45 PM

I've created both display lists and VARs to hold my models. Rendering with VARs is almost twice as fast as with the display lists in my case.


Roots of Display List Overhead - Dan Potter - Apr 7, 2005 06:31 PM

My friend also said that. Smile Sorry, forgot to mention it.. he converted their stuff from display lists to vertex/object buffers and it was a massive improvement. Apparently it's also possible to do nice things too like only changing a few vertices and having the video card suck that new info in automatically (and also vs having to re-upload it).


Roots of Display List Overhead - spaceb - Apr 11, 2005 12:49 PM

Interesting! I like vertex arrays better than display lists anyway, since their contents aren't set in stone once they're created. Now I know there's absolutely no reason to use display lists...


Roots of Display List Overhead - phydeaux - Apr 11, 2005 04:49 PM

A not-completely-related but something I came across recently that I thought would be useful to mention:

Sometimes you may find that's somewhat important to remember with large arrays of vertex positions or vertex data is that the card will have caching problems if your triangles or triangle strips access vertices from that array in a seemingly random order. It can be worth it to duplicate data in order to make sure a lot of your arrays are accessed in more or less linear order. The reason I mentioned this is that I was debugging some vertex array code and wrote the same version sending immediate mode commands to a display list, and the result was actually faster because all the data was suddenly in order.


Roots of Display List Overhead - lightbringer - Apr 11, 2005 08:06 PM

FCCovett Wrote:I've created both display lists and VARs to hold my models. Rendering with VARs is almost twice as fast as with the display lists in my case.

A couple of standard questions pertaining to this:
- How many vertices per DL/model are we talking about here?
- This is static geometry right? (no building a new DL every few frames)
- How many models are you rendering a frame (guess, as long as it's close) (glCallList() can start to hurt after a while, glCallLists() is quite a bit better if you can use it).

Thanks.


Roots of Display List Overhead - FCCovett - Apr 11, 2005 11:24 PM

I tried that with low-poly (about 100) and high-poly models (10k), just a couple of models at a time. The models are comprised of sub-groups (2 or 3 in some cases), and there's one display list for each group. The results were pretty consistent. I didn't build display lists while rendering. Both the vertex array and the display lists were created at the time of loading the models.


Roots of Display List Overhead - OneSadCookie - Apr 11, 2005 11:44 PM

did you make sure to follow apple's tips on building efficient display lists?


Roots of Display List Overhead - FCCovett - Apr 12, 2005 12:24 PM

If you are asking me, I would say I am following 60% to 80% of what that papers says.

"First, you'll get the best performance when using more than 16 vertices per list."

The majority of the display list have more than 16 vertices, except when the model itself is very simple and has a low poly count.


"Second, in order to make the driver's job easier, provide consistent data for each vertex in a list. For example, provide normal, color, and vertex data for each vertex rather than leaving out color or normal for some vertices."

Done.


"Third, it is very important that you provide all attributes that may be dynamic within a single draw command between glBegin and the first glVertex. For example: glBegin, glVertex, glColor, glVertex cannot be optimized but glBegin, glColor, glVertex, glColor, glVertex can be optimized."

Also done, but I am leaving color out of the display lists.


"Fourth, as with all OpenGL drawing, it is very desirable to have as few state changes as possible within a display list. If you can group together your GL_TRIANGLES, GL_QUADS, GL_POINTS, and GL_LINES of similar mode and state, the display list optimizer may be able to batch them together for more efficient drawing."

Also done, although I am using a different order (GL_POINTS, GL_LINES, GL_TRIANGLES, GL_QUADS). I am not sure the order matters though.


"Finally, decomposing strips and fans into triangles and quads can also allow for more behind the scenes display list optimization."

My models don't use strips or fans, so I guess this tip doesn't apply.