OpenGL Cube drawing: VBOs, instanced geometry, octrees - Printable Version
+- iDevGames Forums (http://www.idevgames.com/forums)
+-- Forum: Development Zone (/forum-3.html)
+--- Forum: iPhone, iPad & iPod Game Development (/forum-11.html)
+--- Thread: OpenGL Cube drawing: VBOs, instanced geometry, octrees (/thread-9645.html)
OpenGL Cube drawing: VBOs, instanced geometry, octrees - Symbol$ - Dec 23, 2011 08:23 AM
I was considering making a game all out of cubes, I guess like Minecraft. I'd rather not get into the debate over lack of originality. You could argue Halo copied Doom, but they're both great games, each one offering something original.
Anyway, I've considered VBO's. Are they really necessary? According to Apple:
'Each time glDrawElements is called, the data is retransmitted to the graphics hardware to be rendered. If the data did not change, those additional copies are unnecessary. To avoid this, your application should store its geometry in a vertex buffer object (VBO). Data stored in a vertex buffer object is owned by OpenGL ES and may* be cached by the hardware or driver to improve performance.'
Therefore, couldn't you just bypass rendering altogether if there is no movement by the player? If the player moves, surely VBO's _would_ need to be passed to the hardware again?
What about instanced geometry or octrees? I've also heard display lists are useful - would it be beneficial to implement them for this type of project?
It would be especially useful to hear any tips relating to OpenGL ES2 specifically, although I'd also be keen to hear how best practise would differ to both OpenGL ES 1 and fully fledged desktop Mac/PC OpenGL - particularly the latter since I'm building my level editor in the full scale API.
RE: OpenGL Cube drawing: VBOs, instanced geometry, octrees - Skorche - Dec 23, 2011 10:38 AM
Yes, you should use VBOs (especially on ES). The terrain data will only change when a block is modified. So you don't want to resend all of it every frame. Especially since there will be a lot of it. Also, the terrain geometry doesn't change when the player/camera moves, only the matrix used for drawing it.
I dunno if anybody uses display lists anymore or if they even have good performance for that much data.
ES 1 is like old, fixed function, pre-shader OpenGL. ES 2 is all strictly shader based, basically throwing away all of the fixed function stuff, and so it's pretty different than ES 1. Desktop GL basically provides both modes. My advice at this point is to ignore ES 1 entirely. It's easy to make something work on ES 1 and desktop GL, or ES 2 and desktop GL, but not ES 1 and 2 without writing two separate renderers.
RE: OpenGL Cube drawing: VBOs, instanced geometry, octrees - Symbol$ - Dec 23, 2011 11:14 AM
Thanks for the reply Skorche, I appreciate the advice. Just wanna add that I am well impressed with Chipmunk.
I will use VBO's once I get a chance over the next few days. I didn't really want to start benchmarking when I already knew there were some very experienced OpenGL programmers on this forum.
Although I hate to diverge from iOS, I came across a tutorial called 'Voxel Fun' written on Android here: http://sites.google.com/site/drpaulthomasandroidstuff/Home/voxel-fun
In this the scene is divided into 3 sets of planes one for each axis x,y and z, so there would be for example 6x6x6 planes for a 6x6x6 cube world. No source code is released so I'm struggling to see how each plane is textured..
Is this more efficient than drawing 6x6x6 cubes even for a non-static world? And if so, would VBO's still be the best way to go - better than drawElements?
I'm also interested if anyone has any bad experiences with VBO's or could highlight any potential shortcomings of using them.
RE: OpenGL Cube drawing: VBOs, instanced geometry, octrees - Skorche - Dec 23, 2011 12:03 PM
Well, their case is really specifically optimized for a small voxel space on a ridiculously slow GPU from the G1 Android phone. If you want to make a large world you *definitely* don't want to do that.
The GPU in the iPhone uses tile based deferred renderer. It splits the screen into little squares (like 32x32 or something like that). Then it finds all the triangles that will draw in the tile and raycasts against them to figure out which one will show up on the screen for each pixel. This way it only needs to draw each pixel once. This is important, because the GPU only has enough fill rate to draw each pixel in the screen 2-3 times per frame at 60 fps. This differs from different models, but it's a small number for all of them.
With most traditional GPUs, triangles are drawn one by one. When you draw a far away triangle, then a closer one that draws over the top of it, the GPU will have to draw some of the pixels twice. Drawing pixels is expensive, because the number of them you need to draw adds up really fast. This is why you generally want to roughly sort things by distance when drawing them. When the GPU tries to draw behind something that has already been rendered to the screen, it can abort early without doing as much work.
Getting back to your first question. The reason why you don't want to use the textured plane approach is because enabling alpha testing or blending disables the ray trace that the GPU does, and the traditional sort order trick doesn't work very well as a replacement because it wasn't designed to operate that way. So if you want to draw 100 planes, you would need to draw every pixel on the screen ~100 times. Not good!
What you want to do is to check the squares between each pair of cubes. If both cubes exist or neither cube exists, don't output any geometry. If only one cube exists, spit out a pair of triangles using it's surface color/texture. If possible, you'll want to combine the surface into triangle strips for performance, but that's a bit more complicated.
Keep in mind that whatever you do, you'll have to keep the triangle count down somehow. Expect to be able to draw a few tens of thousands of triangles per frame. Drawing voxels this way creates a *lot* of triangles.
Anyway long post is getting long. I don't think there are really any downsides to using VBOs. Worst case scenario they add like a dozen lines of code to your project. Also, thanks for the kind words about Chipmunk.
OpenGL Cubes World: VBOs, instanced geometry, octrees - Symbol$ - Dec 23, 2011 12:28 PM
Thanks again for your help, Skorche.
Sorry, this is a long post, but there's so much I want to know.
I have also heard that 'chunking' with a separate vertex array for each chunk of say 64x64x64 blocks would be a good approach, although I have yet to see any simple to understand code explaining this concept, only obfuscated snippets or complex frameworks that I have not yet understood.
I just dug up this post by Notch again that seems to support this other theory of chunking: http://notch.tumblr.com/post/3746989361/terrain-generation-part-1
Some clear to understand examples in OpenGL using VBO's and chunking would be very useful at this stage.
I'm really impressed with Chipmunk - I'm in awe of the complexity of a physics engine that affords collisions, joints and other physical modelling that a framework like that can provide a game developer. I've really started to get to grips with using physics engines too - any there any plans for a 3D version? I also really like Crayon Ball and your other iPhone releases I'd love to see a voxel engine combined with physics like this: http://www.youtube.com/watch?v=eQMBGLMtdFE&feature=player_embedded
With regards to the plane based model for voxels - I appreciate what you are saying about pixels being expensive but aren't 2D sprites with transparency in OpenGL games generally very quick on iPhone? I interpreted Dr Thomas' model to mean to draw a texture on each plane, surely that can't be too much of an overhead? Is it only the ray casting/lighting that adds to the overhead due to the iPhone renderer, and if so what are the limitations: would flat shading for example alleviate this issue?
Would you not say that the models main shortcoming is the use of OpenGL ES 1 running on a relatively old µPU and that greater processing power in ES 2 can provide better functionality with VBO's - or is it that the main drawback is the lack of scalability of that model to a larger map that you mentioned? [EDIT] I just re-read what you said about alpha blending and it makes a lot more sense now, although once again I'm sure this isn't a problem for OpenGL sprites. Would 6x6x6 be ok but e.g. 100x100x16 be too many?
It's interesting to hear that about the iPhone renderer. On the subject of dividing and conquering, is there also a good way of combining octrees or dividing the rendering load of the drawing routine for a large cubic world, as I have heard mention of elsewhere to fit this native approach to rendering a scene?
If I understand your advice correctly the main form of optimisation would need to be the comparison of adjacent cubes that you mention. So in brief terms, if any face is obscured, do not render it? Can you provide any examples or hints into an algorithm for this routine? To further optimise if objects are destroyed or added, is there a best way of splicing the world vertex arrays?
Conceptually, is it better to combine the terrain vertex buffer object with any moving objects' VBO's or to keep them separate?
On a more general level, how would you advise culling blocks outside of the view frustum? Also, what settings would you recommend for the viewing frustum pairs, x:y, top:bottom and near:far? What kind of draw distances can I expect in ES2 and how would you recommend approaching the culling for off-screen/too distant cubes?
Thanks for this very useful advice so far, it's likely to save me a lot of time. Sorry for so many questions, I just have a game in my head and want to make it asap! Again, I'd be interested to hear other opinions on this.
RE: OpenGL Cube drawing: VBOs, instanced geometry, octrees - Skorche - Dec 23, 2011 01:43 PM
Whew. Lots more questions. Lets see if I can hit them all.
Actually it's very easy to make a 2D game that uses virtually no CPU time, but that the GPU cannot render at a smooth framerate. Fill rate (the number of pixels that can be draw to the screen in a second), is very limited on mobile GPUs. Something as simple as drawing 3 or 4 screen size sprites on top of each other would murder your framerate. That is basically what the plane-with-textures method does. In the game from the article, there are only 18 layers to be drawn, and their total combined area in pixels isn't ever that big. If it was drawn from inside the cube, it would be a different story as you would have a lot of them covering the entire screen.
From what I've heard from people that programmed for the G1, you could really only draw maybe 1000 triangles per frame before the GPU was busy doing nothing but working on the geometry. So that probably had a lot to do with it as well.
There isn't really a lot you can do to take advantage of the tile renderer really.
The algorithm is pretty simple really, just loop over the voxels and compare the one to the left top and front. You don't have to do right, bottom and back because the voxels there would have already compared themselves to the current one. To splice the vertex arrays, I'd probably keep a buffer for each 16x16x16 chunk (or something like that, whatever the leaves in your octtree are). When you add a face, add it to the end of the buffer. When you remove a face, replace it with the last one in the buffer. Keeping it in a more triangle strip friendly way would probably be good, but very hard. ElMonkey was working on a voxel renderer a couple months back, maybe he has some better advice.
Frustum culling shouldn't be too hard. You can recursively check your octtree nodes against the frustum. Some sort of occlusion checking would be good too, but that's way out of my expertise.