Optimizing my Game Engine. Help needed!

Member
Posts: 22
Joined: 2009.08
Post: #1
I've been creating a 2D Game Engine with two goals in mind:
- To be able to make games for the iPhone/iPod Touch
- To learn

While I've certainly learned a lot, there are so, so, so many questions that run through my mind as I go. My only problem is that there's nobody looking over my shoulder to say "WHOA, whoa, whoa, WAIT! There's an easier way!" That being said, I've got quite a few questions...! Right now the project simply produces an object every frame called HOGameObject. Each HOGameObject has a sprite and starts at the top of the screen. As it moves to the bottom, it reaches 100 pixels from the bottom and destroys itself. So it's basically a wave of oncoming sprites moving towards the bottom of the screen.

1. I took my project into instruments. About 4 seconds into the game a leak was found and my project slows down significantly. I couldn't find the source of the leak, and after looking around, it seems as though this same "leak" is in the OpenGL template provided by Apple! Is this truly a leak, or what?

2. I have two singleton classes. One is SpriteManager and one is GameObjectManager. The sprite manager allocates all the sprites for the game, and the GameObjectManager maintains all of the HOGameObjects. Each singleton stores its sprites and game objects into a mutable array. There's two processes to the main game loop: Draw and Logic, you know? So each frame GameObjectManager and SpriteManager perform a for loop through each of the contents of their arrays and DRAW or LOGIC. Is this the best way to go about this? Also, I'm getting much better results out of a regular FOR loop than I am out of a FOR EACH loop. Why is this?

3. HOGameObject contains about 18 variables: 13 floats, a string, and 4 short ints. The engine has to draw a sprite for each HOGameObject, as well as perform it's logic step. I know this is a really vague question, but how many of these can I have allocated and displayed at one time before I experience some considerable strain on the system? How many sprites and objects can you guys get out of OpenGL ES before it bogs down, are there any big fundamental secrets?

Thanks in advance!
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #2
Couple of things ...


1. You need to batch your sprites if you want decent performance.
At some point ( I can't tell you exactly when - it depends on your code) you will get bogged down by the GLES layer if you keep sending your sprites as separate draw calls.
This kind of optimization may not necessarily help you displaying more sprites ( due to fill rate limitations) but it will leave your CPU with more time to do other stuff ( like logic, perhaps some physics, sounds etc)

2. Personally I don't use Apple framework containers because they don't buy you anything over std containers and are significantly slower. A std::vector performs at around 99% of a raw C array while offering pretty much the same functionality as a mutable array.

As far as your last point, with my code I can display around 400-450 32x32 alpha blended sprites while running at 30 fps on 3g devices.
The number goes up to 1800 with the latest iPhone.
This is is purely due to a pixel fill rate limitation ( if I disable alpha blending I can get away with 3-4 times as many sprites)
Quote this message in a reply
Member
Posts: 22
Joined: 2009.08
Post: #3
Wow! That's great information, thank you.

I thought batch drawing was only for when you were binding textures. Right now I'm only testing using one sprite, so I bind it ONCE then let it be. Would this still apply? Is there more to batch-drawing that I thought?

So pretty much "say no" to anything "NS____" because of speed reasons? Are there any exceptions?

Thanks again.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #4
SamBaylus Wrote:Wow! That's great information, thank you.

I thought batch drawing was only for when you were binding textures. Right now I'm only testing using one sprite, so I bind it ONCE then let it be. Would this still apply? Is there more to batch-drawing that I thought?

So pretty much "say no" to anything "NS____" because of speed reasons? Are there any exceptions?

Thanks again.

Anytime you end up calling one of the glDrawXXX methods it is considered a batch.
That's should be your measuring stick because all other operations like binding a texture or changing blending modes etc are in support of that final glDrawXXX call.

The whole point of batching is to limit how often you submit new geometry - creating texture atlases , grouping sprites by blending modes all of that is done to be able to put as many quads ( or geometry in general) in a single buffer so you could submit it with a single call.

PS.
To see how fast is your code in its current state try to draw , say 200 sprites in a loop and see what kind of fps you are getting.
Quote this message in a reply
Member
Posts: 22
Joined: 2009.08
Post: #5
Okay, so I've done some research and I found the lecture given by Tom Omernick from ngmocoSmile was a great help in giving me an outline of things to check out. However, I still have questions!

(This is all for a 2-D engine)
Let's say on the screen I have 200 sprites of the same texture, same blend, same everything, and I can submit their vertexes in a single array. I've noticed in Apple's documentation they say to draw with triangle strips whenever possible. Should I be drawing the ENTIRE array of all 200 sprites with triangle strips, using degenerate triangles in-between?

I'd think that those degenerate triangles would add up in the long-run and take its toll on performance, but maybe I'm wrong. Also, if I AM using triangle strips with adjoining degenerate triangles, how should I supply the color and texture arrays to correspond with the degenerate triangles? I'm trying not to waste space in my array if possible...

Any insight would be appreciated!

EDIT: Or is it better to NOT use triangle strips and just have the redundant vertices for the corners of the two triangles that make up the quad?
Quote this message in a reply
Member
Posts: 446
Joined: 2002.09
Post: #6
SamBaylus Wrote:Or is it better to NOT use triangle strips and just have the redundant vertices for the corners of the two triangles that make up the quad?
Don't know what's technically faster to tell you the truth but I've had no problems using indexed triangles (reusable verts instead of redundant). With blended 2D sprites I always find the blending is the most expensive part - never ran into a vertex bottleneck.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #7
SamBaylus Wrote:Okay, so I've done some research and I found the lecture given by Tom Omernick from ngmocoSmile was a great help in giving me an outline of things to check out. However, I still have questions!

(This is all for a 2-D engine)
Let's say on the screen I have 200 sprites of the same texture, same blend, same everything, and I can submit their vertexes in a single array. I've noticed in Apple's documentation they say to draw with triangle strips whenever possible. Should I be drawing the ENTIRE array of all 200 sprites with triangle strips, using degenerate triangles in-between?

I'd think that those degenerate triangles would add up in the long-run and take its toll on performance, but maybe I'm wrong. Also, if I AM using triangle strips with adjoining degenerate triangles, how should I supply the color and texture arrays to correspond with the degenerate triangles? I'm trying not to waste space in my array if possible...

Any insight would be appreciated!

EDIT: Or is it better to NOT use triangle strips and just have the redundant vertices for the corners of the two triangles that make up the quad?

Don't bother with triangle strips ... it is a nightmare to maintain and won't buy you anything here anyway.
Use redundant vertices ( which you need to use anyway because you will want to have your sprites use separate position/uv/alpha/color properties)

And as FrankC says .. don't worry about vertex processing time ... you want to use batched arrays to optimize GLES driver slowdown.
Quote this message in a reply
Moderator
Posts: 3,572
Joined: 2003.06
Post: #8
I've used the triangle stripper from oolong engine, which is pretty easy to use, and produces pretty good results in terms of figuring out what can be stripped, as compared to some other strippers I tried, but I honestly didn't see much, if any improvement (which doesn't mean I wasn't doing something else wrong). This was only for 3D geometry (depth test on, lighting on, blending off). I think the idea of triangle stripping is pretty useless for sprites. As already said: better to batch sprites. Also, as Frank says, my biggest killer has almost always been blending, and there isn't much that can be done about that, except with the TBR, if you can localize your blending to only parts of the screen then you should see a boost according to theory, but I have no earthly idea how that could be done in a general sense.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #9
let's leave the "3D strippers on my screen" for other sites, mmkay?
Quote this message in a reply
Moderator
Posts: 3,572
Joined: 2003.06
Post: #10
OneSadCookie Wrote:let's leave the "3D strippers on my screen" for other sites, mmkay?

LOLLOLLOL
Quote this message in a reply
Member
Posts: 22
Joined: 2009.08
Post: #11
You've all been a huge help. I was also reading up on Vertex Buffer Objects. Yay or nay for a 2D Sprite engine?
Quote this message in a reply
Moderator
Posts: 3,572
Joined: 2003.06
Post: #12
VBOs would be great, but they only offer any performance advantage on the newest hardware, so there is no practical gain at this point in time. Have to wait a couple years before the hardware starts cycling out the old stuff, IMHO (or Apple comes out with a magic driver update). It won't hurt you to implement VBOs though, as the code is supported -- just not used with older chipsets.
Quote this message in a reply
Member
Posts: 22
Joined: 2009.08
Post: #13
So even though there's no advantage right now to use VBOs, using them will work on all versions of the devices?

It makes sense to use them anyway, simply if it means the possibility of a future bump in speed, right?
Quote this message in a reply
Moderator
Posts: 3,572
Joined: 2003.06
Post: #14
As far as I know, yes, that's the way I see it. You should be able to implement VBOs and not suffer any performance penalties or compatibility issues with all existing hardware, but you won't necessarily see gains right now, except on the latest hardware. So yes, theoretically, it makes sense to have the code in place if it's not too much trouble. I'm not an Apple engineer though, so I can't tell you fact from fiction on this. The salt is over by the door if you need to take some. Wink
Quote this message in a reply
Member
Posts: 22
Joined: 2009.08
Post: #15
Salt? Just a grain, please! =)

So then I guess I'll be using VBOs, then! No harm in it, and I'll be teaching myself something new. This all has been immensely challenging, but a lot of fun. I skipped over Quartz2D drawing as I knew I would eventually have to learn OpenGL ES for in-depth games. Just when I had an engine up and running, I realized performance was lacking, and it was because I had very poor optimization! You all have been a great help, thank you so much.


I need an OpenGL ES study buddy. If anyone wants to help out and talk OpenGL, please get my on AIM!: Sammers102
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Best toolset/engine for iOS pixel art game? eXpiation 3 5,469 Oct 16, 2012 06:22 AM
Last Post: Skorche
  I wrote my engine and this is the 1st Game! papaonn 5 6,063 Mar 20, 2012 03:53 AM
Last Post: papaonn
  iPhone game engine Goliath 7 14,478 Jan 6, 2012 11:54 AM
Last Post: EqwanoX
  [Game Engine]Orx coming to iPhone iarwain 6 11,191 May 17, 2011 02:31 PM
Last Post: iarwain
  Help optimizing an inconsistent framerate Duddd 6 4,685 Apr 4, 2010 09:27 PM
Last Post: Duddd