Why is my iPad game so slow?

Member
Posts: 54
Joined: 2010.10
Post: #1
Background: We're writing http://www.greatlittlewargame.com using an inhouse engine which we're now porting to iPad and ES 2.0 generally.

Thing is, we can't get the fps above about 15fps no matter what we do. I've heard there are fill rate issues, but we've pared our scene down to this list of bullet points.

We're all experienced engine tech people familiar with GL ES generally, so don't spare the horses with suggestions. Many gallons of beer to anyone who can suggest a magic bullet.

The temp test scene is as follows:

*Model has 8 batches, most of which cover the screen but not over each other so overdraw is neglible (Grass, road, cliff edge etc)
*No alpha blending at all. Simple lambert directional light - the grass itself isn't even dynamically lit - its precalced into the verts
*Four shaders, sorted, none of which are excessive
*About 30,000 vertices in total
*No shadows

The stuff we have tried in the engine is as follows:

*Every shader has minimal precision possible on decls etc
*All state setting is cached by us
*All textures are fairly small now, square, pow2, pvr'd, mipped
*Write directly to the back buffer
*Final blend is set to glDisable for now, alpha stuff just thrown out for now
*Using compressed format vertex decl of 32 bytes apiece (Got to 24 once but made almost no difference)
*Rendering from index buffers, not strips. (8 bit indices slowed down a bit vs using shorts)
*Doing nothing "creative" with the depth buffer or anything
*No post-processing. Hell, the engine is in pieces and doing almost nothing!
*We're not even normalising the normal in our per-pixel lighting
*Specular maths disabled for now

The CPU is pretty much flat lined according to Instruments and Shark, backed up by a release build not noticeably increasing fps from it's shocking 15.

Totally stumped. What are we doing wrong?

Many thanks and sorry about the long post.

Paul Johnson
Great Little War Game
Quote this message in a reply
Member
Posts: 45
Joined: 2006.07
Post: #2
No "magic bullet", but if I were in your shoes, I'd do some simple detective work to see where the problem is coming from...

1) If your render function is reduced to only glClear(), are you still getting only 15 FPS? If so, you almost certainly have a timing bug or some logic error in the way you've organized your render cycle.

2) If you don't draw anything at all but run all of the state management/shader setup/vertex buffer code is the FPS still just 15? If so, it might be that you are doing some extraordinarily expensive non-drawing operations.

3) If you disable the shaders but draw all the geometry, does the FPS increase? If so, the problem could be a shader executing too slowly. I don't know about the iPad, but on some graphics cards exceeding some hardware limits makes your shader run in software, which absolutely kills performance.

4) If you keep all of your shaders and state changes, but draw just one object per category/shader/state, does the FPS increase? If so, you're probably fillrate or geometry bound, and further tests can tell you where the problem is.

Best of luck!
Quote this message in a reply
Moderator
Posts: 3,571
Joined: 2003.06
Post: #3
Looks like you've hit a lot of points already. mattz has some really great suggestions. I assume you have compile for thumb turned off. Is there an OpenGL query hidden in there somewhere, causing a round-trip and stalling the pipeline?
Quote this message in a reply
Member
Posts: 54
Joined: 2010.10
Post: #4
(Nov 11, 2010 07:29 PM)mattz Wrote:  No "magic bullet", but if I were in your shoes, I'd do some simple detective work to see where the problem is coming from...

1) If your render function is reduced to only glClear(), are you still getting only 15 FPS? If so, you almost certainly have a timing bug or some logic error in the way you've organized your render cycle.

2) If you don't draw anything at all but run all of the state management/shader setup/vertex buffer code is the FPS still just 15? If so, it might be that you are doing some extraordinarily expensive non-drawing operations.

3) If you disable the shaders but draw all the geometry, does the FPS increase? If so, the problem could be a shader executing too slowly. I don't know about the iPad, but on some graphics cards exceeding some hardware limits makes your shader run in software, which absolutely kills performance.

4) If you keep all of your shaders and state changes, but draw just one object per category/shader/state, does the FPS increase? If so, you're probably fillrate or geometry bound, and further tests can tell you where the problem is.

Best of luck!
Thanks for the suggestions. To answer your points:

1) It goes up to 60.
2) I think it went up, will double check
3) We've not actually tried that. Will do so in a bit, thanks.
4) We went down to just the first triangle. Little change

I really don't think its fill rate. We took a lot of time to ensure that the ground mesh looks pretty much like a single extrusion. The tanks and stuff over the top aren't massive but lets face it its not like I can take them out. Sad

@Jake: Hmmm, I think so but I'll double-check that also. We claimed to be decent at GL but our mac and iPad SDK experience is still fairly light - we could be making any number of mistakes like this tbh. Any other more global things you can think of?

Thanks fellaz. Good stuff so far...

Paul Johnson
Great Little War Game
Quote this message in a reply
Member
Posts: 446
Joined: 2002.09
Post: #5
(Nov 11, 2010 10:49 PM)AnotherJake Wrote:  I assume you have compile for thumb turned off.
That only matters on old arm6 devices. Apple recommends turning thumb on for arm7 binaries (they use some fancy pants thumb2 that handles floating point).
Quote this message in a reply
Member
Posts: 54
Joined: 2010.10
Post: #6
Tried the thumb options. No change for either. Sad

Which is kinda our running theme Sad

Commenting out only glDrawElements raises the FPS to 60, but I don't think that proves anything as surely all the setup is done lazily anyway?

Regarding suggestion 3). How do I disable all shaders? Isn't that going to crash somewhere?

Paul Johnson
Great Little War Game
Quote this message in a reply
Moderator
Posts: 3,571
Joined: 2003.06
Post: #7
(Nov 12, 2010 03:47 AM)Frank C. Wrote:  
(Nov 11, 2010 10:49 PM)AnotherJake Wrote:  I assume you have compile for thumb turned off.
That only matters on old arm6 devices. Apple recommends turning thumb on for arm7 binaries (they use some fancy pants thumb2 that handles floating point).

Ah, I missed that. Good thing we have forums to discuss these things! Grin

Quote:Commenting out only glDrawElements raises the FPS to 60, but I don't think that proves anything as surely all the setup is done lazily anyway?

Well, it proves that it's not a logic bottleneck, but rather it is indeed related to the GL usage.

I don't know how much it'd help, but you could also try ordering your verts in triangle strip order. The docs say it helps cache hits which improves performance. I've tried it myself but didn't see any improvement. There is a good triangle stripper in the oolong engine if you haven't already tried it.

That was 30k verts, meaning ~10k triangles, correct?

I think I'd focus in on the shaders next myself and see what happens if I simplify everything to bare minimums.
Quote this message in a reply
Member
Posts: 54
Joined: 2010.10
Post: #8
Its nearer 20,000 triangles due to the way the level is tesselated. The file is created in a PC tool which calls the D3D optimiser thingy to sort the indices out for better post transform performance.

We've simplified everything to bare minimums today. The level geom only, with a single small texture, shot up to 40fps. A decent number but only because it's doing nothing useful.

After another whole day with two of us trying stuff out, I'm starting to feel that the iPad just isn't fit for purpose as a gaming platform. If all I can do is render a textured quad I'd rather just pass. Sad
Update. This is worse than billed. Got obsessed with the numbers. Without all the units and level fluff (ie jus the level mesh only) it's a tenth that number of verts.

We were rendering more than this, faster than this, on the leapfrog didj. A kids toy costing 40 bucks Sad

Paul Johnson
Great Little War Game
Quote this message in a reply
Moderator
Posts: 3,571
Joined: 2003.06
Post: #9
Hmm... If you're suggesting that you're having trouble with only 2k verts, then something is definitely screwing up royally. I know for a fact I can render ~10k triangles lit, depth-tested and textured @20 FPS, even on 1st gen devices, and honestly, I am not even knowledgeable about the finer points of OpenGL performance. iPad should have no problem doing what you're describing.

How about glFlush? You aren't calling that are you? Also, you're not drawing a Cocoa view (including anything Cocoa like buttons, etc) over your OpenGL view either are you? Are you doing any GL reads?

With any luck, arekkusu or frogblast will stumble along here and have some expert advice for you.
Quote this message in a reply
Moderator
Posts: 1,560
Joined: 2003.10
Post: #10
You mention blending is disabled, but what about alpha testing? I haven't experienced this myself, but from what I hear enabling alpha testing kills performance on the GPUs used in the iPhone and iPad.
Quote this message in a reply
Member
Posts: 54
Joined: 2010.10
Post: #11
There isn't any alpha testing. You need to do it with the clip instruction in your shader aiui. We never got that far and have made sure there is no alphatesting stuff in the level. The call that usually would set the fields is a stub function.

No gl reads, no flush, no cocoa Sad

Paul Johnson
Great Little War Game
Quote this message in a reply
Moderator
Posts: 3,571
Joined: 2003.06
Post: #12
Is the CA layer transformed? That might happen if you allow view autorotation, or you've rotated it yourself. That can be a real killer.

I think you can clear it to be sure with [self.layer setTransform:CATransform3DIdentity]; in your EAGLView
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #13
You said you have no CPU usage, and that eliminating all rendering gets you 60Hz, so you're probably looking at some kind of GPU bottleneck. You said you've only got 8 draw calls? In that case I'd try isolating each to see if any in particular is the culprit. If one is, post the shader and details of the state (texture formats & sizes, etc).
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #14
(Nov 12, 2010 04:05 AM)Applewood Wrote:  Regarding suggestion 3). How do I disable all shaders?
Replace the shader text with the simplest possible content.
E.g. vertex shader only transforms position and does nothing else.
Fragment shader only outputs constant green, does nothing else.

If this shows a signal, then binary search through shaders and/or original shader text to isolate where time is spent. The profiling tools don't make this easy. Sad

(Nov 12, 2010 08:56 AM)Applewood Wrote:  I'm starting to feel that the iPad just isn't fit for purpose as a gaming platform.
Kindly see the Epic Citadel demo, or the thousands of other successful shipping games.
Quote this message in a reply
Member
Posts: 54
Joined: 2010.10
Post: #15
I have that citadel demo installed. Every now and again when I'm getting pissed off, I fire that up and it pushes me over the edge and I get to stop for a bit. Smile Before this week I considered myself good at this sort of thing. :s However, looking at other stuff doesn't help. Maybe it's using the FFP which is faster for example. Maybe not, but we just don't know and that's kinda the point.


We're using that tool from PowerVR to cycle count our shaders. The most complex one is now 9 cycles (iirc) reading 3 textures (for now down to 64x64). Lerp one to the other based on an alpha then mul the result with a third and the vertex colour. I'm away from the source now else I'd post it. (I can't get the iPad stuff remotely as someone forgot to set the eternal rights. Not a good day, lol.)

All the textures are 4 bit pvr with mips and no translucency.

When we did what you suggest with uber simple shaders we maxed out at 40 fps, which is the fastest we've ever seen anything get to. (That's actually outputting)

Paul Johnson
Great Little War Game
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Scaling iOS Game: iPhone -> iPad s0ckman 2 1,165 Dec 9, 2013 07:34 AM
Last Post: s0ckman
  XML too slow, best way to go binary markhula 9 8,220 Jan 14, 2011 06:58 AM
Last Post: markhula
  emulator slow when rendering from a second thread captainfreedom 1 2,844 Jan 30, 2010 05:05 PM
Last Post: ChrisD
  Simple application slow down, FPS drops Newbrof 15 7,782 Sep 2, 2009 12:51 PM
Last Post: AnotherJake
  Chipmunk is too slow? jaguard 14 8,038 Jan 22, 2009 04:49 AM
Last Post: jaguard