iPhone Game Framerates? Chipmunk & OpenGL
Did you try -ffast-math? It seems to give a pretty good boost in speed from what I remember. (at least it did on desktop CPUs)
Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
I say 15-20fps is good enough for most games. If your making a cinematic movie, 24 fps is the standard (video 30fps).
So if your close to that, the eye will not know the difference.
So if your close to that, the eye will not know the difference.
The difference between 30fps and 60fps is obvious.
kelvin Wrote:Thumb (16-bit) -> ARM (32-bit)
-O0 -> -O3
For the Thumb to ARM thing, I really wish I understood more about what you are suggesting there.
For O3, all I know is that if I compile for anything more than O0 I lose parts of my geometry for some unknown reason. I am personally quite ignorant to what the various optimizations do. I fiddled around with a whole bunch of compiler options for quite a while and am stuck with what I have. At least I'm pretty happy with my performance right now... But, of course, more would be better
ARM chips support two related instruction sets. ARM is a 32-bits-per-instruction, 3-operands-per-instruction set, a lot like PowerPC. Thumb is a 16-bits-per-instruction, 2-operands-per-instruction set. A program can switch (almost) at will between the two sets, though obviously Apple isn't making that easy to do. Thumb code will require more instructions to achieve the same task as ARM, since each instruction is less powerful, but each instruction is also half the size. If the increase in instruction count is less than 2x, this results in a code size reduction, which is generally a good thing; however, performance is likely to be less.
http://en.wikipedia.org/wiki/ARM_architecture has more detailed info.
-O0 in GCC is almost worse than no optimization; it keeps all locals on the stack for better debugging. -O is seldom used because it hinders debugging without doing as good a job of optimization as -O2/-Os. -O2/-Os are more or less equivalent, except that -Os prefers not to make optimizations that would increase code size. For PowerPC, I never saw significant differences between the two; for i386, -O2 is noticeably faster. -O3 does aggressive, code-size increasing optimizations like inlining functions you didn't mark inline, and unrolling loops. It (in my experience) frequently produces slower code than -O2. I personally compile everything at -O2, Shark, and hand-optimize. I never use -O3 (probably to my detriment; it certainly does help with certain code).
If your program does nothing undefined, none of these flags should change the meaning of your program. If you crash at -O2 but not at -O0, this is almost certainly a bug in your code that you should find and fix (as you're clearly relying on undefined behavior, it could be activated by an OS update or the like without your intervention).
http://gcc.gnu.org/onlinedocs/gcc-4.0.1/...tions.html is well worth a read, particularly what you're "breaking" when you turn on -ffast-math (likely nothing you care about, but better safe...)
http://en.wikipedia.org/wiki/ARM_architecture has more detailed info.
-O0 in GCC is almost worse than no optimization; it keeps all locals on the stack for better debugging. -O is seldom used because it hinders debugging without doing as good a job of optimization as -O2/-Os. -O2/-Os are more or less equivalent, except that -Os prefers not to make optimizations that would increase code size. For PowerPC, I never saw significant differences between the two; for i386, -O2 is noticeably faster. -O3 does aggressive, code-size increasing optimizations like inlining functions you didn't mark inline, and unrolling loops. It (in my experience) frequently produces slower code than -O2. I personally compile everything at -O2, Shark, and hand-optimize. I never use -O3 (probably to my detriment; it certainly does help with certain code).
If your program does nothing undefined, none of these flags should change the meaning of your program. If you crash at -O2 but not at -O0, this is almost certainly a bug in your code that you should find and fix (as you're clearly relying on undefined behavior, it could be activated by an OS update or the like without your intervention).
http://gcc.gnu.org/onlinedocs/gcc-4.0.1/...tions.html is well worth a read, particularly what you're "breaking" when you turn on -ffast-math (likely nothing you care about, but better safe...)
Wow, that is incredibly helpful information to me right now! Thank you very much Keith!
Depending on what your graphics look like, you might be able to reduce the appearance of a lower frame rate by implementing some kind of motion blur. Sounds like it would slow it down, but I think a game running at 30fps with motion blur would look smoother than one running at 60fps without.
I haven't tried this, but I will fairly soon. Because my graphics are all dark lines on a light background, I think if I just drawing the body a few times, advancing it in the direction it's travelling and reducing the alpha, it might look okay.
I haven't tried this, but I will fairly soon. Because my graphics are all dark lines on a light background, I think if I just drawing the body a few times, advancing it in the direction it's travelling and reducing the alpha, it might look okay.
There's no substitute for a high frame rate. Sensitive users will be able to tell no matter what. I'd have to see the motion blur idea in action to be sure, but I have a hard time imagining that it would make a noticeable difference...
A higher frame rate is better of course, but motion blur is the reason that people will happily accept 25fps on a TV but complain about 50fps in a game. It makes a huge amount of difference. But it's prohibitively slow to do it in 3D afaik. I don't know if 2D can handle it, but I want to try it out.
Very interesting! I didn't expect it to make that much difference. The time you'd spend computing motion blur might well be better spent on drawing at a higher frame rate, but if you can find an efficient enough way to do it, it might be worth the effort...
Sorry Kelvin, I don't mean to hijack your thread.
Maybe I should start a new one instead, but I'm hoping this will be quick...
Well, it *appears* that the "bug" is fixed. I wasn't crashing, but some geometry wasn't being loaded at -O2, while it worked fine on -O0. (actually, it didn't load using any optimization level other than -O0) Also of note is that it has always worked fine on i386. The missing geometry was only on ARM.
So, but, I still have no idea why it didn't actually work.
What was happening was that I was creating a linked list of polygon (triangle) rendering groups when I loaded the models, grouped by material for each static mesh model, and the pointer to the next polygroup got nilled out if I made any Obj-C calls in the loop where I created the linked list.
I don't know if it is comprehensible, but I'll try to illustrate.
This doesn't work as expected using anything but -O0 on ARM. I get the first polygroup to render, but the rest of the model is missing. It loads the polygroups without error, but the list is broken because of the nil left in nextGroup, which should have been set correctly with prevPolyGroup->nextGroup = newGroup;
But doing it this way works:
Maybe I should start a new one instead, but I'm hoping this will be quick...OneSadCookie Wrote:... If you crash at -O2 but not at -O0, this is almost certainly a bug in your code that you should find and fix ...
Well, it *appears* that the "bug" is fixed. I wasn't crashing, but some geometry wasn't being loaded at -O2, while it worked fine on -O0. (actually, it didn't load using any optimization level other than -O0) Also of note is that it has always worked fine on i386. The missing geometry was only on ARM.
So, but, I still have no idea why it didn't actually work.
What was happening was that I was creating a linked list of polygon (triangle) rendering groups when I loaded the models, grouped by material for each static mesh model, and the pointer to the next polygroup got nilled out if I made any Obj-C calls in the loop where I created the linked list.I don't know if it is comprehensible, but I'll try to illustrate.
This doesn't work as expected using anything but -O0 on ARM. I get the first polygroup to render, but the rest of the model is missing. It loads the polygroups without error, but the list is broken because of the nil left in nextGroup, which should have been set correctly with prevPolyGroup->nextGroup = newGroup;
Code:
newModel = (Mesh *)malloc(sizeof(Mesh));
newModel->polyGroup = nil;
polyGroups = [modelDict objectForKey:@"polyGroups"];
count = [polyGroups count];
for (i = 0; i < count; i++)
{
newGroup = (PolyGroup *)malloc(sizeof(PolyGroup));
newGroup->nextGroup = nil;
if (newModel->polyGroup == nil)
newModel->polyGroup = newGroup;
else
prevPolyGroup->nextGroup = newGroup;
prevPolyGroup = newGroup;
// fill in the model information here
polyGroupDict = [polyGroups objectAtIndex:i];
newGroup->numIndices = [[polyGroupDict objectForKey:@"numIndices"] unsignedIntValue];
newGroup->numFaces = [[polyGroupDict objectForKey:@"numFaces"] unsignedIntValue];
// more model loading blah blah blah
// ...
}But doing it this way works:
Code:
newModel = (Mesh *)malloc(sizeof(Mesh));
newModel->polyGroup = nil;
polyGroups = [modelDict objectForKey:@"polyGroups"];
count = [polyGroups count];
for (i = 0; i < count; i++)
{
newGroup = (PolyGroup *)malloc(sizeof(PolyGroup));
newGroup->nextGroup = nil;
if (newModel->polyGroup == nil)
newModel->polyGroup = newGroup;
else
prevPolyGroup->nextGroup = newGroup;
prevPolyGroup = newGroup;
}
// now fill in the model information
newGroup = newModel->polyGroup;
for (i = 0; i < count; i++)
{
polyGroupDict = [polyGroups objectAtIndex:i];
newGroup->numIndices = [[polyGroupDict objectForKey:@"numIndices"] unsignedIntValue];
newGroup->numFaces = [[polyGroupDict objectForKey:@"numFaces"] unsignedIntValue];
// more model loading blah blah blah
// ...
newGroup = newGroup->nextGroup;
}
While I didn't look too closely at your code, generally I've gotten crashes like that from trying to use garbage values in initialized variables. Because -O0 always reads values from the stack, you will get different garbage.
Lesson learned: treat pointers as immutable whenever possible. Don't declare them without initializing them, and don't reuse the same local if you can avoid it by moving it to an inner block.
Lesson learned: treat pointers as immutable whenever possible. Don't declare them without initializing them, and don't reuse the same local if you can avoid it by moving it to an inner block.
Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Bizarre. If you suspect a compiler bug, please take the time to reduce it to a tiny test-case where you can be certain nothing funny is going on, and report it to Apple.
Thanks for the tips Skorche, always appreciated. 
@OSC: Well I didn't suspect a compiler bug, but it certainly struck me as odd that it wouldn't work. If you guys don't see anything sticking out there, then I guess I'd better attempt to distill it down into a test-case as you suggest, and report it.
Rats, more diversion... At least getting it to work under -O2 did seem to buy me some FPS.

@OSC: Well I didn't suspect a compiler bug, but it certainly struck me as odd that it wouldn't work. If you guys don't see anything sticking out there, then I guess I'd better attempt to distill it down into a test-case as you suggest, and report it.
Rats, more diversion... At least getting it to work under -O2 did seem to buy me some FPS.
(again, my apologies Kelvin)
Well, that was easy... a little *too* easy.
I can't believe this doesn't work. Doesn't make sense. I must be blind or something. So here's the test:
If somebody can see what's wrong here, that would be appreciated. Otherwise, I guess I'll file it.
You can test it yourself by pasting it into the app delegate of a "Hello World" type program on your favorite ARM device. Don't forget to set your optimization level to something other than -O0. Also don't forget to add the method declarations, and add [self doLinkedListTest]; at the bottom of applicationDidFinishLaunching.
Weird.
Well, that was easy... a little *too* easy.

I can't believe this doesn't work. Doesn't make sense. I must be blind or something. So here's the test:
Code:
typedef struct _Node
{
struct _Node *nextNode;
} Node;
- (void)doLinkedListTest
{
Node *rootNode, *newNode, *prevNode, *node;
int i;
// construct the linked list
rootNode = nil;
for (i = 0; i < 5; i++)
{
newNode = (Node *)malloc(sizeof(Node));
newNode->nextNode = nil;
if (rootNode == nil)
rootNode = newNode;
else
prevNode->nextNode = newNode;
prevNode = newNode;
[self testMethod]; // < comment out this line or use -O0 and it behaves as expected
}
// verify the linked list
// (should output five lines of "node", but only outputs one if not using -O0 on ARM)
node = rootNode;
while (node)
{
NSLog(@"node");
node = node->nextNode;
}
}
- (void)testMethod
{
;
}If somebody can see what's wrong here, that would be appreciated. Otherwise, I guess I'll file it.
You can test it yourself by pasting it into the app delegate of a "Hello World" type program on your favorite ARM device. Don't forget to set your optimization level to something other than -O0. Also don't forget to add the method declarations, and add [self doLinkedListTest]; at the bottom of applicationDidFinishLaunching.
Weird.
Possibly Related Threads...
| Thread: | Author | Replies: | Views: | Last Post | |
| OpenGL Differences between iPhone Sim and Real iPhone | SparkyNZ | 5 | 5,544 |
Apr 13, 2011 11:40 AM Last Post: SparkyNZ |
|

