Faster OpenGL in Cocoa

Krevnik
Unregistered
 
Post: #16
jabber Wrote:Krevnik: your link is a 404. Sad
Yeah, because I linked to an area of comcast, rather than my webspace:

http://home.comcast.net/~krevnik/files/CustomGL.zip

Arekkusu: The thread approach is a valid one, but I would rather offload physics and networking onto the second thread rather than rendering. That is just my thoughts though.

The odd thing is that on reasonable machines, you don't get the weird lag I am seeing on my Pismo from NSTimer. What is really wierd is that I would kick up my timer from 1/20th of a second to 1/200th of a second, and my framerate would drop (mind you, this is an LCD, not a CRT system, so the refresh is listed as 0Hz) rather than rise. I honestly haven't found the full cause of exactly what is causing the lag, but it went away the moment I stopped using NSTimer to manage the calls to the rendering code, and I saw massive speedups in NeHe demos by switching to a custom event loop from NSTimer across every tutorial I have worked with so far. I would say that isn't anything to sneeze at.

While it can be debated over which should be used now, custom event loops used to be the only way to do things like this before Carbon/Cocoa. Plus it had the advantage of letting the developer tweak their code to minimize overhead from the APIs. I don't see how this method would be any less valid now then it was then, especially with how little code is being written to customize the event loop. Another thing is that the code for the custom Application class only needs to be written once, and a delegate super-class can be written for OpenGL software once as well. In this way, the extra effort does not need to be done for every application written, which means the code is portable between Cocoa apps.

Oh, and looking at Shooting Things, it starts to lag a bit when lots of bullets start hitting the screen and passing the asteroids (when the metroids show up for the first time) on my system as well. I am curious as to what system you use for development?
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #17
Yes, the custom event loop is a perfectly valid way to avoid runloop overhead. But it is a bit more work to handle both fullscreen and windowed modes. Multiple threads will be more efficient, but of course they're an even bigger pain to set up.

I started with the same Omni Group GDC 2001 stuff a few years ago. But, I haven't noticed the _overhead_ of NSTimer to be detrimental on my dev machines (from 233 MHz Wallstreet -> 667 and 800 Mhz TiBooks -> 1.25 GHz AlBook), especially when the bottleneck in my apps is typically video card fill rate, and not event processing. (Certainly, the _inaccuracy_ of NSTimer has been a problem.)

Re: Shoot Things, a 700 Mhz eMac can run it at a comfortable 60 fps. Anything slower will drop frames. Anything too slow to sustain 30 fps will lag (the animation is frame-based, not time-based.) A Pismo is too slow (you can hit Ctrl-D to see the fps counter.)
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #18
And, contrary to what I posted above, Apple has now posted a Tech Q&A saying that you shouldn't use NSTimers with high rates like 0.001s because it burns CPU.

Of course Apple's own sample code still does exactly that. Sigh.
Quote this message in a reply
Moderator
Posts: 916
Joined: 2002.10
Post: #19
arekkusu Wrote:And, contrary to what I posted above, Apple has now posted a Tech Q&A saying that you shouldn't use NSTimers with high rates like 0.001s because it burns CPU.

Of course Apple's own sample code still does exactly that. Sigh.
well, it is still good up to 120fps, which is faster than most games need to go
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #20
arekkusu Wrote:And, contrary to what I posted above, Apple has now posted a Tech Q&A saying that you shouldn't use NSTimers with high rates like 0.001s because it burns CPU.

Of course Apple's own sample code still does exactly that. Sigh.
Well I'm glad they finally said something about it I guess. They said that it'll "burn up additional CPU time without providing any visible benefit." if you set a timer higher than the vertical refresh rate. I guess we all know that but I've had the "visible" benefit of seeing my frame rate drop from 550 to 200 fps by doing something one way rather than another during development. IOW, it acts as a good indicator that I screwed up. I would still stand by the technique of running it hot at 0.001 seconds just for this purpose but cut it to the vertical retrace for deployment. Another thing they say is that "As a general rule, 30 to 60 frames per second is an acceptable frame rate for most applications, thus a timer value that will yield a framerate in this range would be a good place to start." While I agree with this statement, I would rather suggest that it's better to start by synching to VBL instead to get the best visual performance since not all users are using an LCD as they seem to be assuming. If you wanted to set your timer for 60 frames and the user is running at 75 Hz then that doesn't make much sense either.

[edit] Oh yeah, and there's a typo in their example. It should be 0.01 instead of 0.1 for 100 ms. Their example will give you 10 fps. 0.0083 might make more sense in case a user has a refresh of 120 Hz. Still better to go VBL IMO.
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #21
Yes, you should *ALWAYS* sync to VBL, regardless if the user has a CRT or LCD. There's no excuse for poking people in the eye with VBL tear. Might as well inject static into your audio.

But 1 sec = 1000 ms, so 0.1 sec = 100 ms. 0.01 would be 1/100th of a second, or 10 ms. I think that 0.005 is a good upper limit for NSTimer, as eMac CRTs top out around 120Hz.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #22
arekkusu Wrote:But 1 sec = 1000 ms, so 0.1 sec = 100 ms. 0.01 would be 1/100th of a second, or 10 ms. I think that 0.005 is a good upper limit for NSTimer, as eMac CRTs top out around 120Hz.
Whoops! You are correct. But their example is still stupid since a 100 ms timer gives you only 10 fps!

I agree that 0.005 (200 fps) should be acceptable. 0.0083 is mathematically closer to hitting 120 but it'll still skip a frame once in a while (not that it'd be noticed).
Quote this message in a reply
nabobnick
Unregistered
 
Post: #23
A useful link I found today at flipCode about game loop timing Game Loops. It gives an overview of separating the logic and rendering timing so that logic is lockstep while rendering can speed up and slow down independently. I'm currently working on tieing this into the sample code I posted the other day (that overrides NSApplication). When I have a good mostly general solution I'll post this also.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #24
Okay, I just did some testing. First of all, there is no way in hell 0.01 is acceptable, with or without VBL. Here are some qualitative assessments on an LCD (refresh at 60Hz), with my OpenGL test game that runs at roughly 350 fps unrestrained on my machine. Running in a window 1057 x 793 pixels in dimension (640x480 aspect).

VBL on:

0.0083 -smooth
0.0082 -smooth with small hiccup once every second or two
0.0081 -smooth with more hiccups as more frames were apparently skipped
0.0085 -again smooth but hiccups
0.005 -smooth but small, even chop noticeable
0.007 -same as 0.005
0.0001 -smooth as 0.0083

VBL off:

0.0083 -by far the smoothest performance but does a tear every second or two
0.0085 -smooth but lots of hiccups and worse as time is increased
0.005 -choppy but even
0.007 -choppier than 0.005 but still even
0.0001 -very smooth with very little noticeable chop

So this means a few things to me. First, 0.0083 offers reliable performance if the VBL of the display is 60 or 120 Hz (which makes sense since 1/120 = ~0.00833). Second, 0.005 doesn't guarantee smooth performance but seems to be a compromise (who wants to compromise?). Third, assuming that the user doesn't necessarily have a display that is refreshing at either 60 or 120 Hz, 0.0083 most likely won't look good at all. And finally fourth, I'm not at all convinced that what was said in that technote about 0.0001 being bad is actually correct. On the contrary, according to the data I just gathered, setting your timer to 0.0001 and turning on VBL synch still appears to offer the best visual performance across different vertical refresh rates. I could be wrong but I'd like to see more discussion about this now. Maybe this thread should be split.
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #25
I agree with your conclusion about 0.0001, though I am wondering if you could add your observations on CPU usage to your findings. Particularly if you notice any different between 0.0083, 0.0005 and 0.0001 (or for that matter 0.00000000000001) when you have VBL sync on. I find no measurable difference on all the hardware I've tested (which is nearly everything Apple's shipped in the last 3 years, but that doesn't include Krevnik's Pismo.)

Also, it's not safe to assume that 0.0083 works on LCD displays because the refresh rate is a multiple of 60. Some LCDs refresh at exactly 60.0 Hz, but many are slightly faster or slower which will lead to your timer drifting. And of course the accuracy of NSTimer is not guaranteed.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #26
Is there an accurate way to get a CPU usage average? Looking at it through Activity Monitor they all look pretty much the same to me - somewhere between 4 and 14% with 7-10 being most favored. I didn't see any difference between 0.001 and 0.000001f. The only one that actually *seemed* to be less was 0.01 and even that wasn't by more than 3 or 4%. I suppose I could also test using more entities and see if heavier processing might make the results more dramatic.

arekkusu Wrote:Also, it's not safe to assume that 0.0083 works on LCD displays because the refresh rate is a multiple of 60. Some LCDs refresh at exactly 60.0 Hz, but many are slightly faster or slower which will lead to your timer drifting. And of course the accuracy of NSTimer is not guaranteed.
I agree, that's why I don't think it's a good idea to pick a number out of the sky like Apple is suggesting. It's possible that one could determine the best timer frequency based on the current refresh rate, but as you suggested, NSTimer is a bit funky about accuracy anyway.
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #27
Activity Monitor is not very accurate since it only samples once every second or whatever you've set the interval to, like top. Profiling with Shark will sample every 1 ms (also configurable) and give you a clearer picture of how much CPU your app is using, against a (hopefully) stable background OS. Of course it also points out which methods in your app are using the time; that ought to be 99% your rendering and 1% NSTimer and other overhead (objc_msgsend etc.)
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #28
Duh... I totally forgot about Shark. Adding 200 entities on screen I was able to pound the processor more. Here are the new results using Shark under the same general conditions as before with 46% of the time in-app being spent in collision detection:

0.00001 - 24.0%
0.001 - 23.3%
0.005 - 23.1%
0.0083 - 23.4%
0.01 - 24.5%
0.01666667 - 16.3% (1/60 = 0.01666667)

The last one was miserable performance because it kept skipping frames while trying to synch to VBL. Don't do it that way. I thought I'd check it out to see how close the timer could stay to 60 fps with VBL synch, but it didn't work well (just as expected).
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #29
Hmm. Well, that pretty much is what I expect; the numbers in Shark are always +- 1% so as long as 0.00000000001 doesn't jump up to 40% then I don't think it's worth worrying about. In a single threaded app, the timer callback will be blocked by your rendering, so the behavior ought to be that it fires off as soon as your method exits and the runloop gets control again. In other words, AFAICT, there shouldn't be any appreciable CPU burn, despite Apple's Tech Q&A.
Quote this message in a reply
Moderator
Posts: 3,570
Joined: 2003.06
Post: #30
arekkusu Wrote:so as long as 0.00000000001 doesn't jump up to 40% then I don't think it's worth worrying about.
Agreed. Until Apple or somebody else can come up with a better explanation or give some numbers to say otherwise, that Tech Q&A appears to be suspiciously incorrect for now. I'm sticking with 0.001f for my timer frequency and VBL synch on for deployment.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Opengl/Cocoa text rendering tesil 15 16,390 Mar 20, 2012 11:16 AM
Last Post: OneSadCookie
  OpenGL Text Rendering (in Cocoa) daveh84 5 7,471 Feb 19, 2009 12:44 PM
Last Post: TomorrowPlusX
  OpenGL & Cocoa - Improving frame rate daveh84 4 5,318 Feb 2, 2009 06:53 AM
Last Post: backslash
  bad depth sorting in Cocoa OpenGL aldermoore 2 4,447 Dec 30, 2008 03:07 PM
Last Post: ThemsAllTook
  Loading and using textures with alpha in OpenGL with Cocoa corporatenewt 4 5,686 Dec 8, 2007 02:06 PM
Last Post: Malarkey