Occlusion queries... SLOW.
This morning I implemented some trickery with occlusion queries to render a halo around the sun in my environment. Using two queries, ( one without depth testing, and one with ) I get a ratio of the amount of the sun that's visible and use that to determine the size and opacity of the halo.
It looks *great*. I get smooth dimming of the halo based on how much of the sun is visible.
The trouble is, the performance is simply awful. With profiling, I see that my query testing code is taking up 75% of my render time. Whereas before ( using ray tests ) my code to draw the sun took significantly less than 1%.
Is there anything I should know to make this perform acceptably? Or should I just stay away from occlusion queries?
EDIT: I have to assume the hit is because *asking* for the passed fragments causes a pipeline stall while the card finishes what I rendered. Is there some way to minimize this hit?
My best guess right now is to have my occlusion value be one frame old. E.g., I get the occlusion results *after* finishing the current frame and use them for the next frame. Sure, it'd be one frame old, but I don't think anybody would notice.
It looks *great*. I get smooth dimming of the halo based on how much of the sun is visible.
The trouble is, the performance is simply awful. With profiling, I see that my query testing code is taking up 75% of my render time. Whereas before ( using ray tests ) my code to draw the sun took significantly less than 1%.
Is there anything I should know to make this perform acceptably? Or should I just stay away from occlusion queries?
EDIT: I have to assume the hit is because *asking* for the passed fragments causes a pipeline stall while the card finishes what I rendered. Is there some way to minimize this hit?
My best guess right now is to have my occlusion value be one frame old. E.g., I get the occlusion results *after* finishing the current frame and use them for the next frame. Sure, it'd be one frame old, but I don't think anybody would notice.
I'm sure you read something like this
From looking at the ARB_occlusion_query documentation it seems that oclusion queries may be quite expensive, you have to submit rendering data twice to determine oclusion, there are situation in wich it may help e.g when you use multitexture objects, or multipass.
If you are doing this for the sole porpuse of the lens flare I think you're better off with the ray intersection aproach, even if you check rays every 2-3 frames I think you'll be OK.
From looking at the ARB_occlusion_query documentation it seems that oclusion queries may be quite expensive, you have to submit rendering data twice to determine oclusion, there are situation in wich it may help e.g when you use multitexture objects, or multipass.
If you are doing this for the sole porpuse of the lens flare I think you're better off with the ray intersection aproach, even if you check rays every 2-3 frames I think you'll be OK.
I did read it, and I also read the chapter on it in the _OpenGL SuperBible_. I was just hoping that there was some triviality I'd missed. I suppose not.
The question, then, is what's the optimal strategy for performing occlusion tests? When's the optimal time to read the results of the queries?
All that being said, my ray approach is plenty fast enough as is ( I'm using a thin wrappper/adaptation of ODE's ray collision detection ), it's just that it's binary, where occlusion testing gives me a nice gradation of value.
The question, then, is what's the optimal strategy for performing occlusion tests? When's the optimal time to read the results of the queries?
All that being said, my ray approach is plenty fast enough as is ( I'm using a thin wrappper/adaptation of ODE's ray collision detection ), it's just that it's binary, where occlusion testing gives me a nice gradation of value.
Optimal time to read queries?
I have not played with this extension but from the documentation it seems that it would make no different at what time you do it.
Are you submitting the object itself or it's bounding box for the oclusion query?
if so it could make a huge difference specially if your objects have lots of polygons.
Can you get away submitting the bounding box of the objects for the oclusion test?
I have not played with this extension but from the documentation it seems that it would make no different at what time you do it.
Are you submitting the object itself or it's bounding box for the oclusion query?
if so it could make a huge difference specially if your objects have lots of polygons.
Can you get away submitting the bounding box of the objects for the oclusion test?
Already am -- I'm just submitting a simple square, billboarded quad.
Actually, I made headway on "perceived" correctness while using the ray approach. Instead of having a simple occluded/not-occluded toggle, now I have it set glare intensity based on the number of the last 20 frames that the ray passed the test. It actually results in a "similar" effect ( sun fades out as you walk past a tree, for example, instead of blinking out ), though it's not formally correct.
Actually, I made headway on "perceived" correctness while using the ray approach. Instead of having a simple occluded/not-occluded toggle, now I have it set glare intensity based on the number of the last 20 frames that the ray passed the test. It actually results in a "similar" effect ( sun fades out as you walk past a tree, for example, instead of blinking out ), though it's not formally correct.
If your ray tests aren't too expensive, could you perhaps do several slightly scattered rays to cover more of the area of the sun? Maybe I'm way off base, but it seems like that would be a good compromise.
- Alex Diener
- Alex Diener
That's always been my thought for a fallback. Combined with my over-time fading, it'd be pretty effective, and would probably work fine with just three rays.
I'm using ODE for all my kinematics and collision detection, and ODE's ray/mesh intersection is pretty fast. So it's probably not too big a deal.
I'm using ODE for all my kinematics and collision detection, and ODE's ray/mesh intersection is pretty fast. So it's probably not too big a deal.
Occlusion tests are asynchronous. If you're making the query then immediately asking for the result, performance is expected to be miserable. You need to make the query, do a whole bunch of other work, then ask for the result. You may find that you need to work one frame behind in order to get enough time between making the query and asking for the result.
Thanks, that did it. I query the last result, then perform a new one. My results are one frame old but you'd never know. The cool part is that it went from spending 75% of render time in the query readback, to 3%. Very nice.
Thanks!
Thanks!
Possibly Related Threads...
Thread: | Author | Replies: | Views: | Last Post | |
2D Pixel Collision Detection using OCCLUSION | Elphaba | 0 | 4,277 |
Jun 8, 2009 06:30 AM Last Post: Elphaba |
|
Occlusion query code failing on ATI 9600 | TomorrowPlusX | 4 | 7,991 |
Mar 10, 2007 08:52 AM Last Post: arekkusu |
|
Occlusion query failing for fogged occluders | TomorrowPlusX | 12 | 10,989 |
Jan 8, 2006 05:03 PM Last Post: arekkusu |
|
gl errors with occlusion queries | TomorrowPlusX | 1 | 3,400 |
Oct 12, 2005 10:08 AM Last Post: TomorrowPlusX |
|
Are SDL events slow? | Skorche | 3 | 6,216 |
Jul 25, 2005 11:08 AM Last Post: Skorche |