FBO rendering to GL_ALPHA texture
Right,
Short question: I'd like to render an alpha buffer to a GL_ALPHA texture. Basically, right now I'm blasting quads textured with an GL_ALPHA texture to screen, but I'd like to cache those results to another GL_ALPHA texture.
Now, I've read that FBO attachments to alpha textures are no-go (or were in 2007) – does anyone have a better idea?
These are my ideas at the moment:
1) ditch the caching altogether and keep running quads to screen.
2) use glCopyTexSubImage2D to assemble the texture. (Assembly speed is not critical – this'll happen on average every three frames or so.)
3) suck the sour and run GL_RGBA textures and thereby inflating VRAM use by 400%
I'm leaning towards 2) here, but I thought I'd ask around and see if you 1337s have a better idea?
I have never stress-tested glCopyTexSubImage2D in this way, how slow is it really? A typical usage would be 100 calls of 50x50 quads to a 256x256 texture – it sounds expensive to me?
Short question: I'd like to render an alpha buffer to a GL_ALPHA texture. Basically, right now I'm blasting quads textured with an GL_ALPHA texture to screen, but I'd like to cache those results to another GL_ALPHA texture.
Now, I've read that FBO attachments to alpha textures are no-go (or were in 2007) – does anyone have a better idea?
These are my ideas at the moment:
1) ditch the caching altogether and keep running quads to screen.
2) use glCopyTexSubImage2D to assemble the texture. (Assembly speed is not critical – this'll happen on average every three frames or so.)
3) suck the sour and run GL_RGBA textures and thereby inflating VRAM use by 400%
I'm leaning towards 2) here, but I thought I'd ask around and see if you 1337s have a better idea?
I have never stress-tested glCopyTexSubImage2D in this way, how slow is it really? A typical usage would be 100 calls of 50x50 quads to a 256x256 texture – it sounds expensive to me?
Is that 100 256x256 textures being rendered or just 1 that has 100 quads drawn in it? A 256x256 texture is tiny. I can't imagine that running that every few frames is going to be even measurable to use glCopyTexSubImage2D().
The biggest benefit of glCopyTexSubImage2D() is that it's super simple to use without needed to set up extra buffers or anything. It should be easy to hack together something to see.
The biggest benefit of glCopyTexSubImage2D() is that it's super simple to use without needed to set up extra buffers or anything. It should be easy to hack together something to see.
Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Sorry, 100 alpha-textured quads are glCopyTexSubImage2D
into the texture.
I suppose I could just try it out and see what kind of performance I'm getting.
into the texture.I suppose I could just try it out and see what kind of performance I'm getting.
The performance lab results are back, and they're sweet.
First off, of course I meant glTexSubImage2D since I have the source pixels in-memory.
I simply ran a test where I try out 3 turns of 100 copies of 50x50 random pixel data into a texture. This would simulate doing the above, but three times a frame instead of every other frame (I aimed for the worst case).
This takes, on average, 2ms for 100 copies. If ever I wanted to say negligible, this is the occasion.
However, it gets better: turning on DMA transfers using glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE); pushes that down below half a millisec for 100 copies.
That is very, very acceptable.
First off, of course I meant glTexSubImage2D since I have the source pixels in-memory.
I simply ran a test where I try out 3 turns of 100 copies of 50x50 random pixel data into a texture. This would simulate doing the above, but three times a frame instead of every other frame (I aimed for the worst case).
This takes, on average, 2ms for 100 copies. If ever I wanted to say negligible, this is the occasion.
However, it gets better: turning on DMA transfers using glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE); pushes that down below half a millisec for 100 copies.
That is very, very acceptable.
now i'm curious -- what are you doing with 100 GL_ALPHA textures in a scene, anyways?
I was going to hold off talking about it until I had something to show for it,
but it's text related: I'm storing characters as alpha textures and assembling them into string textures. It's working hell of good.
but it's text related: I'm storing characters as alpha textures and assembling them into string textures. It's working hell of good.
Animating the individual characters I assume? Sounds like it could be very cool.
Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
nifty. we await sweet movies.
I need help mates, im just shocked with incredibly slow glCopyTexSubImage() on iPhone;
What i do is just copy 32x32 image area into texture and it takes about 250ms O_O
Is it common? =\
What i do is just copy 32x32 image area into texture and it takes about 250ms O_O
Is it common? =\
Instead of copying framebuffer content into a texture on the iPhone, you should just render it directly to the texture in the first place.
Thanks, just did it. 
I was trying to make this feature platform-independent, now i see that it's impossible and i have to branch my code.

I was trying to make this feature platform-independent, now i see that it's impossible and i have to branch my code.
Progress bar: [#########-]
I hit a driver bug that put fugly artifacts into my glTexSubImage2D copies (Radar #6932125) that had me write a workaround. Instead of uploading a clear texture and copying into it, I assemble the texture in-memory in a block of unsigned bytes, and upload that once it is done. This way, each newly uploaded glyph takes nothing more than a bastard sibling of memcpy, and I only have to stall the pipeline once (when the texture is ready).
The result is ridiculously fast - fast enough that I can run most of these operations in real-time, per frame without any real performance hits.
Just wanted to share that a driver bug can inspire that mythic algorithm change that gives a factor 10 speedup.
I hit a driver bug that put fugly artifacts into my glTexSubImage2D copies (Radar #6932125) that had me write a workaround. Instead of uploading a clear texture and copying into it, I assemble the texture in-memory in a block of unsigned bytes, and upload that once it is done. This way, each newly uploaded glyph takes nothing more than a bastard sibling of memcpy, and I only have to stall the pipeline once (when the texture is ready).
The result is ridiculously fast - fast enough that I can run most of these operations in real-time, per frame without any real performance hits.
Just wanted to share that a driver bug can inspire that mythic algorithm change that gives a factor 10 speedup.
Possibly Related Threads...
| Thread: | Author | Replies: | Views: | Last Post | |
| glOrtho setup for rendering "impostors" into a texture. | TomorrowPlusX | 22 | 7,523 |
May 22, 2006 07:21 AM Last Post: TomorrowPlusX |
|
| 2D. Rendering to a Masked Texture | Zwilnik | 3 | 2,719 |
Mar 23, 2003 04:07 PM Last Post: Zwilnik |
|

