A reasonable "fast path" for glReadPixels

Sage
Posts: 1,199
Joined: 2004.10
Post: #1
I know, glReadPixels and "fast path" go together like cute puppies and boa constrictors.

But here's what I'm doing: I'm rewriting my capture-to-quicktime-movie code to be as efficient as is reasonably possible. I'd like to know what's the best format and type, whether ( for example ) I can grab GL_RGB and not worry about format swizzling, and so on.


Thanks,
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #2
The most important thing is to get it working asynchronously. Theoretically you can do that with PBOs, though you might have better luck with the old GetTexImage2D path. Details for using PBOs in this way are in the extension spec, details on the GetTexImage2D path (Mac-only) are in the mac-opengl list archives.

Either way, you'll need to use a format that matches the read source (framebuffer or texture) exactly. That probably means BGRA, UNSIGNED_INT_8_8_8_8(_REV) as for texture uploads.

It suddenly occurs to me that even with PBOs, CopyTexSubImage2D + GetTexImage2D might still be better than ReadPixels at ensuring asynchronicity.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #3
That's interesting information, OSC, and I'm hoping I can pick your brain a little ( though, of course, I'm googling and will examine the mac-opengl archives ).

My plan was to have the main thread use glReadPixels to put the current frame into a buffer, and have a separate thread append that buffer to a flat file in /tmp while the main thread went on to render the next frame.

So, what it *sounds* like to me is that I should make a shared context, and use glCopyTexSubImage to copy the framebuffer to a texture in that context, then from a separate thread use glGetTexImage2D to read those pixels out. Am I correct?

Where do PBOs come in? Or are they there just to make a separate shared context?
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #4
From the mac-opengl list I just saw this:
Code:
All you need is something like:

    // Init
    glGenBuffers(1, &pboID);
    glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, pboID);
    glBufferData(GL_PIXEL_PACK_BUFFER_ARB, IMAGE_SIZE, NULL, GL_STATIC_READ);

    // for each read back, be sure the right PBO is bound
    glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, pboID);
    glReadPixels(0, 0, width, height, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, 0);


To access the pixels, then:

    GLubyte *pixels = glMapBuffer(GL_PIXEL_PACK_BUFFER_ARB, GL_READ_ONLY);

    // Do something with the pixels
    processImage(pixels);

        if (!glUnmapBuffer(GL_PIXEL_PACK_BUFFER_ARB)) {
               // Handle error case
             }

The format and type of the glReadPixels is critical - though other 4 component, 8 bits per component combinations should work as well.

I found this simpler alternate approach using the shared texture storage extensions:
Code:
glEnable(GL_TEXTURE_RECTANGLE_EXT);
glBindTexture(GL_TEXTURE_RECTANGLE_EXT, texName);
glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, GL_TEXTURE_STORAGE_HINT_APPLE, GL_STORAGE_SHARED_APPLE

// initialize the texture
glTexImage2D(GL_TEXTURE_RECTANGLE_EXT, 0, GL_RGBA, imageWidth, imageHeight, 0, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, pixelbuffer);

glCopyTexSubImage2D( GL_TEXTURE_RECTANGLE_EXT, 0, 0, 0, 0, 0, imageWidth, imageHeight);

glFlush();

// glFlush initiates the AGP download and glGetTexImage will block for it to complete
// one can do other interesting work here in the meantime or use 2 textures and ping pong between them

glGetTexImage(GL_TEXTURE_RECTANGLE_EXT, 0, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, pixelbuffer);
glFlush()

These both look super straightforward. Your thoughts, OSC?
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #5
I'd imagine one way is to buffer the textures by calling glCopyTexImage2D and saving them in a new texture unit that you put on top of a queue. In another thread (with a shared context), you can take those texture units off of the queue, then use glGetTexImage to them off and delete that texture unit. That way if reading falls behind, they will build up without slowing your framerate. Of course, if it's too far behind, it can end up building up a massive quantity of data and cause thrashing. Of course, as long as you're not saving every frame, it may stay more or less in synch.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #6
The former is cross-platform; the latter is Mac-specific. You can combine the two by using CopyTexSubImage2D and GetTexImage2D with PBOs, as I said, which is probably better than the former since you won't prevent rendering as you read back.

Either way, make sure you double-buffer the object you're reading back to.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #7
Before I go forward then with the threaded call to glGetTexImage in a special context specifically for texture readback, I have a couple questions:

1) I was under the impression that threaded GL using multiple contexts only works on intel macs. Am I misinformed?

2) Can I use glGetTexImage with GL_TEXTURE_RECTANGLE?
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #8
1) yes, you're misinformed.
2) yes.

The whole point of PBOs/texture range + GetTexImage is that they are asynchronous, so you don't need a second thread. Conversely, if you have a second thread, then you don't *need* PBOs or texture range, though they'll probably still help.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #9
Good to hear I'm misinformed. I recall some ruckus a few months back that the "multithreaded gl" updates were only for intel macs. But, anyway, glad to hear I'm wrong.

My plan for multithreading was to have a separate thread serialize the grabbed buffer data to disk while the main thread carries on rendering the next frame.

I'll try the simplest approach first -- using the main thread with the technique you've described to pull out the texture data, and the worker thread to serialize it. I'd prefer to stay away from hairier stuff like multithreaded gl access if I can.

Thanks for the tips. I'll probably have more questions soon, of course Rasp
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #10
I think you're still mistaken. Multithreaded OpenGL means that the meat of the OpenGL calls run on a different thread than you make the calls from. You have always been able to make calls to OpenGL from different threads, however, as long as the threads are synchronized. Be careful, however: you need to attach the context to every thread you use it in. (aka: if you attach the context to the main thread, it's not available in other threads until you attach it there) That also means that you can attach different contexts to different threads without having to constantly attach them back and forth, since they stay attached in their respective threads.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #11
Clarification is good. Thanks!

So, let me think out loud. The app thread -- the "main" rendering thread has the primary display context ( plus another context for fullscreen ). I have a separate ( but shared ) context which is "attached" to my serialization-to-disk thread.

1) The main thread uses glCopyTexSubImage to copy the screen to a texture.

2) The writing thread wakes up ( insert handwaving about threading here ) and uses glGetTextImage to read that data into ram, and writes it to disk -- meanwhile the main rendering thread is rendering the next frame.

This way, only one thread ever touches any particular context.

Does this sound reasonable?
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #12
Yes, that's pretty much the way I've envisioned it. You can also buffer the textures with the queue like I suggested, which lets you make the reading back asynchronous. I would only capture a maximum fps, too. (like 30 fps, so if it's running at 60 fps, you're only capturing every other frame)
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #13
I'm working on a round-robin type queuing implementation with a fixed number of buffers. The last thing I want is to use up all my VRAM and grind to a halt.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Problems getting on the fast path in OpenGL Sea Manky 15 9,871 Jun 10, 2007 01:43 PM
Last Post: OneSadCookie
  glReadPixels doesnt like glCallList Blake 3 2,619 Jan 3, 2006 02:14 PM
Last Post: Blake
  fast billboards reubert 2 2,833 Oct 7, 2004 04:41 AM
Last Post: OneSadCookie
  Too fast! Silden 7 4,034 Apr 16, 2003 02:40 PM
Last Post: Mars_999
  how fast should this be? <seb> 3 3,888 Feb 25, 2003 12:51 PM
Last Post: <seb>