How to use VAR

Moderator
Posts: 608
Joined: 2002.04
Post: #1
I am trying to use VAR because by all accounts it can provide a nice speed boost. Getting VAR to work correctly seems to be quite another thing.

Code:
glEnableClientState(GL_VERTEX_ARRAY_RANGE_APPLE);
glVertexPointer(3, GL_FLOAT, 0, _compatibleVertices);
    glNormalPointer(GL_FLOAT, 0, _compatibleNormals);
    glTexCoordPointer(2, GL_FLOAT, 0, _compatibleTexCoords);
    glColorPointer(4, GL_FLOAT, 0, _compatibleColors);

    //glLockArraysEXT(0, _vertices.size());
    glVertexArrayRangeAPPLE(sizeof(_compatibleVertices), _compatibleVertices);
    glFlushVertexArrayRangeAPPLE(sizeof(_compatibleVertices), _compatibleVertices);
    glDrawElements(GL_QUADS, _vertices.size() / 3, GL_UNSIGNED_INT, indices);
    //glUnlockArraysEXT();

    glDisableClientState(GL_VERTEX_ARRAY_RANGE_APPLE);
    glVertexArrayRangeAPPLE(0, 0);
The above code actually prevents anything from being drawn. When I comment out the VAR stuff it works (slowly, but it works). What am I doing wrong?
Quote this message in a reply
Sage
Posts: 1,231
Joined: 2002.10
Post: #2
OSC will probably find something wrong, but here's roughly what I'm doing:

Code:
(...setup...)
    if (varray) free (varray);
    vsize = sizeof(Vertex) * VAR_size;
    varray = (Vertex *) valloc(vsize * VAR_bufs)
    glEnableClientState(GL_VERTEX_ARRAY_RANGE_APPLE);
    glVertexArrayRangeAPPLE(vsize * VAR_bufs, (GLvoid *)varray);
    glGenFencesAPPLE(VAR_bufs, &fence[0]);
(...depending on what's in Vertex...)
    glEnableClientState(GL_COLOR_ARRAY);
    glColorPointer(4, GL_FLOAT, sizeof(Vertex), &varray[0].color);
    glEnableClientState(GL_TEXTURE_COORD_ARRAY);
    glTexCoordPointer(2, GL_FLOAT, sizeof(Vertex), &varray[0].texture);
    glEnableClientState(GL_VERTEX_ARRAY);
    glVertexPointer(2, GL_FLOAT, sizeof(Vertex), &varray[0].vertex);

(...draw time...)
    for (j=0; j < VAR_bufs; j++){
        glFinishFenceAPPLE(fence[j]);
        for (i=VAR_size*j; i<VAR_size*(j+1); i++){
            varray[i] = (...modify a vertex...)
        }
        glFlushVertexArrayRangeAPPLE(vsize, (GLvoid *)varray + vsize*j);
        glDrawArrays(GL_QUADS, vsize*j, VAR_size);
        glSetFenceAPPLE(fence[j]);
    }

(...tear down...)
    glVertexArrayRangeAPPLE(0, 0);
    if (varray) free (varray);

The idea is to double buffer the VAR so the GPU can AGP-DMA and draw half while the CPU computes into the other half.

After fiddling a little bit with the buffer sizes, I went from ~200K points/sec to ~1.3M and ~200k quads/sec to ~600k. It still seems very slow compared to the tristrip throughput in the ADC "vertex optimization" and "vertex performance" examples.

(Edit: show pointer setup)

Update: after a bit of optimization and tweaking I get around 850k quads/sec, or 1.6M triangles/sec using VAR, which is more in line with the ADC vertex optimization number.

Of course past a certain size the bottleneck becomes dependent on fill rate-- the above numbers are for drawing blended 20x20 quads. At 10x10, I get 1M (2M triangles/sec). At 200x200, I get more like 7k... :/
Quote this message in a reply
Moderator
Posts: 608
Joined: 2002.04
Post: #3
I think my trouble may just be the first parameter of glVertexArrayRangeAPPLE and glFlushVertexArrayRangeAPPLE.

Other related question: I read in another post that display lists use APPLE_var when possible in Mac OS X 10.2. This would be great except that what I will be rendering is not static so I would have to remake the display list every frame. I assume this would negate any speed boost I might otherwise get?
Quote this message in a reply
Sage
Posts: 1,231
Joined: 2002.10
Post: #4
Correct, there is no point in compiling a display list if you're only going to draw it once.
Quote this message in a reply
Moderator
Posts: 608
Joined: 2002.04
Post: #5
That's what I suspected...

So no one knows what exactly is the first argument of glVertexArrayRangeAPPLE and glFlushVertexArrayRangeAPPLE?
Quote this message in a reply
Moderator
Posts: 608
Joined: 2002.04
Post: #6
arekkusu: in your code, what are VAR_size and VAR_bufs? Vertex is a struct with x, y, z, I assume?
Quote this message in a reply
Sage
Posts: 1,231
Joined: 2002.10
Post: #7
The first argument is the size in bytes to map to VAR in AGP-space, or flush (so the card can begin DMA), respectively. Read the spec again: http://oss.sgi.com/projects/ogl-sample/r..._range.txt

In my example:
Code:
#define VAR_size 2048
#define VAR_bufs 2

What you put in Vertex depends entirely on what you want to draw. In my case, 2D colored textured quads:

Code:
typedef struct GLCoord2 {
    GLfloat    x;
    GLfloat    y;
} GLCoord2;

typedef struct GLColor4 {
    GLfloat    r;
    GLfloat    g;
    GLfloat    b;
    GLfloat    a;
} GLColor4;

typedef struct Vertex {
    GLColor4    color;
    GLCoord2    texture;
    GLCoord2    vertex;
} Vertex;

Of course, you need to set the pointer for each element array as usual. Edited my initial post's setup section to make that clear.
Quote this message in a reply
Post Reply