Multithreading for LOD adjustment?

DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #1
Well, I have a dynamic LOD capable engine, but find that I am spending too much time on updating the screen detail for a ROAM implementation. Basically, 90% of CPU are spent checking whether my triangles need to be split or merged. This means the video hardware is nowhere near being fully used. Now, I figured out I cannot check just every other frame or so, since that would lead to an unsteady framerate. And I know of no way to reduce the number of triangles to be checked and updated (although, I just though of something...).

Anyhow, I would like to know if it's a good idea to seperate my current thread which does it all, into a drawing and one updating thread? I assume that the framerate would be stable then, but is it efficient? Also, how would I be able to control how much CPU time each thread gets in relation to each other? Just wondering...

- D.G
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #2
Multithreading is almost certainly more effort and annoyance than it's worth. Try optimizing your ROAM algorithm (it shouldn't be slow).
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #3
Quote:Originally posted by OneSadCookie
Multithreading is almost certainly more effort and annoyance than it's worth. Try optimizing your ROAM algorithm (it shouldn't be slow).

Really, I have tried every optimization I could think of, except for determining visibility, which would only improve the visual quality, as more triangles would be within view, but not triangle count.

I basically have to do 2 vector*matrix operations for every triangle, since I am taking the camera position into account, plus I have to sort all triangles into 3 lists: split, neutral, merge. I think that since I altivec-ed the vector/matrix class, it is the sorting that takes the longest time. Somehow sampler won't give me the correct stack now, only nulls, so I can't exactly measure it. But the sorting took the longest last time I checked, and I haven't changed it much.

Basically, what is done is that the actual values of the variance are updated for all triangles, and once complete, each triangle is removed and re-added to the split/merge/neutral queues. Now, I think this is not the best way to do it, but I could not come up with a better approach, yet. The triangle's contain fields next and prev, to point to the next and previous triangle in the queues, and each of the 3 queues is a seperate, double-linked list, basically.

There is several issues which have to be dealt with:
- it is inefficient to find out which list the triangle belongs too, as the lists have to be traversed, all of them in the worst case
- it is impossible to do a simple interchange of two items, like in a bubble sort type of algorithm, because the triangles may have to be inserted to a different list.

I am thinking that updating variance and relocating the triangle in the queues one by one may be more efficient, but I would like to hear a second opinion before I break my code.

- D.G
Quote this message in a reply
Member
Posts: 177
Joined: 2002.08
Post: #4
One thing I can think of is not to perform the update every single frame. If you update, say, every 10 or 20 frames, you're still updating several times a second, and the loss from a less than perfect visible set is probably small compared to what you are experiencing now.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #5
Use Shikari to do your profiling. It's cool, and it'll tell you which lines of code are generating crap assembler. It'll even sometimes suggest ways to speed up your code.
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #6
Quote:Originally posted by Mark Levin
One thing I can think of is not to perform the update every single frame. If you update, say, every 10 or 20 frames, you're still updating several times a second, and the loss from a less than perfect visible set is probably small compared to what you are experiencing now.


I described why I can't do that. This would cause VERY visible chops between those frames where updating is performed.

Thanks OSC, I will try it ( I have just found it by myself, too, while downloading HeaderDoc. All roads lead to Rome). I tried to get my way with gprof before, but I couldn't find out how to properly have the profiling files generated.

- D.G
Quote this message in a reply
Member
Posts: 110
Joined: 2002.04
Post: #7
Quote:Originally posted by DoooG
\I figured out I cannot check just every other frame or so, since that would lead to an unsteady framerate.
- D.G


Did you try it?
It should feel steady.

Lost of earlier engines did things just that with software renderers.
I suggest you try it since it should be easy.

Or base it on Time instead of FPS...

- Mac Lead ZeniMax Online Studios
- Owner Plaid World Studios
- Resume: http://www.chrisdillman.com
Quote this message in a reply
Member
Posts: 145
Joined: 2002.06
Post: #8
Quote:Originally posted by DoooG
I described why I can't do that. This would cause VERY visible chops between those frames where updating is performed.
What if you divided up the work and did a little bit per frame? It would require a bit more memory (as you'd need to keep the old mesh around till the new one was done), but should resolve the uneven framerate.

"He who breaks a thing to find out what it is, has left the path of wisdom."
- Gandalf the Gray-Hat

Bring Alistair Cooke's America to DVD!
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #9
Quote:Originally posted by inio
What if you divided up the work and did a little bit per frame? It would require a bit more memory (as you'd need to keep the old mesh around till the new one was done), but should resolve the uneven framerate.


The problem is how to do it. The triangles are kept in a static array, so I don't have to allocate memory every time. All triangles are pinned, and the number of triangles may vary from frame to frame. The triangles are also in linked lists, but those also change all the time, and it is therefore not possible to traverse the list part by part.

I think what I will do is to make a "relocate" function, which is able to move a triangle in the lists after it is updated. Since the triangles should not have to move too much, this should be faster than removing and re-inserting all the time. I hope this speeds up the sorting significantly.

Then, I will maybe store the length of the variance vector when it's created, so i dont have to determine the length of it at every update. It is weird, as shikari and sampler show VERY different amounts of time spend on sqrt().

- D.G
Quote this message in a reply
Moderator
Posts: 916
Joined: 2002.10
Post: #10
removing and inserting takes constant time and is quick to implement in a doubly linked list, performace would be negligible
Quote this message in a reply
henryj
Unregistered
 
Post: #11
Is this the roam that you put on sourceforge you are talking about?
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #12
Quote:Originally posted by skyhawk
removing and inserting takes constant time and is quick to implement in a doubly linked list, performace would be negligible


It doesn't, because the lists are sorted, and hence I have to find the right position first.

Anyway, I changed my algorithm to now move each triangle one one place up or down the list each frame, but I got no improvements. It seems like I have to tackle cutting some vector multiplies.

And yes, this is what I put on sourceforge. Updated yesterday, btw. It should actually work now.

- D.G
Quote this message in a reply
Feanor
Unregistered
 
Post: #13
Did you check out the open source project called libMini? I have gotten it running on OS X, and bundled it with some other related code in a big bundle (some of which works, some doesn't). I haven't had the werewithal to study the code so I still am useless on that level, but I thought that as a resource it might be helpful. It's on my iDisk. Check out the target called "Yukon". lScape is cool but it will be a lot of work to port it because of the OpenGL procedure pointers. It's too much for me to do at this point, but I keep hoping someone else will maybe take a stab at it.
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #14
Quote:Originally posted by Feanor
Did you check out the open source project called libMini? I have gotten it running on OS X, and bundled it with some other related code in a big bundle (some of which works, some doesn't). I haven't had the werewithal to study the code so I still am useless on that level, but I thought that as a resource it might be helpful. It's on my iDisk. Check out the target called "Yukon". lScape is cool but it will be a lot of work to port it because of the OpenGL procedure pointers. It's too much for me to do at this point, but I keep hoping someone else will maybe take a stab at it.


I will take a look. The screen shots look promising. But I want to make the ROAM implementation as fast as possible, since its part of a library. There is no reason not to implement other terrain engines, as long as they are compatible with oop.

- D.G
Quote this message in a reply
Post Reply