Realtime Voxel Landscape Engines - Part 6 - Hardware Acceleration by (21 February 2000) |
Return to The Archives |
The First Approach
|
The most simple way of using the hardware would be to replace the software span renderer with an equivalent hardware accelerated version. Simple textured lines could be drawn, or triangles for better performance on current cards. This is still however far to slow to be feasable. The whole rendering algorithm needs to be completely redesigned for triangle rendering, but still benefiting from the advantages of a 4 DOF landscape. |
Taking Advantage of the Hardware
|
A video card can be considered as a huge interpolating machine. So our goal will be to make sure we fully use this capability. For example the interpolation of the texture coordinates will obviously be done with the texturing unit; the interpolation of the lightmap's colour will be taken care of by the gouraud shading; the interpolation of the height of the voxels will be done along the edges of the triangles; the interpolation between the two mixed textures will be done via blending. We also have to make sure that our polygon count never gets too high: this will always be the bottleneck since our loop can always generate more triangles than the card can render. We may also be able to take advantage of the zbuffer at some point, and since the speed hit would be relatively small. |
The Structure Rendering Algorithm
|
After taking into account all the previous factors, the algorithm I came up with turned out reasonably simple. We approach the rendering in a similar fashion as for the software engine, but we scan along the depth lines instead of each ray separately. Every 32 pixels or so (using a smaller step means more polies will be generated), we project the voxels onto the screen in the same fashion, but we do not draw the span straight away like in software. We wait until we've projected the next column of voxels on the screen. And depending on which of the two columns are visible, we take different actions: A is the left most point of the current strip portion, and B is the right most. Case 1 - column A not visible, B visible: start a new triangle strip, and send the vertices of the columns (A then B) to the GL. Case 2 - column A visible, B visible: continue the currently active triangle strip, by sending only the vertices of column B to the GL. Case 3 - column A visible, B not visible: send the coordinates of column B to the GL and stop the current triangle strip. Case 4 - column A not visible, B not visible: usually do nothing, special case needs to be handled. This algorithm is extremely efficient, since mostly triangle strips are drawn. There are however a few problems. |
Holes
|
As you may have noticed, there is a special sub-case of case 4 where even though A and B are not visible, we must draw a single triangle anyway. This must be done if a triangle strip was started or finished at this point during the previous pass. To determine if the triangle must be drawn or not, we just need to check that:
In some cases, this draws the triangle when it's not quite visible, but it's much more efficient than having to perform a few line intersections. |
Texturing
|
The texturing scheme I use is not particularly efficient, since simply trying to emulate the software algorithm. This requires exactly 4 single passes, each vertex having different alpha coefficients, which is not efficient at all. Some sort of procedural shader would probably be more appropriate and quicker for the hardware to render, but slower for the main processor to precompute. Since I'm convinced those are not the two only ways and since I'm not sure what technique would be the best, I'll let you ponder about this one a bit. Feel free to email me if you have any good ideas. I'll mention them here if they have potential. |
OpenGL Implementation
|
I have written (although hacked together quickly is probably more appropriate) a small implementation of this algorithm in OpenGL. I've made a small beta demo available, just to show you it's actually possible to get this algorithm working. You can download terraVoxGLb.zip (1,541 KB) right here. The following screen shot has the polygons outlined, to give you more of an idea how the triangles are actually rendered. This particular scene contains 8664 (x4 passes) triangles drawn in 925 strips, which is pretty good. The triangle mesh is however a bit too dense at the bottom of the valley, and some sort of span reduction technique would ideally need to be implemented. The code itself is nothing special, since I wrote it just to test the algorithm. This is why I won't be releasing it. But given this description of the hardware algorithm and the full source code for the software engine, you should have no trouble implementing a small accelerated demo. |