|
Submitted by , posted on 09 August 2001
|
|
Image Description, by
8,000 Butterflies at 100 frames per second - no assembly, no hardcoding, no
hacks.
These are three shots from a test of the particle system in my experimental
engine (currently named Pandora). The butterflies are completely dynamic and
are independently animated - no precalculation. When the butterflies flap
their wings they climb higher, and when the don't flap they glide down. The
butterflies are all different sizes, and smaller butterflies are faster than
larger ones. There's also a simple physics model (a force field) that
corrals the butterflies into a donut-shape.
Each butterfly is a separate C++ object, and each uses virtual functions to
implement its behavior (they derive from a CParticle base class). Before
people complain about the performance penalties of virtual functions, I've
extensively benchmarked the virtual function overhead here and it's
practically negligible. The particles are moderately memory-inefficient, but
that can be cleaned up quite a bit with a custom allocator. There are
exactly 0 lines of assembly code used in the particle system code - the
particles themselves are straight C++, and the vector math routines
underneath them are hybrid C/C++.
The geometry for each butterfly is essentially a square folded down the
diagonal, and is built with 8 vertices and 4 triangles (4 verts and 2
triangles per side). The current version renders 8,000 butterflies at 95-100
frames a second on my development machine (a P4-1.7ghz + GeForce 3), or
approximately 800,000 particles per second (3.2 million triangles per
second). It doesn't use any GeForce 3 or Pentium 4 specific features (no
vertex shaders, SSE, etc.) though it does use some NVidia-specific OpenGL
extensions (mainly NV_vertex_array_range).
I wrote this demo just to prove that you can get excellent performance out
of a C++/OpenGL engine without any sort of hacking or assembly
optimization - as long as you keep your code efficient and benchmark every
one of your changes, you can usually avoid any performance bottlenecks. I
probably won't be releasing the full source code to the demo (as that would
require releasing huge chunks of my still-in-development engine) but I can
write up a quick overview of the techniques I used if enough people are
interested.
-Austin Appleby
aappleby@austin.rr.com
|
|