|
Intermediary Results / C++ Operators
Submitted by |
Many times I have seen a people do a C++ class for vector math. This is A
Good Thing, basically, it makes your source more readable, and of course
have my own version of such a class too. It's really nice to have "operator
+" to add vectors and "operator %" to do a cross product or the like.
But when speed is really important, don't use them. Even if they are
"inline", they have an intrinsic speed penalty. Consider this code:
vector a, b, c, d;
a = d + ( b % c ); |
This is meant to build the cross-product of b and c and then add d to it
and store the result in a. Due to stack based arrangement of the
floating-point-registers in the Intel-FPU (and so for the AMD-FPU as well
since it must be Intel compatible), any compiler is handicapped of
optimising this piece of code. Intermediary results, as big as a "vector",
will not stay in floating point registers. In fact, the results of
"operator %" will be written back to memory, the read again by "operator +"
then written to a.
So when speed is important, be sure to write out the entire calculation
component by component. This way, all intermediary results can be kept in
FPU-registers, only written to memory at the end of the calculation. For
the example abouve, this would translate to:
vector a, b, c, d;
a.x = d.x + b.y * c.z - b.z * c.y;
a.y = d.y + b.z * c.x - b.x * c.z;
a.z = d.z + b.x * c.y - b.y * c.x; |
That's all about it. I have digged in the assembly output of some
compilers, so I dare the general statement for any compiler.
-- chris
|
The zip file viewer built into the Developer Toolbox made use
of the zlib library, as well as the zlibdll source additions.
|