8-bit Rendering With Translucency Maps by (21 August 1999) |
Return to The Archives |
Introduction
|
Recently it came to be that I was hired for contract work on a port of a
Playstation game to the PC (which shall remain nameless). At the time, it
sounded like a fairly simple task. Todays computers are, after all, much faster
than the Playstation, right? Yeah, well, this is true for any computer bought
this year, but due to some bizarre twist of fate that wasn't to be our target
platform. It seems the company I'm working for is a huge fan of Deer
Hunter--not because they have ever *played* the game, but because of its
tremendous success. If you look on the back of the box, I believe you'll see
the minimum requirements for that game: P133, 16MB ram. Note, this doesn't
include a 3d accelerator, because, well, how many people buying games at Walmart
would have one? So, of course, if Deer Hunter has those requirements, so must
the Playstation port as well. Perfectly sound reasoning, right? Well, I guess
it is from a Well, also due to a sore lack of reasoning on my part I took the job (along with a few friends). < STUPIDITY >I mean, a P133 has got to be faster than the Playstation, right? And D3D software mode can't be all that slow, right? < /STUPIDITY > As you may have guessed, the only option was to implement a software rasterizer of our own. For speed concerns we decided to use 8-bit paletted mode for the renderer. We need every little ounce of speed we can get, and none of us are by any means another Carmack or Abrash. This raised another issue, however. The Playstation supports both translucency and Gouraud shading, which we'd need to have in our port as well. How does one do these things in 8-bit paletted mode? Well, that's the topic of this article. Herein details the method we used to implement our rasterizer, as well as its limitations and possible alternatives. First, you may ask why you'd ever need to implement a software rasterizer, and an 8-bit one, no less. With the exception of my present situation, I can't really think of a definite answer to that question. God forbid any of you are ever in my present situation (as it turns out, the software rasterizer is the least of our problems... all I'll say is that spaghetti C and PSX assembly commented in Japanese is not a pretty sight). However, the technique I'll describe here is very effective in creating unique effects both in 3d and 2d, allowing for things that simply cannot be done in real-time non-paletted modes. Also, it never hurts to have lower minimum requirements, as long as the visual quality stays relatively the same (which is quite possible, if implemented correctly). A lot of the methods covered in this article you may already know, but perhaps haven't really found any decent use for up until now. Other things (hopefully some things) you may not have even thought of as yet, but will find useful. Let's start with the basic problem: we want to create an 8-bit game, but not have it look like it was made back in the 80's (bright, distinct, ugly colors). That's what we'd get if we went with a standard 332 palette for our game (3 bits for red, 3 for green, and 2 for blue). A 332 palette has 256 colors (duh), but most of those colors are spread out into a range of colors that just aren't very useful and/or don't fit together well (for instance there are 7 shades of pure red, but only 2 shades of gray [not counting black and white]). This palette does have one primary advantage, though: all of the blend operations that can be performed in 16+ bit modes can be performed equally well in this mode (with notable gradient loss, however). This is definitely the easiest method of implementing translucency and shading in an 8-bit game, but the detail loss is prohibitively large. I would not suggest ever using this method but for the most simple of games. Anyways, these days it is all about visual quality, and users just won't stand for primitive graphics, even if it does mean reduced hardware requirements. The only alternative is to use a palette optimized for the graphics in your game. Creating an optimized palette is a fairly simple task and is handled by most graphics manipulation packages (Photoshop, Debabelizer, etc.). I believe there is also quite a bit of shareware that will allow you to do this (correct me if I am wrong). With an optimized palette, the game will suddenly look a whole lot better. Even if the textures vary drastically in colors from one another, an optimized palette will allow for much better quality than any form of fixed palette (such as 332). But then, how do we implement translucency and shading using an optimized palette? Unlike 332 mode, there is no quick mathematical trick to determine what the blend of two colors will be, so there is no efficient way to do translucency and shading. Of course, what I just said is not completely true, otherwise I wouldn't be writing this article. There are actually two ways (that I know about) of blending two colors together from an arbitrary optimized palette in an efficient manner (ie. without searching through the entire palette for the closest blend match during every color blend). One is to use an ordered optimized palette, and the other is to use a 2d blend look-up table. The former entails creating the optimized palette, then sorting the entries in such a manner that allows two color indices to be combined via some mathematical operation between the two (preferably a simple one, like an add or an or), the result of which will be another palette index. This method is, however, quite inaccurate with most palettes (although completely accurate with some palettes), and is inflexible. Since I decided not to use this technique, I'll not be covering it here. The latter involves precomputing a look-up table for all of the possible blends between two color indices. This method has the benefit that it is 100% flexible (*all* possible blends between two palette indices can be described by the table), and that it is fairly fast. It is not perfect, however, since it requires 256x256 (64K) entries for each blend type, which has the dual effect of being hard on memory conservation and cache efficiency. Both of these problems will be discussed as well. The easy part -- implementation. For every polygon or sprite we wish to draw translucently, there is a translucency map associated with it. To keep things simple, the translucency map is allocated as single linear 64K byte array. This linear array represents a 2d 256x256 array, so in order to index an element at row r and column c we simply need to shift the most significant index left by 8, like so: A[(r<<8)+c] (this is assuming that the array is stored such that the first row is 0-255, second row is 256-511, etc.). This is an useful operation, so lets just make a macro for it:
From this point on, all a translucency blend requires is the following: *destinationPtr = INDEX_TRANSMAP(transMap, *sourcePtr, *destinationPtr); If, for example, the translucency map transMap describes an average blend, the output value of INDEX_TRANSMAP will be the closest match in the palette to that blend. Although this does not guarantee the optimal output, with most palettes it will be pretty damn close. The only thing left is generating the translucency map. This too, is a fairly simple task, but should not be done at run-time, since generation can take quite a while (it's O(n^3), where n is likely 256, the size of the palette). Basically, given a source palette and destination palette (they need not be the same) for every source/destination index pair, the blend is computed and the destination palette is searched for the closest match to the output. The only thing that needs to be specified is how each pair is to be blended. Thankfully, I already have a decent utility for creating translucency maps that should be included at the end of this article. It is capable of setting opacity percentages on a per-color basis, defining custom blend equations, and a preview window of the resulting blend. Below I've listed some of the pros and cons of using a translucency map. |
Pros
|
|
Cons
|
|
Closing
|
Ok, now I'm getting tired of writing this... so I hope I've gotten most of the
relevant details across. If I haven't, then feel free to email me at
jd_wild@uclink4.berkeley.edu with any questions. I doubt many will use this
information since everybody seems to concentrate mostly on 3d hardware these
days, but just on the off-chance that someone out there is still trying to make
a good ole 2d game (or even a 3d game with modest hardware requirements), I
thought I'd share this. Download article_transmaps.zip (34k) |
Addendum
|
Well, it has been brought to my attention that a little more detailed
description of the translucency map implementation might be helpful. This is my
own fault, I should have realized this before. I'm sorry if I over-simplified
the problem, but I'll attempt to remedy that here. First, I would really suggest downloading article_transmaps.zip, as experimenting with different translucency maps can give you an idea of what can be done with them. If you're interested in generating translucency maps yourself (without this tool), then you are pretty much on your own, because I will not be covering the topic here any further. However, if you'd really like to check out how it is done for yourself, just email me with a request for the utility source code -- I'll be glad to send it your way. As for actually using translucency maps, there are a number of ways that it can be done. The basic mechanism is quite simple. Whenever performing any blitting or rasterizing operation, the main purpose of the routine you're working with is to write bits (presumably the correct ones) to the destination surface. In a standard blit, this would merely require copying lengths of scanlines of a bitmap to a destination area of the same size, like so (assume that all surfaces being used are 8-bit):
This loop iterates through the entire height of the source image, each time copying the line of pixels to the destination with memcpy. Note: pitch refers to the actual byte-width of the surfaces being used, which need not be the same as the blit width (and usually won't be). But this function is only useful for straight, unaltered blits from one surface to another. It is thus not very helpful to us. If we wish to alter the bitmap during blit on a per-pixel level, as we would need to for color-keyed blits, we'll need to handle each pixel individually, like so:
The code above is more complicated than it needs to be due to it being somewhat optimized. The basic premise is the same as the previous though, except that instead of using memcpy to copy an enter row all at once, we iterate through each byte (pixel) in the row one at a time. This is slower than the previous method, since pentium string instructions cannot be used to copy the entire line at once, but we have to do this if we wish to analyze the surface on a per-pixel basis. For color-keying we do just that. Before it is written, each pixel is first checked to see if it is the same as the color key. If it is, we skip it. Now to the point. If we wish to do translucency map blends all we need change is: one, remove the color-key check because it will now be redundant, and two, replace the copy "*dstPtr = *srcPtr;" with "*dstPtr = INDEX_TRANSMAP(transMap, *srcPtr, *dstPtr);". The resulting code is:
Thus, each destination pixel will now be set to the blend (as defined by transMap) of the source pixel with the destination pixel. Depending on what you've defined transMap to be, this single blit function might be used for average blits, color-keyed blits, additive blits, whatever. The blit function now requires a couple extra reads (one from the destination surface, and another from the translucency map), a shift and two adds. This speed loss is mostly (if not completely) offset by the speed gained from removing the conditional (which, in tight loops, can tend to muck up the pipeline). And if you're still worried about speed, you can unroll the loop to make it a bit faster still (which from my tests gains about 15% to 20% in performance). Now, how do I apply this to my 3d game? Simple. In the polygon rasterizer there should be a line very similar to the copy in the color-keyed blit, "*dstPtr = *srcPtr;". To get the same effects available with translucent sprite blits, all we need to do is replace that line the same way we replaced the that line in the blit routine. dstPtr will remain the same (it is just a pointer to a byte in the destination surface), but now srcPtr may refer to a location in texture memory (or, in the case of non-texture polygons, can be merely replaced by a color). Now each pixel drawn for the polygon will be blended with the background as defined by the translucency map being used. For effects other than translucency, we just alter the parameters being blended. For example, if we want to implement gouraud shading, we'd use INDEX_TRANSMAP(transMap, *srcPtr, interpolatedColor), where interpolatedColor is the color (either grayscale or RGB, scaled to the range 0 to 255) interpolated from the vertex colors during Gouraud shading. You may have noticed, however, that this method has one limitation: you cannot have a translucent Gouraud shaded polygon. You can have one or the other, but not both. But, if we're willing to sacrifice a little more speed, it is quite possible to accumulate translucency map blends, like so: "INDEX_TRANSMAP(transMap1, INDEX_TRANSMAP(transMap2, srcColor, blendColor), dstColor);". This has the affect of first blending srcColor with blendColor (which could be the Gouraud texture/interpolated color blend), then feeding that output to another blend with dstColor (which could be the translucency blend). Thus, for a little speed loss, we can do two blends at once. Of course, there is no reason why more than two couldn't be done at a time. I hope this helps explain a little more clearly the ideas I brought up in my article. The other techniques I mentioned (such as anti-aliased fonts, lightmaps, and simulated alpha) are all implemented with the standard source/destination blend detailed. Of course, there are many other things that can be done with translucency maps, so I encourage those who are interested to experiment. As before, if you have any questions or comments, email me: jd_wild@uclink4.berkeley.edu. |
|