Don't use GL_RGB
I commonly see GL_RGB used as a lightmap format in engine sources, and can only assume that people somehow think it "saves memory". It not only doesn't, but it slows things down too. Read this first.
On NVIDIA, using GL_BGRA can upload textures up to 6 times faster than GL_RGB. On Intel it's something similar but subtly (or not so subtly - see below too) different. ATI, oddly enough, doesn't seem to care much, but nonetheless it makes sense to use the format that performs best on as much hardware as possible.And if you are interested, most GPUs like chunks of 4 bytes. In other words, RGBA or BGRA is prefered. RGB and BGR is considered bizarre since most GPUs, most CPUs and any other kind of chip don't handle 24 bits. This means, the driver converts your RGB or BGR to what the GPU prefers, which typically is BGRA.
Don't use GL_UNSIGNED_BYTE
This one really only affects Intel, but it's no harm to use it for everything. With any type other than GL_UNSIGNED_INT_8_8_8_8_REV Intel seems to pull the texture data back to system memory for modification, whereas using GL_UNSIGNED_INT_8_8_8_8_REV allows glTexSubImage2D to send it directly. A combination of GL_BGRA and GL_UNSIGNED_INT_8_8_8_8_REV will run about 40 times faster on Intel than GL_RGB/GL_UNSIGNED_BYTE.
Both of these are only available if your GL_VERSION is 1.2 or higher, but I think that's a reasonable requirement to have these days. Of course you'll need to define them in your glquake.h file, so here they are:
Code: Select all
#define GL_BGRA 0x80E1
#define GL_UNSIGNED_INT_8_8_8_8_REV 0x8367
If you just do the above changes you'll probably notice that nothing at all has changed in terms of performance; especially if your renderer is set up like GLQuake's. This is because of the dreaded R_DrawSequentialPoly function, which is one of the most evil things in GLQuake.
The single worst thing you can do is modify a resource, then use it, then modify it again, then use it again, and so on, in the same frame. This completely breaks CPU/GPU parallelism and means that your CPU will be constantly waiting for you your GPU to be ready, and your pipeline will be constantly stalling.
This is also the reason why disabling multitexturing is sometimes used as a performance enhancer with some maps - the non-multitextured path more commonly does things the right way, avoids the stalls, and therefore seems to be the faster one, even though it's actually substantially slower than a properly designed multitexture path.
Instead set things up so that you can blast through all of your visible surfaces in a first pass, updating lightmaps as you go, then do a second pass for actually drawing them. If this first pass can do something else useful - like sorting surfaces into texture chains - all the better.
Conclusion
There's frequently a reason why ID Software did things the way they did in Quake, but sometimes that reason may be one of:
- Quake had to run on a MS-DOS machine with a p60 and 8MB RAM
- It worked OK on the hardware that was available in 1996 (I'm thinking 3DFX in particular).
- They were learning and experimenting, and didn't really know what they were doing.
- It wasn't noticed as a problem because there were worse bottlenecks elsewhere (fillrate, software T&L, etc).