[GLQuake] Depth Buffer Precision and Clearing Fix

Post tutorials on how to do certain tasks within game or engine code here.
Post Reply
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

[GLQuake] Depth Buffer Precision and Clearing Fix

Post by mh »

This one is depressingly common, I'm afraid. What I'm going to do here is give the Windows API version of what needs to be done; for other operating systems you should be able to figure it out.

First of all a bit of an introduction. OpenGL is great in that it shelters you from having to deal with some of the more down and dirty aspects of the hardware. Unfortunately there are places where the abstraction leaks, and when that happens you can be bitten quite hard. So you do need to roll up your sleeves and get your hands dirty after all.

Most GLQuake-based engines request a 32-bit depth buffer, and leave it at that. However, these engines are most likely actually running with 16-bit depth on everyone's machines. The reson why is that there is actually no such thing as a 32-bit depth buffer on most consumer hardware. Your ChoosePixelFormat call is selecting a 16-bit depth buffer and unless you check what you actually get you'll never know.

I've seen this in the current versions of 3 major engines. So let's fix it.

Assuming you agree that 16-bit depth isn't enough, the first thing to do is pick a better format. Our available formats are going to be 16-bit depth, 24-bit depth (with 8 unused) and 24-bit depth (with 8 stencil). So open gl_vidnt.c, find the bSetupPixelFormat function, and change this line:

Code: Select all

		32,						// 32-bit z-buffer
to this:

Code: Select all

		24,						// 24-bit z-buffer
The next step (same function) is needed because the PIXELFORMATDESCRIPTOR you pass doesn't provide an absolute ruling on what you get; GDI may decide to give you something different (we already saw that when we asked for 32 but got 16). So there is a possibility that we got a stencil buffer too. So just before the "return TRUE" line, add this:

Code: Select all

	DescribePixelFormat (hDC, pixelformat, sizeof (PIXELFORMATDESCRIPTOR), &pfd);
Every engine I've tested this on also gave us 8 bits of stencil. Why is this important? Simply, if you also have a stencil buffer, even if you don't actually use it, you should always clear it at the same time as you clear your depth buffer. Otherwise your performance will suffer quite a huge dropoff.

So add a global qboolean called something like gl_havestencil to your gl_vidnt.c, set it to true if your pfd.cStencilBits is greater than 0 (after calling DescribePixelFormat), and then extern it so that it's accessible to gl_rmain.c

Then, when clearing, check if gl_havestencil is true, and if so, add GL_STENCIL_BUFFER_BIT to your glClear call. Easy.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Sajt
Posts: 1215
Joined: Sat Oct 16, 2004 3:39 am

Re: [GLQuake] Depth Buffer Precision and Clearing Fix

Post by Sajt »

mh wrote:Simply, if you also have a stencil buffer, even if you don't actually use it, you should always clear it at the same time as you clear your depth buffer. Otherwise your performance will suffer quite a huge dropoff.
Hmm, this is interesting, I never knew that. Perhaps some benchmarking is in order...
F. A. Špork, an enlightened nobleman and a great patron of art, had a stately Baroque spa complex built on the banks of the River Labe.
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Post by Baker »

I ran the Win32 GL build of my engine with 16, 24, 32 z buffers requested (no stencil buffer requested) using -bpp 16.

[My engine defaults to the desktop bpp unless otherwise specified so I actually have to do -bpp 16 to get 16 bit color ... to enable switching between fullscreen and windowed mode without reuploading 2D textures as I do not yet have a texture manager capable of re-uploading all the 2D pixs.]

timedemo demo1 results:

32 bit z buffer requested: 295 fps
24 bit z buffer requested: 204 fps
16 bit z buffer requested: 295 fps

I don't know how to interpret the results except at least by starting the engine and doing a timedemo of demo1, I'm not seeing a frames-per-second performance drop off by not clearing the stencil buffer?
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Hmmmm. Did you check if you're actually getting a 16-bit Z-buffer when you requested a 32-bit one? A 16-bit one will always be faster (less to clear) but at the expense of less precision (dramatically so if using gl_ztrick).
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Post by Spike »

Personally I get 24-depth 8-stencil whatever level of depth I set, even without requesting a stencil buffer (each time 24bit colour).

If you have ztrick enabled, you'll never directly clear the depth buffer anyway (you generally should not have ztrick enabled, as its slower on current cards).
leileilol
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Post by leileilol »

ztrick's for voodoo1, it should be taken out and shot in every port in the world, i don't even get an advantage of speed with it even on a trident / powervr
i should not be here
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Curioser and curioser. I always get 16-bit depth unless I explicitly request 24, and then I always get 8-bit stencil too, even if I request 0. D3D is a lot clearer than OpenGL here (you get exactly what you ask for and no messing) but with the downside that you need to check that what you ask for is actually supported first, otherwise it'll blow up. Pros and cons to both approaches.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Post by Spike »

You could walk the list of them with DescribePixelFormat until it fails, in order to enumerate the provided modes and pick the one that best matches your needs.
Which is basically what ChoosePixelFormat does anyway.
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Great minds or fools?

Here's something I very quickly whipped up:

Code: Select all

#include <windows.h>
#include <stdio.h>
#include <memory.h>
#include <conio.h>


void main (void)
{
	HDC hDC = GetDC (NULL);

	int maxpf = DescribePixelFormat (hDC, 0, 0, NULL);

	printf ("%i available pixel formats\n", maxpf);

	for (int i = 0; i < maxpf; i++)
	{
		PIXELFORMATDESCRIPTOR pfd;

		memset (&pfd, 0, sizeof (PIXELFORMATDESCRIPTOR));
		pfd.nSize = sizeof (PIXELFORMATDESCRIPTOR);

		if (DescribePixelFormat (hDC, (i + 1), sizeof (PIXELFORMATDESCRIPTOR), &pfd))
		{
			if (!(pfd.dwFlags & PFD_SUPPORT_OPENGL)) continue;
			if (!(pfd.dwFlags & PFD_DRAW_TO_WINDOW)) continue;
			if (!(pfd.dwFlags & PFD_DOUBLEBUFFER)) continue;
			if (pfd.iPixelType != PFD_TYPE_RGBA) continue;

			printf ("%3i  %2i colour %2i depth %2i stencil\n",
				(i + 1),
				pfd.cColorBits,
				pfd.cDepthBits,
				pfd.cStencilBits);
		}
	}

	ReleaseDC (NULL, hDC);

	printf ("Press any key... ");
	while (!_kbhit ());
}
For Windows Vista and 7 you might also need to add PFD_SUPPORT_COMPOSITION but I've never noticed any practical difference between using it and not using it in Quake.

This of course uses the desktop rather than a window, just because I didn't feel like creating a window. In a Real Program you would want to use a window.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Post by Baker »

leileilol wrote:ztrick's for voodoo1, it should be taken out and shot in every port in the world, i don't even get an advantage of speed with it even on a trident / powervr
Ironically, I get +20 more FPS using gl_ztrick on a GeForce4.

Go figure.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Post Reply