Host_speeds weirdness and video driver handling crashes

Discuss programming topics for the various GPL'd game engine sources.
Post Reply
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

Has anyone working with the Q2 engine checked out the output from host_speeds set to 1? The ms elapsed for the client frame is often negative, while the time for the render frame (inside the client frame) is not:

all: 1 sv: 0 gm: 0 cl: -2 rf: 3

Here's the code in Qcommon_frame responsible for this:

Code: Select all

	if (host_speeds->value)
		time_before = Sys_Milliseconds ();

	SV_Frame (msec);

	if (host_speeds->value)
		time_between = Sys_Milliseconds ();		

	CL_Frame (msec);

	if (host_speeds->value)
		time_after = Sys_Milliseconds ();		


	if (host_speeds->value)
	{
		int			all, sv, gm, cl, rf;

		all = time_after - time_before;
		sv = time_between - time_before;
		cl = time_after - time_between;
		gm = time_after_game - time_before_game;
		rf = time_after_ref - time_before_ref;
		sv -= gm;
		cl -= rf;
		Com_Printf ("all:%3i sv:%3i gm:%3i cl:%3i rf:%3i\n",
			all, sv, gm, cl, rf);
	}	
Here's the code inside CL_Frame where the ref time is calculated:

Code: Select all

	// update the screen
	if (host_speeds->value)
		time_before_ref = Sys_Milliseconds ();
	SCR_UpdateScreen ();
	if (host_speeds->value)
		time_after_ref = Sys_Milliseconds ();
This indicates that the time returned by Sys_Milliseconds after calling CL_Frame is less than before! This is in spite of the fact that it behaves as expected before and after calling SCR_UpdateScreen. Is anybody else getting this behavior? I strongly doubt that the client frame could be finishing before it begins.

Another problem I'm having has to do with newer 295.73 nVidia drivers intercepting crashes with a "Display Driver Stopped Responding and has recovered" error popup. Since upgrading from the 197.45 drivers, this is preventing me from debugging some hangs in the asyncronous network/rendering path I've been having with my engine. Does anybody know how to disable this?
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Host_speeds weirdness and video driver handling crashes

Post by Spike »

QueryPerformanceCounter inside certain implementations of Sys_Milliseconds will do that, yes.
Different CPU cores have different performance counters, which can go out of sync with idle states and other stuff like that.
Use winmm.lib's timeGetTime instead, but it only has between 1ms and 10ms precision or something. You can supposedly request the system to give higher precision for it using some other call.
The other fix is to tie it to a single cpu core.

regarding your nvidia crashes... Try disabling vbo use. Then any bad accesses will come from user-space which will trigger debuggable faults.
Most likely it comes from leaving attribute arrays enabled from a smaller vbo, then rendering 10000 verts with the smaller vbo attributes still active. Such things can trigger some nice DMA bus faults.
This message is from windows telling you that the _driver_ has crashed, rather than the application. Its not nvidia saying their driver has crashed, as they're too proud to ever admit that. Any crashes nvidia's drivers detects will say 'this program is not compatible' or some other face-saving gibberish.
So are you sure its the drivers intercepting a purely software crash?
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Host_speeds weirdness and video driver handling crashes

Post by mh »

Q2 by default uses timeGetTime unless you've modified it yourself.

timeBeginPeriod (1) is the way to set it's resolution to 1ms, but beware that this is a system-wide global setting and can impact on the thread scheduler, other programs, power-saving states, etc. Expect to run a little hot, in other words.

The documentation for timeBeginPeriod suggests that a resolution of 1 may not be supported on all systems, but in practice it will be. If in doubt, start a loop from 1 and call timeBeginPeriod with the loop counter until you either succeed or hit a threshold beyond which you're not going to bother (10 seems reasonable). That only needs to be done once at program startup.

There are also dire warnings floating around about it persisting after the program exits, although I haven't found any definitive confirmation either way. Despite that, playing safe and using timeEndPeriod (1) before exiting seems the right thing to do. If you use the loop method then be sure to call timeEndPeriod with the same resolution that you originally called timeBeginPeriod with.

Yes, timing on Windows sucks.

For the "display driver stopped responding" problem you can change a registry setting that controls the timeout period. See: http://support.microsoft.com/kb/2665946 (also seems possible to disable it entirey based on the info here: http://msdn.microsoft.com/en-us/windows/hardware/gg487368). One possible cause of this is that something you've done has thrown the driver into an infinite loop; if that's happened then increasing the timeout period is not going to help much, and the "stopped responding" behaviour is actually the best you can hope for.

I once had this all the time with an old Intel gfx chip and it consistently happened during lightmap uploads. Switching from GL_RGB to GL_BGRA was all that was needed to fix it. I'd expect that NV would do a lot better though (although I don't trust their drivers much anymore after the 2.80 fiasco)...
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

I'm not using VBOs, and this happens only under the async client frame routine, after several minutes of gameplay. I'm thinking that there's a an infinite loop somewhere, or maybe a divide by zero, that could be causing the application to hang.

I'll try those Registry keys, MH. They don't exist by default, so I'll have to create them.
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Host_speeds weirdness and video driver handling crashes

Post by mh »

My experience has been that any time I leave a shorter attribute array enabled, reading beyond the end of a VBO - provided it's stored in GPU memory - is quite safe as it's not subject to the same memory protection that OS-allocated memory gets. Not that it's something you should do, or behaviour you should rely on, of course.

VBOs in system memory are another matter, but to date I've always gotten a nice clean crash. As a general rule, software T&L - or a fallback of the per-vertex pipeline - can get quite hairy whereas hardware is very robust. (The counterargument, of course, is that the software behaviour at least lets you know pretty damn fast that there is something wrong.)
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

I took a closer look at my async client frame and input code. I noticed this function that I added that supposedly is only used when disconnected, to handle mouse input for the menus:

Code: Select all

void CL_RefreshMove (void)
{	
	usercmd_t *cmd = &cl.cmds[ cls.netchan.outgoing_sequence & (CMD_BACKUP-1) ];

	// Get basic movement from keyboard
	CL_BaseMove (cmd);

	// Allow mice or other external controllers to add to the move
	IN_Move (cmd);
}
It calls CL_BaseMove without first setting and bounds checking the global variable frame_msec like CL_RefreshCmd does:

Code: Select all

void CL_RefreshCmd (void)
{	
	int			ms;
	usercmd_t	*cmd = &cl.cmds[ cls.netchan.outgoing_sequence & (CMD_BACKUP-1) ];

	// get delta for this sample.
	frame_msec = sys_frame_time - old_sys_frame_time;	

	// bounds checking
	if (frame_msec < 1)
		return;
	if (frame_msec > 200)
		frame_msec = 200;

	// Get basic movement from keyboard
	CL_BaseMove (cmd);

	// Allow mice or other external controllers to add to the move
	IN_Move (cmd);
	.
	.
	.
}
CL_BaseMove calls CL_KeyState:

Code: Select all

float CL_KeyState (kbutton_t *key)
{
	float		val;
	int			msec;

	key->state &= 1;		// clear impulses

	msec = key->msec;
	key->msec = 0;

	if (key->state)
	{	// still down
		msec += sys_frame_time - key->downtime;
		key->downtime = sys_frame_time;
	}

	val = (float)msec / frame_msec;
	if (val < 0)
		val = 0;
	if (val > 1)
		val = 1;

	return val;
}
That's a potential divide by zero right there.

So I changed my async client frame code to always call CL_RefreshCmd instead, just like EGL does. Did a debug build, and ran it windowed with the registry hack that MH linked to. After over 20 minutes, no hangs or crashes, when before they always happened within 5 or 6 minutes. I think I may have fixed it. I'll need to run a release build fullscreen to be sure.
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

I spoke too soon. I got another hang after 15 minutes. This instability is really intermittent.
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

mh wrote:My experience has been that any time I leave a shorter attribute array enabled, reading beyond the end of a VBO - provided it's stored in GPU memory - is quite safe as it's not subject to the same memory protection that OS-allocated memory gets. Not that it's something you should do, or behaviour you should rely on, of course.
VBOs in system memory are another matter, but to date I've always gotten a nice clean crash. As a general rule, software T&L - or a fallback of the per-vertex pipeline - can get quite hairy whereas hardware is very robust. (The counterargument, of course, is that the software behaviour at least lets you know pretty damn fast that there is something wrong.)
Would any of this apply to vertex arrays that use the regular glTexCoordPointer, glVertexPointer, etc, and are drawn with glDrawElements or glDrawRangeElementsEXT? I'm not using VBOs, and have recently switched to batching multiple lightmapped surfs using such vertex arrays. I always check for the arrays overflowing before adding another surface.
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Host_speeds weirdness and video driver handling crashes

Post by Spike »

glTexCoordPointer/glVertexPointer specify built-in attributes.
its unlikely that you want to disable the glVertexPointer attribute, but multitextured glTexCoordPointer attributes are more likely to be left enabled, even when the texture itself is disabled.
If you're not using VBOs, the driver should do some sort of memcpy in user-space as part of the glDrawElements call.

If that 'memcpy' results in poking an invalid page, your debugger will trigger a breakpoint inside some system dll without debugging info, so it might not show the right function that called glDrawElements, but should at least show its parent.
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Host_speeds weirdness and video driver handling crashes

Post by mh »

Downloading symbols helps here. At the very least it will enable you to more clearly identify my code/system code/driver code as a source of the crash. And yes - glClientActiveTexture is evil incarnate: if you're using shaders at all you should switch to generic attrib arrays as fast as you can.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

Could you explain downloading symbols? Do you mean using glGetError() checks periodically?

What's so bad about glClientActiveTextureARB? I'm using it in a utility function to change TMUs:

Code: Select all

void GL_SelectTexture (unsigned tmu)
{
	if (tmu >= MAX_TEXTURE_UNITS || tmu >= glConfig.max_texunits)
		return;

	if (tmu == glState.currenttmu)
		return;

	glState.currenttmu = tmu;

	qglActiveTextureARB(GL_TEXTURE0_ARB+tmu);
	qglClientActiveTextureARB(GL_TEXTURE0_ARB+tmu);
}
Attrib arrays require OpenGL 2.0. I'm keeping the required features at 1.4 for compatibility with the old laptop that I'm currently stuck with.

I'm also using locked arrrays to render:

Code: Select all

void GL_LockArrays (int numVerts)
{
	if (!glConfig.extCompiledVertArray)
		return;
	if (glState.arraysLocked)
		return;

	qglLockArraysEXT (0, numVerts);
	glState.arraysLocked = true;
}

void GL_UnlockArrays (void)
{
	if (!glConfig.extCompiledVertArray)
		return;
	if (!glState.arraysLocked)
		return;

	qglUnlockArraysEXT ();
	glState.arraysLocked = false;
}

void RB_DrawArrays (void)
{
	if (rb_vertex == 0 || rb_index == 0) // nothing to render
		return;

	GL_LockArrays (rb_vertex);
	if (glConfig.drawRangeElements)
		qglDrawRangeElementsEXT(GL_TRIANGLES, 0, rb_vertex, rb_index, GL_UNSIGNED_INT, indexArray);
	else
		qglDrawElements(GL_TRIANGLES, rb_index, GL_UNSIGNED_INT, indexArray);
	GL_UnlockArrays ();
}
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Host_speeds weirdness and video driver handling crashes

Post by mh »

You can also use attrib arrays if you have GL_ARB_vertex_program available (this extension also defines a clear mapping between generic attribs and fixed attribs too). See http://oss.sgi.com/projects/ogl-sample/ ... rogram.txt (but be prepared to nose-dive into ASM shaders while searching for the relevane info).

For symbols see: http://support.microsoft.com/kb/319037

What's wrong with glClientActiveTexture is that it's a rotten API. It works, it does what it says, but it's lousy design. glMultiTexCoordPointer would have been better (and would have mapped more cleanly to the equivalent immediate mode calls too).

Lock/Unlock is only really useful if there is a portion of the arrays that you're going to reuse for subsequent draw calls. Say you're drawing an MD2 will a shell around it - the positions will be common but one pass will add texcoords and light colours, the second pass will add shell colour. So you set your array for positions, Lock, then set for texcoords and light colours, draw, set for shell colours, draw again, finally Unlock, and if the driver does it's job right the positions will only need to be sent to the GPU and (maybe even) transformed once (not even VBOs can do that). It may be useful as well if you can call it early enough then do a bunch of other work before your draw call as the driver may be able to stream your verts to the GPU at the same time as you're doing the other work - in that way it can act as a hint to the driver that "I'm not going to be modifying this data from here on, so you can go do your thing with it while I'm doing this other work". In practice it may sometimes be difficult to find other work to be doing.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Host_speeds weirdness and video driver handling crashes

Post by Spike »

locking arrays is somewhat depricated. there's too much variation in the level of support provided, resulting in games not using it properly then being buggy on other graphics cards. resulting in less drivers actually supporting every (fixed) attribute.
while older drivers could cache the transforms, you won't see that with any cards with hardware transforms, so its really only useful for q3-era hardware, and then only for vertex coords (but like I say, beware different levels of support).
For more recent hardware, you're better off using shaders to avoid the need for multiple passes. And/or VBOs.

glsl's attributes can do weird things with gl_Vertex. I've an ati 9600 card that exactly aliases attribute 0 to it, including glDisable(GL_VERTEX_ARRAY). Which means if you're switching between the two, you're never quite sure if its enabled or disabled, which means you have to disable and then enable it in an excessive way.
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Host_speeds weirdness and video driver handling crashes

Post by mh »

Speaking of which, I once saw a driver use lower precision interpolation for attrib array 3 (aliases to color). The full table is:

Code: Select all

    Generic
    Attribute   Conventional Attribute       Conventional Attribute Command
    ---------   ------------------------     ------------------------------
         0      vertex position              Vertex
         1      vertex weights 0-3           WeightARB, VertexWeightEXT
         2      normal                       Normal
         3      primary color                Color
         4      secondary color              SecondaryColorEXT
         5      fog coordinate               FogCoordEXT
         6      -                            -
         7      -                            -
         8      texture coordinate set 0     MultiTexCoord(TEXTURE0, ...)
         9      texture coordinate set 1     MultiTexCoord(TEXTURE1, ...)
        10      texture coordinate set 2     MultiTexCoord(TEXTURE2, ...)
        11      texture coordinate set 3     MultiTexCoord(TEXTURE3, ...)
        12      texture coordinate set 4     MultiTexCoord(TEXTURE4, ...)
        13      texture coordinate set 5     MultiTexCoord(TEXTURE5, ...)
        14      texture coordinate set 6     MultiTexCoord(TEXTURE6, ...)
        15      texture coordinate set 7     MultiTexCoord(TEXTURE7, ...)
       8+n      texture coordinate set n     MultiTexCoord(TEXTURE0+n, ...)
Mixing generic with conventional attribs is generally bad mojo though (although Doom 3 did it.....)
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Knightmare
Posts: 63
Joined: Thu Feb 09, 2012 1:55 am

Re: Host_speeds weirdness and video driver handling crashes

Post by Knightmare »

I induced an infinite loop in my client code so see if that hang would cause the display driver to time out. Nope. So it is indeed something in the rendering code, and not having to do with the async client code. Though the synchronous and async paths have different default max fps which could affect the renderer- cl_maxfps 90 for the synchronous, and r_maxfps 100 for the async path. I did a debug build to try and get it to hang so I could trace the call stack, but it was still running after 30 minutes on the same map where the driver timeouts were occurring within 15 minutes.

The only thing in the renderer that I changed when the hangs started occurring were putting lightmapped surfaces into texture chains, MH's lightmap update batching, changing the lightmap format to GL_BGRA, and the batching of lightmapped surfaces. So I'll try testing with those enhancements removed.
Post Reply