Another SSE/x87 rendering difference (R_LightPoint)

Discuss programming topics for the various GPL'd game engine sources.
Post Reply
ericw
Posts: 92
Joined: Sat Jan 18, 2014 2:11 am

Another SSE/x87 rendering difference (R_LightPoint)

Post by ericw »

Just noticed this.. in my map jam3_ericw, there's a zombie at the start of the map that's embedded partly into a wall. It gets rendered as black in some engines (winquake.exe, fitzquake085.exe, the 32-bit windows quakespasm 0.90.0).
In others, it's lit at a medium brightness (64-bit windows QS 0.90.0).

You can see the zombie in the top middle picture here, in this case it's lit up:

Image

Here's a screenshot from winquake:

Image

(the map is available here, including source)


Even though it looks worse in winquake / fitzquake, that's the reference rendering IMHO, and it's probably my fault for sticking the zombie too far inside the wall.
The cause seems to be another x76 vs SSE difference, like this previous one that was causing lightmap corruption in some maps.

I was able to get winquake-equivalent lighting in code compiled with SSE with the following changes to RecursiveLightPoint (this is modifying the version in Quakespasm):

Add:
#define DoublePrecisionDotProduct(x,y) ((double)x[0]*y[0]+(double)x[1]*y[1]+(double)x[2]*y[2])
Change:
ds = (int) ((float) DotProduct (mid, surf->texinfo->vecs[0]) + surf->texinfo->vecs[0][3]);
dt = (int) ((float) DotProduct (mid, surf->texinfo->vecs[1]) + surf->texinfo->vecs[1][3]);
To:
ds = (int) ((double) DoublePrecisionDotProduct (mid, surf->texinfo->vecs[0]) + surf->texinfo->vecs[0][3]);
dt = (int) ((double) DoublePrecisionDotProduct (mid, surf->texinfo->vecs[1]) + surf->texinfo->vecs[1][3]);
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Re: Another SSE/x87 rendering difference (R_LightPoint)

Post by Baker »

What compiler did you use? I would guess Visual Studio 20xx, or are the cross-compile tools for MinGW on Linux workable for 64-bit Windows now?

I know you are putting this as a 32-bit vs. 64 bit difference. But I suspect something else could in play here.

You would think that the way two IEEE 754 floating point 32 numbers are multiplied in asm would be the same regardless of source code. If you see what I am suggesting. The instructions produced must be different.

Also: Isn't what the 32-bit WinQuake does the gold standard? The 32-bit WinQuake is Quake. What I am implying is that in your previous 32 bit vs. 64 bit, the 64-bit version was acting wrong. What I am suggesting is that the behavior of the 32-bit WinQuake can't be wrong and superceding that is a change. The changed behavior could affect tons of little hard to find things in a great many maps, putting things that weren't in shadows into shadows or putting things in shadows outside of them and changing lighting subtly away from what the author was testing against at the time of creation, etc.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Another SSE/x87 rendering difference (R_LightPoint)

Post by Spike »

if we're saying that only dosquake's precision is important, then remember that its probably using long-double(80bit) precision internally rather than regular doubles (more precise, but as its only and all intermedite values, this results in compiler-specific results as various compilers may consider various locals as intermediates to avoid writing them to memory only to reload them). SSE code can only use 32+64bit.

trying to exactly match the precision of an x87 program is basically a fool's errand if it leaves the control word of the x87 at its default setting.
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Re: Another SSE/x87 rendering difference (R_LightPoint)

Post by Baker »

Spike wrote:if we're saying that only dosquake's precision is important, then remember that its probably using long-double(80bit) precision internally rather than regular doubles (more precise, but as its only and all intermedite values, this results in compiler-specific results as various compilers may consider various locals as intermediates to avoid writing them to memory only to reload them). SSE code can only use 32+64bit.

trying to exactly match the precision of an x87 program is basically a fool's errand if it leaves the control word of the x87 at its default setting.
Unwritten theme #2 of what I was saying: fiddling with the floating point to get a certain result in a single case ... I'm not sure that kind of "specific result in a very specific case" fine tuning is helpful. Especially if it fine-tunes the expected result away from what every other engine on Win32 had as a result.

I didn't say anything about DOS Quake. :D
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
ericw
Posts: 92
Joined: Sat Jan 18, 2014 2:11 am

Re: Another SSE/x87 rendering difference (R_LightPoint)

Post by ericw »

Baker wrote: Also: Isn't what the 32-bit WinQuake does the gold standard? The 32-bit WinQuake is Quake. What I am implying is that in your previous 32 bit vs. 64 bit, the 64-bit version was acting wrong. What I am suggesting is that the behavior of the 32-bit WinQuake can't be wrong and superceding that is a change. The changed behavior could affect tons of little hard to find things in a great many maps, putting things that weren't in shadows into shadows or putting things in shadows outside of them and changing lighting subtly away from what the author was testing against at the time of creation, etc.
Yup, I agree with all of that :). I didn't mean to argue for superceding WinQuake's behaviour, but rather, the code snippet I posted will (even if compiled with SSE) produce the same results as stock 1997 WinQuake, at least on this test case.
Baker wrote:What compiler did you use? I would guess Visual Studio 20xx, or are the cross-compile tools for MinGW on Linux workable for 64-bit Windows now?
Yeah, VS2013, and also Clang on OS X.
Baker wrote: I know you are putting this as a 32-bit vs. 64 bit difference. But I suspect something else could in play here.

You would think that the way two IEEE 754 floating point 32 numbers are multiplied in asm would be the same regardless of source code. If you see what I am suggesting. The instructions produced must be different.
Well not exactly 32-bit vs 64-bit, but I'm pretty sure it's x87 vs SSE.
I admit I didn't spend much time investigating this case, but I did on the lightmapping issue with mfx's map. Found the link where I wrote up more of an explanation of that one, maybe it'd be worth reposting here with some more details filled in: http://sourceforge.net/p/quakespasm/patches/15/

Both cases involve dot products like:

Code: Select all

float a,b,c,d,e,f,g;
float result = (a*b + c*d + e*f);
This happens a fair bit in quake. With the original executables (x87 floating point) the expression (a*b + c*d + e*f) is computed at 80-bit precision, and only then is the result rounded down to 32-bits.

AFAIK, c99 specifies that the compiler can compile the above like this:

Code: Select all

float a,b,c,d,e,f,g;
float temp1 = (a*b);
float temp2 = (c*d);
float temp3 = (e*f);
float result = temp1 + temp2 + temp3;
And I think this is what you get with SSE2 - since the rounding errors add up, the result will be accurate to less than 32 bits.
When you add the casts to double, it forces the compiler not to round the intermediate values down to 32-bits, but keep them at 64 bits, so that's why that helps match the results obtained from the original quake executables.
There's also some info on wikipedia about this: http://en.wikipedia.org/wiki/SSE2#Diffe ... U_and_SSE2
Post Reply