Baker wrote:
Also: Isn't what the 32-bit WinQuake does the gold standard? The 32-bit WinQuake is Quake. What I am implying is that in your previous 32 bit vs. 64 bit, the 64-bit version was acting wrong. What I am suggesting is that the behavior of the 32-bit WinQuake can't be wrong and superceding that is a change. The changed behavior could affect tons of little hard to find things in a great many maps, putting things that weren't in shadows into shadows or putting things in shadows outside of them and changing lighting subtly away from what the author was testing against at the time of creation, etc.
Yup, I agree with all of that
. I didn't mean to argue for superceding WinQuake's behaviour, but rather, the code snippet I posted will (even if compiled with SSE) produce the same results as stock 1997 WinQuake, at least on this test case.
Baker wrote:What compiler did you use? I would guess Visual Studio 20xx, or are the cross-compile tools for MinGW on Linux workable for 64-bit Windows now?
Yeah, VS2013, and also Clang on OS X.
Baker wrote:
I know you are putting this as a 32-bit vs. 64 bit difference. But I suspect something else could in play here.
You would think that the way two IEEE 754 floating point 32 numbers are multiplied in asm would be the same regardless of source code. If you see what I am suggesting. The instructions produced must be different.
Well not exactly 32-bit vs 64-bit, but I'm pretty sure it's x87 vs SSE.
I admit I didn't spend much time investigating this case, but I did on the lightmapping issue with mfx's map. Found the link where I wrote up more of an explanation of that one, maybe it'd be worth reposting here with some more details filled in:
http://sourceforge.net/p/quakespasm/patches/15/
Both cases involve dot products like:
Code: Select all
float a,b,c,d,e,f,g;
float result = (a*b + c*d + e*f);
This happens a fair bit in quake. With the original executables (x87 floating point) the expression (a*b + c*d + e*f) is computed at 80-bit precision, and only then is the result rounded down to 32-bits.
AFAIK, c99 specifies that the compiler can compile the above like this:
Code: Select all
float a,b,c,d,e,f,g;
float temp1 = (a*b);
float temp2 = (c*d);
float temp3 = (e*f);
float result = temp1 + temp2 + temp3;
And I think this is what you get with SSE2 - since the rounding errors add up, the result will be accurate to less than 32 bits.
When you add the casts to double, it forces the compiler not to round the intermediate values down to 32-bits, but keep them at 64 bits, so that's why that helps match the results obtained from the original quake executables.
There's also some info on wikipedia about this:
http://en.wikipedia.org/wiki/SSE2#Diffe ... U_and_SSE2