Subdiv16 for sprites?

Discuss programming topics for the various GPL'd game engine sources.
leileilol
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Subdiv16 for sprites?

Post by leileilol »

Surely would this be faster for sprite rendering if it nicked code from the bsp spans for subdivided drawing? Would help immensely in unrolls with additive sprites.
i should not be here
r00k
Posts: 1111
Joined: Sat Nov 13, 2004 10:39 pm

Re: Subdiv16 for sprites?

Post by r00k »

leileilol wrote:Surely would this be faster for sprite rendering if it nicked code from the bsp spans for subdivided drawing? Would help immensely in unrolls with additive sprites.
SOMEONE HAD TOO MUCH SUSHI! ;)

sushi is a healthy diet that fortifies positive brain patterns.

we must acknowledge
this is a small community....
mankrip
Posts: 924
Joined: Fri Jul 04, 2008 3:02 am

Re: Subdiv16 for sprites?

Post by mankrip »

Already done:

Code: Select all

void D_SpriteDrawSpans_Dithered_ColorKeyed (void)
{
	do
	{
		// mankrip - begin
		u = pspan->u;
		v = pspan->v;
		du = (float)u;
		dv = (float)v;
		// mankrip - end

		// calculate the initial s/z, t/z, 1/z, s, and t and clamp
		sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
		tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
		zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems // mankrip
		izi = (int) (zi * 0x8000 * 0x10000); // mankrip

		s = (int) (sdivz * z) + sadjust;
		if (s > bbextents)
			s = bbextents;
		else if (s < 0)
			s = 0;

		t = (int) (tdivz * z) + tadjust;
		if (t > bbextentt)
			t = bbextentt;
		else if (t < 0)
			t = 0;

		pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
		pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

		Y = v & 1; // mankrip
		count		= pspan->count >> 4; // mh
		// mankrip - begin
		spancount	= pspan->count % 16;
		if (count)
		{
			// prepare dither values
			X = ! ( (v + u) & 1);
			XY0a = dither_kernel[X][Y][0];
			XY1a = dither_kernel[X][Y][1];
			XY0b = dither_kernel[!X][Y][0];
			XY1b = dither_kernel[!X][Y][1];

			while (count--)
		// mankrip - end
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;
				zi += zistepu;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				DITHERED_COLORKEY_A(-16); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B(-15); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A(-14); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B(-13); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A(-12); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B(-11); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A(-10); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B( -9); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A( -8); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B( -7); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A( -6); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B( -5); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A( -4); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B( -3); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_A( -2); izi += izistep; s += sstep; t += tstep;
				DITHERED_COLORKEY_B( -1); izi += izistep;
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
		}
		if (spancount)
		{
				// mankrip - end

			// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
			// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

			spancountminus1 = (float) (spancount - 1);
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
			sdivz += d_sdivzstepu * spancountminus1;
			tdivz += d_tdivzstepu * spancountminus1;
			zi += d_zistepu * spancountminus1;
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
			z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
		//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			snext = (int) (sdivz * z) + sadjust;
			if (snext > bbextents)
				snext = bbextents;
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
			else if (snext < 16)
				snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

			tnext = (int) (tdivz * z) + tadjust;
			if (tnext > bbextentt)
				tnext = bbextentt;
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
			else if (tnext < 16)
				tnext = 16;   // guard against round-off error on <0 steps
			// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

			if (spancount > 1)
			{
				sstep = (snext - s) / (spancount - 1);
				tstep = (tnext - t) / (spancount - 1);
			}

		// mankrip - begin
			// prepare dither values
			X = (v + u) & 1;
			XY0a = dither_kernel[X][Y][0];
			XY1a = dither_kernel[X][Y][1];
			XY0b = dither_kernel[!X][Y][0];
			XY1b = dither_kernel[!X][Y][1];

			pdest += spancount;
			pz += spancount;
			switch (spancount)
			{
				case 16: DITHERED_COLORKEY_A(-16); izi += izistep; s += sstep; t += tstep;
				case 15: DITHERED_COLORKEY_B(-15); izi += izistep; s += sstep; t += tstep;
				case 14: DITHERED_COLORKEY_A(-14); izi += izistep; s += sstep; t += tstep;
				case 13: DITHERED_COLORKEY_B(-13); izi += izistep; s += sstep; t += tstep;
				case 12: DITHERED_COLORKEY_A(-12); izi += izistep; s += sstep; t += tstep;
				case 11: DITHERED_COLORKEY_B(-11); izi += izistep; s += sstep; t += tstep;
				case 10: DITHERED_COLORKEY_A(-10); izi += izistep; s += sstep; t += tstep;
				case  9: DITHERED_COLORKEY_B( -9); izi += izistep; s += sstep; t += tstep;
				case  8: DITHERED_COLORKEY_A( -8); izi += izistep; s += sstep; t += tstep;
				case  7: DITHERED_COLORKEY_B( -7); izi += izistep; s += sstep; t += tstep;
				case  6: DITHERED_COLORKEY_A( -6); izi += izistep; s += sstep; t += tstep;
				case  5: DITHERED_COLORKEY_B( -5); izi += izistep; s += sstep; t += tstep;
				case  4: DITHERED_COLORKEY_A( -4); izi += izistep; s += sstep; t += tstep;
				case  3: DITHERED_COLORKEY_B( -3); izi += izistep; s += sstep; t += tstep;
				case  2: DITHERED_COLORKEY_A( -2); izi += izistep; s += sstep; t += tstep;
				case  1: DITHERED_COLORKEY_B( -1);
				break;
			}
		}
		// mankrip - end
		pspan++;
	} while (pspan->count != DS_SPAN_LIST_END);
}
It's still possible to make it a lot faster, but I haven't bothered with it yet.
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
frag.machine
Posts: 2126
Joined: Sat Nov 25, 2006 1:49 pm

Re: Subdiv16 for sprites?

Post by frag.machine »

Seriously guys, if I had a time machine I wouldnt bother sending a terminator back in the past to kill Hitler; I'd just throw you on it and hit the coords to 1995, somewhere near Mesquite, Texas... :)
I know FrikaC made a cgi-bin version of the quakec interpreter once and wrote part of his website in QuakeC :) (LordHavoc)
qbism
Posts: 1236
Joined: Thu Nov 04, 2004 5:51 am
Contact:

Re: Subdiv16 for sprites?

Post by qbism »

Awesome, never thought of that for sprites.
leileilol
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Re: Subdiv16 for sprites?

Post by leileilol »

In engoo's case, it matters because there's a shitload of sprite drawing - for the particles.
i should not be here
mankrip
Posts: 924
Joined: Fri Jul 04, 2008 3:02 am

Re: Subdiv16 for sprites?

Post by mankrip »

Well, I did give my idea a try, and the results weren't as significant as I expected: Only 1 fps faster (from 18 to 19) at 800x480 in my slow laptop, when displaying a single sprite zoomed.

The idea is simple: For SPR_VP_PARALLEL SPR models (which is the case of all SPR models in vanilla Quake, as well as Engoo's model-based particles), the value of izi never changes, so we can remove its update code completely. This also allows us to bitshift its value in advance.

For SPR model-based particles, you could also check Z only once, at the particle's origin, like Makaqu's particles does. This way you could eliminate the if (pz <= IZI) check for all pixels.

Here's the whole file:

Code: Select all

/*
Copyright (C) 1996-1997 Id Software, Inc.

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.

*/
// d_sprite.c: software top-level rasterization driver module for drawing
// sprites

#include "quakedef.h"
#include "r_local.h" // mankrip
#include "d_local.h"

#define COLORKEY(i)						if (pz[i] <= IZI) if ( (pcolor = *(pbase + (  s          >> 16) + (  t          >> 16) * cachewidth)) != TRANSPARENT_COLOR) { pdest[i] = translationTable[pcolor]; pz[i] = IZI; }
#define DITHERED_COLORKEY_A(i)			if (pz[i] <= IZI) if ( (pcolor = *(pbase + ( (s  + XY0a) >> 16) + ( (t  + XY1a) >> 16) * cachewidth)) != TRANSPARENT_COLOR) { pdest[i] = translationTable[pcolor]; pz[i] = IZI; }
#define DITHERED_COLORKEY_B(i)			if (pz[i] <= IZI) if ( (pcolor = *(pbase + ( (s  + XY0b) >> 16) + ( (t  + XY1b) >> 16) * cachewidth)) != TRANSPARENT_COLOR) { pdest[i] = translationTable[pcolor]; pz[i] = IZI; }
#define BLEND(i)						if (pz[i] <= IZI) { pdest[i] = colorblendingmap[  translationTable[*(pbase + (  s          >> 16) + (  t          >> 16) * cachewidth)]       + (pdest[i] << 8)]; }
#define DITHERED_BLEND_A(i)				if (pz[i] <= IZI) { pdest[i] = colorblendingmap[  translationTable[*(pbase + ( (s  + XY0a) >> 16) + ( (t  + XY1a) >> 16) * cachewidth)]       + (pdest[i] << 8)]; }
#define DITHERED_BLEND_B(i)				if (pz[i] <= IZI) { pdest[i] = colorblendingmap[  translationTable[*(pbase + ( (s  + XY0b) >> 16) + ( (t  + XY1b) >> 16) * cachewidth)]       + (pdest[i] << 8)]; }
#define BLENDBACKWARDS(i)				if (pz[i] <= IZI) { pdest[i] = colorblendingmap[ (translationTable[*(pbase + (  s          >> 16) + (  t          >> 16) * cachewidth)] << 8) +  pdest[i]      ]; }
#define DITHERED_BLENDBACKWARDS_A(i)	if (pz[i] <= IZI) { pdest[i] = colorblendingmap[ (translationTable[*(pbase + ( (s  + XY0a) >> 16) + ( (t  + XY1a) >> 16) * cachewidth)] << 8) +  pdest[i]      ]; }
#define DITHERED_BLENDBACKWARDS_B(i)	if (pz[i] <= IZI) { pdest[i] = colorblendingmap[ (translationTable[*(pbase + ( (s  + XY0b) >> 16) + ( (t  + XY1b) >> 16) * cachewidth)] << 8) +  pdest[i]      ]; }
extern int dither_kernel[2][2][2];

static msprite_t
	* psprite
	;
static int
	sprite_height
,	minindex
,	maxindex
	;
static sspan_t
	* sprite_spans
,	* pspan
	;
static int
	count
,	spancount
,	izi
,	izistep
// integers for dithering
,	u
,	v
,	idiths
,	iditht
,	idiths2
,	iditht2
,	X
,	Y
,	XY0a
,	XY1a
,	XY0b
,	XY1b
	;
static byte
	* pbase
,	pcolor
,	* pdest
,	btemp // replace with pcolor?
	;
static fixed16_t
	s
,	t
,	snext
,	tnext
,	sstep = 0 // setting an initial value to avoid compiler warning
,	tstep = 0 // setting an initial value to avoid compiler warning
	;
static float
	sdivz
,	tdivz
,	zi
,	z
,	du
,	dv
,	spancountminus1
,	sdivzstepu
,	tdivzstepu
,	zistepu
,	intensity // mankrip
	;
static short
	* pz
	;



/*
=====================
D_SpriteDrawSpans
=====================
*/
void D_SpriteDrawSpans_Dithered_ColorKeyed (void)
{
	// mankrip - begin
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_COLORKEY_A(-16); s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-15); s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-14); s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-13); s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-12); s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-11); s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-10); s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -9); s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -8); s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -7); s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -6); s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -5); s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -4); s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -3); s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -2); s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -1);
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_COLORKEY_A(-16); s += sstep; t += tstep;
					case 15: DITHERED_COLORKEY_B(-15); s += sstep; t += tstep;
					case 14: DITHERED_COLORKEY_A(-14); s += sstep; t += tstep;
					case 13: DITHERED_COLORKEY_B(-13); s += sstep; t += tstep;
					case 12: DITHERED_COLORKEY_A(-12); s += sstep; t += tstep;
					case 11: DITHERED_COLORKEY_B(-11); s += sstep; t += tstep;
					case 10: DITHERED_COLORKEY_A(-10); s += sstep; t += tstep;
					case  9: DITHERED_COLORKEY_B( -9); s += sstep; t += tstep;
					case  8: DITHERED_COLORKEY_A( -8); s += sstep; t += tstep;
					case  7: DITHERED_COLORKEY_B( -7); s += sstep; t += tstep;
					case  6: DITHERED_COLORKEY_A( -6); s += sstep; t += tstep;
					case  5: DITHERED_COLORKEY_B( -5); s += sstep; t += tstep;
					case  4: DITHERED_COLORKEY_A( -4); s += sstep; t += tstep;
					case  3: DITHERED_COLORKEY_B( -3); s += sstep; t += tstep;
					case  2: DITHERED_COLORKEY_A( -2); s += sstep; t += tstep;
					case  1: DITHERED_COLORKEY_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;
					zi += zistepu;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
					z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
					// we count on FP exceptions being turned off to avoid range problems // mankrip
				//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_COLORKEY_A(-16); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-15); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-14); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-13); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-12); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B(-11); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A(-10); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -9); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -8); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -7); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -6); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -5); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -4); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -3); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_A( -2); izi += izistep; s += sstep; t += tstep;
					DITHERED_COLORKEY_B( -1); izi += izistep;
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_COLORKEY_A(-16); izi += izistep; s += sstep; t += tstep;
					case 15: DITHERED_COLORKEY_B(-15); izi += izistep; s += sstep; t += tstep;
					case 14: DITHERED_COLORKEY_A(-14); izi += izistep; s += sstep; t += tstep;
					case 13: DITHERED_COLORKEY_B(-13); izi += izistep; s += sstep; t += tstep;
					case 12: DITHERED_COLORKEY_A(-12); izi += izistep; s += sstep; t += tstep;
					case 11: DITHERED_COLORKEY_B(-11); izi += izistep; s += sstep; t += tstep;
					case 10: DITHERED_COLORKEY_A(-10); izi += izistep; s += sstep; t += tstep;
					case  9: DITHERED_COLORKEY_B( -9); izi += izistep; s += sstep; t += tstep;
					case  8: DITHERED_COLORKEY_A( -8); izi += izistep; s += sstep; t += tstep;
					case  7: DITHERED_COLORKEY_B( -7); izi += izistep; s += sstep; t += tstep;
					case  6: DITHERED_COLORKEY_A( -6); izi += izistep; s += sstep; t += tstep;
					case  5: DITHERED_COLORKEY_B( -5); izi += izistep; s += sstep; t += tstep;
					case  4: DITHERED_COLORKEY_A( -4); izi += izistep; s += sstep; t += tstep;
					case  3: DITHERED_COLORKEY_B( -3); izi += izistep; s += sstep; t += tstep;
					case  2: DITHERED_COLORKEY_A( -2); izi += izistep; s += sstep; t += tstep;
					case  1: DITHERED_COLORKEY_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
}

void D_SpriteDrawSpans_Dithered_Blend (void)
{
	// mankrip - begin
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_BLEND_A(-16); s += sstep; t += tstep;
					DITHERED_BLEND_B(-15); s += sstep; t += tstep;
					DITHERED_BLEND_A(-14); s += sstep; t += tstep;
					DITHERED_BLEND_B(-13); s += sstep; t += tstep;
					DITHERED_BLEND_A(-12); s += sstep; t += tstep;
					DITHERED_BLEND_B(-11); s += sstep; t += tstep;
					DITHERED_BLEND_A(-10); s += sstep; t += tstep;
					DITHERED_BLEND_B( -9); s += sstep; t += tstep;
					DITHERED_BLEND_A( -8); s += sstep; t += tstep;
					DITHERED_BLEND_B( -7); s += sstep; t += tstep;
					DITHERED_BLEND_A( -6); s += sstep; t += tstep;
					DITHERED_BLEND_B( -5); s += sstep; t += tstep;
					DITHERED_BLEND_A( -4); s += sstep; t += tstep;
					DITHERED_BLEND_B( -3); s += sstep; t += tstep;
					DITHERED_BLEND_A( -2); s += sstep; t += tstep;
					DITHERED_BLEND_B( -1);
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_BLEND_A(-16); s += sstep; t += tstep;
					case 15: DITHERED_BLEND_B(-15); s += sstep; t += tstep;
					case 14: DITHERED_BLEND_A(-14); s += sstep; t += tstep;
					case 13: DITHERED_BLEND_B(-13); s += sstep; t += tstep;
					case 12: DITHERED_BLEND_A(-12); s += sstep; t += tstep;
					case 11: DITHERED_BLEND_B(-11); s += sstep; t += tstep;
					case 10: DITHERED_BLEND_A(-10); s += sstep; t += tstep;
					case  9: DITHERED_BLEND_B( -9); s += sstep; t += tstep;
					case  8: DITHERED_BLEND_A( -8); s += sstep; t += tstep;
					case  7: DITHERED_BLEND_B( -7); s += sstep; t += tstep;
					case  6: DITHERED_BLEND_A( -6); s += sstep; t += tstep;
					case  5: DITHERED_BLEND_B( -5); s += sstep; t += tstep;
					case  4: DITHERED_BLEND_A( -4); s += sstep; t += tstep;
					case  3: DITHERED_BLEND_B( -3); s += sstep; t += tstep;
					case  2: DITHERED_BLEND_A( -2); s += sstep; t += tstep;
					case  1: DITHERED_BLEND_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;
					zi += zistepu;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
					z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
					// we count on FP exceptions being turned off to avoid range problems // mankrip
				//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_BLEND_A(-16); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B(-15); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A(-14); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B(-13); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A(-12); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B(-11); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A(-10); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B( -9); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A( -8); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B( -7); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A( -6); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B( -5); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A( -4); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B( -3); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_A( -2); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLEND_B( -1); izi += izistep;
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_BLEND_A(-16); izi += izistep; s += sstep; t += tstep;
					case 15: DITHERED_BLEND_B(-15); izi += izistep; s += sstep; t += tstep;
					case 14: DITHERED_BLEND_A(-14); izi += izistep; s += sstep; t += tstep;
					case 13: DITHERED_BLEND_B(-13); izi += izistep; s += sstep; t += tstep;
					case 12: DITHERED_BLEND_A(-12); izi += izistep; s += sstep; t += tstep;
					case 11: DITHERED_BLEND_B(-11); izi += izistep; s += sstep; t += tstep;
					case 10: DITHERED_BLEND_A(-10); izi += izistep; s += sstep; t += tstep;
					case  9: DITHERED_BLEND_B( -9); izi += izistep; s += sstep; t += tstep;
					case  8: DITHERED_BLEND_A( -8); izi += izistep; s += sstep; t += tstep;
					case  7: DITHERED_BLEND_B( -7); izi += izistep; s += sstep; t += tstep;
					case  6: DITHERED_BLEND_A( -6); izi += izistep; s += sstep; t += tstep;
					case  5: DITHERED_BLEND_B( -5); izi += izistep; s += sstep; t += tstep;
					case  4: DITHERED_BLEND_A( -4); izi += izistep; s += sstep; t += tstep;
					case  3: DITHERED_BLEND_B( -3); izi += izistep; s += sstep; t += tstep;
					case  2: DITHERED_BLEND_A( -2); izi += izistep; s += sstep; t += tstep;
					case  1: DITHERED_BLEND_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
}

#ifndef _arch_dreamcast // no backwards blending on the DC
void D_SpriteDrawSpans_Dithered_BlendBackwards (void)
{
	// mankrip - begin
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_BLENDBACKWARDS_A(-16); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-15); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-14); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-13); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-12); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-11); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-10); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -9); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -8); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -7); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -6); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -5); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -4); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -3); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -2); s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -1);
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_BLENDBACKWARDS_A(-16); s += sstep; t += tstep;
					case 15: DITHERED_BLENDBACKWARDS_B(-15); s += sstep; t += tstep;
					case 14: DITHERED_BLENDBACKWARDS_A(-14); s += sstep; t += tstep;
					case 13: DITHERED_BLENDBACKWARDS_B(-13); s += sstep; t += tstep;
					case 12: DITHERED_BLENDBACKWARDS_A(-12); s += sstep; t += tstep;
					case 11: DITHERED_BLENDBACKWARDS_B(-11); s += sstep; t += tstep;
					case 10: DITHERED_BLENDBACKWARDS_A(-10); s += sstep; t += tstep;
					case  9: DITHERED_BLENDBACKWARDS_B( -9); s += sstep; t += tstep;
					case  8: DITHERED_BLENDBACKWARDS_A( -8); s += sstep; t += tstep;
					case  7: DITHERED_BLENDBACKWARDS_B( -7); s += sstep; t += tstep;
					case  6: DITHERED_BLENDBACKWARDS_A( -6); s += sstep; t += tstep;
					case  5: DITHERED_BLENDBACKWARDS_B( -5); s += sstep; t += tstep;
					case  4: DITHERED_BLENDBACKWARDS_A( -4); s += sstep; t += tstep;
					case  3: DITHERED_BLENDBACKWARDS_B( -3); s += sstep; t += tstep;
					case  2: DITHERED_BLENDBACKWARDS_A( -2); s += sstep; t += tstep;
					case  1: DITHERED_BLENDBACKWARDS_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			Y = v & 1; // mankrip
			count		= pspan->count >> 4; // mh
			// mankrip - begin
			spancount	= pspan->count % 16;
			if (count)
			{
				// prepare dither values
				X = ! ( (v + u) & 1);
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				while (count--)
			// mankrip - end
				{
					// calculate s/z, t/z, zi->fixed s and t at far end of span,
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					sdivz += sdivzstepu;
					tdivz += tdivzstepu;
					zi += zistepu;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
					z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
					// we count on FP exceptions being turned off to avoid range problems // mankrip
				//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

					snext = (int) (sdivz * z) + sadjust;
					if (snext > bbextents)
						snext = bbextents;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (snext < 16)
						snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					tnext = (int) (tdivz * z) + tadjust;
					if (tnext > bbextentt)
						tnext = bbextentt;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
					else if (tnext < 16)
						tnext = 16;   // guard against round-off error on <0 steps

					// calculate s and t steps across span by shifting
					sstep = (snext - s) >> 4;
					tstep = (tnext - t) >> 4;
					// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

					// mankrip - begin
					pdest += 16;
					pz += 16;
					DITHERED_BLENDBACKWARDS_A(-16); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-15); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-14); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-13); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-12); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B(-11); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A(-10); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -9); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -8); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -7); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -6); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -5); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -4); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -3); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_A( -2); izi += izistep; s += sstep; t += tstep;
					DITHERED_BLENDBACKWARDS_B( -1); izi += izistep;
					// mankrip - end

					s = snext;
					t = tnext;
					// mankrip - begin
				}
			}
			if (spancount)
			{
					// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				// prepare dither values
				X = (v + u) & 1;
				XY0a = dither_kernel[X][Y][0];
				XY1a = dither_kernel[X][Y][1];
				XY0b = dither_kernel[!X][Y][0];
				XY1b = dither_kernel[!X][Y][1];

				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: DITHERED_BLENDBACKWARDS_A(-16); izi += izistep; s += sstep; t += tstep;
					case 15: DITHERED_BLENDBACKWARDS_B(-15); izi += izistep; s += sstep; t += tstep;
					case 14: DITHERED_BLENDBACKWARDS_A(-14); izi += izistep; s += sstep; t += tstep;
					case 13: DITHERED_BLENDBACKWARDS_B(-13); izi += izistep; s += sstep; t += tstep;
					case 12: DITHERED_BLENDBACKWARDS_A(-12); izi += izistep; s += sstep; t += tstep;
					case 11: DITHERED_BLENDBACKWARDS_B(-11); izi += izistep; s += sstep; t += tstep;
					case 10: DITHERED_BLENDBACKWARDS_A(-10); izi += izistep; s += sstep; t += tstep;
					case  9: DITHERED_BLENDBACKWARDS_B( -9); izi += izistep; s += sstep; t += tstep;
					case  8: DITHERED_BLENDBACKWARDS_A( -8); izi += izistep; s += sstep; t += tstep;
					case  7: DITHERED_BLENDBACKWARDS_B( -7); izi += izistep; s += sstep; t += tstep;
					case  6: DITHERED_BLENDBACKWARDS_A( -6); izi += izistep; s += sstep; t += tstep;
					case  5: DITHERED_BLENDBACKWARDS_B( -5); izi += izistep; s += sstep; t += tstep;
					case  4: DITHERED_BLENDBACKWARDS_A( -4); izi += izistep; s += sstep; t += tstep;
					case  3: DITHERED_BLENDBACKWARDS_B( -3); izi += izistep; s += sstep; t += tstep;
					case  2: DITHERED_BLENDBACKWARDS_A( -2); izi += izistep; s += sstep; t += tstep;
					case  1: DITHERED_BLENDBACKWARDS_B( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
}
#endif // #ifndef _arch_dreamcast
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
mankrip
Posts: 924
Joined: Fri Jul 04, 2008 3:02 am

Re: Subdiv16 for sprites?

Post by mankrip »

The rest of the file:

Code: Select all

void D_SpriteDrawSpans16_ColorKeyed (void)
{
	// mankrip - begin
	#ifdef _arch_dreamcast
	D_SpriteDrawSpans_Dithered_ColorKeyed ();
	#else
	if (d_dither.value)
	{
		D_SpriteDrawSpans_Dithered_ColorKeyed ();
		return;
	}
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				COLORKEY(-16); s += sstep; t += tstep;
				COLORKEY(-15); s += sstep; t += tstep;
				COLORKEY(-14); s += sstep; t += tstep;
				COLORKEY(-13); s += sstep; t += tstep;
				COLORKEY(-12); s += sstep; t += tstep;
				COLORKEY(-11); s += sstep; t += tstep;
				COLORKEY(-10); s += sstep; t += tstep;
				COLORKEY( -9); s += sstep; t += tstep;
				COLORKEY( -8); s += sstep; t += tstep;
				COLORKEY( -7); s += sstep; t += tstep;
				COLORKEY( -6); s += sstep; t += tstep;
				COLORKEY( -5); s += sstep; t += tstep;
				COLORKEY( -4); s += sstep; t += tstep;
				COLORKEY( -3); s += sstep; t += tstep;
				COLORKEY( -2); s += sstep; t += tstep;
				COLORKEY( -1);
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: COLORKEY(-16); s += sstep; t += tstep;
					case 15: COLORKEY(-15); s += sstep; t += tstep;
					case 14: COLORKEY(-14); s += sstep; t += tstep;
					case 13: COLORKEY(-13); s += sstep; t += tstep;
					case 12: COLORKEY(-12); s += sstep; t += tstep;
					case 11: COLORKEY(-11); s += sstep; t += tstep;
					case 10: COLORKEY(-10); s += sstep; t += tstep;
					case  9: COLORKEY( -9); s += sstep; t += tstep;
					case  8: COLORKEY( -8); s += sstep; t += tstep;
					case  7: COLORKEY( -7); s += sstep; t += tstep;
					case  6: COLORKEY( -6); s += sstep; t += tstep;
					case  5: COLORKEY( -5); s += sstep; t += tstep;
					case  4: COLORKEY( -4); s += sstep; t += tstep;
					case  3: COLORKEY( -3); s += sstep; t += tstep;
					case  2: COLORKEY( -2); s += sstep; t += tstep;
					case  1: COLORKEY( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;
				zi += zistepu;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				COLORKEY(-16); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-15); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-14); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-13); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-12); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-11); izi += izistep; s += sstep; t += tstep;
				COLORKEY(-10); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -9); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -8); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -7); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -6); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -5); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -4); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -3); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -2); izi += izistep; s += sstep; t += tstep;
				COLORKEY( -1); izi += izistep;
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: COLORKEY(-16); izi += izistep; s += sstep; t += tstep;
					case 15: COLORKEY(-15); izi += izistep; s += sstep; t += tstep;
					case 14: COLORKEY(-14); izi += izistep; s += sstep; t += tstep;
					case 13: COLORKEY(-13); izi += izistep; s += sstep; t += tstep;
					case 12: COLORKEY(-12); izi += izistep; s += sstep; t += tstep;
					case 11: COLORKEY(-11); izi += izistep; s += sstep; t += tstep;
					case 10: COLORKEY(-10); izi += izistep; s += sstep; t += tstep;
					case  9: COLORKEY( -9); izi += izistep; s += sstep; t += tstep;
					case  8: COLORKEY( -8); izi += izistep; s += sstep; t += tstep;
					case  7: COLORKEY( -7); izi += izistep; s += sstep; t += tstep;
					case  6: COLORKEY( -6); izi += izistep; s += sstep; t += tstep;
					case  5: COLORKEY( -5); izi += izistep; s += sstep; t += tstep;
					case  4: COLORKEY( -4); izi += izistep; s += sstep; t += tstep;
					case  3: COLORKEY( -3); izi += izistep; s += sstep; t += tstep;
					case  2: COLORKEY( -2); izi += izistep; s += sstep; t += tstep;
					case  1: COLORKEY( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	#endif // #ifdef _arch_dreamcast
}

void D_SpriteDrawSpans_Blend (void) // mankrip - transparencies
{
	// mankrip - begin
	#ifdef _arch_dreamcast
	D_SpriteDrawSpans_Dithered_Blend ();
	#else
	if (d_dither.value)
	{
		D_SpriteDrawSpans_Dithered_Blend ();
		return;
	}
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				BLEND(-16); s += sstep; t += tstep;
				BLEND(-15); s += sstep; t += tstep;
				BLEND(-14); s += sstep; t += tstep;
				BLEND(-13); s += sstep; t += tstep;
				BLEND(-12); s += sstep; t += tstep;
				BLEND(-11); s += sstep; t += tstep;
				BLEND(-10); s += sstep; t += tstep;
				BLEND( -9); s += sstep; t += tstep;
				BLEND( -8); s += sstep; t += tstep;
				BLEND( -7); s += sstep; t += tstep;
				BLEND( -6); s += sstep; t += tstep;
				BLEND( -5); s += sstep; t += tstep;
				BLEND( -4); s += sstep; t += tstep;
				BLEND( -3); s += sstep; t += tstep;
				BLEND( -2); s += sstep; t += tstep;
				BLEND( -1);
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: BLEND(-16); s += sstep; t += tstep;
					case 15: BLEND(-15); s += sstep; t += tstep;
					case 14: BLEND(-14); s += sstep; t += tstep;
					case 13: BLEND(-13); s += sstep; t += tstep;
					case 12: BLEND(-12); s += sstep; t += tstep;
					case 11: BLEND(-11); s += sstep; t += tstep;
					case 10: BLEND(-10); s += sstep; t += tstep;
					case  9: BLEND( -9); s += sstep; t += tstep;
					case  8: BLEND( -8); s += sstep; t += tstep;
					case  7: BLEND( -7); s += sstep; t += tstep;
					case  6: BLEND( -6); s += sstep; t += tstep;
					case  5: BLEND( -5); s += sstep; t += tstep;
					case  4: BLEND( -4); s += sstep; t += tstep;
					case  3: BLEND( -3); s += sstep; t += tstep;
					case  2: BLEND( -2); s += sstep; t += tstep;
					case  1: BLEND( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;
				zi += zistepu;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				BLEND(-16); izi += izistep; s += sstep; t += tstep;
				BLEND(-15); izi += izistep; s += sstep; t += tstep;
				BLEND(-14); izi += izistep; s += sstep; t += tstep;
				BLEND(-13); izi += izistep; s += sstep; t += tstep;
				BLEND(-12); izi += izistep; s += sstep; t += tstep;
				BLEND(-11); izi += izistep; s += sstep; t += tstep;
				BLEND(-10); izi += izistep; s += sstep; t += tstep;
				BLEND( -9); izi += izistep; s += sstep; t += tstep;
				BLEND( -8); izi += izistep; s += sstep; t += tstep;
				BLEND( -7); izi += izistep; s += sstep; t += tstep;
				BLEND( -6); izi += izistep; s += sstep; t += tstep;
				BLEND( -5); izi += izistep; s += sstep; t += tstep;
				BLEND( -4); izi += izistep; s += sstep; t += tstep;
				BLEND( -3); izi += izistep; s += sstep; t += tstep;
				BLEND( -2); izi += izistep; s += sstep; t += tstep;
				BLEND( -1); izi += izistep;
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: BLEND(-16); izi += izistep; s += sstep; t += tstep;
					case 15: BLEND(-15); izi += izistep; s += sstep; t += tstep;
					case 14: BLEND(-14); izi += izistep; s += sstep; t += tstep;
					case 13: BLEND(-13); izi += izistep; s += sstep; t += tstep;
					case 12: BLEND(-12); izi += izistep; s += sstep; t += tstep;
					case 11: BLEND(-11); izi += izistep; s += sstep; t += tstep;
					case 10: BLEND(-10); izi += izistep; s += sstep; t += tstep;
					case  9: BLEND( -9); izi += izistep; s += sstep; t += tstep;
					case  8: BLEND( -8); izi += izistep; s += sstep; t += tstep;
					case  7: BLEND( -7); izi += izistep; s += sstep; t += tstep;
					case  6: BLEND( -6); izi += izistep; s += sstep; t += tstep;
					case  5: BLEND( -5); izi += izistep; s += sstep; t += tstep;
					case  4: BLEND( -4); izi += izistep; s += sstep; t += tstep;
					case  3: BLEND( -3); izi += izistep; s += sstep; t += tstep;
					case  2: BLEND( -2); izi += izistep; s += sstep; t += tstep;
					case  1: BLEND( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	#endif // #ifdef _arch_dreamcast
}

void D_SpriteDrawSpans_BlendBackwards (void) // mankrip - transparencies
{
	// mankrip - begin
	#ifdef _arch_dreamcast
	D_SpriteDrawSpans_Dithered_Blend (); // no backwards blending on the DC
	#else
	if (d_dither.value)
	{
		D_SpriteDrawSpans_Dithered_BlendBackwards ();
		return;
	}
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
		do
		{
			// mankrip - begin
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				BLENDBACKWARDS(-16); s += sstep; t += tstep;
				BLENDBACKWARDS(-15); s += sstep; t += tstep;
				BLENDBACKWARDS(-14); s += sstep; t += tstep;
				BLENDBACKWARDS(-13); s += sstep; t += tstep;
				BLENDBACKWARDS(-12); s += sstep; t += tstep;
				BLENDBACKWARDS(-11); s += sstep; t += tstep;
				BLENDBACKWARDS(-10); s += sstep; t += tstep;
				BLENDBACKWARDS( -9); s += sstep; t += tstep;
				BLENDBACKWARDS( -8); s += sstep; t += tstep;
				BLENDBACKWARDS( -7); s += sstep; t += tstep;
				BLENDBACKWARDS( -6); s += sstep; t += tstep;
				BLENDBACKWARDS( -5); s += sstep; t += tstep;
				BLENDBACKWARDS( -4); s += sstep; t += tstep;
				BLENDBACKWARDS( -3); s += sstep; t += tstep;
				BLENDBACKWARDS( -2); s += sstep; t += tstep;
				BLENDBACKWARDS( -1);
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: BLENDBACKWARDS(-16); s += sstep; t += tstep;
					case 15: BLENDBACKWARDS(-15); s += sstep; t += tstep;
					case 14: BLENDBACKWARDS(-14); s += sstep; t += tstep;
					case 13: BLENDBACKWARDS(-13); s += sstep; t += tstep;
					case 12: BLENDBACKWARDS(-12); s += sstep; t += tstep;
					case 11: BLENDBACKWARDS(-11); s += sstep; t += tstep;
					case 10: BLENDBACKWARDS(-10); s += sstep; t += tstep;
					case  9: BLENDBACKWARDS( -9); s += sstep; t += tstep;
					case  8: BLENDBACKWARDS( -8); s += sstep; t += tstep;
					case  7: BLENDBACKWARDS( -7); s += sstep; t += tstep;
					case  6: BLENDBACKWARDS( -6); s += sstep; t += tstep;
					case  5: BLENDBACKWARDS( -5); s += sstep; t += tstep;
					case  4: BLENDBACKWARDS( -4); s += sstep; t += tstep;
					case  3: BLENDBACKWARDS( -3); s += sstep; t += tstep;
					case  2: BLENDBACKWARDS( -2); s += sstep; t += tstep;
					case  1: BLENDBACKWARDS( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	}
	else
		do
		{
			// mankrip - begin
			#undef IZI
			#define IZI (izi >> 16)
			u = pspan->u;
			v = pspan->v;
			du = (float)u;
			dv = (float)v;
			// mankrip - end

			// calculate the initial s/z, t/z, 1/z, s, and t and clamp
			sdivz = d_sdivzorigin + dv * d_sdivzstepv + du * d_sdivzstepu;
			tdivz = d_tdivzorigin + dv * d_tdivzstepv + du * d_tdivzstepu;
			zi = d_ziorigin + dv * d_zistepv + du * d_zistepu;
			z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
			// we count on FP exceptions being turned off to avoid range problems // mankrip
			izi = (int) (zi * 0x8000 * 0x10000); // mankrip

			s = (int) (sdivz * z) + sadjust;
			if (s > bbextents)
				s = bbextents;
			else if (s < 0)
				s = 0;

			t = (int) (tdivz * z) + tadjust;
			if (t > bbextentt)
				t = bbextentt;
			else if (t < 0)
				t = 0;

			pdest = (byte *)d_viewbuffer + (screenwidth * v) + u; // mankrip - edited
			pz = d_pzbuffer + (d_zwidth * v) + u; // mankrip - edited

			count		= pspan->count >> 4; // mh
			spancount	= pspan->count % 16; // mankrip
			while (count--) // mankrip
			{
				// calculate s/z, t/z, zi->fixed s and t at far end of span,
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;
				zi += zistepu;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps

				// calculate s and t steps across span by shifting
				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				// mankrip - begin
				pdest += 16;
				pz += 16;
				BLENDBACKWARDS(-16); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-15); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-14); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-13); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-12); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-11); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS(-10); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -9); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -8); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -7); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -6); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -5); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -4); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -3); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -2); izi += izistep; s += sstep; t += tstep;
				BLENDBACKWARDS( -1); izi += izistep;
				// mankrip - end

				s = snext;
				t = tnext;
				// mankrip - begin
			}
			if (spancount)
			{
				// mankrip - end

				// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
				// clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture

				spancountminus1 = (float) (spancount - 1);
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems // mankrip
			//	izi = (int) (zi * 0x8000 * 0x10000); // mankrip

				snext = (int) (sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (snext < 16)
					snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				tnext = (int) (tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
				else if (tnext < 16)
					tnext = 16;   // guard against round-off error on <0 steps
				// qbism ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}

			// mankrip - begin
				pdest += spancount;
				pz += spancount;
				switch (spancount)
				{
					case 16: BLENDBACKWARDS(-16); izi += izistep; s += sstep; t += tstep;
					case 15: BLENDBACKWARDS(-15); izi += izistep; s += sstep; t += tstep;
					case 14: BLENDBACKWARDS(-14); izi += izistep; s += sstep; t += tstep;
					case 13: BLENDBACKWARDS(-13); izi += izistep; s += sstep; t += tstep;
					case 12: BLENDBACKWARDS(-12); izi += izistep; s += sstep; t += tstep;
					case 11: BLENDBACKWARDS(-11); izi += izistep; s += sstep; t += tstep;
					case 10: BLENDBACKWARDS(-10); izi += izistep; s += sstep; t += tstep;
					case  9: BLENDBACKWARDS( -9); izi += izistep; s += sstep; t += tstep;
					case  8: BLENDBACKWARDS( -8); izi += izistep; s += sstep; t += tstep;
					case  7: BLENDBACKWARDS( -7); izi += izistep; s += sstep; t += tstep;
					case  6: BLENDBACKWARDS( -6); izi += izistep; s += sstep; t += tstep;
					case  5: BLENDBACKWARDS( -5); izi += izistep; s += sstep; t += tstep;
					case  4: BLENDBACKWARDS( -4); izi += izistep; s += sstep; t += tstep;
					case  3: BLENDBACKWARDS( -3); izi += izistep; s += sstep; t += tstep;
					case  2: BLENDBACKWARDS( -2); izi += izistep; s += sstep; t += tstep;
					case  1: BLENDBACKWARDS( -1);
					break;
				}
			}
			// mankrip - end
			pspan++;
		} while (pspan->count != DS_SPAN_LIST_END);
	#endif // #ifdef _arch_dreamcast
}



void D_SpriteDrawSpans_Shadow (void) // mankrip - transparencies
{
	do
	{
		pdest = (byte *)d_viewbuffer + (screenwidth * pspan->v) + pspan->u;
		pz = d_pzbuffer + (d_zwidth * pspan->v) + pspan->u;

		count = pspan->count;

		if (count <= 0)
			goto NextSpan;

		// calculate the initial s/z, t/z, 1/z, s, and t and clamp
		du = (float)pspan->u;
		dv = (float)pspan->v;

		sdivz = d_sdivzorigin + dv*d_sdivzstepv + du*d_sdivzstepu;
		tdivz = d_tdivzorigin + dv*d_tdivzstepv + du*d_tdivzstepu;
		zi = d_ziorigin + dv*d_zistepv + du*d_zistepu;
		z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point

		s = (int)(sdivz * z) + sadjust;
		if (s > bbextents)
			s = bbextents;
		else if (s < 0)
			s = 0;

		t = (int)(tdivz * z) + tadjust;
		if (t > bbextentt)
			t = bbextentt;
		else if (t < 0)
			t = 0;

		do
		{
		// calculate s and t at the far end of the span
			if (count >= 16)
				spancount = 16;
			else
				spancount = count;

			count -= spancount;

			if (count)
			{
			// calculate s/z, t/z, zi->fixed s and t at far end of span,
			// calculate s and t steps across span by shifting
				sdivz += sdivzstepu;
				tdivz += tdivzstepu;
				zi += zistepu;
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems
				izi = (int)(zi * 0x8000 * 0x10000);

				snext = (int)(sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				else if (snext < 16)
					snext = 16;	// prevent round-off error on <0 steps from
								//  from causing overstepping & running off the
								//  edge of the texture

				tnext = (int)(tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				else if (tnext < 16)
					tnext = 16;	// guard against round-off error on <0 steps

				sstep = (snext - s) >> 4;
				tstep = (tnext - t) >> 4;
			}
			else
			{
			// calculate s/z, t/z, zi->fixed s and t at last pixel in span (so
			// can't step off polygon), clamp, calculate s and t steps across
			// span by division, biasing steps low so we don't run off the
			// texture
				spancountminus1 = (float)(spancount - 1);
				sdivz += d_sdivzstepu * spancountminus1;
				tdivz += d_tdivzstepu * spancountminus1;
				zi += d_zistepu * spancountminus1;
				z = (float)0x10000 / zi;	// prescale to 16.16 fixed-point
				// we count on FP exceptions being turned off to avoid range problems
				izi = (int)(zi * 0x8000 * 0x10000);

				snext = (int)(sdivz * z) + sadjust;
				if (snext > bbextents)
					snext = bbextents;
				else if (snext < 16)
					snext = 16;	// prevent round-off error on <0 steps from
								//  from causing overstepping & running off the
								//  edge of the texture

				tnext = (int)(tdivz * z) + tadjust;
				if (tnext > bbextentt)
					tnext = bbextentt;
				else if (tnext < 16)
					tnext = 16;	// guard against round-off error on <0 steps

				if (spancount > 1)
				{
					sstep = (snext - s) / (spancount - 1);
					tstep = (tnext - t) / (spancount - 1);
				}
			}

			do
			{
				btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth);
				if (btemp != TRANSPARENT_COLOR)
				// mankrip - begin
				{
					intensity = (1.0 - ((float)((izi >> 16)- *pz)/20.0));
					if (intensity == 1 && currententity->alpha == 1)
						*pdest = 0;
					else if (intensity > 0 && intensity <= 1)
						*pdest = (byte)colorshadingmap[WITH_BRIGHTS][*pdest + ((32 + (int)(31.0*intensity*currententity->alpha)) * 256)];
				}
				// mankrip - end

				izi += izistep;
				pdest++;
				pz++;
				s += sstep;
				t += tstep;
			} while (--spancount > 0);

			s = snext;
			t = tnext;

		} while (count > 0);

NextSpan:
		pspan++;

	} while (pspan->count != DS_SPAN_LIST_END);
}



void D_SpriteScanLeftEdge (void)
{
	int
		i = (minindex) ? minindex : r_spritedesc.nump
//	,	v
	,	ibottom
	,	lmaxindex = (maxindex) ? maxindex : r_spritedesc.nump
		;
	emitpoint_t
		* pvert
	,	* pnext
		;
	float
		vtop = ceil (r_spritedesc.pverts[i].v)
	,	vbottom
	,	slope
		;
	fixed16_t
//		u
		u_step
		;
	pspan = sprite_spans;

	do
	{
		pvert = &r_spritedesc.pverts[i];
		pnext = pvert - 1;

		vbottom = ceil (pnext->v);

		if (vtop < vbottom)
		{
			slope = (pnext->u - pvert->u) / (pnext->v - pvert->v); // mankrip - du / dv (delta u / delta v)
			u_step = (int) (slope * 0x10000);
			// adjust u to ceil the integer portion
			u = (int) ( (pvert->u + (slope * (vtop - pvert->v))) * 0x10000) + (0x10000 - 1);
			ibottom = (int)vbottom;

			for (v = (int)vtop ; v < ibottom ; v++) // mankrip - edited
			{
				pspan->u = u >> 16;
				pspan->v = v;
				u += u_step;
				pspan++;
			}
		}

		vtop = vbottom;

		if (--i == 0) // mankrip - edited
			i = r_spritedesc.nump;

	} while (i != lmaxindex);
}

void D_SpriteScanRightEdge (void)
{
	int
		i = minindex
//	,	v
	,	ibottom
		;
	emitpoint_t
		*pvert
	,	*pnext
		;
	float
		vtop
	,	vbottom
	,	slope
	,	uvert
	,	unext
	,	vvert
	,	vnext
		;
	fixed16_t
//		u
		u_step
		;
	pspan = sprite_spans;

	vvert = r_spritedesc.pverts[i].v;
	if (vvert < r_refdef.fvrecty_adj)
		vvert = r_refdef.fvrecty_adj;
	else if (vvert > r_refdef.fvrectbottom_adj)
		vvert = r_refdef.fvrectbottom_adj;

	vtop = ceil (vvert);

	do
	{
		pvert = &r_spritedesc.pverts[i];
		pnext = pvert + 1;

		vnext = pnext->v;
		if (vnext < r_refdef.fvrecty_adj)
			vnext = r_refdef.fvrecty_adj;
		else if (vnext > r_refdef.fvrectbottom_adj)
			vnext = r_refdef.fvrectbottom_adj;

		vbottom = ceil (vnext);

		if (vtop < vbottom)
		{
			uvert = pvert->u;
			if (uvert < r_refdef.fvrectx_adj)
				uvert = r_refdef.fvrectx_adj;
			else // mankrip
			if (uvert > r_refdef.fvrectright_adj)
				uvert = r_refdef.fvrectright_adj;

			unext = pnext->u;
			if (unext < r_refdef.fvrectx_adj)
				unext = r_refdef.fvrectx_adj;
			else // mankrip
			if (unext > r_refdef.fvrectright_adj)
				unext = r_refdef.fvrectright_adj;

			slope = (unext - uvert) / (vnext - vvert); // mankrip - du / dv (delta u / delta v)
			u_step = (int)(slope * 0x10000);
			// adjust u to ceil the integer portion
			u = (int)((uvert + (slope * (vtop - vvert))) * 0x10000) + (0x10000 - 1);
			ibottom = (int)vbottom;

			for (v = (int)vtop ; v < ibottom ; v++) // mankrip - edited
			{
				pspan->count = (u >> 16) - pspan->u;
				u += u_step;
				pspan++;
			}
		}

		vtop = vbottom;
		vvert = vnext;

		if (++i == r_spritedesc.nump)
			i = 0;

	} while (i != maxindex);

	pspan->count = DS_SPAN_LIST_END;	// mark the end of the span list
}

void D_SpriteCalculateGradients (void)
{
	vec3_t
		p_normal
	,	p_saxis
	,	p_taxis
	,	p_temp1
		;
	float
		distinv
	,	scale // mankrip - QC Scale
		;

	TransformVector (r_spritedesc.vpn	, p_normal);
	TransformVector (r_spritedesc.vright, p_saxis);
	TransformVector (r_spritedesc.vup	, p_taxis);
	VectorInverse (p_taxis, p_taxis);

	distinv = 1.0f / (-DotProduct (modelorg, r_spritedesc.vpn));

	// mankrip - QC Scale - begin
	scale = (currententity->scale * currententity->scalev[0]);
	p_saxis[0] /= scale;
	p_saxis[1] /= scale;
	p_saxis[2] /= scale;

	scale = (currententity->scale * currententity->scalev[1]);
	p_taxis[0] /= scale;
	p_taxis[1] /= scale;
	p_taxis[2] /= scale;
	// mankrip - QC Scale - end

	d_sdivzstepu = p_saxis[0] * xscaleinv;
	d_tdivzstepu = p_taxis[0] * xscaleinv;

	d_sdivzstepv = -p_saxis[1] * yscaleinv;
	d_tdivzstepv = -p_taxis[1] * yscaleinv;

	d_zistepu =  p_normal[0] * xscaleinv * distinv;
	d_zistepv = -p_normal[1] * yscaleinv * distinv;

	d_sdivzorigin = p_saxis[2] - xcenter * d_sdivzstepu - ycenter * d_sdivzstepv;
	d_tdivzorigin = p_taxis[2] - xcenter * d_tdivzstepu - ycenter * d_tdivzstepv;
	d_ziorigin = p_normal[2] * distinv - xcenter * d_zistepu - ycenter * d_zistepv;

	TransformVector (modelorg, p_temp1);

	sadjust = ( (fixed16_t) (DotProduct (p_temp1, p_saxis) * 0x10000 + 0.5)) - (- (cachewidth	 >> 1) << 16);
	tadjust = ( (fixed16_t) (DotProduct (p_temp1, p_taxis) * 0x10000 + 0.5)) - (- (sprite_height >> 1) << 16);

	// -1 (-epsilon) so we never wander off the edge of the texture
	bbextents = (cachewidth		<< 16) - 1;
	bbextentt = (sprite_height	<< 16) - 1;
}



void D_DrawSprite (void)
{
	sspan_t
		spans[MAXHEIGHT + 1]
		;
	emitpoint_t*
		pverts = r_spritedesc.pverts
		;
	int
		i = 0
	,	nump
		;
	float
		ymin = 999999.9f
	,	ymax = -999999.9f
		;

	// find the top and bottom vertices, and make sure there's at least one scan to draw
	for ( ; i < r_spritedesc.nump ; i++)
	{
		if (pverts->v < ymin)
		{
			ymin = pverts->v;
			minindex = i;
		}

		else // mankrip
		if (pverts->v > ymax)
		{
			ymax = pverts->v;
			maxindex = i;
		}

		pverts++;
	}

	ymin = ceil (ymin);
	ymax = ceil (ymax);

	if (ymin < ymax) // if it crosses any scans (front-faced?)
	{
		// mankrip - begin
		if (r_sprite_lit.value && !r_fullbright.value)
		{
			int
				r_ambientlight
			,	lnum
				;
			dlight_t
				*dl
				;
			vec3_t
				dist
			,	t
				;
			float
				add
				;
			VectorCopy (currententity->origin, t);
			r_ambientlight = R_LightPoint (t);
			// add dynamic lights
			for (lnum=0 ; lnum < MAX_DLIGHTS ; lnum++)
			{
				dl = &cl_dlights[lnum];
				if (dl->die < cl.time)
					continue;
				if (dl->dark)
					continue;
				if (dl->radius) // not viewmodel
				{
					VectorSubtract (t, dl->origin, dist);
					add = dl->radius - Length (dist);
					if (add > 0)
						r_ambientlight += (int)add;
				}
			}
			// minimum of (255 minus) 12 (12 less than in the viewmodel), maximum of (255 minus) 192 (same as MDL models)
			currententity->lightlevel = (r_ambientlight > 192) ? 1.0f : ( (r_ambientlight < 12) ? (12.0f / 192.0f) : ( (float)r_ambientlight / 192.0f));
		//	Q_memcpy (translationTable, currententity->colormap + ( ( ( (shadelight > 192) ? 63 : ( (shadelight < 6) ? 249 : 255 - shadelight)) << VID_CBITS) & 0xFF00), 256);
			Q_memcpy (translationTable, currententity->colormap + ( ( (255 - (int) (128.0f * currententity->lightlevel)) << VID_CBITS) & 0xFF00), 256);
		//	Q_memcpy (translationTable, currententity->colormap + 63 * (int) ceil ( (shadelight > 192) ? 1.0f : ( (shadelight < 6) ? (6.0f / 192.0f) : ( (float)shadelight / 192.0f))), 256);
		}
		else // fullbright
			Q_memcpy (translationTable, currentTable, 256);
		// mankrip - end

		cachewidth		= r_spritedesc.pspriteframe->width;
		sprite_height	= r_spritedesc.pspriteframe->height;
		cacheblock = (byte *)&r_spritedesc.pspriteframe->pixels[0] + cachewidth + 3; // mankrip - extra for dithering

		nump	= r_spritedesc.nump;
		pverts	= r_spritedesc.pverts;
		// copy the first vertex to the last vertex, so we don't have to deal with wrapping
		pverts[nump] = pverts[0];

		D_SpriteCalculateGradients ();
		sprite_spans = spans; // only used by the three functions below
		D_SpriteScanLeftEdge ();
		D_SpriteScanRightEdge ();
		// mankrip - begin
		// common initialization for all SPR drawing functions
		cachewidth += 2; // extra for dithering
		pbase = cacheblock;
		pspan = sprite_spans;
		sdivzstepu = d_sdivzstepu * 16.0f;
		tdivzstepu = d_tdivzstepu * 16.0f;
		zistepu = d_zistepu * 16.0f;
		izistep = (int) (d_zistepu * 0x8000 * 0x10000); // we count on FP exceptions being turned off to avoid range problems
		psprite = currententity->model->cache.data;
		// mankrip - end
		currententity->D_SpriteDrawSpans (); // mankrip
	}
}
I hate writing so many variations of the same code, by the way. If I ever implement 32-bit rendering, that file will almost double.
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
qbism
Posts: 1236
Joined: Thu Nov 04, 2004 5:51 am
Contact:

Re: Subdiv16 for sprites?

Post by qbism »

mankrip wrote:The idea is simple: For SPR_VP_PARALLEL SPR models (which is the case of all SPR models in vanilla Quake, as well as Engoo's model-based particles), the value of izi never changes, so we can remove its update code completely. This also allows us to bitshift its value in advance.
Here's a vanilla drop-in. [EDIT - includes both parallel and oriented types, with more optimized code for parallel sprites.] [EDIT 2 - pull out out variables as statics. Outside the function for reuse among potential other versions of D_SpriteDrawSpans for effects.] [EDIT 3 - No need to calc z, zi, and izi every pass for parallel sprites.]

Code: Select all

/*
=====================
D_SpriteDrawSpans
=====================
*/
//qb: 'generic' version of subdiv16 sprites with code from mh and mankrip leilei post http://forums.inside3d.com/viewtopic.php?t=5268

#define PARALLELCHECK(i) { btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255 && (pz[i] <= izi))  { pz[i] = izi; pdest[i] = btemp;} s+=sstep; t+=tstep;}
#define ORIENTEDCHECK(i) { btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255 && pz[i] <= (izi >> 16)){ pz[i] = izi >> 16; pdest[i] = btemp;} s+=sstep; t+=tstep; izi+=izistep;}

void D_SpriteDrawSpans (sspan_t *pspan)
{

    sstep = 0;   // keep compiler happy
    tstep = 0;   // ditto

    pbase = cacheblock;

    sdivzstepu = d_sdivzstepu * 16;
    tdivzstepu = d_tdivzstepu * 16;
    zistepu = d_zistepu * 16;

    // we count on FP exceptions being turned off to avoid range problems
    izistep = (int)(d_zistepu * 0x8000 * 0x10000);

    psprite = currententity->model->cache.data;
    if (psprite->type == SPR_VP_PARALLEL || psprite->type == SPR_VP_PARALLEL_ORIENTED)
    {
        zi = d_ziorigin + dv*d_zistepv + du*d_zistepu;
        z = (float)0x10000 / zi;   // prescale to 16.16 fixed-point
        // we count on FP exceptions being turned off to avoid range problems
        izi = (int) (zi * 0x8000 * 0x10000) >> 16;

        do
        {
            pdest = (byte *)d_viewbuffer + (screenwidth * pspan->v) + pspan->u;
            pz = d_pzbuffer + (d_zwidth * pspan->v) + pspan->u;

            count = pspan->count >> 4;
            spancount = pspan->count % 16;

            // calculate the initial s/z, t/z, 1/z, s, and t and clamp
            du = (float)pspan->u;
            dv = (float)pspan->v;

            sdivz = d_sdivzorigin + dv*d_sdivzstepv + du*d_sdivzstepu;
            tdivz = d_tdivzorigin + dv*d_tdivzstepv + du*d_tdivzstepu;

            s = (int)(sdivz * z) + sadjust;
            if (s > bbextents)
                s = bbextents;
            else if (s < 0)
                s = 0;

            t = (int)(tdivz * z) + tadjust;
            if (t > bbextentt)
                t = bbextentt;
            else if (t < 0)
                t = 0;

            while (count-- > 0) // Manoel Kasimier
            {

                sdivz += sdivzstepu;
                tdivz += tdivzstepu;

                snext = (int) (sdivz * z) + sadjust;
                if (snext > bbextents)
                    snext = bbextents;
                else if (snext <= 16)
                    snext = 16;

                tnext = (int) (tdivz * z) + tadjust;
                if (tnext > bbextentt)
                    tnext = bbextentt;
                else if (tnext < 16)
                    tnext = 16;   // guard against round-off error on <0 steps

                sstep = (snext - s) >> 4;
                tstep = (tnext - t) >> 4;

                pdest += 16;
                pz += 16;

                PARALLELCHECK(-16);
                PARALLELCHECK(-15);
                PARALLELCHECK(-14);
                PARALLELCHECK(-13);
                PARALLELCHECK(-12);
                PARALLELCHECK(-11);
                PARALLELCHECK(-10);
                PARALLELCHECK(-9);
                PARALLELCHECK(-8);
                PARALLELCHECK(-7);
                PARALLELCHECK(-6);
                PARALLELCHECK(-5);
                PARALLELCHECK(-4);
                PARALLELCHECK(-3);
                PARALLELCHECK(-2);
                PARALLELCHECK(-1);
            }
            if (spancount > 0)
            {
                spancountminus1 = (float)(spancount - 1);
                sdivz += d_sdivzstepu * spancountminus1;
                tdivz += d_tdivzstepu * spancountminus1;

                snext = (int)(sdivz * z) + sadjust;
                if (snext > bbextents)
                    snext = bbextents;
                else if (snext < 16)
                    snext = 16;

                tnext = (int)(tdivz * z) + tadjust;
                if (tnext > bbextentt)
                    tnext = bbextentt;
                else if (tnext < 16)
                    tnext = 16;

                if (spancount > 1)
                {
                    sstep = (snext - s) / (spancount - 1);
                    tstep = (tnext - t) / (spancount - 1);
                }

                pdest += spancount;
                pz += spancount;
                switch (spancount)
                {
                case 16:
                    PARALLELCHECK(-16);
                case 15:
                    PARALLELCHECK(-15);
                case 14:
                    PARALLELCHECK(-14);
                case 13:
                    PARALLELCHECK(-13);
                case 12:
                    PARALLELCHECK(-12);
                case 11:
                    PARALLELCHECK(-11);
                case 10:
                    PARALLELCHECK(-10);
                case 9:
                    PARALLELCHECK(-9);
                case 8:
                    PARALLELCHECK(-8);
                case 7:
                    PARALLELCHECK(-7);
                case 6:
                    PARALLELCHECK(-6);
                case 5:
                    PARALLELCHECK(-5);
                case 4:
                    PARALLELCHECK(-4);
                case 3:
                    PARALLELCHECK(-3);
                case 2:
                    PARALLELCHECK(-2);
                case 1:
                    PARALLELCHECK(-1);
                    break;
                }
            }
            pspan++;
        }
        while (pspan->count != DS_SPAN_LIST_END);
    }
    else
    {
        do
        {
            pdest = (byte *)d_viewbuffer + (screenwidth * pspan->v) + pspan->u;
            pz = d_pzbuffer + (d_zwidth * pspan->v) + pspan->u;

            // Manoel Kasimier - begin
            count = pspan->count >> 4;
            spancount = pspan->count % 16;
            // Manoel Kasimier - end

            // calculate the initial s/z, t/z, 1/z, s, and t and clamp
            du = (float)pspan->u;
            dv = (float)pspan->v;

            sdivz = d_sdivzorigin + dv*d_sdivzstepv + du*d_sdivzstepu;
            tdivz = d_tdivzorigin + dv*d_tdivzstepv + du*d_tdivzstepu;
            zi = d_ziorigin + dv*d_zistepv + du*d_zistepu;
            z = (float)0x10000 / zi;   // prescale to 16.16 fixed-point
            // we count on FP exceptions being turned off to avoid range problems
            izi = (int)(zi * 0x8000 * 0x10000);

            s = (int)(sdivz * z) + sadjust;
            if (s > bbextents)
                s = bbextents;
            else if (s < 0)
                s = 0;

            t = (int)(tdivz * z) + tadjust;
            if (t > bbextentt)
                t = bbextentt;
            else if (t < 0)
                t = 0;

            while (count-- > 0) // Manoel Kasimier
            {
                // calculate s/z, t/z, zi->fixed s and t at far end of span,
                // calculate s and t steps across span by shifting
                //qb: ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - begin
                sdivz += sdivzstepu;
                tdivz += tdivzstepu;
                zi += zistepu;
                //qb: ( http://forums.inside3d.com/viewtopic.php?t=2717 ) - end
                z = (float)0x10000 / zi;   // prescale to 16.16 fixed-point

                snext = (int) (sdivz * z) + sadjust;
                if (snext > bbextents)
                    snext = bbextents;
                else if (snext <= 16)
                    snext = 16;   // prevent round-off error on <0 steps causing overstepping & running off the edge of the texture

                tnext = (int) (tdivz * z) + tadjust;
                if (tnext > bbextentt)
                    tnext = bbextentt;
                else if (tnext < 16)
                    tnext = 16;   // guard against round-off error on <0 steps

                sstep = (snext - s) >> 4;
                tstep = (tnext - t) >> 4;

                pdest += 16;
                pz += 16;

                ORIENTEDCHECK(-16);
                ORIENTEDCHECK(-15);
                ORIENTEDCHECK(-14);
                ORIENTEDCHECK(-13);
                ORIENTEDCHECK(-12);
                ORIENTEDCHECK(-11);
                ORIENTEDCHECK(-10);
                ORIENTEDCHECK(-9);
                ORIENTEDCHECK(-8);
                ORIENTEDCHECK(-7);
                ORIENTEDCHECK(-6);
                ORIENTEDCHECK(-5);
                ORIENTEDCHECK(-4);
                ORIENTEDCHECK(-3);
                ORIENTEDCHECK(-2);
                ORIENTEDCHECK(-1);
            }
            if (spancount > 0)
            {
                // calculate s/z, t/z, zi->fixed s and t at last pixel in span (so can't step off polygon),
                // clamp, calculate s and t steps across span by division, biasing steps low so we don't run off the texture
                spancountminus1 = (float)(spancount - 1);
                sdivz += d_sdivzstepu * spancountminus1;
                tdivz += d_tdivzstepu * spancountminus1;
                zi += d_zistepu * spancountminus1;
                z = (float)0x10000 / zi;   // prescale to 16.16 fixed-point
                snext = (int)(sdivz * z) + sadjust;
                if (snext > bbextents)
                    snext = bbextents;
                else if (snext < 16)
                    snext = 16;   // prevent round-off error on <0 steps from causing overstepping & running off the edge of the texture

                tnext = (int)(tdivz * z) + tadjust;
                if (tnext > bbextentt)
                    tnext = bbextentt;
                else if (tnext < 16)
                    tnext = 16;   // guard against round-off error on <0 steps

                if (spancount > 1)
                {
                    sstep = (snext - s) / (spancount - 1);
                    tstep = (tnext - t) / (spancount - 1);
                }

                pdest += spancount;
                pz += spancount;
                switch (spancount)
                {
                case 16:
                    ORIENTEDCHECK(-16);
                case 15:
                    ORIENTEDCHECK(-15);
                case 14:
                    ORIENTEDCHECK(-14);
                case 13:
                    ORIENTEDCHECK(-13);
                case 12:
                    ORIENTEDCHECK(-12);
                case 11:
                    ORIENTEDCHECK(-11);
                case 10:
                    ORIENTEDCHECK(-10);
                case 9:
                    ORIENTEDCHECK(-9);
                case 8:
                    ORIENTEDCHECK(-8);
                case 7:
                    ORIENTEDCHECK(-7);
                case 6:
                    ORIENTEDCHECK(-6);
                case 5:
                    ORIENTEDCHECK(-5);
                case 4:
                    ORIENTEDCHECK(-4);
                case 3:
                    ORIENTEDCHECK(-3);
                case 2:
                    ORIENTEDCHECK(-2);
                case 1:
                    ORIENTEDCHECK(-1);
                    break;
                }
            }
            pspan++;
        }
        while (pspan->count != DS_SPAN_LIST_END);
    }
}
Last edited by qbism on Fri Sep 20, 2013 2:43 am, edited 4 times in total.
mankrip
Posts: 924
Joined: Fri Jul 04, 2008 3:02 am

Re: Subdiv16 for sprites?

Post by mankrip »

qbism wrote:
mankrip wrote:The idea is simple: For SPR_VP_PARALLEL SPR models (which is the case of all SPR models in vanilla Quake, as well as Engoo's model-based particles), the value of izi never changes, so we can remove its update code completely. This also allows us to bitshift its value in advance.
Here's a vanilla drop-in.
[...]

Oriented sprites would need to keep updating izi like this in a vanilla unroll
[...]
I meant the vanilla game data ( /id1/PAK0.PAK & /id1/PAK1.PAK ), not the engine.

And the if (psprite->type == SPR_VP_PARALLEL) checks in my code ensures that it still works for oriented sprites. Each function has been internally duplicated - one path with optimizations for SPR_VP_PARALLEL, and one path without.
By the way, you could just replace that check with an if (!(izistep || d_zistepu || d_zistepv)) instead, since for SPR_VP_PARALLEL models they're zero. A simple if (!izistep) could probably do, also.

I've also noticed that this in my code

Code: Select all

	// mankrip - begin
	if (psprite->type == SPR_VP_PARALLEL)
	{
		zi = d_ziorigin + (float) (pspan->v) * d_zistepv + (float) (pspan->u) * d_zistepu;
		z = (float)0x10000 / zi; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (zi * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
... can be simplified like this:

Code: Select all

	// mankrip - begin
	if (psprite->type == SPR_VP_PARALLEL)
	{
		z = (float)0x10000 / d_ziorigin; // prescale to 16.16 fixed-point
		// we count on FP exceptions being turned off to avoid range problems
		izi = (int) (d_ziorigin * 0x8000 * 0x10000) >> 16;
		#undef IZI
		#define IZI izi
	// mankrip - end
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
qbism
Posts: 1236
Joined: Thu Nov 04, 2004 5:51 am
Contact:

Re: Subdiv16 for sprites?

Post by qbism »

mankrip wrote:I meant the vanilla game data ( /id1/PAK0.PAK & /id1/PAK1.PAK ), not the engine.
And the if (psprite->type == SPR_VP_PARALLEL) checks in my code ensures that it still works for oriented sprites. Each function has been internally duplicated - one path with optimizations for SPR_VP_PARALLEL, and one path without.
Good point, and that's a better idea than two separate functions. I plan to edit the vanilla function accordingly, then the term 'drop-in' will be more accurate.
qbism
Posts: 1236
Joined: Thu Nov 04, 2004 5:51 am
Contact:

Re: Subdiv16 for sprites?

Post by qbism »

OK, the vanilla drop-in code in previous post is edited/ improved.

SPR_VP_PARALLEL_ORIENTED can be added to the check for the more optimized path.

Code: Select all

...
if (psprite->type == SPR_VP_PARALLEL || psprite->type == SPR_VP_PARALLEL_ORIENTED)
...
mankrip
Posts: 924
Joined: Fri Jul 04, 2008 3:02 am

Re: Subdiv16 for sprites?

Post by mankrip »

For a number of reasons, your version is still slower.
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
qbism
Posts: 1236
Joined: Thu Nov 04, 2004 5:51 am
Contact:

Re: Subdiv16 for sprites?

Post by qbism »

In a certain stress case, this:

Code: Select all

#define PARALLELCHECK(i) { btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255 && (pz[i] <= IZI))  { pz[i] = IZI; pdest[i] = btemp;} s+=sstep; t+=tstep;}
#define ORIENTEDCHECK(i) { btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255 && pz[i] <= (izi >> 16)){ pz[i] = izi >> 16; pdest[i] = btemp;} s+=sstep; t+=tstep; izi+=izistep;}
turns out to be faster than this:

Code: Select all

#define PARALLELCHECK(i) {if (pz[i] <= IZI){ btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255)  { pz[i] = IZI; pdest[i] = btemp;}} s+=sstep; t+=tstep;}
#define ORIENTEDCHECK(i) {if (pz[i] <= (izi >> 16)){ btemp = *(pbase + (s >> 16) + (t >> 16) * cachewidth); if (btemp != 255){ pz[i] = izi >> 16; pdest[i] = btemp;}} s+=sstep; t+=tstep; izi+=izistep;}
Besides pulling out static variables it's the only change that made a dent in the fps counter. It may depend on the compiler and settings used. It may depend on the test case. In this instance, a savegame location next to a steam jet in Rubicon 2 map 2 (rub2m2). Many overlapping sprites filling the screen.
Image
leileilol
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Re: Subdiv16 for sprites?

Post by leileilol »

I just realized now I never macro and I should macro...
i should not be here
Post Reply