MH's Direct3D 8.1 Wrapper

Discuss programming topics for the various GPL'd game engine sources.
r00k
Posts: 1111
Joined: Sat Nov 13, 2004 10:39 pm

Post by r00k »

I tried briefly to add the wrapper to Qrack, but things like the vertex arrays and such gave me either straight crash or weird results.
Basically, if i get ~500fps on my machine im happy enough :D
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Me too. I had a halfway implemented one which needed lotsa stuff commented out to just run. I reckon Qrack could be more benefitted by a full native port if you were going to go down that route; or at least by doing native D3D versions of some Qrack-specific rendering functions.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
revelator
Posts: 2621
Joined: Thu Jan 24, 2008 12:04 pm
Location: inside tha debugger

Post by revelator »

seperate rendering dll's as pr quake2 (i expect my crucification to commence soon for even mentioning that) :P

but its an idea newer the less :)
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

I've actually played around with the idea of writing a ref_d3d.dll for Quake II once up a time. I like the idea of being able to swap around renderers dynamically like that, but in reality the only differences would be the API-specific code and there's probably better mileage in abstracting just that portion of it. So instead of, for example, exporting R_DrawWorld you'd export something like R_DrawSurface (float *verts).
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Post by Spike »

q2's renderer modules contain a lot of duplicated code, much of which is copy+pasta. D3D and GL contain a lot of things in common - they both accept triangles + verts + textures... Quake2's software accepts spans, however, which is why they attempted to abstract at a higher level.
I don't personally see the harm in using a few #ifdefs in the current ref_gl to build both ref_gl and ref_d3d. The user doesn't notice/care, and it reduces actual lines of code. Would need to abstract inside, but at least it would work with any q2 engine that still uses the same api, although I guess few actually still do.

But hey mh, if you ever run out of ideas to play with, you could always play with FTE's d3d backend. :)
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

I suspect that most Q2 ports have by now incorporated the refresh DLL into the base engine code, but there may still be a market for such an idea in people who just want to run stock Q2 on lower-end or integrated hardware.

The biggest pain would likely be in removing all the qgl stuff, which existed for no reason other than to be able to separate 3DFX OpenGL from Default OpenGL (and for logging, but you'd use GLIntercept for that nowadays). That's just messy grunt work with nothing concrete at the end of it.

surf->polys->verts[0] can be used directly as a parameter to DrawPrimitiveUP which is kinda neat and would hugely simplify the surface refresh. It could be moved to Vertex Buffers as a later exercise (I'd only bother if it was established that using -UP was a performance problem with stock Q2 maps though), particularly if you switch to shaders and do all the texcoord manuipulation there. It's easy to convert surface vertex layouts from a poly/trifan to a tristrip which might give better performance on some hardware.

Lightmap updating in Q2 is total cack; even worse than Q1. That needs a total overhaul. Instead of updating changes as they happen you need to accumulate changes and update them in bulk once only per frame. D3D is neater and cleaner than OpenGL here as you can LockRect a D3DPOOL_MANAGED texture (with D3DLOCK_NO_DIRTY_UPDATE), pass pBits + offset into R_BuildLightmap, then UnlockRect it and AddDirtyRect with the rectangle that's actually changed. So you just need to do a bunch of AddDirtyRect calls at the end of the current frame or the start of the next for any lightmap that's been modified. It does mean that updated lightmaps will lag 1 frame behind, but I'd defy anyone to notice.

The MD2 renderer in Q2 is hellishly messy. I've experimented in DirectQ with just loading all vertexes into a Vertex Buffer for MDLs and using stream offset to define the two frames to interpolate between (with interpolation being done in a vertex shader). VRAM usage is typically around the 1-2 MB mark (never seen it top 5 or so), but then DirectQ does compress vertex data down to 8 bytes and remove duplicate vertexes, so MDLs in DirectQ end up maybe 10% to 25% of the size of the original data. The concept could easily transfer across, and would be cleaner in some ways as you wouldn't have to mess around with cache memory in Q2. Otherwise just have a big enough system memory array, transfer the data in and DrawIndexedPrimitiveUP it. It won't be lightning fast but it will be fast enough.

2D drawing needs serious work. Just drawing all the console text as individual quads (or trifans) with one draw call per character can drop framerates down to single digit on some systems. It needs an intermediate layer to batch things up and flush batches on a state change or at the end of a frame. ID3DXSprite can do all of this automatically for you, and it's also viable for use with particles (and sprites, of course). It can be a little slow though as it AddRefs the texture, which for some reason takes far far more CPU cycles than it should (I suspect that the runtime is doing more than just incrementing a reference counter here). But so long as you're not doing too many texture changes (which you're not with the 2D stuff and particles/sprites) it's good enough.

Hmmmm, ideas ideas.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Some more notes.

The easiest way to load a texture is to just allocate a buffer of width * height * 4 + 18, expand to 32 bit into &buffer[18], fill the first 18 bytes of the buffer with a TGA header, and pass into D3DXCreateTextureFromFileInMemoryEx. Alternatively build a BMP file in memory and specify a palette, although the BMP format is slightly more complex (and needs crap like row padding).

Q2 really needs a separate "utility Hunk" for memory allocations like this, so just create one. Use VirtualAlloc, specify a maximum size of 32 or 64 MB, and you'll only ever use as much memory as you actually need but have plenty of headroom nonetheless.

The palette needs to be switched around to BGRA for texture loading in D3D.

Use D3DTOP_SELECTARG1 for GL_REPLACE instead of modulating with a default colour of white.

Use D3DTOP_MODULATE, D3DTOP_MODULATE2X or D3DTOP_MODULATE4X based on the value of the intensity cvar instead of lightscaling textures.

Use D3DSAMP_MIPMAPLODBIAS or D3DSAMP_MAXMIPLEVEL instead of scaling a texture by gl_picmip at load time.

Not quite sure what the best way to handle Draw_StretchRaw is. Probably CreateOffscreenPlainSurface and StretchRect is worth a try. Cache the surface and the dimensions that it was created at so that you only need to recreate it if the dimensions change. You could create a lockable backbuffer and write directly to it, but that might impact performance in the more general case when you're not showing a .cin file.

D3D code needs you to correct the half-pixel offset for 2D GUI rendering otherwise things look really fuzzy and horrible.

Don't bother with a matrix stack; it's much easier to just load matrixes directly as needed by SetTransform. Storing a D3DMATRIX in each entity_t is damn useful.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
revelator
Posts: 2621
Joined: Thu Jan 24, 2008 12:04 pm
Location: inside tha debugger

Post by revelator »

could probably get around those nasty ifdef's by unifying the render api

offcourse would be something like the q quake puts in front of everything to get around allready defined standards so ya qrendervertex qrenderquad qrendertriangle etc.

only problem i could foresee with doing it that way would be things the different api's dont have the exact same functions for (vbo's ?).

i kinda like the dynamic render switching but i absolutely hated the the client dll structure of quake2 yuuuuck.
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

Some of Quake II's OpenGL code will just not work well with D3D at all. OpenGL puts more of an abstraction layer in front of the hardware (this is neither good nor bad, it's just OpenGL philosophy) meaning that it tends to shield you from many of the messier details, D3D tends to shove the ugliness of low-level stuff in your face and forces you to deal with it yourself. There are advantages and disadvantages to both approaches, but code written to suit one does not tend to work well with the other.

D3D is extremely sensitive to number of draw calls. You absolutely must keep these down as low as you can get, so right from the very start you're looking to batch things up as much as possible. That's one reason why I suggested ID3DXSprite for the 2D stuff (and particles/sprites) - it will do the batching for you, meaning that a lot of ugly code you need to write just goes away. This means that you need to rewrite a good chunk of the 2D renderer though.

The MD2 renderer as it stands won't cut it either. They need to go into vertex buffers and they need to be indexed in order to get good performance without hurting resource allocation. That means a total rewrite of not only the renderer but also the in-memory format.

The surface renderer is just shit. It's the worst of the lot almost; dynamic light updates are totally inefficient for D3D and there is no batching done at all. Everything needs to be put into texture chains and drawn from those, with each surface in a texture chain being batched together. That's another total rewrite. Updating lightmaps as they pass is crazy, OpenGL already makes you suffer for that on some systems, and D3D will make you suffer on all systems. Another rewrite.

In the texture loader OpenGL lets you load all kinds of exciting formats that don't actually exist in hardware like GL_RGB or GL_RGBA, and it will silently convert them to BGRA at load time for you. D3D doesn't; it's BGRA all the way baby and forget about anything else (except stuff like luminance of course, but Q2 doesn't use that). More changes.

So abstracting the current Q2 renderer to support both APIs is going to result in something that looks ugly and performs badly. I worked around some of that in the D3D8 wrapper by making some attempts at batching stuff up, converting data to BGRA, etc, but the basic structure of the renderer prevented it from having full effect.

It's worth noting though that OpenGL actually does support the kind of code that D3D likes, and that this is the highest performing kind of code you can write with OpenGL. I guess it bypasses a lot of the abstraction and conversion layers in the driver and goes more directly (pun intended) to the hardware. So if you wanted to have the same renderer with a bunch of #ifdefs you would need to port your OpenGL code to this kind of code first.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
revelator
Posts: 2621
Joined: Thu Jan 24, 2008 12:04 pm
Location: inside tha debugger

Post by revelator »

what i thought about was more of a general wrapper which both architectures could share for common calls it would probably end up being rather advanced though to accomodate for the downfalls you describe. so i agree that might not be what people would look for :)

could be interresting though.
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Post by Baker »

I am just about done with a nifty little consolidated project.

But I notice I need to kick-start the wrapper before it will start to render.

Now this is application of the wrapper isn't Quake, but is in someways a minimal shell of Quake which many parts of the engine removed.

For some reason ... this gets it "going" ... doing a GL_Reset ()

Code: Select all

int glshutdowns[] = {
	GL_TEXTURE_2D,
	GL_BLEND,
	GL_CULL_FACE,
	GL_DEPTH_TEST
};





void GL_Reset (void)
{
	int num_shutdowns = sizeof(glshutdowns)/sizeof(glshutdowns[0]);
	int i;

	for (i = 0; i < num_shutdowns; i++)
		glDisable (glshutdowns[i]);

	glShadeModel (GL_SMOOTH); // Default
	glColor4f (1,1,1,1);
	glHint (GL_PERSPECTIVE_CORRECTION_HINT, GL_DONT_CARE);
}
Now I know all of that isn't required to kickstart it. Perhaps just one GL_Disable or it could be the glHint.

If you happen to remember, lemme know. If not, the answer will sort itself out as a work through this and I don't actually need the answer to keep doing what I am doing since I got it work and with 600 frames per second.

I'm kind of sort of making an empty shell which could be used for a 2D game that would support both OpenGL 1.1 and Direct3D 8.1 ... however ... it is more for my personal use to rapidly conduct experiments on the video code I am working on.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

If memory serves any state or texture change should be enough to kick it off.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
revelator
Posts: 2621
Joined: Thu Jan 24, 2008 12:04 pm
Location: inside tha debugger

Post by revelator »

sounds interresting baker :) ill look forward to what you might come up with.

just came home today from thailand after visiting my father, still a lot to process (38 years is a damn long time) but we had fun :)

were both tech nerds but in regards to languages my dad takes the price

17 languages fluent :shock: including chinese thai portoguise japanese french jeez my ol man is a walking babel :lol:
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Post by Baker »

This feels a little weird saying.

The Direct3D 8.1 Wrapper is ... well ... beating OpenGL on average by about 10% frames per second. Need to confirm on a couple of other machines.

I'm not sure if it was eliminating the usage of hardware gamma entirely (not the wrapper) that did it or the TMUs or some other things I've done.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Post by mh »

I thought it might get to the stage where it does that. :)
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Post Reply