Doom 3 engine release and game code
Moderator: InsideQC Admins
Re: Doom 3 engine release and game code
your SDL library might not have glMapBufferRange in which case you probably need to get the import from opengl itself.
More stuff i probably forgot to mention had a veritable shitstorm on my head these last few days trying to run down some illusive bug
ill post back if i find anything.
More stuff i probably forgot to mention had a veritable shitstorm on my head these last few days trying to run down some illusive bug
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
Fixed dhewm3 source
http://code.google.com/p/realm/downloads/detail?name=dhewm3.7z&can=2&q=
included needed libraries they are in the external dir (yes it uses SDL-1.2 i just moved the missing pieces to the top of qgl.h)
and yes it works.
included my own build so you can check it out.
http://code.google.com/p/realm/downloads/detail?name=dhewm3.7z&can=2&q=
included needed libraries they are in the external dir (yes it uses SDL-1.2 i just moved the missing pieces to the top of qgl.h)
and yes it works.
included my own build so you can check it out.
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
You probably forgot to include a header. Tbh, changes like that should not be simply thrown into dhewm3 because dhewm3 is somewhat different from vanilla codebase.
- motorsep
- Posts: 231
- Joined: Wed Aug 02, 2006 11:46 pm
- Location: Texas, USA
Re: Doom 3 engine release and game code
tbh its not that different only the build method and the fact that it uses SDL is
and as for SDL it only maps the opengl calls SDL itself does not provide any opengl features but works a bit like GLEW when OpenGL is used so if its missing those calls you cannot link and theres really no more to it
The error on my part was lack of sleep so i forgot some changes i made in other places.
and as for SDL it only maps the opengl calls SDL itself does not provide any opengl features but works a bit like GLEW when OpenGL is used so if its missing those calls you cannot link and theres really no more to it
The error on my part was lack of sleep so i forgot some changes i made in other places.
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
Sigh well one shouldnt take all crap a static analyzer throws out for litteral i just had a revelation after fixing some of idlibs math functions by doing array calls as a reference well it compiles fine and the game
loads but rofl all models are now 2D sprites i have skeletons flying all over the place i got attacked by a barrel... and to top it of when looking in a mirror in the game i had the head of an imp
So big warning to devs double check if the fix has the desired result and make a backup of the working version before you go tom cruise on it hehe.
loads but rofl all models are now 2D sprites i have skeletons flying all over the place i got attacked by a barrel... and to top it of when looking in a mirror in the game i had the head of an imp
So big warning to devs double check if the fix has the desired result and make a backup of the working version before you go tom cruise on it hehe.
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
Thank you for your help, it compiles under Linux!
To find the relevant changes, I had to (http://pastebin.com/FFpJUP0G):
What does this this vertex cache patch actually do and how can I see if it works? It does feel faster, but timedemo completes with same fps.
To find the relevant changes, I had to (http://pastebin.com/FFpJUP0G):
- - convert modified files back to unix format
- reapply commit "Fix -Wunused-variable warnings e4771f3" to VertexCache.cpp
- reapply commit "Fix -Wformat and -Wformat-extra-args warnings 04d1e91" to VertexCache.cpp
- reapply commit "Don't use alpha bits for the GL config 9e15847" to glimp.cpp
- revert the cd-key removal
- change the code style to reflect the original
- change unportable LPVOID to void*
- Code: Select all
diff -urN a/neo/renderer/qgl.h b/neo/renderer/qgl.h
--- a/neo/renderer/qgl.h 2012-09-03 23:45:08.097280095 +0200
+++ b/neo/renderer/qgl.h 2012-09-03 23:45:24.083741865 +0200
@@ -34,6 +34,27 @@
#include <SDL_opengl.h>
+/* missing in SDL-1.2 */
+#ifndef GL_ARB_map_buffer_range
+#define GL_MAP_READ_BIT 0x0001
+#define GL_MAP_WRITE_BIT 0x0002
+#define GL_MAP_INVALIDATE_RANGE_BIT 0x0004
+#define GL_MAP_INVALIDATE_BUFFER_BIT 0x0008
+#define GL_MAP_FLUSH_EXPLICIT_BIT 0x0010
+#define GL_MAP_UNSYNCHRONIZED_BIT 0x0020
+#endif
+
+#ifndef GL_ARB_map_buffer_range
+#define GL_ARB_map_buffer_range 1
+#ifdef GL_GLEXT_PROTOTYPES
+GLAPI GLvoid* APIENTRY glMapBufferRange(GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);
+GLAPI void APIENTRY glFlushMappedBufferRange(GLenum target, GLintptr offset, GLsizeiptr length);
+#endif /* GL_GLEXT_PROTOTYPES */
+typedef GLvoid* (APIENTRYP PFNGLMAPBUFFERRANGEPROC)(GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);
+typedef void (APIENTRYP PFNGLFLUSHMAPPEDBUFFERRANGEPROC)(GLenum target, GLintptr offset, GLsizeiptr length);
+#endif
+/* missing in SDL-1.2 */
+
typedef void (*GLExtension_t)(void);
#ifdef __cplusplus
@@ -56,6 +77,10 @@
extern void ( APIENTRY * qglActiveTextureARB )( GLenum texture );
extern void ( APIENTRY * qglClientActiveTextureARB )( GLenum texture );
+// ARB_MapBufferRange
+extern PFNGLMAPBUFFERRANGEPROC qglMapBufferRange;
+extern PFNGLFLUSHMAPPEDBUFFERRANGEPROC qglFlushMappedBufferRange;
+
// ARB_vertex_buffer_object
extern PFNGLBINDBUFFERARBPROC qglBindBufferARB;
extern PFNGLDELETEBUFFERSARBPROC qglDeleteBuffersARB;
diff -urN a/neo/renderer/RenderSystem.h b/neo/renderer/RenderSystem.h
--- a/neo/renderer/RenderSystem.h 2012-09-03 23:45:08.098280061 +0200
+++ b/neo/renderer/RenderSystem.h 2012-09-03 23:45:24.083741865 +0200
@@ -75,6 +75,7 @@
bool ARBVertexBufferObjectAvailable;
bool ARBVertexProgramAvailable;
bool ARBFragmentProgramAvailable;
+ bool ARBMapBufferRangeAvailable;
bool twoSidedStencilAvailable;
bool textureNonPowerOfTwoAvailable;
bool depthBoundsTestAvailable;
diff -urN a/neo/renderer/RenderSystem_init.cpp b/neo/renderer/RenderSystem_init.cpp
--- a/neo/renderer/RenderSystem_init.cpp 2012-09-03 23:45:08.098280061 +0200
+++ b/neo/renderer/RenderSystem_init.cpp 2012-09-03 23:45:24.084741831 +0200
@@ -237,6 +237,10 @@
// EXT_stencil_two_side
PFNGLACTIVESTENCILFACEEXTPROC qglActiveStencilFaceEXT;
+// ARB_MapBufferRange
+PFNGLMAPBUFFERRANGEPROC qglMapBufferRange;
+PFNGLFLUSHMAPPEDBUFFERRANGEPROC qglFlushMappedBufferRange;
+
// ARB_texture_compression
PFNGLCOMPRESSEDTEXIMAGE2DARBPROC qglCompressedTexImage2DARB;
PFNGLGETCOMPRESSEDTEXIMAGEARBPROC qglGetCompressedTexImageARB;
@@ -385,6 +389,13 @@
if ( glConfig.twoSidedStencilAvailable )
qglActiveStencilFaceEXT = (PFNGLACTIVESTENCILFACEEXTPROC)GLimp_ExtensionPointer( "glActiveStencilFaceEXT" );
+ // ARB_MapBufferRange
+ glConfig.ARBMapBufferRangeAvailable = R_CheckExtension("GL_ARB_map_buffer_range");
+ if (glConfig.ARBMapBufferRangeAvailable) {
+ qglMapBufferRange = (PFNGLMAPBUFFERRANGEPROC)GLimp_ExtensionPointer("glMapBufferRange");
+ qglFlushMappedBufferRange = (PFNGLFLUSHMAPPEDBUFFERRANGEPROC)GLimp_ExtensionPointer("glFlushMappedBufferRange");
+ }
+
// ARB_vertex_buffer_object
glConfig.ARBVertexBufferObjectAvailable = R_CheckExtension( "GL_ARB_vertex_buffer_object" );
if(glConfig.ARBVertexBufferObjectAvailable) {
diff -urN a/neo/renderer/VertexCache.cpp b/neo/renderer/VertexCache.cpp
--- a/neo/renderer/VertexCache.cpp 2012-09-03 23:45:08.097280095 +0200
+++ b/neo/renderer/VertexCache.cpp 2012-09-03 23:50:57.214404514 +0200
@@ -33,13 +33,17 @@
#include "renderer/VertexCache.h"
static const int FRAME_MEMORY_BYTES = 0x200000;
-static const int EXPAND_HEADERS = 1024;
+static const int EXPAND_HEADERS = 32;
-idCVar idVertexCache::r_showVertexCache( "r_showVertexCache", "0", CVAR_INTEGER|CVAR_RENDERER, "" );
-idCVar idVertexCache::r_vertexBufferMegs( "r_vertexBufferMegs", "32", CVAR_INTEGER|CVAR_RENDERER, "" );
+idCVar idVertexCache::r_showVertexCache( "r_showVertexCache", "0", CVAR_INTEGER|CVAR_RENDERER, "show vertex cache" );
+idCVar idVertexCache::r_useArbBufferRange( "r_useArbBufferRange", "1", CVAR_BOOL|CVAR_RENDERER, "use ARB_map_buffer_range for optimization" );
+idCVar idVertexCache::r_reuseVertexCacheSooner( "r_reuseVertexCacheSooner", "1", CVAR_BOOL|CVAR_RENDERER, "reuse vertex buffers as soon as possible after freeing" );
idVertexCache vertexCache;
+static GLuint gl_current_array_buffer = 0;
+static GLuint gl_current_index_buffer = 0;
+
/*
==============
R_ListVertexCache_f
@@ -51,6 +55,31 @@
/*
==============
+GL_BindBuffer
+==============
+*/
+static void GL_BindBuffer( GLenum target, GLuint buffer ) {
+ if ( target == GL_ARRAY_BUFFER ) {
+ if ( gl_current_array_buffer != buffer ) {
+ gl_current_array_buffer = buffer;
+ } else {
+ return;
+ }
+ } else if ( target == GL_ELEMENT_ARRAY_BUFFER ) {
+ if ( gl_current_index_buffer != buffer ) {
+ gl_current_index_buffer = buffer;
+ } else {
+ return;
+ }
+ } else {
+ common->Error( "GL_BindBuffer : invalid buffer target : %i\n", (int) target );
+ return;
+ }
+ qglBindBufferARB( target, buffer );
+}
+
+/*
+==============
idVertexCache::ActuallyFree
==============
*/
@@ -67,15 +96,16 @@
// temp blocks are in a shared space that won't be freed
if ( block->tag != TAG_TEMP ) {
- staticAllocTotal -= block->size;
- staticCountTotal--;
+ this->staticAllocTotal -= block->size;
+ this->staticCountTotal--;
if ( block->vbo ) {
-#if 0 // this isn't really necessary, it will be reused soon enough
- // filling with zero length data is the equivalent of freeing
- qglBindBufferARB(GL_ARRAY_BUFFER_ARB, block->vbo);
- qglBufferDataARB(GL_ARRAY_BUFFER_ARB, 0, 0, GL_DYNAMIC_DRAW_ARB);
-#endif
+ // Does not seem to hurt any and is actually used in all other
+ // implementations in some form. Changed a bit to map to NULL pointer
+ // with block size. Removing this is probably ok but you cannot remove
+ // the if ( block->vbo ) cause then it will crash.
+ GL_BindBuffer(GL_ARRAY_BUFFER_ARB, block->vbo);
+ qglBufferDataARB(GL_ARRAY_BUFFER_ARB, block->size, NULL, GL_DYNAMIC_DRAW_ARB);
} else if ( block->virtMem ) {
Mem_Free( block->virtMem );
block->virtMem = NULL;
@@ -87,16 +117,15 @@
block->next->prev = block->prev;
block->prev->next = block->next;
-#if 1
- // stick it on the front of the free list so it will be reused immediately
- block->next = freeStaticHeaders.next;
- block->prev = &freeStaticHeaders;
-#else
- // stick it on the back of the free list so it won't be reused soon (just for debugging)
- block->next = &freeStaticHeaders;
- block->prev = freeStaticHeaders.prev;
-#endif
-
+ if ( r_reuseVertexCacheSooner.GetBool() ) {
+ // stick it on the front of the free list so it will be reused immediately
+ block->next = this->freeStaticHeaders.next;
+ block->prev = &this->freeStaticHeaders;
+ } else {
+ // stick it on the back of the free list so it won't be reused soon (just for debugging)
+ block->next = &this->freeStaticHeaders;
+ block->prev = this->freeStaticHeaders.prev;
+ }
block->next->prev = block;
block->prev->next = block;
}
@@ -126,11 +155,8 @@
common->Printf( "GL_ARRAY_BUFFER_ARB = %i (%i bytes)\n", buffer->vbo, buffer->size );
}
}
- if ( buffer->indexBuffer ) {
- qglBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, buffer->vbo );
- } else {
- qglBindBufferARB( GL_ARRAY_BUFFER_ARB, buffer->vbo );
- }
+ GL_BindBuffer( (buffer->indexBuffer ? GL_ELEMENT_ARRAY_BUFFER : GL_ARRAY_BUFFER), buffer->vbo );
+
return (void *)buffer->offset;
}
@@ -138,11 +164,15 @@
return (void *)((byte *)buffer->virtMem + buffer->offset);
}
+/*
+===========
+idVertexCache::UnbindIndex
+===========
+*/
void idVertexCache::UnbindIndex() {
- qglBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, 0 );
+ GL_BindBuffer( GL_ELEMENT_ARRAY_BUFFER, 0 );
}
-
//================================================================================
/*
@@ -153,10 +183,6 @@
void idVertexCache::Init() {
cmdSystem->AddCommand( "listVertexCache", R_ListVertexCache_f, CMD_FL_RENDERER, "lists vertex cache" );
- if ( r_vertexBufferMegs.GetInteger() < 8 ) {
- r_vertexBufferMegs.SetInteger( 8 );
- }
-
virtualMemory = false;
// use ARB_vertex_buffer_object unless explicitly disabled
@@ -169,7 +195,7 @@
}
// initialize the cache memory blocks
- freeStaticHeaders.next = freeStaticHeaders.prev = &freeStaticHeaders;
+ this->freeStaticHeaders.next = this->freeStaticHeaders.prev = &this->freeStaticHeaders;
staticHeaders.next = staticHeaders.prev = &staticHeaders;
freeDynamicHeaders.next = freeDynamicHeaders.prev = &freeDynamicHeaders;
dynamicHeaders.next = dynamicHeaders.prev = &dynamicHeaders;
@@ -177,21 +203,22 @@
// set up the dynamic frame memory
frameBytes = FRAME_MEMORY_BYTES;
- staticAllocTotal = 0;
+ this->staticAllocTotal = 0;
- byte *junk = (byte *)Mem_Alloc( frameBytes );
+ byte *junk = (byte *)Mem_Alloc( frameBytes );
for ( int i = 0 ; i < NUM_VERTEX_FRAMES ; i++ ) {
- allocatingTempBuffer = true; // force the alloc to use GL_STREAM_DRAW_ARB
- Alloc( junk, frameBytes, &tempBuffers[i] );
- allocatingTempBuffer = false;
- tempBuffers[i]->tag = TAG_FIXED;
+ this->allocatingTempBuffer = true; // force the alloc to use GL_STREAM_DRAW_ARB
+ this->Alloc( junk, this->frameBytes, &this->tempBuffers[i] );
+ this->allocatingTempBuffer = false;
+ this->tempBuffers[i]->tag = TAG_FIXED;
+
// unlink these from the static list, so they won't ever get purged
- tempBuffers[i]->next->prev = tempBuffers[i]->prev;
- tempBuffers[i]->prev->next = tempBuffers[i]->next;
+ this->tempBuffers[i]->next->prev = this->tempBuffers[i]->prev;
+ this->tempBuffers[i]->prev->next = this->tempBuffers[i]->next;
}
Mem_Free( junk );
- EndFrame();
+ EndFrame ();
}
/*
@@ -214,8 +241,6 @@
===========
*/
void idVertexCache::Shutdown() {
-// PurgeAll(); // !@#: also purge the temp buffers
-
headerAllocator.Shutdown();
}
@@ -225,7 +250,7 @@
===========
*/
void idVertexCache::Alloc( void *data, int size, vertCache_t **buffer, bool indexBuffer ) {
- vertCache_t *block;
+ vertCache_t *block = NULL;
if ( size <= 0 ) {
common->Error( "idVertexCache::Alloc: size = %i\n", size );
@@ -235,23 +260,52 @@
*buffer = NULL;
// if we don't have any remaining unused headers, allocate some more
- if ( freeStaticHeaders.next == &freeStaticHeaders ) {
-
+ if ( this->freeStaticHeaders.next == &this->freeStaticHeaders ) {
for ( int i = 0; i < EXPAND_HEADERS; i++ ) {
- block = headerAllocator.Alloc();
- block->next = freeStaticHeaders.next;
- block->prev = &freeStaticHeaders;
- block->next->prev = block;
- block->prev->next = block;
+ block = headerAllocator.Alloc ();
if( !virtualMemory ) {
- qglGenBuffersARB( 1, & block->vbo );
+ qglGenBuffersARB( 1, &block->vbo );
+ block->size = 0;
}
+ block->next = this->freeStaticHeaders.next;
+ block->prev = &this->freeStaticHeaders;
+ block->next->prev = block;
+ block->prev->next = block;
+ }
+ }
+
+ GLenum target = ( indexBuffer ? GL_ELEMENT_ARRAY_BUFFER : GL_ARRAY_BUFFER );
+ GLenum usage = ( allocatingTempBuffer ? GL_STREAM_DRAW : GL_STATIC_DRAW );
+
+ // try to find a matching block to replace so that we're not continually respecifying vbo data each frame
+ for ( vertCache_t *findblock = this->freeStaticHeaders.next; ; findblock = findblock->next ) {
+ if ( findblock == &this->freeStaticHeaders ) {
+ block = this->freeStaticHeaders.next;
+ break;
}
+
+ if ( findblock->target != target ) continue;
+ if ( findblock->usage != usage ) continue;
+ if ( findblock->size != size ) continue;
+
+ block = findblock;
+ break;
}
// move it from the freeStaticHeaders list to the staticHeaders list
- block = freeStaticHeaders.next;
+ block->target = target;
+ block->usage = usage;
+
+ if ( block->vbo ) {
+ // orphan the buffer in case it needs respecifying (it usually will)
+ GL_BindBuffer( target, block->vbo );
+ qglBufferDataARB( target, (GLsizeiptr) size, NULL, usage );
+ qglBufferDataARB( target, (GLsizeiptr) size, data, usage );
+ } else {
+ block->virtMem = Mem_Alloc( size );
+ SIMDProcessor->Memcpy( block->virtMem, data, size );
+ }
block->next->prev = block->prev;
block->prev->next = block->next;
block->next = staticHeaders.next;
@@ -264,10 +318,10 @@
block->tag = TAG_USED;
// save data for debugging
- staticAllocThisFrame += block->size;
- staticCountThisFrame++;
- staticCountTotal++;
- staticAllocTotal += block->size;
+ this->staticAllocThisFrame += block->size;
+ this->staticCountThisFrame++;
+ this->staticCountTotal++;
+ this->staticAllocTotal += block->size;
// this will be set to zero when it is purged
block->user = buffer;
@@ -277,26 +331,7 @@
// load time lots of things may be created, but they aren't
// referenced by the GPU yet, and can be purged if needed.
block->frameUsed = currentFrame - NUM_VERTEX_FRAMES;
-
block->indexBuffer = indexBuffer;
-
- // copy the data
- if ( block->vbo ) {
- if ( indexBuffer ) {
- qglBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, block->vbo );
- qglBufferDataARB( GL_ELEMENT_ARRAY_BUFFER_ARB, (GLsizeiptrARB)size, data, GL_STATIC_DRAW_ARB );
- } else {
- qglBindBufferARB( GL_ARRAY_BUFFER_ARB, block->vbo );
- if ( allocatingTempBuffer ) {
- qglBufferDataARB( GL_ARRAY_BUFFER_ARB, (GLsizeiptrARB)size, data, GL_STREAM_DRAW_ARB );
- } else {
- qglBufferDataARB( GL_ARRAY_BUFFER_ARB, (GLsizeiptrARB)size, data, GL_STATIC_DRAW_ARB );
- }
- }
- } else {
- block->virtMem = Mem_Alloc( size );
- SIMDProcessor->Memcpy( block->virtMem, data, size );
- }
}
/*
@@ -312,10 +347,10 @@
if ( block->tag == TAG_FREE ) {
common->FatalError( "idVertexCache Touch: freed pointer" );
}
+
if ( block->tag == TAG_TEMP ) {
common->FatalError( "idVertexCache Touch: temporary pointer" );
}
-
block->frameUsed = currentFrame;
// move to the head of the LRU list
@@ -324,6 +359,7 @@
block->next = staticHeaders.next;
block->prev = &staticHeaders;
+
staticHeaders.next->prev = block;
staticHeaders.next = block;
}
@@ -354,6 +390,7 @@
block->next = deferredFreeList.next;
block->prev = &deferredFreeList;
+
deferredFreeList.next->prev = block;
deferredFreeList.next = block;
}
@@ -377,9 +414,9 @@
if ( dynamicAllocThisFrame + size > frameBytes ) {
// if we don't have enough room in the temp block, allocate a static block,
// but immediately free it so it will get freed at the next frame
- tempOverflow = true;
- Alloc( data, size, &block );
- Free( block);
+ this->tempOverflow = true;
+ this->Alloc( data, size, &block );
+ this->Free( block);
return block;
}
@@ -417,15 +454,37 @@
// copy the data
block->virtMem = tempBuffers[listNum]->virtMem;
- block->vbo = tempBuffers[listNum]->vbo;
- if ( block->vbo ) {
- qglBindBufferARB( GL_ARRAY_BUFFER_ARB, block->vbo );
- qglBufferSubDataARB( GL_ARRAY_BUFFER_ARB, block->offset, (GLsizeiptrARB)size, data );
+ // mh code start
+ if ( (block->vbo = tempBuffers[listNum]->vbo) != 0 ) {
+ GL_BindBuffer( GL_ARRAY_BUFFER, block->vbo );
+
+ // try to get an unsynchronized map if at all possible
+ if ( glConfig.ARBMapBufferRangeAvailable && r_useArbBufferRange.GetBool() ) {
+ void *dst = NULL;
+ GLbitfield access = GL_MAP_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT|GL_MAP_INVALIDATE_RANGE_BIT;
+
+ // if the buffer has wrapped then we orphan it
+ if ( block->offset == 0 ) {
+ access = GL_MAP_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT;
+ } else {
+ access = GL_MAP_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT|GL_MAP_INVALIDATE_RANGE_BIT;
+ }
+ if ( (dst = qglMapBufferRange(GL_ARRAY_BUFFER, block->offset, (GLsizeiptr) size, access)) != NULL ) {
+ SIMDProcessor->Memcpy( (byte *)dst, data, size );
+
+ qglUnmapBufferARB( GL_ARRAY_BUFFER );
+
+ return block;
+ } else {
+ qglBufferSubDataARB( GL_ARRAY_BUFFER, block->offset, (GLsizeiptr) size, data );
+ }
+ } else {
+ qglBufferSubDataARB( GL_ARRAY_BUFFER, block->offset, (GLsizeiptr) size, data );
+ }
} else {
SIMDProcessor->Memcpy( (byte *)block->virtMem + block->offset, data, size );
}
-
return block;
}
@@ -446,36 +505,25 @@
staticUseSize += block->size;
}
}
-
const char *frameOverflow = tempOverflow ? "(OVERFLOW)" : "";
common->Printf( "vertex dynamic:%i=%ik%s, static alloc:%i=%ik used:%i=%ik total:%i=%ik\n",
dynamicCountThisFrame, dynamicAllocThisFrame/1024, frameOverflow,
- staticCountThisFrame, staticAllocThisFrame/1024,
+ this->staticCountThisFrame, staticAllocThisFrame/1024,
staticUseCount, staticUseSize/1024,
- staticCountTotal, staticAllocTotal/1024 );
+ this->staticCountTotal, staticAllocTotal/1024 );
}
-#if 0
- // if our total static count is above our working memory limit, start purging things
- while ( staticAllocTotal > r_vertexBufferMegs.GetInteger() * 1024 * 1024 ) {
- // free the least recently used
-
- }
-#endif
-
if( !virtualMemory ) {
// unbind vertex buffers so normal virtual memory will be used in case
// r_useVertexBuffers / r_useIndexBuffers
qglBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
qglBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, 0 );
}
-
-
currentFrame = tr.frameCount;
listNum = currentFrame % NUM_VERTEX_FRAMES;
- staticAllocThisFrame = 0;
- staticCountThisFrame = 0;
+ this->staticAllocThisFrame = 0;
+ this->staticCountThisFrame = 0;
dynamicAllocThisFrame = 0;
dynamicCountThisFrame = 0;
tempOverflow = false;
@@ -516,18 +564,15 @@
frameStatic += block->size;
}
}
-
int numFreeStaticHeaders = 0;
+
for ( block = freeStaticHeaders.next ; block != &freeStaticHeaders ; block = block->next ) {
numFreeStaticHeaders++;
}
-
int numFreeDynamicHeaders = 0;
for ( block = freeDynamicHeaders.next ; block != &freeDynamicHeaders ; block = block->next ) {
numFreeDynamicHeaders++;
}
-
- common->Printf( "%i megs working set\n", r_vertexBufferMegs.GetInteger() );
common->Printf( "%i dynamic temp buffers of %ik\n", NUM_VERTEX_FRAMES, frameBytes / 1024 );
common->Printf( "%5i active static headers\n", numActive );
common->Printf( "%5i free static headers\n", numFreeStaticHeaders );
diff -urN a/neo/renderer/VertexCache.h b/neo/renderer/VertexCache.h
--- a/neo/renderer/VertexCache.h 2012-09-03 23:45:08.097280095 +0200
+++ b/neo/renderer/VertexCache.h 2012-09-03 23:45:24.085741797 +0200
@@ -42,6 +42,8 @@
typedef struct vertCache_s {
GLuint vbo;
+ GLenum target;
+ GLenum usage;
void *virtMem; // only one of vbo / virtMem will be set
bool indexBuffer; // holds indexes instead of vertexes
@@ -111,7 +113,8 @@
void ActuallyFree( vertCache_t *block );
static idCVar r_showVertexCache;
- static idCVar r_vertexBufferMegs;
+ static idCVar r_useArbBufferRange;
+ static idCVar r_reuseVertexCacheSooner;
int staticCountTotal;
int staticAllocTotal; // for end of frame purging
What does this this vertex cache patch actually do and how can I see if it works? It does feel faster, but timedemo completes with same fps.
- tobis87
- Posts: 8
- Joined: Sat Sep 01, 2012 9:44 pm
Re: Doom 3 engine release and game code
The alpha bit commit does not work on windows (all screens show static or black) thats why i reverted it nice catch though 
As for what it does Mh might be better explaining that but as far as i understand from the code is that it limits the insane ammount of VBO calls doom3 normally has by checking for allready allocated blocks.
Im not quite into what the ARB_map_buffer_range function does for speed but Mh might have an explanaition.
I get a few fps more from this but nothing earthshattering either but as you also noticed the game feels more responsive so theres definatly something to it
doom3's biggest problem today seems to be that it actually runs worse on modern cards (go figure) i get more fps on my old geforce 6600 gfx card than i get with my 2x 560 gtx TI.
I guess optimizing for recent hardware will be a lenghty process and this is not the only thing that could be done to speed it up. Got some code from jmarshall that should speed up things considerable unfortunatly he forgot a few bits and pieces like me
and hees been mia for some time but if he resurfaces ill see if we can get it fixed and then ill post a patch.
edit: btw timedemo is absolute garbage in doom3 im afraid on my rig it shows constant 60 fps but when i load a map my fps drops to 35 - 40 so its not really a good indicator
As for what it does Mh might be better explaining that but as far as i understand from the code is that it limits the insane ammount of VBO calls doom3 normally has by checking for allready allocated blocks.
Im not quite into what the ARB_map_buffer_range function does for speed but Mh might have an explanaition.
I get a few fps more from this but nothing earthshattering either but as you also noticed the game feels more responsive so theres definatly something to it
doom3's biggest problem today seems to be that it actually runs worse on modern cards (go figure) i get more fps on my old geforce 6600 gfx card than i get with my 2x 560 gtx TI.
I guess optimizing for recent hardware will be a lenghty process and this is not the only thing that could be done to speed it up. Got some code from jmarshall that should speed up things considerable unfortunatly he forgot a few bits and pieces like me
edit: btw timedemo is absolute garbage in doom3 im afraid on my rig it shows constant 60 fps but when i load a map my fps drops to 35 - 40 so its not really a good indicator
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
reckless wrote:Sigh well one shouldnt take all crap a static analyzer throws out for litteral i just had a revelation after fixing some of idlibs math functions by doing array calls as a reference well it compiles fine and the game
loads but rofl all models are now 2D sprites i have skeletons flying all over the place i got attacked by a barrel... and to top it of when looking in a mirror in the game i had the head of an imp.
I'd love to see a video of that.
I know FrikaC made a cgi-bin version of the quakec interpreter once and wrote part of his website in QuakeC
(LordHavoc)
-

frag.machine - Posts: 2090
- Joined: Sat Nov 25, 2006 1:49 pm
Re: Doom 3 engine release and game code
This is the fastest version I've seen on laptop ATI. Didn't notice any bugs. Worked w/ sikkmod.reckless wrote:Fixed dhewm3 source
-
qbism - Posts: 1236
- Joined: Thu Nov 04, 2004 5:51 am
Re: Doom 3 engine release and game code
I'd love to see a video of that.
Hehe yeah that was pretty funny
If you want to recreate it just for giggles then you need to change all operators with idvec arrays into references like this idvec3 b to idvec3 &b.
Nice qbism
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
What the glMapBufferRange stuff does is allow you to take advantage of a VBO streaming pattern that D3D has enjoyed since at least version 7 - in D3D terms it's known as the discard/no-overwrite pattern.
A VBO is a GPU resource, and normally, if you try to update a GPU resource that is currently in use for drawing with (entirely possible because of the asynchronous nature of CPU/GPU operation), everything must stall and wait for drawing to complete before the update can happen. The stock Doom 3 code actually double-buffers it's streaming VBOs to try avoid this (in a slightly obfuscated way) but glMapBufferRange is a more robust way.
So, I mentioned discard/no-overwrite above. Here's what they do.
The buffer is filled in a linear manner. You've got 2mb (or whatever) of space, vertexes are added beginning at position 0, as new vertexes are added they get appended until the buffer fills, then magic happens.
This standard update is no-overwrite; your code makes a promise to GL that it's not going to overwrite any region of the buffer that may be currently in use for drawing, and in return GL will let you update the buffer without blocking. In order to be able to keep this promise your code must maintain a counter indicating how much space in the buffer it has previously used, and add new verts to the buffer at this counter position.
When the buffer becomes full you "discard". This doesn't throw away anything previously added, instead GL will keep the previous block of buffer memory around for as long as is needed to satisfy any pending draw calls, but will give you a new, fresh block for any further updates. That's the "magic" I mentioned above, and it's what lets you use a streaming VBO without any blocking.
This pattern will also let you get rid of Doom 3's double buffering, thus saving you some GPU memory (I haven't yet done this in my code). Because there's no more blocking it will run faster in cases where there is a lot of dynamic buffer usage, but because Doom 3 locks at 60fps it may not be as directly measurable as if the engine was unlocked. Hence the "it feels more responsive but I can't quite put my finger on it" result.
There's another chunk of code in the standard Alloc call which deals with updates of non-streaming VBOs and which is implemented in quite an evil manner by the stock Doom 3 code. When updating such a VBO you can get a faster update if the glBufferData params are the same as was previously used for that VBO (the driver can just reuse the previous block of buffer memory instead of needing to fully reallocate). Doom 3 doesn't do that, so it doesn't get these faster updates, but by searching the free static headers list for a VBO that matches and using that instead of just taking the first one from it, it can. Obviously it sucks that you need to search the list in this way, and a better implementation would just store the VBO with the object that uses it, and reuse the same VBO each time. Since this mainly happens with model animations an ever better implementation would use transform feedback to animate the model instead of animating it on the CPU and needing to re-upload verts each frame, but I haven't even looked at that yet.
So all in all the stock VBO implementation is an unholy mess that needs serious work to get it functioning right, much the same way as Quake 1 lightmap updates were a mess. That code just represents the start of a process, but I personally don't think it's worth continuing with. I say - wait for the BFG edition, wait and see if that's going to get a source release (Carmack seems keen), and use that as a base for further work instead - chances are that all of this stuff will be fixed in that.
A VBO is a GPU resource, and normally, if you try to update a GPU resource that is currently in use for drawing with (entirely possible because of the asynchronous nature of CPU/GPU operation), everything must stall and wait for drawing to complete before the update can happen. The stock Doom 3 code actually double-buffers it's streaming VBOs to try avoid this (in a slightly obfuscated way) but glMapBufferRange is a more robust way.
So, I mentioned discard/no-overwrite above. Here's what they do.
The buffer is filled in a linear manner. You've got 2mb (or whatever) of space, vertexes are added beginning at position 0, as new vertexes are added they get appended until the buffer fills, then magic happens.
This standard update is no-overwrite; your code makes a promise to GL that it's not going to overwrite any region of the buffer that may be currently in use for drawing, and in return GL will let you update the buffer without blocking. In order to be able to keep this promise your code must maintain a counter indicating how much space in the buffer it has previously used, and add new verts to the buffer at this counter position.
When the buffer becomes full you "discard". This doesn't throw away anything previously added, instead GL will keep the previous block of buffer memory around for as long as is needed to satisfy any pending draw calls, but will give you a new, fresh block for any further updates. That's the "magic" I mentioned above, and it's what lets you use a streaming VBO without any blocking.
This pattern will also let you get rid of Doom 3's double buffering, thus saving you some GPU memory (I haven't yet done this in my code). Because there's no more blocking it will run faster in cases where there is a lot of dynamic buffer usage, but because Doom 3 locks at 60fps it may not be as directly measurable as if the engine was unlocked. Hence the "it feels more responsive but I can't quite put my finger on it" result.
There's another chunk of code in the standard Alloc call which deals with updates of non-streaming VBOs and which is implemented in quite an evil manner by the stock Doom 3 code. When updating such a VBO you can get a faster update if the glBufferData params are the same as was previously used for that VBO (the driver can just reuse the previous block of buffer memory instead of needing to fully reallocate). Doom 3 doesn't do that, so it doesn't get these faster updates, but by searching the free static headers list for a VBO that matches and using that instead of just taking the first one from it, it can. Obviously it sucks that you need to search the list in this way, and a better implementation would just store the VBO with the object that uses it, and reuse the same VBO each time. Since this mainly happens with model animations an ever better implementation would use transform feedback to animate the model instead of animating it on the CPU and needing to re-upload verts each frame, but I haven't even looked at that yet.
So all in all the stock VBO implementation is an unholy mess that needs serious work to get it functioning right, much the same way as Quake 1 lightmap updates were a mess. That code just represents the start of a process, but I personally don't think it's worth continuing with. I say - wait for the BFG edition, wait and see if that's going to get a source release (Carmack seems keen), and use that as a base for further work instead - chances are that all of this stuff will be fixed in that.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
We knew the words, we knew the score, we knew what we were fighting for
-

mh - Posts: 2292
- Joined: Sat Jan 12, 2008 1:38 am
Re: Doom 3 engine release and game code
Ok that atleast made me understand a few things
Carmack planned on opensourcing the BFG edition code ??? well id like to see that
but i guess its the proper way unless the Old doom3 code cannot run it it makes no sense keeping it closed source.
One thing id like to see also is the Quake4 source (mostly because it has a working glsl backend) Prey would also be nice
Edit Btw. if someone has a look at my Doom3 source and spots something out of the order (preferably an ATI user) im very interrested in hearing cause the ATI bug i seem to have introduced is really starting to annoy the heck out of me
Carmack planned on opensourcing the BFG edition code ??? well id like to see that
One thing id like to see also is the Quake4 source (mostly because it has a working glsl backend) Prey would also be nice
Edit Btw. if someone has a look at my Doom3 source and spots something out of the order (preferably an ATI user) im very interrested in hearing cause the ATI bug i seem to have introduced is really starting to annoy the heck out of me
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
Read up some more on the BFG edition and a few things to notice.
It is not compatible with the old Doom3 engine (uses parts of idtech5) so mods etc made for Doom3 will not work.
Carmack might not reveal all of the new code but theres a probability he will reveal some of it.
Hmm
a glimpse at idtech5 code sounds rather interresting.
It is not compatible with the old Doom3 engine (uses parts of idtech5) so mods etc made for Doom3 will not work.
Carmack might not reveal all of the new code but theres a probability he will reveal some of it.
Hmm
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
Btw Mh i fixed up the source you send me a while back (Sortof like a cleansource for Doom3).
I converted allmost all the remaining calls to use vertexattribs instead of the old clientstate method (only missing ac->normals atm).
Added some missing source files to the project (AAS).
renamed DOOM3.exe to MHDoom3.exe
Added your VBO optimizations.
Removed a few unnessesary things.
BUG:
Shadows work fine for casted shadows but model shadows look weird as hell (split in half) not something i introduced the unmodified version also does it.

exe size is somewhat smaller now but the above bug is rather annoying to look at
Ill upload it to my realm site if you want it
http://code.google.com/p/realm/downloads/detail?name=mhdoom3.7z&can=2&q=
I converted allmost all the remaining calls to use vertexattribs instead of the old clientstate method (only missing ac->normals atm).
Added some missing source files to the project (AAS).
renamed DOOM3.exe to MHDoom3.exe
Added your VBO optimizations.
Removed a few unnessesary things.
BUG:
Shadows work fine for casted shadows but model shadows look weird as hell (split in half) not something i introduced the unmodified version also does it.
exe size is somewhat smaller now but the above bug is rather annoying to look at
Ill upload it to my realm site if you want it
http://code.google.com/p/realm/downloads/detail?name=mhdoom3.7z&can=2&q=
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Re: Doom 3 engine release and game code
agh just noticed replacing the texcoord calls with vertex atrribs is not quite working (breaks doom sky) could possibly be a bug in Doom3's code cause thats the only place it breaks.
Productivity is a state of mind.
-

revelator - Posts: 2567
- Joined: Thu Jan 24, 2008 12:04 pm
- Location: inside tha debugger
Who is online
Users browsing this forum: No registered users and 1 guest