External Texture Load Speed
External Texture Load Speed
Maybe not the most interesting of topics, but last year I got frustrated with external textures.
It is slow.
Why is it slow? Every time an engine looks for an external texture, it has to check the file system and look through every pak file. For every texture.
Checking the file system isn't slow, but checking the pak files is because it has to look through every entry for every pak file. If you support more than 1 replacement texture file extension, multiply that. If you support more than 1 place for external textures, multiply that again.
So if we DO NOT have an external replacement texture,
1) an engine has to look through at least 2 pak files to not find it.
2) has to check repeat this for each extension to not find it
3) has to check each supported directory (id1/textures, id1/<mapname>/textures, maybe other places) to not find it. If gamedir'd, double this. Some engines have another multiplier here (like ones that check qw/ or their own engine folder)
Maybe a map has 50 textures. Maybe it has 125. Let's say you have 5 pak files containing a total of 2000 files, support 4 image formats and have to check 6 directories for textures. How many file entries in pak files are being checked any time a texture isn't found? Maybe that is still a ton even IF the texture is found. These are some potentially large numbers.
The speedup I decided on: I made a flag for loaded pakfiles indicating what types of files were found in the pak upon loading a pak file. Like a flag for TGA, a flag for PCX, etc. Original Quake paks will not have any of these files, so the engine can skip looking in those pak files entirely. Essentially it can just short-circuit return false very quickly.
It is slow.
Why is it slow? Every time an engine looks for an external texture, it has to check the file system and look through every pak file. For every texture.
Checking the file system isn't slow, but checking the pak files is because it has to look through every entry for every pak file. If you support more than 1 replacement texture file extension, multiply that. If you support more than 1 place for external textures, multiply that again.
So if we DO NOT have an external replacement texture,
1) an engine has to look through at least 2 pak files to not find it.
2) has to check repeat this for each extension to not find it
3) has to check each supported directory (id1/textures, id1/<mapname>/textures, maybe other places) to not find it. If gamedir'd, double this. Some engines have another multiplier here (like ones that check qw/ or their own engine folder)
Maybe a map has 50 textures. Maybe it has 125. Let's say you have 5 pak files containing a total of 2000 files, support 4 image formats and have to check 6 directories for textures. How many file entries in pak files are being checked any time a texture isn't found? Maybe that is still a ton even IF the texture is found. These are some potentially large numbers.
The speedup I decided on: I made a flag for loaded pakfiles indicating what types of files were found in the pak upon loading a pak file. Like a flag for TGA, a flag for PCX, etc. Original Quake paks will not have any of these files, so the engine can skip looking in those pak files entirely. Essentially it can just short-circuit return false very quickly.
The night is young. How else can I annoy the world before sunsrise?
Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: External Texture Load Speed
What does this slow down? Game loading time or frame rendering (FPS)?
QuakeWiki
getButterfly - WordPress Support Services
Roo Holidays
Fear not the dark, but what the dark hides.
getButterfly - WordPress Support Services
Roo Holidays
Fear not the dark, but what the dark hides.
Re: External Texture Load Speed
@chip: game loading time.
Fixing that with a hash table or std::map would be trivial, no?Baker wrote:because it has to look through every entry for every pak file
Re: External Texture Load Speed
Yeah, it's load. I was rather irritated after converting all replacement textures to PCX --- and these textures match exactly the size of the texture they are replacing --- and found the texture load to be "slow" (from my perspective).Chip wrote:What does this slow down? Game loading time or frame rendering (FPS)?
If I disabled external texture support, the map loaded really quickly (about instant). But with external texture support enabled, it felt like it took 3-5 seconds.
I was trying to assess where this slow down could be and it quickly ruled out some concerns and then I looked at the pak file stuff. I was rather astonished at the time wasting potential.
Some people probably don't feel like an extra 3-5 seconds in map load time is a big deal, but I really don't like waiting for stuff to load.
The night is young. How else can I annoy the world before sunsrise?
Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: External Texture Load Speed
By loading do you mean the actual loading or the discovery?
I highly doubt that just checking for the existence of files could take so long.
A "ls -lR" of eg my Quake3 installation (I played with the Quake directory before without taking times so that is cached at the moment) just takes 0.3 seconds for ~1600 files.
Or is it just the pak files that are so slow?
I know that some file formats LOAD quicker than others (TGA is faster than PNG for example).
I highly doubt that just checking for the existence of files could take so long.
A "ls -lR" of eg my Quake3 installation (I played with the Quake directory before without taking times so that is cached at the moment) just takes 0.3 seconds for ~1600 files.
Or is it just the pak files that are so slow?
I know that some file formats LOAD quicker than others (TGA is faster than PNG for example).
Improve Quaddicted, send me a pull request: https://github.com/Quaddicted/quaddicted-data
Re: External Texture Load Speed
A lot of it gets loaded in memory so it takes some time to fill up the buffers. On realm i get the same as baker without textures it loads allmost instantly but if i drop in a load of external textures load time goes up seriously (my own engine with all Q1 textures as png takes about 10 secs to load ouch! and they are not even in a pak). Atleast it doesnt affect framerates
.
Productivity is a state of mind.
Re: External Texture Load Speed
fodquake implements a binary search for loading files from pak files (sorts at load time then goes one way or the other based upon filename sort orders).
fte generates a hashtable for every file in paks+pk3s+system, but the problem with that is that it doesn't always detect when someone copy+pastaed say a .bsp so its been flushing the hash on every map change anyway. Still comes out faster.
my experience is that windows is extreemly sluggish when it comes to trying to fopen 5*6*60*4*8=57600 different files on each map change (gamepaths, extensions, textures, diffuse/luma/bump/norm, gamedirs). Linux isn't half as bad.
just to drive the point home, fopen is one of the calls that NaCl requires to be done asynchronously.
The number of files FTE attempts to load is just too insane to depend upon windows for speed.
And yes, PNG-inside-pak is slower than TGA-inside-pk3, and can have worse compression too. supposedly you're also better off using BGR over RGB for loading. A little less portable, but on pcs it'll always be faster. Assuming you don't fgetc() every single byte in the TGA, the real slowdown is indeed in finding the replacement textures rather than actually loading them.
fte generates a hashtable for every file in paks+pk3s+system, but the problem with that is that it doesn't always detect when someone copy+pastaed say a .bsp so its been flushing the hash on every map change anyway. Still comes out faster.
my experience is that windows is extreemly sluggish when it comes to trying to fopen 5*6*60*4*8=57600 different files on each map change (gamepaths, extensions, textures, diffuse/luma/bump/norm, gamedirs). Linux isn't half as bad.
just to drive the point home, fopen is one of the calls that NaCl requires to be done asynchronously.
The number of files FTE attempts to load is just too insane to depend upon windows for speed.
And yes, PNG-inside-pak is slower than TGA-inside-pk3, and can have worse compression too. supposedly you're also better off using BGR over RGB for loading. A little less portable, but on pcs it'll always be faster. Assuming you don't fgetc() every single byte in the TGA, the real slowdown is indeed in finding the replacement textures rather than actually loading them.
Re: External Texture Load Speed
Well, technically it is the map load time. But it is the discovery of files (which has to happen).Spirit wrote:By loading do you mean the actual loading or the discovery?
I highly doubt that just checking for the existence of files could take so long.
And no it is checking for the existence of files in pak files.
See with a pak file, you essentially have a table in memory of the files in a pak.
Let's say I have 8 pak files and 2000 files in those pak files. Any file that an engine wants to check for in a pak, it has to do 2000 string compares for a file that does not exist (it has to check them all in order to know a file isn't in a pak) or somewhat less than that for one that does exist.
In an engine, it might need to check for the filename with the extension .png, then .jpg, then .tga. It has to check id1/textures, id1/textures/dm6. If an engine like JoeQuake, needs to check joequake/textures and joequake/textures/dm6. If playing a gamedir'd mod like Travail, need to check for travail/textures, travail/textures/dm6. If a qw engine, needs to check qw/textures, qw/textures/dm6. If it finds a texture, the engine might also need to check for _bump, _glow texture, _gloss textures.
And doing that in pak files is via string compare against every file in a pak, and in a 2000 files in a pak scenario it starts to be a ton of checking. Like up to 32000 string compares per file. Some maps have a larger number of textures, some maps have less. DM6 only has 8 textures or so ! Start, e1m1 have quite a few. ARWOP's first level has over 1000!
The night is young. How else can I annoy the world before sunsrise?
Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: External Texture Load Speed
The math certainly starts getting big real quick.Spike wrote:my experience is that windows is extreemly sluggish when it comes to trying to fopen 5*6*60*4*8=57600 different files on each map change (gamepaths, extensions, textures, diffuse/luma/bump/norm, gamedirs). Linux isn't half as bad.
The night is young. How else can I annoy the world before sunsrise?
Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: External Texture Load Speed
its quicker to enumerate once than to fopen in each gamedir 7200 times, especially if those files are in a pak/pk3. 
Trust me though, linux (utf-8 only) is muuuch faster than windows (utf-16/utf-8/case insensitivity/alternative file names/ntfs... vista...).
This is made even worse when you have id1+qw+fte+$gamedir+home/id1+home/id1+home/fte+home/$gamedir, which I guess is the real difference in our experiences. Hash tables allow FTE to combine all filesystems into a single lookup, regardless of how many paks/pk3s/gamedirs/etc are loaded.
You should probably look into a binary search for pak files - each time you double the number of files in the pak, you only add one extra iteration. See fodquake as an example, but you do need to sort the files inside the pak first, somehow.
sidenote: fte actually loads paks/pk3s within paks/pk3s. This allows embedding paks etc within the .apk of the android port, but you probably don't want to compress those files - the android port uses the apk file as its basedir. I just can't insert pak0/pak1 in there due to copyright issues.
Really though, you ought to be scanning for the filename and ignore the extension, so you find eg: $gamedir/pak0.pak/image.pcx in preference to id1/somepack.pak/image.tga - this is something that pretty much every engine fails at, including my own. Just sayin.
Note that this is especially problematic if you include your own texture package with priority above id1 as it'll take priority over any png/pcx/etc anywhere else, and even tga in id1.
Trust me though, linux (utf-8 only) is muuuch faster than windows (utf-16/utf-8/case insensitivity/alternative file names/ntfs... vista...).
This is made even worse when you have id1+qw+fte+$gamedir+home/id1+home/id1+home/fte+home/$gamedir, which I guess is the real difference in our experiences. Hash tables allow FTE to combine all filesystems into a single lookup, regardless of how many paks/pk3s/gamedirs/etc are loaded.
You should probably look into a binary search for pak files - each time you double the number of files in the pak, you only add one extra iteration. See fodquake as an example, but you do need to sort the files inside the pak first, somehow.
sidenote: fte actually loads paks/pk3s within paks/pk3s. This allows embedding paks etc within the .apk of the android port, but you probably don't want to compress those files - the android port uses the apk file as its basedir. I just can't insert pak0/pak1 in there due to copyright issues.
Really though, you ought to be scanning for the filename and ignore the extension, so you find eg: $gamedir/pak0.pak/image.pcx in preference to id1/somepack.pak/image.tga - this is something that pretty much every engine fails at, including my own. Just sayin.
-
Knightmare
- Posts: 63
- Joined: Thu Feb 09, 2012 1:55 am
Re: External Texture Load Speed
The first thing I did was to make sure that it wasn't trying to load each replacement texture more than once (this is in Q2, BTW, so there is no set of textures in the BSP, just surfaces with texture names).
If you use a hash table, you'll just end up string comparing every entry in pak files that don't have the texture you're looking for. So I made hashes of the filenames, and compared those instead. If the hashes matched, I did a string compare too, just to be sure. Further, I also added hash compares to Q2's FindImage routine.
Comparing integers is faster than strings, and I got a nice reduction in load time. Also, loading saves on the same level now takes 1 second or less.
If you use a hash table, you'll just end up string comparing every entry in pak files that don't have the texture you're looking for. So I made hashes of the filenames, and compared those instead. If the hashes matched, I did a string compare too, just to be sure. Further, I also added hash compares to Q2's FindImage routine.
Comparing integers is faster than strings, and I got a nice reduction in load time. Also, loading saves on the same level now takes 1 second or less.
Re: External Texture Load Speed
I believed you the first time.Spike wrote:its quicker to enumerate once than to fopen in each gamedir 7200 times, especially if those files are in a pak/pk3.
Trust me though, linux (utf-8 only) is muuuch faster than windows (utf-16/utf-8/case insensitivity/alternative file names/ntfs... vista...).
This is made even worse when you have id1+qw+fte+$gamedir+home/id1+home/id1+home/fte+home/$gamedir, which I guess is the real difference in our experiences. Hash tables allow FTE to combine all filesystems into a single lookup, regardless of how many paks/pk3s/gamedirs/etc are loaded.
You have more multiplication going on than I do (plus pk3). Plus you sold me on the idea that PNG sucks and that JPEG is undesirable overall. I like how PCX and TGA is easy to implement even on some device. So I have the PNG and JPEG code optional in a file controlling the #ifdefs and have them both disabled (but always make sure they work in the event I change my mind).
Plus I use Quakespasm's "early termination" where "A model's replacement skin cannot be in a lower search location than the model itself".
InterestingKnightmare wrote:The first thing I did was to make sure that it wasn't trying to load each replacement texture more than once (this is in Q2, BTW, so there is no set of textures in the BSP, just surfaces with texture names).
If you use a hash table, you'll just end up string comparing every entry in pak files that don't have the texture you're looking for. So I made hashes of the filenames, and compared those instead. If the hashes matched, I did a string compare too, just to be sure. Further, I also added hash compares to Q2's FindImage routine.
Comparing integers is faster than strings, and I got a nice reduction in load time. Also, loading saves on the same level now takes 1 second or less.
The night is young. How else can I annoy the world before sunsrise?
Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: External Texture Load Speed
It's generally the case on Windows that the CRT function is implemented as a wrapper around the Windows API function, so in this case the implementation of fopen will be a wrapper around CreateFile, and will probably do a bunch of other stuff before and/or after the CreateFile call. I haven't looked at the CRT source code for a while, and don't have a copy of it any more, but can get one easily enough and confirm. Either way, the upshot is that on Windows the API call is normally preferable to the CRT version - both in terms of performance and useful extra functionality.
Building some kind of lookup table is one solution but it does slow down game changing.
My current take on it is to pre-check the paths for replacement textures, and if a path doesn't exist then don't bother trying to use it for individual textures. That doesn't deal with the fact that I support about 7 or 8 different image formats, but does knock about 75% off this element of the load time.
With texture caching the hit only occurs the first time a texture is used, of course. Assuming you've got a properly functioning cache, subsequent uses of the same texture should only require as long as it takes to search the cache.
Regarding format, the preferable one for load speeds is DDS. No, you don't have to use compression with DDS; it's perfectly capable of holding uncompressed formats too. The main reason why it's best is because it holds the texture in a format that can be sent directly to the GPU with no pre-processing whatsoever required, and can hold pregenerated submip levels so that's another bunch of CPU-side work gone.
It's just a well-specified binary file format so you can easily use it with OpenGL - Doom 3 used it.
Building some kind of lookup table is one solution but it does slow down game changing.
My current take on it is to pre-check the paths for replacement textures, and if a path doesn't exist then don't bother trying to use it for individual textures. That doesn't deal with the fact that I support about 7 or 8 different image formats, but does knock about 75% off this element of the load time.
With texture caching the hit only occurs the first time a texture is used, of course. Assuming you've got a properly functioning cache, subsequent uses of the same texture should only require as long as it takes to search the cache.
Regarding format, the preferable one for load speeds is DDS. No, you don't have to use compression with DDS; it's perfectly capable of holding uncompressed formats too. The main reason why it's best is because it holds the texture in a format that can be sent directly to the GPU with no pre-processing whatsoever required, and can hold pregenerated submip levels so that's another bunch of CPU-side work gone.
It's just a well-specified binary file format so you can easily use it with OpenGL - Doom 3 used it.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
We knew the words, we knew the score, we knew what we were fighting for
Re: External Texture Load Speed
Yep dds is quite nice, though i hear it can be the victim of texture corruption.
Besides that it makes life somewhat easier since as mh says it can hold premipped textures.
Besides that it makes life somewhat easier since as mh says it can hold premipped textures.
Productivity is a state of mind.
Re: External Texture Load Speed
Only if you use compression, which you don't have to use. Either way it's just a matter of reading the entire file into a buffer then glTexImaging some pointers.reckless wrote:Yep dds is quite nice, though i hear it can be the victim of texture corruption.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
We knew the words, we knew the score, we knew what we were fighting for