(Multi-) Threading
(Multi-) Threading
I've slightly mess with threading in C in the past, couldn't think of anything good to put it to use. Probably changing now ...
Ok, Objective-C multi-threads and hides everything away. C won't be kind like that.
Rules as far as I know:
1. Writing to a type is atomic.
2. In C, I probably can't atomically write an int or "string" without tons of overhead.
3. Windows uses it's own deal, on non-Windows posix is the deal. <pthread.h>
4. I need to create mutex's for locking and unlocking, avoid "race conditions" and anything that can cause a perma-lock.
5. Spike said grabbing stuff from FTE was a good starting point.
Ok, Objective-C multi-threads and hides everything away. C won't be kind like that.
Rules as far as I know:
1. Writing to a type is atomic.
2. In C, I probably can't atomically write an int or "string" without tons of overhead.
3. Windows uses it's own deal, on non-Windows posix is the deal. <pthread.h>
4. I need to create mutex's for locking and unlocking, avoid "race conditions" and anything that can cause a perma-lock.
5. Spike said grabbing stuff from FTE was a good starting point.
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
1. atomocity depends on the cpu and compiler.
x86 guarentees that even byte accesses are atomic (but not long longs, as these are emulated on 32bit chips like x86).
other cpus have only word (ie: 32bit) accesses, and need to use reads to avoid changing parts of the data.
atomic operations are available on most cpus (either full bus locks on x86, or write-if-still-equal on arm).
compilers can steamroller over everything by rearranging everything. Calling a opaque libc/system function like a mutex lock should ensure no memory accesses get rearranged over the call.
you can also declare types as volatile if you're doing some lockless unatomic syncs - just make SURE that your data type is a machine word, and that you're not running on some really cheap multi-processor system that has no cache coherency. or in other words, don't depend on volatile for anything whatsoever at the hardware level.
2. https://gcc.gnu.org/onlinedocs/gcc-4.1. ... ltins.html
for microsoft compilers, you need to depend upon the windows API instead - InterlockedAdd for instance. naturally, this sucks in portable code (but then so does using gcc-specifics)
C11 has threads.h and stdatomic.h but it'll be a long time before those will actually be usable... msvc users are still stuck with C89...
3. Yup, you'll need to abstract that with two sets of functions. See 5.
note that in windows, a 'mutex' is an inter-process thing. you'll likely want to use 'critical sections', which is what windows calls intra-process mutexes.
4. Yup, good luck with that when it comes to pretty much every single windows function potentially calling SendMessage. If you use mutexes inside your window handler, you are pretty much screwed.
5. FTE has had thread helpers for ages now, but didn't really do too much with them.
It now uses them for directsound mixing (yay, lower mixahead with rtlights), independant player physics (because hitting exactly 77fps with rtlights is hard), map relighting (who needs rtlights anyway), and of course a couple of worker threads to load textures (there are a LOT of textures when your rtlights use specular+bumps+luma+etc etc...) and stuff with.
The worker thread stuff allows me to queue work on a specific thread (either the main thread or the worker thread) and post things from one side to the other. The worker does its thing and posts the 'finished' product back to the main thread which then links it in accordingly (uploading it to gl in the case of textures). because they don't touch each other's stuff, there's no sync conditions between threads (although if your main thread expects a result too soon, you can still have race conditions, but you can test for these by bogging down the worker with some Sleep calls).
you can get away with the worker using cvar values, but don't use cvar strings as the pointer can become invalid at inopertune times.
Stuff with no return value like Con_Printf can include a little bit of code to check if its the main thread, and if not, post and return.
You'll need to very carefully read through the functions your thread calls to make sure its all thread safe, you WILL get hard-to-track bizzare errors if you fail to be truely thorough.
but yeah, the easy way to do it is to post function+data pointers and wait for a response (while not waiting, of course).
cunning use of 'condition variables' can allow your worker to wake up once something is posted to it, without any busy waits.
I guess in windows you could just use PostMessage and have your worker just block in a GetMessage message loop, and then post the result back to the main window (and use SendMessage when you want to sync). But yeah, windows-specific solutions suck.
x86 guarentees that even byte accesses are atomic (but not long longs, as these are emulated on 32bit chips like x86).
other cpus have only word (ie: 32bit) accesses, and need to use reads to avoid changing parts of the data.
atomic operations are available on most cpus (either full bus locks on x86, or write-if-still-equal on arm).
compilers can steamroller over everything by rearranging everything. Calling a opaque libc/system function like a mutex lock should ensure no memory accesses get rearranged over the call.
you can also declare types as volatile if you're doing some lockless unatomic syncs - just make SURE that your data type is a machine word, and that you're not running on some really cheap multi-processor system that has no cache coherency. or in other words, don't depend on volatile for anything whatsoever at the hardware level.
2. https://gcc.gnu.org/onlinedocs/gcc-4.1. ... ltins.html
for microsoft compilers, you need to depend upon the windows API instead - InterlockedAdd for instance. naturally, this sucks in portable code (but then so does using gcc-specifics)
C11 has threads.h and stdatomic.h but it'll be a long time before those will actually be usable... msvc users are still stuck with C89...
3. Yup, you'll need to abstract that with two sets of functions. See 5.
note that in windows, a 'mutex' is an inter-process thing. you'll likely want to use 'critical sections', which is what windows calls intra-process mutexes.
4. Yup, good luck with that when it comes to pretty much every single windows function potentially calling SendMessage. If you use mutexes inside your window handler, you are pretty much screwed.
5. FTE has had thread helpers for ages now, but didn't really do too much with them.
It now uses them for directsound mixing (yay, lower mixahead with rtlights), independant player physics (because hitting exactly 77fps with rtlights is hard), map relighting (who needs rtlights anyway), and of course a couple of worker threads to load textures (there are a LOT of textures when your rtlights use specular+bumps+luma+etc etc...) and stuff with.
The worker thread stuff allows me to queue work on a specific thread (either the main thread or the worker thread) and post things from one side to the other. The worker does its thing and posts the 'finished' product back to the main thread which then links it in accordingly (uploading it to gl in the case of textures). because they don't touch each other's stuff, there's no sync conditions between threads (although if your main thread expects a result too soon, you can still have race conditions, but you can test for these by bogging down the worker with some Sleep calls).
you can get away with the worker using cvar values, but don't use cvar strings as the pointer can become invalid at inopertune times.
Stuff with no return value like Con_Printf can include a little bit of code to check if its the main thread, and if not, post and return.
You'll need to very carefully read through the functions your thread calls to make sure its all thread safe, you WILL get hard-to-track bizzare errors if you fail to be truely thorough.
but yeah, the easy way to do it is to post function+data pointers and wait for a response (while not waiting, of course).
cunning use of 'condition variables' can allow your worker to wake up once something is posted to it, without any busy waits.
I guess in windows you could just use PostMessage and have your worker just block in a GetMessage message loop, and then post the result back to the main window (and use SendMessage when you want to sync). But yeah, windows-specific solutions suck.
Re: (Multi-) Threading
Thanks for your knowledge on this topic.
Things I may end up using threads for:
1. The maps and demos menu. The code I have currently only updates 17 maps per frame. No problem, right?
---- Well, if Windows isn't expecting me to access those files, even a mere 17 quick fopens is slow as hell. With threads, maybe I can touch those files in another thread to wake up the Windows filesystem or whatever is going on with that.
2. I'm probably going to take a stab at server offering travail.zip or whatever game is play (especially for LAN) for installation. A client downloading via FTP or whatever I do can sit and wait, I don't care. But the server shouldn't be impaired if 3 clients are downloading a big zip file and 1 client isn't. I'm looking to avoid the problem that, say, Quakeworld tries address with download rate limiting by not rate limiting. Rate limiting with a 80 MB single player zip on a LAN isn't feasible and having the clients download from, say, Quaddicted is silly if the server has the zip. Server on a LAN can transfer to client WAY faster.
Things I may end up using threads for:
1. The maps and demos menu. The code I have currently only updates 17 maps per frame. No problem, right?
---- Well, if Windows isn't expecting me to access those files, even a mere 17 quick fopens is slow as hell. With threads, maybe I can touch those files in another thread to wake up the Windows filesystem or whatever is going on with that.
2. I'm probably going to take a stab at server offering travail.zip or whatever game is play (especially for LAN) for installation. A client downloading via FTP or whatever I do can sit and wait, I don't care. But the server shouldn't be impaired if 3 clients are downloading a big zip file and 1 client isn't. I'm looking to avoid the problem that, say, Quakeworld tries address with download rate limiting by not rate limiting. Rate limiting with a 80 MB single player zip on a LAN isn't feasible and having the clients download from, say, Quaddicted is silly if the server has the zip. Server on a LAN can transfer to client WAY faster.
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
the advantage is that you go at the speed the system goes at.Baker wrote: 1. The maps and demos menu. The code I have currently only updates 17 maps per frame. No problem, right?
---- Well, if Windows isn't expecting me to access those files, even a mere 17 quick fopens is slow as hell. With threads, maybe I can touch those files in another thread to wake up the Windows filesystem or whatever is going on with that.
throw a few threads at it and you won't be waiting for the disk quite so much.
downloads are a good use of threads. you still want to throttle though - at the end of the day, game traffic should have a higher priority than download traffic, and you might not be the only server process on that machine.2. I'm probably going to take a stab at server offering travail.zip or whatever game is play (especially for LAN) for installation. A client downloading via FTP or whatever I do can sit and wait, I don't care. But the server shouldn't be impaired if 3 clients are downloading a big zip file and 1 client isn't. I'm looking to avoid the problem that, say, Quakeworld tries address with download rate limiting by not rate limiting. Rate limiting with a 80 MB single player zip on a LAN isn't feasible and having the clients download from, say, Quaddicted is silly if the server has the zip. Server on a LAN can transfer to client WAY faster.
ultimately, not throttling is simply not an option, because someone will set their cvars to aggressively flood packets from one server, and then get kicked off the net by the next server because it can actually cope with the requested bandwidth (ssd!). tcp has throttling built in due to its window sizes.
-
- Posts: 2126
- Joined: Sat Nov 25, 2006 1:49 pm
Re: (Multi-) Threading
I suppose that BSP traversal for collision detection may benefit from the use of multithreading, too (don't ask me how though, it's way beyond my skills ). From previous benchmarks I concluded that traceline was the most CPU intensive task in singleplayer (not a big surprise here, but it is always good to have numbers supporting empiric knowledge).
I know FrikaC made a cgi-bin version of the quakec interpreter once and wrote part of his website in QuakeC (LordHavoc)
Re: (Multi-) Threading
could use intels threading building blocks which are opensource works for both gcc and msvc.
Productivity is a state of mind.
Re: (Multi-) Threading
If you're using traceline for collision detection you're probably doing something crazy in QC.
Quake by default uses an "areanodes" system instead, which is not the BSP tree but a separate hierarchical tree, which subdivides the world into areas and places entities into those areas, so that only entities in the same area need to be tested for collision with each other. It's still an O(n-squared) operation but for a significantly smaller value of n.
This fails with bigger maps because it uses a static number of areas (32) with a static area depth (4) so bigger maps just have bigger areas with more entities in them, and the cost of collision detection (value of n) goes up.
A more useful optimization is to just create more areas for bigger maps. Multithreading it is brute-forcing the problem in other words; an algorithmic optimization makes more sense in that case.
Quake by default uses an "areanodes" system instead, which is not the BSP tree but a separate hierarchical tree, which subdivides the world into areas and places entities into those areas, so that only entities in the same area need to be tested for collision with each other. It's still an O(n-squared) operation but for a significantly smaller value of n.
This fails with bigger maps because it uses a static number of areas (32) with a static area depth (4) so bigger maps just have bigger areas with more entities in them, and the cost of collision detection (value of n) goes up.
A more useful optimization is to just create more areas for bigger maps. Multithreading it is brute-forcing the problem in other words; an algorithmic optimization makes more sense in that case.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
We knew the words, we knew the score, we knew what we were fighting for
Re: (Multi-) Threading
C++ library only for desktop computers. What could go wrong?revelator wrote:could use intels threading building blocks which are opensource works for both gcc and msvc.
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
Probably not much but then again multithreading can be a bitch to get to work correctly, this should atleast help quite a bit.C++ library only for desktop computers. What could go wrong?
Or use multithreading to run different parts of the engine in there own threads like Doom3 (2 threads) or BFG which takes it even further.
Not sure how much gain it would give an engine like quake but on something like darkplaces with all the bells and whistles it should surely be noticeable, atleast with rt lights etc.
Would be interesting to see the outcome
Productivity is a state of mind.
Re: (Multi-) Threading
Your work with the Doom 3 engine is far more complex than what I work on And DarkPlaces is more complex than what I work on.revelator wrote:Probably not much but then again multithreading can be a bitch to get to work correctly, this should atleast help quite a bit.C++ library only for desktop computers. What could go wrong?
Setting that aside, what I meant is that if I write code that is almost entirely portable to mobile platforms, why would I want to give that up? And why would I want to switch to C++ for threading?
I have no doubt that the library is outstanding for desktop operating systems and C++. But it's 2015 and there's more than desktops out there.
With a case of Red Bull, a few 5 Hour Energy and 4 days --- I could convert my current engine works to Android and iOS.
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
Go on then.Baker wrote:With a case of Red Bull, a few 5 Hour Energy and 4 days --- I could convert my current engine works to Android and iOS.
You will have to keep it running on windows+linux at the same time, of course. Good luck with that.
By the way, browsers don't support threads in the conventional sense, so if you're aiming at an emscripten port, you'll need to retain a threadless mode (at least at compile time).
Re: (Multi-) Threading
???Spike wrote:browsers don't support threads in the conventional senseBaker wrote:With a case of Red Bull, a few 5 Hour Energy and 4 days --- I could convert my current engine works to Android and iOS.
Somewhere a divide by 0 occurred. I know not where. Maybe because I used the word convert or something?
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
javascript. the ultimate in portability and suffering.Spike wrote:so if you're aiming at an emscripten port
you can't claim you want to avoid desktop-only stuff because you want other ports only to completely rule out the use of emscripten to give you a version that'll run in browsers on any device (well, any device that supports webgl etc, anyway).
Hmm, you DO know what emscripten is, right? Just in case, its a C(llvm) -> javascript compiler...
meh, I guess that's what I get for trying to return to the whole multithreading topic thing.
Re: (Multi-) Threading
Ah, ok your FTEQW that runs in the browser.Spike wrote:Spike wrote:so if you're aiming at an emscripten port
I just meant that that Intel's threading library doesn't work on ARM.
Nope. Had no idea. Had to Google it. Considering it has the word "script" in it, Google is lucky I did even that!Hmm, you DO know what emscripten is, right? Just in case, its a C(llvm) -> javascript compiler...
Probably Tuesday or Wednesday, this thread should make a code-oriented turn back on the main theme.meh, I guess that's what I get for trying to return to the whole multithreading topic thing.
Until then, I'm going to enjoy beer! The weekends and all that ...
The night is young. How else can I annoy the world before sunsrise? Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Re: (Multi-) Threading
Ah ok had no idea you where developing it for ARM, but tbb works mostly for ARM also now and theres ongoing work on it so it's worth considering.
ARM's opencv port allready uses tbb.
ARM's opencv port allready uses tbb.
Productivity is a state of mind.