Best engine for lot's of entities?

Discuss programming topics for the various GPL'd game engine sources.
Post Reply
JasonX
Posts: 422
Joined: Tue Apr 21, 2009 2:08 pm

Best engine for lot's of entities?

Post by JasonX »

Darkplaces seemed like a good choice, but it's framerate drops with about 100 monsters at the screen. Is there a good engine for handling lot's of enemies? FitzQuake in software mode, maybe?
Spirit
Posts: 1065
Joined: Sat Nov 20, 2004 9:00 pm
Contact:

Re: Best engine for lot's of entities?

Post by Spirit »

From what I know: It depends (tm) on the logic rather than the polycount.

You will want an advanced protocol and one of FTE/DP/Quakespasm/DirectQ I think.
Improve Quaddicted, send me a pull request: https://github.com/SpiritQuaddicted/Quaddicted-reviews
mh
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: Best engine for lot's of entities?

Post by mh »

Most Quake engines are CPU-bound.

By intentionally constraining themselves to the same set of GL calls that GLQuake used, they're not able to take advantage of more recent GPU features that can work to accelerate exactly the problem that you (the OP) are encountering. And when I say "more recent" I'm actually speaking about hardware that's up to 10 years old here.

Common issues that lead to exactly this problem include:
  • A preference for brute-forcing simple calculations on the CPU that a GPU is able to do much faster.
  • Hundreds of draw calls per-object.
  • Constantly having to stream vertex data to the GPU each frame, even if that data does not change.
  • Massive subdivision of surfaces to get acceptable results per-vertex for effects that really need to run per-pixel.
The end result is that the CPU is saturated, the PCI-E bus is saturated, and the GPU is mostly idle. This only gets worse as the object count increases; the most powerful processor in your machine, the one that scales really well for handling this kind of workload, ends up hardly being used at all, while the weaker CPU takes the brunt of the load, and the single-threaded nature of the Quake engine prevents today's multicore CPUs from working best.

This is all by way of saying that moving to a software engine is not going to help you here. It's just going to load even more on the CPU and leave the GPU even more idle.

The problem with the Quake engine is not that BSP or MDL formats can't handle high object counts or complex scenes. They're not the best formats for them for sure, but they can handle them and they can give acceptable performance while doing so (provided you don't get too silly).

The problem with the Quake engine is that the GL calls it uses are woefully inadequate and sub-optimal for this kind of drawing. Even by restricting yourself to GL 1.1 but moving from glBegin/glEnd to vertex arrays with proper batching you can remove a lot of CPU-side overhead and start getting the GPU really working for you, and framerates will soar as scenes become more complex. Move to VBOs and transfer some CPU-side work to shaders (which allows certain classes of vertex data - like MDLs - to remain static) and things only get better.

Of the currently maintained engines I'd recommend QuakeSpasm as a solution to this problem. It's (at least last time I looked) still missing the final piece of the puzzle, which is to get some instancing in and which can really help with high MDL counts (provided they're all the same MDL, of course), but despite that recent versions are still a huge stride forward from what GLQuake did.

DarkPlaces, by the way, does also solve much of this, but it also comes with some new overhead of it's own.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Best engine for lot's of entities?

Post by Spike »

regarding network protocols:
vanilla: 600
dp: 32k
fitzquake/quakespasm: 65k
fte: 262k
these are protocol limits rather than default limits. you may need to set pr_maxedicts, max_edicts, pr_ssqc_memsize, or other cvars in order to actually utilise this many ents.

regarding rendering:
be sure to disable any realtime lighting... you don't want everything being drawn 500 times. 100 monsters all using their own muzzleflashes and dlights will destroy your framerate, while the all shadows from static lighting will do the same.
reflective water and other such effects can also result in increased rendering costs per monster.

regarding gamecode:
be aware that idle monsters are not free. they will all be doing various pvs etc checks, and firing off tracelines to see if the player is visible yet. they'll be checking their contents values constantly too. much of this cost will come in bursts, so while you might be able to atain 60fps, you'll get 10 spikes every second that make the game feel like its running at 12 fps or so.
it can thus pay off to activate areas of monsters based upon triggers rather than leaving them all active from the start. this can reduce cpu load quite a lot.

tldr:
100 monsters onscreen at 60fps should be perfectly achievable on any of the engines spirit mentioned, so long as there are no extra poorly-scaling things like rtlights, and so long as the monster thinks are staggered slightly.
Tr3B
Posts: 15
Joined: Tue May 13, 2014 2:24 pm

Re: Best engine for lot's of entities?

Post by Tr3B »

There was a discussion about RBDOOM-3-BFG running 300 zombies. http://idtechforums.fuzzylogicinc.com/i ... opic=198.0

I had really big problems in XreaL with the Q3A netcode when dealing with many entities. The constant updates from the client game to the renderer completely suck with the Q3A architecture.
Doom 3 allows more entities if you run it in single player because the single player execution skips all netcode and entity delta calculations. It also holds all entities in its own renderer world so you only need to update them if needed.
The BFG engine even runs the entire game code and all physics/ai scripting inside in a separate thread which gives another big performance boost.
JasonX
Posts: 422
Joined: Tue Apr 21, 2009 2:08 pm

Re: Best engine for lot's of entities?

Post by JasonX »

Would it be possible to write some entity pooling on QuakeC code? Or this has to be something on the engine side? Would it help to just remanage entities, instead of destroying/instancing them?

Also, i plan to use very low-poly game models (< 500 polys), both for enemies and other data... would this help the framerate, or it's irrelevant comparing to 2k/3k models? Pardon my ignorance guys, and thanks for all the help!
Spike
Posts: 2914
Joined: Fri Nov 05, 2004 3:12 am
Location: UK
Contact:

Re: Best engine for lot's of entities?

Post by Spike »

http://triptohell.info/moodles/junk/fte ... shader.png
(that's about 4.8 million indicies, 250 times a second)
or in other words, if the engine avoids deprecated methods then vertex counts are pretty much irrelevent nowadays.
(I should probably take the time to point out that these are NOT entities, but rather geometry generated by the usage of a shader on the surface which they are upon, meaning its purely static just geometry with very little overhead).

using a base like my purecsqc mod (which completely disregards ssqc and is thus single-player-only) allows you to skip all of the network protocol overhead, but you still have overheads feeding the entities to the renderer and then on to opengl, and you'll need the qc to do your interpolation - this can actually be slower (because qc is executed via an interpreter rather than a jit).

builtins that itterate through lists (like the find builtin) will always have poor scaling with respect to high entity counts, so try to avoid those.
This includes traceline of course. the more entities you have in your room, the more tracelines you're generating and the more entities your tracelines are potentially able to hit.

there's a few ways to check where your overheads are:
r_norefresh 1:
this setting will skip 3d rendering entirely (you'll probably want gl_clear), leaving any 2d stuff including framerate displays on screen. if there's not much difference, then you know the renderer isn't your slowdown.
r_drawentities 0:
unlike r_norefresh, this setting will still draw the world, and probably some other things like particles. you can compare the two to determine if your slowdown is caused by an excessively complex world bsp entity.
pause:
by pausing the game, you inhibit all serverside physics and qc logic. if the framerate goes up then the server is spending far too long servicing movetypes or think functions.
profiling:
most engines support profiling in some way. both dp and fte can give wall-clock times for individual qc functions ('apropos profile' to see the commands). even vanilla quake can show you how many qc opcodes have been executed (but sadly this does not include the duration of time spent within builtins).
Post Reply