Learning AI

Discuss Artificial Intelligence and Bot programming.
Post Reply
Urre
Posts: 1109
Joined: Fri Nov 05, 2004 2:36 am
Location: Moon
Contact:

Learning AI

Post by Urre »

I've been doing some theoretical research on Learning algorithms, theoretical as in not-code-heavy but more towards understanding the ideas behind existing learning approaches, their fields of use and how to put them into use in games. Genetic algorithms and Neural networks, while very interesting, seem generally useless for games. They're good for all these "impossible to explain" problems like learning to interpret handwriting, learning to interpret images, learning to climb as fast and securely as possible. Very useful for robots (oh how I wish I had money), but game worlds have very few such problems. More interesting for games is something known as Reinforcement learning, which in very short terms is like teaching an animal. You give them treats when they do things you like (like rolling over), and reprimands (hrm) when they do bad things (like chewing on your shoes). Other interesting things were drive based actions, which means that the agent has a key set of drives (good ones are hunger and curiosity), which makes the agent act based on how high a drive value is, and expectation/prediction learning, meaning that the agent expects something to happen due to a certain action. These two combined make for some really interesting things. Imagine this: a creature is somewhat hungry, so he needs to find food. He notices a button, and knows from previous experience that something happens when you push a button. His hunger drive isn't overly strong yet, which would reduce his actions to more reliable ones, so he decides to push the button. Food appears. He eats the food because he was somewhat hungry. Now only his curiosity remains, so he presses the button again, but doesn't eat the food. After a couple of presses on this button, he comes to the conclusion that food appears at a 100% probability when pushing this button. Now this button is boring, and will no longer satisfy his curiosity, but is noted as a reliable source of food.

That was a very simple example of a problem (how to satisfy curiosity and hunger). For a long time, I've been trying to figure out how to describe the problem of strategy in a game. This is hard for several reasons. For one it's hard to determine what is part of a strategy, and what is not. Second, it's tough to tell wether the strategy was bad, or if it was badly executed. The latter especially is what usually makes AI programmers want to hardcode strategies into the agents, rather than let them try things out, as there's a high probability that they will learn wrong, as they can't be reasoned with, you can't easily tell them *why* it went wrong. The why is interesting, how to detect where it went wrong and to adapt the strategy based on this, or just try again.

Mauve did something cool with his bots a long time ago. He made them test different ways of fighting a specific player, and depending on rate of success, it would prefer that type of fighting technique against that particular player. It was very simple, and just involved three different hardcoded ways of fighting which it chose between. Even if very simple, it is still an example of learning AI. The question is just which things are to be considered learnable, and which things to use for input to the learning system. You have to conserve CPU, and AI programmer sanity :)

I'm personally most interested in squad based AI, be it an encounter of an enemy squad in a singleplayer game like halo, halflife, any ubisoft shooter, among many others, or a highly team-work reliant online game like say counterstrike (in which a single round really is much like a squad encounter in one of the above games). I have not yet constructed anything usable in the field of a learning squad, but have a couple of ideas.

Comment.
I was once a Quake modder
Entar
Posts: 439
Joined: Fri Nov 05, 2004 7:27 pm
Location: At my computer
Contact:

Post by Entar »

Maybe I'm missing something, but it seems to me that the more complex learning stuff you mentioned (even the reinforcement and drive learning) doesn't have much use in an environment like Quake, unless you really alter the gameplay. Not because it wouldn't work or anything, but in Quake, things get into combat and are dead within a couple to several seconds, so you don't even get to notice much of anything.

Mauve's approach was good, though, because it's in an environment where the AI respawns. Now, if the knowledge from one fight with some grunts was made global and other grunts started fighting differently as a result of this, that could work. Not only would this make each group of grunts seem more unique, but they would get somewhat progressively harder as you progressed through a level. Either that, or you could just apply it to a gameplay environment where monsters respawn, getting progressively more tenacious the more they fight.
LordHavoc
Posts: 322
Joined: Fri Nov 05, 2004 3:12 am
Location: western Oregon, USA
Contact:

Re: Learning AI

Post by LordHavoc »

Urre wrote:I'm personally most interested in squad based AI, be it an encounter of an enemy squad in a singleplayer game like halo, halflife, any ubisoft shooter, among many others, or a highly team-work reliant online game like say counterstrike (in which a single round really is much like a squad encounter in one of the above games). I have not yet constructed anything usable in the field of a learning squad, but have a couple of ideas.
How real learning works is pattern recognition (analysis) and reproduction (synthesis), involving sequences of 'nodes', where a node may itself be a pattern of subnodes (such as sequences of letters forming a word).

Basically a hierarchical version of Markov Chains, a popular method of data compression as well - which is a good hint that it recognizes AND predicts patterns with a high probability :)

But that is a hairy subject that likes to escape our grasp, so I'll focus on what has been accomplished in the world of game ai...
  • navigation + firing as independent activities, this approach is unrealistic in every way but suffices to make an opponent move and shoot.
  • state based logic - switching between one of several 'states' with associated code (monster ai is the most common example of this), this ai approach easily suffers from being 'preoccupied with one enemy' due to insufficiently advanced logic (more advanced logic CAN be implemented of course), overall this produces excellent results but requires much coding, so it is often not done. The best example of this was in HalfLife1, where the AI had several situation flags (such as whether it was low on health, whether it could see its enemy, whether it could throw a grenade to flush out its enemy, whether its teammates were alive, etc), this flags variable was used in a lookup table to select one of several AI states, a very viable approach, but requiring substantial code and 'data entry' for the state selection/switching logic.
  • weighted average choices, aka 'fuzzy logic', in the general form this is just a form of state based logic where all states are run simultaneously and the 'best choice' is selected by weighted averages.
  • neural net - everyone's favorite 'this way I won't have to code ai!' approach, in truth this is an inappropriate method for performing actions, never working out in the end, partly due to the EXTREME amount of resources this method requires, and partly due to the fact no one has found a proper 'mutation algorithm', everyone picks genetic algorithms for this and that never works. I believe that real learning occurs while sleeping by running simulations based on the gathered information of the day (playing virtual deathmatches in this case, not gathering data from the deathmatch that was played).
  • scoring of random choices - simply generating random possible actions and evaluating whether they are a good idea at this time, I have not seen any examples of this, but it should result in plausible behavior even with small numbers of choices being evaluated, particularly if it is used only as a continuously reevaluated 'state selector' (evaluating one random new choice every 50ms for example, and checking if it seems better than the action being performed currently). This approach needs to be able to predict the outcome of actions with some degree of certainty, and it can reevaluate the effectiveness of the prediction routines over time, forming a sort of learning algorithm.
My preferred method at this time is 'scoring of random choices' one, but I have not implemented it or seen any existing examples of it in game ai.

It should also be noted that any algorithm can be made to appear as if it learns, by simply disabling most of its choices until it sees those actions being performed by another character, the game Magic Carpet did this with the enemy wizards, who would not attack your mana-gathering hotair balloons until you attacked theirs, and would generally not directly attack you until you attacked them, and would not attack your castle until you attacked theirs, and this logic persisted from level to level, so you could go through quite a bit of the game 'playing nice'.

P.S. the best game AI I've seen yet is in S.T.A.L.K.E.R.: Shadows of Chernobyl, where the NPC AI (every one of them can be friendly, neutral, or hostile toward each faction in the game, including you of course) makes good use of cover, yells orders to allies to make them take up proper positions for a coordinated attack (from multiple directions) on you or anyone else and then yells another order to have them attack you simultaneously. The AI has delayed reactions (doesn't fire immediately upon seeing you, etc), and retreats to cover when shot at, it's very clearly state based. It's simple AI, but it works, and that's more than can be said of most game AI :)
scar3crow
InsideQC Staff
Posts: 1054
Joined: Tue Jan 18, 2005 8:54 pm
Location: Alabama

Post by scar3crow »

Reading this thread I realize that nearly all of my theoretical AI is in fact state based with simple fuzzy logic to keep it in the air - scoring of random choices sounds like it would be great with a bot, particularly if they consider overall effectiveness, and thus adapt to you.
Urre
Posts: 1109
Joined: Fri Nov 05, 2004 2:36 am
Location: Moon
Contact:

Post by Urre »

Entar: don't forget that Quake is moddable, and I personaly usually see Quake more as a base for making cool stuff, than a game in itself, cause I've grown severely tired of it.

LordHavoc: the random choice evaluation thing you described is fairly commonly described in papers, but yes I've not seen it implemented. However you seemed to forget the important little detail of context, that the agent must be able to effectively analyze the context in which the action he chose to make was effective. Otherwise there will be no learning going on, and the prediction of wether a new random action will be better in this situation than the occuring one must also be experience based. The idea really rolls down to where to limit it, which contexts and which actions should the agent be able to do/note.

Dammit, I wanna play stalker too. On another note, I think Halflifes AI is understated.

scar3crow: Now design a learning squad AI, and I'll code it!
I was once a Quake modder
Entar
Posts: 439
Joined: Fri Nov 05, 2004 7:27 pm
Location: At my computer
Contact:

Post by Entar »

Urre wrote:Entar: don't forget that Quake is moddable, and I personaly usually see Quake more as a base for making cool stuff, than a game in itself, cause I've grown severely tired of it.
Of course. I was just making the point that if you make a mod that keeps Quake's pace, which many mods end up doing these days, you'll end up with what I was describing. But yes, Quake is very moddable, so you can still do some very interesting things.

Of course, with just a little tweaking, you could still do some good learning stuff in a Quake environment. For example, you could have Enforcers be "in communication" with each other (what do you think those little helmets they wear are for?), giving information about the way the player fights to each other. Or you could do that with the grunts and make it sort of a borg thing, since they have those brain implants already. This way, even though one grunt that was learning dies, he can give the information to other grunts and benefit the group that way. Be creative.
Sajt
Posts: 1215
Joined: Sat Oct 16, 2004 3:39 am

Post by Sajt »

Don't go implement this crazy AI if it's not obvious to the player.

To the player, even modest AI techniques like flanking can often just seem like they are spawning there or something.

Some weird learning stuff might be completely unnoticeable. Not just in Quake but in most FPSes..
F. A. Špork, an enlightened nobleman and a great patron of art, had a stately Baroque spa complex built on the banks of the River Labe.
Entar
Posts: 439
Joined: Fri Nov 05, 2004 7:27 pm
Location: At my computer
Contact:

Post by Entar »

Sajt: that's what I was trying to say. Sorta.
LordHavoc
Posts: 322
Joined: Fri Nov 05, 2004 3:12 am
Location: western Oregon, USA
Contact:

Post by LordHavoc »

Sajt wrote:Some weird learning stuff might be completely unnoticeable. Not just in Quake but in most FPSes..
Yes, the real problem with AI is that it reveals a lack of player intelligence :P

I've seen several complaints on forums of "enemies appearing beside me" in games with flanking AI. (I can only imagine what people will be saying on the Stalker forums soon, it has coordinated attack ai :)

Learning AI is often only recognized by AI researchers, the players instead complain "Why are the bots walking around like idiots in this map I just downloaded?", and later complain "The bots weren't too hard at first but now they're killing me!! How do I make them easy again?".
Urre
Posts: 1109
Joined: Fri Nov 05, 2004 2:36 am
Location: Moon
Contact:

Post by Urre »

Interesting points indeed. The reason I'm interested in learning AI, is basicly for agents to learn the most effective strategies in certain maps, and new maps to get AI support without mapper implementation. The latter part doesn't have to be related to learning, but it does if you want the first to be true.

I must admit I never imagined players not realizing the existence of flanking AI, or learning AI, or whatever it might be that makes them feel the game is unfair (this game always spawns monsters behind me, or why did the difficulty go up during the game?). Maybe they're all damaged from playing too much halflife, or other railtrack games. Would an rts player say "this game always spawns units behind my base"? I doubt it.
I was once a Quake modder
scar3crow
InsideQC Staff
Posts: 1054
Joined: Tue Jan 18, 2005 8:54 pm
Location: Alabama

Post by scar3crow »

scar3crow: Now design a learning squad AI, and I'll code it!
Ask me again when school isnt overwhelming and I'll talk with you about it - or pm me on irc (we keep being active at different times lately Ive noticed, Im not ignoring you, I just happen to not be around my pc whenever youre online).
I never imagined players not realizing the existence of flanking AI
This is the one time someone from CryTek said something that I agreed with really - You can code flanking ai, but the player wont know this unless they shout "Im flanking!" during the process, which makes it a bit redundant as well... In multiplayer you rarely think the player spawned in behind you if they appear at your side, because you trust in their basic intelligence as a human (aside from one case on e1m2, I cornered a guy in the MH room, and then he came down the stairs behind me - it was a 1on1), but if its the ai, its assumed to be cheating.

What youre speaking of is I believe as well, a consequence of rail shooters that are static, like HL and co. And your example is excellent, an RTS player if he finds the enemy at the south side of his base when he was fighting them at the north will assume that the enemy ai sent units to two different points, rather than just spawning. Ive got a lot of ideas for an RTS actually... I should write some of those out.

For learning behavior though and making the game harder... increasing accuracy and reaction time, though a legitimate increase in skill - which you can witness in players, is something that when they encounter from the ai feels like a cheat. You need it to be more visual demonstrations of ai... Taking cover, ducking, jumping, circlestrafing. You need to play to the audience when coding your ai, let them see how the ai is working - in the same way youve got your debug messages which reinforce what is going on, choose things that will reinforce to the player why theyre getting smarter.
Post Reply