Ok, last post was a bit of an introduction… let’s cut the fat of our last blogpost and cut to the chase in this one.
We set ourselves a clear goal
“Can we train a model to recognize patterns in button and joystick usage on our pimped arcade machine cabinet, deploy it in the cloud and make predictions in real-time while somebody is actually playing a random game”
Our first concern is to gather data, so we can actually see how we can build a decent model that is fit to make real-time predictions as soon as someone starts playing.
The arcade cabinet is stacked with IoT so there is no problem intercepting the button presses and capturing all that data.
We identify two types of events:
- Game Steering Events – in fact, those events have nothing to do with the game itself, for example:
- Inserting coins
- Selecting a game
- Starting it and
- Stopping it
- Game Interaction Events – these are main events resulting from direct interaction with the actual game (actually, the emulator emulating the game) itself, such as:
- Starting a new level
We publish those events on an MQTT bus and forward them to a datastore. Using the IoT infrastructure we have 2 options which we both use interchangeably, depending on where the arcade cabinet is actually used (since we take it along everywhere we go ;-):
We run a docker container on the arcade cabinet itself, as well as a document-based database – MongoDB in our case – in which we store all the timestamped events in json format. This first modus operandi is ideal for cases where the arcade cabinet is on the road, and internet connection is flaky.
The second modus operandi is storing and forwarding all events to a datastore in the cloud.
Both options are actually the same except that in the latter case our MongoDB is deployed in the cloud. This is already put in place for future purposes. When we’ll discuss the general architecture, we will go into more detail about this topic.
Now, let’s have a closer look at our bad boys:
Firstly, the joystick, able to move in a 360 degrees fashion, is electrically merely pushing 4 micro-switches, one up, down, left and right (duh). The zones in between, north-east, north-west, south-east and south-west, are actually simulated by pressing 2 microswitches at the same time. So, when pushing the joystick north-east, it registers as up and right at the same time.
There are six game interaction buttons, resulting in a game move (depending on the game you are playing). The white buttons are 2 of the so-called Game Steering Events Buttons for starting a new game for instance.
The insert coin buttons (yellow ones, 1 for player 1 and a second for player 2), start (green) and stop buttons (red) are in front of the arcade cabinet and produce Game Steering Events.
For this project we only take player 1 into account and focus only on that one. The assumption we take here is that there’s no difference in button usage when playing against the software AI or another human player.
However, we notice a big difference though if there are 2 cooperating users (not fighting each other, but battling the computer itself together). In that case, the button usage shows an entirely different pattern: e.g. in the game Pang, there’s a big difference when playing the game alone or in cooperation with a second gamer.
Now, dig a little deeper: How does the raw event data really look like?
This is an example of a Game Steering Event:
This is a steering event of a player starting a game called Galaxian at a certain point in time. Nothing more to say about this, it speaks for itself (I hope).
Ok let’s look at a Game Interaction Event now:
All buttons that are in default state (meaning: not being pressed) have a Boolean value of “true”. When a button gets pressed it works like a kind of circuit breaker and gets the value of “false”. Thus, a single button press constitutes two events where the Boolean value gets flipped from “true” to “false” and back.
In the above example there is a single button press for button with ID 38.
Keen readers might notice that there are more buttons in the event payload than the ones we discussed earlier, but the white buttons and player 2 buttons are included already for future episodes in our case study.
The Game Steering Events will be used in a preprocessing step for labelling the data and the Game Interaction Events form the raw data we will have to focus on in order to produce some useful features.
So basically, we will have the following preprocessing setup.
Given a chain of game events on the arcade cabinet we’ll do the following to label the datasets.
If we notice that there is a game started, we’ll start labelling the data.
In the case where there is a game ended event and the game is non-significant we’ll conclude the data for this game and write it to the dataset.
A special case is when the white button is used for starting a new level (the B_start_1 coded button). We’ll look at this event as a gameEndedEvent and a gameStartedEvent at the same time.
The labelling will continue but we’ll conclude the data capturing and write it to the dataset and start a new data capturing step.
What about the non-significant check? “isNullGame”. Well, we are working with human players here, and people tend to make errors, they start the wrong game or they die (too) quickly in the game, or their manager yells at them they have to stop playing in the middle of a game.
We consider this data as non-significant or even noise in the data because it adds no real measurable effect on the identification of the game.
In statistics there is something called the null hypothesis to check if the data is significant or not. I will not go in too much detail but let us just say that we need at least 10 interaction events per game to consider the captured game data as being significant enough.
Now, we have labelled our data, we can start looking into our data to see if we can make sense out of it.
Understanding the data
The type of raw data is an exact fit for the one-hot encoding trick. Each time a button or joystick is used we flip the bit value and as such we get the following layout.
Or in the case that buttons and joystick are used simultaneously we get the following:
Now we have this as a given we can look into our data and start diving into some real cases.
In the beginning we start with 4 games… our own little “Hello World”-case sort of speak.
Puckman – player uses the joystick only
Gyruss – player uses the joystick (in a circle movement) and button 1
Bomberman – player uses the joystick (in a vertical/horizontal fashion) and button 1
Mortal Kombat – player uses all controls and even combos as well
Small note: actually, b2 and b6 are the same in Mortal Kombat, since we are using the default MAME MK layout. Because only 5 real valued buttons are used in the game, one can argue if we shouldn’t categorize Mortal Kombat as a “5 buttons only” game.
As soon as we started to play and data came pouring in, we quickly noticed there is quite some noise in the data. Some colleagues (who still owe me a beer by the way), had their fun frantically hitting all buttons while playing Puckman.
=> Conclusion: A data cleansing step is needed for sure.
The question now arises: What do we consider noise (and should thus be cleaned), what do we consider real data?
For instance, Bomberman goes left to right, up and down, but we noticed in the data that there are quite some diagonally movements in the data. And yes, in the heat of the game, you cannot expect that a gamer perfectly aligns the joystick to the north in order to go up in the game.
After careful consideration and following our gut feeling we define a couple of cleaning rules for each classification instance.
Here is an excerpt:
- Puckman – we clean all button presses
- Bomberman – we clean all button presses > b1
- Gyruss – same ruleset as bomberman
Our data cleansing and preprocessing step is programmed in java, so setting up the cleansing rules per game isn’t that hard to implement and are default included in the GameDefinition Enum.
And yes, we are aware that there is a typo in the enum!
Now we can start thinking about our features.
In order to figure out the possibilities, we need to plot some game scenarios for closer inspection.
In the next picture, consider 2 separate Mortal Kombat game instances:
The X-axis is the time progressing in the game… The Y-axis represents an input variable per joystick/button bit value in the one-hot encoding.
It might strike you, there are not many similarities between them.
And this gets me thinking. A single game on the arcade cabinet has quite some “versions”.
For instance, Mortal Kombat has different characters with different strengths and weaknesses, different combo’s and each level has its own particularities.
And then I don’t even take the different playing styles into account of the players themselves.
Looking at this kind of data we basically have a couple of options:
- We can look at it as being a time series and see if we can use a k-NN approach. It basically tries to find the closest trained sample that corresponds with the given example and uses its corresponding label as prediction value.
- We can try to extract the informative content of the time-series data into scalar quantities and use them as features, that on their turn can be used in a classification algorithm to make a prediction
- We can kind of combine both previous approaches by using a sliding window approach and then extract features from that sliding window to do the classification. Some kind of naïve (dynamic) binning method.
Want to know what option I go for? Well, find out next week, when I will elaborate a little further on the feature engineering part.
If you have any ideas or alternative approaches yourself, please feel free to leave a comment below!
Credits: blogpost by Kevin Smeyers, Machine Learning Master at ToThePoint