#2 Case Study Machine Learning: ToTheArcade: setting goals and determining data

Share on facebook
Share on twitter
Share on linkedin
Machine learning case study

Ok, last post was a bit of an introduction… let’s cut the fat of our last blogpost and cut to the chase in this one.

In short, we set ourselves a clear goal:

“Can we train a model to recognize patterns in button and joystick usage on our pimped arcade machine cabinet, deploy it in the cloud and make predictions in real-time while somebody is actually playing a random game”

Our first concern is to gather data, so we can actually see how we can build a decent model that is fit to make real-time predictions as soon as someone starts playing.

Game Events

The arcade cabinet is stacked with IoT so there is no problem intercepting the button presses and capturing all that data.

iot project with Arduino

We identify two types of events:

  • Game Steering Events – in fact, those events have nothing to do with the game itself, for example:
    • Inserting coins
    • Selecting a game
    • Starting it and
    • Stopping it
  • Game Interaction Events – these are main events resulting from direct interaction with the actual game (actually, the emulator emulating the game) itself, such as:
    • Starting a new level
    • Kicking
    • Firing
    • Moving
    • Jumping
    • ….

We publish those events on an MQTT bus and forward them to a datastore.  Using the IoT infrastructure we have 2 options which we both use interchangeably, depending on where the arcade cabinet is actually used (since we take it along everywhere we go ;-):

We run a docker container on the arcade cabinet itself, as well as a document-based database – MongoDB in our case – in which we store all the timestamped events in json format.  This first modus operandi is ideal for cases where the arcade cabinet is on the road, and internet connection is flaky.

The second modus operandi is storing and forwarding all events to a datastore in the cloud.

Both options are actually the same except that in the latter case our MongoDB is deployed in the cloud.  This is already put in place for future purposes.  When we’ll discuss the general architecture, we will go into more detail about this topic.

Now, let’s have a closer look at our bad boys:


Firstly, the joystick, able to move in a 360 degrees fashion, is electrically merely pushing 4 micro-switches, one up, down, left and right (duh). The zones in between, north-east, north-west, south-east and south-west, are actually simulated by pressing 2 microswitches at the same time.  So, when pushing the joystick north-east, it registers as up and right at the same time.

There are six game interaction buttons, resulting in a game move (depending on the game you are playing).  The white buttons are 2 of the so-called Game Steering Events Buttons for starting a new game for instance.

The insert coin buttons (yellow ones, 1 for player 1 and a second for player 2), start (green) and stop buttons (red) are in front of the arcade cabinet and produce Game Steering Events.

For this project we only take player 1 into account and focus only on that one.  The assumption we take here is that there’s no difference in button usage when playing against the software AI or another human player.

However, we notice a big difference though if there are 2 cooperating users (not fighting each other, but battling the computer itself together). In that case, the button usage shows an entirely different pattern:  e.g. in the game Pang, there’s a big difference when playing the game alone or in cooperation with a second gamer.

Now, dig a little deeper:  How does the raw event data really look like?

This is an example of a Game Steering Event:

{ “_id” : { “$oid” : “5ad0d2bf24aa9a00010953f0” }, “_class” : “company.tothepoint.tothearcade.datalogger.model.DataPoint”,
“timestamp” : { “$date” : “2018-04-13T17:54:39.752+0200” }, “dataPoint” : “{‘type’: ‘GameStarted’, ‘value’: ‘galaxian’}” }

This is a steering event of a player starting a game called Galaxian at a certain point in time.  Nothing more to say about this, it speaks for itself (I hope).

Ok let’s look at a Game Interaction Event now:

{ “_id” : { “$oid” : “5ad0d2cd24aa9a000109541e” }, “_class” : “company.tothepoint.tothearcade.datalogger.model.DataPoint”, “timestamp” : { “$date” : “2018-04-13T17:54:53.760+0200” }, “dataPoint” : “{\”22\”:true,\”23\”:true,\”24\”:true,\”25\”:true,\”26\”:true,\”27\”:true,\”28\”:true,\”29\”:true,\”30\”:true,\”31\”:true,\”32\”:true,\”33\”:true,\”34\”:true,\”35\”:true,\”36\”:true,\”37\”:true,\“38\”:false,\”39\”:true,\”40\”:true,\”41\”:true,\”42\”:true,\”43\”:true,\”44\”:false,\”45\”:true,\”46\”:true,\”timestamp\”:1285500425}” }

{ “_id” : { “$oid” : “5ad0d2cd24aa9a000109541f” }, “_class” : “company.tothepoint.tothearcade.datalogger.model.DataPoint”, “timestamp” : { “$date” : “2018-04-13T17:54:53.826+0200” }, “dataPoint” : “{\”22\”:true,\”23\”:true,\”24\”:true,\”25\”:true,\”26\”:true,\”27\”:true,\”28\”:true,\”29\”:true,\”30\”:true,\”31\”:true,\”32\”:true,\”33\”:true,\”34\”:true,\”35\”:true,\”36\”:true,\”37\”:true,\“38\”:true,\”39\”:true,\”40\”:true,\”41\”:true,\”42\”:true,\”43\”:true,\”44\”:false,\”45\”:true,\”46\”:true,\”timestamp\”:1285500490}” }

All buttons that are in default state (meaning: not being pressed) have a Boolean value of “true”. When a button gets pressed it works like a kind of circuit breaker and gets the value of “false”.  Thus, a single button press constitutes two events where the Boolean value gets flipped from “true” to “false” and back.

In the above example there is a single button press for button with ID 38.

Keen readers might notice that there are more buttons in the event payload than the ones we discussed earlier, but the white buttons and player 2 buttons are included already for future episodes in our case study.

The Game Steering Events will be used in a preprocessing step for labelling the data and the Game Interaction Events form the raw data we will have to focus on in order to produce some useful features.

So basically, we will have the following preprocessing setup.

Given a chain of game events on the arcade cabinet we’ll do the following to label the datasets.

if (eventDto instanceof GameSteeringEventDto) {
GameSteeringEventDto steering = (GameSteeringEventDto) eventDto;
if (steering.isGameStartEvent()) {//start a new game and label it with the game being played.

if (steering.isGameEndedEvent ()) {
       if (!game.isNullGame()) {
//end the game and write out the dataset. }           else {
       //end the game and ignore trivial game

} else {
GameInteractionEventDto interaction = (GameInteractionEventDto) eventDto;
if (Boolean.valueOf(interaction.getB_start_1())){
if (!game.isNullGame()) {
//end the game and write out the dataset.
//start a new game and label it with the game being played.
} else {
       //add the interaction event to the game data set.
       //add the interaction event to the game data set.

If we notice that there is a game started, we’ll start labelling the data.

In the case where there is a game ended event and the game is non-significant we’ll conclude the data for this game and write it to the dataset.

A special case is when the white button is used for starting a new level (the B_start_1 coded button).  We’ll look at this event as a gameEndedEvent and a gameStartedEvent at the same time.

The labelling will continue but we’ll conclude the data capturing and write it to the dataset and start a new data capturing step.

What about the non-significant check?  “isNullGame”. Well, we are working with human players here, and people tend to make errors, they start the wrong game or they die (too) quickly in the game, or their manager yells at them they have to stop playing in the middle of a game.

We consider this data as non-significant or even noise in the data because it adds no real measurable effect on the identification of the game.

In statistics there is something called the null hypothesis to check if the data is significant or not. I will not go in too much detail but let us just say that we need at least 10 interaction events per game to consider the captured game data as being significant enough.

Now, we have labelled our data, we can start looking into our data to see if we can make sense out of it.

Understanding the data

The type of raw data is an exact fit for the one-hot encoding trick.  Each time a button or joystick is used we flip the bit value and as such we get the following layout.

[ju,      jd,        jr,        jl,         b1,       b2,       b3,       b4,       b5,       b6]
[0,       1,         0,         0,         0,         0,         0,         0,         0,         0] 

Or in the case that buttons and joystick are used simultaneously we get the following:

[0,       1,         0,         0,         1,         0,         0,         0,         0,         0]

Now we have this as a given we can look into our data and start diving into some real cases.

In the beginning we start with 4 games… our own little “Hello World”-case sort of speak.

Puckman         – player uses the joystick only

Gyruss         – player uses the joystick (in a circle movement) and button 1

Bomberman    – player uses the joystick (in a vertical/horizontal fashion) and button 1

Mortal Kombat – player uses all controls and even combos as well

Small note: actually, b2 and b6 are the same in Mortal Kombat, since we are using the default MAME MK layout.  Because only 5 real valued buttons are used in the game, one can argue if we shouldn’t categorize Mortal Kombat as a “5 buttons only” game.

As soon as we started to play and data came pouring in, we quickly noticed there is quite some noise in the data.  Some colleagues (who still owe me a beer by the way), had their fun frantically hitting all buttons while playing Puckman.

=> Conclusion: A data cleansing step is needed for sure.

The question now arises: What do we consider noise (and should thus be cleaned), what do we consider real data?

For instance, Bomberman goes left to right, up and down, but we noticed in the data that there are quite some diagonally movements in the data.  And yes, in the heat of the game, you cannot expect that a gamer perfectly aligns the joystick to the north in order to go up in the game.

After careful consideration and following our gut feeling we define a couple of cleaning rules for each classification instance.

Here is an excerpt:

  • Puckman – we clean all button presses
  • Bomberman – we clean all button presses > b1
  • Gyruss – same ruleset as bomberman

Our data cleansing and preprocessing step is programmed in java, so setting up the cleansing rules per game isn’t that hard to implement and are default included in the GameDefinition Enum.

And yes, we are aware that there is a typo in the enum!

PUCKMAN(“puckman”, GameRuleSetFactory.getJoyStickOnlySet()),

Feature Engineering

Now we can start thinking about our features.

In order to figure out the possibilities, we need to plot some game scenarios for closer inspection.

In the next picture, consider 2 separate Mortal Kombat game instances:



The X-axis is the time progressing in the game… The Y-axis represents an input variable per joystick/button bit value in the one-hot encoding.

It might strike you, there are not many similarities between them.

And this gets me thinking.  A single game on the arcade cabinet has quite some “versions”.

For instance, Mortal Kombat has different characters with different strengths and weaknesses, different combo’s and each level has its own particularities.

And then I don’t even take the different playing styles into account of the players themselves.

Looking at this kind of data we basically have a couple of options:

  • We can look at it as being a time series and see if we can use a k-NN approach. It basically tries to find the closest trained sample that corresponds with the given example and uses its corresponding label as prediction value.
  • We can try to extract the informative content of the time-series data into scalar quantities and use them as features, that on their turn can be used in a classification algorithm to make a prediction
  • We can kind of combine both previous approaches by using a sliding window approach and then extract features from that sliding window to do the classification. Some kind of naïve (dynamic) binning method.

Want to know what option I go for?  Well, find out next week, when I will elaborate a little further on the feature engineering part.

If you have any ideas or alternative approaches yourself, please feel free to leave a comment below!

Click here to read part 3: the final feature engineering

 Follow us on Facebook – Follow us on LinkedIn – Follow us on Twitter 

Credits: blogpost by Kevin Smeyers, Machine Learning Master at ToThePoint

Kevin Smeyers

Kevin Smeyers is the current Machine Learning architect at ToThePoint Group. Machine Learning and AI is a specialty he picked up again since his days as a computer science student--after having gained various experiences in multiple other IT domains. Passionate about finding the fun side in things, he continued building up a track record in experimenting with a combination of IoT and Machine Learning. Always looking for the algorithm behind the algorithm, it comes as no surprise that Kevin is a speedcubing fan with a personal best of 32 seconds. As an enthusiastic hobbyist with a job, he’s eager to talk about his passion.

Leave a Reply