The ability to make crucial decisions in real time is one of the most sought after attributes of a head coach in any sport. Being able to improve upon these decisions is thus an important problem, as it can improve a team’s chances of winning. In baseball, there have been numerous studies on managerial decisions such as defensive alignments, bullpen usage, bunting, and more. These studies have resulted in managers making more efficient decisions, leading directly to better play. In football, coaches are faced with fundamental decisions to make every down: the personnel, the formation, and the play their team will run. Unlike baseball, where there is an abundance of data, it is difficult to determine whether coaches are making these important decisions effectively in football for several reasons. First, obtaining labeled data is extremely expensive and requires hand-labeling by domain experts. Furthermore, with a 16-game schedule and an average of only 130 plays per game, NFL football does not generate nearly enough data to reach reliable conclusions.
The lack of sufficient data is not an uncommon problem in science, and due to the proliferation of computing power, it is a problem commonly remedied by simulation studies. Luckily, there is a realistic NFL simulation environment that has been developed and extensively updated for nearly 30 years, EA Sports’ Madden video game franchise. Madden games can act as a model for the underlying system dynamics of an NFL game. We utilize data generated from Madden 17, the most recent version of the game, to train reinforcement learning algorithms that make every play-calling decision throughout an entire game. We compare the results of these algorithms with a baseline established from the game’s built-in play-calling algorithm, an initial surrogate for real-life coaching decisions.
To generate the data at rates far greater than actual NFL games, we constructed 4 controllers that were operable through an interface with Raspberry Pi computers. We ran each of these controllers continuously on separate Xboxes, and we used optical character recognition techniques to capture the current state of the game from image data. Then, we used the current state as input to our reinforcement learning algorithms, which would return the play to run. The correct buttons were subsequently passed to the Raspberry Pi, resulting in 4 Madden games that could run continuously with no human input, collecting data 24 hours per day.
Our results show that the reinforcement learning algorithms are able to perform at better rates than the built-in Madden play-calling algorithm, leading to better decision-making and thus more victories. These results can potentially provide a framework for evaluating and improving play-calling in football. Additionally, they can potentially be augmented with real data to provide a model that performs better than a model based on the real data alone. With enough evidence, football coaches may be compelled to alter strategic decisions for the better, leading to more efficiently called football games.
Nick is a PhD Candidate whose research interests include machine learning and statistical genetic. His current research focuses on pursuit-evasion and cooperative reinforcement learning. We thought this posting was a great excuse to get to know a little more about him, so we we asked him a few questions! We asked a fellow Laber Labs colleague to ask Nick a probing question.
Explain the countable axiom of choice with an analogy involving hot dogs.
Let’s say you really want a hotdog. You are walking down the street, and suddenly you stumble upon an infinite number of hotdog vendors who each have tubs with many hotdogs in them. You know that you are incredibly hungry right now, and that in the future you may want to go back to the best hotdog vendor. Therefore, you get out your trusty megaphone and announce to the hotdog vendors a rule (some function that allows them to choose…let’s call it a choice function) so that each of them will know exactly which of their hotdogs to give you. This way, you don’t have to go and pick out one hotdog from each of them individually. The axiom of choice has now saved you a lot of valuable time and probably doomed you to a sedentary lifestyle.
This is Nick’s second post! To learn more about his research, check out his first article here!