Spatial Analysis of College Basketball

June 23, 2017

Nick Kapur, PhD Candidate

For a few weeks each March, the country is captivated by March Madness. Brackets are filled out, bets are placed, and occasionally prayers are answered. Professional sports are wonderful, but college sports are able to generate the purest form of passion; a passion derived from people’s lives being intricately and inexorably tied to the school they attend. At NC State, we are at the epicenter of college basketball. NC State plays in the best basketball conference in the country (the ACC), mere minutes from Duke and UNC, 2 of the greatest college basketball programs of all time. Competing constantly against the very best schools in the country requires a flexibility and adaptability often necessary in any “underdog” story. I believe that this requirement can lead to the perfect union of NC State basketball and an unlikely partner: the Department of Statistics.

Since the early 2000s, professional sports organizations have slowly embraced the use of statistics and analytics to help drive performance increases. The professional equivalent of college basketball, the National Basketball Association (NBA) has even gone so far as to install special cameras in each arena that produce data including every player’s spatial location 24 times per second. College sports teams, due primarily to a lack of resources, have been far slower to embrace analytics. In college, there are no fancy cameras in place leaving most studies to use simple statistics such as points, rebounds, and assists. Meanwhile, the most important offensive concept in the game, the ability to shoot the ball, is captured only by field goal percentage. Field goal percentage is a misleading statistic as it is unable to determine where on the court shots originate. This shortcoming leads to players who take easier or fewer shots to have higher field goal percentages. This is problematic as it doesn’t truly capture the best shooters; it simply captures the most opportunistic ones.

That is where the Statistics department can help make major strides. In a recent project, I created a web application that allowed for the easy tracking of college basketball shots. This does not give all player’s locations 24 times a second like the NBA, but it does allow easy capture of shot location, a glaring missing piece of data for most college programs. In addition, after leading a team of undergraduates to collect data for 20 NC State games from the 2016-17 season, I performed a spatial analysis of the data. This analysis led to several interesting insights. First, the conventional wisdom that players tend to shoot more (or better) to the side of their dominant hand showed no evidence. Second, the belief that shooting 3 pointers is significantly better than shooting long 2 pointers was reaffirmed. And finally, likelihood comparisons were able to be drawn for each player. This is important, as it can be used to determine where certain players are likely to shoot, which is wonderful information for a coach trying to create a game plan.

Overall, this recent project was able to accomplish several interesting tasks in the world of college basketball that will hopefully allow the influence of statistical thinking to soon become an integral part of the game. If this union is embraced by NC State (as it has been thus far), our university can be a leader in driving the field of sports statistics to a higher level while at the same time winning in front of the entire country every March.

Nick is a PhD Candidate whose research interests include machine learning and statistical genetic. His current research focuses on pursuit-evasion and cooperative reinforcement learning. We thought this posting was a great excuse to get to know a little more about him, so we we asked him a few questions!

  • What do you find most interesting/compelling about your research?

    I love the ability to work on problems from a diverse set of fields. The ability to do statistical research in sports and then take that research and apply it to national security, robotics, and medicine is incredibly appealing to me.

  • What do you see are the biggest or most pressing challenges in your research area?

    I wrote this blog post on sports statistics, so I will answer about that as a research area. I think the most challenging aspect is gaining the trust of the sports community. Like many communities, it tends to be insular and resistant to change. There are still many athletes, coaches, and administrators who do not see the value in listening to people who have not played their sport at a high level. This is slowly changing for the better; however, the area of sports statistics still needs many practitioners who intimately know the sport they are studying, can communicate effectively with the people within that sport’s community, and are open-minded to compromise.

  • Explain the benefits of Scientology.

    The founder of Scientology, L. Ron Hubbard, once said “For a Scientologist, the final test of any knowledge he has gained is, ‘did the data and the use of it in life actually improve conditions or didn’t it?’” The question posed in this quote is phenomenal. It is something a statistician should ask themselves every time they are working on a problem. While the statistical methodology of Scientologists may be less rigorous than that of trained statisticians, at least they are asking themselves the appropriate questions (something statisticians don’t always do).