NBA Point Spread Prediction | December 2020
As 2021 begins, we are greeted with new highlights from the 2021 NBA Season. Traditional highlights include improbable shots from Steph Curry, takeaways with the ensuing fireworks at the other end, and buckets that barely beat the buzzer. My highlights look a little different – how closely am I able to predict a given teams score, along with the corresponding spread and over/under.
The Data
I compiled game-by-game matchups iterating over the last regular season beginning on October 22, 2019, pausing halfway through on March 11, 2020 after Rudy Gobert’s positive test, and concluding in the middle of August. The approach was to gather team statistics from each box-score (in accordance with the providers terms-of-service), matched to the cumulative game score, and repeat for each matchup throughout the season.
The Model
With the data compiled, I trained two separate models – one to predict the score of the team on the road (using the team stats of both the team and their opponent in that game as inputs), and vice-versa for the team at home. With the model trained, I used a separate set of data to apply the two models. In order to predict a matchup, the two models use season cumulative averages for the team in question, and their opponent, on a given night. The spread and over/under are then inferred from the predictions.
The Analysis
Once I was able to get the process up and running on a server, I created a CRON job to run every morning at 6AM EST. The script generates the current day's predictions and renders them in an HTML file, which is displayed through the link above. In addition, the process writes data about the previous night’s predictions and actual scores to a file on the server. After this, I can download the results and analyze the performance separately with various techniques (this was the root goal of the project – evaluate the predictive strength of the model). The history of predictions can also be viewed on my website at the second link above.
Disclaimer: This project was conducted only for personal discovery and shared for community engagement, it does not intend to offer inferences or advise on outcomes of the current NBA Season. If you are interested in building similar tools, I encourage you to consider where the data is sourced, and how you intend to use it