Tag Archives: sports

sports analytics featured in the latest INFORMS Editor’s Cut

An Editor’s Cut on Sports Analytics edited by Scott Nestler and Anne Robinson is available. The volume is a collection of sports analytics articles published in INFORMS journals. Some of the articles are free to download for a limited time if you don’t have a subscription. But there is more than academic papers in the Editor’s Cut.

Here are some of my favorite articles from the volume.

Technical Note—Operations Research on Football [pdf] by Virgil Carter and Robert E. Machol, 1971. This is my favorite. This article may be the first sports analytics paper ever and it was written in an operations research journal (w00t!). It’s written by an NFL player who used data to estimate the “value” of field position and down by watching games on film and jotting down statistics. For example, first and 10 on your opponent’s 15 yard line is worth 4.572 expected points, whereas first and 10 on your 15 yard line is worth -0.673 expected points. This idea is used widely in sports analytics and by ESPN’s Analytics team to figure out things like win probabilities. This paper was way ahead of its time. You can listen to a podcast with Virgil Carter here (it’s my favorite sports analytics podcast).

An Analysis of a Strategic Decision in the Sport of Curling by Keith A. Willoughby and Kent J. Kostuk, 2005. This is a neat paper. I have never curled but can appreciate the strategy selection at the end of a game. In curling, the choice is between taking a single point or blanking an end in the latter stages of a game. Willoughby and Kostuk use decision trees to evaluate the benefits and drawbacks associated with each strategy. Their conclusion is that blanking the end is the better alternative. However, North American curlers make the optimal strategy choice whereas European curlers often choose the single point.

Scheduling Major League Baseball Umpires and the Traveling Umpire Problem by Michael A. Trick, Hakan Yildiz, Tallys Yunes, 2011. This paper develops a new network optimization model for scheduling Major League Baseball umpires .The goal is to minimize the umpire travel of the umpires, but league rules are at odds with this. Rules require each umpire to umpire for all the teams but not two series in a row. As a result, umpires typically travel more than 35,000 miles per season without having a “home base” during the season. The work here helps meet the league goals while making life better for the crew.

A Markov Chain Approach to Baseball by Bruce Bukiet, Elliotte Rusty Harold, José Luis Palacios, 1997. This paper develops and fits a Markov Chain to baseball (You had me at Markov chains!). The model is then used to do a number of different things such as optimize the lineup and forecast run distributions. They find that the optimal position for the “slugger” is not to bat fourth and for the pitcher to not bat last, despite most teams making these decisions.

The Loser’s Curse: Decision Making and Market Efficiency in the National Football League Draft by Cade Massey, Richard H. Thaler, 2013.  Do National League Football teams overvalue the top players picked early in the draft? The answer: Yes, by a wide margin.

There are a couple of dozen papers that examine topics such as decision-making within a game, recruitment and retention issues (e.g., draft preparation), bias in refereeing, and the identification of top players and their contributions. Check it out.

~~~

The Editor’s Cut isn’t just a collection of articles. There are videos, podcasts, and industry articles. A podcast with Sheldon Jacobson is included in the collection. In it, Sheldon talks about bracketology, March Madness, and the quest for the perfect bracket:

A TED talk by Rajiv Maheswaran on YouTube is included in the collection (below) called “The Math Behind Basketball’s Wildest Moves.” It’s a description of how to use analytics to recognize what is happening on a basketball court at any given time using machine learning (is that a pick and roll or not?)

Other sports tidbits from around the web:

Read the previous INFORMS Editor’s Cut on healthcare analytics.

Here are a few football analytics posts on Punk Rock OR:

Who do you think will win the Superbowl? The Carolina Panthers or the Denver Broncos? Did you make this decision based on analytics?


Punk Rock OR was on the Advanced Football Analytics podcast

I am thrilled to have been a guest on the Advanced Football Analytics podcast hosted by Dave Collins (@DaveKCollins ) to talk about Badger Bracketology and football analytics.

Listen here [iTunes link].

Related blog posts:

You can also read some of Badger Bracketology’s press coverage here.


Should a football team run or pass? A game theory and linear programming approach

Last week I visited Oberlin College to deliver the Fuzzy Vance Lecture in Mathematics (see post here). In addition, I gave two lectures to Bob Bosch’s undergraduate optimization course. My post about my lecture on ambulance location models is here.

My second lecture was about how to solve two player zero-sum games using linear programming. The application was a sports analytics application of whether a football team should run or pass. The purpose of the lecture was to learn about zero-sum games (it was a new topic to most students) and learn how to solve zero-sum games with two decision-makers using linear programming.

This lecture tied into my Badger Bracketology work, but since I do not use optimization in my college football playoff forecasting model, I selected another football application.

 

Related reading:


the NFL football draft and the knapsack problem

In this week’s Advanced Football Analytics podcast, Brian Burke talked about the knapsack problem and the NFL draft [Link]. I enjoyed it. Brian has a blog post explaining the concept of the knapsack problem as it relates to the NFL draft here here. The idea is that the draft is a capital budgeting problem for each team, where the team’s salary cap space is the knapsack budget, the potential players are the items, the players’ salaries against the cap are the item weights, and the players’ values (hard to estimate!) are the item rewards. Additional constraints are needed to ensure that all the positions are covered, otherwise the optimal solution returned might be a team with only quarterbacks and running backs. Brian talks a bit about analytics and estimating value. I’ll let you listen to the podcast to get to all the details.

During the podcast, Brian gave OR a shout out and added a side note about how knapsack problems are useful for a bunch of real applications and can be very difficult to solve in the real world (thanks!). I appreciated this aside, since sometimes cute applications of OR on small problem instances give the impression that our tools are trivial and silly. The reality is that optimization algorithms are incredibly powerful and have allowed us to solve incredibly difficult optimization problems.

Optimization has gotten sub-optimal coverage in the press lately. My Wisconsin colleagues Michael Ferris and Stephen Wright wrote a defense of optimization in response to an obnoxious anti-optimization article in the New York Times Magazine (“A sucker is optimized every minute.” Really?). Bill CookNathan Brixius, and JF Puget wrote nice blog posts in response to coverage of a TSP road trip application that failed to touch on the bigger picture (TSP is useful for routing and gene sequencing, not just planning imaginary road trips!!). I didn’t write my own defense of optimization since Bill, Nathan, and JF did such a good job, but needless to say, I am with them (and with optimization) all the way. It’s frustrating when our field misses opportunities to market what we do.

If you enjoy podcasts, football, and analytics, I recommend the Advanced Football Analytics podcast that featured Virgil Carter, who published his groundbreaking football analytics research in Operations Research [Link].

Related posts:

 


Some thoughts on the College Football Playoff

After a fun year of Badger Bracketology, I wanted to reflect upon the college football playoff.

Nate Silver reflects upon the playoff in an article on FiveThirtyEight, and he touches on the two most salient issues in the playoff:

  • False negatives: leaving teams with a credible case for being named the national championship out of the playoff.
  • False positives: “undeserving” teams in the playoff.

As the number of teams in the playoff increases, the number of false negatives decreases (good – this allows us to have a chance of selecting the “right” national champion) and the number of false positives increases (bad).

One of my concerns with the old Bowl Championship Series (BCS) system with a single national championship game was that exactly two teams were invited to the national championship game. This was a critical assumption in the old system that was rarely discussed. There was rarely exactly two teams that are “deserving.” Usually, deserving is equated with “undefeated” and in a major conference. Out of 16 BCS tournaments, this situation occurred only four times (25% of championship games), leading to controversy in the remaining 75%. This is not a good batting average, with most of the 12 controversial years having too many false negatives and no false positives.

The new College Football Playoff (CFP) system has a new assumption: the number of “deserving” teams does not exceed four teams.

If you look at the BCS years, we see that this assumption was never violated: there was never more than four undefeated teams in a major conference nor a controversy surrounding more than 3 potential “deserving” teams. Controversy surrounded the third team that was left out, a team that would now be invited to the playoff. At face value, the four team playoff seems about right.

But given the title of Nate Silver’s article (“Expand The College Football Playoff”) and the excited discussion of the idea of the eight team playoff in 2008 after a controversial national championship game, I can safely say that most people want more than four teams in the playoff. TCU’s dominance in a bowl game supports these arguments. The fact that we’ve had one controversial seeding in one CFP is a sign that maybe four isn’t the right playoff size. What is the upper bound on the number of deserving teams?

Answering this question is tricky, because there is a relationship between the number of teams in the playoff and our definition of “deserving.” There will always be teams on the bubble, but as the playoff becomes larger, this becomes less of an issue. Thoughts on this topic are welcome in blog comments.

It’s worth mentioning the impact on academics and injuries. As a professor of operations research, I believe that every decision requires balancing different tradeoffs. The tradeoffs in the college football playoffs should not only be about false positives, false negatives, fan enjoyment, and ad revenue. Maybe this is trivial: it’s an extra game for a mere eight teams, but I will be disappointed if the entire impact on the student-athletes and their families such as academics and injuries are not part of the conversation.

 


introducing Badger Bracketology, a tool for forecasting the NCAA football playoff

bucky_shoots_and_scoresToday I am introducing Badger Bracketology:
http://bracketology.engr.wisc.edu/

I have long been interested in football analytics, and I enjoy crunching numbers while watching the games. This year is the first season for the NCAA football playoff, where four teams will play to determine the National Champion. It’s a small bracket, but it’s a start in the right direction.

The first step to being becoming the national champion is to make the playoff. To do so, a team must be one of the top four ranked teams at the end of the season. A selection committee manually ranks the teams, and they are given a slew of information and other rankings to make their decisions.

I wanted to see if I could forecast the playoff ahead of time by simulating the rest of the season rather than waiting until all season’s games have been played. Plus, it’s a fun project that I can share with my undergraduate simulation simulation that I teach in the spring.

Here is how my simulation model works. The most critical part is the ranking method, which uses the completed game results to rate and then rank the teams so that I can forecast who the top 4 teams will be at the end of the season. I need to do this solely using math (no humans in the loop!) in each of 10,000 replications. Here is how it works. I start with the outcomes of the games played so far, starting with at least 8 weeks of data. This is used to come up with a rating for each team that I then rank. The ranking methodology uses a connectivity matrix based on Google’s PageRank algorithm (similar to a Markov chain). So far, I’ve considered three variants of this model that take various bits of information account like who a team beats, who it loses to, and the additional value provided by home wins. I used data from the 2012 and 2013 seasons to tune the parameters needed for the models.

The ratings along with the impact of home field advantage are then used to determine a win probability for each game. From previous years, we found that the home team won 56.9% of games later in the season (week 9 or later), which accounts for an extra boost in win probability of ~6.9% for home teams. This is important since there are home/away games as well as games on neutral sites, and we need to take this into account. The simulation selects winners in the next week of games by essentially flipping a biased coin with. Then, the teams are re-ranked after each week of simulated game outcomes. This is repeated until we get to the end of the season. Finally, I identify and simulate the conference championship games played (these are the only games not scheduled in advance). And then we end up with a final ranking. Go here for more details.

There are many methods for predicting the outcome of a game in advance. Most of the sophisticated methods use additional information that we could not expect to obtain weeks ahead of time (like the point spread, point outcomes, yards allowed, etc.). Additionally, some of the methods simply return win probabilities and cannot be used to identify the top four teams at the end of the season. My method is simple, but it gives us everything we need without being so complex that I would be suspicious of overfitting. The college football season is pretty short, so our matrix is really sparse. At present, teams have played 8 weeks of football in sum, but many teams have played just 6-7 games. Additional information could be used to help make better predictions, and I hope to further refine and improve the model in coming years. Suggestions for improving the model will be well-received.

Our results for our first week of predictions are here. Check back each week for more predictions.

Badger Bracketology: http://bracketology.engr.wisc.edu/

Our twitter handle is: @badgerbrackets

Your thoughts and feedback are welcome!

Additional reading:

Bucky_Bracket_Town14_1345


underpowered statistical tests and the myth of the myth of the hot hand

In grad school, I learned about the hot hand fallacy in basketball. The so-called “hot hand” is the person whose scoring success probability is temporarily increased and therefore should shoot the ball more often (in the basketball context). I thought the myth of the hot hand effect was an amazing result: there is no such thing as a hot hand in sports, it’s just that humans are not good at evaluating streaks of successes (hot hand) or failures (slumps).

Flash forward years later. I read a headline about how hand sanitizer doesn’t “work” in terms of preventing illness. I looked at the abstract and read off the numbers. The group that used hand sanitizer (in addition to hand washing) got sick 15-20% less than the control group that only washed hands. The 15-20% difference wasn’t statistically significant so it was impossible to conclude that hand sanitizing helped, but it represented a lot of illnesses averted. I wondered if this difference would have been statistically significant if the number of participants was just a bit larger.

It turns out that I was onto something.

The hot hand fallacy is like the hand sanitizer study: the study design was underpowered, meaning that there is no way to reject the null hypothesis and draw the “correct” conclusion whether or not the hot hand effect or the hand sanitizer effect is real. In the case of the hand sanitizer, the number of participants needed to be large enough to detect a 15-20% improvement in the number of illnesses acquired. Undergraduates do this in probability and statistics courses where they estimate the sample size needed. But often researchers sometimes forget to design an experiment in a way that can detect real differences.

My UW-Madison colleague Jordan Ellenberg has a great article about the myth of the myth of the hot hand on Deadspin and it’s fantastic. He has more in his book How Not to Be Wrong, which I highly recommend.  He introduced me to a research paper by Kevin Korb and Michael Stillwell that compared statistical tests used to test for the hot hand effect on simulated data that did indeed have a hot hand. The “hot” data alternated between streaks with success probabilities of 50% and 90%. They demonstrated that the serial correlation and runs tests used in the ‘early “hot hand fallacy” paper were unable to identify a real hot hand, and therefore, these tests were underpowered and unable to reject the null hypothesis when it was indeed false. This is poor test design. If you want to answer a question using any kind of statistical test, it’s important to collect enough data and use the right tools so you can find the signal in the noise (if there is one) and reject the null hypothesis if it is false.

I learned that there appears to be no hot hand in sports where a defense can easily adapt to put greater defensive pressure on the “hot” player, like basketball and football. So the player may be hot but it doesn’t show up in the statistics only because the hot player is, say, double teamed. The hot hand is more apparent and measurable in sports where defenses are not flexible enough to put more pressure on the hot player, like in baseball and volleyball.

 

 


Follow

Get every new post delivered to your Inbox.

Join 3,292 other followers