Tag Archives: football analytics

before Sabermetrics, there was football analytics

I enjoyed a recent Advanced NFL Stats podcast interview with Virgil Carter [Link], a former Chicago Bears quarterback who is considered to be the “father of football analytics.” During his time in the NFL, Carter enrolled in Northwestern University’s MBA program, and he started to work on a football project that was eventually published in Operations Research in 1971 (before Bill James of baseball analytics and Sabermetrics fame!). Carter even taught statistics and mathematics at Xavier University while on the Cincinnati Bengals.

The paper in Operations Research was co-written with Robert Machol and entitled “Operations Research on Football.” The paper estimates the expected value of having a First-and-10 at different yard lines on the field (see my related post here). Slate has a nice article about Virgil Carter [Link] outlining the work that went into estimating the value associated with field position:

Carter acquired the play-by-play logs for the first half of the 1969 NFL season and started the long slog of entering data: 53 variables per play, 8,373 plays. After five or six months, Carter had produced 8,373 punch cards. By today’s computing standards, Carter’s data set was minuscule and his hardware archaic. To run the numbers, he reserved time on Northwestern’s IBM 360 mainframe. Processing a half-season query would take 15 or 20 minutes—something today’s desktop computers could do in nanoseconds. In one research project, Carter started with the subset of 2,852 first-down plays. For each play, he determined which team scored next and how many points they scored. By averaging the results, he was able to learn the “expected value” of having the ball at different spots on the field.

They found that close to a team’s own end zone (almost 100 years from scoring a touchdown), a team’s expected points was negative, meaning that turnovers from fumbles and interceptions leading an opponent to score an easy touchdown outweighed a team’s own ability to move down the field and score. The paper discusses issues other than expected values, such as Type I and Type II errors using time outs. Here, the a timeout that controls time management has implications on each team’s remaining possessions, and using too much or too little time. The rules of football were quite different 40-something years ago. For example, an incomplete pass in the endzone required the ball to be brought out to the 20 yard line (instead of a mere loss of a down with no change in field position).

Listen to the podcast here.

Read my posts on football analytics here.

why the Bears should have gone for it on fourth and inches

In last night’s Bears/Packers game, Coach Marc Trestman (of the Bears) decided to go for it on 4th and inches at the Bears’ 32 yard line during in the fourth quarter with 7:50 left and when the Bears were up 4. Normally, teams decide to punt in this situation, which reflects a hyper-conservative decision-making approach adopted by most football coaches. The Bears got the first down, and the ensuing drive led to a field goal, putting the Bears up by 7 with 0:50 left in the game.

In hindsight, it was obviously a great call. But decisions aren’t made with hindsight – both good and bad outcomes are possible with different likelihoods.

An article by Chris Chase at USA Today [Link] argued that while going for it on 4th down was a bad decision because the bad outweighed the good. There isn’t much analytical reasoning in the article. I prefer base decisions on number crunching rather than feeling and intuition, so here is my attempt to argue that going for it on 4th down was a good decision.

The basic idea of football decision-making

There are a number of models that estimate the expected number of points a team would get based on their position on the field. To determine the best decision, you can:

  1. look at the set of possible outcomes associated with each decision,
  2. find the probability and expected number of points associated with each of these outcomes,
  3. then take the expected value associated with each outcome, and
  4. choose the outcome with the most expected points.

Let’s say going for it on 4th down has success probability p. Historical data suggests that p=0.8 or so. If unsuccessful, the Packers would take the ball over on the Bears’ 32 yard line with a conditional expected value of about -3.265 points. This value is negative because we are taking the Bears’ point of view. If successful, the Bears would be around their own 35 yard line with a conditional expected value of 0.839. When considering both outcomes (success and failure), we can an expected value associated with going for it on fourth down: 0.839 p – 3.265(1-p).

Let’s look at the alternative: punting. The average punt nets a team about 39 yards. This would put the ball on the Packers’ 29 yard line with an associated expected number of points of -0.51. However, this isn’t the right way to approach the problem. Since the expected number of points associated with a yard line is non-linear, we can’t average the field position first and then look up the expected number of points. Instead, we should consider several outcomes associated with field positions: Let’s assume that the Packers will get the ball back on their own 15, 25, 35, and 45 yard lines with probabilities 0.28, 0.25, 0.25, and 0.22 and with expected points 0.64, -0.24, -0.92, and -1.54, respectively. This averages out to the ball on the Packers’ 29 yard line with -0.45 points (on average).

Now we can compare the options of going for it (left hand side) and punting (right hand side):
0.839 p - 3.265 (1-p) \ge -0.45
Solving this inequality tells us that the Bears should go for it on fourth down if they have a success probability of at least 68.6%.

These values are from Wayne Winston’s book Mathletics.

But time was running out!

The method I outlined above tends to work really well except that it ignores the actual point differential between the teams (which is often important, e.g., when deciding to go for one or two after a touchdown), the amount of time left on the clock, and the number of timeouts. It’s worth doing a different analysis during extreme situations. With 7:50 left on the clock, the situation wasn’t too extreme, but the Packers’ 3 remaining timeouts and 4 point score differential are worth discussing. Going for it on 4th down allowed the Bears to score a field goal and eat up an additional seven minutes off the clock, which was almost the perfect outcome. Let’s consider a range of outcomes.

Very close to the end of the game, it’s best to evaluate decisions based on the probability of winning instead of the expected number of points. Note that you find the probability of winning as the expected value of an indicator variable, so it uses the same method with different numbers. Making this distinction is important, since if you are down by 4 points, going for a field goal may maximize your average points but would guarantee that you’d lose the game.

One way to address these issues is to look at how many possessions the Packers will have if the Bears punt or go for it on fourth down. Let’s say that the Packers would get one possessions if the Bears punt. They would need to score a touchdown on their single possession to win. Let’s say that the Packers would get two possessions if the Bears punt. The Packers could win by scoring two field goals or one touchdown, unless the Bears score on their possession in between the Packers’ possessions. If the Bears score an additional field goal, that would put the Bears up 7, and the Packers would need at least one touchdown to tie (assuming a PAT), and an additional score of any kind to win. If the Bears score an additional touchdown, that would put the Bears up 10-12, and the Packers need two touchdowns to win and could possibly tie or win with a field goal and a touchdown (assuming a PAT or 2-point conversion was successful). The combination and sequence of events need to be evaluated and measured.

Without crunching numbers, we can see that punting would likely increase the Packers’ chance of winning because it would give them 2 chances to score (unless the Packers’ defense is so poor that they think the Bears would be almost certain to score again given another chance).

This is just one idea for analyzing the decision of whether to go for it on fourth down. Certainly, more details can be taken into account so long as there is data to support the modeling approach to support the decision.

Brian Burke blogged about this as I was finishing up my post [Link]. He used the win potential instead of the expected number of points (which I recommend but don’t calculate). This yielded the Bears’ break-even success probability of 71%, which is close to what I found. In any case, this more or less supported the decision to go for it on fourth and inches (although not going for it would also be reasonable in this case since the probability of successfully getting a fourth down is only slightly higher than the threshold) but maybe this analysis wouldn’t have supported the decision to go for it if it were fourth and 1.

More on fourth down decision-making:

What sports play have you over analyzed?

Superbowl reading for number crunchers

Here are a few links to posts and articles about the Superbowl that will appeal to number crunchers:

Nate Silver argues that defense wins championships. Many math models show that offense is more instrumental in winning games than is defense. But defense may be better for winning titles. Silver looks at the top 20 defenses and offenses to have played in the Super Bowl according to the simple rating system at pro-football-reference.com. He finds that the team with top defensive teams have won 14 of 20 Super Bowls  whereas the top offensive teams have won 10 of 20.

Nate Cohn at the New Republic writes about how football is ripe for reaping the benefits from advanced statistics.

Josh Laurito has a nice post on TV Ratings (as measured by Nielsen) for major league sports championships. The Super Bowl is the only one championship that has been increasing over the past decade or so  (shown below). The Superbowl with the highest ratings ever was the 1986 Superbowl featuring the 1985 Bears (this is probably the closest I’ll get to proving that the 1985 Bears was the best team ever.)

Superbowl Nielson Ratings

I’ve written about football in several posts. One analyzes the Patriots’ decision to let the Giants score a touchdown in last year’s Superbowl using a decision tree.

I also have three presentations on football decision making.

The first illustrates when a team should go for a two point conversion using dynamic programming:  

The second looks at when a team should go for it on fourth down using decision trees: 

The third uses game theory to find the best mix of run and pass plays. 

how to find a football team’s best mix of pass and run plays using game theory

This is my third and final post in my series of football analytics slidecasts. After this one, just enjoy the Superbowl. My first two posts are here and here.

This slidecast illustrates how to find

  • the offensive team’s best mix of run and pass plays, and
  • the defensive team’s best mix of run and pass defenses.
The best mix is, of course, a mixed strategy. We use a game theory to identify the best mix (a Nash equilibrium) for a simultaneous, perfect information, zero-sum game.

What is a football team’s best mix of running and passing plays?

View another webinar from Laura McLay

When is a two point conversion better than an extra point? A dynamic programming approach.

This post continues my series of slidecasts about football. My first slidecast is here.

Today’s topic addresses when a two point conversion is better than an extra point after a touchdown. As you may guess, it is best for a team to go for two when they are down by eight. You can see other scenarios when it is best to go for two, based on the point differential and the remaining number of possessions in the game.

This presentation is on Wayne Winston’s book Mathletics, which is a fantastic introduction to sports analytics.

Related post:

should a football team go for it on fourth down?

With the Superbowl coming up, I created three sports analytics slidecasts for analyzing football strategies. I will post one per day here on the blog.

The first slidecast deals with the decision of whether a football team should go for it on fourth down (or should they punt). The presentation is adapted from the book Scorecasting by Tobias Moskowitz and Jon Werthem. Wayne Winston blogged about this, and his blog post went viral. Here is another look at this issue.