I enjoyed a recent Advanced NFL Stats podcast interview with Virgil Carter [Link], a former Chicago Bears quarterback who is considered to be the “father of football analytics.” During his time in the NFL, Carter enrolled in Northwestern University’s MBA program, and he started to work on a football project that was eventually published in Operations Research in 1971 (before Bill James of baseball analytics and Sabermetrics fame!). Carter even taught statistics and mathematics at Xavier University while on the Cincinnati Bengals.
The paper in Operations Research was co-written with Robert Machol and entitled “Operations Research on Football.” The paper estimates the expected value of having a First-and-10 at different yard lines on the field (see my related post here). Slate has a nice article about Virgil Carter [Link] outlining the work that went into estimating the value associated with field position:
Carter acquired the play-by-play logs for the first half of the 1969 NFL season and started the long slog of entering data: 53 variables per play, 8,373 plays. After five or six months, Carter had produced 8,373 punch cards. By today’s computing standards, Carter’s data set was minuscule and his hardware archaic. To run the numbers, he reserved time on Northwestern’s IBM 360 mainframe. Processing a half-season query would take 15 or 20 minutes—something today’s desktop computers could do in nanoseconds. In one research project, Carter started with the subset of 2,852 first-down plays. For each play, he determined which team scored next and how many points they scored. By averaging the results, he was able to learn the “expected value” of having the ball at different spots on the field.
They found that close to a team’s own end zone (almost 100 years from scoring a touchdown), a team’s expected points was negative, meaning that turnovers from fumbles and interceptions leading an opponent to score an easy touchdown outweighed a team’s own ability to move down the field and score. The paper discusses issues other than expected values, such as Type I and Type II errors using time outs. Here, the a timeout that controls time management has implications on each team’s remaining possessions, and using too much or too little time. The rules of football were quite different 40-something years ago. For example, an incomplete pass in the endzone required the ball to be brought out to the 20 yard line (instead of a mere loss of a down with no change in field position).
Listen to the podcast here.
Read my posts on football analytics here.