Category Archives: Uncategorized

When should a football team attempt a two point conversion instead of an extra point? A dynamic programming approach.

On Sunday November 10, 2019, the Carolina Panthers were down 14 against the Packers early in the 4th quarter. They scored a touchdown, putting them down by 8, and they went for a two point conversion.  The two point conversion did not succeed. This has been the subject of debate, with journalists both applauding and criticizing the decision.

I created a dynamic programming model to determine whether or not to go for a 2 point conversion. The dynamic programming model is based on Wayne Winston’s book Mathletics, which is a fantastic introduction to sports analytics. The state captures the team with possession, the score differential when they obtain possession, and the number of remaining possessions. When there is one remaining possession, it is the last possession. When there are three remaining possessions, the team with the possession has two scoring attempts. Each possession ends in a touchdown, a field goal, or no score. I assume half of all games end in a tie. The probabilities I used are based on average team statistics. I do not model other decisions, such as whether to go for it on fourth down, although these could further improve a team’s win probability.

The slides are below.

Bottom line: teams should go for two points when they score a touchdown and they are down 10, 8, 3, or 2 or up by 1, 2, 4, or 5 (including the points from scoring the touchdown) near the end of the game. These conclusions hold when there are at least two additional possessions in the game.

If you have the last possession: go for 2 when a touchdown on this last possession puts you down by 2.

If you just scored a touchdown but your opponent will have the last possession: go for 2 when a touchdown puts you down by 2 or up by 1, 4, or 5. You normally will want to go for 2 when a touchdown puts you up by 2 except in this situation, because missing the extra point means your opponent could win with a field goal.

Carolina went for two when down by 8 after scoring a touchdown. According to my math, Carolina made the right choice. However, the best strategy does not guarantee a win nor does it drastically improve the win probability.

We can examine the decision in more detail. When down by 8 with four possessions to go (which matches up with when Carolina went for a two point conversion), a team has one of two choices:

  1. They could kick an extra point, which would give them a 11.3% win probability if successful (with probability 0.96) or a 7.9% win probability if not successful. Together, this yields a 11.2% win probability.
  2. They could go for a two point conversion. If they succeed (with probability 0.48), they would have a 18.3% win probability. Otherwise, they would have a 7.9% probability of winning if not successful. Together, this yields a 12.9% win probability.

There are four things to keep in mind:

  • Carolina improved their probability of winning by 1.7% by going for two.
  • A good process does not guarantee a good outcome.
  • Carolina was not likely to win using either approach.
  • Carolina could have further improved their win probability by considering other decisions (who is playing, which plays are called, and whether to go for it on fourth down).

My conclusions are summarized in the chart below. For more reading: Benjamin Morris of 538 wrote an article about when to go for two here. My analysis is consistent with his, although we make different comparisons.

When to go for a two point conversion in NFL football

 


data science isn’t just data wrangling and data analysis: on understanding your data

I have often heard rules of thumb, such as data science/analytics is 80% data cleaning/wrangling and 20% data analysis. I’ve seen various versions of this rule of thumb that all have two elements in common:

1) They assign various percentages of time to data cleaning and analysis, with the time allocated to cleaning greatly outweighing the time allocated to analysis.

2) Time is always partitioned into these two categories: cleaning and analysis.

I want to introduce a third category: understanding the data. This is a critically important part of doing data science and analytics. I acknowledge that many data scientists understand their data quite well, so I am not not criticizing the entire data science community. Instead, I want to point out and discuss the rich tradition of understanding the data that we have fostered in operations research and highlight the importance of this often overlooked aspect of working with data and building data-driven models. As an applied optimization researcher, I believe that data collection, understanding the data, problem identification, and model formulation are critical aspects of research. These topics are important training topics for students in my lab. To solve a data-driven model, we need to understand the data and their limitations.

Here are a few ways I have made efforts to understand the data:

  • I have asked questions about the data to subject matter experts (who shared the data with me).
  • I have done ride-alongs and observed service providers in action to see how they collect and record data as well as how they interpret data categories.
  • I have read manuals that describe data sets.
  • Summary statistics and other analytical tools shed light on distributions and processes that produce the data.
  • Disseminating research findings often results in good questions about the data sources from audience goers, which has improved my understanding of the data.
  • I have read other papers related to my problem that describe how data are collected in other settings.

Understanding the data has helped me understand the data’s limitations and apply the data meaningfully in my research:

  • Data sets often have censored data. What is not included in the data set may be critically important. There is no way to know what is not in a data set unless I understand how it was collected.
  • Some data points are nonsensical or misrecorded (e.g., 8 hour ambulance service times). Others are outliers and are important to include in an analysis. Understanding how the data were recorded help to ensure that the data are used in a meaningful way in the application at hand.
  • Some data points are recorded automatically and others are recorded manually. Both kinds can be high quality or low quality, depending on the setting, employee training, and the tool for automatic collection.
  • Understanding the data is a first line of defense when it comes to data and algorithm bias. Most data sets are biased in that they are not fully representative of the target population or problem, and understanding these biases can help prevent building models that are discriminatory and/or not effective when it comes to the problem at hand.
  • Understanding what data are not included in a data set has resulted in me asking for additional sources of data for my research. Sometimes I have been able to get better data if I ask for it.

Without understanding the data, the analysis could be a matter of garbage in, garbage out

This post covers just one of many issues required for avoiding algorithm bias in applying data science/analytics. Colleagues and I shared some of these additional thoughts with Cait Gibbons, who wrote an excellent article about algorithm bias for the Badger Herald. Read her article for more.

 


how unusual was it that the visiting team won all 7 games of the World Series?

Last night the Washington Nationals beat the Houston Astros in the seventh game of the World Series. The visiting team won all seven games of the series. This has never happened before.

Two evenly matched teams should each win about half of the time. Home field advantage indicates that the home team has a slight advantage, with the home team in Major League Baseball winning approximately 55% of their games and the visiting team winning 45% of their games.

The probability that the visiting team wins all seven games in a seven game series is:

(0.45)^7 = 0.0037

This is less than 1% and while it is rare, we would expect the visiting team to win all seven games every 268 World Series or so. There have been 114 World Series so far.

For comparison, the probability that the home team wins all seven games in a seven game series is:

(0.55)^7 = 0.0152

To put this in context, the home team is four times as likely to win all seven games in a seven game series than the visiting team. The home team won all seven games of the World Series three times, which is about what we would expect based on the math above.

So far, either the home team or the visiting team winning all games in a seven game series would account for about 2% of all possible outcomes for a seven game series. The other 98% captures a mix of home and visiting team victories as well as the series ending in fewer than seven games.

 


We already have an 8 team college football playoff

I commonly hear others argue for expanding the four team College Football Playoff (CFBP) to an eight team playoff. I oppose expanding the playoff to eight teams because for all practical purposes we already have an 8+ team playoff.

Hear me out. There are five major conferences, each of which has a conference championship game, plus an additional five conference championship games in the non-major conferences. The five conference championship games from the major conferences help whittle down the field so the CFBP committee can select four teams for the playoff. These five conference championship games serve as a de facto first round of the playoff, with the losing teams being eliminated from advancing in the “playoff.” None of the losing teams have ever been selected for the College Football Playoff.

As further evidence of my claim, the conference championship games are so important for selecting teams for the College Football Playoff that the Big 12 added a conference championship game after their conference missed a berth in two of the first three playoffs.

The conference championship games serve as a first round of a playoff as follows. First, as noted earlier, the teams who lose the conference championship games are (for all practical purposes) eliminated from the playoff, since the CFBP committee has never selected a losing team for the College Football Playoff (e.g., 2017 Wisconsin). Second, note that not all five winning teams in the conference championship games have a reasonable case for making the playoff, since some may already have two losses (e.g., 2017 The Ohio State). As a result, some of the winning teams are also eliminated from the playoff. Who remains are the conference championship winners, other teams who do not have a conference championship game such as Notre Dame as well as Baylor and TCU in 2014, and other top teams who did not qualify for the conference championship game (e.g., Alabama in 2017). The College Football Playoff committee invites four of the teams who emerge from this process to participate in the College Football Playoff.

Expanding the College Football Playoff to eight teams seems redundant to me given that there are conference championship games that serve as a mechanism for selecting the teams for a four team playoff. I might support an eight team playoff if it replaced and eliminated the conference championship games from the five major conferences. I do not see a need to eliminate the other conference championship games. I hope a team from a non-major conference is someday selected for the College Football Playoff, and conference championship games from those conferences help make a case to include teams from more conferences in the College Football Playoff.

A secondary reason for why I oppose expanding the field to include additional teams in the College Football Playoff is for the impact on the students’ educational plans. The football players are student-athletes, and as a professor, I bristle when the topic of education does not enter the conversation. And by and large it has not.

In the meantime, I will continue to rank college football teams and forecast the College Football Playoff every week on Badger Bracketology.

 

 


A digital device policy in the classroom

Over the years I have struggled with the issue of whether or not to ban cell phones or laptops. For me, whether or not to ban is the wrong question. A better question focuses on student learning, since I strongly believe in policies that support student learning in the classroom. I also strongly believe in treating my students like adults. 

I don’t mind cell phones in the classroom, I mind behavior that interferes with learning. So I crafted a digital device policy in the classroom that focuses on behavior instead of imposing bans two years ago.

I posted the digital device policy from my syllabi below. When I introduce the policy, I tell the students I expect them to be in categories 3 and 4. During the semester, I make sure to remind of the policy if there are many students who are not meeting expectations or acknowledge that the students are doing a great job by meeting expectations.

I’m more comfortable setting a clear policy that outlines expectations for behavior than a blanket ban. This also puts the students in charge of themselves.

It’s been two years, and I still like it. 

My digital device policy:

Laptops and tablets should be put away and closed if we are not using them for an in-class example. Research* shows that laptop use in class leads to lower grades for those with the laptops and even lower grades for those who are sitting by the laptop users due to the distractions they provide. I ask that you respect your peers’ desire to learn and not engage in distracting behavior in class. I understand that many of you like to follow along with the lecture notes on your tablet during class. I support the use of a laptops and tablets that are consistent with the course’s learning goals. I discourage taking notes during class using your laptop keyboard, since students frequently tell me they find typing noises during class to be extremely distracting.

* Sana, F., Weston, T. and Cepeda, N.J., 2013. Laptop multitasking hinders classroom learning for both users and nearby peers. Computers & Education, 62, pp.24-31.

 

1

Far Below Expectations

2

Below
Expectations

3

Meeting Expectations

4

Exceeding Expectations

Cell Phone / Laptop / Tablet / Device Use

In the real world, people have their phones and devices with them at their jobs, meetings, and courses. Adults do not have their devices taken away from them. They are expected to manage their own use and conform to professional expectations in every setting.

 

 

Use is inappropriate. Device is a distraction to others.

 

Example: A student plays games, views non-academic material, types (not for taking notes), reads non-academic articles, has text or chat conversations.

 

 

 

Use is distracting. Device is a distraction to the student. Student frequently checks phone or device during learning.

 

Example: A student takes out their phone to look at a text several times during a class period.

 

 

Device is not used except for designated appropriate times OR use is limited to a quick check of the phone during a transition or appropriate time.

 

Example: If a student receives an important message from a parent, they quickly check while still being engaged in class and with no distraction to others.

 

Device is not used except for as an efficient academic tool for a direct purpose. Devices are not a distraction and are used at appropriate times as an extension of work or learning.

 

Example: A student follows along with the lecture notes on a tablet and goes back a slide to correct a misconception about the lecture material. The student looks up the formula for the Binomial theorem for an in-class example, which is consistent with the course’s learning goals.

 

I use this meme in class, but fewer and fewer students know who Obi Wan Kenobi is:

 

This updates my previous post on my preliminary digital device policy.

What is your digital device policy?


Target always thinks I am pregnant: on the costs of false positives and false negatives

In 2012, The New York Times published an article about an algorithm used by Target to identify shoppers that might be pregnant.

[A] man walked into a Target outside Minneapolis and demanded to see the manager. He was clutching coupons that had been sent to his daughter, and he was angry, according to an employee who participated in the conversation.

“My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again.

On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

Target uses purchase history to predict which shoppers are pregnant. The article in the New York Times that broke this story implied that Target is extremely accurate in these predictions. I want to explore this a little further. Target sends shoppers that meet their pregnancy criteria coupon books for maternity and baby products. This article in Forbes outlines some of the data used by Target as well as their approach in predictive analytics.

[Target’s statistician] identified 25 products that when purchased together indicate a women is likely pregnant. The value of this information was that Target could send coupons to the pregnant woman at an expensive and habit-forming period of her life.

I’ve always been skeptical about the accuracy of Target’s algorithm, mainly because they have found me continually pregnant since 2010-11 (the last time I was actually pregnant). Target has sent me many coupon books and ads over the years. It’s not just Target. Sometimes I receive baby formula in the mail from a formula company. Babies are expensive, and that it costs Target (and other companies) very little to send me baby coupons and ads. The upside is that if Target is correct, they have a huge potential profit. If they are wrong, it only cost them a little in advertising revenue.

Target targeted me with this ad on twitter. It’s not the first pregnancy ad that has been tailored to me. I also receive coupons in the mail.

We can unpack a procedure for identifying pregnant customers. John Foreman includes a “pregnant shopper” model in his book Data Smart to introduce linear and logistic regression that illustrates this point. I’ve used this model in class and students really like it. Regression models are fit to data from fictitious shoppers. A logistic regression model produces a score for each shopper based on their purchases of many types of products (similar to how the real application works). The score can be mapped into a 0-1 prediction or decision for classification. This helps decide who gets the coupon books and who doesn’t by choosing a cutoff, with shoppers whose scores are above the cutoff getting the coupon books. The lower the cutoff, the more false positives there will be. Different cutoffs lead to different values of the true positive and false positive rates (see the “ROC curve” image below from Data Smart).

Two ways to measure accuracy include:

  1. Sensitivity: the true positive rate that measures the proportion of actual positives that are correctly identified.
  2. Specificity: the true negative rate (1 – the false positive rate) that measures the proportion of actual negatives that are correctly identified.

What this means is that the algorithm is not necessarily accurate and that Target is not necessarily aiming for accuracy in terms of the model’s predictive ability. Instead, Target is choosing a point on this ROC curve by setting a cutoff that makes sense for their business model. If the costs of sending an ad to a non-pregnant shopper is low (the cost of a false positive) and the profit of true positives is high, it would lead Target selecting a value of the cutoff that would lead to a point on the curve with a high true positive rate as well as a high false positive rate (with low specificity). This is what I experience.

Other applications where the cost of a false positive is lower may lead to different selections of a cutoff with a lower false positive rate. I rarely have received baby formula in the mail, presumably since mailing formula comes at a much higher cost to these companies.

Another example that comes to mind is modeling sports injuries. The cost of a false positive could be high: resting a star player too much at the end of the season to stave off injury means the team could lose too many games and miss the playoffs. Not resting the player (risking a true positive) means the player could suffer a season ending injury, which would mean the team could lose in the playoffs.


The economic impact of obesity on automobile fuel consumption

Back in 2005, I was a graduate student working with my advisor Professor Sheldon Jacobson at the University of Illinois. Hurricane Katrina damaged oil refineries in the Gulf of Mexico, causing production to drastically drop. Gas prices surged as supply fell.

Sheldon and I discussed fuel economy in the wake of Hurricane Katrina. He noted that one way to improve fuel economy is to remove all the junk out of a car, because lowering the weight inside a car improves its fuel economy on the micro level. He challenged me to quantify this small change in fuel economy. I was a curious and energetic graduate student, and I threw myself into answering this question for the next couple of weeks. During this time, Sheldon pointed out that as a country, people in the United States had gotten heavier. This additional body weight (about 25 pounds for both men and women between 1960 and 2003) was not insignificant. Given that Americans drive so much, this additional weight accounted for a lot of additional fuel usage.  I felt a little guilty because I had a baby whose car seat weighed about 25 pounds (it was one of the safest on the market) and often left heavy bags of cat litter in my car trunk if I was too tired to carry them inside.

After crunching the numbers, the results were astounding: We found that if we put people from 1960 into automobiles in ~2005, approximately 938 million gallons of fuel would have been saved by transporting lower weight passengers, which corresponds to approximately 0.7% of the nation’s annual fuel consumption.

Our paper was published in Engineering Economy. It’s entitled, “The Economic Impact of Obesity on Automobile Fuel Consumption.” You can read the paper here.

Paper Abstract: Obesity has become a major public health problem in the United States. There are numerous health implications and risks associated with obesity. One socio-economic implication of obesity is that it reduces passenger vehicle fuel economy (i.e., the miles per gallon achieved by automobiles, which include cars and light trucks driven for noncommercial purposes). This article quantifies the amount of additional fuel consumed (annually) in the United States by automobiles that is attributable to higher average passenger (driver and non-driver) weights, during the period from 1960 to 2002. The analysis uses existing driving data in conjunction with historical weight data. The results indicate that, since 1988, no less than 272 million additional gallons of fuel are consumed annually due to average passenger weight increases. This number grows to approximately 938 million gallons of fuel when measured from 1960, which corresponds to approximately 0.7% of the nation’s annual fuel consumption, or almost three days of fuel consumption by automobiles. Moreover, more than 39 million gallons of fuel are estimated to be used annually for each additional pound of average passenger weight.

~

The paper was published in December 2006. A press release went out in October, a month into my first tenure track position. Sheldon and I anticipated that the paper would receive a lot of media attention. In preparation, we received training from the excellent University of Illinois media team. They helped us develop a series of takeaways and talking points about the paper and taught me how to stay on point during interviews. Sheldon was a lot more eager to talk to the media than I was. He agreed to do the heavy lifting when it came to media engagement.

When the press release came out, the paper received a lot of attention. I received phone calls at home at the crack of dawn asking for interviews for radio programs. I was on television. I was quoted in newspapers. I was on the radio. Journalists asked me how much I weighed during interviews (really!) and one asked me if I “hated fat people.” The Associated Press wrote an article that was published in hundreds of newspapers.  In all, our research findings were reported by news articles in more than 400 online and print newspapers and magazines, and was featured on several national cable news shows and regional radio shows. My interview with WDBJ7, a CBS affiliate in Roanoke, Virginia, appeared on the evening news. It was scary, but overall it was a great experience. Amazingly, I saw references to this paper in the popular press years after the press release. For example, a CNBC quiz asked how much gas is consumed annually due to Americans’ weight gains since 1960. I got the right answer (938 million gallons of gas).

The media firestorm helped me become comfortable with working with the media, despite my introversion. I could see the value of media attention to scientific topics such as analytics, and now I always embrace bringing engineering, operations research and analytics to the public. I am much better at talking with journalists, and I’m happy to say that no once since has asked me about my weight.