Category Archives: Teaching materials

Presidential election forecasting: a case study

I am sharing several of the case studies I developed for my courses. This example is a spreadsheet model that forecasts outcomes of an election using data from the 2012 Presidential election.

Presidential Election Forecasting

There are a number of mathematical models for predicting who will win the Presidential Election. Many popular forecasting models use simulation to forecast the state-level outcomes based on state polls. The most sophisticated models (like 538) incorporate phenomena such as poll biases, economic data, and momentum. However, even the most sophisticated models are often modeled using spreadsheets.

For this case study, we will look at state-level poll data from the 2012 Presidential election when Barack Obama ran against Mitt Romney. The spreadsheet contains realistic polling numbers from before the election. Simulation is a useful tool for translating the uncertainty in the polls to potential election outcomes.  There are 538 electoral votes: whoever gets 270 or more votes wins.


  1. Everyone votes for one of two candidates (i.e., no third party candidates – every vote that is not for Obama is for Romney).
  2. The proportion of votes that go to a candidate is normally distributed according to a known mean and standard deviation in every state. We will track Obama’s proportion of the votes since he was the incumbent in 2012.
  3. Whoever gets more than 50% of the votes in a state wins all of the state’s electoral votes. [Note: most but not all states do this].
  4. The votes cast in each state are independent, i.e., the outcome in one state does not affect the outcomes in another.

It is well known that the polls are biased, and that these biases are correlated. This means that there is dependence between state outcomes (lifting assumption #4 above). Let’s assume four of the key swing states have polling bias (Florida, Pennsylvania, Virginia, Wisconsin). A bias here means that the poll average for Obama is too high. Let’s consider biases of 0%, 0.5%, 1%, 1.5%, and 2%. For example, the mean fraction of votes for Obama in Wisconsin is 52%. This mean would change to 50% – 52% depending on the amount of bias.

Using the spreadsheet, simulate the proportion of votes in each state that are for Obama for these 5 scenarios. Run 200 iterations for each simulation. For each iteration, determine the number of electoral votes in each state that go to Obama and Romney and who won.


  1. The total number of electoral votes for Obama
  2. An indicator variable to capture whether Obama won the election.


(1) Create a figure showing the distribution of the total number of electoral votes that go to Obama. Report the probability that he gets 270 or more electoral votes.

(2) Paste the model outputs (the electoral vote average, min, max) and the probability that Obama wins for each of the five bias scenarios.

(3) What is the probability of a tie (exactly 269 votes)? 

Modeling questions to think about:

  1. Obama took 332 electoral votes compared to Romney’s 206. Do you think that this outcome was well-characterized in the model or was it an expected outcome?
  2. Look at the frequency plot of the number of electoral votes for Obama (choose any of the simulations). Why do some electoral vote totals like 307, 313, and 332 occur more frequently than the others?
  3. Why do you think a small bias in 4 states would disproportionately affect the election outcomes?
  4. How do you think the simplifying assumptions affected the model outputs?
  5. No model is perfect, but an imperfect model can still be useful. Do you think this simulation model was useful?

More reading from Punk Rock Operations Research:

How FiveThirtyEight’s forecasting model works:


  1. The assignment
  2. A shell spreadsheet with basic data to share with students
  3. A spreadsheet with the solutions

More teaching case studies

SIR models: A teaching case study to use in a course about probability models

This past summer, I created a few examples about COVID-19 to use in my course on probability models. I’ll post those materials here as I teach with them. Here is the first case study that introduces SIR models for modeling the spread of infectious disease. SIR models are widely used in epidemiology.

Infectious disease modeling: framing and modeling

Assume we have a constant population with N individuals. We can partition the population into three groups:

  1. Those who are susceptible to disease (S[n], i.e., not infected).
  2. Those who are infected (I[n])
  3. Those who are recovered (R[n]).

We assume a discrete time model, where we are interested in how the number of susceptible, infected, and recovered individuals vary according to time. Therefore, we start at time n=0 and index these values by n. The time between time n and n+1 could represent, say, a week.

A new strain of influenza or a novel coronavirus emerges. Susceptible individuals can become infected after exposure, and infected individuals can recover. Recovered individuals have immunity from reinfection.

New infecteds, result from contact between the susceptibles, and infecteds, with contact rate beta/N, which represents the proportion of contacts an infected individual has. Infecteds are cured at a rate (gamma) proportional to the number of infecteds, which become recovered.

Question #1: Come up with an expression to relate N to S[n], I[n], and R[n].

Question #2: Develop recursive expressions for S[n+1] based on S[n] and perhaps other variables.

Question #3: Then, do the same for I[n+1] and R[n+1].

Question #4: What are the boundary conditions?

Question #5: How would you estimate the total number who become infected by time n? 

Discussion questions:

  1. What other diseases fit this model?
  2. What are some possible ways to reduce the infection rate?
  3. What are some possible ways to increase the recovery rate?
  4. How does a vaccine effect this model?
  5. There is an interruption in the production of the vaccine, and your state will only receive 20% of the vaccines that you need before influenza season begins. Vaccines will slowly be released after this level. What are some criteria we could use to decide how to distribute these vaccines? What else can you do?

The second part performs computation in a spreadsheet. The assignment is here. We use the CDC 2004-5 data from a population of 157,759 samples taken from individuals with flu-like symptoms and 3 initial infections. Let n=0 represent the last week in September, the beginning of influenza season. Then, we compute these numbers in a spreadsheet to see how the disease may evolve. Next, we fit the model parameters (beta and gamma) using data that was collected by minimizing the sum squared error (SSE). Finally, we assess the impact of a vaccine. 


  1. The assignment.
  2. The solution.
  3. The assignment for the computational part.
  4. A google spreadsheet with the calculations (create a copy or download)

More examples

Pooled testing: a teaching case study to use in a course about probability models

This summer, I created a few examples about COVID-19 to use in my course on probability models. I’ll post those materials here as I teach with them. Here is the first example.

Pooled testing to expand testing capacity

In July 2020, many states struggled to process COVID-19 tests quickly, with some states taking more than a week to process tests. Many statisticians have proposed pooled testing to process tests quicker and effectively expand testing capacity to up to four times the regular capacity. Pooled testing works when few tests come back positive.

Pooled testing came about in the 1940s, when government statisticians needed a more efficient way to screen World War II draftees for syphilis. “The Detection of Defective Members of Large Populations,” by R. Dorfman in 1943 contains a methodology for pooled testing.

Pooled testing works as follows:

  • Tests are grouped that pool n samples together, where each sample reflects an individual’s test sample.
  • Pooled test results are either positive or negative. They come back positive if at least 1 of the n individual samples are positive.
  • For tests that come back positive, tests are rerun individually with the unused portions of the original samples to see which individuals test positive, achieving the same results but faster. A total of n+1 tests are performed.
  • For tests that come back negative, no further testing is needed. We conclude all individuals are negative. One total test is performed, which reduces the overall tests.
  • When pooling is not used, one test per individual yields n tests for the group.

Consider a group of 40 asymptomatic individuals that are tested for COVID-19 in pooled groups of size . Let  denote the number of groups tested, and let  capture the number of groups that test positive (a random variable). We assume that an individual tests positive for COVID-19 with probability  (New York data from July 2020).

  • Express g as a function of n.
  • Express X and its distribution based on g, n, and q.
  • Let the random variable T denote the total number of tests run. Derive an expressive for T as a function of  as well as fixed parameters n and g.
  • Consider test groups of size n = 4, 5, 8, 10, 20. Which group size yields the fewest number of tests performed, on average? (Hint: Find E[T]).
  • How does your answer to the last question change if q = 0.02, 0.02, 0.075? (Note: Dane County had q = 0.02 and Wisconsin had q = 0.075 at the end of July 2020. At the time I wrote this in early October 2020, more than 20% of COVID tests are coming back positive in Wisconsin).

You can read more on the New York Times article that inspired this case study.


  1. The assignment.
  2. The solution.
  3. A google spreadsheet with the calculations (create a copy or download)

a multiobjective decision analysis model to find the best restaurant in Richmond

I taught multiobjective decision analysis (MODA) this semester. It is a lot of fun to teach. I always learn a lot when I teach it. One of the most enjoyable parts of the class (for me at least!) is to run a class project that we chip away at during class over the course of the semester. Our project is to find the best restaurant for us to celebrate at the end of the semester. “Best” here is relative to the people in the class and the .

The project is a great way to teach about the MODA process. The process not only includes the modeling, but also the craft of working with decision makers and iteratively improving the model. It’s useful for students to be exposed to the entire analysis process. I don’t do this in my other classes.

On the first day of class, we came up with our objectives hierarchy. I did this by passing out about five Post It notes to each student. They each wrote one criteria for selecting a restaurant on each Post It note. They stuck their Post It notes to the wall. Together, we regrouped and organized our criteria into an objectives hierarchy.  Some of the objectives because “weed out criteria,” such as making sure that the restaurant could accommodate all of us and comply with dietary restrictions.

Our initial criteria were:

  1. Distance
  2. Quality of food
  3. Variety of food
  4. Service: Fast service
  5. Service: Waiting time for a table
  6. Service: Friendly service
  7. Atmosphere: Noise level
  8. Atmosphere: Cleanliness
  9. Cost

Our final criteria were as follows (from most to least important):

  1. Quality of food
  2. Cost (tie with #3)
  3. Distance
  4. Fast service (tie with #5)
  5. Noise level
  6. Cleanliness

We removed variety of food, waiting time, and friendly service because classroom discussions indicated that they weren’t important compared to the other criteria. Variety, for example, was less important if we were eating delicious food at an ethnic restaurant that had less “variety” (variety in quotes here, because it depends on you you measure it).

In the next few weeks, we worked on identifying how we would actually measure our criteria. Then, we came up with a list of our favorite restaurants. During this process, we removed objectives that no longer made sense.

We collaboratively scored each of the restaurants in each of the six categories by using a google docs spreadsheet.

  1. Quality of food = average score (1-5 scale)
  2. Cost (tie with #3) = cost of an entree, drink, tax, and tip
  3. Distance = distance from the class (in minutes walk/drive)
  4. Fast service (tie with #5) = three point scale based on fast service, OK service, or very slow service
  5. Noise level = four point scale based on ratings
  6. Cleanliness: based on the last inspection. Score = # minor violations + 4*# major violations.

A real challenge was to come up with:

  • the single dimensional value functions that translated each restaurant score for an objective into a value between 0 and 1.
  • the weights that balanced our preferences across objectives using swing weight thinking. FYI, we used an additive model.

I won’t elaborate on these parts of the process further. Ask me about these if you are interested.

When we finished our model, the “best” decision was to forego a restaurant and do a potluck instead. No one was happy with this. We examined why this happened. This was great: ending up with a bad solution was a great opportunity for learning. We concluded that we didn’t account for the hidden costs associated with a potluck. Namely, it would entail either making a trip to the grocery store or cooking, approximately a 30 minute penalty. We decided that this was equivalent to driving to a distant restaurant, a 26 minute drive in our model.  It was also hard to evaluate cleanliness since the state do not inspect classrooms like they do restaurants. But since cleanliness didn’t account for much of our decision, we decided not to make adjustments there.

The final model is in a google docs spreadsheet.

We performed a sensitivity analysis on all of the weights. Regardless of what they were, most of the restaurants were dominated, meaning that they would not be optimal no matter what the weights were. The sensitivity was not in google docs, since we downloaded the document and performed sensitivity on our own. I show the sensitivity wrt to the weight for quality below. The base weight for quality is 0.3617. When the weight is zero and quality is not important, Chipotle would have been our most preferred restaurant. The Local would be preferred only across a tiny range.

We celebrated in Ipanema, a semi-vegetarian restaurant in Richmond. I think our model came up with a great restaurant. We all enjoyed a nice meal together. Interestingly, Mamma Zu scored almost identically to Ipanema (see the figure below).

I cannot claim credit for this fun class project. I shamelessly stole this idea from Dr. Don Buckshaw, who uses it in MODA short courses.  We use the Craig Kirkwood’s Strategic Decision Making as the textbook for the course. I also recommend Ralph Keeney’s Value Focused Thinking and John Hammond’s Smart Choices.

How do you choose a restaurant?

Sensitivity with respect to the weight for quality (0.3617 in the base case).

how to find a football team’s best mix of pass and run plays using game theory

This is my third and final post in my series of football analytics slidecasts. After this one, just enjoy the Superbowl. My first two posts are here and here.

This slidecast illustrates how to find

  • the offensive team’s best mix of run and pass plays, and
  • the defensive team’s best mix of run and pass defenses.
The best mix is, of course, a mixed strategy. We use a game theory to identify the best mix (a Nash equilibrium) for a simultaneous, perfect information, zero-sum game.

What is a football team’s best mix of running and passing plays?

View another webinar from Laura McLay

When is a two point conversion better than an extra point? A dynamic programming approach.

This post continues my series of slidecasts about football. My first slidecast is here.

Today’s topic addresses when a two point conversion is better than an extra point after a touchdown. As you may guess, it is best for a team to go for two when they are down by eight. You can see other scenarios when it is best to go for two, based on the point differential and the remaining number of possessions in the game.

This presentation is on Wayne Winston’s book Mathletics, which is a fantastic introduction to sports analytics.

Related post:

should a football team go for it on fourth down?

With the Superbowl coming up, I created three sports analytics slidecasts for analyzing football strategies. I will post one per day here on the blog.

The first slidecast deals with the decision of whether a football team should go for it on fourth down (or should they punt). The presentation is adapted from the book Scorecasting by Tobias Moskowitz and Jon Werthem. Wayne Winston blogged about this, and his blog post went viral. Here is another look at this issue.

a few more thoughts on operations research and coffee: multiobjective decision analysis

I recently wrote about how I used OR to decide how to get my coffee fix in the morning. Some of you suggested that I perform MODA to consider the tradeoffs between cost, taste, and convenience. I agree!

My previous post contains the cost, taste, and convenience scores for the six coffee options:

  1. Home coffee (made at home, brought to work in a mug)
  2. Department coffee (purchased from the coffee made in the department coffee maker)
  3. Coffee shop (bought from a local coffee shop on the way to work)
  4. Dunkin Donuts (my guilty pleasure, a little out of my way)
  5. Office coffee (made in my spare coffee maker in my office)
  6. Keurig coffee (made in my office)


  • The preferential independence assumption was reasonable here.
  • I used an additive value function, since it was reasonable in this situation.
  • I used an exponential shape to assess the single dimensional value functions.  I was linear in taste.  I had a concave shape factor for cost (1.83) and a convex shape factor for convenience (-6.8).  I really hate waiting.
  • My swing weights are 0.5 for cost, 0.25 for convenience, and 0.25 for taste. My rationale here is that cost adds up over the year, and as a result, it is twice as important as convenience and taste.  Convenience and taste seem about equally important to me.

Based on this, the MODA values for the six options lead to this ordering of my coffee options scaled between zero and one are:

  1. Home coffee (0.77)
  2. Office coffee (0.71)
  3. Department coffee (0.67)
  4. Keurig coffee (0.67)
  5. Coffee shop (0.32)
  6. Dunkin Donuts (0.25)

It looks like I naturally gravitated to my “optimal” decisions of making coffee at home or in my office.  A sensitivity on the weight for cost leads to the following graph.  It shows that buying coffee at a coffee shop or at Dunkin Donuts would be suboptimal across all weights (so would buying the department coffee).  If I care a little less about cost, buying a Keurig coffee maker for my office would become the best option.

Coffee MODA analysis

Coffee MODA analysis. We want to maximize the value, so the line that is highest is my “optimal” choice.

If I change the weights so that taste counts the most (with a weight of 0.5) and cost and convenience have weights of 0.25, then the MODA values for the six options lead to this ordering of my coffee options scaled between zero and one are:

  1. Home coffee (0.73)
  2. Keurig coffee (0.69)
  3. Dunkin Donuts (0.5)
  4. Office coffee (0.49)
  5. Coffee shop (0.49)
  6. Department coffee (0.46)

The sensitivity of the results based on the weight for taste are captured in the following figure.  In both cases, it looks like continuing to make coffee at home is my best bet.

If you’re interested in working this example, check out my spreadsheet for this on my new “Files” page under the teaching materials heading. I’ll try to post some of my teaching materials, code, and data on this blog as I go.