Sports scheduling meets business analytics: why scheduling Major League Baseball is really hard

Mike Trick of Carnegie Mellon University came to the Industrial and Systems Engineering department at UW-Madison to give a colloquium entitled “Sports scheduling meets business analytics.”

How hard is it to schedule 162 game seasons for the 30 MLB teams? It’s really, really hard.

Mike Trick stepped up through what makes for a “good” schedule? Schedules obey many constraints, some of which include:

  • Half of each team’s games are home, half are away.
  • Teams cannot have more than three series away or home.
  • Teams cannot have three home weekends in a row.
  • Teams in the same division play six series: two early on, two in the middle of the season, and two late, with one home and one away each time.
  • Teams play all other teams in at least two series.
  • Schedules should have a good flow, with about one week home followed by one week away.
  • Teams that fly from the west coast to the east coast have a day off in between series.

Teams can make additional scheduling requests. Every team, for example, asks for a home game on Father’s Day, and this can only be achieved for half of the teams in any given year. Mike addresses this by ensuring that no team has more than two away games in a row on Father’s Day.

Mike illustrated how hard it was to create a feasible solution from scratch. You cannot complete a feasible schedule if you try something intuitive like schedule the weekends first and fill out the rest of the schedule later. This leads to infeasible schedules 99% of the time. One of the challenges is that integer programming algorithms do not quickly identify when infeasibility is reached and instead branch and bound for a long while.

Additionally, it is equally hard to change a small piece of a feasible schedule based on a new requirement and easily get another feasible schedule. For example, let’s say the pope decides to visit the United States and wants to use the baseball stadium on a day scheduled for a game. You cannot simply swap that game out with another. Changing the schedule to free up the stadium on that one day leads to a ripple of changes across the entire schedule for the other teams, because changing that one game affects the other visiting team’s schedule and leads to violations in the above constraints (e.g., half of each team’s games are at home, etc). This led to Mike’s development of a large neighborhood search algorithm that efficiently reschedules large parts of the schedule (say, a month) during the schedule generation process.

Mike found that how he structured his integer programming models made a big difference. He did not use the standard approach to defining variables. Instead he used an idea from Branch and Price and embedded more structure in the variables (which ultimately introduced many more variables) to solve the problem more efficiently using commercial integer programming solvers. This led to 6 million variables that allowed him to embed his objectives such as travel costs.

In most real-world problems, Mike noted that there is no natural objective function. MLB schedules are a function of travel distance and “flow,” where flow reflects the goal of alternating home and away weeks. The objective reflects the distance teams travel. He cannot require each team to travel the same amount. Seattle travels a minimum of 48,000 miles per season no matter the schedule because Seattle is far away from most cities. Requiring other teams to travel 48,000 miles in the season leads to schedules where teams often travel from coast to coast on adjacent series to equal Seattle’s distance traveled. That is bad.

Mike ultimately included revenue in his objective, where revenue reflects attendance. He used linear regression to model attendance. He acknowledged that this is a weakness, because attendance does not equal profit. For example, teams can sell out afternoon games when they discount ticket prices. Children come and do not purchase beer at the stadiums, which ultimately fills the stands but does not generate the most revenue.

Mike summarized the keys to his success, which included:

  1. Computing power improved over time
  2. Commercial solvers improved
  3. He solved the right problem
  4. He structured the problem in an effective way
  5. He identified a way to get quick solutions for part of the schedule (useful for when something came up and a game had to change).
  6. He developed a large neighborhood search algorithm that efficiently retools large parts of the schedule.

Three years ago I wrote a blog post about Mike Trick’s keynote talk on Major League Baseball (MLB) scheduling at the German Operations Research Conference (blog post here). that post contains some background information.

 

Advertisements

risk analysis and extreme uncertainty in Beijing

I attended the International Conference on Risk Analysis, Decision Analysis, and Security at Tsinghua University in Beijing on July 21-23, 2017.
The conference was organized by Mavis Chen and Jun Zhuang in honor of my UW-Madison Industrial and Systems Engineering colleague Vicki Bier. The conference was attended by Vicki Bier’s collaborators and former students.

I enjoyed listening to Vicki’s keynote talk about her career in risk analysis and extreme uncertainty. Vicki talked about drawing conclusions with a sample size of one (or fewer!). In her career, Vicki has studied a variety of applications in low-probability, high consequence events such as nuclear power and security, terrorism, natural disasters, and pandemic preparedness. She stressed the importance of math modeling in applications in which the phenomenon you are studying hasn’t happened yet. In fact, you never want these phenomena to happen. Vicki told us, “I am a decision analyst by religion,” meaning that decision analysis is the lens through which she views the world and how she first starts thinking about problems, always recognizing that other methods may in the end provide the right tools for the problem.

Vicki has collaborated with many people of the years, and she shared several stories about her collaborations with her former students. I enjoyed hearing the stories about how her students challenged her and pushed her research in new directions. For example, Vicki told us, “Jun Zhuang made me confront my lifelong distaste for dynamic programming.” Vicki ended her keynote outlining her future work, noting that is not yet ready to retire.

Several conference attendees took a field trip to the Great Wall of China and to visit the tomb of the thirteenth emperor of the Ming dynasty, the only tomb that has been excavated underground, and the Ming dynasty’s summer palace. Many thanks to Mavis Chen and Jun Zhuang for their outstanding organization of the conference!

Pictures from the conference and its side trips are below.

At the #greatwall outside of #Beijing #china

A post shared by Laura Albert (@punkrockanalytics) on

At #Tsinghua university in #beijing #China

A post shared by Laura Albert (@punkrockanalytics) on

Outside of #Tiananmen Square in #Beijing #China from Madison #OnWisconsin

A post shared by Laura Albert (@punkrockanalytics) on

More #gorgeous #lotus flowers in #Tsinghua university in #Beijing #China

A post shared by Laura Albert (@punkrockanalytics) on

Among the #lotus flowers in #Tsinghua university in #Beijing #China

A post shared by Laura Albert (@punkrockanalytics) on

At the #SummerPalace in #beijing #China enjoying #lotus flowers 🌺

A post shared by Laura Albert (@punkrockanalytics) on

#peace signs at #Tiananmen #Square in #Beijing #China

A post shared by Laura Albert (@punkrockanalytics) on

Outside the 13th #ming #dynasty #tomb in #beijing #china

A post shared by Laura Albert (@punkrockanalytics) on

I ate a #popsicle made out of #greenbeans in #Beijing #China and it was #delicious

A post shared by Laura Albert (@punkrockanalytics) on


optimization for medicine: past, present, and future

I am still at the INFORMS Healthcare Conference in Rotterdam.  Brian Denton gave a plenary talk entitled “Optimization for medicine: past, present, and future,” where he made the case for applying more optimization to healthcare problems. He started his talk by introducing the number of papers in PubMed that use operations research methodologies to healthcare problems, and he noted that optimization methodologies were the least used (but not the least valuable!)

Denton introduced several optimization problems from the 1960s that applied optimization to healthcare problems, including linear programming for optimization chemotherapy and nonlinear optimization for optimizing dialysis decisions (e.g., when to change the bath). One dared to solve a linear program with 72 variables! (To be fair, that was tough using 1968 computers). It’s always interesting to learn where operations research gained traction early on. I once learned that the The US Forest Service had been using linear programming models to plan long-range harvesting and tree rotations since the 1960s (see my post here).

Denton highlighted some current optimization applications in healthcare, including Timothy Chan’s work in inverse optimization, Sommer Gentry’s work on matching problems for optimizing kidney exchanges, and his own work on Markov decision processes (MDPs) for optimizing treatment policies for type II diabetes. Denton discussed the importance of personalized medicine, since physicians may adopt many different treatment protocols to treat their patients without nearly enough guidance. “No one can ever agree what to do in medicine,” Denton told us and noted that despite so much disagreement among physicians about treatment options, we end up with non-personalized, one-size-fits-all treatment recommendations. He has been using robust MDPs to identify personalized type II diabetes treatment protocols in an adversarial setting, where the adversary chooses the transition probabilities in an uncertainty set subject to a budget.

Denton ended his talk with several opportunities for research, including how to make inter-related sequential decisions over time. Some research has been done in this area, but much of the research assumes that we know what will happen downstream, which is used to inform decisions we need to make now. He also argued that medical devices provides many challenging opportunities for optimal control problems for delivering dosage, such as an artificial pancreas that delivers insulin. And finally, he mentioned that most research considers a single disease and medical specialty. The next challenge is optimizing across multiple medical “silos” and multiple types of medical specialties, with perhaps a fixed “budget.”

The INFORMS Healthcare conference was really great and was filled with high quality talks. It was a nice mixture of theory and application, with plenty of discussion about how healthcare systems work in international settings.

Related post:

I’m ending my post with a couple of pictures from Rotterdam

 

 


the evolution of aviation security

You can listen to me talk about the evolution of aviation security on Wisconsin Public Radio. Norman Gilliland interviewed me for for an hour on the program “University in the Air” that aired on July 30, 2017. It was a lot of fun to chat about aviation security for an hour.

I recorded the program just before I left for an international trip. On my trip, I went through security at four airports on four continents (Chicago O’Hare, Amsterdam, Beijing, and Cairo) and was closely following the different procedures at each airport. It was interesting how different countries and airports tried to achieve similar goals in different ways. Aviation security will continue to evolve and change and will certainly be different a year or two from now. I’ll continue to blog about the evolution of aviation security 🙂

Related posts:


Data analytics for personalized healthcare

I am at the INFORMS Healthcare Conference in Rotterdam. Dimitris Bertsimas at MIT delivered the opening plenary entitled “Personalized Medicine: A Vision for Research and Education.” He talked about research in operations research, healthcare analytics, and opportunities for analytics education. It was a great talk, where Bertsimas discussed how analytical methods could and should be used for making personalized medical decisions. He was frank and honest about some of the mistakes he made along the way, and those confessions were the best parts of the talk. I captured the talk outline in a picture.

Bertsimas claimed that data is often an afterthought in many models. I agree. His main takeaway that generated a lot of questions dealt with model transparency. Bertsimas stressed the need to make models transparent so that they can be adopted by physicians and healthcare service providers. He warned that models will be “dead on arrival” if they are not transparent. However, transparency can be a challenge when using many machine learning methodologies such as neural networks. He confessed he learned that transparency is far more important than accuracy the “hard way.”

Side note: transparency is not just a sticking point with physicians. The New York City Police Department’s Domain Awareness System was a 2016 Edelman finalist. Police officers also demanded model transparency. This limited the kinds of analytics that could be used within the tool, but the 30,000 police officers bought into the transparency, used the tool, and kept New York City safer.

Have you ever been required to sacrifice accuracy for transparency?


Roller derby names inspired by operations research

I’ve been reading the graphic novel Roller Girl by Victoria Jamieson with my nine year old daughter. It’s great! My daughter reads while I listen and fantasize about optimization inspired roller derby names. Here is what I came up with so far.

Optimization Roller derby names I’m considering:

Convex Hell
Facility Laceration
Benders Deconstruction
Carnage Generation
Linear Aggression (or Logistic Aggression!)
Branch & Bomb
… or maybe Branch & Kill  or  Branch & Punish
Out-of-KILLter
Quadratic Assault Problem
Sublinear Rage of Convergence
Cutting Plane (this is probably OK as is)

I don’t roller derby, but I’d like to. What would you add?


evaluating systems according to their inputs vs. outputs

I am an avid runner and I am often asked how many miles I run in my shoes before getting a new pair. I don’t keep track of this because (a) it takes too much time and effort to keep a running total of mileage and (b) I prefer to evaluate the outputs by examining the wear on the shoe soles.

Rules of thumb suggest replacing running shoes every 500 miles or less. These rules of thumb make assumptions about the relationship between the inputs (mileage) and outputs (shoe wear). Running style effects how quickly a shoe wears out, so your mileage may vary. I am a hard-core heel striker with high arches, so I accumulate wear more quickly than the average runner. A back of the envelope calculation suggests that I run about 400-500 miles on each pair of running shoes before replacing them. Perhaps I wait too long to replace my shoes.

I prefer to assess shoe wear directly instead of approximating it with my mileage. Ultimately, runners should replace their shoes when they no longer give proper support, regardless of the mileage. Using mileage as a proxy for wear can work when evaluating inputs is easier than evaluating outputs. Some runners find that is easy to calculate their mileage if they closely follow training plans or record their mileage using an app.

Alternative rules of thumb based on outputs are often less quantitative than the rules of thumb based on inputs. For example, Runners World recommends to “go by feel” and replace shoes when it does not feel like they are supporting and cushioning enough. I also check out the wear to the soles and the tears in the fabric as signs that the shoes have had enough. Going by mileage is more clear cut and involves less interpretation, which many seem to prefer.

Replacing running shoes is not the only time I prefer evaluating outputs to inputs. I breastfed all three of my children. Women who formula fed their children would often ask me how I knew if the babies had enough milk because they had measured the formula they gave to the babies (the inputs). I could not directly measure how much milk my babies were drinking, but the outputs helped me confirm that the babies were feeding enough. The outputs in this case were diaper changes, and they were easy to measure. I started keeping an eye on the outputs during my hospital stays after the deliveries. I continued to evaluate the outputs until I returned to work and started to pump, when I could directly measure how much milk I pumped and the babies fed.

For more reading on inputs and outputs, I recommend John D. Cook’s blog post about evaluating people in hierarchical organizations by their inputs or outputs.

When do you evaluate a system according to its inputs or its outputs? How do you decide when to replace your running shoes?

I should have replaced these shoes a bit earlier