## PhD development seminar: getting started with research

I am teaching a PhD development seminar for first year PhD students in industrial engineering and related disciplines. The purpose of this course is to prepare students for the dissertation research in industrial and systems engineering. The course helps set expectations, introduces campus resources to students, and creates a cohort of student to connect students with their peers.

I am creating a series of blog posts featuring some of the classes from the semester. Those, along with previous PhD related posts, are tagged with the “PhD support” tag.

Other posts in this series:

## PhD development seminar: first steps in writing

I am teaching a PhD development seminar for first year PhD students in industrial engineering and related disciplines. The purpose of this course is to prepare students for the dissertation research in industrial and systems engineering. The course helps set expectations, introduces campus resources to students, and creates a cohort of student to connect students with their peers.

I am creating a series of blog posts featuring some of the classes from the semester. Those, along with previous PhD related posts, are tagged with the “PhD support” tag.

Other posts in this series:

## Pooled testing: a teaching case study to use in a course about probability models

This summer, I created a few examples about COVID-19 to use in my course on probability models. I’ll post those materials here as I teach with them. Here is the first example.

## Pooled testing to expand testing capacity

In July 2020, many states struggled to process COVID-19 tests quickly, with some states taking more than a week to process tests. Many statisticians have proposed pooled testing to process tests quicker and effectively expand testing capacity to up to four times the regular capacity. Pooled testing works when few tests come back positive.

Pooled testing came about in the 1940s, when government statisticians needed a more efficient way to screen World War II draftees for syphilis. “The Detection of Defective Members of Large Populations,” by R. Dorfman in 1943 contains a methodology for pooled testing.

Pooled testing works as follows:

• Tests are grouped that pool n samples together, where each sample reflects an individual’s test sample.
• Pooled test results are either positive or negative. They come back positive if at least 1 of the n individual samples are positive.
• For tests that come back positive, tests are rerun individually with the unused portions of the original samples to see which individuals test positive, achieving the same results but faster. A total of n+1 tests are performed.
• For tests that come back negative, no further testing is needed. We conclude all individuals are negative. One total test is performed, which reduces the overall tests.
• When pooling is not used, one test per individual yields n tests for the group.

Consider a group of 40 asymptomatic individuals that are tested for COVID-19 in pooled groups of size . Let  denote the number of groups tested, and let  capture the number of groups that test positive (a random variable). We assume that an individual tests positive for COVID-19 with probability  (New York data from July 2020).

• Express g as a function of n.
• Express X and its distribution based on g, n, and q.
• Let the random variable T denote the total number of tests run. Derive an expressive for T as a function of  as well as fixed parameters n and g.
• Consider test groups of size n = 4, 5, 8, 10, 20. Which group size yields the fewest number of tests performed, on average? (Hint: Find E[T]).
• How does your answer to the last question change if q = 0.02, 0.02, 0.075? (Note: Dane County had q = 0.02 and Wisconsin had q = 0.075 at the end of July 2020. At the time I wrote this in early October 2020, more than 20% of COVID tests are coming back positive in Wisconsin).

You can read more on the New York Times article that inspired this case study.

## PhD development seminar: first steps in research

Last year, I developed a new PhD development seminar for first year PhD students in industrial engineering and related disciplines.

The purpose of this course is to prepare students for the dissertation research in industrial and systems engineering. The main focus is on initial steps and skills required to get started with research. Topics include understanding degree requirements, first steps in research, conducting a literature review, working with citation managers, time management, research ethics, data management, technical writing, and research organization. I invite a number of guest speakers to class sessions to introduce topics, connect students with campus resources, and answer questions.

By the end of the semester: each student should achieve these learning outcomes

• Understand requirements for a PhD in Industrial Engineering or other PhD program.
• Understand expectations for a dissertation and how to get started with research.
• Understand what campus resources are available for writing, finding resources at the library, mental health, and others.
• Understand research concepts such as research safety, research ethics, time management principles, setting expectations, meeting milestones, and plagiarism.

The course helps set expectations, introduces campus resources to students, and creates a cohort of student to connect students with their peers.

I am again offering the course in Fall 2020 but in a virtual format. So far, we are off to a great start. I will create a series of blog posts featuring some of the classes from the semester. Those, along with previous PhD related posts, are tagged with the “PhD support” tag.

## First steps in research

This week’s class was about first steps in research, where two professors discussed how they helped new PhD students started on research. Professors Vicki Bier and Doug Wiegmann came to class and were wonderful. Some of their terrific advice was captured in my tweetstorm below.

Stay tuned for more blog posts about the course.

## Resilient voting systems during the COVID-19 pandemic: A discrete event simulation approach

Holding a Presidential election during a pandemic is not simple, and election officials are considering new procedures to support elections and minimize COVID-19 transmission risks. I became award of these issues earlier this summer, when I had a fascinating conversation with Professor Barry Burden about queueing, location analysis, and Presidential elections. Professor Burden is a professor of Political Science at the University of Wisconsin-Madison, a founding director of the Elections Research Center, and an election expert.

I was intrigued by the relevance of location analysis and queueing theory in this important and timely problem in public sector critical infrastructure (elections are critical infrastructure). I looked into the issue further with Adam Schmidt, a PhD student in my lab. We created a detailed discrete event simulation model of in-person voting, and we analyzed it using a detailed study.

We present an executive summary of our paper below. Read the full paper here: https://doi.org/10.6084/m9.figshare.12985436.v1

# Resilient voting systems during the COVID-19 pandemic:A discrete event simulation approach

Adam Schmidt and Laura A. Albert
Industrial and Systems Engineering
1513 University Avenue
laura@engr.wisc.edu
September 21, 2020

### Executive Summary

The 2020 General Election will occur during a global outbreak of the COVID-19 virus. Planning for an election requires months of preparation to ensure that voting is effective, equitable, accessible, and that the risk from the COVID-19 virus to voters and poll workers is minimal. Preparing for the 2020 General Election is challenging given these multiple objectives and the time required to implement mitigating strategies.

The Spring 2020 Election and Presidential Preference Primary on April 7, 2020 in Wisconsin occurred during the statewide “Stay-at-home” order associated with the COVID-19 pandemic. This election was extraordinarily challenging for election officials, poll workers, and voters. The 2020 Wisconsin Spring Primary experienced a record-setting number of ballots cast by mail, and some polling locations experienced long waiting times caused by consolidated polling locations and longer-than-typical check-in and voting times due to increased social distancing and protective measures. A number of lawsuits followed the 2020 Wisconsin Spring Primary, highlighting the need for more robust planning for the 2020 General Election on November 3, 2020.

This paper studies how to design and operate in-person voting for the 2020 General Election. We consider and evaluate different design alternatives using discrete event simulation, since this methodology captures the key facets of how voters cast their votes and has been widely used in the scientific literature to model voting systems. Through a discrete event simulation analysis, we identify election design principles that are likely to have short wait times, have a low-risk of COVID-19 transmission for voters and poll workers, and can accommodate sanitation procedures and personal protective equipment (PPE).

We analyze a case study based on Milwaukee, Wisconsin data. The analysis considers different election conditions, including different levels of voter turnout, early voting participation, the number of check-in booths, and the polling location capacity to consider a range of operating conditions. Additionally, we evaluate the impact of COVID-19 protective measures on check-in and voting times. We consider several design choices for mitigating the risks of long wait times and the risks of the COVID-19 virus, including consolidating polling locations to a small number of locations, using an National Basketball Association (NBA) arena as an alternative polling location, and implementing a priority queue for voters who are at high-risk for severe illness from COVID-19.

As we look toward the General Election on November 3, 2020, we make the following observations based on the discrete event simulation results that consider a variety of voting conditions using the Milwaukee case study.

1. Many polling locations may experience unprecedented waiting times, which can be caused by at least one of three main factors: 1) a high turnout for in-person voting on Election Day, 2) not having enough poll workers to staff an adequate number of check-in booths, 3) an increased time spent checking in, marking a ballot, and submitting a ballot due to personal protective equipment (PPE) usage and other protective measures taken to reduce COVID-19 transmission. Any one of these factors is enough to result in long wait times, and as a result, election officials must implement strategies to mitigate all three of these factors.
2. The amount of time spent inside may be long enough for voters to acquire the COVID-19 virus. The risk to voters and poll workers from COVID-19 can be mitigated by adopting strategies to reduce voter wait times, especially for those who are at increased risk of severe illness from COVID-19, and encourage physical distancing through the placement and spacing of voting booths.
3. Consolidating polling locations into a few large polling locations offers the potential to use fewer poll workers and decrease average voter wait times. However, the consolidated polling locations likely cannot support the large number of check-in booths required to maintain low voter wait times without creating confusion for voters and interfering with the socially distant placement of check-in and voting booths. As a result, consolidated polling locations require high levels of staffing and could result in long voter wait times.
4. The NBA has offered the use of its basketball arenas as an alternative polling location for voters to use on Election Day as a resource to mitigate long voter wait times. An NBA arena introduces complexity into the voting process, since all voters have a choice between their standard polling location and the arena. This could create a mismatch between where voters choose to vote and where resources are allocated. As a result, some voters may face long wait times at both locations.

We recommend that entities overseeing elections make the following preparations for the 2020 General Election. Our recommendations have five main elements:

1. More poll workers are required for the 2020 General Election than for previous presidential elections. Protective measures such as sanitation of voting booths and PPE usage to reduce COVID-19 transmission will lead to slightly longer times for voters to check-in and to fill out ballots, possibly causing unprecedented waiting times at many polling locations if in-person voter turnout on Election Day is high. We recommend having enough poll workers to staff one additional check-in booth per polling location (based on prior presidential elections or based on what election management toolkits recommend), to sanitize voting areas and to manage lines outside of polling locations.
2. To reduce the transmission of COVID-19 to vulnerable populations during the voting process, election officials should consider the use of a priority queue, where voters who self-identify as being at high-risk for severe illness from COVID-19 (e.g., voters with compromised immune systems) can enter the front of the check-in queue.
3. In-person voting on Election Day should occur at the standard polling locations instead of at consolidated polling locations. Consolidated polling locations require many check-in booths to ensure short voting queues, and doing so requires high staffing levels. Election officials should ensure that an adequate number of voting booths (based on prior presidential elections or based on what election management toolkits recommend) can be safely located within the voting area at the standard polling locations, placing booths outside if necessary.
4. We do not recommend using sports arenas as supplementary polling locations for in-person voting on Election Day. Alternative polling locations introduce complexity and could create a mismatch between where voters choose to go and where resources are allocated, potentially leading to longer waiting times for many voters. This drawback can be avoided by instead allocating the would-be resources at the sports arena to the standard polling locations.
5. The results emphasize the importance of high levels of early voting for preventing long voter queues (i.e., one half to three quarters of all votes being cast early). This can be achieved by expanding in-person early voting, in terms of both the timeframe and locations for early in-person early voting, adding new drop box locations for voters to deposit absentee ballots on or before Election Day, and educating voters on properly completing and submitting a mail-in absentee ballot.

The results are based on a detailed case study using data from Milwaukee, Wisconsin. It is worth noting that the discrete event simulation model reflects standard voting procedures used throughout the country and can be applied to other settings. Since the data from the Milwaukee case study are reflective of many other settings, the results, observations, and recommendations can be applied to voting precincts throughout Wisconsin and in other states that hold in-person voting on Election Day.

## Optimization with impact: my journey in public sector operations research.

Today, I gave a keynote talk at the Advances in Data Science & Operations Research Virtual Conference, presented by Universidad Galileo in collaboration with INFORMSttt. It’s the first INFORMS conference made for Latino America that brings together the scientific community from the areas of operations research, business intelligence, and data science. Dr. Jorge Samayoa, the General Chair, and Dr. José Ramírez, the Executive Chair, were wonderful hosts.

### References from my talk include:

#### Media Engagement

1. L.A. Albert. 2020. Engaging the media: Telling our operations research stories to the public. SN Operations Research Forum 1 (14) https://doi.org/10.1007/s43069-020-00017-0
2. Many of my media appearances are here.

#### Cyber-Security

1. Zheng, K., Albert, L., Luedtke, J.R., Towle, E. 2019. A budgeted maximum multiple coverage model for cybersecurity planning and management, IISE Transactions 51(12), 1303-1317.
2. Zheng, K., and Albert, L.A. A robust approach for mitigating risks in cyber supply chains, Risk Analysis 39(9), 2076-2092.
3. Zheng, K., and Albert, L.A. Interdiction models for delaying adversarial attacks against critical information technology infrastructure. Naval Research Logistics 66(5), 411 – 429.
4. Enayaty-Ahangar, F., Albert, L.A., DuBois, E. 2020. A surey of optimization models and methods for cyberinfrastructure security. To appear in IISE Transactions. https://doi.org/10.1080/24725854.2020.1781306

#### Aviation security

1. McLay, L. A., S. H. Jacobson, and J. E. Kobza, 2006. A Multilevel Passenger Prescreening Problem for Aviation Security, Naval Research Logistics 53 (3), 183 – 197.
2. Lee, A.J., A. McLay, and S.H. Jacobson, 2009. Designing Aviation Security Passenger Screening Systems using Nonlinear Control. SIAM Journal on Control and Optimization 48(4), 2085 – 2105.
3. McLay, L. A., S. H. Jacobson, and A. G. Nikolaev, 2009. A Sequential Stochastic Passenger Screening Problem for Aviation Security, IIE Transactions 41(6), 575 – 591.
4. McLay, L.A., S.H. Jacobson, A.J. Lee, 2010. Risk-Based Policies for Aviation Security Checkpoint ScreeningTransportation Science 44(3), 333-349.
5. Albert, L.A., Nikolaev, A., Lee, A.J., Fletcher, K., and Jacobson, S.H., 2020. A Review of Risk-Based Security and Its Impact on TSA PreCheck, To appear in IISE Transactions.

#### Fire and Emergency Medical Services

1. McLay, L.A., A Maximum Expected Covering Location Model with Two Types of Servers, IIE Transactions 41(8), 730 – 741.
2. McLay, L.A. and M.E. Mayorga, 2010. Evaluating Emergency Medical Service Performance Measures. Health Care Management Science 13(2), 124 – 136.
3. McLay, L.A., Mayorga, M.E., 2011. Evaluating the Impact of Performance Goals on Dispatching Decisions in Emergency Medical Service. IIE Transactions on Healthcare Service Engineering 1, 185 – 196.
4. McLay, L.A., Moore, H. 2012. Hanover County Improves Its Response to Emergency Medical 911 Calls. Interfaces 42(4), 380-394.
5. McLay, L.A., Mayorga, M.E., 2013.  A model for optimally dispatching ambulances to emergency calls with classification errors in patient priorities. IIE Transactions 45(1), 1—24.
6. Toro-Diaz, H., Mayorga, M.E., Chanta, S., McLay, L.A., 2013. Joint location and dispatching decisions for Emergency Medical Services. Computers & Industrial Engineering 64(4), 917 – 928.
7. Chanta, S., Mayorga, M. E., McLay, L. A., 2014. Improving Rural Emergency Services without Sacrificing Coverage: A Bi-Objective Covering Location Model for EMS Systems. Annals of Operations Research 221(1), 133 – 159.
8. Grannan, B.C., Bastian, N., McLay, L.A. A Maximum Expected Covering Problem for Locating and Dispatching Two Classes of Military Medical Evacuation Air Assets. Operations Research Letters 9, 1511-1531.
9. McLay, L.A., Mayorga, M.E., 2013. A dispatching model for server-to-customer systems that balances efficiency and equity. Manufacturing & Service Operations Management 15(2), 205 – 200.
10. Ansari, S., McLay, L.A., Mayorga, M.E., 2015. A Maximum Expected Covering Problem for District Design, Transportation Science 51(1), 376 – 390.
11. Ansari, S., Yoon, S., Albert, L. A., 2017. An approximate Hypercube model for public service systems with co-located servers and multiple response. Transportation Research Part E: Logistics and Transportation Review. 103, 143 – 157.
12. Yoon, S., Albert, L. An Expected Coverage Model with a Cutoff Priority Queue. Health Care Management Science 21(4), 517 – 533. DOI: https://doi.org/10.1007/s10729-017-9409-3.
13. Yoon, S., and Albert, L.A. A dynamic ambulance routing model with multiple response. To appear in Transportation Research Part E: Logistics and Transportation Review. https://doi.org/10.1016/j.tre.2019.11.001
14. Yoon, S., Albert, L.A., and V.M. White 2020. A Scenario-Based Ambulance Location Model for Emergency Response with Two Types of Vehicles. To appear in ­Transportation Science.

## Engaging the Media: Telling Our Operations Research StORies to the Public

My article “Engaging the Media: Telling Our Operations Research Stories to the Public” has been published in SN Operations Research Forum. I wrote this article after appearing in the news many times, especially after the SARS-CoV-2 pandemic began (see my media appearances here). The paper’s abstract is as follows:

Academic research occasionally captures the attention of the media. When this happens, there is a small window of opportunity to disseminate the real-world impact of our research and the value of our operations research and analytics expertise to the public. To do so, we must package our messages for public consumption. In this article, I summarize principles for interacting with the media, describe what various media interactions are like, and offer tips for capitalizing on one’s expertise. Finally, I reflect on what we have to offer to journalists and the value of telling our stories to the public.

I feel strongly that applied operations research should be outward facing to some level. Otherwise, the research has little chance to make an impact beyond the discipline. I wrote a brief statement about my outreach philosophy at the end of my research statement in my promotion dossier:

My research formulates new operations research models and algorithms for solving important and interesting real-world problems of national interest and concern. I take this responsibility to serve my profession and our nation quite seriously. I believe that it is essential for researchers who are working on problems in the public sector to disseminate their research findings to the public through outreach in addition to dissemination in academic journals. Translating research concepts into practical messages is critical for influencing public policy and transitioning research concepts into practice. This is a common theme that permeates all my research activities.

## Punk Rock OR zoom backgrounds

I made a series of Punk Rock Operations Research zoom backgrounds for upcoming fall meetings and classes held on video conferencing software. I’ve been using the black background, and I wanted some variety for the many upcoming meetings. The images are below. If you want to use one, you can right click on the image and download it.

Punk Rock OR T-shirts, laptop stickers, water bottles and mugs are available here.

## Opening the economy is not the problem — opening without a plan to control the risk is the problem

Three weeks ago, I wrote an op-ed on government plans to contain the COVID-19 pandemic. I was underwhelmed and felt that many were asking the wrong questions. I wrote an op-ed that was published on Fox News today: You can read it here:

Opening the economy is not the problem — opening without a plan to control the risk is the problem

I outline three risk-management strategies that may be the difference between recovery and a second wave:

1. Regulation of gatherings with super-spreader potential
2. Face masks required in businesses, in buildings, and on public transit
3. Robust contact tracing

It’s up to the rest of us to do our part.

Right before the op-ed was published, I tweeted a thread that reflects my more recent musings on the subject of containment and risk management:

## a soccer win probability model

Last year, I tweeted about a win probability model I created for soccer (or football, depending on where you are from) and the 2019 FIFA Women’s World Cup case study. I promised to blog about this case study I developed for my probability models course. This is a long overdue blog post on this topic.

Here is a portion of the soccer analytics case study.

### Soccer win probability model

Soccer (or football, which it is called outside of the the United States) is based on a 90 minute match. The data from FIFA women’s soccer indicate:

• Home teams score 2.34 goals/match (in regulation)
• Visiting away teams score 1.71 goals/match (in regulation)

We assume the goals are scored according to a Poisson process with exponentially distributed arrival times. Assume each team scores independently of one another and of the score. We can use the home team data for the team that is favored in the match.

Consider the situation when the home team is down by 1 goal with 4 minutes in regulation. Find the probability that the home team wins. This is a win probability. We do not consider pulling the goalie.

In one possible solution, we divide the match into small increments of, say, a half minute in length. We recursively solve for the home team’s win probability for any score differential and any time. This way, we want to be able to answer questions like this repeatedly for different score differentials and lengths of remaining time.

#### The derivation is next. You can skip the math if you want and jump to the figures below.

Let the random variable $W_i(d)$ capture the event that the home team wins with a score differential of $d$ with $i$ increments to go.

We want to find $P(W_i(d))$, the probability that the home team wins if there is a score differential of $d$ with $i$ increments to go.  In our problem, the home team is down by 1 goal with 4 minutes in regulation, yielding $P(W_8(-1))$ for a win probability with a score differential of -1 with eight 30 second increments to go.

We use a recursive expression to compute the probability of scoring in small intervals of time and put the solution in a spreadsheet. The spreadsheet approach computes our solution, and it allows us to assess a variety of situations, including different score differentials with different amounts of time to go. The boundary conditions with 0 time increments to go are $P(W_0(d))=1$ if $d > 0$ (the home team is winning when time expires), $P(W_0(d))=0$ if $d < 0$ (the home team is losing when time expires), or $P(W_0(d))=1/2$ if $d = 0$ (the match ends in a tie).

There are two ways to compute the probability that a team scores in a small amount of time $\delta$, which is the length of an increment: (1) we can use exponential interarrival times, or (2) we can use the Binomial approximation. I’ll illustrate the latter approach below. We have to make the time increments small enough such that having at most one goal scored during the time interval is a reasonable assumption.

We compute $P(W_i(d))$ by conditioning on 𝑌, where 𝑌 captures what happened in increment 𝑖 with a home goal (+1), an away goal (-1), or no goals (0):

$P(W_i(d)) = \sum_y P(W_i(d) | Y=y) P(Y=y)$

After taking advantage of independent increments, we can simplify this to $P(W_i(d)) = \sum_y P(W_{i-1}(d + y) ) P(Y=y)$. Here, we recursively solve for $P(W_i(d))$ by conditioning on what happened last and formulating a new expression based on the win probability with $i-1$ time increments to go.

Here, $P(Y=1) = \lambda_H \delta / 90$, where $\lambda_H =2.34$ home goals per match. Likewise, $P(Y=0) = \lambda_A \delta / 90$, where $\lambda_A =1.71$ visiting team goals per match.

#### Let’s look at the answer

We can put this into the spreadsheet and estimate the probability of 0.048 that the home team wins when down by 1 goal with 4 minutes to go. We can see answers to other scenarios. A home team wins with a probability of 0.517 if the match is tied with 5 minutes to go.

What is more interesting is that we can move across these spreadsheet to estimate real-time win probabilities. Every time there is a goal, we jump to another row in the spreadsheet. We jump up a row if the visiting team scores and down a row if the home team scores.

I made two win probability charts for the USA v ENG and USA v FRA games in the 2019 World Cup. I set the USA Women’s National Team as the “home” team since they were favored to win in each of the matches, even though the matches were played in France. You can also see that the home team has a probability of 0.61 of winning when the game begins.

Earlier we assumed that goals are scored according to a Poisson Process. Is that a good assumption? Not exactly (see this post using Premier League data) but it’s not a bad approximation except for the end of game situation when a team pulls their goalie. The model we built above can be easily changed to have time-specific scoring rates. Pulling a goalie is trickier but doable. When pulling a goalie, we have to consider new Poisson scoring rates that depend on time and the score differential.

On a side note, the Poisson process assumption holds up better with National Hockey League data.