Question

1 Approved Answer

Posted on Oct 13, 2024

Submitted to Management Science manuscript MS-0001-1922.65 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes

Submitted to Management Science manuscript MS-0001-1922.65 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication. Appointment Scheduling Under Schedule-Dependent Patient No-Show Behavior Qingxia Kong School of Business, Universidad Adolfo Ibanez, q.kong@uai.cl Shan Li Zicklin School of Business, Baruch College, City University of New York, shan.li@baruch.cuny.edu Nan Liu Department of Health Policy and Management, Mailman School of Public Health, Columbia University, nan.liu@columbia.edu Chung-Piaw Teo Department of Decision Sciences, NUS Business School, bizteocp@nus.edu.sg Zhenzhen Yan Department of Decision Sciences, NUS Business School, a0109727@nus.edu.sg This paper studies an appointment scheduling problem under schedule-dependent patient no-show behavior. The problem is motivated by our studies of independent datasets from countries in two continents which identify a significant time-of-day effect on patient show-up probabilities. We deploy a distributionally robust model, which minimizes the worst case total expected cost of patient waiting and service provider's idle and overtime, by optimizing the scheduled arrival times of patients. We show that this model under schedule-independent patient show-up behavior can be reformulated as a copositive program and then be approximated by semidefinite programs. These formulations are obtained by a new technique that uses a completely positive program to equivalently represent a linear program with uncertainties present in both the objective function and the right-hand side of the constraint sets. To tackle the case when patient no-shows are endogenous on the schedule, we construct a set of dual prices to guide the search for a good schedule and use the technique iteratively to obtain a near optimal solution. Our computational studies reveal a significant reduction in total expected cost by taking into account the time-of-day variation in patient show-up probabilities as opposed to ignoring it. Key words : Distributionally Robust Optimization; Copositive Program; Appointment Scheduling; Patient No-shows History : 1 Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 2 1. Introduction Consider the following situation: Amy has an appointment with her dentist at noon. However, she is requested to attend a business meeting announced last minute that is scheduled at 12:30pm. As a result, she cannot attend the appointment as she is supposed to. If she has had an 8am appointment, she would have been able to see her dentist first and come to work right after. Such patient nonattendance (or commonly known as \"no-show\") behavior frequently arises in clinic appointment scheduling. Due to the uncertainties patient no-show brings into the picture, its prevalence in different medical specialties and geographic regions as well as its potential detrimental impact on patient health outcomes and service provider revenues (Moore et al. 2001, Ulmer and Troxler 2004), patient no-show is a crucial factor for ambulatory care providers, such as primary care doctors, dentists and physical therapists, to consider when designing appointment templates. A provider's daily appointment template specifies the expected number of patients to be seen in a day and the scheduled arrival times of these patients. To mitigate the effects of patient no-shows, it is common that an appointment template allows over-booking appointment time slots (i.e., scheduling two or more patients into the same time slot). Over-booking will certainly reduce service provider's idle time, and therefore increase throughput by seeing more patients per day. It will, however, increase service provider's overtime and patient's waiting time, and in turn may hurt service provider's satisfaction and patient's experience. As healthcare moves towards more patient-centered, payers shift from the traditional pay-for-service scheme to pay-for-performance by linking reimbursement rates to service providers with patient satisfaction rating (Press Ganey 2008). At the same time, the booming of social media websites significantly increases information transparency in the healthcare market (McCormack 2013), and leads to a soaring competition among healthcare providers in their service quality. For outpatient care providers, it thus becomes more important than ever to adopt an appointment template that achieves the best tradeoff between capacity utilization and patient experience. A significant amount of operations research efforts have been devoted to investigating the optimal appointment scheduling under patient no-show behavior. Some recent literature on this topic includes Kaandorp and Koole (2007), Robinson and Chen (2010), Hassin and Mendel (2008), LaGanga and Lawrence (2012), Luo et al. (2012), Jiang et al. (2015). Interested readers are referred to Cayirli and Veral (2003) and Denton and Gupta (2003) for a review of the earlier literature. Most, if not all, of this prior literature on appointment scheduling assumes that patient show-up probabilities (or the distributions) are exogenously determined. As we illustrated in our earlier example, however, in many situations whether or not a patient will show up for an appointment can depend on the time-of-day of her appointment. Working professionals, like Amy, usually have less control of their availability as their work day progresses. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 3 As a result, in that specific occasion, Amy has to miss her noon appointment, but she would have come for an 8am appointment. There is a surge of interest in using data analytics to improve no-show prediction in healthcare by incorporating more complex factors such as the time-of-day effect. For instance, Gabriel Belfort and nine teammates (at a MIT Hacking Medicine event) built a prototype for the start-up \"Smart Scheduling.\" Using hundreds of patient demographics and punctuality data, their system is able to predict cancelled or missed appointments up to 70% accuracy (c.f. The Boston Globe, July 14, 2014). This allows clinics to efficiently target reminders and double-book appointments, to provide better service availability and to improve patient experience. A few studies have also pointed out that patient no-show rates may depend on their appointment times of day, but the patterns are not uniform. Lacy et al. (2004) laid out a few reasons for patient no-shows, some of which are related to time (for example, trouble getting off work, transportation, etc.). Moore et al. (2001) reported that morning appointments are more likely to be kept than afternoon slots. LaGanga and Lawrence (2008) showed that no-show rates may vary by appointment slots. Another prospective study of nonattendance in a physiotherapy clinic in Ireland showed that late afternoon slots produce a lower no-show rate compared to morning and early afternoon slots (French et al. 2005). In order to gain more insights on this phenomenon, we use two independent large datasets from countries in two continents to systematically analyze the impact of appointment time-of-day on patient show-up probabilities. Controlling for patient-level and provider-level factors, we find significant empirical evidences that patient show-up probabilities indeed depend on their appointment times of day. Specifically, we find that patients in a US community healthcare facility are more likely to show up for their appointments at the beginning or the end of the day in weekdays. In a Chilean pediatric practice, however, show-up rates tend to be lower in the early morning. Such different temporal effects of appointment times may be explained by the differences in patient populations and culture. More importantly, the temporal effect size can be quite significant. For instance, our analysis shows that for a US patient scheduled on Wednesday, her show-up probability can increase from 56% to 81% when given an appointment at 8am rather than at noon. These interesting findings motivate our research questions in this paper: (1) how to design an appointment template when time-of-day affects patient no-show behavior; and (2) how much efficiency gain/cost reduction can be achieved by accounting for such time-of-day effect compared to ignoring it? Specifically, we consider a fixed set of patients to be scheduled in a given clinic session for a single service provider. A clinic session is referred to a consecutive time window during which a service provider serves patients without taking a break. We focus on the design of appointment template that specifies the scheduled arrival times of these patients. As an appointment template is usually Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 4 determined before appointments are actually made and without the knowledge of each potential patient's individual characteristics, we do not consider the impact of individual characteristics on patient show-up probabilities in our model. Patient show-up probabilities depend on their scheduled arrival times. All patients, if show up, arrive on time; and no walk-ins are allowed. The overall goal of our model is to design an appointment template that increases throughput by seeing more patients per day (or in other words, limits service provider's idle time), but not at the expense of overwhelming patient waiting time and staff overtime. Following the convention of the literature on this topic, our objective is to minimize the sum of service provider's idle cost and overtime cost as well as patient's waiting cost. From a practical point of view, the scheduler may not have sufficient data to confidently estimate the exact probability distribution of patient no-shows. In contrast, estimating only the first two moments of show-up rates is much less cumbersome. Therefore, we deploy a two-stage stochastic optimization framework from a distributionally robust perspective to solve the appointment scheduling problem mentioned above. In the second stage, we evaluate the total cost given an schedule and the realization of patient no-show status. In the first stage, instead of assuming a specific distribution of show-up rates, we use a set of distributions with given first and second moment information to find out the worst case optimal schedule that minimizes the maximum cost among the family of distributions. Such distributionally robust solutions guarantee the schedule to perform well under all possible distributions. This approach is also versatile enough to handle various salient features of the scheduling problems. To solve this scheduling problem, we encounter several unaddressed technical challenges in the optimization literature. First, incorporating patient no-show behavior demands solving a completely positive program in which uncertainties occur both in the objective function and the righthand side (RHS) of the constraint sets. In addition, because patient show-up probabilities depend on time of day, uncertainties in the system related to patient no-shows are actually endogenous on the schedule - our decision variables. Standard stochastic programming approach does not work here due to such schedule-dependent show-up probabilities. Specifically, we cannot generate random samples to guide the design of the schedule without knowing the schedule. To the best of our knowledge, Pflug G. (1990) was the first to address exogenous uncertainty where the underlying stochastic process depends on the optimization decisions. See Goel and Grossmann (2006) for a review of this area of research. These problems are often approached using a scenario tree representation and a mixed integer programming approach to handle the discrete number of scenarios. To tackle this challenge, we develop a new modeling technique that enables us to reformulate such a problem with patient no-show. We first solve the appointment scheduling problem with Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 5 static show-up rate (i.e., schedule-independent), and then apply the method iteratively to tackle the case with endogenous patient no-show behavior (i.e., schedule-dependent). To test our proposed methods, we carry out extensive numerical studies. We show that, comparing to the front-loading pattern observed in the optimal schedule with a static show-up rate (Zacharias and Pinedo 2014), when patient show-up probabilities increase over time (e.g., in the case presented by the Chile dataset), the optimal schedule still observes a front-loading pattern but it is postponed. However, when patient show-up probabilities decrease over time (e.g., in the case presented by the US dataset), it is better to spread out patients rather than front loading the system. In both situations, we find significant reductions in the total expected cost by explicitly taking into account the impact of schedule-dependent patient no-show probabilities. In summary, this paper makes three main contributions to the literature. First, we use two large datasets from countries in two continents to study and quantify the impact of appointment timeof-day on patient show-up probabilities, controlling for patient-level and provider-level factors. We identify significant evidences on the temporal effect of appointment times on patient show-up probabilities in both datasets. Second, comparing to the \"classic\" front-loading schedule pattern arising from assuming a constant patient show-up rate over time (Zacharias and Pinedo 2014), our model reveals an optimal schedule with different patterns in cognizance of time-varying patient no-show behavior. More importantly, we demonstrate a significant cost reduction that can result from the schedules derived from our model. Third, from a methodological perspective, we develop a general technique that uses completely positive program to equivalently represent a distributionally robust linear program (LP) with uncertainties present in both the objective function and the RHS of the constraint sets. By doing so, we are able to reformulate such a technically challenging problem as a completely positive program that can be approximated by semidefinite programs. This paper offers a general approach to solve problems with this structure. Two papers most relevant to ours are Kong et al. (2013) and Zacharias and Pinedo (2014). Compared to models developed in these two papers, ours is much more general. The modeling technique of this paper is inspired by Kong et al. (2013), which considers an appointment system with random service durations and assumes that all patients show up for appointments. They develop a linear copositive program to solve the appointment scheduling problem under the worst case distribution. From a technical point of view, our model is much more challenging, as in their formulation the uncertainty only appears in the objective function, but in our problem the uncertainties of patient show-up rate are present in both the objective function and the RHS of the constraint sets. The modeling technique developed in this work can be used to solve general problems of this kind. To derive the exact optimal schedule, Zacharias and Pinedo (2014) require constant, i.e., time-homogeneous, patient no-show behavior. In contrast, our model allows for schedule-dependent Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 6 show-up rates. We develop an iterative method to solve this problem, and our computational results suggest that significant efficiency gain can be achieved when accounting for the time-of-day variation in patient no-show behavior. Furthermore, we show that even though our model aims to solve the optimal schedule for the worse-case distribution, such robust schedules can also be nearoptimal in terms of the total expected costs when compared to the optimal schedules generated by Zacharias and Pinedo (2014). Another paper that is also relevant is LaGanga and Lawrence (2012), which is the only work that we know considering time-varying show-up probabilities in appointment scheduling. In their model, patient service times are deterministic and are equal to the length of an appointment slot. The decision is to identify the number of patients scheduled for each slot. Given the combinatorial nature of this problem, they use complete enumeration and develop a heuristic approach to solve it. Our work significantly advances theirs by providing a unified optimization framework for a more general class of the problems and by developing algorithms to solve the model efficiently. The rest of the paper is organized as follows. Section 2 presents a predictive analysis of the timeof-day effect on patient show-up probabilities while controlling for other factors. Section 3 develops the new approach needed to analyze a distributionally robust LP with uncertainties in both the objective and the RHS of constraints. Section 4 introduces the appointment scheduling model with schedule-dependent show-up probabilities, and applies the new approach developed in Section 3 to this model. Section 5 discusses our numerical results, and Section 6 draws our concluding remarks. 2. Time-of-day Effects on Patient Show-up Probabilities Previous literature has shown that patient characteristics (e.g., gender, age, new or established patient) and provider-level factors (e.g., provider type and relationship with patients) are important predictors for patient no-shows; see, e.g., Ulmer and Troxler (2004). We hypothesize that, controlling for these factors above, time-of-day also has a significant impact on patient attendance behavior, for potential reasons to be discussed soon. In this section, we use two datasets on patients appointment records, one from the US and the other from Chile, to investigate our hypothesis. 2.1. US data The US data contain patient appointment records from a large urban community health center located in New York City. This center offers comprehensive medical and dental care to the local community and has more than twenty healthcare providers including physicians, nurse practitioners, nutritionists and care managers. The annual visits to this center amount up to more than 25,000. Our data are extracted from the EMR (Electronic Medical Record) system of this center. This large dataset spans over three years ranging from January 2011 to December 2013. When analyzing Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 7 this dataset, we focus on adult primary care visits, i.e., visits to internists, family medicine doctors or nurse practitioners. We exclude walk-in patients from the analysis because they did not schedule their appointments in advance. Among scheduled visits, we exclude those mandated by school or work, e.g., visits for PPD skin tests or vaccine shots, because these visits have much higher show-up probabilities compared to other regular visits. The final dataset contains 35,094 patient visits made by 4,142 distinct patients. Because some patients made multiple visits, our data has a panel data structure, for which we develop a mixed-effects logistic regression model to account for potential within-subject correlation. Patient visit status (show-up vs. no-show or cancellation) is the dependent variable, and timeof-day is the independent variable of interest. We also control for a number of other potential factors available in the dataset, including patient age, gender, visit type (new vs. established), provider type (family medicine, internal medicine or nurse practitioner) and day of week. For the age variable, we dichotomize patients into younger and elder patients based on a median split at age 52 for ease of interpretation. We use a random intercept to capture individual patient effect. A full regression model reveals that gender effect is not significant, i.e., male and female patients have statistically the same attendance behavior. The difference between visits to family medicine and internal medicine is not significant either. This is not surprising, as these are all visits to physicians who usually practice in a similar manner. The difference in patient no-show rates due to provider practice manner, if any, should present in comparison between the visits to physicians and nurses. We also find that patient show-up probabilities are (similar and) higher in Tuesday, Wednesday and Thursday compared to other days in the week, controlling for other factors. Thus, we group days into two categories: Tuesday to Thursday, and other days in the week. In our analysis, we model appointment time as a categorical variable in the regression to explore the temporal effect of appointment time. Our final regression model, after excluding non-significant predictors, takes the following form. logit(pij ) = 0 + 1 AppHourij + 2 M idW eekij + 3 P hysicianij + 5 Y oungij + i , in which pij is the probability of patient i showing up for his/her jth appointment in the dataset ; 0 is the fixed intercept and i N (0, 2 ) represents the unobserved individual random effect with 2 being its variance to be estimated; AppHour is a categorical variable for different appointment times in a day; MidWeek=1 if it is Tuesday, Wednesday or Thursday; Physician=1 if the patient sees an physician (not a nurse practitioner); Young = 1 if patient's age is lower than the median age 52 of the sample. As we will discuss later, patient show-up patterns appear quite different in weekdays compared to Saturday, so we develop two regression models, one using full data and Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 8 the other using non-Saturday visits. The estimated model coefficients are shown in Table 7 of the Appendix A. We use the likelihood-ratio test to assess the significance of appointment times which consist of multiple levels, and this test lends strong empirical support for our hypothesis that appointment times have a significant impact on patient show-up rates, controlling for other factors (p < 0.05 for the full data model, and p < 0.01 for the model excluding Saturday visits). The impact of other factors on patient show-up probabilities is discussed in the Appendix A. To explore the effect size of time-of-day, we plot, by day of week, the average marginal showup probabilities over different hour of day in Figure 1. Specifically, for each day of week, we fit a separate mixed-effects logistic regression model; and then for each appointment hour, we use the fitted model to predict the show-up probability for each patient in our dataset for that day of week, holding his/her other characteristics unchanged. We then average the predicted show-up probabilities over all patients in the data for that day. This average is shown as the curved bold line in figures 1a to 1f, representing how the \"expected\" show-up rate of a random patient in this population changes should s/he be scheduled at different times of a day. We observe that, in weekdays, the show-up probabilities tend to be higher either in early morning (except for Tuesday) or in late afternoon. However, Saturday exhibits a different pattern: showup probabilities peak in the middle of the day (a dome-shape). This difference may be explained by people's different life schedules during weekdays and weekends. During weekdays, early morning slots may be the most \"convenient\" ones from patients' perspectives because attending these appointments slots has the least interruption to one's work/life and thus these slots are less likely to be missed. On Saturday, however, people tend to have a relaxed schedule (and a late breakfast), and thus early morning slots are associated with lower show-up probabilities. 2.2. Chile data The Chile data consist of patient appointment records from an ENT (ear, nose and throat) department of a public teaching pediatric hospital located in Santiago, Chile. This hospital offers various services including speciality consultation, emergence, and surgical/medical hospitalization. In 2013, this hospital had around 16,500 discharges, 5,500 major surgeries and 90,000 outpatient visits. Our ENT dataset covers one year period from October 2012 to October 2013. During this period, appointments are scheduled from 8am to 4pm Monday through Friday, and there are 7,352 patient visits made by 3,302 distinct patients. We explore this dataset using a similar approach as for the US data. We control for the following potential predictors in our regression model: age, gender, status (initial visit vs. follow-up visit), provider type (speech specialists, physicians and surgeons), distance from residence to hospital, 0.8 Predicted Showup Probability 9 Predicted Showup Probability Predicted Showup Probability Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 0.8 0.6 0.8 0.6 0.4 0.6 0.4 8 9 10 11 12 13 14 15 16 17 18 0.4 8 Appointment Time 9 (b) Tuesday 9 10 11 12 13 14 15 16 17 18 Appointment Time (d) Thursday Figure 1 12 13 14 15 16 0.6 0.4 8 11 Appointment Time 0.8 0.6 0.4 10 (c) Wednesday 0.8 0.6 9 Predicted Showup Probability 0.8 8 Appointment Time Predicted Showup Probability Predicted Showup Probability (a) Monday 10 11 12 13 14 15 16 17 18 0.4 8 9 10 11 12 13 14 Appointment Time (e) Friday 15 16 8 9 10 11 12 Appointment Time 13 14 (f) Saturday Sample-Average Probabilities of Show-up over Different Times of Day (the US Data). day of week and time of day. We use a random intercept to capture individual patient effect. For the age variable, we dichotomize patients into younger and elder patients based on a median split at age 6. Our dataset has some information on patients' residence, using which we are able to group patients into three categories: close to hospital (these are patients who live in counties very close to the hospital location); inside the city (these are patients who live in the same metropolitan area where the hospital is located), and outside the city (these are patients who do not live in the metropolitan area of the hospital). Our regression analysis reveals that gender and age effects are not significant. Distance, thought to have an impact on patient no-show rates, does not appear to be significant either. One possible explanation for these three factors not being significant is that parents were doing their best to bring their children to see the service provider regardless of their residence location or their children' gender and age. We find that patients who visit speech specialists and surgeons tend to have similar attendance behavior, and thus we group these two types of patients in a single category. We also find that patient show-up probabilities are higher on Thursdays but lower on Fridays compared to other days of the week. As a result, we group days into three categories: Monday to Wednesday, Thursday and Friday. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 10 We refer the readers to the Appendix A for details of the final model (see Table 8), and the interpretation of the results. Here, we focus on the temporal effect of appointment times on patient show-up probabilities in different days of week. As we did for the US data, we plot the sampleaverage marginal show-up probabilities over different times of the day for any given day of week, as well as the aggregated marginal show-up probability over all days of week (because the daily patterns look similar in the Chile data); see Figure 2. Note that the office hour in the Chilean practice is different from that in the US, and very few patients visit at 1pm so we exclude those. We observe an interesting pattern of show-up probabilities in this Chile dataset in contrast to the US data. In general, patient show-up rates increase over time since early morning, peak at the middle of the day, and then decrease (see Figure 2f). This pattern is different from the weekday pattern in the US data, but similar to Saturday there. There may be a few explanations. First, this is a pediatric population, and parents may need extra preparation time for the visit. Thus, mid of a day appears to be the most convenient times. Second, it is possible due to the less-work-oriented Latino culture and the fact that Latinos usually have a more relaxed attitudes towards time (Flores and Vega 1998). 0.9 Predicted Showup Probability 1.0 Predicted Showup Probability 1.0 Predicted Showup Probability 1.0 0.9 0.8 0.9 0.8 0.7 0.8 0.7 0.6 0.7 0.6 0.5 0.6 0.5 8 9 10 11 Appointment Time 0.5 8 (a) Monday 9 10 11 12 13 14 Appointment Time 15 16 (b) Tuesday 10 11 12 Appointment Time (d) Thursday Figure 2 13 14 15 16 15 16 0.6 0.5 9 14 0.7 0.6 0.5 13 0.8 0.7 0.6 12 0.9 0.8 0.7 11 Appointment Time Predicted Showup Probability 0.9 0.8 10 1.0 Predicted Showup Probability 0.9 9 (c) Wednesday 1.0 Predicted Showup Probability 1.0 8 8 0.5 9 10 11 Appointment Time (e) Friday 12 8 9 10 11 12 13 14 Appointment Time (f) Aggregate Sample-Average Probabilities of Show-up over Different Times of Day (Chile Data). Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 3. 11 An Approach of Completely Positive Decomposition As we demonstrated in Section 2 that, appointment time-of-day is an important factor that affects the variation in patient show-up rates. We are interested in finding out the structure of the optimal policies to schedule patients by incorporating this new empirical evidence. We develop a copositive programming reformulation to solve the no-show problem, based on an approach first developed in Kong et al. (2013). However, our problem is more challenging due to two reasons: first, incorporating patient no-show behavior demands a completely positive model with uncertainties in both the objective function and RHS of the constraints; second, patient showup probabilities depend on our scheduling decision variables. To address the second issue, we use the dual prices associated with the moment cones in the copositive program to guide the search for the optimal appointment schedule, and apply this method iteratively to tackle the problem with endogenous patient no-show behavior (i.e., schedule-dependent). We first develop in this section the theories on how to use completely positive program to solve LPs with uncertainties in both objective and the RHS of constraints. Later we will apply these results to solve the problem of appointment scheduling with no-shows in Section 4. 3.1. Linear Optimization with Uncertainties in both Objective and Right-Hand Side We consider a general LP with uncertainties occurring in both the objective c and the RHS of the For ease of exposition, we assume the objective function c is a linear function of constraint sets b. i.e., b. = kTi b + li , i {1, 2, ..., n} , ci = ci (b) where ki Rm1 is the coefficient vector1 . We consider the following linear optimization problem: = max c(b) Tx ZP (b) Hx = d, x 0 s.t. Ax = b, > (1) > where A := (a1 a2 ... am1 ) , H := (h1 h2 ... hm2 ) . Without loss of generality, we assume that our linear optimization problem satisfies the following conditions: (1) The feasible region is bounded; (2) if Hx = 0 and x 0, then x = 0. Note that the first condition can be used to construct a redundant deterministic constraint n P xi M that we can add to the model to ensure that the second condition holds. i=1 1 to denote (b1 , b2 , ..., bm1 ). Throughout this paper, we use boldface notation to denote vectors. For example, we use b Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 12 lies in a set of multiple distributions supported on We assume further that the distribution of b + 1 Rm + with finite first moment and finite second moment , denoted as b (, ) . We solve the following distributionally robust optimization problem: (P ) : ZP = h i E ZP (b) sup (2) + b(,) 3.2. Completely Positive Decomposition Before showing the main theorem, we first introduce some necessary notation and briefly review related concepts. A completely positive cone is defined as nm CP n := {A Sn |V R+ , such that A = V V > } := {A Sn |v1 , v2 , ..., vk Rn+ , such that A = k X vi vi> }, i=1 where Sn is n n symmetric matrices. A copositive cone is defined as COn := {A Sn |v Rn+ , vT Av 0} A copositive cone is the dual of a completely positive cone. In the following sections of this paper, we use X cp 0 (resp. X co 0) to represent X CP n (resp. X COn ). For more information on completely positive cone and copositive cone, we refer interested readers to Berman A (2003). denote the optimal solution of problem (1) obtained under b. Let Let x(b) p := E[x(b)] b) T] X := E[x(b)x( b T ] Y := E[x(b) We observe that T T T 1 1 1 1 p m1 +n+1 b R+ Y T=E . b is a completely positive matrix, as b p Y X x(b) x(b) x(b) = kTi b + li , i {1, 2, ..., n}, we have Furthermore, since ci = ci (b) \u0014 \u0015 \u0014X \u0015 X\u0012 \u0013 =E + li )xi (b) = i (b)] + li E[xi (b)] , E ZP (b) (kTi b kTi E[bx i i = dj is well studied and can be According to Natarajan et al. (2011), the constraint hTj x(b) formulated as conic constraint by lifting, i.e. h i 2 = d2 E (hTj x(b)) j Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 13 = bi , i, we have For the constraint aTi x(b) h i h i h i 2 = E (aT x(b)) b i = E b 2 , i E (aTi x(b)) i i Then we can further infer X h i X h i X h i 2 = b i = 2 , E (aTi x(b)) E (aTi x(b)) E b i i i i and written in matrix form, we have h i h i h i b) T = AT E x(b) b T = I E b b T (AT A) E x(b)x( Let K denote the matrix k1 k2 ... kn \u0001 and l = (l1 , . . . , ln ). We now consider the following completely positive program ZC , obtained by reformulating the problem ZP using the variables X, Y and p: (C) : ZC = max K Y T + lT p 1 T pT Y T s.t. p Y X (AT A) X (AT A) X Hp diag(HXH T ) cp 0 = = = = AT Y I d diag(ddT ) (3) Note that diag(M ) denotes a vector of the diagonal elements of matrix M . The main result derived in this section is the following: Theorem 1. ZC = ZP To show that (C) is actually equivalent to (P ), we need to construct a (non-negative) distribution obtained from (C) that satisfies the moment conditions, with corresponding objective value ZP . The construction hinges on the following observations: consider any completely positive decomposition of matrix T 1 > pT k X k Y> = k k p Y X k k k where k R+ , k , k Rn+ . Let + = {k | k > 0}, 0 = {k | k = 0}. The constraints in (C) ensure that Ak = k , k H k = d, k + k k = 0, k 0 . Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 14 Our approach is motivated by the construction in Natarajan et al. (2011), which uses completely positive decomposition to obtain such a desired distribution in the limit. The difference is that, their approach only deals with problems with uncertainties occurring in the objective function alone, whereas in our model, uncertainties are present in both the objective function and the RHS of the constraint sets. Therefore, our model further requires that each constraint with uncertainties in the RHS has to hold in each completely positive decomposition. 4. The Model In this section, we present a stylized mathematic model to understand the effect on the optimal appointment schedule when patient no-show rates depend on the schedule. Let N = {1, 2, ..., m} be the index set of all patients, where m denote the number of patients scheduled to arrive in a day. The number of appointment slots available per day is n, each of unit length. We assume m n for the purpose of focusing on the overbooking effect. The basic assumptions of our appointment scheduling model are listed as follows: The service sequence is fixed. Patients arrive punctually at the scheduled appointment times, if they show up. There is a single service provider in the clinic. The service provider arrives at the same time with the first patient and operates with a work conserving policy (i.e., server does not idle as long as there are patients waiting in the queue). Walk-in patients are not considered. We define si as the length of time slot scheduled for ith patient in the sequence, indicating the arrival interval between the ith and i + 1'th patient. We create a dummy patient, who does not consume any consultation time, arriving at the beginning of the appointment session (i.e., time 0), and thus the time allowance for this dummy patient, denoted as s0 , indicates the first patient's arrival time (and also the arrival time of the service provider). We also add a m + 1'th patient to arrive at the end of the clinical session to capture the amount of overtime. All patients are scheduled to arrive before the m + 1'th patient. Let ui denote patient i's consultation time. We use bi (s) {0, 1} to denote the show-up status of the ith patient, with bi (s) = 1 if the ith patient shows up, and 0 otherwise. Note that the show-up state is a function of the schedule s. We consider three types of costs in our model: (i) the waiting cost of patients, (ii) the idle cost, and (iii) the overtime cost of service provider. The scheduler determines the length of time slot si for patient i. The unit waiting cost for each patient is denoted by ci , i = 1, ..., m. If the last patient is completed after the nth slot, then an overtime cost cO is charged per unit of time. If the service provider is idle sometime during the day, that incurs an idle cost of cI per unit of time. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 15 Let Wi denote the waiting time of the ith patient in the system. Since Wm+1 denotes the amount m P bi (s)ui denotes the amount of work brought into the system, the total of overtime work, and i=1 cost can be represented as m m i=1 i=1 P P f (s, b(s)) := ci (bi (s)Wi ) + cO Wm+1 + cI (n + Wm+1 s0 bi (s)ui ), (4) in which f (s, b(s)) denotes the total cost incurred given a schedule s and the show-up state of each patient b(s). Using the recursion n o Wi = max 0, Wi1 + bi1 (s)ui1 si1 i = 2, ..., m + 1, (5) we can use a network flow approach to model the total cost function, as shown in Figure 32 . In order to capture patient no-show in a network flow representation, we change the inflow to node i to cibi (s), i = 1, ..., m. Therefore if patient i does not show up, the inflow coming into node i becomes 0. In addition, the inflow to node (m + 1) is cI + cO which corresponds to the term (cO + cI )Wm+1 in (4). Figure 3 2 Network Flow Representation We add an auxiliary patient who arrives at the end of the appointment session (i.e., node (m+1)) to represent the service provider's overtime. We also create a dummy patient arriving at the beginning of appointment session (i.e., node 0), her/his scheduled slot length determines the arrival of the first patient. In Kong et al. (2013), all the inflow to network is deterministic as patient no-show is not considered there. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 16 In the rest of the paper, for ease of exposition we assume that each consultation duration uses exactly one appointment slot, i.e., ui = 13 Using this network structure, our problem can be reformulated as m m P P f (s, b(s)) = max (bi (s) si )yi + cI (n bi (s)) s0 y0 i=1 i=1 z0 y0 zi yi + yi1 zm+1 + ym y, z s.t. =1 = bi (s)ci , = cO + cI 0 i = 1, 2, ..., m (6) To ensure that the formula satisfies the two assumptions proposed in Section 3, we need to add a m P bi (s) m into the set of constraints in the cone. To see how this constraint redundant constraint i=1 helps to justify the two assumptions, we refer the readers to Appendix C. We deploy a two-stage stochastic optimization framework to solve the appointment scheduling problem in the case of schedule-independent show-up rate from the distributionally robust perspective. Specifically, we consider the following model: ( ) n o sup min E[f (s, b(s))] ss (7) + b(s)((s),(s)) is the expected sum of service provider's idle cost and overtime cost as well as where E[f (s, b)] patient's waiting cost in the second stage when a schedule s is given, and s is the set of constraints on the schedule in the first stage. The first step is to calculate worst case expected cost of the second stage problem. For any schedule s, we consider the maximization problem Zp (s) := sup n o E[f (s, b(s))] + b(s)((s),(s)) 4.1. (8) Reformulation We can apply our results from Section 3 to solve problem (8). We first present model (6) in a general form as follows: f (s, b(s)) = max P T + c0 (ki b(s))xi sT x + f T b(s) i s.t. Ax = cw b(s) Hx = d + sl = m 1Tm b(s) x0 To obtain an equivalent completely positive model as we showed in Section 3, we first define the following notation: T T ] X := E[x(b(s))x( p := E[x(b(s))] Y := E[x(b(s)) b(s) b(s)) 2 ss := E[sl (b(s)) ] y := E[b(s)sl (b(s))] yx := E[x(b(s))sl (b(s))] b(s) >] ps := E[sl (b(s))] (s) := E[b(s)] (s) := E[b(s) 3 Our method can be adopted to handle the case when service durations are random. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 17 \u0001 Let K denote the matrix k1 k2 ... k2m+3 . Based on our results in Section 3, we consider the following completely positive program ZC (s) = max K Y T sT p + f T (s) + c0 1 (s)T pT ps (s) (s) Y T y s.t. p Y X yx T ps y yx T ss (AT A) X (AT A) X Hp T diag(HXH )\u0013 \u0012 \u0001 (s) 1Tm 1 ps\u0013 \u0012 \u0012 \u0013 \u0001 (s) y 1m T 1m 1 yT ss 1 cp 0 = AT ((1cw T ) Y ) = (cw cw T ) (s) = d = diag(ddT ) = m = m2 (9) We observe that our second stage problem is a maximization problem, while the first stage minimizes the total cost by making schedule decisions. In the literature, one approach to tackle such a min-max problem is to take the dual of the inner maximization problem, and as a result, reformulate the min-max problem as a min-min problem, so that the two stages of the problem can be combined into one. We use 0 R, 0 Rm , 0 Rmm to denote the dual variables corresponding to moment constraints; and 1 , 2 ,w(1) , w(2) , 1 , 2 to denote the dual variables corresponding to each of the equality constraints in (9). Let () be the operation of taking the diagonal matrix and Diag() be the operation that converts a vector to a diagonal matrix. Define 1 1 1 0 (0 + 1 1m )T w(1)T H 2 2 2 1 1 (0 + 1 1m ) 0 2 (cw cw T ) + 2 1m 1T 12 1 A (cw 1T ) 2 1m m 2 W := 1 T T T T (2) 1 H T w(1) ( A (c 1 )) A A( + ) + H Diag(w )H 0 1 w 1 2 2 2 1 T T 2 1m 0 2 2 1 0 0T 0T 12 sT 0 0 0 12 K 0 0 0 0 0 0 and C(s) := 0 1 s 1 KT 0 0 0 2 2 0 0T 0T 0T 0 Then the dual of the second stage problem can be written as ZD (s) = min 0 + (s)T 0 + (s) 0 + dT w(1) + dT Diag(w(2) )d + 1 m + 2 m2 + f T (s) + c0 s.t. W C(s) co 0 (10) Note that additional constraints on s can be added to the cone when required. The above conic programming model therefore provides a unified approach to study many different classes of appointment scheduling systems. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 18 4.2. Iterative Procedure We use g to denote the scheduled arrival time of each patient. As we stated before, gi = s0 + s1 + ... + si1 , i = 1, .., m, where s0 represents the scheduled time slot for the dummy patient. We define L R(m+1)(m+1) and Li+1,j = 1, i = 1, ..., m; j = 1, ..., i. Assuming g0 = 0, we can then rewrite the relationship between s and g in matrix form as g = Ls Let p(t) denote a patient's show-up probability if he is scheduled to arrive at time t. For simplicity, we assume that the show-up probability depends solely on the time of arrival and does not depend on patient demographic features. In this case, for any given schedule s, > (s) = p(g), (s) = p(g)p(g) + Diag(p(g) (1m p(g))) We consider a linear function of show-up probability p(g) to illustrate our approach4 . Without loss of generality, we assume p(g) = a + bg, 0 g n, where p(g) is linearly increasing in g if b > 0 and decreasing otherwise. In this way, we can model (s) as a quadratic function in s which can be easily modelled using the conic approach. In the schedule-dependent case, the two stage problem is formulated as min 0 + (s)T 0 + (s) 0 + dT w(1) + dT Diag(w(2) )d + 1 m + 2 m2 + f T (s) + c0 s.t. W C(s) co 0 s s (11) Note that the model is non-convex due to the product term (s)T 0 and (s) 0 in the objective function. The constraint set, however, is still linear conic, which enables us to apply an iterative method to solve this problem. The main idea is to separate the two sets of decision variables ((s), (s)) and (0 , 0 ). We fix the value of one pair and solve the above linear conic programs to arrive at a local equilibrium solution in an iterative manner. Note that the objective value obtained this way decreases monotonically. To do that, our first step is to use the average show-up probability over all slots. We denote the corresponding first and second moments of this average probability as (0 , 0 ). This reduces the problem to the schedule-independent case. We can solve (12) to obtain an optimal schedule s0 and the corresponding 01 , 01 by setting ((s), (s)) = (0 , 0 ). We refer s0 as the static schedule because we use average show-up rates across time of the day. After that, we start the iteration 4 If p(g) is nonlinear in g, we can approximate p(g) by its gradient to apply this iterative method on more general show up function. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 19 from 01 , 01 and incorporate time-dependent show-up probabilities into the model to generate a new pair of first and second moments, and then obtain the new schedule along the way. Specifically, after fixing (0 , 0 ) = (01 , 01 ), the next step is to obtain a new schedule s1 and the corresponding moments ((s1 ), (s1 )). In the case of p(g(s1 )) = a + bg(s1 ), 0 g(s1 ) n, the function of the first moment on patient's arrival time can be presented as (s1 ) = a1m + bg(s1 ), and the second moment can be simply written as (s1 ) = (s1 )(s1 )T + Diag((s1 ) (1m (s1 ))). We next solve the following quadratic programming problem \u0001T \u0001 min 0 + a1m + bg(s) 0 + (s)(s)T + Diag((s) (1m (s))\u0001 0 +dT w(1) + dT Diag(w(2) )d + 1 m + 2 m2 + f T a1m + bg(s) + c0 s.t. W C(s) co 0 s s (12) This model is still nonlinear, but the objective now has a quadratic form g(s)g(s)T and g(s) = Ls. We can reformulate this quadratic problem as a conic problem by replacing ss> = Zss , g(s)g(s)> = Zgg , and reformulate the first stage constraints in\u001aconic form. Specifically, in the \u001b case of continuous slot m P length, the first stage constraints s s = s Rm+1 | si = n, si 0 can be reformulated in + i=0 conic form as 1 sT g T T s Zss Zgs cp 0 g Zgs Zgg 1Tm+1 s = n T 2 1m+1 Zss 1m+1 \u0012 = \u0013n \u0001 s L Im+1 =0 g\u0013 \u0012 \u0012 \u0013 T \u0001 Zss Zgs \u0001T L I L I diag =0 m+1 m+1 Zgs Zgg (13) By solving (13) we can obtain a new schedule s1 and a corresponding show-up rate and moment conditions. This leads to the next iteration of our numerical procedure to obtain a new (0 , 0 ). 4.3. Fixed Slot Length In practice, healthcare appointment systems often adopt a scheduling template with fixed-length appointment slots (for example, each appointment slot is 20-minute long in many clinics). In that case, the length of \u001a appointment slot is no longer continuous. Instead, the constraint set s can be \u001b m P m+1 presented as s = s R+ | si = n, si {0, 1} , where si is a binary variable that indicates i=0 whether or not the ith patient is scheduled the same time as patient i + 1. If the ith patient is scheduled the same time as the i + 1st patient, si = 0; otherwise si = 1. By doing so, we obtain the number of patients scheduled at each appointment slot. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 20 The binary constraint set adds more difficulties to the problem. Based on Burer (2009), which presents an equivalent completely positive representation for quadratic program with binary variables, we can capture the binary constraints using a completely positive program. Denote 1 sT sl T T Cs := cp 0 s Zss Y sl Y Zll 1Tm+1 s T 1m+1 Zss 1m+1 diag(Z ) \u0012 \u0013T \u0012ss \u0013 s ei sl ei \u0013\u0012 \u0013 \u0012 \u0013T \u0012 Zss Y T ei ei ei ei Y Zll =n = n2 =s . = 1, i = 1, i We can add Cs to constrain the set of feasible schedule s of model (10), and round the solution to the conic program to 0-1 solution for the schedule s. 5. Computational Studies We briefly introduce the choice of cost parameters used in our experiments. Three types of costs are involved in determining the optimal schedule: patient's waiting cost rate cw , service provider's idle cost rate cI , and overtime cost rate cO . What matters in the optimization is the ratios among these cost rates, but not their magnitudes. Thus we set cI = 1.0 without loss of generality. One classic way to measure the value of time is its opportunity cost, which is typically assumed to be the wage rate (Becker 1965). As a typical primary care physician's income is about $220,942 and the median personal income in the US is $24,062, we consider cw = 0.1 as a base case in our study. However, we vary cw [0.05, 0.5] to study the impact of different waiting costs. To set the overtime cost rate, we adopt the US federal government mandate that overtime salary rate should be at least 1.5 times of the regular time salary rate, and thus we set cO = 1.5. Note that the ranges of these cost parameters are also similar to those considered in the previous literature, e.g., Zacharias and Pinedo (2014). 5.1. Demonstration of the Iteration Method This section uses a simple example of a 12-slot clinic session to illustrate the iteration method developed above. We assume the consultation time is deterministic and each patient spends exactly one time slot with the service provider. We set cw = 0.1, cI = 1, cO = 1.5; and consider two patterns of show-up probabilities: the decreasing case p(t) = 0.9 0.8 t, t = 0, 1, , 12, 12 (14) in which patient show-up probabilities linearly decrease from 0.9 to 0.1 over time and the increasing case p(t) = 0.1 + 0.8 t, t = 0, 1, , 12, 12 (15) Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 Table 1 21 Performance of the Iterative Method under Decreasing Show-up Probabilities m 13 14 15 16 17 18 19 20 Static schedule 13.1657 14.4350 13.2988 12.3526 11.7105 11.0076 10.6449 10.7975 Iterative method 5.9862 5.5848 5.3458 5.2119 5.0513 5.0388 5.1173 5.2168 Improvement (%) 54.53 61.31 59.80 57.81 56.87 54.22 51.93 51.68 Table 2 Performance of the Iterative Method under Increasing Show-up Probabilities m 13 14 15 16 17 18 19 20 Static schedule 13.2002 13.2924 12.7287 12.1040 11.6465 11.8348 10.9509 10.7546 Iterative method 7.8485 7.7065 7.5828 7.4635 7.3813 7.8811 7.1959 7.1141 42.02 40.43 38.34 36.62 33.41 34.29 33.85 Improvement (%) 40.54 where patient show-up rates linearly increase from 0.1 to 0.9 over time. We vary the number of patients to be scheduled m {13, 14, . . . , 20} to test the impact of different overbooking levels. To implement the iterative method described in Section 4.2, we first use the average showup probability over all slots (which is 0.5 in both the increasing the decreasing cases) to get the corresponding optimal schedule. We call this schedule the static schedule because show-up probabilities are static over time. To iterate, we solve for the next schedule using patient show-up probabilities calculated based on the last obtained schedule. For each m, we perform 500 iterations. We report the average worse case expected cost in the last 100 iterations; see row 2 in Tables 1 and 2. We also show the worse case expected cost of the static schedule (that ignores schedule-dependent no-shows) while schedule-dependent no-shows indeed present and the percentage improvement made by the iterative method that considers schedule-dependent no-shows; see rows 1 and 3 in both tables. We observe that the iterative method converges quickly; the average coefficients of variation in the last 100 iterations is less than 0.1% for all cases we tested. In addition, the optimal schedule obtained under the iterative method makes a significant improvement (30% 60%) over the static schedule in terms of the worse case expected cost. 5.2. Analysis of Schedule Patterns In this section, we study how show-up probabilities and patient's waiting cost affect the optimal schedule. 5.2.1. Impact of Show-up Probabilities We assume that there are n = 12 slots in the clinic session and consider both increasing and decreasing show-up rates (14) and (15) described in Section 5.1. We use m = 18 to illustrate the impact of show-up probabilities on the optimal schedule. Figure 4 plots the optimal schedules under both sets of show-up probabilities as well as the optimal static schedule (assuming show-up probability is 0.5 throughout the day). The horizontal axis indicates the index of patients, and the vertical axis shows the scheduled arrival Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 22 time for each patient. As expected, the static schedule demonstrates an obvious pattern of \"front loading.\" In particular, the first four patients are scheduled to come at the beginning of the clinic session. This observation is consistent with the pattern of an optimal schedule under the total expected cost criteria when patient show-up probabilities are constant over time; see, e.g., Hassin and Mendel (2008). When show-up probabilities increase over time, we still observe a front-loading pattern in the optimal schedule but it is postponed. That is, a group of patients are asked to arrive together not at the beginning of the session but later in the day. This is likely due to the low show-up probabilities at the beginning of the session, and thus it makes more sense to have patients come later to avoid idle time of the service provider5 . This finding suggests that service providers may want to proactively delay the start of their service time if observing a high level of no-show probabilities at the beginning of the day. When show-up probabilities decrease over time, however, patients are in general scheduled to arrive later compared to the static schedule and the schedule under increasing show-up probabilities. This is because higher show-up probabilities at the beginning of the session are likely to build up wait lines of patients, if they were scheduled densely at the beginning of the session. As a result, the optimal schedule in this case tends to \"smooth\" the workload later into the day. 12 Arrival time 10 Decreasing Increasing Static 8 6 4 2 0 0 Figure 4 5.2.2. 2 4 6 8 10 Patient sequence 12 14 16 18 Optimal Schedules under Different Show-up Probabilities Schedule under Different Waiting Costs In this section, we vary the waiting costs from 0.1 to 0.5 to examine its effect on the optimal schedule. This analysis sheds light on the design of appointment templates for patient populations with different valuations of waiting. Similar to the previous section, we fix n = 12, m = 18 and consider both increasing and decreasing show-up probabilities (14) and (15). Figures 5a and 5b show the optimal schedules under these two sets 5 Note that the service provider comes together with the first patient. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 23 of show-up probabilities, respectively. A close comparison of these two figures reveals that under decreasing show-up probabilities, a larger waiting cost rate leads to less front-loading. In contrast, with increasing show-up probabilities, a larger waiting cost rate results in more significant frontloading. One intuitive explanation is that as the waiting cost gets higher, we want to avoid patient waiting by possibly reducing the \"expected\" number of patients who show up. Thus, the optimal schedule tends to assign more patients to appointment slots with lower show-up probabilities. As a result, when waiting cost increases, fewer patients are assigned early in the session when show-up probabilities peak at the beginning the session; but more patients get assigned early in the session when show-up probabilities bottom at the beginning of the session. 12 12 cw=0.1 cw=0.2 cw=0.3 cw=0.4 cw=0.5 Arrival time 8 8 6 4 2 2 2 4 6 8 10 Patient sequence 12 14 16 (a) Decreasing Show-up Rates Figure 5 5.3. 6 4 0 0 cw=0.1 cw=0.2 cw=0.3 cw=0.4 cw=0.5 10 Arrival time 10 18 0 0 2 4 6 8 10 Patient sequence 12 14 16 18 (b) Increasing show-up rate Schedule under Different Waiting Costs Value of Incorporating Schedule-Dependent No-Shows Our optimal schedule is determined based on minimizing the worst case total cost. It is therefore of natural interest to examine its performance in terms of the total expected cost. To do so, we run simulations (10, 000 replications) to calculate the expected total costs under our robust optimal schedule (which takes into account schedule-dependent no-shows) and the static schedule (which ignores that). Similar to above, we consider two patterns of show-up probabilities (14) and (15). Table 3 shows the results associated with the decreasing show-up rates (14). The first two rows show the total expected costs under the static schedule and the robust optimal schedule over a range of overbooking levels. The third row represents the average cost reduction percentage. We observe a significant improvement (3.24% to 54.92% reduction) in total expected cost due to explicit consideration of schedule-dependent no-shows. Furthermore, when the overbooking level increases, the improvement becomes more significant. In Section 5.2.2, we note that the static schedule assigns quite a few patients early in the session when the overbooking level is high (m = 18). As this does not recognize the fact that show-up rates are higher (than average) at the beginning of the session, it leads to more waiting costs and thus larger total expected costs. Therefore, the cost difference becomes more significant when more patients need to be scheduled. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 24 Table 4 shows the results under increasing show-up rates (15). We see a significant improvement (12.37% to 27.04%) in total expected cost by explicitly taking into account the impact of patient increasing show-up rates. In contrast to the case of decreasing show-up rates, the cost reduction is higher when the overbooking level is lower. As discussed in Section 5.2.1, patients will be postponed to come under increasing show-up probabilities. With fewer patients to be scheduled, such a postponement is likely to be more effective. If, however, there are a larger number of patients to be scheduled, postponement will lead to longer patient wait later in the day and thus has a smaller room to make improvement. Table 3 Comparison of Total Expected Costs under Decreasing Show-up Rates m 13 14 15 16 17 18 19 20 Static schedule 5.2443 5.3194 5.5416 5.8990 6.5564 7.4330 8.6695 10.0161 Iterative method 5.0742 4.8016 4.7510 4.7976 4.5542 4.1783 4.2586 4.5149 Improvement (%) 3.24 9.73 14.27 18.67 30.54 43.79 50.88 54.92 Table 4 Comparison of Total Expected Costs under Increasing Show-up Rates m 13 14 15 16 17 18 19 20 Static schedule 8.0677 7.9187 7.5349 7.1644 6.8241 6.4858 6.1691 5.9093 Iterative method 5.9578 5.7775 5.6014 5.4852 5.3661 5.2704 5.1990 5.1784 Improvement (%) 26.15 27.04 25.66 23.44 21.37 18.74 15.73 12.37 5.4. Case Studies In this section, we apply our methodology to two case studies inspired by real data. In particular, we consider designing appointment templates under patient show-up probabilities found in our US and Chile datasets. We are interested in fixed-interval-length appointment templates. Since the schedule proposed by the copositive model is fractional, we develop a rounding heuristic for generating binary schedules during the iterations, i.e., to get a fix-interval-length appointment template in each step of the iterative method (see the Appendix D). To evaluate the performance of our rounding heuristics, we benchmark our binary schedules with those presented in Zacharias and Pinedo (2014) that provide the lowest total expect cost. Similar to their setup, we let cI = 1, cO = 1.5,cw = 0.1, n = 12, m = 18; and vary patient no-show probability from 0 to 0.8.6 See Appendix E for detailed schedules. For ease of cost comparison, Figure 6 illustrates the total expected total cost, waiting cost, overtime cost and idle cost under our schedules and those in Zacharias and Pinedo (2014). We see that even though our model focuses on the worst case perspective, the resulting schedules perform 6 Note that the patient show-up probability here is constant over time of day. Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 25 fairly close to those in Zacharias and Pinedo (2014) in terms of total expected cost. Furthermore, our model replicates the key insights obtained by Zacharias and Pinedo (2014) that there exists an optimal level of no-show probability to achieve the lowest total expected cost for a given number of patients to be scheduled. These results indicate that our rounding heuristics can generate fixinterval-length appointment templates that are near optimal in the sense of total expected cost. 12 OptimalTotal OptimalWaiting OptimalOvertime OptimalIdle HeuristicTotal HeuristicWaiting HeuristicOvertime HeuristicIdle 10 Expected Cost 8 6 4 2 0 0 0.1 Figure 6 5.4.1. 0.2 0.3 0.4 No Show Probability 0.5 0.6 0.7 0.8 Cost Decomposition under Different Show-up Probabilities The US Case Instead of modeling each single day, we consider the overall range of show-up probabilities in the US dataset. From the predictive analysis in Section 2, the show-up probability ranges approximately between 0.8 to 0.4. As service providers often take a lunch break at noon, it makes sense to consider a half day. Specifically, we consider the morning sessions in which patient show-up rates decrease in most days in the US dataset. We assume that patient show-up probabilities linearly decrease from 0.8 to 0.4. In our calculations, we use the optimal schedule obtained from Zacharias and Pinedo (2014) with a static show-up probability 0.6 as the starting point of the iteration method. Assuming cw = 0.1, cI = 1, cO = 1.5, we apply the iterative method and rounding heuristics described previously to get the robust optimal schedule. We then use simulations to estimate the corresponding total expected costs, and compare these costs with those under the static schedule derived based on Zacharias and Pinedo (2014) and assuming a constant no-show probability over time. Table 5 presents the detailed results. We observe a significant reduction (7%-13%) in total expected costs when schedules are obtained by incorporating the schedule-dependent no-show behaviors. More importantly, this improvement is relatively insensitive to the waiting cost rate and overbooking level, suggesting that service Author: Schedule Dependent No-Show Article submitted to Management Science; manuscript no. MS-0001-1922.65 26 Table 5 cw = 0.05 Static Iterative cw = 0.1 Static Iterative cw = 0.15 Static Iterative cw = 0.2 Static Iterative cw = 0.25 Static Iterative cw = 0.3 Static Iterative cw = 0.4 Static Iterative cw = 0.5 Static Iterative cw = 0.6 Static Iterative Total Cost 3.3715 3.1381 Total Cost 4.0613 3.6167 Total Cost 4.2574 3.8740 Total Cost 4.7362 4.2864 Total Cost 5.2880 4.6898 Total Cost 5.0494 4.5597 Total Cost 5.7294 5.0577 Total Cost 5.6311 4.9934 Total Cost 5.5526 4.8491 Case Study Results for the US data Improvement 1 3 6.92% 1 Improvement 1 3 10.95% 1 Improvement 1 2 9.01% 1 Improvement 1 2 9.5% 1 Improvement 1 2 9.5% 1 Improvement 1 2 9.7% 1 Improvement 1 2 11.72% 1 Improvement 1 2 11.33% 1 Improvement 1 2 12.67% 1 2 2 2 2 1 1 2 2 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 3 1 1 3 2 2 3 1 2 3 2 2 3 2 2 3 1 1 3 1 1 3 1 1 3 1 1 4 2 2 4 1 1 4 1 1 4 1 1 4 1 1 4 2 2 4 2 2 4 1 1 4 1 1 5 1 1 5 2 2 5 2 2 5 2 2 5 2 2 5 1 1 5 1 1 5 1 2 5 1 1 6 2 2 6 1 1 6 1 1 6 1 1 6 1 1 6 1 1 6 1 1 6 2 1 6 1 1 7 1 2 7 1 2 7 2 1 7 1 1 7 1 1 7 2 2 7 1 2 7 1 1 7 1 2 8 2 1 8 2 1 8 1 2 8 2