Case TH Background You work for TH, a large, for-profit managed health care company. The division within the company that you work for specializes in pharmacy benefit management (PBM). According to the American Pharmacists Association, \"PBMs are primarily responsible for developing and maintaining the formulary, contracting with pharmacies, negotiating discounts and rebates with drug manufacturers, and processing and paying prescription drug claims. For the most part, they work with self-insured companies and government programs striving to maintain or reduce the pharmacy expenditures of the plan while concurrently trying to improve health care outcomes.\"1 For this case, you will focus specifically on drug costs related to Medicare Part D. Medicare Part D (also called the Medicare prescription drug benefit) subsidizes the cost of prescription drugs and drug insurance premiums for Medicare beneficiaries in the United States. The following diagram illustrates the role of PBM in the prescription fulfillment process. The PBM helps by working with members who have Medicare coverage and need pharmaceutical drugs, the pharmacy that fills drug prescriptions and the government that helps pay for the drugs.2 Members 1 PBM (also acting as an insurance provider) 6 25 Pharmacy Government 3 4, 7 1 \"Pharmacy Benefit Management,\" American Pharmacists Association website, http://www.pharmacist.com/sites/default/files/files/Profile_26%20PBM%20Final%20071213.pdf, accessed May 30, 2017. 2 This diagram is simplified by assuming the PBM and the insurance provider are the same entity. This is common in practice but not uniform because some providers contract with a PBM. For purposes of this case, assume that the PBM and the insurance provider are the same. Adapted from Analytics mindset case studies 2017 Ernst & Young Foundation (US). All Rights Reserved. 1 Information flows: 1. Members pay a premium, usually monthly, to the PBM as part of Medicare. 2. When members need drugs, they go to their doctor, receive a prescription and take their prescription to the pharmacy to have it filled. 3. The pharmacy receives the prescription information from the member and, using the member's medical ID card, contacts the PBM to find out if the drug prescribed is covered by the member's insurance. 4. The PBM tells the pharmacy if the drug is covered by the insurance plan and how much the member must pay as a co-pay. 5. The member pays the co-pay amount to the pharmacy and receives their prescription. 6. The PBM submits claims to the government for reimbursement through a formatted claims file. The government pays the PBM. 7. Sometime later during the period, the PBM reviews all of the pharmacy's drug disbursements and sends the pharmacy a check for the amount that the PBM and the pharmacy previously agreed to contractually as payment for each drug. It is important to note several things about this information flow. The PBM adds value in a few ways. First, since the PBM works with many different members, the PBM is able to negotiate with the pharmacy to receive discounted drug prices. That is, the PBM negotiates with all pharmacies and because of the projected volume of purchasing, the pharmacies are willing to give the PBM, and thus the member, a discount. Second, the PBM manages the information around (and helps determine) what is covered by insurance. As already mentioned, the PBM is a for-profit entity. To make money, the PBM must have revenues that exceed the gross costs of the drugs prescribed to members, plus other business costs (e.g., salaries, administration costs). In this arrangement, on a short-term basis, the PBM has a limited ability to control the revenues and gross costs of the drugs, but it can control other business costs. Thus, it becomes very important for the PBM to understand revenues and gross drug costs so it can make informed business decisions about the other business costs. Appendix Data descriptions RecordID - Primary key from the database that is a unique number for each row of data MemberID - A unique ID for each different member Month - The month to which the data pertains, listed in numeric format as 1 for January, 2 for February, etc. GrossDrugCost - The total amount of drug costs incurred by a member during the corresponding month NLISDummy - A dummy variable that takes the value of 1 if the member is listed as non-low income by the government and 0 otherwise LISCHOSERDummy - A dummy variable that takes the value of 1 if the member chose a specific plan and 0 if the member automatically was assigned a plan, i.e., members automatically are assigned (thus, LISCHOSERDummy = 0), but some members take the initiative to choose a specific drug plan other than one assigned (thus, LISCHOSERDummy = 1) RiskScore - A score assigned by the government based on previous government data indicating how sick someone is, higher scores indicate members are sicker SpecialtyDummy - A dummy variable that takes the value of 1 if the member utilizes specialty drugs and 0 otherwise AdjudicationDays - The number of non-holiday workdays in a month Age - The age of each member during the month Gender - A dummy variable that takes the value of 1 if the member is female and 0 if the member is male FrailtyDummy - A dummy variable that takes the value of 1 if the government indicates the member is frail and 0 if the government indicates the member is not frail HospiceDummy - A dummy variable that takes the value of 1 if the member is receiving hospice care and 0 if they are not InstitutionDummy - A dummy variable that takes the value of 1 if the member is receiving institutionalized long-term care (e.g., hospital, nursing facility) and 0 if they are not ESRDDummy - A dummy variable that takes the value of 1 if the member is receiving care for end-stage renal disease (i.e., end-stage kidney disease) and 0 if they are not Part I (54%) Your role at TH is to forecast the monthly gross drug costs. To do this, you have been given six months' worth of data about previous drug costs, including several factors that are expected to influence the gross drug costs. See the appendix for a detailed description of the data. Your job is to investigate which factors are important for determining the gross drug costs and create a model that predicts drug costs so you have better information to manage your other business costs. You recognize that the best tool for this analysis is regression, which is designed to determine the relationship between an independent variable (or multiple independent variables) and a dependent variable (in this case, the monthly gross drug costs). Before asking you to forecast the gross drug costs, TH had several other employees attempt the same task. Below, you will find three models they tried, as well as corresponding questions you need to answer about the different models. Upon receiving this assignment from your boss, you recognize that you will need to exercise an analytics mindset to respond to his request. As a reminder, an analytics mindset is the ability to: Ask the right questions Extract, transform and load relevant data Apply appropriate data analytics techniques Interpret and share the results with stakeholders Required Using the complete data provided (Case_TH.txt), estimate the gross drug costs using each model and answer the questions. Note that you should not change anything in the data (e.g., remove outliers). Model 1: GrossDrugCost = B0 + B1 * RiskScore + 1. Provide a statistical interpretation of the coefficient, standard error, T-stat and P-value for the RiskScore variable. Provide a practical explanation of the information to senior management. 2. Provide a statistical interpretation of the coefficient and P-value for the intercept. Provide a practical explanation of the information to senior management. 3. Provide a statistical interpretation of the adjusted R-squared value. Explain what it means in a statistical way and provide a practical explanation of the information to senior management. 4. A coworker wants to know what the predicted gross drug costs would be for a new member. The new member is a 73-year-old man who the government classifies as frail and he has a risk score of 510. Using the model above, what would you predict the gross drug costs will be? 5. Based on the data in the last problem, what range of values would make you 95% confident that the range represents the actual gross drug costs of the new member? Adapted from Analytics mindset case studies 2017 Ernst & Young Foundation (US). All Rights Reserved. 2 Model 2: GrossDrugCost = B0 + B1 * RiskScore + B2 * Age + B3 * Gender + 6. Provide a statistical interpretation of the coefficient and P-value for the gender variable. Provide a practical explanation of the information to senior management. 7. Provide a statistical interpretation of the coefficient and P-value for the age variable. Provide a practical explanation of the information for senior management. 8. Explain why the coefficient, standard error and T-stat for the RiskScore variable are different in this model than they were in Model 1. 9. Provide a statistical interpretation of the coefficient and P-value for the intercept. Provide a practical explanation of the information to senior management. 10. Compare the adjusted R-squared values between Models 1 and 2. Are they the same or different? Why? What could you conclude about the differences (if any) in the adjusted Rsquared values? 11. Senior management wants to know the expected gross drug costs of the average customer. That is, for the median value of the RiskScore, age and gender, what would you expect the average gross drug costs to be? What is the 95% confidence interval for this estimate? 12. Which independent variable has the largest effect on the gross drug costs? Model 3: GrossDrugCost = B0 + B1 * RiskScore + B2 * SpecialtyDummy + B3 * RiskScore * SpecialtyDummy + 13. Interpret the output for senior management. Part II (16%) For Part I, you assumed that regression was the appropriate analytics tool and that the data did not violate any regression assumptions or otherwise did not prove to be problematic. Now, in Part II, you will test the data to evaluate whether these assumptions were warranted. For this part, use the following model: - GrossDrugCost = B0 + B1 * NLISDummy + B2 * LISCHOSERDummy + B3 * RiskScore + B4 * SpecialtyDummy + B5 * AdjudicationDays + B6 * Age + B7 * Gender + B8 * HospiceDummy + B9 * InstitutionDummy + B10 * ESRDDummy + Required Please examine the data/model and answer the questions below. Potential regression assumption or problem 1. Are there outliers in the data? 2. Is multicollinearity a problem? 3. Are there violations of the homoscedasticity assumption? 4. Are the residual errors normally distributed? Part III (30%) Now that you have considered models developed by other employees, your task is to create the best model you can for predicting the gross drug costs. For purposes of this assignment, the best models are those that have the smallest prediction errors for future realized drug costs. That is, you will use the model you developed for the six months' worth of data provided and predict the gross drug costs for the next month using your model coefficients. Required Create your best regression model based on the data to predict the future gross drug costs. Clearly state the estimated model in answering this part of the case. Submit a memo that contains the standard regression output (coefficients, T-stats, P-values, adjusted R-squared, etc.). - Explain to senior management the main insights that you learned from your model that will be valuable in operating the PBM business. - Also include any cautions that senior management should understand about using this model