Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

INSTRUCTIONS : For this assignment you have to execute and interpret one regression model. The objective is to determine how much will be saved if

INSTRUCTIONS : For this assignment you have to execute and interpret one regression model. The objective is to determine how much will be saved if 10% of complicated appendicitis cases were prevented, meaning they were instead "non-complicated." The dependent variable is defined as the total charge associated with the inpatient episode (TCHGS). The "treatment variable" of interest is complicated versus uncomplicated appendicitis. Pay special attention to it when discussing the results and provide an explanation for the outcomes between the models. Is it statistically significant? Why is it positive or negative and does this make sense? In addition to the main variable of interest, don't forget to discuss (very briefly) the other explanatory variables. In the model, in addition to the treatment variable, control for the influence of the following confounding variables (specifications are discussed in the short video): age, race, gender, ethnicity, insurance status, and severity. In your results section, provide both (a) the difference in total cost per case, e.g. complicated vs. non-complicated, and (b) the total savings is 10% of complicated cases were instead treated as non-complicated. To figure out how many cases are complicated versus non-complicated, you can use a frequency statement as shown below. 2. Which data set to use? Appendicitis.SAS7BDat. 3. Turn in a structured abstract (no more than 300 words, not including the title and section headers, so 307 in total) with the format shown below. a. A point will be deducted if you exceed the word limit. Adhering to the word limit is essential as it forces you to get to the point; at the same time you have to be careful not to omit anything of importance. i. Do not include tables or graphs in the abstract (that would go in the main text or presentation - if you were to do one)! Determine which values from your test(s) are most important to the reader and report those in the abstract. ii. Since the word limit applies to everyone equally, no exceptions will be made!!! 4. What is important or relevant information is by definition, at least partially, subjective. However, there are certain rules that have to be followed. i. For example, in the results section you should definitely discuss the significance of your test-statistic (at minimum you should provide the p-value associated with the treatment variable). If you do not, at least one point will be deducted (yes, this is a big oversight). Given the word limit, you may want to group the other variables and report as such. ii. In the methods section, provide a rationalization for using the specific method. 5. Special importance is attached to the discussion section of the abstract (see below). This is where you have to apply your creativity as a story teller. This is where you help make information out of data as it involves the reader. Try to convey why the hypothesis and the analysis are important (if you personally don't care, ask yourself why anyone would). Imagine how and what decisions could be affected by the analysis and findings and communicate this to your intended audience. Simple repetitive statements already expressed elsewhere in the abstract will result in deductions, so do not just repeat the conclusions! Abstract Objective: This contains a brief statement describing the general objective of the analysis. For example "The objective of this analysis is to examine the impact of ..." Side note: each independent variable included in the model is associated with a hypothesis. For example, a hypothesis (in alternative form) is that Hispanics have a lower percentage of charge associated with labs because (you fill in the blank). Given your word limit, you should just focus on the treatment variable. Data and methods: This contains a brief description of the particular dataset used to test the hypothesis. For example, concerning the dataset briefly mention the unit of measurement, whether the data is publicly available, and what types of variables it includes. You should also discuss what you are controlling for (e.g. age). Results: This contains a brief layman's description of the least squares results and whether or not each variable (or groups of were) was significant. Conclusions: A very brief statement indicating your conclusions pertaining to the hypothesis - did your results reject the null hypothesis or not? Discussion: This is perhaps the most important section. Here you tell the story of why you believe the analysis was important - why is it "information" and not just "data"? Why would anyone within related professional or policy circles find your analysis and results interesting? How could your analysis help such individuals in terms of decision making? You may want to select one of the listed independent variables and assume that it is your treatment variable in an experiment. This will help focus your abstract. As part of the discussion, indicate whether you would have liked to test the influence of additional variables if they had been available .

*************************Data output of SAS FOR LINEAR REGRESSION **************************

Dependent Variable is Operating room Total charges

The REG Procedure

Model: MODEL1

Dependent Variable: TCHGS TCHGS

Number of Observations Read 47460
Number of Observations Used 47460

Analysis of Variance
Source DF Sum of Squares Mean Square F Value Pr>F
Model 8 1.842725E13 2.303406E12 2473.72 <.0001
Error 47451 4.418411E13 931152403
Corrected Total 47459 6.261136E13

Root MSE 30515 R-Square 0.2943
Dependent Mean 50896 Adj R-Sq 0.2942
Coeff Var 59.95464

Parameter Estimates
Variable Label DF Parameter Estimate Standard Error tValue Pr>|t|
Intercept Intercept 1 271956 2177.14211 124.91 <.0001
AGE AGE 1 91.20939 7.26987 12.55 <.0001
black 1 5684.95352 470.91645 12.07 <.0001
otherNW 1 6813.95711 425.26911 16.02 <.0001
female 1 311.73308 284.31449 1.10 0.2729
hispanic 1 2457.95591 342.49490 7.18 <.0001
uninsured 1 1136.03849 366.21885 3.10 0.0019
complicated complicated 1 11983 313.34904 38.24 <.0001
SeverityOverall SeverityOverall 1 -237142 2110.40017 -112.37 <.0001

Logistic regression with Dependent variable =Emergency

The LOGISTIC Procedure

Model Information
Data Set WORK.APPEND
Response Variable postOperative postOperative
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring

Number of Observations Read 28736
Number of Observations Used 28736

Response Profile
Ordered Value postOperative Total Frequency
1 1 1848
2 0 26888

Probability modeled is postOperative='1'.

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics
Criterion Intercept Only Intercept and Covariates
AIC 13718.520 10969.309
SC 13726.786 11043.702
-2 Log L 13716.520 10951.309

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr>ChiSq
Likelihood Ratio 2765.2111 8 <.0001
Score 4864.3455 8 <.0001
Wald 1982.3999 8 <.0001

Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard Error Wald Chi-Square Pr>ChiSq
Intercept 1 7.5077 0.3847 380.8706 <.0001
AGE 1 0.0101 0.00227 19.9878 <.0001
black 1 0.1945 0.0808 5.7930 0.0161
otherNW 1 -0.1207 0.0913 1.7481 0.1861
female 1 -0.4945 0.0558 78.4029 <.0001
hispanic 1 -0.1497 0.0716 4.3635 0.0367
uninsured 1 -0.0698 0.0639 1.1918 0.2750
SeverityOverall 1 -11.3226 0.3626 975.2230 <.0001
complicated 1 1.1893 0.0551 466.5174 <.0001

Odds Ratio Estimates
Effect Point Estimate 95% Wald Confidence Limits
AGE 1.010 1.006 1.015
black 1.215 1.037 1.423
otherNW 0.886 0.741 1.060
female 0.610 0.547 0.680
hispanic 0.861 0.748 0.991
uninsured 0.933 0.823 1.057
SeverityOverall <0.001 <0.001 <0.001
complicated 3.285 2.949 3.659

Association of Predicted Probabilities and Observed Responses
Percent Concordant 85.6 Somers' D 0.712
Percent Discordant 14.4 Gamma 0.712
Percent Tied 0.0 Tau-a 0.086
Pairs 49689024 c 0.856

Frequency distribution of complicated versus uncomplicated Appendicitis

The FREQ Procedure

complicated
complicated Frequency Percent Cumulative Frequency Cumulative Percent
0 33068 69.68 33068 69.68
1 14392 30.32 47460 100.00

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Statistical Techniques in Business and Economics

Authors: Douglas A. Lind, William G Marchal

17th edition

1259666360, 978-1259666360

More Books

Students also viewed these Mathematics questions