Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Week 2: Understanding and Exploring Assumptions You will submit one Word document, including your SPSS output. 1. Why do we care whether the assumptions required
Week 2: Understanding and Exploring Assumptions You will submit one Word document, including your SPSS output. 1. Why do we care whether the assumptions required for statistical tests are met? (Tip: You might also want to write your answer on a note card you paste to your computer.) 2. Open the data set that you corrected in Activity #1 for DownloadFestival.sav. You will use the following variables: Day1, Day2, and Day3 (hygiene variable for all three days). Create a simple histogram for each variable. Choose to display the normal curve (under Element Properties) and title your charts. Copy these plots into your Word document. 3. Now create probability-probability (p-p) plots for each variable. This output will give you additional information. Read over the Case Processing Summary. Notice that there is missing data for Days 2 and Day 3. Copy only the Normal p-p Plots into your Word document (you do not need to copy the beginning output nor the Detrended Normal p-p Plots). 4. Examining the histograms and p-p plots, describe the dataset with particular attention toward the assumption of normality. For each day, do you think the responses are reasonably normally distributed? (Just give your impression of the data.) Why or why not? 5. Using the same dataset and the Frequency command, calculate the standard descriptive measures (mean, median, mode, standard deviation, variance and range) as well as kurtosis and skew for all three hygiene variables. Paste your output into your Word document (you do not need to paste the Frequency Table). What does the output tell you? You will need to comment on: sample size, measures of central tendency and dispersion and well as kurtosis and skewness. You will need to either calculate z scores for skewness and kurtosis or use those given in the book to provide a complete answer. Bottom line: is the assumption of normality met for these three variables? Does this match your visual observations from question #1? 6. Using the dataset SPSSExam.sav and the Frequency command, calculate: the standard descriptive statistics (mean, median, mode, standard deviation, variance and range) plus skew and kurtosis, and histograms with the normal curve on the following variables: Computer, Exam, Lecture, and Numeracy for the entire dataset. Complete the same analysis using University as a grouping variable. Paste your output into your Word document (you do not need to paste the Frequency Table). What do the results tell you with regard to whether the data is normally distributed? 7. Using the dataset SPSSExam.sav, determine whether the scores on computer literacy and percentage of lectures attended (with University as a grouping variable) meet the assumption of homogeneity of variance (use Levene's test). You must remember to unclick the \"split file\" option used above before conducting this test. What does the output tell you? (Be as specific as possible.) 8. Describe the assumptions of normality and homogeneity of variance. When these assumptions are violated, what are your options? Are there cases in which the assumptions may technically be violated, yet have no impact on your intended analyses? Explain. PART A Part A (Items #1 and #12 are required but not graded) You will submit one file, a Word document. Please limit each response to 250 words or less. Name the file in the following format: lastnamefirstinitialBTM8107-1.doc (example: smithbBTM8107-1.doc). 1. Briefly describe your area of research interest (1-3 sentences is sufficient). 2. List 4 variables that you might assess in a research project related to your research area. List one for each type of measurement scale: Nominal, ordinal, interval, and ratio. If you cannot think of a variable for each measurement scale, explain why the task is difficult. 3. Create one alternate hypothesis and its associated null hypothesis related to your research area. 4. Briefly describe whether you think your area of interest is more conducive to experimental or correlational research. What are the costs/benefits of each as it relates to your research area? 5. Reliability vs. Validity. Considering your area of research interest, discuss the importance of reliability and validity. Can you have one without the other? Why or why not? 6. Sample vs. Population. Considering your area of research interest, describe the difference between a sample and population. Why is it important to understand the difference between a sample and population in a statistics course? 7. Measures of Central Tendency. Below is a set of data that represent weight in pounds for a particular sample. Calculate the mean, median and mode. Which measure of central tendency best describes this data and why? You may use Excel, SPSS, some other software program, or a hand calculator for this problem. 110.00 117.00 120.00 118.00 104.00 100.00 107.00 115.00 115.00 115.00 114.00 100.00 117.00 115.00 103.00 105.00 110.00 115.00 250.00 275.00 8. Measures of Dispersion. For the data set above, calculate the range, the interquartile range, the variance, and the standard deviation. What do these measures tell you about the \"spread\" of the data? 9. Descriptive Statistics. Why is it important to perform basic descriptive statistics prior to conducting inferential statistical tests? 10. Statistical Significance. Revisit the hypotheses you created above in #5. If you conducted a statistical test based on these hypotheses and found a statistically significant result, what would that mean from both a statistical and practical standpoint? (Be sure to use the phrases \"null hypothesis\" and \"effect size\" in your answer). 11. Type I and Type II Error. The concept of Type I and Type II Error is critical and will come into play not only with each and every statistical test you perform, but when you are asked to conduct an a priori power analysis for your Dissertation Proposal. Considering your answer to #10, discuss the implications of making both a Type I and Type II error. 12. After completing Assignment #1, are there any areas of concern you have that you would like to share with your course instructor? Part B You will submit a total of three files: two SPSS data files and one Word document. Section A: Creating a Data File. Open a data file in SPSS and enter the data presented in Table 3.1 on page 101. Save this SPSS data file. Section B: Create a mock research project. Submit your answers to the three questions below in a Word document. 1. Considering your area of research interest, briefly state your area and a possible research project related to the area (150-500 words). 2. Pose one or more null and alternative hypotheses that follow from the possible research project. 3. List at least 10 variables that would be collected in your mock research project that would be used to answer the hypotheses. After each variable, list the variable name you will use in SPSS (Section C), the level of measurement (binary, nominal, ordinal, interval, or ratio), and the possible range of scores. Feel free to be creative. Section C: Create a mock SPSS data set. 1. Open a data file in SPSS and enter in a set of mock data for the research project you describe in Section B. (Note: It is important that you do not collect real data for this activity; you cannot collect data without IRB approval). 2. You must enter 10 rows of data for each of the 10 variables (that is, create data for 10 mock participants). Each row represents the scores of each mock participant on the ten variables. 3. Participant #1 must have missing data for Variable #3. Ensure this is coded correctly. You should now have three files for Part #2. Part C You will submit one Word document. You will create this Word document by exporting SPSS output into Word. Section A. Creating Visual Displays of Data. For this portion of the activity, you will export output you created while working in SPSS for Chapter 4 into a Word document. Please read the instructions below to ensure you are including the correct material in your document (This chapter has you create many charts and not all are required for Part #3). 1. Using the data set: DownloadFestival.sav, create a boxplot for males and females for the variable Day1. It is important that you change the outlier identified to 2.02 prior to creating the boxplot. Be sure to save the data set with a new name, indicating it is the corrected data set (outlier identified and corrected). Save this boxplot with an appropriate title in your Part #3 Word document. 2. Using the data set: ChickFlick.sav, create a simple bar chart for independent means. The variables you will use are: Arousal, Film, and Gender (grouping variable). Be sure to display error bars and save your chart with an appropriate title in your Part #3 Word document. 3. Using the data set: Hiccups.sav, create a clustered bar chart for related means. The variables you will use are: Baseline, Tongue Pulling, Carotid Artery Massage, Digital Rectal Massage. Be sure to display error bars, include labels for the X- and Y-axis, and save your chart with an appropriate title in your Part#3 Word document. 4. Using the data set: Text Messages.sav (Note: you may see an additional data set with the same name: TextMessages.sav - either will create the correct output), create a clustered bar chart for mixed designs. The variables you will use are: Time1, Time2, and Group. Be sure to display error bars, include labels for the X- and Y-axis, and save your chart with an appropriate title in your Part #3 Word document. 5. Using the data set: Exam Anxiety.sav, create a scatterplot that includes a regression line. The variables you will use are: Exam Performance and Exam Anxiety. Be sure to include the regression line and save your chart with an appropriate title in your Part #3 Word document. Section B. Why Exploratory Data Analysis? Write a short paragraph that highlights your understanding of why exploratory data analysis is a critical part of any analytical strategy (500 word limit). This answer is worth half the assigned points for this activity. To receive full credit, you must show a high level of understanding related to the importance of exploring data visually. Part C You will submit one Word document. You will create this Word document by exporting SPSS output into Word. Section A. Creating Visual Displays of Data. For this portion of the activity, you will export output you created while working in SPSS for Chapter 4 into a Word document. Please read the instructions below to ensure you are including the correct material in your document (This chapter has you create many charts and not all are required for Part #3). 1. Using the data set: DownloadFestival.sav, create a boxplot for males and females for the variable Day1. It is important that you change the outlier identified to 2.02 prior to creating the boxplot. Be sure to save the data set with a new name, indicating it is the corrected data set (outlier identified and corrected). Save this boxplot with an appropriate title in your Part #3 Word document. Solution Results from DownloadFestival.sav Histogram Day 1 Histogram Day 2 Histogram Day 3 Normal p-p Plot Day 1 Normal p-p Plot Day 2 Normal p-p Plot Day 3 . From the Histogram Day 1, we see that the distribution has longer tail towards the right side of the normal curve, indicating that the distribution is skewed right. Also, going through the normal p - p plot for Day 1, we see that only few points move away from the line, indicating that the distribution is slightly skewed From the Histogram Day 2, we see that the distribution has longer tail towards the right side of the normal curve, indicating that the distribution is skewed right. Also, going through the normal p - p plot for Day 2, we see that most of the points move away from the line, indicating that the distribution is heavily skewed From the Histogram Day 3, we see that the distribution has longer tail towards the right side of the normal curve, indicating that the distribution is skewed right. Also, going through the normal p - p plot for Day 3, we see that most of the points move away from the line, indicating that the distribution is heavily skewed Descriptive Statistics Statistics Hygiene (Day 2 Hygiene (Day 3 of Download of Download of Download Festival) Festival) Festival) 810 264 123 0 546 687 Mean 1.7934 .9609 .9765 Std. Error of Mean .03319 .04436 .06404 Median 1.7900 .7900 .7600 2.00 .23 .44a .94449 .72078 .71028 .892 .520 .504 8.865 1.095 1.033 .086 .150 .218 170.450 .822 .732 .172 .299 .433 20.00 3.44 3.39 Minimum .02 .00 .02 Maximum 20.02 3.44 3.41 N Valid Hygiene (Day 1 Missing Mode Std. Deviation Variance Skewness Std. Error of Skewness Kurtosis Std. Error of Kurtosis Range a. Multiple modes exist. The smallest value is shown The mean Hygiene score on day 1 is 1.7934 with a standard deviation of 0.94. The median Hygiene score on Day 1 is 1.79. This indicates that nearly 50% of sample data of Hygiene score on Day 1 will fall below 1.79 and 50% of sample data of Hygiene score on Day 1 will fall above 1.79. The minimum and maximum recorded Hygiene score on Day 1 is 0.02 and 20.02 respectively. The skewness and kurtosis values are 8.865 (Z score = 103.08) and 170.450 (Z score = 990.99) respectively. Here, we can say that the distribution is slightly right skewed The mean Hygiene score on day 2 is 0.9609 with a standard deviation of 0.721. The median Hygiene score on Day 2 is 0.79. This indicates that nearly 50% of sample data of Hygiene score on Day 2 will fall below 0.79 and 50% of sample data of Hygiene score on Day 2 will fall above 0.79. The minimum and maximum recorded Hygiene score on Day 2 is 0.00 and 3.44 respectively. The skewness and kurtosis values are 1.095 (Z score = 10.43) and 0.822 (Z score = 2.75) respectively. Here, the distribution is skewed right The mean Hygiene score on day 3 is 0.9765 with a standard deviation of 0.7103. The median Hygiene score on Day 3 is 0.96. This indicates that nearly 50% of sample data of Hygiene score on Day 3 will fall below 0.76 and 50% of sample data of Hygiene score on Day 3 will fall above 0.76. The minimum and maximum recorded Hygiene score on Day 3 is 0.02 and 3.41 respectively. The skewness and kurtosis values are 1.033 (Z score = 4.74) and 0.822 (Z score = 1.69) respectively. Here, the distribution is skewed right 2. Using the data set: ChickFlick.sav, create a simple bar chart for independent means. The variables you will use are: Arousal, Film, and Gender (grouping variable). Be sure to display error bars and save your chart with an appropriate title in your Part #3 Word document. Bar Chart - Distribution of Psychological arousal during the film between male and female participants Bar Chart - Distribution of Psychological arousal during the film between Bridget Jones Dairy and Memento 3. Using the data set: Hiccups.sav, create a clustered bar chart for related means. The variables you will use are: Baseline, Tongue Pulling, Carotid Artery Massage, Digital Rectal Massage. Be sure to display error bars, include labels for the X- and Y-axis, and save your chart with an appropriate title in your Part#3 Word document. Bar Chart - Relationship between type of massage therapy and their mean relief in pain 4. Using the data set: Text Messages.sav (Note: you may see an additional data set with the same name: TextMessages.sav - either will create the correct output), create a clustered bar chart for mixed designs. The variables you will use are: Time1, Time2, and Group. Be sure to display error bars, include labels for the X- and Y-axis, and save your chart with an appropriate title in your Part #3 Word document. Bar Chart - Relationship between Time 1, Time 2 and Group 5. Using the data set: Exam Anxiety.sav, create a scatterplot that includes a regression line. The variables you will use are: Exam Performance and Exam Anxiety. Be sure to include the regression line and save your chart with an appropriate title in your Part #3 Word document. Scatter Plot - Relationship between Exam Anxiety and Performance Section B. Why Exploratory Data Analysis? Write a short paragraph that highlights your understanding of why exploratory data analysis is a critical part of any analytical strategy (500 word limit). This answer is worth half the assigned points for this activity. To receive full credit, you must show a high level of understanding related to the importance of exploring data visually. Exploratory Data Analysis The box plot represents the distribution of day 1 hygiene between male and female participants. Here, we see that there is no difference in the day 1 hygiene results (after removing the outliers outlier identified to 2.02 prior to creating the boxplot), between male and female participants. But some higher value outliers are found in day 1 hygiene in female participants The mean psychological arousal during the film is high in male groups and it was also found high in Memento film. That is, the male participants working in Memento film experience the high mean psychological arousal during the film The mean value is less for Digital Rectal Massage followed by Carotid massage and it seems to be high in Baseline and Tongue pulling massage. The variation is also seems to be high in Baseline and Tongue pulling massage as we can see that the error bars length is large in these two massage therapies when compared to that of Digital Rectal Massage and Carotid Artery Massage On comparing the average text messages between controls and text messages group, it is observed that there is no difference in the average text messages between the controls and text messages groups at baseline, but the mean number of text messages is less in text messages groups when compared to that of control groups at six month follow up. That is, there is a significant decrease in delivering the text messages in the text messages group when compared to that of control group. Thus, the treatment condition is effective in decreasing the number of text messages sent The scatter plot represents the relationship between exam anxiety and exam performance (%). Going through the scatter plot, we see that the points most of the points move from left top to right bottom indicating that there exists a moderate negative relationship between anxiety and performance. That is, as the exam anxiety score increases, then the performance in the exams decreases. The value of R2 is found to be 0.194. This indicates that 19.4% of the variation in the dependent variable is explained by the independent variable exam anxiety, while the remaining 80.6% left unexplained. References James T. McClave, P. George Benson, Terry Sincich, 'Statistics for Business and Economics. Pearson; 11 edition (January 13, 2010). ISBN-13: 978-0321640116 Douglas Lind, William Marchal, Samuel Wathen, 'Basic Statistics for Business and Economics, McGraw-Hill/Irwin; 7 edition (January 11, 2010). ISBN-13: 978-0077384470 Donald R. Cooper, Pamela S. Schindler, 'Business Research Methods', 11th Edition (January 1, 2011). Part A (Items #1 and #12 are required but not graded) You will submit one file, a Word document. Please limit each response to 250 words or less. Name the file in the following format: lastnamefirstinitialBTM8107-1.doc (example: smithbBTM8107-1.doc). 1. Briefly describe your area of research interest (1-3 sentences is sufficient). The main objective of this study is to have a clear idea about variables and its measurements. In addition, the descriptive statistics that will be used for different level of measurements 2. List 4 variables that you might assess in a research project related to your research area. List one for each type of measurement scale: Nominal, ordinal, interval, and ratio. If you cannot think of a variable for each measurement scale, explain why the task is difficult. The variables that can be taken into consideration are Community Living Transportation Education and Employment Here, the variable community living describes whether their house has accommodations with living features for disabilities. This is a categorical variable and hence measured using nominal scale Transportation represents whether special transportation is available for these kinds of patients. This is a categorical variable and hence measured using nominal scale Education represents whether special education is available for these kinds of patients. This is a categorical variable and hence measured using nominal scale Employment represents whether employers are ready to hire a disabled person. This is a categorical variable and hence measured using nominal scale 3. Create one alternate hypothesis and its associated null hypothesis related to your research area. Null Hypothesis: H0 That is, there is no association between education and employment of individuals with mental or physical disabilities Alternate Hypothesis: Ha That is, there is an association between education and employment of individuals with mental or physical disabilities 4. Briefly describe whether you think your area of interest is more conducive to experimental or correlational research. What are the costs/benefits of each as it relates to your research area? The study taken into consideration is correlational research. Here, we are trying to find relationship between two variables and therefore, the appropriate study design is correlational study design. A correlational study determines whether or not two variables are correlated. This means to study whether an increase or decrease in one variable corresponds to an increase or decrease in the other variable 5. Reliability vs. Validity. Considering your area of research interest, discuss the importance of reliability and validity. Can you have one without the other? Why or why not? Reliability and validity is an important factor in our study. Here, we use Cronbach's alpha to determine the internal consistency of our data 6. Sample vs. Population. Considering your area of research interest, describe the difference between a sample and population. Why is it important to understand the difference between a sample and population in a statistics course? Sample is a subset of a population. In real situations, it is not possible to collect the data from entire population. Therefore, a sample of participants was selected and the statistical analysis was performing from the data generated from the sample respondents and the findings are used to infer about the population. The main advantage of sample is, it is cost effective, saves man power and time 7. Measures of Central Tendency. Below is a set of data that represent weight in pounds for a particular sample. Calculate the mean, median and mode. Which measure of central tendency best describes this data and why? You may use Excel, SPSS, some other software program, or a hand calculator for this problem. 110.00 117.00 120.00 118.00 104.00 100.00 107.00 115.00 115.00 115.00 114.00 100.00 117.00 115.00 103.00 105.00 110.00 115.00 250.00 275.00 The table given below shows the workings of central tendency for the given dataset Weight Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 126.25 10.54636 115 115 47.16474 2224.513 7.062065 2.837079 175 100 275 2525 20 From the above table, we see that the mean weight in pounds for a particular sample is 125.25, median is 115 and mode is 115. Since the mean value is greater than the median, we can say that the distribution of weight in pounds for a particular sample is skewed right. Therefore, the appropriate measure of central tendency for skewed data is median. Therefore, the appropriate value of central tendency is 115 8. Measures of Dispersion. For the data set above, calculate the range, the interquartile range, the variance, and the standard deviation. What do these measures tell you about the \"spread\" of the data? The table given below shows the workings of central tendency for the given dataset Weight Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 126.25 10.54636 115 115 47.16474 2224.513 7.062065 2.837079 175 100 275 2525 20 From the above table, we see that the sample deviation nis 47.164, sample variance is 2224.513. The range is 175 and the interquartile range is 10.5 9. Descriptive Statistics. Why is it important to perform basic descriptive statistics prior to conducting inferential statistical tests? Inferential statistics is used to infer the population. Therefore, before performing inferential statistics, we need to perform the descriptive statistics and gather the information regarding the distribution of the data and the presence of outliers. This will enable us to select suitable statistical tests for our data. For example, when the distribution is skewed, then performing parametric tests is not appropriate. In those kind of situations, we need to perform non parametric tests. 10. Statistical Significance. Revisit the hypotheses you created above in #5. If you conducted a statistical test based on these hypotheses and found a statistically significant result, what would that mean from both a statistical and practical standpoint? (Be sure to use the phrases \"null hypothesis\" and \"effect size\" in your answer). Here, the statistical significance indicates that there is an association between education and employment of individuals with mental or physical disabilities 11. Type I and Type II Error. The concept of Type I and Type II Error is critical and will come into play not only with each and every statistical test you perform, but when you are asked to conduct an a priori power analysis for your Dissertation Proposal. Considering your answer to #10, discuss the implications of making both a Type I and Type II error. Type I error is rejecting the null hypothesis when it is true. Here, we conclude that there is an association between education and employment of individuals with mental or physical disabilities, but in fact there is there is no association between education and employment of individuals with mental or physical disabilities Type II error is failing to reject the null hypothesis when it is not true. Here, we conclude that there is no association between education and employment of individuals with mental or physical disabilities, but in fact there is there is an association between education and employment of individuals with mental or physical disabilities 12. After completing Assignment #1, are there any areas of concern you have that you would like to share with your course instructor? Yes, the main concern is the sample selection and questionnaire that needs to be framed for my objective. As the topic chosen is very sensitive, there is more chance that the participants will not respond to the study properly and therefore, what exact sample size is required for this study and also the sampling frame that needs to be used *Textbook: Discovering statistics using IBM SPSS statistics (4th Ed.). Thousand Oaks, CA Sage Publications. ISBN; 9781446249185. By Andy Field. *Software: IBM SPSS statistics Grad Pack 23 for windows. ID#: 44W5923. (V19,V20, and v22 are acceptable). Explore Correlation and Regression You will submit one Word document. You will create this Word document by cutting and pasting SPSS output into Word. Please answer the questions first and include all output at the end of the activity in an Appendix. Part A. SPSS Assignment Part A of Assignment #3 has you familiarizing yourself with a set of data, providing you the opportunity to perform statistical tests and then interpret the output. You will rely on all you have learned to this point and add correlation and regression strategies to your skill set. Using the data set: Chamorro-Premuzic.sav; you will focus on the variables related to Extroversion and Agreeableness (student and lecturer). To complete Part A 1. Exploratory Data Analysis. a. Perform Exploratory Data Analysis on all variables in the data set. Because you are going to focus on Extroversion and Agreeableness, be sure to include scatterplots for these combinations of variables (Student Agreeableness/Lecture Agreeableness; Student Extroversion/Lecture Extroversion; Student Agreeableness/Lecture Extroversion; Student Extroversion/Lecture Agreeableness) and include the regression line within the chart. b. Compose a one to two paragraph write up of the data. c. Create an APA style table that presents descriptive statistics for the sample. 2. Make a decision about the missing data. How are you going to handle it and why? 3. Correlation. Perform a correlational analysis on the following variables: Student Extroversion, Lecture Extroversion, Student Agreeableness, Lecture Agreeableness. a. Ensure you handle missing data as you decided above. b. State if you are using a one or two-tailed test and why. c. Write up the results in APA style and interpret them. 4. Regression. Calculate a regression that examines whether or not you can predict if a student wants a lecturer to be extroverted using the student's extroversion score. 5. Multiple Regression. Calculate a multiple regression that examines whether age, gender, and student's extroversion predict if a student wants the lecturer to be extroverted. a. Ensure you handle missing data as you decided above. b. State if you are using a one or two-tailed test and why. c. Include diagnostics. d. Discuss assumptions: are they met? e. Write up the results in APA style and interpret them. f. Do these results differ from the correlation results above? Part B. Applying Analytical Strategies to an Area of Research Interest 1. Briefly restate your research area of interest. a. Pearson Correlation: Identify two variables for which you could calculate a Pearson correlation coefficient. Describe the variables and their scale of measurement. Now, assume you conducted a Pearson correlation and came up with a significant positive or negative value. Create a mock r value (for example, . 3 or -.2). Report your mock finding in APA style (note the text does not use APA style) and interpret the statistic in terms of effect size and R2 while also taking into account the third variable problem as well as direction of causality. b. Spearman's Correlation: Identify two variables for which you could calculate a Spearman's correlation coefficient. Describe the variables and their scale of measurement. Now, assume you conducted a correlation and came up with a significant positive or negative value. Create a mock r value (for example, .3 or -.2). Report your mock finding in APA style and interpret the statistic in terms of effect size and R2 while also taking into account the third variable problem as well as direction of causality. c. Partial Correlation vs. Semi-Partial Correlation: Identify three variables for which you may be interested in calculating either a partial or semi-partial correlation coefficient. Compare/contrast these two types of analyses using your variables and research example. Which would you use and why? d. Simple Regression: Identify two variables for which you could calculate a simple regression. Describe the variables and their scale of measurement. Which variable would you include as the predictor variable and which as the outcome variable? Why? What would R2 tell you about the relationship between the two variables? e. Multiple Regression: Identify at least 3 variables for which you could calculate a multiple regression. Describe the variables and their scale of measurement. Which variables would you include as the predictor variable and which as the outcome variable? Why? Which regression method would you use and why? What would R2 and adjusted R2 tell you about the relationship between the variables? f. Logistic Regression: Identify at least 3 variables for which you could calculate a logistic regression. Describe the variables and their scale of measurement. Which variables would you include as the predictor variable and which as the outcome variable? Why? Which regression method would you use and why? What would the output tell you about the relationship between the variables? *Textbook: Discovering statistics using IBM SPSS statistics (4th Ed.). Thousand Oaks, CA Sage Publications. ISBN; 9781446249185. By Andy Field. *Software: IBM SPSS statistics Grad Pack 23 for windows. ID#: 44W5923. (V19,V20, and v22 are acceptable). Explore Correlation and Regression You will submit one Word document. You will create this Word document by cutting and pasting SPSS output into Word. Please answer the questions first and include all output at the end of the activity in an Appendix. Part A. SPSS Assignment Part A of Assignment #3 has you familiarizing yourself with a set of data, providing you the opportunity to perform statistical tests and then interpret the output. You will rely on all you have learned to this point and add correlation and regression strategies to your skill set. Using the data set: Chamorro-Premuzic.sav; you will focus on the variables related to Extroversion and Agreeableness (student and lecturer). To complete Part A 1. Exploratory Data Analysis. a. Perform Exploratory Data Analysis on all variables in the data set. Because you are going to focus on Extroversion and Agreeableness, be sure to include scatterplots for these combinations of variables (Student Agreeableness/Lecture Agreeableness; Student Extroversion/Lecture Extroversion; Student Agreeableness/Lecture Extroversion; Student Extroversion/Lecture Agreeableness) and include the regression line within the chart. b. Compose a one to two paragraph write up of the data. c. Create an APA style table that presents descriptive statistics for the sample. 2. Make a decision about the missing data. How are you going to handle it and why? 3. Correlation. Perform a correlational analysis on the following variables: Student Extroversion, Lecture Extroversion, Student Agreeableness, Lecture Agreeableness. a. Ensure you handle missing data as you decided above. b. State if you are using a one or two-tailed test and why. c. Write up the results in APA style and interpret them. 4. Regression. Calculate a regression that examines whether or not you can predict if a student wants a lecturer to be extroverted using the student's extroversion score. 5. Multiple Regression. Calculate a multiple regression that examines whether age, gender, and student's extroversion predict if a student wants the lecturer to be extroverted. a. Ensure you handle missing data as you decided above. b. State if you are using a one or two-tailed test and why. c. Include diagnostics. d. Discuss assumptions: are they met? e. Write up the results in APA style and interpret them. f. Do these results differ from the correlation results above? Part B. Applying Analytical Strategies to an Area of Research Interest 1. Briefly restate your research area of interest. a. Pearson Correlation: Identify two variables for which you could calculate a Pearson correlation coefficient. Describe the variables and their scale of measurement. Now, assume you conducted a Pearson correlation and came up with a significant positive or negative value. Create a mock r value (for example, . 3 or -.2). Report your mock finding in APA style (note the text does not use APA style) and interpret the statistic in terms of effect size and R2 while also taking into account the third variable problem as well as direction of causality. b. Spearman's Correlation: Identify two variables for which you could calculate a Spearman's correlation coefficient. Describe the variables and their scale of measurement. Now, assume you conducted a correlation and came up with a significant positive or negative value. Create a mock r value (for example, .3 or -.2). Report your mock finding in APA style and interpret the statistic in terms of effect size and R2 while also taking into account the third variable problem as well as direction of causality. c. Partial Correlation vs. Semi-Partial Correlation: Identify three variables for which you may be interested in calculating either a partial or semi-partial correlation coefficient. Compare/contrast these two types of analyses using your variables and research example. Which would you use and why? d. Simple Regression: Identify two variables for which you could calculate a simple regression. Describe the variables and their scale of measurement. Which variable would you include as the predictor variable and which as the outcome variable? Why? What would R2 tell you about the relationship between the two variables? e. Multiple Regression: Identify at least 3 variables for which you could calculate a multiple regression. Describe the variables and their scale of measurement. Which variables would you include as the predictor variable and which as the outcome variable? Why? Which regression method would you use and why? What would R2 and adjusted R2 tell you about the relationship between the variables? f. Logistic Regression: Identify at least 3 variables for which you could calculate a logistic regression. Describe the variables and their scale of measurement. Which variables would you include as the predictor variable and which as the outcome variable? Why? Which regression method would you use and why? What would the output tell you about the relationship between the variables
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started