Question 1 (3 points)
Sample data is collected in ExamScores.csv that includes scores in the first and second exams for students in a class. The variables are called Exam1 and Exam2 respectively. The professor is interested in finding out whether the average score in the second exam is different from the average score in the first exam, treating the data as matched-pair. Which of the following Python lines can be used to perform this test?
Question 1 options:
| a) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_paired_score = scores[['Exam1']] exam2_paired_score = scores[['Exam2']] print(st.ttest_rel(exam1_paired_score, exam2_paired_score)) | |
| b) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_paired_score = scores[['Exam1']] exam2_paired_score = scores[['Exam2']] null_value = 0 alternative = 'not-equal' print(st.ttest_rel(exam1_paired_score, exam2_paired_score, equal_var=False, null_value, alternative)) | |
| c) | from scipy.stats import ttest_ind_from_stats as ttest import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_paired_score = scores[['Exam1']] exam2_paired_score = scores[['Exam2']] print(ttest(exam1_paired_score, exam2_paired_score)) | |
| d) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_paired_score = scores[['Exam1']] exam2_paired_score = scores[['Exam2']] null_value = 0 alternative = 'not-equal' print(st.ttest_ind(exam1_paired_score, exam2_paired_score, equal_var=False, null_value, alternative)) | |
Save
Question 2 (3 points)
Which of the following Python functions is used to perform a hypothesis test for the difference in two population proportions?
Question 2 options:
| a) | prop_1samp_ztest(x, n, null_value, alternative) | |
| b) | prop_1samp_hypothesistest(x, n, null_value, alternative) | |
| c) | proportions_ztest(counts, n) | |
| d) | prop_hypothesis_test(x, n, null_value, alternative) | |
Save
Question 3 (3 points)
How can we obtain a one-tailed probability value (P-Value) from Python functions that return a two-tailed probability value?
Question 3 options:
| a) | Multiply the result by 4 | |
| b) | Divide the result by 4 | |
| c) | Multiply the result by 2 | |
| d) | Divide the result by 2 | |
Save
Question 4 (3 points)
Which of the following Python functions is used to perform a hypothesis test for the difference in two population means when summary data from samples is provided for the two populations?
Question 4 options:
| a) | import scipy.stats as st st.ttest_ind(data1, data2, equal_var=False) | |
| b) | from snhu_MAT243 import means_1samp_ttest means_1samp_ttest(mean, std_dev, n, null_value, alternative) | |
| c) | from statsmodels.stats.proportion import proportions_ztest proportions_ztest(counts, n) | |
| d) | from scipy.stats import ttest_ind_from_stats as ttest ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False) | |
Save
Question 5 (3 points)
A group of 10,000 individuals were divided evenly into two groups. One group was given a vaccine and the other group was given a placebo. Of the 5,000 individuals in the first group, 95 individuals developed a disease. In the second group, 125 individuals developed the disease. Which of the following Python lines are used to perform the hypothesis test to investigate whether or not there is sufficient evidence to conclude that the proportion of individuals that were given the vaccine is less than the proportion who were given a placebo?
Question 5 options:
| a) | from snhu_MAT243 import prop_1samp_ztest n = 10000 x = 5000 null_value = 0.50 alternative = 'not-equal' prop_1samp_ztest(x, n, null_value, alternative) | |
| b) | from statsmodels.stats.proportion import proportions_ztest counts = [95, 125] n = [5000, 5000] proportions_ztest(counts, n) #Divide the output probability value by 2 to get 1 tailed probability value | |
| c) | from statsmodels.stats.proportion import proportions_ztest n = [95, 125] counts = [5000, 5000] proportions_ztest(counts, n) #Divide the output probability value by 2 to get 1 tailed probability value | |
| d) | from snhu_MAT243 import prop_1samp_ztest n = 5000 x = 10000 null_value = 0.50 alternative = 'not-equal' prop_1samp_ztest(x, n, null_value, alternative) | |
Save
Question 6 (3 points)
A professor is interested in finding out whether the average score in the second exam is the same as the average score in the first exam. Suppose two samples are collected for the two exams and saved in the ExamScores.csv file. The variables are called Exam1 and Exam2 respectively. Which of the following Python lines can be used to perform a hypothesis test to investigate if there is sufficient evidence to conclude that average score in the second exam is not equal to the first exam?
Question 6 options:
| a) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_scores = scores[['Exam1']] exam2_scores = scores[['Exam2']] null_value = 0 alternative = 'not-equal' print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False, null_value, alternative)) | |
| b) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') exam1_scores = scores[['Exam1']] exam2_scores = scores[['Exam2']] print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False)) | |
| c) | import scipy.stats as st scores = pd.read_csv('ExamScores.csv') exam1_scores = scores[['Exam1']] exam2_scores = scores[['Exam2']] print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False)) | |
| d) | import scipy.stats as st import pandas as pd scores = pd.read_csv('ExamScores.csv') print(st.ttest_ind(scores)) | |
Save
Question 7 (3 points)
Commute times from town A to town B are obtained for two different highways. The sample size obtained for the first highway is 40, and it is found that the average commute time is 5.35 hours with a standard deviation of 3.1 hours. The sample size obtained for the second highway is 50, and it is found that the average commute time is 4.95 hours with a standard deviation of 5.8 hours. Which of the following Python lines along with this summary data can be used to perform a hypothesis test to conclude whether or not the population means are different?
Question 7 options:
| a) | from scipy.stats import ttest_ind_from_stats as ttest_ind n1 = 40 mean1 = 5.35 stdev1 = 3.1 n2 = 50 mean2 = 4.95 stdev2 = 5.8 print(ttest_ind(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)) | |
| b) | from scipy.stats import ttest_ind_from_stats as ttest n1 = 40 mean1 = 5.35 stdev1 = 3.1 n2 = 50 mean2 = 4.95 stdev2 = 5.8 print(ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)) | |
| c) | from scipy.stats import ttest_ind_from_stats as ttest n1 = 40 mean1 = 4.95 stdev1 = 5.8 n2 = 50 mean2 = 5.35 stdev2 = 3.1 print(ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)) | |
| d) | from scipy.stats import ttest_ind_from_stats as ttest n1 = 40 n2 = 50 print(ttest(n1, n2, equal_var=False)) | |
Save
Question 8 (3 points)
A professor is interested in finding out whether a higher proportion of students score a higher grade than 90 in the second exam as compared to the first exam. Suppose two samples are collected for the two exams and saved in the ExamScores.csv file. The variables are called Exam1 and Exam2 respectively. Which of the following Python lines can be used to perform a hypothesis test to investigate if there is sufficient evidence to conclude that the proportion of students scoring more than 90 is higher in exam 2 compared to exam 1?
Question 8 options:
| a) | from statsmodels.stats.proportion import proportions_ztest import pandas as pd scores = pd.read_csv('ExamScores.csv') proportions_ztest(scores) #Divide the output probability value by 2 to get 1 tailed probability value | |
| b) | from statsmodels.stats.proportion import prop_1samp_ztest import pandas as pd scores = pd.read_csv('ExamScores.csv') x1 = scores[['Exam1']].count()[0] x2 = scores[['Exam2']].count()[0] n1 = (scores[['Exam1']] > 90).values.sum() n2 = (scores[['Exam2']] > 90).values.sum() counts = [x1, x2] n = [n1, n2] null_value = 0.50 alternative = 'larger' prop_1samp_ztest(counts, n, null_value, alternative) | |
| c) | from statsmodels.stats.proportion import proportions_ztest import pandas as pd scores = pd.read_csv('ExamScores.csv') n1 = scores[['Exam1']].count()[0] n2 = scores[['Exam2']].count()[0] x1 = (scores[['Exam1']] > 90).values.sum() x2 = (scores[['Exam2']] > 90).values.sum() counts = [x1, x2] n = [n1, n2] proportions_ztest(counts, n) #Divide the output probability value by 2 to get 1 tailed probability value | |
| d) | from statsmodels.stats.proportion import proportions_ztest import pandas as pd scores = pd.read_csv('ExamScores.csv') x1 = scores[['Exam1']].count()[0] x2 = scores[['Exam2']].count()[0] n1 = (scores[['Exam1']] > 90).values.sum() n2 = (scores[['Exam2']] > 90).values.sum() counts = [x1, x2] n = [n1, n2] proportions_ztest(counts, n) #Divide the output probability value by 2 to get 1 tailed probability value | |
Save
Question 9 (3 points)
Which of the following Python functions is used to perform a hypothesis test for the difference in two population means using data from a sample (i.e., using actual sample data and not using summary data)?
Question 9 options:
| a) | ttest_ind(data1, data2, equal_var=False) | |
| b) | means_1samp_ttest(mean, std_dev, n, null_value, alternative) | |
| c) | ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False) | |
| d) | ttest_rel(data1, data2) | |
Save
Question 10 (3 points)
Which of the following Python functions is used to perform a paired t-test?
Question 10 options:
| a) | ttest_ind(data1, data2, equal_var=False) | |
| b) | ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False) | |
| c) | ttest_rel(data1, data2) | |
| d) | means_1samp_ttest(mean, std_dev, n, null_value, alternative) | |