Question
Subject: Statistical Methods and Analysis Topic: College Registration Instruction: Outcomes: The successful student will be able to: Describe the types of data collected in a
Subject: Statistical Methods and Analysis
Topic: College Registration
Instruction:
Outcomes:
The successful student will be able to:
- Describe the types of data collected in a specific enterprise process
- Identify types of data found in a typical business data set
- Create a structured data set for a sample representing a specified enterprise process
1. Create an Excel data set for a sample of people or units or Events
A. Your data set will be a sample of people, units or activities related to a specified business function or process. Each group will have a unique topic for which to build there data set.
B. A case is a single row that represents one item in your sample or population. In the example data set, the data set is a sample ofcountries. See the 2nd column. Each row represents the data for a country.
C. There must be at least 60 cases or rows of observations.
- Each row will represent one person, unit, or instance of an activity from your sample or population.
- Each row will be identified separately by a unique code found in the first column of each row. See column 1 in the sample.
D. There must between 12 to 15 variables, each is a column of data. (If you exceed 15, only the first 15 will be considered. Having less than 12 columns will be penalized.)
- Each column represents a single variable describing the people, units or activities collected in the sample.
- A header in the first row of the data set identifies what the variable is.
- Please continue reading to see what kind of variables/columns are required.
E. The variables (columns) chosen for the data set must include:
i. The first column at the left as an Identifier.It contains the uniqueidentifying code for each row. Note in our sample data base, there is a different number for each country inCountry ID which is the first column in the data set.
ii. A column is required for a qualitative, categorical variable where each value provides a description of the cases such as name or product type or type of event.
In our sample data set,Country is an example of this type of required variable.
iii. At least two columns of qualitative, categorical variables where there is a minimum of three and a maximum of eight different values for the entire set of cases.
These are important variables as they will allow you to create charts using multiple variables required in Part B.
In our sample data set, there are three such variables. In Column 3,Economic Status, Column 5, Gov't Health Spending Levels and in Column 9, Food Safety Risk Level all have just three different values. You don't have to restrict your variable to three values but the list should be short.
iv. At least three columns of measurable, Quantitative variables. Do not include the units with the numbers in the column of data. Indicate the units in the variable name. For example: Instead of entering a value of 15 kg for a case, your column header will be Weight (kg) and you will enter 15. In our example data set, Column 4,Govt Health Spending (per capita), Column 10, Birth Rate per 1000 Women and columns 11 through 13 showing life expectancy for males, females and the overall average, are continuous quantitative variables.
v. At least one variable that is a Quantitative with few (8 or less) values. In our example data set, Column 6, theEpidemic Control Index is an example of this type of variable.
vi. Add more variables to complete the 12 to 15 columns of date. Variable types of your own choosing.
2. Once you have your column headers in place, you must populate the data set with at least 60 cases. Please note: You will be graded on the quality of the cases in your data set:
i. The variables reflect the kind of data that would be collected in the enterprise process you are assigned.
ii. For each case (row), the values are "internally consistent", which means they make sense to be in the same case.
For example, in the sample data set, a "3rd World country" is unlikely to be able to afford High Per Capita spending on health care. Maybe one or two might, but most would not. They would not have high longevity. On the other hand, 1st world countries would have values that are different than 3rd world countries.
iii. There are very few duplicate cases. Values for continuous variables and combinations of values across a case tend to be different for most cases.
3.In a separate worksheet, beside the worksheet containing your data set, provide documentation for your data set. (See Appendix_B for an example format and content.) Include the following:
a. A paragraph describing the process you were assigned and the item, event or person you are collected the data for
b. A list of your variables and information about them:
i. The name of your variable (column label)
ii. The type of variable it is: Qualitative, Quantitative Discrete or Quantitative Continuous
iii. The range of values for continuous variables or the list of values for discrete variables
iv. An explanation of what the variable represents and how it is represented by its values or range of values.
Sample Provided from Professor
WE AutoSave Off H Application Project --General Instructions & Part A Fall 2022 (1) - Saved to this PC Search (Alt+Q) hycco fernandez HF X Share File Home Insert Draw Design Layout References Mailings Review View Help Comments Find Calibri (Body) * 11 ~ A" A Aa Ap Lol Normal No Spacing Heading 1 Replace Editor Reuse Paste BIU~abx, XALA= = BY Dictate Sensitivity Select Files Clipboard Font Paragraph Styles Editing Voice Sensitivity Editor Reuse Files Undo A 1 3 .. 5 . 6 Appendix A: Example Data Set This is the World Health Organization 2012 Longevity Data Set. While it is not a business data set, it is structured in the way yours needs to be. Return to Instructions. The first row displays column headers which are the names of your variables. This data set has 13 variables. You need 15 variables. Each row, a case, is one of the people, units or events in your sample. This data set is studying countries. There is a row for each country. Here there are 15 rows. You need 60 rows. Govt Govt Average Health Health Epidemic Epidemic Food Birth Rate Life Average Life Average Country Economic Spending Spending Control Control Safety Food Safety per 1000 Expectancy Expectancy Life ID Country Status (per capita) Level Index Preparedness Index Risk Level Women (Male) (Female) Expectancy 01 Afghanistan LLDC 10.70 Very low 13 Very Risky 35.30 58 61 60 02 Algeria LDC 234.40 Very low 10 Very Risky 24.60 70 73 03 Andorra DC 2340.60 High 75 40 73 Very Risky 9.00 79 86 83 04 Antigua & Barbuda LDC 513.60 Low 100 60 80 Risky 16.6 73 77 75 05 Argentina LDC 688.70 Low 50 100 60 Safe 16.90 73 79 76 06 Armenia LDC 62.90 Very low 75 100 87 Safe 13.90 67 75 71 07 Australia DC 4108.40 High 100 100 100 Safe 13.30 81 85 83 Austria DC 4085.10 High 75 100 100 Safe 9.50 78 83 08 81 09 Bahrain LDC 643.50 Low 100 70 93 Risky 15.60 76 78 77 10 Bangladesh LLDC 9.00 Very low 75 50 27 Very Risky 20.30 69 71 70 Page 5 of 7 117 of 2228 words X Ex Accessibility: Investigate " Focus + 140% 11:03 PM 13 C W CH 11/11/2022 CloudyAutoSave Off Search (Alt+Q) X X Statistical Assignment 1 - Excel hycco fernandez HF Comments Share File Home Insert Page Layout Formulas Data Review View Help Calibri 12 A A al Wrap Text General IX Ex AY O $ ~ % 9 Conditional Format as Cell Insert Delete Format Sort & Find & Analyze Sensitivity Paste BIUVFDAY E Merge & Center 00 0 Formatting Table Styles v Filter ~ Select Data Clipboard Font Alignment Number Styles Cells Editing Analysis Sensitivity Undo FO viXfx Male K L M N O A B C D E F G H College Registration 2021 Program Tuition Overall Student Age First Admission Registration Awarded Campus Cost GPA Admission Bursaries / Identificati Last Name Genders Group Programs Cost and Categories Name Types Dates Credentials Locations Grants ($) Discounts (Years) Books ($) (Grade on (%) Points) 911589090 Domestic Aaron Frankie Full Time Male 18 10/5/2021 Diploma Accounting 1500.00 3 9832.00 2.22 A W 913458905 Domestic Adiliee Dianella Full Time Other 20 12/16/2021 Diploma Marketing 1100.00 9214.00 3.24 1100.00 9832.00 3.5 914589056 Domestic Andrade Ellenla Part Time Female 23 11/15/2021 Diploma Accounting 915719207 International Augusta Bill Full Time Other 25 12/10/2021 Diploma Marketing 1500.00 34320.00 4 750.00 34320.00 4 917979509 International Ballmer Hanna Full Time Other 19 8/29/2021 Diploma Marketing 11/15/2021 Advanced 8 919109660 Domestic Barbara Anne Full Time Female 20 Diploma Finance 750.00 14748.00 3.84 920239811 International Batto Ed Full Time Male 21 10/20/2021 Diploma Accounting 2300.00 8 34940.00 3.51 11/20/2021 Bachelors 922500113 International Belen Carmel Full Time Female 23 Degree Human Resourc 2800.00 10 85808.00 3.85 10 Accounting w Male 18 8/30/2021 Diploma 1000.00 9832.00 3.02 11 916849358 Domestic Bogley Kenneth Part Time 12 921369962 Domestic Brenden Maryanne Part Time Other 27 8/19/2021 Diploma Marketing 1100.00 6 214.00 2.06 10/11/2021 Bachelors 3.79 13 924760415 International Camee Cheryllee Full Time Female 19 Degree Human Resourc 1500.00 10 85808.00 Delsis Part Time Other 18 11/25/2021 Diploma Marketing 500.00 6 9214.00 3.38 14 927020717 Domestic Canti 10/28/2021 Bachelors 15 928150868 International Carag Berg Full Time Male 19 Degree Human Resourc 2000.00 10 85808.00 2.13 Female 21 8/8/2021 Diploma Accounting 750.00 w 832.00 4 16 929281019 Domestic Carrieg Lorianne Full Time 17 non411 17ln. Minl . Registration Data Data Set Documentation + + Ready 11:04 PM 13.C CH 11/11/2022 CloudyX AutoSave Off Application Project Grading Sheet Part A M Fernandez, T Whittaker, H Fernandez, C Froklage, M Ravalico - Excel Search (Alt+Q) hycco fernandez HF X File Home Insert Page Layout Formulas Data Review View Help Comments Share Calibri 11 A A = = 2 Y al Wrap Text General IX Ex AP O Paste BIUVFDAY EE Merge & Center * % 00 .0 Conditional Format as Cell Insert Delete Format Sort & Find & Analyze Sensitivity Formatting Table v Styles v Filter > Select Data Undo Clipboard Font Alignment Number Styles Cells Editing Analysis Sensitivity F10 VIX fx A B D E F G H M N 1 Out of Grade Requirements Comments 0.5 0.5 Each record represents on individual person or unit from the assigned 2 topic population or sample Each is a unique record representing a student in the data set. Meets the requirements in terms of the number of records and the 0.5 0.5 number of variables: You have included sufficient records and variables. 12-15 Variables and a minimum of 60 records You are using Sheridan College as an example as you are using the three campuses. The records are not representative of the population you are collecting the data from. Even if the data is fictional, the population is has to be reasonably representative. Given that sufficient records exist, each record is a reasonable For example, 28 of the 60 students are International. Yet nearly 100% of the names are Anglo. Trouble with Anglo names, they 1.5 0 description of an individual in the population or sample and the records are gender specific and many names do not match the gender you have indicated. You have tuition and book fees for the year together represent sufficient variability in the population or sample. that are $34,000 + and $85000+ that are significantly unrealistic. You put business programs on the Trafalgar campus where there are none. This is a project and your team has five members. You need to do some work among you in finding out what is 5 reasonable for a college registration database. All the required variable types are represented and the number of each type has been provided: . 1" column - unique identifier co- 2" column - name or type of individual You two name columns are counted as one column as indicated in the topic selection document. At least 2 Qualitative variables with between 3 to 8 values These are weak as they are randomly assigned. If you look across the column, the values aren't always consistent. Example is the gender and name indicated earlier. You have problems with your Quantitative values.Discounts are not offered for tuition which is government regulated. Tuition 1.5 0.5 fees are "discounted" by offering grants, scholarships, on-campus jobs and student loans. Even if there were discounts, the discount would not apply across all tuition, all books and all academic fees. So, this variable doesn't work. Take it out of your data At least 3 Quantitative variables that measure distinctly different set. Tuition fees are the same for all Business programs but different only for Diploma versus Degree and Domestic versus things about your population International. Even with books included, they don't swing by $20,000 or more. You need to fix the values for this variable. Your Part Time students would not be paying $35,000 or more for tuition. At best, 25% of students get bursuries or grants and these are matched to GPA which you didn't do. You need to fix this one. You have randomized the assignment of values to your 10 11 At least I Quantitative/ Discrete variable with between's to o records. You need to think through what is realistic for each student type you have. Thes are weak as in many cases, they are randomly assigned. 12 . No Yes/No variables. V 1 0.75 Documentation of the data set is complete Generally, satisfactory. GPA explanation is unsatisfactory. GPA is a score that has already been earned. You need at least one 13 nother field with this score to indicate what semesters it covers or from what institution. 14 2.25 The better you do this part of the project, the easier and faster Part B will be. 15 16 18 10 GRADING + Ready + 70% 13.C 11:05 PM Cloudy 11/11/2022Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started