Answered step by step
Verified Expert Solution
Question
1 Approved Answer
https://catalog.data.gov/dataset/crime-data-from-2020-to-present 1. Find a dataset and describe your motivation (10pt) a. Find a dataset online and provide a URL to the data (5pt) i. This
https://catalog.data.gov/dataset/crime-data-from-2020-to-present
1. Find a dataset and describe your motivation (10pt) a. Find a dataset online and provide a URL to the data (5pt) i. This dataset should have at least 4 variables with at least 1 numeric variable and 1 character variable ii. Some places to find datasets: 1. The UCI machine learning repository: https://archive.ics.uci.edu/ 2. Kaggle: https://www.kaggle.com 3. data.gov 4. data.europa.eu b. Explain why you chose this data (what interest it holds for you personally or professionally) (5pt) 2. Import the data into SAS OnDemand (20pt) a. It is recommended that you not try to read the data in from the URL and instead download the file to your computer and then upload it to your SAS OnDemand workspace. b. Be sure you know what type of delimiting your data uses to read it in correctly c. Use the REPLACE option in your PROC IMPORT so that old versions will be overwritten if your first PROC IMPORT does not work as intended d. Save this dataset to a permanent library 3. Do the following tasks (70pt): a. Create a new variable in your dataset using the existing variables (10pt)i. Forexample, you may add two columns together, add a Yes/No version of one column, or anything else! Give labels to variables which you are interested in (does not have to be all variables!) (10pt) Sort your data by a variable of your choosing and output the sorted datato afile in your working directory (10pt) Generate a contingency table for at least one of your character variables displaying also the cumulative column percentages (look at SAS documentation if you are unsure how to do this), comment on the results (15pt) Get the mean and standard deviation or the median and the interquartile range for at least one of your numeric variables across different values of one of your character variables and comment on the results (15pt) Create at least one (1) appropriate plot and explain what information it provides (10pt) 4. BONUS POINTS (15pt) a. C. Subset your data to only the variables which you use in your script. If you use all of your variables, you may write a statement that will keep all of the variables in your dataset (5pt) Change the names of some of your variables to names which are more descriptive or easier to type (5pt) For part 3(d) or 3(e), do this with only a subset of your data (5pt) 5. Save your output as a .PDF file and submit it to Moodle alongside your .SAS file. You're all doneStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started