Subset the data set based on the location, day of the week, type of collision, and lighting
Question:
Subset the data set based on the location, day of the week, type of collision, and lighting condition. Compare these subsets of data to find interesting patterns. Can you identify any links between crash fatality and the aforementioned variables? Are there any missing values? Which strategy should you use to handle the missing values? Because many of the variables are categorical, you should consider transforming them into dummy variables prior to the analysis.
ID | County | City | Weekday | Severity | ViolCat | ClearWeather | Month | CrashType | Highway |
1 | SAN DIEGO | SAN DIEGO | 7 | 1 | 8 | 0 | 1 | A | 0 |
2 | HUMBOLDT | UNINCORPORATED | 4 | 1 | 8 | 1 | 1 | A | 1 |
3 | VENTURA | OXNARD | 2 | 1 | 12 | 1 | 2 | A | 0 |
4 | STANISLAUS | UNINCORPORATED | 4 | 1 | 1 | 1 | 1 | A | 0 |
5 | MENDOCINO | UNINCORPORATED | 5 | 1 | 1 | 1 | 1 | A | 1 |
6 | LOS ANGELES | LONG BEACH | 7 | 1 | 3 | 1 | 3 | A | 0 |
7 | LOS ANGELES | LOS ANGELES | 4 | 1 | 3 | 0 | 3 | A | 0 |
8 | CALAVERAS | UNINCORPORATED | 1 | 1 | 1 | 1 | 2 | A | 1 |
9 | SAN BERNARDINO | HESPERIA | 2 | 1 | 1 | 1 | 1 | A | 0 |
10 | VENTURA | OXNARD | 5 | 0 | 8 | 1 | 1 | A | 0 |
11 | VENTURA | OXNARD | 6 | 0 | 8 | 0 | 1 | A | 0 |
12 | ORANGE | FULLERTON | 4 | 0 | 9 | 1 | 2 | A | 0 |
13 | SAN DIEGO | CHULA VISTA | 1 | 0 | 3 | 1 | 1 | A | 0 |
14 | ALAMEDA | OAKLAND | 6 | 0 | 1 | 1 | 1 | A | 0 |
15 | LOS ANGELES | LOS ANGELES | 5 | 0 | 9 | 1 | 3 | A | 0 |
16 | SANTA CLARA | MORGAN HILL | 4 | 0 | 8 | 1 | 3 | A | 0 |
17 | LOS ANGELES | LOS ANGELES | 3 | 0 | 9 | 1 | 3 | A | 0 |
18 | SAN JOAQUIN | UNINCORPORATED | 3 | 0 | 8 | 1 | 3 | A | 0 |
19 | LOS ANGELES | LOS ANGELES | 5 | 0 | 9 | 1 | 4 | A | 0 |
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Business Analytics
ISBN: 9781265897109
2nd Edition
Authors: Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, Leida Chen
Question Posted: