Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

need accurate answers for all questions please Please do your work independently and do not copy or distribute the exam sheet, thank you! - Let

need accurate answers for all questions please
image text in transcribed
Please do your work independently and do not copy or distribute the exam sheet, thank you! - Let x1 be the last digit of your student ID, x2 be the second last digit, and so on. - x1=_,x2=,x3=,x4=_,x5=,x6=,x7=. - Example ID: 7654321 , then x1=1,x2=2,x3=3,x4=4,x5=5,x6=6,x7=7. - Use the values of x1,,x7 for all the hands-on assignment below. - For all of the following questions, to get point, please Submit 1) the answers to the following questions and 2) corresponding ipython notebook files to CANVAS assignment section "Final Hands on Q1, Q2, Q3". 1. [10 pts] Spark SQL. Input file: Taking every (15+x1)-th sample of the minute_weather.csv If we impute the missing values from the air pressure at 9 am column with average value, how many air pressure at 9 am measurements have values between (910+x2) and (920+x3) ? 2. [10 pts] Spark Decision Tree Classification and Evaluation. Input file: Taking every (15+x4)-th sample of the minute_weather.csv If we perform decision tree classifier with train-test split ratio set as 0.7 and 0.3, seed = (x5+123), maximum depth of the tree set as (x6+6), what is the false positive rate after classification? 3. [10 pts] Spark k-mean Clustering. Input file: daily_weather.csv If we perform clustering with (x7+8) clusters (and seed =x1+10 ), which cluster appears to identify Santa Ana conditions (lowest humidity and highest wind speeds)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Data Mining Concepts And Techniques

Authors: Jiawei Han, Micheline Kamber, Jian Pei

3rd Edition

0123814790, 9780123814791

More Books

Students also viewed these Databases questions

Question

3. How would this philosophy fit in your organization?

Answered: 1 week ago

Question

How would you assess the value of an approach like this?

Answered: 1 week ago