Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This exercises some basic Scala and some basic Spark Given the training set : X ( age: Double) ,Y( yearly visits to parks!: Double) X

This exercises some basic Scala and some basic Spark

Given the training set : X ( age: Double) ,Y( yearly visits to parks!: Double)

X = (10 5 1 6 7 3 4 5 1 8)

Y = (2 4 4 2 4 5 4 5 6 4 )

first write your own file with this data, csv or text file

you can use Scala.io to read in the file in Scala and "spark.read. .... " using Spark

See the cookbook !

PART I : Do a Scala Analysis and compute --- regression coefficient, intercept, SST, SSR, SSE, Correlation coef, R, R^2, angle between x y . Draw the statistical triangle and identify the legs.

PART II Do a Spark analysis of this data set

0. //optional //Construct your schema to match the incoming data

1. Read in your file ( which will give you a DataFrame (DF).

2. Convert DatasFrame to a Dataset ( see ch 11 of Spark Guide)

3. Construct a Vector assembler to convert the age column to a features vector

( can you use basic Spark datatypes to do this manually)?

4. Make sure the "visits" column is called "label" ( that is, get your DS ready for regression)!

5. Do a Spark regression and verify the Scala values earlier calculated.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Structured Search For Big Data From Keywords To Key-objects

Authors: Mikhail Gilula

1st Edition

012804652X, 9780128046524

More Books

Students also viewed these Databases questions