Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

What are the distinct values of 'Year' within the data? What command did you use to determine this? A statement that creates a new dataframe

What are the distinct values of 'Year' within the data? What command did you use to determine this?

 

A statement that creates a new dataframe that only has people named 'DAVID':

val davids = 

 

A statement that calculates the total number of people named 'DAVID' by year (across all counties). This means you need to add together the different value per country in the same year:

val davidsByYear = 

 

 

A statement that calculates the total number of people with the same name and gender by year:

val sameName =

 

A statement that determines the maximum number of times a name was used per gender, per year:

val maxNameCounts =

 

A statement that joins maxNameCounts with sameName. Note that you have TWO fields to join on (Year and Sex):

val sameNameWithMaxCount =

 

A statement that filters the results such that we have the most used names per year/sex:

 

A series of statements that, starting from reading in CSV file, determines the most popular name per year across both sexes. Determine this not by directly comparing counts, but by the % of people given that name for that gender.


 

scala> df.show() _col -+---+- Year | First Name | County | Sex | Count 2016 DAVID Kings M 231| | 2016 | JACOB Kings MI 228 | 2016 | ETHAN Queens M 224 2016 LIAM Queens M 217 2016 Kings F 210 2016 Kings M 209 2016 Kings M 199 2016 M 198 | 2016 | M 186 2016| 184 2016| 183 2016| 182 | 2016 | 179 177 2016| | 2016 | | 2016 | | 2016 | | 2016 | | 2016 | +- _c1| _c2|_c3|_c4| OLIVIA ETHAN DANIEL MOSHE LIAM NOAH Kings Bronx | Kings M F ESTHER Kings SOPHIA | Queens | F F M M M F| F| F| +---+- RACHEL Kings AIDEN | Queens | MATTHEW | Queens NOAH | Queens SARAH Kings EMMA | Queens | LEAH Kings | + --+ only showing top 20 rows + 176| 175| 174| 171| 170 -+

Step by Step Solution

3.42 Rating (149 Votes )

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Mary Z. Last

8th edition

1285427106, 978-1285427102

More Books

Students also viewed these Programming questions

Question

In what way does machine learning handle large datasets?

Answered: 1 week ago

Question

What is a LAN?

Answered: 1 week ago