Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

DSC 4 3 0 : Python Programming Assignment 0 8 0 2 : SpotiPy In this assignment, you will use pandas to process a data

DSC 430: Python Programming
Assignment 0802: SpotiPy
In this assignment, you will use pandas to process a data set of artists and their tracks downloaded from Spotify Web API. This data come in two files: artists.tsv and tracks.tsv, both of which have tab-separated values. They contain uniquely identified 240K artists and 450K tracks performed by these artists, respectively. In the former, you can find basic information such as number of followers and genre of the artists, and in the latter, there are information about popularity of the tracks and ID of the artists that performed them.
Note that the ID(s) in "id_artists" of tracks.tsv have one or multiple IDs separated by ,"(comma and a space). These IDs can be matched with the ones in "id" column of artists.tsv to uniquely identify artists that performed the tracks. Another multi-value column is called "genres" in artists.tsv. It shows genre(s) assigned to each artist and the values are separated by ","(comma and a space). Below, we define a "jazz artist" as an artist who has at least one value in this column where "jazz" is mentioned (e.g., "jazz pop", "soul jazz", etc.). Similarly, a "pop artist" would be an artist that has at least one genre value that has "pop" string. Similar definition goes for other genres.
Load the two .tsv files into two Pandas dataframes and use Pandas methods and functions to address the following questions:
Identify and print the name and genre of the artist with maximum number of followers.
Identify and print name of the most productive artist in terms of the number of tracks s? he performed.
Write a function called summarize_genres (genres) that takes a list of genres and return a dataframe that has three columns: "genre" (name of input genres), "total N"(total number of artists in each genre), and "Av. followers" (average number of followers of artists in each genre).
Write a function called get_genre_variants (genre) that takes a genre string and returns an array that includes all variants of that genre (i.e., strings in which that genre is mentioned). Try it on "jazz". How many variants of jazz can you find in this data set?
Write a function called
summarize_artist_performance (name) that takes an artist's name and print the following values: number of tracks, number of solo tracks, number of collaborative tracks, average popularity of total/solo/collaborative tracks, number of people with whom the artists have collaborated. Try it on "Michael Jackson". Are his average total/solo/collaborative track popularities very different?
Record a three-minute video in which you run the code. Then, present your code. Specifically, answer the following questions:
How did you identify artists with maximum number of followers or maximum number of tracks?
How did you detect all artists in a given genre in summarize_genres function? Show the output of this function for the input list ["pop", "hip hop", "rock", "metal", "jazz", "blues", "country", "folklore"].
Explain how you classify a track to either a solo or collaborative performance.
Explain how you identified all distinct collaborators of an artist.
Submission: Submit a single .py file containing all the code to the D2L. Do not zip or archive the file. Your code must include comments at the top including your name, date, video link, and the honor statement, "I have not given or received any unauthorized assistance on this assignment." Each function must include a docstring and be commented appropriately.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Marketing The New Profit Frontier

Authors: Ed Burnett

1st Edition

0964535629, 978-0964535626

More Books

Students also viewed these Databases questions