Answered step by step
Verified Expert Solution
Question
1 Approved Answer
A web scraping program has been developed to extract the data and save it in TXT format in a text file. Here are the first
A web scraping program has been developed to extract the data and save it in TXT format in a text file. Here are the first five lines of the file that contains the songs shown in the screenshot:
"November : PM"After A Few","Travis Denning"
"November : PM"Better Life","Keith Urban"
"November : PM"Mercy","Brett Young"
"November : PM"What She Wants Tonight","Luke Bryan"
"November : PM"Prayed For You","Matt Stell"
The files that you are given for the assignment may contain a different list of songs. If you are given multiple files, combine them into one file. There may be duplicate records.
Your program will process the data file to answer the following questions:
Breaks for commercials and other contents. After continuous play of a few songs, the station takes a break for commercials or other contents eg announcements, interviews, or taking calls from listeners Your task is to identify times at which a song is followed by a break. This can be detected by examining the time pattern eg a song appearing to
be longer than minutes most likely includes time for break
Top songs. Songs on heavy rotation will be aired multiple times on a given day. Your task is to identify these songs.
Top artists. On a given day, most artists have one song aired during the day even though the song may be aired several times some artists may have more than one song aired.
Your task is to use two metrics to identify top artists: those who have the most distinct songs count of distinct songs; those who have the most air plays all songs combined
Solution Approach
Here is one approach there are other approaches
To identify commercial breaks:
Order the data by time
Calculate the time between two songs to obtain nominal time of all but the last song
If a songs nominal time is longer than minutes, a break follows the song
We choose to use minutes because most songs on radio are around minutes.
To find the top songs, we need to count how many times each song gets aired. This is a simple task conceptually for each distinct song, just count its occurrences in the dataset. In practice, we need a way of keeping track of records, something that guarantees uniqueness for song and can scale up from only a few songs to millions of songs Then sort by count to find top songs.
To find top artists, we need to keep track the air plays for each artist: which song, how many times the song is air. Then we can count how many distinct songs each artist has, and we can also count how many total airplays each artist has.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started