Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 21, 2024

from csv import reader ### The Google Play data set ### opened_file = open('googleplaystore.csv') read_file = reader(opened_file) android = list(read_file) android_header = android[0] android =

from csv import reader ### The Google Play data set ### opened_file = open('googleplaystore.csv') read_file = reader(opened_file) android = list(read_file) android_header = android[0] android = android[1:] ### The App Store data set ### opened_file = open('AppleStore.csv') read_file = reader(opened_file) ios = list(read_file) ios_header = ios[0] ios = ios[1:]

To make it easier to explore the two data sets, we'll first write a function named explore_data() that we can use repeatedly to explore rows in a more readable way. We'll also add an option for our function to show the number of rows and columns for any data set.

def explore_data(dataset, start, end, rows_and_columns=False): dataset_slice = dataset[start:end] for row in dataset_slice: print(row) print(' ') # adds a new (empty) line between rows if rows_and_columns: print('Number of rows:', len(dataset)) print('Number of columns:', len(dataset[0])) print(android_header) print(' ') explore_data(android, 0, 3, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] ['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']

Number of rows: 10841 Number of columns: 13

print(' ') explore_data(ios, 0, 3, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] ['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']

Number of rows: 7197 Number of columns: 16

Deleting Wrong Data

print(android[10472]) # incorrect row print(' ') print(android_header) # header print(' ') print(android[0]) # correct row

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up'] ['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']

print(len(android)) del android[10472] # don't run this more than once print(len(android))

10841 10840

for app in android: name = app[0] if name == 'Instagram': print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

In total, there are 1,181 cases where an app occurs more than once:

duplicate_apps = [] unique_apps = [] for app in android: name = app[0] if name in unique_apps: duplicate_apps.append(name) else: unique_apps.append(name) print('Number of duplicate apps:', len(duplicate_apps)) print(' ') print('Examples of duplicate apps:', duplicate_apps[:15])

Number of duplicate apps: 1181

Let's start by building the dictionary.

In [8]:

reviews_max = {} for app in android: name = app[0] n_reviews = float(app[3]) if name in reviews_max and reviews_max[name] < n_reviews: reviews_max[name] = n_reviews elif name not in reviews_max: reviews_max[name] = n_reviews

In a previous code cell, we found that there are 1,181 cases where an app occurs more than once, so the length of our dictionary (of unique apps) should be equal to the difference between the length of our data set and 1,181.

In [9]:

print('Expected length:', len(android) - 1181) print('Actual length:', len(reviews_max))

Expected length: 9659 Actual length: 9659

WRITE COMMENT ON THE CODE

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Machine Learning And Knowledge Discovery In Databases European Conference Ecml Pkdd 2014 Nancy France September 15 19 2014 Proceedings Part 2 Lnai 8725

Authors: Toon Calders ,Floriana Esposito ,Eyke Hullermeier ,Rosa Meo

2014th Edition

3662448505, 978-3662448502

More Books

Students also viewed these Databases questions

Question

★★★★★

a. Compute the up and down factors for the stock price movements and the dollar return (1 + R) for each period. b. Use these up and down factors to create a four- period tree for the stock price...

Answered: 1 week ago

Question

★★★★★

I put the question in picture format thank you again Question 1 A sample of n=16 observations is drawn from a normal population with $1000 and 63200. Find the following: a) (10 points) P0? 2: 1070)...

Answered: 1 week ago

Question

★★★★★

Perform a multiple regression analysis using the data in GlenCove and determine the VIF for each independent variable in the model. Is there reason to suspect the existence of collinearity? For...

Answered: 1 week ago

Question

★★★★★

At the beginning of 2018, Thompson Service, Inc., showed the following amounts in the stockholders' equity section of its balance sheet. Stockholders' equity: Capital stock, $1 par value, 500,000...

Answered: 1 week ago

Question

★★★★★

from csv import reader ### The Google Play data set ### opened_file = open('googleplaystore.csv') read_file = reader(opened_file) android = list(read_file) android_header = android[0] android =...

Answered: 1 week ago

Question

★★★★★

Check my work Exercise 8-7A Calculate payroll withholdings and payroll taxes (LO8-3) [The following information applies to the questions displayed below. Aspen Ski Resorts has 100 employees, each...

Answered: 1 week ago

Question

★★★★★

Calculate the unemployment rate and labour force participation rate for Alberta and Canada. Show your work. Compare to the US unemployment rate and US labour force participation rate. Compare the...

Answered: 1 week ago

Question

★★★★★

Empire College London BTEC (RQF) Assignment Brief HNC in Business & HND in BUSINESS (Business Management) Unit 25: Global Business Environment Assignment Brief 2 Student Name/ID Number Unit Number...

Answered: 1 week ago

Question

★★★★★

An investor plans to buy bonds of the FSZ business. I f the nominal interest rate is 4% and the maturity in 4 years, while the nominal value is 1,000, answer the following questions: i. In case the...

Answered: 1 week ago

Question

★★★★★

Use the definition of hyperinflation. Based on the Visual Capitalist publication , did any country experience hyperinflation in 2022? Mapped: Which Countries Have the Highest Inflation?...

Answered: 1 week ago

Question

★★★★★

Lativia\'s Lattes finds that it can sell $ 5 , 0 0 0 worth of 1 6 - ounce lattes when its price is $ 4 per unit and $ 5 , 0 0 0 worth of it when its price is $ 5 , then

Answered: 1 week ago

Question

★★★★★

The small company for which you work manufactures and sells camping equipment. Recently, the companys management team decided to expand its sales efforts to include e-commerce. Barbara Kramer, the...

Answered: 1 week ago

Question

★★★★★

Youre interested in learning what type of backpack is most popular among students on your campus. (Objectives 2, 3, and 6) a. Outline your data-gathering plan and share it with the class. b. Using...

Answered: 1 week ago

Question

★★★★★

Teamwork. Form teams of no more than five people. Each team will select an important current issue or problem on the campus. As directed by your instructor, do one or more of the following...

Answered: 1 week ago

Previous Question Next Question