Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Can somebody help me solve this using pandas, numpy, and matplotlib.pyplot? here is the dataset 1.a Data Type Conversion First, we examine the data type

image text in transcribed

Can somebody help me solve this using pandas, numpy, and matplotlib.pyplot?

image text in transcribed

here is the dataset

1.a Data Type Conversion First, we examine the data type of each column. Are there any columns with unexpected data types? (You don't need to write any answers.) In [6]: ## No need for modification, just run this cell df.dtypes Out (6): Name object Console object Year_of_Release float64 Genre object Publisher object NA_Sales float64 EU Sales float64 JP Sales float64 Other_Sales float64 Global Sales float64 Critic_Score float64 Critic Count float64 User_Score object User Count float64 Developer object Rating object dtype: object 1.a.1) According to the content, "User_Score" would be more useful if represented as "float64", but its type is "object" instead. Try to manually fix this by Coercing the data type of the "User_Score' column to float64. If there are any entries that can not be converted to numerics (e.g. you might have noticed some entries being "tod", which is why pandas could not convert them into numerics.) interpret them as missing values. Hint: Select the right option for the "errors" argument when converting the values to numeric. Remeber to store the converted column back to df ["User_Score"). In [12] : ## Your code here In [ ]: grader.check("q1a1") 1.b Missing Values and Duplicated Entries 1.b.1) How many missing values are there for each column? Store the answer in a pandas Series (name the variable s_nacount) that is indexed by the column name of the dataset, and the corresponding value field stores NaN counts In [7] : ## Your code here S_nacount = ... In (): grader.check("q1b1") 1.b.2) Drop all the rows from df that contain missing values. After the operation, df should contain the same column as before but has no NaN in any entries. In (9): ## Your code here In [ ]: grader.check("q1b2") 1.b.3) Find out how many duplicated entries the dataset contains. Store the number of duplicated entries in the variable num_duplicated. If there are any duplicated entries, drop them from the dataframe df. In [12]: ## Your code here print(num_duplicated) In [ ]: grader.check("q1b3") In [4]: ## Load the required modules import pandas as pd import numpy as np import matplotlib.pyplot as plt The dataset for this homework is based on https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings. Please read the Kaggle page for the complete description of the dataset. We've replaced the column name "Platform" with "Console" to avoid a conflict due to dummy variables generations (see 3.c). We start by loading the dataset with pandas. In [5]: ## No need for modification, just run this cell df = pd. read_csv ("HW1_dataset.csv") df.head (5) Out [5]: Name Console Year_of_Release Genre Publisher NA_Sales EU_Sales JP_Sales Other 0 Wii Sports Wii 2006.0 Sports Nintendo 41.36 28.96 3.77 1 Super Mario Bros. NES 1985.0 Platform Nintendo 29.08 3.58 6.81 2 Mario Kart Wii Wii 2008.0 Racing Nintendo 15.68 12.76 3.79 3 Wii Sports Resort Wii 2009.0 Sports Nintendo 15.61 10.93 3.28 Pokemon 4 Red/Pokemon Blue GB 1996.0 Role- Playing Nintendo 11.27 8.89 10.22

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning C# 2005 Databases

Authors: Karli Watson

1st Edition

0470044063, 978-0470044063

More Books

Students also viewed these Databases questions

Question

=+j Enabling a productive global workforce.

Answered: 1 week ago

Question

=+ Are you interested in creating or

Answered: 1 week ago

Question

=+working on a micro-multinational?

Answered: 1 week ago