Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Import the Census Income (Adult) dataset using Pandas, use the 14 attribute names (i.e., age, workclass, .., native-country) as explained in the dataset description as

Import the Census Income (Adult) dataset using Pandas, use the 14 attribute names (i.e., age, workclass, .., native-country) as explained in the dataset description as the first 14 column names and salary as the last column name (5 pt) , view the strings ?, ?, ? , or ? as the missing values and replace them with NaN (the default missing value marker in Pandas) (10 pt), and print out the first five rows of the DataFrame. (5 pt)

Dataset source file: http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data

Dataset description: http://archive.ics.uci.edu/ml/datasets/census+income

Pay attention to the header and index_col arguments when using pandas.read_csv().

a. Print out a concise summary of the DataFrame and observe if null values exist in each column of the DataFrame by checking the summary(10pt)

b. Find out the rows that contain missing values and print them out (10pt)

c. Drop the rows of the DataFrame with missing values and observe if null values still exist in each column by checking the concise summary again (10 pt)

image text in transcribed

P2: Write a Python code in Colab using Pandas to accomplish the following tasks 1. Import the Census Income (Adult) dataset using Pandas, use the 14 attribute names (i.e., "age", "workclass", ...., "native-country") as explained in the dataset description as the first 14 column names and "salary" as the last column name ( 5pt), view the strings '?', '?', '?', or ' ?' as the missing values and replace them with NaN (the default missing value marker in Pandas) (10 pt), and print out the first five rows of the DataFrame. (5 pt ) - Dataset source file: http://archive.ics.ucl.edu/ml/machine-learning-databases/adult/adult.data - Dataset description: http://archive.ics.uci.edu/ml/datasets/census+income - Pay attention to the header and index_col arguments when using pandas.read_csv(). [ ] \# write your answer here 2. Dataset checking and cleaning a. Print out a concise summary of the DataFrame and observe if null values exist in each column of the DataFrame by checking the summary(10pt) [ ] Write your answer here b. Find out the rows that contain missing values and print them out (10pt) [ ] \#rite your answer here c. Drop the rows of the DataFrame with missing values and observe if null values still exist in each column by checking the concise summary again (10 pt) [ ] \#rite your answer here

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Server T-SQL Recipes

Authors: David Dye, Jason Brimhall

4th Edition

1484200616, 9781484200612

More Books

Students also viewed these Databases questions

Question

What processes are involved in perceiving?

Answered: 1 week ago