Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I need help cleaning a dataset please provide the code it can be downloaded from here https://www.kaggle.com/tmdb/tmdb-movie-metadata/data what i have done so far below, dont

I need help cleaning a dataset please provide the code

it can be downloaded from here https://www.kaggle.com/tmdb/tmdb-movie-metadata/data

what i have done so far below, dont mind the importing because I will use the rest when I have a clean set.

from datetime import timedelta, date import datetime import numpy as np import pandas as pd import string import re import csv import requests import string

data from https://www.kaggle.com/tmdb/tmdb-movie-metadata/data df_movies = pd.read_csv('tmdb_5000_movies.csv', delimiter = ',', header = 0, skipinitialspace = True)

df_movies.drop(columns='homepage', inplace=True) df_movies.drop(columns='popularity', inplace=True) df_movies.drop(columns='overview', inplace=True) df_movies.drop(columns='status', inplace=True) df_movies.drop(columns='tagline', inplace=True) df_movies.drop(columns='vote_average', inplace=True) df_movies.drop(columns='vote_count', inplace=True) df_movies.drop(columns='id', inplace=True)

df_movies.drop(columns='id', inplace=True)

df_movies.head()

I want it so that the 'genres' column only says the genre whether it is action adventure and so on. Same goes for 'production_company' and 'production_country' and 'spoken_language'.

Then I need you to remove all rows where 'spoken_language is not english or en, and create a separate column with just the year of the movie's release, titled 'release_year' and order it by 'release-year' and then 'revenue'.

Thanks!

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning Apache Cassandra Development

Authors: Vivek Mishra

1st Edition

1484201426, 9781484201428

More Books

Students also viewed these Databases questions

Question

How many Tables Will Base HCMSs typically have? Why?

Answered: 1 week ago

Question

What is the process of normalization?

Answered: 1 week ago