Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question is too long but this is a data science question using R. Please review the attached data set and instructions on the link provided

Question is too long but this is a data science question using R. Please review the attached data set and instructions on the link provided which goes to a google drive.

https://drive.google.com/drive/folders/1sGO4Lu_ZopouO9_h7W0HA5631yZoAqg1?usp=sharing

Instructions are as follows but are also on the link....AS#3 Movie Data Analysis

Please submit with R command and result (or take screenshot) of each step in Word or PDF format.

In this homework assignment, you'll be working with the movies data frame. It contains hundreds of movies with four different genres, release year, rating, and votes.

1. Please download and unzip AS#2 Movies.zip from D2L Dropbox. 2. Load movies data in RStudio

3. Display the structure of movies. How many movies? How many variables (columns)? Hint: Use the str() function to show the structure of movies.

4. Plot the genre column of movies with color of blue. The title of the plot is "Movie by Genre" Hint: table(movies$genre) makes it numerical since it counts the amount in each category .

5. Plot the genre column (horizontal axis) against the rating variable (vertical axis) with color

of green. The title of the plot is "Movie by Genre" and the label of Y-axis is Rating.

6. Plot the runtime variable of movies with color of red. The title is "Movie by Runtime". Can you tell what's on the horizontal axis and what is on the vertical one?

7. Using plot(), create a graph that shows the rating against runtime with color purple. Rating should be on the horizontal x-axis, and runtime on the vertical y-axis. The title of the plot is "Runtime & Rating" with color red. Use symbol of triangle. Is there a correlation between the two variables?

8. Create a plot that has the following properties:

It plots the variables votes (x-axis) against runtime (y-axis).

The title of the plot is "Votes versus Runtime" with color blue.

The x-axis and y-axis are labeled "Number of votes [-]" and "Runtime [s]" respectively;

The subtitle of the plot is "No clear correlation" with color red.

Use symbol of square with color of green.

9. List all the graphical parameters that are currently active in your session, by running par().

Next, use par() to set the mfrow parameter: R should plot figures on a 2-by-1 grid (2 rows, 1 column). Build two plots:

o A scatterplot that plots the votes (x-axis) against the rating (y-axis) variable of movies with color blue. The title of the plot is "Votes & Rating" in red color.

o A histogram of the votes variable in green color. The title of the plot is "Histogram of Movies & Votes" in red color.

10. Create a pie chart to show all four genres as labels. Title is "Pie Chart of Movie Genre". > par(mfrow= c(1,1)) * Reset to the default for a single chart display

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Administrator Limited Edition

Authors: Martif Way

1st Edition

B0CGG89N8Z

Students also viewed these Databases questions