Question
Question is too long but this is a data science question using R. Please review the attached data set and instructions on the link provided
Question is too long but this is a data science question using R. Please review the attached data set and instructions on the link provided which goes to a google drive.
https://drive.google.com/drive/folders/1sGO4Lu_ZopouO9_h7W0HA5631yZoAqg1?usp=sharing
Instructions are as follows but are also on the link....AS#3 Movie Data Analysis
Please submit with R command and result (or take screenshot) of each step in Word or PDF format.
In this homework assignment, you'll be working with the movies data frame. It contains hundreds of movies with four different genres, release year, rating, and votes.
1. Please download and unzip AS#2 Movies.zip from D2L Dropbox. 2. Load movies data in RStudio
3. Display the structure of movies. How many movies? How many variables (columns)? Hint: Use the str() function to show the structure of movies.
4. Plot the genre column of movies with color of blue. The title of the plot is "Movie by Genre" Hint: table(movies$genre) makes it numerical since it counts the amount in each category .
5. Plot the genre column (horizontal axis) against the rating variable (vertical axis) with color |
of green. The title of the plot is "Movie by Genre" and the label of Y-axis is Rating. |
6. Plot the runtime variable of movies with color of red. The title is "Movie by Runtime". Can you tell what's on the horizontal axis and what is on the vertical one?
7. Using plot(), create a graph that shows the rating against runtime with color purple. Rating should be on the horizontal x-axis, and runtime on the vertical y-axis. The title of the plot is "Runtime & Rating" with color red. Use symbol of triangle. Is there a correlation between the two variables?
8. Create a plot that has the following properties:
It plots the variables votes (x-axis) against runtime (y-axis).
The title of the plot is "Votes versus Runtime" with color blue.
The x-axis and y-axis are labeled "Number of votes [-]" and "Runtime [s]" respectively;
The subtitle of the plot is "No clear correlation" with color red.
Use symbol of square with color of green.
9. List all the graphical parameters that are currently active in your session, by running par().
Next, use par() to set the mfrow parameter: R should plot figures on a 2-by-1 grid (2 rows, 1 column). Build two plots:
o A scatterplot that plots the votes (x-axis) against the rating (y-axis) variable of movies with color blue. The title of the plot is "Votes & Rating" in red color.
o A histogram of the votes variable in green color. The title of the plot is "Histogram of Movies & Votes" in red color.
10. Create a pie chart to show all four genres as labels. Title is "Pie Chart of Movie Genre". > par(mfrow= c(1,1)) * Reset to the default for a single chart display
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started