Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

PLEASE ANSWER ALL QUESTIONS Introduction Wine ratings and descriptions from critics are used by wine shoppers to aid in selecting a wine. These ratings and

PLEASE ANSWER ALL QUESTIONS

Introduction Wine ratings and descriptions from critics are used by wine shoppers to aid in selecting a wine. These ratings and descriptions can be posted in the description of a wine featured on the web or by hanging a shelf tag with the bottles available on the shelf in brick and mortar stores. This information is provided to increase sales in the fiercely competitive roughly $435B wine industry.

Data Description The file Wine.csv contains the data. The following is a description of the variables included in the dataset. You will use this data dictionary to choose the correct column (variable) when answering questions for your homework assignment.

country - The country that the wine is from

description description of wine

points - Wine Enthusiast rating 1-100 (but only ratings 80 or higher are reported)

price price for a bottle of the wine

province - province or state the wine is from

region - The wine growing area in a province or state

sub region - a more specific region specified within a wine growing area

taster name

taster twitter handle

title review title

variety - The type of grapes used to make the wine (ie Pinot Noir)

winery - The winery that made the wine

designation - vineyard within the winery for grapes that made the wine

Task You work for a large chain wine retailer and your boss has tasked you with looking at the wine data for certain insights.

# Q0 Use the read.csv() function to read in the Wine.csv file. Don't forget to use the strings=T argument.

# Q1a How many observations (rows) are available to you? Use the nrow function.

# Q1b How many variables (columns) are collected on each rated wine? Use the ncol function.

# It is important to find the exact spellings of column names and entries within columns in the wine dataframe.

# Q2a Use the names() or str() function on the wine dataframe to view the exact spellings of the column names.

# Q2b Use the summary() function on the taster twitter handle column to view the exact spellings of the tasters' twitter handles.

# You will use this strategy of "looking" in a column to get exact sepllings to answer questions on this quiz.

# It is important to know whether there are missing values (NAs) in your data since NAs need to be addressed for certain functions to work correctly.

# Q3a Use the sum() and is.na() functions to find the number of missing values in the wine dataframe.

# Q3b Use the sum() and is.na() functions to count the number of missing values in the points column.

# Q3c Use the sum() and is.na() functions to count the number of missing values in the price column.

# Q3d Use the sum() and is.na() functions to count the number of observations that have a price. Hint: Since the ! means "not", putting an ! in front of is.na() counts the non-missing values.

# Q4a Use the mean function to calculate the average points for the wines in the dataset.

# Q4b Use the mean function to calculate the average price for the wines in the dataset.

# Q5a Use the mean function to calculate the average points for the Napa Sonoma sub-region. Your code should be a single line and will use the square brackets for subsetting..

# Q5b Use the mean function to calculate the average price for the Napa Sonoma sub-region, again in a single line of code.

# Q6 Now we'll practice our counting skills again.

# The rating scale being used by these tasters orignally had 91 as the highest possible score and now goes to 100 for exceedingly delicious wines.

# Use the sum() function on a logical vector that checks for whether points are greater than 91 or not to count the number of wine ratings with more than 91 points.

# Q7a Now we'll practice subsetting on more than one criteria.

# Use the mean function to compute the average points for wines that are chardonnay wines and from the central coast sub-region.

# Q7b More practice subsetting on more than one criteria.

# Use the mean function to compute the average price for wines that are either chardonnay wines or are from the finger lakes sub-region.

# Q8 Use the median function to find the median price for a Cabernet Sauvignon that is rated higher than 91 points.

PLEASE ANSWER ALL QUESTIONS

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions