Question
file to see the question details in the script, and replace the ??? with correct R ```{r} # Load libraries library(tidyverse) library(tidymodels) library(vip) # for
file to see the question details in the script, and replace the "???" with correct R
```{r} # Load libraries library(tidyverse) library(tidymodels) library(vip) # for variable importance library(skimr)
# Load data sets home_sales <- read_rds("E:/RDataFiles/home_sales.Rds")%>% select(-selling_date)
```
```{r} # Exploratory analysis skim(home_sales) ```
## Creating a Machine Learning Workflow
In the previous section, we trained a linear regression model to the `advertising` data step-by-step. In this section, we will go over how to combine all of the modeling steps into a single workflow.
We will be using the `workflow` package, which combines a `parnsip` model with a recipe, and the `last_fit()` function to build an end-to-end modeling training pipeline.
Let's assume we would like to do the following with the advertising data:
1. Split our data into training and test sets
2. Feature engineer the training data by removing skewness and normalizing numeric predictors
3. Specify a linear regression model
4. Train our model on the training data
5. Transform the test data with steps learned in part 2 and obtain predictions using our trained model
The machine learning workflow can be accomplished with a few steps using tidymodels
## Workflow for Home Selling Price
For another example of fitting a machine learning workflow, let's use linear regression to predict the selling price of homes using the `home_sales` data.
For our feature engineering steps, we will include removing skewness and normalizing numeric predictors, and creating dummy variables for the `city` variable.
Remember that all machine learning algorithms need a numeric feature matrix. Therefore we must also transform character or factor predictor variables to dummy variables.
### Step 1. Split Our Data
First we split our data into training and test sets.
```{r} ???
# Create a split object homes_split <- initial_split(home_sales, prop = ??, strata = selling_price)
# Build training data set homes_training <- homes_split %>% ??()
# Build testing data set homes_test <- homes_split %>% ??() ```
### Step 2. Feature Engineering
Next, we specify our feature engineering recipe. In this step, we do not use `prep()` or `bake()`. This recipe will be automatically applied in a later step using the `workflow()` and `last_fit()` functions.
For our model formula, we are specifying that `selling_price` is our response variable and all others are predictor variables.
```{r} homes_recipe <- recipe(?? ~ ., data = ??) %>% ???(all_numeric(), -all_outcomes()) %>% ??(all_numeric(), -all_outcomes()) %>% ??(all_nominal(), - all_outcomes()) ```
As an intermediate step, let's check our recipe by prepping it on the training data and applying it to the test data. We want to make sure that we get the correct transformations.
From the results below, things look correct.
```{r} homes_recipe %>% prep() %>% ??(new_data = ??) ```
### Step 3. Specify a Model
Next, we specify our linear regression model with `parsnip`.
```{r} lm_model <- ??() %>% set_engine(??) %>% set_mode(???) ```
### Step 4. Create a Workflow
Next, we combine our model and recipe into a workflow object.
```{r} homes_workflow <- ???() %>% add_model(lm_model) %>% add_recipe(homes_recipe) ```
### Step 5. Execute the Workflow
Finally, we process our machine learning workflow with `last_fit()`.
```{r} homes_fit <- homes_workflow %>% last_fit(split = homes_split) ```
To obtain the performance metrics and predictions on the test set, we use the `collect_metrics()` and `collect_predictions()` functions on our homes_fit object.
```{r} # Obtain performance metrics on test data homes_fit %>% ???() ```
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started