Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

--- output: html_document: default pdf_document: default --- --- title: 'Twitter Retweetability Analysis' subtitle: 'UMaine BUA684 Module 3' author: - FirstName LastName date: `r format(Sys.time(), '%d

--- output: html_document: default pdf_document: default ---

--- title: 'Twitter Retweetability Analysis' subtitle: 'UMaine BUA684 Module 3' author: - FirstName LastName date: "`r format(Sys.time(), '%d %B %Y')`" output: pdf_document ---

# Problem The aim of this assignment is to understand the relationship between retweetability of one tweet (whether the tweet is retweeted and, if it does, how many times it is retweeted) and some features of the tweet.

# Data ```{r message=FALSE, warning=FALSE} # import data install.packages("readr") library(readr) LLBean_retweet <- read_csv("LLBean_retweet.csv")

# wrangle data install.packages(c("dplyr", "stringr","stringi")) library(dplyr) LLBean1<-LLBean_retweet%>%filter(language == c('en'))%>% mutate(hashtags = gsub("\\[|\\]", "", hashtags), urls=gsub("\\[|\\]", "", urls))

library(stringr) library(stringi) LLBean2<-LLBean1%>% mutate( tweet_length=str_length(tweet), url_ind=ifelse(str_length(urls)==0, 0, 1), hashtags_count=ifelse(str_length(hashtags)==0,0,stri_count_fixed(hashtags, ",") + 1), retweet_ind=as.numeric(retweets_count>0))%>%select(-language)

LLBean3<-LLBean2%>%filter(retweet_ind==1)

head(LLBean_retweet) head(LLBean1) head(LLBean2) head(LLBean3) ```

In below, Please write all your answers in **Bold** font.

*Problem 1: Based on the data outputs `LLBean_retweet`, `LLBean1`, `LLBean2`, and `LLBean3`, describe how the above R code wrangles the data.* **(Note: Because the data is big, to have a complete view of the four datasets, you must also run the above code in Console not just in this R Notebook. Or, you can open each of the four datasets from the Environment panel by clicking on the dataset name. )**

**Your answer: ( )**

# Analysis ## Logistic regression model *Problem 2: In the following chunk, use the business case example for logistic model as reference to build a logistic regression model based on the dataset `LLBean2` to predict `retweet_ind` using `tweet_length`, `url_ind`, `hashtags_count`, and `video` as predictors. Show all your R code in the submission including that for addressing the multicollinearity problem. If the problem appears, you need to update your model to resolve the problem. Also, you should evaluate importance of different predictors in the model.* ```{r message=FALSE, warning=FALSE}

```

*Problem 3: Based on your final model results for Problem 2, interpret the meaning of the regression coefficient estimate for the most important predictor.*

**Your answer: ( )**

*Problem 4: In the following chunk, estimate the possible range of regression coefficient estimate for the most important predictor at 95% confidence level.* ```{r message=FALSE, warning=FALSE}

```

*Problem 5: Based on the result for Problem 4, interpret the generalized meaning of the regression coefficient estimate for the most important predictor.*

**Your answer: ( )**

*Problem 6: In the following chunk, measure performance of the logistic regression model you build. Show all your R code in the submission* ```{r message=FALSE, warning=FALSE}

```

*Problem 7: According to the model performance measure in Problem 6, is this model a poor/average/good/strong model?*

**Your answer: ( )**

## Least-square regression model *Problem 8: In the following chunk, use the business case example for least-square model as reference to build a least-square regression model based on the dataset `LLBean3` to predict `retweets_count` using `tweet_length`, `url_ind`, `hashtags_count`, and `video` as predictors. Show all your R code in the submission including that for checking the assumptions for this type of regression model, detecting influential outliers, and addressing the multicollinearity problem. If the outlier problem and/or the multicollinearity problem appears, you need to update your model to resolve the problem(s). Also, you should evaluate importance of different predictors in the model.* ```{r message=FALSE, warning=FALSE}

```

*Problem 9: Is there some assumption(s) not satisfied by this dataset? If yes, what assumptions are not satisfied.*

**Your answer: ( )**

*Problem 10: Based on your final model results for Problem 8, interpret the meaning of the regression coefficient estimate for the most important predictor.*

**Your answer: ( )**

*Problem 11: In the following chunk, estimate the possible range of regression coefficient estimate for the most important predictor at 95% confidence level.* ```{r message=FALSE, warning=FALSE}

```

*Problem 12: Based on the result for Problem 11, interpret the generalized meaning of the regression coefficient estimate for the most important predictor.*

**Your answer:( )**

*Problem 13: In the following chunk, measure performance of the least-square regression model you build. Show all your R code in the submission* ```{r message=FALSE, warning=FALSE}

```

*Problem 14: According to the model performance measure in Problem 12, do you think this model a good model?*

**Your answer:( )**

# Discussion *Reflect on the ways in which the logistic and least-square regression model results could contribute to the development of an enhanced social media marketing strategy for the company. Although this analysis won't be detailed here, you will have the opportunity to collaborate with your project team, allowing you to work together with your teammates to further examine this aspect and devise comprehensive marketing approaches.*

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Small Business Management Entrepreneurship and Beyond

Authors: Timothy s. Hatten

5th edition

538453141, 978-0538453141

More Books

Students also viewed these General Management questions

Question

What are the welfare effects of subsidies?

Answered: 1 week ago