Question
R Code Homework Association Rules The data in for this assignment was generated using the method by Agrawal and Srikant (random.patterns) to simulate transactions (random.transactions)
R Code Homework
Association Rules
The data in for this assignment was generated using the method by Agrawal and Srikant (random.patterns) to simulate transactions (random.transactions) which contains correlated items. 10,000 transactions occurred with 100 items to choose from. The average length of the transactions is 10 items. Note: You will need to install and load the arules and arulesViz R packages to complete this assignment.
Read the AssociationRules.csv transaction data file.
Provide the item frequency plot for top 5 items. Determine the most frequent item bought in the store. Hint: Use itemFrequencyPlot function.
Most frequent bought item was whole milk
How many items were bought in the largest transaction? Hint: Use max, colSums to calculate the number of items in the largest transaction.
Call the Apriori Association rule algorithm with a minimum Support of 1% and a minimum Confidence of 0%.
How many rules appear in the data?
How many rules are observed when the minimum confidence is changed to 50%.
Explain how the specified confidence impacts the number of rules.
Create a scatter plot comparing the parameters support (x-axis) and confidence on the y-axis, and lift with shading.
Identify the positioning of the interesting rules. For example Top Left, Top Right, Bottom Left or Bottom Right. Explain your observation.
Create a scatter plot comparing the parameters lift (y-axis) and support on the x-axis, and confidence with shading.
Identify the positioning of the interesting rules. For example Top Left, Top Right, Bottom Left or Bottom Right. Explain your observation.
Calculate the number of rules that appear in at least 10% of the transactions (10% support).
Provide any 3 rules from them.
Identify the most interesting rules by extracting the rules in which the confidence is greater than 80%. Report the total number of rules extracted.
Provide 3 rules with the highest lift.
Provide 3 rules with lowest lift.
Create a Graph-based visualization with items and rules as vertices for part (a)
# Install the packages and load libraries
install.packages('arules')
install.packages('arulesViz')
library('arules')
library('arulesViz')
#read in the csv file as a transaction data
txn <- read.transactions ("MBAdata.csv",rm.duplicates = FALSE,format="single",sep=",",cols=c(1,2))
#inspect transaction data
txn@transactionInfo
txn@itemInfo
image(txn)
itemFrequencyPlot(txn,topN=10)
#mine association rules
basket_rules <- apriori(txn,parameter=list(sup=0.5,conf=0.9))
inspect(basket_rules)
#Part2
#Read in Groceries data
data(Groceries)
Groceries
Groceries@itemInfo
#mine rules
rules <- apriori(Groceries, parameter=list(support=0.001, confidence=0.5))
#Extract rules with confidence =0.8
rules
subrules <- rules[quality(rules)$confidence > 0.8]
##or
subrules <- rules[rules@quality$confidence > 0.8]
inspect(subrules)
# Visualize rules as a scatter plot (with jitter to reduce occlusion)
plot (subrules)
#Extract the top three rules with high lift
rules_high_lift <- head(sort(rules, by="lift"), 3)
inspect(rules_high_lift)
plot(rules_high_lift, method="graph", control=list(type="items"))
###-------------- Homework --------------###
txn <- read.csv('AssociationRules.csv')
txn <- read.transactions("AssociationRules.csv",rm.duplicates = FALSE)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started