Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Product Recommendations: The action or practice of selling additional products or ser- vices to existing customers is called cross-selling. Giving product recommendation is one of

image text in transcribedimage text in transcribed

Product Recommendations: The action or practice of selling additional products or ser- vices to existing customers is called cross-selling. Giving product recommendation is one of the examples of cross-selling that are frequently used by online retailers. One simple method to give product recommendations is to recommend products that are frequently browsed together by the customers. Suppose we want to recommend new products to the customer based on the products they have already browsed on the online website. Write a program using the A-priori algorithm to find products which are frequently browsed together. Fix the support to s =100 (i.e., product pairs need to occur together at least 100 times to be considered frequent) and find itemsets of size 2 and 3. Use the online browsing behavior dataset provided with this homework. Each line represents a browsing session of a customer. On each line, each string of 8 characters represents the id of an item browsed during that session. The items are separated by spaces. a) Identify pairs of items (X,Y) such that the support of {X,Y} is at least 100. For all such pairs, compute the confidence scores of the corresponding association rules: XY, Y + X. Sort the rules in decreasing order of confidence scores and list the top 5 rules in the writeup. Break ties, if any, by lexicographically increasing order on the left hand side of the rule. b) Identify item triples (X,Y,Z) such that the support of {X,Y,Z} is at least 100. For all such triples, compute the confidence scores of the corresponding association rules: (X,Y)= Z, (X,Z) = Y, (Y,Z) = X. Sort the rules in decreasing order of confidence scores and list the top 5 rules in the writeup. Order the left-hand-side pair lexicographically and break ties, if any, by lexicographical order of the first then the second item in the pair. Instructions for Code Submission and Output Format. Please follow the below instructions. It will help us in grading your programming part of the homework. Supported programming languages: Python, Java, C++ Store all the relevant files in a folder and submit the corresponding zipfile named after your student-id, e.g., 114513209.zip This folder should have a script file named run_code.sh Executing this script should do all the necessary steps required for executing the code including compiling, linking, and execution Assume relative file paths in your code. Some examples: "./filename.txt" or "../hw1/filename.txt" The output of your program should be dumped in a file named "output.txt" in the following format: OUTPUT A FRO11987 FRO12685 0.4325 FRO11987 ELE11375 0.4225 FR011987 GRO94758 0.4125 FRO11987 SNA80192 0.4025 FRO11987 FRO18919 0.4015 OUTPUT B FR011987 FRO12685 DA195741 0.4325 FR011987 ELE11375 GRO73461 0.4225 FRO11987 GRO94758 ELE26917 0.4125 FR011987 SNA80192 ELE28189 0.4025 FR011987 FR018919 GRO68850 0.4015 Explanation. - Line 1 should have "Output A" Next five lines should have the top five rules with decreasing confidence scores for part (a) of the programming question. Format: meaning {iteml} =item2 - Line 7 should have "Output B - Next five lines should have the top five rules with decreasing confidence scores for part (b) of the programming question. Format: meaning {iteml, item2} = item3 Product Recommendations: The action or practice of selling additional products or ser- vices to existing customers is called cross-selling. Giving product recommendation is one of the examples of cross-selling that are frequently used by online retailers. One simple method to give product recommendations is to recommend products that are frequently browsed together by the customers. Suppose we want to recommend new products to the customer based on the products they have already browsed on the online website. Write a program using the A-priori algorithm to find products which are frequently browsed together. Fix the support to s =100 (i.e., product pairs need to occur together at least 100 times to be considered frequent) and find itemsets of size 2 and 3. Use the online browsing behavior dataset provided with this homework. Each line represents a browsing session of a customer. On each line, each string of 8 characters represents the id of an item browsed during that session. The items are separated by spaces. a) Identify pairs of items (X,Y) such that the support of {X,Y} is at least 100. For all such pairs, compute the confidence scores of the corresponding association rules: XY, Y + X. Sort the rules in decreasing order of confidence scores and list the top 5 rules in the writeup. Break ties, if any, by lexicographically increasing order on the left hand side of the rule. b) Identify item triples (X,Y,Z) such that the support of {X,Y,Z} is at least 100. For all such triples, compute the confidence scores of the corresponding association rules: (X,Y)= Z, (X,Z) = Y, (Y,Z) = X. Sort the rules in decreasing order of confidence scores and list the top 5 rules in the writeup. Order the left-hand-side pair lexicographically and break ties, if any, by lexicographical order of the first then the second item in the pair. Instructions for Code Submission and Output Format. Please follow the below instructions. It will help us in grading your programming part of the homework. Supported programming languages: Python, Java, C++ Store all the relevant files in a folder and submit the corresponding zipfile named after your student-id, e.g., 114513209.zip This folder should have a script file named run_code.sh Executing this script should do all the necessary steps required for executing the code including compiling, linking, and execution Assume relative file paths in your code. Some examples: "./filename.txt" or "../hw1/filename.txt" The output of your program should be dumped in a file named "output.txt" in the following format: OUTPUT A FRO11987 FRO12685 0.4325 FRO11987 ELE11375 0.4225 FR011987 GRO94758 0.4125 FRO11987 SNA80192 0.4025 FRO11987 FRO18919 0.4015 OUTPUT B FR011987 FRO12685 DA195741 0.4325 FR011987 ELE11375 GRO73461 0.4225 FRO11987 GRO94758 ELE26917 0.4125 FR011987 SNA80192 ELE28189 0.4025 FR011987 FR018919 GRO68850 0.4015 Explanation. - Line 1 should have "Output A" Next five lines should have the top five rules with decreasing confidence scores for part (a) of the programming question. Format: meaning {iteml} =item2 - Line 7 should have "Output B - Next five lines should have the top five rules with decreasing confidence scores for part (b) of the programming question. Format: meaning {iteml, item2} = item3

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions