Market Basket Analysis using Association Rule-Mining in R language

Association mining is usually done on transactions data from a retail market or from an online e-commerce store. Since most transactions data is large, the apriori algorithm makes it easier to find these patterns or rules quickly. Association Rules are widely used to analyze retail basket or transaction data, and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules. Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as candidate generation), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found.

DATASET: Groceries_dataset

Let's code and analyse the algorithm πŸ’ͺ

πŸ‘‰ Import the groceries dataset

πŸ‘‰Explore the data

πŸ‘‰ Perform data preparation such as checking the Null values, normalising the format of data to numeric values and group the data of similar values
πŸ‘‰After the data pre-processing, the item list is modified as
    

πŸ‘‰ Remove the undesired parameters such as date and member number and then create a new dataset called ItemList.csv
πŸ‘‰A sample snippet of the itemlist
                    
πŸ‘‰ Apriori algorithm generates the most relevant set of rules from a given transaction data. It also shows the support, confidence and lift of those rules. These factors are used to decide the relative strength of the rules. 
Consider the rule X => Y in order to understand the metrics
                

πŸ‘‰Now, apply the apriori algorithm and generate the rules. 
    

πŸ‘‰The top 10 rules generated by the apriori algorithm listed as
   

πŸ‘‰ Let's get a better understanding by visualizing the association rule
                                                                             Support vs Confidence analysis
πŸ‘‰ Visualizing Top 30 rules and the parallel co-ordinate plot
πŸ‘‰ The most frequently purchased itemsets (Top 5) is visualized as
                                        

Comments

Articles by Hemapriya

Comprehending the state-of-art Digit Recognizer dataset using machine learning

Performance Analysis of Weather Data using Machine Learning