Association mining is usually done on transactions data from a retail market or from an online e-commerce store. Since most transactions data is large, the apriori algorithm makes it easier to find these patterns or rules quickly. Association Rules are widely used to analyze retail basket or transaction data, and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules. Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as candidate generation), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found.
DATASET: Groceries_dataset
Let's code and analyse the algorithm πͺ
π Import the groceries dataset
πExplore the data
π Perform data preparation such as checking the Null values, normalising the format of data to numeric values and group the data of similar values
πAfter the data pre-processing, the item list is modified as
π Remove the undesired parameters such as date and member number and then create a new dataset called ItemList.csv
πA sample snippet of the itemlist
π Apriori algorithm generates the most relevant set of rules from a given transaction data. It also shows the support, confidence and lift of those rules. These factors are used to decide the relative strength of the rules.
Consider the rule X => Y in order to understand the metrics
πNow, apply the apriori algorithm and generate the rules.
πThe top 10 rules generated by the apriori algorithm listed as
π Let's get a better understanding by visualizing the association rule
Support vs Confidence analysis
π Visualizing Top 30 rules and the parallel co-ordinate plot
π The most frequently purchased itemsets (Top 5) is visualized as
Comments
Post a Comment