Posts

Showing posts from August, 2021

⏳ Mining deep into Data Mining - Statistics - PART I ⏳

Image
 Why do we have to know statistics?πŸ€” As mentioned in the previous posts, we live in the world of data from which we can derive insightful information. Thus, Statistics play a vital role in processing and analyzing the data to make decisions and predictions. What actually is statistics? πŸ‘€ Let's get more technical  Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data . There are two types or classes of statistics.               πŸ‘‰  Descriptive statistics                πŸ‘‰  Inferential statistics Descriptive Statistics πŸ˜€                πŸ‘‰ Descriptive statistics focuses more on analyzing, summarizing, and organizing data in the form of numbers or graphs.                πŸ‘‰ Bar plots, histograms, pie-charts are used in visualizing descriptive data and determining PDF (probability density function), CDF (cumulative distribution function), Normal distribution.                πŸ‘‰ Measure of central tendency is determin

πŸ’‘⏳ Mining deep into Data Mining - PART II ⏳πŸ’‘

Image
 HurrayπŸ’₯ , we have seen the basics of data mining in Part I πŸ˜ƒ Let's get into the phases involved in the KDD process step by step. To start with, let's explore the Data Preprocessing phase. What actually is DATA in data mining? πŸ€” In data mining, Data refers to the collection of objects and their attributes . Umm, Confusing right? 😨 πŸ‘‰ An Object is just like an entry in a table or an instance. It is also known as record, point, entity or sample. πŸ‘‰ Attribute is any property or characteristic of an object. πŸ‘‰ For example, If the eye of a person is considered as an object then, the eye color, blink rate are regarded as the attributes.  πŸ‘‰ Attribute can also be called a feature, field, characteristic, or variable in data mining. πŸ‘‰ Here, the organization of data is in a tabular form.                                                   Let's get more technical πŸ’»πŸ™Œ πŸ‘‰ Each of the rows can be called vectors, (ie) object vectors or feature vectors. πŸ‘‰ The number of attributes wil

πŸ’‘⏳ Mining deep into Data Mining - PART I ⏳πŸ’‘

 "Necessity is the mother of invention" The need for knowledge is the root of data collection, discovery, and analysis. To be precise, we could say that the current technological world is  drowning in data but starving for knowledge. Thus, data mining comes in handy What is Data Mining? It is the extraction of interesting, non-trivial, previously unknown, potentially useful, patterns or knowledge from the huge amount of data. Want to know the alternative names of Data Mining? πŸ‘‰ Knowledge Discovery and Databases (KDD) πŸ‘‰ Data or Pattern analysis πŸ‘‰ Data archeology πŸ‘‰ Data dredging πŸ‘‰ Information harvesting πŸ‘‰ Business Intelligence Data mining is indeed a confluence of multiple disciplines mainly πŸ‘‰ Statistics πŸ‘‰ Algorithms πŸ‘‰ Data visualization πŸ‘‰ Machine learning πŸ‘‰ Pattern recognition πŸ‘‰ Database Technology Why not follow traditional data analysis? πŸ‘‰ Traditional analysis of data will not be able to handle tera-bytes of data πŸ‘‰ High dimensional data add complexity to the a