Data mining steps

Comment

Author: Admin | 2025-04-28

Introduction In this article, I will discuss what is data mining and why we need it? We will learn a type...IntroductionIn this article, I will discuss what is data mining and why we need it? We will learn a type of data mining called clustering and go over two different types of clustering algorithms called K-means and Hierarchical Clustering and how they solve data mining problems Table of Contents What is data mining? Why is it neededFive steps involved in data miningData mining TasksWhat is clusteringTwo types of clustering AlgorithmK-means Clustering AlgorithmElbow Method to determine the optimal number of clustersHierarchal Clustering What is data mining and why do we need it? It is very difficult to find and understand relevant data because it’s collected and stored at a very massive speed over the networks. For example, credit card transactions, demographic data, and web server logs occupy terabytes of data. Thus, data mining is a technique used for analysis and exploration of large amount of data to uncover meaning insights. It helps in understanding, sorting and selecting relevant information. It uncovers hidden values in the databases and transforms data into useful informationFive steps involved in the data mining process Now, let’s delve into the main five steps involved in data mining process. I have shown data mining steps in Figure1: below. The explanation of five steps are also discussed below Figure1: Data mining stepsStep 1: Selection: In the selection step the data is first collected and integrated from all the variety of sources. We collect only those data that will help to gain meaningful insight. Step 2: Preprocessing: In the preprocessing step the data that is collected needs preprocessing as it is may not be clean. The processing step consists of removing missing values or inconsistent data. So we need to apply various techniques to remove such anomalies. Step 3: Transformation: After preprocessing step we need to transform data into forms appropriate for mining. This includes aggregation, normalization etc. Step 4: data mining: Now we can apply data mining techniques on the data to uncover insights. There are various data mining techniques like classification, clustering, and association etc. Step 5: Evaluation: This step involves the pattern evaluation like visualization, removing patterns that are redundant from the patterns generated Next, let’s understand two main data mining tasks and in which category the clustering comes. Data mining tasksFigure 2: Data mining tasksThe two main data mining tasks consists of:Predictive Methods: This method uses some variables to predict unknown values of other variables. It includes data mining task such as classification.Description Methods: This method helps to find data describing patterns so that it comes up with new important information from the available data. The

Add Comment