Data Mining in Database Management System

What is Data Mining in Database Management System?

Data mining refers to extracting or mining knowledge from large amounts of data. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.

The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.

The key properties of data mining are

Automatic discovery of patterns
Prediction of likely outcomes
Creation of actionable information
Focus on large datasets and databases

Given databases of sufficient size and quality, data mining technology can generate new business opportunities by providing following capabilities

1. Automated prediction of trends and behaviors

Data mining automates the process of finding predictive information in large databases. A typical example of a predictive problem is targeted marketing.

Data mining uses data on past promotional mailings to identify the targets most likely to maximize return on investment in future mailings.

2. Automated discovery of previously unknown patterns

Data mining tools sweep through databases and identify previously hidden patterns in one step. An example of pattern discovery is the analysis of retail sales data to identify seemingly unrelated products that are often purchased together.

Other pattern discovery problems include detecting fraudulent credit card transactions and identifying anomalous data that could represent data entry keying errors

Tasks of Data Mining

Data mining involves six common classes of tasks;

1. Anomaly detection (Outlier/change/deviation detection)

The identification of unusual data records, that might be interesting or data errors that require further investigation.

2. Association rule learning (Dependency modelling)

It searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits.

Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.

3. Clustering

It is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.

4. Classification

It is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".

5. Regression

It attempts to find a function which models the data with the least error.

6. Summarization

It provides more compact representation of the data set, including visualization and report generation.

Architecture of Data Mining

1. Knowledge Base

This is the domain knowledge that is used to guide the search or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, used to organize attributes or attribute values into different levels of abstraction.

2. Data Mining Engine

This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association and correlation analysis, classification, prediction, cluster analysis, outlier analysis, and evolution analysis

3. Pattern Evaluation Module

This component typically employs interestingness measures interacts with the data mining modules so as to focus the search toward interesting patterns. It may use interestingness thresholds to filter out discovered patterns.

For efficient data mining, it is highly recommended to push the evaluation of pattern interestingness as deep as possible into the mining process so as to confine the search to only the interesting patterns.

4. User interface

This module communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task, providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining results.

In addition, this component allows the user to browse database and data warehouse schemas or data structures, evaluate mined patterns, and visualize the patterns in different forms

Chapters

Data Mining in Database Management System

What is Data Mining in Database Management System?

1. Automated prediction of trends and behaviors

2. Automated discovery of previously unknown patterns

Tasks of Data Mining

1. Anomaly detection (Outlier/change/deviation detection)

2. Association rule learning (Dependency modelling)

3. Clustering

4. Classification

5. Regression

6. Summarization

Architecture of Data Mining

1. Knowledge Base

2. Data Mining Engine

3. Pattern Evaluation Module

4. User interface

BCA

Quick Links

Website

Resources

Chapters

1. Introduction to Computer System

2. Computer Software

3. Operating System

4. Database Management System

5. Data Communication & Computer Network

6. Internet and WWW

7. Contemporary Technologies

8. Other Specifics

Data Mining in Database Management System

What is Data Mining in Database Management System?

1. Automated prediction of trends and behaviors

2. Automated discovery of previously unknown patterns

Tasks of Data Mining

1. Anomaly detection (Outlier/change/deviation detection)

2. Association rule learning (Dependency modelling)

3. Clustering

4. Classification

5. Regression

6. Summarization

Architecture of Data Mining

1. Knowledge Base

2. Data Mining Engine

3. Pattern Evaluation Module

4. User interface

BCA