Data Mining

Date Submitted: 11/23/2012 07:14 AM

1.1) Brief history:

The concept of data mining has its origins somewhere in the early 1900's when the growth in economies was taking place worldwide. This growth resulted in the large volumes of data sets that could not be manually-handled because of the sheer size and the increasing complexity of it. Traditional methods could not be applied accurately to this problem. Also management of these data sets was a major concern for all organizations. The need for a solution was great resulting in the idea of data mining taking hold. Back then it was just a concept, a project in its early stages. But as time progressed more and more research was done on the subject. And with the boom of computers the idea becomes a full fledged vision of automated prediction of patterns and related trends.

1.2) Definition:

Data mining is the process of applying computer based methods for sorting through large sets of data for discovering actionable information such as patterns and trends that cannot be done by simple analysis.

1.3) Difference between Data, Information and Knowledge

What is data? Any school, business enterprises, shops, supermarkets etc .or any other organization handle a lot of details in a day. The details could be anything from customer details, books, and model numbers of a product, name of students and their grades or even business transactions. All of these details that are collected are called data. Data could take any form. It could be text or numeric values. Data can be classified as

• Operational data –sales or transaction details, cost etc

• Non-operational data-predicted sales forecast

• Metadata-this is data about the data collected eg: - int, float etc

Information is the method of skimming through the data and finding out related trends which will be useful to organization

Knowledge is using the information obtained in ways beneficial to the organization.

For e.g. :- if we consider a supermarket the list of...