Datamining And Knowledge Discovery

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

1.INTRODUCTION:

The evolution of database technology starts from primitive file processing to the development of database management system which includes query and the transaction processing. The further development leads to demand for efficient and effective data analysis tool. This tremendous need is due to the explosive growth in data collected from applications, including business and management, government administration, science and engineering and environmental control.

Data mining refers to extracting knowledge from large amount of data. Data mining is the process of finding the interesting patterns from large amount of data, the data can be stored in data warehouse, databases and other information repositories. It is a young field which is developed from areas such as database systems, data warehousing, statistics, machine learning, data visualisation, information retrieval and high-performance computing. Other contributing areas include neural networks, pattern recognition, spatial data analysis, image databases, signal processing and many application fields such as business, economics and bioinformatics.

1.1 DATAMINING AND KNOWLEDGE DISCOVERY:

Data Mining is refered to as the discovery of valuable, non obvious information from enormous collection of data. Knowledge Discovery in Databases (KDD) popularly known as datamining, refers to the process of identifying valid and useful pattern in data.

Data mining is defined as the process of discovering interesting patterns (or knowledge) from large amounts of data. The data sources can include databases, data warehouses, the Web, other information repositories, or data that are streamed into the system dynamically.

Data mining is contouring the transformation of masses of information into significant knowledge. It is a procedure used to find underlying truths of random data for discovering new opportunities. The exposed pattern focuses on application problems and assists in more useful, proactive decision making. The important techniques of data mining involve decision trees, neural networks, nearest neighbour clustering, fuzzy logic, and genetic algorithms. The following figure 1.1depict the steps involved in iterative process.

Fig 1: Data Mining is the core of KDD

The Knowledge Discovery in Databases process comprise of few steps which involves collection of raw data to some form of new knowledge. The knowledge discovery in database composes of the following steps:

Data cleaning: It is the step in which the noisy and irrelevant information are removed from the collection. This step is popularly known as data cleansing.

Data integration: at this step, multiple data sources are combined to form a common source which often contain heterogeneous data source.

Data selection: Retrieval of relevant data from the database to analyse the task is carried out in this phase

Data transformation: It is the phase which uses summary or aggregation operation to transform the data in to appropriate form for mining, also known as data consolidation.

Data mining: it is the essential step in which intelligent techniques are applied to extract patterns which are potentially useful.

Pattern evaluation: at this stage, use interestingness measures to identify the truly interesting patterns which represent the knowledge.

Knowledge representation: is the final phase which provides the visually representation of the discovered knowledge to the user. The visualization and knowledge representation techniques are used to present the mined knowledge to the end user.

In the above mentioned steps some steps are combined together. For instance, In the pre-processing phase data cleaning and data integration can be executed together to generate a data warehouse. The integration of the data is the outcome of the selection, when data selection and data transformation steps are combined. In case of data warehouse the selection is done on the metamorphosed data.

The Knowledge discovery in database is an iterative process. By providing the discovered knowledge to the user, Enhances the evaluation measures, refinement of the mining takes place, to get more appropriate result by selection of new data or further transformed, or by integrating new data source.

Data mining deduce its name from searching similarities from large database and mining rocks from a valuable ore. Both the techniques are used to shift a large amount of material to exactly pinpoint the area where the value resides. The term is actually a misnomer. Mining gold from rocks or sand is referred to as "gold mining" rather than sand mining or rock mining. Nevertheless, data mining is a brilliant term that characterise the process to find small set of precious information from large amount of raw material. Data Mining has became the accepted accustomed term, and rapid trend that dominated the general term such as Knowledge Discovery in Database which describe the complete process of datamining.Many other terms carry a similar or slightly different meaning to data mining such as data dredging, knowledge extraction and pattern discovery.

1.2. CONCEPT OF DATA MINING:

In traditional business data processing, the database technology achieves a great success. Nowadays, there is an increasing hope to use this database technology in new application domain. One such application domain that acquires considerable implication is data mining. Increasing number of organisation are creating tremendous database which is measured in

giga byte and tera byte which comprise of business data such as consumer data, transaction histories and sales record. These data provides a possible gold mine of valuable information.

Data mining is a comparatively new and assuring technology. It can be defined as the process of bringing out meaningful new correlation, patterns, and trends by mining it from the large amounts of data stored in warehouse, using statistical, machine learning, artificial intelligence (AI), and data visualization techniques. Industries like medical, manufacturing, aerospace, chemical, are taking the advantage of data mining. Intimate beholders generally agree that in-depth decision support requires new technology. This new technology leads to the discovery of trends and predictive patterns in data, the hypothesis creation and testing, and insight-agitating visualisation generation.

Extraction of useful information from large database is done by using data mining concept. Data warehouse generally contain large databases called as"Data Mountain"is provided to the data mining tool to extract useful information. In short data warehousing is responsible for building the data mountain. The implicit, previously unknown and potentially useful information extraction from the data mountain is carried out by means of data mining. This data mining is not concerned to a specific industry – it requires some well-informed technologies and the disposition to explore the possibility of hidden knowledge that domiciles in the data.

The extraction of combined and previously hidden information from large databases is done by data mining which is a complex activity. Data mining contains various field of research and development of algorithms and software environments to handle the context of real-life difficulties where tremendous amount of information is available for mining. Lot of publicity is gained in this field and also see the things in a different way. Hence, based on the view points, Data Mining is just believed to be a smaller step in broader overall process called Knowledge discovery in database(KDD). Thus, according to this pedant definition DM software mainly includes tools for automated learning of data, such as machine learning and artificial neural networks, and the traditional approach to analyse the data using query and reporting, on-line analytical processing or relational calculus, so as to provide the maximum benefit from data to the end-user.

1.3 ARCHITECTURE OF DATA MINING SYSTEM

Fig 2: Architecture of typical data mining system

The typical architecture of data mining system consists of the following components

Database, data warehouse, World Wide Web, other info repositories: This represent the collection one or more set of database, datawarehouse, spread sheets, other info repositories. This is usually the source of data. The data may require cleaning and integration

Database or data warehouse server: Based on the user’s data mining request the database or data warehouse server is responsible for fetching the relevant data.

Knowledge base: This is the information of domain we are mining like concept hierarchies, to organize attributes onto various levels of abstraction. Also contains user beliefs, which can be used to access interestingness of pattern or thresholds.

Data mining engine: It is an essential component of data mining system. Performs functionalities like characterization, association, classification, prediction ,cluster analysis, outlier analysis etc.

Pattern evaluation module: It is mainly responsible for conducting tests for interestingness of a pattern.

User interface: Communicates between users and data mining system. Visualizes results or perform exploration on data and schemas.

Two high-level objectives of Data mining include: Prediction & description

• Prediction of unknown or future values of selected variables

• Description in terms of (human-interpretable) patterns

Fig 3: Data Mining Objectives

Data mining, a powerful new technology which is used to extract hidden predictive information from large database available with great potential to guide companies focus on the information available in the data warehouse. The future trends and behaviours are predicted by the data mining tools, which allow the businesses to make their own proactive and knowledgeable decision. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. The business questions that are traditionally time consuming to resolve are answered by the data mining tools.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now