How Could Data Mining Be Profitable

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Today for every organization or company, there is an important vital asset and it is ‘Information’. There are two technologies which always have been central in improving the quantitative and qualitative value of the information available to decision makers, Business Intelligence and Knowledge Management. Organization managers have recognized that timely accurate knowledge can mean improved business performance. It shows how they are important for an organization. However they usually face a serious challenge: how to handle massive amount of data that they have, generated, collected and stored? "There is a need to have a technology that can access, analyse, summarize, and interpret information intelligently and automatically. Responding to this challenge, the field of data mining has emerged" (Ying, et al., 2008). They make advantage of using data mining tools to analyse, discover, and find out answers of unknown queries and to make better decisions. The resource of data mining is called data warehouse. Data warehouse is like a central repository that integrate data from different sources and in other words centralize them. It helps to do complex reporting and data analysis like quarterly or annual comparisons or trending reports. However for success of the data mining, information quality is a critical factor. In the next sections we are going to explain more about Business Intelligence, Knowledge Management, Data warehouse, Data mining and Information Quality.

Business Intelligence

"Business intelligence (BI) is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information" (Evelson & Nicolson, 2008). With BI an organization can handle large amounts of information to help to identify and develop new opportunities. Using BI an organization can also decide for effective strategies which could help to provide competitive market advantage and stability.

In 1970s Business Intelligence derived from decision-making support technology. Later it experienced a complex and gradual evolution including Transaction Processing System (TPS), Executive Information System (EIS), Management Information System (MIS), Decision Support System (DSS) and other stages. In 1996, the Gartner group defined BI as series of systems which has data warehouses, data analysis and data mining which help the organization to make a better decision and keeps its leading position in the competitive market. Business intelligence is using information of company’s past performance to predict the company’s future performance. Emerging trends from which the company might profit could be revealed by BI.

Using BI technologies, you can have different views of your business operations which are historical views, current views and predictive views. Some of the common functions of BI technologies could be named as reporting, online analytical processing, data mining, analytics, complex event processing, process mining, text mining, business performance management, benchmarking, predictive analytics and prescriptive analytics.

One of the main and important goals of business intelligence deployment is to support making better business decisions. In other words a BI system can be called a decision support system (DSS).

Knowledge management

Knowledge Management (KM) is a concept and a term that get up about two decades ago, approximately in 1990. Very early on in the KM movement, Davenport, T (1994) offered this definition:

"Knowledge management is the process of capturing, distributing, and effectively using knowledge."

The above definition has the advantage of being simple and plain, stark, and to the point. Few years later, another second definition of KM created by the Gartner Group which is perhaps the most commonly cited one:

"Knowledge management is a discipline that promotes an integrated approach to identifying, capturing, evaluating, retrieving, and sharing all of an enterprise's information assets. These assets may include databases, documents, policies, procedures, and previously un-captured expertise and experience in individual workers" (Duhon, 1998).

The second definition is corporate orientated and very organizational. KM, historically at least, is primarily about managing the knowledge of and in organizations. In order to attain a vision and mission in any organization, knowledge managers need to integrate information system strategies with business strategies. And to support it different human and computer based techniques have emerged and one of them is Data Mining. It helps to analyze and model the business from the perspective. For example it could be very helpful for business process re-engineering. Simply data mining gives the knowledge managers a hidden knowledge to redesign the whole business process so as to suite the current business development and challenges and to remain at competitive level with other business organisations (Folorunso & O. Ogunde, 2004).

Data Warehouse

An excellent source of data to locate and mine is an enterprise data warehouse (DW). Because of the nature of a data warehouse, most pertinent data that has been selected by analysts and business users should be located within the warehouse structure. Data warehouses store current and historical data. The data from different operational system such as marketing, sales and etc. will be uploaded and stored in the warehouse. In addition, for the explicit purpose of reporting this data is organized and stored. A data warehouse is the main source for data mining. The reason is that the data within the data warehouse has already undergone significant data additions, revisions, modifications, and purging based on business rules and processes. During this process the data may pass through additional operations to be well formed and integrated before they are used in the DW for reporting. These criteria’s will be covered in Information Quality section of this paper.

Data Mining

In today’s world, every business, company and organization has its own large amount of data. They usually use their own data for their future decisions, research and their development. The data in their databases is on their hand when they require it. But the most important thing is to analyse the data and find important information. If you want to grow rapidly you must take quick and accurate decisions to grab timely available opportunities (Arthur, n.d.).

By implementing the typical data warehousing in organizations, users will be allowed to ask and answer questions such as "How many cars were sold, by area, by agency between the months of March and August in 2010?" But on the other side using data mining, business decision makers will be able to ask and answer questions, such as "What are the factors that increases the rate of sell in specific region in specific quarter?" or "What are the best times to do a sale in a year and what are the best areas to increase the number of shops and provide more service for customers?"

Data mining allows users to sift the data in data warehouses and get enormous amount of information. With this process you can access the business intelligence gems. Using the process of data mining, you can extract required valuable information from data. So data mining is about refining data and extracting important information. Data mining is the process of extracting hidden knowledge from large volumes of raw data; it can also be defined as the process of extracting hidden predictive information from large databases (Chaterjee, n.d.). The data mining process will utilize the data in the enterprise data warehouse. To analyse data one important thing is that it should be granular enough. "Data that is characterized by significant aggregations beyond the original grain of the data will not produce significant results when used to create or test against a mining model" (Chaterjee, n.d.).

The process of data mining is mainly divided into 3 steps;

Pre-processing: It is about collecting large amount of relevant data

Mining: It is about data classification, clustering, error correction and linking information

Validation: It is about trust on new information

Benefits of Data Mining for Organizations

Fast and Feasible Decisions

If you want to search for information from huge amount of data, it requires lots of time. It also irritates the person who is doing such. Not only when a person is doing such work, the possibility of making mistakes and incorrect decision increases, but also with annoyed mind no one can make accurate decisions for sure. By help of data mining, you can easily get information and make fast and authentic decisions. It also helps to compare information with various factors so the decisions become more reliable.

Powerful Strategies

With the information which is available after the data mining, you can make different strategies. In other word by analyzing information in various dimensions you can make different strategies and implement them. This could help the organisation to effectively expand its business boundaries and making authentic decisions.

Competitive Advantage

With the information in your hand you should try to compare it in different aspects and doing competitive analysis and making corrective decisions. This will enable the company to gain competitive advantage.

How could data mining be profitable?

Data mining has been deployed by a wide range of companies successfully. This technology is applicable to any company looking to leverage a large data warehouse to better manage their customer relationships. Early adopters of this technology have tended to be in information-intensive industries such as financial services and direct mail marketing. To have a successful data mining there are two critical factors:

A large, well-integrated data warehouse

A well-defined understanding of the business process within which data mining is to be applied

And some successful companies include:

Transportation companies

Pharmaceutical companies

Credit card companies

Large consumer package goods companies (to improve the sales process to retailers)

Each of above examples has clear shared interest. They control the knowledge about customers implicit in a data warehouse to reduce costs and improve the value of customer relationships. These kind of organizations can focus their efforts on the most profitable customers and prospects, and design targeted marketing strategies to best reach them.

Information Quality

In summer 2005 scientists reported about a problem that was related to quality of information gathered from the satellites. They were collecting data at the equator and they had reported a stable and cooling trend of temperature. But the reality was something else and there was a pattern of global warming. It was because they had drifted off course and they were reporting daytime temperatures evaluations that were taken in fact at night. This simple example shows that how important trend discoveries can be unseen, unnoticed, misidentified or interpreted inaccurately if there are information quality problems anywhere in the information value chain.

There are different sources of error introduction that hamper the result of data mining and data analysis. The goal of this research is not to go in depth for each one, but it is to find that the cases that data mining does not effectively work for the organization and in other words bring value for it. Sometimes there are some mismatches in data, maybe because it has not properly, clearly and accurately defined. The characteristics of the real-work object also should be accurate and up to date. For example in case of changing the price of an item, updated prices should be considered also. For an analyst not only the validity of information is important, but also the accuracy of data is important too. One of the other reasons for failure could be wrong way of transforming data which could not be analysed by data mining tool. And finally the way that analysis is presented and displayed is very important too. Here are some notes that should be considered with a short description and example for each one:

Correct, complete and clear information: Imagine there is a "Salary" attribute in database with different kinds of data like: "2000$","Less than 1000$", "1500£" or "Not specified". This shows that data format plays a very important role in gathering data. Also there should be some kind of trainings for information producers. However in case that direct customers are the information producers, one the important things to implement during data entry is data validation rules. It could help to prevent inserting incorrect values or in special cases Null values for instance.

Exact measurement to avoid data collection errors: The previous example about the satellite and global warming is measurement error. To avoid happening same issues, periodically there should be verification of measurement devices and check or maybe calibrate them.

Missing or inaccurate values: Sometimes there because of data collection errors we have missing values. Missing values also occur because of incomplete customer responses and many other reasons.

Value synonyms: Imagine that it has used two different values like "Doz" and "Dz" or "Ok" and "Okay" because data does not have standardized value set. It shows value synonyms that cause data mining failure.

Concurrency: The age of data is represented by concurrency. In data mining for different goals we could need different ages of information. For example, recruiting rules of a company or insurance rules of an organizations have different changes in time periods. So maybe you cannot use the rules of 15 years ago against today’s data or past 4 years for instance.

Outliers and anomalies: The values that do not fit the expected value or expected set of values or range of valid values are called outliers.

Mapping categorical data to numeric values: In cases that there are categorical data it is better to map them to numeric values. In trend analysis the relative relationship of categorical data or attribute codes are not easy to interpret. Because we need to interpret them for correlation and in this case the numeric values are much better than alphabetical values. For example a data mining tool cannot differentiate between "Well done", "Good", "Normal" and "Not bad" as text. So such data could not be useful in analysis.

Modelling errors (Correlated attributes): There should not be redundant data that tells the same information. For example in a database the use of "age" and "date of birth" or "gender" and "personal title" causes redundancy.

Without considering all of these factors there could be some problems in decisions. For example in Aug 2004, U.S. Sen. Edward M. "Ted" Kennedy was stopped and questioned in airport mistakenly just because his name appeared on the government’s secret "no-fly" list (The Washington Post, 2004). The problem was just 'T. Kennedy' used as alias by terrorist suspect. And in took more than 3 week for senator and his staff to remove his name from the list. But on the other side some of the data mining successes could be mentioned as "Web Search", "Spam Filtering", "Recommender Systems", "Machine Translation", "Fraud Detection in Banks" and so on.

Regardless of the outcome, data mining process always takes analysts deeper into their data than they have ever been before. In other words data mining may have executed against data that has been never used. Also in data that was previously supposed to be clean, could be found quality issues. Or surprisingly data relationships which are unrelated to the data mining project may reveal themselves. These discoveries can be valuable in their own right, and they may be considered as side benefits of pursuing data mining.

Conclusion

Whereas nowadays the number of companies and organisations, which move to implement data mining techniques grow more and more, it is very important to have an accurate, clean and complete plan for the whole process of defining data, gathering data, storing, mining and analyzing it. And totally during the whole process, the ‘Information Quality’ should be considered in each phase. In other words at first and before going to preparation and extraction of data to mine, it should be assured about the quality of information. The completeness, accuracy and precision of data should be controlled. Data mining may show us a demand for data review, data cleansing and validation procedures. It might provide incremental insight into relationships between the organization and its customers or between services or products. It might trigger a entirely different path of analysis. And with all of these efforts the company could achieve its own business intelligence and knowledge discovery and also its future mining efforts will be better supported. Otherwise probably it will fail or the information they get will not bring desired value and will not help them to make authentic decisions.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now