Data mining is generally part of a larger business intelligence or knowledge management initiative. Since state governments are complex organizations that collect and process massive amounts of information, data mining can help provide value to state government operations and taxpayers by extracting useful information out of mountains of collected data. In addition, data mining can be predictive and uncover hidden patterns that states can strategically use to reduce costs, increase business expansion opportunities, and detect fraud, waste and abuse that drains away taxpayer dollars.
Data mining is the process of sorting through large amounts of data and picking out relevant information. It is normally used by large corporations employing Business Intelligence integrated with an ERP system to help make managerial decisions based on the patterns and forecasts generated from the data collected. It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" and "the science of extracting useful information from large data sets or databases."
Data mining in relation to enterprise resource planning is the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making. Data mining identifies trends within data that go beyond simple analysis. Through the use of sophisticated algorithms, non-statistician users have the opportunity to identify key attributes of business processes and target opportunities. However, abdicating control of this process from the statistician to the machine may result in false-positives or no useful results at all. Although data mining is a relatively new term, the technology is not. For many years, businesses have used powerful computers to sift through volumes of data such as supermarket scanner data to produce market research reports (although reporting is not always considered to be data mining). Continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy and usefulness of data analysis.
The term data mining is often used to apply to the two separate processes of knowledge discovery and prediction. Knowledge discovery provides explicit information that has a readable form and can be understood by a user (e.g., association rule mining). Forecasting, or predictive modeling provides predictions of future events and may be transparent and readable in some approaches (e.g., rule-based systems) and opaque in others such as neural networks. Moreover, some data-mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery. Metadata, or data about a given data set, are often expressed in a condensed data-minable format, or one that facilitates the practice of data mining.
Data mining relies on the use of real world data. These data are extremely vulnerable to collinearity precisely because data from the real world may have unknown interrelations. An unavoidable weakness of data mining is that the critical data that may expose any relationship might have never been observed. Alternative approaches using an experiment-based approach such as Choice Modelling for human-generated data may be used. Inherent correlations are either controlled for or removed altogether through the construction of an experimental design.
Data are any facts, numbers, or text that can be processed by a computer. Today, organizations are accumulating vast and growing amounts of data in different formats and different databases.
This includes:
• operational or transactional data such as, sales, cost, inventory, payroll, and accounting
• nonoperational data, such as industry sales, forecast data, and macro economic data
• meta data — data about the data itself, such as logical database design or data dictionary definitions
The purpose of data mining is to identify patterns in order to make predictions from information contained in databases. It allows the user to be proactive in identifying and predicting trends with that information. Common uses of data mining in government include knowledge discovery, fraud detection, analysis of research, decision support, and website personalization.
• Improving service or performance
• Detecting fraud, waste, and abuse
• Analyzing scientific and research information
• Managing human resources
• Detecting criminal activities or patterns
• Analyzing intelligence and detecting terrorist activities
Data Mining Algorithms
The data mining algorithm is the mechanism that creates a data mining model. To create a model, an algorithm first analyzes a set of data and looks for specific patterns and trends. The algorithm uses the results of this analysis to define the parameters of the mining model. These parameters are then applied across the entire data set to extract actionable patterns and detailed statistics.
The mining model that an algorithm creates can take various forms, including:
• A set of rules that describe how products are grouped together in a transaction.
• A decision tree that predicts whether a particular customer will buy a product.
• A mathematical model that forecasts sales.
• A set of clusters that describe how the cases in a dataset are related.
The problem of mining sensors association rules is inspired by the definition of the association rules proposed in the domain of transactional databases. However, there is not much work done
on the way to define association rules for wireless sensor networks in which the sensors themselves are the main object in the extracted rules.
No comments:
Post a Comment