0% found this document useful (0 votes)
23 views25 pages

Classification Rule Mining Overview

Chapter 4 discusses Classification Rule Mining, which involves predicting categorical labels and continuous values through data analysis. It covers the principles of classification, the process of building classifiers, rule-based classification, and various classification methods such as Bayesian classifiers and decision trees. Additionally, it addresses design issues, data preparation, and regression analysis as a method for modeling relationships between variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views25 pages

Classification Rule Mining Overview

Chapter 4 discusses Classification Rule Mining, which involves predicting categorical labels and continuous values through data analysis. It covers the principles of classification, the process of building classifiers, rule-based classification, and various classification methods such as Bayesian classifiers and decision trees. Additionally, it addresses design issues, data preparation, and regression analysis as a method for modeling relationships between variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

CHAPTER-4

Classification rule Mining


Description
Principle
Design
Algorithm
Rule evaluation
• What is Classification rule Mining?
• Classification and prediction are two forms of data analysis that
can be used to extract models describing important data classes
or to predict future data trends.
• Classification predicts (discrete, unordered) labels, prediction
models continuous valued functions.
• Following are the examples of cases where the data analysis
task is Classification −
• A bank loan officer wants to analyze the data in order to know
which customer (loan applicant) are risky or which are safe.
• A marketing manager at a company needs to analyze a customer
with a given profile, who will buy a new computer.
• In both of the above examples, a model or classifier is
constructed to predict the categorical labels.
• These labels are risky or safe for loan application data and yes or
no for marketing data.
• A predictor is constructed that predicts a continuous-valued
function, or ordered value.
• Regression analysis is a statistical methodology that is mostly
used for numeric prediction.
• Many classification and prediction methods have been proposed
by researchers in machine learning, pattern recognition, and
statistics.
• Principle of Classification

• With the help of the bank loan application that we have discussed
above, let us understand the working of classification.

• The Data Classification process includes two steps −


• Building the Classifier or Model
• Using Classifier for Classification
• Building the Classifier or Model

• This step is the learning step or the learning phase.


• In this step the classification algorithms build the classifier.
• The classifier is built from the training set made up of database
tuples and their associated class labels.
• Each tuple that constitutes the training set is referred to as a
category or class.
• These tuples can also be referred to as sample, object or data
points.
• Using Classifier for Classification
• In this step, the classifier is used for classification. Here the test
data is used to estimate the accuracy of classification rules.
• The classification rules can be applied to the new data tuples if
the accuracy is considered acceptable.
• Data Mining - Rule Based Classification

• IF-THEN Rules : Rule-based classifier makes use of a set of IF-THEN


rules for classification.
• We can express a rule in the following from −
• IF condition THEN conclusion

• Let us consider a rule R1,


• R1: IF age = youth AND student = yes
• THEN buy_computer = yes
• Points to remember −
• The IF part of the rule is called rule antecedent or precondition.
• The THEN part of the rule is called rule consequent.
• The antecedent part the condition consist of one or more
attribute tests and these tests are logically ANDed.
• The consequent part consists of class prediction.

• Note − We can also write rule R1 as follows −


• R1: (age = youth) ^ (student = yes))(buys computer = yes)

• If the condition holds true for a given tuple, then the antecedent
is satisfied.
• Design Issues Regarding Classification and Prediction:
• Preparing the Data for Classification and Prediction:
• The following preprocessing steps may be applied to the data to
help improve the accuracy, efficiency, and scalability of the
classification or prediction process.
• Data Cleaning −
• Data cleaning involves removing the noise and treatment of
missing values.
• The noise is removed by applying smoothing techniques and the
problem of missing values is solved by replacing a missing value
with most commonly occurring value for that attribute.
• Relevance Analysis −
• Database may also have the irrelevant attributes.
• Correlation analysis is used to know whether any two given
attributes are related.
• Data Transformation and reduction −
• The data can be transformed by any of the following methods.
– Normalization − The data is transformed using normalization.
– Normalization is used when in the learning step, the neural
networks or the methods involving measurements are used.
– Normalization involves scaling all values for a given attribute
so that they fall within a small specified range, such as -1 to +1
or 0 to 1.
– Generalization −
The data can also be transformed by generalizing it to the
higher-level concepts. For this purpose we can use the concept
hierarchies.
• Comparison of Classification and Prediction
Methods
• Here is the criteria for comparing the methods of Classification and
Prediction −
• Accuracy − Accuracy of classifier refers to the ability of classifier.
It predict the class label correctly.
the accuracy of the predictor refers to the value of attribute for a
new data.
• Speed − This refers to the computational cost in generating and
using the classifier or predictor.
• Robustness − It refers to the ability of classifier or predictor to
make correct predictions from given noisy data.
• Scalability − Scalability refers to the ability to construct the
classifier or predictor efficiently for given large amount of data.
• Interpretability − It refers to what extent the classifier or predictor
understands.
• Bayes Classification Methods
• “What are Bayesian classifiers?” Bayesian classifiers are statistical
classifiers.
• They can predict class membership probabilities such as the
probability that a given tuple belongs to a particular class.
• Bayesian classification is based on Bayes’ theorem.
• classification algorithms have found a simple Bayesian classifier
known as the na¨ıve
• Bayesian classifier to be comparable in performance with decision
tree and selected neural network classifiers.
• Bayesian classifiers have also exhibited high accuracy and speed
when applied to large databases.
• Na¨ıve Bayesian classifiers assume that the effect of an attribute
value on a given class is independent of the values of the other
attributes. This assumption is called class conditional independence.
• Bayes’ theorem is useful in that it provides a way of calculating
the posterior probability is given below
• P(H/X), from P.(H), P.(X/H), and P.(X).

• Bayes’ theorem is
• P(H/X) =P(X/H)/P(H)/P(X)
• Classification by Decision Tree Induction:
• Decision tree induction is the learning of decision trees from
class-labeled training tuples.
• A decision tree is a flowchart-like tree structure, where Each
internal node denotes a test on an attribute.
• Each branch represents an outcome of the test.
• Each leaf node holds a class label.
• The topmost node in a tree is the root node.
• The construction of decision tree classifiers does not require any
domain knowledge or parameter setting, and generate
knowledge discovery.
• Decision trees can handle high dimensional data.
• Their representation of acquired knowledge in tree form.
• The learning and classification steps of decision tree induction are
simple and fast.
• In general, decision tree classifiers have good accuracy.
• Decision tree induction algorithms have been used for
classification in many application areas, such as medicine,
manufacturing and production, financial analysis, and molecular
biology.
• Algorithm For Decision Tree Induction:

• The algorithm is called with three parameters:


• Data partition
• Attribute list
• Attribute
• The parameter attribute list is a list of attributes describing the
tuples.
• Attribute selection method specifies procedure for selecting the
attribute that best discriminates the given tuples according to
class.
• The tree starts as a single node, N, representing the training
tuples in D.
• If the tuples in D are all of the same class, then node N becomes a
leaf and is labeled with that class .
• All of the terminating conditions are explained at the end of the
algorithm.
• Genetic Algorithms
• The idea of genetic algorithm is derived from natural evolution.
• In genetic algorithm, first of all, the initial population is created.
• This initial population consists of randomly generated rules. We can
represent each rule by a string of bits.
• For example, in a given training set, the samples are described by
two Boolean attributes such as A1 and A2. And this given training
set contains two classes such as C1 and C2.
• We can encode the rule IF A1 AND NOT A2 THEN C2 into a bit
string 100. In this bit representation, the two leftmost bits
represent the attribute A1 and A2, respectively.
• Likewise, the rule IF NOT A1 AND NOT A2 THEN C1 can be encoded
as 001.
• Fuzzy Set Approaches:

• Fuzzy logic uses truth values between 0.0 and 1.0 to represent the
degree of membership that a certain value has in a given category.
Each category then represents a fuzzy set.
• Fuzzy logic system typically provide graphical tools to assist users in
converting attribute values to fuzzy truth values.
• Fuzzy set theory is also known as possibility theory.
• It was proposed by Lotfi Zadeh in1965 as an alternative to traditional
two-value logic and probability theory.
• Most important, fuzzy set theory allows us to deal with vague or
inexact facts.
• Regression Analysis:
• Regression analysis can be used to model the relationship between
one or more independent or predictor variables and a dependent or
response variable which is continuous-valued.
• In general, the values of the predictor variables are known The
response variable is what we want to predict.

• Linear Regression:
• Straight-line regression analysis involves a response variable, y, and a
single predictor variable x.
• It is the simplest form of regression, and models y as a linear function
of x.
• That is, y = b+wx where the variance of y is assumed to be
constant band w are regression coefficients specifying the Y-
intercept and slope of the line
• The regression coefficients can be estimated using this method
with the following equations:
• where x is the mean value of x1, x2, … , x|D|, and y is the mean value
of y1, y2,…, y|D|.
• The coefficients w0 and w1 often provide good approximations to
otherwise complicated regression equations.

You might also like