UNIT – 4
DECISION MAKING
DECISION-MAKING UNDER CERTAINTY
Decision theory starts with three fundamental concepts, viz. actions, conditions, and outcomes.
While making decisions under certainty, the decision maker is fully informed, able to compute
with perfect accuracy, and fully rational.
Decision under certainty means that each alternative leads to one and only one consequence
and a choice among alternatives is equivalent to a choice among consequences
DECISION-MAKING UNDER UNCERTAINTY
"Uncertainty" refers to situations when this randomness "cannot" be expressed in terms of
specific mathematical probabilities.
In decision-making under pure uncertainty, the decision-maker does not know the outcomes
of any of the states of nature. In such situations, the decision-making depends merely on the
decision-maker's personality type, and the decision-maker's behavior is purely based on
his/her attitude toward the unknown. Some of these behaviors are optimistic, pessimistic, and
least regret, among others.
With no knowledge regarding the likelihood (probability) of any of the events occurring, the
decision maker must base his decision solely on the actual conditional payoff values, together
with his attitude or anticipation toward earning those values. Four decision criteria reflecting
different attitudes will be discussed: the pessimistic, the optimistic, the equally likely, and the
Savage opportunity-loss decision criteria. These criteria may lead to different decisions for
the same problem; thus, the decision maker must select his appropriate criterion at the outset.
We shall discuss each of the uncertainty decision criteria concerning the apple peddler
problem.
Maximax Criterion
The Maximax criterion is an optimistic approach. It suggests that the decision-maker
examines the maximum payoffs of alternatives and chooses the alternative whose outcome is
the best. This criterion appeals to the adventurous decision-maker who is attracted by high
payoffs. This approach may also appeal to a decision-maker who likes to gamble and who is
in a position to withstand any losses without substantial inconvenience. It is possible to
model the optimist profile with the Maximax decision rule (when the payoffs are positive-
flow rewards, such as profits or revenue. When payoffs are given as negative-flow rewards,
such as costs, the optimist decision rule is Minimin Note that negative-flow rewards are
expressed with positive numbers.) Maximax decision rule is followed: 1. For each action
alternative (matrix row) determine the maximum payoff possible. 2. From these maxima,
select the maximum payoff. The action alternative leading to this payoff is the chosen
decision.
Maximin Criteria
The pessimistic decision criterion, or what is sometimes called the maximin or maximum
criterion, assures the decision maker that he will earn no less (or pay no more) than some
specified amount. It is a very conservative approach to decision-making, in that we anticipate
the worst possible outcome (minimum for profit and maximum for cost) for any strategy that
we might choose. The optimal strategy chosen is then the best (maximin or minimax) of the
anticipated outcomes.
Steps for Decision under Pessimistic Criteria
The formal procedure for finding the pessimistic decision is as follows:
1. For each possible strategy, identify the worst payoff value. This will be the row minimum
for a profit table and the row maximum for a cost table. Record this number in a new column.
2. Select the strategy with the best-anticipated payoff value (maximum for-profit and minimum
for cost)
It should be pointed out that, this decision rule does not consider the utility values of the various
outcomes nor does it allow for the superimposing of the decision maker's subjective feelings
about the likelihood of the various events. Rather, this criterion is best suited for those
situations, where the probabilities are not easily evaluated and the decision maker is very
conservative.
When dealing with the costs, the maximum cost associated with each alternative is considered
and the alternative that minimizes this maximum cost is chosen.
Minimax Criteria
The opportunity loss decision criterion, sometimes called the Savage minimax regret decision
criterion, was proposed by the economist Savage. It assures the decision maker that the
opportunities for a payoff that he has missed (or lost) because of the random occurrence of a
possible unfavorable event will be as small a value as is possible. The approach assumes that
for each strategy-event pair, a regret (or opportunity loss) value can be computed equal to the
difference between what the payoff could have been (had he chosen the optimal strategy for
this event) and what it is for the strategy chosen and the event that has occurred.
The decision process then anticipates the worst (maximum) opportunity loss for each possible
strategy and chooses the optimal strategy the one with the minimum anticipated opportunity
loss.
Steps for Decision under Savage Regret Criteria
The formal procedure for finding the opportunity-loss decision is as follows:
1. From the conditional payoff table, develop the conditional opportunity loss table as
follows:
(a) For the first possible event, identify the best possible payoff value.
(b) For each possible strategy, subtract the actual conditional payoff value from the best value.
These results are the regret or opportunity loss values for this event.
(c) Repeat steps (a) and (b) for all possible events.
2. For each possible strategy, identify the worst or maximum regret value. Record this
number as a new column.
3. Select the strategy with the smallest (minimum) anticipated opportunity loss value.
Laplace criteria
The equally likely decision criterion is based on the principle of insufficient reason and is
attributed to Laplace and thus, this criterion is also known as Laplace criteria. The approach
assumes that the decision maker has no knowledge as to which event will occur, and thus he
considers the likelihood of the different events occurring as being equal. In effect, the decision
maker has assigned the same probability value to each event. This probability is equal to
1/number of events. Hence, an average or expected payoff can be computed for each possible
strategy and the optimal decision will be the one with the best average payoff value.
Steps for Decision under Laplace Criteria
The formal procedure for finding the equally likely decision is as follows:
1. For each possible strategy, find the average or expected payoff by adding all the possible
payoffs and dividing by the number of possible events. Record this number in a new column.
2. Select the strategy with the best average payoff value: maximum for profit and minimum
for cost.
Hurwicz criteria
This criterion is a compromise between an optimistic and pessimistic decision criterion. To
start with, a coefficient of optimism a (0 a 1), is selected. When a is close to one, the
decision maker is optimistic about the future, and when a is close to zero the decision maker is
pessimistic about the future. According to Hurwicz, select a strategy that maximizes:
H = a (Maximum payoff in column) + (1 – a) (Minimum payoff in column)
DECISION-MAKING UNDER RISKS
"Risk" refers to situations where the decision-maker can assign mathematical probabilities to
the randomness which he is faced with.
Example: ABC Corporation wishes to introduce one of two products to the market
this year. The Probabilities and Present Values (PV) of projected cash inflows are shown in
Table 14.2. With the help of a decision tree, suggest which product should company choose.
We now discuss decision-making under risk. In this environment, we have additional
information about the occurrence of the states of nature: we have the past data containing
information about the occurrence of different states of nature. We know either: 1) The directly
available probabilities of occurrence of different states of nature or 2) The frequency data for
different states of nature which can be converted into probabilities using the relative frequency
approach of probability (we have explained relative frequency approach of probability in Unit
2 of MST-003) or 3) Subjective probabilities on the basis of experience of individuals (we
have explained subjective probabilities in Unit 2 of MST-003). In all cases, we have with us
the probabilities of occurrence of different states of nature in the environment of decision
making under risk. The following criteria are used to select an optimum course of action in this
environment: (i) Expected Monetary Value (EMV) Criterion, (ii) Expected Opportunity Loss
(EOL) Criterion.
Expected Monetary Value (EMV) Criterion
In this criterion, we first form the payoff table or payoff matrix if it is not already given. Then
for each course of action we find the expected value by multiplying the payoff value for each
course of action one at a time with the probabilities of the corresponding state of nature. The
resulting values are called the expected monetary values (EMVs). Next we select the maximum
of the EMVs in the case of profit or gain, and the minimum of the EMVs in the case of loss or
cost. In the case of profit or gain the course of action corresponding to the maximum expected
monetary value is the optimum course of action according to this criterion. And in the case of
loss or cost, the course of action corresponding to the minimum expected monetary value is the
optimum course of action according to this criterion. We follow the steps given below for the
calculations of this criterion: Step 1: If the payoff table or payoff matrix is already given, Step
1 is not needed. Otherwise, we first define the courses of action and states of nature and then
obtain the payoff table or payoff matrix for the given situation. We also add one more column
to the table indicating the probabilities of different states of nature. Step 2: To obtain the
expected monetary value (EMV) for each course of action, we multiply the payoff value of
each course of action with the probability of the corresponding state of nature and then add the
results. For example, let 1j 2j x , x ,..., x mj be the payoff values for the jth course of action
corresponding to m states of nature 1 2 N ,N ,...,N 1 p ,p ,...,p 2 m and let be the corresponding
probabilities of these m states of mnature, respectively. Then the expected monetary value
(EMV) for the jth course of action is given as: m th EMV forthej courseof action p x p x ... p
x 1 1j 2 2j m mj i p x ...(1) ij i 1 Step 3: We select the maximum expected
monetary value from among the expected monetary values obtained in Step 2 if payoff values
represent profit or gain. We select the minimum EMV if the values represent loss or cost. Step
4: Under this criterion, the course of action corresponding to the maximum (or minimum)
EMV selected in Step 3 will be the optimum course of action.
Expected Opportunity Loss (EOL) Criterion
This criterion suggests the course of action which minimizes our expected opportunity loss.
The steps involved in the procedure of this criterion are the same as in the expected monetary
value (EMV) criterion except that instead of dealing with payoff values, here we deal with
opportunity loss values. We follow the steps explained below: Step 1: If the payoff table or
payoff matrix is already given, then Step 1 is not needed. Otherwise, we first define the courses
of action, states of nature and then obtain the payoff table. We also add one more column
indicating the probabilities of different states of nature. Step 2: We obtain the opportunity loss
values or regret values or conditional opportunity loss values for each state of nature by
subtracting all payoff values corresponding to each state of nature from their respective
maximum payoff values in case of profit or gain. or the minimum payoff value corresponding
to each state of nature from all other payoff values of the states of nature in case of cost or loss.
The calculation has been explained in Tables 9.4 and 9.5, respectively, in Unit 9. Step 3: Next,
we obtain the expected opportunity loss values for each course of action by finding the sum of
the products of the opportunity loss values of the course of action with the probabilities of the
corresponding states of nature as explained in Example 2 given below. Step 4: Finally, we
select the minimum from among the expected opportunity loss values calculated in Step 3. The
course of action corresponding to the minimum expected opportunity loss value will be the
optimum course of action.
DECISION TREE ANALYSIS
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and
their possible consequences, including chance event outcomes, resource costs, and utility.
A decision tree is drawn only from left to right, and has only burst nodes (splitting paths) but
no sink nodes (converging paths). Therefore, if drawn manually, it can grow very big and
become hard to draw fully.
A decision tree is used as a visual and analytical tool, where the expected values (or expected
utility) of competing alternatives are calculated.
A decision tree consists of three types of nodes:
1. Decision nodes - commonly represented by squares
2. Chance nodes - represented by circles
3. End nodes - represented by triangles
Decision trees show the possible outcomes of different choices, taking into account
probabilities, costs and returns. They enable a manager to set out the consequences of choices,
ensuring that he has considered all possibilities and to assess the likelihood of each different
possibility and to assess the result of each possibility in terms of cost and profit.
It is a tool that has applications spanning several different areas. Decision trees can be
used for classification as well as regression problems. The name itself suggests that it uses
a flowchart like a tree structure to show the predictions that result from a series of feature -
based splits. It starts with a root node and ends with a decision made by leaves.
Decision Tree Terminologies
Before learning more about decision trees let’s get familiar with some of the
terminologies:
Root Node: The initial node at the beginning of a decision tree, where the entire
population or dataset starts dividing based on various features or conditions.
Decision Nodes: Nodes resulting from the splitting of root nodes are known as
decision nodes. These nodes represent intermediate decisions or conditions within
the tree.
Leaf Nodes: Nodes where further splitting is not possible, often indicating the final
classification or outcome. Leaf nodes are also referred to as terminal nodes.
Sub-Tree: Similar to a subsection of a graph being called a sub-graph, a sub-section
of a decision tree is referred to as a sub-tree. It represents a specific portion of the
decision tree.
Pruning: The process of removing or cutting down specific nodes in a decision tree
to prevent overfitting and simplify the model.
Branch / Sub-Tree: A subsection of the entire decision tree is referred to as a
branch or sub-tree. It represents a specific path of decisions and outcomes within
the tree.
Parent and Child Node: In a decision tree, a node that is divided into sub-nodes is
known as a parent node, and the sub-nodes emerging from it are referred to as child
nodes. The parent node represents a decision or condition, while the child nodes
represent the potential outcomes or further decisions based on that condition.
Example of Decision Tree
Let’s understand decision trees with the help of an example:
Decision trees are upside down which means the root is at the top and then this root is split
into various several nodes. Decision trees are nothing but a bunch of if-else statements in
layman terms. It checks if the condition is true and if it is then it goes to the next node
attached to that decision.
In the below diagram the tree will first ask what is the weather? Is it sunny, cloudy, or
rainy? If yes then it will go to the next feature which is humidity and wind. It will again
check if there is a strong wind or weak, if it’s a weak wind and it’s rainy then the person
may go and play.
Did you notice anything in the above flowchart? We see that if the weather is cloudy then
we must go to play. Why didn’t it split more? Why did it stop there?
To answer this question, we need to know about few more concepts like entropy,
information gain, and Gini index. But in simple terms, I can say here that the output for
the training dataset is always yes for cloudy weather, since there is no disorderliness here
we don’t need to split the node further.
The goal of machine learning is to decrease uncertainty or disorders from the dataset and
for this, we use decision trees.
Now you must be thinking how do I know what should be the root node? what should be
the decision node? when should I stop splitting? To decide this, there is a metric called
“Entropy” which is the amount of uncertainty in the dataset.
How do decision tree algorithms work?
The decision Tree algorithm works in simpler steps
1. Starting at the Root: The algorithm begins at the top, called the “root node,”
representing the entire dataset.
2. Asking the Best Questions: It looks for the most important feature or question that
splits the data into the most distinct groups. This is like asking a question at a fork
in the tree.
3. Branching Out: Based on the answer to that question, it divides the data into
smaller subsets, creating new branches. Each branch represents a possible route
through the tree.
4. Repeating the Process: The algorithm continues asking questions and splitting the
data at each branch until it reaches the final “leaf nodes,” representing the predicted
outcomes or classifications.
Decision Tree Assumptions
Several assumptions are made to build effective models when creating decision trees.
These assumptions help guide the tree’s construction and impact its performance. Here are
some common assumptions and considerations when creating decision trees:
Binary Splits
Decision trees typically make binary splits, meaning each node divides the data into two
subsets based on a single feature or condition. This assumes that each decision can be
represented as a binary choice.
Recursive Partitioning
Decision trees use a recursive partitioning process, where each node is divided into child
nodes, and this process continues until a stopping criterion is met. This assumes that data
can be effectively subdivided into smaller, more manageable subsets.
Feature Independence
Decision trees often assume that the features used for splitting nodes are independent. In
practice, feature independence may not hold, but decision trees can still perform well if
features are correlated.
Homogeneity
Decision trees aim to create homogeneous subgroups in each node, meaning that the
samples within a node are as similar as possible regarding the target variable. This
assumption helps in achieving clear decision boundaries.
Top-Down Greedy Approach
Decision trees are constructed using a top-down, greedy approach, where each split is
chosen to maximize information gain or minimize impurity at the current node. This may
not always result in the globally optimal tree.
Categorical and Numerical Features
Decision trees can handle both categorical and numerical features. However, they may
require different splitting strategies for each type.
Overfitting
Decision trees are prone to overfitting when they capture noise in the data. Pruning and
setting appropriate stopping criteria are used to address this assumption.
Impurity Measures
Decision trees use impurity measures such as Gini impurity or entropy to evaluate how
well a split separates classes. The choice of impurity measure can impact tree construction.
No Missing Values
Decision trees assume that there are no missing values in the dataset or that missing values
have been appropriately handled through imputation or other methods.
Equal Importance of Features
Decision trees may assume equal importance for all features unless feature scaling or
weighting is applied to emphasize certain features.
No Outliers
Decision trees are sensitive to outliers, and extreme values can influence their
construction. Preprocessing or robust methods may be needed to handle outliers
effectively.
Sensitivity to Sample Size
Small datasets may lead to overfitting, and large datasets may result in overly complex
trees. The sample size and tree depth should be balanced.
Decision Tree in Machine Learning
A decision tree in machine learning is a versatile, interpretable algorithm used for predictive
modeling. It structures decisions based on input data, making it suitable for both classification
and regression tasks. This article delves into the components, terminologies, construction, and
advantages of decision trees, exploring their applications and learning algorithms.
Decision Tree in Machine Learning
A decision tree is a type of supervised learning algorithm that is commonly used in machine
learning to model and predict outcomes based on input data. It is a tree-like structure where
each internal node tests on attribute, each branch corresponds to attribute value and each leaf
node represents the final decision or prediction. The decision tree algorithm falls under the
category of supervised learning.They can be used to solve both regression and classification
problems.
------------------------------- GOOD LUCK ------------------------------------