0% found this document useful (0 votes)
44 views

Lecture 2 Deterministic

The document describes a deterministic dynamic programming problem called the stagecoach problem. It involves finding the optimal route through multiple stages (stagecoach runs) to minimize the total cost of life insurance. Dynamic programming is used to solve this by starting with the final stage and working backwards, determining the optimal decision at each stage based on the previous stages. Deterministic dynamic programming problems have states and decisions at each stage that deterministically lead to the next state.

Uploaded by

Armee Justitia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Lecture 2 Deterministic

The document describes a deterministic dynamic programming problem called the stagecoach problem. It involves finding the optimal route through multiple stages (stagecoach runs) to minimize the total cost of life insurance. Dynamic programming is used to solve this by starting with the final stage and working backwards, determining the optimal decision at each stage based on the previous stages. Deterministic dynamic programming problems have states and decisions at each stage that deterministically lead to the next state.

Uploaded by

Armee Justitia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Deterministic Dynamic

Programming

1 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM
The stagecoach problem is a problem specially constructed to
illustrate the features and to introduce the terminology of dynamic
programming.

A mythical fortune seeker in Missouri who decided to go west to join


the gold rush in California during the mid-19th century. The journey
would require traveling by stagecoach through unsettled country
where there was serious danger of attack by marauders.

The possible routes are shown in the figure, where each state is
represented by a circled letter and the direction of travel is always
from left to right in the diagram.

2 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM
Four stages were required to travel from his point of embarkation in
state A (Missouri) to his destination in state J (California).

3 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM
Life insurance policies were offered to stagecoach passengers. The
cost of the policy for taking any given stagecoach run was based on a
careful evaluation of the safety of that run. Thus, the safest route
should be the one with the cheapest total life insurance policy.

We shall now focus on the question of which route minimizes the total
cost of the policy.

4 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

How should we find the route with cheapest cost?

5 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

How should we find the route with cheapest cost?

How about selecting the cheapest road at each stage?

5 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

How should we find the route with cheapest cost?

How about selecting the cheapest road at each stage?

One possible approach to solving this problem is to use trial and


error. However, the number of possible routes is large (18)

5 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

Dynamic programming starts with a small portion of the original


problem and finds the optimal solution for this smaller problem

6 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

Dynamic programming starts with a small portion of the original


problem and finds the optimal solution for this smaller problem

We start with the smaller problem where the fortune seeker has
nearly completed his journey and has only one more stage
(stagecoach run) to go.

6 / 17 Ya-Tang Chuang Dynamic Programming


T HE S TAGECOACH P ROBLEM

Dynamic programming starts with a small portion of the original


problem and finds the optimal solution for this smaller problem

We start with the smaller problem where the fortune seeker has
nearly completed his journey and has only one more stage
(stagecoach run) to go.

At each subsequent iteration, the problem is enlarged by


increasing by 1 the number of stages left to go to complete the
journey

6 / 17 Ya-Tang Chuang Dynamic Programming


C HARACTERISTICS OF DYNAMIC P ROGRAMMING
P ROBLEMS
These features characterize dynamic programming problems
1. The problem can be divided into stages, with a decision required
at each stage.

2. Each stage has a number of states associated with the


beginning of that stage.

3. The effect of the decision at each stage is to transform the


current state to a state associated with the beginning of the next
stage (possibly according to a probability distribution).

4. The solution procedure is designed to find an optimal policy for


the overall problem

7 / 17 Ya-Tang Chuang Dynamic Programming


C HARACTERISTICS OF DYNAMIC P ROGRAMMING
P ROBLEMS
These features characterize dynamic programming problems
5. The optimal immediate decision depends on only the current
state and not on how you got there. This is the principle of
optimality for dynamic programming

6. The solution procedure begins by finding the optimal policy for


the last stage (backward induction)

7. A recursive relationship that identifies the optimal policy for


stage n, given the optimal policy for stage n + 1, is available.

fn∗ (sn ) = min{c(sn , xn ) + fn+1



(sn+1 )}
xn

8 / 17 Ya-Tang Chuang Dynamic Programming


C HARACTERISTICS OF DYNAMIC P ROGRAMMING
P ROBLEMS
Notation which will be used are summarized below:
N: number of stages

n: label for current stage (n = 1, 2, . . . , N )

sn : current state for stage n

xn : decision variable for stage n

xn∗ : optimal value of xn (given sn )

fn (sn , xn ): objective function if system starts in state sn at stage


n, immediate decision is xn , and optimal decisions are made
thereafter

9 / 17 Ya-Tang Chuang Dynamic Programming


C HARACTERISTICS OF DYNAMIC P ROGRAMMING
P ROBLEMS
Notation which will be used are summarized below:
The recursive relationship will always be of the form

fn∗ (sn ) = min{fn (sn , xn )}


xn

or

fn∗ (sn ) = max{fn (sn , xn )},


xn

where fn∗ (sn ) = fn (sn , xn∗ )

10 / 17 Ya-Tang Chuang Dynamic Programming


D ETERMINISTIC DYNAMIC PROGRAMMING

The state at the next stage is completely determined by the state


and decision at the current stage

states sn might be representable by a discrete state variable (as


for the stagecoach problem) or by a continuous state variable

decision variables (x1 , x2 , . . . , xN ) also can be either discrete or


continuous.

11 / 17 Ya-Tang Chuang Dynamic Programming


D ISTRIBUTING M EDICAL T EAMS TO C OUNTRIES
The World Health Organization (WHO) is devoted to improving health
care in the underdeveloped countries of the world. It now has five
medical teams available to allocate among three such countries to
improve their medical care

The WHO needs to determine how many teams (if any) to allocate to
each of these countries to maximize the total effectiveness of the five
teams. The the number allocated to each country must be an integer.

The measure of performance being used is additional person-years of


life.

12 / 17 Ya-Tang Chuang Dynamic Programming


D ISTRIBUTING M EDICAL T EAMS TO C OUNTRIES
The measure of performance being used is additional person-years of
life.

What are the decision variables xn and state variables sn ?

13 / 17 Ya-Tang Chuang Dynamic Programming


T HE D ISTRIBUTION OF E FFORT P ROBLEM
The preceding example illustrates a particularly common type of
dynamic programming problem called the distribution of effort
problem.

Stage n: activity n (n = 1, 2, . . . , N).

xn : amount of resource allocated to activity n

State sn : amount of resource still available for allocation to


remaining activities

When the system starts at stage n in state sn , the choice of xn results


in the next state at stage n + 1 being sn+1 = sn − xn

14 / 17 Ya-Tang Chuang Dynamic Programming


D ISTRIBUTING S CIENTISTS TO R ESEARCH T EAMS
A government space project is conducting research on a certain
engineering problem that must be solved before people can fly safely
to Mars.

Three research teams are currently trying three different approaches


for solving this problem. The estimate has been made that, under
present circumstances, the probability that the respective teams —
call them 1, 2, and 3 — will not succeed is 0.40, 0.60, and 0.80,
respectively. Thus, the current probability that all three teams will fail
is 0.40 × 0.60 × 0.80 = 0.192. Because the objective is to minimize
the probability of failure, two more top scientists have been assigned
to the project.

Following table gives the estimated probability that the respective


teams will fail when 0, 1, or 2 additional scientists are added to that
team.

15 / 17 Ya-Tang Chuang Dynamic Programming


D ISTRIBUTING S CIENTISTS TO R ESEARCH T EAMS

The problem is to determine how to allocate the two additional


scientists to minimize the probability that all three teams will fail.

16 / 17 Ya-Tang Chuang Dynamic Programming


D ISTRIBUTING S CIENTISTS TO R ESEARCH T EAMS

The problem is to determine how to allocate the two additional


scientists to minimize the probability that all three teams will fail.

17 / 17 Ya-Tang Chuang Dynamic Programming

You might also like