Output Data Analysis
Unit II
Modeling and Simulation
Topics to be discussed
Types of Simulation
In Simulation Output Data Analysis, there are three primary types of
simulations based on how output data is generated and analyzed.
These types determine the method of data collection, analysis
technique, and the interpretation of results. Understanding these
types helps in applying the right statistical tools and making accurate
decisions.
• Finite-Horizon Simulations
• Steady-State Simulations
• Regenerative Simulations
Terminating Simulation (Finite-horizon Simulation)
A terminating simulation (also known as a finite-horizon simulation) is a type of
simulation model that:
❑ Begins at a specific, defined initial state
❑ Ends at a predetermined stopping condition — such as a fixed time, event, or
number of transactions
❑ Focuses on short-term or event-driven behavior rather than long-term averages
It is most appropriate when the system naturally starts and stops, and the user is
interested in the performance within that timeframe.
Advantages of Terminating Simulation
1. Simplicity in Analysis
o No need to identify warm-up periods or steady states
o Clear start and stop points
2. Closer to Real-World Operations
o Many systems like hospitals, schools, stores, etc., operate on fixed-time cycles
o Easy to simulate shifts, work hours, or event-based tasks
3. Efficient Resource Usage
o Less computational time compared to long, steady-state runs
o Useful for simulations that don't require continuous operation
4. Clear Output Metrics
o Directly measures what happens within the time frame of interest
o Good for comparing operational strategies during the day/shift
Disadvantages of Terminating Simulation
1. Initial Condition Bias
o Starting from an empty or unrealistic state may bias the results
o Especially problematic in the early part of the simulation
2. High Variability
o Single runs may give misleading results due to randomness
o Multiple replications are needed for reliable output
3. Limited Scope
o Not suitable for long-run behavior or systems without natural stopping points
o Doesn't help in optimizing ongoing, continuous systems
4. May Miss Rare Events
o Short simulation time may not capture infrequent but important events (e.g., failures)
Applications / Use CASE
Domain Usage Example
Banking Simulating daily operations in a bank (8 AM to 4 PM)
Retail Customer flow during a sale day in a store
Hospitals Emergency Room performance over a night shift
Education School classroom scheduling for one day
Events Simulation of a conference or seminar session
Transportation Airport passenger handling per flight arrival
Use terminating simulation when:
• The system has clear operational boundaries
(start/end)
• You are interested in short-term performance
• The system is reset or restarted regularly (daily,
weekly, etc.)
• Performance varies significantly between runs
Steady-State Simulation
A steady-state simulation is a type of simulation that:
• Focuses on the long-run average behavior of a system
• Does not have a natural ending point — it operates continuously over time
• Requires the system to reach a steady operating condition, where performance metrics become
stable (i.e., the effect of initial conditions fades out)
This simulation is used when we are more interested in understanding how the system behaves in the
long run, rather than during any specific time period.
A 24/7 ATM machine simulation at a bank:
• Customers arrive at all hours of the day and night
• The simulation does not stop at a fixed time — it runs for a long time to capture average performance
• Metrics of interest include average waiting time, ATM utilization, number of customers in queue, etc.
Advantages of Steady-State Simulation
1. Long-Run Insights
o Helps understand system performance over time
o Useful for decision-making in continuously running systems (e.g., networks, production
lines)
2. More Realistic for Continuous Systems
o Reflects real-world systems that don’t stop or reset periodically
o Avoids artificial stop/start points
3. Reliable Averages
o Gives more stable and generalized results
o Reduces short-term variation
4. Helps Capacity Planning
o Ideal for studying long-term resource usage, bottlenecks, and service efficiency
Disadvantages of Steady-State Simulation
1. Warm-Up Period Requirement
o Initial part of the simulation is biased due to non-steady-state conditions
o Needs identification and removal of this transient period
2. Long Simulation Time
o May need to run for a very long time to reach and observe steady-state
o Computationally expensive
3. Complex Analysis
o Requires careful statistical treatment (e.g., batch means, time series plots)
o Initial results can be misleading if warm-up is not handled properly
4. No Fixed End
o Difficult to decide when to stop the simulation run
Use Cases / Applications
Domain Usage Example
Manufacturing Continuous production line performance
Banking ATM or teller queue performance over weeks/months
Telecommunications Network traffic flow in routers or base stations
Healthcare ICU bed usage in hospitals operating 24/7
Call Centers Call handling systems in round-the-clock support centers
Transportation Toll booth operations or metro/train systems
Use steady-state simulation when:
• The system operates continuously without a natural end
• You're interested in average long-term performance
• Initial conditions are not important — focus is on typical behavior over time
• You need to optimize throughput, resource usage, or service levels
Regenerative Simulation
A regenerative simulation is a simulation model in which the system naturally returns to a specific state
(called a regeneration point) that restarts the system in a statistically identical manner. This repeated
pattern allows the system to be divided into independent cycles, known as regenerative cycles.
Each regenerative cycle behaves like a new simulation replication, allowing analysts to collect performance
data as independent observations without needing separate runs.
Key Characteristics
• Has identifiable points (regeneration points) where the system "restarts" statistically.
• Allows data from each cycle to be treated as independent and identically distributed (i.i.d.).
• No need for multiple simulation runs — just one long run with repeated regeneration points.
A single-server queueing system (like an M/M/1 queue) where:
• Customers arrive randomly, and one server handles requests.
• A regeneration point occurs whenever the system becomes empty (i.e., no customers are left in queue or service).
• From that point onward, the queue starts afresh.
So, each time the system is empty, the simulation restarts a new cycle, and performance measures like wait time, queue
length, or utilization can be collected per cycle.
Stochastic Process and Sample Path
A stochastic process is a mathematical model that represents a system evolving randomly
over time. It is defined as a collection of random variables
where each random variable X(t) represents the state of the system at time t.
• In simulation, output data such as waiting time, queue length, or resource utilization
over time is inherently random and can be modeled using a stochastic process.
• This helps us understand how the output behaves under uncertainty and enables us
to compute statistical measures like mean, variance, confidence intervals, etc.
Stochastic Process and Sample Path
A sample path is one possible realization (i.e., one outcome or trajectory) of the stochastic process. It shows how the
state of the system evolves over time during a single simulation run.
• If the stochastic process is like the "script" of possible futures, then a sample path is like watching one movie based
on that script.
In output data analysis of simulations:
• A stochastic process provides a theoretical framework to describe how system output changes randomly over time.
• A sample path is a practical realization of that process, representing one possible evolution of the system under
specific random events.
By analyzing multiple sample paths, we can extract meaningful, statistically valid performance metrics, enabling
informed decision-making based on simulation results.
Sampling and Systematic Errors
Output data analysis in modeling and simulation is fundamentally concerned with extracting meaningful
insights from stochastic simulation results.
The quality and reliability of these insights are directly influenced by two critical types of errors: sampling
errors and systematic errors. Understanding, identifying, and mitigating these errors is essential for producing
credible simulation results and making sound decisions based on simulation studies
Sampling error arises when a subset (sample) of all possible outputs is used to
estimate system performance. Since simulation outputs are inherently random,
each run may yield slightly different results.
Sampling error refers to the variability in the estimator (like sample mean) due to
using only a finite number of replications or observations.
Sources of Sampling Errors
Random Number Generation: The use of pseudo-random number generators introduces variability in
simulation outputs. Different random number seeds produce different output sequences, leading to
sampling variability.
Finite Run Length: Simulations are necessarily finite in duration, meaning we observe only a limited
sample of the system's behavior. This truncation introduces sampling error, particularly in steady-state
analyses where we must distinguish between transient and steady-state behavior.
Limited Replications: Conducting only a small number of simulation replications increases sampling
error. The law of large numbers suggests that more replications reduce sampling variability, but practical
constraints often limit the number of runs.
Types of Sampling Errors
Initialization Bias: This occurs when the simulation's initial conditions are not representative of the steady-
state behavior being studied. The system may require a warm-up period to reach steady state, and including
transient data in steady-state analyses introduces bias.
Correlation-Induced Bias: Successive observations in a single simulation run are often correlated,
violating the independence assumption required for standard statistical analyses. This autocorrelation can
lead to underestimated confidence intervals and incorrect conclusions about system performance.
Finite Sample Bias: When estimating parameters or performance measures from finite samples, the
estimates may be biased. This is particularly problematic when estimating rare event probabilities or
extreme quantiles.
Systematic error
• Systematic error, or bias, is a consistent, repeatable error that leads to inaccurate estimates, typically
due to faulty assumptions, poor model design, or incorrect methodology.
• Unlike sampling error, systematic error does not decrease with more replications — it affects all results
consistently in the wrong direction.
Sources of Systematic Errors
Model Conceptualization Errors: Incorrect assumptions about system behavior, inappropriate level of detail, or missing important system
components can introduce systematic bias. These errors stem from inadequate understanding of the real system being modeled.
Implementation Errors: Programming bugs, incorrect parameter values, or inappropriate algorithms can systematically bias simulation
results. These errors are particularly insidious because they may not be immediately apparent.
Numerical Precision Issues: Floating-point arithmetic limitations, accumulation of round-off errors, or inappropriate numerical methods
can introduce systematic bias, especially in long simulation runs.
Data Collection Bias: Systematic errors in data collection procedures, such as consistently measuring at inappropriate times or under non-
representative conditions, can bias output analysis.
Types of Systematic Errors
Discretization Errors: When continuous processes are approximated using discrete time steps or discrete state spaces,
systematic errors can arise from the approximation. Smaller time steps generally reduce these errors but increase
computational cost.
Algorithmic Bias: The choice of simulation algorithms can introduce systematic bias. For example, using inappropriate
event scheduling algorithms or incorrect statistical distributions can systematically affect results.
Boundary Condition Errors: Incorrect specification of system boundaries or boundary conditions can systematically
bias simulation results, particularly in spatial simulations or network models.
Aggregation Errors: When detailed system behavior is aggregated into summary statistics, information loss can
introduce systematic bias, especially if the aggregation method is inappropriate for the underlying data distribution.
Mean, Standard Deviation, and Confidence Interval
In the field of Modeling and Simulation, analyzing the output data accurately is essential to draw valid
conclusions about the system being studied.
Three important statistical tools widely used for this purpose are:
• Mean,
• Standard Deviation, and
• Confidence Interval (CI).
These measures help quantify the central tendency, variability, and reliability of simulation results.
•Mean - Provides central tendency but needs context from variability measures
•Standard Deviation - Quantifies system stability and predictability
•Confidence Intervals - Account for sampling uncertainty in estimates
Mean
The mean is the arithmetic average of the output values collected from a simulation. It gives a measure of the
central value or expected performance of the system.
Use in Simulation:
• Estimate average queue length
• Predict mean server utilization
• Measure average production rate
Standard Deviation (SD)
The standard deviation measures the variability or spread of the simulation output around the mean. A higher
SD indicates that the data is more spread out, while a lower SD means the data points are closer to the mean.
If the output values are more spread (e.g., 5, 10, 15), the SD will be higher than tightly clustered values (e.g., 9,
10, 11).
Use in Simulation:
•Assess reliability of results
•Understand risk or uncertainty
•Compare consistency across models
Confidence Interval (CI)
A Confidence Interval gives a range of values within which the true mean of the system's output is likely to lie
with a specified probability (typically 95% or 99%).
Mean, Standard Deviation, and Confidence Interval
In Modeling and Simulation, Mean, Standard Deviation, and Confidence Interval are critical for
analyzing and interpreting output data. They allow modelers to:
•Assess performance metrics
•Evaluate system stability
•Make informed and confident decisions
Analysis of Finite-Horizon Simulation
• A Finite-Horizon Simulation refers to a simulation run that is explicitly bounded by time or event count.
• That is, the simulation begins at a fixed point (typically time 0 or event 1) and ends at a predefined time limit,
event limit, or stage.
• This type of simulation is often used when studying systems or processes over a limited period — such as
evaluating daily patient admissions in a hospital, analyzing a batch manufacturing process, or simulating
one business day of a bank.
• Finite-horizon simulations are particularly useful when the objective is to understand short-term system
behavior.
• Since these simulations have a definite endpoint, the output data can be analyzed within that window using
well-defined statistical techniques.
• However, due to their bounded nature, variance in results can be high, and careful design of the experimental
method becomes essential to produce reliable insights.
Analysis of Finite Horizon Simulation
We would like to analyse the output of a simulation with the following properties:
– Simulation starts in a specific initial state.
– Runs until some termination event occurs.
– Life-time of process simulated is finite.
Single Run
Independent Replications
Independent Replications
Finite Horizon Simulation – Mean, SD & Variance
Sequential Estimation:
Analysis of Steady-State Simulations:
We would like to analyze
• Long-term behavior of system of interest
• By examining its steady-state parameters.
Removal of Initial Bias (warm-up Interval)
Removal of Initial Bias (warm-up Interval)
Removal of Initial Bias (warm-up Interval)
Replication - Deletion Approach
Replication - Deletion Approach
Replication - Deletion Approach
Replication - Deletion Approach
Replication - Deletion Approach
Replication - Deletion Approach
Batch-Means Method
• One of the approaches that tries to overcome drawbacks of replication-
deletion method.
• Owes its popularity to its simplicity and effectiveness.
• Types:
• Classical Batch Means Method
• Overlapping Batch Means Method
Overlapping Batch-means Method