An Interactive Spreadsheet-Based Tool To Support Teaching Design of Experiments
An Interactive Spreadsheet-Based Tool To Support Teaching Design of Experiments
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact [email protected].
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Vol. 8, No. 2, January 2008, pp. 55–64
issn 1532-0545 08 0802 0055 informs ®
doi 10.1287/ited.1080.0008
I N F O R M S © 2008 INFORMS
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
Transactions on Education
T his paper describes an interactive spreadsheet-based tool that can be used to generate data representative
of the type that might be obtained running a structured set of experiments. The purpose of this tool is to
help the user experience the iterative nature of design and analysis of experiments. The tool supports quick
and simple generation of data for one and two-factor problems. The underlying relationships are based on
queuing approximations for a single-stage batch production environment. Factor levels are related to product
lot sizes and the response is assumed to be average lot flowtimes. Variability due to replication is emulated by
sampling from a statistical distribution. Statistical software packages can be used to generate linear or quadratic
models from the results generated. Analysis can include the examination of main and interaction effects or the
optimization of lot sizes to minimize flowtimes.
Key words: design of experiments, central composite design (CCD), response surface methods
History: Received: July 5, 2005; accepted: January 23, 2006. This paper was with the authors 3 months for
2 revisions.
strong argument for the need to teach sequential Figure 1 Single Stage Production with Batch Arrivals
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
ca Machine cd
for single or two-factor scenarios. Mock experiments cs
can be quickly set up, evaluated and then altered for
another round of design and analysis. Qi = Lot size for part type i
ca = Lot interarrival time coefficient of variation
Q2 = ???
cs = Lot service time coefficient of variation
cd = Lot interdeparture time coefficient of variation
2. The Approach
The basic idea behind this approach is to use a set
of quantitative relationships in an underlying model period of time. This will again result in more work in
that exhibit the desired type of behavior. These rela- queue and an increase in flowtimes.
tionships should ideally exhibit linear or nonlinear Lot flowtime behavior is convex with respect to lot
behavior, depending on the factor settings. As well, sizes. A good objective is to minimize the weighted
they should support the demonstration of interaction mean lot flowtimes, W , across all part types by select-
effects. Furthermore, if the relationships are not well ing the best lot sizes, Qi , for each part type i. Since all
known or understood it makes the problem context part types have unique characteristics, the best lot size
more palatable with respect to the need for an exper- combinations are affected by their relative production
imental solution. characteristics as well as the variability of lot interar-
The problem context implemented in this spread- rivals. Therefore, the problem can be viewed as one
sheet model is that of lot size selection in a batch of lot size optimization to be solved using response
production system. A big problem in manufactur- surface methods. The analytical relationships used to
ing is establishing good lot (or batch) sizes for pro- describe this problem are approximate and are not
duction. In batch production facilities it is common well known. Therefore, this is the type of problem
to have multiple part types processed on the same that might well lend itself to experimentation.
machine (or resource). These are capacity-constrained The model embedded in this spreadsheet tool
machines that can process only one part type at a assumes a lot size selection scenario where a sin-
time. It is common for each part type to have unique gle machine is being used to produce batches of
part processing time, lot size and lot setup time char- two part types. The configuration of interest is illus-
acteristics. The machine is typically set up for one par- trated by Figure 1. The lot flowtime relationships
ticular part type and then a lot of parts is processed. embedded in the spreadsheet model are given in the
The lot processing time is equal to the part processing appendix. These relationships may be of interest to
time multiplied by the lot size. The lot service time is operations management or industrial engineering stu-
the lot processing time plus the lot setup time. dents. However, it is not essential to know anything
The arrival of lots of different part types is typically about the underlying relationships in order to use this
stochastic so lot flowtime behavior can be modeled tool if the primary interest is to learn DOE methodol-
using queuing relationships. If a lot of parts arrives ogy. In other words, experimentation can be done in
and the machine is busy, the lot will have to wait in a context-free manner.
queue. It is normal to assume that lots in the queue
will be processed first-come, first-served (FCFS). If
the lot sizes are too small, there will be many setups 3. The Spreadsheet Model
incurred and the utilization, defined as the proportion The spreadsheet implementation is designed to be
of time the machine is busy being set up or process- simple, transparent, and easily modified. An Excel®
ing parts, will be high. The result is that the average workbook1 serves as the user interface for specify-
number of lots waiting for processing may be high. ing inputs as well as for extracting the experimental
This means the average lot flowtime, defined to be results. The user inputs are specified in the Inputs
the lot queue time plus lot service time, will also be worksheet shown as Figure 2. The colored cells are
very high. This drives up total manufacturing times user-defined inputs.
and inventory costs. If the lot sizes are too large, the
machine utilization will be lower but the machine will 1
An example of such a workbook (DOE_Tools.xls) can be found at
be committed to producing one part type for a long http://ite.pubs.informs.org/.
Enns: An Interactive Spreadsheet-Based Tool to Support Teaching Design of Experiments
INFORMS Transactions on Education 8(2), pp. 55–64, © 2008 INFORMS 57
Cells in Range D10:E13 specify the production sce- E13 can be used as inputs to estimate behavior while
nario. The user must specify the demand per unit time designing the experiment, but the values in these cells
for each of the part types, D, the time required to set are over-written when the experiment is run.
up the machine for each specific part type, , the pro- The cells in Range I6:J7 are used to specify the low
duction rate per unit time, P , and the production lot and high factor settings for two-factor, two-level (22
size, Q. Note that P is the production rate when the experimental designs. These are referred to as facto-
machine is steadily processing the specified part type. rial design points. To run a single factor experiment,
Since there will be some idle time as well as time for one would set the low (−1) and high (+1) settings
setups, the value of P must be larger than D in order equal for one of the lot sizes.
to have a stable system in which all demand is met. The cells in Range I11:I13 are used to specify the
The final user input describing the production sce- experimental design. These inputs represent the num-
nario is in Cell D6. This value specifies the amount of
variability in the stream of lot arrivals to the machine.
It is expressed as a coefficient of variation, defined to Figure 3 CCD with Coded Variables
be the standard deviation of interarrival times divided Q2
by the mean interarrival time. A higher value indi- 0, +1.41
Figure 4 CCD with Actual Variables and Replications ation of the observed lot queue times, Wq , when mul-
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
The macros can be accessed by going into the Visual values based on the experimental design point are
Basic Editor, located under Tools and then Macro. Fig- written into Columns C and D of the Outputs work-
ure 6 shows a view from within the VB Editor. In sheet. For each design point processed, the Exper-
this figure the Project Explorer is visible and the con- iment macro is also called. This macro writes the
tents of the workbook is shown in the upper left- values for the current design point into Cells D13
hand window. If this window is not visible, it can and E13 of the Inputs worksheet. It then calculates
be accessed within the VB Editor by going into View the machine utilization rate, average lot service time,
and then activating the Project Explorer. The Project average lot queue time, and average lot service time
Explorer window should include the atpvbaen.xls file for the given observation using the formulas in Range
and “funcres” references. If these are not present, they D16:D20 and writes these values sequentially into
can be added within the VB Editor to a list found rows in the Outputs worksheet.
under Tools and then References. This list should also This macro also uses the coefficient of variation in
include VBA and the Microsoft Excel Object Library Cell D5 of the Inputs worksheet to specify a multi-
as references. plier for adjusting the calculated queue time to come
All of the VBA code in the workbook is con- up with an observed queue time. A normal distribu-
tained within a module called Experimental Design. tion is used in generating this multiplier. Once data
This module contains three macros, called ExpDesign, has been generated for each of the design points, the
Experiment, and Sort. Part of the code in these macros experimental output is randomized. This occurs at
is shown in the large window of Figure 6. It is not nec- the end of the ExpDesign macro. The “Sort” button,
essary to understand this code unless the user wishes which activates the Sort macro, in the Outputs sheet
to modify it. However, a brief description is given as can then be used to put the data back in a structured
follows. form.
The “Run Experiment” button activates the ExpDe- In order to analyze the experimental results gen-
sign macro. This macro systematically chooses points erated, the appropriate columns from the Outputs
in the experimental design and writes the coded val- worksheet can be copied into statistical analysis
ues for the design points into Columns A and B of software, such as Minitab® or Design-Expert®
the Outputs worksheet. As well, appropriate lot size (Montgomery 2001). In some cases the user of
Enns: An Interactive Spreadsheet-Based Tool to Support Teaching Design of Experiments
60 INFORMS Transactions on Education 8(2), pp. 55–64, © 2008 INFORMS
these statistical packages must specify the desired is not. Furthermore, it could be easily observed that
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
experimental design and then the software will the experimental region selected is unlikely to contain
automatically generate worksheet columns showing the lot size combination yielding minimum flowtimes
appropriate factor settings. This means the user must and that the ranges should be moved so both lot sizes
first generate the experimental design before the are reduced. Interaction or surface plots could be used
responses, W , found in column H can be copied and to verify this.
pasted into the analysis worksheet. Another set of experiments would then be run. The
steepest-descent algorithm could be used in determin-
Downloaded from informs.org by [182.74.116.110] on 01 July 2015, at 00:51 . For personal use only, all rights reserved.
Figure 8 Minitab Interaction Plot rerun with 10 additional center points, the curva-
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
Interaction plot (data means) for W ture will be shown to be statistically significant (not
0.24 shown).
Q1 A final step would be to add axial points. Figure 5
200
0.23 150 shows the Outputs worksheet obtained when rerun-
ning the experiment as a CCD with factorial and axial
0.22 points replicated twice and with 10 center point repli-
Mean
Residual
Percent
0.00
Downloaded from informs.org by [182.74.116.110] on 01 July 2015, at 00:51 . For personal use only, all rights reserved.
50
–0.01
10
–0.02
1
–0.030 – 0.015 0.000 0.015 0.030 0.180 0.195 0.210 0.225 0.240
Residual Fitted value
4.8 0.02
0.01
3.6
Frequency
Residual
0.00
2.4
–0.01
1.2
–0.02
0.0
– 0.02 – 0.01 0.00 0.01 0.02 2 4 6 8 10 12 14 16 18 20 22 24 26
Residual Observation order
it would be necessary to shift the design in the appro- As well, students sometimes have difficulty initially
priate direction and make further attempts to fit a accepting that quite different lot size combinations are
model that will identify a minimum. selected by different individuals or groups attempt-
ing to find the optimal. This happens because the
5. Discussion and Conclusions response surface may be very flat around the opti-
This spreadsheet-based tool has been used effec- mum. It is valuable for them to observe that while the
tively in laboratory and homework exercises in DOE lot size combinations may be different, the predicted
elective courses for engineering undergraduate and lot flowtimes are nearly equal.
graduate students. Students are given a handout In summary, more training of students with
describing the software, problem and relationships. appropriate skills in design of experiments and
The information provided is similar to that given in response surface methodologies is clearly required.
this paper. They are then asked to provide a writ-
ten report for a given production scenario, showing
their path of analysis, final results and conclusions. Figure 11 Contour Plot Showing Optimal
As well, they are asked to explore the effects of setup
Contour plot of W vs Q2, Q1
time reduction if setup times are cut in half. This
requires finding new optimal lot size combinations
and determining the impact on performance. 300 W
The exercise has been given to students with and < 0.20
without providing an instructional computer labora- 0.20 – 0.22
tory. In general, a hands-on computer lab in which 250
0.22 – 0.24
0.24 – 0.26
students are guided through the solution path for an 0.26 – 0.28
Q2
Figure 12 Surface Plot where Di is the demand rate, Pi is the production rate, Qi is
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
the lot size, and i is the lot setup time for part type i. Simi-
Surface plot of W vs. Q 2, Q1
larly, the utilization rate, , is determined by the following:
m m
Di
Di Q D
= i + i = + i i 0 ≤ < 1
i=1
Qi Pi i=1
Pi Qi
0.275
Note that in analysis using rapid modeling relationships,
0.250 is usually constrained to be 0.95 or less. Although the
Downloaded from informs.org by [182.74.116.110] on 01 July 2015, at 00:51 . For personal use only, all rights reserved.
lot setup and processing times, is given by the following: obtained for any given replication is determined by the
m following:
i=1 Di /Qi i + Qi /Pi
x̄ = m W = Wq + x̄ = Wq + Wqcp N 1 CV − 1 + x̄
i=1 Di /Qi
Enns: An Interactive Spreadsheet-Based Tool to Support Teaching Design of Experiments
64 INFORMS Transactions on Education 8(2), pp. 55–64, © 2008 INFORMS
Since the amount of expected uncertainty is always based Box, G. E. P., P. Y. T. Liu. 1999. Statistics as a catalyst to learning
Additional information, including supplemental material and rights and permission policies, is available at http://ite.pubs.informs.org.
on the center point of the design, it will be the same at all by scientific method part I—An example. J. Quality Tech. 31(1)
factor settings within a given design. This helps ensure the 1–15.
ANOVA assumption of equal variances at all combinations Buzacott, J. A., J. G. Shanthikumar. 1993. Stochastic Models of Man-
ufacturing Systems. Prentice-Hall, Englewood Cliffs, NJ.
of factor settings will not be violated.
Hopp, W. J., M. L. Spearman. 2001. Factory Physics. McGraw-Hill,
Boston.
Montgomery, D. C. 2001. Design and Analysis of Experiments, 5th ed.
References John Wiley and Sons, New York.
Downloaded from informs.org by [182.74.116.110] on 01 July 2015, at 00:51 . For personal use only, all rights reserved.
Box, G. E. P. 1999. Statistics as a catalyst to learning by scientific Whitt, W. 1983. The queueing network analyzer. Bell Systems Tech. J.
method part II—A discussion. J. Quality Tech. 31(1) 16–29. 62(9) 2779–2813.