0% found this document useful (0 votes)
9 views32 pages

Advantages of Bootstrap Forest For Yield Analysis

The white paper discusses the advantages of using the bootstrap forest method for yield analysis in the semiconductor industry, emphasizing the need for efficient manufacturing processes while maintaining quality. It highlights the challenges yield engineers face in identifying variations in yield due to multiple factors and the limitations of traditional techniques. The paper presents practical examples and case studies demonstrating the effectiveness of partitioning techniques, particularly the bootstrap forest, for root-cause analysis in yield management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

Advantages of Bootstrap Forest For Yield Analysis

The white paper discusses the advantages of using the bootstrap forest method for yield analysis in the semiconductor industry, emphasizing the need for efficient manufacturing processes while maintaining quality. It highlights the challenges yield engineers face in identifying variations in yield due to multiple factors and the limitations of traditional techniques. The paper presents practical examples and case studies demonstrating the effectiveness of partitioning techniques, particularly the bootstrap forest, for root-cause analysis in yield management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Advantages of Bootstrap Forest for Yield Analysis

White Paper
SAS White Paper

Table of Contents
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Description of various control stages. . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Online control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Parametric test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
EWS (electrical wafer sort). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Analytical methodologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Problem classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Statistical analysis methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Traditional methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Multivariate analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Principal component analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Predictive modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Regression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
The bootstrap forest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
The boosted tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Case study 1: Root-cause identification (smile signature). . . . . . . . . . 15
Description of problem and impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Case study 2: Correlation between EWS and PT parameter. . . . . . . . 22
Description of problem and impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Youssef Baltagi, Data Analyst Engineer, STMicroelectronics Rousset

Florence Kussener, JMP Senior Systems Engineer, SAS


Advantages of Bootstrap Forest for Yield Analysis

Introduction
The world of the semiconductor industry is becoming increasingly competitive and
forcing manufacturers to achieve significant reductions in time to market. As a result,
every step in the manufacturing process needs to be completed in less time while
maintaining a high level of control and quality. This must be accomplished based on
an increasing number of requirements to satisfy customer demands, particularly for
products aimed at the automotive industry and the medical sector.

In spite of the efficiency of the tools available to engineers to control manufacturing


processes that consist of hundreds of individual steps, several additional tests are
needed to identify defective parts coming out of a production site. Yield – defined as
the number of good chips over the number of chips tested – is an important parameter
that allows yield engineers to monitor the final quality of their products. Yield also has
an impact on the cost of manufacturing a product, which is why it is essential both to
maximize performance and limit any variations.

Explaining any variation in yield is a challenge for the yield engineer insofar as there can
be multiple causes of variability, including variations in process parameters, unexpected
and unmonitored manufacturing events, defective equipment, etc. The engineer must
have access to high-performance tools and methods that make it possible to get to the
root of the problem as quickly and reliably as possible. Given the significant quantity of
data collected at every stage of a manufacturing lot and the sometimes-limited statistics
available to describe the problem, traditional techniques are not always adequate for
resolving the issues faced by the yield engineer. For some years, data mining has proved
a highly effective supplement to these techniques.

In this paper, we will use a number of practical examples to demonstrate the use of
partitioning techniques. In particular, we will focus on the “bootstrap forest” method
available in JMP® Pro for root-cause analysis. Sections two and three explain the
problem and describe the techniques used; we will then present the results obtained
using two case studies. The first involves using partitioning for root-cause identification
in the case of a loss of electrical yield that is not detected during the manufacturing
process; the second examines a variation in electrical yield detected during
manufacturing.

1
SAS White Paper

Problem
Description of various control stages

Online control
The volume and complexity of data in the semiconductor industry are significant.
Creating the final product involves a manufacturing process made up of several hundred
steps. These are divided into macro steps involving areas including photolithography,
etching, implantation and CMP (chemical mechanical polishing). A distinction is drawn
between FEOL (front end of line) steps, which define the active parts of transistors, and
BEOL (back end of line) steps, which establish the interconnections between different
transistors using contacts and wire lines.

All of these steps are controlled online, either in real time or after the event, with the help
of the traditional techniques used by process control: SPC (statistical process control)
and APC (advanced process control). Metrological data is gathered as part of the control
process with respect to processing equipment (chamber temperature, pressure, etc.),
along with physical data measured on a lot-by-lot basis, such as thickness, dimensions,
physicochemical characteristics, etc. All data is recorded in EDA (engineering data
analysis) databases and must comply with the specifications agreed when defining the
processes and equipment; any variation must be individually analyzed at the relevant
point. In spite of this, physical measurements are not sufficient to ensure that the
manufacturing process has been completed correctly: Defects that are not identified by
physical measurements may appear when components are electrically activated.

2
Advantages of Bootstrap Forest for Yield Analysis

There are two additional steps used to sort chips at the end of the manufacturing
process: the parametric test and the electrical wafer sort, both electrical tests. The
first of these relates to elementary components (transistors, resistors, capacitors, etc.),
which are produced on cutting lines and placed on a wafer in N geographical areas
(sampling of N = 5, 9, 17 areas). In the second case there is no sampling: Functional
measurements are carried out on all chips. There is also one final sorting step carried
out at assembly sites, known as a final test, which is used to eliminate chips that are
found to be defective after they have been placed in a casing. The entire cycle can be
summarized in Figure 1.

Figure 1. An example of the manufacturing process for a semiconductor component


(from silicon to final testing).

3
SAS White Paper

Parametric test
The parametric test is carried out at the end of the manufacturing process and involves
electrical testing of the elementary components (resistors, transistors, capacitors, etc.) in
structures known as TEGs (test element groups). These are repeated at several points
on the wafer to measure its uniformity (see Figure 2). Several parameters are measured
for these components. They are aggregated at several different levels, including
individual values per site, average per wafer, average per lot, etc. All of this information is
recorded in databases. At this stage, wafers that do not comply with specifications are
rejected.

Figure 2. Illustration of test modules used for parametric tests.

EWS (electrical wafer sort)


Test Flow

The electrical wafer sort is an electrical sorting step to ensure that all chips are
electrically functional in accordance with the client’s specifications. Figure 3 illustrates
the EWS test process. Each chip is subjected to a sequence of tests combined in
various subtests (or subprograms) identified by a whole number known as a BIN. If
a particular subtest fails to meet the criteria, the test sequence is stopped and the
chip (identified by its X,Y coordinates on the wafer) is allocated the corresponding test
number. By default, the corresponding BIN number for a chip that passes all the tests
successfully is 1.

4
Advantages of Bootstrap Forest for Yield Analysis

Test Sequence

Figure 3. Test sequence diagram.

Wafer map

A wafer map can be produced based on the chip coordinates and its BIN number.
Each BIN (associated with a failed chip) is represented with a color; conventionally,
BIN 1 is shown in green as in Figure 4.

Figure 4. Example of an EWS wafer map.

5
SAS White Paper

Yield calculation

The yield of the wafer (as a percent) is calculated based on the following formula:

All test data (yield, BIN, test conditions, etc.) are recorded in the EDA database in
the same way as for the parametric test, with several levels of aggregation: individual
value per chip, average per wafer, average per lot, product, etc. The role of the device
engineer is to monitor product yields: They are responsible for increasing yield to its
theoretical maximum as quickly as possible at the start of production. They must
also react quickly to identify any sudden variation in yield and analyze the root causes
of the latest anomalies not identified by the SPC process. The first step in these
analyses is to categorize the problems.

Analytical methodologies
The first stage is to identify the anomalies using a BIN Pareto analysis, as shown in
Figure 5. This provides a means of focusing on the BINs that are most representative
in terms of loss of yield.
BIN COUNT

BIN NUMBER
Figure 5. Pareto analysis of BINS with associated signatures.

6
Advantages of Bootstrap Forest for Yield Analysis

Once the main anomalies have been identified, further analysis is carried out based
on three key steps: classification, data extraction and the choice of relevant statistical
analysis models.

Problem classification
Classification is an essential step in narrowing down the problem, whether it involves
defining the population affected by the crisis (for comparison with a healthy population)
or describing its signature.

Population classification. The aim of this process is to select the sample to be


used for analysis. Yield affects population of production lots and wafers in different
ways:
• Outliers or random data: Some losses of yield affect the population
in a random fashion. Atypical, outlying wafers can be identified using
percentiles. In general terms, the random nature of losses in this type of
scenario makes it relatively difficult to carry out statistical analyses and
identify their root cause.
• Systematic losses of yield: In general terms, these losses of yield are
seen when there is a defect in a piece of manufacturing equipment, or
where there is a deviation from the standard process; they tend to affect
a set of lots or wafers at the same time. In this case, losses of yield can
be of a variety of types:
o Lot to lot. The whole of the lot is systematically affected.
o Intralot. One or more wafers in the lot is affected by the same
signature.
o Intrawafer. One area of the wafer is affected.

Signature classification. Two categories of signature can be distinguished for


losses of yield, namely categorical and noncategorical signatures.
• Categorical signatures (classification): This type of problem generally
reflects the presence of a defect identified as such, an EWS signature,
a problem with a piece of equipment, etc. In this case it is normal
to classify populations in a number of categories based on whether
particular lots carry the signature, whether they have been processed
on the defective equipment, whether they have a physical defect, etc.
An example of this is shown in Figure 6.

7
SAS White Paper

Figure 6. Example of an analysis based on a GOOD/BAD category classification of lots


processed using two items of equipment at a process stage.

• Noncategorical signatures (e. g., numerical): In the case of a variation


in a PT or EWS parameter, a process which does not necessarily
translate into a signature on the wafer, the response is not categorical;
therefore, it is important to try to identify correlations between different
data sources. An example of this is shown in Figure 7.

Figure 7. Example of variation on a PT parameter shown using an Xbar control chart.

8
Advantages of Bootstrap Forest for Yield Analysis

Data preparation
Figure 8 shows an example of a data table used for yield analyses. This represents
an aggregation at the wafer level of EWS, PT, INLINE, EQUIPMENT and CHAMBER
parameters. In the case of a lot of 25 wafers, the table dimension is 25x (N+1: where
N is the number of parameters). The total amount of data is then this table dimension
multiplied by the number of lots and number of sites. The main issue in data preparation
is defining the most relevant aggregation level (LOT, WAFER, SITE) used for the analysis,
and extracting the maximum amount of information related to the process. The number
of variables is often significantly higher than the number of observations, which can
cause a problem for traditional statistical analyses. Furthermore, it is possible that some
parameters may be sampled but not available on all wafers or measurement sites, which
results in missing data; from experience, this can negatively affect the analysis when
missing data represents more than 30 percent of the observations. Finally, there can be
high levels of correlation between certain factors, so a second issue is only selecting the
most relevant factors to identify the process problem.

Figure 8. Extract from a data table of a combination of yield, PT and equipment


parameters.

Statistical analysis methods


Root-cause analysis consists of trying to identify the factor responsible for a “bad”
response. We will first look at traditional methods and their limitations, and go on to
examine how two advanced techniques can help in this identification. The first concerns
multivariate statistical techniques, which will examine variables and identify factors with
high levels of correlation. This is introduced in the multivariate analysis section below.
Another way of identifying active factors is to construct a relationship between the
response and all the factors taken together, taking care to express the actual relationship
rather than noise. We will examine this further in the predictive modeling section. This will
also show the importance of validating the model to guarantee its general applicability
and strength as a model. We will examine the advantages and disadvantages of each of
these techniques, all of which are available in JMP Pro.

9
SAS White Paper

Traditional methods
As pointed out by Lee et al. [1], the usual techniques used in root-cause analysis are
analysis of variance (ANOVA) techniques and Kruskall-Wallis tests. Analysis of variance
is a type of analysis used for data distributed on a normal curve; Kruskall-Wallis tests,
however, avoid this constraint. The number of factors to be considered in our case
makes this a very time-consuming task. This approach will therefore be used more to
validate the results obtained using the data mining techniques outlined below. Moreover,
these methods ignore the possible interactions that may occur in a crisis, and are based
on studying just one parameter at a time.

Multivariate analysis
Various analysis methods, in particular discriminant analysis, principal component
analysis and partial least squares (PLS) modeling can be used to better describe and
understand multivariate relationships. For illustrative purposes, in this paper we focus
entirely on principal component analysis.

Principal component analysis


Pearson [2] introduced principal component analysis (PCA) in 1901, in an article where
he uses correlation among variables not to express a response variable, but rather to
summarize the information contained among them.

When a set of variables is correlated, some of the information contained in any given
variable is redundant, due to the information already provided by the other variables.
An orthogonal (uncorrelated) set of variables, however, contains no such redundancy.
PCA uses linear combinations of the original variables to construct a set of orthogonal
variables. These orthogonal variables are termed principal components, and are ordered
such that the variance of any given component exceeds that of the next.

If the original set of variables is highly correlated, we are often able to retain the vast
majority of the information contained in the original variable set, using only the first few
principal components. For this reason, PCA is often described as a dimension-reduction
technique.

A well-established technique, PCA provides a means to describe and de-correlate data,


while suppressing noise using only a subset of the principal components. Further, it can
be used to combine factors and identify the most representative factors in each group
obtained in a new system of axes.

As PCA is geared to continuous variables, we use it below to analyze PT parameters,


rather than to describe equipment.

10
Advantages of Bootstrap Forest for Yield Analysis

Figure 9. Extract from Pearson [2].

Predictive modeling
By constructing a model, which links factors to the response (in our case, yield), we can
identify the impact of these factors, and thereby identify root causes.

Regression
A typical approach would be to construct a linear (or quadratic) model of type:

examining the highest ai coefficients (as absolute values). In our case, however, this is
impractical because we usually have a number of factors that exceeds the number of
historic data, making it impossible to estimate all the coefficients.

The stepwise method can be used to circumvent this type of problem, by representing
only explanatory factors in the model. This technique was used successfully by McCray
et al. [3], but suffers in settings like ours, where a significant portion of the data is
missing.

11
SAS White Paper

Partitioning
More recently, partition- or decision-tree-based methods have been used to identify
ways of improving yield (Cheng et al. [4]). Partitioning is a way to describe the
relationship between a response and set of factors without a mathematical model;
its goal is to divide the data into groups, which differ maximally with respect to some
characteristic – in our example, yield. Partitioning is an iterative process, the visualization
of which resembles a tree – hence the term “decision tree.”

If we examine a group of data, we can identify the X that most expresses the variance
of Y, splitting the data at that value of X, which maximizes the difference in the resulting
groups. An example of this is shown in Figure 10.

All Rows

Count G ^2 LogWorth
874 840,97374 34,957909
Level Rate Prob
BAD 0,1865 0,1865
GOOD 0,8135 0,8135

80_1_CHAMB ER (CH01, CH02, 80_1_CHAM B ER (CH08, CH07,


CH04) CH06, CH05, CH03)

Count G ^2 Count G ^2
377 61,589827 497 619,99428
Level Rate Prob Level Rate Prob
BAD 0,0159 0,0164 BAD 0,3159 0,3156
GOOD 0,9841 0,9836 GOOD 0,6841 0,6844

Figure 10. Example of a decision tree with one split.

12
Advantages of Bootstrap Forest for Yield Analysis

The bootstrap forest


The bootstrap forest (or random forest, a term proposed by Ho [5]) averages the
results of many trees (see Figure 11). For each of these trees, only a random sample of
the observations is considered; for each split, only a random subset of the candidate
variables is considered. In this way, it is highly probable that all of the variables useful
in predicting the response will eventually be chosen as splitting variables: By randomly
excluding rows and columns, the bootstrap forest casts light on relationships in the data
that might otherwise be missed. By viewing the columns with nonzero contributions to
the forest, the analyst can identify each of the factors that may affect the response, even
those whose impact is subtle. The most predictive factors are those with the highest
contributions, as the greater a particular variable’s correlation with the response, the
more frequently it will be chosen.

1 + 2 + ... + n /n trees

Figure 11. Illustration of a bootstrap forest.

The boosted tree


Boosted trees can also be used to identify influential factors. A boosted tree is actually
a weighted sequence of simple trees, each of which fits the scaled residuals of the
previous tree (see Figure 12). Subsequent trees are fit until there is no benefit in
proceeding further. The final model is a weighted accumulation of all the layers in the
model.

Final model (M):

where ε is the learning rate.

Although they perform slightly worse than bootstrap forests in identifying root causes,
their simplicity allows boosted trees to be estimated more quickly than bootstrap forests.

13
SAS White Paper

L L

M1 M2 M3 ... Mn

Figure 12. Illustration of a boosted tree.

Neural networks

Neural networks are highly flexible predictive models. Based on the way in which the
brain was originally thought to function, neural networks contain one or more “hidden
layers,” each of which contains one or more transformation functions, operating on the
predictors. The relationship between the predictors and the response, described by
Sassenberg et al. [6], is usually quite complex, and generally renders interpretation of
the model coefficients impossible. For this reason, neural networks have traditionally
been better suited to making predictions than to identifying a root cause.

JMP 11, however, allows an analyst to order factors based on the significance of their
impact on the model. Using this information in the same way one might use the column
contributions from a bootstrap forest, likely root-cause candidates can be unearthed,
making the use of neural networks in this way an opportunity deserving additional study.

Validation

It is important to verify that the model(s) ultimately constructed provide a satisfactory


approximation of the factor to response relationship; models that perform acceptably on
the training data, but perform poorly when scoring previously unseen data, are said to
suffer from overfitting. To combat this, we use a technique known as validation.

To validate a model, we use the model to score (make predictions based on) data that
was not used to fit the model (termed validation data). The model’s predictions are then
compared to the true responses. If overfitting is present, the model will not perform as
well as expected on the validation data.

While there are a variety of validation strategies, we will discuss two of the most
commonly used: cross-validation and holdback validation.

14
Advantages of Bootstrap Forest for Yield Analysis

As indicated above, it is important to check that the model constructed provides a


satisfactory explanation of the relationship between factors and one (or more) responses
rather than noise. The aim is to avoid constructing models that appear to perform well
but may introduce errors, known as problems of overfitting.

We will now describe the two approaches to validation in more detail.

Cross-validation: The cross-validation technique divides the data set into k subsets
or folds (this is commonly referred to as k-fold cross-validation). K models are then
estimated: Each model is estimated using the data that remains after excluding a single
fold from the original data. Each of the k models is then used to score the fold excluded
when estimating it. The resultant model picked is the one that produces the best fit
to the excluded subset. This technique is well suited to small data sets but does not
guarantee the model’s general applicability.

Holdback validation: Holdback validation divides the data set into three subsets: a
training set, a validation set and a test set. The training set and validation set are used
to select the best candidate model from among many. Several models are constructed:
Each is fit using the training data, and then scores the validation data. From among
these candidate models, the best performing model is selected and used to score the
test data set, providing an indication of the model’s ability to generalize to previously
unseen data.

We will now apply these techniques to our crisis data sets, using the model comparison
tool in JMP Pro to identify the best model.

Case studies
Case study 1: Root-cause identification (smile signature)

Description of problem and impact

We observed losses of yield for a number of weeks on a critical BIN for a mature
STMicroelectronics product. The process was well understood and stable, with a high
level of baseline yield. Yield losses were as high as 20 percent on some wafers, with
a signature at the bottom of the wafer we will call a “smile” (see Figure 13). A defect
analysis identified the nature of the problem, which affected more than 100 wafers.
A detailed view of the defect, which has been produced through scanning electron
microscopy (SEM), is shown in Figure 14. Unfortunately, online parametric analysis was
unable to identify the cause of the problem.

15
SAS White Paper

Figure 13. Example of a failed BIN wafer stack. These failed wafers exhibited a smile
defect signature.

Figure 14. SEM (scanning electron microscopy) photo of the defect.

16
Advantages of Bootstrap Forest for Yield Analysis

Data
The data contains 874 rows, comprising one response and 802 factors. Because
the smile BIN response is positively skewed, we decided to apply a logarithmic
transformation to the data. The raw response is shown in Figure 15.

0 50 100 150 200 250 300 350

Figure 15. Smile BIN distribution.

After the logarithmic transformation has been applied to the raw data, we have a
response distribution that is close to normal.

- 0,2 0 0,2 0,6 1 1,2 1,6 2 2,2 2,6

Figure 16. Smile BIN distribution following logarithmic transformation.

17
SAS White Paper

As Figure 17 shows, the data set contains a significant amount of missing data.
Removing columns with more than 30 percent missing data gives us a data set that is
smaller (just 346 columns) and much easier to analyze. Further analyses presented in
this case study are analyzed with this subset.

Figure 17. Distribution of missing data, before and after treatment.

Analysis
We begin our analysis by fitting a validated partition model (see Figure 18).

All Rows
Count 524 LogWorth Difference
M ean 39,71374 6,53e+307 72,2747
Std Dev 63,567568

177_1_CHAM B ER (CH34, CH42, 177_1_CHAMB ER (CH30, CH52,


CH44, CH02, CH24, CH15, CH41, CH50)
CH40, CH43, CH21, CH10, CH32, Count 12
CH54, CH06, CH19, CH53, CH56, Mean 110,33333
CH33, CH11, CH35, CH57, CH55, Std Dev 93,596167
CH37, CH07, CH45, CH38, CH17,
CH01, CH27, CH39, CH14, CH09,
CH49, CH47, CH03, CH36, CH46,
CH28, CH48, CH26, CH13, CH29,
CH08, CH12, CH05, CH25, CH18,
CH31, CH22, CH23, CH04, CH20)

Count 512
Mean 38,058594
Std Dev 61,865112

Figure 18. A portion of a validation decision tree model.

18
Advantages of Bootstrap Forest for Yield Analysis

Number
RSquare RMSE N of Splits Imputes AlCc
Training 0,029 62,579856 524 1 55 5828,09
Validation 0,037 59,018092 175
Test - 0,04 56,642924 175

Figure 19. Statistical characteristics of decision tree.

Although the process step 177_1_CHAMBER (see Figure 20) seems to affect the
response in a statistical sense, it has too many levels to be useful in a practical sense –
and gives us no help in understanding our yield problem.

Column Contributions
Number
Term of Splits SS Portion
177_1_CHAMBER 1 61248,1522 1,0000
175_1_CHAMBER 0 0 0,0000
144_1_CHAMBER 0 0 0,0000
1_1_CHAMBER 0 0 0,0000

Figure 20. Equipment identified by the partition.

Sometimes we can gain new insights by transforming a continuous response into a


binary one. For example, here we declare a wafer “BAD” if the response is greater than
50, and “GOOD” otherwise. In this case, the partition splits first on the 80_1_CHAMBER
factor. This factor has only eight levels, so the information proves valuable. We should
note, however, that it is necessary to identify the threshold value in order to correctly
specify the binary response, which is extremely time-consuming.

Therefore, we will base our root-cause analysis on other models, using the continuous
SMILE BIN variable. In particular, we are going to create a set of models based on
advanced partitioning techniques: bootstrap forests and boosted trees. We will then use
a model comparison tool in JMP Pro to select a model based on the statistical criterion
R2 (see Figure 21).

Measures of Fit for Log10


Predictor Creator ,2 ,4 ,6 ,8 RSquare RASE AAE Freq
Log10 Predictor Partition - 0,000 0,4723 0,3706 874
Log10 Predictor 2 Bootstrap Forest 0,3265 0,3876 0,3017 874
Log10 Predictor 3 Boosted Tree 0,0346 0,4641 0,3569 874

Figure 21. Comparison of different models.

19
SAS White Paper

We can see that the bootstrap forest gives us a model that is statistically superior to
either of the other approaches. We therefore proceed by analyzing the factors proposed
by the bootstrap forest model.

If we examine the bootstrap forest’s column contribution report (Figure 22), we see that
the 80_1_CHAMBER chamber is at the top of the list. Again, this variable is much more
useful in understanding yield loss than the variable first proposed by a simple partition,
the 177_1_CHAMBER variable.

Column Contributions
Number
Term of Splits SS Portion
80_1_CHAMBER 23 1654529,98 0,1078
160_2_CHAMBER 10 864816,201 0,0563
146_1_CHAMBER 4 801808,207 0,0522
146_1_EQUIPMENT 5 755164,555 0,0492
146_2_CHAMBER 7 635484,504 0,0414
160_2_EQUIPMENT 11 574988,809 0,0375
7_1_CHAMBER 11 555647,003 0,0362
130_1_EQUIPMENT 2 395391,082 0,0258
203_1_CHAMBER 9 393854,156 0,0257

Figure 22. Parameters highlighted by the bootstrap forest.

We were therefore able to identify the root cause of the problem directly, without
discretizing the response.

Conclusion
The 80_1 tool was confirmed as the culprit by various approaches: First, a Kruskall-
Wallis (Figure 23) test carried out after the event confirmed a significant difference in yield
among the chambers. This confirmed our view that the bootstrap forest method offered
an appropriate solution for identifying the root cause. It is important to note that the use
of a bootstrap forest in this way is more generally applicable than the Kruskall-Wallis test.
This is because the bootstrap forest can be used for any combination of continuous
and categorical factors and responses, and is also able to succeed in the presence of
interactions and other complex relationships among factors.

20
Advantages of Bootstrap Forest for Yield Analysis

Figure 23. Confirmation of results by a Kruskall-Wallis test.

Finally, the root cause was confirmed by the process teams, using physical analyses and
analyses of data related to the equipment (see Figure 24).

Graph Builder
EW S1 SM ILE Zone-01 vs. 80_1_EQUIPM EN T & 80_1_CH A M BER GOOD/BAD
350
EWS1 SM ILE Zone- 01
BAD
GOOD
300

250
EWS1 SMILE Zone-01

200

150

100

50

CH01 CH02 CH03 CH04 CH05 CH06 CH07 CH08

EQ01 EQ02
80_1_EQUIPMENT / 80_1_CHAMBER

Figure 24. Confirmation of the results by the process teams.

21
SAS White Paper

Case study 2: Correlation between EWS and PT parameter

Description of problem and impact


In this case study, analysis showed that BIN10 losses (representing up to a 10 percent
yield loss on some wafers) were linked to a parametric effect. Finding the associated
process parameter is easier once the PT parameter is identified, so the issue was to
find the PT parameter that correlated with BIN10 most closely. Because analyzing the
correlation of all parameters would have been prohibitively cumbersome, we used a
bootstrap forest to identify those parameters with the most impact on the response,
then used these to perform a follow-up correlation study.

Figure 25. Example of wafer.

Data
The data set has 560 rows and 600 columns - too much information to begin with
principal component analysis. In this case, due to the correlation between EWS and the
PTs, missing data is minimal: Except for noncritical parameters, 100 percent of the PT
information is available.

As the BIN10 data followed a logarithmic trend and included some outliers, we applied a
logarithmic transformation to it, producing a more symmetric distribution (see Figure 26).

22
Advantages of Bootstrap Forest for Yield Analysis

40 1,6

35 1,4

30 1,2

25
1

20
0,8
15
0,6
10
0,4
5
0,2
0
LogNormal(1,98652,0,67515) Normal(0,86273,0,29348)

Figure 26. BIN10 distribution before and after logarithmic transformation.

Analysis
We fit a bootstrap forest with the 606 PT parameters, using the transformed BIN10
variable as Y. Our forest contained 100 trees, with 151 randomly selected columns
considered at each split.

The model created performed reasonably well, as seen by the high R2 reported in Figure
27 for the training data, and minimal reduction of R2 in validation and test sets. Therefore,
we have built a well-fitting model that is also predicting new data well.

R Square RMSE N
Training 0,859 0,1136393 335
Validation 0,748 0,1353828 111
Test 0,750 0,1435366 112
Figure 27. Statistical characteristics of the bootstrap forest model.

We can see that a certain number of columns has an impact on the response, in
particular parameters 171, 451 and 26.

23
SAS White Paper

Column Contributions
Number
Term of Splits SS Portion
451__PARAM _AVERAGE 35 179,294222 0,1017
26__PARAM _AVERAGE 50 147,894805 0,0839
171__PARAM _AVERAGE 31 120,072569 0,0681
164__PARAM _AVERAGE 31 74,5215686 0,0423
183__PARAM _AVERAGE 22 48,1871382 0,0273
218__PARAM _AVERAGE 12 36,9233666 0,0209
42__PARAM _AVERAGE 29 30,1121665 0,0171
216__PARAM _AVERAGE 13 28,6468436 0,0162

Figure 28. PT parameters highlighted by the bootstrap forest.

The most significant PT parameters in the bootstrap forest model were used to perform
a correlation study, using principal component analysis. This technique helps to identify
and group elements that are highly correlated.

1,0

0,5 218__PARAM _AVERAGE

164__PARAM _AVERAGE 451__PARAM _AVERAGE


171__PARAM _AVERAGE
0,0
26__PARAM _AVERAGE
183__PARAM _AVERAGE

- 0,5
EWS1_BIN10_

- 1,0
- 1,0 - 0,5 0,0 0,5 1,0
Component 1 (72,9 %)

Figure 29. Projection based on the first two components of PT parameters and BIN 10.

Of those parameters most highly correlated with BIN10, we selected parameter 171; it is
positively correlated with the first principal component, negatively with the second, and
measures the effects of the process directly. This is illustrated in Figure 29.

24
Advantages of Bootstrap Forest for Yield Analysis

Graph Builder
EW S1_B IN 10_ vs. 171__PA R A M _AVERAGE S1
S2
40 S3
S4
EWS1_BIN10_
35

30

25
EWS1_BIN10_

20

15

10

0
1720 1740 1760 1780 1800
171__PARAM _AVERAGE
Where(1 rows excluded)

Figure 30. Correlation between BIN 10, PT parameter and equipment.

As Figure 30 shows, there is a correlation between the PT parameter and the process
parameter. In Figure 31 the significant difference is confirmed by a post-hoc Student’s
t-test.

1800

1790

1780

1770
171_PARAM_AVERAGE

1760

1750

1740

1730

1720

1710

1700
S1 S2 S3 S4 Each Pair
Student's t
SPLIT
0,05

Figure 31. Impact of equipment on the PT parameter.

25
SAS White Paper

Conclusion
After this analysis was used to optimize the process parameter, the process returned to
a minimal BIN10 rate. The same result could have been realized with other techniques,
after a detailed analysis of each of the 600 parameters, but the bootstrap forest enabled
us to easily pre-screen the parameters, greatly reducing the number of parameters
subjected to detailed analysis.

Summary
As George Box famously said, “All models are wrong, but some models are useful.” In
both the examples above, the use of bootstrap forests greatly streamlined the analysis
process, but it is natural to ask how these models compare to other models we might
have fit. The JMP Pro environment allows us to quickly build, compare and identify the
most useful of several models.

Graph Builder
RSquare vs. Creator Predictor
CS_1_Log10(Smile)
CS_2_Log(Bin10)
1,0
CS_3_Log_EWS13_EQUIP
CS_3_Log_EWS13_PT

0,8
RSquare

0,6

0,4

0,2

0,0
Partition Bootstrap Forest Boosted Tree Neural

Creator

Figure 32. R2 of each model for different case studies.

26
Advantages of Bootstrap Forest for Yield Analysis

As seen in Figure 32, which compares R2 values for a variety of models, each
constructed over a variety of responses, the performance of the bootstrap forests was
clearly better than that of the other tree-based methods. Its performance was on par
with the performance of the neural networks.

The two case studies outlined in this paper illustrate just two examples from
STMicroelectronics. We have also frequently used bootstrap forests in other cases
where a simple partition proved inadequate. In particular, they were employed in a case
that required aspects of both of the procedures described above. Specifically, we first
used a bootstrap forest to support a principal component analysis (the purpose of which
was to identify a PT parameter), and then used it a second time to identify equipment
(Figure 33).

Graph Builder
BIN13 & 2 more vs. 68_1_EQUIPMENT GOOD/BAD
PARAMETER_132_AVERAGE

5e- 9 PARAM ETER_132_AVERAGE


LOGBIN13
0
BIN13
- 5e-9
BAD
- 1e-8 GOOD
- 1,5e-8
- 2e-8
- 2,5e-8

3,0
2,5
LOGBIN13

2,0
1,5
1,0
0,5
0,0

3000
2500
BIN13

2000
1500
1000
500
0
EQ01 EQ02 EQ03 EQ04 EQ05 EQ06 EQ07

68_1_EQUIPM ENT

Figure 33. Relationship between equipment an BIN13, Log(BIN13) and the PT


parameter.

27
SAS White Paper

References
[1] Lee, C. H., Woo, H. D., Hong, S. W., Moon, J. Y., Kang, S. H., Lee, J. C., Chong,
K. W. and Oh, K. S. (2006). “Novel Method for Identification and Analysis of Various
Yield Problems in Semiconductor Manufacturing.” IEEE Advanced Semiconductor
Manufacturing Conference and Workshop: 185-190.

[2] Pearson, K. (1901). “On Lines and Planes of Closest Fit to Systems of Points in
Space.” Philosophical Magazine 2 (11): 559-572.

[3] “McCray, A. T., McNames, J. and Abercrombie, D. (2004). “Stepwise Regression for
Identifying Sources of Variation in a Semiconductor Manufacturing Process.” IEEE/SEMI
Advanced Semiconductor Manufacturing Conference and Workshop: 448-452.

[4] Cheng H., Ooi, M. P., Kuang, Y. C., Sim, E., Cheah, B. and Demidenko, S.
“Automatic Yield Management System for Semiconductor Production Test.” (2006).
Sixth IEEE International Symposium on Electronic Design, Test and Application (DELTA):
254-258.

[5] Ho, T. K. “Random Decision Forest.” (1995). Proceedings of the Third International
Conference on Document Analysis and Recognition, Montreal, QC, 14-16: 278-282.

[6] Sassenberg, C., Weber, C., Fathi, M. and Montino, R. “A Data Mining-Based
Knowledge Management Approach for the Semiconductor Industry.” (2009). IEEE
International Conference on Electro/Information Technology 2009: 72-77.

28
Advantages of Bootstrap Forest for Yield Analysis

29
About SAS and JMP®
JMP is a software solution from SAS that was first launched in 1989. John Sall, SAS co-founder and Executive Vice President, is the
chief architect of JMP. SAS is the leader in business analytics software and services, and the largest independent vendor in the business
intelligence market. Through innovative solutions, SAS helps customers at more than 65,000 sites improve performance and deliver value
by making better decisions faster. Since 1976 SAS has been giving customers around the world THE POWER TO KNOW®.

STMicroelectronics
STMicroelectronics is a global leader in the semiconductor market, with clients covering the full range of sense & power technologies,
automotive products and on-board processing solutions. From managing consumption to energy savings, confidentiality to data
security and health and well-being to intelligent devices for the general public, ST is involved wherever microelectronic technology is
making a positive and innovative contribution to day-to-day living. ST is active at the heart of professional and entertainment solutions
at home, in the office and in the car. ST is synonymous with “life.augmented” through the increasing use of technology to improve the
quality of life.

In 2012, ST generated net turnover of $8.49 billion. For further information visit the ST website: st.com.

SAS Institute Inc. World Headquarters +1 919 677 8000


JMP is a software solution from SAS. To learn more about SAS, visit sas.com
For JMP sales in the US and Canada, call 877 594 6567 or go to jmp.com.....
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and
other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 106852_S122754.0314

You might also like