SW Requirements and Data Analysis in Confocal Raman Micros
SW Requirements and Data Analysis in Confocal Raman Micros
4.1 Introduction
61
62 T. Dieing and W. Ibach
it can be displayed. For data sets originating from confocal Raman microscopy
experiments, where at each image pixel a full spectrum was recorded, this evalu-
ation will result in an image. This image generation can be performed using either
single-variant or multi-variant methods. The resulting images and masks can then
be evaluated further in combination with the multi-spectral data sets in order to
obtain, for example, average spectra originating from certain areas on the sample.
Combining the information contained in single spectra with the multi-spectral data
sets allows further enhancement of the image contrast. These steps are described in
Sects. 4.5–4.7. Combining the images generated allows the information contained
in various images to be displayed in one multi-colored image (Sect. 4.8). From these
images, phase separation and/or mixing can easily be identified.
In Sect. 4.9 it will be illustrated that even very noisy spectra acquired with
extremely little signal can be treated through the described methods to extract the
relevant information and obtain images with excellent contrast.
All the above-mentioned data processing techniques will be shown using a few
sample systems which are introduced together with the acquisition parameters in
Sect. 4.10.
The requirements for any data acquisition software for confocal Raman microscopy
are extensive. The main tasks can, however, be sorted into several groups, which
will be described in the following.
processor which at the same time needs to be able to manage all other tasks for the
control of the microscope. Additionally, the data stream must be handled in a way
that allows data acquisition without interruptions. If the software needs pauses after
a certain number of acquired spectra to process and/or store the data, the advantage
of the fast data acquisition is lost.
If the entire CCD chip of the camera must be readout (i.e., due to the necessity
to acquire a reference, calibration spectrum with each spectrum recorded) the data
stream would multiply by the vertical size of the chip (200 in the example above)
thus lowering the maximum acquisition speed significantly.
An additional challenge for the software is memory space required by the spectra
recorded. If, for example, a Raman image of 512 × 512 spectra is recorded, this
results in a data amount of
This is only the amount for one layer. Note also that once data processing starts, the
data are typically transformed from integer to double precision to increase the accu-
racy of the calculations. This almost triples the storage space needed per spectrum.
Luckily, however, such a high resolution is not always necessary and due to the time
taken per Raman spectrum is often impractical. Therefore a resolution in the range
of 150 × 150 pixels is generally sufficient.
Adding to the issue of memory space is that many analysis methods such as
multivariate analysis require a substantial additional amount. Programmers must
therefore balance computation time against available memory.
Apart from the above-discussed data acquisition, the software needs to establish
correlations between and among the data acquired. For example, the software should
64 T. Dieing and W. Ibach
be capable of indicating where the spectra were recorded on the bitmap acquired
from the white light image. Also, after the generation of a confocal Raman image
though, i.e., an integral filter over a certain spectral region, the software needs to
allow the display of the spectra at each position of the image by a simple mouse
click to facilitate the analysis. Additionally, if spectra were taken with different
gratings (and thus different spectral resolutions), the software needs to be able to
correlate these spectra with each other. Figure 4.1 displays these correlations using
the example of several measurements on a polymer blend sample (PS-PMMA on
glass).
Fig. 4.1 Spectral and spatial correlation within the data acquisition software. The spectra recorded
(a, c, e) are linked to the position where they are recorded as indicated by the yellow and blue
crosses. The spatial correlation between the video image (b) and the confocal Raman image (d) is
indicated by the red box. Additionally, the spectral axes are correlated for spectra recorded with
different gratings as indicated by the green bar for a spectrum recorded using a 600 g/mm grating
(a) and an 1800 g/mm grating (c). (e) One of the spectra recorded for the confocal Raman image
(d) and its position correlation (blue cross) is shown
4 Software and Data Analysis 65
Data acquired in confocal Raman measurements are generally five or even six
dimensional. The dimensions are
• The spatial X, Y, and Z coordinates of the point where the spectrum was recorded
(typically given in μm).
• The spectral position given as the wave number (cm−1 ), relative wave number
(rel. cm−1 ), or wavelength (typically given in nm).
• the intensity recorded at this spatial and spectral position (typically given in CCD
counts or counts per second [cps]).
• Time may also be present as a sixth dimension.
Such a data set is sometimes referred to as a hyperspectral data set.
Individual spectra can of course be displayed in a straightforward way (intensity
vs. spectral position) with the coordinates (and time if applicable) added in writing.
Displaying data sets containing more than one spectrum, however, becomes more
complicated. Line scans (spectra collected along a single line) as well as time series
(spectra recorded at the same position as a function of time) are sometimes displayed
in a so-called waterfall display as shown in Fig. 4.2.
For confocal Raman image scans, an entire Raman spectrum is collected at every
image point. A confocal Raman image scan consisting of 256×256 points will
therefore contain 256 × 256 = 65, 536 individual spectra. One may distinguish
between a single image scan, in which the spectra are recorded in one layer in
three-dimensional space, and a multi-layered or stack scan, in which several parallel
layers offset by a specific distance are recorded.
In either case, the information contained within each spectrum needs to be
reduced to a single value, which will then determine the coloring of the pixel at
this position (see also Sect. 4.5).
Fig. 4.2 Display of a line scan recorded along the line represented in red in the video image (b) in
the form of a waterfall plot (a)
66 T. Dieing and W. Ibach
In the case of an image stack one can then display each layer of the stack
individually or combine them with software in order to display the distribution
in three dimensions. Some examples of this can be found in Chap. 12 by Thomas
Wermelinger and Ralph Spolenak.
Pre-processing of Raman spectra refers to the treatment of the Raman spectra before
the generation of images or final presentation of the spectra. The steps described
below are universal to spectra recorded and should generally be followed before
any further processing.
Cosmic rays are high-energy particles from outer space which interact with atoms
and molecules in the Earth’s atmosphere. Due to their high energy, a large number
(often called a shower) of particles are generated upon this impact which are mainly
charged mesons. These quickly decay into muons. Due to their relativistic speeds
(and thus the time dilation) some of these muons reach the surface of the Earth.
Despite this exact reaction path, the term “cosmic ray” is also used (even if not
100% correct) for the muons interacting with devices on the Earth’s surface and for
simplicity, this term will be used in the following as well.
Fig. 4.3 Cosmic ray removal. The red spectrum was recorded with a short integration time and
shows two cosmic rays near 3000 cm−1 . The blue spectrum is the same as the red, but after having
undergone cosmic ray removal, and the black spectrum is the spectrum of this component (PMMA)
recorded with a longer integration time for a better signal to noise ratio
4 Software and Data Analysis 67
If such a cosmic ray hits a CCD detector it will generate a false signal in the
shape of a very sharp peak in the spectrum that is not related to the Raman signal.
An example can be seen in red in Fig. 4.3.
Cosmic rays can be filtered out as shown in Fig. 4.3 and described below.
There is, however, also the possibility to minimize the amount of cosmic rays
recorded through the readout method of the CCD camera. As already described
in Sect. 4.2.1.1, one method is the full vertical binning mode, which is the fastest
readout method. In this case, all pixels are used even though typically only a few
percent of the pixels are exposed to the Raman signal. If one limits the readout to
the few lines in the detector at which the Raman photons hit the CCD camera, one
typically excludes more than 90% of the pixels from the readout. This will of course
reduce the probability of recording a cosmic ray. The disadvantages of this method
is that it is a slower readout method and that during the readout, the light hitting
the camera cannot be recorded. This method is therefore recommended for single
spectra, whereas for Raman imaging it is not very suitable.
Once the spectra are recorded, various mathematical methods can be used to
filter the cosmic rays from the spectra. In these, two principal approaches can be
distinguished. These will be discussed in the following.
This method works well when evaluating single spectra with various acquisi-
tions on the same position. It requires that there are only negligible changes from
spectrum to spectrum. If the sample changes its spectral signature in a rapid way (for
example, due to a chemical process taking place), the usage of this type of algorithm
is problematic.
For confocal Raman imaging data sets, this method can also be applied. The user
must be aware that in this case the spectra are recorded not only at different times
but also at different spatial positions. In this case it additionally depends on the
compositional variation of the sample compared to the resolution of the scan. If the
changes from spectrum to spectrum are too dramatic, one faces again the problem
that the algorithm might filter out real peaks.
4.4.2 Smoothing
1. Moving average This filter is arguably the simplest filter for smoothing. For this
filter a definable number of values to the left and right of the current value are
averaged and the current value is replaced. Then this “window” moves to the
next value and so on. For very slow changing signals (as might be the case in
photoluminescence [PL]) this filter can be suitable.
2. Weighted average This filter differs from the Moving Average in that it does not
take each value with the same weight, but multiplies each one with a binomial
weighting factor or a Gaussian distribution. Table 4.1 shows the distribution of
the binomial coefficients for the average calculation for various filter sizes. This
filter ensures that the resulting value is closer to the real value as compared to
the Moving Average filter even if the signal is changing more rapidly.
4 Software and Data Analysis 69
4 (1, 2, 1)
1
1 3
16 (1, 4, 6, 4, 1)
1
2 5
64 (1, 6, 15, 20, 15, 6, 1)
1
3 7
256 (1, 8, 28, 56, 70, 56, 28, 8, 1)
1
4 9
3. Median The median filter is generally less influenced by single data points that
fall out of range of the “normal” signal. For example, if a cosmic ray is within the
search window, the median filter will be less influenced by this than an average
filter. This filter is a good choice for removing spikes in a line graph without
heavy rounding of the edges of Raman peaks.
4. Savitzky – Golay The Savitzky – Golay filter (sometimes also known as DISPO
(Digital Smoothing Polynomial) filter) essentially uses the surrounding values in
a weighted way and fits a polynomial through these points in order to determine
the “fitted” value at the current position. While the smoothing of this filter will
not be as strong as, for example, a Moving Average filter, it will smooth the
data considerably while largely maintaining the curve shape (peak width, peak
intensities,...). A detailed discussion of the functionality as well as examples
of this filter can be found in [3]. This filter has the additional advantage of
allowing the calculation of the derivative of the spectrum, which can be use-
ful for peak location (see black vertical lines in Fig. 4.4). Figure 4.4 shows
an example of the usage of the Savitzky – Golay filter. The use of this fil-
ter is especially recommended if the widths of the peaks in the spectrum are
comparable.
5. Wavelet transformation techniques Wavelet transformation is a mathematical
technique somewhat similar to Fourier transformation but with the advantage
that both time and frequency information are maintained. Wavelets consist essen-
tially of a family of basic functions which can be used to model the signal.
Each level of the wavelet decomposition will result in an approximation and
a detail result. The approximation result is then used as the basis for the next
decomposition and this is repeated until a defined threshold. By using the correct
combination of the detail results (one available per decomposition level) and the
approximation result (only the last one is typically used here) one can perform
the reconstruction (or inverse discrete wavelet transformation [IDWT]) to obtain
a spectrum with either a strong noise reduction, a removed background or both.
More detailed descriptions as well as some illustrative examples can be found in
[4, 5].
6. Maximum entropy filter The maximum entropy method uses the fact that certain
aspects of the instrument functionality are known. Through this, neighboring
pixels in a spectrum are not statistically independent anymore and a filtering of
these values is therefore possible.
70 T. Dieing and W. Ibach
Fig. 4.4 Effect of the Savitzky – Golay filter. The black points in the top spectrum correspond to
the data points recorded from the CCD camera (background subtracted) and the red line shows the
spectrum after smoothing using the Savitzky – Golay filter. The blue curve in the bottom is the
derivative of the spectrum obtained through the Savitzky – Golay filter
This resulting curve is then subtracted. Using this method, great care must be
taken to ensure that no relevant information of the spectrum is altered.
Wavelet transformation techniques
The principles of wavelet transformation techniques have already been described
in Sect. 4.4.2. Through the appropriate combination of the detail results (one
available per decomposition level) and the approximation result (only the last one
is typically used here) one can subtract the background. In this the approximation
result is generally omitted from the reconstruction.
As stated above, in univariate data analysis, each spectrum determines one value of
the corresponding pixel in the image or the images. The value of these pixels can be
determined by simple filters or by fitting procedures.
Fig. 4.5 Usage of an integrated intensity filter with an oil – water – alkane immersion. The spectra
(a) are integrated in three different spectral areas. The water peak (blue) is evaluated without back-
ground subtraction and results in image (c). The oil peak is integrated as shown in green with the
pixels adjacent to the higher wave number side of the peak used as the background level and results
in image (d). The alkane is using pixels to the left and right of the integration area for background
calculation (red) and results in image (e). Image (b) shows the combined image of (c), (d), and (e)
It should be noted that many of the filters used allow the extraction of a large
amount of information from the data. However, there is also the danger of misinter-
pretation. The list below shows some typical simple filters and their usage as well
as considerations which should be taken into account to avoid misinterpretations.
– Other materials present in the sample might also have peaks at this position.
– The amount of material between the objective and the focal point might change
and this would also have an influence on the absolute intensity of the peak.
– The polarization direction of the laser relative to the structure can also have an
influence on this peak intensity.
– If the software does not provide good background subtraction methods, then
changes in the background can influence the result.
74 T. Dieing and W. Ibach
Simple filters have the advantage of being relatively low in processor load and
thus they can be applied during ongoing data acquisition.
m u × Gauss + (1 − m u ) × Lorentzian
Here m u is the profile shape factor. Pseudo-Voigt curves can be additionally differ-
entiated in the degree of freedom of the FWHM (identical for Gauss and Lorenz or
variable).
In addition to the above-discussed mixture of Lorentzian and Gaussian line pro-
files, the Raman curves may also be distorted by the local sample environment.
Instrument functions will play an additional role. The change of the signal due to
these instrument functions is heavily dependent on the microscope and spectrometer
design [6] and it will depend on the instrument if this needs to be taken into account
for peak fitting.
4 Software and Data Analysis 75
For these reasons, there is not one perfect peak function which can be used to
perfectly match all Raman peaks. Table 4.2 shows some common fitting filters and
which information can be obtained through them.
As can be seen from Table 4.2, all of the filters listed deliver more than one
result per curve fitted. Therefore, the results after applying such filters to a confocal
Raman image data set are multiple images as can be seen in Fig. 4.6. In addition
to the results described in Table 4.2, the software should provide an error image in
order to allow the user to quickly determine if the fitting error in a certain region
is larger due to, for example, a line distortion. In Fig. 4.6 a confocal Raman image
was recorded of a Vickers indent in Si and the resulting spectra were fitted using a
Lorentzian function. The peak shift as well as the broadening can clearly be seen
from the spectra (Fig. 4.6a) as well as from the images (Fig. 4.6b and c).
Fig. 4.6 Lorentzian fitting of first-order Si peaks around a Vickers indent. The spectra (a) are
extracted from the points indicated by the corresponding colors in the images. (b) Shows the
position of the first-order Si peaks and (c) shows the width of the line. Both are results from a
Lorentzian fit
In the following this reduction is explained in a greatly simplified way. The user
is referred to, for example, [8, 9] for further reading and detailed explanation of the
method.
Fig. 4.7 Principal components of a typical two-dimensional data set (a) Noise-dominated data set
in which principal components can no longer be found (b)
result one gets a certain number of areas or masks which indicate where the spectra
belonging to the various clusters were acquired as well as the average spectra of
each cluster. Other applications also include the identification of bacteria strands
and even their position in their life cycle or the identification of pathogenic cells.
Cluster analysis has the advantage of being an automated and objective method
to find similar regions in spectral data sets. It can, however, require significant pro-
cessing power and time.
There are various ways of clustering the data and each has its advantages and
disadvantages. In the following, some clustering principles are briefly introduced
before two typical clustering methods as well as one variation are described. For
detailed descriptions of cluster analysis, the reader is referred to the literature, for
example, [9].
each cluster. Once the cluster tree is calculated the height, or extraction level, must
be defined and from this the masks and average spectra can be extracted.
While this method is almost completely unsupervised (with the exception of the
extraction level) it requires a huge amount of processing power and/or time.
Fig. 4.8 K -means cluster analysis of an oil – water – alkane immersion. (a) Cluster tree with
the root cluster on the left, the first level of clusters in the middle, and a further division into
sub-clusters on the right. The sub-clusters each show a mixed phase marked in black. (b) Alkane
spectrum (red) and the mixed phase (black) as extracted from the top two clusters. (c) Mixed alkane
and water spectrum (blue) and the mixed phase (black) as extracted from the middle two clusters.
The water phase does not exist as a pure phase in this sample. (d) Oil spectrum (green) and the
mixed phase (black) as extracted from the bottom two clusters
4 Software and Data Analysis 81
as a pure component in this sample but is always to some degree mixed with the
alkane.
Once the number of clusters (N ) is defined for the K -means cluster analysis, the
algorithm first defines N centers in the 1600 dimensional space and assigns each
point (spectrum) the center closest to it. Then the centroid (one might also call it an
average spectrum) for each group is computed. Following this the spectra are again
sorted according to their distance to the calculated centroids and then the procedure
is repeated. The algorithm is typically stopped once the assignment of the points
(spectra) to their group ceases to change.
While this method needs somewhat more supervision than hierarchical clustering
and is heavily dependent on the selection of the N initial centers, it requires much
less processing power. For some commercial confocal Raman microscopes it can
even be applied as an online evaluation tool during confocal Raman measurements
with acquisition speeds of more than 600 spectra/s.
Fuzzy Clustering
In the hierarchical and K -means cluster algorithms, each spectrum either belongs to
a cluster or does not. This is why the image outputs of these algorithms are binary
masks (one for each cluster extracted).
In fuzzy clustering, the spectra can belong “to a certain degree” to a cluster. If
a spectrum is located inside the cluster, then it belongs more to this cluster than
one on the edge of it. Image outputs of this algorithm display this variation and are
therefore not binary, but each pixel value typically has a value between 0 and 1 (or
100%).
This method instantly shows gradients in the images due to each pixel now
having a certain probability of belonging to one cluster or another. One can also
interpret this value as a measure of how well the spectrum fits to the corresponding
cluster. However, the resulting clusters cannot be clustered further as is the case for
classical K -means clustering.
than if only a small part of the spectrum is used (as is the case when using a sum
filter, for example).
Basis spectra can be the spectra of the pure components present in the sample. This
is the ideal case. Care must be taken, however, to record the spectra with exactly the
same settings as used when the confocal Raman image was recorded. Typically the
same integration time and many accumulations are chosen for this in order to obtain
basis spectra with a good S/N ratio. An additional point which should be taken into
consideration is whether the pure component can be present in the sample or if it
might have undergone a chemical reaction to form a new component.
Quite often the pure spectra cannot (easily) be acquired from any arbitrary sam-
ple. In this case the basis spectra should be extracted from the scan. The selective
averaging described in Sect. 4.6 is one method to perform this. Care must be taken
to ensure that the spectra are pure and not mixed themselves. If they are mixed spec-
tra, they need to be de-mixed because the fitting procedure will not work properly
otherwise, as described in the following.
The fitting procedure is essentially fitting each of the spectra recorded using the
basis spectra. It tries to minimize the fitting error D described by the equation
−−−−−−−−−−−−−→ −−→ −−→ −−→ 2
D = [RecordedSpectrum] − a × B S A − b × B S B − c × B SC − · · · (4.1)
−
→
by varying the weighting factors a, b, c, ... of the basis spectra B S.
In order to improve such a fit, it is not advisable to use the entire recorded spectral
range (i.e., from −100 to 3500 cm−1 ). The Rayleigh peak and common parts in
the spectra (such as a glass substrate, for example) are best excluded from the fit.
Additionally, parts of the spectra that do not contain Raman information should not
be taken into account as they only contribute noise.
Following the fit of all the tens of thousands of spectra (using (4.1) for each
spectrum recorded), the algorithm constructs one image for each basis spectrum
showing the factors a, b, c, ... plus one image for the fitting error D.
Care must be taken in order to avoid using mixed spectra as the basis spectra.
If such spectra are used the weighting factors can become negative, or if the fit is
constrained to weighting factors greater than zero, the fit will not work properly.
Figure 4.9 shows an example of basis analysis. Here a thin layer of a PS-PMMA
polymer blend was investigated with very short integration times (4.3 ms). The indi-
vidual spectra were thus relatively noisy as can be seen from the red and blue spectra
in Fig. 4.9(c) and (e).
84 T. Dieing and W. Ibach
Fig. 4.9 Basis analysis of a PS-PMMA polymer blend. (a) Average spectra used for the fitting
procedure. (c) and (e) Original spectra (red and blue, respectively) and fitted spectra (black) for PS
(red) and PMMA (blue). The original spectra were recorded at the crosses indicated in the images
on the right with the corresponding colors. (b) and (d) Resulting image showing the distribution of
PS [b red frame] and PMMA [d blue frame] following the basis analysis. Brighter colors indicate
a higher fitting factor and thus a higher signal intensity of the basis spectrum at the corresponding
position
Using selectively averaged spectra with a good signal to noise ratio (see Fig. 4.9a)
one can fit the individual, noisy spectra using the information contained within the
entire spectral range. This results in a significant improvement of the S/N ratio and
the contrast of the resulting images (see Fig. 4.9b and d).
However, if multiple images such as the results of the basis analysis need to
be presented, the number of images can quickly become too large. It might be of
additional interest to see if certain components are present as pure or mixed phases.
In such cases the combination of images is a good way to illustrate the distribu-
tion of components. In Fig. 4.5(b) the distribution of oil, water, and alkane is shown
in green, blue, and red, respectively.
The color scales for each component are first adjusted, so that each component
has an individual color scale (red, green, and blue (RGB) in this case). The images
are then combined. One can combine the images by layering them and making the
upper layers transparent depending on the value of each pixel. Another method is
to combine the colored pixels in an additive way in order to illustrate the mixing of
phases better. The colors are then combined and where, in the example above, water
(blue) and alkane (red) are present, the resulting color is mixed (violet).
Note that the definition of the range of the color scale bar has a significant influ-
ence on the appearance of images and that care must be taken to choose appropriate
settings.
In this section a very noisy example data set is evaluated to illustrate the implemen-
tation and capability of the methods described above and show that even though the
S/N ratio of the individual spectra is at first glance insufficient, the algorithms used
can produce images and spectra of high quality.
The sample investigated was a PET-PMMA polymer blend spin coated onto a
glass slide. The data were acquired in EMCCD mode and Fig. 4.10 shows an exam-
ple of three of the 22,500 spectra recorded.
Fig. 4.10 Noisy PET-PMMA Raman image. The spectra (a) on the left show the three differ-
ent components acquired at the positions marked in the image (b) with the corresponding colors
(green = glass, red = PET, blue = PMMA). The image (b) on the right displays the distribution of
PMMA in the sample as derived through a simple sum filter
86 T. Dieing and W. Ibach
The noise level in the spectra is significant and the S/N ratio is only slightly
above 1. Even with this noise level, the sum filter applied to the CH-stretching
region where PMMA shows a signal produces some image contrast as can be seen
in Fig. 4.10(b).
Following cosmic ray removal and a background subtraction, a K -means cluster
analysis was performed on the data set, resulting in three clusters. The average
spectra of these clusters as well as the color–coded cluster map are shown in
Fig. 4.11.
It can be clearly seen that the quality of the spectra as well as the spatial assign-
ment of the pixels and thus the image contrast is dramatically improved. As can be
seen from Fig. 4.11(a), all spectra still contain the glass background due to the lim-
ited depth resolution of the confocal setup and the thickness of the film (<<50 nm).
It is additionally noticeable that the glass spectrum still shows a small peak in the
CH-stretching region. This is due to edge effects of the clusters as already discussed
in Sect. 4.5.2.2.
The spectra can now be de-mixed further by correct subtraction of the spectra
from each other, which then results in the spectra shown in Fig. 4.12(c).
These spectra are used for the basis analysis and the resulting image for the
PET and the PMMA is shown in Fig. 4.12(a) and (b), respectively. The scale
bars to the right of image Fig. 4.12(a) and (b) indicate the fitting value. The
glass background image (not shown here) now shows a homogenous distribution.
Fig. 4.12(d) shows the combined image of the three components following basis
analysis.
Figure 4.13 illustrates the effect on image contrast for the PMMA phase by sim-
ply displaying Figs 4.10(b) and 4.12(c) next to each other. The enhancement in
image contrast due to the multivariate data methods and the de-mixing is clearly
visible. This is especially apparent in the region where PET is located, the contrast
is strongly enhanced.
Fig. 4.11 PET-PMMA Raman image and spectra following K -means cluster analysis. The spectra
on the left (a) show the average spectra of the three clusters (green = glass, red = PET, blue =
PMMA) and the image on the right (b) shows the combined cluster map with the pixels color-coded
as the spectra according to their cluster affiliation
4 Software and Data Analysis 87
Fig. 4.12 The results of the basis analysis for PET (a) and PMMA (b), the de-mixed spectra used
for the basis analysis (c) and the combined image of the three components (green = glass, red =
PET, blue = PMMA)
Fig. 4.13 Comparison of the image contrast before and after data evaluation. Image (a) was
obtained through the sum filter and image (b) after basis analysis.
88 T. Dieing and W. Ibach
The scale bars on the right-hand side of the images are an additional indicator
that the sum filter only uses the number of detected electrons1 in the CH-stretching
band, whereas the fitted image uses the photons in the entire spectral range, which
is reflected in the higher scale bar values in Fig. 4.13(b).
This example demonstrates conclusively, that with a sufficiently large number
of spectra (22,500 in this case), it is still possible to obtain a great amount of
spectral and spatial information from the data sets by using multivariate methods
and advanced data analysis algorithms, even though the individual spectra have a
very low signal to noise ratio. However, the signals in the spectra still need to be
sufficiently high to allow the cluster analysis to distinguish one cluster from another.
In order to illustrate the data processing of Raman spectra, a few example data sets
were utilized throughout this chapter. The samples as well as the acquisition details
will be explained in the following.
All data presented were recorded using an alpha300R confocal Raman micro-
scope from WITec GmbH, a frequency-doubled Nd:YAG laser (532 nm) and a spec-
trometer equipped with a 600 g/mm and a 1800 g/mm grating as well as a back-
illuminated CCD camera. The samples were
• A PS-PMMA polymer blend either dropped onto or spin coated onto a glass slide
• A Si[100] wafer with an indent produced using a nano-indenter
1 Since this measurement is performed in EMCCD mode, the signal is strongly amplified and thus
does not represent the number of photons.
4 Software and Data Analysis 89
• An oil – alkane – water (O – A – W) mix which was placed between two cover
slips
• A PET-PMMA polymer blend spin coated on a glass slide
The objective used was either a 100× air objective with an NA of 0.9 or 0.95 or
a 100× oil immersion objective with an NA of 1.25 (oil – alkane – water sample).
Further experimental details can be found in Table 4.3.
References
1. WITec GmbH. Ultrafast confocal raman imaging – application examples. http:/www.witec.
de/en/download/Raman/UltrafastRaman.pdf (2008)
2. L. Quintero, S. Hunt, M. Diem. Denoising of raman spectroscopy signals. Poster presented at
the 2007 R2C Multi Spectral Discrimination Methods Conference (2007)
3. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C, 2nd edn.,
chap. Savitzky-Golay Smoothing Filters, pp. 650–655, (The Press Syndicate of the University
of Cambridge, 1999)
4. P. Ramos, I. Ruisánchez, J. Raman Spectrosc. 36, 848 (2005)
5. G. Gaeta, C. Camerlingo, R. Ricio, G. Moro, M. Lepore, P. Indovina, Proc. SPIE 5687, 170
(2005)
6. T. Dieing, O. Hollricher, Vib. Spectrosc. 48, 22 (2008)
7. K. Pearson, Philos. Mag. 2(6), 559 (1901)
8. C. Bishop, Pattern Recognition and Machine Learning (Springer, New York, NY, 2007)
9. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning (Springer, New York,
NY, 2009)