0% found this document useful (0 votes)
30 views

Computing Methods in HEP

The document consists of lecture notes on computing methods in high energy physics, covering various programming languages and tools such as Unix Shells, FORTRAN, C++, and ROOT. It includes sections on event generators, detector simulations, data reconstruction, and error estimation, providing a comprehensive overview of the computational techniques used in the field. The notes are intended for educational purposes at the Helsinki Institute of Physics, University of Helsinki.

Uploaded by

Maheen Ejaz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Computing Methods in HEP

The document consists of lecture notes on computing methods in high energy physics, covering various programming languages and tools such as Unix Shells, FORTRAN, C++, and ROOT. It includes sections on event generators, detector simulations, data reconstruction, and error estimation, providing a comprehensive overview of the computational techniques used in the field. The notes are intended for educational purposes at the Helsinki Institute of Physics, University of Helsinki.

Uploaded by

Maheen Ejaz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 147

COMPUTING METHODS IN HIGH ENERGY PHYSICS

LECTURE NOTES

S. Lehti and V. Karimäki


Helsinki Institute of Physics
University of Helsinki

Helsinki January 2010


Contents
1 Review of basic tools 1
1.1 Unix Shells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 CVS - Concurrent Versions System . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 GIT code management system . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 LaTeX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 FORTRAN 15
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 FORTRAN variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Control flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Subprograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 COMMON block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 Real world example: PYTHIA . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 C++ programming 22
3.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Abstract Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 Member functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.8 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.9 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.10 Object Creation and Destruction . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.11 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.12 Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.13 Containers and Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.14 Polymorfism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

i
4 ROOT 38
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Some practicalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Conventions and types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4 Global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 History file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.6 Histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.7 Canvases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.8 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.9 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.10 Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Combining languages 45
5.1 FORTRAN and C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 C++ and ROOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Cross section and branching ratio calculations 51


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 Cross sections and branching ratios in event generators . . . . . . . . . . . . . 51
6.2.1 PYTHIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2.2 HDECAY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.2.3 FeynHiggs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.4 HQQ, HIGLU etc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.5 MCFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.2.6 PDF’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7 Review of event generators 56


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.2 PYTHIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.3 HERWIG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.4 Other event generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

8 Detector simulations 63
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.2 GEANT4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3 Usage of Geant4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.4 Dataset creation for full simulation . . . . . . . . . . . . . . . . . . . . . . . . 68

ii
9 CMSSW 69
9.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.2 Environment setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3 Dataset creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4 Analysis example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

10 Reconstruction 76
10.1 Event reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.2 Local reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.3 Global reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.4 Combined reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.5 b and τ tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.6 Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.7 Event reconstruction software in CMS . . . . . . . . . . . . . . . . . . . . . . 80

11 Fast simulation 82
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
11.2 Single particle performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
11.3 Particle showers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
11.4 CMS Fast Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
11.5 Tracker response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
11.6 Calorimeter response to e/γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
11.7 Calorimeter response to hadrons . . . . . . . . . . . . . . . . . . . . . . . . . 85
11.8 Response to muons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
11.9 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

12 Error estimation 88
12.1 Propagation of measurement errors . . . . . . . . . . . . . . . . . . . . . . . . 88
12.2 Error propagation in case of linear transformation . . . . . . . . . . . . . . . 88
12.3 Error propagation in case of non-linear transformation . . . . . . . . . . . . . 90

13 Fitting parameterized model with experimental data 92


13.1 The method of maximum likelihood . . . . . . . . . . . . . . . . . . . . . . . 92
13.2 Least squares method as a special case of maximum likelihood . . . . . . . . 94

iii
14 Least squares solutions with error estimates 95
14.1 Linear χ2 solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
14.2 Non-linear least squares fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
14.2.1 Convergence criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
14.3 Least squares fit with constraints . . . . . . . . . . . . . . . . . . . . . . . . . 98
14.4 Fit quality tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
14.4.1 Pull or stretch values of fitted parameters . . . . . . . . . . . . . . . . 99
14.4.2 Fit probability distribution . . . . . . . . . . . . . . . . . . . . . . . . 99
14.5 Maximum Likelihood and Poisson statistics . . . . . . . . . . . . . . . . . . . 100
14.5.1 Maximum Likelihood and parameter errors . . . . . . . . . . . . . . . 102

15 Introduction to Monte Carlo simulation 103


15.1 Generation of non-uniform distribution by inversion method . . . . . . . . . . 103
15.2 Inversion method for discrete distribution . . . . . . . . . . . . . . . . . . . . 104
15.3 Approximate inversion method using tabulation . . . . . . . . . . . . . . . . . 105
15.4 Hit-or-Miss method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
15.5 Hit-or-Miss method by comparison function . . . . . . . . . . . . . . . . . . . 106
15.6 Composition method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

16 Generation from common continuous distributions 108


16.1 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
16.2 Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.2.1 Polar or Box-Muller method . . . . . . . . . . . . . . . . . . . . . . . . 111
16.2.2 Faster variation of the polar method . . . . . . . . . . . . . . . . . . . 112
16.2.3 Gaussian generation by summation method . . . . . . . . . . . . . . . 112
16.2.4 Kahn’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
16.3 Multidimensional Gaussian distribution . . . . . . . . . . . . . . . . . . . . . 114
16.4 Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
16.5 χ2 distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
16.6 Cauchy or Breit-Wigner distribution . . . . . . . . . . . . . . . . . . . . . . . 120
16.7 Landau distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

iv
17 Monte Carlo integration 124
17.1 Basics of the MC integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
17.2 Convergence and precision of the MC integration . . . . . . . . . . . . . . . . 126
17.3 Methods to accelerate the convergence and precision . . . . . . . . . . . . . . 127
17.3.1 Stratified sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
17.3.2 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
17.3.3 Control variates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
17.3.4 Antithetic variates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
17.4 Limits of integration in the MC method . . . . . . . . . . . . . . . . . . . . . 132
17.5 Comparison of convergence with other methods . . . . . . . . . . . . . . . . . 133

18 Grid computing 134


18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
18.2 Grid middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
18.3 Usage of ARC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
18.4 Job monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

v
1 Review of basic tools

1.1 Unix Shells

The Shell is a program that runs automatically when you log in to a Unix/linux system. The
Shell forms the interface between users and the rest of the system. It reads each command
you type at your terminal and interprets what you have asked for. The Shell is very much
like a programming language with features like
• variables
• control structures (if,while,..)
• subroutines
• parameter passing
• interrupt handling

These features provide you with the capability to design your own tools. Shell scripting
is ideal for any small utilities that perform relatively simple task, where efficiency is less
important than easy configuration, maintenance and portability. You can use the shell to
organize process control, so commands run in predetermined sequence, dependent on the
successful completion of each stage.

Shell Name History


sh (Bourne) The original shell
csh,tcsh,zsh The C shell.
bash Bourne Again Shell, from the GNU project
rc More C than csh, also from GNU

File Handling

Most Unix commands take input from the terminal keyboard and send output to the terminal
monitor. To redirect the output use > or >> symbol:

$ ls > myFile # output to new file myFile


$ ls -l >> myFile # append to existing file

Likewise you can redirect the input with the < symbol:

$ wc -l < myfiles

The need for redirection becomes more apparent in shell programming.

1
Pipes

The standard output of one process (or program) can be the standard input of another
process. When this is done a ”pipeline” is formed. The pipe operator is the | symbol.
Examples:

$ ls | sort
$ ls | sort | more
$ ls | sort | grep btag | more

Here the standard output from left becomes the standard input to the right of |.
Pipelines provide a flexible and powerful mechanism for doing jobs easily and quickly, without
the need to construct special purpose tools. Existing tools can be combined. Another useful
command:

$ find . -name "*.cpp" | xargs grep btag

This is searching every file in this directory and any subdirectory with ending cpp and con-
taining the string btag.

Shell Programming

Suppose you have a file called ’test’ which contains unix commands. There are different ways
to get the system to execute those commands. One is giving the filename as an argument
to the shell command, or one can use the command source in which case the current shell is
used

$ csh test (or sh test)


$ source test

One can also give the executing rights to the file and then run it

$ chmod 755 test


$ test

Here the number coding in the chmod command comes from rwxrwxrwx with rwx being
binary number: rwx=111=7 and r-x=101=5. You can check the rights of your files with the
-l option of the ls command.
A shell script starts usually with a line which tells which program is used to execute the file.
The line looks like this (for bourne shell)
#!/bin/sh Comments start with a # and continue to the end of a line. The shell syntax
depends on which shell is in use. At CERN people have been using widely csh (tcsh), but
also bash as the linux default has gained ground. It’s up to you which shell you prefer.

2
csh

Variables are used to hold temporary values and manage changeable information. There are
two types of variables:
• Shell variables
• Environment variables

Shell variables are defined locally in the shell, whereas environment variables are defined for
the shell and all the child processes that are started from it.
The set command is used to create new local variables and assign a value to them. Some
different methods of invoking the set command are as follows:
• set
• set name
• set name = word
• set name = (wordlist)
• set name[index] = word

The first three forms are used with scalar variables, whereas the last two are used with array
variables. The setenv command is used to create new environment variables. Environment
variables are passed to shell scripts and invoked commands, which can reference the variables
without first defining them.
• setenv
• setenv name value

To obtain the value of a variable a reference ${name} is used. Example

$ mv a.out analysis_${CHANNEL}_$RUN.out

Here the variables CHANNEL and RUN contains some values. A reference ${#name} returns
the number of elements in array, and ${?name} returns 1 if the variable is set, 0 otherwise.
In addition to ordinary variables, a set of special variables is available.

Variable Description
$0 Shorthand for $argv[0]
$1,$2,...,$9 Shorthand for $argv[n]
$* Equivalent to $argv[*]
$$ Process number of the shell
$< input from file

Conditional statements are created with

if( expression ) then


...
else
...
endif

3
and loops with

foreach name (wordlist)


..
end

while ( expression )
...
end

An example csh script:

#!/bin/csh
if( ! ${?ENVIRONMENT} ) setenv ENVIRONMENT INTERACTIVE
if( $ENVIRONMENT != "BATCH" ) then
if( ! ${?SCRATCH} ) then
echo Setting workdir to HOME
setenv WORKDIR $HOME
else
echo Setting workdir to SCRATCH
setenv WORKDIR $SCRATCH
endif
setenv LS_SUBCWD $PWD
endif
##########################################
setenv INPUTFILE analysis.in
#setenv DATAPATH $LS_SUBCWD/data
setenv DATAPATH /mnt/data/data
##########################################

make

cd $WORKDIR
if( -f $INPUTFILE ) rm $INPUTFILE
cp -f $LS_SUBCWD/$INPUTFILE $WORKDIR
setenv LD_LIBRARY_PATH $LS_SUBCWD/../lib:${LD_LIBRARY_PATH}
echo
echo START EXECUTION OF JOB
echo
$LS_SUBCWD/../bin/${CHANNEL}_analysis.exe >& analysis.out
if( -f $WORKDIR/analysis.out ) mv $WORKDIR/analysis.out $LS_SUBCWD
echo
echo JOB FINISHED
echo
exit

bash

In bash the standard sh assignment to set variables is used:


name = value

4
The array variables can be set in two ways, either by setting a single element:
name[index] = value
or multiple elements at once:
name = (value1,..,valueN)
Exporting variables for use in the environment
export name
export name = value
Arithmetic evaluation is performed when the following form is encountered: $(( expression
)). The basic if statement syntax is

if list ; then
else
fi

Most often the list given to an if statement is one or more test commands, which can be
invoked by calling the test as follows:
test expression
[expression]
The other form of flow control is the case-esac block. Bash supports several types of loops:
for, while, until and select loops. All loops in bash can be exit by giving the built-in break
command.

Perl

Perl has become the language of choice for many Unix-based programs, including server
support for WWW pages. Perl is a simple yet useful programming language that provides the
convenience of shell scripts and the power and flexibility of high-level programming languages.
In perl the variable can be a string, integer or floating-point number. All scalar variables
start with the dollar sign $. The following assignments are all legal in perl:

$variable = 1;
$variable = "my string";
$variable = 3.14;

To do arithmetic in Perl, the following operators are supported: + - * / ** %. Logical


operators are divided into two classes, numeric and string. The numeric logical operators
are <, >, ==, <=, >=, !=, ||, && and !. The logical operator that work with strings are
lt, gt, eq, le, ge and ne. The most common assignment operator is =. Autoincrement ++
and decrement - - are also available. Strings can be combined with . and .= operators, for
example

$a = "be" . "witched";
$a = "be"; $a .= "witched";

The if conditional statement has the following structure: if (expr) {..}. Repeating statement
can be made using while and until, looping can be done with the for statement.
Shell (system) commands can be run with the command
system(shell command string)

5
More about csh(tcsh),bash and perl: man pages and www like
http://www.eng.hawaii.edu/Tutor/csh.html
http://www.faqs.org/faqs/unix-faq/shell/csh-whynot
http://www.gnu.org/software/bash
http://www.perl.com
An example Perl script:

#!/usr/local/bin/perl

##########################################################
$ibg = 2020;
$eventsPerFile = 500;
$allEvents = 100000;
$firstEvent = 1;
$BatchQueue = "1nw";
$Jobfile = "offlineAnalysis.job";
$runNumber = 1;
##########################################################

$pwd = $ENV{’PWD’};
$orca = rindex("$pwd","ORCA");
$slash = index("$pwd","/",$orca);
$ORCA_Version = substr("$pwd",$orca,$slash-$orca);
$famos = rindex("$pwd","FAMOS");
$slash = index("$pwd","/",$famos);
$FAMOS_Version = substr("$pwd",$famos,$slash-$famos);
$ORCA_VERSION = $ORCA_Version;
if($famos > $orca) $ORCA_VERSION = $FAMOS_Version;
print "Submitting $ORCA_VERSION job(s)\n";
###########################################################

for ($i = 1; $i <= $allEvents/$eventsPerFile; $i++) {


$LAST = $firstEvent + $eventsPerFile - 1;
system("bsub -q $BatchQueue $Jobfile $ORCA_VERSION $runNumber $eventsPerFile $firstEvent
$firstEvent = $LAST + 1;
}
if($firstEvent < $allEvents){
$LAST = $allEvents - 1;
system("bsub -q $BatchQueue $Jobfile $ORCA_VERSION $runNumber $eventsPerFile $firstEvent
}

Python

Python is a dynamic programming language used in a wide variety of application domains.


It is object-oriented, interpreted, and interactive programming language. It has modules,
classes, exceptions, very high level dynamic data types, and dynamic typing. There are
interfaces to many system calls and libraries, as well as to various windowing systems. New
built-in modules are easily written in C or C++ (or other languages).

6
A recommended coding style is to use 4-space indentation, and no tabs CMS software
(CMSSW) configuration files use Python. More in http://www.python.org/
An example Python script (truncated):

#!/usr/bin/env python

import sys ...


#import os
#import commands
#import getopt
#import fileinput
#import re

ConfigFile = ’TTEffAnalyzer_cfg.py’
OutFileName = ’file:tteffAnalysis.root’
root_re = re.compile("(?P<file>’([^’]*\.root)’)")
cfg_re = re.compile("(\..*?)$")
job_re = re.compile("(\.py)")

i = 0
for dline in fileinput.input():
match = root_re.search(dline)
if match:
i = i + 1
jobNumber = ’_%i’ %(i)
cfgFileName = cfg_re.sub("%s\g<1>"%jobNumber, ConfigFile)
inFileName = match.group("file")
outFileName = cfg_re.sub("%s\g<1>"%jobNumber, OutFileName)
cfgFile = open(cfgFileName, ’w’)
inFile = open(ConfigFile,’r’)
for line in inFile:
if ’outputFileName’ in line:
cfgFile.write(’ outputFileName = cms.string("’ + outFileName + ’")\n’)
elif ’process.source = source’ in line:
cfgFile.write(’process.source = cms.Source("PoolSource",\n’)...
# cfgFile.write(’ fileNames = cms.untracked.vstring(\n’)
# cfgFile.write(’ ’ + inFileName + ’\n’)
# cfgFile.write(’ )\n’)
elif ’input = cms.untracked.int32’ in line:
cfgFile.write(’ input = cms.untracked.int32(-1)\n’)
else:
cfgFile.write(line)
inFile.close()
cfgFile.close()

jobFileName = job_re.sub(".job",cfgFileName)
jobFile = open(jobFileName, ’w’)
jobFile.write("#!/bin/csh\n")...
jobFile.close()

7
cmd = "bsub -q 1nd " + jobFileName
# os.system("chmod +x "+ jobFileName)
os.system(cmd)

1.2 CVS - Concurrent Versions System

In HEP experiments the software is typically written by the collaboration members. Some
part of the software can be commercial programs, but most of the code must be written from
scratch.
CVS is a tool to manage the source code.
• to prevent overwriting others’ changes
• keeps record of the versions
• allows users and developers an easy access to releases and prereleases

For example almost all of the CMS software is available via CVS. Only old software written
in FORTRAN is not necessarily in CVS, but as the programs are being rewritten in C++,
the fraction of code not in CVS is decreasing.
Documentation for CVS can be found in WWW, e.g.:
http://ximbiot.com/cvs/cvshome
Some useful commands:
How to login? Here is an example for CMS:
Set your environment variable CVSROOT

$ set cvsbase = ":pserver:anonymous @cmscvs.cern.ch:/cvs_server/repositories"

Let’s choose CMSSW, the new detector simulation and reconstruction code

$ setenv CVSROOT "${cvsbase}/CMSSW"

At lxplus.cern.ch this is done by command ’cmscvsroot CMSSW’, which is a csh script setting
the environment variable.

$ cvs login (type passwd ’98passwd’)

How to access the head version of the code? Use cvs check-out

$ cvs co IOMC

How to access a chosen release?

$ cvs co -r CMSSW_1_6_7 IOMC

The program version above is 1 6 7 and is quite rapidly changing.

8
1.3 GIT code management system

Git is an open source, distributed version control system designed to handle everything from
small to very large projects with speed and efficiency.
Git does not use a centralized server.
Projects using Git: Git, Linux Kernel, Perl, Ruby on Rails, Android, WINE, Fedora, X.org,
VLC, Prototype
Useful links for documentation:
http://git-scm.com/
http://jonas.nitro.dk/git/quick-reference.html
First you need to introduce yourself to git with your name and public email address before
doing any operation.
$ git config --global user.name
"Your Name Comes Here"
$ git config --global user.email
[email protected]

Some useful commands:

$ git clone $<$repo$>$


$ git commit -a -m "Comment text"
$ git add file
$ git rm file
$ git remote add public ${HOME}/
public/html/HipProofAnalysis.git
$ git push public
refs/heads/master:refs/heads/master
$ git branch -a

How to make a public repository (lxplus):

$ cd $HOME/public/html
$ mkdir MyGitRepo.git
$ cd MyGitRepo.git
$ git --bare init
$ chmod a+x hooks/post-update
$ git update-server-info
$(git config http.sslVerify false)

Taking code from others:

$ git remote add matti <repo>


$ git fetch matti
$ git merge matti/master

9
1.4 LaTeX

LaTeX is the tool to write your papers, and at this stage of your studies you should already
know how to use it. If not, then learn it now! LaTeX documentation can be found in literature
and in the web, google gives you several options for getting the documentation. You might
try e.g.:
http://en.wikibooks.org/wiki/LaTeX/Basics
http://en.wikibooks.org/wiki/LaTeX/Basics
http://www.ctan.org/tex-archive/info/lshort/english/lshort.pdf
http://wwwinfo.cern.ch/asdoc/psdir/texacern.ps.gz (use wget to make a copy for you)
Here we concentrate on two tools which you might need in the future: BibTeX and feynMF.
The first is an easy way of managing your references, and the second one is a tool to draw
Feynman graphs.

BibTeX

When using BibTeX, LaTex is managing your bibliography for you. It puts your references
in the correct order, and it does not show refer-
ences you have not used. This way you can have the same custom bib file for every paper
you write, you just add more references when needed. You need two files, a bib file, and a
style file. Here is an example style file based on JHEP style
http://cmsdoc.cern.ch/~slehti/lesHouches.bst
The bib file you have to write yourself, or you can use SPIRES automated bibliography
generator
http://www-spires.fnal.gov/spires/hep/refs/bibtex.shtml
to make the entry for yourself. Here is an example reference from a bib file:

@Article{L1_TDR,
title = "The Level-1 Trigger",
journal = "CERN/LHCC 2000-038",
volume = "CMS TDR 6.1",
year = "2000"
}

The citation (\cite{L1 TDR}) in the text works as usual. In your tex file you need to have
\bibliographystyle{lesHouches}. BibTeX is run like LaTeX: latex-bibtex-latex-latex-dvips.

feynMF

feynMF is a package for easy drawing of professional quality Feynman diagrams with META-
FONT. The documentation for feynMF can be found in
http://xml.web.cern.ch/XML/textproc/feynmf.html
http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=feynmf
http://mirror.ctan.org/macros/latex/contrib/feynmf/manual.ps
Instructing LaTeX to use feynMF is done including the feynmf package: (file feynmf.sty is
used)

\usepackage{feynmf}

10
Processing your document with LaTeX will generate one or more METAFONT files, which
you will have to process with METAFONT. The METAFONT file name is given in the
document in fmffile environment:

\begin{fmffile}{<METAFONT-file>}
...
\end{fmffile}

You would need the file feynmf.mf in your work directory. You can obtain it e.g. by the
command:
wget http://www.phys.uu.nl/~koetsier/pdf/feynmf.mf
The following code produces the figure feynmanGraph.dpf (shown below). Use display or
acroread command to view the pdf-file.
File feynmanGraph.tex:

\documentclass{report}
\usepackage{graphicx,feynmf}
\pagestyle{empty}
\begin{document}
\begin{figure}[h]
\begin{fmffile}{triangleGraph}
\centering
\parbox{50mm}{
\begin{fmfgraph*}(100,60)
\fmfleft{i1,i2}
\fmfright{o1}
\fmf{curly,tension=0.5}{i1,v1}
\fmf{curly,tension=0.5}{i2,v2}
\fmf{fermion,tension=0.1}{v2,v1}
\fmf{fermion,tension=0.3}{v1,v3}
\fmf{fermion,tension=0.3}{v3,v2}
\fmf{dashes}{v3,o1}
\fmflabel{$g$}{i1}
\fmflabel{$g$}{i2}
\fmflabel{$h,H,A$}{o1}
\end{fmfgraph*}}
\end{fmffile}
\end{figure}
\end{document}

latex feynmanGraph.tex
mf ’\mode:=localfont; input triangleGraph’
latex feynmanGraph.tex
dvipdf feynmanGraph.dvi
display feynmanGraph.pdf # or acroread feynmanGraph.pdf

METAFONT on the resulting triangleGraph.mf file is executed as follows:

mf ’\mode:=localfont; input triangleGraph;’

11

g

g
h, H, A

It produces the font files *.tfm and *.600gf . The whole chain is latex-mf-latex-dvipdf. The
name of the .mf file, given as an argument of the \begin{f mf f ile}{...} command, must not
be the same as the name of the .tex file.

1.5 Makefile

The make utility automatically determines which pieces of a large program need to be re-
compiled and issues command to recompile them. However, make is not limited to programs,
it can be used to describe any task where some files must be updated automatically from
others whenever the others change.

To run make, you must have a makefile. Makefile contains the rules which make executes.
The default names for the makefiles are GNUmakefile, makefile and Makefile (in this order).
If some other name is preferred, it can used with command make -f myMakefile.

A documentation about make can be found in


www.gnu.org/software/make/manual/make.html
A makefile is an ASCII text file containing any of the four types of lines:
• target lines
• shell command lines
• macro lines
• make directive lines (such as include)

Makefile rules

Target lines tell make what can be built. Target lines consist of a list of targets, followed by
a colon (:), followed by a list of dependencies. Although the target list can contain multiple
targets, typically only one target is listed. Example:

clean:
rm -f *.o

The clean command is executed by typing

$ make clean

Notice that the shell command line after the semicolon has a tab in front of the command.
If this is replaced by blanks, it won’t work! This applies to every line which the target is
supposed to execute.
The dependencies are used to ensure that components are built before the overall executable
file. Target must never be newer than any of its dependent targets. If any of the dependent

12
targets are newer than the current target, the dependent targets must be made, and then the
current target must be made. Example:

foo.o: foo.cc
g++ -c foo.cc

If foo.o is newer than foo.cc, nothing is done, but if foo.cc has been changed so that the
timestamp of file foo.cc is newer than foo.o, then foo.o target is remade.
One of the powerful features of the make utility is its capability to specify generic targets.
Suppose you have several cc files in your program. Instead of writing every one object file a
separate rule, on can use a suffix rule:

.cc.o:
g++ -c $<

main: a.o b.o c.o d.o


g++ a.o b.o c.o d.o

Here $< is a special built-in macro, which substitutes the current source file in the body of
a rule. When main is executed, and the timestamp of the object files is newer than that of
exe file main, the the dependent targets a.o b.o c.o d.o are made using the suffix rule: if the
corresponding cc file is newer, new object file is made. When all object files needing updating
are made, then the exe file main is made.

Makefile macros

In the above example the object files were typed in two places. One could use a macro
instead:

OBJECTS = a.o b.o c.o d.o

main: $(OBJECTS)
g++ $(OBJECTS)

Makefile functions

A function call syntax is $(f tion args) There are several functions available for various
purposes: text and file name manipulation, conditional functions etc.
Example: let’s assume that the program containing the source files a.cc, b.cc, c.cc and d.cc
are located in a directory where there are no other cc files. So every file ending .cc must be
compiled into the executable. One can use functions wildcard, basename and addsuffix.

files = $(wildcard *.cc)


filebasenames = $(basename $(files))
OBJECTS = $(addsuffix .o,$(filebasenames))

13
Here the function wildcard gives a space separated list of any files in the local directory
matching the pattern *.cc. If you now modify your program to include a fifth file e.cc, your
Makefile will work without modifications.
An example Makefile (notice the usage of macros $(CXX) and $(MAKE). What does the @
do? If new line is needed, one can make the break with \).

files = $(wildcard ../src/*.cc ../src/*.cpp *.cc *.cpp)


OBJS = $(addsuffix .o,$(basename $(files)))
OPT = -O -Wall -fPIC -D_REENTRANT
INC = -I$(ROOTSYS)/include -I. -I../src \
-I../src/HiggsAnalysis/MssmA2tau2l/interface
LIBS = -L$(ROOTSYS)/lib -lCore -lCint -lHist -lGraf -lGraf3d -lGpad \
-lTree -lRint -lPostscript -lMatrix -lPhysics -lpthread -lm -ldl \
-rdynamic

.cc.o:
$(CXX) $(OPT) $(INC) -c $^~ -o $@

.cpp.o:
$(CXX) $(OPT) $(INC) -c $^~ -o $@

all:
@$(MAKE) --no-print-directory All

All:
@$(MAKE) compile; $(MAKE) analysis.exe

compile: $(OBJS)

analysis.exe: $(OBJS)
$(CXX) $(LIBS) -O $(OBJS) -o analysis.exe

clean:
rm -f $(OBJS)

14
2 FORTRAN

2.1 Overview

FORTRAN (short for FORmula TRANslation) is a problem-oriented language concerned


with problems of numerical calculations in scientific and engineering applications.
Why FORTRAN? FORTRAN as a programming language is decades old, but until recently
it has been the programming language HEP experiments have been using. Software written
in FORTRAN is still in active use, like
• PYTHIA6
• CMKIN
• TOPREX
• HDECAY

The FORTRAN version in use is FORTRAN77. A large fraction of older HEP software,
like Geant, CMKIN, CMSIM, has been translated to C++. There has been a transition to
C++ in HEP software, programs have been rewritten in C++ and new programs are mostly
written in C++. Not everything, though.
There are some peculiarities in F77. The statements start at the 7th position in the line, the
first 5 positions are for labels when needed, and the 6th position signals a continuation line
for long statements. This feature is error-prone so be careful to start a new statement line
with 6 blanks.
The line must not be too long either. If the line is longer than 72 chars, the rest of the line
is truncated. This can also result in errors which are not easy to find. The command can be
extended to the next line by e.g. & char as the 6th char in the line

COMMON/A/B,C,D,
& E,F,G

Comments can be made by starting the line with C, c or * , or with ! when commenting only
a part of the line.

2.2 FORTRAN variables

The data types in FORTRAN are


• INTEGER
• REAL
• DOUBLE PRECISION
• CHARACTER
• LOGICAL
• COMPLEX

The variable declarations are done in the beginning of the program. The variables can be
separated with commas:

INTEGER B,C,D
REAL dummy
INTEGER I

15
Usually variables starting with letter I,J,K,L,M,N are implicitly INTEGERs. This can be
changed by the command IMPLICIT:

IMPLICIT REAL (A-Z)

If no implicit types are wanted, one can use:

IMPLICIT NONE

If this is used, all the variables must be declared explicitly.


In the old FORTRAN, FORTRAN66, the maximum length for variable name was 6 charac-
ters. In FORTRAN77 this restriction is not there anymore.
The string variables are constructed from chars:

CHARACTER*10 NAME
NAME = ’abc’

Arrays are initialized by enclosing the dimension in parenthesis:

CHARACTER*10 NAME(50)
INTEGER IMATRX(3,3)

In FORTRAN the indexing starts from 1 (while in C++ it starts from 0).

IMATRX(3,3) = 1

2.3 Control flow

The basic loop in FORTRAN is a DO loop:

DO 999 I=1,10
...
999 CONTINUE

Here the number 999 is a label which must be uniquely chosen. No two lines can have the
same label. Another way to make a DO loop is

DO I=1,10
...
ENDDO

In FORTRAN programs one can also see often IF...GOTO conditions, although using GOTO
is often said to be bad programming style. Here a label is again used

IF (I.eq.1) GOTO 999

which jumps to line 999 CONTINUE when the IF condition is fulfilled


The conditional statements are constructed with IF..THEN..ELSE..ENDIF. The logical op-
erators are .EQ., .LT., .GT., .GE., .LE., .NE. and multiple expressions can be combined with
.AND. and .OR. operators. The . separates the operator from the variables and values.
Parenthesis (, ) can be used to change the order how the logical expression is evaluated.

16
IF (I.eq.1.OR.(J.GT.0.AND.K.ne.0)) THEN
...
ENDIF

The logical operators for character and string variables are the same.
.LT. comes before
.GT. comes after
.LE. comes before or is equal
.GE. comes after or is equal
.EQ. equal
.NE. not equal
One can use either ” or ’ for strings.

2.4 Input/Output

Printing in FORTRAN is done with keyword PRINT, followed by asterisk, comma and what
to print.

PRINT *,"Hello world!"

The asterisk in the PRINT statement tells the compiler that the format of the printing is to
be directed by the list of printed items.

PRINT *,"1+2=",1+2

Another option for printing is using the keyword WRITE. WRITE can also be used to write
directly into a file. Here the syntax is

WRITE(6,*)"Hello world!"

Here the number 6 tells that the output is standard output, and asterisk is as in the PRINT
statement. The format can be determined explicitly

WRITE(6,999) "Hello world!"


999 FORMAT(A12)

The format options are Aw for char/string, Iw for integer, Fw.d for float, Ew.d for float in
exponential notation. Gw.d uses F format to values between 0.1 and 10ˆ d, outside that
range format is like E. Here the w is an integer determining the width of the field, and d
digits after the decimal point. X is used for skipping.

123 FORMAT(A12,2I4,G15.6,I4)

Here the format is string with 12 char field, 2 integers with field 4,one real with field 15 and
decimal precision of 6 digits, and one integer with field 4.
Analogous to WRITE is READ. It can be used to read input from standard input or from a
file. File is opened with command OPEN, and closed with CLOSE.

OPEN(87,FILE=’hdecay.in’)
READ(87,101)IHIGGS
101 FORMAT(10X,I30)

17
2.5 Subprograms

In FORTRAN there are two kinds of subprograms: SUBROUTINE and FUNCTION. The
difference between the subroutine and function is that subroutines have no return type while
functions do have. A subprogram is defined with a line

SUBROUTINE NAME(LIST)
or
REAL FUNCTION NAME(LIST)

The LIST is the list of variables. The function return type can be any of the fortran data
types. The variable list LIST may contain both input and output variables. Therefore
functions can be written as subroutines, with one extra variable in the list returning the
function value. Subprograms end with statements RETURN and END.
Once the subroutine has been defined, it can be used via the CALL statement just like any
other FORTRAN statement

CALL NAME(LIST)

The function is used like built-in functions

VALUE = NAME(LIST)

2.6 COMMON block

There is another way for programs to communicate: it is done through special areas of storage
called COMMON. When variables are placed in COMMON blocks they may be accessible to
two or more subprograms. The syntax for COMMON block definition is

COMMON/name/list of variables

Example: part of PYTHIA COMMON block:

COMMON/PYDAT1/MSTU(200),PARU(200),MSTJ(200),PARJ(200)
COMMON/PYDAT2/KCHG(500,4),PMAS(500,4),PARF(2000),VCKM(4,4)
COMMON/PYDAT3/MDCY(500,3),MDME(8000,2),BRAT(8000),KFDP(8000,5)
COMMON/PYDAT4/CHAF(500,2)
CHARACTER CHAF*16
COMMON/PYDATR/MRPY(6),RRPY(100)
COMMON/PYSUBS/MSEL,MSELPD,MSUB(500),KFIN(2,-40:40),CKIN(200)
COMMON/PYPARS/MSTP(200),PARP(200),MSTI(200),PARI(200)

18
Simple example programs

PROGRAM EXAMPLE1
PRINT *,"Hello world!"
STOP
END

The program can be compiled with g77. In makefile you can use the implicit variable $(FC),
and suffix rules.

$ g77 example1.F -o example1.exe

The first and third lines are not mandatory, the same program can be written as:

PRINT *,"Hello world!"


END

PROGRAM EXAMPLE2
INTEGER i
COMMON/TEST/i
i = 1
CALL INCREMENT1(i) ! i = 2
CALL INCREMENT2 ! i = 3
i = INCREMENT3(i) ! i = 4
END

SUBROUTINE INCREMENT1(input)
INTEGER input
input = input + 1
END

SUBROUTINE INCREMENT2
INTEGER i
COMMON/TEST/i
i = i + 1
END

INTEGER FUNCTION INCREMENT3(in)


INTEGER in
INCREMENT3 = in + 1
END

2.7 Real world example: PYTHIA

PYTHIA source code (FORTRAN) and documentation can be found in


http://home.thep.lu.se/~torbjorn/pythiaaux/recent.html
http://projects.hepforge.org/pythia6
Taking the latter URL let’s download the source code pythia-6.4.22.f and compile it

19
$ g77 pythia-6.4.22.f -c -O4

There are a number of warnings, but they can be ignored... Then let’s create the library from
the object file:

$ ar -r libpythia-6.4.22.a *.o

With the command ar -t libmylib.a you can list the contents of the lib file. More about
the ar command, read the man pages.
Next we need the main program. There are several examples in the pythia web page, let’s
take the Z0 production at LEP1 and download it. The file seems to be named main63.f.
http://home.thep.lu.se/~torbjorn/pythia/main63.f
The main program consists of three parts, initialization, event loop, and output creation.
The output here is histograms and cross section table. Let’s compile and link:

$ g77 -w main63.f -o Z0lep.exe -L. -lpythia-6.4.22

Now the executable Z0lep.exe can be run by the command:

./Z0lep.exe

Next, let’s make some modifications and have a new main. Copy the main63.f to a new name,
e.g. main63b.f. Then open the file with editor. Let’s change the machine from LEP to LHC.
For that two lines need editing:

ECM=14000.D0
CALL PYINIT(’CMS’,’p’,’p’,ECM)

Let’s also print the Z0 transverse momentum. Inside the event loop after calling PYEVNT
add

DO J=1,N
IF(K(J,2).EQ.23) THEN
PRINT *,DSQRT(P(J,1)**2+P(J,2)**2)
ENDIF
ENDDO

Here the variables used come from the PYTHIA common blocks. All this is documented in
the PYTHIA manual.
• N number of particles in the event
• K(J,2) KF code of particle J (just a label)
• P(J,1) PX of particle J
• P(J,2) PY of particle J

The Z0 pt seems to be mostly small. It means the particle doesn’t move much in the transverse
plane. However, it can have quite sizable momentum along the beam pipe (Z-coordinate).

Let’s next look at the decay products. The KF code for b quarks is ±5, sign indicating
particle/antiparticle.

20
DO J=1,N
IF(IABS(K(J,2)).EQ.5.AND.(K(J,3),2).eq.23) THEN
PRINT*,DSQRT(P(J,1)**2+P(J,2)**2)
ENDIF
ENDDO

Here the K(J,3) returns the line of the parent particle, so parent particle KF code is
K(K(J,3),2). This selects all b and bbar quarks which originate from Z0 decays. As you
can see in the output the transverse momentum for b’s is much larger than for Z0’s, as
expected.
Normally this kind of information is put into histograms. Prints are used only for testing
and debugging. Booking histograms is done in the initialization

CALL PYBOOK(3,’b pt’,100,0.D0,100.D0)

Filling histograms inside the event loop

CALL PYFILL(3,PT,1.D0)

and writing the histogram in a file in the output creation part.

OPEN(87,file=’histo.txt’)
CALL PYDUMP(1,87,0)
CLOSE(87)

The resulting text file contents is documented in the PYTHIA manual. See
http://home.thep.lu.se/~torbjorn/pythia/lutp0613man2.pdf

21
3 C++ programming
Object-oriented programming is a programming style that captures the behavior of the real
world in a way that hides detailed implementation, and it allows the problem solver to think
in terms of problem domain. C++ was created with two goals: to make it compatible with
ordinary C, and to extend C with OO constructs.
FAQ: http://www.parashift.com/c++-faq-lite/index.html
Object-oriented programming is a data-centered view of programming, in which data and
behavior are strongly linked. Data and behavior are conceived of as classes whose instances
are objects. Objects are class variables, and object-oriented programming allows abstract
data structures to be easily created and used.
A C++ program is a collection of declarations and functions that begin executing with the
function main(). The program is compiled with a first phase executing any preprocessor
directives, such as

#include <string>

Namespaces were introduced to provide a scope that allows different code providers to avoid
global name clashes. The

namespace std;

is reserved for use with the standard libraries. The using declaration allows the identifiers
found in the standard library to be used without being qualified. Without this declaration
the program would have to use
std::string.
In OO terminology, a variable is called an object. A constructor is a member function whose
job is to initialize an object of its class. Constructors are invoked whenever an object of its
class is created. A destructor is a member function whose job is to finalize a variable of its
class. The destructor is called implicitly when an automatic object goes out of scope.
The simple native types in C++ are bool, double, int and char. These types have a set of
values and representation that is tied to the underlying machine architecture on which the
compiler is running.
C++ simple types can be modified by the keywords short, long, signed and unsigned to
yield further simple types.

Fundamental data types


bool
char signed char unsigned char
wchar t
short int long
unsigned short unsigned unsigned long
float double long double

22
A variable declaration associates a type with a variable name. A declaration of a variable
constitutes a definition, if storage is allocated for it. The definition can be thought as a
creating the object.

int i;

A definition can also initialize the value of the variable

int i = 0;
int i = 0,j;

C++ declarations are themselves statements and they can occur throughout a block.
An automatic type conversion can occur across assignments and mixed statements. A pro-
motion (widening) is usually well behaved, but demotions (narrowing) can lose information.
In mixed statements the type conversion follows promotion rules.
In addition to implicit conversions, there are explicit conversions called casts.

static_cast<double>(i)
const_cast<double>(constVar)

Older C++ systems allow an unrestricted form of cast with

(type)expression or type(expression)

3.1 Expressions

The arithmetic operators in C++ are +,-,*,/,%. To raise to a power one can use the func-
tion pow. Arithmetic expressions are consistent with expected practise, with one important
difference: division operator. The result of the division operator depends on its arguments

a = 3 / 2;
a = 3 / 2.0;

Relational (<, >, <=, >=), equality (==,!=) and logical (!,&&,||) operators work as expected.
In the evaluation of expressions that are the operands of && and ||, the evaluation process
stops as soon as the outcome is known. If expr1 is false, then in

expr1 && expr2

expr2 will not be evaluated. Similarly, if expr1 is true, then in

expr1 || expr2

expr2 will not be evaluated since the value of the logical expression is already determined.
The conditional operator ?: is a ternary operator, taking three expressions as operands. In a
construct such as

expr1 ? expr2 : expr3

23
expr1 is evaluated first. If it is true, expr2 is evaluated, and that is the value of the conditional
expression as a whole. If expr1 is false then expr3 is evaluated. For example

x = (y < z) ? y : z;

assigns the smaller of two values to x.

C++ provides also assignment operators that combine an assingment and some operator like

a += b; // a = a + b
a *= a + b; // a = a*(a+b)
i++; // i = i + 1
i- -; // i = i - 1
j = i++; // j = i; i = i + 1
j = ++i; // i = i + 1; j = i

Notice that the two last examples are not equivalent.

3.2 Statements

The general forms of basic control flow statements are

if ( condition ) statement
if ( condition ) statement1 else statement2
while ( condition ) statement
for (init; condition; increment) statement
do statement while ( condition )
switch ( condition ) statement

To interrupt the normal flow of control within a loop, two special statements can be used:
break and continue
Example: a simple for loop

for(int i = 0; i < 3; i++){


cout << i << endl;
}

Example: a while loop

while (getline(file,line)){
cout << line.length() << endl;
}

24
3.3 Functions

A C++ program is made of one or more functions, one of which is main(). The form for a
function definition is

type name(parameter-list) {
statements
}

The type specification that precedes the function name is the return type, and the value is
returned with statement return. If the function does not return any value, the return type
of the function is void.
The parameters in the parameter list can be given default arguments

void myFunction(int i, int j = 1)\{...\}

It is possible to overload functions

void myFunction(int i){...}


void myFunction(double d){...}

3.4 Pointers

Pointers are used to reference variables and machine addresses. They can be used to access
memory and manipulate addresses. Pointer variables can be declared in programs and then
used to take addresses as values. The declaration

int* p;

declares p to be of type pointer to int. The legal range of values for any pointer always
includes the special address 0.

int* p = &i;

The variable p here can be thought of as referring to i or pointing to i or containing the


address of i.
The dereferencing or indication operator * can be used to return the value of the variable the
pointer points to

int j = *p;

Here *p returns the value of variable i.

25
3.5 Arrays

An array declaration is of the form

int a[size];

The values of an array can be accessed by a[expr], where expr is an integer expression getting
values from 0 to (size-1). Arrays can be initialized by a comma separated list

int a[3] = { 1, 2, 3 };

or element by element a[0] = 1; a[1] = 2; a[2] = 3;. Arrays can be thought of constant
pointers. Pointer arithmetic provides an alternative to array indexing

int *p = a;
int *p = &a[0];

The arrays can be of any type, including arrays of arrays. Multidimensional arrays can be
formed by adding bracket pairs:

int a[2][2]; // Not a[2,2]!


int b[2][5][200];

3.6 Abstract Data Structures

ADT implementations are used for user-defined data types. A large part of the OO program
design process involves thinking up the appropriate data structures for a problem.
The structure type allows the programmer to aggregate components into a single named
variable. A structure has components, called members, that are individually named. Since
the members of a structure can be of various types, the programmer can create aggregates
suitable for describing complicated data.
Example: complex numbers

struct complex {
double re,im;
};
complex c;
c.re = 1; c.im = 2;

3.7 Member functions

The concept of struct is augmented in C++ to allow functions to be members. The function
declaration is included in the structure declaration.

struct complex {
void reset(){
re = 0;
im = 0;
}
double re,im;
};

26
The member functions can be public or private. Public members are visible outside the
data structure, while private can be accessed only within the structure. Very often the data
is hidden, and accessed only by functions built for that purpose.
Hiding data allows more easily debugged and maintained code because errors and modifica-
tions are localized.

3.8 Classes

Classes are introduced by a keyword class. They are a form of struct whose default pri-
vacy specification is private. Thus struct and class can be used interchangeably with the
appropriate access specifications.

class complex {
public:
void reset(){
re = 0;
im = 0;
}

private:
double re,im;
};

Class adds a new set of scope rules to those of the kernel language. The scope resolution
operator :: comes in two forms:

::i // refer to external scope


foo_bar::i // refer to class scope

Classes can nest. This means that there can be classes within classes. Referencing the outer
or inner variables is done with the scope resolution operator.

class X {
int i; // X::i
public:
class Y {
int i; // X::Y::i
};
};

The definition of a purely local class is unavailable outside their local scope. Static variables
are shared by all variables of that class and stored in only one place. They keyword this
denotes an implicitly declared self-referential pointer.

class X {
public:
X clone(){return (*this);}
};

27
The member functions can also be static and const. A static member function can exist
independent of any specific variables of the class type being declared. Const member functions
cannot change the values of the data members. Writing out const member functions and const
parameter declarations is called const-correctness. It makes the code more robust.

class X {
static int Y();
int Z() const;
};

int X::Y(){}
int X::Z() const {}

3.9 Input/Output

Output is inserted into an object of type ostream, declared in the header file iostream.h.
An operator << is overloaded in this class to perform output conversions from standard
types.

std::cout << "Hello world!" << std::endl;

Input is inserted in istream, and operator >> is overloaded to perform input conversions to
standard types.

std::cin >> i;

Here ”using namespace std;” would allow using these commands without the scope resolution
operator.
File i/o is handled by including fstream.h. To read a variable var from a file first open the
file:

ifstream inFile("filename",ios::in);
inFile >> var;

A while loop can be made to read until EOF is reached: while(!inFile.eof()){}


It is also possible to read a whole line using getline(ifstream,string).
Example: MET from high pt object
File SimpleMET.h:

#ifndef SIMPLEMET_H
#define SIMPLEMET_H

#include "MyEvent.h"

class SimpleMET {
public:
SimpleMET();
~SimpleMET();

28
void Add(MyTrack);
void Add(MyJet);
MyMET GetMET();
double Value();

private:
MyMET etmiss;
};
#endif
}

File SimpleMET.cc:

#include "SimpleMET.h"

SimpleMET::SimpleMET(){
etmiss.x = 0;
etmiss.y = 0;
}
SimpleMET::~SimpleMET(){}

void SimpleMET::Add(MyTrack track){


etmiss.x -= track.Px();
etmiss.y -= track.Py();
}

void SimpleMET::Add(MyJet jet){


etmiss.x -= jet.Px();
etmiss.y -= jet.Py();
}

MyMET SimpleMET::GetMET(){
return etmiss;
}

double SimpleMET::Value(){
return etmiss.Value();
}

29
3.10 Object Creation and Destruction

An object requires memory and some initial value. C++ provides this through declarations
that are definitions.

int n = 5;

The int object n gets allocated off the run-time system stack, and initialized to the value 5.
In creating complicated aggregates, the user will expect similar management of a class defined
object. The class needs a mechanism to specify object creation and destruction, so that a
client can use objects like native types. A constructor constructs values of the class type.
It is a member function whose name is the same as the class name. This process involves
initializing data members and, frequently, allocating free store using new. A destructor is a
member function whose purpose is to destroy values of the class type. It is a member function
whose name is preceded by the tilde character.
Constructors can be overloaded. A constructor is invoked when its associated type is used in a
definition. Destructors are invoked implicitly when an object goes out of scope. Constructors
and destructors do not have return types and cannot use return statements.

class Example {
public:
Example();
Example(int);

~Example();
}
Example e;
Example * ep = new Example;
delete ep;
Example e(5);
Example * ep = new Example(5);

A constructor requiring no arguments is called the default constructor. This can be a con-
structor with an empty argument list or a constructor where all arguments have default
values. It has the special purpose of initializing arrays of objects of its class.
If a class does not have a constructor, the system provides a default constructor. If a class
has constructors, but does not have a default constructor, array allocation causes a syntactic
error.
Constructor initializer is a special syntax for initializing subelements of objects with con-
structors. Constructor initializers can be specified in a comma-separated list that follows the
parameter list and precedes the body. Assuming the class Example has a data member i:

Example::Example() : i(0);

Constructors of a single parameter are automatically conversion functions unless declared


with the keyword explicit.

Example::Example(int);

30
This is automatically a type conversion from int to Example. It is available both explicitly
and implicitly:

Example e;
int i = 123;
e = static_cast<Example>(i);
e = i;

With explicit keyword the conversion can be disabled. The constructor would be

explicit Example::Example(int);
Example e = 123; // illegal
Example e = Example(123); // ok
Example e(123); // ok
e = 124; // illegal
e(125); // illegal

A copy constructor is called whenever a new variable is created from an object. If there is no
copy constructor defined for the class, C++ uses the default copy constructor which copies
each field.

Example e(0); // constructor to build e


Example f(e); // copy constructor to build f
Example f = e; // copy constr., init. in decl.

If shallow copies are ok, dont write copy constructors, but use the default. Shallow copy is a
copy where a change to one variable would affect the other. If the copy is in different memory
location, then it’s a deep copy.

i = e.i; // shallow copy


Example e,f; f = e; // deep copy

If you need a copy constructor, you most probably also need a destructor and operator=
The syntax for the copy constructor is

Example::Example(const Example&){...}

3.11 Inheritance

+indexC++!inheritance Inheritance is a mechanism of deriving a new class from an old one.


An existing class can be added to or altered to create the derived class.
C++ supports virtual member functions. They are functions declared in the base class and
redefined in the derived class. A class hierarchy that is defined by public inheritance creates
a related set of user types, all of whose objects may be pointed by a base class pointer. The
appropriate function definition is selected at run-time.
Inheritance should be designed into software to maximize reuse and allow a natural modelling
for problem domain.
A class can be derived from an existing class using the form

31
class name : (public|protected|private) bclass{};

The keywords public, protected and private are used to specify how the base class members
are accessible to the derived class. The keyword protected is introduced to allow data hiding
for members that must be available in the derived classes, but otherwise act like private
members. See also: http://www.parashift.com/c++-faq-lite/strange-inheritance.html
A derived class inherits the public and protected members of a base class. Only constructors,
its destructor and any member function operator=() cannot be inherited.
Member functions can be overridden.
Virtual functions in the base class are often dummy functions. They have an empty body in
the base class, but they will be given specific meanings in the derived class. A pure virtual
function is a virtual member function whose body is normally undefined.

virtual function prototype = 0;

3.12 Templates

+indexC++!templates Templates allow the same code to be used with respect to different
types.
The syntax of a template class declaration is prefaced by

template < class identifier >

Example:

template < class T > class stack {...};


stack<char*> s_ch;
stack<int> s_i;

This mechanism saves us rewriting class declaration where the only variation would be the
type declaration.
Function templates can be used for functions which have the same body regardless of type.

template<class T1, class T2>

void copy(T1 a[], T2 b[], int n){


for(int i = 0; i<n; i++) a[i] = b[i];
}

3.13 Containers and Iterators

Container classes are used to hold a large number of individual items. Many of the operations
on container classes involve the ability to visit individual elements conveniently.
Useful containers in C++ are e.g.
• vector
• pair
• map

32
• list
• queue
• stack
• set

Containers come in two major families: sequence and associative. Sequence containers include
vectors and lists; they are ordered by having a sequence of elements. Associative containers
include sets and maps, they have keys for looking up elements.
An iterator is a pointer which is used for navigating over the containers.

vector<int> intVec;
intVec.push_back(123);
intVec.push_back(456);
vector<int>::const_iterator i;
for(i = intVec.begin();i != intVec.end(); i++){
cout << *i << endl;
}
map<string,int> cut;
cut["pt"] = 20;
cut["deltaPhi"] = 175;
if(track.pt() < cut["leptonPt"] ) continue;
map<string,int>::const_iterator i = find("pt");

In function tag() :

return pair<JetTag,Collection>(theTag,tauIp);

Using function tag() :

pair<JetTag,Collection> ipInfo = algo.tag();


JetTag tag = ipInfo.first;
Collection c = ipInfo.second;

3.14 Polymorfism
• Polymorphism is a means of giving different meanings to the same message. OO takes
advantige of polymorphism by linking behavior to the object’s type.
• Overloading a function gives the same function names different meanings.
• Overloading operators gives them new meanings.

The overloaded meaning is selected by matching the argument list of the function call to the
argument list of the function declaration.
Example: filling 1D and 2D histograms

void MyHistogram::fill(string,double);
void MyHistogram::fill(string,double,double);

33
When an overloaded function is invoked, the compiler selection algorithm picks the appro-
priate function. It depends on what type conversions are available. An exact match is the
best. Casts can be used to force such a match.
Overloading operators allows infix expressions of both ADT’s and built-in types to be written.
It is an important notational convenience, and in many instances leads to shorter and more
readable programs.
Operators can be overloaded as non-static member functions.
Example: An operator for matrix*vector

Vector Vector::operator*(const Matrix& m,


const Vector& v){...};
Vector v; Matrix m;
Vector result = m * v;

Example: Two dimensional point addition

Point Point::operator + (const Point& p) const{


Point point;
point.x = x + p.x();
point.y = y + p.y();
return point;
}

Example: MyVertex class inheriting MyGlobalPoint

//#include <string>
//using namespace std;
class MyGlobalPoint {
public:
MyGlobalPoint();
~MyGlobalPoint();
double getX() const;
double getY() const;
double getZ() const;
double getXerror() const;
double getYerror() const;
double getZerror() const;
double value() const;
double error() const;
double significance() const;
double getPhi() const;
double MyGlobalPoint operator +
(const MyGlobalPoint&) const;
double MyGlobalPoint operator -
(const MyGlobalPoint&) const;
protected:
double x,y,z,
dxx,dxy,dxz,
dyy,dyz,

34
dzz;
// int dimension;
};

#include <vector>
#include "MyTrack.h"
using namespace std;
class MyVertex : public MyGlobalPoint{
public:
MyVertex();
~MyVertex();
double Eta() const;
double Phi() const;
double eta() const;
double phi() const;
MyVertex operator + (const MyVertex&) const;
MyVertex operator - (const MyVertex&) const;
vector<MyTrack> tracks();
private:
vector<MyTrack> associatedTracks;
};

Example: Saving events in a ROOT file


File MyRootTree.h:

#ifndef MYROOTTREE_H
#define MYROOTTREE_H
#include "TROOT.h"
#include "TFile.h"
#include "TTree.h"
#include "MyEvent.h"
class MyRootTree {
public:
MyRootTree();
~MyRootTree();
void fillTree(MyEvent*);
private:
TTree* rootTree;
MyEvent* myEvent;
TFile* rootFile;
};
#endif

File MyRootTree.cc:

#include "MyRootTree.h"
MyRootTree::MyRootTree(){
rootFile = new TFile("analysis.root","RECREATE");
rootTree = new TTree("rootTree","events");

35
myEvent = new MyEvent();
int bufsize = 256000;
int split = 1;
rootTree->Branch("MyEvent","MyEvent",&myEvent,bufsize,split);
}
MyRootTree::~MyRootTree(){
rootFile->cd();
rootTree->Write();
rootTree->ls();
rootFile->Close();
delete rootFile;
}

MyRootTree::fillTree(MyEvent* event){
myEvent = event;
rootFile->cd();
rootTree->Fill();
}

class MyEventAnalyser : public edm::EDAnalyzer {


...
private:
...
MyRootTree* userRootTree;
};

void MyEventAnalyser::beginJob() {
...
userRootTree = new MyRootTree();
}

void MyEventAnalyser::analyze(const edm::Event& iEvent){


// Event reconstruction here, reconstructing primary
// vertex, electrons, muons, etc.
MyEvent* saveEvent = new MyEvent;
saveEvent->primaryVertex = PV;
saveEvent->electrons = recElectrons;
saveEvent->muons = recMuons;
saveEvent->mcParticles = mcParticles;
userRootTree->fillTree(saveEvent);
delete saveEvent;
}

void MyEventAnalyser::endJob() { ...


delete userRootTree;
}

In CMSSW the reconstruction can be done e.g. using a class derived from EDAnalyzer. The
MyRootTree type pointer is added in the MyEventAnalyser class definition, in the initial-

36
ization (beginJob) the MyRootTree class object is created, in the event loop (analyze) the
TTree is filled, and when the destructor is called (delete MyRootTree class object), the TTree
is stored in a ROOT file.

37
4 ROOT

4.1 Introduction

ROOT is an object-oriented framework aimed at solving data analysis challenges of high


energy physics. The commonly used components of ROOT are

• Command line interpreter


• Histogramming and fitting
• Writing a graphical user interface
• 2D graphics
• Input/output
• Collection classes
• Script processor

To find information about ROOT, use the web page http://root.cern.ch. The reference
manual is most useful, and when facing a problem, the ROOT talk digest may already
contain the answers to your questions. The ROOT developers are also very helpful, and they
are gladly answering questions how to use ROOT.
You can start a ROOT session by typing
$ root or: $ root -l
assuming ROOT is installed on your system. This will give you the system prompt
root [0]
You can execute your ROOT script (myScript.C) by typing
root [0] .x myScript.C
or directly from the shell
$ root myScript.C
One can exit from ROOT by typing
root [0] .q
It is possible to type CINT commands directly to the system prompt
root [0] int i = 0
root [1] cout << i << endl
0
root [2]
From inside ROOT you can type Linux (Unix) commands by preceding the command by .!,
for example:
root [3] .! ls -l
The command line interpreter CINT looks and feels just like C++, and it does about 95%
of what C++ does. Not everything, though.

4.2 Some practicalities

ROOT scripts are text files which contain the commands enclosed within brackets:

38
{
cout << "Hello world!" << endl;
}
When executing ROOT files from the command line, functions need to be named as the file.
If the file name is abc.cxx, then
void abc(){}
is the only allowed name for the function. If the function name and file name are different,
then one can load the script first
root [] .L abc.cxx
and then use the function
root [] abc()
It is possible to include header files just like in C++. Quite a lot is already available, but if
it’s not then one can add e.g. for maps
#include <map>
ROOT is known to be sometimes unstable. Therefore is it sometimes better to run scripts
with a new ROOT session starting every time instead of using
root [] .x myScript
time after time in the same session.
A very powerful method to create and modify ROOT scripts is to save the current script in
a ROOT file, and then copy-paste the relevant section into your script. For example when
adding text in a figure, one can add it interactively, save the script in a file, open the saved
file and copy the text part in the original file. The original file is good to be available if you
have e.g. classes written in it, or if it includes fits. Those files may contain code you need
again later. The saved scripts drop all the fancy structures, and sometimes fits dont work in
the saved scripts.
Best way to work with ROOT is not scripting alone, or graphical interface alone, but a
combination of both.

4.3 Conventions and types

ROOT uses a set of coding conventions.

• Classes begin with T


• Non-class types end with t
• Member functions begin with a capital
• Constants begin with k
• Global variables begin with g
• Getters and setters begin with Get and Set

In ROOT there are machine independent types available. For example int may be 16 bytes
in one machine, but 32 bytes in another. To ensure the size of your variables, there are
predefined types in ROOT:
• Int t,
• Float t,

39
• Double t,
• Bool t etc.

You can, however, use also the C++ types int, double etc if you wish.

4.4 Global variables

ROOT has a set of global variables that apply to the session. For example the single instance
of TROOT is accessible via the global gROOT and hold information relative to the current
session. Examples:
gROOT->Reset();
gROOT->LoadMacro("ftions.cxx");
gSystem->Load("libMyEvent.so");

4.5 History file

When you do something with command line, you may want to save what you have done. You
can use the arrow up and down to browse the used commands. They are recorded in a file
$HOME/.root_hist, which contains the last 100 commands. If needed, you can cut and past
your commands from that text file.

4.6 Histograms

ROOT supports 1-D, 2-D 3-D and profile histograms. There are several histogram classes
available. For example the class TH1D is for plotting double precision numbers in a one
dimensional histogram. Inheritance is heavily used, which means that the ROOT reference
manual is often needed to find suitable member functions.
Histograms are created with constructors:
root [] TH1F* n = new TH1F("Name","Title",10,0,5);
Then filled:
root [] n->Fill(value);
The histogram drawing is done with Draw() member function
root [] n->Draw();
Drawing option can be used to plot several histograms in the same figure
root [] n2->Draw("same");
The full list of drawing options can be found in the ROOT users guide. The histogram classes
inherit from the attribute classes, which provide member functions for setting the line, fill,
marker and text attributes.
root [] n->SetFillColor(4);
root [] n->SetLineWidth(3);
root [] n->SetLineStyle(2);
The axis titles can be given by
root [] n->GetXaxis()->SetTitle("njets");

40
root [] n->GetYaxis()->SetTitle("Events");
Histograms can be added, divided and multiplied. They can also be cloned:
root [] TH1F* m = (TH1F*)n->Clone();
Cloning is useful when there are several identical histograms for different variables. When one
histogram is changed, all are changed accordingly. For example: histograms for reconstructed
mass (background only and signal+background). Sometimes histograms need rescaling:
root [] double scale = 1/n->GetMaximum();
root [] n->Scale(scale);

4.7 Canvases

Canvases may be seen as windows. In ROOT a graphical entity that contains graphical objects
is called a Pad. A Canvas is a special kind of Pad. A pad can contain e.g. histograms.
Here is a simple example to draw a histogram, assuming the histogram is already created
and filled.
TCanvas* c = new TCanvas("c","",500,500);
c->SetFillColor(0);
histo->Draw();
c->Print("mygraph.C");
Here the canvas will print itself in a ROOT script file mygraph.C. One can also print in
graphical formats like .eps by changing the suffix of the output file.
c->Print("mygraph.eps");
Pads can contain pads. Canvas can be divided into subpads:
canvas->Divide(2,1);
canvas->cd(1);
histo1->Draw();
canvas->cd(2);
histo2->Draw();
The Divide member function creates two parallel figures one containing histo1 and the other
containing histo2. The arguments determine the pad matrix, e.g.:
canvas->Divide(2,3);
will create six pads, two in x direction and 3 in y. Navigation between the pads is done with
member function cd(padnumber);
The log/linear scale is switched on/off using pads, not histograms. In the current pad the y
scale is changed to log scale by:
gPad->SetLogy();

4.8 Graphs

A graph is a graphics object made of two arrays holding the coordinates of n points. The
coordinates can be arrays of doubles or floats.

41
int N = 3;
double x[N] = {1,2,3}, y[N] = {1,2,3};
TGraph* g = new TGraph(N,x,y);
TCanvas* c = new TCanvas("c","",300,300);
c->cd();
hframe = new TH2F("hframe","",2,0,4,2,0,4);
hframe->Draw();
g->Draw("C");
Here the draw area is determined with a 2D histogram hframe. If automatic draw area
is enough, one can also use the draw option ”A” for including the axis drawing. Other
draw options are e.g. ”C” for continuous and ”P” for polymarkers. If polymarkers are
used, the default marker is a tiny dot, so to get the graph visible, one should change the
marker/marker size. Draw options can be combined , e.g ”CP” will produce both continuous
line and polymarkers.
Several graphs can be superimposed in the same figure by leaving out the axis option ”A”.
Errors can be added using class TGraphErrors. The errors are given in error arrays for x and
y errors.

4.9 Trees

Trees are used to store large quantities of data for later analysis. In HEP the event recon-
struction takes usually quite some time, and if the analysis is done within the reconstruction
program, we need to run the time consuming program every time we change our analysis.
Trees (ntuples) is a way to solve this problem. The reconstructed object can be stored in a
tree, saved in a file and then later analysed separately. That way the reconstruction is done
only once, and when changing e.g. cuts in the analysis all we have to do is to read the ROOT
files again.

ROOT tree example

Writing a Tree is done in three parts, TTree object construction, filling, and writing. The
class TTree constructor creates a tree, the TTree method Branch() defines space for the
quantities in a tree branch and the TTree method Fill() puts quantities there. A file where
the tree is stored is opened by the class TFile and finally the TTree method Write() puts
the tree in the file. The following simplified example summarizes the procedure:

void makeTree() {
TTree* t1=new TTree("t1","My simple tree");
Float_t px=999.9, py=-999.9;
t1->Branch("px",&px,"px/F");
t1->Branch("py",&py,"py/F");
// ...
t1->Fill(); // fill tree
TFile f("myTree.root","recreate");
t1->Write(); // write tree on file
t1->Delete(); // all done, delete tree
f.Close();
}

42
In this example only one variable and one entry is put in the tree. In reality, of course, several
quantities per event are put and a large number of events in a loop. Here the branch is called
px, with value address &px, and leaf list and the type of each leaf. Here we have only one
leaf of type Float t.
In the analysis program the tree file is opened again with the class TFile constructor. The
tree with the identifier "t1" is then found from the file with the TFile member function
Get("t1") and assigned to the TTree object with the pointer *t1. The branch addresses are
set with the TTree method SetBranchAddress("var",&var). The number of entries in the
tree is given by the method GetEntries(). In summary, the procedure is shown below as a
simple example for only one variable px.

void readTree() { // simple "analysis" code


TFile* f= new TFile("myTree.root"); // open root tree file
TTree* t1 = (TTree*)f->Get("t1"); // get tree from file
Float_t px,py;
t1->SetBranchAddress("px",&px);
t1->SetBranchAddress("py",&py);
// ...
Int_t nbEntries=(Int_t)t1->GetEntries();
for ( Int_t i=0; i<nbEntries; i++ ) {
t1->GetEntry(i); // get entry number i
cout<<"px,py = "<<px<<" "<<py<<endl;
}
}

Here the GetEntry(i) gets the entry number i from the tree, together with the value of
the stored variable px. Here the stored variable was Float t. One can also store ADT’s
(Abstract Data Types) in the ROOT trees using the same method.

4.10 Fitting

To fit a histogram, one can use the member function TH1::Fit. For example to fit a gaussian:
histo->Fit("gaus");
Often more complicated functions are needed. It is possible to create a custom formula,
which is used as the fit function:
TF1* f = new TF1("f","[0]*x+[1]",0,1);
histo->Fit("f");
Here [0] and [1] are the parameters, which will be varied to get the best fit. The third way is
to define a function which is of the form:
Double_t name(Double_t* x, Double_t* par){} }
The two parameters x and par contain a pointer to the dimension array, and a pointer to the
parameters array.
Example:

Double_t func1(Double_t *x,Double_t *par) {


return par[0]/sqrt(2*TMath::Pi()*par[2]*par[2]) *

43
exp(-0.5*(x[0]-par[1])*(x[0]-par[1])/(par[2]*par[2]));
}
Double_t func2(Double_t *x,Double_t *par) {
return par[0]*TMath::Landau(x[0],par[1],par[2])/par[2];
}
Double_t fitFunction(Double_t *x,Double_t *par) {
return func1(x,par) + func2(x,&par[3]);
}

TF1* theFit = new TF1("theFit", fitFunction, fitRangeMin, fitRangeMax, nPar);


histo->Fit("theFit","R");

Often to get the fit to converge, one needs to have an educated guess: to set the parameters
to some initial values not too far from the expected results. This can be achieved with:
theFit->SetParameter(0,1.234);
which sets the initial value of parameter par[0] to 1.234. Parameters can also be fixed to a
constant value so that they are not varied in the fitting procedure:
theFit->FixParameter(0,1.234);

44
5 Combining languages

5.1 FORTRAN and C++

To access FORTRAN programs from C++ (C) and vica versa, look at e.g.
http://arnholm.org/software/cppf77/cppf77.htm
http://cnlart.web.cern.ch/cnlart/217/node34.html
Since new HEP programs are mostly written in C++, you may need to call FORTRAN
subroutines from C++ main, but having FORTRAN main calling C++ is unlikely. Therefore
we concentrate here only on the former case.
The FORTRAN code can be accessed from C++ via COMMON blocks. Let us consider a
FORTRAN subroutine named TEST with the following common block:

INTEGER I
REAL R
DOUBLE PRECISION D
COMMON/COMX/I,R(3,3),D
CHARACTER*80 CHTEXT(10)
COMMON/COMC/CHTEXT

In C++, function or structure corresponding to fortran subroutine or common block has to


be declared with the same name typed in lower case letters with underscore:

SUBROUTINE TEST
test_;

For the C++/FORTRAN interface we need:

extern "C" void test_();


struct fBlockCOMX {
int i;
float r[3][3];
double d;
};

extern "C" {
extern fBlockCOMX comx_;
}

The common block variables can then be accessed as the structure member data field:

comx_.i = i;
float f = comx_.r[0][0];

Same applies for the character common block:

45
struct fBlockCOMC {
char text[10][80];
};
extern "C" {
extern fBlockCOMC comc_;
}

Argument passing in calling FORTRAN from C++ is very compiler dependent. The following
should work on most compilers

SUBROUTINE TEST(X,CH,Y)
float x,y;
char* ch_pointer;
int ch_length;
test_(&x,ch_pointer,&y,ch_length);

To avoid problems with argument passing, one could write a FORTRAN interface, which
communicates with C++ using common blocks, and calls the actual subroutine which one
wants to link with the C++ program.
Compiling the C++/FORTRAN program is done as usual, first the object files are created
from the source after which the code is linked to an executable. However, since the FORTRAN
code is probably something which is not modified, one could create a library containing the
FORTRAN part and perhaps the interface, too. After all, this linkage procedure is practical
only for reusing old existing FORTRAN code, when rewriting the FORTRAN code in C++
is too time consuming a task.
As a real world example, you can look at the program sigmaBr6:
http://cmsdoc.cern.ch/~slehti/sigmaBr.html
which combines several FORTRAN programs and provides a common interface to them and
makes the plotting the cross section and branching ratio curves easier.
A C++/FORTRAN interface is found also in CMSSW, which is a reconstruction program
using different event generators.

5.2 C++ and ROOT

C++ and ROOT are very close to each other and therefore easy to combine. The ROOT
classes and libraries are available as soon as ROOT is installed on the system. In the Makefile
one needs to include the ROOT include path

-I$(ROOTSYS)/include

and link the (used) ROOT libraries. On runtime the environment variable LD LIBRARY PATH
must contain a path to the ROOT libraries.
Although linking C++ and ROOT is easy, why should one do such a thing? First of all, one
can write small test programs with ROOT. Secondly, in a typical physics analysis one has to
analyse perhaps millions of events when speed is an issue. Executing ROOT macros is much
slower than executing compiled code, so at some point compiling the ROOT analysis script
may become relevant. Third reason is that language constructs not fully supported by CINT
become available.

46
Example:
The MSSM h → γγ production at the LHC is very sensitive to stop mixing, especially if
the stop quark mass is very close to the top mass. The cross section times branching ratio
for this channel can be estimated with M.Spira’s programs for calculating the Higgs boson
production at the LHC and decay into the chosen final state. The following results were
obtained for stop mass mt̃1 = 200 GeV: The task is to write a ROOT script for plotting the

5 fb 10 fb
mA (GeV) tanβ mA (GeV) tanβ
188.8 1.7 500.0 3.27
206.3 1.8 428.7 4
214.9 1.9 393.3 5
220.1 2 369.6 10
227.6 3 404.0 20
226.1 5 450.9 30
233.9 10 499.2 40
257.9 20
285.9 30
313.9 40

σ×BR curves for h → γγ in (mA ,tanβ) parameter space. We use the OO-style and write the
code as a class.

Solution: (file figure.C, try and execute as root -l figure.C)

class Curve {
public:
Curve(double xx[], double yy[], int NN){
for (int i=0; i<NN; i++) {
x[i]=xx[i];
y[i]=yy[i];
}
Nentries=NN;
if (NN>N) cout<<"Too many entries"<<endl;
}
void Plot(){
graph = new TGraph(Nentries,x,y);
graph->SetLineWidth(2);
graph->DrawClone("C");
delete graph;
}
private:
static const int N = 20;
int Nentries;
double x[N],y[N];
TGraph* graph;
};
void figure(){
TCanvas* canvas = new TCanvas("canvas","",500,500);
canvas->SetFillColor(0);
canvas->SetLogy();

TH2F* frame = new TH2F("frame","",2,50,500,2,1,100);


frame->SetStats(0);
frame->GetXaxis()->SetTitle("m_{A} (GeV)");
frame->GetYaxis()->SetTitle("tan#beta");
frame->GetYaxis()->SetTitleOffset(1.2);
frame->Draw();

double x5fb[]={188.8,206.3,214.9,220.1,227.6,226.1,233.9,257.9,285.9,313.9};

47
double y5fb[]={ 1.7, 1.8, 1.9, 2.0, 3.0, 5.0, 10.0, 20.0, 30.0, 40.0};
Curve sigmaBr5fb(x5fb,y5fb,10);
sigmaBr5fb.Plot();

double x10fb[]={500.0,428.7,393.3,369.6,404.0,450.9,499.2};
double y10fb[]={ 3.27, 4.0, 5.0, 10.0, 20.0, 30.0, 40.0};
Curve sigmaBr10fb(x10fb,y10fb,7);
sigmaBr10fb.Plot();

TLatex * tex = new TLatex(163.307,58.5823,"m_{#tilde{t}_{1}} = 200 GeV");


tex->SetLineWidth(2);
tex->Draw();

canvas->Print("figure.eps");
}

Let us see how to compile this example script. The first modification is to change ”void
figure” into ”int main”. Then the used class headers and namespace should be included in
beginning of the code:

#include "TGraph.h"
#include "TCanvas.h"
#include "TH2F.h"
#include "TLatex.h"
#include <iostream>
using namespace std;

A Makefile should be added. The ROOT include paths need to be included in the compilation,
as well as a list of ROOT libraries, which ($(LIBS) below) can be obtained by the the following
sequence of commands in the Makefile:

ROOTLIBDIR = $(shell root-config --libdir)


ROOTLIBS = $(shell root-config --glibs)
LIBS = -L$(ROOTLIBDIR) $(ROOTLIBS)
OPT = -O -m32 -Wall -fPIC -D_REENTRANT # g++ options needed on korundi

So then you can put the compilation phase in the Makefile e.g. as:
g++ $(OPT) -I$(ROOTSYS)/include $(LIBS) -o figure.exe
Notice that the figure is not printed on the screen by the executable, but only into a file.
Another way to compile the example is to use ACLiC, the AutomatiC Library Compiler
for CINT. When compiling the code into a library this way, no Makefile is needed, and the
function name doesn’t need to be changed. However, the header files need to included. The
script is compiled by typing:

root [] .L myScript.C+

This + option creates a shared library myScript C.so in your directory. The library can be
loaded later for use, and after loading the function is available:

root [] myScript()

48
Notice that now the ROOT command line is used, and the Canvas is available interactively.
To make your scripts easy to move between the interpreter and compiler, it is recommended
to always write the include statements in your scripts. Also do not use the CINT extensions
and program around the CINT limitations.
Sometimes one wants to include C++ classes in ROOT. That can be done by ”rootifying”
the class. For example in order to save an object of a user defined class, the class needs to
be rootified. This includes
• The class must inherit TObject
• The ClassDef(MyClass,1) macro is added in the class definition
• The ClassImp(MyClass) macro is added in the class implementation.

The TObject class provides the default behavior and protocol for the objects in the ROOT
system. It is the primary interface to classes providing I/O, error handling, drawing etc.
Classes can be added in ROOT also without the ClassDef and ClassImp macros, but then
the object I/O features of ROOT will not be available for these classes.
Note that you must provide a default constructor for your classes or you will get a compiling
error.
In order to get your rootified code to compile, a dictionary is needed. Dictionary is needed
in order to get access to ones classes via the interpreter. Dictionaries can be created with a
rootcint program. The rootcint program also generates the

Streamer(),
TBuffer &operator>>()
and
ShowMembers()

methods for ROOT classes, i.e. classes using the ClassDef and ClassImp macros.
To tell rootcint for which classes the method interface stubs should be generated, a file
LinkDef.h is used. The LinkDef file name MUST contain the string: LinkDef.h or linkdef.h,
e.g. MyLinkDef.h.
LinkDef.h must be the last argument on the rootcint command line.
A LinkDef file looks like the following:

#ifdef __CINT__
#pragma link off all globals;
#pragma link off all classes;
#pragma link off all functions;
#pragma link C++ class MyJet+;
#pragma link C++ class vector<MyJet>;
#pragma link C++ class MyEvent+;
#endif

The trailing + tells rootcint to use the new I/O system.


The order of pragma statements matters.
The rootcint call looks like the following:

rootcint -f eventdict.cc -c -I. -p MyEvent.h MyJet.h LinkDef.h

49
Here eventdict.cc is the name of the dictionary file rootcint generates. -I. sets the include
path, then the class headers are listed and finally LinkDef.h. This command can be added
e.g. in a Makefile.
The library is then compiled with the code describing the class, and with the event dictionary.
The library can then be loaded in ROOT

root [] .L MyEvent.so

or used as a normal C++ library and linked with the analysis program.
A second way to add a class is with ACLiC:

root [] gROOT->Macro("MyCode.C++")

Example: rootifying MyTrack class


In file MyTrack.h:

#include "TROOT.h"
#include "TLorentzVector.h"

class MyTrack : public TLorentzVector {


public:
MyTrack();
...
ClassDef(MyTrack,1) // macro
};

In file MyTrack.cc:

#include "MyTrack.h"
ClassImp(MyTrack) // macro
...

In file LinkDef.h

#ifdef __CINT__
#pragma link off all globals;
#pragma link off all classes;
#pragma link off all functions;
#pragma link C++ class MyTrack+;
# #pragma link C++ class vector<MyTrack>;
#endif

In Makefile

eventdict.cc: MyTrack.h
rootcint -f eventdict.cc -c -p \
MyTrack.h LinkDef.h

libMyTrack.so: $(OBJS)
$(GCC) -shared -O *.o -o libMyTrack.so

50
6 Cross section and branching ratio calculations
6.1 Introduction

In particle physics, the concept of a cross section is used to express the likelihood of interaction
between particles. It can therefore characterize the probability that a particular reaction will
take place, or the statistical nature of scattering events. The cross section is expressed in
units of area, usually in barn. The cross section of two particles (i.e. observed when the two
particles are colliding with each other) is a measure of the interaction event between the two
particles.
In HEP experiments the measured variable related to cross section is the number of measured
events. One can measure e.g. the total number of events, or events found per unit space angle
to measure the angular distributions. The number of events measured depends on the process
cross section σ, integrated luminosity L and the detector efficiency :

N = σL.

The branching ratio (branching fraction) is the fraction of events for a chosen particle
measured to decay in a certain way. The sum of branching ratios for a particle is one. The
branching ratio is defined as the partial decay width divided by the total width.
The measurements of particle properties and interactions in HEP experiments have led to
a theoretical model, the so called Standard Model (SM). The measured particle production
and decay give estimates for the SM coupling constants. With these coupling constants cross
sections for rare reactions can be predicted, and the predictions are experimentally verified,
when possible. At LHC era, when the CM energies are increased by about an order of
magnitude, the SM can give predictions for the SM particle production at LHC.
In addition to the SM, there are several theoretical models which extend the SM. One of those
models is the Minimal Supersymmetric Standard Model (MSSM). The MSSM predicts a set
of new particles, their production cross sections and branching ratios. As those extensions
are not experimentally verified, there are several open parameters. Since studying all the
possible values for dozens or hundreds of parameters is not feasible, one usually chooses
some key scenarios for the studies. To predict the particle production cross sections and
decay widths one needs a theoretical model and a scenario for calculating them. To have the
results available also for non-experts, theorists have in many occasions written a program or
programs to make the actual calculations. The user has to set the scenario, i.e. select the
input parameters.
We will not go here through every program available for calculating the cross sections or
branching ratios. New programs are being written and old ones are updated. Some programs
overlap each other, therefore providing a possibility to double-check results with independent
calculations.

6.2 Cross sections and branching ratios in event generators

6.2.1 PYTHIA

Event generators, like PYTHIA, have built-in the theoretical models for calculating the cross
sections and decay widths. In many occasions the cross sections and branching ratios are de-
termined with adequate accuracy using the event generators. One should remember, though,
that the accuracy e.g. in PYTHIA depends on the number of simulated events.

51
In PYTHIA the cross section is stored in a common block

COMMON/PYPARS/ MSTP(200),PARP(200),MSTI(200),PARI(200)

where PARI(1) gives the total integrated cross section, obtained as a by-product of the
selection of hard-process kinematics, and PARI(2) gives the ratio of total integrated cross
section and the number of events generated.
In case one is not satisfied with the cross sections and branching ratios coming from the event
generators, then one has to use other methods. A cross section can be given by theorists,
based on the best knowledge available, or one can use the special programs provided for
making these calculations.

6.2.2 HDECAY

HDECAY is a self-contained and fast program for calculating the decay widths and branching
ratios of the SM Higgs boson, and the Higgs bosons of the MSSM according to the current
theoretical knowledge. The motivation for the program according to the authors is:
”The search of the Higgs boson is one of the major goals of future colliders, such as LHC and
LC. Once the Higgs boson is found, it will be of outmost importance to perform a detailed
investigation of its fundamental properties. To this end, a very precise prediction of the pro-
duction cross section and of the branching ratios for the main decay channels is mandatory.”
The url for HDECAY is: http://people.web.psi.ch/spira/proglist.html
HDECAY includes
• All kinematically allowed decay channels which have branching ratios greater than 10−4 .
• All relevant higher-order QCD corrections.
• Double off-shell decays into massive gauge bosons
• All important below-threshold three-body decays
• Complete radiative corrections in the effective potential approach with full mixing in
the stop and sbottom sectors
• Loop mediated channels including mixing

HDECAY is used via input parameter file. The basic input parameters are the fermion and
gauge boson masses and their total widths, coupling constants, and in the MSSM soft SUSY
breaking parameters. The full list of input parameters can be found in the program manual
hep-ph/9704448 (http://arxiv.org/abs/hep-ph/9704448). The output is produced in several
text files.
The SUSY parameters can be chosen e.g. according to selected benchmark scenarios, like
presented in hep-ph/0202167 (http://arxiv.org/abs/hep-ph/0202167). A typical scenario is
the max-mh scenario, which maximizes the h mass:

MU = 200.D0
M2 = 200.D0
MSUSY = 1000.D0
AL = 0.D0
AU = 2450.D0
AD = 2450.D0
OFF-SUSY = 0

52
Here MSUSY is the common mass for all input masses between MSL1 - MDR. In the scenario
selection and choosing parameters, one has to be careful that no parameter is set to value
which is already experimentally excluded. For example the non-observation of neutralinos
and charginos at LEP limits the value of MU and M2.

6.2.3 FeynHiggs

FeynHiggs is a Fortran code for diagrammatic calculation of the masses and mixing angles
of the Higgs bosons in the MSSM at two-loop level. FeynHiggs provides a convenient way
of invoking the subroutines of the FeynHiggs library from the command line. The input
parameters are read from a file and the output is displayed in a human-readable form on
screen.
Url: http://www.feynhiggs.de
FeynHiggs documentation is in manual pages. Those pages are available after installing the
package, or they can be found in the FeynHiggs web page.
FeynHiggs can be used via mathematica, as a FORTRAN library, or using an interactive
web page. It also provides a C/C++ interface. The web page cannot do everything, but for
simple usage it is an easy start (FHUCC=Feyn Higgs User Control Center: obtain results
online http://wwwth.mppmu.mpg.de/members/heinemey/feynhiggs/cgi-bin/fhucc.cgi).
The examples in FeynHiggs are not too well documented and easy to use. In FORTRAN
usage the user should provide the main program (like example/demo.F), which calls the
subroutines available in the FeynHiggs library libFH.a. The bin, library and include files can
be found under iX86-linux/ directory X being e.g. 3, 4 or 5. Another possibility is to use the
binary FeynHiggs with input file (like example/var.in). Linking FeynHiggs with your own
code should be straightforward due to the FeynHiggs library being available.

6.2.4 HQQ, HIGLU etc.

In addition to HDECAY, M.Spira has written programs also for the Higgs cross section
calculations. There are several programs, which calculate the production cross sections for
different channels. Also PROSPINO, a code for SUSY particle production at hadron colliders
is available. The programs can be found at http://people.web.psi.ch/spira/proglist.html
The program HQQ calculates the production cross section of Higgs bosons via qqbar,gg →
h,H,A + tt̄/bb̄ at hadron colliders at LO QCD (Leading Order QCD). The program allows
to calculate the total production cross sections for the neutral Higgs bosons of the SM and
MSSM. The parton densities are defined in the subroutine STRUC at the end of the program
and can be changed there. The default is the use of the PDFLIB (PDF=Parton Distribution
Functions). The input parameters can be chosen as in HDECAY.
The program PPHQQ calculates the production cross section of Higgs bosons via vector
boson fusion WW/ZZ → h,H at hadron colliders at NLO QCD (Next to Leading Order QCD).
Interference effects between W- and Z-boson fusion are neglected, since they are less than 1%.
The program allows to calculate the total production cross section in the DIS (Deep Inelastic
Scattering) as well as the MSbar (Mass factorization by Subtracting the counter term) scheme
for the parton densities for the scalar Higgs bosons of the SM and MSSM. The MSSM Higgs
sector is implemented in the approximate two-loop RGE approach (RGE=Renormalization
Group Equations). The parton densities are defined in the subroutine STRUC at the end of
the program and can be changed there. The default is the use of the PDFLIB.

53
HIGLU is a program for calculating the total Higgs production cross section at hadron col-
liders via gluon fusion in the SM and MSSM. NLO QCD corrections and contributions of
virtual top and bottom loops are included. The program allows also to calculate the decay
widths of Higgs bosons into gluons. Parton distributions can be attached to the program in
any desirable way by adjusting the corresponding subroutine STRUC. The default PDF’s the
program uses are GRV sets (GRV=Glück-Reya-Vogt).

6.2.5 MCFM

MCFM is a parton-level Monte Carlo program which gives NLO predictions for processes at
hadron colliders. This program is useful for calculating the SM cross sections. MCFM allows
the user to choose between a number of schemes for defining the electroweak couplings.
Part of the parameters is found in file src/User/mdata.f, part of the parameters in an input
file (Bin/input.DAT). A number of parton distributions are included, and the parameters to
choose the PDF set are specified in the input file. The available processes and PDF’s are
listed in the program documentation. The url to MCFM package is http://mcfm.fnal.gov

Graphics by P. Anandam
cteq5m mu=100GeV bottom
0.004 cteq5l mu=100GeV bottom
mrst98_5 mu=100GeV bottom

0.0035

0.003
x^2 * Parton Distribution

0.0025

0.002

0.0015

0.001

0.0005

0
0.0001 0.001 0.01 0.1 1
x

Figure 1: Parton distributions for bottom quarks with CTEQ5L, CTEQ5M and MRST98 5 structure
functions. Made with http://zebu.uoregon.edu/~parton parton distribution plotter.

6.2.6 PDF’s

Parton distribution functions give the probability to find partons (quarks and gluons) in a
hadron as a function of the fraction x of the proton’s momentum carried by the parton. They
are determined from experimental results on short distance scattering of the partons.
The programs which calculate the cross sections for hadron collisions, need parton distribution
functions, PDF’s. They are also called structure functions. In the example programs above

54
the PDF’s are included in the program code, or a library like PDFLIB or LHAPDF is linked
with the program. Almost all PDF’s are included in the libraries, which makes the PDF
selection very easy. It also possible to download the PDF source codes, but for them there
is not necessarily any easy-to-use interface available like in the libraries. If one wants to use
a PDF set which is not included in the program, then it is necessary to download the PDF
set in question and modify the program to include it. PDF sets available are e.g.
• MRS (Martin-Roberts-Stirling)
• CTEQ (Coordinated Theoretical-Experimental project on QCD)
• GRV (Glück-Reya-Vogt)

In addition to selecting the PDF, one must remember to set the parameters like αS (MW ) to
match the values used in the PDF. One should also use the correct order for the PDF, like
LO PDF for LO cross section calculations.

55
7 Review of event generators
7.1 Introduction

Event generators generate simulated high-energy physics events. Despite the simple structure
of the tree-level perturbative quantum field theory description of the collision and decay
processes in an event, the observed high-energy process usually contains significant amount of
modifications, like photon and gluon bremsstrahlung or loop diagram corrections, that usually
are too complex to be easily evaluated in real calculations directly on the diagrammatic level.
Any realistic test of the underlying physical process in a particle accelerator experiment,
therefore, requires an adequate inclusion of these complex behaviors surrounding the actual
process. Based on the fact that in most processes, a factorization of the full process into
individual problems is possible (which means a negligible effect from interference), these
individual processes are calculated separately, and the probabilistic branching between them
are performed using Monte Carlo methods.
The final-state particles generated by event generators can be fed into a detector simulation,
allowing a precise prediction and verification for the entire system of experimental setup.
However, as the detector simulation is usually a complex and computationally expensive
task, simple event analysis techniques are also performed directly on event generator results.
A typical hadronic event generator simulates the following subprocesses:
• Initial-state composition and substructure
• Initial-state radiation
• The hard process
• Resonance decay
• Final-state radiation
• Accompanying semi-hard processes
• Hadronization and further decay

A typical heavy ion event generator usually can be less strict in simulating the rare and
rather negligible processes found in a hadronic generator, but it would need to simulate the
following subprocesses, in addition to those in a hadronic generator:
• Nuclear initial-state
• High multiplicity, soft processes
• In-medium (nuclear) energy loss
• Collective behavior of the medium (not handled properly by any generators so far)

The major event generators that are used by current experiments are e.g.:
• Pythia
• Herwig
• Isajet
• Alpgen
• MC@NLO (Next to Leading Order)
• CompHEP
• TopReX

As the name indicates, the output of an event generator should be in the form of events, with
the same average behavior and the same fluctuations as real data.
Event generators can be used in many different ways. The five main applications are probably
the following:

56
• To give a feeling for the kind of events one may expect/hope to find, and at what rates
• As a help in the planning of a new detector, so that detector performance is optimized,
within other constraints, for the study of interesting physics scenarios.
• As a tool for devising the analysis strategies that should be used with real data, to
optimize signal-to-background conditions
• As a method for estimating detector acceptance corrections that have to be applied to
raw data
• As a convenient framework within which to interpret the observed phenomena in terms
of a more fundamental underlying theory

7.2 PYTHIA

The PYTHIA program is frequently used for event generation in high-energy physics. The
emphasis is on multiparticle production in collisions between elementary particles. This in
particular means hard interactions in e+ e− , pp and ep colliders, although also other applica-
tions are envisaged. The program is intended to generate complete events, in as much detail
as experimentally observables ones, within the bounds of our current understanding of the
underlying physics.
The documentation is extensive, and extremely useful, and it can be found in:
http://www.thep.lu.se/~torbjorn/Pythia.html
The PYTHIA package is completely self-contained. In addition interfaces to externally defined
subprocesses, PDF libraries, τ -lepton decay libraries and a time routine are provided, plus a
few optional interfaces.
All default settings and particle and process data are stored in BLOCK DATA PYDATA. This
subprogram must be linked at the beginning of the main program for the proper functioning
of other routines

EXTERNAL PYDATA

The program philosophy in PYTHIA is that the Monte Carlo program is built as a slave system,
and the user has to supply the main program. The details how the program should perform
specific tasks is done by setting common block variables. By calling subroutines the program
is told to generate events according to rules established by the user. To extract information
on the generated events, one can look at common block PYJETS, or to call various functions
and subroutines to analyse the events.
In the core process generation machinery, like selection of hard process kinematics, a sizable
amount of initialization is performed in the PYINIT call, and thereafter the events generated
by PYEVNT all obey rules established at that point. The common block variables specifying
methods and constraints to be used have to be set before calling PYINIT.
PYTHIA is extremely versatile, but the price to be paid for this is having a large number
of adjustable parameters and switches for alternative modes of operation. Since all these
parameters and switches are assigned sensible default values, there is no reason to worry
about them until the need arises. Unless explicitly stated, all switches and parameters can
be changed independently of each other.
The normal usage of PYTHIA can be subdivided into three steps:
• initialization
• event generation loop

57
• printing and histogramming

In the initialization step all basic characteristics of the coming generation are specified. This
includes the selection of required processes, the selection of kinematics, definition of the un-
derlying physics scenario, e.g. the Higgs mass, selection of PDF’s, switching off of generation
parts not needed, and initialization of the event generation procedure.
In the event loop the events are generated and studied. The generation of next event is done
with call (for example see main63.f, page 20):

CALL PYEVNT

The events are then saved on disk, or interfaced to a detector simulation.


In the final step, in addition to producing output, the events can be normalized to expected
rates as the number of events simulated is not always the number of events expected. For
example a process with a very large cross section may be expected to give millions of events,
while simulating that many events may not always be feasible. In that case the events must
be given weights.

Example 1: How to get particle listing


In PYTHIA every particle is given a unique code, a KF code. This code is needed e.g. when
forcing decays, or when you want to select a particle in your analysis for some reason. To get
a full list of the KF codes, one can call PYLIST:
Main program:

CALL PYLIST(12)
END

This list is so useful that you might want to save the output on your disk or even print it.
In the listing the decay modes (IDC) for each particle are also included. These can be used
for switching decay modes on and off. Note that while the KF code is not changed between
PYTHIA versions, the IDC code for decays may change (and have changed) when new potential
decay modes are included.

Example 2: Simulation of Z boson production


Let’s assume we want to study the Z boson in µ+ µ− final state. For that we need to initialize
PYTHIA to produce Z bosons, and then force them to decay into muons, as there is no need
to include the other decay modes producing very different signatures in the detector. The
initialization step looks like following:

MSEL=0 ! full user control


MSUB(1)=1 ! Z-production with sub-processes
CKIN(1) = 80 ! s-hat cut
DO IDC=MDCY(22,2),MDCY(22,2)+MDCY(22,3)-1
MDME(IDC,1)=MIN(0,MDME(IDC,1)) ! switch off all ...
ENDDO
DO IDC=MDCY(23,2),MDCY(23,2)+MDCY(23,3)-1
MDME(IDC,1)=MIN(0,MDME(IDC,1)) ! ... decays

58
ENDDO
MDME(171,1)=1 ! switch on decays ...
MDME(184,1)=1 ! ... to mu+mu-
CALL PYINIT(’CMS’,’P’,’P’,14000.D0)

The first line MSEL=0 switches all channels off giving full user control.
MSUB(1)=1 sets the process 1 on. The processes are listed in the PYTHIA manual, here we
select the subprocess fi f̄i → γ ∗ /Z

CKIN(1) sets the lower limit for m̂ = s. We are not interested in the Z/γ ∗ production well
below the Z mass, it has huge cross section and the kinematical selection excludes it anyway.
The Z boson KF code is 23, the γ KF code is 22, the DO loops switch all the decay channels
off. Once they are all off, we set the muon channel open using MDME(IDC,1)=1. The particle
listing (Example 1) gives IDC=171 for γ ∗ → µµ and IDC=184 for Z→ µµ.
Finally we initialize PYTHIA with the chosen parameter values for production and decay, with
CM energy of 14 TeV for colliding proton beams. The event loop consists of a loop over as
many events as we want. Choosing too low a value for the number of simulated events results
in poor statistics, but choosing too large a value will make the simulation computationally
expensive and very time consuming. Within the event loop the events are generated by calling
subroutine PYEVNT.
The final step consists of event normalization. If we assume an integrated luminosity of 30
fb−1 , the normalization factor (event weight) will be

CNOR = PARI(2)*10.**12 *NEV*LUMI

where PARI(2) is the total integrated cross section divided by number of events, NEV is the
number of events and LUMI the chosen luminosity. The factor 1012 comes from unit conver-
sion, in PYTHIA the cross sections are in mb’s, while the luminosity is in fb−1 .

Example 3: How to set PDF’s


In the previous example we used PYTHIA’s default parton distribution function (PDF), which
is CTEQ5L. If some other PDF is preferred, it can be selected in the initialization.
The complete set of PDF’s available in PYTHIA are listed in the PYTHIA manual section
7: Process Generation.
http://home.thep.lu.se/~torbjorn/pythia/lutp0613man2.pdf
More can be found in external libraries of parton distributions, PDFLIB and LHAPDF.
PDFLIB is the older one, and no new parton distributions has been added in it for a couple
of years. The newer LHAPDF (the Les Houches Accord PDF interface) contains all new sets
of the last five years or so. LHAPDF has an interface LHAGLUE, which is identical to the
PDFLIB interface.
To change PDF to one already available in PYTHIA, use MSTP(51) to select the set. Use
PYTHIA manual to check which value of MSTP(51) selects the chosen PDF. To change the
PDF used in example 2 to a PDF from an external library, we need to add PDF calls in the
program initialization, and to disable the dummy routines in the PYTHIA code. As PDFLIB is
becoming obsolete, LHAPDF is used as an example. The procedure is the following:
Read LHAPDF docs.

59
Edit PYTHIA source code and rename dummy subroutines PDFSET and STRUCTM to disable
them. These routines are available in the library interface.
In your initialization call SETPDFPATH to set the path to the PDF data file you are using.
Add a call (and argument type definitions) to PDFSET with arguments

PARM(1) = ’DEFAULT’
VAL(1) = PDF_set_number

The PDF set number depends on the chosen PDF, and it can be found in the LHAPDF manual
appendix A:
http://projects.hepforge.org/lhapdf/manual
PDF set numbers and names. Choose scale, and call STRUCTM. For LHC e.g.

X = 0.002
CALL STRUCTM(X,X,X,X,X,X,X,X,X,X,X)

Compile and link the LHAPDF library with the rest of your code. The LHAPDF code can be
found in
http://hepforge.cedar.ac.uk/lhapdf
The initialization code looks like following, assuming CTEQ6LL PDF is chosen and the
corresponding data file cteq6ll.LHpdf is found in the local directory:

CALL PYINIT(’CMS’,’P’,’P’,14000.D0)
CALL SETPDFPATH(’.’)
PARM(1) = ’DEFAULT’
VAL(1) = 10042 ! CTEQ6LL
CALL PDFSET(PARM,VAL)
X = 0.002
CALL STRUCTM(X,X,X,X,X,X,X,X,X,X,X)

7.3 HERWIG

HERWIG is another general purpose Monte Carlo event generator. It includes the simulation
of hard lepton-lepton, lepton-hadron and hadron-hadron scattering and soft hadron-hadron
collisions. It uses parton-shower approach for initial and final state QCD radiation, including
colour coherence effects and azimuthal corrections both within and between jets.
A generic hard process, i.e. process with high momentum transfer, is divided in four compo-
nents in HERWIG. These are
• Elementary hard subprocess
• Initial and final state parton showers
• Heavy object decays
• Hadronization process

The official HERWIG homepage for documentation and source code is


http://hepwww.rl.ac.uk/theory/seymour/herwig

Example 4: Simulation of Z boson production

60
Let’s repeat the example 2, now using HERWIG instead of PYTHIA. The source code can be
downloaded from the HERWIG web page, and it includes the fortran herwig6510.f and header
files HERWIG65.INC and herwig6510.inc. As in PYTHIA, the user must write the main
program, from where the HERWIG routines are called. A good stating point is e.g. an example
program available in the HERWIG manual chapter 8.

PART1=’p’
PART2=’p’
PBEAM1=7000.
PBEAM2=PBEAM1
IPROC = 1353
CALL HWIGIN
EMMIN = 80 ! min lepton-lepton-mass

Here we selected LHC beams with 14 TeV CM energy, colliding protons, and producing Z
bosons (and γ ∗ ) decaying into muons, process 1353.
SUSY particles are not needed, so line for SUSY particle data can be commented. The rest
of the code can remain as is, but to be sure the program works, some test prints are included

SUBROUTINE HWANAL
INTEGER NEVHEP,NHEP,ISTHEP,JMOHEP,JDAHEP
DOUBLE PRECISION PHEP,VHEP
PARAMETER (NMXHEP=4000)
COMMON/HEPEVT/NEVHEP,NHEP,ISTHEP(NMXHEP),IDHEP(NMXHEP),
& JMOHEP(2,NMXHEP),JDAHEP(2,NMXHEP),PHEP(5,NMXHEP),VHEP(5,NMXHEP)
DO i = 1,NHEP
IF(ABS(IDHEP(i)).EQ.13.AND.ABS(IDHEP(JMOHEP(1,i)))) THEN
print *,"ev, muon pt ",NEVHEP,sqrt(PHEP(1,i)**2+PHEP(2,i)**2)
ENDIF
ENDDO
END

The particle ID codes are the revised Particle Data Group ones. The only exception is the
lightest MSSM Higgs boson h with id 26 to distinguish it from the SM Higgs boson (id=25).
HERWIG gives the output in the LEP standard common block /HEPEVT/.
For the normalization in the final step, the process cross section (times the branching ratio)
is needed. The cross section is stored in common /HWEVNT/ with variable name AVWGT. The
cross section is given in nb.
The PDF’s are included using variables AUTPDF and MODPDF for the two beams:

CALL SETPDFPATH(’.’)
AUTPDF(1) = ’HWLHAPDF’ !beam 1
MODPDF(1) = 10042 !beam 1 (CTEQ6LL PDF)
AUTPDF(2) = AUTPDF(1) !beam 2
MODPDF(2) = MODPDF(1) !beam 2

Like in PYTHIA one must comment dummy routines PDFSET and STRUCTM in the HERWIG source
code.

61
7.4 Other event generators

The number of event generators is increasing. Often the general purpose event generators can
be used, or they can serve as a good starting point, but in many cases there are special event
generators available giving more precise predictions. The event generators can be specialized
in e.g. top physics, or vector boson fusion, or associated production with n jets etc. It is often
very healthy to compare the results from different event generators, and to try to understand
the differences. Which event generator to use is up to the user, but when presenting results
the choice of the event generator may be criticized.
Many of the specialized event generators include code only for the core calculations, and use
general purpose event generators like PYTHIA and HERWIG as a user interface. This simplifies
the HEP simulation, as the event generators can be easily switched from one to another.

62
8 Detector simulations

8.1 Introduction

In high energy physics, the method to simulate experimental data is done in two stages,
event generation and detector simulation. Event generators describe the particle reactions
and produce the momentum vectors of the generated particles. The output of an event
generator is used as input for a detector simulation program.
The response of a detector to the passage of the scattered particles involves random processes
such as ionization, multiple Coulomb scattering etc. Therefore the transport of particles
through an experimental setup can be effectively implemented using the Monte Carlo tech-
nique. Programming packages, such as GEANT can be used to describe complicated detector
configurations. Many GEANT based simulation packages has been developed to offer central
facilities for Monte Carlo studies for different experiments.

8.2 GEANT4

Geant4 is a free software package composed of tools which can be used to accurately simulate
the passage of particles through matter. The simulation processes included in the toolkit are

• geometry of the system


• materials involved
• particles
• generation of events
• tracking through matter and EM fields
• response of sensitive detector components
• generation of event data
• storage of events and tracks
• visualization
• analysis of data

Users may construct stand-alone applications, or applications built upon another object-
oriented framework.
The Geant4 physics models handle the interactions of particles with matter across a very wide
energy range. It incorporates a large part of all that is known about particle interactions.
The Geant4 web page url is http://cern.ch/geant4
For the detector simulation one needs a detector description. The detector definition requires
the representation of its geometrical elements, their materials and electronics properties. The
geometrical representation of detector elements focuses on the definition of solid models and
their spatial position.
Geant4 is capable of describing and propagating in variety of electromagnetic fields. Magnetic
fields, electric fields and electromagnetic uniform and non-uniform fields can be specified. In
order to propagate a track inside a field, the equation of motion of the particle in the field is
integrated in the toolkit.
A physical interaction of a track in the sensitive region of a detector is called a hit. Hits sim-
ulate the physical signal in the detector element produced by the particle-matter interaction.

63
The physical signal in the sensitive detector element, the detector response, is transformed
with the help of electronics into a digital signal, which can be stored on tape or hard drive.
The simulation of this step is called digitization, and the digitized data, the raw data (”digis”).
Typical usages of the digitizer are
• Simulation of ADC and/or TDC (Analogue to Digital or Time to Digital Converter)
• Simulation of readout scheme
• Generation of raw data
• Simulation of trigger logics
• Simulation of pile-up

At high luminosities several separate events are likely to occur at the same bunch crossing.
These so called minimum bias events or pile-up add a non-negligible level of background to
the response of the detector system. In the tracker the pile-up events generate a number of
hits which can affect the track finding efficiency and fit accuracy.
In the calorimeters these events add pile-up noise affecting calorimeter and missing energy
resolution.
Physics processes describe how particles interact with the material. Seven major categories
are provided by Geant4:
• electromagnetic
• hadronic
• decay
• photolepton-hadron
• optical
• parametrization
• transportation

Electromagnetic physics processes include


• Photon processes, like Compton scattering, gamma conversion, photo-electric effect and
muon pair production
• Elecron/positron processes, like ionization and delta rays, bremsstrahlung and positron
annihilation

• Muon processes, like ionization and delta rays, bremsstrahlung and e+e- pair production
• Hadron/ion processes, like ionization
• Multiple scattering
• Low energy electromagnetic processes, like Rayleigh scattering

Hadronic interactions include


• lepton-hadron interactions
• photonuclear and electronuclear reactions
• nucleus-nucleus reactions
• elastic scattering
• interactions of stopping particles
• nuclear cascades
• fission, evaporation, break-up models
• low energy neutron interactions
• radioactive decay

64
The optical effects include processes like Cerenkov radiation, scintillation, wavelenght shifting
and absorption.
Probability of having an interaction within L due to the physics process i is

pi (L) = 1 − exp(−L/λi ) (1)

where λi is the mean free path length.


Monte Carlo algorithm:
1. set properties for incident particle (momentum..)
2. get values for λi for all relevant processes
3. sample the free path length L from the distributions of all participating processes
4. select the smallest path length
5. move the particle by this step
6. simulate the interaction
7. if particle still exists, goto 1

It is to be noted that:
• free path lengths sampling produces steps
• if a material boundary is hit, a new step is always created
• physics processes limit the step size according to the sample’s free path length.
• e.g. zick-zack path due to elastic Coulomb scattering
• e.g. energy loss and possible creation of other particles
• the physics process is a microscopic description of the particle interaction with another
particle of the material or the external field, according to quantum theory

The Detector Description must be as realistic as possible for the physics to be studied.
Particles tracked through the detector must see and feel a realistic environment in order to
deliver realistic simulation results.
The Detector Description need lots of experience and expertise, one has to know which details
of the detector are essential and which can be safely ignored. Often the detector cannot be
modelled 1:1, e.g. the geometry must fit into the computer memory..

8.3 Usage of Geant4

Since the contents of the program varies according to the needs of a given simulation, the
Geant4 toolkit does not provide the main program. The main must be written by the user.
Examples of simple main programs are, however, provided. The main() method is imple-
mented by two toolkit classes
• G4RunManager
• G4UImanager

The first thing main() must do is create an instance of the G4RunManager class. It controls
the flow of the program and manages the event loops within a run. The run manager is
responsible for initialization procedures, and it must be given all the information necessary
to build and run the simulation. This includes
• detector description
• physics processes
• how the primary particles are produced

65
• any additional requirements

Example N01:

G4RunManager* runManager = new G4RunManager;


runManager->SetUserInitialization(new ExN01DetectorConstruction);
runManager->SetUserInitialization(new ExN01PhysicsList);
runManager->SetUserAction(new ExN01PrimaryGeneratorAction);
runManager->initialize();
G4UImanager* UI = G4UImanager::GetUIpointer();
UI->ApplyCommand("/run/verbose 1");
UI->ApplyCommand("/event/verbose 1");
UI->ApplyCommand("/tracking/verbose 1");
int numberOfEvents = 3;
runManager->BeamOn(numberOfEvents);
delete runManager;

The detector description specifies the detector geometry, used materials, defines sensitive
regions and readout schemes.
The geometry defines the size and shape, like Boxes, Tubes and their sections, Cones and
their sections, Spheres, Wedges, and Toruses.

G4Tubs(const G4String& pName,


G4double pRMin,
G4double pRMax,
G4double pDz,
G4double pSPhi,
G4double pDPhi)
G4Material* Al = new G4Material("Aluminum",
z= 13., a= 26.98*g/mole, density= 2.7*g/cm3);

Every quantity such as mass, energy, density,.. is represented by multiplying the numerical
value with the unit. For reading returned quantities, the numerical value should be divided
by the unit:

double speed = getSpeed()/(km/h);

The Logical Volume manages the information associated with detector elements represented
by a given Solid and Material, independently from its physical position in the detector.
Physical volumes represent the spatial positioning of the volumes describing the detector
elements. Several techniques can be used. They range from the simple placement of a single
copy to the repeated positioning using either a simple linear formula or a user specified
function.
SensitiveDetector represents the parts of the detector that will do the actual measurement.
Sensitivity is handled via logical volumes, and the principal mandate of a sensitive detector
is the construction of hit objects using information from steps along a particle track.
Readout geometry is a virtual, parallel geometry for obtaining the channel number. It asso-
ciates ReadoutGeometry object with a sensitive detector.

66
Figure 2: pRMin = 10, pRMax = 15, pDz = 20 pSPhi = 0*Degree, pDPhi = 90*Degree

The primary particle production can be done with particle guns, or one can use interface to
external event generators. For Geant4 it was decided not to link any FORTRAN code in the
program. Instead, Geant4 provides an ASCII file interface for such event generators.
In the example N01 the PhysicsList defines a geantino, and assigns transportation process.
The PrimaryGeneratorAction constructs a particle gun and generates events by shooting
geantinos with the gun. Geantino is a non-physical test particle which simulates particle
transportation, it does not interact with material.
Geant4 is designed to transport long lived particles along macroscopic distances. We have to
tell Geant4 the initial conditions of the particles we want to have simulated. The short lived
particles like Z will not be tracked.
User has to instantiate the Primary Vertex. The G4UImanager is used for giving optional
UI (or GUI) commands via the member function ApplyCommand. The commands are listed
in the user guide section 7.1. The command

/run/verbose 1

is selected to display main topics, the command

/event/verbose 1

to give stacking information, and command

/tracking/verbose 1

gives minimum information of each Step. These all have default values 0, which give silent
operation.
Summary:
The various stages during a simulation run are
• Initialization
- detector description

67
- physics processes selection and configuration
• Run
- contains several Events under the same conditions

• Event
- generation of primary particles
- tracking of all particles

We do not go deeper into the usage of Geant4, as normally one does not have to use Geant
directly, but via an auxiliary program where the geometry etc is already defined and the
end user only provides the primary event for the detector simulation to get a hit collection.
Very often experiments do not use Geant4 digitization, but implement their own digitization
machineries, like in CMS.
For further reading, Geant4 provides extensive documentation like
• User’s guide for application developer (basic usage of Geant4)
• Physics reference manual
• Software reference manual

and most of all working examples for beginners, extended examples and examples for advanced
users.

8.4 Dataset creation for full simulation

Instead of using standalone GEANT simulations, the HEP experiments often create their
own applications and simulation frameworks. The full GEANT simulation is used, but via
a customized interface, different for each experiment as the code is written from scratch by
the physicists doing the experiment. Therefore an average user does not have to worry about
e.g. a correct detector geometry description, as the geometry is modelled by experts and
included in the simulation framework.
CMS is no exception, it has its own programs written for end users. The first full simulation
program CMSIM was written in fortran, and it used fortran based GEANT3 to simulate the
particle-matter interactions. Later CMSIM was replaced with OO software ORCA, OSCAR
and CMSSW, but some part of the fortran code still lives, as the event generators are written
in fortran. In CMSSW the fortran part is hidden by linking fortran with C++, so that OO
framework is used. We use CMS here as an example to get the general idea, how the full
simulation works in practice.

68
9 CMSSW
The dataset production for CMSSW simulations can be done in steps. The steps for a full
simulation study are
• Event generation
• Hits
• Digis
• High level object reconstruction
• Analysis

Event generation is done using the Monte Carlo event generators available. The most common
event generators are included in the program as libraries, and the user can select the event
generator, and what kind of events are to be simulated. The output is a ROOT file containing
the data collections user has selected.

9.1 Framework

The CMSSW framework implements a software model wherein there is one executable, called
cmsRun, and many plug-in modules which run algorithms. The same executable is used for
both detector and Monte Carlo data.
The CMSSW executable, cmsRun, is configured at run time by the user’s job-specific config-
uration file. This file tells cmsRun which data to use, which modules to run, which parameter
settings to use for each module, and in what order to run the modules. Required modules
are dynamically loaded at the beginning of the job.
The CMS Event Data Model (EDM) is centered around the concept of an Event as a C++
object container for all RAW and reconstructed data pertaining to a physics event. During
processing, data are passed from one module to the next via the Event, and are accessed
only through the Event. All objects in the Event may be individually or collectively stored
in ROOT files, and are thus directly browsable in ROOT.
The CMSSW code is contained in a CVS repository. The repository can be accessed by
setting the CVSROOT environment variable

setenv CVSROOT :pserver:[email protected]:/cvs_server/repositories/CMSSW

The code is browsable (WEBcvs) at


http://cmsdoc.cern.ch/cms/cpt/Software/html/General/
https://cms-cpt-software.web.cern.ch/cms-cpt-software/General/
where one can find also the CMSSW WorkBook and the reference manual.

9.2 Environment setup

The environment needed for CMSSW is set by

setenv SCRAM_ARCH slc5_ia32_gcc434


setenv VO_CMS_SW_DIR /home/opt/cms/cmssw
source $VO_CMS_SW_DIR/cmsset_default.csh

69
9.3 Dataset creation

An example how to create a dataset is given in the WorkBook


https://twiki.cern.ch/twiki/bin/view/CMS/WorkBook Chapter 6, ”Event Generation and
Simulation”.
First one needs to create a working area. This is done with an auxiliary program called
SCRAM, short for Source Configuration, Release, And Management tool. With command
”scram list” you can see all the software and versions available via scram.

$ scram project CMSSW CMSSW_3_5_6


$ cd CMSSW_3_5_6/src

The source code is in cvs, checkout a configuration file for the generation of ”Higgs to ZZ to
4 leptons”:

$ cvs login
(98passwd)
$ cvs co -r CMSSW_3_5_6 Configuration/Generator/python/H200ZZ4L_cfi.py

Define the environment:

$ cmsenv

The command cmsenv is alias to: eval ‘scram runtime -csh‘, and it defines the environ-
ment for example for the system to find the CMSSW executables like cmsRun.
Next, using cmsDriver.py, create a configuration file for the generation of the events in which
Higgs particle decays to two Z 0 bosons which in turn decay to four leptons:

$ cmsDriver.py Configuration/Generator/python/H200ZZ4L_cfi.py -s \
GEN,SIM,DIGI,L1,DIGI2RAW,HLT --eventcontent RAWSIM --conditions \
auto:mc --datatier GEN-SIM-RAW -n 1 --no_exec

Then execute CMSSW using the resulting configuration file and produce the event file:

$ cmsRun H200ZZ4L_cfi_py_GEN_SIM_DIGI_L1_DIGI2RAW_HLT.py

The execution takes about 5 minutes (only one event!). The resulting file is
H200ZZ4L cfi py GEN SIM DIGI L1 DIGI2RAW HLT.root . Then the configuration file for raw
data reconstruction is created by:

cmsDriver.py Configuration/Generator/python/H200ZZ4L_cfi.py -s \
RAW2DIGI,L1Reco,RECO --eventcontent RECOSIM --conditions auto:mc \
--datatier GEN-SIM-RECO --eventcontent RECOSIM -n 1 --no_exec

Edit the resulting H200ZZ4L cfi py RAW2DIGI L1Reco RECO.py file and change the input
source to

fileNames =
cms.untracked.vstring(’file:H200ZZ4L_cfi_py_GEN_SIM_DIGI_L1_DIGI2RAW_HLT.root’)

70
Then the reconstruction is executed by:

$ cmsRun H200ZZ4L_cfi_py_RAW2DIGI_L1Reco_RECO.py

The execution takes about 1 minute for the single event. The resulting file is
H200ZZ4L cfi py RAW2DIGI L1Reco RECO.root . One can visualize the event by cmsShow
as:

$ cmsShow H200ZZ4L_cfi_py_RAW2DIGI_L1Reco_RECO.root

Unfortunately this does not work yet on korundi. The picture was produced on another
system and it looks like this:

Figure 3: Higgs decayng to ZZ and each Z decaying to two leptons

This year (2010) the LHC is running at 7 TeV. Therefore we are interested in the simulation
of 7 TeV events. Let us take an example config file to produce events where Higgs decays
to two τ leptons. event generator. One of the important SM Higgs production processes is
Vector Boson Fusion (VBF). The Higgs boson decays mainly to bb̄ pairs, but the b jet final
state is difficult to extract from the QCD background. The next most probable final state
is H → τ τ , which is experimentally easier. The PYTHIA manual tells the correct pythia
switches for these kind of events, section 8.5 Higgs production, processes ISUB = 123 (ZZ)
and ISUB = 124 (WW). The Higgs decay final states can be listed by running stand-alone
PYTHIA with CALL PYLIST(12), as described in lecture 8. The SM Higgs KF code is 25,
and decay table lists decay labels from 210 to 226 (288 with non-SM decays) with di-tau final
state label 220. So we set all other decay modes to zero (Fortran array MDME, see below)
except for τ + τ − .
So now we checkout the sample configuration file for Higgs to τ + τ − from the repository (you
are in the directory CMSSW 3 5 6/src):

71
$ cvs co Configuration/Generator/python/QQH1352T_Tauola_7TeV_cfi.py

The file you get looks like as follows:

import FWCore.ParameterSet.Config as cms

from Configuration.Generator.PythiaUESettings_cfi import *


from GeneratorInterface.ExternalDecays.TauolaSettings_cff import *
generator = cms.EDFilter("Pythia6GeneratorFilter",
pythiaPylistVerbosity = cms.untracked.int32(1),
# put here the efficiency of your filter (1. if no filter)
filterEfficiency = cms.untracked.double(1.0),
comEnergy = cms.double(7000.0),
pythiaHepMCVerbosity = cms.untracked.bool(False),
ExternalDecays = cms.PSet(
Tauola = cms.untracked.PSet(
TauolaPolar,
TauolaDefaultInputCards
),
parameterSets = cms.vstring(’Tauola’)
),
# put here the cross section of your process (in pb)
crossSection = cms.untracked.double(0.388),
maxEventsToPrint = cms.untracked.int32(3),
PythiaParameters = cms.PSet(
pythiaUESettingsBlock,
processParameters = cms.vstring(’PMAS(25,1)=135.0 !mass of Higgs’,
’MSEL=0 !user selection for process’,
’MSUB(123)=1 !ZZ fusion to H’,
’MSUB(124)=1 !WW fusion to H’,
’MDME(210,1)=0 !Higgs decay into dd’,
’MDME(211,1)=0 !Higgs decay into uu’,
’MDME(212,1)=0 !Higgs decay into ss’,
’MDME(213,1)=0 !Higgs decay into cc’,
’MDME(214,1)=0 !Higgs decay into bb’,
’MDME(215,1)=0 !Higgs decay into tt’,
’MDME(216,1)=0 !Higgs decay into’,
’MDME(217,1)=0 !Higgs decay into Higgs decay’,
’MDME(218,1)=0 !Higgs decay into e nu e’,
’MDME(219,1)=0 !Higgs decay into mu nu mu’,
’MDME(220,1)=1 !Higgs decay into tau tau’,
’MDME(221,1)=0 !Higgs decay into Higgs decay’,
’MDME(222,1)=0 !Higgs decay into g g’,
’MDME(223,1)=0 !Higgs decay into gam gam’,
’MDME(224,1)=0 !Higgs decay into gam Z’,
’MDME(225,1)=0 !Higgs decay into Z Z’,
’MDME(226,1)=0 !Higgs decay into W W’),
parameterSets = cms.vstring(’pythiaUESettings’,
’processParameters’)

72
)
)

The procedure is the same as above 9.3. First, to produce the configuration file for event
generation, detector simulation, and raw data digitization and executing, do:

$ cmsDriver.py Configuration/Generator/python/QQH1352T_Tauola_7TeV_cfi.py -s \
GEN,SIM,DIGI,L1,DIGI2RAW,HLT --eventcontent RAWSIM --conditions \
auto:mc --datatier GEN-SIM-RAW -n 1 --no_exec

$ cmsRun QQH1352T_Tauola_7TeV_cfi_py_GEN_SIM_DIGI_L1_DIGI2RAW_HLT.py

Then, when the job finished after a few minutes, to produce the config file for reconstruction
(’RECO’), do:

$ cmsDriver.py Configuration/Generator/python/QQH1352T_Tauola_7TeV_cfi.py -s \
RAW2DIGI,L1Reco,RECO --eventcontent RECOSIM --conditions auto:mc \
--datatier GEN-SIM-RECO --eventcontent RECOSIM -n 1 --no_exec

and you get a configuration file named as:


QQH1352T_Tauola_7TeV_cfi_py_RAW2DIGI_L1Reco_RECO.py
As above, edit this file and change the input file name to:
file:QQH1352T_Tauola_7TeV_cfi_py_GEN_SIM_DIGI_L1_DIGI2RAW_HLT.root
and execute again as:
$ cmsRun QQH1352T_Tauola_7TeV_cfi_py_RAW2DIGI_L1Reco_RECO.py
and you get the ROOT output file:
QQH1352T_Tauola_7TeV_cfi_py_RAW2DIGI_L1Reco_RECO.root

There is a number of sample driver files which you can get by checking out the whole directory
python as (in the directory CMSSW_3_5_6/src):
$ cvs co Configuration/Generator/python
All examples can be executed in a manner similar to those described above. In summary the
steps are roughly as follows:
1. setting up the work area (if not yet done) and environment
2. cmsDriver.py driverFile.py → simuConfiguration.py
3. cmsRun simuConfiguration.py → simulatedEvents.root
4. cmsDriver.py driverFile.py → recoConfiguration.py
5. edit recoConfiguration.py to have file:simulatedEvents.root as input
6. cmsRun recoConfiguration.py → reconstructedEvenst.root

73
Fast simulation - FAMOS

A full simulation by Geant4 takes often a large amount of time. In CMSSW there is an
option for fast simulation, which is much faster (factor 10 or so), but not always so precise
For many applications it is, however precise enough. In order to obtain the FAMOS option
one has to checkout an extra settings file (you must be in the directory CMSSW 3 5 6/src):
$ cvs co Configuration/Generator/python/PythiaUESettings_cfi.py
Furthermore, the reconstruction step needs not to be executed separately but the simulation
and reconstruction can go in one step by creating the configuration file as follows for 100
events:

$ cmsDriver.py Configuration/Generator/python/ZEE_cfi.py \
-s GEN,FASTSIM --eventcontent RECOSIM \
--conditions auto:mc --datatier GEN-SIM-RECO -n 100 --no_exec

Then the cmsRun program will do both simulation (fast) and reconstruction. Here we have
taken the sample file ZEE cfi.py discussed below.

9.4 Analysis example

For CMSSW analysis one needs data, and an analysis program. Let us produce a dataset
containing di-electron events. We choose events where intermediate vector bosons Z 0 are
produced and then decayed as Z 0 →e+ e− . So first we have to produce the data file. We take
the configuration file
Configuration/Generator/python/ZEE_cfi.py
Follow the procedures discussed above. Make the production for example with 5 events. You
define the number of events to be processed in the configuration file for generation which you
created by cmsDriver.py.
As a result of the chain of jobs, we get the file ZEE_cfi_py_RAW2DIGI_L1Reco_RECO.root
which is the input for our analysis. Next we need to write the analysis program (module).
Since our data is electrons we start with the example code GsfElectronMCAnalyzer.cc which
is found in RecoEgamma/Examples, so let us check that out from the CVS repository (you
should be in the directory CMSSW 3 5 6/src) and compile:

$ cvs co -r CMSSW_3_5_6 RecoEgamma/Examples


$ cd RecoEgamma/Examples/
$ scram b

where b stands for build. First let us try to run the analyzer without modifications.

$ cd test

Try grepping to find the config file for the module name GsfElectronMCAnalyzer. The file is
GsfElectronMCAnalyzer cfg.py. Edit the file to contain the correct input file, and choose
the number of events = 5:

74
process.maxEvents = cms.untracked.PSet(
input = cms.untracked.int32(5)
)

process.source = cms.Source ("PoolSource",


fileNames = cms.untracked.vstring("file:ZEE_cfi_py_RAW2DIGI_L1Reco_RECO.root")
)

Next set the environment

$ eval ‘scram runtime -csh‘

Notice that the character ‘ here is the grave accent (you could also use cmsenv command).
Move the ROOT file which you produced above to the test directory, and run the program:

$ cmsRun GsfElectronsMCAnalyzer_cfg.py

Let us next modify the code to print the electron pt, eta and phi. Edit the file
../plugins/GsfElectronMCAnalyzer.cc, the member function
GsfElectronMCAnalyzer::analyze
One can see there a loop over electrons/positrons, where the print should be added. Recompile
and rerun:

$ scram b
$ cmsRun GsfElectronsMCAnalyzer_cfg.py

How to access other reco::GsfElectron member functions can be found browsing the
CMSSW reference manual.

75
10 Reconstruction

10.1 Event reconstruction

Reconstruction is the operation of constructing physics quantities from the raw data collected
in the experiment. As a software process, reconstruction is therefore the procedure of data
reduction whose main client is data analysis.
The reconstruction process is seen as a collection of independent units, each one providing a
set of corresponding objects as output. The reconstruction process can be divided into three
steps, corresponding to local reconstruction within an individual detector module, global
reconstruction within a whole detector, and combination of these to produce higher level
objects.
The reconstruction units providing local reconstruction in a detector module use as input real
data from the DAQ system or simulated data representing the real data. These data are in
either case called digis. The output from the reconstruction units are RecHits, reconstructed
hits which are typically position measurements in tracking type detectors and calorimetric
clusters in calorimeter systems. The RecHits are used as the input for global reconstruction.
In the global reconstruction step information from the different modules of a subdetector
are combined, although information from different subdetectors is not. For example, tracker
RecHits are used to produce reconstructed charged particle tracks and muon system RecHits
are used to produce candidate muon tracks.
The final reconstruction step combines reconstructed objects from individual subdetectors
to produce higher level reconstructed objects suitable for high-level triggering or for physics
analysis. For example, tracks in the tracker system and tracks in the muon system are
combined to provide final muon candidates, and electron candidates from the calorimeter
system are matched to tracks in the tracker system.

10.2 Local reconstruction

Local reconstruction leads to RecHits which contain information about the energy deposition
and positions.
In the tracker detectors local reconstruction algorithms search for strips and pixels with a
signal exceeding a threshold, and use them as seeds for clusters. Clusters are built by adding
neighboring strips/pixels.
In the muon drift chambers, local reconstruction provides the position of a muon hit in a
drift cell, determined from the drift time measurement and the effective drift velocity. Three-
dimensional track segments within a superlayer are built from hits in each component layer.
In the muon cathode strip chambers, local reconstruction provides position and time of arrival
of a muon hit from the distribution of charge induced on the cathode strips. Two dimensional
hits are obtained in each layer, and these can be combined to create three-dimensional track
segments within each chamber.
In the muon resistive plate chambers, local reconstruction gives the position of a muon hit
from the position of clusters of hit strips.
In the electromagnetic calorimeter (ECAL), local reconstruction identifies the position, time
of arrival, and energy of localized electromagnetic energy depositions.

76
In the hadron calorimeter (HCAL), local reconstruction identifies the position, time of arrival,
and energy of localized electromagnetic energy depositions.

10.3 Global reconstruction

The global reconstruction algorithms use the object created in the local reconstruction within
a single detector module, combining them with objects arising from other modules of the
same subdetector to produce further objects which represent the best measurement from
that subdetector.

• Reconstruction in the tracker system: high/low pT tracks, displaced vertices etc.


• Reconstruction in the calorimeter system: ECAL+HCAL tower linking to be used as a
basis for jet reconstruction
• Reconstruction in the muon system: reconstruction of muons without inner tracker
information in an inhomogeneous and nonuniform magnetic field.

Track reconstruction in a dense environment needs an efficient search for hits during
the pattern recognition stage and a fast propagation of trajectory candidates. The track
reconstruction is decomposed into five logical parts:

• hit reconstruction, which in turn consists of clustering of strips or pixels and estimating
a position and its uncertainty (”local reconstruction”)
• seed generation providing the initial trajectory candidates for the full track reconstruc-
tion
• pattern recognition or trajectory building
• ambiguity resolution
• final track fit

Ambiguities in track finding arise because a given track may be reconstructed starting from
different seeds, or because a given seed may result in more than one trajectory candidate.
Vertex reconstruction (interaction points) usually involves two steps, vertex finding and
vertex fitting. Vertex finding involves grouping of tracks into vertex candidates. The vertex
finding algorithms can be very different depending on the physics case (primary or secondary
vertex, reconstruction of exclusive decays, etc.). Vertex fitting involves determining the best
estimate of the vertex parameters (position, covariance matrix, track parameters constrained
by the vertex position and their covariances) for a given set of tracks, as well as indicators of
the fit quality.

10.4 Combined reconstruction

The final stage of reconstruction combines input objects created in the global reconstruction
within each subdetector, creating objects based on the complete detector.
Photon and electron identification. The global selection of electrons and photons pro-
ceeds in three steps. The first step uses the calorimeter information only. The second step
requires hits in the pixel detectors, consistent with an electron candidate. The success of
matching ECAL supercluster to hits in the pixel detector flags the candidate as an electron,
otherwise the candidate is flagged as a photon. In the final step, the selection of electrons
uses full track reconstruction, seeded from the pixel hits obtained by the matching step. The

77
selection of photons can instead use isolation cuts and rejection of π 0 ’s based on lateral shower
shape and the reconstruction of converted photons.
Muon identification. Global muon identification starts from the stand-alone muon (re-
constreucted from the hits in muon system alone), adding associated silicon tracker hits and
performing final fit to the track. Isolation criteria can be applied to the muon candidates to
provide additional rejection against nonprompt muons from b, c and K decays.
Jet reconstruction. Jet reconstruction aims to reconstruct and identify jets arising from
the hadronization of a scattered parton, in order to reconstruct its direction and energy. Many
reconstruction algorithms exist in the literature, varying in speed, efficiency and resolution.
Most algorithms use a clustering technique, in which calorimetric towers close in (η, ϕ) to a
high ET tower are summed together, subject to some constraints. For example in the cone
algorithm, a seed tower is selected and then all objects sufficiently close in (η, ϕ) are used to
form a proto-jet. The process of association is iterated until the parameters of the proto-jet
have stabilized. The procedure is repeated with the remaining unassociated towers, until no
seeding tower with sufficiently high ET remains.
MET reconstruction (missing ET ). Many channels of discovery at the LHC present as
a clear signature for new physics a large missing transverse energy. These kind of events
contain normally large pT neutrinos or possibly even the lightest super symmetry particles
(if super symmetry exists).
Missing transverse energy is typically reconstructed by summing the energy depositions times
sin θ over the whole calorimeter system, and taking the transverse momentum carried by the
muons into account.

78
10.5 b and τ tagging

Many physics channels produce b jets and τ jets in the final state. These need to be distin-
guished from more copious backgrounds containing only light flavoured jets. The top quark,
for example, decays almost exclusively into a boson and a b quark.
Inclusive tagging of b jets, as opposed to lighter flavour jets, mainly relies upon relatively
distinct properties of b-hadrons such as proper lifetime (lifetime at rest), large mass, decays
to final states with high charged track multiplicity, relatively large semileptonic branching
ratios and a hard fragmentation function.
The b-tagging algorithms rely on the reconstruction of lower level physics objects. At LHC b
tagging typically will be applied to jets. Most of the b hadron properties used for b tagging
are exploited using charged particle tracks. To increase the signal-to-noise ratio, only tracks
fulfilling certain quality criteria are selected.
For b tagging the track measurement precision close to the interaction point is most relevant.
The simplest b tagging algorithm counts high quality tracks in a cone around the jet axis
with high enough impact parameter significance. More sophisticated algorithms are described
in the literature.
Tau identification is based on τ jet properties such as lifetime, mass, small charged track
multiplicity, collimation and isolation of τ decay products. Tau jet identification is important,
since τ lepton decays hadronically about 65% of the time producing a τ jet. In general, the
primary requirement for τ jet identification is the isolation of a collimated group of charged
particles in the tracker.

10.6 Trigger

Trigger is the start of the physics event selection process. For the nominal LHC design
luminosity of 1034 cm−2 s−1 , an average of 17 events occurs at the beam crossing frequency
of 25 ns. This input rate of 109 interactions every second must be reduced by a factor of at
least 107 to 100 Hz, the maximum rate that can be absorbed by the on-line computer farm.
The CMS Level-1 (L1) trigger system is based on custom electronics. The High Level Trigger
(HLT) system relies on software.
The CMS L1 trigger is based on the presence of local objects such as muons, electrons,
photons and jets, and it employs global sums of ET and MET. The requirements on the L1
trigger are chosen to provide a high efficiency for the hard scattering physics to be studied
at the LHC. The L1 trigger operates without the benefit of full detector granularity and
full-resolution data.
The HLT is used to provide the selection of 1:1000 from the maximum average Level-1 output
rate of 105 Hz to the final storage rate of 100 Hz. The HLT has a total processing time of up
to ∼1 s, during which time interval the data are stored in memory.
To minimize the CPU requirement by the HLT, a key feature of the algorithms is to recon-
struct the information in the CMS detector only partially. The sets of thresholds for each
trigger object are shown in Table 1.
The trigger algorithms can be simulated and their efficiences studied by simulation. To have
a realistic physics analysis, the trigger simulation needs to be included in the event selection.

79
Table 1: CMS trigger thresholds

Trigger Threshold (GeV)


Inclusive electron 29
Di-electrons 17
Inclusive photons 80
Di-photons 40,25
Inclusive muon 19
Di-muons 7
Inclusive τ -jets 86
Di-τ -jets 59
1-jet + MET 180 + 123
1-jet OR 3-jets OR 4-jets 657,247,113
Electron+τ -jet 19 + 45
Inclusive b-jets 237

10.7 Event reconstruction software in CMS

Although the used reconstruction algorithms and methods are general and used widely in
HEP, the reconstruction software is highly experiment dependent. The reconstruction soft-
ware is written by the physicists for their own use. The analysis software is even more
specific: as the reconstruction is done the same way within an experiment, the analysis using
the reconstructed objects varies from physics channel to physics channel.
Here we do not go very deeply into the CMS reconstruction software, but describe only the
basic principles how to reconstruct the physics objects for your own analysis.
The first place to start is the documentation. Although it may be in many cases almost
non-existent, it is an obvious place to start.
Use tutorials and examples as working examples from which you can take the relevant part
or enlarge to fulfill your needs.
Go to the source code and look for validation code and examples available. This is often a
very fruitful approach.
And most of all, ask help from the experts responsible for writing/maintaining the software
in your experiment, or from other users who may have already solved your problem and who
may have code available which you can easily adopt to your study.
How to browse the source code in CMS? The first way is using the CVS browser. The source
can also be accessed directly via scram,

scram list

Here you see the paths to the source code, go to the source code directory and start digging
with ls, grep, find | xargs grep etc. If you find something interesting, take the code from CVS,
and test it. Sometimes it does not work, and in such a case you did not find a *working*
example, and you can keep searching . . .
Good places to start looking at the CMS reconstruction code are e.g. HiggsAnalysis (in
CVS), containing many different analysis. Sami Lehti/HIP is responsible for HiggsAnaly-
sis/HeavyChHiggsToTauNu code which is written in such a way that it should be easily
adopted for any physics channel.

80
One can also start from the WorkBook:
https://twiki.cern.ch/twiki/bin/view/CMS

->CMS Offline WorkBook

81
11 Fast simulation

11.1 Introduction

Many simulation tasks, such as extracting rare signal events from the expected background,
require a generation of millions of background events, of which only a handful survives after
a proper selection. If detailed detector simulation is too time consuming to be applied, while
still needing some reasonable estimates for the detector and trigger response, the solution
can be found by parametrizing the detector response. The parametrized detector response
simulation is called fast simulation.
Parametrizing and smearing provide an approximate but very fast method to simulate the
detector response. It is a useful tool for quick first inspection of properties of a given physics
channel with large statistics.

11.2 Single particle performance

The key parameters for tracker resolution for single particles are
• transverse momentum pT
• azimuthal angle φ0
• polar angle θ
• transverse impact parameter d0
• longitudinal impact parameter zip

The transverse impact parameter is defined as the closest point of approach to the impact
reference point (primary vertex).
The resolution is studied with full simulation as a function of track pT and η (= − log tan 2θ ).
The resolutions are fitted as a function of η for each value of pT using polynomial functions
up to power five, which is needed to describe the most irregularly shaped η-dependencies.
These functions are used to give the resolution for fixed values of η in order to have points
of resolution as a function of pT .
An approximate parametrization of the pT dependence
√ of the impact parameter and angular
resolution is expressed as (NB notation: a ⊕ b = a + b2 )
2

σip,ang (pT ) = (a1 ⊕ a2 /pT ) + a3

Likewise, a first approximation of the pT resolution dependence on pT is of the form

σpT (pT ) = b1 ⊕ b2 pT

where the first term corresponds to the resolution due to the finite detector precision and the
second term the uncertainty due to multiple scattering.
In order to improve the fits, one may use a ”covariance” or cross term, e.g. for impact
parameter q
σd0 (pT ) = a21 + a4 /pT + a22 /p2T + a3
In the functions the coefficients are calculated for discrete values of ηj . The resolution for
arbitrary η is found by linear interpolation (extrapolation) from two points of σ(ηj ) with ηj ’s
closest to η, and used in smearing the ip.

82
11.3 Particle showers

Calorimeters are described in terms of characteristic parameters such as energy resolution, po-
sition resolution, and shower separation, dynamical range and spatial coverage. A calorimeter
can be homogeneous (like crystal calorimeter) or sampling calorimeter (sandwich structure
with active and inactive layers).
The energy resolution of calorimeters is usually expressed in terms of ∆E/E, ∆E being the
overall statistical error including all error sources. At high energies, statistical contributions
dominate. The relative √ resolution improves with increasing E. The main tendency for ∆E/E
is to follow the form k/ E where k depends on the calorimeter and readout characteristics.
A general formula to get an approximate resolution is
p
∆E/E = A2 + B 2 (2)

83
Figure 4: CMSSW (solid), and parameterized and smeared (dashed) ip.

where A is related to photon statistics in the photomultipliers, and B deals with the sampling
fluctuations.
Containment describes the fraction of energy measured in the calorimeter. Energy not con-
tained in the calorimeter may result in severe biases or simply be detrimental to the energy
resolution.
A detailed simulation of electromagnetic showers is computationally intensive. A parameter-
ization of the spatial energy distribution of an electromagnetic shower, based on probability
density functions allows speeding the simulation without compromising the simulation ac-
curacy. The parameterized simulation is tuned to give the same results as the full detector
response simulation.

11.4 CMS Fast Simulation

CMS Fast Simulation is a CMSSW-integrated tool to simulate and reconstruct events with
the CMS detector, in view of doing physics analysis without being penalized by either CPU
time or disk space considerations, while still benefiting from an accurate simulation of the
detector effects. The input of Fast Simulation is a list of particles from an event generator,
propagated in the magnetic field to different layers of the various subdetectors, with which
the particles may interact. The quasi-stable particles are allowed to decay.
The interactions simulated with Fast Simulation are
• electron bremsstrahlung
• photon conversion
• charged particle energy loss by ionization
• multiple chattering
• electron, photon and hadron showering

The first 4 are applied to particles traversing the thin layers of the tracker, while the latter
is parameterized in the electromagnetic and hadron calorimeters.

84
The (current) muon simulation is mostly based on parameterization of resolutions and effi-
ciencies.
As output, Fast Simulation delivers a series of ”high-level objects”, such as reconstructed hits
for charged particles in the tracker layers and energy deposits in calorimeter cells, which can
be used as inputs for the reconstruction algorithms.

11.5 Tracker response

The computing time needed to simulate an event in Fast Simulation is about 3 orders of
magnitude smaller than that needed in the full simulation.
Fast Simulation uses a simplified tracker geometry, adequate for the required level of accu-
racy. The thickness of the active layers is tuned to reproduce the fully simulated number of
bremsstrahlung photons above a certain energy threshold. After tuning, the total number of
radiation lengths traversed in the tracker is in agreement with the full geometry.
While propagating in the magnetic field through tracker layers, charged particles experience
multiple scattering and energy loss by ionization. The intersections between the modified
trajectories and each tracker layer define the position of ”simulated hits”. Each simulated hit
is turned with a certain efficiency to a ”reconstructed hit”. The position is obtained from a
gaussian smearing of the simulated hit position. The gaussian resolution used in the smearing
process is obtained from the full simulation for each silicon strip and pixel layer. The detailed
procedure was developed in view of reproducing the b-tagging performance with requested
level of accuracy.

11.6 Calorimeter response to e/γ

The showers of electrons and photons which impinge on the ECAL are simulated in 2 steps.
In the first step, the shower is developed following a parameterization as if ECAL were a
homogeneous medium. This approximation is realistic since the CMS ECAL is made of
contiguous crystals. The deposited energy is integrated over longitudinal slides including
uncertainties due to the limited photo-statistics and the longitudinal non-uniformity in the
crystals.
In the second step, the energy spots in each slide are distributed in space and placed into
the actual crystal geometry. The time used in this step is kept to a reasonable value by
limiting the 2D spot-crystal assignment to a small 7x7 crystal grid in a plane perpendicular
to the shower longitudinal component. The energy collection is then simulated taking into
account energy leakage, losses due to gaps in the geometry, and shower enlargement due to
the magnetic field.
When all electrons and photons are processed, the electronic noise is simulated and zero
suppression applied, and a list of reconstructed hits is built. The fast amd full simulations
agree at the level of permil in the barrel region, and at the level of percent in the endcaps,
for energies ranging from 1 GeV to 1 TeV.

11.7 Calorimeter response to hadrons

Charged and neutral hadrons are propagated to the ECAL, HCAL and forward calorimeter
entrance after their interactions with the tracker layers. Their energy response is derived
from the full simulation. Single charged pions are fully simulated for pT values of 2, 5, 10,

85
30, 50, 100 and 300 GeV, uniformly distributed in pseudorapidity between -5 and 5. The
reconstructed energy is collected in 5x5 HCAL tower matrices and in the corresponding 25x25
ECAL crystals. The energy distributions are then sliced into η bins. These distributions are
fitted to a gaussian, the mean value and the sigma of which are tabulated as a function of
the energy and pseudorapidity, used in turn to smear the hadron energy response in the fast
simulation. Linear interpolation and extrapolation is used.
Other hadrons are simulated as if they were pions with the same transverse energy. The
agreement between fast and full simulations is satisfactory.

11.8 Response to muons

The response of the muon chambers is simply parameterized to reproduce the efficiencies and
resolutions of the full simulation. At the moment, the detailed particle propagation stops
at the entrance of the calorimeters. The muon interactions with calorimeters are simulated
much as the pion interactions. The fast muon simulation results agree within 1.5% with the
full simulation results.

11.9 Usage

Fast Simulation source code is available in CVS as part of the CMSSW code. The documen-
tation can be found under the link
https://cms-cpt-software.web.cern.ch/cms-cpt-software/General/
-> Workbook
-> 6.5 Simulating and Reconstructing events with Fast Simulation
Unfortunately the above link needs a password. There is a copy of those pages in
http://www.helsinki.fi/~karimaki/CMHEP/CMSSW/swproject.html
Start Fast Simulation simulation by creating the working area. First the environment needs
to be set to get access to scram (see sections 9.1 and 9.2). We can take the latest available
version on korundi CMSSW 3 6 0 (in April 2010) and do:

scram p CMSSW CMSSW_3_6_0


cd CMSSW_3_6_0/src

Next one needs to login in cvs, and take code from there. You can take the example Fast-
Simulation/Configuration from cvs and use that as a starting point. One can also use the
program addpkg like in the example below. In the example we produce TTbar Tauola events
with the following recipe:

cmsenv (eval ‘scram runtime -csh‘)


addpkg FastSimulation/Configuration
cmsDriver.py TTbar_Tauola_cfi.py -s GEN,FASTSIM --pileup NoPileUp \
--conditions auto:mc --eventcontent RECOSIM \
--beamspot=Early10TeVCollision --datatier GEN-SIM-RECO -n 10 --no_exec
cmsRun TTbar_Tauola_cfi_py_GEN_FASTSIM.py

In Fast Simulation the event generator input can be generated and fed to the analysis on
the fly or one can store the events in ROOT files for later analysis with a separate analysis
program.

86
In the test directory there are several example configuration files which can readily be given
as input to cmsRun. For example:

$ cmsRun FastSimulation/Configuration/test/Example_cfg.py

The example config files in FastSimulation/Configuration/test can be customized for


user’s needs by making some modifications. One can change for example the number of
events, the average number of pile-up events, simulate (or not) hits in the tracking system
and calorimetry, etc:

...
process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )
...
# If you want to turn on/off pile-up
process.famosPileUp.PileUpSimulator.averageNumber = 5.0
# You may not want to simulate everything for your study
process.famosSimHits.SimulateCalorimetry = True
process.famosSimHits.SimulateTracking = True
# process.famosSimHits.SimulateMuons = False
...

If one wants change the name of the output in a ROOT file, an output module and path
must be defined.

...
process.o1 = cms.OutputModule(
"PoolOutputModule",
fileName = cms.untracked.string("MySecondFamosFile_2.root"),
outputCommands = cms.untracked.vstring("keep *",
"drop *_mix_*_*")
)

For the event generation one can use the existing examples in the FastSimula-
tion/Configuration/python directory as a starting point.
Remember that fast simulation is not a full simulation, it gives more approximate results,
and it cant be always used. Sometimes it is, however, the only practical option, and often a
very good starting point for a ”first look” in a new physics channel.

87
12 Error estimation
Notes in this section is a sub-selection from the document ”Selected Topics on Data Analysis
in Particle Physics” http://www.helsinki.fi/~karimaki/CMHEP/Lectures/statLectures.pdf

12.1 Propagation of measurement errors

Error propagation is needed when quantities to be studied must be calculated from directly
measurable quantities with given covariance matrix. Let us denote a set of measurements as
follows:

• Set of measurements: x = (x1 , x2 , . . . , xn )

• Covariance matrix: Vx , with vij = cov(xi , xj )

The components xi represent individual measurements which may be correlated. In case of


uncorrelated xi the covariance matrix is diagonal. Furthermore, let us suppose that we are
interested in a set of transformed variables which are derived from the measured variables x
by some transformation formula:
x0 = x0 (x) (3)
0 0 0 0
x = (x1 , x2 , . . . , xm ), m ≤ n (4)
or:
where the components x0i are functions of x.
For example: The measured variables could be the Cartesian coordinates x = (x, y, z) and
the transformed variables could be the spherical coordinates x0 = (r, ϕ, θ).
Now the question is: What is the covariance matrix Vx0 of the new variables x0 ? This is the
problem of the error propagation.
The answer is derived by recalling first the definition of the covariance matrix:
Z
(Vx )ij ≡ cov(xi , xj ) ≡ cov(x)ij ≡ σij ≡ vij = (xi − hxi i)(xj − hxj i)f (x)dx (5)

where f(x) is the probability density and where we have listed various notations for the
covariance. From the definition it readily follows that

cov(axi , bxj ) = ab cov(xi , xj ) (6)

or more generally: X X X
cov( ai xi , bj xj ) = ai bj cov(xi , xj ) (7)
i j i,j

12.2 Error propagation in case of linear transformation

We first consider linear transformations x → x0 . A linear transformation can be written in


matrix notation:
x0 = J x (8)

88
where J is an m × n matrix independent of x. According to the definition (5) we can derive
the expression for the covariance matrix element i, j:
X X
(Vx0 )ij = cov( Jik xk , Jjl xl )
k l
X
= Jik Jjl cov(xk , xl )
k,l

= (J Vx J T )ij

where we have used equation (7). The above result implies the following transformation law
for the covariance matrix:
Vx0 = J Vx JT (9)
This the error propagation formula in case of linear transformation of variables x0 = Jx.

Examples
Pn
Example 1: Error estimate of sum of measured quantities s = 1 xi . Error ∆s = ?
In matrix notation we write s = Jx where J = (1 1 · · · 1) and it follows that
n
X X
(∆s)2 ≡ σs2 = (1 1 · · · 1)Vx (1 1 · · · 1)T = σii + 2 σij (10)
1 i>j

where we use the fact that covariance matrix is symmetric and the notation (∆xi )2 = σii . If
Vx is diagonal (uncorrelated measurements xi ), we get:

n
X
σs2 = σi2 (11)
1

where our notation is: σii = σi2 .

Example 2: Error estimate of weighted mean


P
wi xi
µ = Pi .
i wi

of N quantities. Error estimate σµ = ?


Here we assume that xi are independent of each other so that cov(xi , xj ) = 0 ↔ i 6= j and
 2 
σ1
 σ22 (0) 
Vx =  (12)
 
.. 
 (0) . 
2
σN

where all the off-diagonal elements are zero. Now the J matrix is 1 by N :
X
J=( wi )−1 (w1 w2 · · · wN )
i

89
and we get
X X X
σµ2 = ( wi )−2 (w1 w2 · · · wN )Vx (w1 w2 · · · wN )T = ( wi )−2 (wi σi )2 (13)
i i i

Normally the weights are inverse variances i.e. wi = σi−2 and inserting to the above result we
obtain for the error of the weighted mean:

1 X 1
= . (14)
σµ2 σi2
i

A special case is that the weights are equal: σi = ∆x for all i. In this case we get for thr
error of the unweighted mean

N
1 X ∆x
µ= xi ; σµ = √ (15)
N N
1

which are the formulae for a simple (unweighted) mean and its error estimation.

Example 3: Error estimate of simple sums of three quantities x1 , x2 and x3 :

u1 = x1 + x2 (16)
u2 = x2 + x3 . (17)

Here the transformation is from 3D to 2D so that n = 3 and m = 2. The transformation


matrix is:  
1 1 0
J=
0 1 1
so that the covariance matrix of u = (u1 , u2 ) is:
  
  σ11 σ12 σ13 1 0
1 1 0 
Vu = σ12 σ22 σ23  1 1 (18)
0 1 1
σ13 σ23 σ33 0 1

where the middle most matrix is the covariance matrix Vx of the variables x = (x1 , x2 , x3 ).

12.3 Error propagation in case of non-linear transformation

We define again a transformation of a set of measurements x as:

x0 = x0 (x)
x = (x1 , . . . , xn )
x0 = (x01 , . . . , x0m )

where x0 is under-constrained i.e. m ≤ n. Here x is again an array of measurements with


known covariance matrix Vx which is symmetric n × n matrix. The problem is now how
to calculate the covariance matrix of quantities x0 i.e. the covariance matrix Vx0 of the m
quantities x01 , . . . , x0m .

90
We expand x0 in Taylor series around the expectation value of x: x = hxi. Each component
x0i is expanded as:
x0i = x0i (hxi) + ∇x x0i (x = hxi)(x − hxi) + · · · (19)
so that the expansion in matrix form reads:

x0 = x0 (hxi) + J(x − hxi) + · · · (20)

where J is now the Jacobian derivative matrix of the transformation:


  ∂x01 ∂x01

∇x01

∂x1 · · · ∂xn 
J =  ...  = 
  .. .. ..  (21)

 . . . 
∇x0m ∂x0m ∂x0m
∂x1 ··· ∂xn

computed at x = hxi. Neglecting higher order terms we have the expansion:

x0 = x0 (hxi) + Jx − Jhxi. (22)

The first and the third terms are constant, because they are calculated at a fixed point
x = hxi, so that their covariances vanish and we have

cov(x0 ) = cov(Jx) = J cov(x) JT .

Using the V notation for the covariance matrix, the error propagation formula in case of
non-linear transfromation reads:
Vx0 = J Vx JT (23)
which is the same formula as in case of linear transformation except that the matrix J is now
the Jacobian derivative matrix of the non-linear transformation computed at x = hxi.

Examples

Example 1: Error estimate of a product u = x1 x2 . Given the covariance matrix of (x1 , x2 )


what is the error estimate ∆u?
 
The Jacobian is J = ∂u/∂x1 ∂u/∂x2 = x2 x1 so that

 σ12 σ12
  
2 2 x2
(∆u) ≡ σu = x2 x1 2 = x22 σ12 + 2x1 x2 σ12 + x21 σ22 . (24)
σ12 σ1 x1

In general a single valued function f = f (x, y) has the error estimate:


 2  2
∂f ∂f ∂f ∂f
σf2 = σx2 +2 σxy + σy2 (25)
∂x ∂x ∂y ∂y

where σxy is the covariance of x and y. Notice that the formula

∂f ∂f
∆f = ∆x + ∆y,
∂x ∂y
which is usually offered as an error formula for a function in elementary courses is valid only,
if the correlation between x and y is +100%: σxy = σx σy .

91
Example 2: Transformation from Cartesian coordinates (x, y) to polar coordinates (r, ϕ).
The transformation is  p
 r = x2 + y 2
 ϕ = arctan y
x
and the Jacobian: !
∂r ∂r  x y   
∂x ∂y r r
1 x y
J= ∂ϕ ∂ϕ = −y x = −y x .
∂x ∂y r2 r2 r r r

The transformed covariance matrix is then


     −y 
0 σrr σrϕ T 1 x y σxx σxy x r
V ≡ = J V J = 2 −y x x .
σrϕ σϕϕ r r r σ xy σ yy y r

Exercises

Exercise 1: Transformation of momentum variables. Suppose that the momentum of a


charged particle is measured in 3D in terms of the curvature ρ and the angular variables
λ and ϕ. The curvature ρ is the curvature of the projected trajectory in the xy plane and
it is due to a magnetic field B = (0, 0, Bz ). The covariance matrix of the measurement is
known:  
σρρ σρλ σρϕ
V = σρλ σλλ σλϕ 
σρϕ σλϕ σϕϕ
Calculate the covariance matrix in terms of the Cartesian momentum components p =
(px , py , pz ). Hints: The angle λ is related to the polar angle θ as λ = π/2 − θ. The re-
lation between the curvature and momentum is ρ = 0.3qBz /pT where pT is the transverse
momentum i.e. the momentum projected in the xy plane, q = ±1 is the particle charge and
the units of ρ, Bz and pT are [m−1 ], [T] and [GeV], respectively.

Exercise 2: Error estimate of the invariant mass. A particle decays into two particles with
masses m1 and m2 . The momenta of the decaying particles p1 = (px1 , py1 , pz1 ) and p2 =
(px2 , py2 , pz2 ) are measured together with the corresponding covariance matrices V1 and V2 .
The invariant mass squared of of the two-particle system is calculated as:

M 2 = (E1 + E2 )2 − (p1 + p2 )2 .
q
where Ei = m2i + pi 2 . What is the error estimate of the invariant mass M ? The particle
measurements 1 and 2 are assumed independent.

13 Fitting parameterized model with experimental data

13.1 The method of maximum likelihood

Let us suppose that we have N measurements of a quantity y:

y = (y1 , y2 , . . . , yN )

92
which may depend on a vector variable x. We want to fit these measurements with a modeling
function:
y = y(α1 , . . . , αm ; x)
where α1 , . . . , αm are the parameters of the model.
The measurements y1 , y2 , . . . , yN represent a random sample from a probability density g(y)
in the N dimensional measurement space.
The a priory probability density g(y) is parameterized by the model y = y(α; x):

g(y) → g ◦ y(α; x; y).

Now we define the likelihood function:

Ly (α; x; y) = g ◦ y(α; x; y) (26)

subject to the normalization:


Z
Ly (α; x; y)dy = 1 for all α; x. (27)

The quantity Ly (α; x; y)dy is the probability of the sample y = (y1 , y2 , . . . , yN ) with param-
eters α defining a probability density in measurement space.
With these introductory preparations we introduce the method of Maximum Likelihood: The
principle is to determine the model parameters α = (α1 , . . . , αm ) such that the likelihood
function becomes minimum. Or, equivalently, the logarithm of the likelihood function:

log L = log[g ◦ y(α; x; y)] (28)

becomes maximal under the constraint


Z
L dy = 1.

The joint probability density g(y) can be for example:

1. A Gaussian:
1
g(y) = N exp[− 12 (y − hyi)T V−1 (y − hyi)]
(2π det V) 2
with V = cov(y) (N × N symmetric matrix) and hyi represented by the model: hyi =
y(α1 , . . . , αm , x).

2. A Poisson distribution, when yi = ni are integers:

nλi i −λi
pi = e
ni !
where the Poisson expectation values, λi , represent the model: λi = λi (α1 , . . . , αm ; x).

The likelihood function is a product of the probabilities pi : L = Πpi .

93
350 350

300 300

250 250

200 200

150 150

100 100

50 50

0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

Figure 5: Spectrum of a quantity x in a histogram presentation and as a plot with error bars. The
solid line in the second picture is a fit result by LSQ.

13.2 Least squares method as a special case of maximum likelihood

The least squares method (LSQ) follows from the principle of maximum likelihood, when
the a priory probability density is Gaussian. We consider the spectrum of a 1D quantity x
which can be presented as a histogram showing the frequency of x values or as plot with
measurement errors (Fig. 5). The measured frequencies in a histogram (histogram ’bars’)are
subject to statistical fluctuations which follows Poisson or Gaussian distribution. Suppose
the fluctuations are Gaussian i.e. we have a large statistics histogram. Then the likelihood
function is:
1 n
T −1
o
Ly (α; y; x) = N exp − 1
2
[y − y(α; x)] V [y − y(α; x)] (29)
(2π det V) 2

The array y holds the measurements (histogram values) and the array

y(α; x) = (y(α1 ; x1 ), . . . , y(αN ; xN ))

are the corresponding model values. Taking the logarithm of eq. (29) we get:

log Ly (α; y; x) = const − 12 [y − y(α; x)]T V−1 [y − y(α; x)]

From this it follows that maximizing the Gaussian likelihood is the same as minimizing the
so called least squares function:

χ2 = [y − y(α; x)]T V−1 [y − y(α; x)] (30)

94
where Vij = cov(yi , yj ), i, j = 1, . . . , N and y is an array of measurements and y(α; x)
represents the model. The matrix V−1 is usually called a weight function. In case the
measurements yi are uncorrelated, the covariance matrix is diagonal: Vij = δij σij . In this
case the eq. (30) reduces to the following sum:
N 
yi − f (α; x) 2
X 
2
χ = (31)
σi
i=1

where we denote the model function as f and the diagonal elements of V as σi2 . The quantity
σi is normally the error estimate of the measurement yi . For example in Fig. 5 we have a

spectrum which follows the behavior f (x) = α1 + α2 x. The curve in the figure is fitted with
this parameterization by minimizing (31) with respect to α1 and α2 .

14 Least squares solutions with error estimates


In the following we consider the following cases:

• linear χ2 minimization

• non-linear or general χ2 minimization

• constrained χ2 minimization

The χ2 minimization problem is called linear, if the model function is linear in terms of its
parameters α.

14.1 Linear χ2 solution

Our linear model is now:


m
X
f (x) = αj fj (x), m<N (32)
j=1

i.e. a measured value of the quantity y is modeled by this linear expression at a space point
x. For N measurements of y: y = (y1 , y2 , . . . , yN ) we write the model as follows:

f (x) = (f (x1 ), . . . , f (xN ))

Denoting matrix elements Hij ≡ fj (xi ) we can express the model in a form of matrix equa-
tion:
f (x) = H(x) α (33)
where H is a N×m matrix as defined above. With these notations the linear χ2 function is
expressed as:
χ2 = (y − Hα)T V−1 (y − Hα) (34)
where V is the covariance matrix (N×N ) of the measurements y (y is a column vector and
yT a row vector).
To minimize, we differentiate (34) with respect to α and get:

∇α χ2 = −2(H∇α α)T V−1 (y − Hα) = −2(H I)T V−1 (y − Hα) = −2HT V−1 (y − Hα) = 0

95
where I = ∇α α is a unit matrix. It follows that the linear LSQ solution for the parameters
α minimizing the χ2 equation (34) is:

α = (HT V−1 H)−1 HT V−1 y (35)

The solved parameters α = (α1 , . . . , αm ) are subject to statistical error depending on the
precision of the measurements y i.e. on the covariance matrix V (and on the modeling matrix
H). The solution (35) is of the form α = Jy with J = (HT V−1 H)−1 HT V−1 so that the
error propagation techniques described in section 12.2 applies. Using eq. (9) we obtain the
error estimate (covariance matrix) of the parameters α:

Vα = JVJT = (HT V−1 H)−1 HT V−1 V V−1 H(HT V−1 H)−1


   

where we have used the symmetry of the matrices V and HT V−1 H. The above equation
reduces trivially to:
Vα = (HT V−1 H)−1 (36)
As an example we consider a special case where the measurements yi are independent of each
other.

Example: independent measurements yi

In this case the covariance matrix V of the measurements yi is diagonal and the diagonal
elements are variances of yi which are the squared error estimates of yi : Vij = δij σij . We use
the notation var(yi ) ≡ σii ≡ σi2 . Then the χ2 function becomes:
N Pm !2
X yi − j α j fj (x i )
χ2 = . (37)
σi
i=1

In the simplest case the functions fj are powers of a single variable x and the fitting model
is a polynomial. We define a column vector c and an m×m matrix B as follows:

c = HT V−1 y
B = HT V−1 H

Then the ’best fit’ parameters α are:

α = B−1 c

where the elements of c and B are:


N
X N
X
ci = σk−2 Hki yk = σk−2 fi (xk ) yk
k=1 k=1
X X X X
Bij = Hki (V −1
H)kj = Hki σk−2 δkl Hlj = σk−2 Hki Hkj
k k l k

X
Bij = σk−2 fi (xk )fj (xk ) (38)
k

Notice that in case that the model is a 1D polynomial, the B matrix elements are Bij =
xi+j 2
k /σk .

96
14.2 Non-linear least squares fit

In case of a model which is non-linear in terms of its parameters α the model expression Hα
is replaced by values of a function f :
f (α; x) = fα = (f (α; x1 ), . . . , f (α; xN ))
where α is an array of m model parameters α = (α1 , . . . , αm ). Then the χ2 function reads:
χ2 = (y − fα )T V−1 (y − fα ) (39)
where again y is an array of N measurements and V is their covariance matrix. This χ2
function must be minimized by iteration. The iteration method needs an initial value which
we denote by α∗ = (α1∗ , . . . , αm
∗ ). We expand each component of the model function in Taylor

series around the initial value α∗ at each space point xi , i = 1, . . . , N :


f (α; xi ) = f (α∗ ; xi ) + ∇α f (α; xi )|α=α∗ · (α − α∗ ) + · · · . (40)
This is a set of N equations. We linearize by neglecting the higher order terms of the model
function f and write the equations (40) in matrix form as:
fα = fα∗ + J∆α (41)
where ∆α ≡ α − α∗ and J is the Jacobian derivative matrix of the model function:
  ∂f (α;x1 ) 
· · · ∂f∂α
(α;x1 )

∇α f (α; x1 ) ∂α1 m

J=
 ..  
= .. .. .. 
(42)
.  
 . . .


∇α f (α; xN ) ∂f (α;xN ) ∂f (α;xN )
∂α1 · · · ∂αm

computed at α = α∗ . We denote by ε∗ the departure of the measurements from the model


at α = α∗ :
ε ∗ = y − f α∗
Together with this notation and using the linearized form of the model (41) we get a linearized
expression for the χ2 function:
χ2 = (ε∗ − J∆α)T V−1 (ε∗ − J∆α). (43)
We find the minimum by setting the differential to zero:
∇α χ2 = −2(J∇α ∆α)T V−1 (ε∗ − J∆α) = −2JT V−1 (ε∗ − J∆α) = 0
and solving for ∆α:
∆α = (JT V−1 J)−1 JT V−1 ε∗ (44)
This is a correction to the parameters (α1∗ , . . . , αm
∗ ) at a new iteration stage:

αnext = α∗ + ∆α.
The iteration is continued until satisfying some convergence criteria which are discussed
below. The covariance matrix of the converged solution α = (α1 , . . . , αm ) is:

Vα = (JT V−1 J)−1 (45)

It is the same expression as in case of linear LSQ (36) except that the matrix H of linear
mapping is replaced by the Jacobian J (42) of the non-linear transformation computed at
the χ2 minimum: H ↔ J.

97
Example: independent measurements yi

In this case the covariance matrix V is diagonal and the diagonal elements are variances of
yi which are the squared error estimates of yi : Vij = δij σij . Then the χ2 function becomes:
N 
yi − f (α; xi ) 2
X 
2
χ = . (46)
σi
i=1

Similarly to the previous section we define an m-element column vector c and an m×m matrix
B as follows:

c = JT V−1 ε∗
B = JT V−1 J.

B and c are calculated at α = α∗ at each iteration stage and the iterated correction is

∆α = B−1 c.

The elements of c and B are (in case of uncorrelated measurements):


N  
X ∂f (α; xk )
ci = σk−2 ε∗k
∂αj α=α∗
k=1
X X X X
Bij = Jki (V−1 J)kj = Jki σk−2 δkl Jlj = σk−2 Jki Jkj
k k l k
 
X ∂f (α; xk ) ∂f (α; xk )
= σk−2
∂αi ∂αj α=α∗
k

where ε∗k = yk − f (α∗ ; xk ). The inverse of B gives the covariance matrix Vα .

14.2.1 Convergence criteria

We denote ε = y − fα , the deviations of the measurements y from the model f = f (α; x)


at an iteration stage. Then ∆χ2 = 2∆εT V−1 = −2∆f T V−1 so we have estimated distance
from minimum:
∆χ2 = −2∆αT J T V−1 ε (47)

The equation ∆χ2 = 1 is a 1STD (hyper)ellipsoid in the error space.q


The condition ∆χ2 < 0.1
is usually a sufficient convergence criterium since then ∆αi << σα2 i and the parameter
changes at the iteration i are much smaller than their statistical error.

14.3 Least squares fit with constraints

We skip this section. For interested have a look at the original document:
http://www.helsinki.fi/~karimaki/CMHEP/Lectures/statLectures.pdf

98
14.4 Fit quality tests

14.4.1 Pull or stretch values of fitted parameters

A pull value for the parameter i is defined as

∆yi ∆yi
pi = p =p (48)
V(∆yii ) Vii − V(y)ii

i.e. the difference between the fitted and measured values divided by the error estimate.
The pull values pi are randomly distributed and they should follow the standard normal
distribution N (0, 1). Plotting the pull values and checking, if they follow N (0, 1) is used to

Figure 6: Random Standard Normal distribution and its fit

control the fit quality. If this test fails, the reason might be that the used model does not
describe the experimental measurements, or the measurements are biased or may be the error
estmates, which are used for the weights, are not properly estimated.

14.4.2 Fit probability distribution

When fitted repeatedly several times with variable input, the resulting χ2 values should follow
the theoretical χ2 distribution with the parameter ND , the number of degrees of freedom. By
inspecting the χ2 distribution is not an easy way to see, if the fits are OK, especially when the
number ND varies from one fit to another. A better way is to make the so called χ2 probability
distribution which should follow uniform distribution in the range (0,1) independent of ND .
If a variable is a random variate, like χ2 values, then also any of its function is a random
variate. It can be shown, that if a variate x follows probability distribution f (x), then the
values of its cumulative distribution function
Z x
F (x) = f (t)dt
−∞

99
follows uniform distribution in the range (0,1). Let us denote the χ2 distribution function as
fND (x). Then the corresponding cumulative distribution function reads:
Z χ2
2
FND (χ ) = fND (t)dt.
0

It is not integrable in closed from, but its execution can be found from certain libraries. The
ROOT TMath class offers one, the Prob() function. The code
double chiProb=TMath::Prob(chi,ndf);
computes the χ2 probability function for given χ2 value (chi) and for the number of degrees
of freedom ndf.

Figure 7: χ2 distribution with NDF=3 and the corresponding χ2 probability distribution.

In real life there are often outliers in the measured data, which do not fit well the assumed
model. The outliers tend to create a ”forward peak” in the fits’ χ2 probability distribution.
In above figures we have an ideal case with no outliers.

14.5 Maximum Likelihood and Poisson statistics

Fitting histograms with low statistics is not ideal with the χ2 method, because the bin
fluctuations are not Gaussian. In contrast, the large statistics histograms can be fitted with

least squares method assuming that the bin error is ni where ni is the number of entries in
the bin number i.
For small statistics histograms one chooses Maximum Likelihood fit with Poisson statistics:
< ni >ni −<ni >
P (ni ) ≡ pi = e , i = 1, . . . , Nbins , (49)
ni !
i.e. we have Nbins Poisson distributions which describe the probability of fluctuations in each
bin i.

100
Figure 8: Example of a small statistics histogram - exponential distribution. The true mean of the
generated histogram should be = 1 and the slope should be = -1. The curve is fitted with the Least
Squares method. For this the Poisson likelihood fit gives the estimate slope=-1.09.

Suppose we have a model f (x; ā) to describe the spectrum dN/dx. The model should be such
that Z xi+1
< ni >= f (x; ā)dx ≡ fi (ā), i = 1, . . . , Nbins
xi

(fi (ā) is a notation for the integral). Bins are independent so the likelihood function is

NY NY
bins bins
fi (ā)ni −fi (ā)
L= pi = e (50)
ni !
i=1 i=1

The parameters ā can be determined by minimizing − log L:


NX
bins

− log L = [log ni ! + fi (ā) − ni log fi (ā)] (51)


i=1
NX
bins

= const + [fi (ā) − ni log fi (ā)] (52)


i=1

The minimization is performed by finding the zeros of the derivatives of − log L with respect
to the parameters ā:
ni  
∂(log L) X ∂fi (ā) ni
− = 1− = 0, j = 1, . . . , Np , (53)
∂aj ∂aj fi (ā)
i=1

where Np is the number of parameters ā. Normally this group of Np equations must be solved
numerically by iteration.
In ROOT, for example, a loglikelihood fit to an exponential can be performed as follows:

his->Fit("expo","LO");

where his is a pointer to exponential histogram.

101
Figure 9: Comparison of small statistics histograms fitted with Least Squares and Poisson Likelihood
method. Exponential histograms with only 20 entries were generated and fitted 10000 times. The
fitted slopes (absolute values) are plotted. The LSQ method largely and systematically underestimates
the slope values, whereas the Poisson Likelihood gives correct estimates on the average.

14.5.1 Maximum Likelihood and parameter errors

Let us denote:
W (ā) ≡ log L(ā) (54)
We expand W (ā) in Taylor series around ā = ā∗ , where ā∗ maximizes W :

W (ā) = W (ā∗ ) + J(ā − ā∗ ) − 12 (ā − ā∗ )T H (ā − ā∗ ) + . . .

where J is Jacobian of W and H is Hessian of −W :


∂ 2 log L
Hij = −
∂ai ∂aj
Now at maximum ā = ā∗ the second term in the series vanishes. By comparison with the
series expansion of exponential we conclude that:

W (ā) ' const exp(− 12 ∆āT H ∆ā)

whcih is a general (multidimensional) Gaussian distribution with the weight matrix H. The
covariance matrix is its inverse:

cov(ai , aj ) ≡ Vij = (H −1 )ij (55)

This is the error matrix of the parameters fitted with the Maximum Likelihood method.

102
15 Introduction to Monte Carlo simulation
Monte Carlo simulation is sometimes called stochastic simulation. The basic requirement of
Monte Carlo simulation is to have means to generate random numbers which follow density
functions typical in the physical processes in question. For example γ-spectra have peaks
which can be simulated with Gaussians.

15.1 Generation of non-uniform distribution by inversion method

This method can also be called method of inverted cumulative distribution function. It is
generally valid and useful always, when the inverse of the cumulative distribution can be
evaluated. Here we use the notations: f (x) = density function of the random variate x and
F (x) the corresponding cumulative distribution function:
Z x
F (x) = f (t)dt
−∞

Figure 10: A probability density function and its cumulative distribution

In all its simplicity the inversion method works as follows:

1◦ Generate u from uniform distribution (0, 1)

2◦ Calculate x = F −1 (u)

Then the random variates x follow the density f (x). It can be proved as follows. The
cumulative probability (=F ) of the variate x:

P [x ≤ t] = P [F −1 (u) ≤ t] = P [F (F −1 (u)) ≤ F (t)] = P [u ≤ F (t)] = F (t) 

The second equality follows from the fact that cumulative distribution functions are mono-
tonically increasing and the last equality follows from u being uniform in the range (0, 1).

103
Example: Particle lifetime or intensity of a radiative source as a function of time: exponen-
tial distribution as a function of time:

f (t) = λe−λt (t ≥ 0)
−λt
F (t) = 1 − e (0 ≤ F < 1)
−1 −1
F (u) = −λ log(1 − u) (0 ≤ u < 1)

Here λ is a parameter. The mean lifetime is τ = λ−1 and the half-life is t1/2 = τ log 2. Now
we can write a C++ function which generates variates from exponential with mean=tau:

double rnexpo(double tau){


double rnduni = (double)rand()/(double)RAND_MAX;
return -tau*log(rnduni); // <cmath>
}

Notice: Instead of 1 − u one can use u, because both are uniform (0,1). We use ROOT to
make an example plot with this code using tau=2:

Figure 11: Random exponential distribution generated with the above code. The straight line in
logarithmic scale shows that the plot is exponential.

If one is working with ROOT, the class TRandom provides good many utilities for random
number generation. For example the code:
rnexp=gRandom->Exp(Tau);
provides an exponential random number with mean Tau.

15.2 Inversion method for discrete distribution

In order to generate from a discrete probability pi , i = 1, 2, . . . one tabulates the cumulative


probability:
Xj
Fj = pi , j = 1, . . . , N (56)
i=1
where N is large enough so that FN ' 1. The procedure is then:

104
1◦ Generate u from uniform [0,1)

2◦ Determine k such that Fk−1 ≤ u < Fk

Then the probabilities pk follow the wanted discrete distribution. Example: Suppose the
discrete probalities are pi = const/i, i = 1, . . . , 10. The task is to write a code for the
generation.

void generateK(int *k) {


static int done=0;
static float cons;
const int nProb=10;
static float iProbCumul[nProb];
if (!done) { // initializations
done = 1;
float iProb[nProb];
for (int i=0; i<nProb; i++) {
iProb[i]=1.0/(i+1);
cons+=iProb[i]; // find normalization
}
iProbCumul[0]=iProb[0]/cons;
std::cout<<"0 "<<iProb[0]/cons<<" "<<iProbCumul[0]<<std::endl;
for (int i=1; i<nProb; i++)
iProbCumul[i]=iProbCumul[i-1]+(iProb[i]/cons);
}
// generation
*k=1;
double rnduni = (double)rand()/(double)RAND_MAX;
for (int i=1; i<nProb; i++)
if (rnduni>=iProbCumul[i-1]) *k=i+1;
}
int main() { // testing
for (int i=0; i<100; i++) {
int k;
generateK(&k),
std::cout<<k<<std::endl;
}
}

15.3 Approximate inversion method using tabulation

This method can be used for continuous density distribution, when a closed form is not
available for inversion method. One tabulates the pairs of values (xi , F( xi )), xi < xi+1 . Then
the generation algorithm goes as follows:

1◦ Generate u from uniform [0,1)

2◦ Find xi such that F (xi ) ≤ u < F (xi+1 )

105
3◦ Compute x by interpolation:

[F (xi+1 ) − u]xi + [u − F (xi )]xi+1


x=
F (xi+1 ) − F (xi )

15.4 Hit-or-Miss method

Suppose a density distribution f = f (x) is de-


fined in a finite range:

x ∈ [a, b] and M = max{f (x)|a ≤ x ≤ b}.

If the inversion method is not applicable (i.e.


the inverse cimulative density function can-
not be computed in closed form), the following
method is always valid (von Neumann 1951):

1◦ Generate a random number r1 uniform in [a,b)

2◦ Generate a random number r2 uniform in [0,M)

3◦ If r2 > f (r1 ), go back to 1◦ (miss)

4◦ If r2 ≤ f (r1 ), take r1 (hit)

5◦ Go back to 1◦ for the next random number

The random numbers r1 then follow the density distribution f (x).


Efficiency of the method goes like the ratio of the areas:
Rb
f (x)dx 1
Ef f = a =
M (b − a) M (b − a)

The efficiency can be bad for the distributions which make a narrow peak. Note: If the
maximum M cannot be calculated, one can use a value M 0 which is known to be larger than
maximum in the range [a, b) (which would make the efficiency smaller).

15.5 Hit-or-Miss method by comparison function

Let f be a density function from which we should generate random variates. Now we do not
restrict the range of f ’s definition to a finite range (in the figure below the range is (−∞, ∞).
Let g be another density function for which we have a generation method available (e.g. the
inversion method) and which satisfies the condition:

∃ constant a such that a g(x) ≥ f (x) ∀x.

106
We thus require that we can find such a constamt a that the graph of the product function
a g(x) is everywhere above the graph of the function f (x) (and is defined in the same range).

6 ag(x)

f (x)

-x

The distribution f can be generated then using the following Hit-or-Miss method which uses
the inversion method for g:

1◦ Generate a random number r1 uniform (0, 1)

2◦ Compute G−1 (r1 ) = x

3◦ Generate a random number r2 uniform (0, a g(x))

4◦ If r2 > f (x) → 1◦ (miss)

5◦ If r2 ≤ f (x) take x (hit)

6◦ → 1◦

15.6 Composition method

The composition method can be used, when the density function f can be expressed as the
following integral composition:
Z Z
f (x) = gy (x)dH(y) = gy (x)h(y)dy (57)

where h and gy are density functions. The distribution f is then generated with the following
algorithm:

1◦ Generate a random number y from the h distribution

2◦ Generate x from gy distribution using the generated y value

he density distribution h can also be discrete. Then in the probability element h(y)dy in the
formula (57) is replaced by the discrete probability pi and the integral by a sum:
X
f (x) = pi gi (x)
i

The functions gi can be for example pieces of the function f in suitable intervals i:

107
Z
f (x)
gi (x) = , pi = f (x)dx
pi i
6
where integration is taken over the interval i (pi is a g4
normalization factor). Then the generation algorithm g3 r
r
reads: r
g2 g5
1◦ Generate a random i from the discrete distribu- g1 r
tion pi . -
i =1 6 2 6 3 6 4 6 5
2◦ Generate x in the interval i from the distribution
gi .

In each interval one uses a generation method applicable in the interval in question. This is
a method used often in practical applications.

16 Generation from common continuous distributions

16.1 Uniform distribution

From now on we use the notation Unif(a, b) for a uniform distribution in the interval (a, b).
The density function is:

(b − a)−1 a ≤ x ≤ b

f (x) =
0 otherwise

Or by using the step function:


f (x)
6
Θ(x − a)Θ(b − x)
f (x) =
b−a -x
a b
The mean and variance are:
1
x̄ = 21 (a + b) σx2 = (b − a)2
12
The generation from the distribution Unif(a, b) is based on the generation from the basic dis-
tribution Unif(0, 1) for which there is a large number of options in various program libraries.
This generator forms in fact a base of generation for almost all other density distributions.
The generation of general uniform distribution Unif(a, b) is based on the transformation:

x → x0 = a + (b − a)x,

which is a linear mapping from the interval (0, 1) to the interval (a, b). Hence the algorithm
is simply:

1◦ Generate r1 Unif(0, 1)

2◦ Compute r2 = a + (b − a)r1

Then the random numbers r2 are distributed as Unif(a, b).

108
16.2 Gaussian distribution

In what follows we denote the general normal (Gaussian) distribution as N (µ, σ) where µ is
the mean of the distribution and σ its standard deviation. The Gaussian density function is
1 1 x−µ 2
f (x) = √ e− 2 ( σ ) , (58)
2πσ
which is defined on the whole real axis. The Gaussian N (0, 1) is called standard normal

Figure 12: Standard normal distribution

distribution. The Gaussian cumulative distribution function is:


Z x−µ
σ 1 2 x−µ
√1
F (x) = 2π e− 2 t dt ≡ Ψ( ). (59)
−∞ σ

The function Ψ is the cumulative distribution function of the standard normal N (0, 1). It
cannot be computed in closed form. Ψ is found for example in ROOT TMath class as
TMath::Freq(double x). The even moments of N (µ, σ) are:

(2k)! 2k
µ2k = σ , k = 1, 2, 3, ...
2k k!
All odd moments vanish. It follows that the skewness and kurtosis are
µ3 µ4
γ1 = = 0, γ2 = − 3 = 0.
σ3 σ4
1 2 2
√ is Φ(t) = exp(itµ − 2 t σ ). The half width at half maximum
The characteristic function
(HWHM) of N (µ, σ) is = 2 log 2 σ ' 1.18σ.
The tails of Gaussian decrease fast: As can be seen from the table below, only about 2.70/00 of
the Gaussian variates are farther than three standard deviations (three σ 0 s) from the mean.

P (| x − µ |≤ σ) = 2Ψ(1) − 1 = 68.27%
P (| x − µ |≤ 2σ) = 2Ψ(2) − 1 = 95.45%
P (| x − µ |≤ 3σ) = 2Ψ(3) − 1 = 99.73%.

109
Let us consider the following example in which we apply the generation method presented
in Appendix A to generate random numbers of a function which depends on two random
variates of given density.
Example: Let x and y be independent of each other with the density N (0, 1). Find out,
what is the density distribution of the ratio z = y/x. Thus we have to calculate the integral:
Z∞ Z∞
1 2 1 2 y
f (z) = dx dy √12π e− 2 x √1 e− 2 y δ(z

− )
x
−∞ −∞

Using the formula X δ(y − yi )


δ(g(y)) = ,
|g 0 (yi )|
i

where yi ’s are zeros of the function g, we get the δ-function in the form: δ(z− xy ) = |x|δ(y−zx).
It follows that the density fuction of the variate z is:
Z∞ Z∞
− 12 x2 1 2
f (z) = 1
2π dx | x | e dy e− 2 y δ(y − zx)
−∞ −∞
Z∞ Z∞
2 − 12 x2 − 12 z 2 x2 1 1 2 2
= dx xe e = dx xe− 2 x (1+z )
2π π
0 0
1 1
= .
π 1 + z2
This is so called Cauchy distribution which we shall discuss more closely later.
Let us consider the transformations between the standard normal and general Gaussian dis-
tribution:
N (µ, σ) ↔ N (0, 1).
We can easily show that

if x is N (µ, σ), then t = x−µ


σ is N (0, 1)
if t is N (0, 1), then x = σt + µ is N (µ, σ)

It follows that it is sufficient to master the generation of N (0, 1) in order to be able to generate
variates which follow the general Gaussian. The cumulative distribution function Ψ (59), nor
its inverse, cannot be calculated in closed form, so the inverse method cannot be applied, but
1 2
possibly by the tabulation method. However, the total integral e− 2 t dt can be calculated
R

by a trick in which one uses the polar coordinates:


R∞ 1 2 R∞ 1 2 R∞ 1 2 R∞ R∞ 1 2 +y 2 )
[ e− 2 x dx]2 = e− 2 x dx e− 2 y dy = e− 2 (x dxdy
−∞ −∞ −∞ −∞−∞
R∞ R2π 1 2 R∞ − 1 r2
= e− 2 r rdrdϕ = 2π e 2 d( 21 r2 ) = 2π.
r=0 ϕ=0 0

May be the most important method of generation of N (0, 1) is based on a similar trick
presented in the following.

110
16.2.1 Polar or Box-Muller method

Let x and y be independent standard normal variates. Their common 2D distribution function
in plane is:
1 2 1 2 1
√1 e− 2 x √1 e− 2 y 1 − 2 (x2 +y 2 )
f (x, y) = 2π 2π
= 2π e

We transform this to the polar coordinate system which is defined as:



x = r cos ϕ
(60)
y = r sin ϕ

The surface element in xy plane transforms as dxdy → rdrdϕ, where the factor r is the
Jacobian determinant of the trasformation (60). The density of the new variates r ja ϕ is
then:
1 2
1
g(r, ϕ) = f (x, y)r = 2π re− 2 r .
We observe that the combined density of r and ϕ is separable:
1 2
g(r, ϕ) = fϕ (ϕ)fr (r), where fϕ (ϕ) = 1
2π ja fr (r) = re− 2 r .

Hence the new variates r and ϕ are independent of each other so they can be generated
independently. The generation of ϕ is trivial: Unif(0, 2π). The generation of r can be made
with the inversion method, because the cumulative density is:
Z r
1 2
Fr (r) = fr (r)dr = 1 − e− 2 r ,
0

and its inverse:


Fr−1 (u) =
p
−2 log (1 − u).
So we conclude: when generating the variates r ja ϕ, one gets by inserting to the trasformation
formula (60) two normal variates x and y. We have derived the polar or Box-Muller
algorithm:

1◦ Generate u1 Unif(0, 1) and compute ϕ = 2πu1



2◦ Generate u2 Unif(0, 1) and compute r = −2 log u2

3◦ Inserting to the formula (60) we get


 √
x = √ −2 log u2 cos(2πu1 )
(61)
y = −2 log u2 sin(2πu1 )

two independent normal variates N (0, 1).

The algorithm is explicit and precise, but rather slow, because it involves computation of

four fairly CPU-costly functions: , log, cos ja sin .

111
16.2.2 Faster variation of the polar method

1◦ Generate u1 ja u2 Unif(0, 1) q (v1 , v2 )


√@
2◦ v1 ← 2u1 − 1; v2 ← 2u2 − 1. The points (v1 , v2 ) w@×
distribute uniformly in a square.

3◦ w ← v12 + v22 . If w > 1, return to 1◦ .


The points (v1 , v2 ) are now uniformly in unit circle and
they are used to compute random sin ja cos:
 √
cos(ϕ) = v1 / w

sin(ϕ) = v2 / w

Moreover w is Unif(0, 1) (show) and it replaces the random number u2 in the formula
(61).

4◦ Inserting to the formula (61) we get:


 p
x = v1p −2 log w/w
y = v2 −2 log w/w

The method is faster, because one avoids the computation of sin and cos. A negative effect
is a small loss in the generation of random numbers. The relative loss is 1 − π4 = 21% of the
random numbers due to the cut at the point 3◦ .

16.2.3 Gaussian generation by summation method

This method is based on the central limit theorem which states that the density of the sum
variate Sn of n independent random numbers x1 , x2 , ...xn
n
X
Sn = xi
i=1

approaches N (µ, σ 2 ) in the limit n → ∞, where


n
X n
X
µ= E(xi ) ja σ2 = σx2i
i=1 i=1

Example: Let xi be random variates Unif(0, 1). Then for the sum variate we have:
n
µ =n· 1
2
= 2
1 n
σ2 = n · 12 = 12 .

We define a new random variate:



r r
12 n 12
yn = (Sn − ) = Sn − 3n .
n 2 n
The mean and variance of this variate are:
√ 12 n √
r r
12
ȳn = S̄n − 3n = · − 3n = 0
n n 2

112
r
12 2 n 12 n
σy2n =( ) var(Sn − ) = · =1.
n 2 n 12
According the the central limit theorem the behaviour of the variate yn approaches N (0, 1)
in the limit n → ∞. For practical applications the convergence is fast and n ≈ 10 is large
enough for many applications. The relative precision is poorest in the tails, because they get
truncated: √
|yn | ≤ 3n.
Often one uses the value n = 12. Then the expression of yn is simple:
12
X
y12 = −6 + xi
i=1

The summation method is especially suitable for calculation in parallel processors: the terms
of the sum can be computed simultaneously, when one uses multiplicative congruential method
for the generation of Unif(0, 1). This is based on the following property:

If xi+1 = axi (mod m) then xi+k = ak xi (mod m)

It follows that the sequence of k (e.g. k = 64) consecutive random numbers Unif(0, 1) can
be computed directly from the same seed using simply different multiplicative coefficients
a, a2 , ..., ak . In parallel processor computers this can be done simultaneously.

16.2.4 Kahn’s method

The Kahn’s method is based on the approximative formula of the normal distribution:

1 2 e−kx
√1 e− 2 x 'k = f (x),
2π (1 + e−kx )2
q
8
where the constant is k = π. The cumulative density and its inverse can be easily calcu-
lated:
Rx
F (x) = f (x)dx = (1 + e−kx )−1
−∞
F −1 (u) = k −1 log 1−u
u

The generation algorithm is in all its simplicity as follows:

1◦ Generate u Unif(0, 1)

2◦ Compute x = π8 log 1−uu


p

The method is fast and well applicable for example in the simulation of measurement errors
of physical quantities, if one does not require a precise Gaussian distribution.

113
16.3 Multidimensional Gaussian distribution
Variate: ~x = (x1 , ..., xk ) vector in k-dimensional space
Parameters: µ
~ = (µ1 , ..., µk ) vector in k-dimensional space
V symmetric kxk matrix pos. definite (det V > 0).
1
Density function: f (~x) = p ~ )T V −1 (~x − µ
exp[− 12 (~x − µ ~ )]
(2π)k det V
Char. function: Φ(~t) = exp(i~t · µ
~ − 21 ~t · V ~t)

The connection of the parameters µ


~ ja V to the characteristics of the density function is as
follows:

µ
~ = E(~x) expectation value of ~x
vij = cov(xi , xj ) covariance matrix elements
vii = σx2i diagonal elements are variances

The inverse V −1 is often called a weight matrix.

Example 1: Gaussian distribution in plane (k=2)


According to the definition of correlation ρ, one has v12 = v21 = cov(x1 , x2 ) = ρσ1 σ2 , so that
V can be expressed as

σ12
 
ρσ1 σ2
V =
ρσ1 σ2 σ22

from which we get the weight matrix:

(σ1 σ2 )−2 σ22


 
−ρσ1 σ2
V −1 =
1 − ρ2 −ρσ1 σ2 σ12

A general 2-dimensional Gaussian density distribution expressed as a function σ1 , σ2 and the


correlation coefficient ρ is then:

Figure 13: Two-dimensional Gaussian density with correlation coefficient ρ = −0.7.

114
x1 −µ1 2 x −µ 2
h i
x −µ x −µ
− 12 1
( ) −2ρ 1σ 1 2σ 2 +( 2σ 2 )
f (x1 , x2 ) = 1√
e 1−ρ2 σ1 1 2 2 (62)
2πσ1 σ2 1−ρ2

Example 2: If the component variates x1 , x2 , ..., xk are mutually independent, all off-
diagonal elements of the matrix V vanish, so V ja V −1 are diagonal matrices:

vij = (V −1 )ij = 0, when i 6= j ja vii = σi2 ; (V −1 )ii = σi−2

It follows that the corresponding density function factorizes for the components xi :
k “
xi −µi
”2
Y 1 −1
f (~x) = √ e 2 σi

i=1
2πσi

Generation of multidimensional Gaussian

a) Uncorrelated case
Each component of the vector ~x are generated independent of each other: One generates ui
from N (0, 1) and computes xi = σi ui + µi , i = 1, ..., k resulting in a random vector ~x.

b) Two-dimensional distribution with correlation

The density given in the equation (62) can be generated with the following modified Box-
Muller method: √
x1 = √−2 log u1 sin 2πu2
x2 = −2 log u2 sin(2πu2 − φ)
where the phase angle φ defines the correlation: cos φ = ρ. The validity of the method can
be proved e.g. using the following theorem:
Theorem: If x and y are independent Gaussian random variates:

x : N (µx , σx2 )
y : N (µy , σµ2 )

then the variate z = ax + by is Gaussian N (aµx + bµy , a2 σx2 + b2 σy2 ).


Proof: The theorem is proved with the aid of characteristic function:

Φz (t) = Φax+by (t) = Φax (t)Φby (t) = Φx (at)Φy (bt)


= exp(iatµx − 21 a2 t2 σx2 ) exp(ibtµy − 12 b2 t2 σy2 )
= exp[it(aµx + bµy ) − 21 t2 (a2 σx2 + b2 σy2 )].

We see that the characteristic function of z is that of a Gaussian whose mean and variance
are aµx + bµy and a2 σx2 + b2 σy2 .

c) General case

Let the vector ~x be a general k-dimensional Gaussian variate and µ


~ its mean vector:

~x = (x1 , ..., xk ) ; µ
~ = (µ1 , ..., µk )

115
Furthermore, let us assume that in addition to µ~ we know the covariance matrix V :
 
v11 v12 · · · v1k
 v21 v22 · · · v2k 
T
V = E[(~x − µ
~ )(~x − µ
~) ] =  .
 
.. .. 
 .. . . 
vk1 vk2 · · · vkk

The question is: How to generate vectors ~x ? In the following we present a method which is
based on the fact that the covariance matrix V can always be expressed as a product

V = CC T .

This C-matrix is utilized in the generation algorithm which goes as follows:

1◦ Generate ~u = (u1 , · · · , uk ), where ui are N (0, 1)

2◦ Compute ~x = µ
~ + C~u

Then the vectors ~x are Gaussian N (~µ, V ). The algorithm is proved as follows: At the point
2◦ one computes ~x as a matrix product where every component of ~x is a linear combination of
standard normal variates ui . From the above theorem it follows that the vector ~x is a normal
variate. Moreover we see easily that these variates have the right mean µ ~ and covariance
matrix V :

E(~x) = E(~
µ + C~u) = E(~ µ) + C~0 = µ
µ) + CE(~u) = E(~ ~
E[(~x − µ ~ )T ] = E[C~u(C~u)T ] = E[C~u~uT C T ]
~ )(~x − µ
= CE(~u~uT )C T = CC T = V (m.o.t.)

The expectation value of ~u is a null vector and the expectation value of ~u~uT unit matrix,
because ui are independent N (0, 1).
A matrix C can be computed with the following recursive algorithm (so called Cholesky
factorization):

do i = 1, k
do j = 1, i − 1
cij = (vij − j−1
P
n=1 cin cjn )/cjj
cji = 0
enddo
1
cii = (vii − i−1 2 2
P
n=1 cin )
enddo

The resulting C-matrix is a triangular matrix: All the elements above the diagonal are zero.

16.4 Exponential distribution

The exponential distribution is important in physics phenomena. The density function is:

f (x) = λe−λx , x > 0.

The most important characterizing parameters are:

116
Expectation: hxi = λ−1 Skewness: γ1 = 2
Variance: σx2 = λ−2 Kurtosis: γ2 = 6

−1
The characteristic function is: φ(t) = (1 − itλ−1 )

In the following we consider a few examples related to exponential probability.

Example 1: Decay of a radioactive sample. Number of nuclear decays in a short time


interval is proportional to the length of the interval and to the number of nuclei:
ZN Zt
dN N
dN = −λN dt ⇒ =− λdt ⇒ log = −λt
N N0
N0 0

N (t) = N0 e−λt
The parameter λ can be called the (mean) decay rate.
Example 2: Particle lifetime. The probability distribution of the lifetime is:
1 t
f (t) = exp(− ),
τ0 τ0
where τ0 is the mean lifetime of the particle in its rest frame. The mean lifetime of a moving
particle is affected by the time dilatation, according to which its lifetime, measured in the
laboratory frame, gets longer: τ0 → γτ0 . Here γ is so called Lorentz factor:
1
γ=q
1 − ( vc )2

The total energy of the particle is E = m = γm0 (m=’relativistic mass’) in units where
speed of light is unity (c ≡ 1). Furthermore relationship between particle relativistic mass,
momentum and energy is:
p
p = mv = Ev ⇒ v =
E
With these formulae we can derive the probability density function of the particle flight path
length `:
` E ` p −1 m0 `
t= = · =⇒ f (`) = ( cτ0 ) exp (− )
v p c m0 p cτ0

Generation of exponential

The cumulative distribution function and its inverse can be easily calculated:
Zx
−λx
f (x) = λe ⇒ F (x) = λe−λt dt = 1 − e−λx
0

⇒ F −1 (u) = −λ−1 log(1 − u)


Hence one can use the method of inverse distribution function according to which the gener-
ation algorithm is as follows:

117
1◦ Generate u from Unif(0, 1)

2◦ Compute x = −λ−1 log u

Simulation applications of the exponential are useful especially in the radiative physics. A
free path of a particle, i.e. distance traveled before any interaction, is a random variable
which follows exponential independently of the type of interaction: decay, elastic or inelastic
collision with atoms, Compton scattering e.t.c.

16.5 χ2 distribution

Variate: χ2N positive real number


Parameter: N positive integer (’degrees of freedom’)
2 N χ2
Density function: f (χ2N ) = 12 ( χ2 ) 2 −1 e− 2 /Γ( N2 )
Char. function: φ(t) = (1 − 2it)−N/2

Figure 14: Chi-squared distribution for N=3

Notice that χ2N does not mean the square of χN , but it is just a notation for a variate which
follows the χ2 distribution. Often one omits the parameter N in the notation: χ2N → χ2 .
The Γ function which appears in the normalization factor has the following properties:

Γ(n) = (n − 1)! Γ( 21 ) = π Γ(x) = (x − 1)Γ(x − 1)
√ 1
Γ(x) ' 2πxx− 2 e−x (1 + 0.0833/x)

The last (approximate) expression is so called Stirling’s formula. Other characteristic prop-
erties of the χ2 distribution are:
q
Mean: hχ2 i = N Skewness: γ1 = N8
Variance: σχ2 2 = 2N Kurtosis: γ2 = 12/N

118
Lemma: The sum of N independent standard normal variates xi : N (0, 1) squared follow
the χ2N distribution:
XN
SN = x2i
i=1

Proof: We prove the lemma by using characteristic functions. Let us have x : N (0, 1). Then
the characteristic functions of x2 :n and SN :n are:
Z∞ Z∞
2 1 2 1 1 2 1
φx2 (t) = eitx √12π e− 2 x dx = √ e− 2 x (1−2it) dx = (1 − 2it)− 2

−∞ −∞
N
Y N
φSN (t) = φP x2 (t) = φx2 (t) = (1 − 2it)− 2
i i
i=1

The characteristic function of the random variate SN equals the characteristic function of χ2N
so that the probability densities SN and χ2N are the same. 
More generally: If xi follows N (µi , σi2 ) then xiσ−µ
i
i
follow N (0, 1). This implies that the
sum variate:
N 
xi − µ i 2
X 
2
χN = (63)
σi
i=1

follows the χ2 distribution. From this we can readily conclude on the basis of the central
limit theorem that the χ2N distribution behaves as

lim f (χ2 ) = N (N, 2N )


N →∞

for large N . It means that√the χ2N density function approaches Gaussian with the mean and
stanfard deviation N and 2N , respectively. The convergence is rather slow. Another vari-
ate, whose density distribution approaches the Gaussian faster, is obtained by the following
transformation of the χ2N variable:
p √
aχ2 = 2χ2 − 2N − 1
N

The density dostribution of this variate approaches N (0, 1).

Another way to get variates χ2N is by summing up independent exponential variates. In the
following yi ’s are exponentially distributed and x is N (0, 1):

1◦ N even: N/2
X
χ2N = 2 yi ; f (yi ) = e−yi (64)
i=1

2◦ N odd: [N/2]
1 1 2
yi + x2 ; f (x) = √ e− 2 x
X
χ2N = 2 (65)
i=1

The notation [N/2] means an integer nearest to N/2 which is smaller.
1 2
Notice: f (χ22 ) = 21 e− 2 χ so for the parameter N = 2 the χ2 distribution is exponential.

119
Generation of the χ2 distribution

a) If N is large (≥ 20): from Gaussian


b) N ≤ 20: using the formulas (64-65):

1◦ Generate [N/2] random numbers from e−x .

2◦ Compute the sum (64).

3◦ If N is odd, generate x : N (0, 1) and add x2 to the sum.

The χ2 distribution is important in:

• statistical tests (e.g. random number generator tests)

• optimization of the parameters (’model fitting’) when checking the applicability of a


parameterized model

Example: Confidence Level (CL)


Z∞
CL = f (χ2 )dχ2
χ2f it

χ2f it versus CL can be obtained from curves or tables. A very powerful way of testing is to
plot the so called χ2 probability (see section 14.4).

16.6 Cauchy or Breit-Wigner distribution

The density function of the Cauchy distribution is of the form:


1 1 b 1
f (x) = 2 = , b > 0.
πb 1 + ( x−a ) π b + (x − a)2
2
b

The density function has two parameters from which the other (a) sets the mean and the
other (b) sets the width. In the following Cauchy distribution is denoted as C(a, b). The
characteristic figures of Cauchy distribution are:

Mean: hxi = a Variance: σx2 = ∞(!)

The variance is not defined, because the integral of the variance formula diverges. The same
is true for all higher order even central moments. All odd central moments vanish (=0),
which follows from the fact that the Cauchy distribution is symmetric around its mean a.
The characteristic function is:

φ(t) = exp(iat − b | t |).

120
Figure 15: Cauchy distribution (blue), a=0, b=1; standard normal for comparison

Though the standard deviation of the Cauchy distribution is = ∞, its half width half maxi-
mum (HWHM) can be calculated: HW HM = b.
The Cauchy distribution has the following property (same as Gaussian N (µ, σ 2 )) :

x is C(a1 , b1 )
⇒ z = x + y is C(a1 + a2 , b1 + b2 )
y is C(a2 , b2 )

In physics the Cauchy distribution appears in the resonance peak (Breit-Wigner):

1 Γ/2 Γ
f (E) = 2 2
= C(E0 , )
π (Γ/2) + (E − E0 ) 2

Here Γ on the width of the resonance and E its energy at rest. When it comes to a resonance
particle, E0 = hEi is the particle mass.

Example 1: The lifetime distribution of the resonance is given by the Fourier transformation
of the Breit-Wigner, E → t:

f (t) ∝| φ(t) |2 =| exp(iat) |2 · | exp(−b | t |) |2 = exp(−Γ | t |).

It follows that the mean lifetime τ of the resonance particle is inversely proportional to the
width of the resonance peak: τ ∝ 1/Γ.

Example 2: A rotating straight line going through the point (a, b) (b > 0). When the angle φ
between the line and the x axis is uniformly distributed random variate, then the coordinates
x of the intercept of the line and x axis are distributed like Cuachy C(a, b) (exercise).
One of the applications of the Cauchy distribution is to look for errors in the simulaton
program: using Cauchy (long tails) instead of Gaussian in the errors generation one gets
plenty of situations which would be rare otherwise. Then there is a larger chance to get a
programming bug to show up.

121
Generation of Cauchy distribution

There are several methods:


a) In the attached figure the angular variate φ is Unif(− π2 , + π2 ). Then the random variate x
is C(a, b). Hence the algorithm reads:
rP
1◦ Generate u Unif(0, 1)
P PP
b φ PP
P PP
2◦ Compute x = a + b tan[π(u − 12 )] P P
P -
a x

b) If u1 and u2 independent and normal N (0, 1), then the ratio u = u1 /u2 is C(0, 1). The
algorithm reads:

1◦ Generate u1 and u2 from N (0, 1)

2◦ Compute x = a + b u1 /u2

c) If the points (x1 , x2 ) in plane are distributed uniformly in unit circle, then the random
variate x = x1 /x2 is C(0, 1), and the algorithm reads:

1◦ Generate u1 ja u2 Unif(0, 1)

2◦ If (2u1 − 1)2 + (2u2 − 1)2 ≤ 1, then

3◦ x = a + b(u1 − 12 )/(u2 − 12 )

16.7 Landau distribution

In physics the Landau distribution is related to the motion of charged particles in a medium.
When a particle undergoes electro-magnetic interaction in medium and ionizes atoms, it loses
energy. The energy loss, so called ionization energy, per traveled unit distance (keV/cm) is
a random variate which follows Landau distribution. The exact definition of the distribution
reads:
c+∞
Z
1
Φ(λ) = exp(λs + s log s)ds (c > 0).
2πi
c−i∞

This is a Laplace transformation of the function ss . In practical applications the following


approximate expression is precise enough:
1 1 −λ
Φ(λ) ' √ e− 2 (λ+e ) 6

missä λ = λ̄(E − Ep )/(Em − Ep )


Ep = most probable ionizaton energy
Em = average ionization energy
' 153.5X( βz )2 A
Z
ρ keV cm2 /g
E
-
λ̄ ' 1.270 Ep

122
where z is the charge of the ioniziting particle and β = v/c its velocity. The quantities Z, A
and ρ are the atomic number, the mass number and the density of medium. X is the distance
in which the energy loss is measured.
A typical feature of the Landau distribution is its very long tail. In other words the random
fluctuations with E > Ep are highly probable. There is more information on the subject in
F. Sauli: Princinples of operation of multiwire proportional and drift chambers, CERN Yellow
Report 77–09 (May 1977) (http://cdsweb.cern.ch/record/117989/files/Chapter01.pdf).

Generation of Landau distribution

We do not discuss here the theory of generation. If needed, one can use
existing generation programs. In the ROOT analysis toolkit the function is
TRandom::Landau(peak,halfWidth). In the picture below we use this function to produce
a typical histogram of the Landau distribtion.

Figure 16: A random generated Landau distribution, peak=10, halfWidth=2

123
17 Monte Carlo integration

17.1 Basics of the MC integration

The MC method is simple and also efficient compared to other numerical methods, when the
volume of integration is multi-dimensional.
Let us consider 1-dimensional integration. Conclusions which will be drawn below are valid
also for n-dimensional integration:

Zb
I= g(x)dx (66)
a

On the other hand the expectation value of the function g(x) over the uniform distribution
Unif(a, b) is:

Zb
1
E(g) = g(x)dx, (67)
b−a
a

from which it follows that


I = (b − a)E(g).
Let us assume that the values of the function g(x) are computed at random points xi which
are distributed uniformly Unif(a, b). An estimate of the g’s expectation value is then:
N
1 X
E(g) ' g(xi )
N
i=1

It follows that the integral (66) has the estimate:

N
b−aX
I ' IN = g(xi ) (68)
N
i=1

where xi , i = 1, · · · , N , are random numbers whose distribution is Unif(a, b). The formula
(68) can immediately be genralized to multi-dimensional space:

Z Z N
VA X
I= ··· g(~x)d~x ' IN = g(~xi ) (69)
A N
i=1

where ~xi are random vectors (points in space) uniformly distributed in the space of integration
A and VA is the (hyper)volume of A. A is (hyper)rectangle, if the limits of integration are
constant.

124
Example: Consider integration on plane. In the formula 6
(69) we have the volume (area) VA . A question arises how b2
to know or estimate VA , when the area has an arbitratry
shape whose value cannot be computed in closed form?
In this case one can apply Hit or Miss method in the
A
following way:
Generate uniformly pairs (x1 , x2 ) in a rectangle which
contains the area A (see figure). Total N generations
a2
from which M pairs hits the area A. The size of the area,
VA , is then estimated as: -
a1 b1
M
VA ' (b2 − a2 )(b1 − a1 )
N

If the volume of the surrounding region is denoted as VbA one gets more generally:
Z Z M
VbA X
· · · g(~x)d~x ' g(~xi ). (70)
A N
i=1

Often a suitable variable transformation


rk = rk (x1 , · · · , xn ), k = 1, · · · , n,
can transform the integration region A to a hypercube:
0 ≤ rk ≤ 1, k = 1, · · · , n,
and the equation (69) transforms to the normal form:

Z1 Z1 N
1 X
··· f (~r)d~r = lim f (~ri ) (71)
N →∞ N
0 0 i=1

where f is the transformation of g:


∂(x1 , · · · , xn )
f (~r) = g(~x)
∂(r1 , · · · , rn ) xk =xk (r1 ,··· ,rn )

(see Appendix A.10: Variable transformation).

Zb1 Zbn
Example: General constant integration limits: I= ··· g(~x)d~x.
a1 an
The required transformation is obtained by the equations:
xk = ak + (bk − ak )rk , k = 1, · · · , n.
n
Q
The Jacobian determinant of this transformation is simply: (bk − ak ), so we get
k=1

Zb1 Zbn n
Y Z1 Z1
··· g(~x)d~x = (bk − ak ) · · · g[~a + (~b − ~a) · ~r]d~r.
a1 an k=1 0 0

125
In what follows we restrict ourselves to the integrals of normal form in order to simplify the
notation.

17.2 Convergence and precision of the MC integration

Let us conside 1-variable integral (due to simplicity of notation, the formulae are valid for
multi-variable integrals):
Z1 N
1 X
I= f (x)dx ' IN = f (xi ), 0 < xi < 1. (72)
N
0 i=1

IN is the sum of random variables and approaches therefore a Gaussian distribution


according to the central limit theorem. The variance of this distribution is:
 N  N
1 X 1 X 2 N
σI2N =σ 2
f (xi ) = 2 σ (f (xi )) = 2 σf2 .
N N N
i=1 i=1

We have derived an important result for the precision and the rate of convergence to the
(crude) Monte Carlo integration:

V2
var(IN ) = var(f ), (73)
N
1
where V is the total volume of the integration region. This means that IN converges as N − 2 :

V
I = IN ± √ σf . (74)
N

We emphasize that the convergence (74) is independent of the dimension of the integration.
This is the basic reason to the good efficiency of the MC integration in multi-dimensional
space.
In the formula (74) there appears a notion variance of the function f within the range
of integration. In the following we explain this notion more in detail. Without a loss of
generality we consider again the unit range [0,1].
The variance of the function f means the variance with respect to uniform density. The
definition of variance gives readily the theoretical expression of var(f ):
Z1 Z1
var(f ) ≡ σf2 = 2
(f (x) − hf i) · 1 · dx = (f (x) − I)2 dx = hf 2 i − hf i2 . (75)
0 0

We see that the expression contains the value of the integral I, so this equation has only a
theoretical meaning from the point of view of MC integration. In practice the integral (75)
must also be computed with MC method. According to the equation (75) we obtain the
following empirical expression for var(f ):
N
"N #2
1 X 1 X
var(f ) ≡ σf2 ' f (xi )2 − 2 f (xi ) . (76)
N N
i=1 i=1

126
If we do not know the volume of the integration region, we need to use hit or miss method
as in the formula (70). The error estimate of IN is then:

M
"M #2  21
Vb X 1 X 
∆IN = f (xi )2 − f (xi ) ,
N N 
i=1 i=1

where Vb is the volume of the surrounding region, M the number of generations which hit the
integration region and N the number of generations in the whole surrounding volume Vb .
Example: Constant function: f (x) = a, 0 < x < 1. Then hf i = a and the variance of f is:
R1 R1
σf2 = (f − a)2 dx = 0 · dx = 0
0 0

The example shows that the precision of MC inte-


6f (x)
gration
σf
∆IN = √
N
is the better the closer f is to a constant function.
When var(f ) is large (figure), the largest contribu-
tion to integral comes from the region where f is
large. Despite of that, in the crude MC, the num-
ber of generations in the region where the function I = hf i
is small is the same as in the region where the func- -x
tion is large. Therefore points are ’wasted’ in crude 0 1
Monte Carlo.

17.3 Methods to accelerate the convergence and precision

The method of MC integration in which the random points are generated uniformly in the
region of integration is called crude Monte Carlo. According to the formula
σf
∆IN = √
N
the precision can be improved by
1
- accelerating the convergence N − 2
- reducing the variance.
Convergence can be accelerated by using quasi random numbers which are customized for
a given problem. It can be shown that with quasi random numbers it is possible to attain a
convergence which goes like ∝ N −1 at the best. We do not discuss this option in detail, but
rather concentrate to methods of variance reduction. Most commonly used variance reduction
methods are:
- stratified sampling (’kerrostettu otanta’ in Finnish)
- importance sampling
- using control variates
- using antithetic (negatively correlated) variates
which we consider in the following.

127
17.3.1 Stratified sampling

The stratified sampling (’kerrostettu otanta’ in Finnish) is in a sense analogous to the com-
position method (15.6). Let us consider the basic form (1D case).

Z1
I= f (x)dx.
0 6f (x)
In the method the integration region is divided
in smaller intervals. Let the dividing points be

0 = α0 < α1 < · · · < αk = 1.

Then the integral I is written as a sum:


k
X Zαj -x
I= f (x)dx 0 1
α0 α1 ··· αk
j=1 α
j−1

When crude Monte Carlo is applied in each interval, the formula for stratified sampling reads:
k nj
0
X αj − αj−1 X
IN = f [αj−1 + (αj − αj−1 )rij ], (77)
nj
j=1 i=1
P
where N = nj and rij are random numbers Unif(0,1). In the following we study the
j
0 .
variance of the MC estimate IN
k   nj
0
X αj − αj−1 2 X
var(IN ) = var(f )
nj αj−1 <x<αj
j=1 i=1

k  k Zαj
αj − αj−1 2

X X αj − αj−1
= nj · var(f ) = (f − hf i)2 dx.
nj αi−1 <x<αj
nj
j=1 j=1 αj−1

0 is
It follows that the variance of IN
 
k  Zαj  Zαj 2 
0
X  αj − αj−1 1 
var(IN )= f 2 (x)dx − f (x)dx . (78)
 nj nj 
j=1 
αj−1 αj−1

With suitable choice of nj and αj one has var(IN ) − var(IN0 ) > 0 which means that the
0
variance of IN is reduced in comparison with the crude MC estimate IN .
1 N
Example: Uniform intervals αj − αj−1 = k and nj = k. Then (see formula (73))
α  k Zαj
k  Zj 2 2
0 k X 1 X
var(IN ) − var(IN ) = f (x)dx − f (x)dx .
N N
j=1 α j=1α
j−1 j−1

128
One can show that this is always ≥ 0. For example in case that the range of integration is
divided in two, one gets:
1
Z2 Z1 2
0 1
var(IN ) − var(IN ) = f (x)dx − f (x)dx ≥ 0.
N
0 1
2

An optimal division depends on the shape of the function. If the shape of the function is
not known, division into uniform intervals gives already a smaller variance. In the previous
example the numbers of generation nj were chosen all equal. Obviously a better result is
obtained, when the numbers nj are set proportional to the integral of |f | over each interval:

Zαj
nj ∝ |f (x)|dx.
αj−1

This method can be used only iteratively, because in the beginning the integral of the function
|f | is unknown. In another method one adjusts the intervals αj such that the contribution
to the total integral from each interval is roughly equal:

Zαj
|f (x)|dx ' const.
αj−1

In this case the numbers of generation nj are set the same at each interval. The
stratied sampling is used for example in the Monte Carlo integration code Vegas
(http://en.wikipedia.org/wiki/VEGAS_algorithm).

17.3.2 Importance sampling

6
The basic principle in the importance sampling
is to try and weight the integrand such that its
variance is as small as possible. For this purpose f (x)
let us consider the following identities:
-x
f (x) a b
f (x) = · g(x) ⇓
g(x)
f (x) 6
f (x)dx = dG(x),
g(x)
f (x)
where g is a density function which satisfies the g(x)
conditions listed below and G is the correspond-
-x
ing cumulative density. a b

129
The importance sampling corresponds to the change of variable r = G(x) so the integral of
the function f can be written:

Zb Z1
f (G−1 (r))
I= f (x)dx = dr. (79)
g(G−1 (r))
a 0

The function g must satisfy the conditions:

• g(x) = dG/dx is a density function in the range (a, b) of integration

• G(x) can be inverted or there is some other method available to generate from g

• The ratio f (x)/g(x) as uniform as possible in the range (a, b)

The MC estimate of the integral (79) with importance sampling is then:

N
1 X f (xi )
IN = , (80)
N g(xi )
i=1

where the random points xi must be generated from the density g. The non-uniform
sampling is compensated by weighting the terms of sum by g −1 . If the inversion method can
be used, the formula goes in the form (80)
N
1 X f (G−1 (ri ))
IN = , (81)
N g(G−1 (ri ))
i=1

where the random numbers ri are generated from Unif(0,1).


Example 1: If f (x) > 0 ∀x one could choose g(x) = cf (x) (c = constant) so var( fg ) =
var(c) = 0. This way the MC integration would be absolutely precise. This isR a useless
observation, however, because in order to know G(x) we should know the integral f (x)dx !
Example 2: If f (x) resembles Gaussian N (0, σ), one gets:
Z∞ √ N
2πσ X 1 xi 2
f (x)dx ' f (xi )e+ 2 ( σ )
N
−∞ i=1

where the random points xi are generated from the distribution N (0, σ).

17.3.3 Control variates

In this method the integral is divided in two terms

Z1 Z1 Z1
I= f (x)dx = [f (x) − g(x)]dx + g(x)dx. (82)
0 0 0

The problem is to find a suitable function g which fulfils the following conditions:

130
• var(f − g) < var(f )
R1
• g(x)dx can be computed analytically.
0
R
The integral (f − g)dx is computed with crude Monte Carlo. In comparison with the
importance sampling this method has a number of advantages:

- g’s zeros is not a problem

- g’s generation method is not needed

- g can have negative values.

The variance of the function f − g is

var(f − g) = var(f ) + var(g) − 2cov(f, g)

which is smaller than the variance of the original integrand var(f ), if


1
cov(f, g) > var(g).
2
The condition for the method to work is a sufficiently large positive correlation between the
functions f : and g.

17.3.4 Antithetic variates

The crude Monte Carlo estimate of the integral IN is divided in two parts:
0 00
IN = IN + IN (83)
0 , I 00 ) < 0. Then the variance of I
where the covariance of the two parts is negative: cov(IN N N
becomes smaller, because
0 00 0 00
var(IN ) = var(IN ) + var(IN ) + 2cov(IN , IN ).

The negative correlation of the terms IN 0 and I 00 is obtained by using different variates in
N
their generation - so called antithetic variates (’vastakkaisvariaatit’ in Finnish) u, ϕ(u):

1◦ u ja ϕ(u) are Unif(0,1)

2◦ cov(u, ϕ(u)) < 0.

Example: Consider the variates u and 1 − u where u is generated from Unif(0,1). These
variates provide maximal negative correlation.

131
Then it follows from the equation (83):
N  
1 X 1 1
IN = f (ui ) + f (1 − ui ) . f (x)
6
N 2 2
i=1

This gives a reduced variance provided that the integrand


f is not symmetric (f (u) 6= f (1 − u)). The variance gets
reduced, since if ui hits a region where f is small, then 1−ui -x
hits a region where f is large. More generally, consider the 0 u 1−u 1
estimate
N N
1 X 1 X
IN = {αf (αui ) + (1 − α)f [1 − (1 − α)ui ]} ≡ Fα (ui ). (84)
N N
i=1 i=1

The variance of (84 becomes minimized with a choice of α which minimizes the variance of
the function Fα :
2
Z1
 1
Z Z1
var(Fα ) = Fα dx −  Fα dx = Fα2 dx − I 2
2

0 0 0
Zα Z1 Zα
= α 2
f (x)dx + (1 − α) 2
f (x)dx + 2(1 − α) f (x)f [1 − (α−1 − 1)x]dx − I 2 .
0 α 0

This expression has a minimum in the range 0 < α < 1. The precise value of α is difficult to
determine, but a good approximation is given by the equation f (α) = (1 − α)f (1) + αf (0).
Notice that (84) is also an applicaition of the stratified sampling with divisions [0, α] and
[α, 1], because 0 ≤ αui ≤ α and α ≤ 1 − (1 − α)ui ≤ 1.

17.4 Limits of integration in the MC method

Non-constant limits of integration can often yield difficulties in MC integration. As an ex-


ample let us consider the integral:

Z1 Zx 1
I= f (x, y)dxdy
x=0 y=0

A flowchart in which y
1◦ generate random number xi Unif[0,1],
2◦ generate random number yi Unif[0,xi ] and
N
◦ 1 X
3 compute IN = f (xi , yi )
N 0
i=1 0 x 1

gives a wrong answer, because the number of points gets equal in each vertical band
(figure) and hence the density of points in the region of integration is not uniform. Three

132
different methods are presented below in order to compute the integral by MC method. In
the formulae a is the area of the triangle.

A. Rejection B. Folding C. Weighting


1◦ gen. 0 < xi < 1 1◦ gen. 0 < r1 < 1 1◦ gen. 0 < xi < 1
2◦ gen. 0 < yi < 1 2◦ gen. 0 < r2 < 1 2◦ gen. 0 < yi < xi
3◦ josP
yi > xi → 1◦ 3◦ xi = max(r1 , r2 ) ◦ 1 PN
3 N i xi f (xi , yi )
4◦ Na N i f (xi , yi ) yi =
Pmin(r1 , r2 ) Compensating the non-
4◦ Na N i f (xi , yi )
Choosing the points at uniformity of points by
random in the square and Folding the points above weighting with inverse of
rejecting those above the the diagonal below the the density (= 1/x)
diagonal diagonal

Rejection method is inefficient, because 50% of the generations are lost. As a matter of
fact, the method C is the same as importance sampling with the weighting function
f (x, y) = 1/x.

17.5 Comparison of convergence with other methods

When the efficiency of the MC integration is compared with other numerical methods, let us
consider the situation first in 1D case. Other methods used frequently are trapezoid rule,
Simpson’s rule and Gauss’s m-point rule. Precision in these methods depends on the
number of divisions.
? ?s = sagitta
Trapezoid rule:  

6 6
Error ∝ s ∝ h2 ∝ n−2
⇒ convergence ∝ n−2 h = division
n = nb. of divisions
 h -
Z b
h
Simpson’s rule: f (x)dx = (f0 + 4f1 + 2f2 + · · · + 4f2n−1 + f2n )
a 3
h= b−a
2n; fi = f (a + ih). Convergence is ∝ n−4 .
Z b Xk
Gauss rule: f (x)dx = gm (xi−1 , xi ),
a i=1
where gm is m-point rule estimate in the interval [xi−1 , xi ]. In this method the convergence
is ∝ n−2m+1 where n = k · m.

Comparison in N-dimensional space:

Method Convergence Nb. of points


− 12
Monte Carlo n n
Trapezoid n−2/N nN
Simpson n−4/N nN
Gauss n−(2m−1)/N nN

One can see that for numerical methods other than Monte Carlo, both convergence and
precision deteriorate quickly. Also the number of points needed grows quickly as a function
of N in other numerical methods.

133
18 Grid computing

18.1 Introduction

The Grid is a distributed architecture for delivering computing and data resources over the
internet. The word Grid is chosen by analogy with the electric power grid. According to this
analogy, the end user does not have to know or care where the computing is performed or the
data stored. The software tools (middleware) developed for the Grid are suitable for remote
instrumentation and virtual environments. Rapid development of new technologies enabled
the forming of the Grid: ”Scientific computing is getting more and more data intensive.
Simultaneously, the demand for applications that are easy to use grows as new branches of
science are taking computing methods into use. The development of networks has made it
possible to offer versatile information services regardless of their physical locations. In the
intersection of these trends a new concept has been created: the Grid.”
The Grid is applicable to tasks where widely distributed resources are available on a Wide Area
Network (WAN). The Grid problem can be defined as flexible, secure and manageable sharing
of the computing resources in a virtual organization (VO). This kind of an organization
consists of a dynamic group of individuals, institutions and resources. Grid projects are
multi-organizational.
A metacomputer is a collection of computers in different physical locations that can be used
like a local computer. In essence, the Grid concept is a logical continuation of the meta-
computer. The main difference between the metacomputer and Grid concepts is that, in a
metacomputer, the resources are usually connected using a Local Area Network. For a Grid,
the resources are distributed on a Wide Area Network. As a consequence, the data processing
and saving for a Grid (or at least their manageability) is genuinely distributed on a national,
international, and even global level.
The Grid empire is expanding fast, particularly across all borders - a lot of new Grid research
programs and initiatives are being established all over the world and many of those are grow-
ing rapidly. The overall goal of all these programs is to make Grid computing as universally
accepted and standardized as electricity is now.
In HEP one of the most challenging tasks is to build and maintain a data storage and analysis
infrastructure for the LHC. The LHC will produce roughly 15 Petabytes (1.5 × 107 GBytes) of
data annually. Thousands of scientists around the world will access and analyse those data.
In addition, all the data needs to be available over the estimated 15-year lifetime of LHC.
The analysis of the data, including comparison with theoretical simulations, requires of the
order of 100,000 CPUs at 2004 measures of processing power. A novel globally distributed
model for data storage and analysis was chosen - a computing Grid.

18.2 Grid middleware

The Grid relies on advanced software called middleware, which ensures seamless communi-
cation between different computers and different parts of the world. Grid middleware refers
to the security, resource management, data access, instrumentation, policy, accounting, and
other services required for applications, users, and resource providers to operate effectively in
a Grid environment. Middleware acts as a sort of ’glue’ which binds these services together.
To deploy Grid middleware as a user, a valid Grid user certificate is needed.

134
The middleware can be in general categorized into site services and Virtual Organization (VO)
services. The site services consist of security, computing element providing Grid interfaces,
storage element providing Grid interfaces to site storage, monitoring and accounting services
for inspecting the status of Grid services, VO membership service, workload management
providing facilities to manage jobs, file catalogues for locating and accessing data, information
services for publishing and maintaining data, and file transfer services.
During the past few years, numerous Grid and Grid-like middleware products have emerged.
Examples include UNICORE, ARC, EDG/LCG-2/gLite, Globus, Condor, VDT and SRB.
They are capable of providing some of the fundamental Grid services, such as Grid job submis-
sion and management, Grid data management and Grid information services. Unfortunately
no widely accepted, implemented and usable standards exist.
The LHC computing Grid project (LCG) develops middleware based on Globus, Condor,
Virtual Data Toolkit and gLite; the middleware is EDG/LCG-2/gLite. The middleware of
the NorduGrid is ARC.

18.3 Usage of ARC

NorduGrid is the Grid of the Nordic countries. The web page for NorduGrid and ARC
middleware is http://www.nordugrid.org, where one can find for example the user guide
(NORDUGRID-MANUAL-6).
The client software is pre-installed in the korundi cluster. First one needs to login in to the
Grid, type
grid-proxy-init
and give the pass word. This creates a temporary token called proxy. Grid services can act
only as long as the proxy is valid. You can print information of your proxy, like how long it
is valid, by typing
grid-proxy-info
Submitting a job. Let us create a small job script which prints ”Hello World!!”, and submit
that to the Grid. To do that we need the job script, and a job description file. In the job
description file one has the following lines (xRSL language):
File test.xrsl:

& (executable=test.csh)
(stdout=test.out)
(stderr=test.err)
(gmlog=gridlog)
(cputime=10)
(memory=32)
(disk=1)

To submit the job, type


ngsub -f test.xrsl
This will give you a
tt jobid, which you can use for checking the job status
ngstat gsiftp://...

135
If you want to specify the cluster where the job should run, submit your job with an option
-c
ngsub -f test.xrsl -c korundi.grid.helsinki.fi
You can kill jobs with command ngkill, for cleaning use ngclean.

18.4 Job monitoring

To monitor the progressing of the job,


ngcat gsiftp:}//...
This command works as cat command. To retrieve the output files,
ngget gsiftp:}//...
Files to be retrieved by ngget should be described in the xRSL file with line outputfiles.
Cancelling a job is done with ngkill command. A job can be killed practically on any stage
of processing through the Grid.

ngkill gsiftp:}//...
ngkill -all}

Jobs can be resubmitted with ngresub command. Only jobs where the gmlog attribute was
given in the xRSL file can be resubmitted. E.g.
(gmlog=myjob.log)
If a job fails, or you are not willing to retrieve the results for some reason, a good practice
for users is not to wait for the Grid manager to clean up the job leftovers, but to use ngclean
to release disk space.
ngclean gsiftp://...
If the user proxy expires while the job is still running, new proxy can be uploaded with
command
ngrenew gsiftp:}//...
Example: let us make a program which produces a rootfile, submit the job to the Grid and
retrieve the output (the rootfile).
The program used is ”writing.C” in Exercises5.2. It writes gaussian distributed random
numbers in a tree, which is written in a rootfile.

#include "TTree.h"
#include "TFile.h"
#include "TRandom.h"

void writing(){
// Initialize random number generator with a seed
gRandom->SetSeed(123454321);
// Define the number of "events"
int N = 1000;

136
// Open new root file on disk
TFile* myRootFile = new TFile("tree.root","RECREATE");
if (!myRootFile) {
cout << "Error: cannot open file tree.root!" << endl;
return;
}

// Open root tree


TTree* myRootTree = new TTree("ExampleTree","tree");
if (!myRootTree) {
cout << "Error: cannot create a root tree!" << endl;
return;
}

// Create a branch for storing the values from the memory adress
// of the variable f
float f;
myRootTree->Branch("gaussian",&f,"value/F");

// Fill random variables of default gaussian distribution to the root tree


for(int i = 0; i < N; i++){
f = gRandom->Gaus(); // mean=0, sigma=1
myRootTree->Fill();
}

// Write root tree, delete it from memory, and close it


myRootTree->Write();
myRootTree->Delete();
myRootFile->Close();
}

The problem now is runtime libraries which either need to be present in the Grid worknode,
or we must send the libraries with the exe file.
Since we cant be sure that the root version we are using is available in the working node, let
us choose the latter option and send the libraries with the executable.
The libraries are copied in a new directory libs, which is then tarred and gzipped. Some files
from $(ROOTSYS)/etc may also be needed, depending on the used ROOT version.
The job script needs to unpack the tarball: file test.job

#!/bin/sh
export ROOTSYS=.
export LD\_LIBRARY\_PATH=\$(ROOTSYS)/libs
tar xfvz libs.tar.gz
./writing

For some reason the working node accepted only bash/sh, so let us use that. The environment
variables ROOTSYS and LD LIBRARY PATH needed to be set, without them the program
crashes. One should always test that the job script works before submitting it to the grid.
Job xRSL file:

137
&(executable=test.job)

(executables=writing libCint.so libGpad.so


libHist.so libPhysics.so libRIO.so libCore.so
libGraf3d.so libMatrix.so libPostscript.so libTree.so
libdl.so libGraf.so libNet.so libRint.so)

(inputfiles=(test.job "")
(writing "")
(libs.tar.gz ""))
(outputfiles=("tree.root" ""))
(notify="e [email protected]")

The last line will send email when the program ends (=e in the string). The executable
”writing” could have been included in the tarball, too, or all the files could have been sent
separately without any tarring.
So, after creating a grid proxy, the program is submitted as
ngsub -f test.xrsl
Status checked with
ngstat -a
and result retrieved with
ngget -a
File transfers. Simple file transfers can be made using interactive tools, such as gsincftp.
File transfers can also be initiated from the client side by ngsub or from the server side by
grid-manager as described in the job description.
The session directories are kept on the computing resources for a limited time only, usually
at least 24h. Client machines are not necessarily connected to the Grid when the jobs finish,
so the grid-manager does not transfer jobs directly back to the client machine.
The job result files can also be transfered into a storage element. Storage elements are
persistent storage areas (file servers) which are continuosly connected to the Grid. The
storage element can be set by the outputfiles xRSL command, like
(outputfiles=("test.out" "gsiftp://...))
If you like to move a number of files, best to make a tarball of them, which to transfer.
As an example Storage Element we use dCache server madhatter.csc.fi.
Some useful commands:
ls

• ngls gsiftp://madhatter.csc.fi/pnfs/csc.fi/data/cms/test

• srmls srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test

• /opt/d-cache/srm/bin/srmls -srm protocol version=2 -server mode=passive -streams num=1


srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test

138
cp
Notice that any nonexistent subdir in the given SE path is automatically created, and wild-
cards (like star in test.*) are not supported. The output file name must be given explicitly.

• srmcp file://$PWD/test.root srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test/test.root

• ngcp file://$PWD/test.root srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test/test.root

rm

• srmrm srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test/test.root

• ngrm srm://madhatter.csc.fi:8443/pnfs/csc.fi/data/cms/test/test.root

ROOT (The data are readable without authentication only from korundi and jade.)

• TFile* file = new TXNetFile(”root://madhatter.csc.fi/pnfs/csc.fi/data/cms/test/test.root”);

139
Index
Acroread, 11 pythia, 57
Event reconstruction, 76
Branching ratio, 51 combined, 77
global, 77
C++, 22
local, 76
arrays, 26
particle identification, 78
casts, 23
software, 80
classes, 27
tagging, 79
constructor, 30
tracks, 77
containers, 32
vertices, 77
data structures, 26
destructor, 30 Famos, 74
expressions, 23 Fast simulation, 74, 82
functions, 25 ecal response, 85
i/o, 28 hcal response, 85
iterators, 32 in CMS, 84
member functions, 26 muon response, 86
objects, 30 performance, 82
pointers, 25 showers, 83
polymorfism, 33 tracker response, 85
statements, 24 usage, 86
Central limit theorem, 119 Fit quality, 99
CMSSW, 69 fit probability, 99
analysis example, 74 pulls, 99
dataset creation, 70 Fitting, 92
environment setup, 69 least squares, 94
framework, 69 maximum likelihood, 92, 100
Covariance matrix, 88 Fortran, 15
maximum likelihood, 102 common block, 18
Cross section, 51 examples, 19
feynHiggs, 53 flow control, 16
hdecay, 52 i/o, 17
hqq,higlu etc, 53 pythia, 19
mcfm, 54 subprograms, 18
number of events, 51 variables, 15
pdf’s, 54
pythia, 51 Git, 9
CVS, 8 Grid, 134
ARC, 135
Display, 11 certificate, 134
commands, 138
Error, 88
file transfer, 138
matrix, 88
gLite, 135
propagation, 88
proxy, 135
Event generators, 56
submitting, 135
herwig, 60
others, 62 Jacobian, 91

140
LaTeX, 10
BibTeX, 10
feynMF, 10
metafont, 11
Least Squares, 94
constraints, 98
convergence, 98
error estimates, 95
Life time, 104
half-life, 104

Makefile, 12
functions, 13
macros, 13
make -f, 12
rules, 12
Maximum Likelihood, 92
gaussian statistics, 94
poisson statistics, 100
Metacomputer, 134
Middleware, 134
Monte Carlo, 103

PDF, 53, 54
Perl, 5
Propagation, 88
linear transformations, 88
non-linear transformations, 90
Python, 6

ROOT, 38, 100


canvas, 41
graphs, 41
histograms, 40
likelihood, 101
trees, 42
types, 39

Trigger, 79

Unix, 1
bash, 4
csh, 3

141

You might also like