0% found this document useful (0 votes)

57 views93 pages

PAST: Free Palaeontological Stats Tool

This document summarizes a publication about the software program PAST (PAlaeontological STatistics). PAST is a free statistical program designed for use in paleontology and paleoecology. It includes functions commonly used in these fields such as ordination, morphometrics, cladistics, and biostratigraphy. The document provides information on installing PAST, entering and manipulating data in its spreadsheet interface, and selecting/editing data. It also briefly describes some of PAST's statistical functions and exporting graphics.

Uploaded by

alissonzkv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views93 pages

PAST: Free Palaeontological Stats Tool

Uploaded by

alissonzkv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 93

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228393561

PAST–Palaeontological statistics, ver. 1.89

Article in Palaeontologia Electronica · March 2009

CITATIONS READS

1,049 5,102

3 authors:

Oyvind Hammer David A.T. Harper

University of Oslo Durham University
220 PUBLICATIONS 29,659 CITATIONS 416 PUBLICATIONS 34,273 CITATIONS

SEE PROFILE SEE PROFILE

Paul D Ryan
National University of Ireland, Galway
77 PUBLICATIONS 28,310 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

NORLEX View project

Tectonic evolution of western Ireland View project

All content following this page was uploaded by Paul D Ryan on 03 June 2014.

The user has requested enhancement of the downloaded file.

PAST - PAlaeontological STatistics, ver. 1.89
Øyvind Hammer, D.A.T. Harper and P.D. Ryan

January 29, 2009

1 Introduction
Welcome to the PAST! This program is designed as a follow-up to PALSTAT, an
extensive package written by P.D. Ryan, D.A.T. Harper and J.S. Whalley (Ryan et
al. 1995). It includes many of the functions which are commonly used in palaeon-
tology and palaeoecology.
These days, a number of large and very good statistical systems exist, including
SPSS, SAS and extensions to Excel. Why yet another statistics program?

• PAST is free.

• PAST is tailor-made for palaeontology. This means that it includes func-

tions which are not found in off-the-shelf programs (for example cladistics,
ordination, morphometry and biostratigraphy), and that it does not include
functions which are of little use to palaeontologists and that only make the
user interface more confusing.

• PAST is easy to use, and therefore well suited for introductory courses in
quantitative palaeontology.

• PAST comes with a number of example data sets, case studies and exercises,
making it a complete educational package.

Further explanations of many of the techniques implemented together with case

histories are located in Harper (1999).
If you have questions, bug reports, suggestions for improvements or other com-
ments, we would be happy to hear from you. Contact us at [email protected].
The PAST home page is

http://folk.uio.no/ohammer/past

1
2 Installation
The basic installation of PAST is easy: Just download the file ’Past.exe’ and put
it anywhere on your hard disk. Double-clicking the file will start the program.
The data files for the case studies can be downloaded separately, or together in
the packed file ’casefiles.zip’. This file must be unpacked with a program such as
WinZip.

We suggest you make a folder called ’past’ anywhere on your hard disk, and
put all the files in this folder.

Please note: Problems have been reported for some combinations of screen
resolution and default font size in Windows - the layout becomes ugly and it may
be necessary for the user to increase the sizes of windows in order to see all the text
and buttons. If this happens, please set the font size to ’Small fonts’ in the Screen
control panel in Windows. We are working on solving this problem.
PAST also seems to have problems with some printers. Postscript printers work
fine.
When you exit PAST, a file called ’pastsetup’will be automatically placed in
your personal folder (for example ’My Documents’ in Windows 95/98), containing
the last used file directories.

2
3 Entering and manipulating data
PAST has a spreadsheet-like user interface. Data are entered as an array of cells,
organized in rows (horizontally) and columns (vertically).

Entering data
To input data in a cell, click on the cell with the mouse and type in the data. This
can only be done when the program is in the ’Edit mode’. To select edit mode, tick
the box above the array. When edit mode is off, the array is locked and the data
cannot be changed. The cells can also be navigated using the arrow keys.
Any text can be entered in the cells, but almost all functions will expect num-
bers. Both comma (,) and decimal point (.) are accepted as decimal separators.
Genetic sequence data are coded using C, A, G, T and U (lowercase also ac-
cepted).
Absence/presence data are coded as 0 or 1, respectively. Any other positive
number will be interpreted as presence. Absence/presence-matrices can be shown
with black squares for presences by ticking the ’Square mode’ box above the array.
Missing data are coded with question marks (’?’) or the value -1. Unless
support for missing data is specifically stated in the documentation for a function,
the function will not handle missing data correctly, so be careful.
The convention in PAST is that items occupy rows, and variables columns.
Three brachiopod individuals might therefore occupy rows 1, 2 and 3, with their
lengths and widths in columns A and B. Cluster analysis will always cluster items,
that is rows. For Q-mode analysis of associations, samples (sites) should there-
fore be entered in rows, while taxa (species) are in columns. For switching be-
tween Q-mode and R-mode, rows and columns can easily be interchanged using
the Transpose operation.

Counter
A counter function is available in the Edit menu for use e.g. at the microscope when
counting microfossils of different taxa. A single row (sample) must be selected.
The counter window will open with a number of counters, one for each selected
column (taxon). The counters will be initialized with the column labels and any
counts already present in the spreadsheet. When closing the counter window, the
spreadsheet values will be updated.
Count up (+) or down (-) with the mouse, or up with the keys 1-9, 0 and a-z
(only the first 36 counters). The bars represent relative abundance. A log of events
is given at the far right - scroll up and down with mouse or arrow keys inside the
text. An optional auditive feedback has a specific pitch for each counter.

3
Selecting areas
Most operations in PAST are carried only out on the area of the array which you
have selected (marked). If you try to run a function which expects data, and no
area has been selected, you will get an error message.

• A row is selected by clicking on the row label (leftmost column).

• A column is selected by clicking on the column label (top row).

• Multiple rows are selected by selecting the first row label, then shift-clicking
(clicking with the Shift key down) on the additional row labels. Note that
you can not ’drag out’ multiple rows - this will instead move the first row
(see below).

• Multiple columns are similarly marked by shift-clicking the additional col-

umn labels.

• The whole array can be selected by clicking the upper left corner of the array
(the empty grey cell) or by choosing ’Select all’ in the Edit menu.

• Smaller areas within the array can be selected by ’dragging out’ the area, but
this only works when ’Edit mode’ is off.

Renaming rows and columns

When PAST starts, rows are numbered from 1 to 99 and columns are labelled A to
Z. For your own reference, and for proper labelling of graphs, you should give the
rows and columns more descriptive but short names. Choose ’Rename columns’
or ’Rename rows’ in the Edit menu. You must select the whole array, or a smaller
area as appropriate.
Another way is to select the ’Edit labels’ option above the spreadsheet. The
first row and column are now editable in the same way as the rest of the cells.

Increasing the size of the array

By default, PAST has 99 rows and 26 columns. If you should need more, you
can add rows or columns by choosing ’Insert more rows’ or ’Insert more columns’
in the Edit menu. Rows/columns will be inserted after the marked area, or at the
bottom/right if no area is selected. When loading large data files, rows and/or
columns are added automatically as needed.

Moving a row or a column

A row or a column (including its label) can be moved simply by clicking on the
label and dragging to the new position.

4
Cut, copy, paste
The cut, copy and paste functions are found in the Edit menu. Note that you can
cut/copy data from the PAST spreadsheet and paste into other programs, for ex-
ample Word and Excel. Likewise, data from other programs can be pasted into
PAST.
Remember that local blocks of data (not all rows or columns) can only be
marked when ’Edit mode’ is off.
All modules giving graphic output have a ’Copy graphic’ button. This will
place the graphical image into the paste buffer for pasting into other programs, such
as a drawing program for editing the image. Note that graphics are copied using
the ’Enhanced Metafile Format’ in Windows. This allows editing of individual
image elements in other programs. When pasting into Coreldraw, you have to
choose ’Paste special’ in the Edit menu, and then choose ’Enhanced metafile’.
Some programs may have idiosyncratic ways of interpreting EMF images - beware
of strange things happening.

Remove
The remove function (Edit menu) allows you to remove selected row(s) or col-
umn(s) from the spreadsheet. The removed area is not copied to the paste buffer.

Grouping (colouring) rows

Selected rows (data points) can be tagged with one of 16 attractive colors using the
’Row color/symbol’ option in the Edit menu. Each group is also associated with a
symbol (dot, cross, square, diamond, plus, circle, triangle, line, bar, filled square,
star, oval, filled triangle, inverted triangle, filled inverted triangle, filled diamond).
This is useful for showing different groups of data in plots, and is also required by
a number of analysis methods.
Important: For methods that require groupings of rows using colors, rows be-
longing to one group must be consecutive. If more than 16 groups are required,
colors can be re-used.
The ’Numbers to colors’ option in the Edit menu allows the numbers 1-16 in
one selected column to set corresponding colours (symbols) for the rows.

Selecting datatypes for columns

Selected columns can be tagged with a datatype (continuous/unspecified, ordinal,
nominal or binary) using the ’Column data types’ option in the Edit menu. This
is only required if you wish to use mixed similarity/distance measures (see Multi-
variate statistics).

5
Transpose
The Transpose function, in the Edit menu, will interchange rows and columns. This
is used for switching between R mode and Q mode in cluster analysis, principal
components analysis and seriation.

Grouped columns to multivar

Converts from a format with multivariate items presented in consecutive groups of
N columns to the PAST format with one item per row and all variates along the
columns. For N = 2, two specimens and four variables a − d, the conversion is
from
a1 b1 a2 b2
c1 d1 c2 d2
to
a1 b1 c1 d1
a2 b2 c2 d2

Grouped rows to multivar

Converts from a format with multivariate items presented in consecutive groups
of N rows to the PAST format with one item per row and all variates along the
columns. For N = 2, two specimens and four variables a − d), the conversion is
from
a1 b1
c1 d1
a2 b2
c2 d2
to
a1 b1 c1 d1
a2 b2 c2 d2

Stack colored rows into columns

Stacks colored groups horizontally along columns. This can be useful e.g. for
performing univariate statistics on pairs of columns across groups.

Samples to events (UA to RASC)

Given a data matrix of occurrences of taxa in a number of samples in a number
of sections, as used by the Unitary Associations module, this function will convert
each section to a single row with orders of events (FADs, LADs or both) as ex-
pected by the Ranking-Scaling module. Tied events (in the same sample) will be
given equal ranking.

6
Events to samples (RASC to UA)
Expects a data matrix with sections/wells in rows, and taxa in columns, with FAD
and LAD values in alternating columns (i.e. two columns per taxon). Converts to
the Unitary Associations presence/absence format with sections in groups of rows,
samples in rows and taxa in columns.

Loading and saving data

The ’Open’ function is found in the File menu. You can also drag a file from
the desktop onto the PAST window. PAST uses an ASCII file format, for easy
importing from other programs (e.g. Word) and easy editing in a text editor. The
format is as follows:

. columnlabel columnlabel columnlabel

rowlabel data data data
rowlabel data data data
rowlabel data data data

Empty cells (like the top left cell) are coded with a full stop (.). Cells are
separated by white space, which means that you must never use spaces in row or
column labels. ’Oxford Clay’ is thus an illegal column label which would confuse
the program.
If any rows have been assigned a colour other than black, the row labels in
the file will start with an underscore, a number from 0 to 15 identifying the colour
(symbol), and another underscore.
If any columns have been assigned a datatype other than continuous/unspecified,
the column labels in the file will similarly start with an underscore, a number
from 0-3 identifying the datatype (0=continuous/unspecified, 1=ordinal, 2=nom-
inal, 3=binary), and another underscore.
In addition to this format, PAST can also detect and open files in the following
formats:

• Excel format (only the first worksheet).

• Nexus format (see below), popular in systematics.

• TPS format developed by Rohlf (the landmark, outlines, curves, id, scale and
comment fields are supported, other fields are ignored).

• BioGraph format for biostratigraphy (SAMPLES or DATUM format). If

a second file with the same name but extension ".dct" is found, it will be
included as a BioGraph dictionary.

• RASC format for biostratigraphy. You must open the .DAT file, and the
program expects corresponding .DIC and .DEP files in the same directory.

7
• CONOP format for biostratigraphy. You must open the .DAT file (log file),
and the program expects corresponding .EVT (event) and .SCT (section) files
in the same directory.

The ’Insert from file’ function is useful for concatenating data sets. The loaded
file will be inserted into your existing spreadsheet at the selected position (upper
left). Other data sets can thus be inserted both to the right of and below your
existing data.

Data from Excel

Data from Excel can be imported in two ways:

• Copy from Excel and paste into PAST. Note that if you want the first row
and column to be copied into the label cells in PAST, you need to switch on
the "Edit labels" option.

• Open the Excel file from PAST. The "Edit labels" option operates in the same
way.

• Make sure that the top left cell in Excel contains a single dot (.) and save as
tab-separated text in Excel. The resulting text file can be opened directly in
PAST.

Reading and writing Nexus files

The Nexus file format is used by many cladistics programs. PAST can read and
write the Data (character matrix) block of the Nexus format. Interleaved data are
supported. Also, if you have performed a parsimony analysis and the ’Parsimony
analysis’ window is open, all shortest trees will be written to the Nexus file for
further processing in other programs (e.g. MacClade or Paup). Note that not all
Nexus options are presently supported.

8
4 Transforming your data
These routines subject your data to mathematical operations. This can be useful for
bringing out features in your data, or as a necessary preprocessing step for some
types of analysis.

Logarithm
The Log function in the Transform menu log-transforms your data using the base-
10 logarithm. If the data contain zero or negative values, it may be necessary to
add a constant (e.g. 1) before log-transforming (use Evaluate Expression x + 1).
This is useful, for example, to compare your sample to a log-normal distribu-
tion or for fitting to an exponential model. Also, abundance data with a few very
dominant taxa may be log-transformed in order to downweight those taxa.
Missing data (?) supported.

Subtract mean
This function subtracts the column mean from each of the selected columns. The
means cannot be computed row-wise.

Remove trend
This function removes any linear trend from a data set (two columns with X-Y
pairs, or one column with Y values). This is done by subtraction of a linear re-
gression line from the Y values. Removing the trend can sometimes be a useful
operation prior to spectral analysis.

Row percentage
All values converted to the percentage of the row sum.
Missing values (?) supported.

Row normalize length

All values divided by the Euclidean length of the row vector.
Missing values (?) supported.

Abundance to presence/absence
All positive (non-zero) numbers are replaced with ones.
Missing values (?) supported.

9
Procrustes and Bookstein coordinates, Normalize size, Burnaby size
removal
For description of these functions, see ’Geometrical analysis’.

Sort ascending and descending

Sorts the rows in the marked area, based on values in the first data column.
The ’Sort descending’ function is useful, for example, to plot taxon abundances
against their ranks (this can also be done with the Abundance Model module).

Sort on color
Sorts the rows in the marked area on color.

Column difference
Simply subtracts two selected columns, and places the result in the next column.

Evaluate expression
This powerful feature allows flexible mathematical operations on the selected ar-
ray of data. Each selected cell is evaluated, and the result replaces the previous
contents. A mathematical expression must be entered, which can include any of
the operators +, -, *, /, ˆ(power), and mod (modulo). Also supported are brackets (),
and the functions abs, atan, cos, sin, exp, ln, sqrt, sqr, round and trunc.
The following variables can also be used:

• x (the contents of the current cell)

• l (the cell to the left if it exists, otherwise 0)

• r (the cell to the right)

• u (the cell above, or up)

• d (the cell below, or down)

• mean (the mean value of the current column)

• min (the minimum value)

• max (the maximum value)

• n (the number of cells in the column)

• i (the row index)

• j (the column index)

10
• random (uniform random number from 0 to 1)

• normal (Gaussian random number with mean 0 and variance 1)

• integral (running sum of the current column)

• stdev (standard deviation of the current column)

• sum (total sum of the current column)

Examples:
sqrt(x) Replaces all numbers with their square roots
(x-mean)/stdev Mean and standard deviation normalization, column-wise
x-0.5*(max+min) Centers the values around zero
(u+x+d)/3 Three-point moving average smoothing
x-u First-order difference
i Fills the column with the row numbers (requires non-empty cells, such as all zeros
sin(2*3.14159*i/n) Generates one period of a sine function down a column (requires non-empty cells)
5*normal+10 Normally distributed random number, mean 10 and standard deviation 5
Missing values (?) supported.

11
5 Plotting functions
Graph
Plots one or more columns as separate graphs. The x coordinates are set auto-
matically to 1,2,3,... There are three plot styles available: Graph (lines), bars and
points. The ’X labels’ options sets the x axis labels to the appropriate row names.

XY graph
Plots one or more pairs of columns containing x/y coordinate pairs. The ’log Y’
option log-transforms your Y values (if necessary, a constant is added to make the
minimum log value equal to 0). The curve can also be smoothed using 3-point
moving average.
95 percent confidence ellipses can be plotted in most scatter plots in PAST,
such as scores for PCA, CA, DCA, PCO, NMDS, and relative and partial warps.
The calculation of these ellipses assumes a bivariate normal distribution.
Convex hulls can also be drawn in the scatter plots, in order to show the areas
occupied by points of different ’colours’. The convex hull is the smallest convex
polygon containing all points.
The minimal spanning tree is the set of lines with minimal total length, con-
necting all points. In the XY graph module, Euclidean lengths in 2D are used.
Hold the mouse cursor over a point to see its row label.

XY graph with error bars

As XY graph, but expects four columns (or a multiple), with x, y, x error and y
error values. Symmetric error bars are drawn around each point. If an error value
is set to zero, the corresponding error bar is not drawn.

Histogram
Plots histograms (frequency distributions) for one or more columns. The number
of bins is set to an "optimal" number (the zero-stage rule of Wand 1996), but can
be changed by the user. The "Fit normal" option draws a graph with a fitted normal
distribution (Parametric estimation, not Least Squares).
Kernel Density Estimation is a smooth estimator of the histogram. PAST uses
a Gaussian kernel with range according to the rule given by Silverman (1986):
0.9 min(s, IQ/1.34)n−1/5 .

Percentiles
Plots the (cumulative) number of points smaller than each percentage of points.
Two popular methods are included. For a percentile p, the rank is computed ac-
cording to k = p(n + 1)/100, and the value that corresponds to that rank taken. In

12
the rounding method, k is rounded to the nearest integer, while in the interpolation
method, non-integer ranks are handled by interpolation between the two nearest
ranks.

Box plot
Box plot for one or several columns (samples) of univariate data. For each sample,
the 25-75 percent quartiles are drawn using a box. The median is shown with a
horizontal line inside the box. The minimal and maximal values are shown with
short horizontal lines (’whiskers’).
If the "Outliers" box is ticked, another box plot convention is used. The whiskers
are drawn from the top of the box up to the largest data point less than 1.5 times
the box height from the box (the "upper inner fence"), and similarly below the box.
Values outside the inner fences are shown as circles, values further than 3 times the
box height from the box (the "outer fences") are shown as stars.

Ternary
Ternary plot for three columns of data, normally containing proportions of compo-
sitions.

Bubble plot
An attempt at plotting 3D data (three columns) by showing the third axis as size
of disks. Negative values are not shown. Select "Subtract min" to subtract the
smallest third axis value from all values - this will force the data to be positive.
The "Size" slider scales the bubbles relative to unit radius on the x axis scale.

Survivorship
Survivorship curves for one or more columns of data. The data will normally con-
sist of age or size values. A survivorship plot shows the number of individuals
which survived to different ages. Assuming exponential growth (highly question-
able!), size should be log-transformed to age. This can be done either in the Trans-
form menu, or directly in the Survivorship dialogue.

Landmark plot
This function is very similar to the ’XY graph’, the only difference being that all
XY pairs on each row are plotted with the appropriate row colour and symbol. It is
well suited for plotting landmark data.

13
Landmarks 3D
Plotting of points in 3D (XYZ triples). Especially suited for 3D landmark data, but
can also be used e.g. for PCA scatter plots along three principal components. The
point cloud can be rotated around the x and the y axes (note: left-handed coordinate
system). The ’Perspective’ slider is normally not used. The ’Stems’ option draws
a line from each point down to a bottom plane, which can sometimes enhance
3D information. ’Lines’ draws lines between consecutive landmarks within each
separate specimen (row). ’Axes’ shows the three coordinate axes with the centroid
of the points as the origin.

Normal probability plot

Plots a normal probability (normal QQ) plot for one column of data. A normal
distribution will plot on a straight line. For comparison, an RMA regression line is
given, together with the Probability Plot Correlation Coefficient.

Matrix
Two-dimensional plot of the data matrix, using a grayscale with white for lowest
value, black for highest, or a colour scale with red at maximum. Can be useful to
get an overview over a large data matrix.

Surface
Three-dimensional landscape plot of a data matrix containing elevation values.
Colors can be assigned according to height, or gray-shaded using a fixed light
source.

14
6 Basic statistics
Univariate statistics
Typical application Assumptions Data needed
Quick statistical description None, but variance and One or more columns of
of a univariate sample standard deviation are most measured or counted data
meaningful for normally
distributed data

Displays the following statistics: Number of entries (N), smallest value (Min),
largest value (Max), sum, mean value (Mean), standard error of the estimate of the
mean (Std. error), population variance (that is, the variance of the population es-
timated from the sample), population standard deviation (square root of variance),
median, skewness (positive for a tail to the right), kurtosis (positive for a peaked
distribution) and geometric mean.
Missing data (?) is supported.

Comparing data sets

There are many different standard tests available for comparing two distributions.
Here is the standard disclaimer: You can never prove that two distributions are
the same. A high probability value is only consistent with a similar distribution,
but does of course give an indication of the similarity between the two sample
distributions. On the other hand, a very low probability value does show, to the
given level of significance, that the distributions are different.

Normality
Typical application Assumptions Data needed
Testing for normal distribu- Minimum 3, maximum Single column of measured
tion of a sample 5000 data points or counted data

Testing a null hypothesis of univariate normal distribution, using two methods.

The Shapiro-Wilk test is designed for samples with 3≤N ≤5000. The Jarque-Bera
test uses a test statistic JB combining skewness and kurtosis measures. It is proba-
bly inferior to Shapiro-Wilk, especially for small samples.
The chi-squared test (using four bins) is generally inferior to the Shapiro-Wilk
test, and should only be used for large samples (N>30). See Brown & Rothery
(1993) or Davis (1986) for details.
Missing data (?) is supported.

15
F and t tests (two samples)
Typical application Assumptions Data needed
Testing for equality of the Normal or almost normal Two columns of measured
variances and means of two distribution (apart from the or counted data
samples permutation test)

Two columns must be selected. The F test compares the variances of two
distributions, while the t test compares their means. The F and t statistics, and the
probabilities that the variances and means of the parent populations are the same,
are given. The F and t tests should only be used if you have reason to believe that
the parent populations are close to normally distributed. The Shapiro-Wilk test for
one distribution against a normal distribution can give you an idea about this.
Also, the t test is really only applicable when the variances are the same. So
if the F test says otherwise, you should be cautious about the t test. An unequal
variance t statistic (Welch test) is also given, which should be used in this case.
The 95 percent confidence intervals for the means are calculated using the t
distribution.
The permutation t test compares the observed t statistic (normalized difference
between means) with the t statistics from 10,000 (can be changed by the user)
random pairs of replicates from the pooled data set. This test will be more accurate
than the normal t test for non-normal distributions and small samples.
Sometimes publications give not the data, but values for sample size, mean and
variance for two populations. These can be entered manually using the ’F and t
from parameters’ option in the menu.
See Brown & Rothery (1993) or Davis (1986) for details.
Missing data (?) are supported by deletion.

How do I test lognormal distributions?

All of the above tests apply to lognormal distributions as well. All you need to do
is to transform your data first, by taking the log transform in the Transform menu.
You might want to ’backup’ your data column first, using Copy, and then get your
original column back using Paste.

t test (one sample)

Typical application Assumptions Data needed
Testing whether the mean of Normal or almost normal One column of measured
a sample is equal to a given distribution data
value

The one-sample t test is used to investigate whether the sample is likely to have
been taken from a population with a given (theoretical) mean.

16
Paired t test. Say that a measurement such as length of claw has been taken
on the left and right side of a number of crab specimens, and we want to test for
directed asymmetry (difference between left and right). A two-sample t test is not
appropriate, because the values are not independent. Instead, we can perform a
one-sample t test of left minus right against the value zero.
The 95 percent confidence interval for the mean is calculated using the t distri-
bution.
Missing data (?) are supported by deletion.

Paired tests (two samples)

Typical application Assumptions Data needed
Testing for equality of mean t test: Normal distribution of Two columns of paired data
or median in two samples of differences.
paired values

Say that a measurement such as length of claw has been taken on the left and
right side of a number of crab specimens, and we want to test for directed asym-
metry (difference between left and right). A two-sample t test is not appropriate,
because the values are not independent. Instead, we can perform a one-sample t
test of left minus right against the value zero.
The program also carries out two non-parametric tests: The sign test and the
Wilcoxon signed-rank test. The Wilcoxon test skips equal pairs and takes ties
into account. The p value using the normal approximation should not be used for
N < 6. The Monte Carlo estimate for p uses 100000 random permutations.
The sign test should be less powerful than the Wilcoxon test. If the sign test
gives a lower p value than the Wilcoxon test, there may be a problem with the
assumptions.
Missing data (?) supported by deletion.

Chi-square
Typical application Assumptions Data needed
Testing for equal distribu- Each compartment contain- Two columns of counted
tion of compartmentalized, ing at least five individu- data in different compart-
counted data als (not required for Monte ments (rows)
Carlo and Fisher’s exact
tests)

The Chi-square test is the one to use if your data consist of the numbers of
elements in different bins (compartments). For example, this test can be used to
compare two associations (columns) with the number of individuals in each taxon
organized in the rows. You should be a little cautious about such comparisons if
any of the bins contain less than five individuals.

17
There are two options that you should select or not for correct results. ’Sample
vs. expected’ should be ticked if your second column consists of values from a
theoretical distribution (expected values) with zero error bars. In this case, non-
integers are allowed in the second column. If your data are from two counted
samples each with error bars, leave this box open. This is not a small-sample
correction.
’One constraint’ should be ticked if your expected values have been normal-
ized in order to fit the total observed number of events, or if two counted samples
necessarily have the same totals (for example because they are percentages). This
will reduce the number of degrees of freedom by one.
When "one constraint" is selected, a permutation test is available, with 10000
randomly permutated replicates. For ’Sample vs. expected" these replicates are
generated by keeping the expected values fixed, while the values in the first column
are random with relative probabilities as specified by the expected values, and with
constant sum. For two samples, all cells are random but with constant row and
column sums.
With one constraint and a 2x2 table, the Fisher’s exact test is also given (two-
tailed). When available, the Fisher’s exact test is far superior to the chi-square.
See Brown & Rothery (1993) or Davis (1986) for details.
Missing data (?) are supported by deletion.

Coefficient of variation
Typical application Assumptions Data needed
Testing for equal coefficient Unknown Two columns of measured
of variation in two samples data

The coefficient of variation is defined as standard deviation divided by the

mean. This module implements the Fligner-Killeen test for equal coefficient of
variation, as recommended and described by Donnelly & Kramer (1999). This test
has been used to compare variation within fossils with variation within a closely
related modern species, to test for multiple fossil species.

Mann-Whitney U (two samples)

Typical application Assumptions Data needed
Comparing the medians of Both samples have N > Two columns of measured
two samples 7, and similar distribution or counted data
shapes.

Two columns must be selected. The two-tailed (Wilcoxon) Mann-Whitney U

test can be used to test whether the medians of two independent distributions are
different. This test is non-parametric, which means that the distributions can be of

18
any shape. PAST uses an approximation based on a z test, which is only valid for
N > 7. It includes a continuity correction.
See Brown & Rothery (1993) or Davis (1986) for details.
Missing data (?) are supported.

Kolmogorov-Smirnov (two samples)

Typical application Assumptions Data needed
Comparing the distributions None Two columns of measured
of two samples data

Two columns must be selected. The K-S test can be used to test whether two in-
dependent distributions of continuous, unbinned numerical data are different. The
K-S test is non-parametric, which means that the distributions can be of any shape.
If you want to test just the locations of the distribution (medians), you should use
instead the Mann-Whitney U test.
See Davis (1986) for details.
Missing data (?) are supported.

Spearman’s rho and Kendall’s tau (two samples)

Typical application Assumptions Data needed
Testing whether two vari- None Two columns of measured
ables are correlated or counted paired data (such
as x/y pairs)

These non-parametric rank-order tests are used to test for correlation between
two variables. The permutation test is based on 1000 replicates.
Missing data (?) are supported.

Correlation matrix
Typical application Assumptions Data needed
Quantifying correlation be- The linear (Pearson) correla- Two or more columns of
tween two or more variables tion assumes normal distri- measured or counted vari-
bution ables

A matrix is presented with the correlations between all pairs of columns. Cor-
relation values are given in the lower triangle of the matrix, and the probabilities
that the columns are uncorrelated are given in the upper. Both parametric (Pearson)
and non-parametric (Spearman and Kendall) coefficients and tests are available.

19
Variance/covariance matrix
Typical application Assumptions Data needed
Quantifying covariance be- None Two or more columns of
tween two or more variables measured or counted vari-
ables

A symmetric matrix is presented with the variances and covariances between

all pairs of columns.
Missing data are supported by pairwise deletion.

Contingency table analysis

Typical application Assumptions Data needed
Testing for dependence be- None Matrix of counted data in
tween two variables compartments

A contingency table is input to this routine. Rows represent the different states
of one nominal variable, columns represent the states of another nominal variable,
and cells contain the counts of occurrences of that specific state (row, column) of
the two variables. A measure and probability of association of the two variables
(based on Chi-square) is then given.
For example, rows may represent taxa and columns samples as usual (with
specimen counts in the cells). The contingency table analysis then gives informa-
tion on whether the two nominal variables "taxon" and "locality" are associated. If
not, the data matrix is not very informative. For details, see Press et al. (1992).

One-way ANOVA
Typical application Assumptions Data needed
Testing for equality of the Normal distribution and Two or more columns of
means of several univariate similar variances and measured or counted data
samples sample sizes

One-way ANOVA (analysis of variance) is a statistical procedure for testing the

null hypothesis that several univariate data sets (in columns) have the same mean.
The data sets are required to be close to normally distributed.
See Brown & Rothery (1993) or Davis (1986) for details.
Levene’s test for homogeneity of variance (homoskedasticity), that is, whether
variances are equal as assumed by ANOVA, is also given. Two versions of the
test are included. The original Levene’s test was based on means. This version has
more power if the distributions are normal or at least symmetric. The version based
on medians has less power, but is more robust to non-normal distributions.

20
If Levene’s test is significant, meaning that you have unequal variances, you
can use the unequal-variance (Welch) version of ANOVA, with the F , df and p
values given.
If the ANOVA shows significant inequality of the means (small p), you can go
on to study the given table of "post-hoc" pairwise comparisons, based on Tukey’s
HSD test. The Studentized Range Statistic Q is given in the lower left triangle of
the array, and the probabilities p(equal) in the upper right. Sample sizes do not
have to be equal for the version of Tukey’s test used.

Two-way ANOVA
Typical application Assumptions Data needed
Testing for equality of the Normal distribution and Three columns needed.
means of several univariate similar variances and First, a column with the
samples, taken across two sample sizes levels for the first factor
sets of factors (e.g. species (coded as 1, 2, 3 etc.), then
and soil type), and for inter- a column with the levels
action between the factors for the second factor, and
finally the column of the
corresponding measured
values.

Two-way ANOVA (analysis of variance) is a statistical procedure for testing

the null hypotheses that several univariate samples have the same mean across
each of the two factors, and that there are no dependencies (interactions) between
factors. The samples are assumed to be close to normally distributed and have
similar variances. If the sample sizes are equal, these two assumptions are not
critical.
The algorithm uses weighted means for unbalanced designs.

Kruskal-Wallis test
Typical application Assumptions Data needed
Testing for equality of the None Two or more columns of
medians of several univari- measured or counted data
ate samples

The Kruskal-Wallis test can be regarded as a non-parametric alternative to

ANOVA (Zar 1996). The H statistic and the H statistic corrected for ties (Hc)
are given, together with a p value for equality (assuming a chi-squared distribution
of Hc).
PAST also includes a simple non-parametric post hoc test, based on Bonferroni-
corrected pairwise Mann-Whitney.

21
Similarity/distance indices
Typical application Assumptions Data needed
Comparing two or more Equal sampling conditions Two or more columns of
samples data with items in rows and
variables in columns.

14 similarity and distance measures, as described under Cluster Analysis are

available. Note that some of these are similarity indices, while others are distance
indices (in cluster analysis, these are all converted to similarities). All pairs of rows
are compared, and the results given in a matrix.
Missing data are supported as described under Cluster Analysis.

Mixture analysis
Typical application Assumptions Data needed
Fitting a univariate data set Sampling from a mixture of One column of measured
to a mixture of two or more two or more normally dis- data
Gaussian (normal) distribu- tributed populations
tions

Mixture analysis is an advanced maximum-likelihood method for estimating

the parameters (mean, standard deviation and proportion) of two or more univari-
ate normal distributions, based on a pooled univariate sample. For example, the
method can be used to study differences between sexes (two groups), or several
species, or size classes, when no independent information about group member-
ship is available.
PAST uses the EM algorithm, which can get stuck on a local optimum. The
procedure is therefore automatically run 20 times, each time with new, random
starting positions for the means. The starting values for standard deviation are set
to s/G, where s is the pooled standard deviation and G is the number of groups.
The starting values for proportions are set to 1/G. The user is still recommended to
run the program a few times to check for stability of the solution ("better" solutions
have less negative log likelihood values).
The Akaike Information Criterion (AIC) is defined as 2k − 2loglikelihood,
where k is the number of parameters. A minimal value for AIC indicates that you
have chosen the number of groups that produces the best fit without overfitting.

Genetic sequence stats

Typical application Assumptions Data needed
Selection of optimal dis- None One or more rows con-
tance measure for genetic taining sequences with nu-
sequence data cleotides coded as numbers
1-4 or atcg/ATCG

22
The following statistics are given: Total sequence length, total gap length av-
eraged over rows, total number of each nucleotide averaged over rows, p-distance
(Hamming) and Jukes-Cantor distance averaged over all possible pairs of rows,
maximal Jukes-Cantor, transition and transversion frequency averaged over all
rows, and transition/transversion ratio.

23
7 Multivariate statistics
Principal components analysis
Typical application Assumptions Data needed
Reduction and interpretation Debated Two or more rows of mea-
of large multivariate data sured data with three or
sets with some underlying more variables
linear structure

Principal components analysis (PCA) is a procedure for finding hypothetical

variables (components) which account for as much of the variance in your multi-
dimensional data as possible (Davis 1986, Harper 1999). These new variables are
linear combinations of the original variables. PCA has several applications, two of
them are:
• Simple reduction of the data set to only two variables (the two most impor-
tant components), for plotting and clustering purposes.
• More interestingly, you might try to hypothesize that the most important
components are correlated with some other underlying variables. For mor-
phometric data, this might be simply age, while for associations it might be
a physical or chemical gradient (e.g. latitude or position across the shelf).
The PCA routine finds the eigenvalues and eigenvectors of the variance-covariance
matrix or the correlation matrix. Choose var-covar if all your variables are mea-
sured in the same unit (e.g. centimetres). Choose correlation (normalized var-
covar) if your variables are measured in different units; this implies normalizing
all variables using division by their standard deviations. The eigenvalues, giving
a measure of the variance accounted for by the corresponding eigenvectors (com-
ponents) are given for all components. The percentages of variance accounted for
by these components are also given. If most of the variance is accounted for by
the first one or two components, you have scored a success, but if the variance is
spread more or less evenly among the components, the PCA has in a sense not been
very successful.
The Jolliffe cut-off value gives an informal indication of how many principal
components should be considered significant (Jolliffe, 1986). Components with
eigenvalues smaller than the Jolliffe cut-off may be considered insignificant, but
too much weight should not be put on this criterion.
Row-wise bootstrapping is carried out if a non-zero number of bootstrap repli-
cates (e.g. 1000) is given in the ’Boot N’ box. The bootstrapped components
are re-ordered and reversed according to Peres-Neto et al. (2003) to ensure corre-
spondence with the original axes. 95 percent bootstrapped confidence intervals are
given for the eigenvalues.
The ’Scree plot’ (simple plot of eigenvalues) can also be used to informally in-
dicate the number of significant components. After this curve starts to flatten out,

24
the corresponding components may be regarded as insignificant. 95 percent confi-
dence intervals are shown if bootstrapping has been carried out. The eigenvalues
expected under a random model (Broken Stick) are optionally plotted - eigenvalues
under this curve represent non-significant components (Jackson 1993).
The ’View scatter’ option allows you to see all your data points (rows) plotted
in the coordinate system given by the two most important components. If you have
tagged (grouped) rows, the different groups will be shown using different symbols
and colours. You can also plot the Minimal Spanning Tree, which is the shortest
possible set of connected lines connecting all points. This may be used as a visual
aid in grouping close points. The MST is based on an Euclidean distance measure
of the original data points, so it is most meaningful when all your variables use
the same unit. The ’Biplot’ option will show a projection of the original axes
(variables) onto the scattergram. This is another visualisation of the component
loadings (coefficients) - see below. √
is ticked, the data points will be scaled by 1/ dk , and
If the ’Eigenval scale’ √
the biplot eigenvectors by dk - this is the correlation biplot of Legendre & Leg-
endre (1998). If not ticked, the data points are not scaled, while the biplot eigen-
vectors are normalized to equal length (but not to unity, for graphical reasons) -
this is the distance biplot.
The ’View loadings’ option shows to what degree your different original vari-
ables (given in the original order along the x axis) enter into the different compo-
nents (as chosen in the radio button panel). These component loadings are impor-
tant when you try to interpret the ’meaning’ of the components. The ’Coefficients’
option gives the PC coefficients, while ’Correlation’ gives the correlation between
a variable and the PC scores. Do not use the latter if you are doing PCA on the
correlation matrix. If bootstrapping has been carried out, 95 percent confidence
intervals are shown (only for the Coefficients option).
The ’SVD’ option will enforce use of the supposedly superior Singular Value
Decomposition algorithm instead of "classical" eigenanalysis. The two algorithms
will normally give almost identical results, but axes may be flipped.
For the ’Shape PCA’ and ’Shape deform’ options, see the section on Geomet-
rical Analysis.
Bruton & Owen (1988) describe a typical morphometrical application of PCA.
Missing data are supported by column average substitution.

Principal coordinates
Typical application Assumptions Data needed
Reduction and interpretation Unknown Two or more rows of
of large multivariate data measured, counted or pres-
sets with some underlying ence/absence data with
linear structure three or more variables, or
a symmetric similarity or
distance matrix

25
Principal coordinates analysis (PCO) is another ordination method, somewhat
similar to PCA. The algorithm is taken from Davis (1986).
The PCO routine finds the eigenvalues and eigenvectors of a matrix contain-
ing the distances between all data points. The Gower measure will normally be
used instead of Euclidean distance, which gives results similar to PCA. An addi-
tional eleven distance measures are available - these are explained under Cluster
Analysis. The eigenvalues, giving a measure of the variance accounted for by the
corresponding eigenvectors (coordinates) are given for the first four most important
coordinates (or fewer if there are fewer than four data points). The percentages of
variance accounted for by these components are also given.
The similarity/distance values are raised to the power of c (the "Transformation
exponent") before eigenanalysis. The standard value is c = 2. Higher values (4 or
6) may decrease the "horseshoe" effect (Podani & Miklos 2002).
The ’View scatter’ option allows you to see all your data points (rows) plotted
in the coordinate system given by the PCO. If you have tagged (grouped) rows, the
different groups will be shown using different symbols and colours. The "Eigen-
value scaling" option scales each axis using the square root of the eigenvalue (rec-
ommended). The minimal spanning tree option is based on the selected similarity
or distance index in the original space.
Missing data are supported by pairwise deletion (not for the Raup-Crick, rho
or user-defined indices).

Non-metric multidimensional scaling

Typical application Assumptions Data needed
Reduction and interpretation None Two or more rows of
of large multivariate ecolog- measured, counted or pres-
ical data sets ence/absence data with
two or more variables, or
a symmetric similarity or
distance matrix.

Non-metric multidimensional scaling is based on a distance matrix computed

with any of 14 supported distance measures, as explained under Cluster Analysis
below. The algorithm then attempts to place the data points in a two- or three-
dimensional coordinate system such that the ranked differences are preserved. For
example, if the original distance between points 4 and 7 is the ninth largest of all
distances between any two points, points 4 and 7 will ideally be placed such that
their euclidean distance in the 2D plane or 3D space is still the ninth largest. Non-
metric multidimensional scaling intentionally does not take absolute distances into
account.
The program may converge on a different solution in each run, depending upon
the random initial conditions. Each run is actually a sequence of 11 trials, from
which the one with smallest stress is chosen. One of these trials uses PCO as

26
the initial condition, but this rarely gives the best solution. The scatter plot is
automatically rotated to the major axes (2D or 3D).
The algorithm implemented in PAST, which seems to work very well, is based
on a new approach developed by Taguchi & Oono (in press).
The minimal spanning tree option is based on the selected similarity or distance
index in the original space.
Shepard plot: This plot of obtained versus observed (target) ranks indicates the
quality of the result. Ideally, all points should be placed on a straight ascending
line (x = y).
Missing data are supported by pairwise deletion (not for the Raup-Crick and
rho indices).

Correspondence analysis
Typical application Assumptions Data needed
Reduction and interpretation Unknown Two or more rows of
of large multivariate ecolog- counted data in three or
ical data sets with environ- more compartments
mental or other gradients

Correspondence analysis (CA) is yet another ordination method, somewhat

similar to PCA but for counted data. For comparing associations (columns) con-
taining counts of taxa, or counted taxa (rows) across associations, CA is the more
appropriate algorithm. Also, CA is more suitable if you expect that species have
unimodal responses to the underlying parameters, that is they favour a certain range
of the parameter, becoming rare for lower and higher values (this is in contrast to
PCA, which assumes a linear response).
The CA routine finds the eigenvalues and eigenvectors for a matrix containing
the Chi-squared distances between all data points. The eigenvalue, giving a mea-
sure of the similarity accounted for by the corresponding eigenvector, is given for
each eigenvector. The percentages of similarity accounted for by these components
are also given.
The ’View scatter’ option allows you to see all your data points (rows) plotted
in the coordinate system given by the CA. If you have tagged (grouped) rows, the
different groups will be shown using different symbols and colours.
In addition, the variables (columns, associations) can be plotted in the same
coordinate system (Q mode), optionally including the column labels. If your data
are ’well behaved’, taxa typical for an association should plot in the vicinity of that
association.
PAST presently uses a symmetric scaling ("Benzecri scaling").
If you have more than two columns in your data set, you can choose to view a
scatter plot on the second and third axes.
Relay plot: This is a composite diagram with one plot per column. The plots
are ordered according to CA column scores. Each data point is plotted with CA

27
first-axis row scores on the vertical axis, and the original data point value (abun-
dance) in the given column on the horizontal axis. This may be most useful when
samples are in rows and taxa in columns. The relay plot will then show the taxa
ordered according to their positions along the gradients, and for each taxon the
corresponding plot should ideally show a unimodal peak, partly overlapping with
the peak of the next taxon along the gradient (see Hennebert & Lees 1991 for an
example from sedimentology).
Missing data are supported by column average substitution.

Detrended correspondence analysis

Typical application Assumptions Data needed
Reduction and interpretation Unknown Two or more rows of
of large multivariate ecolog- counted data in three or
ical data sets with environ- more compartments
mental or other gradients

The Detrended Correspondence (DCA) module uses the same algorithm as

Decorana (Hill & Gauch 1980), with modifications according to Oxanen & Minchin
(1997). It is specialized for use on ’ecological’ data sets with abundance data; sam-
ples in rows, taxa in columns (vice versa prior to v. 1.79). When the ’Detrending’
option is switched off, a basic Reciprocal Averaging will be carried out. The result
should be similar to Correspondence Analysis (see above) plotted on the first and
second axes.
Eigenvalues for the first three ordination axes are given as in CA, indicating
their relative importance in explaining the spread in the data.
Detrending is a sort of normalization procedure in two steps. The first step
involves an attempt to ’straighten out’ points lying in an arch, which is a common
occurrence. The second step involves ’spreading out’ the points to avoid clustering
of the points at the edges of the plot. Detrending may seem an arbitrary procedure,
but can be a useful aid in interpretation.
Missing data are supported by column average substitution.

Canonical Correspondence Analysis

Typical application Assumptions Data needed
Reduction and interpretation None Two or more rows of sites,
of large multivariate ecolog- with taxa (species) in
ical data sets with environ- columns. The first columns
mental or other gradients contain environmental
variables.

Canonical Correspondence Analysis (Legendre & Legendre 1998) is corre-

spondence analysis of a site/species matrix where each site has given values for

28
one or more environmental variables (temperature, depth, grain size etc.). The
ordination axes are linear combinations of the environmental variables. CCA is
thus an example of direct gradient analysis, where the gradient in environmental
variables is known a priori and the species abundances (or presence/absences) are
considered to be a response to this gradient.
The implementation in PAST follows the eigenanalysis algorithm given in Leg-
endre & Legendre (1998). The ordinations are given as site scores - fitted site
scores are presently not available. Environmental variables are plotted as correla-
tions with site scores. Both scalings (type 1 and 2) of Legendre & Legendre (1998)
are available. Scaling 2 emphasizes relationships between species.

Two-block Partial Least Squares (PLS)

Typical application Assumptions Data needed
Studying the structure of co- None Two or more rows of multi-
variation between two sets variate continuous data. The
of variates on the same rows columns should be first all
variates of first block, then
all variates of second block.

Two-block Partial Least Squares can be seen as an ordination method that can
be compared with PCA, but with the objective of maximizing covariance between
two sets of variates on the same rows (specimens, sites). For example, morphome-
tric and environmental data can be ordinated in order to study covariation between
the two.
The program will ask for the number of columns belonging to the first block.
The remaining columns will be assigned to the second block. There are options for
plotting PLS scores both within and across blocks, and PLS loadings.
The algorithm follows Rohlf & Corti (2000). Permutation tests and biplots are
not yet implemented.

Cluster analysis
Typical application Assumptions Data needed
Finding hierarchical group- None Two or more rows of
ings in multivariate data sets counted, measured or pres-
ence/absence data in one or
more variables or categories,
or a symmetric similarity or
distance matrix.

The hierarchical clustering routine produces a ’dendrogram’ showing how data

points (rows) can be clustered. For ’R’ mode clustering, putting weight on group-
ings of taxa, taxa should be in rows. It is also possible to find groupings of variables

29
or associations (Q mode), by entering taxa in columns. Switching between the two
is done by transposing the matrix (in the Edit menu).
Three different algorithms are available:

• Unweighted pair-group average (UPGMA). Clusters are joined based on the

average distance between all members in the two groups.

• Single linkage (nearest neighbour). Clusters are joined based on the smallest
distance between the two groups.

• Ward’s method. Clusters are joined such that increase in within-group vari-
ance is minimized.

One method is not necessarily better than the other, though single linkage is not
recommended by some. It can be useful to compare the dendrograms given by the
different algorithms in order to informally assess the robustness of the groupings. If
a grouping is changed when trying another algorithm, that grouping should perhaps
not be trusted.
For Ward’s method, a Euclidean distance measure is inherent to the algorithm.
For UPGMA and single linkage, the distance matrix can be computed using 13
different measures:

• The Euclidean distance (between rows) is a robust and widely applicable

measure. Distance is converted to similarity by changing the sign.
v
u s
uX
Euclideanjk = t (xij − xik )2
i=1

• Correlation (of the variables along rows) using Pearson’s r. A little mean-
ingless if you have only two variables.

• Correlation using Spearman’s rho (basically the r value of the ranks). Will
often give the same result as correlation using r.

• Dice (Sorensen) coefficient for absence-presence (coded as 0 or positive

numbers). Puts more weight on joint occurences than on mismatches.
When comparing two columns (associations), a match is counted for all taxa
with presences in both columns. Using ’M’ for the number of matches and
’N’ for the the total number of taxa with presences in just one column, we
have
Dice similarity = 2M/(2M + N )

• Jaccard similarity for absence-presence data: M/(M + N )

30
• The Simpson index is defined as M/Nmin , where Nmin is the smaller of
the numbers of presences in the two associations. This index treats two as-
sociations as identical if one is a subset of the other, making it useful for
fragmentary data.

• Kulczynski similarity for presence-absence data:

M/(M + N1 ) + M/(M + N2 )
2

• Ochiai similarity for presence-absence data (binary form of the cosine):

q
[M/(M + N1 )][M/(M + N2 )]

• Bray-Curtis measure for abundance data.

Ps
|xij − xik |
Bray − Curtisjk = Psi=1
i=1 (xij + xik )

• Cosine distance for abundance data - one minus the inner product of abun-
dances each normalised to unit norm.

• Chord distance for abundance data (converted to similarity by changing the

sign). Recommended!
v Ps
u (xij xik )
Chordjk = t2 − 2 qP i=1 P
u
s 2 s 2
i=1 xij i=1 xik

• Morisita’s index for abundance data. Recommended!

Ps
i=1 (xij (xij − 1))
λ1 = Ps Ps (1)
i=1 xij ( i=1 xij − 1)
Ps
(xik (xik − 1))
λ2 = Ps i=1 Ps
i=1 ik ( i=1 xik − 1)
x
2 si=1 (xij xik )
P
M orisitajk =
(λ1 + λ2 ) si=1 xij si=1 xik
P P

• Raup-Crick index for absence-presence data. Recommended! This index

(Raup & Crick 1979) uses a randomization ("Monte Carlo") procedure, com-
paring the observed number of species ocurring in both associations with the
distribution of co-occurrences from 200 random replicates.

31
• Horn’s overlap index for abundance data (Horn 1966).

s
X
Nj = xij (2)
i=1
Xs
Nk = xik
i=1
Ps
i=1 [(xij + xik )ln(xij + xik )] − si=1 [xij lnxij ] − si=1 [xik lnxik ]
P P
Rojk =
(Nj + Nk )ln(Nj + Nk ) − Nj lnNj − Nk lnNk

• Hamming distance for categorical data as coded with integers (or sequence
data coded as CAGT). The Hamming distance is the number of differences
(mismatches), so that the distance between (3,5,1,2) and (3,7,0,2) equals 2.
In PAST, this is normalised to the range (0,1), which is known to geneticists
as "p-distance".

• Jukes-Cantor distance for genetic sequence data (CAGT). Similar to Ham-

ming distance, but takes into account probability of reversals.

• Kimura distance for genetic sequence data (CAGT). Similar to Jukes-Cantor

distance, but takes into account different probabilities of nucleotide transi-
tions vs. transversals.

• Tajima-Nei distance for genetic sequence data (CAGT). Similar to Jukes-

Cantor distance, but does not assume equal nucleotide frequencies.

• Manhattan distance: The sum of differences in each variable (converted to

similarity by changing the sign).

• User-defined similarity: Expects a symmetric similarity matrix rather than

original data. No error checking!

• User-defined distance: Expects a symmetric distance matrix rather than orig-

inal data. No error checking!

• Mixed: This option requires that data types have been assigned to columns
(see Entering and manipulating data). A pop-up window will ask for the
similarity/distance measure to use for each datatype. These will be com-
bined using an average weighted by the number of variates of each type.
The default choices correspond to those suggested by Gower, but other com-
binations may well work better. The "Gower" option is a range-normalised
Manhattan distance.

See Harper (1999) or Davis (1986) for details.

32
Missing data: The cluster analysis algorithm can handle missing data, coded
as -1 or question mark (?). This is done using pairwise deletion, meaning that
when distance is calculated between two points, any variables that are missing are
ignored in the calculation. For Raup-Crick, missing data are treated as absence.
Missing data are not supported for Ward’s method, nor for the Rho similarity mea-
sure.
Two-way clustering: The two-way option allows simultaneous clustering in
R-mode and Q-mode.
Stratigraphically constrained clustering: This option will allow only adjacent
rows (or groups of rows) to be joined during the agglomerative clustering proce-
dure. May produce strange-looking (but correct) dendrograms.
Bootstrapping: If a number of bootstrap replicates is given (e.g. 100), the
columns are subjected to resampling. The percentage of replicates where each
node is still supported is given on the dendrogram.
All-zeros rows: Some similarity measures (Dice, Jaccard, Simpson etc.) are
undefined when comparing two all-zero rows. To avoid errors, especially when
bootstrapping sparse data sets, the similarity is set to zero in such cases.

Neighbour joining cluster analysis

Typical application Assumptions Data needed
Finding hierarchical group- None Two or more rows of
ings in multivariate data sets counted, measured or pres-
ence/absence data in one or
more variables or categories,
or a symmetric similarity or
distance matrix.

Neigbour joining clustering (Saitou & Nei 1987) is an alternative method for
hierarchical cluster analysis. The method was originally developed for phyloge-
netic analysis, but may be superior to UPGMA also for ecological data. In con-
trast with UPGMA, two branches from the same internal node do not need to
have equal branch lengths. A phylogram (unrooted dendrogram with proportional
branch lengths) is given.
Distance indices and bootstrapping are as for other cluster analysis (above).
Negative branch lengths are forced to zero, and transferred to the adjacent
branch according to Kuhner & Felsenstein (1994).
The tree is by default rooted on the last branch added during tree construction
(this is not midpoint rooting). Optionally, the tree can be rooted on the first row in
the data matrix (outgroup).

33
K-means clustering
Typical application Assumptions Data needed
Non-hierarchical clustering None Two or more rows of
of multivariate data into a counted or measured data in
specified number of groups one or more variables

K-means clustering (e.g. Bow 1984) is a non-hierarchical clustering method.

The number of clusters to use is specified by the user, usually according to some
hypothesis such as there being two sexes, four geographical regions or three species
in the data set
The cluster assignments are initially random. In an iterative procedure, items
are then moved to the cluster which has the closest cluster mean, and the cluster
means are updated accordingly. This continues until items are no longer "jumping"
to other clusters. The result of the clustering is to some extent dependent upon the
initial, random ordering, and cluster assignments may therefore differ from run to
run. This is not a bug, but normal behaviour in k-means clustering.
The cluster assignments may be copied and pasted back into the main spread-
sheet, and corresponding colors (symbols) assigned to the items using the ’Num-
bers to colors’ option in the Edit menu.
Missing data are supported by column average substitution.

Seriation
Typical application Assumptions Data needed
Stratigraphical or environ- None Presence/absence (1/0) ma-
mental ordering of taxa and trix with taxa in rows
localities

Seriation of an absence-presence matrix using the algorithm described by Brower

and Kyle (1988). This method is typically applied to an association matrix with
taxa (species) in the rows and populations in the columns. For constrained seri-
ation (see below), columns should be ordered according to some criterion, normally
stratigraphic level or position along a presumed faunal gradient.
The seriation routines attempt to reorganize the data matrix such that the pres-
ences are concentrated along the diagonal. There are two algorithms: Constrained
and unconstrained optimization. In constrained optimization, only the rows (taxa)
are free to move. Given an ordering of the columns, this procedure finds the ’op-
timal’ biozonation, that is, the ordering of taxa which gives the prettiest range
plot. Also, in the constrained mode, the program runs a ’Monte Carlo’ simulation,
generating and seriating 30 random matrices with the same number of occurences
within each taxon, and compares these to the original matrix to see if it is more
informative than a random one (this procedure is time-consuming for large data
sets).

34
In the unconstrained mode, both rows and columns are free to move.

Discriminant analysis and Hotelling’s T 2

Typical application Assumptions Data needed
Testing for separation and Multivariate normality. Two multivariate data sets of
equal means of two multi- Hotelling’s test assumes measured data, marked with
variate data sets equal covariances. different colors

Given two sets of multivariate data, an axis is constructed which maximizes

the difference between the sets. The two sets are then plotted along this axis using
a histogram.
This module expects the rows in the two data sets to be tagged with dots (black)
and crosses (red), respectively.
Equality of the means of the two groups is tested by a multivariate analogue to
the t test, called Hotelling’s T-squared, and a p value for this test is given. Normal
distribution of the variables is required, and also that the number of cases is at least
two more than the number of variables.
Number of constraints: For correct calculation of the Hotelling’s p value, the
number of dependent variables(constraints) must be specified. It should normally
be left at 0, but for Procrustes fitted landmark data use 4 (for 2D) or 6 (for 3D).
Discriminant analysis is a standard method for visually confirming or rejecting
the hypothesis that two species are morphologically distinct. Using a cutoff point
at zero (the midpoint between the means of the discriminant scores of the two
groups), a classification into two groups is shown in the "view numbers" option.
The percentage of correctly classified items is also given.
Discriminant function: New specimens can be classified according to the dis-
criminant function. Take the inner product between the measurements on the new
specimen and the given discriminant function factors, and then subtract the given
offset value.
Leave one out (cross-evaluation): An option is available for leaving out one
row (specimen) at a time, re-computing the discriminant analysis with the remain-
ing specimens, and classifying the left-out row accordingly (as given by the Score
value).
Beware: The combination of discriminant analysis and Hotelling’s T 2 test is
sometimes misused. One should not be surprised to find a statistically significant
difference between two samples which have been chosen with the objective of
maximizing distance in the first place! The division into two groups should ideally
be based on independent evidence.
See Davis (1986) for details.
Missing data are supported by column average substitution.

35
Paired Hotelling’s T 2
Typical application Assumptions Data needed
Testing for equal means of a Multivariate normality. A multivariate data set
paired multivariate data set of paired measured data,
marked with different colors

The paired Hotelling’s test expects two groups of multivariate data, marked
with different colours. Rows within each group must be consecutive. The first row
of the first group is paired with the first row of the second group, the second row is
paired with the second, etc.
Missing data are supported by column average substitution.

Permutation test for two multivariate groups

Typical application Assumptions Data needed
Testing for qual means of The two groups have similar Two multivariate data sets of
two multivariate data sets distributions (variances) measured data, marked with
different colors

This module expects the rows in the two data sets to be grouped into two sets
by colouring the rows, e.g. with black (dots) and red (crosses). Rows within each
group must be consecutive.
Equality of the means of the two groups is tested using permutation with 2000
replicates (can be changed by the user), and the Mahalanobis squared distance
measure. The permutation test is an alternative to Hotelling’s test when the as-
sumptions of multivariate normal distributions and equal covariance matrices do
not hold.
Missing data are supported by column average substitution.

Multivariate normality test

Typical application Assumptions Data needed
Testing for multivariate nor- Departures from multivari- One multivariate sample of
mality ate normality detectable as measured data, with vari-
departure from multivariate ables in columns
skewness or kurtosis

Multivariate normality is assumed by a number of multivariate tests. PAST

computes Mardia’s multivariate skewness and kurtosis, with tests based on chi-
squared (skewness) and normal (kurtosis) distributions. A powerful omnibus (over-
all) test due to Doornik & Hansen (1994) is also given. If at least one of these tests
show departure from normality (small p value), the distribution is significantly non-
normal. Sample size should be reasonably large (>50), although a small-sample

36
correction is also attempted for the skewness test.

Box’s M test
Typical application Assumptions Data needed
Testing for equivalence of Multivariate normality Two multivariate data
the covariance matrices for sets of measured data, or
two data sets two (square) variance-
covariance matrices, marked
with different colors

This test is rather specialized, testing for the equivalence of the covariance
matrices for two multivariate data sets. You can use either two original multivariate
data sets from which the covariance matrices are automatically computed, or two
specified variance-covariance matrices. In the latter case, you must also specify the
sizes (number of individuals) of the two samples.
The Box’s M statistic is given, together with a significance value based on a
chi-square approximation. Note that this test is supposedly very sensitive. This
means that a high p value will be a good, although informal, indicator of equality,
while a highly significant result (low p value) may in practical terms be a somewhat
too sensitive indicator of inequality.

One-way MANOVA and Canonical Variates Analysis

Typical application Assumptions Data needed
Testing for equality of the Multivariate normal distri- Two or more samples of
means of several multivari- bution, similar variances- multivariate measured data,
ate samples, and ordination covariances marked with different col-
based on maximal separa- ors. The number of cases
tion (multigroup discrimi- must exceed the number of
nant analysis) variables.

One-way MANOVA (Multivariate ANalysis Of VAriance) is the multivariate

version of the univariate ANOVA, testing whether several samples have the same
mean. If you have only two samples, you would perhaps rather use the two-sample
Hotelling’s T 2 test.
Two statistics are provided: Wilk’s lambda with it’s associated Rao’s F and the
Pillai trace with it’s approximated F. Wilk’s lambda is probably more commonly
used, but the Pillai trace may be more robust.
Number of constraints: For correct calculation of the p values, the number of
dependent variables(constraints) must be specified. It should normally be left at 0,
but for Procrustes fitted landmark data use 4 (for 2D) or 6 (for 3D).
Pairwise comparisons (post-hoc): If the MANOVA shows significant overall
difference between groups, the analysis can proceed by pairwise comparisons. In
PAST, the post-hoc analysis is quite simple, by pairwise Hotelling’s tests. In the

37
post-hoc table, groups are named according to the row label of the first item in
the group. Hotelling’s p values are given above the diagonal, while Bonferroni
corrected values (multiplied by the number of pairwise comparisons) are given
below the diagonal. This Bonferroni corrected test has very little power.

Canonical Variates Analysis

An option under MANOVA, CVA produces a scatter plot of specimens along the
two first canonical axes, producing maximal and second to maximal separation
between all groups (multigroup discriminant analysis). The axes are linear com-
binations of the original variables as in PCA, and eigenvalues indicate amount of
variation explained by these axes.
Missing data are supported by column average substitution.

One-way ANOSIM
Typical application Assumptions Data needed
Testing for difference be- Ranked dissimilarities Two or more groups of mul-
tween two or more multi- within groups have similar tivariate data, marked with
variate groups, based on any median and range. different colours, or a sym-
distance measure metric similarity or distance
matrix with similar groups.

ANOSIM (ANalysis Of Similarities) is a non-parametric test of significant dif-

ference between two or more groups, based on any distance measure (Clarke 1993).
The distances are converted to ranks. ANOSIM is normally used for ecological
taxa-in-samples data, where groups of samples are to be compared.
In a rough analogy with ANOVA, the test is based on comparing distances
between groups with distances within groups. Let rb be the mean rank of all dis-
tances between groups, and rw the mean rank of all distances within groups. The
test statistic R is then defined as
rb − rw
R= .
N (N − 1)/4

Large positive R (up to 1) signifies dissimilarity between groups. The signif-

icance is computed by permutation of group membership, with 10,000 replicates
(can be changed by the user).
Pairwise ANOSIMs between all pairs of groups are provided as a post-hoc test.
The Bonferroni corrected p values are very conservative.
Missing data are supported by pairwise deletion (not for the Raup-Crick, Rho
and user-defined indices).

38
Two-way ANOSIM
Typical application Assumptions Data needed
Testing for difference be- Ranked dissimilarities First two columns: Levels
tween multivariate groups, within groups have similar of the two factors, coded
based on any distance mea- median and range. with integers. Consecutive
sure. The groups are orga- columns: Multivariate data,
nized into two factors of at or a symmetric similarity or
least two levels each. distance matrix.

The two-way ANOSIM in PAST uses the crossed design (Clarke 1993). For
more information see one-way ANOSIM, but note that groups (levels) are not
coded with colors but with integer numbers in the first two columns.

One-way NPMANOVA
Typical application Assumptions Data needed
Testing for difference be- The groups have similar Two or more groups of mul-
tween two or more multi- distributions (similar vari- tivariate data, marked with
variate groups, based on any ances) different colors, or a sym-
distance measure metric similarity or distance
matrix with similar groups.

NPMANOVA (Non-Parametric MANOVA) is a non-parametric test of signifi-

cant difference between two or more groups, based on any distance measure (An-
derson 2001). NPMANOVA is normally used for ecological taxa-in-samples data,
where groups of samples are to be compared, but may also be used as a general
non-parametric MANOVA
NPMANOVA calculates an F value in analogy with ANOVA. In fact, for uni-
variate data sets and the Euclidean distance measure, NPMANOVA is equivalent
to ANOVA and gives the same F value.
The significance is computed by permutation of group membership, with 10,000
replicates (can be changed by the user).
Pairwise NPMANOVAs between all pairs of groups are provided as a post-hoc
test. The Bonferroni corrected p values are very conservative.
Missing data is supported by pairwise deletion (not for the Raup-Crick, Rho
and user-defined indices).

39
Mantel test
Typical application Assumptions Data needed
Testing for correlation be- None Two groups of multivariate
tween two distance ma- data, marked with different
trices, typically geograph- colors, or two symmetric
ical or stratigraphic dis- distance or similarity matri-
tance and e.g. distance be- ces.
tween species compositions
of samples (is there a spa-
tial structure in a multivari-
ate data set?)

The Mantel test is a permutation test for correlation between two distance or
similarity matrices. In PAST, these matrices can also be computed automatically
from two sets of original data. The first matrix must be above the second matrix
in the spreadsheet, and the rows be marked with two different colors. The two
matrices must have the same number of rows. If they are distance or similarity
matrices, they must also have the same number of columns.

SIMPER
Typical application Assumptions Data needed
Finding which taxa are pri- Independent samples Two or more groups of mul-
marily responsible for dif- tivariate abundance samples
ferences between two or (taxa in columns) marked
more groups of ecological with different colors.
samples (abundances)

SIMPER (Similarity Percentage) is a simple method for assessing which taxa

are primarily responsible for an observed difference between groups of samples
(Clarke 1993). The overall significance of the difference is often assessed by
ANOSIM. The Bray-Curtis similarity measure is implicit to SIMPER.
If more than two groups are selected, you can either compare two groups (pair-
wise) by choosing from the lists of groups, or you can pool all samples to perform
one overall multi-group SIMPER.

CABFAC factor analysis

Typical application Assumptions Data needed
Factor analysis of abun- None Counted data with samples
dance data, optionally with in rows, taxa in columns.
associated environmental The first column can option-
data ally contain associated envi-
ronmental data (e.g. temper-
ature)

40
This module implements the classical Imbrie & Kipp (1971) method of factor
analysis and environmental regression (CABFAC and REGRESS).
The program asks whether the first column contains environmental data. If not,
a simple factor analysis with Varimax rotation will be computed on row-normalized
data. If environmental data are included, the factors will be regressed onto the en-
vironmental variable using the second-order (parabolic) method of Imbrie & Kipp,
with cross terms. PAST then reports the RMA regression of original environmen-
tal values against values reconstructed from the transfer function. You can also
save the transfer function as a text file that can later be used for reconstruction of
palaeoenvironment (see below). This file contains:

• Number of taxa

• Number of factors

• Factor scores for each taxon

• Number of regression coefficients

• Regression coefficients (second- and first-order terms, and intercept)

Calibration from CABFAC

Typical application Assumptions Data needed
Reconstructing environmen- Faunal responses to the Samples in rows, taxa in
tal parameters (e.g. tem- environment are constant columns. The program will
perature) from a CABFAC through time and space also ask for a previously
transfer function and fossil constructed transfer function
species abiundances file

This module will reconstruct a (single) environmental parameter from taxa-

in-samples abundance data. The program will also ask for a CABFAC transfer
function file, as previously made by CABFAC factor analysis. The set of taxa
(columns) must be identical in the spreadsheet and the transfer function file. (There
seems to be a numerical instability that adds some noise to the solution - this is
being looked into).

Calibration from optima

Typical application Assumptions Data needed
Reconstructing environmen- None Samples in rows, taxa in
tal parameters (e.g. tem- columns. First three rows:
perature) from species abun- Optima, tolerances, peak
dance and species optimum abundances. Consequent
data. rows: Abundance counts.

41
The first three rows can be generated from known (Recent) abundance and
environmental data by the "Species packing" option in the Model menu. The third
row (peak abundance) is not used, and the second row (tolerance) is used only
when the "Equal tolerances" box is not ticked.
The algorithm is weighted averaging, optionally with tolerance weighting, ac-
cording to ter Braak & van Dam (1989).

Modern Analog Technique

Typical application Assumptions Data needed
Reconstructing paleoenvi- Fossil associations are Samples in rows, taxa in
ronmental parameters (e.g. ecologically comparable to columns. First column
temperature) from fossil modern ones contains environmental data
counts and modern data. - downcore samples have
question marks in this col-
umn. All modern samples in
first rows, then all downcore
samples.

The Modern Analog Technique works by finding modern samples with faunal
associations close to modern ones.

Parameters to set
• Distance measure: Several distance measures commonly used in MAT are
available. "Squared chord" has become the standard choice in the literature.

• Weighting: When several modern analogs are linked to one downcore sam-
ple, their environmental values can be weighted equally, inversely propor-
tional to faunal distance, or inversely proportional to ranked faunal distance.

• Distance threshold: Only modern analogs closer than this threshold are used.
A default value is given, which is the tenth percentile of distances between all
sample pairs in the modern data. The "Dissimilarity distribution" histogram
may be useful when selecting this threshold.

• N analogs: This is the maximum number of modern analogs used for each
downcore sample.

• Jump method (on/off): For each downcore sample, modern samples are
sorted by ascending distance. When the distance increases by more than
the selected percentage, the subsequent modern analogs are discarded.

Note that one or more of these options can be disabled by entering a large value.
For example, a very large distance threshold will never apply, so the number of
analogs is decided only by the "N analogs" value and optionally the jump method.

42
Cross validation
The scatter plot and R2 value show the results of a leave-one-out (jackknifing)
cross-validation within the modern data. The y = x line is shown in red. This only
partly reflects the "quality" of the method, as it gives little information about the
accuracy of downcore estimation.

43
8 Fitting data to functions
Linear
Typical application Assumptions Data needed
Fitting data to a straight None One or two columns of
line, or exponential or power counted or measured data
function

If two columns are selected, they represent x and y values, respectively. If one
column is selected, it represents y values, and x values are taken to be the sequence
of positive integers (1,2,...). A straight line y = ax+b is fitted to the data. There are
two different algorithms available: Standard regression and Reduced Major Axis
(the latter is selected by ticking the box). Standard regression keeps the x values
fixed, and finds the line which minimizes the squared errors in the y values. Use
this if your x values have very small errors associated with them. Reduced Major
Axis tries to minimize both the x and the y errors. RMA fitting and standard error
estimation is according to Miller & Kahn (1962), not Davis (1986)!
Also, both x and y values can be log-transformed (base 10), in effect fitting
your data to the ’allometric’ function y = 10b xa . An a value around 1 indicates
that a straight-line (’isometric’) fit may be more applicable.
The values for a and b, their errors, a Chi-square correlation value (not for
RMA), Pearson’s r correlation, and the probability that the columns are not corre-
lated are given.
The calculation of standard errors for slope and intercept assumes normal dis-
tribution of residuals and independence between the variables and the variance of
residuals. If these assumptions are strongly broken, it is preferable to use the boot-
strapped 95 percent confidence intervals (2000 replicates). The number of random
points selected for each replicate should normally be kept as N , but may be reduced
for special applications.
The permutation test on correlation (R2 ) uses 10,000 replicates.
In Standard regression (not RMA), a 95 percent "Working-Hotelling" confi-
dence band for the fitted line (not for the data points!) is available.

Residuals
The Residuals window reports the distances from each data point to the regression
line, in the x and y directions. Only the latter is of interest when using ordinary
linear regression rather than RMA. The residuals can be copied back to the spread-
sheet and inspected for normal distribution and independence between independent
variable and residual variance (homoskedasticity).

44
Exponential functions
Your data can be fitted to an exponential function y = eb eax by first log-transforming
just your y column (in the Transform menu) and then performing a straight-line fit.

Linear, one independent, n dependent

Typical application Assumptions Data needed
Linear fitting of several de- As for simple linear fitting Two or more columns of
pendent variates to one inde- measured data
pendent, continuous variate.

When you have one independent variate and several dependent variates, you
can fit each dependent variate separately to the independent variate using simple
linear regression. This module makes the process more convenient by having a
scroll button going through each dependent variate.
In addition, an overall test of multivariate regression significance is provided
(MANOVA with Wilks’ lambda).

Polynomial
Typical application Assumptions Data needed
Fitting data to a polynomial None (except for the calcu- Two columns of counted or
lation of p) measured data

Two columns must be selected (x and y values). A polynomial of up to the

fourth order is fitted to the data. The algorithm is based on a least-squares criterion
and singular value decomposition (Press et al. 1992), with mean and variance
standardization for improved numerical stability.
The chi-squared value is a measure of fitting error - larger values mean poorer
fit. The Akaike Information Criterion has a penalty for the number of terms. The
AIC should be as low as possible to maximize fit but avoid overfitting.
R2 is the coefficient of determination, or proportion of variance explained by
the model. Finally, a p value, based on an F test, gives the significance of the fit.

Sinusoidal
Typical application Assumptions Data needed
Fitting data to a set of peri- None Two columns of counted or
odic, sinusoidal functions measured data

Two columns must be selected (x and y values). A sum of up to eight sinusoids

with periods specified by the user, but with unknown amplitudes and phases, is
fitted to the data. This can be useful for modelling periodicities in time series, such

45
as annual growth cycles or climatic cycles, usually in combination with spectral
analysis. The algorithm is based on a least-squares criterion and singular value
decomposition (Press et al. 1992). By default, the periods are set to the range of
the x values, and harmonics (1/2, 1/3, 1/4, 1/5, 1/6, 1/7 and 1/8 of the fundamental
period). These values can be changed, and need not be in harmonic proportion.
The chi-squared value is a measure of fitting error - larger values mean poorer
fit. The Akaike Information Criterion has a penalty for the number of sinusoids
(the equation used assumes that the periods are estimated from the data). The AIC
should be as low as possible to maximize fit but avoid overfitting.
R2 is the coefficient of determination, or proportion of variance explained by
the model. Finally, a p value, based on an F test, gives the significance of the fit.
A "search" function for each sinusoid will optimize the frequency of that si-
nusoid (over the full meaningful range from one period to the Nyquist frequency),
holding all other selected sinusoid frequencies constant. The algorithm is slow but
very robust and almost guaranteed to find the global optimum.
For a "blind" spectral analysis, finding all parameters of an optimal number of
sinusoids, follow this procedure: Start with only the first sinusoid selected. Click
"search" to optimize period, amplitude and phase. This will find the strongest
sinusoid in the data. Note the AIC. Add (select) the second sinusoid, and click its
search button to optimize all parameters of both sinusoids except the period of the
first sinusoid. This will find the second strongest sinusoid. Continue until the AIC
no longer decreases.
It is not meaningful to specify periodicities that are smaller than two times the
typical spacing of data points.
Each sinusoid is given by y = a cos(2π(x − x0 )/T − φ), where a is the
amplitude, T is the period and φ is the phase. x0 is the first (smallest) x value.

Logistic
Typical application Assumptions Data needed
Fitting data to a logistic None Two columns of counted or
or von Bertalanffy growth measured data
model

Attempts to fit the data to the logistic equation y = a/(1 + b ∗ e−cx ). For
numerical reasons, the x axis is normalized. The algorithm is a little complicated.
The value of a is first estimated to be the maximal value of y. The values of b and
c are then estimated using a straight-line fit to a linearized model.
Though acceptable, this estimate can optionally be improved by using the esti-
mated values as an initial guess for a Levenberg-Marquardt nonlinear optimization
(tick the box). This procedure can sometimes improve the fit, but due to the nu-
merical instability of the logistic model it often fails with an error message.
The logistic equation can model growth with saturation, and was used by Sep-
koski (1984) to describe the proposed stabilization of marine diversity in the late

46
Palaeozoic.
The 95 percent confidence intervals are based on 2000 bootstrap replicates, not
using the Levenberg-Marquardt optimization step.

Von Bertalanffy
An option in the ’Logistic fit’ window. Uses the same algorithm as above, but fits
to the von Bertalanffy equation y = a ∗ (1 − b ∗ e−cx ). This equation is used for
modelling growth of multi-celled animals (in units of length or width, not volume).

Smoothing splines
Typical application Assumptions Data needed
Smoothing noisy data Smooth underlying function Two columns X/Y of
counted or measured data.
A third column can contain
standard deviations on
Y. A fourth column can
contain extra X values for
interpolation.

Two columns must be selected (X and Y values). The data are fitted to a
smoothing spline, which is a sequence of third-order polynomials, continuous up to
the second derivative. A typical application of this is the construction of a smooth
curve going through a noisy data set. The algorithm is due to de Boor (2001).
An optional third columns specifies standard deviations on the data points.
These are used for weighting the data. If unspecified, these values are all set to
10 percent of the standard deviation of the Y values.
A smoothing value is set by the user. This is a normalized version of the
smoothing factor of de Boor (default 1). Larger values give smoother curves. A
value of 0 will start a spline segment at every point. Clicking ’Optimize smooth-
ing’ will calculate an ’optimal’ smoothing factor by a crossvalidation (jackknife)
procedure.
’View given points’ gives a table of the given data points X, Y and stdev(Y), the
corresponding Y values on the spline curve (ys) and the residuals. The chi-squared
test for each point may be used to identify outliers. The final column suggests an
stdev(Y) value to use if forcing the p value to 0.5.
An optional fourth column (if used then the third column must also be filled
with stdev values) may not contain the same number of values as the previous
columns. It contains extra X values to be used for interpolation between the data
points.
Note that sharp jumps in your data can give rise to oscillations in the curve, and
that you can also get large excursions in regions with few data points.

47
Multiple data points at the same X value are handled by collapsing them to a
single point by weighted averaging and calculation of a combined standard devia-
tion.

Abundance models
Typical application Assumptions Data needed
Fitting taxon abundance dis- None One column of abundance
tribution to one of four mod- counts for a number of taxa
els in a sample

This module can be used for plotting taxon abundances in descending rank
order on a linear or logarithmic(Whittaker plot) scale, or number of species in
abundance octave classes (as shown when fitting to log-normal distribution). It can
also fit the data to one of four different standard abundance models:

• Geometric, where the 2nd most abundant species should have a taxon count
of k<1 times the most abundant, the 3rd most abundant a taxon count of k
times the 2nd most abundant etc. for a constant k. This will give a straight
descending line in the Whittaker plot. Fitting is by simple linear regression
of the log abundances.

• Log-series, with two parameters α and x. The fitting algorithm is from Krebs
(1989).

• Broken stick. There are no free parameters to be fitted in this model.

• Log-normal. The fitting algorithm is from Krebs (1989). The logarithm

(base 10) of the fitted mean and variance are given. The octaves refer to
power-of-2 abundance classes:
Octave Abundance
1 1
2 2-3
3 4-7
4 8-15
5 16-31
6 32-63
7 64-127
... ...

A significance value based on chi-squared is given for each of these models,

but the power of the test is not the same for the four models and the significance
values should therefore not be compared. It is important, as always, to remember
that a high p value can not be taken to imply a good fit. A low value does however
imply a bad fit.

48
Species packing (Gaussian)
Typical application Assumptions Data needed
Fitting species abundance Unimodal species responses One column of environmen-
data along a gradient to an tal measurements in sam-
environmental parameter ples (e.g. temperature),
and one or more columns
of abundance data (taxa in
columns).

This module fits Gaussian response models to species abundances along a gra-
dient, for one or more species. The fitted parameters are optimum (average), toler-
ance (standard deviation) and maximum.
Two algorithms are available: Least-squares Gaussian fitting by internal trans-
formation to a parabolic form, and weighted averaging according to ter Braak &
van Dam (1989).

49
9 Diversity
Diversity statistics
Typical application Assumptions Data needed
Quantifying taxonomical di- Representative samples One or more columns, each
versity in samples containing counts of individ-
uals of different taxa down
the rows

These statistics apply to association data, where numbers of individuals are

tabulated in rows (taxa) and possibly several columns (associations). The available
statistics are as follows, for each association:

• Number of taxa (S)

• Total number of individuals (n)
• Dominance=1-Simpson index. Ranges from 0 (all taxa are equallypresent)
P ni 2
to 1 (one taxon dominates the community completely). D = n where
ni is number of individuals of taxon i.
• Simpson index=1-dominance. Measures ’evenness’ of the community from
0 to 1. Note the confusion in the literature: Dominance and Simpson indices
are often interchanged!
• Shannon index (entropy). A diversity index, taking into account the number
of individuals as well as number of taxa. Varies from 0 for communities with
only a single taxon to high values for communities with many taxa, each with
few individuals. H = − nni ln nni
P

• Buzas and Gibson’s evenness: eH /S

• Menhinick’s richness index - the ratio of the number of taxa to the square
root of sample size.
• Margalef’s richness index: (S − 1)/ ln(n), where S is the number of taxa,
and n is the number of individuals.
• Equitability. Shannon diversity divided by the logarithm of number of taxa.
This measures the evenness with which individuals are divided among the
taxa present.
• Fisher’s alpha - a diversity index, defined implicitly by the formula S =
α ln(1 + n/α) where S is number of taxa, n is number of individuals and α
is the Fisher’s alpha.
• Berger-Parker dominance: simply the number of individuals in the dominant
taxon divided by n.

50
Most of these indices are explained in Harper (1999).
Approximate confidence intervals for all the indices can be computed with a
bootstrap procedure. 1000 random samples are produced (200 prior to version
0.87b), each with the same total number of individuals as in the original sam-
ple. The random samples are taken from the total, pooled data set (all columns).
For each individual in the random sample, the taxon is chosen with probabilities
according to the original abundances. A 95 percent confidence interval is then cal-
culated. Note that the diversity in the replicates will often be less than, and never
larger than, the pooled diversity in the total data set.
Since these confidence intervals are all computed with respect to the pooled
data set, they do not represent confidence intervals for the individual samples. They
are mainly useful for identifying samples where the given diversity index falls out-
side the confidence interval. Bootstrapped comparison of diversity indices in two
samples is provided in the "Compare diversities" module.

Quadrat richness
Typical application Assumptions Data needed
Estimating species richness Representative, random Two or more columns, each
from several quadrat sam- quadrats of equal size containing presence/absence
ples (1/0) of different taxa down
the rows

Four non-parametric species richness estimators are included in PAST: Chao

2, first- and second-order jackknife, and bootstrap. All of these require presence-
absence data in two or more sampled quadrats of equal size. Colwell & Coddington
(1994) reviewed these estimators, and found that the Chao2 and the second-order
jackknife performed best.

Taxonomic distinctness
Typical application Assumptions Data needed
Quantifying taxonomical Representative samples One or more columns, each
distinctness in samples containing counts of indi-
viduals of different taxa
down the rows. In ad-
dition, the leftmost row(s)
must contain names of gen-
era/families etc. (see be-
low).

Taxonomic diversity and taxonomic distinctness as defined by Clarke & War-

wick (1998), including confidence intervals computed from 200 random replicates
taken from the pooled data set (all columns). Note that the "global list" of Clarke &

51
Warwick is not entered directly, but is calculated internally by pooling (summing)
the given samples.
These indices depend on taxonomic information also above the species level,
which has to be entered for each species as follows. Species names go in the name
column (leftmost, fixed column), genus names in column 1, family in column 2
etc. Species counts follow in the columns thereafter. The program will ask for the
number of columns containing taxonomic information above the species level.
For presence-absence data, taxonomic diversity and distinctness will be valid
but equal to each other.

Compare diversities
Typical application Assumptions Data needed
Comparing diversities in Equal sampling conditions Two columns of abundance
two samples of abundance data with taxa down the
data rows

This module computes a number of diversity indices for two samples, and then
compares the diversities using two different randomization procedures as follows.

Bootstrapping
The two samples A and B are pooled. 1000 random pairs of samples (Ai , Bi ) are
then taken from this pool (200 prior to version 0.87b), with the same numbers of
individuals as in the original two samples. For each replicate pair, the diversity in-
dices div(Ai ) and div(Bi ) are computed. The number of times |div(Ai )−div(Bi )|
exceeds or equals |div(A) − div(B)| indicates the probability that the observed
difference could have occurred by random sampling from one parent population as
estimated by the pooled sample.
A small probability value p(equal) then indicates a significant difference in
diversity index between the two samples.

Permutation
1000 random matrices with two columns (samples) are generated, each with the
same row and column totals as in the original data matrix. The p value is computed
as for the boostrap test.

Diversity t test
Typical application Assumptions Data needed
Comparing Shannon diver- Equal sampling conditions Two columns of abundance
sities in two samples of data with taxa down the
abundance data rows

52
Comparison of the Shannon diversities (entropies) in two samples, using a t
test described by Poole (1974). This is an alternative to the randomization test
available in the Compare diversities module.
Note that the Shannon indices here include a bias correction term (Poole 1974),
and may diverge slightly from the uncorected estimates calculated elsewhere in
PAST, at least for small samples.

Diversity profiles
Typical application Assumptions Data needed
Comparing diversities in Equal sampling conditions Two columns of abundance
two samples of abundance data with taxa down the
data rows

The validity of comparing diversities in two samples can be criticized because

of arbitrary choice of diversity index. One sample may for example contain a
larger number of taxa, while the other has a larger Shannon index. It may therefore
be a good idea to try a number of diversity indices in order to make sure that
the diversity ordering is robust. A formal way of doing this is to define a family
of diversity indices, dependent upon a single continuous parameter (Tothmeresz
1995).
PAST uses the exponential of the so-called Renyi index, which depends upon a
parameter alpha. For alpha=0, this function gives the total species number; alpha=1
gives an index proportional to the Shannon index, while alpha=2 gives an index
which behaves like the Simpson index.
Ps α
ln i=1 pi
H= ,
1−α
where pi are proportional abundances of individual taxa and s is the number of
species.
The program plots two such diversity profiles together. If the profiles cross, the
diversities are non-comparable.

Individual rarefaction
Typical application Assumptions Data needed
Comparing taxonomical di- When comparing samples: One or more columns of
versity in samples of differ- Samples should be taxo- counts of individuals of dif-
ent sizes nomically similar, obtained ferent taxa (each column
using standardised sampling must have the same number
and taken from a similar of values)
’habitat’.

Given one or more columns of abundance data for a number of taxa, this mod-
ule estimates how many taxa you would expect to find in a sample with a smaller

53
total number of individuals. With this method, you can compare the number of
taxa in samples of different size. Using rarefaction analysis on your largest sam-
ple, you can read out the number of expected taxa for any smaller sample size. The
algorithm is from Krebs (1989). An example application in palaeontology can be
found in Adrain et al. (2000).
Let N be the total number of individuals in the sample, s the total number
of species, and Ni the number of individuals of species number i. The expected
number of species E(Sn ) in a sample of size n and the variance V (Sn ) are then
given by
s N −Ni
" #
X
n
E(Sn ) = 1− N
i=1 n

s N −Ni N −Ni
" !#
X
n n
V (Sn ) = N
1− N
i=1 n n
s j−1 " N −N −N
i j N −Ni N −Nj
#
n n n
XX
+ 2 N
− N N
(3)
j=2 i=1 n n n

Standard errors (square roots of variances) are given by the program. In the
graphical plot, these standard errors are converted to 95 percent confidence inter-
vals.

Sample rarefaction (Mao tau)

Typical application Assumptions Data needed
Computing species accumu- Similar to individual-based A matrix of presence-
lation curves as a function of rarefaction absence data (abundances
number of samples treated as presences), with
taxa in rows and samples in
columns.

Sample-based rarefaction (also known as the species accumulation curve) is

applicable when a number of samples are available, from which species richness is
to be estimated as a function of number of samples. PAST implements the analyt-
ical solution known as "Mao tau", with standard deviation. In the graphical plot,
the standard errors are converted to 95 percent confidence intervals.
See Colwell et al. (2004) for details.

54
Diversity curves
Typical application Assumptions Data needed
Plotting diversity curves None Abundance or pres-
from occurrence data ence/absence matrix with
samples in rows (lowest
sample at bottom) and taxa
in columns

Found in the ’Strat’ menu, this simple tool allows plotting of diversity curves
from occurrence data in a stratigraphical column. Note that samples should be
in stratigraphical order, with the uppermost (youngest) sample in the uppermost
row. Data are subjected to the range-through assumption (absences between first
and last appearance are treated as presences). Originations and extinctions are in
absolute numbers, not percentages.
The ’Endpoint correction’ option counts a FAD or LAD in a sample as 0.5
instead of 1 in that sample. Both FAD and LAD in the sample counts as 0.33.

55
10 Time series analysis
Spectral analysis
Typical application Assumptions Data needed
Finding periodicities in Time series long enough to One or two columns of
counted or measured data contain at least four cycles counted or measured data

Two columns must be selected (x and y values). Since palaeontological data

are often unevenly sampled, the FFT algorithm can be difficult to use. PAST there-
fore includes the Lomb periodogram algorithm for unevenly sampled data, with
time values given in the first column.
The frequency axis is in units of 1/(x unit). If for example, your x values are
given in millions of years, a frequency of 0.1 corresponds to a period of 10 million
years. The power axis is in units proportional to the square of the amplitudes of the
sinusoids present in the data.
Also note that the frequency axis extends to very high values. If your data are
evenly sampled, the upper half of the spectrum is a mirror image of the lower half,
and is of little use. If some of your regions are closely sampled, the algorithm may
be able to find useful information even above the half-point (Nyquist frequency).
The highest peak in the spectrum is presented with its frequency and power
value, together with a probability that the peak could occur from random data.
The 0.01 and 0.05 significance levels (’white noise lines’) are shown as red dashed
lines.
You may want to remove any linear trend in the data (Edit menu) before ap-
plying the Lomb periodogram. Failing to do so can produce annoying peaks at low
frequencies.

Autocorrelation
Typical application Assumptions Data needed
Finding periodicities in Time series long enough to One column of counted or
counted or measured data contain at least two cycles. measured data
Even spacing of data points.

Autocorrelation (Davis 1986) is carried out on separate column(s) of evenly

sampled temporal/stratigraphic data. Lag times up to N/2, where N is the num-
ber of values in the vector, are shown along the x axis (positive lag times only -
the autocorrelation function is symmetrical around zero). A predominantly zero
autocorrelation signifies random data - periodicities turn up as peaks. q
The "95 percent confidence interval" option will draw lines at plus/minus 1.76 n−τ1 +3 ,
after Davis (1986). This is the confidence interval for random, independent points.
This module handles missing data, coded with question marks (’?’).

56
Cross-correlation
Typical application Assumptions Data needed
Finding an optimal align- Even spacing of data points. Two columns of counted or
ment of two time series measured data

Cross-correlation (Davis 1986) is carried out on two column(s) of evenly sam-

pled temporal/stratigraphic data. The x axis shows the displacement of the second
column with respect to the first, the y axis the correlation between the two time
series for a given displacement. The "p values" option will draw the significance
of the correlation, after Davis (1986).

Wavelet transform
Typical application Assumptions Data needed
Inspection of time series at Even spacing of data points One column of counted or
different scales measured data

The continuous wavelet transform (CWT) is an analysis method where a data

set can be inspected at small, intermediate and large scales simultaneously. It can
be useful for detecting periodicities at different wavelengths, self-similarity and
other features. The vertical axis in the plot is a logarithmic size scale, with the
signal observed at a scale of only two consecutive data points at the bottom, and
at a scale of one fourth of the whole sequence at the top. One unit on this axis
corresponds to a doubling of the size scale. The bottom of the figure thus represents
a detailed, fine-grained view, while the top represents a smoothed overview of
longer trends. Signal energy (or more correctly correlation strength with the scaled
mother wavelet) is shown with a grayscale.
The shape of the mother wavelet can be set to Morlet, Gauss or Sombrero. The
Morlet wavelet usually performs best.
The algorithm is based on fast convolution of the signal with the wavelet at
different scales, using the FFT.
The wavelet transform was used by Prokoph et al. (2000) for illustrating cycles
in diversity curves for planktic foraminiferans.

Walsh transform
Typical application Assumptions Data needed
Spectral analysis (finding Even spacing of data points One column of binary (0/1)
periodicities) of binary or or ordinal (integer) data
ordinal data

The normal methods for spectral analysis are perhaps not optimal for binary
data, because they decompose the time series into sinusoids rather than "square

57
waves". The Walsh transform may then be a better choice, using basis functions
that flip between -1 and +1. These basis functions have different "frequencies"
(number of transitions divided by two), known as sequencies. In PAST, each pair
of even ("cal") and odd ("sal") basis functions (one pair for each integer-valued
sequency) is combined into a power value using cal2 + sal2 , producing a "power
spectrum" that is comparable to the Lomb periodogram.
Note that the Walsh transform is slightly "exotic" compared with the Fourier
transform, and its interpretation must be done cautiously. For example, the ef-
fects of the duty cycle (percentage of ones versus zeros) are somewhat difficult to
understand.
In PAST, the data values are pre-processed by multiplying with two and sub-
tracting one, bringing 0/1 binary values into the -1/+1 range optimal for the Walsh
transform.

Runs test
Typical application Assumptions Data needed
Testing for randomness in a None One column containing a
time series time series. The values are
converted to 0 (x≤0) or 1
(x > 0).

The runs test is a non-parametric test for randomness in a sequence of values.

Non-randomness may include such effects as autocorrelation, trend and periodicity.
The test is based on a dichotomy between two values (x≤0 or x > 0). It counts
the number of runs (groups of consecutive equal values) and compares this to a the-
oretical value. The runs test can therefore be used directly for sequences of binary
data. Continuous data can be converted to an appropriate form by subtracting the
mean (Transform menu), or taking the difference from one value to the next (use
"x-u" in the Evaluate Expression function).

Mantel correlogram (and periodogram)

Typical application Assumptions Data needed
"Autocorrelation" and spec- None Several rows of multivariate
tral analysis of a multivariate data, one row for each sam-
time series ple. Samples are assumed to
be evenly spaced in time.

The Mantel correlogram (e.g. Legendre & Legendre 1998) is a multivariate

extension to autocorrelation, based on any similarity or distance measure. The
Mantel correlogram in PAST shows the average similarity between the time series
and a time lagged copy, for different lags.

58
The Mantel periodogram is a power spectrum of the multivariate time series,
computed from the Mantel correlogram (Hammer 2007).
The Mantel scalogram is an experimental plotting of similarities between all
pairs of points along the time series. The apex of the triangle is the similarity
between the first and last point. The base of the triangle shows similarities between
pairs of consecutive points.

ARMA and intervention analysis

Typical application Assumptions Data needed
Analysis and removal of se- Stationary time series, ex- One column of equally
rial correlations in time se- cept for a single intervention spaced data
ries, and analysis of the im-
pact of an external distur-
bance ("intervention") at a
particular point in time.

This powerful but somewhat complicated module implements maximum-likelihood

ARMA analysis, and a minimal version of Box-Jenkins intervention analysis (e.g.
for investigating whether a climate change had an impact on biodiversity).
By default, a simple ARMA analysis without interventions is computed. The
user selects the number of AR (autoregressive) and MA (moving-average) terms
to include in the ARMA difference equation. The log-likelihood and Akaike infor-
mation criterion are given. Select the numbers of terms that minimize the Akaike
criterion, but be aware that AR terms are more "powerful" than MA terms. Two
AR terms can model a periodicity, for example.
The main aim of ARMA analysis is to remove serial correlations, which oth-
erwise cause problems for model fitting and statistics. The residual should be in-
spected for signs of autocorrelation, e.g. by copying the residual from the numer-
ical output window back to the spreadsheet and using the autocorrelation module.
Note that for many paleontological data sets with sparse data and confounding
effects, proper ARMA analysis (and therefore intervention analysis) will be im-
possible.
The program is based on the likelihood algorithm of Melard (1983), combined
with nonlinear multivariate optimization using simplex search.
Intervention analysis proceeds as follows. First, carry out ARMA analysis
on only the samples preceding the intervention, by typing the last pre-intervention
sample number in the "last samp" box. It is also possible to run the ARMA analysis
only on the samples following the intervention, by typing the first post-intervention
sample in the "first samp" box, but this is not recommended because of the post-
intervention disturbance. Also tick the "Intervention" box to see the optimized
intervention model.
The analysis follows Box and Tiao (1976) in assuming an "indicator function"
u(i) that is either a unit step or a unit pulse, as selected by the user. The indicator

59
function is transformed by an AR(1) process with a parameter δ, and then scaled
by a magnitude w (note that the magnitude given by PAST is the coefficient on the
transformed indicator function: first do y(i) = δy(i − 1) + u(i), then scale y by
the magnitude). The algorithm is based on ARMA transformation of the complete
sequence, then a corresponding ARMA transformation of y, and finally linear re-
gression to find the magnitude. The parameter delta is optimized by exhaustive
search over [0, 1].
For small impacts in noisy data, delta may end up on a sub-optimum. Try
both the step and pulse options, and see what gives smallest standard error on the
magnitude. Also, inspect the "delta optimization" data, where standard error of the
estimate is plotted as a function of δ, to see if the optimized value may be unstable.
The Box-Jenkins model can model changes that are abrupt and permanent (step
function with δ = 0, or pulse with δ = 1), abrupt and non-permanent (pulse with
δ < 1), or gradual and permanent (step with δ < 0).
Be careful with the standard error on the magnitude - it will often be underes-
timated, especially if the ARMA model does not fit well. For this reason, a p value
is deliberately not computed (Murtaugh 2002).

Insolation (solar forcing) model

Typical application Assumptions Data needed
Computation of solar forc- NA None
ing through time

This module computes solar insolation at any latitude and any time from 100
Ma to the Recent (the results are less accurate before 50 Ma). The calculation can
be done for a "true" orbital longitude, "mean" orbital longitude (corresponding to a
certain date in the year), averaged over a certain month in each year, or integrated
over a whole year.
The implementation in PAST is ported from the code by Laskar et al. (2004),
by courtesy of these authors. Please reference Laskar et al. (2004) in any publica-
tions.
It is necessary to specify a data file containing orbital parameters. Download
the file INSOLN.LA2004.BTL.100.ASC from
http://www.imcce.fr/Equipes/ASD/insola/earth/La2004 and
put in anywhere on your computer. The first time you run the calculation, PAST
will ask for the position of the file.
The amount of data can become excessive for long time spans and short step
sizes!

60
11 Geometrical analysis
Directions (one sample)
Typical application Assumptions Data needed
Displaying and testing for See below One column of directional
random distribution of di- (0-360) or orientational (0-
rectional data 180) data in degrees

Plots a rose diagram (polar histogram) of directions given in a column of degree

values (0 to 360). Used for plotting current-oriented specimens, orientations of
trackways, orientations of morphological features (e.g. terrace lines), etc.
By default, the ’mathematical’ angle convention of anticlockwise from east is
chosen. If you use the ’geographical’ convention of clockwise from north, tick the
box.
You can also choose whether to have the abundances proportional to radius in
the rose diagram, or proportional to area (equal area).
The "Kernel density" option plots a circular kernel density estimate.
The mean angle takes circularity into account. The 95 percent confidence inter-
val on the mean is estimated according to Fisher (1983). It assumes circular normal
distribution, and is not accurate for very large variances (confidence interval larger
than 45 degrees).
The R̄ value (Rayleigh’s spread) is given by:
v !2 !2
u n n
u X X
R̄ = t cos θi + sin θi (4)
i=1 i=1

R̄ is further tested against a random distribution using Rayleigh’s test for direc-
tional data (Davis 1986). Note that this procedure assumes evenly or unimodally
distributed data - the test is not appropriate for bidirectional data. The p values are
computed using an approximation given by Mardia (1972).
The Rao’s spacing test for uniform distribution uses probability tables pub-
lished by Russell & Levitin (1996). A Chi-square test for uniform distribution is
also available, with a user-defined number of bins (default 4).
The ’Orientations’ option allows analysis of linear orientations (0-180 degrees).
The Rayleigh test is then carried out by a directional test on doubled angles (this
trick is described by Davis 1986). The Chi-square uses four bins from 0-180 de-
grees. The rose diagram mirrors the histogram around the origin.

61
Directions (two samples)
Typical application Assumptions Data needed
Testing for equal mean angle Concentration value kappa Two columns of directional
in two directional or orienta- >1.0, unimodal (von Mises) (0-360) or orientational (0-
tional samples distribution, similar R val- 180) data in degrees
ues.

Watson-Williams test for equal mean angle in two samples. With corrections
due to Mardia (1972). The concentration parameter kappa is maximum-likelihood,
computed analytically. It should be larger than 1.0 for accurate testing. In addition,
the test assumes similar angular variances (R values).

Circular correlation
Typical application Assumptions Data needed
Testing for correlation be- ’Large N ’ Two columns of directional
tween two directional or ori- (0-360) or orientational (0-
entational variates 180) data in degrees

This module uses the circular correlation procedure and parametric significance
test of Jammalamadaka & Sengupta (2001).

Nearest neighbour point pattern analysis

Typical application Assumptions Data needed
Testing for clustering or Elements small compared to Two columns of x/y posi-
overdispersion of two- their distances, mainly con- tions
dimensional position values vex domain, N>50.

Point distribution statistics using nearest neighbour analysis (modified from

Davis 1986). The area is estimated either by the smallest enclosing rectangle or
using the convex hull, which is the smallest convex polygon enclosing the points.
Both are inappropriate for points in very concave domains. Two different edge ef-
fect adjustment methods are available: wrap-around (’torus’) and Donnelly’s cor-
rection.
The probability that the distribution is random (Poisson process, giving an ex-
ponential nearest neighbour distribution) is presented, together with the R value:

2d¯
R= p ,
A/N

where d¯ is the observed mean distance between nearest neighbours, A is the

area of the convex hull, and N is the number of points. Clustered points give R<1,
Poisson patterns give R 1, while overdispersed points give R>1.

62
The orientations (0-180 degrees) and lengths of lines between nearest neigh-
bours, are also included. The orientations can be subjected to directional analysis
to test whether the points are organised along lineaments.
Applications of this module include spatial ecology (are in-situ brachiopods
clustered) and morphology (are trilobite tubercles overdispersed).

Ripley’s K point pattern analysis

Typical application Assumptions Data needed
Testing for clustering or Rectangular domain Two columns of x/y posi-
overdispersion of two- tions
dimensional position values

Ripley’s K (Ripley 1979) is the average point density as a function of distance

from every point. It is useful when point pattern characteristics change with scale,
e.g. overdispersion over small distances but clustering over large distances. For
complete spatial randomness (CSR), R(d) is expected to increase as the square of
distance. The L(d) function is the square root of R(d)/π. For CSR, L(d) = d, and
L(d) − d = 0. An approximate
√ 95 percent confidence interval for L(d) − d under
CSR is given by 1.42 A/N . Ripley’s edge correction is included.

Area
For the correct calculation of Ripley’s K, the area must be known. In the first
run, the area is computed using the smallest bounding rectangle, but this can often
overestimate the real area, so the area can then be adjusted by the user. An over-
estimated area will typically show up as a strong overall linear trend with positive
slope for L(d) − d.

Fractal dimension
The fractal dimension (if any) can be estimated as the asymptotic linear slope in
a log-log plot of R(d). For CSR, the log-log slope should be 2.0. Fractals should
have slopes less than 2.

Multivariate allometry
Typical application Assumptions Data needed
Finding and testing for al- None A multivariate data set with
lometry in a multivariate variables (distance measure-
morphometric data set ments) in columns, speci-
mens in rows.

This advanced method for investigating allometry in a multivariate data set

is based on Jolicoeur (1963) with extensions by Kowalewski et al. (1997). The

63
data are (automatically) log-transformed and subjected to PCA. The first principal
component (PC1) is then regarded as a size axis (this is only valid if the variation
accounted for by PC1 is large, say more than 80 percent). The allometric coefficient
for each original variable is estimated by dividing the PC1 loading for that variable
by the mean PC1 loading over all variables.
95 percent confidence intervals for the allometric coefficients are estimated by
bootstrapping specimens. 2000 bootstrap replicates are made.
Missing data are supported by column average substitution.

Fourier shape analysis

Typical application Assumptions Data needed
Analysis of fossil outline Shape expressible in polar Digitized x/y coordinates
shape (2D) coordinates, sufficient num- around an outline. Speci-
ber of digitized points to mens in rows, coordinates of
capture features. alternating x and y values in
columns (see Procrustes fit-
ting below).

Accepts X − Y coordinates digitized around an outline. More than one shape

(row) can be simultaneously analyzed. Points do not need to be totally evenly
spaced. The shape must be expressible as a unique function in polar co-ordinates,
that is, any straight line radiating from the centre of the shape must cross the outline
only once.
The algorithm follows Davis (1986). The origin for the polar coordinate system
is found by numerical approximation to the centroid. 128 points are then produced
at equal angular increments around the outline, through linear interpolation. The
centroid is then re-computed, and the radii normalized (size is thus removed from
the analysis). The cosine and sine components are given for the first twenty har-
monics, but note that only N/2 harmonics are ’valid’, where N is the number of
digitized points. The coefficients can be copied to the main spreadsheet for further
analysis (e.g. by PCA).
The ’Shape view’ window allows graphical viewing of the Fourier shape ap-
proximation(s).

Elliptic Fourier shape analysis

Typical application Assumptions Data needed
Analysis of fossil outline Sufficient number of digi- Digitized x/y coordinates
shape tized points to capture fea- around an outline. Speci-
tures. mens in rows, coordinates of
alternating x and y values in
columns (see Procrustes fit-
ting below).

64
More than one shape (row) can be simultaneously analyzed.
Elliptic Fourier shape analysis is in some respects superior to simple Fourier
shape analysis. One advantage is that the algorithm can handle complicated shapes
which may not be expressible as a unique function in polar co-ordinates. Elliptic
Fourier shapes is now a standard method of outline analysis. The algorithm used
in PAST is described in Ferson et al. (1985).
Cosine and sine components of x and y increments along the outline for the
first 20 harmonics are given, but only the first N/2 harmonics should be used,
where N is the number of digitized points. Size and positional translation are
normalized away, and do not enter in the coefficients. However, no attempt is made
to standardize rotation or starting point, so all specimens should be measured in a
standard orientation. The coefficients can be copied to the main spreadsheet for
further analysis (e.g. by PCA).
The ’Shape view’ window allows graphical viewing of the elliptic Fourier
shape approximation(s).

Eigenshape analysis
Typical application Assumptions Data needed
Analysis of fossil outline Sufficient number of digi- Digitized x/y coordinates
shape tized points to capture fea- around several outlines.
tures. Specimens in rows, coordi-
nates of alternating x and
y values in columns (see
Procrustes fitting below).

Eigenshapes are principal components of outlines. The scatter plot of outlines

in principal component space can be shown, and linear combinations of the eigen-
shapes themselves can be visualized.
The implementation in PAST is partly based on MacLeod (1999). It finds
the optimal number of equally spaced points around the outline using an itera-
tive search, so the original points need not be equally spaced. The eigenanaly-
sis is based on the covariance matrix of the non-normalized turning angle incre-
ments around the outlines. The algorithm does not assume a closed curve, and
the endpoints are therefore not constrained to coincide in the reconstructed shapes.
Landmark-registered eigenshape analysis is not included. All outlines must start at
the ’same’ point.

65
Procrustes and Bookstein fitting (2D or 3D)
Typical application Assumptions Data needed
Standardization of morpho- None Digitized x/y or x/y/z
metrical landmark coordi- landmark coordinates.
nates Specimens in rows, co-
ordinates of alternating x
and y (or x/y/z) values in
columns.

The Procrustes option in the Transform menu will transform your measured
coordinates to Procrustes coordinates. There is also a menu choice for Bookstein
coordinates. Specimens go in different rows and landmarks along each row. If you
have three specimens with four landmarks, your data should look as follows:
x1 y1 x2 y2 x3 y3 x4 y4
x1 y1 x2 y2 x3 y3 x4 y4
x1 y1 x2 y2 x3 y3 x4 y4
For 3D the data will be similar, but with additional columns for z.
Landmark data in this format could be analyzed directly with the multivariate
methods in PAST, but it is recommended to standardize to so-called Procrustes co-
ordinates by removing position, size and rotation. A further transformation to Pro-
crustes residuals (approximate tangent space coordinates) is achieved by selecting
’Subtract mean’ in the Edit menu. Note: You must always convert to Procrustes
coordinates first, then to Procrustes residuals.
Here is a typical sequence of operations for landmark analysis:

• Conversion of measured coordinates to Procrustes coordinates

• Conversion of Procrustes coordinates to Procrustes residuals (this must not

be done before Thin-plate Spline Transformation or Shape PCA analysis, see
below).

• Multivariate analysis of tangent space coordinates, with e.g. PCA or cluster

analysis.

A thorough description of Procrustes and tangent space coordinates is given by

Dryden & Mardia (1998). The algorithms for Procrustes fitting are from Rohlf &
Slice (1990) (2D) and Dryden & Mardia (1998) (3D).
Bookstein fitting has a similar function as Procrustes fitting, but simply stan-
dardizes size, rotation and scale by forcing the two first landmarks onto the coor-
dinates (0,0) and (1,0). It is not in common use today. Bookstein fitting is only
implemented for 2D.
Missing data are supported by column average substitution.

66
Shape PCA
This is an option in the Principal Components module (Multivar menu). PCA on
landmark data can be carried out as normal PCA analysis on Procrustes coordinates
for 2D or 3D (see above), but for 2D landmark data some extra functionality is
available in the PCA module by choosing Shape PCA. The var-covar option is
enforced, and the ’Shape deform (2D)’ button enabled. This allows you to view
the displacement of landmarks from the mean shape (plotted as points or symbols)
in the direction of the different principal components, allowing interpretation of
the components. The displacements are plotted as lines (vectors).

Grid
The "Grid" option visualizes the deformations as thin-plate splines. These splines
are the relative warps with the uniform (affine) component included, and with al-
pha=0. Relative warps are also available separately in the "Geomet" menu, but
there the uniform component is not included.

Thin-plate spline transformation grids

Typical application Assumptions Data needed
Visualization of shape None Digitized x/y landmark co-
change ordinates. Specimens in
rows, coordinates of alter-
nating x and y values in
columns. Procrustes stan-
dardization recommended.

The first specimen (first row) is taken as a reference, with an associated square
grid. The warps from this to all other specimens can be viewed. You can also
choose the mean shape as the reference.
The ’Expansion factors’ option will display the area expansion (or contraction)
factor around each landmark in yellow numbers, indicating the degree of local
growth. This is computed using the Jacobian of the warp. Also, the expansions
are colour-coded for all grid elements, with green for expansion and purple for
contraction.
At each landmark, the principal strains can also be shown, with the major strain
in black and minor strain in brown. These vectors indicate directional stretching.
A description of thin-plate spline transformation grids is given by Dryden &
Mardia (1998).

Partial warps
From the thin-plate spline window, you can choose to see the partial warps for a
particular spline deformation. The first partial warp will represent some long-range

67
(large scale) deformation of the grid, while higher-order warps will normally be
connected with more local deformations. The affine component of the warp (also
known as zeroth warp) represents linear translation, scaling, rotation and shearing.
In the present version of PAST you can not view the principal warps.
When you increase the magnification factor from zero, the original landmark
configuration and a grid will be progressively deformed according to the selected
partial warp.

Partial warp scores

From the thin-plate spline window, you can also choose to see the partial warp
scores of all the specimens. Each partial warp score has two components (x and
y), and the scores are therefore presented in scatter plots.

Relative warps
Typical application Assumptions Data needed
Ordination of a set of shapes None Digitized x/y landmark co-
ordinates. Specimens in
rows, coordinates of alter-
nating x and y values in
columns. Procrustes stan-
dardization recommended.

The relative warps can be viewed as the principal components of the set of
thin-plate transformations from the mean shape to each of the shapes under study.
It provides an alternative to direct PCA of the landmarks (see Shape PCA above).
The parameter alpha can be set to one of three values:

• alpha=-1 emphasizes small-scale variation.

• alpha=0 is PCA of the landmarks directly, and is equivalent to Shape PCA

(see above) but without including the affine (uniform) component.

• alpha=1 emphasizes large-scale variation.

The relative warps are ordered according to importance, and the first and sec-
ond warps are usually the most informative. Note that the percentage values of the
eigenvalues are relative to the total non-affine part of the transformation - the affine
part is not included (see Shape PCA for relative warps with the affine component
included).
The relative warps are visualized with thin-plate spline transformation grids.
When you increase or decrease the amplitude factor away from zero, the original
landmark configuration and grid will be progressively deformed according to the
selected relative warp.

68
The relative warp scores of pairs of consecutive relative warps can shown in
scatter plots, and all scores can be shown in a numerical matrix.
The algorithm for computing the relative warps is taken from Dryden & Mardia
(1998).

Size from landmarks (2D or 3D)

Typical application Assumptions Data needed
Size estimation from land- None Digitized x/y or x/y/z
marks landmark coordinates.
Specimens in rows, co-
ordinates with alternating
x and y (and z for 3D)
values in columns. Must
not be Procrustes fitted or
normalized for size!

Calculates the centroid size for each specimen (Euclidean norm of the distances
from all landmarks to the centroid).
The values in the ’Normalized’ column are centroid sizes divided by the square
root of the number of landmarks - this might be useful for comparing specimens
with different numbers of landmarks.

Normalize size
The ’Normalize size’ option in the Transform menu allows you to remove size
by dividing all coordinate values by the centroid size for each specimen. For 2D
data you may instead use Procrustes coordinates, which are also normalized with
respect to size.
See Dryden & Mardia (1998), p. 23-26.

Distance from landmarks (2D or 3D)

Typical application Assumptions Data needed
Calculating distances None Digitized x/y or x/y/z
between two landmarks landmark coordinates.
Specimens in rows, coor-
dinates with alternating x
and y (and z for 3D) values
in columns. May or may
not be Procrustes fitted or
normalized for size.

Calculates the Euclidean distances between two fixed landmarks for one or
many specimens. You must choose two landmarks - these are named according to

69
the name of the first column for the landmark (x value).

All distances from landmarks (EDMA)

Typical application Assumptions Data needed
Calculating distances be- None Digitized x/y or x/y/z
tween all pairs of landmarks landmark coordinates.
Specimens in rows, coor-
dinates with alternating x
and y (and z for 3D) values
in columns. May or may
not be Procrustes fitted or
normalized for size.

This function will replace the landmark data in the data matrix with a data
set consisting of distances between all pairs of landmarks, with one specimen per
row. The number of pairs is N(N-1)/2 for N landmarks. This transformation will
allow multivariate analysis of distance data, which are not sensitive to rotation or
translation of the original specimens, so a Procrustes fitting is not mandatory before
such analysis. Using distance data also allows log-transformation, and analysis of
fit to the allometric equation for pairs of distances.
Missing data are supported by column average substitution.

Landmark linking
This function in the Geomet menu allows the selection of any pairs of landmarks
to be linked with lines in the morphometric plots (thin-plate splines, partial and
relative warps, etc.), to improve readability. The landmarks must be present in the
main spreadsheet before links can be defined.
Pairs of landmarks are selected or deselected by clicking in the symmetric ma-
trix. The set of links can also be saved in a text file. Note that there is little error
checking in this module.

Burnaby size removal

This function in the Transform menu will project your multivariate data set of mea-
sured distances onto a space orthogonal to the first principal component. Burnaby’s
method may (or may not!) remove isometric size from the data, for further "size-
free" data analysis. The "Allometric" option will log-transform the data prior to
projection, thus conceivably removing also allometric size-dependent shape varia-
tion from the data. Note that the implementation in PAST does not center the data
within groups - it assumes that all specimens (rows) belong to one group.

70
Gridding (spatial interpolation)
Typical application Assumptions Data needed
Spatial interpolation of scat- Some degree of smoothness Three columns with position
tered data points onto a reg- (x,y) and corresponding data
ular grid values

Gridding (spatial interpolation) allows the production of a map showing a con-

tinuous spatial estimate of some variate such as fossil abundance or thickness of a
rock unit, based on scattered data points. The user can specify the size of the grid
(number of rows and columns), but in the present version the spatial coverage of
the map is generated automatically based on the positions of data points (the map
will always be square).
A least-squares linear surface (trend) is automatically fitted to the data, re-
moved prior to gridding and finally added back in. This is primarily useful for the
semivariogram modelling and the kriging method.
Three algorithms are available:

Moving average
The value at a grid node is simply the average of the N closest data points, as
specified by the user (the default is to use all data points). The points are given
weight in inverse proportion to distance. This algorithm is simple and will not
always give good (smooth) results. One advantage is that the interpolated values
will never go outside the range of the data points.

Thin-plate spline
Maximally smooth interpolator. Can overshoot in the presence of sharp bends in
the surface.

Kriging
The user is required to specify a model for the semivariogram, by choosing one of
three models (spherical, exponential or Gaussian) and corresponding parameters to
fit the empirical semivariances as well as possible. The semivariogram is computed
within each of a number of bins. Using the histogram option, choose a number
of bins so that each bin (except possibly the rightmost ones) contains at least 30
distances See e.g. Davis (1986) for more information.
The kriging procedure also provides an estimate of standard errors across the
map (this depends on the semivariogram model being accurate). Kriging in PAST
does not provide for anisotropic semivariance.

71
12 Cladistics
Typical application Assumptions Data needed
Semi-objective analysis of Many! See Kitchin et al. Character matrix with taxa
relationships between taxa (1998) in rows, outgroup in first
from morphological or ge- row. For calculation of
netic evidence stratigraphic congruence in-
dices, first and last appear-
ance datums must be given
in the first two columns.

Warning: the cladistics package in PAST is fully operational, but lacking in

comprehensive functionality. The heuristic algorithms seem not to perform quite
as well as in some other programs (this is being looked into). The PAST cladistics
package is adequate for education and initial data exploration, but for more ’se-
rious’ work we recommend a specialized program such as PAUP. Algorithms are
from Kitchin et al. (1998).

Parsimony analysis
Character states should be coded using integers in the range 0 to 255. The first
taxon is treated as the outgroup, and will be placed at the root of the tree.
Missing values are coded with a question mark (?) or the value -1. Please note
that PAST does not collapse zero-length branches. Because of this, missing values
can lead to a proliferation of equally shortest trees ad nauseam, many of which are
in fact equivalent.
There are four algorithms available for finding short trees:

Branch-and-bound
The branch-and-bound algorithm is guaranteed to find all shortest trees. The total
number of shortest trees is reported, but a maximum of 1000 trees are saved. You
should not use the branch-and-bound algorithm for data sets with more than 12
taxa.

Exhaustive
The exhaustive algorithm evaluates all possible trees. Like the branch-and-bound
algorithm it will necessarily find all shortest trees, but it is very slow. For 12 taxa,
more than 600 million trees are evaluated! The only advantage over branch-and-
bound is the plotting of tree length distribution. This histogram may indicate the
’quality’ of your matrix, in the sense that there should be a tail to the left such that
few short trees are ’isolated’ from the greater mass of longer trees (but see Kitchin
et al. 1998 for critical comments on this). For more than 8 taxa, the histogram is
based on a subset of tree lengths and may not be accurate.

72
Heuristic, nearest neighbour interchange
This heuristic algorithm adds taxa sequentially in the order they are given in the
matrix, to the branch where they will give least increase in tree length. After each
taxon is added, all nearest neighbour trees are swapped to try to find an even shorter
tree.
Like all heuristic searches, this one is much faster than the algorithms above
and can be used for large numbers of taxa, but is not guaranteed to find all or any of
the most parsimonious trees. To decrease the likelihood of ending up on a subopti-
mal local minimum, a number of reorderings can be specified. For each reordering,
the order of input taxa will be randomly permutated and another heuristic search
attempted.
Please note: Because of the random reordering, the trees found by the heuristic
searches will normally be different each time. To reproduce a search exactly, you
will have to start the parsimony module again from the menu, using the same value
for "Random seed". This will reset the random number generator to the seed value.

Heuristic, subtree pruning and regrafting

This algorithm (SPR) is similar to the one above (NNI), but with a more elaborate
branch swapping scheme: A subtree is cut off the tree, and regrafting onto all other
branches in the tree is attempted in order to find a shorter tree. This is done after
each taxon has been added, and for all possible subtrees. While slower than NNI,
SPR will often find shorter trees.

Heuristic, tree bisection and reconnection

This algorithm (TBR) is similar to the one above (SPR), but with an even more
complete branch swapping scheme. The tree is divided into two parts, and these
are reconnected through every possible pair of branches in order to find a shorter
tree. This is done after each taxon is added, and for all possible divisions of the
tree. TBR will often find shorter trees than SPR and NNI, at the cost of longer
computation time.

Character optimization criteria

Three different optimization criteria are availiable:

Wagner
Characters are reversible and ordered, meaning that 0->2 costs more than 0->1, but
has the same cost as 2->0.

73
Fitch
Characters are reversible and unordered, meaning that all changes have equal cost.
This is the criterion with fewest assumptions, and is therefore generally preferable.

Dollo
Characters are ordered, but acquistition of a character state (from lower to higher
value) can happen only once. All homoplasy is accounted for by secondary rever-
sals. Hence, 0->1 can only happen once, normally relatively close to the root of
the tree, but 1->0 can happen any number of times further up in the tree. (This
definition has been debated on the PAST mailing list, especially whether Dollo
characters need to be ordered).

Bootstrap
Bootstrapping is performed when the ’Bootstrap replicates’ value is set to non-zero.
The specified number of replicates (typically 100 or even 1000) of your character
matrix are made, each with randomly weighted characters. The bootstrap value for
a group is the percentage of replicates supporting that group. A replicate supports
the group if the group exists in the majority rule consensus tree of the shortest trees
made from the replicate.
Warning: Specifying 1000 bootstrap replicates will clearly give a thousand
times longer computation time than no bootstrap! Exhaustive search with boot-
strapping is unrealistic and is not allowed.

Cladogram plotting
All shortest (most parsimonious) trees can be viewed, up to a maximum of 1000
trees. If bootstrapping has been performed, a bootstrap value in percents is given
at the root of the subtree specifying each group.
Character states can also be plotted onto the tree, as selected by the ’Character’
buttons. This character reconstruction is unique only in the absence of homoplasy.
In case of homoplasy, character changes are placed as close to the root as possible,
favouring one-time acquisition and later reversal of a character state over several
independent gains (so-called accelerated transformation).
The ’Phylogram’ option allows plotting of trees where the length of vertical
lines (joining clades) is proportional to branch length.

Consistency index
The per-character consistency index (ci) is defined as m/s, where m is the mini-
mum possible number of character changes (steps) on any tree, and s is the actual

74
number of steps on the current tree. This index hence varies from one (no homo-
plasy) and down towards zero (a lot of homoplasy). The ensemble consistency
index CI is a similar index summed over all characters.

Retention index
The per-character retention index (ri) is defined as (g − s)/(g − m), where m
and s are as for the consistency index, while g is the maximal number of steps for
the character on any cladogram (Farris 1989). The retention index measures the
amount of synapomorphy on the tree, and varies from 0 to 1.

Consensus tree
The consensus tree of all shortest (most parsimonious) trees can also be viewed.
Two consensus rules are implemented: Strict (groups must be supported by all
trees) and majority (groups must be supported by more than 50 percent of the
trees).

Bremer support (decay index)

The Bremer support for a clade is the number of extra steps you need to construct a
tree (consistent with the characters) where that clade is no longer present. There are
reasons to prefer this index rather than the bootstrap value. PAST does not compute
Bremer supports directly, but for smaller data sets it can be done ’manually’ as
follows:

• Perform parsimony analysis with exhaustive search or branch-and-bound.

Take note of the clades and the length N of the shortest tree(s) (for example
42). If there are more than one shortest tree, look at the strict consensus
tree. Clades which are no longer found in the consensus tree have a Bremer
support value of 0.

• In the box for ’Longest tree kept’, enter the number N +1 (43 in our example)
and perform a new search.

• Additional clades which are no longer found in the strict consensus tree have
a Bremer support value of 1.

• For ’Longest tree kept’, enter the number N + 2 (44) and perform a new
search. Clades which now disappear in the consensus tree have a Bremer
support value of 2.

• Continue until all clades have disappeared.

75
Stratigraphic congruence indices
For calculation of stratigraphic congruence indices, the first two columns in the
data matrix must contain the first and last appearance datums, respectively, for
each taxon. These datums must be given such that the youngest age (or highest
stratigraphic level) has the highest numerical value. You may need to use negative
values to achieve this (e.g. 400 million years before present is coded as -400.0).
The box "FADs/LADs in first columns" in the Parsimony dialogue must be ticked.
The Stratigraphic Congruence Index (SCI) of Huelsenbeck (1994) is defined as
the proportion of stratigraphically consistent nodes on the cladogram, and varies
from 0 to 1. A node is stratigraphically consistent when the oldest first occurrence
above the node is the same age or younger than the first occurrence of its sister
taxon (node).
The Relative Completeness Index (RCI) of Benton & Storrs (1994) is defined
as (1−M IG/SRL)x100 percent, where MIG (Minimum Implied Gap) is the sum
of the durations of ghost ranges and SRL is the sum of the durations of observed
ranges. The RCI can become negative, but will normally vary from 0 to 100.
The Gap Excess Ratio (GER) of Wills (1999) is defined as 1 − (M IG −
Gmin )/(Gmax − Gmin ) where Gmin is the minimum possible sum of ghost ranges
on any tree (that is, the sum of distances between successive FADs), and Gmax is
the maximum (that is, the sum of distances from first FAD to all other FADs).
These indices are further subjected to a permutation test, where all dates are
randomly redistributed across the different taxa 1000 times. The proportion of
permutations where the recalculated index exceeds the original index is given. If
small (e.g. p<0.05), this indicates a statistically significant departure from the null
hypothesis of no congruency between cladogram and stratigraphy (in other words,
you have significant congruency). The permutation probabilities of RCI and GER
are equal for any given set of permutations, because they are based on the same
value for MIG.

76
13 Biostratigraphy
Unitary associations
Typical application Assumptions Data needed
Quantitative biostratigraphi- None Presence/absence (1/0) ma-
cal correlation trix with horizons in rows,
taxa in columns

Unitary Associations analysis (Guex 1991) is a method for biostratigraphical

correlation (see Angiolini & Bucher 1999 for a typical application). The data input
consists of a presence/absence matrix with samples in rows and taxa in columns.
Samples belonging to the same section (locality) are tagged with the same color,
and ordered stratigraphically within each section such that the lowermost sample is
entered in the lowest row. Colours can be re-used in data sets with large numbers
of sections (see alveolinid.dat for an example).

Overview of the method

The method of Unitary Associations is logical, but rather complicated, consisting
of a number of steps. For details, see Guex 1991. The implementation in PAST in-
cludes most of the features found in the standard program, called BioGraph (Savary
& Guex 1999), and thanks to a fruitful co-operation with Jean Guex it also includes
a number of options and improvements not found in the present version of that pro-
gram.
The basic idea is to generate a number of assemblage zones (similar to ’Oppel
zones’) which are optimal in the sense that they give maximal stratigraphic reso-
lution with a minimum of superpositional contradictions. One example of such a
contradiction would be a section containing a species A above a species B, while
assemblage 1 (containing species A) is placed below assemblage 2 (containing
species B). PAST (and BioGraph) carries out the following steps:
1. Residual maximal horizons
The method makes the range-through assumption, meaning that taxa are con-
sidered to have been present at all levels between the first and last appearance in
any section. Then, any samples with a set of taxa that is contained in another sam-
ple are discarded. The remaining samples are called residual maximal horizons.
The idea behind this throwing away of data is that the absent taxa in the discarded
samples may simply not have been found even though they originally existed. Ab-
sences are therefore not as informative as presences.
2. Superposition and co-occurrence of taxa
Next, all pairs (A,B) of taxa are inspected for their superpositional relation-
ships: A below B, B below A, A together with B, or unknown. If A occurs below
B in one locality and B below A in another, they are considered to be co-occurring
although they have never actually been found together.

77
The superpositions and co-occurrences of taxa can be viewed in the biostrati-
graphic graph. In this graph, taxa are coded as numbers. Co-occurrences between
pairs of taxa are shown as solid blue lines. Superpositions are shown as dashed red
lines, with long dashes from the above-occurring taxon and short dashes from the
below-occurring taxon.
Some taxa may occur in so-called forbidden sub-graphs, which indicate incon-
sistencies in their superpositional relationships. Two of the several types of such
sub-graphs can be plotted in PAST: Cn cycles, which are superpositional cycles (A-
>B->C->A), and S3 circuits, which are inconsistencies of the type ’A co-occurring
with B, C above A, and C below B’. Interpretation of such forbidden sub-graphs is
described by Guex (1991).
3. Maximal cliques
Maximal cliques are groups of co-occurring taxa not contained in any larger
group of co-occurring taxa. The maximal cliques are candidates for the status of
unitary associations, but will be further processed below. In PAST, maximal cliques
receive a number and are also named after a maximal horizon in the original data
set which is identical to, or contained in (marked with asterisk), the maximal clique.
4. Superposition of maximal cliques
The superpositional relationships between maximal cliques are decided by in-
specting the superpositional relationships between their constituent taxa, as com-
puted in step 2. Contradictions (some taxa in clique A occur below some taxa in
clique B, and vice versa) are resolved by a ’majority vote’. The contradictions
between cliques can be viewed in PAST.
The superpositions and co-occurrences of cliques can be viewed in the maximal
clique graph. In this graph, cliques are coded as numbers. Co-occurrences between
pairs of cliques are shown as solid blue lines. Superpositions are shown as dashed
red lines, with long dashes from the above-occurring clique and short dashes from
the below-occurring clique. Also, cycles between maximal cliques (see below) can
be viewed as green lines.
5. Resolving cycles
It will sometimes be the case that maximal cliques are now ordered in cycles: A
is below B, which is below C, which is below A again. This is clearly contradictory.
The ’weakest link’ (superpositional relationship supported by fewest taxa) in such
cycles is destroyed.
6. Reduction to unique path
At this stage, we should ideally have a single path (chain) of superpositional
relationships between maximal cliques, from bottom to top. This is however often
not the case, for example if A and B are below C, which is below D, or if we have
isolated paths without any relationships (A below B and C below D). To produce a
single path, it is necessary to merge cliques according to special rules.
7. Post-processing of maximal cliques
Finally, a number of minor manipulations are carried out to ’polish’ the result:
Generation of the ’consecutive ones’ property, reinsertion of residual virtual co-
occurrences and superpositions, and compaction to remove any generated non-

78
maximal cliques. For details on these procedures, see Guex 1991. At last, we now
have the Unitary Associations, which can be viewed in PAST.
The unitary associations have associated with them an index of similarity from
one UA to the next, called D:

Di = |U Ai − U Ai−1 |/|U Ai | + |U Ai−1 − U Ai |/|U Ai−1 |

8. Correlation using the Unitary Associations

The original samples are now correlated using the unitary associations. A sam-
ple may contain taxa which uniquely places it in a unitary association, or it may
lack key taxa which could differentiate between two or more unitary associations,
in which case only a range can be given. These correlations can be viewed in PAST.
9. Reproducibility matrix
Some unitary associations may be identified in only one or a few sections, in
which case one may consider to merge unitary associations to improve the geo-
graphical reproducibility (see below). The reproducibility matrix should be in-
spected to identify such unitary associations. A UA which is uniquely identified in
a section is shown as a black square, while ranges of UAs (as given in the correla-
tion list) are shown in gray.
10. Reproducibility graph and suggested UA merges (biozonation)
The reproducibility graph (Gk’ in Guex 1991) shows superpositions of unitary
associations that are actually observed in the sections. PAST will internally reduce
this graph to a unique maximal path (Guex 1991, section 5.6.3), and in the process
of doing so it may merge some UAs. These mergers are shown as red lines in the
reproducibility graph. The sequence of single and merged UAs can be viewed as a
suggested biozonation.

Special functionality
The implementation of the Unitary Associations method in PAST includes a num-
ber of options and functions which have not yet been described in the literature.
For questions about these, please contact us.

Ranking and Scaling

Typical application Assumptions Data needed
Quantitative biostratigraphi- None Table of depths, with wells
cal correlation in rows and events in
columns

Ranking-Scaling (Agterberg & Gradstein 1999) is a method for quantitative

biostratigraphy based on events in a number of wells or sections. The data input
consists of wells in rows with one well per row, and events (e.g. FADs and/or
LADs) in columns. The values in the matrix are depths of each event in each

79
well, increasing upwards (you may want to use negative values to achieve this).
Absences are coded as zero. If only the order of events is known, this can be coded
as increasing whole numbers (ranks, with possible ties for co-occurring events)
within each well.
The implementation of ranking-scaling in PAST is not comprehensive, and
advanced users are referred to the RASC and CASC programs of Agterberg and
Gradstein.

Overview of the method

The method of Ranking-Scaling proceeds in two steps:
1. Ranking
The first step of Ranking-Scaling is to produce a single, comprehensive strati-
graphic ordering of events, even if the data contains contradictions (event A over
B in one well, but B over A in another), or longer cycles (A over B over C over A).
This is done by ’majority vote’, counting the number of times each event occurs
above, below or together with all others. Technically, this is achieved by "presort-
ing" followed by the Modified Hay Method (Agterberg & Gradstein 1999).
2. Scaling
The biostratigraphic analysis may end with ranking, but additional insight may
be gained by estimating stratigraphic distances between the consecutive events.
This is done by counting the number of observed superpositional relationships (A
above or below B) between each pair (A,B) of consecutive events. A low number
of contradictions implies long distance.
Some computed distances may turn out to be negative, indicating that the or-
dering given by the ranking step was not optimal. If this happens, the events are
re-ordered and the distances re-computed in order to ensure only positive inter-
event distances.

RASC in PAST
Parameters
Well threshold: The minimum number of wells in which an event must occur
in order to be included in the analysis
Pair threshold: The minimum number of times a relationship between events
A and B must be observed in order for the pair (A,B) to be included in the ranking
step
Scaling threshold: Pair threshold for the scaling step
Tolerance: Used in the ranking step (see Agterberg & Gradstein)
Ranking
The ordering of events after the ranking step is given, with the first event at the
bottom of the list. The "Range" column indicates uncertainty in the position.
Scaling

80
The ordering of the events after the scaling step is given, with the first event
at the bottom of the list. For an explanation of all the columns, see Agterberg &
Gradstein (1999).
Event distribution
A plot showing the number of events in each well, with the wells ordered ac-
cording to number of events.
Scattergrams
For each well, the depth of each event in the well is plotted against the optimum
sequence (after scaling). Ideally, the events should plot in an ascending sequence.
Dendrogram
Plot of the distances between events in the scaled sequence, including a den-
drogram which may aid in zonation.

Constrained Optimization (CONOP)

Typical application Assumptions Data needed
Quantitative biostratigraphi- None Table of depths/levels, with
cal correlation wells/sections in rows and
event pairs in columns:
FADs in odd columns and
LADs in even columns.
Missing events are coded
with zeros.

PAST includes a simple version of Constrained Optimization (Kemple et al.

1989). Both FAD and LAD of each taxon must be specified in alternate columns.
Using so-called Simulated Annealing, the program searches for a global (com-
posite) sequence of events that implies a minimal total amount of range extension
(penalty) in the individual wells/sections. The parameters for the optimization pro-
cedure include an initial annealing temperature, the number of cooling steps, the
cooling ratio (percentage lower than 100), and the number of trials per step. For
explanation and recommendations, see Kemple et al. 1989.
Output windows include the optimization history with the temperature and
penalty as function of cooling step, the global composite solution and the implied
ranges in each individual section.
The implementation of CONOP in PAST is based on a FORTRAN optimiza-
tion core provided by Kemple and Sadler.

Appearance Event Ordination

Typical application Assumptions Data needed
Quantitative biostratigraphi- None Presence/absence (1/0) ma-
cal correlation and seriation trix with horizons in rows,
taxa in columns

81
Appearance Event Ordination (Alroy 1994, 2000) is a method for biostrati-
graphical seriation and correlation. The data input is in the same format as for Uni-
tary Associations, consisting of a presence/absence matrix with samples in rows
and taxa in columns. Samples belonging to the same section (locality) must be as-
signed the same color, and ordered stratigraphically within each section such that
the lowermost sample enters in the lowest row. Colors can be re-used in data sets
with large numbers of sections (see alveolinid.dat for an example).
The implementation in PAST is based on code provided by John Alroy. It
includes Maximum Likelihood AEO (Alroy 2000).

Range confidence intervals

Typical application Assumptions Data needed
Estimation of confidence in- Random distribution of fos- The number of horizons
tervals for first or last ap- siliferous horizons through containing the taxon, and
pearances and total range, the stratigraphic column levels or dates of first and
for one taxon. or through time. Section last occurrences of the
should be continuously taxon.
sampled.

Assuming a random (Poisson) distribution of fossiliferous horizons, confidence

intervals for the stratigraphic range of one taxon can be calculated given the first oc-
currence datum (level), last occurrence datum, and total number of horizons where
the taxon is found (Strauss & Sadler 1989, Marshall 1990).
No data are needed in the spreadsheet. The program will ask for the number of
horizons where the taxon is found, and levels or dates for the first and last appear-
ances. If necessary, use negative values to ensure that the last appearance datum
has a higher numerical value than the first appearance datum. 80, 95 and 99 percent
confidence intervals are calculated for the FAD considered in isolation, the LAD
considered in isolation, and the total range. The value alpha is the length of the
confidence interval divided by the length of the observed range.
Be aware that the assumption of random distribution will not hold in many real
situations.

Distribution free range confidence intervals

Typical application Assumptions Data needed
Estimation of confidence in- No correlation between One column per taxon, with
tervals for first or last ap- stratigraphic position and levels or dates of all hori-
pearances. gap size. Section should be zons where the taxon is
continuously sampled. found.

This method (Marshall 1994) does not assume random distribution of fossil-
iferous horizons. It requires that the levels or dates of all horizons containing the

82
taxon are given.
The program outputs upper and lower bounds on the lengths of the confidence
intervals, using a 95 percent confidence probability, for confidence levels of 50, 80
and 95 percent. Values which can not be calculated are marked with an asterisk
(see Marshall 1994).

14 Scripting
PAST includes a simple scripting (programming) language that allows you to make
your own algorithms within the PAST framework. The language is a simplified
Pascal, but with full matrix support and a library of mathematical, statistical and
user-interface functions.
Important: As of the present version, this new scripting capability is rudimen-
tary and not entirely stable. It will be rapidly improved in coming versions!

Language structure
Variables
Variable names can be of any length. There is no declaration of variables. All
variables have global scope. The assignment operator is :=.

Comments
Upon encountering the hash character in the source, the rest of the line is skipped.

begin ... end

Blocks are marked with begin ... end. The entire script must be contained within a
begin .... end pair.

if ... else
Example:

if fad>0 begin
lad:=fad+1;
fad:=0;
end else lad:=0;

for ... to
Example:

for i:=1 to 10 a[i]:=0;

83
while
Example:

while a[i]=0 i:=i+1;

procedure
Procedures have no arguments, return values or local variables. All communication
with the procedure must go through global variables. A procedure call must contain
a dummy argument. The procedure definition must be inside the outermost begin
... end pair. Example:

begin
procedure writehello
begin
message("Hello");
end;

writehello(0);
end;

Types
Four types are available: Double-precision numbers, vectors, arrays and strings.
Types are implicitly defined at first usage. "Sensible" type conversions are per-
formed automatically.
Examples:

a:=14.3;
b:="Hi!";
c:=array(5,10); c[3,7]:=a;
d:=vector(10); d[7]:=17;

84
Operators
Operator Supported types
+ double+double, vector+vector, array+array, ar-
ray+double, vector+double
- double-double, vector-vector, array-array, array-
double, vector-double
double*double, array*array (array multiplication),
array*double, vector*double
/ double/double, array/double, vector/double
(hat) double(hat)double (power)
& string&string, string& double, double&string (con-
catenation)
= double=double, string=string
> double>double, string>string
< double
>= double>=double, string>=string
<= double<=double, string<=string
<> double<>double, string<>string
and double and double
or double or double

Mathematical functions
Most of these functions support double, vector and array types.
Function Comments
cos(x) Cosine
sin(x) Sine
tan(x) Tangent
exp(x) Exponential, e to the power of x
log(x) Natural logarithm
sqrt(x) Square root
int(x) Round down to integer
rnd(x) Random, uniformly distributed number in the range
[0..x
fprob(f, df1, df2) p value for F distribution
tprob(t, df) p value for Student’s t distribution
zprob(z) p value for standardized normal distribution (z test)
chi2prob(chi2, df) p value for chi2 distribution

85
Array and vector operations
Function Comments
nrows(x) Number of rows in vector or array x
ncols(x) Number of columns in array x
array(m,n) Allocate an array of size m,n
vector(m) Allocate a vector of length m
row(x,m) Row m in array x, as a vector
column(x,n) Column n in array x, as a vector
inv(x) Inverse of double, or square array
mean(x) Mean value of vector or array
eigen(x) For a square NxN matrix x, returns a Nx(N+1)
matrix with N sorted eigenvectors in the first N
columns, and the eigenvalues in the last column.
cov(x) Variance-covariance matrix of array x
sort2(x1, x2) Sort the vectors x1 and x2, on x1. Returns an array
with the sorted vectors in the two columns.

Distance matrices
Note that all these functions return symmetric distance matrices. For similarity
indices such as Jaccard, the complement 1 − x is returned. A value -1 in the input
matrix is treated as missing value.
Function Comments
eucliddist(x) Symmetric Euclidean distance matrix of array x
chorddist(x) Chord distance
cosdist(x) Complement of cosine similarity
dicedist(x) Complement of Dice similarity
jaccarddist(x) Complement of Jaccard similarity
morisitadist(x) Complement of Morisita similarity
horndist(x) Complement of Horn similarity
manhattandist(x) Manhattan distance
corrdist(x) Complement of correlation similarity
rhodist(x) Complement of non-parametric correlation
raupcrickdist(x) Complement of Raup-Crick similarity
hammingdist(x1, x2) Hamming distance
simpsondist(x1, x2) Complement of Simpson similarity

User interface functions and procedures

Functions:

86
spreadsheetarray(0|1) Returns the selected array of numbers in the spread-
sheet. Call with argument 1 for replacing missing
values with column averages.
spreadsheetcolumn(n) Returns column n from the spreadsheet, as a vector
spreadsheetsymbols(0) Returns a vector of integer values for symbols (col-
ors) in the selected array.
selectedrowlabel(m) Returns a string containing the row label of row m
in the selected array.
selectedrowlabel(n) Returns a string containing the column label of col-
umn n in the selected array.
Procedures:
message(x) Displays a message dialogue showing x (string or
double)
setspreadsheet(m, n, x) Set the contents of cell m,n in spreadsheet to x
(string or double)
opennumwindow(m, n) Opens a numerical output (table) window with m
rows and n columns.
setnumwindow(m, n, x) Set the contents of cell m,n in the numerical output
window to x (string or double)

Graphics
Only one graphics window is available at any one time. Before displaying graphics,
the window must be opened using the procedure openwindow. Coordinates can be
in any unit, as the window is automatically scaled to accomodate its contents.
Colors are defined as 24-bit integers (R, G, B has 8 bits each), or using the
pre-defined constants colred, colblack, colblue, colyellow, colgreen, colpurple.
Procedures:
openwindow Opens the graphics window
drawline(x1, y1, x2, y2) Draws a blue line.
drawstring(x1, y1, string, Draws a text string.
color)
drawvectorpoints(x, Draws a set of points contained in the vector x, with
color) indices on the horizontal axis.
drawxypoints(x, y, color) Draws a set of points with x,y coordinates contained
in the vectors x and y.
drawxysymbols(x, y, Draws a set of points with x,y coordinates contained
symbols) in the vectors x and y. The symbols vector contains
integer values for symbols (colors).
drawhistogram(x, nbins, Draws a histogram of the values contained in the
color) vector x, with the number of bins given.

87
Examples
1. Take the first two columns of the spreadsheet, and display a log-log scatter
diagram:

begin
x:=spreadsheetcolumn(1);
y:=spreadsheetcolumn(2);
openwindow;
drawxypoints(log(x), log(y), colblack);
end;

2. Carry out a Principal Components Analysis, with graphical and numerical

output:

begin
data:=spreadsheetarray(1);
eig:=eigen(cov(data));
eigvectors:=array(nrows(eig), ncols(eig)-1);
for i:=1 to nrows(eig) for j:=1 to ncols(eig)-1
eigvectors[i,j]:=eig[i,j];
scores:=data*eigvectors;

openwindow;
drawxysymbols(column(scores,1), column(scores,2), spreadsheetsymbols(0)
for i:=1 to nrows(scores)
drawstring(scores[i,1], scores[i,2], spreadsheetrowlabel(i), colblack

opennumwindow(nrows(scores)+1, ncols(scores)+1);
for i:=1 to nrows(scores) for j:=1 to ncols(scores)
setnumwindow(i+1, j+1, scores[i,j]);

for i:=1 to nrows(scores) setnumwindow(i+1, 1, spreadsheetrowlabel(i));

for i:=1 to ncols(scores) setnumwindow(1, i+1, "Axis "&i);
end;

88
15 Acknowledgments
PAST was inspired by and includes many functions found in PALSTAT, which was
programmed by P.D. Ryan with assistance from J.S. Whalley. Harper thanks the
Danish Natural Science Research Council (SNF) for support. Frits Agterberg and
Felix Gradstein allowed OH access to source code for RASC, and Peter Sadler pro-
vided source code for CONOP. Jean Guex provided a series of ideas for improve-
ment and extension of the Unitary Associations module, and tested it intensively.
John Alroy provided source code for AEO.
Many users of PAST have given us ideas for improvement and reported bugs.
Among these are Charles Galea Bonavia, Hans Arne Nakrem, Mikael Fortelius,
Knut Rognes, Julian Overnell, Kirsty Brown, Paolo Tomassetti, Jose Luis Navarrete-
Heredia, Wally Woolfenden, Erik Telie, Fernando Archuby, Ian J. Slipper, James
Gallagher, Marcio Pie, Hugo Bucher, Alexey Tesakov, Craig Macfarlane, José
Camilo Hurtado Guerrero, Wolfgang Kiessling and Bastien Wauthoz.

16 References
Adrain, J.M., S.R. Westrop & D.E. Chatterton 2000. Silurian trilobite alpha diver-
sity and the end-Ordovician mass extinction. Paleobiology 26:625-646.
Agterberg, F.P. & F.M. Gradstein. 1999. The RASC method for Ranking and
Scaling of Biostratigraphic Events. In: Proceedings Conference 75th Birthday
C.W. Drooger, Utrecht, November 1997. Earth Science Review 46(1-4):1-25.
Alroy, J. 1994. Appearance event ordination: a new biochronologic method. Pale-
obiology 20:191-207.
Alroy, J. 2000. New methods for quantifying macroevolutionary patterns and pro-
cesses. Paleobiology 26:707-733.
Anderson, M.J. 2001. A new method for non-parametric multivariate analysis of
variance. Austral Ecology 26:32-46.
Angiolini, L. & H. Bucher 1999. Taxonomy and quantitative biochronology of
Guadalupian brachiopods from the Khuff Formation, Southeastern Oman. Geobios
32:665-699.
Benton, M.J. & G.W. Storrs. 1994. Testing the quality of the fossil record: paleon-
tological knowledge is improving. Geology 22:111-114.
Bow, S.-T. 1984. Pattern recognition. Marcel Dekker, New York.
Brower, J.C. & K.M. Kyle 1988. Seriation of an original data matrix as applied to
palaeoecology. Lethaia 21:79-93.
Brown, D. & P. Rothery 1993. Models in biology: mathematics, statistics and
computing. John Wiley & Sons, New York.
Bruton, D.L. & A.W. Owen 1988. The Norwegian Upper Ordovician illaenid trilo-
bites. Norsk Geologisk Tidsskrift 68:241-258.
Clarke, K.R. 1993. Non-parametric multivariate analysis of changes in community
structure. Australian Journal of Ecology 18:117-143.

89
Clarke, K.R. & Warwick, R.M. 1998. A taxonomic distinctness index and its sta-
tistical properties. Journal of Applied Ecology 35:523-531.
Colwell, R.K. & J.A. Coddington. 1994. Estimating terrestrial biodiversity through
extrapolation. Philosophical Transactions of the Royal Society (Series B) 345:101-
118.
Colwell, R.K., C.X. Mao & J. Chang. 2004. Interpolating, extrapolating, and
comparing incidence-based species accumulation curves. Ecology 85:2717-2727.
Davis, J.C. 1986. Statistics and Data Analysis in Geology. John Wiley & Sons,
New York.
De Boor, C. 2001. A Practical Guide to Splines. Springer.
Donnelly, S.M. & A. Kramer. 1999. Testing for multiple species in fossil sam-
ples: an evaluation and comparison of tests for equal relative variation. American
Journal of Physical Anthropology 108:507-529.
Dryden, I.L. & K.V. Mardia 1998. Statistical Shape Analysis. Wiley.
Farris, J.S. 1989. The retention index and the rescaled consistency index. Cladis-
tics 5:417-419.
Ferson, S.F., F.J. Rohlf & R.K. Koehn 1985. Measuring shape variation of two-
dimensional outlines. Systematic Zoology 34:59-68.
Fisher, N.I. 1983. Comment on "A Method for Estimating the Standard Deviation
of Wind Directions". Journal of Applied Meteorology 22:1971.
Guex, J. 1991. Biochronological Correlations. Springer Verlag, Berlin.
Harper, D.A.T. (ed.). 1999. Numerical Palaeobiology. John Wiley & Sons, Chich-
ester.
Hennebert, M. & A. Lees. 1991. Environmental gradients in carbonate sediments
and rocks detected by correspondence analysis: examples from the Recent of Nor-
way and the Dinantian of southwest England. Sedimentology 38:623-642.
Hill, M.O. & H.G. Gauch Jr. 1980. Detrended Correspondence analysis: an im-
proved ordination technique. Vegetatio 42:47-58.
Horn, H.S. 1966. Measurement of overlap in comparative ecological studies. Amer-
ican Naturalist 100:419-424.
Huelsenbeck, J.P. Comparing the stratigraphic record to estimates of phylogeny.
Paleobiology 20:470-483.
Jackson, D. A. 1993. Stopping rules in principal components analysis: a compari-
son of heuristical and statistical approaches. Ecology 74:2204-2214.
Jammalamadaka, S.R. & A. Sengupta. 2001. Topics in Circular Statistics. World
Scientific.
Jolicoeur, P. 1963. The multivariate generalization of the allometry equation. Bio-
metrics 19:497-499.
Jolliffe, I.T. 1986. Principal Component Analysis. Springer-Verlag, Berlin.
Kemple, W.G., P.M. Sadler & D.J. Strauss. 1989. A prototype constrained op-
timization solution to the time correlation problem. In Agterberg, F.P. & G.F.
Bonham-Carter (eds), Statistical Applications in the Earth Sciences. Geological
Survey of Canada Paper 89-9:417-425.

90
Kitchin, I.J., P.L. Forey, C.J. Humphries & D.M. Williams 1998. Cladistics. Ox-
ford University Press, Oxford.
Kowalewski, M., E. Dyreson, J.D. Marcot, J.A. Vargas, K.W. Flessa & D.P. Hall-
mann. 1997. Phenetic discrimination of biometric simpletons: paleobiological
implications of morphospecies in the lingulide brachiopod Glottidia. Paleobiology
23:444-469.
Krebs, C.J. 1989. Ecological Methodology. Harper & Row, New York.
Laskar, J., P. Robutel, F. Joutel, M. Gastineau, A.C.M. Correia & B. Levrard. 2004.
A long-term numerical solution for the insolation quantities of the Earth. Astron-
omy & Astrophysics 428:261-285.
Legendre, P. & L. Legendre. 1998. Numerical Ecology, 2nd English ed. Elsevier,
853 pp.
MacLeod, N. 1999. Generalizing and extending the eigenshape method of shape
space visualization and analysis. Paleobiology 25:107-138.
Marshall, C.R. 1990. Confidence intervals on stratigraphic ranges. Paleobiology
16:1-10.
Marshall, C.R. 1994. Confidence intervals on stratigraphic ranges: partial re-
laxation of the assumption of randomly distributed fossil horizons. Paleobiology
20:459-469.
Miller, R.L. & Kahn, J.S. 1962. Statistical Analysis in the Geological Sciences.
John Wiley & Sons, New York.
Oxanen, J. & P.R. Minchin. 1997. Instability of ordination results under changes in
input data order: explanations and remedies. Journal of Vegetation Science 8:447-
454.
Peres-Neto, P.R., D.A. Jackson & K.M. Somers. 2003. Giving meaningful inter-
pretation to ordination axes: assessing loading significance in principal component
analysis. Ecology 84:2347-2363.
Podani, J. & I. Miklos. 2002. Resemblance coefficients and the horseshoe effect in
principal coordinates analysis. Ecology 83:3331-3343.
Poole, R.W. 1974. An introduction to quantitative ecology. McGraw-Hill, New
York.
Press, W.H., S.A. Teukolsky, W.T. Vetterling & B.P. Flannery 1992. Numerical
Recipes in C. Cambridge University Press, Cambridge.
Prokoph, A., A.D. Fowler & R.T. Patterson. 2000. Evidence for periodicity and
nonlinearity in a high-resolution fossil record of long-term evolution. Geology
28:867-870.
Raup, D. & R.E. Crick. 1979. Measurement of faunal similarity in paleontology.
Journal of Paleontology 53:1213-1227.
Ripley, B.D. 1979. Tests of ’randomness’ for spatial point patterns. Journal of the
Royal Statistical Society, ser. B 41:368-374.
Rohlf, F.J. & M. Corti. 2000. Use of two-block partial least squares to study
covariation in shape. Systematic Biology 49:740-753.
Russell, G.S. & D.J. Levitin. 1996. An expanded table of probability values for
Rao’s Spacing Test. Communications in Statistics: Simulation and Computation

91
24:879-888.
Ryan, P.D., Harper, D.A.T. & Whalley, J.S. 1995. PALSTAT, Statistics for palaeon-
tologists. Chapman & Hall (now Kluwer Academic Publishers).
Saitou, N. & M. Nei. 1987. The neighbor-joining method: a new method for
reconstructing phylogenetic trees. Molecular Biology and Evolution 4:406-425.
Savary, J. & J. Guex. 1999. Discrete Biochronological Scales and Unitary Asso-
ciations: Description of the BioGraph Computer Program. Memoires de Geologie
(Lausanne) 34.
Sepkoski, J.J. 1984. A kinetic model of Phanerozoic taxonomic diversity. Paleobi-
ology 10:246-267.
Strauss, D. & P.M. Sadler. 1989. Classical confidence intervals and Bayesian prob-
ability estimates for ends of local taxon ranges. Mathematical Geology 21:411-
427.
Taguchi, Y-H. & Oono, Y. In press. Novel non-metric MDS algorithm with confi-
dence level test.
Tothmeresz, B. 1995. Comparison of different methods for diversity ordering.
Journal of Vegetation Science 6:283-290.
Wills, M.A. 1999. The gap excess ratio, randomization tests, and the goodness of
fit of trees to stratigraphy. Systematic Biology 48:559-580.
Zar, J.H. 1996. Biostatistical Analysis. 3rd ed. Prentice Hall, New York.

View publication stats

Hammer Harper Ryan 2001
No ratings yet
Hammer Harper Ryan 2001
10 pages
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
No ratings yet
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
4 pages
Organizational Diagnosis: An Evidence-Based Approach: Journal of Change Management January 2012
No ratings yet
Organizational Diagnosis: An Evidence-Based Approach: Journal of Change Management January 2012
26 pages
BarnettBengtsenDavidsPeters PhilosophyofHigherED
No ratings yet
BarnettBengtsenDavidsPeters PhilosophyofHigherED
9 pages
Art Introd 108 Shalock CV
No ratings yet
Art Introd 108 Shalock CV
15 pages
Fitting Statistical Models With PROC MCMC: Conference Paper
No ratings yet
Fitting Statistical Models With PROC MCMC: Conference Paper
27 pages
133 Full
No ratings yet
133 Full
12 pages
Acute Workload Are Associated With Increased Injury Hulin 2014
No ratings yet
Acute Workload Are Associated With Increased Injury Hulin 2014
7 pages
Application of Basic Science To Clinical Problems: Traditional vs. Hybrid Problem-Based Learning
No ratings yet
Application of Basic Science To Clinical Problems: Traditional vs. Hybrid Problem-Based Learning
13 pages
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
No ratings yet
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
4 pages
Assessment of Personality Disorder
0% (1)
Assessment of Personality Disorder
19 pages
Human Competence Revisited: 40 Years of Impact: Journal of Organizational Behavior Management April 2019
No ratings yet
Human Competence Revisited: 40 Years of Impact: Journal of Organizational Behavior Management April 2019
8 pages
The Ability of Work Life Balance Policies To Influence Key Social Organizational Issues
No ratings yet
The Ability of Work Life Balance Policies To Influence Key Social Organizational Issues
16 pages
Human Geophagia Calabash Chalk and Undongo Mineral
No ratings yet
Human Geophagia Calabash Chalk and Undongo Mineral
12 pages
Expert Credibility in Climate Change
No ratings yet
Expert Credibility in Climate Change
4 pages
Synthesis of 1,6-Hexanediol From Cellulose Derived Tetrahydrofuran Dimethanol (THFDM) With Pt-Wox/Tio2 Catalysts
No ratings yet
Synthesis of 1,6-Hexanediol From Cellulose Derived Tetrahydrofuran Dimethanol (THFDM) With Pt-Wox/Tio2 Catalysts
13 pages
Systemic Family Therapy: Applying Psychological Theory To Clinical Practice
No ratings yet
Systemic Family Therapy: Applying Psychological Theory To Clinical Practice
20 pages
Oreinted Antibody Immobilization Strategies
No ratings yet
Oreinted Antibody Immobilization Strategies
15 pages
Outcomes Research What Is It and Why Does It Matte
No ratings yet
Outcomes Research What Is It and Why Does It Matte
10 pages
Estuarine Shoreline Change
No ratings yet
Estuarine Shoreline Change
16 pages
Tavares 2017 Gymaware Female
No ratings yet
Tavares 2017 Gymaware Female
9 pages
When Does Mens Hostile Sexism Predict Relationship Aggression? The Moderating Role of Partner Commitment
No ratings yet
When Does Mens Hostile Sexism Predict Relationship Aggression? The Moderating Role of Partner Commitment
11 pages
OShaughnessyetal.2016Site-Specific VRI 38-152140448
No ratings yet
OShaughnessyetal.2016Site-Specific VRI 38-152140448
12 pages
2010-Wang-Burgess-et-al-The Safety of Common Steel BeamColumn Connections
No ratings yet
2010-Wang-Burgess-et-al-The Safety of Common Steel BeamColumn Connections
11 pages
Adaptive Artificial Limb
No ratings yet
Adaptive Artificial Limb
11 pages
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
No ratings yet
Gabbettetal 2021 Progressingtrainingloadsinhealthyinjuredathletes BJSM
4 pages
1 s2.0 S1441358220300306 Main
No ratings yet
1 s2.0 S1441358220300306 Main
12 pages
Pyke (2002) Rangeland Health Attributes and Indicators For Qualitative Assessment
No ratings yet
Pyke (2002) Rangeland Health Attributes and Indicators For Qualitative Assessment
15 pages
Open Relationship Prevalence, Characteristics, and Correlates in A Nationally Representative Sample of Canadian Adults
No ratings yet
Open Relationship Prevalence, Characteristics, and Correlates in A Nationally Representative Sample of Canadian Adults
12 pages
Jamieson Etal
No ratings yet
Jamieson Etal
10 pages
Jurnal 1
No ratings yet
Jurnal 1
11 pages
Political Control via Admin Procedures
No ratings yet
Political Control via Admin Procedures
37 pages
Physiotherapy-Led Functional Exercise Programme After THR 2016
No ratings yet
Physiotherapy-Led Functional Exercise Programme After THR 2016
7 pages
Decision Nodes
No ratings yet
Decision Nodes
25 pages
Beavers 2013
No ratings yet
Beavers 2013
22 pages
Crossing International Boundaries Through Doctoral Partnerships: Learnings From A Chinese-Australian Forum
No ratings yet
Crossing International Boundaries Through Doctoral Partnerships: Learnings From A Chinese-Australian Forum
21 pages
Nutritional Status of Honey Bee (Apis Mellifera L.) Workers Across An Agricultural Land-Use Gradient
No ratings yet
Nutritional Status of Honey Bee (Apis Mellifera L.) Workers Across An Agricultural Land-Use Gradient
11 pages
Effectsof Early Neglectonlanguageandcognition
No ratings yet
Effectsof Early Neglectonlanguageandcognition
9 pages
Fratura Tipo II Odontoide Com Pseudoatrose
No ratings yet
Fratura Tipo II Odontoide Com Pseudoatrose
7 pages
Raman Spectroscopy for Breast Cancer
No ratings yet
Raman Spectroscopy for Breast Cancer
7 pages
Luke Vail Ayres 2014
No ratings yet
Luke Vail Ayres 2014
16 pages
Disaggregated Memory For Expansion and Sharing in
No ratings yet
Disaggregated Memory For Expansion and Sharing in
13 pages
High-Performance Fume Hood Field Test Results and Research Agenda
No ratings yet
High-Performance Fume Hood Field Test Results and Research Agenda
63 pages
Managing Change Across Boundaries: Boundary Shaking Practices1
No ratings yet
Managing Change Across Boundaries: Boundary Shaking Practices1
20 pages
Paper 1
No ratings yet
Paper 1
28 pages
250 Koenigswald, Widga, Göhlich 2023 Oregon Mammutidae
No ratings yet
250 Koenigswald, Widga, Göhlich 2023 Oregon Mammutidae
73 pages
Groundwater's Role in Climate Change
No ratings yet
Groundwater's Role in Climate Change
10 pages
Prince Et Al FB 2005
No ratings yet
Prince Et Al FB 2005
12 pages
2015 - RO-DBT RefraMED (Art) Lynch
No ratings yet
2015 - RO-DBT RefraMED (Art) Lynch
12 pages
Stress Dependent Regulation of FOXO Transcription Factors by The SIRT1 Deacetylase
No ratings yet
Stress Dependent Regulation of FOXO Transcription Factors by The SIRT1 Deacetylase
7 pages
Faunal and Environmental Change in The Late Miocen
No ratings yet
Faunal and Environmental Change in The Late Miocen
73 pages
MarinePollBulletin Maleetal2024
No ratings yet
MarinePollBulletin Maleetal2024
9 pages
Steinhaus (1970) A Bridged Glossary of Terms in Invertebrate Pathology
No ratings yet
Steinhaus (1970) A Bridged Glossary of Terms in Invertebrate Pathology
71 pages
Measuring Intelligence With The Sandia Matrices Ps
No ratings yet
Measuring Intelligence With The Sandia Matrices Ps
12 pages
Pharmacodynamics of Echinocandins Against Candida
No ratings yet
Pharmacodynamics of Echinocandins Against Candida
9 pages
Elite Youth Football Talent Selection
No ratings yet
Elite Youth Football Talent Selection
10 pages
Amp 562128
No ratings yet
Amp 562128
39 pages
Development of The Conformity To Masculine Norms I
No ratings yet
Development of The Conformity To Masculine Norms I
24 pages
Polanin Et Al 2024 Effects of The 5e Instructional Model A Systematic Review and Meta Analysis
No ratings yet
Polanin Et Al 2024 Effects of The 5e Instructional Model A Systematic Review and Meta Analysis
17 pages
An Interesting Topic Name Here
No ratings yet
An Interesting Topic Name Here
4 pages
9 One Way Repeated Anova Jasp
No ratings yet
9 One Way Repeated Anova Jasp
20 pages
Optimal Data Rate Selection For Vehicle Safety Communications
No ratings yet
Optimal Data Rate Selection For Vehicle Safety Communications
9 pages
Wireless Networks For 4G/5G: Instructors: Sumit Roy and Tom Henderson TA: Collin Brady
No ratings yet
Wireless Networks For 4G/5G: Instructors: Sumit Roy and Tom Henderson TA: Collin Brady
15 pages
Prior To 1 Lecture: Students Should Review The Online ns-3 Tutorial and The ns-3
No ratings yet
Prior To 1 Lecture: Students Should Review The Online ns-3 Tutorial and The ns-3
3 pages
Iot Project Powerpoint
100% (1)
Iot Project Powerpoint
17 pages
Session 1 Session 2 Session 3 Session 4
No ratings yet
Session 1 Session 2 Session 3 Session 4
3 pages
Probability Exam Prep Questions
No ratings yet
Probability Exam Prep Questions
17 pages
XII STD - Statistics English Medium
No ratings yet
XII STD - Statistics English Medium
280 pages
Chapter 2 - Basic Tools For Forecasting
100% (1)
Chapter 2 - Basic Tools For Forecasting
66 pages
Research Methods: Foundation Course On
No ratings yet
Research Methods: Foundation Course On
31 pages
Jicable19-19 Treeing Asset Management Final
No ratings yet
Jicable19-19 Treeing Asset Management Final
6 pages
Carver Solutions
No ratings yet
Carver Solutions
109 pages
Chapter 4
No ratings yet
Chapter 4
27 pages
Virtual University of Pakistan: Statistics and Probability
No ratings yet
Virtual University of Pakistan: Statistics and Probability
5 pages
ESTIMATION (One Population) : CHAPTER - 8
100% (1)
ESTIMATION (One Population) : CHAPTER - 8
14 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
21 pages
Probability Distributions Quiz
No ratings yet
Probability Distributions Quiz
15 pages
Interpreting Data Collection and Presentation Problems
No ratings yet
Interpreting Data Collection and Presentation Problems
303 pages
General Education 2024 Vol 7 Questionnaire
No ratings yet
General Education 2024 Vol 7 Questionnaire
45 pages
Measure of Central Tendency of Ungrouped Data Exemplar
No ratings yet
Measure of Central Tendency of Ungrouped Data Exemplar
7 pages
P (74 - X - 78) - Learn Math and Stats With Dr. G
No ratings yet
P (74 - X - 78) - Learn Math and Stats With Dr. G
3 pages
Math 142 Co2 2.11
No ratings yet
Math 142 Co2 2.11
2 pages
Greg 4171: Statistical Applications in The Garment Industry
No ratings yet
Greg 4171: Statistical Applications in The Garment Industry
4 pages
QT Assignment
No ratings yet
QT Assignment
14 pages
Manual Del Raptor Ver - 7
No ratings yet
Manual Del Raptor Ver - 7
405 pages
Ala'a+Muhammed+Hussien+1,+Asst +prof +Manal+Omar+Mousa1
No ratings yet
Ala'a+Muhammed+Hussien+1,+Asst +prof +Manal+Omar+Mousa1
20 pages
Maths Workbook Year 6
100% (1)
Maths Workbook Year 6
256 pages
Power Transformer Lifespan Guide
100% (1)
Power Transformer Lifespan Guide
9 pages
05 S1 Silver 1
No ratings yet
05 S1 Silver 1
17 pages
Impact of Interest Rate Spread On Profitability of Commercial Banks in Nepal
No ratings yet
Impact of Interest Rate Spread On Profitability of Commercial Banks in Nepal
44 pages
Math Olympiad: Max/Min Solutions
No ratings yet
Math Olympiad: Max/Min Solutions
5 pages
Mock Spelling Bee Test 1
No ratings yet
Mock Spelling Bee Test 1
6 pages
Worksheet October 24 Solutions
No ratings yet
Worksheet October 24 Solutions
12 pages
English Beamer Powerlaw
No ratings yet
English Beamer Powerlaw
23 pages
Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences
No ratings yet
Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences
19 pages