100% found this document useful (1 vote)
910 views390 pages

Laplace (1813-1825) Theorie Analytique Des Probabilites

This document provides a summary of Pierre-Simon Laplace's Théorie Analytique des Probabilités (Theory of Probabilities). The summary highlights that Laplace's work analyzes and establishes the principles for solving probability problems. It uses two theories from earlier works: the theory of generating functions and the theory of approximation of formulas involving very large numbers. Laplace applies these theories to determine probabilities of causes and results from chance events involving large numbers. The work is foundational for understanding probabilities in areas like annuities, insurance, and decisions made by assemblies.

Uploaded by

Sinisa Hristov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
910 views390 pages

Laplace (1813-1825) Theorie Analytique Des Probabilites

This document provides a summary of Pierre-Simon Laplace's Théorie Analytique des Probabilités (Theory of Probabilities). The summary highlights that Laplace's work analyzes and establishes the principles for solving probability problems. It uses two theories from earlier works: the theory of generating functions and the theory of approximation of formulas involving very large numbers. Laplace applies these theories to determine probabilities of causes and results from chance events involving large numbers. The work is foundational for understanding probabilities in areas like annuities, insurance, and decisions made by assemblies.

Uploaded by

Sinisa Hristov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 390

Review

of
Théorie Analytic Probabilités ∗
No author indicated.

Connaissance des Temps pour l’année 1815 (1812), pp. 215–221

THÉORIE ANALYTIQUE

DES PROBABILITÉS,
DEDICATED
A S. M. L’EMPEREUR ET ROI
By M. le comte LAPLACE,

Chancelier du Sénat-Conservateur, Grand-Officier de la Légion-d’Honneur,


Member de l’Institut impérial, et du Bureau des Longitudes, etc.

We will be no better able to render account of this Work, than by transcribing


the presentation that the author made of it at the beginning.
“I myself propose, he says, to give here the analysis and the principles nec-
essary in order to resolve the problems concerning probabilities. This analysis
is composed of two theories that I have given, thirty years ago, in the Mémoires
de l’Académie des Sciences. The one of them is the Théorie des Fonctions
génératrices: the other is the Théorie de l’Approximation des formules fonc-
tions de très-grands nombres. They are the object of the first book, in which I
present them in a manner yet more general than in the Memoirs cited. Their
reconciliation shows with evidence, that the second is only an extension of the
first, and that they are able to be considered as two branches of one same cal-
culus, that I designate by the name Calculus of generating Functions. This
calculus is the foundation of my Théorie des Probabilités, which makes the ob-
ject of the second book. The questions relative to the events due to chance,
are brought back most often with facility, to some linear equations in simple or
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science,

Xavier University, Cincinnati, OH. December 10, 2011

1
partial differences: the first branch of the calculus of generating functions gives
the most general method in order to integrate this kind of equations. But when
the events that one considers are in great number, the expressions to which one
is led, are composed of one so great multitude of terms and of factors, that their
numerical calculation becomes impracticable; it is therefore then indispensable
to have a method which transforms them into convergent series. This is that
which the second branch of the calculus of the generating functions makes with
so much more advantage, as the method becomes more necessary.
“My object being to present the methods and the general results of the
theory of probabilities, I treat specially the most delicate, most difficult, and
at the same time most useful questions of this theory. I am attached especially
to determine the probabilities of the causes and of the results indicated by the
events considered in large number, and to seek the laws according to which this
probability approaches some limits in measure as the events are multiplied. This
research merits the attention of the geometers, by the analysis that it requires:
it is there principally that the theory of the approximation of formulas functions
of great numbers, finds its most important applications. This research interests
observers, by indicating to them the means that they must choose among the
results of their observations, and the probability of the errors that they have
yet to fear. Finally, it merits the attention of the philosophers, by showing
how regularity completes by being established in the same things which would
appear to us entirely delivered from chance, and by revealing the hidden, but
constant, causes of them, on which this regularity depends. It is out of the
regularity of the mean results of the events considered in great number, which
raise diverse establishments, such as annuities, tontines, assurances, etc. The
questions which are related to them, such as to the inoculation of vaccine and
to the decisions of assemblies, offer no difficulty according to my theory. I limit
myself here to resolve the most general; but the importance of these objects in
civil life, the moral considerations of which they complicate themselves, and the
numerous observations that they suppose, require a work apart.
“If one considers the analytic methods to which the series of probabilities
has already given birth, and those that it is able to yet give birth; the justice of
the principles which serve it at base; the rigorous and delicate logic which their
use requires in the solution of the problems; the establishment of public utility
which is supported on it: if one observes next that in the same things which are
able to be submitted to the calculus, this theory gives the most certain outlines
which are able to guide us in our judgments, and that it teaches to preserve
oneself from the illusions which often mislead us; one will see that there is no
science at all more worthy of our meditations, and of which the results are more
useful. It owes birth to two French geometers of the seventeenth century, so
fecund in great men and in great discoveries, and perhaps of all the centuries
the one which gives the most honor to the human spirit. Pascal and Fermat
themselves proposed and resolved some problems on the probabilities: Huygens
reunited these solutions, and extended them in a small treatise on the same
matter which next had been considered in a more general manner by Bernoulli,
Montmort, Moivre, and by many celebrated geometers of these last times.”

2
The work that we announce contains all that which has been done of im-
portance on this branch of human knowledge, that the author appears to us
to have perfected, either by the generality of his analysis, or by the novelty
and the difficulty of the problems that he has resolved. Among these numerous
problems, those which concern the means that it is appropriate to choose among
the results of observations, in the same way the probability of phenomena, of
their causes, and of future events, deduced from observed events, seems to us
should fix particularly the attention of the geometers. After having exposed
how the observations had often rebuked the analysts, by making them sense
the necessity to rectify their observations, and as he is arrived himself by the
considerations of the probabilities, to the great periodic and secular inequalities
of the celestial movements; the author adds:
“One sees thence how it is necessary to be attentive to the indications of
nature, when they are the result of a great number of observations, although
besides, they appear inexplicable by the known means. I engage thus the as-
tronomers, to follow with a particular attention, the lunar inequality in long
period, which depends on the longitude of the perigee of the moon, added to
the double of the longitude of its nodes; and that already the observations indi-
cate with much likelihood. If the sequence of observations continues to verify it,
it will force the geometers to return again onto the lunar theory, by making enter
the consideration of the difference which is able to exist between the northern
and southern hemispheres of the earth, a difference on which this inequality ap-
pears to me principally to depend. Thus one is able to say that nature itself has
concurred in the perfection of the theories founded on the principal of universal
gravity, and it is in my sense, one of the strongest proofs of the truth of this
admirable principal.
“One is able still, by the analysis of probabilities, to verify the existence
or the influence of certain causes of which one has believed to note the action
on other organic beings. Of all the instruments that we are able to employ in
order to understand the imperceptible agents of nature, the most sensible are the
nerves, especially when their sensibility is magnified by particular circumstances.
It is by their means, that one has discovered the feeble electricity that the
contact of two heterogeneous metals develop; that which has opened a vast field
to the researches of physicists and chemists. The singular phenomena which
result from the extreme sensibility of the nerves in some individuals, have given
birth to diverse opinions on the existence of a new agent one has named animal
magnitism, on the action of ordinary magnetism and the influence of the sun
and of the moon, in some nervous affections; finally, on the impressions that
the proximity of metals or of running water are able give birth. It is natural
to think that the action of these causes is very weak, and is able to easily be
troubled by a great number of accidental circumstances; thus of that which, in
any case, it is not at all manifested, one must not conclude that is never exists.
We are so distant from knowing all the agents of nature, that it would be little
philosophical to deny the existence of phenomena, uniquely because they are
inexplicable in the actual state of our knowledge. Only we must examine them
with an attention so much more scrupulous, as it appears more difficult to admit

3
them; and it is here that the analysis of the probabilities become indispensable
in order to determine to what point it is necessary to multiply the observations
or the experiences, finally to have in favor of the agents that they seem to
indicate, a probability superior to all the ratios that one has besides to reject
the existence of them.
“The same analysis is able to be extended to the diverse results of medicine
and political economy, and even to the influence of moral causes; because the
action of these causes, when it is repeated a great number of times, offers in its
results so much regularity as that of physical causes.”
One of the most remarkable phenomena of the system of the World, is the
one of the nearly circular movements in the same sense and very nearly in the
same plane, of the planets and of their satellites, while the comets move in some
very eccentric orbits, and indifferently in all the senses and all the inclinations
to the ecliptic. Mr. Count Laplace submits to the analysis of probabilities, the
existence of this singular phenomenon, by supposing it the effect of chance; and
he finds for its probability, a fraction excessively small, whence he concludes this
phenomenon indicates a particular cause, with a probability superior to those
of the greatest number of historical facts, on which one is permitted no doubt.
He has shown in his Exposition du Systèm du Monde, that his cause has been
able to be only the solar atmosphere originally extended beyond the orbits of
the planets, and that the cooling and the attraction of the sun has successively
condensed. Seen at the distance of the stars, this star would appear to us now
to shine as they; but in an original state where the author supposes it, it would
resemble at this distance, the nebulas that the telescopes show us composed of
a core more or less shining, surrounded by a nebulosity which, being condensed
by a sequence of times to the surface of the core, will end by transforming it
into a star. By conceiving by analogy, all the stars formed in this manner;
one is able to imagine their former state of nebulosity, preceded itself by some
successive states in which the nebulous matter would be more or less diffuse,
the core being less luminous: one arrives thus, in showing thus far that it is
possible, in a nebulosity so diffuse, that one is able with pain to suspect the
existence of it. Such is, in fact, the first state of the nebulas that Mr. Herschell
has observed with a particular care, by means of his powerful telescopes, and in
which he has followed the progress of the condensation, not on one alone, this
progress being able to become sensible for us only after some centuries, but on
their assemblage; very nearly as one is able to follow in a vast forest, the increase
of the trees, on the individuals of diverse ages, that it contains. He has observed
first the nebulous matter spread in diverse clusters, in the different parts of the
sky of which it occupies a great extent. He has seen in some of these clusters,
this matter weakly condensed about one or many faint cores. In other nebulas,
these cores shine more relative to the nebulosity which surround them. The
atmospheres of each core, being separated by an ulterior condensation, there
results from it some multiple nebulas formed by a shining core, surrounded
by an atmosphere. Sometimes the nebulous matter, by being condensed in a
uniform matter, has produced that nebulas which one names planetary. Finally
a greater degree of condensation transforms all these nebulas into stars. It is

4
necessary to follow in the same Memoir that Mr. Herschell just published, the
progressions of condensation of the nebulas which, classified according to this
very philosophical view, indicate with an extreme likelihood, the transformation
of the nebulas into stars, and the former state of nebulosity of the existing stars.
We will confirm the proofs drawn from these analogies, by the following remark:
For a long time the particular disposition of some visible stars to the simple
view, has struck some philosophical observers, Mr. Michell has already noted
how little it is probable that the six stars of the Pleiades, for example, had
been tightened in the narrow space which contains them, by the sole chances of
hasard; and he has concluded from it that this gorup of stars and the similar
groups that the sky presents to us, are the effects of an original cause, or of a
general law of nature. Now, these effects are a necessary continuation of the
condensation of these nebulas to many cores, that Mr. Herschell has described;
because it is visible that nebulous matter being attracted without ceasing by
these diverse cores, they must form at length, a group of stars, parallel to the
one of the Pleiades. The condensation of the nebulas to two cores, will form
similarly from the stars turning very nearly about one another, parallels to
that of which Mr. Herschell has already considered the respective movements.
Such are further, the 61st of Cignus and its following, in which Mr. Bessel just
recognized the proper movements, so considerable and so little different, that the
proximity of these stars among them, and their movement their common center
of gravity, must leave no doubt. Thus Mr. Count Laplace and Mr. Herschell
are arrived by some opposite routes, to the consideration of the sun surrounded
formerly by a vast atmosphere; the first, by ascending to this state of the sun,
by the consideration of the singular phenomena of the solar system; the second,
by descending through the progress of the condensation of the nebulous matter.
This encounter, by making the proofs agree that they have both produced, from
their ideas, gives to them together, a probability quite near to certitude.
By rendering to the good researches of Mr. Herschell, the justice which is
due to them; we will modify in some regards, his opinion on the cause of the
movements of rotation of the sun and of the stars. A cluster of molecules, all
originally immobile, are not able by being condensed, to produce as he seems to
believe, a star endowed with a movement of rotation. Mr. Count Laplace has
demonstrated in his Méchanique Céleste, that if all these molecules, by being
reunited, come to form a body endowed with a movement of rotation; the axis
of rotation will be necessarily the straight line perpendicular to the invariable
plane of the maximum of the areas, and passing through the center of gravity
of the entire mass; and the movement of rotation will be such, that the sum
of the areas described by each molecule projected onto this plane, will always
remain the same as in the original; whence it follows that this movement will
be null, if all the molecules have been originally in repose. One is able to see in
the Work cited, that this constancy of the areas maintain the uniformity of the
movement of rotation of the earth and of the duration of the day which, from
Hipparchus to us, has not varied by a hundredth of a second, despite the winds,
the currents of the Ocean, and all the interior convulsions of the globe. But in
a nebula with many cores, nothing is opposed to this that the stars which result

5
from it, have movements of rotation, provided that they turn in some different
sense; because it is not true, as many celebrated philosophers have advanced,
that the universal attraction is not able to produce in a system of originally
immobile bodies, any permanent movement, and that it must at length, reunite
them all to their common center of gravity.
This work of Mr. Herschell gives to him anew rights to the recognition of the
astronomers, as so many important discoveries have merited to him a long time.
One of the principal is the discovery of the planet Uranus and of six satellites
that the power of his telescopes has made him perceive about it. Two alone of
among them have been able to be recognized by some other observers. It is well
to desire that this celebrated astronomer publish the observations that without
doubt; he has made in order to note the existence of thes stars and in order to
determine their movements.

6
BOOK I
CALCULUS OF GENERATING FUNCTIONS

Pierre Simon Laplace∗


1820, 3rd edition

FIRST PART
GENERAL CONSIDERATIONS ON THE ELEMENTS OF MAGNITUDES

The notation of exponents, imagined by Descartes, has led Wallis and Newton to the
consideration of fractional exponents, positive and negative, and to the interpolation of
series. Leibnitz has rendered these exponents variables, that which has given birth to
the exponential calculus and has completed the system of elements of finite functions.
These functions are formed of exponential, algebraic and logarithmic quantities; quan-
tities essentially distinct from one another. Integrations are not often reducible to finite
functions. Leibnitz having adapted to his differential characteristic of the exponents,
in order to express the repeated differentiations, he has been led by the analogy of the
powers and of the differences, an analogy that Lagrange has followed by way of induc-
tion, in all his developments. The theory of generating functions extends this analogy
to some unspecified characteristics and indicates it evidently. All theory of series and
the integration of the equations in the differences result with an extreme facility from
this theory. No 1.

Chapter I. — C ONCERNING GENERATING FUNCTIONS IN ONE VARIABLE

u being any function of a variable t and yx being the coefficient of tx in the devel-
opment of this function, u is the generating function of yx . If we multiply u by
any function s of 1t , we will have a new generating function which will be that
of a function of yx , yx+1 , etc. By designating by ∇yx this last function, usi will
be the generating function of ∇i yx , so that the exponent of s, in the generating
function, becomes the one of the characteristic ∇ in the engendered function.
No 2.

On the interpolation of the sequences in one variable, and on the integration of the
linear differential equations.

Interpolation is reduced to determining the coefficient yx+1 of tx in the development


of tui . We are able to give to 1t an infinity of different forms: by elevating it
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. July 29, 2013

1
to the power i under these forms and passing again next from the generating
functions to the coefficients, we have, under an infinity of corresponding forms,
the expression of yx+i . Application of this method to the series of which the
successive differences of the terms decrease. No 3.
Formulas in order to interpolate between an odd or even number of equidistant quan-
tities. No 4.
General formula of interpolation of series of which the last ratio of the terms is that of
a series of which the general term is given by a linear equation in the differences,
with constant coefficients. No 5.
The formula is arrested when the relation of the terms is that of a similar series, and
then it gives the integral of the linear equations in finite differences, of which
the coefficients are constants. General integration of these equations, in the case
even where they have a last term a function of the index. No 6.
Formula of interpolation of the same series, ordered with respect to the successive
differences of the principal variable. No 7.
Passage of this formula, from the finite to the infinitely small. Interpolation of the
series of which the last ratio of the terms is that of an equation in the infinitely
small linear differences, with constant coefficients. Integration of this kind of
equations, when also they have a last term. No 8.
On the transformation of series. No 9.
Theorems on the development of functions and of their differences into series.
We deduce from the calculus of generating formulas the formulas
 
Δn yx = [(1 + Δyx )i − 1]n , Σn yx = [(1 + Δyx )i − 1]−n ,
Δ and Σ corresponding to the case where x varies by unity and  Δ and  Σ corre-
sponding to the case where x varies with i. We deduce from these formulas the
following:
 dyx n  dyx −n
 n  n
Δ yx = cα dx − 1 , Σ yx = cα dx − 1 ,

in which c designates the number of which the hyperbolic logarithm is unity, and

Δ and  Σ correspond to the variation α of x. We transform the expression of

Δn yx into this here
 dy nα dyx+ nα
n
x+
α 2 − α 2
c 2 dx − c 2 dx .

We arrive to these formulas


d n yx
= [log(1 + Δyx )]n ,
 dxn
n
yx dxn = [log(1 + Δyx )]−n .

2
Analogy between the positive powers and the differences and between the negative
powers and the integrals, based on this that the exponents of the powers, in the
generating functions, are transported to the characteristics corresponding to the
variable yx . Generalization of the preceding results. No 10.
Theorem analogous to the previous on the products of the many functions of one same
variable and especially with respect to the product px yx . No 11.
Chapter II. — C ONCERNING GENERATING FUNCTIONS IN TWO VARIABLES
x
u being a function of two variables t and t , and yx,x being the coefficient of tx t in
the development of this function, u is generating function of yx,x . If we multiply
x
u by a function s of 1t and t1 , the coefficient of tx t in the development of this
product will be a function of yx,x , yx+1,x , yx,x +1 , etc.; by designating it by
∇yx,x , usi will be the generating function of ∇i yx,x . No 12.
On the interpolation of the series in two variables and on the integration of linear
equations in partial differences.
General formula of the interpolation of series of which the last ratio of the terms is
that of a series of which the general term is given by a linear equation in partial
differences, with constant coefficients. No 13.
The formula is arrested when the relation of the terms is that of a similar series, and
then it gives the integral of the linear equations in the partial finite differences, of
which the coefficients are constants. This integral supposes that we know or that
we can deduce from the conditions of the problem n arbitrary values of yx,x , by
giving, for example, to x the n values 0, 1, 2, . . . , n − 1, x being moreover un-
specified. A very simple expression of yx,x , when these arbitrary functions in x
are given by some linear equations in the differences, with constant coefficients.
No 14.
General expression of yx,x under the form of definite integral; important remark on
the number of arbitrary functions which the integral of the equations in partial
differences contains. No 15.
Examination of some cases which escape from the general formula of integration
given in that which precedes; in this case, the characteristics of the finite differ-
ences which the integrals contain have for exponents the variable indices of the
equations in the partial differences. No 16.
Integration of the equation
a n−1 b
0 = Δn yx,x + Δ Δyx,x + 2 Δn−2 yx,x + · · · ,
α α
Δ corresponding to the variability of x of which the unit is the difference, and

Δ corresponding to the variability of x of which α is the difference. We de-
duce from it the integral of the equation in the infinitely small and finite partial
differences, that we obtain by changing, in the preceding, α into dx , and the
characteristic  Δ into d. No 17.

3
Theorems on the development into series of the functions of many variables.

These theorems are analogous to those who have been given previously with respect to
the functions in one variable alone, and we recover the observed analogy between
the positive powers and the differences, and between the negative powers and the
integrals. No 18.

Considerations on the passages from the finite to the infinitely small.

The consideration of these passages is very proper to clarify the most delicate points
of the infinitesimal Calculus. It shows evidently that the quantities neglected in
this Calculus remove nothing from its rigor. By applying it to the problem of the
vibrating cords, it proves the possibility to introduce some arbitrary discontinu-
ous functions into the integrals of the equations in the finite and infinitely small
partial differences, and it gives the conditions of this discontinuity. No 19.
General considerations on the generating functions.

To find the generating function of a given quantity by a linear equation in the finite
differences, of which the coefficients are some rational and entire functions of
the index. No 20.
Expressions of the integrals of these equations as definite integrals. The functions
under the integral sign are of the same nature as the generating functions of the
quantities given by these equations. Thus all the theorems deduced previously
from the analogy of the powers and the differences are applied to these integrals.
Their principle advantage is to furnish an approximation as handy as convergent
of these quantities, when their index is a very great number. This method of
approximation acquires a great extension by the passages from the positive to
the negative and from the real to the imaginary, passages of which I have given
the first traces in the Mémoires de l’Académie des Sciences of 1782. It seems, by
the posthumous Works of Euler, that, toward the same time, this great geometer
occupied himself with the same object. No 21.

4
SECOND PART
THEORY OF THE APPROXIMATION OF FORMULAS WHICH ARE FUNCTIONS OF
VERY LARGE NUMBERS

Chapter I. — O N THE INTEGRATION BY APPROXIMATION OF THE DIFFERENTIALS


WHICH CONTAIN SOME FACTORS RAISED TO GREAT POWERS

Expression, in convergent series, of their integral taken between two given limits: the
series ceases to be convergent near to the maximum of the function under the
integral sign. No 22.
Expression, in convergent series, of the integral in this last case. No 23.

That which this series becomes when the integral is taken between two limits which
render null the function
 under the integral sign. Its value depends then on inte-
grals of the form tr dt e−t and taken from t null to t infinity. We establish this
n

theorem  
2 −tn π
tn−r dt c−t =
n
n r−2
t dt c  r−1 ,
sin n π
π being the semi-circumference of which the radius is unity. We deduce from it
this remarkable result 
2 1√
dt c−t = π.
2
No 24.

This last result gives, by the passage from the real to the imaginary,
 √
2 2 π − r22
dx cos rx c−a x = c 4a ,
2a
the integral being taken from x null to x infinity; a direct method which leads
to this equation and from
 which we deduce the value of the integral when the
quantity under the sign is multiplied by x2n : value of the integral

2 2
x2n±1 dx sin rx c−a x .

No 25.
We arrive to the formulas
 
dx cos rx xdx sin rx
= = π c−r ,
1 + x2 1 + x2
the integrals being taken from x = −∞ to x = +∞; and we deduce from it
cos
the integrals M
N dx sin rx, taken within the same limits, N being a rational and
entire function of x, of a degree superior to M , and not having a real factor of
first degree. No 26.

5
 2
Expression of the integral dt c−t taken between the given limits, either as series,
o
or as continued fraction. N 27.
Approximation of the double, triple, etc. integrals of the differentials multiplied by
some factors raised to high powers. Formulas in convergent  series in order to
integrate, within some given limits, the double integral ydxdx , y being a
function of x and of x . Examination of the case where the integral is taken very
near the maximum of y. Expression of the integral as convergent series. No 28.
Chapter II. — O N THE INTEGRATION BY APPROXIMATION OF LINEAR EQUATIONS
IN FINITE AND INFINITELY SMALL DIFFERENCES

Integration of the equation in the finite differences

S = Ayx + BΔyx + CΔ2 yx + · · · ,

 entire functions of s. If the variable ys is expressed


A, B, C being some rational and
by the definite integral xs φ dx or by this here c−sx φ dx, φ being function
of x, we have, by the formulas of the preceding Chapter, the value of ys in very
convergent series, when the index s is a large number. In order to determine
φ, we substitute for ys , its expression as definite integral in the equation with
the difference in yx , which is partitioned into two others, of which the one is
a differential equation in φ, which serves to determine this unknown; the other
equation gives the limits of the definite integral. No 29.
Integration of any number of linear equations in one index alone and having a last
term, the coefficients of these equations being some rational and entire functions
of this index. This method is able to be extended to the linear equations in
differences either infinitely small, or into finite parts and into infinitely small
parts. No 30.
The principal difficulty of this analysis consists in integrating the differential equa-
tion in φ, which is integrable generally only in the case where the index s is only
to the first power in the equation in the differences in ys , which then is of the
form 0 = V + sT, V and T being some linear functions of ys and of its differ-
ences, either finite, or infinitely small. Integral of this last equation, by a very
convergent series, when s is a large number. Important remark on the extent of
this series, which is independent of the limits of the definite integral by which
ys is expressed, and which subsists in the same case where the equation in the
limits has only imaginary roots. When, in the equation in ys , s surpasses the
first degree, we can sometimes decompose it into many equations which contain
only the first power of s. We can further, in many cases, integrate, by a very
convergent approximation, the differential equation in φ. No 31.
Integration of the equation
0 = V + sT + s R,
V, T, R being unspecified linear functions of ys,s and of its ordinary and partial
differences, finite and infinitely small. No 32.

6
Chapter III.— A PPLICATION OF THE PRECEDING METHODS TO THE APPROXIMA -
TION OF DIVERSE FUNCTIONS OF VERY GREAT NUMBERS

On the approximation of the products composed of a great number of factors and of


the terms of polynomials raised to great powers.

The integral of the equation 0 = (s + 1)yx − ys−1 , approximated by the methods


of the preceding Chapter and compared to its finite integral, given, by a very
convergent series, the product (μ + 1)(μ + 2) . . . s. By making s negative and
passing from the positive to the negative and from the real to the imaginary, we
arrive to this remarkable equation
1 
2π(−1) 2 −μ dx c−x
 = ,
xμ−1 dxe−x xμ

the first integral being taken from x null to x infinity, and the last integral being
−x
taken between the imaginary limits of x which render null the function cxμ ; that
cos
dx x
 sin
which gives an easy means to have the integral xμ , taken from x null to
x infinity. This equation gives further the value of the integrals
 
d cos  d sin 
,
1 + 2 1 + 2
π
taken from  null to  infinity. One finds 2c for these integrals; their accord
o
with the results of the n 26 proves the justice of these passages from the positive
to the negative and from the real to the imaginary: these diverse results have been
given in the Mémoires de l’Académie des Sciences for the year 1782. No 33.
The approximate integral of the equation 0 = (a + b )ys+1 − (a + bs)yx , whence we
deduce, by a simple and very convergent series, the middle term or independent
 2s
of a of the binomial a + a1 . No 34.

General method in order to have, by a convergent series, the middle term or indepen-
dent of a, in the development of the polynomial

a−n + a−n+1 + a−n+2 + · · · + an−1 + an

raised to a very high power. No 35.

Expressions, in convergent series, of the coefficient of a±l , in the development of this


power, and of the sum of its coefficients, from the one of a−l to the one of al .
No 36.
Integration by approximation of the equation in the differences ps = sys +(s−i)ys+i .
One deduces from it the expression of the sum of the terms of the very high
power of a binomial, by arresting its development at any term quite distant from
the first. No 37.

7
On the approximation of the very elevated differences infinitely small and finite of
functions

Approximation of the very elevated infinitely small differences of the powers of a


polynomial. Very approximate expression of the very elevated differential of an
angle, taken with respect to its sine. No 38.

Expressions by definite integrals of the finite and infinitely


 small differences of ys ,
when we are arrived to give to it either of the forms xs φ dx, c−sx φ dx. No 39.
Approximation by very convergent series of Δn 1s , n being a large number. We de-
duce, by means of the passages from the positive to the negative and from the
real to the imaginary, the approximation of Δn s. The convergence of the series
requires that i surpass n and that the difference i − n is not too small with respect
to s + n2 . Expression in series of Δn si , in the last case. No 40.
Expression of the difference Δn si when i is smaller than n. No 41.
Expression of the sum of the terms of Δn si , by arresting its development at the term
in which the quantity raised to the power i commences to become negative. Ap-
proximation, by very convergent series, of the function
√ √ n(n − 1) √
(n + r n)n±l − n(n + r n − 2)n±l + (n + r n − 4)n±l − · · ·
1.2
in which we reject the terms where the quantity raised to the power n ± l is
negative, l being a very considerable whole number with respect to n. No 42.
Extension of the preceding methods to the very elevated finite differences of the form
 
Δn (s + p)i (s + p )i (s + p )i · · · No 43.
General remarks on the convergence of the series. No 44.

8
BOOK II
CHAPTER I
PRINCIPES GÉNÉRAUX DE CETTE THÉORIE

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§1–2, pp. 181–190

GENERAL PRINCIPLES OF THIS THEORY


Definition of probability. Its measure is the ratio of the number of favorable cases to the one of
all possible cases.
The probability of an event composed of two simple events is the product of the probability of
one of these events, by the probability that, this event having arrived, the other event will
take place.
The probability of a future event, deduced from an observed event, is the quotient of the division
of the probability of the event composed of these two events and determined a priori by
the probability of the observed event, determined similarly a priori.
If an observed event is able to result from n different causes, their probabilities are respectively,
as the probabilities of the event, deduced from their existence, and the probability of each
of them is a fraction of which the numerator is the probability of the event under the
hypothesis of the existence of the cause, and of which the denominator is the sum of the
similar probabilities, relative to all the causes. If these diverse causes considered a priori
are unequally probable, it is necessary, instead of the probability of the event, resulting
from each cause, to employ the product of this probability by that of the cause itself.
The probability of a future event is the sum of the products of the probability of each cause,
deduced from the observed event, by the probability that this cause existing, the future
event will take place.
On the influence that the unknown difference which is able to exist among some simple events
must have on the results of the Calculus of Probabilities that we suppose equally possible.
This difference increases the probability of the events composed of the repetition of one
same event No 1.
On the mathematical and moral expectations. The first is the product of the expected good by
the probability to obtain it; the second depends on the value relative to the expected good.
The most natural and simplest rule in order to estimate this value consists in supposing
the value relative to one infinitely small sum in direct ratio of its absolute value and in
inverse ratio to the total good of the interested person. No 2.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. July 17, 2016

1
§1. We have seen in the Introduction that the probability of an event is the ratio of [181]
the number of cases which are favorable to it to the number of all possible cases, when
nothing supports belief that one of these cases must arrive rather than the others, that
which renders them, for us, equally possible. The just estimation of these diverse cases
is one of the most delicate points of the Analysis of chances.
If all the cases are not equally possible, we will determine their respective possi-
bilities, and then the probability of the event will be the sum of the probabilities of
each favorable case. In fact, let us name p the probability of the first of these cases.
This probability is relative to the subdivision of all the cases into some others equally
possible. Let N be the sum of all the cases thus subdivided, and n the sum of those
cases which are favorable to the first case; we will have
n
p= .
N
We will have similarly
n n
p = , p = , ...,
N N
by marking with one stroke, with two strokes, . . . the letters p and n, relatively to
the second case, to the third, . . .. Now the probability of the event of which there is [182]
question is, by the same definition of probability, equal to

n + n + n + · · ·
,
N
it is therefore equal to p + p + p + · · ·
When an event is composed of two simple events, the one independent of the other,
it is clear that the number of all possible cases is the product of the two numbers which
express all the possible cases relative to each simple event, because each of the cases
relative to one of these events is able to be combined with all the cases relative to the
other event. By the same reason, the number of cases favorable to the composite event
is the product of the two numbers which express the cases favorable to each simple
event; the probability of the composite event is therefore then the product of the prob-
abilities of each simple event. Thus the probability to bring forth twice consecutively
one ace with one die is one thirty-sixth, when we suppose the faces of the die perfectly
equal, because the number of all possible cases in two trials is thirty-six, each case of
the first cast being able to be combined with the six cases of the second, and among all
these cases one alone gives two aces consecutively.
In general, if p, p , p , . . . are the respective possibilities of any number of simple
events independent of one another, the product p.p . p . . . . will be the probability of
an event composed of these events.
If the simple events are linked among them in a manner that the supposition of the
arrival of the first influences the probability of the arrival of the second, we will have
the probability of the composite event, by determining: 1◦ the probability of the first
event; 2◦ the probability that, this event having arrived, the second will take place.
In order to demonstrate this principle in a general manner, let us name p the number
of all the possible cases, and let us suppose that in this number there are of them p

2
favorable to the first event. Let us suppose next that, in the number p , there are q
favorable to the second event; it is clear that pq will be the probability of the composite [183]
p
event. But the probability of the first event is p,the probability that, this event having
arrived, the second will take place is p ; because then, one of the cases p needing to
q

exist, we must consider only these cases. Now we have

q p q
= ,
p p p
that which is the translation into Analysis of the principle enunciated above.
In considering how a composite event the observed event joins to a future event, the
probability of this last event, deduced from the observed event, is evidently the prob-
ability that, the observed event taking place, the future event will take place similarly;
now, by the principle that we have just exposed, this probability multiplied by that of
the observed event, determined a priori or independently from that which is already
arrived, is equal to that of the composite event determined a priori; we have therefore
this new principle, relative to the probability of future events, deduced from observed
events:
The probability of a future event, deduced from an observed event, is the quotient
of the division of the probability of the event composed of these two events, and deter-
mined a priori, by the probability of the observed event, determined similarly a priori.
Thence results further this other principle relative to the probability of causes, de-
duced from observed events.
If an observed event is able to result from n different causes, their probabilities
are respectively as the probabilities of the event, deduced from their existence; and the
probability of each of them is a fraction of which the numerator is the probability of the
event, under the hypothesis of the existence of the cause, and of which the denominator
is the sum of the similar probabilities, relative to all the causes.
Let us consider, in fact, as a composite event the observed event, resulting from
one of these causes. The probability of this composite event, a probability that we will [184]
designate by E, will be, by that which precedes, equal to the product of the probability
of the observed event, determined a priori and that we will name F , by the probability
that, this event taking place, the cause of which there is concern exists, a probability
which is that of the cause, deduced from the observed event, and that we will name P .
We will have therefore
E
P = .
F
The probability of the composite event is the product of the probability of the cause
by the probability that, this cause taking place, the event will arrive, a probability that
we will designate by H. All the causes being supposed a priori equally possible, the
probability of each of them is n1 ; we have therefore

H
E= .
n
The probability of the observed event is the sum of all the E relative to each cause; by

3
designating therefore by S H
n the sum of all the values of
H
n, we will have

H
F =S ;
n
E
the equation P = F will become therefore

H
P = ,
SH
that which is the principle enunciated above, when all the causes are a priori equally
possible. If this is not, by naming p the probability a priori of the cause that we have
just considered, we will have
E = Hp,
and, by following the preceding reasoning, we will find
Hp
P = ;
SHp
that which gives the probabilities of the diverse causes, when they are not all equally [185]
possible a priori.
In order to apply the preceding principle to an example, let us suppose that an urn
contains three balls of which each is able to be only white or black; that after having
drawn one ball, we restore it to the urn in order to proceed to a new drawing, and that
after m drawings, we have brought forth only some white balls. It is clear that we are
able to make a priori only four hypotheses; because the balls are able to be either all
white, or two whites and one black, or two blacks and one white, or finally all black. If
we consider these hypotheses as so many causes of the observed event, the probabilities
of the event relative to these causes will be
2m 1
1, , , 0.
3m 3m
The respective probabilities of these hypotheses, deduced from the observed event, will
be therefore, by the third principle,
3m 2m 1
, , , 0.
3m + 2m + 1 3m + 2m + 1 3m + 2m + 1
We see, besides, that it is useless to have regard to the hypotheses which exclude
the event, because, the probability resulting from these hypotheses being null, their
omission changes not at all the expressions of the other probabilities.
If we wish to have the probability to bring forth only some black balls in the fol-
lowing m drawings, we will determine a priori the probabilities to bring forth first m
white balls, next m black balls. These probabilities are, relatively to the preceding
hypotheses,

2m 2m
0, , , 0,
3m+m 3m+m

4
and as, a priori, the four hypotheses are equally possible, the probability of the com-
posite event will be the quarter of the sum of the four preceding probabilities, or [186]

1 2m + 2m
.
4 3m+m
The probabilities of the observed event, determined a priori, under the preceding four
hypotheses, being respectively
3m 2m 1
, , , 0,
3m 3m 3m
the quarter of their sum, or
1 3m + 2 m + 1
,
4 3m
will be the probability of the observed event, determined a priori; by dividing therefore
the probability of the composite event by this probability, we will have, by the second
principle,

2m + 2 m
,
3m (3m + 2m + 1)
for the probability to bring forth m black balls in the m drawings following.
We are able further to determine this probability by the following principle:

The probability of a future event is the sum of the products of the probability of each
cause, deduced from the observed event, by the probability that, this cause existing, the
future event will take place.

Here the probabilities of each cause, deduced from the observed event, are, as we
have seen,
3m 2m 1
m m
, , , 0;
3 +2 +1 3 + 2m + 1
m 3m + 2m + 1
the probabilities of the future event, relative to these causes, are respectively

1 2m
0, , , 1;
3m 3m
the sum of their respective products, or [187]

2m + 2m
 ,
3 (3m + 2m + 1)
m

will be the probability of the future event, deduced from the observed event, that which
is conformed to that which precedes.
If we suppose four balls in the urn, and that having brought forth a white ball at
the first drawing, we seek the probability to bring forth only some black balls in the
following m drawings, we will find, by the principles exposed above, this probability
equal to
 
3 + 2m +1 + 3m
.
10.4m

5
If the number of white balls equals the one of the black, the probability to bring
1
forth only some black balls in m drawings is 2m  . It surpasses the preceding when m


is equal or less than 5; but it becomes inferior to it when m surpasses 5, although the
white ball extracted first from the urn indicates a superiority in the number of white
balls. The explication of this paradox holds in this that this indication excludes not at
all the superiority of the number of black balls; it renders it only less probable, instead
that the supposition of a perfect equality between the number of whites and the one
of the blacks excludes this superiority; now this superiority, however small that its
probability be, must render the probability to bring forth consecutively m black balls
greater than the case of equality of the colors, when m is considerable.
The inequality which is able to exist between some things that we suppose per-
fectly similar is able to have on the results of the Calculus of Probabilities a sensible
influence which merits a particular attention. Let us consider the game of heads and
tails, and let us suppose that it is equally easy to bring forth heads as tails; then the
probability to bring forth heads at the first trial is 12 , and that to bring it forth two times
consecutively is 14 . But if there exists in the coin an inequality which makes one of [188]
the faces appear rather than the other, without us knowing the face that this inequality
favors, the probability to bring forth heads at the first trial will remain always 12 , be-
cause, in the ignorance where one is of the face that this inequality favors, as much as
the probability of the simple event is increased if this inequality is favorable to it, so
much is it diminished if this inequality is contrary to it. But the probability to bring
forth heads two times consecutively is increased, notwithstanding this ignorance; be-
cause this probability is equal to that to bring forth heads at the first trial, multiplied by
the probability that, having brought it forth at the first trial, we will bring it forth at the
second; now its arrival at the first trial is a motive to believe that the inequality of the
coin favors it; it increases therefore the probability to bring it forth at the second; thus
the product of the two probabilities is increased by this inequality. In order to submit
this object to calculation, let us suppose that the inequality of the coin increases by the
quantity α the probability of the simple event that it favors. If this event is heads, the
probability will be 12 + α, and the probability to bring it forth two times consecutively
 2
will be 12 + α . If the event favored is tails, the probability of heads will be 12 − α,
 2
and the probability to bring it forth two times consecutively will be 12 − α . As we
have no reason in advance to believe that the inequality favors the one rather than the
other of the simple events, it is clear that, in order to have the probability of the com-
posite event heads-heads, it is necessary to add the two preceding probabilities and to
take the half of their sum, that which gives 14 + α2 for this probability: it is also the
probability of tails-tails. We will find by the same reasoning that the probability of the
composite event heads-tails or tails-heads is 14 − α2 ; consequently, it is less than that
of the repetition of the same simple event.
The preceding considerations are able to be extended to any events whatsoever, p
representing the probability of a simple event, and 1 − p that of the other event; if we
designate by P the probability of a result relative to these events, and if we suppose
that p is really p ± α, α being an unknown quantity, in the same way as the sign which [189]

6
affects it, the probability P of the result will be

1 2 d2 P 1 d4 P
P+ α 2
+ α4 4 + · · ·
1.2 dp 1.2.3.4 dp
By making P = pn , that is to say by supposing that the result relative to the events are
n times the repetition of the first, the probability P will become

n(n − 1) 2 n−2 n(n − 1)(n − 2)(n − 3) 4 n−4


pn + α p + α p + ···
1.2 1.2.3.4
Thus the unknown error, that we are able to suppose in the probability of the simple
events, increases always the probability of the composite events of the repetition of the
same event.

§2. The probability of events serves to determine the hope1 and the fear of the
persons interested in their existence. The word espérance has diverse meanings; it ex-
presses generally the advantage of the one who awaits any good, under a supposition
that is only probable. In the theory of chances, this advantage is the product of the ex-
pected sum by the probability to obtain it; it is the partial sum which must return when
we no longer wish to incur the risks of the event, by supposing that the apportionment
of the entire sum is made proportional to the probabilities. This manner to apportion it
is alone equitable, when we set aside all strange circumstance, because with an equal
degree of probability we have an equal right with respect to the expected sum. We will
name this advantage mathematical expectation, in order to distinguish it from moral
expectation which depends, as it does, on the expected good and on the probability to
obtain it, but which is regulated further on a thousand variable circumstances that it is
nearly always impossible to define, and further yet to subject to the calculus. These cir-
cumstances, it is true, making only to increase or to decrease the value of the expected
good, we are able to consider the moral expectation itself as the product of this value
by the probability to obtain it; but we must then distinguish, in the expected good, its
value relative to its absolute value: the latter is independent of the motives which make [190]
it desired, whereas the first increases with these motives.
We are not able to give a general rule in order to estimate this relative value; how-
ever it is natural to suppose the value relative to an infinitely small sum, in direct ratio
to its absolute value, in inverse ratio of the total good of the interested person. In fact,
it is clear that a franc has very little value for the one who possesses a great number of
them, and that the most natural manner to estimate its relative value is to suppose it in
inverse ratio to this number.
Such are the general principals of the Analysis of Probabilities. We will now apply
them to the most delicate and the most difficult questions of this analysis. But, in order
to put in order in this matter, we will treat first the questions in which the probabilities
of the simple events are given; we will consider next those in which these probabilities
are unknown and must be determined by the observed events.

1 espérance: Expectation, here translated as hope.

7
BOOK II
CHAPTER II
DE LA PROBABILITÉ DES ÉVÉNEMENTS COMPOSÉ D’ÉVÉNEMENTS
SIMPLES DONT POSSIBILITÉS RESPECTIVES SONT DONNÉES

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§3–15, pp. 191–279.

ON THE PROBABILITY OF EVENTS COMPOSED OF SIMPLE EVENTS


OF WHICH THE RESPECTIVE POSSIBILITIES ARE GIVEN
Expression of the number of combinations of n letters taken r by r when we have regard or not
to their respective situation. Application to the lotteries. No 3.
A lottery being composed of n tickets of which r exit at each drawing, we demand the probabil-
ity that after i drawings all the tickets will have exited. General solution of the problem. A
very simple and very near expression of the probability when n and i are great numbers.
Application to the case where n = 10000 and r = 1. There is, in this case, odds a little
less than one against one that all the tickets will exit in 95767 drawings and odds a little
more than one against one that they will exit in 95768 drawings. In the case of the lottery
of France, where n = 90 and r = 5, there is odds a little less than one against one that
all the numbers will exit in 85 drawings, and odds a little more than one against one that
they will exit in 86 drawings. No 4.
An urn being supposed to contain the number x of balls, we draw from it a part or the totality,
and we demand the probability that the number of extracted balls will be even. Solution
of the problem. There is advantage to wager for an odd number. No 5.
Expression of the probability to bring forth x white balls, x black balls, x red balls, etc., by
drawing a ball from each of the urns of which the number is
x + x + x + · · · ,
and which contain each p white balls, q black balls, r red balls, etc. No 6.
To determine the probability to draw thus from the preceding urns x white balls, before bringing
forth either x black balls, or x red balls, or, etc. Solution of the problem by the method
of combinations. Identity of this problem with the one which consists in determining
the lots of a number n of players of whom the respective skills are known when there is
lacking, in order to win the game, x trials to the first, x to the second, x to the third, etc.
No 7.
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. July 31, 2016

1
General solution of the preceding problem by the analysis of generating functions. In the case
of two players A and B of whom the respective skills are equal, the problem is the one
that Pascal proposed to Fermat and that these two great geometers resolved. It reverts to
imagining an urn which contains two balls, one white and the other black, bearing each
the no 1, the white ball being for player A, the black ball for player B. We draw from
the urn one ball that we return next there in order to proceed to a new drawing, and we
continue thus until the sum of the drawn values, favorable to one of the players, attains
a given number. After a certain number of drawings, there is lacking yet to player A the
number x and to player B the number x . The two players agree then to be retired from
the game by dividing the stake that they have set in beginning: the concern is to know
how this division must be made. That which returns to the players must be evidently
proportional to their respective probabilities to win the game. Generalization and solution
of the problem: 1◦ by supposing in the urn one white ball favorable to A and bearing the
no 1 and two black balls favorable to B and the one bearing, the no 1, and the other, the no
2; each ball diminishing with its number the number of points which lack to the player to
whom it is favorable; 2◦ by supposing in the urn two white balls bearing the nos 1 and 2
and two black balls bearing the same numbers. No 8.
Conceiving in an urn r balls marked with the no 1, r balls marked with the no 2, and thus con-
secutively until no n; these balls being well mixed in the urn and each drawn successively,
we demand the probability that there will exit at least s balls at the rank indicated by their
number. General solution of the problem and of the one in which, having i urns each con-
taining the number n of balls, all of different colors and that we draw each successively
from each urn by completing the drawing from one urn before passing to another urn, we
demand the probability that one or many balls of the same color will exit at the same rank
in the complete drawings from the urns. No 9.
Two players A and B, of whom the respective skills are p and q and of whom the first has the
number a of tokens and the second the number b, play with this condition that the one who
loses gives a token to his adversary and that the game ends only when one of the players
will have lost all his tokens; we demand the probability that one of the players will win
the game before or at the nth trial. Generating function of this probability, whence we
deduce the general expression of the probability. The expression of the probability that
the game will end before or at the nth trial. That which it becomes when we suppose a
infinite. Very close value of the same expression, when we suppose moreover p and q
equals and when b is a considerable number. If b = 100, there is disadvantage to wager
one against one that A will win the game in 23780 trials; but there is advantage to wager
that he will win it in 23781 trials. No 10.
A number n + 1 of players play together on the following conditions: two of among them play
first, and the one who loses is retired after having set a franc into the game, in order to
return only after all the other players have played; that which holds generally for all the
players who lose and who thence become the last. The one of the first two players who
has won plays with the third, and, if he beats him, he continues to play with the fourth,
and thus consecutively, until he loses, or until he has beat successively all the players. In
this last case, the game is ended. But, if the player winning on the first trial is vanquished
by one of the other players, the vanquisher plays with the player following and continues
to play until either he is vanquished or until he has beaten consecutively all the players.
The game continues thus until one of the players beats consecutively all the others, that
which ends the game, and then the player who wins it carries away all that which has
been set into the game. This premised, we demand: 1◦ the probability that the game will
end before or at the number x of trials; 2◦ the probability that any one of the players will

2
win the game in this number of trials; 3◦ his advantage. General solution of the problem.
Generating functions of these three quantities, whence we deduce their values. Quite
simple expressions of these quantities, when x is infinite or when the game is continued
indefinitely. No 11.
q being the probability of a simple event at each trial, we demand the probability to bring it
forth i times consecutively in the number x of trials. Solution of the problem. Generating
function of this probability, whence we deduce the expression of the probability.
Two players A and B, of whom the respective skills are q and 1 − q, play with this condition
that the one of the two who will have vanquished first i times consecutively his adversary
will win the game; we demand the respective probabilities of the players to win the game,
before or at the trial x. Solution of the problem by means of the generating functions.
Expressions of these probabilities in the case of x infinite. Respective lots of the players,
by supposing that at each trial that they lose, they deposit a franc into the game. No 12.
One urn being supposed to contain n+1 balls, distinguished by the nos 0, 1, 2, 3, . . ., n, we draw
from it one ball that we return into the urn after the drawing; we demand the probability
that after i drawings the sum of the numbers drawn will be equal to s. Solution of the
problem based on a singular artifice, which consists in the use of a characteristic proper
to make known the successive diminution that it is necessary to subject to the variable, in
each term of the final result of the successive integrations, when they are discontinuous.
Application of the solution to the problem which consists in determining the probability
to bring forth a given number, by projecting i dice, each of a number of faces n + 1,
and to the problem where we seek the probability that the sum of the inclinations to
the ecliptic of a number s of orbits will be comprehended within some given limits, by
supposing all the inclinations, from zero to the right angle, equally possible. We show
that the existence of a common cause which has directed the movements of rotation and
of revolution of the planets and of the satellites, in the sense of the rotation of the Sun, is
indicated with a probability excessively near to certitude, and quite superior to that of the
greatest number of historical facts, with respect to which we are permitted no doubt. The
same solution, applied to the movement and to the orbits of one hundred comets observed
to this day, proves that nothing indicates, in these stars, a first cause which has tended to
make them move in one sense rather than in another, or under one inclination rather than
under another, in the plane of the ecliptic. No 13.
Solution of the problem exposed at the beginning of the preceding section, in the case where
the number of balls which bear the same number is not equal to unity and varies according
to any one law. No 14.
Application of the artifice exposed in no 13 to the solution of this problem. Let there be i
variable quantities of which the sum is s and of which the laws of possibility are known
and able to be discontinuous; one proposes to find the sum of the products of each value
that any function of these variables is able to receive, multiplied by the probability corre-
sponding to this value. Application of this solution to the investigation on the probability
that the error of the result of any number of observations of which the laws of facility
of the errors are expressed by some rational and entire functions of these errors will be
comprehended within some given limits.
Application of the same solution to the investigation of a rule proper to make known the most
probable result of the opinions uttered by the diverse members of a tribunal; this rule
is not at all applicable to the choices of the electoral assemblies. Rule relative to these
choices, when we set aside the passions of the electors and of the strange considerations
in merit, which are able to determine them. These diverse causes render this rule subject
to some grave inconveniences which have caused to abandoning it.

3
Investigation on the law of probability of the errors of observations, mean among all those
which satisfy the conditions that the positive errors are the same as the negative errors,
and that their probability diminishes when they increase. No 15.

4
§3. If we develop the product (1+p)(1+p )(1+p ) · · · , composed of n factors, this [191]
development will contain all the possible combinations of the n letters p, p , p , . . . ,
p(n−1) , taken one by one, two by two, three by three, . . . to n,1 and each combination
will have for coefficient unity. Thus, the combination pp p resulting from the product
(1+p)(1+p )(1+p ), multiplied by the term 1 of the development of the other factors,
its coefficient is evidently unity. Now, in order to have the total number of combinations
of n letters taken x by x, we will observe that each of these combinations become px ,
when we suppose p , p , . . . equal to p. Then the product of the n preceding factors is
changed into the binomial (1 + p)n ; now the coefficient of px in the development of
this binomial is
n(n − 1)(n − 2) · · · (n − x + 1)
;
1.2.3 . . . x
this quantity expresses therefore the number of combinations of n letters taken x by x.
We will have the total number of combinations of these letters, taken one by one, two
by two, . . . to n by n, by making p = 1, in the binomial (1 + p)n , and by subtracting
unity from it, that which gives 2n − 1 for this number.
Let us suppose that in each combination we have regard not only to the number of
letters, but further to their situation; we will determine the number of combinations, [192]
by observing that, in the combination of two letters pp , we are able to put p in the
second place, and next in the first, that which gives the two combinations pp , p p.
By introducing next a new letter p in each of these combinations, we are able to put
it in the first, in the second or in the third place, that which gives 2.3 combinations.
By continuing thus, we see that, in a combination of x letters, we are able to give
1.2.3 . . . x different situations, whence it follows that the total number of combinations
of n letters, taken x by x, being, by that which precedes,
n(n − 1)(n − 2) · · · (n − x + 1)
,
1.2.3 . . . x
the total number of combinations, when we have regard to the different situation of the
letters, will be this same function, by suppressing its denominator.
We are able easily, by means of these formulas, to determine the benefits of lotter-
ies. Let us suppose that the number of tickets2 of a lottery is n, and that there exits r of
them at each drawing; we wish to have the probability that a combination of s of these
tickets will exit in the first drawing.
The total number of combinations of tickets, taken r by r, is, by that which pre-
cedes,
n(n − 1)(n − 2) · · · (n − r + 1)
,
1.2.3 . . . r
In order to have, among these combinations, the number of those in which the s tickets
are comprehended, we will observe that, if we subtract these tickets from the total of
the tickets, and if we combine r − s by r − s the remaining n − s, the number of these
combinations will be the sought number; because it is clear that by adding the s tickets
1 Translator’s note: By taken one by one, two by two, etc., we understand one at a time, two at a time,

etc. Thus the binomial coefficient.


2 Translator’s note: The word is “numéro,” or number used in the sense of a label. I have therefore chosen

to render it as ticket.

5
to each of these combinations, we will have the combinations r by r of the tickets, in
which are these s tickets. This number is therefore
(n − s)(n − s − 1) · · · (n − r + 1)
;
1.2.3 . . . (r − s)

by dividing it by the total number of combinations r by r of the n tickets, we will have [193]
for the sought probability

r(r − 1)(r − 2) · · · (r − s + 1)
.
n(n − 1)(n − 2) · · · (n − s + 1)

By dividing this quantity by 1.2.3 . . . s, we will have, by that which precedes, the prob-
ability that the s tickets will exit in a determined order among them. We will have the
probability that the first s tickets of the drawing will be those of the proposed combina-
tion, by observing that this probability reverts to that to bring forth this combination, by
supposing that there exits only s tickets at each drawing, that which reverts to making
r = s in the preceding function, which becomes thus
1.2.3 . . . s
.
n(n − 1)(n − 2) · · · (n − s + 1)

Finally we will have the probability that the s chosen tickets will exit first in a deter-
mined order, by reducing the numerator of this fraction to unity.
The quotients of the stakes divided by these probabilities are those which the lottery
must render to the players: the excess of these quotients over that which it gives is its
benefit. In fact, if we name p the probability of the player, m his stake and x that which
the lottery must render to him for equality of the game, x − m will be the stake of
the lottery; because, having received the stake m and rendering x to the player, it puts
into the game only x − m. Now, for equality of the game, the mathematical hope3 of
each player must be equal to his fear; his hope is the product of the stake x − m of his
adversary by the probability p to obtain it; his fear is the product of his stake by the
probability 1 − p of the loss. We have therefore

p(x − m) = (1 − p)m,

that is that, for the equality of the game, the stakes must be reciprocal to the probabili-
ties of winning. This equation gives
m
x= ;
p
thus that which the lottery must render is the quotient of the stake divided by the prob- [194]
ability of the player to win.

§4. A lottery being composed of n numbered tickets of which r exit at each draw-
ing, we require the probability that after i drawings all the tickets will have exited.
3 Translator’s note: The word is espérance.

6
Let us name zn,q the number of cases in which, after i drawings, the totality of
the tickets 1, 2, 3, . . . q will have exited. It is clear that this number is equal to the
number zn,q−1 of cases in which the tickets 1, 2, 3, . . . , q − 1 have exited, less the
number of cases in which, these tickets being exited, the ticket q is not drawn; now
this last number is evidently the same as the one of the cases in which the tickets
1, 2, 3, . . . , q − 1 would be extracted, if we remove the ticket q from the n tickets of
the lottery, and this number is zn−1,q−1 ; we have therefore
(i) zn,q = zn,q−1 − zn−1,q−1 .
n(n−1)(n−2)···(n−r+1)
Now the number of all possible cases in a single drawing being 1.2.3...r ,
the one of all possible cases in i drawings is
 i
n(n − 1)(n − 2) · · · (n − r + 1)
.
1.2.3 . . . r
The number of all the cases in which the ticket 1 will not exit in these i drawings is
the number of all possible cases, when we subtract this ticket from the n tickets of the
lottery, and this number is
 i
(n − 1)(n − 2) · · · (n − r)
;
1.2.3 . . . r
the number of cases in which the ticket 1 will exit in i drawings is therefore
 i  i
n(n − 1)(n − 2) · · · (n − r + 1) (n − 1)(n − 2) · · · (n − r)
− ,
1.2.3 . . . r 1.2.3 . . . r
or  i
(n − 1)(n − 2) · · · (n − r)
 ;
1.2.3 . . . r
this is the value of zn,1 . This premised, equation (i) will give, by making successively [195]
q = 2, q = 3, . . . ,
 i
2 (n − 2)(n − 3) · · · (n − r − 1)
zn,2 =  ,
1.2.3 . . . r
 i
(n − 3)(n − 4) · · · (n − r − 2)
zn,3 = 3 ,
1.2.3 . . . r
.........................................
and generally
 i
(n − q)(n − q − 1) · · · (n − r − q + 1)
zn,q = q .
1.2.3 . . . r
Thus the probability that the tickets 1, 2, 3, . . . q will exit in i drawings being equal to
zn,q divided by the number of all possible cases, it will be
q [(n − q)(n − q − 1) · · · (n − r − q + 1)]i
[n(n − 1)(n − 2) · · · (n − r + 1)]i

7
If we make in this expression q = n, we will have, s being here the variable which
must be supposed null in the result,

n [s(s − 1) · · · (s − r + 1)]i
[n(n − 1) · · · (n − r + 1)]i

for the expression of the probability that all the tickets of the lottery will exit in i
drawings.
If n and i are very great numbers, we will have, by the formulas of §40 of Book I,
the value of this probability by means of a highly convergent series. Let us suppose, for
example, that only one ticket exits at each drawing; the preceding probability becomes

n si
.
ni
Let us propose to determine the number i of drawings in which this probability is k1 , n
and i being very great numbers. By following the analysis of the section cited, we will [196]
determine first a by the equation
i+1 nca
0= −s− a ,
a c −1
that which gives  
i+1 1 − c−a
a= −a .
n+s 1 − sc
n+s

We have next, by §40 of Book I, when c−a is a very small quantity of the order 1i , as
that takes place in the present question, we have, I say, to the quantities nearly of order
1
i2 , s being supposed null in the result of the calculation,

 i+ 12
 sn i
i
i+1 cna−i (1 − c−a )n−i
=  .
ni 1 − i+1 c −a
n

1
Now we have, to the quantities nearly of the order i2 ,


i+ 12
i
= c−1 ;
i+1

by supposing next c−a = z, we have


−a n−1 (i−n)z i−n 2


(1 − c ) =c 1+ z ;
2

moreover, the equation which determines a gives

i + 1 − na = (i + 1)z,

8
whence we deduce
cna−i−1 = c−iz (1 − z);
1
we will have therefore, to the quantities nearly of order i2 ,

n si −nz i − 2n + 1 i−n 2
=c 1+ z+ z .
ni 2n 2

In order to determine z, let us take up again the equation [197]


i + 1 i + 1 −a
a= − c ;
n n
we will have, by formula (p) of §21 of Book II of the Mécanique céleste,
2 3
−a i + 1 2 3 i+1 3 42 i+1
z=c =q+ q + n
q + n
q4 + · · · ,
n 1.2 1.2.3
i+1
q being supposed equal to c− n . This value of z gives

c−nz = c−nq [1 − (i + 1)q 2 ];

consequently,

n si −nq i + 1 − 2n n+i+2 2
= c 1 + q − q .
ni 2n 2

By equating this quantity to the fraction k1 , we will have


log k i + 1 − 2n n + i + 2
q= 1+ − log k ;
n 2n2 2n2

now we have
i + 1 = −n log q;
we will have therefore very nearly, for the expression of the number i of drawings,
according to which the probability that all the tickets have exited is k1 ,

1 1 1
i = (log n − log log k)(n − t + 2 log k) + 2 log k;
2
we must observe that all these logarithms are hyperbolic.
Let us suppose the lottery composed of 10000 tickets, or n = 10000, and k = 2,
this formula gives
i = 95767, 4
for the expression of the number of drawings, in which we can wager one against one,
that the ten thousand tickets of the lottery will exit; it is therefore odds a little less than
one against one that they will exit in 95767 drawings, and odds a little more than one [198]
against one that they will exit in 95768 drawings.

9
We will determine by a similar analysis the number of drawings in which we are
able to wager one against one that all the tickets of the lottery of France will exit. This
lottery is, as we know, composed of 90 tickets of which five exit at each drawing. The
probability that all the tickets will exit in i drawings is then, by that which precedes,
n [s (s − 1)(s − 2)(s − 3)(s − 4)]i
,
[n(n − 1)(n − 2)(n − 3)(n − 4)]i
n being here equal to 90, and s must be supposed null in the result of the calculation.
If we makes s = s − 2, this function becomes
n [s(s2 − 1)(s2 − 4)]i
,
{(n − 2)[(n − 2)2 − 1][(n − 2)2 − 4]}i
where, by developing in series,
 
(n s5i − 5in s5i−2 + · · · ) 5i
1 + + · · · ,
(n − 2)5i (n − 2)2
s having to be supposed equal to −2 in the result of the calculation.
We have, by §40 of Book I, by neglecting the terms of order i12 and supposing c−a
very small of order 1i ,
5i+1 5i  5i 5i (n−2)a−5i
n 5i
 s a 5i+1 c (1 − c−a )n
=  ,
(n − 2)5i (n − 2)5i 1 + 1 − na c
2 −a
5i 5i(1−c−a )2

a being given by the equation


(5i + 1)(1 − c−a )
a=  .
−a
(n − 2) 1 + 2cn−2

We have thus, by neglecting the terms of order i12 ,


 5i
1 + 2c
−a

5i

n s5i n−2 −a n 1−(5i+1)c−a − 10ic


−a 5i 1 na 2 −a
c
= (1−c ) c n−2 1− + ;
(n − 2)5i (1 − c−a )5i 5i + 1 10i 10i
now we have
5i [199]
2c−a 10ic−a
1+ =c n−2 ,
n−2

−a −5i 5ic−a 5i −2a


(1 − c ) =c 1+ c ,
2

5i

5i 1
= c−1 1 + ;
5i + 1 10i
we will have therefore, to the quantities nearly of order i12 ,

n s5i −a n −a 5i −2a na2 c−a


= (1 − c ) 1 − c + c + .
(n − 2)5i 2 10i

10
By substituting for a its value and observing that i is very little different from n − 2 in
the present case, as we will see hereafter, we have, very nearly,

na2 c−a 5i + 12 −a
= c .
10i 2(n − 2)
−a
12c
I keep, for greater exactitude, the term 2(n−2) , although of order i12 , because of the size
of its factor 12; we will have therefore
 
n s5i −a n 5i − 2n + 16 −a 5i −2a
= (1 − c ) 1 + c + c .
(n − 2)5i 2(n − 2) 2
n 5i−2
 s
If we change in this equation 5i into 5i − 2, we will have that of (n−2) 5i−2 ; but the

value of a will no longer be the same. Let a be this new value, we will have

(5i − 1)(1 − c−a )
a = 
−a
,
(n − 2) 1 + 2cn−2

that which gives, very nearly,


2
a = a − .
n−2
Now we have
 2c−a
1 − c−a = 1 − c−a − ,
n−2
whence we deduce, by neglecting the quantities of order 1i , [200]

(1 − c−a )n = (1 − c−a )n ;

consequently we have, by neglecting the quantities of order 1i ,

n s5i−2
= (1 − c−a )n .
(n − 2)5i−2
1
We will have therefore, to the quantities nearly of order i2 ,

n [s(s2 − 1)(s2 − 4)]i


[n(n − 1)(n − 2)(n − 3)(n − 4)]i
 
−a n 5i − 2n + 16 −a 5i −2a
= (1 − c ) 1 + c + c .
2(n − 2) 2

This quantity must, by the condition of the problem, be equal to 12 , that which gives
 
n 1 5i − 2n + 16 −a 5i −2a
1 − c−a = 1− c − c ,
2 2n(n − 2) 2n

11
whence we deduce
  
n 1 5i − 2n + 16 5i −a
c−a = 1− 1+ + c ;
2 2n(n − 2) 2n

consequently we have, by hyperbolic logarithms,


 √ 
n
2 5i − 2n + 16 5i −a
a = log √ − − c ;
n
2−1 2n(n − 2) 2n
1
now we have, to the quantities nearly of order i2 ,

5i + 1
a= √ ;
(n − 2) n 2
we will have therefore
   √ 
n−2√ 1 16 1 √ n
2
2 1− − − ( 2 − 1) log √
n n
i= .
5 2n 10in 2 n
2−1
By substituting for n its value 90, we find
i = 85, 53,
so that there is odds a little less than one to one that all the tickets will exit in 85 [201]
drawings, and odds a little more than one to one that they will exit in 86 drawings.
A quite simple and very close way to obtain the value of i is to suppose nis , now
n i

the series
i
i
n−1 n(n − 1) n − 2
1−n + − ··· ,
n 2 n
equal to the development

i
2i
n−1 n(n − 1) n − 1
1−n + − ···
n 1.2 n
 i n
of the binomial 1 − n−1 n . In fact the two series have the first two terms equal re-
spectively. Their third terms are also, very nearly, equal among themselves; for we have
i 2i
quite nearly n−2n equal to n−1 n . In fact, their hyperbolic logarithms are, by ne-
glecting the terms of order ni2 , both equal to − ni . We will see in the same way that the
fourth terms, the fifth, . . . are very little different, when n and i are very great numbers;
but the difference increases without ceasing in measure as the terms move away from
the first, that which must in the end produce in them an evident <difference> between
the series themselves. In order to estimate it, let us determine the value  of i concluded
1
i n
from the equality of the two series. By equating to k the binomial 1 − n−1 n , we
will have   
log 1 − n k1
i= ,
log n−1n

12

these logarithms may be, at will, hyperbolic or tabulated. Let n k1 = 1 − z. We will
have, by taking the hyperbolic logarithms of each member of this equation [202]

1 z2
log k = − log(1 − z) = z + + ··· ,
n 2
that which gives, very nearly,

log k log k
z= 1− ;
n 2n
we will have therefore, in hyperbolic logarithms,
 
n 1 log k
log 1 − = log z = log log k − log n − ,
k 2n

We have next
n−1 1 1
log = − − 2 − ···
n n 2n
The preceding expression for i becomes in this way, very nearly,

1
i = n(log n − log log k) 1 − + 12 log k;
2n
the excess of the value found previously for i over this one is
log k
(log n − log log k);
2
this excess becomes infinite, when n is infinite; but a very great number is necessary
in order to render it very evident; and in the case of n = 10000 and of k = 2, it is still
only three units.
If we consider likewise the development

i
n−5
1− + ···
n
n  
 i n
[s (s −1)(s −2)(s −3)(s −4)]i
of the expression  [n(n−1)(n−2)(n−3)(n−4)] i , as the one 1 − n−5
n of the bi-
nomial, we will have, in order to determine the number i of trials in which we can
wager one against one that all the tickets will exit, the equation

i  n
n−5 1
1− = ;
n 2

that which gives  √  [203]


n
√ 2
log n
2−1
i=   .
n
log n−5

13
These logarithms can be tabulated. By making n = 90, we find

i = 85, 204,

that which differs very little from the value i = 85, 53 that we have found above.

§5. An urn being supposed to contain the number x of balls, we draw from it a part
or the totality, and we demand the probability that the number of extracted balls will
be even.
The sum of the cases in which this number is unity equals evidently x, since each
of the balls can equally be extracted. The sum of the cases in which this number equals
2 is the sum of the combinations of x balls taken two by two, and this sum is, by
§3, equal to x(x−1)
1.2 . The sum of the cases in which the same number equals 3 is the
sum of the combinations of balls taken three by three, and this sum is x(x−1)(x−2) 1.2.3 ,
and thus consecutively. Thus the successive terms of the development of the function
(1 + 1)x − 1 will represent all the cases in which the number of extracted balls is
successively 1.2.3 . . . to x; whence it is easy to conclude that the sum of all the cases
relative to the odd numbers is 12 (1 + 1)x − 12 (1 − 1)x , or 2x−1 , and that the sum of
all the cases relative to the even numbers is 12 (1 + 1)x + 12 (1 − 1)x − 1, or 2x−1 − 1.
The union of these two sums is the number of all the possible cases; this number is
therefore 2x − 1; thus the probability that the number of extracted balls will be even
is 22x −1−1 , and the probability that this number will be odd is 22x −1 ; there is therefore
x−1 x−1

advantage to wager with equality on an odd number.


If the number x is unknown, and if we know only that it can not exceed n, and that [204]
this number and all the lesser are equally possible, we will have the number of all the
possible cases relative to the odd numbers by making the sum of all the values of 2x−1 ,
from x = 1 to x = n, and it is easy to see that this sum is 2n − 1. We will likewise
have the sum of all the possible cases relative to the even numbers, by summing the
function 2x−1 − 1, from x = 1 to x = n, and we find this sum equal to 2n − n − 1; the
2n −n−1
probability of an even number is therefore then 2n+1 −n−2 , and that of an odd number
2n −1
is 2n+1 −n−2 .
Let us suppose now that the urn contains the number x of white balls, and the same
number of black balls; we demand the probability that by drawing any even number of
balls, we will bring forth as many white balls as black balls, all the even numbers being
able to be brought forth equally.
The number of cases in which one white ball from the urn can be combined with
a black ball is evidently x.x. The number of cases in which two white balls can be
combined with two black balls is x(x−1) 1.2
x(x−1)
1.2 , and thus consecutively. The number
of cases in which we will bring forth as many white balls as black balls is therefore the
sum of the squares of the terms of the development of the binomial (1 + 1)x , less unity.
In order to have this sum, we will observe that it is equal to a term independent of a,
x 2x
in the development of 1 + a1 (1 + a)x . This function is equal to (1+a) ax . The term
independent of a, in its development, is thus the coefficient of the middle term of the
1.2.3...2x
binomial (1 + a)2x ; this coefficient is (1.2.3...x) 2 ; the number of cases in which we can

14
draw from the urn as many white balls as black balls is therefore
1.2.3 . . . 2x
− 1.
(1.2.3 . . . x)2
The number of all possible cases is the sum of the odd terms in the development of the [205]
binomial (1 + 1)2x , less the first, or unity. This sum is 12 (1 + 1)2x + 12 (1 − 1)2x ; the
number of possible cases is therefore 22x−1 − 1, which gives for the expression of the
probability sought
1.2.3...2x
(1.2.3...x)2 − 1
.
22x−1 − 1
In the case where x is a large number, this probability is reduced by §33 of Book I to
√2 , π being the semi-circumference of which 1 is the radius.

§6. We consider a number x + x of urns, of which the first contains p white balls
and q black balls, the second p white balls and q  black balls, the third p white balls
and q  black balls, and thus consecutively. Let us suppose that we draw successively
one ball from each urn. It is clear that the number of all the possible cases in the first
drawing is p + q; in the second drawing, each of the cases of the first being able to be
combined with the p + q  balls of the second urn, we will have (p + q)(p + q  ) for the
number of all the possible cases relative to the first two drawings. In the third drawing,
each of these cases can be combined with the p + q  balls of the third urn; that which
gives (p + q)(p + q  )(p + q  ) for the number of all the possible cases relative to the
three drawings, and thus of the rest. This product for the totality of the urns will be
composed of x + x factors, and the sum of all the terms of its development in which
the letter p, with or without accent, is repeated x times, and consequently the letter q,
x times, will express the number of cases in which we can draw from the urns x white
balls and x black balls.
If p , p , . . . are equal to p, and if q  , q  , . . . are equal to q, the preceding product
 
becomes (p+q)x+x . The term multiplied by px q x in the development of this binomial
is
(x + x )(x + x + 1) · · · (x + 1) x x 1.2.3 . . . (x + x ) 
p q or px q x .
1.2.3 . . . x 1.2.3 . . . x.1.2.3 . . . x 

Thus this quantity expresses the number of cases in which we can bring forth x white [206]

balls and x black balls. The number of all the possible cases being (p + q)x+x , the
probability to bring forth x white balls and x black balls is

x
x 
1.2.3 . . . (x + x ) p q
,
1.2.3 . . . x.1.2.3 . . . x p+q p+q
p
where we must observe that p+q is the probability of drawing a white ball from one of
q
the urns, and that p+q is the probability of drawing from it a black ball.
It is clear that it is perfectly equal to draw x white balls and x black balls from
x + x urns which each contain p white balls and q black balls, or one alone of these
urns, provided that we replace into the urn the ball extracted at each drawing.

15
Let us consider now a number x + x + x of urns of which the first contains p
white balls, q black balls and r red balls, of which the second contains p white balls,
q  black balls and r red balls, and thus consecutively. Let us suppose that we draw one
ball from each of these urns. The number of all the possible cases will be the product
of the x + x + x factors,

(p + q + r)(p + q  + r )(p + q  + r ) · · ·

The number of cases in which we will bring forth x white balls, x black balls and x
red balls will be the sum of all the terms of the development of this product, in which
the letter p will be repeated x times, the letter q, x times and the letter r, x times.
If all the accented letters p , q  , . . . are equal to their non-accented correspondents, the
 
preceding product is changed into the trinomial (p + q + r)x+x +x . The term of its
x x x
development, which has for factor p q r , is
1.2.3 . . . (x + x + x )  
px q x r x ;
1.2.3 . . . x.1.2.3 . . . x .1.2.3 . . . x
 
thus, the number of all the possible cases being (p + q + r)x+x +x , the probability to [207]
bring forth x white balls, x black balls and x red balls will be

x
x 
x
1.2.3 . . . (x + x + x ) p q r
,
1.2.3 . . . x.1.2.3 . . . x .1.2.3 . . . x p+q+r p+q+r p+q+r
p q r
whence we must observe that p+q+r , p+q+r , p+q+r are the respective probabilities of
drawing from each urn a white ball, a black ball and a red ball.
We see generally that, if the urns contain each the same number of colors, p being
the number of the balls of the first color, q the one of the balls of the second color,
r, s, . . . those of the balls of the third, the fourth, . . ., x + x + x + x + · · · being
the number of urns, the probability to bring forth x balls of the first color, x balls of
the second, x of the third, x of the fourth, . . . will be

x
1.2.3 . . . (x + x + x + x + · · · ) p
1.2.3 . . . x.1.2.3 . . . x .1.2.3 . . . x .1.2.3 . . . x . . . p + q + r + s + · · ·

x 
x
x
q r s
× ···
p + q + r + s + ··· p + q + r + s + ··· p + q + r + s + ···

§7. Let us determine now the probability to draw from the preceding urns x white
balls, before bringing forth x black balls, or x red balls, . . .. It is clear that, n express-
ing the number of the colors, this must happen at the latest after x+x +x +· · ·−n+1
drawings; because, when the number of white balls extracted is equal or less than x,
the one of the extracted black balls less than x , the one of the extracted red balls less
than x , . . ., the total number of the extracted balls, and consequently the number of
drawings, is equal or less than x + x + x + · · · − n + 1; we can therefore consider
here only x + x + x + · · · − n + 1 urns.
In order to have the number of cases in which we can bring forth x white balls at
the (x + i)st drawing, it is necessary to determine all the cases in which x − 1 white

16
balls will have exited at the drawing x + i − 1. This number is the term multiplied by [208]
px−1 in the development of the polynomial (p + q + r + · · · )x+i−1 , and this term is
1.2.3 . . . (x + i − 1)
px−1 (q + r + · · · )i .
1.2.3 . . . (x − 1)1.2.3 . . . i
By combining it with the p white balls of the urn x + i, we will have a product which
it will be necessary further to multiply by the number of all the possible cases relative
to the x + x + · · · − n − i + 1 following drawings, and this number is
 
(p + q + r + · · · )x +x +···−n−i+1
;

we will have therefore


1.2.3 . . . (x + i − 1)  
(a) px (q + r + · · · )i (p + q + r + · · · )x +x +···−n+1 ,
1.2.3 . . . (x − 1)1.2.3 . . . i
for the number of cases in which the event can happen precisely at the drawing x + i.
It is necessary however to exclude from it the case in which q is raised to the power x ,
those in which r is raised to the power x , etc.; because in all these cases it has already
happened in the drawing x + i − 1, either x black balls, or x red balls, or etc. Thus in
the development of the polynomial (q + r + · · · )i , it is necessary to have regard only
 
to the terms multiplied by q f rf sf . . . in which f is less than x , f  is less than x ,
 
f  is less than x , . . . The term multiplied by q f rf sf . . .in this development is
1.2.3 . . . i  
q f r f sf . . .
1.2.3 . . . f.1.2.3 . . . f  .1.2.3 . . . f 
All the terms that we must consider in the function (a) are therefore represented by


⎨ 1.2.3 . . . (x + f + f  + · · · − 1) 

...
px q f r f · · ·
(b) 1.2.3 . . . (x − 1).1.2.3 . . . f.1.2.3 . . . f

⎩   
× (p + q + r + · · · )x +x +···−f −f −···−n+1 ,

because i is equal to f + f  + · · · . Thus, by giving, in this last function, to f all


the whole values from f = 0 to f = x − 1, to f  all the values from f  = 0 to
f  = x − 1, and thus consecutively, the sum of all these terms will express the number [209]
of cases in which the proposed event can happen in x + x + · · · − n + 1 drawings.
It is necessary to divide this sum by the number of all the possible cases, that is by
 
(p + q + r + · · · )x+x +x +···−n+1 . If we designate by p the probability of drawing
a white ball from any one of the urns, by q  that of drawing from it a black ball, by r
that of drawing a red ball, . . ., we will have
p q r
p = , q = , r = , ··· ;
p + q + r + ··· p + q + r + ··· p + q + r + ···
 
the function (b), divided by (p + q + r + · · · )x+x +x +···−n+1
, will become thus
1.2.3 . . . (x + f + f  + · · · − 1) 
px q f rf · · · .
1.2.3 . . . (x − 1).1.2.3 . . . f.1.2.3 . . . f  . . .

17
The sum of the terms which we will obtain by giving to f all the values from f = 0
to f = x − 1, to f  all the values from f  = 0 to f  = x − 1, . . . will be the sought
probability to bring forth x white balls before x black balls, or x red balls, or, etc.
We can, according to this analysis, determine the lot of a number n of players A, B,
C,. . ., of whom p , q  , r , . . . represent the respective skills, that is their probabilities
to win a trial4 when, in order to win the game, there is lacking x trials to player A, x
trials to player B, x trials to player C, and thus consecutively; because it is clear that,
relatively to player A, this reverts to determine the probability to bring forth x white
balls before x black balls, or x red balls, . . ., by drawing successively a ball from a
number x+x +x +· · ·−n+1 of urns which contain each p white balls, q black balls,
r red balls, · · · , p, q, r, . . . being respectively equal to the numerators of the fractions
p , q  , r , . . . reduced to the same denominator.

§8. The preceding problem can be resolved in a quite simple manner by the analysis
of the generating functions. Let us name yx,x ,x ,... the probability of player A to win [210]
the game. At the following trial, this probability is changed into yx−1,x ,x ,... , if A
wins this trial, and the probability for this is p . The same probability is changed into
yx,x −1,x ,... , if the trial is won by player B, and the probability for this is q  ; it is
changed into yx,x ,x −1,... if the trial is won by player C, and the probability for this is
r , and thus consecutively; we have therefore the equation in the partial differences

yx,x ,x ,... = p yx−1,x ,x ,... + q  yx,x −1,x ,... + r yx,x ,x −1,... + · · ·
 
Let u be a function of t, t , t , . . . , such that yx,x ,x ,... is the coefficient of tx tx tx . . .
in its development; the preceding equation in the partial differences will give, by pass-
ing from the coefficients to the generating functions,

u = u(p t + q  t + r t + · · · ),

whence we deduce
1 = p t + q  t + r t + · · · ;
consequently,
1 p
= ,
t 1 − q  t − r t − · · ·
that which gives
⎧ ⎫

⎪ 1 + x(q  t + r t + · · · ) ⎪


⎪ ⎪


⎪ x(x + 1) ⎪


⎨ + (q 
t 
+ r 
t 
+ · · · ) 2 ⎪

u upx x 1.2
= = up .
tx (1 − q  t − r t − · · · )x ⎪ + x(x + 1)(x + 2) (q  t + r t + · · · )3 ⎪
⎪ ⎪

⎪ ⎪


⎪ 1.2.3 ⎪


⎩ ⎪

+ ................................
 
Now the coefficient of t0 tx tx . . . in tux is yx,x ,x ,... , and the same coefficient in
 
any term of the last member of the preceding equation, such as kupx tl tl . . ., is
4 Translator’s note: Throughout the word coup is translated as trial.

18
kpx y0,x −l ,x −l ,... ; the quantity y0,x −l ,x −l ,... is equal to unity, since then player
A lacks no trial. Moreover, it is necessary to reject all the values of y0,x −l ,x −l ,...
in which l is equal or greater than x , l is equal or greater than x , and thus con-
secutively, because these terms are not able to be given by the equation in the partial
differences, the game being finite, when any one of the players B, C, . . . have no more [211]
trials to play; it is necessary therefore to consider in the last member of the preceding
equation only the powers of t less than x , only the powers of t less than x , . . ..
The preceding expression of tux will give thus, by passing again from the generating
functions to the coefficients,
⎧ ⎫

⎪ 1 + x(q  + r + · · · ) ⎪


⎪ ⎪


⎪ x(x + 1) ⎪


⎨ +  
(q + r + · · · ) 2 ⎪

yx,x ,x ,... = p
 
x 1.2 ,

⎪ x(x + 1)(x + 2)   3⎪

⎪ +
⎪ (q + r + · · · ) ⎪


⎪ 1.2.3 ⎪


⎩ ⎪

+ ............................

provided that we reject the terms in which the power of q  surpasses x − 1, those in
which the power of r surpasses x − 1, etc. The second member of this equation is
developed in one sequence of terms comprehended in the general formula

1.2.3 . . . (x + f + f  + · · · − 1) 


px q f rf · · ·
1.2.3 . . . (x − 1).1.2.3 . . . f.1.2.3 . . . f . . .

The sum of these terms relative to all the values of f from f null to f = x − 1, to all
the values of f  from f  null to f  = x − 1, . . ., will be the probability yx,x ,x ,... , that
which is conformed to that which precedes.
In the case of two players A and B, we will have, for the probability of player A,
 
x  x(x + 1) 2 x(x + 1)(x + 2) · · · (x + x − 2) x −1
p 1 + xq + q + ··· + q .
1.2 1.2.3 . . . (x − 1)

By changing p into q  and x into x , and reciprocally, we will have


 
x   x(x + 1) 2 x (x + 1)(x + 2) · · · (x + x − 2) x−1
q 1+xp + p + ··· + p
1.2 1.2.3 . . . (x − 1)

for the probability that player B will win the game. The sum of these two expressions [212]
must be equal to unity, that which we see evidently by giving them the following forms.
The first expression can, by §37 of Book I, is transformed into this one
⎧ ⎫

⎪ x + x − 1 q  (x + x − 1)(x + x − 2) q 2 ⎪
⎪ 1+
⎨ 
+ 2
+ · · ·⎪


 1 p 1.2 p
px+x −1  ,

⎪ (x + x − 1) · · · (x + 1) q x −1 ⎪


⎩ + ⎪

1.2.3 · · · (x − 1) px −1

19
and the second can be transformed into this one
⎧ ⎫
⎪ x + x  − 1 p (x + x − 1)(x + x − 2) p2

⎨ 1+ + + · · ·⎪


x+x −1 1 q 1.2 q 2
q .

⎪ (x + x − 1) · · · (x + 1) px−1 ⎪

⎩ + ⎭
1.2.3 · · · (x − 1) q x−1

The sum of these expressions is the development of the binomial (p + q  )x+x −1 ,
and consequently it is equal to unity, because, A or B needing to win each trial, the
sum p + q  of their probabilities for this is unity.
The problem which we just resolved is the one which we name the problem of
points in the Analysis of chances. The chevalier de Méré proposed it to Pascal, with
some other problems on the game of dice. Two players of whom the skills are equal
have put into the game the same sum; they must play until one of them has beat a given
number of times his adversary; but they agree to quit the game, when there is lacking
yet x points to the first player in order to attain this given number, and when there
is lacking x points to the second player. We demand in what way they must share
the sum put into the game. Such is the problem that Pascal resolved by means of his
arithmetic triangle. He proposed it to Fermat who gave the solution to it by way of
combinations, that which occasioned between these two great geometers a discussion,
after which Pascal recognized the goodness of the method of Fermat, for any number
of players. Unhappily we have only one part of their correspondence, in which we see
the first elements of the theory of probabilities and their application to one of the most [213]
curious problems of this theory.
The problem proposed by Pascal to Fermat reverts to determining the respective
probabilities of the players winning the game; because it is clear that the stake must be
shared between the players proportionally to their probabilities. These probabilities are
the same as those of two players A and B, who must attain a given number of points,
x being the number of those which is lacking to player A, and x being the number of
those which is lacking to player B, by imagining an urn containing two balls of which
one is white and the other black, both bearing the no. 1, the white ball being for player
A, and the black ball for player B. We draw successively one of these balls, and we
return it into the urn after each drawing. By naming yx,x the probability that player A
will attain, the first, the given number of points, or, that which reverts to the same, that
he will have x points before B has x , we will have
1 1
yx,x = yx−1,x + yx,x −1 ;
2 2
because, if the ball that we extract is white, yx,x is changed into yx−1,x , and if the
extracted ball is black, yx,x is changed into yx,x −1 , and the probability of each of
these events is 12 ; we have therefore the preceding equation.
The generating function of yx,x in this equation in the partial differences is, by §20
of Book I,
M
,
1 − 12 t − 12 t

20
M being an arbitrary function of t . In order to determine it, we will observe that y0,0
can not take place, since the game ceases when one or the other of the variables x and x
is null; M must therefore have for factor t . Moreover y0,x is unity, whatever be x , the
probability of player A is changing then into certitude: now the generating function of
ti 
unity is generally 1−t  , because the coefficients of the powers of t in the development [214]
of this function are all equal to unity; in the present case, y0,x being able to hold when
x is either 1, or 2, or 3, etc., i must be equal to unity; the generating function of y0,x is
t 0
therefore equal to 1−t  ; this is the coefficient of t in the development of the generating

function of yx,x or in
M
;
1 − 12 t − 12 t
we have therefore
M t
= ,
1 − 12 t 1 − t
that which gives
t (1 − 12 t )
M= ,
(1 − t )
consequently the generating function of yx,x is

t (1 − 21 t )
.
(1 − t )(1 − 12 t − 12 t )
By developing it with respect to the powers of t, we have

t 1 t 1 t2 1 t3
1+ + 2 + 3 + ··· .
1 − t 2 1 − 12 t 2 (1 − 12 t )2 2 (1 − 12 t )3
The coefficient of tx in this series is
1 t
;
2 (1 − t )(1 − 12 t )x
x 


yx,x is therefore the coefficient of tx in this last quantity; now we have
t
(1 − t )(1 − 12 t )x
x(x+1)(x+2)···(x+x −2) x
t + 12 x t2 + 1 x(x+1) 3
22 2 t + ··· + 1
2x −1 1.2.3...(x −1) t + ···
= 
.
1 −t
By reducing into series the denominator of this last fraction and multiplying the nu-

merator by this series, we see that the coefficient of tx in this product is that which

this numerator becomes when we make t = 1; we have therefore [215]
⎧ ⎫

⎪ 1 x(x + 1) 1 x(x + 1)(x + 2) 1
· · · ⎪

⎨ 1 + x + + + ⎬
2 1.2 22 1.2.3 23
yx,x =  ,

⎪ x(x + 1) · · · (x + x − 2) 1 ⎪

⎩ + ⎭
1.2.3 . . . (x − 1) 2x −1

21
a result conformed to that which precedes.
Let us imagine presently that there is in the urn a white ball bearing the no. 1, and
two black balls, of which one bears the no. 1, and the other bears the no. 2, the white
ball being favorable to A, and the black balls to his adversary, each ball diminishing by
its value the number of points which is lacking to the player to whom it is favorable.
yx,x being always the probability that player A will attain first the given number, we
will have the equation in the partial differences
1 1 1
yx,x = yx−1,x + yx,x −1 + yx,x −2 ;
3 3 3
because, in the following drawing, if the white balls exits, yx,x becomes yx−1,x ; if
the black ball numbered 1 exits, yx,x becomes yx,x −1 , and if the black ball numbered
2 exits, yx,x becomes yx,x −2 , and the probability of each of these events is 13 .
The generating function of yx,x is

M
,
1 − 13 t − 13 t − 13 t2

M being an arbitrary function of t , which must, by that which precedes, have for factor
t , and in the present case is equal to

t 1 1
(1 − t − t2 ),
1 − t 3 3
so that the generating function of yx,x is

t (1 − 13 t − 13 t2 )
.
(1 − t )(1 − 13 t − 13 t − 13 t2 )

The coefficient of tx in the development of this function is

1 t 1
,
3 1 − t (1 − 13 t − 13 t2 )x
x


and there results from this that we just said that the coefficient of tx in the development [216]
of this last quantity is equal to
⎧ ⎫

⎪  xt2 (1 + t ) x(x + 1) t3 (1 + t )2 ⎪⎪
⎨ t + + ⎬
1 3 1.2 32 ;
3x ⎪⎪
4  3 ⎪
⎩ + x(x + 1)(x + 2) t (1 + t ) + · · · ⎪ ⎭
1.2.3 33

by rejecting from the development in this series all the powers of t superior to tx , and
supposing in this that we conserve t = 1, this will be the expression of yx,x .
It is easy to translate this process into a formula. Thus, by supposing x even and

22
equal to 2r + 2, we find

2
r 
1 2 x(x + 1) 2 x(x + 1) · · · (x + r − 1) 2
yx,x = x 1 + x + + ··· +
3 3 1.2 3 1.2.3 . . . r 3
 
x(x + 1) · · · (x + r) (r + 1)r (r + 1)r . . . 2
+ 1 + (r + 1) + + · · · +
1.2.3 . . . (r + 1) 3x+r+1 1.2 1.2.3 . . . r
 
x(x + 1) · · · (x + r + 1) (r + 2)(r + 1) . . . 4
+ 1 + (r + 2) + · · · +
1.2.3 . . . (r + 2) 3x+r+2 1.2.3 . . . (r − 1)
+ .................................................................
x(x + 1) · · · (x + 2r)
+ .
1.2.3 . . . (2r + 1) 3x+2r+1
If we suppose x odd and equal to 2r + 1, we will have

2
r 
1 2 x(x + 1) 2 x(x + 1) · · · (x + r − 1) 2
yx,x = x 1 + x + + ··· +
3 3 1.2 3 1.2.3 . . . r 3
 
x(x + 1) · · · (x + r) (r + 1)r (r + 1)r . . . 3
+ 1 + (r + 1) + + · · · +
1.2.3 . . . (r + 1) 3x+r+1 1.2 1.2.3 . . . (r − 1)
 
x(x + 1) · · · (x + r + 1) (r + 2)(r + 1) (r + 2)(r + 1) . . . 5
+ 1 + (r + 2) + + · · · +
1.2.3 . . . (r + 2) 3x+r+2 1.2 1.2.3 . . . (r − 2)
+ ..............................................................................
x(x + 1) · · · (x + 2r − 1)
+ .
1.2.3 . . . 2r 3x+2r
Thus, in the case of x = 2 and x = 5, we have
350
y2,5 = .
729
Let us imagine further that there are in the urn two white balls distinguished, as the [217]
two black balls, by the nos. 1 and 2; the probability of player A will be given by the
equation in the partial differences
1 1 1 1
yx,x = yx−1,x + yx−2,x + yx,x −1 + yx−1,x −2 .
4 4 4 4
The generating function of yx,x is then, by §20 of Book I,
M + Nt
,
1 − 14 t − 14 t − 14 t2 − 14 t2

M and N being two arbitrary functions of t . In order to determine them, we will


observe that y0,x is always equal to unity, and that it is necessary to exclude in M the
null power of t ; we have therefore

t 1  1 2
M= 1− t − t .
1 − t 4 4

23
In order to determine N , let us seek the generating function of y1,x . If we observe
that y0,x is equal to unity, and that, player A having no more need but of one point, he
wins the game, either that he brings forth the white ball numbered 1 or the white ball
numbered 2, the preceding equation in the partial differences will give
1 1 1
y1,x = + y1,x −1 + y1,x −2 .
2 4 4
Let us suppose y1,x = 1 − yx  ; we will have
1  1
yx  = y  + y  .
4 x −1 4 x −2
The generating function of this equation is

m + nt
,
1 − 14 t − 14 t2

m and n being two constants. In order to determine them, we will observe that y1,0 =
0, and that consequently y0 = 1, that which gives m = 1. The generating function of
yx  is therefore [218]
1 + nt
.
1 − 14 t − 14 t2
We have next evidently y1,1 = 12 , that which gives y1 = 12 ; y1 is the coefficient of t
in the development of the preceding function, and this coefficient is n + 14 ; we have
therefore n + 14 = 12 , or n = 14 . The generating function of unity is 1−t
1
 , because here

all the powers of t can be admitted; we have thus

1 1 + 14 t 1 
2t
− , or ,
1 − t 1 − 14 t − 14 t2 (1 − t )(1 − 14 t − 14 t2 )

for the generating function of y1,x . This same function is the coefficient of t in the
development of the generating function of yx,x , a function which, by that which pre-
cedes, is
t 1  1 2
1−t (1 − 4 t − 4 t ) + N t
;
1 − 14 t − 14 t − 14 t2 − 14 t2
this coefficient is
1 
4t N
+ ;
(1 − t )(1 − 14 t − 1 2
4t ) 1− 1 
− 14 t2
4t

by equating it to
1 
2t
,
(1 − t )(1 − 14 t − 14 t2 )
we will have
1 
4t
N= .
1 − t

24
The generating function of yx,x is thus

t (1 − 14 t − 14 t2 ) + 14 tt
.
(1 − t )(1 − 14 t − 14 t − 14 t2 − 14 t2 )
If we develop into series the function

t (1 − 14 t − 14 t2 ) + 14 tt
− t ,
1 − 14 t − 14 t − 14 t2 − 14 t2
we will have [219]
⎧ ⎫
⎪ 1  1 1 ⎪

⎪ 1+ t (1 + t ) + 2 t2 (1 + t )2 + 3 t3 (1 + t )3 + · · · ⎪


⎪ 4 4 4 ⎪


⎪   ⎪


⎪ t(1 + t) 2  3 4 ⎪


⎪ +  2  2 3
1 + t (1 + t ) + 2 t (1 + t ) + 8 t (1 + t ) + · · ·  3 ⎪


⎪ 4 4 4 4 ⎪


 ⎨   ⎪

(2 + t)tt 2
t (1 + t) 2
3  3.4 3.4.5
4 ⎪ + 1 + t (1 + t ) + 2
t (1 + t ) +  2
t (1 + t ) + · · · ⎪ .
3  3

⎪ 42 4 1.2.42 1.2.3.43 ⎪


⎪  ⎪⎪

⎪ 3 3 ⎪


⎪ t (1 + t) 4   4.5 2  2 4.5.6 3  3
· · · ⎪


⎪ + 3
1 + t (1 + t ) + 2
t (1 + t ) + 3
t (1 + t ) + ⎪


⎪ 4 4 1.2.4 1.2.3.4 ⎪


⎩ ⎪

+ .........................................................................

If we reject from this series all the powers of t other than tx and all the powers of t

superior to tx , and if in that which remains we make t = 1, t = 1, we will have
the expression of yx,x when x is equal or greater than unity; when x is null, we have
y0,x = 1. It is easy to translate this process into a formula, as we have done for the
preceding case.
Let us name zx,x the probability of player B; the generating function of zx,x will
be that which the generating function of yx,x becomes when we change in it t into t ,
and reciprocally, that which gives, for this function,

t(1 − 14 t − 14 t2 ) + 14 tt
.
(1 − t)(1 − 14 t − 14 t − 14 t2 − 14 t2 )
By adding the two generating functions, their sum is reduced to
t t tt
+ + ,
1 − t 1 − t (1 − t)(1 − t )

in which the coefficient of tx tx is unity; thus we have

yx,x + zx,x = 1,

that which is clear besides, since the game must be necessarily won by one of the
players.

§9. Let us imagine in an urn r balls marked with the n◦ 1, r balls marked with n◦
2, r balls marked with n◦ 3, and thus consecutively to the n◦ n. These balls being well

25
mixed in the urn, we draw them successively; we demand the probability that there will [220]
exit at least one of these balls at the rank5 indicated by its label6 , or that there will exit
at least two of them, or at least three, etc.
Let us seek first the probability that there will exit at least one of them. For this,
we will observe that each ball can exit at its rank only in the first n drawings; we can
therefore here set aside the following drawings; now the total number of balls being
rn, the number of their combinations n by n, by having regard for the order that they
observe among themselves, is, by that which precedes,

rn(rn − 1)(rn − 2) · · · (rn − n + 1);

this is therefore the number of all possible cases in the first n drawings.
Let us consider one of the balls marked with the n◦ 1, and let us suppose that it
exits at its rank, or the first. The number of combinations of the rn − 1 other balls
taken n − 1 by n − 1 will be

(rn − 1)(rn − 2) · · · (rn − n + 1);

this is the number of cases relative to the assumption that we just made, and, as this
assumption can be applied to r balls marked with n◦ 1, we will have

r(rn − 1)(rn − 2) · · · (rn − n + 1)

for the number of cases relative to the hypothesis that one of the balls marked with the
n◦ 1 will exit at its rank. The same result holds for the hypothesis that any one of the
n−1 other kinds of balls will exit at the rank indicated by its label. By adding therefore
all the results relative to these diverse hypotheses, we will have

(a) rn(rn − 1)(rn − 2) · · · (rn − n + 1);

for the number of cases in which one ball at least will exit at its rank, provided however
that we remove from them the cases which are repeated.
In order to determine these cases, let us consider one of the balls of the n◦ 1, exiting
first, and one of the balls of the n◦ 2, exiting second. This case is comprehended twice [221]
in the preceding number; for it is comprehended one time in the number of the cases
relative to the assumption that one of the balls labeled7 1 will exit at its rank, and a
second time in the number of cases relative to the assumption that one of the balls
labeled 2 will exit at its rank; and, as this extends to any two balls exiting at their
rank, we see that it is necessary to subtract from the number of the cases preceding the
number of all the cases in which two balls exit at their rank.
2
The number of combinations of two balls of different labels is n(n−1) 1.2 r ; for the
number of the labels being n, their combinations two by two are in number n(n−1) 1.2 ,
5 Translator’s note: This means that a ball marked with 1 will be drawn first, a ball marked with 2 will be

drawn second, and so on. In other words, balls will be drawn consecutively by number.
6 Translator’s note: The word here is numéro, number. However, this refers to the use of a number as a

label. In order to distinguish it from nombre, number or quantity, I choose to render it as such.
7 Translator’s note: The word is numérotées, numbered. I have chosen to render it as labeled for the same

reason as above.

26
and in each of these combinations we can combine the r balls marked with one of the
labels with the r balls marked with the other label. The number of combinations of the
rn − 2 remaining balls, taken n − 2 by n − 2, by having regard for the order that they
observe among themselves, is

(rn − 2)(rn − 3) · · · (rn − n + 1);

thus the number of cases relative to the assumption that two balls exit at their rank is

n(n − 1) 2
r (rn − 2)(rn − 3) · · · (rn − n + 1);
1.2
by subtracting it from the number (a), we will have

⎨ rn(rn − 1)(rn − 2) · · · (rn − n + 1)
(a )
⎩ − n(n − 1) r2 (rn − 2)(rn − 3) · · · (rn − n + 1),
1.2
for the number of all the cases in which one ball at least will exit at its rank, provided
that we subtract again from this function the repeated cases, and that we add to them
those which are lacking.
These cases are those in which three balls exit at their rank. By naming k this
number, it is repeated three times in the first term of the function (a ); for it can result, [222]
in this term, from the three assumptions of each of the three balls exiting at its rank. The
number k is likewise comprehended three times in the second term of the function; for
it can result from each of the assumptions relative to any two of the three balls exiting
at their rank. Thus, this second term being affected with the − sign, the number k in not
found in the function (a ); it is necessary therefore to add it to it in order that it contain
all the cases in which one ball at least exits at its rank. The number of combinations of n
labels taken three by three is n(n−1)(n−2)
1.2.3 , and, as we can combine the r balls of one of
these labels of each combination with the r balls of the second label and with the r balls
of the third label, we will have the total number of combinations in which three balls
exit at their rank, by multiplying n(n−1)(n−2)
1.2.3 r3 by (rn − 3)(rn − 4) · · · (rn − n + 1), a
number which expresses that of the combinations of the rn − 3 remaining balls, taken
n − 3 by n − 3, by having regard for the order that they observe among themselves. If
we add this product to the function (a ), we will have



⎪ rn(rn − 1)(rn − 2) · · · (rn − n + 1)


⎨ n(n − 1) 2
(a ) − r (rn − 2)(rn − 3) · · · (rn − n + 1),
⎪ 1.2


⎩ + n(n − 1)(n − 2) r3 (rn − 3)(rn − 4) · · · (rn − n + 1).

1.2.3
This function expresses the number of all cases in which one ball at least exits at its
rank, provided that we subtract from it again the repeated cases. These cases are those
in which four balls exit at their rank. By applying here the preceding reasonings, we
will see that it is necessary again to subtract from the function (a ) the term

27
n(n − 1)(n − 2)(n − 3) 4
r (rn − 4)(rn − 5) · · · (rn − n + 1).
1.2.3.4
By continuing thus, we will have, for the expression of the cases in which one ball at [223]
least exits at its rank,



⎪ rn(rn − 1)(rn − 2) · · · (rn − n + 1)



⎪ − n(n − 1) r2 (rn − 2)(rn − 3) · · · (rn − n + 1)



⎪ 1.2


n(n − 1)(n − 2) 3
(A) + r (rn − 3)(rn − 4) · · · (rn − n + 1)

⎪ 1.2.3



⎪ n(n − 1)(n − 2)(n − 3) 4

⎪ − r (rn − 4)(rn − 5) · · · (rn − n + 1)

⎪ 1.2.3.4


+ .........................................................

the series being continued as far at it can be. In this function, each combination is not
repeated: thus the combination of s balls exiting at their rank is found here only one
time; for this combination is comprehended s times in the first term of the function,
since it can result from each of the s balls exiting at its rank; it is subtracted s(s−1)
1.2
times in the second term, since it can result from two by two combinations of the s
balls exiting at their rank; it is added s(s−1)(s−2)
1.2.3 times in the third term, since it can
result from the combinations of s letters taken three by three, and thus consecutively;
it is therefore, in the function (A), comprehended a number of times equal to

s(s − 1) s(s − 1)(s − 2)


s− + − ··· ,
1.2 1.2.3
and consequently equal to 1 − (1 − 1)s , or to unity. By dividing the function (A) by
the number rn(rn − 1)(rn − 2) · · · (rn − n + 1) of all possible cases, we will have,
for the expression of the probability that one ball at least will exit at its rank,


⎪ (n − 1)r (n − 1)(n − 2)r2

⎨ 1 − 1.2(rn − 1) + 1.2.3(rn − 1)(rn − 2)
(B)

⎪ (n − 1)(n − 2)(n − 3)r3
⎩ − + ···
1.2.3.4(rn − 1)(rn − 2)(rn − 3)
Let us seek now the probability that s balls at least will exit at their rank. The number [224]
of cases in which s balls exit at their rank is, by that which precedes,

n(n − 1)(n − 2) · · · (n − s + 1) s
(b) r (rn − s)(rn − s − 1) · · · (rn − n + 1),
1.2.3 . . . s
provided that we subtract from this function the cases which are repeated. These cases
are those in which s + 1 balls exit at their rank, for they can result, in the function,
from s + 1 balls taken s by s; these cases are therefore repeated s + 1 times in this

28
function; consequently it is necessary to subtract them s times. Now the number of
cases in which s + 1 balls exit at their rank is

n(n − 1)(n − 2) · · · (n − s) s+1


r (rn − s − 1)(rn − s − 2) · · · (rn − n + 1).
1.2.3 . . . (s + 1)

By multiplying it by s and subtracting it from the function (b), we will have


⎪ n(n − 1)(n − 2) · · · (n − s + 2) s

⎨ r (rn − s)(rn − s − 1) · · · (rn − n + 1)
 1.2.3 . . . s 
(b )

⎪ s(n − s)r
⎩ × 1− .
(s + 1)(rn − s)

In this function, many cases are again repeated, namely, those in which s + 2 balls
exit at their rank; for they result, in the first term, from s + 2 balls exiting at their
rank and taken s by s; they result, in the second term, from s + 2 balls exiting at their
rank and taken s + 1 by s + 1, and moreover multiplied by the factor s, by which we
have multiplied the second term. They are therefore comprehended in this function the
number of times (s+2)(s+1)
1.2 − s(s + 2); thus it is necessary to multiply by unity, less
this number of times, the number of cases in which s + 2 balls exit at their rank. This
last number is

n(n − 1)(n − 2) · · · (n − s − 1) s+2


r (rn − s − 2)(rn − s − 3) · · · (rn − n + 1);
1.2.3 . . . (s + 2)

the product in question will be therefore

n(n − 1)(n − 2) · · · (n − s − 1) s+2 s(s + 1)


r (rn − s − 2) · · · (rn − n + 1) .
1.2.3 . . . (s + 2) 1.2

By adding it to the function (b ), we will have [225]




⎪ n(n − 1)(n − 2) · · · (n − s + 1) s

⎪ r (rn − s)(rn − s − 1) · · · (rn − n + 1)

⎪ 1.2.3 . . . s

⎨ ⎧ ⎫
⎪ s (n − s)r ⎪
(b ) ⎪
⎨ 1 − ⎪


⎪ (s + 1) (rn − s)

⎪ × .

⎪ ⎪
⎪ s (n − s)(n − s − 1)r ⎪
2


⎩ ⎩ + ⎭
s + 2 1.2(rn − s)(rn − s − 1)

This is the number of all possible cases in which s balls exit at their rank, provided that
we subtract from it again the cases which are repeated. By continuing to reason so, and
by dividing the final function by the number of all possible cases, we will have, for the

29
expression of the probability that s balls at least will exit at their rank,

⎪ (n − 1)(n − 2) · · · (n − s + 1)rs−1



⎪ 1.2.3 . . . s(rn − 1)(rn − 2) · · · (rn − s + 1)


⎨ ⎧ ⎫
⎪ s (n − s)r s (n − s)(n − s − 1)r2 ⎪
(C) ⎪
⎨ 1 − + ⎪


⎪ s + 1 rn − s s + 2 1.2.(rn − s)(rn − s − 1)

⎪ × .

⎪ ⎪ (n − s)(n − s − 1)(n − s − 2)r3 ⎪

⎩ ⎪
⎩ − s + · · ·⎪

s + 3 1.2.3(rn − s)(rn − s − 1)(rn − s − 2)

We will have the probability that none of the balls will exit at its rank by subtracting
the formula (B) from unity, and we will find, for its expression,
2
(1.2.3 . . . rn) − nr[1.2.3 . . . (rn − 1)] + n(n−1)
1.2 r [1.2.3 . . . (rn − 2)] − · · ·
.
1.2.3 . . . rn
We have, by §33 of Book I, whatever be i,

1.2.3 . . . i = xi dxc−x ,

the integral8 being taken from x null to x infinity. The preceding expression can there-
fore be put under this form
 rn−n
x dx(x − r)n c−x
(o)  .
xrn dxc−x

Let us suppose the number rn of balls in the urn very great; then, by applying to [226]
the preceding integrals the method of §24 of Book I, we will find very nearly for the
integral of the numerator,
√ n+1 −X
2πX rn+2 1 − Xr c
√ ,
nX + n(r − 1)(X − r)2
2

X being the value of x which renders a maximum the function xrn−n (x − r)n c−x .
The equation relative to this maximum gives for X the two values

rn + r r2 (n − 1)2 + 4rn
X= ± .
2 2
We can consider here only the greatest of these values which is, to the quantities nearly
1 n
of the order rn , equal to rn + n−1 ; then the integral of the numerator of the function
(o) becomes nearly
√ 1 n+1 √
2π(rn)rn+ 2 c−rn 1 − n1 r
 .
(r − 1)(1 − n1 )2 + 1
8 Translator’s note: The constant c denotes e, the base of the natural logarithm.

30
The integral of the denominator of the same function is, by §33, quite nearly,
√ 1
2π(rn)rn+ 2 c−rn ;

the function (o) becomes thus



1 n+1 √
1− r
 n
.
(r − 1)(1 − n1 )2 + 1

We can put it under the form



1 n+1
1−
 n
;
(1 − n1 )2 + 2
rn − 1
rn2

rn being supposed a very great number, this function is reduced quite nearly to this [227]
very simple form
n
n−1
.
n
This is therefore the quite close expression of the probability that none of the balls of
the urn will exit at its rank, when there is a great number of balls. The hyperbolic
logarithm of this expression being
1 1
−1 − − − ··· ,
2n 3n2
We see that it always increases in measure as n increases; that it is null, when n = 1,
and that it becomes 1c , when n is infinity, c being always the number of which the
hyperbolic logarithm is unity.
Let us imagine now a number i of urns each containing the number n of balls, all of
different colors, and that we draw successively all the balls from each urn. We can, by
the preceding reasonings, determine the probability that one or many balls of the same
color will exit at the same rank in the i drawings. In fact, let us suppose that the ranks
of the colors are settled after the complete drawing of the first urn, and let us consider
first the first color; let us suppose that it exits first in the drawings of the i − 1 other
urns. The total number of combinations of the n − 1 other colors from each urn is, by
having regard for their situation among them, 1.2.3 . . . (n − 1); thus the total number
of these combinations relative to i − 1 urns is [1.2.3 . . . (n − 1)]i−1 ; this is the number
of cases in which the first color is drawn the first altogether from all these urns, and, as
there are n colors, we will have

n[1.2.3 . . . (n − 1)]i−1

for the number of cases in which one color at least will arrive at its rank in the drawings
from the i−1 urns. But there are in this number repeated cases; thus the case where two
colors arrive at their rank in these drawings are comprehended twice in this number; it [228]

31
is necessary therefore to subtract them from it. The number of these cases is, by that
which precedes,
n(n − 1)
[1.2.3 . . . (n − 2)]i−1 ;
1.2
by subtracting it from the preceding number, we will have the function

n(n − 1)
n[1.2.3 . . . (n − 1)]i−1 − [1.2.3 . . . (n − 2)]i−1 .
1.2
But this function contains itself repeated cases. By continuing to exclude them as we
have done above relatively to a single urn, by dividing next the final function by the
number of all possible cases, and which is here (1.2.3 . . . n)i−1 , we will have, for the
probability that one of the n − 1 colors at least will exit at its rank in the i − 1 drawings
which follow the first,
1 1 1
− + − ··· ,
ni−2 1.2[n(n − 1)] i−2 1.2.3[n(n − 1)(n − 2)]i−2

an expression in which it is necessary to take as many terms as there are units in n.


This expression is therefore the probability that at least one of the colors will exit at the
same rank in the drawings from the i urns.

§10. Let us consider two players A and B, of whom the skills are p and q, and of
whom the first has a tokens and the second b tokens. Let us suppose that at each trial
the one who loses gives a token to his adversary, and that the game ends only when one
of the players will have lost all his tokens; we demand the probability that one of the
players, A for example, will win the game before or at the nth trial.
This problem can be resolved with ease by the following process, which is, in
some way, mechanical. Let us suppose b equal or less than a, and let us consider the
development of the binomial (p + q)b . The first term pb of this development will be
the probability of A winning the game at trial b. We will subtract this term from the
development, and we will subtract similarly the last term q b , if b = a, because then this
term expresses the probability of B winning the game at trial b. Next we will multiply [229]
the rest by p + q. The first term of this product will have for factor pb q, and, as the
exponent b surpasses only by b − 1 the exponent of q, there results from it that the game
can not be won by player A, at the trial b + 1, that which is clear besides; because, if A
has lost a token in the first b trials, he must, for winning the game, win this token plus
the b tokens of player B, that which requires b + 2 trials. But, if a = b + 1, we will
subtract from the product its last term, which expresses the probability of the player B
winning the game at the trial b + 1.
We will multiply anew this second remainder by p+ q. The first term of the product
will have for factor pb+1 q, and, as the exponent of p surpasses by b the one of q, this
term will express the probability of A winning the game at the trial b + 2. We will
subtract similarly from the product the last term, if the exponent of q surpasses by a
the one of p.
We will multiply anew this third remainder by p + q, and we will continue these
multiplications up to the number of times n − b, by subtracting at each multiplication

32
the first term, if the exponent of p surpasses it by b the one of q, and the last term, if
the exponent of q surpasses by a the one of p. This premised, the sum of the first terms
thus subtracted will be the probability of A winning the game before or at trial n, and
the sum of the last terms subtracted will be the similar probability relative to player B.
In order to have an analytic solution of the problem, let yx,x be the probability of
player A winning the game, when he has x tokens, and when he has no more than x
trials to play in order to attain the n trials. This probability becomes, at the following
trial, either yx+1,x −1 , or yx−1,x −1 , according as player A wins or loses the trial; now
the respective probabilities of these two events are p and q: we have therefore the
equation in the partial differences
yx,x = pyx+1,x −1 + qyx−1,x −1.
In order to integrate this equation, we will consider, as previously, a function u of t

and of t generator of yx,x , so that yx,x is the coefficient of tx tx in the development [230]
of this function. In passing again from the coefficients to the generating functions, the
preceding equation will give


pt
u = u. + qtt ,
t
whence we deduce
pt
1= + qtt ;
t
consequently, 
1
1 1 t2 − 4pq
= ± ,
t 2pt 2p
that which gives  x

1 1 1 1
= ± − 4pq ;
t x (2p)x t t2
therefore  x
u u 1 1
 = ± − 4pq .
x
t t x (2p)x tx t t2
This equation can be put under the following form:
⎧ x  x

⎪ 1 1 1 1

⎪ + − 4pq + − − 4pq

⎪ t t2 t t2
u u ⎨
  x   x
x  =
2(2p)x tx ⎪ 1 1 1 1
x
t t ⎪ + − 4pq − − − 4pq

⎪ 1 t  t 2

t  t 2

⎪ ± − 4pq .
⎩ t2 1
− 4pq
t 2

1
The preceding expression of t gives

1 2p 1
± 2 − 4pq = − ;
t t t

33
we have therefore
 x  x 
u u 1 1 1 1
= + − 4pq + − − 4pq
tx tx 2(2p)x tx t t2 t t2
    x   x
u 1t − 2pt1

1
t  + t
1
2 − 4pq − 1
t  − t
1
2 − 4pq
+  ;
2(2p)x−1 tx 1
− 4pq t2

under this form, the ambiguity of the ± sign disappears.


Now, if we pass again from the generating functions to their coefficients, and if we [231]
observe that y0,x is null, because player A loses the game necessarily when he has no
more tokens, the preceding equation will give, by passing again from the generating
functions to the coefficients,
1
yx,x = [X (x−1) y1,x+x −1 +X (x−3) y1,x+x −3 +· · ·+X (x−2r−1) y1,x+x −2r−1 +· · · ],
2x px−1
the series of the second member being arrested when x − 2r − 1 has a negative value.
1 1
X (x−1) , X (x−3) , . . . are the coefficients of tx−1 , tx−3 , . . .in the development of the
function
  x   x
1 1 1 1
t + t2 − 4pq − t − t2 − 4pq
(i) 
1
t2 − 4pq

If we name u the coefficient of tx in the development of u, u will be a function of


t and of x, generator of yx,x . If we name similarly T  the coefficient of t in the

development of u, the product of 2x pTx−1 by the function (i) will be the generating
function of the second member of the preceding equation; this function is therefore
equal to u . Let us suppose x = a + b, then yx,x becomes ya+b,x , and this quantity is
equal to unity; because it is certain that A has won the game, when he has won all the
tokens of B; u is therefore then the generating function of unity; now x is here zero or
an even number, because the number of trials in which A can win the game is equal to
b plus an even number: indeed, A must for this win all the tokens of B, and moreover
he must win again each token that he has lost, that which requires two trials. Next, n
expressing a number of trials in which A can win the game, it is equal to b plus an even
number; x , being the number of trials which is lacking to player A in order to arrive to
n, is therefore zero or an even number. Thence it follows that, in the case of x = a + b,
1
u becomes 1−t 2 ; we have therefore [232]
  a+b   a+b
1 1 1 1
T  t + t2 − 4pq − t − t2 − 4pq 1
 = ,
2 pa+b−1
a+b 1
− 4pq 1 − t2
t2

that which gives the value of T  . By multiplying it by the function (i) divided by
2a pa−1 and in which we make x = a, we will have the generating function of ya,x

34
equal to
 
2b pb tb [(1 + 1 − 4pqt2 )a − (1 − 1 − 4pqt2 )a ]
(o)   .
(1 − t2 )[(1 + 1 − 4pqt2 )a+b − (1 − 1 − 4pqt2 )a+b ]
In the case of a = b, it becomes
2a pa ta
  .
(1 − t2 )[(1 + 1 − 4pqt2 )a + (1 − 1 − 4pqt2 )a ]
By developing the function
 
(q) (1 + 1 − 4pqt2 )a − (1 − 1 − 4pqt2 )a

according to the powers of t2 , the radical disappears, and the highest exponent
 of t in
this development is equal to or smaller than a. But, if we develop (1 − 1 − 4pqt2 )a
according to the powers of t2 , the least exponent
 of t will be 2a; the function (q) is
therefore equal to the development of (1 + 1 − 4pqt2 )a , by rejecting the powers of
t superior to a.
Now we have, by §3 of Book I,
a(a − 3) 2 a(a − 4)(a − 5) 3
z a = 1 − aα + α − α + ··· ,
1.2 1.2.3
z being one of the roots of the equation
α
z =1−
z
which is reduced to unity when α is null. This root is [233]

1 + 1 − 4α
;
2
by supposing therefore α = pqt2 , we will have
  a
1 + 1 − 4pqt2
 
2 a(a − 3) 2 2 4 a(a − 4)(a − 5) 3 3 6
= 2 1 − apqt +
a
p q t − p q t + ··· ;
1.2 1.2.3
we will have thus
2a pa ta
  a   a
1+ 1 − 4pqt2 + 1 − 1 − 4pqt2
pa ta
= a(a−3) 2 2 4 a(a−4)(a−5) 3 3 6
,
1 − apqt2 + 1.2 p q t − 1.2.3 p q t + ···

the series of the denominator being continued exclusively to the powers of t superior
to a. This second member must be, by that which precedes, divided by 1 − t2 , in

35
order to have the generating function of ya,x ; the quantity ya,x is therefore the sum of
the coefficients of the powers of t , by considering in the development of this member
with respect to the powers of t only the powers equal or inferior to x . Each of these
coefficients will express the probability that A will win the game at the trial indicated
by the exponent of the power of t .
If we name zi the coefficient corresponding to ta+2i , we will have generally

a(a − 3) 2 3
0 = zi − apqzi−1 + p q zi−2 − · · · ;
1.2
whence it is easy to conclude the values of z1 , z2 , . . ., by observing that z−1 , z−2 ,
. . . are nulls, and that z0 = pa . The value of zi being equal to ya,a+2i − ya,a+2i−2 ,
will have those of ya,a , ya,a+2 , ya,a+4 , etc. The equation in the partial differences to
which we are immediately led is found thus restored to one equation in the ordinary
differences, which determines, by integrating it, the value of ya,x . But we can obtain
this value by the following process, which is applied in the general case where a and b [234]
are equal or different between them.
Let us resume the generating function of ya,x found above; ya,x is the coefficient

of tx −b in the development of the function
P
2b pb ,
Q(1 − t2 )

by supposing
  a   a
1 − 4pqt2 − 1 − 1 − 4pqt2
1+
P = 
1 − 4pqt2
   a+b   a+b
1 + 1 − 4pqt2 − 1 − 1 − 4pqt2
Q=  .
1 − 4pqt2

There results from §5 of Book I that, if we consider the two terms


P P
, − ,
2t2i Q (1 − t2 )t2i+1 dQ
dt

that we make next successively t = 1 and t = −1 in the first term, and t equal
successively to all the roots of the equation Q = 0 in the second term, the sum of all the
terms which we obtain in this manner will be the coefficient of t2i in the development
of the fraction
P
.
Q(1 − t2 )
That which the first term produces in this sum is
pa − q a
.
2b (pa+b − q a+b )

36
In order to have the roots of the equation Q = 0, we make
1
t = √ ,
2 pq cos 

that which gives


√ √
(cos  + −1 sin )a+b − (cos  − −1 sin )a+b
Q= √ ,
−1 sin  (cos )a+b−1
or [235]
2 sin(a + b)
Q= .
sin  (cos )a+b−1
The roots of the equation Q = 0 are therefore represented by

(r + 1)π
= ,
a+b
r being a positive whole number which be extended from r = 0 to r = a+b−2. When
a + b is an even number, 12 π is one of the values of ; it is necessary to exclude it,
because, cos  becoming null then, this value of  does not render Q null. In this case,
the equation Q = 0 has only a + b − 2 roots; but, as the term depending on the value
 = 12 π is multiplied in the expression of ya,x by a positive power of cos (r+1)π
a+b , we
1
can conserve the value of r which gives  = 2 π, since the term which corresponds to
it in the expression of ya,x disappears.
Now we have
dQ dQ d
= ,
dt d dt
whence we deduce, by virtue of the equation sin(a + b) = 0,
√ √
dQ 4(a + b) pq cos(r + 1)π 4(a + b) pq (−1)r+1
= = .
dt sin2  (cos )a+b−3 sin2  (cos )a+b−3

The term
−P
,
(1 − t2 )t2i+1 dQ
dt
by observing that
2 sin a
P = ,
sin  (cos )a−1
becomes thus
 b+2i+1
(−1)r+1 22i+2 (pq)i+1 sin (r+1)π
a+b sin (r+1)aπ
a+b cos (r+1)π
a+b
(h)   ;
2(r+1)π
(a + b) p2 − 2pq cos a+b + q 2

the sum of all the terms which we obtain, by giving to r all the whole and positive [236]

37
values, from r = 0 to r = a + b − 2, will be that which produces the function
−P
:
(1 − t2 )t2i+1 dQ
dt

we will designate this sum by the characteristic S placed before the function (h).
If we make r + 1 = a + b − (r + 1), we will have

(r + 1)π (r + 1)π


sin = sin ,
a+b a+b
(r + 1)π (r + 1)π
cos = − cos ,
a+b a+b
2(r + 1)π 2(r + 1)π
cos = cos ,
a+b a+b
(r + 1)aπ (r + 1)aπ
sin = (−1)a+1 sin .
a+b a+b
Thence it is easy to conclude that, in the function (h), the term relative to r + 1 is the
same as the term relative to r + 1; we can therefore double this term, and extend then
the characteristic S only to the values of r comprehended from r = 0 to r = a+b−2 2 , if
a + b is even, or r = a+b−1
2 , if a + b is odd. This premised, by observing that

(r + 1)aπ (r + 1)bπ
sin = (−1)r sin ,
a+b a+b
we will have


⎪ pb (pa − q a )

⎪ ya,b+2i = a+b

⎨ p − q a+b
(H)  b+2i
2(r+1)π (r+1)bπ

⎪ b+2i+2 b i+1 sin a+b sin a+b cos (r+1)π

⎪ −
2 p (pq) a+b

⎩ S .
a+b p2 − 2pq cos 2(r+1)π + q 2 a+b

By changing a into b, p into q, and reciprocally, we will have the probability that player
B will win the game before the trial a + 2i or at this trial.
Let us suppose a = b; sin (r+1)aπ
a+b will become sin 12 (r + 1)π. This sine is null, [237]
when r + 1 is even; therefore it suffices then to consider, in the expression of ya,a+2i ,
the odd values of r +1. By expressing them as 2s+1, and observing that sin (2s+1)π 2 =
(−1)s , we will have
pa
ya,a+2i =
pa + q a
 a+2i
2 a+2i+2 a
p (pq) i+1 (−1)s sin (2s+1)π
a cos (2s+1)π
2a
− S ,
a p2 − 2pq cos (2s+1)π
a + q2

2s + 1 must comprehend all the odd values contained in a − 1.

38
If we change, in this expression, p into q, and reciprocally, we will have the prob-
ability of player B for winning the game in a + 2i trials. The sum of these two prob-
abilities will be the probability that the game will be ended after this number of trials;
this last probability is therefore
 a+2i
s (2s+1)π (2s+1)π
2 a+2i+1 (−1) sin a cos 2a
1− (pa + q a )(pq)i+1 S (2s+1)π
.
a 2
p − 2pq cos + q2
a

If the skills p and q are equal, this expression becomes


 a+2i+1
s (2s+1)π
2 (−1) cos 2a
1− S (2s+1)π
.
a sin a

When a + 2i is a large number, we are able to conclude from it in a quite near manner
the number of trials necessary in order that the probability that the game will end in
this number of trials is equal to a given fraction k1 . We will have then
 a+2i+1
s (2s+1)π
2 (−1) cos 2a k−1
S (2s+1)π
= ;
a sin k
a

a + 2i being supposed a very great number, quite superior to the number a, it suffices [238]
to consider the term of the first member which corresponds to s null, and then we have
 
log a(k−1)
2k sin π
2a
a + 2i + 1 = π ,
log(cos 2a )
these logarithms can be at will hyperbolic or tabular.
If, in the preceding formulas, we suppose a infinity, b remaining a finite number,
we will have the case in which player A plays against player B who has originally the
number b of tokens, until he has won all the tokens of B, without that ever the latter
is able to beat A, whatever be the number of tokens that he has won from him. In this
case, the generating function (o) of ya,x is reduced to
2b pb tb
  b ,
(1 − t2 ) 1 + 1 − 4pqt2
  a   a+b
because then 1 − 1 − 4pqt2 and 1 − 1 − 4pqt2 , developed, contain
only some infinite powers of t , powers which we must neglect, when we consider only
a finite number of trials. We have by that which precedes
  −b
1 + 1 − 4pqt2
⎧ ⎫

⎪ b(b − 3) 2 2 4 b(b + 4)(b + 5) 3 3 6 ⎪
2
1 ⎨ 1+bpqt + 1.2 p q t + p q t + · · ·⎪

= b 1.2.3 .
2 ⎪
⎪ b(b + i − 1)(b + i + 2) · · · (b + 2i − 1)pi q i t2i ⎪

⎩ + + ··· ⎭
1.2.3 . . . i

39
2b pb tb
By multiplying this second member by 1−t2 , the coefficient of tb+2i will be
 
b(b + 3) 2 2 b(b + i − 1)(b + i + 2) · · · (b + 2i − 1)pi q i
p b
1 + bpq + p q + ··· + :
1.2 1.2.3 . . . i

this is the value of ya,b+2i , or the probability that A will win the game before or at the
trial b + 2i.
This value will be very painful to reduce into numbers, if b and 2i were large [239]
numbers; it will be especially very difficult to obtain by its means the number of trials
in which A can wager one against one to win the game; but we can attain it easily in
this manner.
Let us resume formula (H) found above. In the case of a infinite, and p being
supposed equal or greater than q, if we suppose r+1 π
a π = φ and a = dφ, it becomes

2b+2i+2 pb (pq)i+1 dφ sin 2φ sin bφ (cos φ)b+2i
ya,b+2i = 1 − ,
π p2 − 2pq cos 2φ + q 2

the integral needing to be taken from φ = 0 to φ = 12 π. In the case of p less than q, the
pb
same expression holds, provided that we change the first term 1 into qb
.
If p = q, this expression becomes

2 dφ sin bφ (cos φ)b+2i+1
1− ,
π sin φ

the integral being taken from φ null to φ = 12 π. Let us suppose now that b and i are
great numbers. The maximum of the function

φ (cos φ)b+2i+1
sin φ
corresponds to φ = 0, that which gives 1 for this maximum. The function decreases
next with an extreme rapidity, and, in the interval where it has a sensible value, we can
suppose
log sin φ = log φ + log(1 − 16 φ2 ) = log φ − 16 φ2 ,
log(cos φ)b+2i+1 =(b + 2i + 1) log(1 − 12 φ2 + 24 1 4
φ )
b + 2i + 1 2 b + 2i + 1 4
=− φ − φ ,
2 12
that which gives, by neglecting the sixth powers of φ and its fourth powers which are [240]
not multiplied by b + 2i + 1,

(cos φ)b+2i+1 b + 2i + 23 2 b + 2i + 23 4
log = − log φ − φ − φ .
sin φ 2 12
By making therefore
b + 2i + 23
a2 = ,
2

40
we will have 2
(cos φ)b+2i+1 1 − a6 φ4 −a2 φ2
= c ;
sin φ φ
hence,
 
  dφ 1 − a2 4
dφ sin bφ (cos φ)b+2i+1 6 φ 2
φ2
= sin bφ c−a .
sin φ φ
This last integral is able to be taken from φ = 0 to φ infinity; because it must be
2 2
taken from φ = 0 to φ = 12 π; now, a2 being a considerable number, c−a φ becomes
1
excessively small, when we make φ = 2 π, so that we can suppose it null, seeing the
extreme rapidity with which this exponential diminishes, when φ increases. Now we
have
 
 dφ 1 − a2 φ4 

d 6 2 2 a2 2 2
sin bφ c−a φ = dφ 1 − φ4 cos bφ c−a φ ;
db φ 6
we have besides, by No. 25 of Book I,
 √
−a2 φ2 π − b22
dφ cos bφ c = c 4a ,
2a
 √ 4 − b22
4 −a2 φ2 π d c 4a
φ dφ cos bφ c = ,
2a db4

3 π − b22 b2 b4
= c 4a 1− 2 + ,
8a5 a 12.a4
b2
whence we deduce, by making 4a2 = t2 ,
  2

dφ sin bφ (cos φ)b+2i+1 √ −t2 tc−t 2 2

= π dt c − 1 − 3t .
sin φ 8a2

Thus the probability that A will win the game in the number b + 2i trials is [241]
 2

2 −t2 T c−T 2 2

1− √ dt c − 1 − 3T ,
π 8a2
2
the integral being taken from t null to t = T , T 2 being equal to 4ab
2.

If we seek the number of trials in which we can wager one against one that this will
take place, we will make this probability equal to 12 , that which gives
 √ 2
2 π T c−T
dt c−t = + 2
1 − 23 T 2 .
4 8a
Let us name T  the value of t, which corresponds to
 √
−t2 π
dt c = ,
4

41
and let us suppose
T = T  + q,
1
 2 2
q being of order a2 . The integral dt c−t will be increased very nearly by qc−T , that
which gives
2
−T 2 T  c−T
qc = 1 − 23 T 2 ;
8a2
we will have therefore
T 2
T 2 = T 2 + 1 − 23 T 2 .
4a2
1
Having therefore T 2 to the quantities near the order a4 , the equation

b2
2a2 = b + 2i + 2
3 =
2T 2
1
will give, to the quantities near the order a2 ,

b2 7 1
b + 2i = 2
− + T 2 .
2T 6 3
In order to determine the value of T 2 , we will observe that here T  is smaller than 12 ; [242]
thus the transcendent and integral equation
 √
2 π
dt c−t =
4
can be transformed into the following:

1 1 1 5 1 1 7 π
T  − T 3 + T − T + ··· = .
3 1.2 5 1.2.3 7 4
By resolving this equation, we find

T 2 = 0.2102497.

By supposing b = 100, we will have

b + 2i = 23780.14.

There is therefore then disadvantage to wager one against one that A will win the game
in 23780 trials, but there is advantage to wager that he will win it in 23781 trials.

§11. A number n + 1 of players play together with the following conditions. Two
among them play first, and the one who loses is retired after having put a franc in
the game, in order to return only after all the other players have played; that which
holds generally for all the players who lose, and who thence become the last. The
one of the first two players who has won plays with the third, and, if he wins it, he
continues to play with the fourth, and thus consecutively until he loses, or until he has
beat successively all the players. In this last case, the game is ended. But, if the player

42
winning at the first trial is vanquished by one of the other players, the vanquisher plays
with the following player, and continues to play until he is vanquished, or until he has
beat in sequence all the players; the game continues thus until there is one player who
beats in sequence all the others, that which ends the game, and then the player who
wins it takes away all that which has been set into the game.
This premised, let us determine first the probability that the game will end precisely
at trial x; let us name zx this probability. In order that the game finish at trial x, if is [243]
necessary that the player who enters into the game at trial x − n + 1 wins this trial and
the n − 1 trials following; now he is able to enter against a player who has won only
a single trial: by naming P the probability of this event, 2Pn will be the corresponding
probability that the game will end at trial x. But the probability zx−1 that the game
will end at trial x − 1 is evidently 2n−1
P
. Because it is necessary for this that there is
a player who has won a trial, at trial x − n + 1, and who, playing at this trial, wins
it and the following n − 2 trials; and the probability of each of these events being P
1 P
and 2n−1 , the probability of the composite event will be 2n−1 ; we will have therefore
P
zx−1 = 2n−1 , and, consequently,

P 1
n
= zx−1 ;
2 2
1
2 zx−1 is therefore the probability that the game will end at trial x, relative to this case.
If the player who enters into the game at trial x − n + 1 plays at this trial against

a player who has already won two trials, by naming P  the probability of this case, 2Pn
will be the probability relative to this case that the game will end at trial x. But we
have
P
n−2
= zx−2 ;
2
because, in order that the game end at trial x − 2, it is necessary that at trial x − n + 1
one of the players has already won two trials, and that he wins this trial and the n − 3
following trials. We have therefore
P 1
= 2 zx−2 ;
2n 2
1
22 zx−2 is therefore the probability that the game will end at trial x, relative to this case;
and thus consecutively.
By reassembling all these partial probabilities, we will have [244]
1 1 1 1
zx = zx−1 + 2 zx−2 + 3 zx−3 + · · · + n−1 zx−n+1 .
2 2 2 2
The generating function of zx is, by Book I,
ψ(t)
1 1 2 1
1− 2t − 22 t − ··· − 2n−1 t
n−1

or
1
2 ψ(t)(2 − t)
.
1 − t + 21n tn

43
In order to determine ψ(t), we will observe that the game is able to end no earlier
1
than at trial n, and that the probability for this is 2n−1 ; because it is necessary that the
vanquisher at the first trial win the n − 1 following trials; ψ(t) must therefore contain
1
only the power n of t, and 2n−1 must be the coefficient of this power, that which gives
n
t
ψ(t) = 2n−1 ; thus the generating function of zx is
1 n
2n t (2 − t)
.
1 − t + 21n tn

The sum of the coefficients of the powers of t to infinity, in the development of this
function, is the probability that the game must end after an infinity of trials; now we
have this sum by making t = 1 in the function, that which reduces it to unity; it is
therefore certain that the game must end.
We will have the probability that the game will be ended at trial x or before this
trial, by determining the coefficient of tx in the development of the preceding function,
divided by 1 − t; the generating function of this probability is therefore
1 n
− t)
2n t (2 .
(1 − t) 1 − t + 21n tn

Let us give to the generating function of zx this form [245]


 
1 tn (2 − t) 1 tn 1 t2n
1 − + − · · · ;
2n 1 − t 2n 1 − t 22n (1 − t)2
trn (2−t)
the coefficient of tx in 2rn (1−t)r is

1 (x − rn + 1)(x − rn + 2) · · · (x − rn + r − 2)
(x − rn + 2r − 2);
2rn 1.2.3 . . . (r − 1)
we have therefore
1 x − 2n + 1 x − 3n + 1
zx = − + (x − 3n + 4)
2n 22n 1.2.23n
(x − 4n + 1)(x − 4n + 2)
− (x − 4n + 6) + · · · ,
1.2.3.24n
an expression which is relative only to x greater than n, and in which it is necessary to
take only as many terms as there are whole units in the quotient nx . When x = n, we
1
have zx = 2n−1 .
By developing in the same manner the generating function of the probability that
the game will end before or at trial x, we will find for the expression of this probability
x − n + 2 x − 2n + 1
− (x − 2n + 4)
2n 1.2.22n
(x − 3n + 1)(x − 3n + 2)
+ (x − 3n + 6) − · · ·
1.2.3.23n
this expression holding even in the case x = n.

44
Let us determine now the respective probabilities of the players winning the game at
trial x. Let y0,x, be that of the player who has won the first trial. Let y1,x , y2,x , . . . , yn−1,x
be those of the following players, and yn,x that of the player who has lost at the first
trial, and who thence became the last. Let us designate the players by (0), (1), (2), . . .,
(n − 1), (n). This premised, the probability yr,x of player (r) becomes yr−1,x−1 , if
at the second trial player (0) is vanquished by player (1); because it is clear that (r) is [246]
found then, with respect to the vanquisher (1), in the same position where (r − 1) was
with respect to the vanquisher (0); only, there is one less trial to play in order to arrive at
trial x, that which changes x into x − 1. Presently the probability that player (0) will be
vanquished by (1) is 12 ; thus 12 yr−1, x−1 is the probability of player (r) for winning the
game at trial x relative to the case where (0) is vanquished by (1). If (0) is vanquished
only by (2), yr,x becomes yr−2, x−2 , and the probability of this event being 14 , we have
1
4 yr−2, x−2 for the probability of player (r) winning the game at trial x, relative to
this case. If player (0) is vanquished only by player (r), yr,x becomes y0, x−r , and the
probability of this event is 21r ; thus 21r y0,x−r is the probability of player (r) winning the
game at trial x, relative to this case. If player (0) is vanquished only by player (r + 1),
yr,x is changed into yn−1, x−r−1 ; because then player (r) is found, with respect to the
vanquisher, in the original position of player (n − 1) with respect to player (0); only
there remains only x − r − 1 trials to play in order to arrive at trial x. Now the prob-
1 1
ability that (0) will be vanquished only by player (r + 1) is 2r+1 ; 2r+1 yn−1, x−r−1 is
therefore the probability of (r) for winning the game at trial x, relative to this case. By
continuing thus, and reassembling all these partial probabilities, we will have the en-
tire probability yr,x of player (r) for winning the game, that which gives the following
equation:
1 1 1 1
yr,x = yr−1,x−1 + 2 yr−2, x−2 + · · · + r y0,x−r + r+1 yn−1, x−r−1
2 2 2 2
1 1
+ r+2 yn−2, x−r−2 + · · · + n−1 yr+1, x−n+1 .
2 2
This expression holds from r = 1 to r = n − 2. It gives
1 1 1 1
yr−1,x−1 = 2 yr−1,x−2 + 3 yr−3, x−3 + · · · + n yr,x−n .
2 2 2 2
By subtracting this equation from the preceding, we will have that here in the partial [247]
differences,
1
(1) yr,x − yr−1,x−1 + yr,x−n = 0;
2n
this equation is extended from r = 2 to r = n − 2.
We have, by the preceding reasoning, the following equation:
1 1 1
yn−1,x = yn−2, x−1 + 2 yn−3, x−2 + · · · + n−1 y0,x−n+1 .
2 2 2
But the preceding expression of yr,x gives
1 1 1 1
yn−2,x−1 = 2 yn−3,x−2 + · · · + n−1 y0, x−n+1 + n yn−1,x−n .
2 2 2 2

45
By subtracting this equation from the preceding, we will have
1
yn−1,x − yn−2,x−1 + yn−1,x−n = 0;
2n
thus equation (1) subsists in the case of r = n − 1.
The preceding reasoning leads further to this equation
1 1 1
yn,x = yn−1,x−1 + 2 yn−2, x−2 + · · · + n−1 y1, x−n+1 ,
2 2 2
that which gives
1 1 1
yn, x−1 = 2 yn−1, x−2 + · · · + n y1,x−n .
2 2 2
By subtracting this equation from that here, which gives the general expression of yr,x ,

1 1 1
y1,x = y0,x−1 + 2 yn−1, x−2 + · · · + n+1 y2, x−n+1 ,
2 2 2
and making 12 (y0,x + yn,x ) = ȳ0,x , we will have

1
y1,x − ȳ0,x−1 + y1,x−n = 0.
2n
Equation (1) subsists therefore yet even in the case of r = 1, provided that we change
y0,x into ȳ0,x . We must observe that ȳ0,x is the probability to win the game at trial x
of each of the first two players, at the moment where the game commences; because [248]
this probability becomes, after the first trial, y0,x or yn,x according as the player win
or lose, and the probability of each of these events is 12 .
Now, the generating function of equation (1) is, by §20 of Book I,

φ(t)
(a) ,
1 − tt + 21n tn

t being relative to the variable x, and t being relative to the variable r, so that yr,x is
the coefficient of tr tx in the development of this function; φ(t) is a function of t that
there is concern to determine.
For this, we will make
1
T = ;
1 + 21n tn
the generating function of yr,x will be the coefficient of tr in the development of the
function (a); it will be therefore
φ(t)tr T r+1 .
The probability that the game will end precisely at trial x is evidently the sum of the
probabilities of each player for winning at this trial; it is therefore

2ȳ0,x + y1,x + y2,x + · · · + yn−1,x ;

46
consequently the generating function of this probability is

T φ(t)(2 + tT + t2 T 2 + · · · + tn−1 T n−1 )

or
2 − tT − tn T n
T φ(t) .
1 − tT
By equating it to the generating function of this probability, that we have found above
and which is
1 n
2n t (2 − t)
,
1 − t + 21n tn
we will have [249]
1 n
2n t (2 − t)(1

− tT )
.
φ(t) =
T (2 − tT − t T ) 1 − t + 21n tn
n n

Thus the generating function of equation (1) in the partial differences is


1 n
2n t (2 − t)(1 − tT

)
;
1 n 1 n
T (2 − tT − t T ) 1 − t + 2n t
n n 1 − tt + 2n t

the generating function of yr,x is therefore


1 n+r
2n t (2 − t)(1 − tT )T r
.
(2 − tT − tn T n ) 1 − t + 21n tn

The coefficient of tx in the development of this function is the probability of player (r)
winning the game at trial x. We will thus be able to determine this probability through
this development. The sum of all these coefficients to x infinity is the probability of
player (r) winning the game; now we have this sum by making t = 1 in the preceding
2n
function, that which gives T = 1+2 n ; let us name p this last quantity, and let us

designate by yr the probability of (r) to win the game; we will have

(1 − p)pr
yr = .
2 − p − pn
This expression is extended from r = 0 to r = n − 1, provided that we change y0
into ȳ0 , ȳ0 expressing the probability of the first two players winning the game at the
moment where they enter the game.
Now, each losing player depositing a franc into the game, let us determine the
advantage of the different players. It is clear that after x trials, there were x tokens in
the game; the advantage of player (r) relative to these x tokens is the product of these
tokens by the probability yr,x winning the game at trial x; this advantage is therefore
xyr,x . The value of xyr,x is the coefficient of tx−1 dt in the differential of the generating [250]
function yr,x ; by dividing therefore this differential by dt and by supposing next t = 1,
we will have the sum of all the values of xyr,x to x infinity; this is the advantage of
player (r). But it is necessary to subtract the tokens that he put into the game at each
trial that he loses; now yr,x being his probability to win the game at trial x, 2n yr,x−n+1

47
will be his probability to enter into the game at trial x−n+1, since this last probability,
multiplied by the probability 21n that he will win this trial and the n − 1 following trials
is his probability to win the game at trial x. By supposing therefore that he loses as
many times as he enters into the game, the sum of all the values of 2n yr,x−n+1 , to x
infinity, would be the disadvantage of player (r); and as the sum of all the values of
yr,x−n+1 is equal to the sum of all the values of yr,x or yr , we would have 2n yr or
2n (1−p)pr
2−p−pn for the disadvantage of player (r). But he does not lose each time that he
enters into the game, because he is able to enter into the game and to win the game; it
is necessary therefore to remove from 2n yr the sum of all the values of yx or yr , and
then the disadvantage of (r) is (2 −1)(1−p)p
n r

2−p−pn . In order to have the entire advantage of


(r), it is necessary to subtract this last quantity from the sum of the values of xyr,x ; by
designating therefore by S this sum, the advantage of player (r) will be

(2n − 1)(1 − p)pr


S− ,
2 − p − pn
S being, as we have just seen, the differential of the generating function of yr,x divided
by dt, and in which we suppose next t = 1. Under this supposition, we have
dT
T = p, = −np(1 − p).
dt
Let us designate by Yr the advantage of (r), we will find
 
np + 1 − n r pn+1 + n(1 − p)pn − p
Yr = p (1 − p)r + .
2 − p − pn 2 − p − pn

This equation will serve from r = 0 to r = n − 1, provided that we change Y0 into [251]
Ȳ0 , Ȳ0 being the advantage of the first two players, at the moment where they enter
into the game.
If, at the commencement of the game, each of the players deposits into the game
a sum a, the advantage of player (r) will be increased from it by (n + 1)a, multiplied
by the probability yr , that this player will win the game; but it is necessary to remove
from it the stake a from this player; it is necessary therefore, in order to have then his
advantage, to increase the preceding expression of Yr by the quantity

(n + 1)a(1 − p)pr
− a.
2 − p − pn
When the advantage of (r) becomes negative, it is changed into disadvantage.

§12. Let q be the probability of a simple event at each trial; we demand the proba-
bility to bring it forth i times in sequence in the number x of trials.
Let us name zx the probability that this composite event will take place precisely at
trial x. For this, it is necessary that the simple event not arrive at trial x − i, and that it
arrive in the i trials following, the composite event being not at all arrived previously.
Let then P be the probability that the simple event will not arrive at all at trial x − i − 1.
The corresponding probability that it will not arrive at all at trial x − i will be (1 − q)P ,

48
and the corresponding probability that the composite event will take place precisely at
trial x will be (1 − q)P q i . This will be the part of zx corresponding to this case. But
the probability that the composite event will arrive at trial x − 1 is evidently P q i ; we
have therefore
zx−1
P = i ;
q
thus the partial value of zx , relative to this case, is (1 − q)zx−1 .
Let us consider now the cases where the simple event will arrive at trial x − i − 1.
Let us name P  the probability that it will not arrive at trial x − i − 2; the probability [252]
that it will arrive in this case at trial x − i − 1 will be qP  , and the probability that it
will not arrive at trial x − i will be (1 − q)qP  ; the partial value of zx relative to this
case will be therefore (1 − q)qP  q i . But the probability that the composite event will
arrive precisely at trial x − 2 is P  q i : it is the value of zx−2 , that which gives
zx−2
P = ;
qi

(1 − q)qzx−2 is therefore the partial value of zx relative to the case where the simple
event will arrive at trial x − i − 1, without arriving at trial x − i − 2.
We will find in the same manner that (1 − q)q 2 zx−3 is the partial value of zx ,
relative to the case where the simple event will arrive at trials x − i − 1 and x − i − 2,
without arriving at trial x − i − 3; and thus consecutively.
By uniting all these partial values of zx , we will have

zx = (1 − q)(zx−1 + qzx−2 + q 2 zx−3 + · · · + q i−1 zx−i ).

It is easy to conclude from it that the generating function of zx is

q i (1 − qt)ti
,
1 − t + (1 − q)q i ti+1

because this generating function is

φ(t)
.
1 − (1 − q)(t + qt2 + · · · + q i−1 ti )
or
φ(t)(1 − qt)
,
1 − t + (1 − q)q i ti+1
The function φ(t) must be determined by the condition that it must contain only the
power i of t, since the composite event is able to commence to be possible only at trial
i; moreover, the coefficient of this power is the probability q i that this event will take
place precisely at this trial.
By dividing the preceding generating function by 1 − t, we will have [253]

q i (1 − qt)ti
 
(1 − t)2 1 + (1−q)q
i ti+1

1−t

49
for the generating function of the probability that the composite event will take place
before or at trial x.
By developing this function, we will have, for the coefficient of tx+i , the series

(x − i)
q i [(1 − q)x + 1] − (1 − q)q 2i [(1 − q)(x − i − 1) + 2]
1.2
(x − 2i)(x − 2i − 1)
+ (1 − q)2 q 3i [(1 − q)(x − 2i − 2) + 3]
1.2.3
(x − 3i)(x − 3i − 1)(x − 3i − 2)
− (1 − q)3 q 4i [(1 − q)(x − 3i − 3) + 4]
1.2.3.4
+ ................................................................. ,

the series being continued until we arrive to some negative factors. This is the expres-
sion of the probability that the composite event will take place at trial x + i or before
this trial.
Let us suppose further that two players A and B, of whom the respective skills for
winning a trial are q and 1−q, play with this condition, that the one of the two who will
have first vanquished i times in sequence his adversary will win the game; we demand
the respective probabilities of the two players for winning the game precisely at trial x.
Let yx be the probability of A, and yx that of B. Player A is able to win the game
at trial x, only as long as he commences or recommences to beat B at trial x − i + 1,
and that he continues to win the following i − 1 trials. Now, before commencing trial
x − i + 1, B will have already beat A either one time, or two times, . . ., or i − 1 times.
In the first case, if we name P the probability of this case, P (1 − q)i−1 will be the

probability yx−1 of B for winning the game at trial x − 1, that which gives

yx−1
P = .
(1 − q)i−1

But if B loses at trial x − i + 1 and at the i − 1 following trials, A will win the game [254]
qi y
at trial x, and the probability of this is P q i ; (1−q)
x−1
i−1 is therefore the part of yx relative

to the first case.


In the second case, if we name P  his probability, P  (1 − q)i−2 will be the proba-

bility yx−2 of B for winning the game at trial x − 2. The probability of A for winning

q i yx−2
the game at trial x, relative to this case, is P  q i ; we have therefore (1−q)i−2 for this
probability.
By continuing thus, we will have

qi 
yx = [(1 − q)yx−1 + (1 − q)2 yx−2
 
+ · · · + (1 − q)i−1 yx−i+1 ].
(1 − q)i

If we change q into 1 − q, yx into yx and reciprocally, we will have

(1 − q)i
yx = (qyx−1 + q 2 yx−2 + · · · + q i−1 yx−i+1 ).
qi

50
Now, u being the generating function of yx , that of yx will be, by all that which pre-
cedes,
kqut(1 + qt + qt2 + · · · + q i−2 ti−2 ),

k being equal to (1−q)


i

q i . But the preceding expression of yx commencing to hold only
when x = i + 1, because for the smaller values of x, yx−1 , yx−2 , . . . are nulls, it
is necessary, in order to complete the preceding expression of the generating function
of yx , to add to it a rational and entire function of t, of order i, and of which the
coefficients of the powers of t are the values of yx , when x is equal or smaller than i.
Now yx is null, when x is less than i; and when it is equal to i, yx is (1 − q)i , because
it expresses then the probability of B for winingn the game after i trials; the function
to add is therefore (1 − q)i ti ; thus the generating function of yx is

kqut(1 + qt + qt2 + · · · + q i−2 ti−2 ) + (1 − q)i ti .

If we name u this function, the expression of yx in yx−1 


, yx−2 , . . ., will give for the [255]
 1
generating function of yx , by changing in that of yx , k into k , q into 1 − q,

1
(1 − q)u t[1 + (1 − q)t + · · · + (1 − q)i−2 ti−2 ] + q i ti .
k
This quantity is therefore equal to u, whence we deduce, by substituting in it for u its
preceding value,

q i ti (1 − qt)[1 − (1 − q)i ti ]
u= .
1 − t + q(1 − q)i ti+1 + (1 − q)q i ti+1 − q i (1 − q)i t2i

By changing q into 1 − q, we will have the generating function u of yx . If we di-


vide these functions by 1 − t, we will have the generating functions of the respective
probabilities of A and of B, for winning the game before or at trial x.
If we suppose t = 1 in u, we will have the probability that A will win the game;
because it is clear that by developing u according to the powers of t, and by supposing
next t = 1, the sum of all the terms of this development will be that of all the values of
yx . We find thus the probability of A winning the game equal to

[1 − (1 − q)i ]q i−1
;
(1 − q)i−1 + q i−1 − q i−1 (1 − q)i−1
the probability of B is therefore

(1 − q)i−1 [1 − q i ]
.
(1 − q)i−1+ q i−1 − q i−1 (1 − q)i−1
Let us suppose now that the players, at each trial that they lose, deposit a franc into
the game, and let us determine their respective lot. It is clear that the gain of player
A will be x, if he wins the game at trial x, since there will be x francs deposited into
the game; thus the probability of this event being yx by that which precedes. Sxyx
will be the expression of the advantage of A, the sign S extending to all the possible

values of x. The generating function of yx being u or TT , T  being the numerator of

51
the preceding expression of u, and T being its denominator, it is easy to see that we [256]

will have Sxyx by differentiating TT , and by supposing next t = 1 in this differential,
that which gives, with this condition,
dT  T  dT
Sxyx = − 2 .
T dt T dt
In order to have the disadvantage of A, we will observe that at each trial that he plays,
the probability that he will lose, and consequently that he will deposit a franc into the
game, is 1 − q; his loss is therefore the product of 1 − q by the probability that the

trial will be played; now the probability that trial x will be played is 1−Syx−1 −Syx−1 ;
t  T  t+T  t 
the generating function of unity is here 1−t , and that of Syx+1 +Syx+1 is T (1−t) ; T
being that which T  becomes when we change q into 1 − q and reciprocally; thus the
generating function of the disadvantage of A is
(1 − q)t(T − T  − T  )
.
(1 − t)T
The numerator and the denominator of this function are divisible by 1−t; moreover, we
will have the sum of all the disadvantages of A, or his total disadvantage, by making
t = 1 in this generating function; the total disadvantage is therefore, by the known
methods, and by observing that T  + T  = T when t = 1,
(1 − q)(dT − dT  − dT  )
− ,
T dt
t being supposed equal to unity after the differentiations. If we subtract this expression
from that of the total advantage of A, we will have, for the expression of the lot of this
player,
qdT  + (1 − q)(dT − dT  ) T  dT
− 2 .
T dt T dt
The lot of B will be
(1 − q)dT  + q(dT − dT  ) T  dT
− 2 ,
T dt T dt
t being supposed unity after the differentiations, that which gives [257]
T = q(1 − q)[q i−1
+ (1 − q) i−1
−q i−1
(1 − q) i−1
],
dT
= (i + 1)q(1 − q)[q i−1 + (1 − q)i−1 ] − 2i q i (1 − q)i − 1,
dt
T = (1 − q)q i [1 − (1 − q)i ],
dT 
= i(1 − q)q i [1 − 2(1 − q)i ] − qq i [1 − (1 − q)i ].
dt
T 
We will have T  and dt by changing, in these last two expressions, q into 1 − q.

§13. An urn being supposed to contain n + 1 balls, distinguished by the numbers


0, 1, 2, 3, . . . , n, we draw from it a ball which we replace into the urn after the draw-
ing. We demand the probability that after i drawings the sum of the numbers brought
forth will be equal to s.

52
Let t1 , t2 , t3 , . . . , ti be the numbers brought forth at the first drawing, at the sec-
ond, at the third, . . .; we must have

(1) t1 + t2 + t3 + · · · + ti = s.

t2 , t3 , . . . , ti being supposed not to vary, this equation is susceptible only of one com-
bination. But, if we make vary at the same time t1 and t2 , and if we suppose that these
variables can be extended indefinitely from zero, then the number of combinations
which give the preceding equation will be

s + 1 − t 3 − t 4 − · · · − ti ;

because t1 can be extended from zero, that which gives

t 2 = s − t 3 − t 4 − · · · − ti ,

to s − t3 − t4 − · · · − ti , that which gives t2 = 0, the negative values of the variables


t1 , t2 must to be excluded.
Now, the number s + 1 − t3 − t4 − · · · − ti is susceptible of many values, by virtue
of the variations of t3 , t4 , . . .. Let us suppose first t4 , t5 , . . . invariables, and that t3 [258]
being able to be extended indefinitely from zero; then we make

s + 1 − t3 − t4 − · · · − ti = x,

by integrating this variable of which the finite difference is unity, we will have x(x−1)
1.2
for its integral; but, in order to have the sum of all the values of x, it is necessary,
as we know, to add x to this integral; this sum is therefore x(x+1)
1.2 . It is necessary to
make x equal to its greater value, which we obtain by making t3 null in the function
s+1−t3 −t4 −· · ·−ti : thus the total number of combinations relative to the variations
of t1 , t2 and t3 is

(s + 2 − t4 − t5 − · · · − ti )(s + 1 − t4 − t5 − · · · − ti )
.
1.2
By making next in this function

s + 2 − t4 − t5 − · · · − ti = x,
x(x−1)
it becomes 1.2 ; by integrating it from x = 0 and by adding the function itself
to this integral, we will have (x+1)x(x−1)
1.2.3 ; the value of x null corresponds to t4 =
s + 2 − t5 − · · · − ti , and it greater value corresponds to t4 null, and consequently
it is equal to s + 2 − t5 − · · · − ti ; by substituting therefore for x this value into the
preceding integral, we will have

(s + 3 − t5 − t6 − · · · − ti )(s + 2 − t5 − t6 − · · · − ti )(s + 1 − t5 − t6 − · · · − ti )
1.2.3
for the sum of all the combinations relative to the variations of t1 , t2 , t3 , t4 . By contin-
uing thus, we will find generally that the total number of the combinations which give

53
equation (1), under the supposition where the variables t1 , t2 , . . . , ti can be extended
indefinitely from zero, is

(s + i − 1)(s + i − 2)(s + i − 3) · · · (s + 1)
(a)
1.2.3 . . . (i − 1)
but, in the present question, these variables can not be extended beyond n. In order [259]
to express this condition, we will observe that, the urn containing n + 1 balls, the
1
probability to extract any one among them is n+1 ; thus the probability of each of the
1
values of t1 , from zero to n, is n+1 . The probability of the values of t1 , equals or
superiors to n + 1, is null; we can therefore represent it by 1−l
n+1

n+1 , provided that we


make l = 1 in the result of the calculation; then the probability of any value of t1 can
be generally expressed by 1−l
n+1

n+1 , provided that we make l to begin, only when t1 will


have attained n + 1, and that we suppose it at the end equal to unity; it is likewise
of the probabilities of the other variables. Now, the probability of equation (1) is the
product of the probabilities of the values of t1 , t2 , t3 , . . .; this probability is therefore
 n+1 i
1−l
n+1 ; the number of combinations which give this equation, multiplied by their
 n+1 i
respective probabilities, is thus the product of the fraction (a) by 1−l n+1 , or


i
(s + 1)(s + 2)(s + i − 1) 1 − ln+1
(b) ;
1.2.3 . . . (i − 1) n+1

but it is necessary, in the development of this function, to apply ln+1 only to the com-
binations in which one of the variables begins to surpass n; it is necessary to apply
l2n+2 only to the combinations in which two of the variables begin to surpass n, and
thus of the rest. If in equation (1) we suppose that one of the variables, t1 , for example,
surpasses n, by making t1 = n + 1 + t , this equation becomes

s − n − 1 = t1 + t2 + t3 + · · · ,
the variable t1 can be extended indefinitely. If two of the variables such as t1 and t2
surpass n, by making

t1 = n + 1 + t1 , t2 = n + 1 + t2 ,
the equation becomes [260]

s − 2n − 2 = t1 + t2 + t3 + · · · ,
and thus consecutively. We must therefore, in the function (a) which we have de-
rived from equation (1), diminish s by n + 1, relatively to the system of variables
t1 , t2 , t3 , . . . We must diminish it by 2n + 2, relatively to the variables t1 , t2 , t3 , . . .,
and thus of the rest. It is necessary consequently, in the development of the function
(b) with respect to the powers of l, to diminish, in each term, s from the exponent of
the power of l; by making next l = 1, this function becomes

54

⎪ (s + 1)(s + 2) · · · (s + i − 1) i(s − n)(s − n + 1) · · · (s + i − n − 2)

⎨ 1.2.3 . . . (i − 1)(n + 1)i − 1.2.3 . . . (i − 1)(n + 1)i
(c)

⎪ i(i − 1) (s − 2n − 1)(s − 2n) · · · (s + i − 2n − 3)
⎩ + − ··· ,
1.2 1.2.3 . . . (i − 1)(n + 1)i
the series must be continued until one of the factors s − n, s − 2n − 1, s − 3n − 2, . . .
becomes null or negative.
This formula gives the probability to bring forth a given number s, by projecting
i dice with a number n + 1 faces on each, the smallest number marked on the faces
being 1. It is clear that this reverts to supposing in the preceding urn all the number of
the balls increased by unity, and then the probability to bring forth the number s + i in
i drawings is the same as that of bringing forth the number s in the case that we just
considered; now, by making s + i = s , we have s = s − i; formula (c) will give
therefore, for the probability to bring forth the number s by projecting the i dice,
(s − 1)(s − 2) · · · (s − i + 1) i(s − n − 2)(s − n − 3) · · · (s − i − n)

1.2.3 . . . (i − 1)(n + 1)i 1.2.3 . . . (i − 1)(n + 1)i
i(i − 1) (s − 2n − 3)(s − 2n − 4) · · · (s − i − 2n − 1)
+ − ···
1.2 1.2.3 . . . (i − 1)(n + 1)i
Formula (c), applied to the case where s and n are infinite numbers, is transformed
into the following
  s i−1 i(i − 1)  s i−1 
1 s i−1
−i −1 + −2 − ··· .
1.2.3 . . . (i − 1)n n n 1.2 n
This expression can serve to determine the probability that the sum of the inclinations [261]
to the ecliptic of a number i of orbits will be comprehended within some given limits,
by supposing that, for each orbit, all the inclinations from zero to the right angle are
equally possible. Indeed, if we imagine that the right angle 12 π is divided into an infinite
number n of equal parts, and if s contains an infinite number of these parts, by naming
φ the sum of the inclinations of the orbits, we will have
s φ
= 1 .
n 2π

By multiplying therefore the preceding expression by ds or by n1dφ , and by integrating



it from φ −  to φ + , we will have

i
i
i ⎫

⎪ φ+ φ+ i(i − 1) φ +  ⎪


⎨ 1 − i 1 − 1 + 1 − 2 − · · · ⎪

1 2 π 2 π 1.2 2 π
(o)
1.2.3 . . . i ⎪
i
i
i ⎪
;

⎪ φ− φ− i(i − 1) φ −  ⎪

⎩− 1 +i 1 −1 − 1 − 2 + · · ·⎭
2π 2π
1.2 2π

this is the expression of the probability that the sum of the inclinations of the orbits will
be comprehended within the limits φ −  to φ + .

55
Let us apply this formula to the orbits of the planets. The sum of the inclinations
of the orbits of the planets to that of the Earth was 91.4187◦ at the beginning of 1801:
there are ten orbits, without including the ecliptic; we have therefore here i = 10. We
make next
φ −  = 0,
φ +  = 91.4187◦ .
The preceding formula becomes thus, by observing that 12 π or the quarter of the cir-
cumference is 100◦ ,
1
(0.914187)10 .
1.2.3 . . . 10
This is the expression of the probability that the sum of the inclinations of the orbits
will be comprehended within the limits zero and 91.4187◦ , if all the inclinations were
equally possible. This probability is therefore 0.0000001235. It is already very small; [262]
but it is necessary next to combine it with the probability of a very remarkable cir-
cumstance in the system of the world, and which consists in this that all the planets
are moved in the same sense as the Earth. If the direct and retrograde movements are
10
supposed equally possible, this last probability is 12 ; it is necessary therefore to
10
multiply 0.0000001235 by 12 , in order to have the probability that all the move-
ments of the planets and of the Earth will be directed in the same sense, and that the
sum of their inclinations to the orbit of the Earth will be comprehended within the lim-
its zero and 91.4187◦ ; we will have thus 1.0972 (10)10 for this probability, that which gives
1 − 1.0972
(10)10 for the probability that this had not ought to take place, if all the inclina-
tions, in the same way the direct and retrograde movements, have been equally facile.
This probability approaches so to certainty, that the observed result becomes unlikely
under this hypothesis; this result indicates therefore, with a very great probability, the
existence of an original cause which has determined the movements of the planets to
bring themselves together to the plane of the ecliptic or, more naturally, to the plane of
the solar equator and to be moved in the sense of the rotation of the Sun. If we consider
next that the eighteen satellites observed until now make their revolution in the same
sense, and that the observed rotations in the number of thirteen in the planets, the satel-
lites and the ring of Saturn, are yet directed in the same sense; finally, if we consider
that the mean of the inclinations of the orbits of these stars and of their equators to the
solar equator is quite removed from reaching a half right angle, we will see that the
existence of a common cause, which has directed all these movements in the sense of
the rotation of the Sun and onto some planes slightly inclined to the one of its equator,
is indicated with a probability quite superior to the one of the greatest number of the
historical facts on which we permit no doubt.
Let us see now if this cause has influence on the movement of the comets. The
number of these which we have observed until the end of 1811, by counting for the [263]
same the diverse apparitions of the one of 1759, is raised to one hundred, of which fifty-
three are direct, and forty-seven are retrograde. The sum of the inclinations of the orbits
of the first is 2657.993◦ , and that of the inclinations of the other orbits is 2515.684◦ :
the mean inclination of all these orbits is therefore 51.73677◦ ; consequently the sum of

all the inclinations is i.π
4 + i.1.73677 , i being here equal to 100. We see already that

56
the mean inclination surpassing the half right angle, the comets, far from participating
in the tendency of the bodies of planetary system, in order to be moved in some planes
slightly inclined to the ecliptic, appear to have a contrary tendency. But the probability
of this tendency is very small. Indeed, if we suppose, in formula (o),
i.π
φ= ,  = i.1.73677◦ ,
4
it becomes

i
i ⎫

⎪ 4i.1.73677◦ 4i.1.73677◦ ⎪


⎪ i+ −i i+ −2 ⎪


⎪ π π ⎪






⎪ ◦ i ⎪


⎪ i(i − 1) 4i.1.73677 ⎪


⎨ + i + − 4 − · · · ⎪

1 1.2 π
(p)
1.2.3 . . . i.2i ⎪
i
i⎪
,

⎪ 4i.1.73677◦ 4i.1.73677◦ ⎪


⎪ − i− +i i− −2 ⎪ ⎪

⎪ π π ⎪






⎪ ◦ i ⎪


⎪ i(i − 1) 4i.1.73677 ⎪

⎩ − i− − 4 + ··· ⎭
1.2 π

π being 200◦ . This is the expression of the probability that the sum of the inclinations
of the orbits of the i comets must be comprehended within the limits ±i.1, 73677◦ . The
number of terms of this formula and the precision with which it would be necessary
to have each of them renders the calculation impractical; it is necessary to recur to the
methods of approximation developed in the second part of Book I. We have, by §42 of
the same Book,
√ √ √
(i + r i)i − i(i + r i − 2)i + i(i−1) 1.2 (i + r i − 4) − · · ·
i

1.2.3 . . . i.2i

1 3 3 2 3 3 3 2
= + dr c− 2 r − r(1 − r2 )c− 2 r ,
2 2π 20.i 2π
the powers of the negative quantities being here excluded, as they are in the preceding [264]
formula; by making therefore
√ 4i.1.73677◦
r i= ,
200◦
formula (p) becomes

3 − 32 r 2 3 3 3 2
2 dr c − r(1 − r2 )c− 2 r
2π 10.i 2π
the integral being taken from r null. We find thus 0.474 for the probability that the
inclination of the 100 orbits must fall within the limits 50◦ ± 1.17377◦ ; the probability
that the mean inclination must be inferior to the observed inclination is therefore 0.737.
This probability is not great enough in order that the observed result makes a rejection
of the hypothesis of an equal facility of the inclinations of the orbits, and in order to

57
indicate the existence of an original cause which has influence on these inclinations, a
cause which we cannot forbid to admit in the inclinations of the orbits of the planetary
system.
The same thing holds with respect to the sense of the movement. The probability
that, out of 100 comets, 47 moreover will be retrogrades, is the sum of the 48 first terms
of the binomial (p + q)100 , by making in the result of the calculation p = q = 12 . But
the sum of the 50 first terms, plus the half of the 51st or the middle term, is the half of
100
the entire binomial, or of 12 + 12 , that is 12 ; the sought probability is therefore

1 100.99 . . . 51 1 50 50.49 1 1.2.3 . . . 100.1594


− + + or = − .
2 1.2.3 . . . 50.2100 2 51 51.52 2 (1.2.3 . . .)2 .2100 .663

By virtue of the theorem


1 1 √
1.2.3 . . . s = ss+ 2 c−s 1 + + ··· 2π,
12s

we have, very nearly,


1 1 √
1.2.3 . . . 100 = 100100+ 2 c−100 1 + 2π,
1200

1
2100 (1.2.3 . . . 50)2 = 100100+1 c−100 1 + π.
300

The preceding probability becomes thus [265]


1 1 1197.1594
−√ = 0.3046.
2 50π 1200.663
This probability is much too great to indicate a cause which has favored, at the origin,
the direct movements. Thus the cause which has determined the sense of the move-
ments of the revolution and of the rotation of the planets and of the satellites seems to
have no influence on the movement of the comets.

§14. The method of the preceding section has the advantage to be extended to the
case where the number of balls of the urn which bear the same label is not equal to
unity, but varies according to any law whatsoever. Let us imagine, for example, that
there is only one ball bearing the no 0, that one ball bearing only the no 1, and thus
consecutively until no r inclusively. Let us suppose moreover that there are two balls
bearing the no r + 1, two balls bearing the no r + 2, and thus consecutively until no
n inclusively. The total number of balls in the urn will be 2n − r + 1, the probability
1
to extract from it one of the labels inferior to r + 1 will be therefore 2n−r+1 , and
o
the probability to extract from it the n r + 1 or one of the superior labels will be
2 1+lr+1
2n−r+1 : we will represent it by 2n−r+1 ; but we will make l = 1 in the result of the
calculation. Although there are no labels beyond no n, we will be able however to
consider in the urn some labels superior to n, to infinity, provided that we will give to
their extraction a null probability; we will be able therefore to represent this probability

58
by 1+l2n−r+1
−2l
r+1 n+1
, by making l = 1 in the result of the calculation. By this artifice,
we will be able to represent generally the probability of any label whatsoever by the
preceding expression, provided that we will make lr+1 commence only when the labels
will commence to surpass r, and that we will make ln+1 commence only when one of
the labels will commence to surpass n. This premised, we will find, by applying here [266]
the reasonings of the previous section, that the probability to bring forth the number s
in i drawings is equal to

(s + i − 1)(s + i − 2)(s + i − 3) · · · (s + 1)
(1 + lr+1 − 2ln+1 )i ,
1.2.3 . . . (i − 1)(2n − r + 1)i

provided that, in the development of this function according to the powers of l, we


diminish in each term s from the exponent of the power of l, that we suppose next
l = 1 and that we arrest the series when we arrive to some negative factors.

§15. Let us apply now this method to the investigation on the mean result that any
number of observations of which the laws of facility of the errors are known must give.
For this, we will resolve the following problem:
Let i variable and positive quantities be t, t1 , t2 , . . . , ti−1 , of which the sum is
s, and of which the law of possibility is known; we propose to find the sum of the
products of each value that a given function ψ(t, t1 , t2 , . . .) of these variables is able
to receive, multiplied by the probability corresponding to this value.
Let us suppose, for more generality, that the functions which express the possi-
bilities of the variables t, t1 , . . . are discontinuous, and let us represent by φ(t) the
possibility of t, from t = 0 to t = q, by φ (t) + φ(t) its possibility from t = q to
t = q  , by φ (t) + φ (t) + φ(t) its possibility from t = q  to t = q  , and thus con-
secutively to infinity. Let us designate next the same quantities relative to the variables
t1 , t2 , . . . by the same letters, by writing respectively at the base the numbers 1, ,2, 3,
. . . , so that q1 , q1 , . . . ; φ1 (t1 ), φ1 (t1 ), . . . correspond, relatively to t1 , to that which q,
q  , . . . , φ(t), φ (t), . . . are respectively to t, and thus consecutively. In this manner to
represent the possibilities of the variables, it is clear that the function φ(t) holds from
t = 0 to t infinity; that the function φ (t) holds from t = q to t infinity, and thus con-
secutively. In order to recognize the values of t, t1 , t2 , . . . when these diverse functions
begin to hold, we will multiply, conformably to the method exposed in the preceding [267]
 .
sections, φ(t) by l0 or unity, φ (t) by lq , φ (t) by lq , . . ; we will multiply similarly
φ1 (t1 ) by unity, φ1 (t1 ) by lq1 , and thus consecutively; the exponents of the powers of
l will indicate then these values. It will suffice next to make l = 1 in the last result of
the calculation. By means of these very simple artifices, we are able to easily resolve
the proposed problem.
The probability of the function ψ(t, t1 , t2 , . . .) is evidently equal to the product of
the probabilities of t, t1 , t2 , . . ., so that, if we substitute for t its values s−t1 −t2 −· · · ,
that the equation gives
t + t1 + t2 + · · · + ti−1 = s,

59
the product of the proposed function by its probability will be


⎪ ψ(s − t1 − t2 − · · · , t1 , t2 , . . .)



⎪ q  q  
⎨ × [φ(s − t1 − t2 − · · · ) + l φ (s − t1 − t2 − · · · ) + l φ (s − t1 − t2 − · · · ) + · · · ]


(A) × [φ1 (t1 ) + lq1 φ1 (t1 ) + lq1 φ1 (t1 ) + · · · ]



⎪ × [φ2 (t2 ) + lq2 φ (t2 ) + lq2 φ (t2 ) + · · · ]

⎪ 2 2


× .............................................

We will have therefore the sum of all these products: 1◦ by multiplying the preceding
quantity by dt1 , and by integrating for all the values of which t1 is susceptible; 2◦
by multiplying this integral by dt2 and by integrating for all the values of which t2
is susceptible, and thus consecutively to the last variable ti−1 ; but these successive
integrations require some particular attentions.
Let us consider any term whatsoever of the quantity (A), such as

lq+q1 +q2 +··· ψ(s − t1 − t2 − · · · , t1 , t2 , . . .)φ (s − t1 − t2 − · · · )φ1 (t1 )φ2 (t2 ) · · · ;

by multiplying it by dt1 , it is necessary to integrate for all the possible values of t1 ;


now the function φ (s − t1 − t2 − · · · ) holds only when t, of which the value is s − t1 −
t2 − · · · , equals or surpasses q; the greatest value that t1 is able to receive is therefore
s − q − t2 − t3 − · · · Moreover, φ1 (t1 ) holding only when t1 is equal or greater than [268]
q1 , this quantity is the smallest value that t1 is able to receive; it is necessary therefore
to take the integral of which there is concern from t1 = q1 to

t 1 = s − q − q 1 − t2 − t 3 − · · · ;

or, that which reverts to the same, from t1 − q1 = 0 to

t1 − q1 = s − q − q1 − t2 − t3 − · · ·

We will find in the same manner that by multiplying this new integral by dt2 , it will be
necessary to integrate it from t2 − q2 = 0 to

t2 − q2 = s − q − q1 − q2 − t3 − · · ·

By continuing to operate thus, we will arrive to a function of

s − q − q1 − q2 − · · · ,

in which there will remain none of the variables t, t1 , t2 , . . . This function must be
rejected, if s − q − q1 − q2 − · · · is null or negative; because it is clear that, in this
case, the system of functions φ (t), φ1 (t1 ), φ2 (t2 ), . . . are not able to be employed. In
fact, the smallest values of t1 , t2 , . . . being, by the nature of these functions, equals
to q1 , q2 , . . ., the greatest value that t is able to receive is s − q1 − q2 − · · · ; thus the
greatest value of t − q is
s − q − q1 − q2 − · · · ;
now the function φ (t) is able to be employed only as long as t − q is positive.

60
Thence results a very simple solution of the proposed problem. Let us substitute:
1◦ q + t instead of t in φ (t), q  + t instead of t in φ (t), q  + t instead of t in φ (t)
and thus consecutively; 2◦ q1 + t1 instead of t1 in φ1 (t1 ), q1 + t1 instead of t1 in
φ1 (t1 ), . . .; 3◦ q2 + t2 instead of t2 in φ2 (t2 ), q2 + t2 instead of t2 in φ2 (t2 ), . . . and
thus consecutively; 4◦ finally, k + t instead of t, k1 + t1 instead of t1 , and thus of the [269]
remainder, into ψ(t, t1 , t2 , . . .); the function (A) will become

⎪ ψ(k + s − t1 − t2 − t3 − · · · , k1 +tq1 , k 2 + t2 , . . .)



⎨ ×[φ(s − t1 − t2 − t3 − · · · )+l φ (s + q − t1 − t2 − · · · )

(A ) 
+lq φ (s + q  − t2 − t3 − · · · ) + · · · ]

⎪ 

⎪ ×[φ1 (t1 ) − lq1 φ1 (q1 + t1 )+lq1 φ1 (q1 + t1 ) + · · · ]

×[φ2 (t2 ) − lq2 φ2 (q2 + t2 )+ · · · ].

By multiplying this function by dt1 , we will integrate it from t1 null to t1 = s −


t2 − t3 − · · · . We will multiply next this first integral by dt2 , and we will integrate
it from t2 null to t2 = s − t3 − t4 − · · · . By continuing thus, we will arrive to a
last integral, which will be a function of s, and which we will designate by Π(s), and
this function will be the sought sum of all the values of ψ(t, t1 , t2 , . . .), multiplied by
their respective probabilities. But for this it is necessary to take care to change in any
term whatsoever, multiplied by a power of l, such as lq+q1 +q2 +··· , k in the part of the
exponent of the power relative to the variable t, and which in this case is q; and, if this
part is lacking, it is necessary to suppose k equal to zero. It is similarly necessary to
change k1 in the part of the exponent relative to the variable t1 , and thus consecutively;
it is necessary to diminish s from the entire exponent of the power of l, and to write
thus, in the present case, s − q − q1 − q2 − · · · , instead of s, and to reject the term, if
s, thus diminished, becomes negative. Finally it is necessary to suppose l = 1.
If ψ(t, t1 , t2 , . . .), φ(t), φ (t), . . . , φ1 (t1 ), . . . are rational and entire functions of
the variables t, t1 , t2 , . . . of their exponentials and of sines and cosines, all the suc-
cessive integrations will be possible, because it is of the nature of these functions to
reproduce themselves by the integrations. In the other cases, the integrations would not
be able to be possible; but the preceding analysis reduces then the problem to quadra-
tures. The case of the rational and entire functions offer some simplifications that we
are going to expose.
Let us suppose that we have [270]

φ(t) + lq φ (q + t) + lq φ (q  + t) + · · · = A + Bt + Ct2 + · · ·
q1  
φ1 (t1 ) + l φ1 (q1 + t1 ) + l φ1 (q1 + t1 ) + · · ·= A1 + B1 t1 + C1 t21 + · · ·
q1 

φ2 (t2 ) + lq2 φ2 (q2 + t2 ) + lq2 φ2 (q2 + t2 ) + · · ·= A2 + B2 t2 + C2 t22 + · · ·
......................................... .....................,

and let us designate by Htn tn1 1 tn2 2 · · · any term whatsoever of

ψ(k + t, k1 + t1 , k2 + t2 , . . .);

61
it is easy to be assured that the part of Π(s) corresponding to this term is

⎪ 1.2.3 . . . n.1.2.3 . . . n1 .1.2.3 . . . n2 . . . Hsi+n+n1 +n2 +···−1



⎪ 2
⎨ × [A + (n + 1)Bs + (n + 1)(n + 2)Cs + · · · ]

(B) × [A1 + (n1 + 1)B1 s + (n1 + 1)(n1 + 2)C1 s2 + · · · ]



⎪ × [A2 + (n2 + 1)B2 s + (n2 + 1)(n2 + 2)C2 s2 + · · · ]



× . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .,

provided that, in the development of this quantity, instead of any power whatsoever a of
sa
s, we write 1.2.3...a . We will have next the corresponding part of the entire sum of the
values of ψ(t, t1 , t2 , . . .), multiplied by their respective probabilities, by changing any
term of this development, such as Hλtμ sa into Hλ(s − μ)a , and by substituting into
H, instead of k, the part of the exponent μ which is relative to the variable t, instead of
k1 the part relative to t1 , and thus of the remainder.
If in formula (B) we suppose H = 1, and n, n1 , n2 , . . . nulls, we will have the sum
of the values of unity multiplied by their respective probability; now it is clear that this
sum is nothing other than the sum of all the combinations in which the equation

t + t1 + t2 + · · · + ti−1 = s

holds, multiplied by their probability; it expresses consequently the probability of this [271]
equation. If, under the preceding hypotheses, we suppose moreover that the law of
probability is the same for the first r variables t, t1 , t2 , . . . , tr−1 , and that, for the last
i − r, it is again the same, but different, than for the first, we will have

A = A1 = A2 = · · · = Ar−1 ,
B = B1 = B2 = · · · = Br−1 ,
............................ ,
Ar = Ar+1 = · · · = Ai−1 ,
Br = Br−1 = · · · = Bi−1 ,
............................ ,

and formula (B) will be changed into the following:

(C) si−1 (A + Bs + 2Cs2 + · · · )r (Ar + Br s + 2Cr s2 + · · · )i−r .

This formula will serve to determine the probability that the sum of the errors of any
number of observations whatsoever, of which the law of facility of errors is known,
will be comprehended within some given limits.
Let us suppose, for example, that we have i − 1 observations of which the errors
for each observation are able to be extended from −h to +g, and that by naming z the
error of the first of these observations, the law of facility of this error is expressed by
a+bz+cz 2 . Let us suppose next that this law is the same for the errors z1 , z2 , . . . , zi−2
of the other observations, and let us seek the probability that the sum of these errors
will be comprehended within the limits p and p + e.

62
If we make

z = t − h, z1 = t1 − h, z2 = t2 − h, ...,

it is clear that t, t1 , t2 , . . . will be positive and will be able to be extended from zero to
h + g; moreover, we will have

z + z1 + z2 + · · · + zi−2 = t + t1 + t2 + · · · + ti−2 − (i − 1)h;

therefore the greatest value of the sum z + z1 + z2 + · · · + zi−2 being, by assumption,


equal to p + e, and the smallest being equal to p, the greatest value of t + t1 + t2 + [272]
· · · + ti−2 will be (i − 1)h + p + e, and the smallest will be (i − 1)h + p; by making
thus
(i − 1)h + p + e = s
and
t + t1 + t2 + · · · + ti−2 = s − ti−1 ,
ti−1 will always be positive and will be able to be extended from zero to e. This
premised, if we apply in this case formula (C), we will have q = h + g. Besides,
the law of facility of errors z being a + bz + cz 2 , we will conclude from it the law of
facility of t, by changing z into t − h. Let

a = a − bh + ch2 , b = b − 2ch;

we will have a + b t + ct2 for this law; this will be therefore the function φ(t). But,
as, from t = h + g to t infinity, the facility of the values of t is null by hypothesis, we
will have
φ (t) + φ(t) = 0,
that which gives
φ (t) = −(a + b t + ct2 );
therefore, if we make

a = a + b (h + g) + c(h + g)2 ,


b = b + 2c(h + g),
we will have

φ(t) + lq φ (q + t) = a + b t + ct2 − lh+g (a + b t + ct2 ),

and this equation will hold further by changing t into t1 , t2 , . . ., since the law of facility
of the errors is supposed the same for all the observations.
As for the variable ti−1 , we will observe that the probability of the equation

z + z1 + · · · + zi−2 = μ

being, whatever be μ, equal to the product of the probabilities of z, z1 , z2 , . . . , the


probability of the equation

t + t1 + t2 + · · · + ti−2 = s − ti−1

63
will be equal to the product of the probabilities of t, t1 , t2 , . . . ; the law of probability
of ti−1 is therefore constant and equal to unity, and, as this variable must be extended [273]
only from ti−1 = 0 to ti−1 = e, we will have
qi−1 = e, φi−1 (ti−1 ) = 1, φi−1 (ti−1 ) + φi−1 (ti−1 ) = 0
and, consequently,
φi−1 (ti−1 ) = −1,
that which gives
φi−1 (ti−1 ) + lqi−1 φi−1 (qi−1 + ti−1 ) = 1 − le ;
formula (C) will become therefore
(C ) si−1 [a + b s + 2cs2 − lh+g (a + b s + 2cs2 )]i−1 (1 − le ).
Let
(a + b s + 2cs2 )i−1 = a(1) + b(1) s + c(1) s2 + f (1) s3 + · · · ,
(a + b s + 2cs2 )i−2 (a + b s + 2cs2 ) = a(2) + b(2) s + c(2) s2 + · · · ,
(a + b s + 2cs2 )i−3 (a + b s + 2cs2 ) = a(3) + b(3) s + c(3) s2 + · · · ,
.................................... ................................
The preceding formula (C ) will give, by changing any term whatsoever, such as λlμ sa ,
a
into λ(s−μ)
1.2.3...a ,
⎧ (1)  i−1  ⎫

⎪ a s − (s − e)i−1 ⎪


⎪ ⎪


⎪ (1)   ⎪


⎪ b i
− − i ⎪


⎪ + s (s e) ⎪


⎪ i ⎪


⎪ ⎪


⎪ c (1)  i+1  ⎪


⎪ + s − (s − e) i+1 ⎪


⎪ i(i + 1) ⎪


⎪ ⎪


⎪ ⎪


⎪ + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ⎪


⎪ ⎧   ⎫ ⎪


⎪ ⎪ (2)
− − i−1
− − − − i−1
⎪ ⎪


⎪ ⎪

a (s h g) (s h g e) ⎪
⎪ ⎪


⎪ ⎪
⎪ ⎪
⎪ ⎪


⎪ ⎪
⎪ b (2)   ⎪
⎪ ⎪


⎪ ⎪
⎨ + (s − h − g) i
− (s − h − g − e) i ⎪
⎬ ⎪


⎪ ⎪

1 ⎨ −(i − 1)
i ⎬

⎪ c (2)   ⎪


1.2.3 . . . (i − 1) ⎪ ⎪
⎪ + (s − h − g)i+1 − (s − h − g − e)i+1 ⎪ ⎪ ⎪


⎪ ⎪
⎪ i(i + 1) ⎪
⎪ ⎪


⎪ ⎪
⎪ ⎪
⎪ ⎪


⎪ ⎩ ⎭ ⎪

⎪ + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ⎪

⎪ ⎧   ⎫ ⎪


⎪ (3) ⎪


⎪ ⎪
⎪ a (s − 2h − 2g) i−1
− (s − 2h − 2g − e) i−1

⎪ ⎪


⎪ ⎪
⎪ ⎪
⎪ ⎪


⎪ ⎪
⎪ (3)   ⎪
⎪ ⎪


⎪ ⎪
⎪ b ⎪
⎪ ⎪


⎪ ⎨ + (s − 2h − 2g) i
− (s − 2h − 2g − e) i
⎬ ⎪


⎪ (i − 1)(i − 2) i ⎪


⎪ + ⎪


⎪ 1.2 ⎪
⎪ c (3)   ⎪
i+1 ⎪⎪
⎪ ⎪

⎪ ⎪
⎪ + (s − 2h − 2g) i+1
− (s − 2h − 2g − e) ⎪ ⎪


⎪ ⎪ i(i + 1)
⎪ ⎪
⎪ ⎪


⎪ ⎪
⎪ ⎪
⎪ ⎪


⎪ ⎩ ⎭ ⎪


⎪ + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ⎪


⎩ ⎪

−.....................................................

64
It is necessary to reject from this expression the terms in which the quantity raised [274]
under the sign of the powers is negative.
Let us suppose now that, z, z1 , z2 , . . . representing always the errors of i − 1
observations, the law of facility, so much of the error z as of the negative error −z, is
β(h − z), and that h and −h are the limits of the errors. Let us suppose moreover that
this law is the same for all the observations, and let us seek the probability that the sum
of the errors will be comprehended within the limits p and p + e.
If we make z = t − h, z1 = t1 − h, . . . , it is clear that t, t1 , . . . will be always
positive and will be able to be extended from zero to 2h; but here the law of facility is
discontinuous at two points. From t = 0 to t = h, it is expressed by βt. From t = h
to t = 2h, it is expressed by β(2h − t); finally it is null from t = 2h to t infinity. We
have therefore
q = h, q  = 2h;
we have next
φ(t) = βt,

φ (t) + φ(t) = (2h − t)β,
φ (t) + φ (t) + φ(t) = 0,
that which gives
φ (t) = (2h − 2t)β, φ (t) = (t − 2h)β.
Thus we have in this case

φ(t) + lq φ (q + t) + lq φ (q  + t) = βt(1 − lh )2 ,
an equation which holds further by changing t into t1 , t2 , . . . Presently we have
z + z1 + z2 + · · · + zi−2 = t + t1 + t2 + · · · + ti−2 − (i − 1)h;
therefore the sum of the errors z, z1 , . . . needing to be, by hypothesis, contained within
the limits p and p + e, the sum of the values of t, t1 , . . . , ti−2 will be comprehended
within the limits (i − 1)h + p and (i − 1)h + p + e; so that, if we make
t + t1 + t2 + · · · + ti−2 = s − ti−1 ,
s being supposed equal to (t − 1)h + p + e, ti−1 will be able to be extended from zero [275]
to e, and we will see, as in the preceding example, that its facility must be supposed
equal to unity in this interval, and that it must be supposed null beyond this interval;
thus we have
qi−1 = e and φi−1 (ti−1 ) + lqi−1 φi−1 (ti−1 ) = 1 − le .

This premised, if we observe that, 2β dz(h − z) being the probability that the error
of an observation is comprehended within the limits −h and +h, that which is certain,
we have β = h12 ; formula (C) will give, for the expression of the sought probability,
⎧ 2i−2 ⎫
⎪ s − (s − e)2i−2 ⎪

⎪   ⎪


⎪ 2i−2 2i−2 ⎪

⎨ − (2i − 2) (s − h) − (s − h − e) ⎬
1
(2i − 2)(2i − 3)   ,
1.2.3 . . . (2i − 2)h2i−2 ⎪
⎪ + (s − 2h)2i−2 − (s − 2h − e)2i−2 ⎪⎪

⎪ ⎪



1.2 ⎪

− .............................................

65
by taking care to reject all the terms in which the quantity elevated to the power 2i − 2
is negative.
We will next apply this analysis to the following problem. If we imagine a number
i of points ranked in a straight line, and on these points some ordinates, of which the
first is at least equal to the second, the latter at least equal to the third, and thus consec-
utively, and that the sum of these i ordinates are constantly equal to s, by supposing s
partitioned into an infinity of parts, we are able to satisfy the preceding conditions, in
an infinity of ways. We propose to determine the value of each of the ordinates, a mean
among all the values that it is able to receive.
Let z be the smallest ordinate, or the ith ordinate; let z + z1 be the (i − 1)st ordinate;
let z + z1 + z2 be the (i − 2)nd ordinate, and thus consecutively to the first ordinate
which will be z + z1 + · · · + zi−1 . The quantities z, z1 , z2 , . . . will be either nulls or
positives, and their sum iz +(i−1)z1 +(i−2)z2 +· · ·+zi−1 will be, by the conditions
of the problem, equal to s. Let [276]

iz = t, (i − 1)z1 = t1 , (i − 2)z2 = t2 , ..., zi−1 = ti−1 ;

we will have
t + t1 + t2 + · · · + ti−1 = s;
the variables t, t1 , t2 , . . . will be able to be extended to s. The rth ordinate will be
t t1 ti−r
+ + ··· + .
i i−1 r
It is necessary to determine the sum of all the variations that this quantity is able to
receive, and to divide it by the total number of these variations, in order have the mean
ordinate. Formula (B) gives very easily this sum, by observing that here
t t1 ti−r
ψ(t, t1 , t2 , . . .) = + + ··· + ,
i i−1 r
and we find it equal to

si 1 1 1
+ + ··· + .
1.2.3 . . . i i i−1 r

By dividing this quantity by the total number of combinations, which is able to be only
a function of i and of s and which we will designate by N , we will have, for the mean
value of the rth ordinate,

si 1 1 1
+ + ··· + .
1.2.3 . . . iN i i−1 r

In order to determine N , we will observe that all the mean values must together equal
s, that which gives
si−1
N= ;
1.2.3 . . . (i − 1)

66
the mean value of the rth ordinate is therefore

s 1 1 1
() + + ··· + .
i i i−1 r

Let us suppose that an observed effect has been able to be produced only by one of
the i causes A, B, C, . . ., and that a person, after having estimated their respective [277]
probabilities, writes on a ticket the letters which indicate these causes, in the order of
the probabilities that he attributes to them, by writing the first the letter indicating the
cause which seems to him most probable. It is clear that we will have, by the preceding
formula, the mean value of the probabilities that he is able to suppose to each of them,
by observing that here the quantity s, that we must apportion on each of the causes,
is certitude or unity, since the person is assured that the effect must result from one
of them. The mean value of the probability that he attributes to the cause that he has
placed on his ticket at the rth rank is therefore

1 1 1 1
+ + ··· + .
i i i−1 r

Thence it follows that, if a tribunal is summoned to decide on this object, and if each
member expresses his opinion by a ticket similar to the preceding, then, by writing
on each ticket, beside the letters which indicate the causes, the mean values which
correspond to the rank that they have on the ticket, by making next a sum of all the
values which correspond to each cause on the diverse tickets, the cause to which will
correspond the greatest sum will be that which the tribunal will judge most probable.
This rule is not at all applicable to the choice of the electoral assemblies, because
the electors are not at all obliged, as the judges, to apportion one same sum taken
for unity on the diverse parts among which they must be determined; they are able
to suppose to each candidate all the nuances of merit comprehended between the null
merit and the maximum of merit, which we will designate by a; the order of the names
on each ticket does only to indicate that the elector prefers the first to the second, the
second to the third, etc. We will determine thus the numbers that it is necessary to write
on the ticket beside the names of the candidates.
Let t1 , t2 , t3 , . . . , ti be the respective merits of the i candidates in the opinion of
the elector, t1 being the merit that he supposes to the one of the candidates who he [278]
has set at the first rank, t2 being  the merit that he supposes at the second, and thus
consecutively. The integral tr dt1 dt2 . . . dti will express the sum of the merits that
the elector is able to attribute to candidate r, provided that we integrate first with respect
to ti , from ti = 0 to ti = ti−1 , next with respect to ti−1 , from ti−1 to ti−2 , and thus
consecutively, to the integral relative to t1 , which we will take from t1 null to t1 = a.
Because it is clear that then ti never surpasses  ti−1 , ti−1 never surpasses ti−2 , . . . . By
dividing the preceding integral by this here dt1 dt2 . . . dti which expresses the total
sum of the combinations in which the preceding condition is fulfilled, we will have the
mean expression of the merit which the elector is able to attribute to the rth candidate.
In executing the integrations, we find i−r+1 i+1 a for this expression.
Thence it follows that we are able to write on the ticket of each elector i beside
the first name, i − 1 beside the second, i − 2 beside the third, . . . By uniting next all

67
the numbers relative to each candidate on the diverse tickets, the one of the candidates
who will have the greatest sum must be presumed the candidate who, in the eyes of the
electoral Assembly, has the greatest merit, and must consequently be chosen.
This mode of election would be without doubt the better, if some strange considera-
tions in the merit did not influence at all often with respect to the choice of the electors,
even the most honest, and did not determine them at all to place in the last ranks the
most formidable candidates to the one who they prefer, that which gives a great advan-
tage to the candidates of a mediocre merit. Also experience has caused to abandon it
in the establishments which have adopted it.
Let us suppose that the errors of an observation are able to be extended within the
limits +a and −a, but that ignoring the law of probability of these errors we subject it
only to the condition to give to them a probability so much smaller as they are greater,
the probability of the positive errors being supposed the same as that of the correspond-
ing negative errors, all things that it is natural to admit. Formula () will give again the [279]
mean law of the errors. For this we will imagine the interval a partitioned into an infi-
a x
nite number i of parts represented by dx, so that i = dx ; we will make next r = dx ;
formula () becomes thus 
s dx dx
,
a x
the integral being taken from x = x to x = a. In the present question s = 12 ; because
the error must fall within the limits −a and +a, the probability that it will fall within
the limits zero and a is 12 ; it is the quantity s that it is necessary to apportion on all the
points of the interval a; formula () becomes then

dx a
log .
2a x
Thus the mean law of the probabilities of the positive errors x, or negatives −x, is
1 a
log .
2a x

68
BOOK II
CHAPTER III
DES LOIS DE LA PROBABILITÉ QUI RÉSULTENT DE LA MULTIPLICATION
INDÉFINIE DES ÉVÉNEMENTS

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC §§16–17, pp. 280–308

ON THE LAWS OF PROBABILITY WHICH RESULT FROM THE INDEFINITE


MULTIPLICATION OF EVENTS

p being the probability of the arrival of a simple event at each trial and 1 − p that of its
non-arrival, to determine the probability that, out of a very great number n of trials, the
number of times that the event will take place will be comprehended within some given
limits. Solution of the problem. The the most probable number of times is np. Expression
of the probability that this number of times will be comprehended within the limits np±l.
The limits ±l remaining the same, this probability increases with the number n of trials:
the probability remaining the same, the ratio of the interval 2l of the limits to the number
n is tightened when n increases, and, in the case of n infinite, this ratio becomes null
and the probability is changed into certitude. The solution of the preceding problem
serves further to determine the probability that the value of p, supposed unknown, is
comprehended within some given limits, when, out of a very great number of trials n, we
know the number i of events corresponding to p which are arrived: p is very nearly ni , and
generally when, in a trial, there must arrive any one of many simple events, the respective
probabilities of these events are very nearly proportionals to the number of times that they
will arrive in a very great number n of trials. P being the probability of the arrival of an
event composed of two simple events, of which p and 1−p are the respective probabilities
and 1 − P being the probability of the non-arrival of this composite event, if out of a very
great number n of arrivals and of non-arrivals of the same event, we know the number i
of these arrivals, we have the probability that the value of P will be comprehended within
some given limits, and, as P is a known function of p, we conclude from it the probability
that the value of p will be comprehended within some given limits. No 16.
An urn A containing a very great number n of white and black balls, at each drawing, we
extract one from it that we replace with a black ball; we demand the probability that,
after r drawings, the number of white balls will be x.
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
The solution of the problem depends on a linear equation in partial finite differences of the first
order, with variable coefficients. Reduction of this equation to an equation in the infinitely
small partial differences. Integration of this last equation. Application of the solution to
the case where the urn is originally filled in this manner: we project a right prism of which
the base, being a regular polygon of p + q sides, is narrow enough in order that the prism
never falls on it; on the p + q lateral faces, p are white and q are black and we put, into
urn A, at each projection, a ball of the color of the face on which the prism falls again.
Two urns A and B each contain a very great number n of white and black balls, the number of
whites being equal to the one of the blacks in the totality 2n of balls; we draw at the same
time a ball from each urn, and we place again into one urn the ball extracted from the
other. By repeating this operation any number r times, we demand the probability that
there will be x white balls in urn A.
The problem depends on a linear equation in the partial finite differences of the second order,
with variable coefficients. Reduction of this equation to an equation in the infinitely
small partial differences of the second order. Integration of this last equation by means
of a definite integral. Development of this integral into series. Determination of the
constants of the series by means of its initial value. Analytic theorems relative to this
object. Application of the solution in the case where urn A is originally filled, as in the
preceding problem. Mean value of the white balls in each urn, after r drawings. General
expression of this value, in the case where we have a number e of urns disposed circularly
and each containing a great number n of balls, some white and the others black, each
drawing consisting in extracting at the same time one ball from each urn and placing it
again into the following, by departing from one of them, in a determined sense. No 17.

2
§16. In measure as the events are multiplied, their respective probabilities are de- [280]
veloped more and more; their mean results and the profits or the losses which depend
on them converge toward some limits to which they bring together with the probabili-
ties always increasing. The determination of these increases and of these limits is one
of the most interesting and most delicate parts of the analysis of chances.
Let us consider first the manner in which the possibilities of two simple events, of
which one alone must arrive at each trial, is developed when we multiply the number of
trials. It is clear that the event of which the facility is greatest must probably arrive more
often in a given number of trials, and we are carried naturally to think that by repeating
the trials a very great number of times, each of these events will arrive proportionally
to its facility, that we will be able thus to discover by experience. We will demonstrate
analytically this important theorem.
We have seen in §6 that, if p and 1 − p are the respective probabilities of two events
a and b, the probability that in x + x trials the event a will arrive x times and the event
b, x times, is equal to

1.2.3 . . . (x + x ) 


px (1 − p)x ;
1.2.3 . . . x.1.2.3 . . . x

this is the (x + 1)st term of the binomial [p + (1 − p)]x+x . Let us consider the greatest [281]
kp x
of these terms that we will designate by k. The anterior term will be 1−p x+1 , and the
1−p x
following term will be k p x +1 . In order that k be the greatest term, it is necessary
that we have
x p x+1
< < ;
x + 1 1−p x
it is easy to conclude from it that, if we make x + x = n, we will have

(n + 1)p − 1 < x < (n + 1)p;

thus x is the greatest whole number comprehended within (n + 1)p; by making there-
fore
x = (n + 1)p − s,
that which gives

x+s x + 1 − s p x+s
p= , 1−p= , =  ,
n+1 n+1 1−p x +1−s

s will be less than unity. If x and x are very great numbers, we will have, very nearly,
p x
= ,
1−p x
that is to say that the exponents of p and of 1 − p in the greatest term of the binomial
are quite nearly in the ratio of these quantities; so that, of all the combinations which
are able to take place in a very great number n of trials, the most probable is that in
which each event is repeated proportionally to its probability.

3
The lth term, after the greatest, is
1.2.3 . . . n 
px−l (1 − p)x +l .
1.2.3 . . . (x − l).1.2.3 . . . (x + l)
We have, by §33 of Book I,
 
n+ 12 −n
√ 1
1.2.3 . . . n = n c 2π 1 + + ··· ,
12n
that which gives [282]
x−l
 
1 1 c 1
= (x − l)l−x− 2 √ 1− − ··· ,
1.2.3 . . . (x − l) 2π 12(x − l)
x 
+l
 
1  −x −l− 12 c 1
= (x + l) √ 1 − − · · · .
1.2.3 . . . (x + l) 2π 12(x + l)
1
Let us develop the term (x − l)l−x− 2 . Its hyperbolic logarithm is
   
1 l
l−x− log x + log 1 − ;
2 x
now we have
 
l l l2 l3 l4
log 1 − = − − 2 − 3 − 4 − ··· ;
x x 2x 3x 4x

we will neglect the quantities of order n1 , and we will suppose that l2 does not surpass
4
at all the order n; then we will be able to neglect the terms of order xl 3 , because x and
x are of order n. We will have thus
   
1 l
l−x− log x + log 1 −
2 x
 
1 l l2 l3
= l−x− log x + l + − − 2,
2 2x 2x 6x
that which gives, by passing again from the logarithms to the numbers,
 
1 l2 1 l l3
(x − l)l−x− 2 = cl− 2x xl−x− 2 1 + − 2 ;
2x 6x
we will have similarly
 
 −l−x − 21 −l− 2x
l 2
−l−x − 12 l l3
(x + l) =c 
x 1 −  + 2 .
2x 6x
x+s
We have next, by that which precedes, p = n+1 , s being less than unity; by making
x−z x
therefore p = n , z will be contained within the limits n+1 and − n−x
n+1 , and conse-
x +z
quently it will be, setting aside the sign, below unity. The value of p gives 1−p = n ; [283]

4
we will have, by the preceding analysis,
  
x +l xx−l xx +l nzl
p x−l
(1 − p) = 1+  ;
nn xx

thence we deduce
1.2.3 . . . n 
px−l (1 − p)x +l
1.2.3 . . . (x − l).1.2.3 . . . (x + l)
√ − nl2  
nc 2xx nzl l(x − x) l3 l3
=√ √ 1+  + − + .
π 2xx xx 2xx 6x2 6x2

We will have the term anterior to the greatest term and which is extended from it at the
distance l, by making l negative in this equation; by uniting next these two terms, their
sum will be √
2 n nl2
√ √ c− 2xx .
π 2xx
The finite integral √
 2 n nl2
√ √ c− 2xx ,
π 2xx 

taken from l = 0 inclusively, will express therefore the sum of all the terms of the
binomial [p + (1 − p)]n , comprehended between the two terms, of which the we have
px+l for factor, and the other has px−l for factor, and which are thus equidistant from
the greatest term; but it is necessary to subtract from this sum the greatest term which
is evidently contained twice.
Now, in order to have this finite integral, we will observe that we have, by §10 of
Book I, y being function of l,

  −1  0
1 dy 1 dy 1 dy
y= dy = − + + ··· ,
c dl −1 dl 2 dl 12 dl

whence we deduce, by the preceding section,


 
1 1 dy
y = y dl − y + + · · · + const.;
2 12 dl
√ nl2
2 n
y being here equal to √ √
π 2xx
c− 2xx , the successive differentials of y acquire for [284]
nl
factor √2xx and its powers. Thus, l not being supposed to be able to be more than
,

order n, this factor is of order √1n , and consequently its differentials, divided by the
respective powers of dl, decrease more and more; by neglecting therefore, as we have
done previously, the terms of order n1 , we will have, by starting with l the two finite
and infinitely small integrals, and designating by Y the greatest term of the binomial,
 
1 1
y = ydl − y + Y.
2 2

5
The sum of all the terms of the binomial [p + (1 − p)]n contained between
 the two
terms equidistant from the greatest term by the number l being equal to y − 12 Y , it
will be 
1
ydl − y,
2
and if we add there the sum of these extreme terms, we will have, for the sum of all
these terms, 
1
ydl + y.
2
If we make √
l n
t= √ ,
2xx
this sum becomes
 √
2 2 n 2
(o) √ dt c−t + √ √ c−t .
π π 2xx 

The terms that we have neglected being of the order n1 , this expression is so much more
exact as n is greater; it is rigorous when n is infinity. It would be easy, by the preceding
analysis, to have regard to the terms of order n1 and of the superior orders.
We have, by that which precedes, x = np + z, z being a number smaller than unity; [285]
we have therefore √
x+l l+z t 2xx z
−p= = √ + ;
n n n n n
thus formula (o) expresses the probability that the difference between the ratio of the
number of times that the event a must arrive to the total number of trials, and the facility
p of this event, is comprehended within the limits

t 2xx z
(l) ± √ + .
n n n

2xx being equal to

2z 2z 2
n 2p(1 − p) + (1 − 2p) − 2 ,
n n
we see that the interval comprehended between the preceding limits is of order √1n .
If the limit of t, that we will designate by T , is supposed invariable, the probability
determined by the function (o) remains very nearly the same; but the interval compre-
hended between the limits (l) diminishes without ceasing in measure as the trials are
repeated, and it becomes null, when their number is infinite.
This interval being supposed invariable, when the events are multiplied, T increases
without ceasing, and quite nearly as the square root of the number of trials. But, when

6
T is considerable, formula (o) becomes, by §27 of Book I,
2 2
c−T 1 c−T
1− √ +
,
2T π q 2nπ p(1 − p) + nz (1 − 2p) − z2
1+ n2
2q
1+
3q
1+
.
1 + ..
2
q being equal to 2T1 2 . When we make T increase, c−T diminishes with an extreme
rapidity, and the preceding probability approaches rapidly to unity, to which it becomes [286]
equal, when the number of trials is infinite.
There are here two sorts of approximations: the one of them is relative to the limits
taken on both sides of the facility of the event a; the other approximation is related to
the probability that the ratio of the arrivals of this event to the total number of trials will
be contained within these limits. The indefinite repetition of the trials increases more
and more this probability, the limits remaining the same; it narrows more and more the
interval of these limits, the probability remaining the same. Into infinity, this interval
becomes null, and the probability is changed into certitude.
The preceding analysis unites with the advantage to demonstrate this theorem the
one to assign the probability that, in a great number n of trials, the ratio of the arrivals
of each event will be comprehended within some given limits. Let us suppose, for
example, that the facilities of the births of boys and of girls are in the ratio of 18 to 17,
and that there are born in one year 14000 infants; we demand the probability that the
number of boys will not surpass 7363, and will not be less than 7037.
In this case, we have
18
p= , x = 7200, x = 6800, n = 14000, l = 163;
35
formula (o) gives quite nearly 0.994303 for the sought probability.
If we know the number of times that out of n trials the event a is arrived, formula (o)
will give the probability that its facility p, supposed unknown, will be comprehended
within the given limits. In fact, if we name i this number of times, we will have, by
that which precedes, the probability that the difference ni − p will be comprehended


within the limits ± T n√2xx
n
+ nz ; consequently, we will have the probability that p will
be comprehended within the limits [287]

i T 2xx z
∓ √ − .
n n n n


The function T n√2xxn
being of the order √1n , we are able, by neglecting the quantities
1
of order n , to substitute there i instead of x and n − i instead of x ; the preceding limits
become thus, by neglecting the terms of order n1 ,

i T 2i(n − i)
∓ √ ,
n n n

7
and the probability that the facility of the event a is contained within these limits is
equal to
 √ −T 2
2 2 nc
(o ) √ dt c−t + √ .
π π 2i(n − i)
We see thus that, in measure as the events are multiplied, the interval of the limits
is narrowed more and more, and the probability that the value of p falls within these
limits approaches more and more unity or certitude. It is thus that the events, in being
developed, make known their respective probabilities.
We arrive directly to these results, by considering p as a variable which is able to
be extended from zero to unity, and by determining, according to the observed events,
the probability of its diverse values, as we will see it when we will treat the probability
of causes deduced from observed events.
If we have three or a greater number of events a, b, c, . . ., of which one alone must
arrive at each trial, we will have, by that which precedes, the probability that, in a very
great number n of trials, the ratio of the number x of times that one of these events,
a for example, will arrive, to the number n, will be comprehended within the limits
p ± α, α being a very small fraction, and we see that, in the extreme case of the number
n infinite, the interval 2α of these limits is able to be supposed null, and the probability
is able to be supposed equal to certitude, so that the numbers of arrivals at each event [288]
will be proportional to their respective facilities.
Sometimes the events, instead of making known directly the limits of the value of
p, give those of a function of this value; then we conclude from it the limits of p, by the
resolution of equations. In order to give a quite simple example of it, let us consider
two players A and B, of whom the respective skills are p and 1−p, and playing together
on this condition, that the game is won by the one of the two players who, out of three
trials, will have vanquished twice his adversary, the third trial being not played, as
useless, when one of the players is vanquished in the first two trials.
The probability of A to win the game is the sum of the first two terms of the bino-
mial [p + (1 − p)]3 ; it is consequently equal to p3 + 3p2 (1 − p). Let P be this function;
by raising the binomial P + (1 − P ) to the power n, we will have, by the preceding
analysis, the probability that, out of the number n of games, the number of games won
by A will be comprehended within the given limits. It suffices for that to change p into
P in formula (o).
If we name i the number of games won by A, formula (o ) will give the probability
that P will be comprehended within the limits

i T 2i(n − i)
± √ .
n n n
Let therefore p be the real and positive root of the equation
i
p3 + 3p2 (1 − p) = ;
n
by designating by p ∓ δp the limits of p, the corresponding limits of P will be very
nearly 3p2 − 2p3 ∓ 6p (1 − p )δp; by equating these limits to the preceding, we will

8
have
T 2i(n − i)
δp =  √ ;
6p (1 − p )n n
thus formula (o ) will give the probability that p will be comprehended within the limits [289]

 T 2i(n − i)
p ∓  √ .
6p (1 − p )n n
The number n of games does not determine the number of trials, since there are able
to be some games of two trials, and others of three trials. We will have the probability
that the number of games of two trials will be comprehended within the given limits,
by observing that the probability of a game with two trials is p2 + (1 − p)2 ; let us
designate this function by P  . By elevating the binomial P  + (1 − P  ) to the power
n, formula (o) will give the probability that the number of games of two trials will be
comprehended within the limits nP  ± l; now the number of games of two trials being
nP  ± l, the number of games with three trials will be n(1 − P  ) ∓ l; the total number
of trials will be therefore 3n − nP  ∓ l; formula (o) will give therefore the probability
that the number of trials will be comprehended within the limits

2n(1 + p − p2 ) ∓ T 2nP  (1 − P  ).

§17. Let us consider an urn A containing a very great number n of white and black
balls, and let us suppose that at each drawing we draw one ball from the urn, and that
we replace it with a black ball. We demand the probability that after r drawings the
number of white balls will be x.
Let us name yx,r this probability. After a new drawing, it becomes yx,r+1 . But, in
order that there are x white balls after r+1 drawings, it is necessary that there are either
x + 1 white balls after the drawing r and that the following drawing makes a white ball
exit, or x white balls after the drawing r and that the following drawing makes a black
ball exit. The probability that there will be x + 1 white balls after r drawings is yx+1,r
and the probability that then the following drawing will make a white ball exit is x+1 n ;
the probability of the composite event is therefore x+1 n y x+1,r ; this is the first part of
yx,r+1 . The probability that there will be x white balls after the drawing r is yx,r , and [290]
the probability that then there will exit a black ball is n−x
n , because the number of black
balls in the urn is n − x; the probability of the composite event is therefore n−x n yx,r ;
this is the second part of yx,r+1 . Thus we have
x+1 n−x
yx,r+1 = yx+1,r + yx,r .
n n
If we make
x = nx , r = nr , 
yx,r = yx,r ,
this equation becomes
 
  1
yx,r + 1 = x + yx  + 1 ,r + (1 − x )yx  ,r ;
n n n

9
n being supposed a very great number, we are able to reduce into convergent series
yx,r + n1 and yx + n1 ,r ; we will have therefore, by neglecting the squares and the supe-
rior powers of n1 ,
1 ∂yx  ,r x ∂yx  ,r 1 
= + yx,r ;
n ∂r n ∂x n
the integral of this equation in partial differences is
 
 r  r
yx,r  = c φ(x c ),

 
φ(x cr ) being an arbitrary function of x cr , that it is necessary to determine through

the value of yx,0 .
Let us suppose that urn A has been replenished in this manner. We project a right
prism of which the base, being a regular polygon of p + q sides, is rather narrow so
that the prism never falls on it. On the p + q lateral faces, p are white and q are black,
and we put into urn A, at each projection, a ball of the color of the face on which the
prism falls. After n projections, the number of white balls will be quite nearly, by
np np
the preceding section, p+q , and the probability that it will be p+q + l is, by the same [291]
section,
2 l2
p + q − (p+q)
√ c 2npq .
2npqπ
If we make
np (p + q)2
x= + l, = i2 ,
p+q 2pq
this function becomes
i − in2 (x− p+q
np 2
) ;
√ c
πn
this is the value of yx,0 or of yx  ,0 ; but the preceding value of yx  ,r gives
x
yx,0 = φ ;
n
we have therefore x i −i2 n( nx − p+q
p 2
) ;
φ =√ c
n nπ
hence  2
 
xcr
icr −i2 n p
− p+q
yx  ,r =√ c
n
;

whence we deduce  r 2
r 2 np
ic n − in xc n − p+q
yx,r =√ c .

r
np
The most probable value of x is that which renders xc n − p+q null, and consequently
it is equal to
np
r ;
(p + q)c n

10
the probability that the value of x will be contained within the limits

np μ n
r ± r
(p + q)c n cn

is 
i dμ −i2 μ2
2 √ c ,
π
the integral being taken from μ = 0.
Let us seek now the mean value of the number of white balls contained within urn [292]
A, after r drawings. This value is the sum of all the possible numbers of white balls,
multiplied by their respective probabilities; it is therefore equal to

2np i dμ −i2 μ2
r √ c ,
(p + q)c n π

the integral being taken from μ = 0 to μ = ∞. This value is thus


np
r ;
(p + q)c n

consequently, it is the same as the most probable value of x.


Let us consider now two urns A and B containing each the number n of balls, and
let us suppose that, in the total number 2n of balls, there are as many white as black.
Let us imagine that we draw at the same time one ball from each urn, and that next one
puts into one urn the ball extracted from the other. Let us suppose that we repeat this
operation any number r times, by agitating at each time the urns, in order to well mix
the balls; and let us seek the probability that after this number r of operations, there
will be x white balls in urn A.
Let zx,r be this probability. The number of possible combinations in r operations
is n2r ; because at each operation the n balls of urn A are able to be combined with
each of n balls from urn B, that which produces n2 combinations; n2r zx,r is therefore
the number of combinations in which it is possible to have x white balls in urn A after
these operations. Now, it is able to happen that the (r + 1)st operation makes a white
ball exit from urn A, and makes a white ball return; the number of cases in which this
is able to arrive is the product of n2r zx,r , by the number x of white balls of urn A, and
by the number n − x of white balls which must be then in urn B, since the total number
of white balls of the two urns is n. In all these cases, there remains x white balls in urn
A; the product x(n − x)n2r zx,r is therefore one of the parts of n2r+2 zx,r+1 . [293]
It is able to happen further that the (r + 1)st operation makes a black ball exit and
return into urn A, that which conserves in this urn x white balls. Thus n − x being,
after the rth operation, the number of black balls of urn A, and x being the one of black
balls of urn B, (n − x)xn2r xx,r is further a part of n2r+2 zx,r+1 .
If there are x − 1 white balls in urn A after the rth operation and if the operation
following makes a black ball exit from it and makes a white ball return there, there will
be x white balls in urn A after the (r + 1)st operation. The number of cases in which
that is able to arrive is the product of n2r xx−1,r , by the number n − x + 1 of the black

11
balls of urn A after the rth drawing, and by the number n − x + 1 of white balls of
urn B, after the same operation; (n − x + 1)2 n2r zx−1,r is therefore further a part of
n2r+2 zx,r+1 .
Finally, if there are x + 1 white balls in urn A after the rth operation, and if the
operation following makes a white ball exit from it and makes a black ball return there,
there will be again, after this last operation, x white balls in the urn. The number of
cases in which that is able to arrive is the product of n2r zx+1,r , by the number x + 1
of white balls of urn A, and by the number x + 1 of black balls of urn B after the rth
operation; (x + 1)2 n2r zx+1,r is therefore again part of n2r+2 zx,r+1 .
By reuniting all these parts and by equating their sum to n2r+2 zx,r+1 , we will have
the equation in partial finite differences
 2  2
x+1 2x  x x−1
zx,r+1 = zx+1,r + 1− zx,r + 1 − zx−1,r .
n n n n

Although this equation is in differences of the second order with respect to the
variable x, however its integral contains only one arbitrary function which depends on
the probability of the diverse values of x in the initial state of urn A. In fact, it is clear
that, if we knew the values of zx,0 corresponding to all the values of x from x = 0 to [294]
x = n, the preceding equation will give all the values of zx,1 , zx,2 , . . . , by observing
that, the negative values of x being impossible, zx,r is null when x is negative.
If n is a very great number, this equation is transformed into an equation in partial
differences, that we obtain thus. We have then, very nearly,

∂zx,r 1 ∂ 2 zx,r
zx+1,r = zx,r + + ,
∂x 2 ∂x2
∂zx,r 1 ∂ 2 zx,r
zx−1,r = zx,r − + ,
∂x 2 ∂x2
∂zx,r
zx,r+1 = zx,r + .
∂x
Let √
n+μ n
x= , r = nr , zx,r = U ;
2
the preceding equation in the partial finite differences will become, by neglecting the
terms of order n12 ,
∂U ∂U ∂2U
= 2U + 2μ + .
∂r ∂μ ∂μ2
In order to integrate this equation, which, as we are able to be assured by the method
that I have given for this object, in the Mémoires de l’Académie des Sciences of the
year 1773, is integrable in finite terms only in the manner of definite integrals, let us
make 
U = φ dtc−μt ,

12
φ being a function of t and of r . We will have

∂U
2μ = 2c tφ − 2 c−μt (φ dt + tdφ),
−μt
∂μ

∂2U
= c−μt t2 φ dt;
∂μ2
the equation in the partial differentials in U become thus
   
−μt ∂U −μt −μt 2 dφ
c dt = 2c tφ + c dt t φ − 2t .
∂r dt

By equating between them the terms affected of the sign, we will have the equation
in the partial differentials
∂φ ∂φ
= t2 φ − 2t .
φr ∂t

The term outside the sign, equated to zero, will give, for the equation in the limits of
the integral,
0 = tφc−μt .
The integral of the preceding equation in the partial differentials of φ is
 
1 2 t
φ = c 4 t ψ 2r ,
c
 
ψ c2rt  being an arbitrary function of c2rt  ; we have therefore
  
−μt+ 14 t2 t
U= dtc ψ .
c2r
Let there be √
t = 2μ + 2s −1
the expression of U will take this form
  √ 
−μ2 −s2 s − μ −1
(A) U =c ds c Γ .
c2r
It is easy to see that the preceding equation, to the limits of the integral, requires that
√ of the integral relative to s are taken from s = −∞ to s = ∞. By taking the
the limits
radical −1 with the − sign, we will have for U an expression of this form
  √ 
−μ2 −s2 s + μ −1
U =c ds c Π ,
c2r

the arbitrary function Π(s) being able to be different from Γ(s). The sum of these
two expressions of U will be its complete value. But it is easy to be assured that, the
integrals being taken from s = −∞ to s = ∞, the addition of this new expression of
U adds nothing to the generality of the first, in which it is comprehended.

13
Let us develop now the second member of equation (A), according to the powers [296]
1
of c2r
, and let us consider one of the terms of this development, such as
2 
H (i) c−μ 2 √
ds c−s (s − μ −1)2i ;
c4ir
this term becoming, after the integrations,
2
1.3.5 . . . (2i − 1) √ H (i) c−μ
π
2i c4ir 
2
i(2μ) i(i − 1)(2μ)4 i(i − 1)(i − 2)(2μ)6
× 1− + − + ···
1.2 1.2.3.4 1.2.3.4.5.6
1
Let us consider further one term of this development, relative to the odd powers of c2r
,
such as √ 2 
L(i) −1c−μ 2 √
(4i+2)r  ds c−s (s − μ −1)2i+1 ,
c
This term becomes, after the integrations,
√ 2  
1.3.5 . . . (2i + 1)L(i) πμc−μ i(2μ)2 i(i − 1)(2μ)4
1 − + − · · · .
2i c(4i+2)r 1.2.3 1.2.3.4.5

We will have therefore thus the general expression of the probability U , developed
1
into a series ordered according to the powers of c2r  , a series which becomes very


convergent
 when
√ r is a considerable number. This expression must be such that U dx
or 12 U dμ n is equal to unity, the integrals being extended √ to all the
√ values of x and
of μ, that is to say from x null to x = n, and from μ = − n to μ = n; because it is
certain that, one of the values of x must take place, the sum  of−μthe probabilities of all
2
these values must be equal to unity. By taking the integral c dμ within the limits
of μ, we have the same result, to very nearly, as by taking it from μ = −∞ to μ = ∞;
−n
the difference is only of the order c√n , and seeing the extreme rapidity with which c−n
diminishes in measure as n increases, we see that this difference is insensible√ when n [297]
is a great number. This premised, let us consider in the integral 12 U dμ n the term
√   
1.3.5 . . . (2i − 1)H (i) nπ −μ2 i(2μ)2 i(i − 1)(2μ)4
dμc 1− + − ··· .
2i c4ir 1.2 1.2.3.4

By extending the integral from μ = −∞ to μ = ∞, this term becomes


√  
1.3.5 . . . (2i − 1)H (i) π n i(i − 1) i(i − 1)(i − 2)
1−i+ − + ··· .
2i c4ir 1.2 1.2.3

The factor 1 − i + i(i−1)


1.2 − · · · is equal to (1 − 1) ; it is therefore null, except in the
i

case of i = 0, where it is reduced to unity. It is clear that the terms of the expression

of U which contain the odd powers of μ give a null result in the integral 21 U dμ n,

14
2
extended from μ = −∞ to μ = ∞; because these terms have for factor c−μ , and we
have generally within these limits

2
μ2i+1 dμc−μ = 0.

There is therefore only the first term of the expression of U , a term


 that√
we will rep-
−μ2 1
resent by Hc√ , which is able to give a result in the integral 2 U dμ n, and this
result is 12 H nπ; we have therefore
1 √
H nπ = 1;
2
consequently,
2
H=√ .

The general expression of U has thus the following form
⎧ ⎫

⎪ Q (1)
(1 − 2μ 2
) Q(2) (1 − 4μ2 + 43 μ4 ) ⎪

2c−μ ⎨ 1 + + + ··· ⎬
2
c 4r  c 8r 
(k) U = √
nπ ⎪ (0) (1) 2 2 (2) 4 2 4 4 ⎪
⎩ + L μ + L μ(1 − 3 μ ) L μ(1 − 3 μ + 15 μ ) + · · ·⎪
⎪ ⎭
2r  6r  10r 
c c c
Q(1) , Q(2) , . . . , L(0) , L(1) , . . . being indeterminate constants, which depend on the
initial value of U .
Let us suppose that U becomes X when r is null, X being a given function of μ. [298]
We have generally these two theorems,

2
0 = Q(i) μ2q dμ Ui c−μ ,

2
(i)
0=L μ2q+1 dμ Ui c−μ ,

(i) −μ2 (i) −μ2


when q is less than i; Ui and Ui being functions of μ, by which 2Q

c
nπc4ir
and √2L c
nπc(4i+2)r
are multiplied in the expression of U . In order to demonstrate these theorems, we will
2
2Q(i)
√c
−μ
Ui
observe that, by that which precedes, nπ
is equal to

√ 2 2 √
( −1)2i H (i) c−μ ds c−s (μ + s −1)2i ;

it is necessary therefore to show that we have



2 2 √
0= μ2q ds dμ c−μ −s (μ + s −1)2i ,

the integrals being taken from μ and s equal to −∞ to μ and s equal to +∞. By
integrating first with respect to μ, this term becomes

2q − 1 2 2 √
μ2q−2 dμ ds c−μ −s (μ + s −1)2i
2

2 2 √
+i μ2q−1 dμ ds c−μ −s (μ + s −1)2i−1 .

15
By continuing to integrate thus by parts relatively to μ, we arrive finally to some terms
of the form 
2 2 √
k dμ ds c−μ −s (μ + s −1)2e ,

e not being zero, and, by that which precedes, these terms are null.
We will prove in the same manner that we have

2
0 = L(i) μ2q+1 dμ Ui c−μ .

Thence it follows that we have generally


 
2 2
0 = Ui Ui dμ c−μ , 0= Ui Ui dμ c−μ ,

i and i being different numbers. Because if, for example, i is greater than i, all the [299]
powers of μ in Ui are less than 2i ; each of the terms of Ui will give therefore, by
2
that which precedes, a result null in the integral Ui Ui dμ c−μ . The same reasoning
   2
holds for the integral Ui Ui dμc−μ .
But these integrals are not nulls, when i = i . We will obtain them in this case in
this manner. We have, by that which precedes,
√  2 √
2i ( −1)2i ds c−s (μ + s −1)2i
Ui = √ .
1.3.5 . . . (2i − 1) π

The term which has for factor μ2i in this expression is



2i ( −1)2i μ2i
;
1.3.5 . . . (2i − 1)
 2
now we are able to consider only this term in the first factor Ui of the integral Ui Ui dμ c−μ ;
because the inferior powers of μ, in this factor, give a null result in the integral. we
have therefore
 
−μ2 22i 2 2 √
Ui Ui dμ c = 2
√ μ2i dμ ds c−μ −s (μ + s −1)2i .
[1.3.5 . . . (2i − 1)] π

We have, by integrating with respect to μ, from μ = −∞ to μ = ∞,



2 2 √
μ2i dμ ds c−μ −s (μ + s −1)2i

2i − 1 2 2 √
= μ2i−2 dμ ds c−μ −s (μ + s −1)2i
2

2i 2 2 √
+ μ2i−1 dμ ds c−μ −s (μ + s −1)2i−1
2
The first term of the second member of this equation is null by that which precedes;
this member is reduced therefore to its second term. We find in the same manner that

16
we have 
2 2 √
μ2i−1 dμ ds c−μ −s (μ + s −1)2i−1

2i − 1 2 2 √
= μ2i−2 dμ ds c−μ −s (μ + s −1)2i−2 ,
2
and thus consecutively; we have therefore [300]

2 2 √ 1.2.3 . . . 2iπ
μ2i dμ ds c−μ −s (μ + s −1)2i = ;
22i
consequently,  √
2 2.4.6 . . . 2i π
Ui Ui dμ c−μ = .
1.3.5 . . . (2i − 1)
We will find in the same manner
 √
  −μ2 1 2.4.6 . . . 2i π
Ui Ui dμ c = .
2 1.3.5 . . . (2i + 1)

We have evidently 
2
Ui Ui dμ c−μ = 0,

in the same case where i and i are equal, because the product Ui Ui contains only
some odd powers of μ.
This premised, the general expression of U gives, for its initial value, that we have
designated by X,
2    
2c−μ   3
X= √ 1 + Q(1) 1 − 2μ2 + · · · + L(0) μ + L(1) μ 1 − μ2 + · · · .
nπ 2

If we multiply this equation by Ui dμ, and if we take the integrals from μ = −∞ to


μ = ∞, we will have, by virtue of the preceding theorems,
 
2 (i) 2
XUi dμ = √ Q Ui Ui dμc−μ ,

whence we deduce
√ 
1.3.5 . . . (2i − 1) 12 n
Q(i) = XUi dμ;
2.4.6 . . . 2i
we will find, in the same manner,
√ 
1.3.5 . . . (2i + 1) n
L(i) = XUi dμ.
2.4.6 . . . 2i

We will have therefore thus the successive values of Q(1) , Q(2) , . . . , L(0) , L(1) , . . . by
means of definite integrals, when X or the initial value of U will be given.
2 2
In the case where X is equal to √2i

c−i μ , the general expression of U takes a very [301]

17
 √ 
s−μ −1
simple form. Then the arbitrary function Γ c 2r  from formula (A) is of the form
 √ 
s−μ −1 2
−β 
kc c2r . In order to determine the constants β and k, we will observe that by
supposing
β
β = ,
c4r
we will have
2
   μ√−1 2
μ
− 1+β  −(1+β  ) s− β 1+β 
U = kc ds c .

By making next  √ 
β  μ −1
1 + β s − = s ,
1 + β
and observing that the integral relative to s must be taken from s = −∞ to s = ∞, the
integral relative to s must be taken within the same limits, we will have

k π − 1+β μ2
U=√ 
c 
.
1+β
By comparing this expression to the initial value of U , which is
2i 2 2
U = √ c−i μ ,

and observing that β is the initial value of β  , we will have


1
i2 = ,
1+β
whence we deduce
1 − i2 1 − i2
β= , β = .
i2 i2 c4r
We must have next √
k π 2i
√ =√ ,
1+β nπ
that which gives
√ 2
k π=√ ,

 √
a value that we obtain next by the condition that 12 U dμ n = 1, the integral being [302]
taken from μ = −∞ to μ = ∞; we will have, for the expression of U , whatever be r ,
2
2 μ
− 1+β
U= c 
.
nπ(1 + β  )

We find, in fact, that this value of U , substituted into the equation in the partial differ-
entials in U , satisfies it.

18
β  diminishing without ceasing when r increases, the value of U varies without
ceasing and becomes at its limit, when r is infinity,
2 2
U = √ c−μ .

In order to give an application of these formulas, let us imagine, in an urn C, a very


great number m of white balls and a parallel number of black balls. These balls having
been mixed, let us suppose that we draw from the urn n balls, that we put into urn A.
Let us suppose next that we put into urn B as many white balls as there are black balls
in urn A, and as many black balls as there are white balls in the same urn. It is clear
that the number of cases in which there will be x white balls, and consequently n − x
black balls in urn A, is equal to the product of the number of combinations of the m
white balls of urn C, taken x by x, by the number of combinations of the m black balls
of the same urn, taken n − x by n − x. This product is, by §3, equal to

m(m − 1)(m − 2) · · · (m − x + 1) m(m − 1)(m − 2) · · · (m − n + x + 1)


1.2.3 . . . x 1.2.3 . . . (n − x)
or to
(1.2.3 . . . m)2
.
1.2.3 . . . x.1.2.3 . . . (n − x).1.2.3 . . . (m − x).1.2.3 . . . (m − n + x)

The number of all possible cases is the number of combinations of the 2m balls from
urn C, taken n by n; this number is
1.2.3 . . . 2m
;
1.2.3 . . . n.1.2.3 . . . (2m − n)

by dividing the preceding fraction by that here, we will have, for the probability of x [303]
or for the initial value of U ,

(1.2.3 . . . m)2 1.2.3 . . . n.1.2.3 . . . (2m − n)


.
1.2.3 . . . x.1.2.3 . . . (m − x).1.2.3 . . . (n − x).1.2.3 . . . (m − n + x).1.2.3 . . . 2m

Now, if we observe that we have very nearly, when s is a great number,


1 √
1.2.3 . . . s = ss+ 2 c−s 2π,

we will find easily, after all the reductions, by making



n+μ n
x= ,
2
1
and by neglecting the quantities of order n which are not multiplied by μ2 ,

2 m mμ2
U=√ c− 2m−n ;
nπ 2m − n

19
by making therefore
m
i2 = ,
2m − n
we will have
2i 2 2
U = √ c−i μ .

If the number m is infinite, then i2 = 12 , and the initial value of U is

2 1 2
U = √ c− 2 μ .

Its value, after any number of drawings, is
μ2
2 −
− 4r
U=  c
1+c n .
4r
nπ 1 + c− n

The case of m infinite returns to the one in which the urns A and B would be filled,
by projecting n times a coin which would bring forth indifferently heads or tails, and [304]
putting into urn A a white ball each time that heads would arrive, and a black ball each
time that tails would arrive, and making the inverse for urn B. Because it is clear that
the probability to draw a white ball from urn C is then 12 , as that to bring forth heads
or tails.   √
By taking the integral U dx or 12 U dμ n from μ = −a to μ = a, we will have
the probability
√ that the number of white balls of urn A will be comprehended within
the limits ±a n.
We are able to generalize the preceding result, by supposing the urn A filled, as
at the beginning of this section, by the projection of a prism of p + q lateral faces, of
which p are white and q are black. we have seen that then, if we make

(p + q)2
i2 = ,
2pq
we have, at this origin or when r is null,
i − in2 (x− p+q
np 2
) .
U=√ c

Let us suppose p and q very little different, so that we have
 
p+q a
p= 1+ √ ,
2 n
 
p+q a
q= 1− √ ,
2 n
we will have
2
i2 = 2
1 − an

20
or, very nearly, i2 = 2; therefore
 √ 2
2 −2 a n
2− 2
x− n
U=√ c n .
2nπ
By making therefore √
n+μ n
x= ,
2
we will have
2 1 2
U=√ c− 2 (μ−a) .
2nπ
Let us suppose now that after any number whatsoever of drawings we have
2 (μ−α)2
U=√ c− β ,
nβπ

β and α being functions of r . If we substitute this value into the equation in the partial
differences in U , we will have
 
dβ 2(μ − α)2 dα
−  1− + 4  (μ − α)
dr β dr
 
2(μ − α)2
= 4(β − 1) 1 − − 8α(μ − α),
β

whence we deduce the two following equations:



dr  dα
= −4, = −2α.
β−1 dr

By integrating them and observing that at the origin of r , α = a and β = 2, we will


have  
β = 1 + c−4r , α = ac−2r ,
that which gives

(μ−ac−2r )2
2 −
U= c 1+c−4r

.
nπ(1 + c−4r )
Let us seek now the mean value of the number of white balls contained in urn A,
after r drawings. This value is the sum of the products of the diverse numbers of white
balls, multiplied by their respective probabilities; it is therefore equal to the integral
 √ √
n + μ n dμ n
U ,
2 2
taken from μ = −∞ to μ = ∞. By substituting for U its value given by formula (k), [306]
we will have, by virtue of the preceding theorems, for this integral,

1 n (0) − 2r
n+ L c n.
2 4

21

At the origin where r is null, this value is 12 n + 2n L(0) ; thus we will have L(0) at the
mean of the number of white balls that urn A contains at this origin.
We are able to obtain quite simply, in the following manner, the mean value of the
number of white balls, after r drawings. Let us imagine that each white ball has a value
that we will represent by unity, the black balls being supposed to have no value. It is
clear that the value of urn A will be the sum of the products of all the possible numbers
of white balls which are able to exist in the urn, multiplied by their respective prob-
abilities; this value is therefore that which we have named mean value of the number
of white balls. Let us name it z, after the rth drawing. At the following drawing, if
there exits a white ball, this value diminishes by one unit; now, if we suppose that x is
the number of white balls contained in the urn after the rth drawing, the probability of
extracting a white ball from it will be nx ; by naming therefore U the probability of this

supposition, the integral U xdx n , extended from x = 0 to x = n, will be the diminution
of z, resulting from the probability to extract a white ball from the urn. If we make, as
above, nr = r , and if we designate the very  small fraction n1 by dr , this diminution
will be equal to zdr ; because z is equal to U x dx, a sum of the products of the num-
bers of white balls by their respective probabilities. The value of urn A is increased, if
we extract a white ball from urn B, in order to put it into urn A; now, x being supposed
the number of white balls of urn A, n − x will be the one of the white balls of urn B,
and the probability to extract a white ball from this last urn will be n−x
n ; by multiplying
n−x
this probability by the probability U of x, the integral U n dx, taken from x null [307]
to x = n, will be the increase of z. U (n − x)dx is the value of urn B; by naming
therefore z  this value, z  dr will be the increase of z; we will have therefore

dz = z  dr − zdr .

The sum of the values of the two urns is evidently equal to n, a number of white balls
that they contain, that which gives z  = n − z; substituting this value of z  into the
preceding equation, it becomes

dz = (n − 2z)dr ;

whence we deduce, by integrating,

1 L(0)
z= n + 2r ,
2 4c
L(0) being an arbitrary constant, that which is conformed to that which precedes.
We are able to extend all this analysis to the case of any number whatsoever of
urns; we will limit ourselves here to seek the mean value of the number of white balls
that each urn contains after n drawings.
Let us consider a number e of urns, disposed circularly, and each containing the
number n of balls, the ones white, and the others black, n being supposed a very great
number. Let us suppose that after r drawings, z0 , z1 , z2 , . . . , ze−1 are the respective
values of the diverse urns. Each drawing consists of extracting at the same time a ball
from each urn and to putting it into the following, by departing from one of them in a

22
1
determined sense. If we make nr = r and n = dr , we will have, by the reasoning that
we have just made relatively to two urns,

dzi = (zi−1 − zi )dr ;

this equation holds from i = 1 to i = e − 1. In the case of i = e, we have

dz0 = (ze−1 − z0 )dr .

By integrating these equations, and supposing that at the origin the respective values of [308]
each urn, or the numbers of white balls that they contain, are

λ0 , λ1 , . . . , λe−1 ,

we arrive to this result, which holds from i = 0 to i = e − 1,


⎧   ⎫
⎪ 2siπ  ⎪

⎪ λ0 cos − ar ⎪


⎪ e ⎪


⎪   ⎪


⎪ − ⎪



2s(i 1)π
−  ⎪


⎪ +λ 1 cos ar ⎪


⎨ e ⎪

1 −(1−cos e )r
2sπ   
zi = Sc 2s(i − 2)π
e ⎪
⎪ +λ2 cos − ar ⎪


⎪ e ⎪


⎪ ⎪


⎪ + · · · · · · · · · · · · · · · · · · · · · · · · · · · ⎪


⎪ ⎪


⎪   ⎪


⎪ 2s(i − e + 1)π  ⎪
⎩ +λe−1 cos − ar ⎪ ⎭
e

the sign S extending to all the values of s, from s = 1 to s = e, and a being equal
to sin 2sπ
e . The term of this expression, corresponding to s = e, is independent of r ,


and equal to 1e (λ0 + λ1 + · · · + λe−1 ), that is the entire sum of the white balls of the
urns divided by their number. This term is the limit of the expression of zi , whence
it follows that after an infinite number of drawings the values of each urn are equal
among them.

23
BOOK II
CHAPTER IV
DE LA PROBABILITÉ DES ERREURS DES RÉSULTES MOYENS D’UN
GRAND NOMBRE D’OBSERVATIONS ET DES RÉSULTATS MOYENS LES
PLUS AVANTAGEUX.

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§18–24, pp. 309–354

ON THE PROBABILITY OF THE ERRORS OF THE MEAN RESULTS OF A


GREAT NUMBER OF OBSERVATIONS AND ON THE MOST ADVANTAGEOUS
MEAN RESULTS

To determine the probability that the sum of the errors of a great number of observations will
be comprehended within some given limits, by supposing that the law of possibility of the
errors is known, and the same for each observation, and that the negative errors are as
possible as the corresponding positive errors. General expression of this probability. No
18.
To determine, under the preceding suppositions, the probability that the sum of the errors of
a great number of observations or the sum of their squares, of their cubes, etc., will be
comprehended within some given limits, setting aside the sign. General expression of this
probability and of the most probable sum. No 19.
An element being known quite nearly, to determine its correction by the collection of a great
number of observations. Formation of the equations of condition. By disposing them in
a manner that, in each of them, the coefficient of the correction of the element has the
same sign, and adding them, we form a final equation which gives a mean correction. Ex-
pression of the probability that the error of this mean correction is comprehended within
some given limits. The most general manner to form the final equation is to multiply each
equation of condition by an indeterminate factor and to add all these products. Expression
of the probability that the error of correction given by this final equation is comprehended
within some given limits. Expression of the mean error that we are able to fear more or
less. Determination of the system of factors which render this error a minimum. We are
led then to the result that the method of least squares of the errors of observations gives.
Mean error of its result. Its expression depends on the law of facility of the errors of the
observations. Means to render it independent. No 20.
To correct, by the collection of a great number of observations, many elements already known
quite nearly. Formation of the equations of condition. By multiplying them each by an
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
indeterminate factor and adding the products, we form a first final equation: a second
system of factors gives a second final equation, and thus consecutively until we have as
many final equations as there are elements to correct. Expression of the mean errors that
we are able to fear with respect to each element corrected by these final equations. Deter-
mination of the systems of factors by the condition that these mean errors are minima. We
fall again into the method of least squares of the errors of observations; whence it follows
that this method is that which the Calculus of probabilities indicates as being the most
advantageous. Expression of the mean errors that it leaves yet to fear, more or less, with
respect to each element. These expressions are independent of the law of facility of the
errors of each observation and contain only the data of the observations. Simple means to
compare among them, on the side of precision, diverse astronomical Tables of one same
star. No 21.
Examination of the case where the possibility of the negative errors is not the same as that of
the positive errors. Mean result toward which the sum of the products of the errors of a
great number of observations converge, by any factors; probability of this convergence.
No 22.
Examination of the case where we consider the observations already made. Then the error of
the first gives the errors of all the others. The probability of this error, taken a posteriori or
according to the observations already made, is the product of the respective probabilities
a priori of the errors of each observation. By imagining therefore a curve of which the
abscissa is the error of the first observation, and of which this product is the ordinate, this
curve will be that of the probabilities a posteriori of the errors of the first observation. The
error that it is necessary to suppose to it is the abscissa corresponding to the ordinate which
divides the area of the curve into two equal parts. The value of this abscissa depends on
the unknown law of the probabilities a priori of the errors of the observations, and in this
ignorance, it is convenient to rest content with the most advantageous result, determined
a priori by the preceding articles. Investigation of the law of probabilities a priori of the
errors, which give constantly the sum of the errors null for the result that it is necessary to
choose a posteriori. This law gives generally the rule of the minimum of the squares of
the errors of the observations. This last rule becomes necessary when we must choose a
mean result among many results, each given by a great number of observations of diverse
kinds. No 23.
Investigation of the system of corrections of many elements by a great number of observations,
which render a minimum, setting aside the sign, the greatest of the errors that it supposes
to them. This system is the one which renders a minimum the sum of similar powers,
very elevated and even, of each error. It differs little from the system given by the method
of least squares of the errors of the observations. Historical notice on the methods of
correction of the elements by the observations. No 24.

2
§18. Let us consider now the mean results of a great number of observations of which we [309]
know the law of the facility of errors. Let us suppose first that, for each observation, the errors
are able to be equally

−n, −n + 1, −n + 2, . . . , −1, 0, 1, 2, . . . , n − 2, n − 1, n.
1
The probability of each error will be 2n+1
. If we name s the number of observations, the

l −1
coefficient of c in the development of the polynomial
√ √ √ √ √ √
−n −1
(c + c−(n−1) −1
+ c−(n−2) −1
+ · · · + c− −1
+ 1 + c −1
+ · · · + cn −1 s
)

will be the number√of combinations in which the sum of the errors is l. This coefficient is the term
independent
√ of c −1 and of its powers in the development of the same polynomial multiplied
−l −1
by c , and√ it is clearly

equal to the term independent of  in the same development
l −1
+c−l −1
multiplied by c 2
or by cos l; we will have therefore, for the expression of this
coefficient, 
1
d cos l(1 + 2 cos  + 2 cos 2 + · · · + 2 cos n)s ,
π
the integral being taken from  = 0 to  = π.
We have seen, in §36 of Book I, that this integral is [310]
√ 3 l2
(2n + 1)s 3 − n(n+1)s2
 c ;
n(n + 1)2sπ

the total number of combinations of the errors is (2n + 1)s ; by dividing the preceding quantity
by that here, we will have √ 3 l2
3 − 2
 c n(n+1)s ;
n(n + 1)2sπ
for the probability that the sum of the errors of the s observations will be l.
If we make 
n(n + 1)s
l = 2t ,
6

n(n+1)s
the probability that the sum of the errors will be comprehended within the limits +2T 6

and −2T n(n+1)s 6
will be equal to

2 2
dtc−t ,
π
the integral being taken from t = 0 to t = T . This expression holds further in the case of n
infinite. Then, by naming 2a the interval comprehended between the limits of the errors

of each
observation, we will have n = a, and the preceding limits would become ± 2T√a6 s : thus the

probability that the sum of the errors will be comprehended within the limits ±ar s is
 
3 3 2
2 drc− 2 r ;

ar
it is also the probability that the mean error will be comprehended within the limits ± √ s
; be-
cause we have the mean error by dividing by s the sum of the errors.
The probability that the sum of the inclination of the orbits of s comets will be comprehended [311]
within some given limits, by supposing all the inclinations equally possible, from zero to a right

3
angle, is evidently the same as the preceding probability; the interval 2a of the limits of the errors
of each observation is, in this case, the interval π2 of the limits of the possible inclinations: then

the probability that the sum of the inclinations must be comprehended within the limits ± πr4 s
  3 2
is 2 2π 3
drc− 2 r , that which accords with that which we have found in §13.
Let
 us  suppose generally that the probability of each positive or negative error is expressed
by φ nx , x and n being infinite numbers. Then, in the function

1 + 2 cos  + 2 cos 2 + 2 cos 3 + · · · + 2 cos n,


 
each term, such as 2 cos x, must be multiplied by φ nx ; now we have
x
x
x2 x
2 2
2φ cos x = 2φ − φ n  + ···
n n n2 n
By making therefore
x 1
x = , dx = ,
n n
the function
n

0 1 2
φ + 2φ cos  + 2φ cos 2 + · · · + 2φ cos n
n n n n
becomes  
2n dx φ(x ) − n2 2 x2 dx φ(x ) + · · ·

the integrals must be extended from x = 0 to x = 1. Let then


 
k = 2 dx φ(x ), k = x2 dx φ(x ), ...

The preceding series becomes



k 2 2
nk 1 − n  + ··· .
k
Now the probability that the sum of the errors of the s observations will be comprehended within [312]
the limits ±l is, as it easy to be assured of it by the preceding reasonings,
⎧ ⎫s

⎪ 0 1 2 ⎪
2
 ⎨φ + 2φ cos +2φ cos 2 + · · ·⎪

d dl cos l n n n ,
π ⎪ n


⎩ +2φ cos n ⎪

n
the integral being taken from  null to  = π; this probability is therefore
 s
(nk)s k 2 2
(u) 2 d dl cos l 1 − n  + ··· .
π k
Let us suppose s
k 2 2 2
1− n  + ··· = c−t ;
k
by taking the hyperbolic logarithms, we will have, very nearly, when s is a great number,

k 2 2
s n  = t2 ,
k

4
that which gives 
t k
= .
n k s
 x
If we observe next that, nk or 2 dx φ n expressing the probability that the error of an obser-
vation is comprehended within the limits ±n, this quantity must be equal to unity, the function
(u) will become    

2 k −t2 lt k
dl dt c cos ,
nπ k s n k s


the integral relative to t must be taken from t null to t = πn kk s , or to t = ∞, n being
supposed infinite. Now we have, by §25 of Book I,
    √
lt k −t2 π − l22 kk s
dt cos c = c 4n ,
n k s 2
by making therefore  [313]
l k s
= 2t ,
n k
the function (u) becomes 
2 2
√ dt c−t
π
Thus, by naming, as above, 2a the interval comprehended between the limits of the errors of each
observation, the probability that the sum of the errors of the s observations will be comprehended

within the limits ±ar s is  
k kr 2
dr c− 4k .
k s
 
If φ nx is constant, then kk = 6, and this probability becomes
 
3 3 2
2 drc− 2 r ,

that which is conformed to that which we have found above.
If φ nx or φ(x ) is a rational and entire function of x , we will have, by the method of

§15, the probability that the sum of the errors will be comprehended within the limits ±ar s,

expressed by a sequence of powers s, 2s, . . . of quantities of the form s − μ ± r s, in which μ
increases in arithmetic progression, these quantities being continued until they become negatives.
By comparing this sequence to the preceding expression of the same probability, we will obtain
in a quite close manner the value of the sequence, and we will arrive thus with respect to this
kind of sequence to some theorems analogous to those that we have given in §42 of Book I, on
the finite differences of the powers of a variable.
If the law of facility of the errors is expressed by a negative exponential which is able to be
extended to infinity, and generally if the errors are able to be extended to infinity, then a becomes
infinite, and the application of the preceding method is able to offer some difficulties. In all these [314]
cases, we will make
x 1
= x , = dx ,
h h
h being any finite quantity whatsoever, and by following exactly the preceding analysis, we will
find, for the probability that the sum of the errors of the s observations is comprehended within

the limits ±hr s,  
k kr 2
dr c− 4k .
k s


5
 
an expression in which we must observe that φ hx or φ(x ) expresses the probability of the
error ±x, and that we have
 
k = 2 dx φ(x ), k = x2 dx φ(x ),

the integrals being taken from x = 0 to x = ∞.

§19. Let us determine presently the probability that the sum of the errors of a very great
number of observations will be comprehended within some given limits, setting aside the sign of
these errors, that is, by taking them all positively. For this, let us consider the series
n



n−1 0
φ c−n −1 +φ c−(n−1) −1 + · · · + φ + ···
n n n
√ n

n−1
+φ c(n−1) −1 + φ cn −1 ,
n n
x
φ n being the ordinate on the curve of probability of errors, corresponding to the error ±x,
and x being, in the same way as n, considered as formed of an infinite number of units. If we
raise this series to the power s, after having changed √ the sign of the negative exponentials, the
coefficient of any one exponential, such as c(l+μs) −1 , will be the probability that the sum of
the errors, taken setting aside the sign, is l + μs; this probability is therefore [315]
⎧ √ ⎫s


⎪ 0 1 2 ⎪
1
 √ ⎨φ + 2φ c  −1
+2φ c 2 −1
+ · · ·⎪

d c −(l+μs) −1 n n n ,
2π ⎪

n n −1 √ ⎪

⎩ +2φ c ⎪

n
the integral
 relative to
√  being taken from  = −π to  = π; because, in this interval, the
integral d c−r −1 or 

d(cos r − −1 sin r)
disappears, whatever be r, provided that it is not null.
We have, by developing with respect to the powers of ,
⎧  √
 √ n

s 

⎪ −μs −1 0 1  −1 n −1
⎪ log c
⎪ φ + 2φ c + · · · + 2φ c

⎪ n n n

⎪ ⎧





⎪ ⎪ φ 0 + 2φ 1 + 2φ 2 + · · · + 2φ n






⎪ ⎪
⎪ n n n n ⎪

⎨ ⎪
⎪   ⎪


⎪ √


(1) ⎪
⎨ 1 2 n ⎪


⎪ + 2 −1 φ + 2φ + · · · + nφ √

⎪ = s log n n n − μs −1

⎪ ⎪  n
 ⎪ ⎪

⎪ ⎪
⎪ 1 2 ⎪

⎪ ⎪
⎪ −  2
φ + 2 2
φ + · · · + n 2
φ ⎪


⎪ ⎪
⎪ ⎪


⎪ ⎪
⎪ n n n ⎪


⎩ ⎪
⎩ ⎪

− .............................................
By making therefore
x 1
= x , = dx ,
n n
we have
  
2 dx φ(x ) = k, x dx φ(x ) = k , x2 dx φ(x ) = k ,
 
x3 dx φ(x ) = k , x4 dx φ(x ) = kiv , .............,

6
the integrals being taken from x null to x = 1; the second member of equation (1) becomes

2k √ k 2 2 √
s log nk + s log 1 + n −1 − n  − · · · − μs −1.
k k
The error of each observation must fall necessarily within the limits ±n, we have nk = 1; the [316]
preceding quantity becomes thus

2k μ √ (kk − 2k )2 sn2 2
s − n −1 − − ··· ;
k n k2
by making therefore
μ 2k
=
n k
and neglecting the powers of  superior to the square, this quantity is reduced to its second term,
and the preceding probability becomes
 √
1 
−lw −1− kk −2k
2
sn2  2
dc k2 .

Let
k βt l √
β= √ , = √ , = r s.
kk − 2k2 n s n
The preceding integral becomes
β 2 r2   √ 
1 c− 4 − t+ lβ √−1 2
√ βdtc 2n s .
2π n s
This integral must be taken from t = −∞ to t = ∞, and then the preceding quantity becomes
β β 2 r2
√ √ c− 4 .
2 πn s

By multiplying it by dl or by ndr s, the integral

1 β 2 r2
√ βdrc− 4
2 π
will be the half-probability that the value of l, and, consequently, the sum of the errors of the
 √
observations, is comprehended within the limits 2kk as±ar s, ±a being the limits of the errors
of each observation, limits that we designate by ±n, when we imagine them partitioned into an
infinite number of parts.
We see thus that the sum of the most probable errors, setting aside the sign, is that which
 
[317]
corresponds to r = 0. This sum is 2kk as. In the case where φ(x) is constant, 2kk = 12 ; the sum
of the most probable errors is therefore then the half of the greatest sum possible, a sum which
is equal to sa. But, if φ(x) is not constant and diminishes in measure as the error x increases,

then 2kk is less than 12 , and the sum of the errors, setting aside the sign, is below the half of the
greatest sum possible.
We are able, by the same analysis, to determine the probability that the sum of the squares
of the errors will be l + μs; it is easy to see that this probability has for expression the integral
⎧ √ ⎫s


⎪ 0 1 2 22  −1 ⎪
1
 √ ⎨φ + 2φ c  −1
+2φ c + · · ·⎪

d c −(l+μs) −1 n n n ,
2π ⎪

n n2  −1 √ ⎪

⎩ +2φ c ⎪

n

7
taken from  = −π to  = π. By following exactly the preceding analysis, we will have

2n2 k
μ= ,
k
and by making
k
β = √ ,
kkiv − 2k2
the probability that the sum of the squares of the errors of the s observations will be compre-
 √
hended within the limits 2kk a2 s ± a2 r s will be

1 β 2 r 2
√ β  dr c− 4 .
π

The most probable sum is that which corresponds to r null; it is therefore 2kk a2 s. If s is a
very great number, the result of the observations will deviate very little from this value, and
2 
consequently it will make known very nearly the factor a kk .

§20. When we wish to correct an element already known quite nearly, with the collection of [318]
a great number of observations, we form some equations of condition in the following manner.
Let z be the correction of the element, and β the observation; the analytic expression of the latter
will be a function of the element. By substituting, instead of the element, its approximate value,
plus the correction z; by reducing into series with respect to z and neglecting the square of z,
this function will take the form h + pz; by equating it to the observed quantity β, we will have

β = h + pz.

z would be therefore determined, if the observation was rigorous; but, as it is susceptible of error,
by naming  this error, we have exactly, to the quantities near of order z 2 ,

β +  = h + pz;

and by making β − h = α, we have


 = pz − α.
Each observation furnishes a similar equation, that we are able to represent for the (i + 1)st
observation by this one
(i) = p(i) z − α(i) .
By reuniting all these equations, we have

(1) S(i) = zSp(i) − Sα(i) ,

the sign S relating to all the values of i, from i = 0 to i = s − 1, s being the total number of
observations. By supposing null the sum of the errors, this equation gives

Sα(i)
z= ;
Sp(i)
it is that which we name ordinarily mean result of the observations.
We have seen, in §18, that the probability that the sum of the errors of s observations will be [319]

comprehended within the limits ±ar s is
 
k kr 2
dr c− 4k .
k π


8

Let us name ±u the error of the result z; by substituting, into equation (1), ±ar s instead of
(i) (i)
S , and Sα
Sp(i)
± u instead of z, it gives

uSp(i)
r= √ ;
a s
the probability that the error of the result z will be comprehended within the limits ±u is there-
fore  
k (i) du − ku2 (Sp (i) )2
Sp c 4k a2 s .
k π
 a
Instead of supposing null the sum of the errors, we are able to suppose null any linear func-
tion of these errors, that we will represent thus,
(m) m + m(1) (1) + m(2) (2) + · · · + m(s−1) (s−1) ,
m, m(1) , m(2) , . . . being positive or negative whole numbers. By substituting into this function
(m), instead of , (1) , (2) , . . ., their values given by the equations of condition, it becomes
zSm(i) p(i) − Sm(i) α(i) ;
by equating therefore to zero the function (m), we have
Sm(i) α(i)
z= .
Sm(i) p(i)
Let u be the error of this result, so that we have
Sm(i) α(i)
z= + u;
Sm(i) p(i)
the function (m) becomes
uSm(i) p(i) .
Let us determine the probability of the error u, when the observations are in great number.
For this, let us consider the product [320]




x mx√−1 x −m(1) x√−1 x m(s−1) xn√−1


φ c × φ c × ··· × φ c ,
a a a

the sign extending   to all the values of x, from the extreme negative value of x to its positive
extreme value. φ xa is, as in the preceding sections, the probability of an error x in each obser-
vation; x being supposed, in the same way as a, formed √ from an infinity of parts taken for unity.
It is clear that the coefficient of any exponential cl −1 , in the development of this product,
will be the probability that the sum of the errors of the observations, multiplied respectively by
m, m(1) , . . . , that√is the function (m), will be equal√to l; by multiplying therefore the preceding
product by c−l −1 , the term independent of c −1 and of its powers, in this new product,
will express this probability. If we suppose, as we will do it here, the probability
  of the positive

errors the same as that of the negative errors, we will have, for the sum φ xa cmx −1 , to
√ √
reunite the terms  xmultiplied,
 one by cmx −1 , and the other by c−mx −1 , then this sum takes
the form 2 φ a cos mx. It is likewise of it of all the similar sums. Thence it follows that
the probability that the function (m) will be equal to l is equal to
⎧ √


⎪ −l −1 x ⎪
 ⎪
⎨c ×2 φ cos mx ⎪

1 a
(i) d 

,
2π ⎪
⎪ x x ⎪
⎩ ×2 φ cos m(i) x × · · · × 2 φ cos m(s−1) x⎪ ⎭
a a

9
the integral being taken from  = −π to  = π. We have, by reducing the cosines into series,


 2

x x 1 2 2 2 x x
φ cos mx = φ − m a  φ + ···
a a 2 a2 a
If we make x
a
= x and if we observe that, the variation of x being unity, we have dx = 1
a
, we [321]
will have
 x

φ dx φ(x ).
=a
a

Let us name, as in the preceding sections, k the integral 2 dx φ(x ), taken from x null to its
extreme positive value; let us name similarly k the integral x2 dx φ(x ), taken within the


same limits, and thus consecutively; we will have




x k 2 2 2 kiv 4 4 4
2 φ cos mx = ak 1 − m a  + m a  − ··· .
a k 12k
The logarithm of the second member of this equation is

k 2 2 2 kkiv − 6k2 4 4 4


m a  +
− m a  − · · · + log ak;
k 12k2
  
ak or 2a dx φ(x ) expresses the probability that the error of each observation will be compre-
hended within its limits, that which is certain; we have therefore ak = 1; that which reduces the
preceding logarithm to

k 2 2 2 kkiv − 6k2 4 4 4


m a  +− m a  − ···
k 12k2
Thence it is easy to conclude that the product

 x
 x
 x

2 φ cos mx × 2 φ cos m(i) x × · · · × ×2 φ cos m(s−1) x


a a a
is

kkiv − 6k2 4 4 k a2  2 Sm(i)2
1+ a  Sm(i)4 + · · · c− k ;
12k2
the preceding integral (i) is reduced therefore to


1 kkiv − 6k2 4 4 (i)4
√ k 2 2 (i)2
d 1 + a  Sm + · · · c−lw −1− k a  Sm .
2π 12k 2

By making sa2 2 = t2 , this integral becomes


 √
1 kkiv − 6k2 Sm(i)4 4 
− lw √−1 − kk Sm(i)2 t2
√ dt 1 + t + · · · c a s s
;
2aπ s 12k 2 s 2

(i)4
Sm(i)2 , Sm(i)4 , . . . are evidently quantities of order s; thus Sms2 is of order 1s ; by neglecting [322]
therefore the terms of this last order vis-à-vis of unity, the last integral is reduced to
 √
1 
− lt √−1 − kk Sms
(i)2 2
t
√ dtc a s .
2aπ s

10
The integral relative to  must be taken from  = −π to  = π, the integral relative to  t must
√ √
be taken from t = −aπ s to t = aπ s, and in these cases the exponential under the sign is
insensible to these two limits, either because s is a great number, or because a is here supposed
divided into an infinity of parts taken for unity; we are able therefore to take the integral from
t = −∞ to t = ∞. Let us make
 √ √
 k Sm(i)2 l −1k s
t = t+ ;
ks 2ak Sm(i)2

the preceding integral function becomes

− kl2 
c 4k a2 Sm(i)2 2
 dt c−t .
k
2aπ k
Sm(i)2

The integral relative to t must be taken, as the integral relative to t, from t = −∞ to t = ∞,


that which reduces the preceding quantity to this one,

− kl2
c 4k a2 Sm(i)2
√  
2a π kk Sm(i)2

If we make l = ar s and if we observe that, the variation of l being unity, we have adr = 1,
we will have √  kr 2 s
s −
 dr c 4k Sm(i)2 ,

2 k k π Sm(i)2

for the probability that the function (m) will be comprehended within the limits zero and ar s, [323]
the integral being taken from r null.
We have need here to know the probability of the error u of the element determined by

making null the function (m). This function being supposed equal to l or to ar s, we will have,
by that which precedes,

uSm(i) p(i) = ar s;
by substituting this value into the preceding integral function, it becomes
 ku2 (Sm(i) p(i) )2
Sm(i) p(i) −
 du c 4k a2 Sm(i)2 ;

2a k k π Sm(i)2
this is the expression of the probability that the value of u will be comprehended within the limits
zero and u, it is also the expression of the probability that u will be comprehended within the
limits zero and −u. If we make
 √
k Sm(i)2
u = 2at ,
k Sm(i) p(i)
the preceding probability becomes 
1 2
√ dt c−t .
π
 t remains
Now, the probability remaining the same, √ the same, and the interval of the two limits
k Sm(i)2
of u are tightened so much more as a k Sm(i) p(i)
is smaller. This interval remaining the

11
same, the value of t, and consequently the probability 
that the
√ error of the element falls within
k Sm(i)2
this interval, is so much greater as the same quantity a k Sm(i) p(i)
is smaller; it is necessary
(i)
therefore to choose the system of factors m , which renders this quantity a minimum; and as

a,
√k, k are the same in all these systems, it is necessary to choose the system which renders
Sm(i)2
Sm(i) p(i)
a minimum.
We are able to arrive to the same result in this manner. Let us resume the expression of [324]
the probability that u will be comprehended within the limits zero and u. The coefficient of du
in the differential of this expression is the ordinate of the curve of probabilities of the errors u
of the element, errors represented by the abscissa u of this curve, that we are able to extend to
infinity on each side of the ordinate which corresponds to u null. This premised, each error,
either positive, or negative, must be considered as a disadvantage or a real loss, in any game
whatsoever; now, by the principles of the theory of probabilities, exposed at the beginning of this
Book, we evaluate this disadvantage by taking the sum of all the products of each disadvantage
by its probability; the mean value of the error to fear more is therefore the sum of the products
of each error by its probability; it is consequently equal to the integral
 −
ku2 (Sm(i) p(i) )2
u du Sm(i) p(i) c 4k a2 Sm(i)2
 ,

2a k k π Sm(i)2

taken from u null to u infinity; thus this error is


 √
k Sm(i)2
a .
kπ Sm(i) p(i)
This quantity, taken with the − sign, gives the mean error to fear to less. It is clear that the
system of factors m(i) that it
√is necessary to choose must be such that these errors are minima
Sm(i)2
and consequently such that Sm(i) p(i)
is a minimum.
If we differentiate this function with respect to m(i) , we will have, by equating its differential
to zero, by the condition of the minimum,

m(i) p(i)
= .
Sm (i)2 Sm(i) p(i)
This equation holds, whatever be i, and, as the variation of i does not change the fraction
Sm(i)2
Sm(i) p(i)
at all, by naming μ this fraction, we will have [325]

m = μp, m(1) = μp(1) , ..., m(s−1) = μp(s−1) ,

and we are able, whatever be p, p(1) , . . . , to take μ such that the numbers m, m(1) , . . . are whole
numbers, as the preceding analysis supposes it. Then we have

Sp(i) α(i)
z= ,
Sp(i)2
and the mean error to fear becomes 
k
a kπ
± .
Sp(i)2
It is, under all the hypotheses that we are able to make on the factors m, m(1) , . . . , the smallest
possible mean error.

12
If we make the values of m, m(1) , . . . equal to ±1, the mean error to fear will be smaller
when the sign ± will be determined, in a manner that m(i) p(i) is positive, that which returns
to supposing 1 = m = m(1) = · · · , and to prepare the equations of condition, so that the
coefficient of z in each of them is positive; this is that which we do in the ordinary method. Then
the mean result of the observations is
Sα(i)
z= ,
Sp(i)
and the mean error to fear more or less is

ks
a kπ
± ;
Sp(i)
but this error surpasses the preceding, which, as we have seen is the smallest possible. We are
able to be convinced of it besides in this manner. It suffices to show that we have the inequality

s 1
> 
Sp(i) Sp(i)2
or [326]
sSp(i)2 > (Sp(i) )2 .
In fact, 2pp(1) is less than p2 +p(1)2 , since (p(1) −p)2 is a positive quantity; we are able therefore,
in the second member of the preceding inequality, to substitute, for 2pp(1) , p2 + p(1)2 − f , f
being a positive quantity. By making similar substitutions for all the similar products, this second
member will be equal to the first, less a positive quantity.
The result
Sp(i) α(i)
z= ,
Sp(i)2
to which corresponds the minimum of mean error to fear, is the one which the method of least
squares of the errors of the observations gives; because, the sum of these squares being

(pz − α)2 + (p(1) z − α(1) )2 + · · · + (p(s−1) z − α(s−1) )2 ,

the condition of the minimum of this function, by making z vary, gives for this variable the
preceding expression; this method must therefore be employed in preference, whatever be the

law of facility of the errors, a law on which depends the ratio kk .
This ratio is 16 , if φ(x) is a constant; it is less than 16 , if φ(x) is variable, and such that it
diminishes in measure as x increases, as it is natural to suppose. By adopting the mean law of
1
errors, that we have given in §15 and according to which φ(x) is equal to 2a log xa , we have

k 1
= .
k 18
As for the limits ±a, we are able to take for these limits the deviations of the mean result, which
would make to reject an observation. 

But we are able, by the same observations, to determine the factor a kk of the expression
of the mean error. In fact, we have seen, in the preceding section, that the sum of the squares [327]
2 
of the errors of the observations is very nearly 2s a kk , and that, if they are in great number, it
becomes extremely probable that the observed sum will not deviate from this value by a sensible

13
quantity; we are able therefore to equate them; now the observed sum is equal to S(i)2 or to
(i) (i)
α
S(p(i) z − α(i) )2 , by substituting for z its value SpSp(i)2 ; we find thus

a2 k Sp(i)2 .Sα(i)2 − (Sp(i) α(i) )2


2s = .
k Sp(i)2
The preceding expression of the mean error to fear respecting the result z becomes then

Sp(i)2 .Sα(i)2 − (Sp(i) α(i) )2
± √ ,
Sp(i)2 2sπ

an expression in which there is nothing which is not given by the observations and by the coeffi-
cients of the equations of condition.

§21. Let us suppose now that we have two elements to correct by the collection of a great
number of observations. By naming z and z  the respective corrections of these elements, we
will form, as in the preceding section, some equations of condition, which will be comprehended
under this general form
(i) = p(i) z + q (i) z  − α(i) ,
(i) being, as in that section, the error of the (i + 1)st observation. If we multiply respectively by
m, m(1) , . . . , m(s−1) these equations, and if we add together these products, we will have a first
final equation
Sm(i) (i) = z.Sm(i) p(i) + z  .Sm(i) q (i) − Sm(i) α(i) .
By multiplying further the same equations respectively by n, n(1) , . . . , n(s−1) and adding these
products, we will have a second final equation

Sn(i) (i) = z.Sn(i) p(i) + z  .Sn(i) q (i) − Sn(i) α(i) ,

the sign S extending here, as in the preceding section, to all the values of i, from i = 0 to [328]
i = s − 1.
If we suppose null the two functions Sm(i) (i) , Sn(i) (i) , functions which we will designate
respectively by (m) and (n), the two preceding final equations will give the corrections z and
z  of the two elements. But these corrections are susceptible of errors, relative to that of which
the supposition that we have just made is itself susceptible. Let us imagine therefore that the
functions (m) and (n), instead of being nulls, are respectively l and l , and let us name u and u
the errors corresponding to the corrections z and z  , determined by that which precedes; the two
final equations will become

l = u.Sm(i) p(i) + u .Sm(i) q (i) ,


l = u.Sn(i) p(i) + u .Sn(i) q (i) .

It is necessary now to determine the factors m, m(1) , . . . , n, n(1) , . . . , in a manner that the mean
error to fear respecting each element is a minimum. For this, let us consider the product



x −(m+n )x√−1 x −(m(1) +n(1)  )x√−1


φ c × φ c × ···
a a


x −(m(s−1) +n(s−1)  )x√−1


× φ c ,
a

14
  
the sign referring to all the values of x, from x = −a to x = a; φ xa being, as in the
preceding section, the probability of the error x, in the same way as of the error −x. The
preceding function becomes, by reuniting the two exponentials relative to x and to −x,



x x
2 φ cos(mx + nx ) × 2 φ cos(m(1) x + n(1) x ) × · · ·
a a


x
×2 φ cos(m(s−1) x + n(s−1) x ),
a

the sign extending here to all the values of x, from x = 0 to x = a, x being supposed, in the [329]
same way as a, divided into an infinity of parts taken for unity. Presently, it is clear√
that the √term
 
independent of the exponentials, in the product of the preceding function by c−l −1+l  −1 ,
is the probability that the sum of the errors of each observation, multiplied respectively by m,
m(1) , . . . , or the function (m), will be equal to l, at the same time as the function (n), sum of
the errors of each observation, multiplied respectively by n, n(1) , . . . , will be equal to l ; this
probability is therefore
⎧ 

⎪ x  ⎪
 √ √

⎨ 2 φ cos(m + n )x × · · · ⎪

1  −l −1−l   −1 a
dd c 
,
4π 2 ⎪
⎪ x ⎪
⎩ ×2 φ cos(m(s−1)  + n(s−1)  )x,⎪ ⎭
a
the integrals being taken from  and  equal to −π, to  and  equal to π. This premised:
By following exactly the analysis of the preceding section, we find that the preceding func-
tion is reduced to very nearly
 √
1  √ k 2 2 (i)2  (i) (i) 2 (i)2
dd c−l −1−l  −1− k a [ Sm +2 .Sm n + .Sn ] ,
4π 2

k and k having here the same signification as in the section cited. We see further, by the same
section, that the integrals are able to be extended from a = −∞, a = −∞, to a = ∞
and a = ∞. If we make

a .Sm(i) n(i) kl −1
t = a + + 
Sm(i)2 2k a.Sm(i)2

  k (lSm n − l Sm(i)2 ) −1
(i) (i)
t = a −  ;
2k a Sm(i)2 .Sn(i)2 − (Sm(i) n(i) )2
if we make next
E = Sm(i)2 .Sn(i)2 − (Sm(i) n(i) )2 ,
the preceding double integral becomes

− k [l2 Sn(i)2 −2ll Sm(i) n(i) +l2 Sm(i)2 ] dt dt − kkl2 Sm(i)2 − k t2 E
c 4k a2 E × c kSm(i)2 .
4π 2 a2
By taking the integrals within the positive and negative infinite limits, as those relative to a [330]
and a , we will have
1 k l2 Sn(i)2 −2ll Sm(i) n(i) +l2 Sm(i)2
(o) 4k π 2
√ c− 4k a2 E .
k
a E

It is necessary now, in order to have the probability that the values of l and of l will be compre-
hended within the given limits, to multiply this quantity by dl dl , and to integrate next within

15
theselimits. By naming X this quantity, the probability of which there is concern will be there-
fore X dl dl . But, in order to have the probability that the errors u and u of the corrections
of the elements will be comprehended within the given limits, it is necessary to substitute within
this integral, instead of l and l , their values in u and u . Now, if we differentiate the expressions
of l and of l , by supposing l constant, we have

dl = duSm(i) p(i) + du Sm(i) q (i) ,


0 = duSn(i) p(i) + du Sn(i) q (i) ,

that which gives

du(Sm(i) p(i) .Sn(i) q (i) − Sn(i) p(i) .Sm(i) q (i) )


dl = .
Sn(i) q (i)

If we differentiate next the expression of l , by supposing u constant, we have

dl = du Sn(i) q (i) ;

we will have therefore

dl dl = (Sm(i) p(i) .Sn(i) q (i) − Sn(i) p(i) .Sm(i) q (i) )du du .

By making next

F = Sn(i)2 (Sm(i) p(i) )2 − 2Sm(i) n(i) .Sm(i) p(i) .Sn(i) p(i) + Sm(i)2 .(Sn(i) p(i) )2 ,
G = Sn(i)2 .Sm(i) p(i) .Sm(i) q (i) + Sm(i)2 .Sn(i) p(i) .Sn(i) q (i)
− Sm(i) n(i) .Sn(i) p(i) .Sm(i) q (i) + Sm(i) p(i) .Sn(i) q (i) .
H = Sn(i)2 .(Sm(i) q (i) )2 − 2Sm(i) n(i) .Sm(i) q (i) .Sn(i) q (i) + Sm(i)2 .(Sn(i) q (i) )2 ,
I = Sm(i) p(i) .Sn(i) q (i) − Sn(i) p(i) .Sm(i) q (i) ,

the function (o) becomes [331]


  k(F u2 +2Guu +Hu2 )
k 1 du du −
√ c 4k a2 E .
4k π E a2

Let us integrate first this function from u = −∞ to u = ∞. If we make


   Gu 
kH
4k
u + H
t= √ ,
a E
and if we take the integral from t = −∞ to t = ∞, we will have, by considering of it only the
variation of u ,
 
k du 1 − ku 2 F H−G2
√ c 4k a2 EH .
4k π a H


Now we have
F H − G2
= I 2;
E
the preceding integral becomes therefore
 
1 du k − 4kk I 22u2
√ c a H .
H a 4k  π

16
We will have, by the preceding section, the mean error to fear, more or less, respecting the
correction of the first element, by multiplying the quantity under the sign by ±u, and taking
the integral from u = 0 to u = ∞, that which gives, for this error,

a H
±  ,
I kkπ 

the + sign indicating the mean error to fear to more, and the − sign the mean error to fear less.
Let us determine presently the factors m(i) and n(i) , in a manner that this error is a minimum. [332]
By making m(i) alone vary, we have

H −p(i) Sn(i) q (i) + q (i) Sn(i) p(i)
d log =dm(i)
I I
 (i) (i)2 
q Sn .Sm(i) q (i) − n(i) .Sm(i) q (i) .Sn(i) q (i)
− q (i) .Sm(i) n(i) .Sn(i) q (i) + m(i) (Sn(i) q (i) )2
+ dm(i)
H
It is easy to see that this differential disappears, if we suppose, in the coefficients of dm(i) ,

m(i) = μp(i) , n(i) = μq (i) ,

μ being an arbitrary coefficient independent of i, and by means of which we are able to render
m(i)√
and n(i) whole numbers; the preceding supposition renders therefore null the differential
of I , taken with respect to m(i) . We will see in the same manner that this supposition renders
H

null the differential of the same quantity, taken with respect to n(i) . Thus this supposition renders
a minimum the mean error to fear respecting the correction of the first element; and we will see
in the same manner that it renders further a minimum mean error to fear respecting the correction
of the second element, an error that we obtain by changing in the expression of the preceding H
into F . Under this supposition, the corrections of the two elements are

Sq (i)2 .Sp(i) α(i) − Sp(i) q (i) .Sq (i) α(i)


z= ,
Sp(i)2 .Sq (i)2 − (Sp(i) q (i) )2
Sp(i)2 .Sq (i) α(i) − Sp(i) q (i) .Sp(i) α(i)
z = .
Sp(i)2 .Sq (i)2 − (Sp(i) q (i) )2
It is easy to see that these corrections are those that the method of least squares of the errors of
the observations gives, or of the minimum of the function

S(p(i) z + q (i) z  − α(i) )2 ;

whence it follows that this method holds generally, whatever be the number of elements to deter-
mine; because it is clear that
 the previous analysis
 can be extended to any number of elements.
k S(i)2
By substituting for a kπ
the quantity 2sπ
, to which we are able, by §20, to suppose it [333]
equal, , (1) , . . . being that which remains in the equations of condition after having substituted
there the corrections given by the method of least squares of the errors, the mean error to fear
respecting the first element is
 
S(i)2
2sπ
Sq (i)2
± .
Sp(i)2 .Sq (i)2 − (Sp(i) q (i) )2

17
The mean error to fear more or less respecting the second element is
 
S(i)2
2sπ
Sp(i)2
± .
Sp(i)2 .Sq (i)2 − (Sp(i) q (i) )2

whence we see that the first element is more or less well determined as the second, according as
Sq (i)2 is smaller or greater than Sp(i)2 .
If the r first equations of condition do not contain q at all, and if the s − r last do not contain
p at all, then Sp(i) q (i) = 0, and the preceding formulas coincide with that of the preceding
section.
We are able to obtain thus the mean error to fear respecting each element determined by
the method of least squares of the errors, whatever be the number of elements, provided that
we consider a great number of observations. Let z, z  , z  , z  , . . . be the corrections of each
element, and let us represent generally the equations of condition by the following:

(i) = p(i) z + q (i) z  + r(i) z  + t(i) z  + · · · − α(i) .

In the case of a single element, the mean error to fear is, as we have seen,

S(i)2 1
(a) ±  .
2sπ Sp(i)2
When there are two elements, we will have the mean error to fear respecting the first element by [334]
(i)2 (i)2 (Sp(i) q (i) )2
changing, in the function (a), Sp into Sp − Sq (i)2
, that which gives, for this error,
 
S(i)2
 2sπ
Sq (i)2
(a ) ± .
Sp(i)2 . Sq (i)2 − (Sp(i) q (i) )2

When there are three elements, we will have the error to fear respecting the first element, by
(Sp(i) r (i) )2
changing, in this expression (a ), Sp(i)2 into Sp(i)2 − Sr (i)2
, Sp(i) q (i) into Sp(i) q (i) −
(i) (i) (i) (i) (i) (i) 2
r .Sq r (Sq r )
Sp
Sr (i)2
, and Sq (i)2 into Sq (i)2 − Sr (i)2
; that which gives for this error
 
S(i)2
 2sπ
Sq (i)2 .Sr(i)2 − (Sq (i) r(i) )2
(a ) ±  .

 Sp(i)2 .Sq (i)2 .Sr(i)2 − Sp(i)2 (Sq (i) r(i) )2 − Sq (i)2 (Sp(i) r(i) )2

− Sr(i)2 (Sp(i) q (i) )2 + 2Sp(i) q (i) .Sp(i) r(i) .Sq (i) r(i)

In the case of four elements, we will have the mean error to fear respecting the first element,
(Sp(i) t(i) )2
by changing in this expression (a ), Sp(i)2 into Sp(i)2 − St(i)2
, Sp(i) q (i) into Sp(i) q (i) −
Sp(i) t(i) .Sq (i) t(i)
St(i)2
,
etc. By continuing thus, we will have the mean error to fear respecting the first
element, whatever be the number of elements. By changing, in the expression of this error, that
which is relative to the first element, into that which is relative to the second and reciprocally,
we will have the mean error to fear respecting the second element, and thus of the others.
Thence results a simple way to compare among them diverse astronomical Tables, on the
side of precision. These Tables are able always to be supposed reduced to the same form, and
then they differ only by the epochs, the mean movements and the coefficients of their arguments;
because, if one of them, for example, contains an argument which is not found at all in the others,
it is clear that that returns to supposing, in the latter, this coefficient null. Now, if we compared

18
these Tables to the totality of the good observations, by rectifying them through this comparison,
these Tables, thus rectified, would satisfy, by that which precedes, the condition that the sum [335]
of the squares of the errors that they would permit to subsist yet be a minimum. The Tables
which would approach most to fulfill this condition would merit therefore preference, whence
it follows that by comparing these diverse Tables to a considerable number of observations, the
presumption of exactitude must be in favor of that in which the sum of the squares of the errors
is smaller than in the others.

§22. To here we have supposed the facilities of the positive errors the same as those of
the negative errors. Let us consider now the general case in which these facilities are able to
be different. Let us name a the interval in which the errors of each observation are able to be
extended, and let us suppose it divided into an infinite number n + n of equal parts and taken
for unity, n being the number of the parts which correspond to the negative errors, and n being
the number of the parts which correspond to the positive errors. On each point of the interval
a let us raise an ordinate

which expresses the probability of the corresponding error, and let us
x
designate by φ n+n  the ordinate corresponding to the error x. This premised, let us consider
the series

 
−n −qn −1 −(n − 1) −q(n−1)√−1
φ c + φ c + ···
n + n n + n


−1 −q −1 0 1
+φ c + φ + φ cq −1 + · · ·
n + n n + n n + n

n −1 √
q(n −1) −1 n  √
+φ c + φ cqn  −1 .
n+n  n+n 

 x
qx√−1 
Let us represent this series by φ n+n c , the sign extending to all the values of x,

from x = −n to x = n . The term independent of c −1 and of its powers, in the development
of the function

 √
 √
 √
x x q (1) x −1 x (s−1)
c−(l+μ) −1 φ c qx −1
φ c · · · φ cq x −1
,
n+n  n+n  n+n 

will be, by §21, the probability that the function [336]


(1) (1) (s−1) (s−1)
(m) q + q  + ··· + q 

will be equal to l + μ; this probability is therefore


 √ √
 √
1 x
(1) d c−l −1 c−μ −1 φ cqx −1 × · · · ,
2π n + n
the integral being taken from  = −π to  = π. The logarithm of the function

 √
 √
x x (1)
(2) c−μ −1 φ c qx −1
× φ cq x −1 · · ·
n+n  n+n 

is  
√ x √
−μ −1 + log φ cqx −1
+ ···
n + n
n and n being supposed infinite numbers, if we make
x 1
= x , = dx ;
n + n n + n

19
if, moreover, we suppose
  
k = dx φ(x ), k = x dx φ(x ), k = x2 dx φ(x ), ...,

the integrals being taken from x = − n+n


n  n
 to x = n+n , we will have

⎧ ⎫
⎪ k √ ⎪
 √

⎨ 1 + q(n + n ) −1 ⎪

x qx −1  k
φ c = (n + n )k .
n+n  ⎪  ⎪
⎩ − k q 2 (n + n )2 2 + · · ·⎪
⎪ ⎭
2k
The error of each observation must fall within the limits −n and +n , and the probability that
 x

this will hold being φ n+n or (n + n )k, this quantity must be equal to unity. Thence it is
μ
easy to conclude that the logarithm of the function (2) is, by making μ = n+n
,

k (i) √ kk − k2 (i)2
Sq − μ (n + n ) −1 − Sq (n + n )2 2 + · · · ,
k 2k2

the sign S embracing all the values of i, from i null to i = s − 1. We will make the first power [337]
of  vanish, by making
k
μ = Sq (i) ,
k
and if we consider only its second power, that which we are able to do by that which precedes,
when s is a very great number, we will have, for the logarithm of the function (2),

kk − k2 (i)2


− Sq (n + n )2 2 .
2k2
By passing again from the logarithms to the numbers, the function (2) is transformed into the
following
 2
− kk −k (n+n )2  2 Sq (i)2
c 2
2k ;
the integral (1) becomes thus
 √
1 
− kk −k
2
(n+n )2  2 Sq (i)2
d c−lw −1 c 2k2 .

Let us suppose

l = (n + n )r Sq (i)2 ,
 √ 
(kk − k2 )Sq (i)2  r −1 2k2
t= (n + n ) − .
2k 2 2 kk − k2


The variation of l being unity, we will have



1 = (n + n )dr Sq (i)2 ;

the preceding integral becomes thus, after having integrated it from t = −∞ to t = ∞,

kdr − k2 r 2
 c 2(kk −k2 )
2(kk − k )π
 2

20
Thus the probability that the function (m) will be comprehended within the limits

ak (i) 
Sq ± ar Sq (i)2 ,
k
is equal to  [338]
2 kdr − k2 r 2
√  c 2(kk −k2 ) ,
π 2(kk − k2 )
the integral being taken from r null.
ak
k
is the abscissa of the ordinate which passes through the center of gravity of the area of
the curve of the probabilities of the errors of each observation; the product of this abscissa by
Sq (i) is therefore the mean result toward which the function (m) converges without ceasing. If
we suppose 1 = q = q (1) = · · · the function (m) becomes the sum of the errors, and then Sq (i)
becomes s; therefore by dividing the sum of the errors by s, in order to have the mean error, this
error converges without ceasing toward the abscissa of the center of gravity, in a manner that by
taking on both sides any interval whatsoever as small as we will wish, the probability that the
mean error will fall within this interval will finish, by multiplying indefinitely the observations,
by differing from certainty only by a quantity less than every given magnitude.

§23. We come to investigate the mean result that observations numerous and not yet made
must indicate with most advantage, and the law of probability of the errors of this result. Let
us consider presently the mean result of observations already made and of which we know the
respective deviations. For this, let us imagine a number s of observations of the same kind, that
is such that the law of errors is the same for all. Let us name A the result of the first, A + q
the one of the second, A + q (1) the one of the third, and thus consecutively; q, q (1) , q (2) , . . .
being positive and increasing quantities, that which we are always able to obtain by a convenient
disposition of the observations. Let us designate further by φ(z) the probability of the error
z for each observation, and let us suppose that A + x is the true result. The error of the first
observation is then −x; q − x, q (1) − x, . . . are the errors of the second, of the third, etc. The
probability of the simultaneous existence of all these errors is the product of their respective
probabilities; it is therefore [339]
(1)
φ(−x)φ(q − x)φ(q − x) · · ·

Now, x being susceptible of an infinity of values, by considering them as so many causes of the
observed event, the probability of each of them will be, by §1,

dx φ(−x)φ(q − x)φ(q (1) − x) · · ·


 ,
dx φ(−x)φ(q − x)φ(q (1) − x) · · ·
the integral of the denominator being taken for all the values of which x is susceptible. Let us
1
name H this denominator. This premised, let us imagine a curve of which x is the abscissa, and
of which the ordinate y is

Hφ(−x)φ(q − x)φ(q (1) − x) · · · ,


this curve will be that of the probabilities of the values of x. The value that it is necessary to
choose for the mean result is that which renders the mean error to fear a minimum. Each error,
either positive, or negative, must be considered as a disadvantage, or a real loss in the game,
we have the mean disadvantage, by taking the sum of the products of each disadvantage by its
probability; the mean value of the error to fear is therefore the sum of the products of each error,
setting aside the sign, by its probability. Let us determine the abscissa that it is necessary to

21
choose in order that this sum is a minimum. For this, let us give to the abscissas for origin the
first extremity of the preceding curve, and let us name x and y  the coordinates of the curve, by
departing from this origin. Let l be the value that it is necessary to choose. It is clear that, if the
true result were x , the error of the result l would be, setting aside the sign, l − x , as much as
x would be less than l; now y  is the probability that x is the true result; the sum of the errors
to fear, settingaside the sign, multiplied by their probability, is therefore for all the values of
x less than l, (l − x )y  dx , the integral being taken from x = 0 to x = l. We will see in
the same manner that, for the values of x superior to l, the sum of the errors to fear, multiplied [340]
by their probability, is (x − l)y  dx , the integral being taken from x = l to the abscissa x


corresponding to the last extremity of the curve; the entire sum of the errors to fear, setting aside
the sign, multiplied by their respective probabilities, is therefore
 
(l − x )y  dx + (x − l)y  dx .

The differential of this function, taken with respect to l, is


 
dl y  dx − dl y  dx ;
   
 we have the differential of (l − x )y dx , by differentiating first the value of l under
because
the sign, and by adding to this differential the increase which results from the variation of the
limit of the integral, a limit which is changed into l + dl. This increase  is equal to the element
(l − x )y  dx , to the limit where x = l; it is therefore null, and dl y dx is the differential of
the integral (l − x )y  dx . We will see in the same manner that −dl y  dx is the differential
of the integral (x − l)y  dx . The sum of these differentials is null relatively to the abscissa l,
for which the mean error to fear is a minimum; we have therefore, relatively to this abscissa,
 
y  dx = y  dx ,

the first integral being taken from x = 0 to x = l, and the second being taken from x = l to
the extreme value of x .
It follows thence that the abscissa which renders the mean error to fear a minimum is that of
which the ordinate divides the area of the curve into two equal parts. This point enjoys further
the property to be the one on the side of which it is as probable that the true result falls, as
the other side, and by this reason it is able further to be named middle of probability. Some
celebrated geometers have taken for the middle that it is necessary to choose the one which
renders the observed result the most probable, and consequently the abscissa which corresponds
to the greatest ordinate of the curve; but the middle that we adopt is evidently indicated by the
theory of probabilities.

If we put φ(x) under the form of an exponential, and if we designate it by c−ψ(x ) , so that it [341]
is able equally to agree to the positive and negative errors, we will have

2
)−ψ(x−q)2 −ψ(x−q (1) )2 −···
(1) y = Hc−ψ(x

If we make x = a + z, and if we develop the exponent of c with respect to the powers of z, y


will take this form
2
−Qz 3 −···
y = Hc−M −2N z−P z ,
an expression in which we have

22
M =ψ(a2 ) + ψ(a − q)2 + ψ(a − q (1) )2 + · · · ,
N =aψ  (a2 ) + (a − q)ψ  (a − q)2 + (a − q (1) )ψ  (a − q (1) )2 + · · · ,
P =ψ  (a2 ) + ψ  (a − q)2 + ψ  (a − q (1) )2 + · · · + 2a2 ψ  (a2 )
+ 2(a − q)2 ψ  (a − q)2 + a(a − q (1) )2 ψ  (a − q (1) )2 + · · · ,
............................................................... ,
ψ (t) being the coefficient of dt in the differential of ψ(t), ψ  (t) being the coefficient of dt in


the differential of ψ  (t), and thus consecutively.


Let us suppose the number s of observations very great, and let us determine a by the equa-
tion N = 0 which gives the condition of the maximum of y; then we have
2
−Qz 3 −···
y = Hc−M −P z .
M , P , Q, . . . are of order s; now, if z is very small of order 1

s
, Qz 3 becomes of order 1

s
,
−Qz 3 −···
and the exponential c is able to be reduced to unity. Thus, in the interval from z = 0 to
z = √rs , we are able to suppose
2
r = Hc−M −P z .
m
Farther on, and when z is of order s− 2 , m being smaller than unity, P z 2 becomes of order
2
s1−m ; consequently c−P z becomes, in the same way as y, insensible; so that we are able, in all
extent of the curve, to suppose [342]
2
y = Hc−M −P z .
The value of a given by the equation N = 0, or

0 = aψ  (a2 ) + (a − q)ψ  (a − q)2 + (a − q (1) )ψ  (a − q (1) )2 + · · · ,


is then the abscissa x corresponding to the ordinate which divides the area of the curve into equal
parts. The condition that the entire area of the curve must represent certitude or unity gives

1 2
= dz c−M −P z ,
H
the integral being taken from z = −∞ to z = ∞, that which gives

cM P
H= √ .
π
The mean error to fear more or less, by taking a for the mean result of the observations, is
± zydz, the integral being taken from z null to z infinity, that which gives for this error
1
± √ .
2 πP
2
But the entire ignorance where we are of the law c−ψ(x )
of the errors of each observation does
not permit forming the equation

0 = aψ  (a2 ) + (a − q)ψ  (a − q)2 + · · · ,

Thus, knowledge of the values of q, q (1) , . . . shedding a posteriori no light on the mean result
a of the observations, it is necessary to be held to the most advantageous result determined a

23
priori, and that we have seen to be the one which furnishes the method of least squares of the
errors.
Let us seek the function ψ(x2 ) which gives constantly the rule of the arithmetic means,
admired by the observers. For this, let us imagine that, out of the s observations, the first i
coincide, in the same way as the s − i last. The equation N = 0 becomes then

0 = iaψ  (a2 ) + (s − i)(a − q)ψ  (a − q)2 .

The rule of the arithmetic mean gives [343]


s−i
a= q;
s
the preceding equation becomes thus
2 !
 s−i 2 i2 2
ψ q =ψ q .
s s2

This equation needing to hold whatever be si and q, it is necessary that ψ  (t) be independent of
t, that which gives
ψ  (t) = k,
k being a constant. By integrating, we have

ψ(t) = kt − L,

L being an arbitrary constant; hence,


2 2
c−ψ(x )
= cL−kx .

Such is therefore the function which is able alone to give generally the rule of the arithmetic
 2
means. The constant L must be determined in a manner that the integral dx cL−kx , taken
from x = −∞ to x = ∞, is equal to unity; because it is certain that the error x of an observation
must fall within these limits; we have therefore

k
cL = ;
π
 2
consequently the probability of the error is πk c−kx .
In truth, this expression gives infinity for the limit of the errors, that which is not admissible;
but, seeing the rapidity with which this kind of exponential diminishes in measure as x increases,
we are able to take k great enough in order that beyond the admissible limit of the errors their
probabilities are insensible and able to be supposed null.
The preceding law of errors gives, for the general expression (1) of y, [344]

sk −ksu2
y= e ,
π

by determining H in a manner that the entire integral ydx is unity, and making

Sq (i)
x= + u.
s

24
The ordinate which divides the area of the curve into two equal parts is that which corresponds
to u = 0 and consequently to
Sq (i)
x= ;
s
this is therefore the value of x that it is necessary to choose for the mean result of the observa-
tions; now this value is that which the rule of the arithmetic means gives; the preceding law of
errors of each observation gives therefore constantly the same results as this rule, and we have
seen that it is the only law which enjoys this property.
By adopting this law, the probability of the error (i) of the (i + 1)st observation is

k −k(i)2
e ;
π
now we have seen in §20 that, z being the correction of an element, this observation furnishes
the equation of condition
(i) = p(i) z − α(i) .
The probability of the value of p(i) z − α(i) is therefore

k −k(p(i) z−α(i) )2
e ;
π

the probability of the simultaneous existence of the s values pz − α, p(1) z − α(1) , . . . , p(s−1) z −
α(s−1) will be therefore  s−1
k (i) (i) 2
e−kS(p z−α ) .
π
This probability varies with z; we will have therefore the probability of any value whatsoever [345]
of z by multiplying this quantity by dz and dividing the product by the integral of this product;
taken from z = −∞ to z = ∞. Let

Sp(i) q (i)
z= + u;
Sp(i)2
this probability becomes 
kSp(i)2 −ku2 Sp(i)2
du e ,
π
so that, if we describe a curve of which the coefficient of du is the ordinate and of which u is the
abscissa, this curve, extended from u = −∞ to u = ∞, is able to be considered as the curve of
the probabilities of the errors u, of which the result

Sp(i) α(i)
z=
Sp(i)2
is susceptible. The ordinate which divides the area of the curve into two equal parts is that which
(i) (i)
α
corresponds to u = 0, and consequently to z equal to SpSp(i)2 ; this result is therefore the one
that it is necessary to choose; now it is the same as the one which the method of least squares of
errors of observations gives; the preceding law of errors of each observation leads therefore to
the same results as this method.
The method of least squares of errors becomes necessary when there is concern to take
a mean among many given results, each, by the collection of a great number of observations
of diverse kinds. Let us suppose that one same element is given: 1◦ by the mean result of s

25
observations of a first kind and that it is, by these observations, equal to A; 2◦ by the mean
result of s observations of a second kind and that it is equal to A + q; 3◦ by the mean result of
s observations of a third kind and that it is equal to A + q  , and thus of the remaining. If we
represent by A + x the true element, the error of the result of the observations s will be −x; by [346]
supposing therefore β equal to  
k Sp(i)2
,
k 2a
if we make use of the method of least squares of errors in order to determine the mean result, or
to 
k Sp(i)2
√ ,
k 2a s
if we make use of the ordinary method; the probability of this error will be, by §20,
β 2 2
√ c−β x .
π

The error of the result of the s observations will be q − x, and, by designating by β  for these
observations that which we have named β for the s observations, the probability of this error will
be
β 2 2
√ c−β (x−q) .
π
Similarly, the error of the result of the s observations will be q  − x, and by naming for them
β  that which we have named β for the s observations, the probability of this error will be

β  2  2
√ c−β (x−q ) .
π
and thus consecutively. The product of all these probabilities will be the probability that −x,
q − x, q  − x, . . . will be the errors of the mean results of the observations s, s , s , . . . .
By multiplying it by dx and taking the integral from x = −∞ to x = ∞, we will have the
probability that the mean results of the observations s , s , . . . will surpass respectively by q, q  ,
. . . the mean result of the s observations.
If we take the integral within the determined limits, we will have the probability that, the
preceding condition being fulfilled, the error of the first result will be comprehended within these
limits; by dividing this probability by that of the condition itself, we will have the probability [347]
that the error of the first result will be comprehended within some given limits, when we are
certain that the condition effectively holds; this probability is therefore
 2 2 2 2 2  2
dxc−β x −β (x−q) −β (x−q ) −···
 ,
dxc−β 2 x2 −β 2 (x−q)2 −β 2 (x−q )2 −···

the integral of the numerator being taken within the given limits, and that of the denominator
being taken from x = −∞ to x = ∞. We have

β 2 x2 + β 2 (x − q)2 + β 2 (x − q  )2 + · · ·
= (β 2 + β 2 + β 2 + · · · )x2 − 2x(β 2 q + β 2 q  + · · · ) + β 2 q 2 + β 2 q 2 + · · ·

Let
β 2 q + β 2 q  + · · ·
x= + t;
β 2 + β 2 + β 2 + · · ·

26
the preceding probability will become
 2 2 2 2
dt c−(β +β +β +··· )t
 ,
dt c−(β 2 +β 2 +β 2 +··· )t2

the integral of the numerator being taken within some given limits, and that of the denominator
being taken from t = −∞ to t = ∞. This last integral is

π
 .
β 2 + β 2 + β 2 + · · ·

By making therefore 
t = t β 2 + β 2 + β 2 + · · ·,
the preceding probability becomes

1 2
√ dt c−t .
π

The most probable value of t is that which corresponds to t null, whence it follows that the
most probable value of x is that which corresponds to t = 0; thus the correction of the first
result, that the collection of all the observations s, s , s , . . . give with most probability, is [348]
2 2 
β q + β q + ···
.
β 2 + β 2 + β 2 + · · ·
This correction, added to the result A, gives, for the result that it is necessary to choose,

Aβ 2 + (A + q)β 2 + (A + q  )β 2 + · · ·
.
β 2 + β 2 + β 2 + · · ·
The preceding correction is that which renders a minimum the function

(βx)2 + [β  (x − q)]2 + [β  (x − q  )]2 + · · ·

Now the greatest ordinate of the curve of probabilities of the first result is, as we have just seen

it, √βπ ; that of the curve of probabilities of the second result is √βπ , and thus consecutively; the
mean that it is necessary to choose among the diverse results is therefore the one which renders
a minimum the sum of the squares of the error of each result multiplied by the greatest ordinate
of its curve of probability. Thus the law of the minimum of the squares of the errors becomes
necessary, when we must take a mean among some results given each by a great number of
observations.

§24. We have seen previously that, in all the manners to combine the equations of condition
in order to form linear final equations, necessary to the determination of the elements, the most
advantageous is that which results from the method of least squares of errors of the observations,
at least when the observations are in great number. If, instead of considering the minimum of the
squares of the errors, we considered the minimum of other powers of the errors, or even of every
other function of the errors, the final equations would cease to be linear, and their resolution
would become impractical, if the observations were in great number. However there is a case
which merits a particular attention, in this that it gives the system in which the greatest error, [349]
setting aside the sign, is less than in every other system. This case is the one of the minimum
of the infinite and even powers of the errors. Let us consider here only the correction of a

27
single element, and, z expressing this correction, let us represent, as previously, the equations of
condition by the following,

(i) = p(i) z − α(i) ,


i being able to vary from zero to s − 1, s being the number of observations. The sum of the
powers 2n of the errors will be S(α(i) − p(i) z)2n , the sign S extending to all the values of i.
We are able to suppose in this sum all the values of p(i) positive; because, if one of them was
negative, it would become positive by changing, as we are able to do, the signs of the two terms
of the binomial raised to the power 2n, to which it corresponds. We will suppose therefore the
quantities α − pz, α(1) − p(1) z, α(2) − p(2) z, . . . disposed in a manner that the quantities p,
p(1) , p(2) ,. . . are positive and increasing. This premised, if 2n is infinite, it is clear that the
greatest term of the sum S(α(i) − p(i) z)2n will be the entire sum, unless there was one or many
other terms which were equal to it, and this is that which must hold in the case of the minimum
of the sum. In fact, if there was only a single greatest quantity, setting aside the sign, such as
α(i) − p(i) z, we would be able to diminish it by making z vary conveniently, and then the sum
S(α(i) − p(i) z)2n would diminish and would not be a minimum. It is necessary moreover that,
 
if α(i) − p(i) z and α(i ) − p(i ) z are, setting aside the sign, the two greatest quantities and equal
between them, they are of contrary sign. In fact, the sum
 
(α(i) − p(i) z)2n + (α(i ) − p(i ) z)2n
needing to be then a minimum, its differential
  
−2ndz[p(i) (α(i) − p(i) z)2n−1 + p(i ) (α(i ) − p(i ) z)2n−1 ]
must be null, that which is able to be, when n is infinite, only in the case where α(i) − p(i) z
 
and α(i ) − p(i ) z are infinitely little different and of contrary sign. If there are three greatest [350]
quantities, and equals among them, setting aside the sign, we will see in the same manner that
their signs are not able to be the same.
Now, let us consider the sequence

α(s−1) − p(s−1) z, α(s−2) − p(s−2) z, α(s−3) − p(s−3) z, . . . , α − pz,
(o)
− α + pz, . . . , −α(s−3) + p(s−3) z, −α(s−2) + p(s−2) z, −α(s−1) + p(s−1) z.
If we suppose x = −∞, the first term of the sequence surpasses the following, and continues to
surpass them by making z increase, to the moment where it becomes equal to one of them. Then
this one, by the increase of z, becomes greatest of all, and in measure as we make z increase,
it continues always to surpass those which precede it. In order to determine this term, we will
form the sequence of quotients

α(s−1) − α(s−2) α(s−1) − α(s−3) α(s−1) − α α(s−1) + α α(s−1) + α(s−1)


, (s−1) , . . . , (s−1) , (s−1) , . . . , (s−1) .
p (s−1) −p (s−2) p −p (s−3) p −p p +p p + p(s−1)
(s−1) (r)
Let us suppose that αp(s−1) −α
−p(r)
is the smallest of these quotients by having regard to the sign,
that is by regarding a greater negative quantity as smaller than another lesser negative quantity.
If there are many least and equal quotients, we will consider the one which is related to the most
distant term from the first in the sequence (o); this term will be the greatest of all, to the moment
where, by the increase of z, it becomes equal to one of the following, which begins then to be
the greatest. In order to determine this new term, we will form a new sequence of quotients

α(r) − α(r−1) α(r) − α(r−2) α(r) − α p(r) + α


, , . . . , (r) , , ...,
p(r) − p(r−1) p(r) − p(r−2) p − p p(r) + p

28
the term of the sequence (o) to which the least of these quotients correspond will be the new
term. We will continue thus to the one of the two terms which become equal and the greatest
is in the first half of the sequence (o), and the other in the second half. Let α(i) − p(i) z and
 
−α(i ) + p(i ) z be these two terms; then the value of z which corresponds to the system of the [351]
minimum of the greatest of the errors, setting aside the sign, is

α(i) + α(i )
z= .
p(i) − p(i )
If there are many elements to correct, the equations of condition which determine their cor-
rections contain many unknowns, and the investigation of the system of correction, in which the
greatest error is, setting aside the sign, smaller than in every other system, becomes more com-
plicated. I have considered this case in a general manner in Book III of the Mécanique céleste. I
will observe only here that then the sum of the 2n powers of the errors of the observations is, as
in the case of a single unknown, a minimum when 2n is infinite; whence it is easy to conclude
that, in the system of which there is concern, it must have as many errors, plus one, equal, and
greatest, setting aside the sign, as there are elements to correct. We imagine that the results cor-
responding to 2n equal to a great number must differ little from those which 2n infinite gives. It
is not even necessary for this that the 2n power is quite elevated, and I have recognized through
many examples that, in the same case where this power does not surpass the square, the results
differ little from those that the system of the minimum of the greatest squares gives, that which
is a new advantage of the method of least squares of the errors of observations.
For a long time, geometers take an arithmetic mean among their observations, and, in order
to determine the elements that they wish to know, they choose the most favorable circumstances
for this object, namely, those in which the errors of the observations alter the least that it is
possible the value of these elements. But Cotes is, if I do not deceive myself, the first who has
given a general rule in order to make many observations agree in the determination of an element,
proportionally to their influence. By considering each observation as a function of the element
and regarding the error of the observation as an infinitely small differential, it will be equal to
the differential of the function, taken with respect to that element. The more the coefficient of [352]
the differential of the element will be considerable, the less it will be necessary to make the
element vary, in order that the product of its variation by this coefficient is equal to the error
of the observation; this coefficient will express therefore the influence of the observation on the
value of the element. This premised, Cotes represents all the values of the element, given by each
observation, by the parts of an indefinite straight line, all these parts having a common origin. He
imagines next, at their other extremities, some weights proportional to the influences respective
of the observations. The distance from the common origin of the parts to the common center of
gravity of all these weights is the value that he chose for the element.
Let us take the equation of condition of §20,

(i) = p(i) z − α(i) ,

(i) being the error of the (i + 1)st observation, and z being the correction of the element already
known quite nearly; p(i) , that we are always able to suppose positive, will express the influence
(i)
of the corresponding observation. αp(i) being the value of z resulting from the observation, the
rule of Cotes reverts to multiplying this value by p(i) , to make a sum of all the products relative
to the diverse values, and to divide it by the sum of all the p(i) , that which gives

Sα(i)
z= .
Sp(i)

29
This was in fact the correction adopted by the observers, before the usage of the method of least
squares of the errors of the observations.
However we do not see that, since this excellent geometer, we have employed his rule, until
Euler, who in his first piece on Jupiter and Saturn, appears to me to serve himself first of the
equations of the condition in order to determine the elements of the elliptic movement of these
two planets. Near the same time, Tobie Mayer made use of it in his beautiful researches on the
libration of the Moon, and next in order to form his lunar Tables. Since, the best astronomers
have followed this method, and the success of the Tables which they have constructed by his [353]
means has verified the advantage of it.
When we have only one element to determine, this method leaves no difficulty; but, when
we must correct at the same time many elements, it is necessary to have as many final equations
formed by the reunion of many equations of condition, and by means of which we determine
by elimination the corrections of the elements. But what is the most advantageous manner to
combine the equations of condition, in order to form the final equations? It is here that the ob-
servers abandoned themselves to some arbitrary gropings, which must have led them to some
different results, although deduced from the same observations. In order to avoid these gropings,
Mr. Legendre had the simple idea to consider the sum of the squares of the errors of the observa-
tions, and to render it a minimum, that which furnishes directly as many final equations, as there
are elements to correct. This scholarly geometer is the first who has published this method; but
we owe to Mr. Gauss the justice to observe that he had had, many years before this publication,
the same idea of which he made a habitual usage, and that he had communicated to many as-
tronomers. Mr. Gauss, in his Theory of elliptic movement, has sought to connect this method to
the Theory of Probabilities, by showing that the same law of errors of the observations, which
gives generally the rule of the arithmetic mean among many observations, admitted by the ob-
servers, gives similarly the rule of the least squares of the errors of the observations, and it is
that which we have seen in §23. But, as nothing proves that the first of these rules gives the most
advantageous result, the same uncertainty exists with respect to the second. The research on the
most advantageous manner to form the final equations is without doubt one of the most useful
of the Theory of Probabilities: its importance in Physics and Astronomy carries me to occupy
myself with it. For this, I will consider that all the ways to combine the equations of condition,
in order to form a linear final equation, returns to multiplying them respectively by some factors
which were null relatively to the equations that we employed not at all, and to make a sum of [354]
all these products, that which gives a first final equation. A second system of factors gives a
second final equation, and thus consecutively, until we have as many final equations as elements
to correct. Now it is clear that it is necessary to choose the system of factors, such that the mean
error to fear more or less respecting each element is a minimum; the mean error being the sum
of the products of each error by its probability. When the observations are in small number, the
choice of these systems depends on the law of errors of each observation. But, if we consider a
great number of observations, that which holds most often in the astronomical researches, this
choice becomes independent of this law, and we have seen, in that which precedes, that Analysis
leads then directly to the results of the method of least squares of the errors of the observations.
Thus this method which offered first only the advantage to furnish, without groping, the final
equations necessary to the correction of the elements, gives at the same time the most precise
corrections, at least when we wish to employ only final equations which are linear, an indispens-
able condition, when we consider at the same time a great number of observations; otherwise,
the elimination of the unknowns and their determination would be impractical.

30
BOOK II
CHAPTER V
APPLICATION DU CALCUL DES PROBABILITÉS A LA RECHERCHE DES
PHÉNOMÉNES ET DE LEURS CAUSES

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §25, pp. 355–369

APPLICATION OF THE CALCULUS OF PROBABILITIES TO THE


INVESTIGATION OF PHENOMENA AND OF THEIR CAUSES

We are able, by the analyses of the preceding Chapters, applied to a great number of obser-
vations, to determine the probability of the existence of the phenomena of which the extent is
small enough in order to be comprehended within the limits of the errors of each observation.
Formulas which express that the probabilities of the existence of the phenomena and of its extent
are comprehended within some given limits. Application to the diurnal variation of the barom-
eter and to the rotation of the Earth, deduced from the experiments on the fall of bodies. The
same analysis is applicable to the most delicate questions of Astronomy, of political Economics,
of Medicine, etc., and to the solution of the problems on chances, too complicated in order to
be resolved directly by analysis. A floor being divided into small rectangular squares by some
lines parallel and perpendicular among them, to determine the probability that, by projecting at
random a needle, it will fall again on a joint of these squares. No 25.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
§25. The phenomena of nature present themselves most often accompanied with so many [355]
strange circumstances, so great a number of perturbing causes mix their influence that it is very
difficult, when they are very small, to recognize them. We are able then to arrive there only by
multiplying the observations, so that, the strange effects coming to destroy themselves, the mean
result of the observations no longer permit perceiving but these phenomena. Let us imagine,
by that which precedes, that that holds rigorously only in the case of an infinite number of
observations. In every other case, the phenomena are indicated by the mean results only in
a probable manner, but which is so much more as the observations are in greater number. The
investigation of this probability is therefore very important for Physics, Astronomy and generally
for all the natural sciences. We will see that it returns to the methods that we have just exposed.
In the preceding Chapter, the existence of a phenomenon was certain; its extent alone has been
the object of the Calculus of Probabilities: here the existence of a phenomenon and its extent are
the object of this calculus.
Let us take for example the diurnal variation of the barometer, that we observe between
the tropics, and which becomes sensible even in our climates, when we choose and when we
multiply the observations conveniently. We have recognized that in general, toward 9h in the
morning, the barometer is more elevated than toward 4h in the evening; next it rises again toward [356]
the 11h in the evening, and it descends again toward 4h of the morning, it order to arrive again to
its maximum height toward 9h . Let us suppose that we have observed the height of the barometer
toward 9h in the morning and toward 4h in the evening, during the number s of days, and, in order
to avoid the too great influence of the perturbing causes, let us choose these days in a manner
that, in the interval from 9h to 4h , the barometer has not varied beyond 4mm . Let us suppose
next that by making the sum of the s heights of the morning and the sum of the s heights of
the evening, the first of these sums surpasses the second by the quantity q; this difference will
indicate a constant cause which tends to raise the barometer toward 9h in the morning, and to
lower it toward 4h in the evening. In order to determine with what probability this cause is
indicated, let us imagine that this cause not exist at all, and that the observed difference q results
from accidental perturbing causes and from the errors of the observations. The probability that
then the observed difference between the sums of the heights of the morning and of the evening
must be below q is, by §18, equal to
 
k kr 2
dr c− 4k ,
4k π


the integral being taken from r = −∞ to r = r√ a


s
, k and k being constants dependent on
the law of probability of the differences between the heights of the morning and of the evening,
and ±a being the limits of these differences, a being here equal to 4mm , kk being at least equal
to 6, as we have seen in §20, k4k is not able to be supposed less than 32 ; by making therefore
s = 400, and supposing the extent of the diurnal variation of 1mm , that which is very nearly
that which Mr. Ramond has found in our climates, by the comparison of a very great number of
kr 2
observations, we will have q = 400mm . Thus r = 5, and 4k  is at least equal to 37.5; by making
therefore
kr2
t2 =  ,
4k
the preceding probability becomes at least [357]
 2
dt c−t
1− √ ,
π

the integral being taken from t = 37, 5 to t = ∞. This integral is, quite nearly, by §27 of

2
Book I,
c−37.5
1− √ ,
2 37, 5π
and it approaches so to unity or to certitude, that it is extremely probable that, if there existed no
constant cause at all of the observed excess of the sum of the barometric heights of the morning
over those of the heights of the evening, this excess would be smaller than 400mm ; it indicates
therefore with an extreme probability the existence of a constant cause which has produced it.
The phenomenon of a diurnal variation being thus well established, let us determine the
most probable value of its extent, and the error that we are able to commit with respect to its
evaluation. Let us suppose for this that this value is qs ± √ ar
s
; the probability that the extent of
the diurnal variation from the morning to the evening will be comprehended within these limits
is, by §18,  
k kr 2
2 dr c− 4k ,
4k π


the integral being taken from r = 0.


 (i)2
We are able to eliminate kk by observing that, by §20, this fraction is nearly equal to S 2as2
,
(i)2 q
± being the difference from s to the observed extent the (i + 1) day, and the sign S
st

extending to all the values of i, from i = 0 to i = s − 1; by making therefore



2S(i)2
ar = t ,
s
the probability that the extent of the
 diurnal variation from the morning to the evening is compre- [358]
q (i)2  2
hended within the limits s ± √s t 2S
s
will be √π dt c−t , the integral being taken from t
2

null.
The diurnal variation of the heights of the barometer depends uniquely on the Sun; but these
heights are still affected by the aerial tides that the attraction of the Sun and of the Moon produce
on our atmosphere, and of which I have given the theory in Book IV of the Mécanique céleste.
It is therefore necessary to consider at the same time these two variations, and to determine
their magnitudes and their respective epochs, by forming some equations of condition analogous
to those of which the astronomers make use, in order to correct the elements of the celestial
movements. These variations being principally sensible at the equator, and the perturbing causes
being extremely small, we will be able, by means of excellent barometers, to determine them
with a great precision, and I do not doubt at all that we recognize then, in the collection of a very
great number of observations, the laws that the theory of the gravity in the atmospheric tides
indicates, and that manifests itself in a manner so remarkable in the observations of the tides of
the Ocean, that I have discussed with extent, in the Book cited of the Mécanique céleste.
We see, by that which precedes, that we are able to recognize the very small effect of a
constant cause, by a long sequence of observations of which the errors are able to exceed this
effect itself. But then it is necessary to take care to vary the circumstances of each observation,
in a manner that the mean result of their collection is not at all altered sensibly by it and is
nearly entirely the effect of the cause of which there is concern; it is necessary to multiply the
observations, until the analysis indicates a very great probability that the error of this result will
be comprehended within some very narrowed limits.
Let us suppose, for example, that we wish to recognize by observation the small deviation
to the east, produced by the rotation of the Earth, in the fall of a body. I have shown, in Book
X of the Mécanique céleste, that if, from the summit of a quite high tower, we abandon a body [359]
to its weight, it will fall onto a horizontal plane passing through the foot of the tower, at a small
distance to the east from the point of contact of this plane with a ball suspended by a wire of

3
which the point of suspension is the one of the departure of the body. I have given, in the Book
cited, the expression of this deviation, and there results from it that by setting aside the resistance
of the air, it is uniquely toward the east; that it is proportional to the cosine of the latitude and
to the square root of the cube of the height, and that at the latitude of Paris it is raised to 5.1mm ,
when the height of the tower is 50m . The resistance of the air changes this last result; I have
given similarly the expression of it in this case, in the Book cited.
We have already made a great number of experiments in order to confirm, by this means, the
movement of the rotation of the Earth, which besides is demonstrated by so many other phenom-
ena that this confirmation becomes useless. The small errors of these very delicate experiments
have often exceeded the effect that we would wish to determine, and it is only by multiplying
considerably the experiments that we are able thus to establish its existence and to fix its value.
We will submit this object to the analysis of probabilities.
If we take for origin of the coordinates the point of contact of the plane and of the ball
suspended by a wire of which the summit of the suspension is the one of the departure of a ball
that we make fall; if we next mark on this plane the diverse points where the ball will touch the
plane in each experiment; by determining the common center of gravity of these points, the line
drawn from the origin of the coordinates to this center will determine the sense and the mean
quantity by which the ball deviated from this origin, and both will be determined with so much
more exactitude as the experiments will be more numerous and more precise.
Let us consider now as axis of the abscissas the line drawn from the origin of the coordinates
to the east, and let us designate by x, x(1) , x(2) , . . . , x(s−1) , y, y (1) , y (2) , . . . , y (s−1) the
respective coordinates of the points determined by the experiments of which the number is s. In [360]
expressing by X and Y the coordinates of the center of gravity of all these points, we will have

x(i) y (i)
X=S , Y =S ,
s s
the sign S extending to all the values of i, from i = 0 to i = s−1. This premised, by designating
by ±a the limits of the errors of each experiment, in the sense of x, the probability that the mean
deviation of the ball, from the point of origin of the coordinates, is comprehended within the
ar
limits X ± √ s
, will be, by §18,
 
k kr 2
2 dr c− 4k ,
4k π
k and k being some constants which depend on the law of facility of the errors of each experi-
ment in the sense of x.
Similarly, ±a being the limits of the errors of each experiment in the sense of y, the prob-
ability that the mean value of the deviation in the sense of y is comprehended within the limits

Y ± a√sr will be


k − kr
2
2  dr c 4k ,
4k π

k and k being some constants depending on the law of errors of the experiments in the sense
of y. The fractions 4kk and 4kk being, by that which precedes, greater than 32 , we will be
able to judge the degree of approximation and of probability of the values of X and of Y , and
to determine the probability of the deviation to the south and to the north, indicated by the
observations.
The preceding analysis is able further to be applied to the investigation of the small inequal-
ities of the celestial movements, of which the extent is comprehended within the limits either
of the errors of observations, or of the perturbations produced by accidental causes. It is nearly

4
thus that Tycho Brahe recognized that the equation of the times, relative to the Sun and to the
planets, was not at all applicable to the Moon, and that it was necessary to subtract the part de- [361]
pending on the anomaly of the Sun, and even a much greater quantity, that which led Flamsteed
to the discovery of the lunar inequality that we name annual equation. It is further in the re-
sults of a great number of observations that Mayer recognized that the equation of precession,
relative to the planets and to the stars, was not at all applicable to the Moon; he evaluated to
around 12 decimals the quantity by which it was necessary then to diminish it, a quantity that
Mason raised next to nearly 24 , by the comparison of all the observations of Bradley, and that
Mr. Bürg has reduced to 21 , by means of a much greater number of observations of Maskelyne.
This inequality, although indicated by the observations, was neglected by the greatest number
of astronomers, because it did not appear to result from the theory of universal gravitation. But,
having submitted its existence to the Calculus of Probabilities, it appeared to me indicated with
a probability so strong, that I believed a duty to seek the cause of it. I saw well that it was able
to result only from the ellipticity of the terrestrial spheroid, that we had neglected until then in
the theory of the lunar movement, as needing to produce only some insensible terms, and I con-
cluded from it that it was extremely probable that these terms became sensible by the successive
integrations of the differential equations. Having determined these terms by a particular analy-
sis, that I have exposed in Book VII of the Mécanique céleste, I discovered first the inequality
of the movement of the Moon in latitude, and which is proportional to the sine of its longitude:
by its means, I recognized that the theory of gravity gives effectively the diminution observed
by the astronomers cited, in the inequality of the precession, applicable to the lunar movement
in longitude. The quantity of this diminution and the coefficient of the inequality in latitude of
which I just spoke are therefore very proper to determine the flatness of the Earth. Having made
part of my investigations to Mr. Bürg who occupied himself then with his Tables of the Moon,
I prayed him to determine the coefficients of the two inequalities with a particular care. By a
remarkable concurrence, the coefficients that he has determined accord to give to the Earth the [362]
1
flatness 305 , a flatness which differs little from the mean concluded from the measures of the
degrees of the meridian and from the pendulum, but which, seeing the influence of the errors
of the observations and of the perturbing causes on these measures, appears to me more exactly
determined by the lunar inequalities. Mr. Burckhardt, who has just formed new Tables of the
Moon, very precise, on the collection of observations of Bradley and of Maskelyne, has found
1
the same coefficient as Mr. Bürg for the lunar inequality in latitude: he finds 34 to add to the co-
1
efficient of the inequality in longitude, that which reduces the flatness to 301 , by this inequality.
1
The very slight difference of these results proves that by fixing at 304 this flatness, the error is
insensible.
The analysis of Probabilities has led me similarly to the cause of the great irregularities of
Jupiter and of Saturn. The difficulty to recognize the law of it and to restore them to the theory
of universal attraction had made conjecture that they were due to the momentary actions of the
comets; but a theorem to which I was arrived on the mutual attraction of the planets made me
reject this hypothesis, indicating to me the mutual attraction of the two planets as the true cause
of these irregularities. According to this theorem, if the movement of Jupiter accelerated by
virtue of some great inequality with very long period, the one of Saturn must be decelerated in
the same manner, and this deceleration is to the acceleration of Jupiter as the product of the mass
of this last planet by the square root of the great axis of its orbit is to the similar product relative
to Saturn. Thus, by taking for unity the deceleration of Saturn, the corresponding acceleration of
Jupiter must be 0, 40884; now Halley had found, by the comparison of the modern observations
to the old, that the acceleration of Jupiter corresponded to the deceleration of Saturn, and that
it was 0, 44823 of this deceleration. These results, so well in accord with the theory, led me to
think that there exists, in the movement of these planets, two great inequalities corresponding
and of contrary sign, which produced these phenomena. I have recognized that the mutual action

5
of the planets was not able to occasion in their mean movements some variations always increas- [363]
ing or periodic, but of a period independent of their mutual configuration; it was therefore in the
relation of the mean movements of Jupiter and of Saturn that I had to seek that of which there
is concern. Now, by examining this relation, it is easy to recognize that twice the mean move-
ment of Jupiter surpasses only by a very small quantity five times the one of Saturn; thus the
inequalities which depend on this difference, and of which the period is around nine centuries,
are able to become very great by the successive integrations which give to them for divisor the
square of the very small coefficient of time in the argument of these inequalities. By fixing to-
ward the epoch of Tycho Brahe the origin of this argument, I saw that Halley had had to find, by
the comparison of the modern observations to the old, the alterations that he had observed, while
the comparison of the modern observations among them must have presented some contrary and
parallel alterations to those that Lambert had noted. The existence of the inequalities of which
I just spoke appeared to me therefore extremely probable, and I hesitated not at all to undertake
the long and painful calculation, necessary in order to assure myself of it completely. The result
of this calculation, not only confirmed them, but it made known to me many other inequalities,
of which the collection has carried the Tables of Jupiter and Saturn to the degree of precision of
the same observations.
We see thence how it is necessary to be attentive to the indications of nature, when they are
the result of a great number of observations, although besides they be inexplicable by known
means. I engage thus the astronomers to follow with a particular attention the lunar inequality
with long period, which depends principally on the movement of the perigee of the Moon, added
to the double of the mean movement of its nodes; an inequality of which I have spoken in Book
VII of the Mécanique céleste, and that already the observations indicate with much likelihood.
The preceding cases are not the only ones in which the observations have redressed the analysts.
The movement of the lunar perigee and the acceleration of the movement of the Moon, which
was not given at all at first by the approximations, has made felt the necessity to rectify these [364]
approximations. Thus we are able to say that nature itself has concurred with the analytic per-
fection of the theories based on the principle of universal gravitation, and it is, in my sense, one
of the strongest proofs of the truth of this admirable principle.
We are able further, by the Analysis of Probabilities, to verify the existence or the influence
of certain causes of which we have believed to notice the action on organized beings. Of all the
instruments that we are able to use in order to understand the imperceptible agents of nature,
the most sensible are the nerves, especially when their sensibility is enhanced by some particular
circumstances. It is by their means that we have discovered the weak electricity that the contact of
two heterogeneous metals develop, that which has opened a vast field to the investigations of the
physicians and the chemists. The singular phenomena, which result from the extreme sensitivity
of the nerves in some individuals, have given birth to diverse opinions on the existence of a new
agent that we have named animal magnetism, with respect to the action of ordinary magnetism
and the influence of the Sun and of the Moon in some nervous affections; finally with respect
to the impressions that the proximity of metals or of running water are able to give birth. It
is natural to think that the action of these causes is very weak, and able to be troubled easily
by a great number of accidental circumstances; thus, of that which, in some cases, it is not at
all manifested, we must not conclude that it never exists. We are so averted from knowing all
the agents of nature that it would be less philosophical to deny the existence of the phenomena,
uniquely because they are inexplicable in the actual state of our knowledge. Alone we must
examine them with an attention so much more scrupulous as it appears more difficult to admit
them, and it is here that the Analysis of the Probabilities becomes indispensable in order to
determine to what point it is necessary to multiply the observations or the experiments in order
to have in favor of the existence of the agents that they seem to indicate a probability superior to
all the reasons that we are able to have besides to reject it.

6
The same analysis is able to be extended to the diverse results of Medicine and of political [365]
Economy, and even to the influence of moral causes; because the action of these causes, when it
is repeated a great number of times, offers in its results so much regularity as physical causes.
We are able to determine further by the Analysis of Probabilities, compared to a great num-
ber of experiences, the advantage and the disadvantage of the players, in the cases of which
complication renders impossible their direct investigation. Such is the advantage to the hand, in
the game of piquet: such are further the respective probabilities to bring forth the different faces
of a right rectangular prism, of which the length, the width and the height are unequal, when the
prism projected into the air falls again on a horizontal plane.
Finally we would be able to make use of the Calculus of Probabilities in order to rectify
curves or to square their surfaces. Without doubt, the geometers will not employ this means; but,
as it gives me place to speak of a particular kind of combinations of chance, I will expose it in a
few words.
Let us imagine a plane divided by parallel lines, equidistant by the quantity a; let us imagine
moreover a very narrow cylinder, of which 2r is the length, supposed equal or less than a. We
require the probability that in casting it on it, it will encounter one of the divisions of the plane.
Let us erect on any point of one of these divisions a perpendicular extended to the following
division. Let us suppose that the center of the cylinder be on this perpendicular and at the
height y above the first of these two divisions. By making the cylinder rotate about its center
and naming φ the angle that the cylinder makes with the perpendicular, at the moment where it
encounters this division, 2φ will be the part of the circumference described by each extremity
ofthe cylinder, in which
 it encounters the division; the sum of all these parts will be therefore
4 φdy, or 4φy − 4 ydφ; now we have y = r cos φ; this sum is therefore

4φy − 4r sin φ + const.

In order to determine this constant, we will observe that the integral must be extended from y [366]
nul to y = r, and consequently from φ = π2 to φ = 0, that which gives

const. = 4r;

thus the sum of which there is concern is 4r. From y = a − r to y = a, the cylinder is
able to encounter the following division, and it is clear that the sum of all the parts relative to
this encounter is again 4r; 8r is therefore the sum of all the parts relative to the encounter of
one or of the other of the divisions by the cylinder, in the movement of its center the length of
the perpendicular. But the number of all the arcs that it describes in rotating in entirety with
respect to itself, at each point of this perpendicular, is 2aπ; this is the number of all the possible
combinations; the probability of the encounter of one of the divisions of the plane by the cylinder
4r
is therefore aπ . If we cast this cylinder a great number of times, the ratio of the number of times
where the cylinder will encounter one of the divisions of the plane to the total number of casts
4r
will be, by §16, very nearly, the value of aπ , that which will make known the value of the
circumference 2π. We will have, by the same section, the probability that the error of this value
8r
will be comprehended within some given limits, and it is easy to see that the ratio aπ which, for
a given number of projections, renders the error to fear least, is unity, that which gives the length
of the cylinder equal to the interval of the divisions, multiplied by the ratio of the circumference
to four diameters.
Let us imagine now the preceding plane divided again by some lines perpendicular to the
preceding, and equidistant by a quantity b equal or greater than the length 2r of the cylinder. All
these lines will form with the first a sequence of rectangles of which b will be the length and a
the height. Let us consider one of these rectangles; let us suppose that in its interior we draw
at the distance r from each side some lines which are parallel to them. They will form first an

7
interior rectangle, of which b − 2r will be the length, and a − 2r the height; next two small [367]
rectangles, of which r will be the height, and b − 2r the length; then two other small rectangles
of which r will be the length and a − 2r the height; finally, four small squares of which the sides
will be equal to r.
As long as the center of the cylinder will be placed in the interior rectangle, the cylinder, in
rotating on its center, will never encounter the sides of the large rectangle.
When the center of the cylinder will be placed in the interior of one of the rectangles of which
r is the height and b − 2r the length, it is easy to see, by that which precedes, that the product of
8r by the length b − 2r will be the number of corresponding combinations, in which the cylinder
will encounter one or the other of the sides b of the great rectangle. Thus 8r(b − 2r) will be
the total number of combinations corresponding to the cases in which, the center of the cylinder
being placed in one or the other of these small rectangles, the cylinder encounters the outline of
the great rectangle. By the same reason, 8r(a − 2r) will be the total number of combinations in
which, the center of the cylinder being placed in the interior of the small rectangles of which r
and a − 2r are the dimensions, the cylinder encounters the outline of the great rectangle.
There remains for us to consider the four small squares. Let ABCD be one of them. From
the angle A common to this square and to the great rectangle, as center, and from the radius
r, let us describe a quarter circumference terminating itself at the points B and D. As long as
the center of the cylinder will be comprehended within the quarter circle formed by this arc, the
cylinder, in turning, will encounter the outline of the rectangle in all its positions; the number
of combinations in which this will take place is therefore equal to the product of 2π by the area
2 2
of the quarter circle, and consequently it is equal to π 2r . If the center of the cylinder is in the
part of the square which is outside of the quarter circle, the cylinder, in turning around its center,
will be able to encounter one or the other of the two sides AB and AD extended, without ever
encountering both at the same time. In order to determine the number of combinations relative [368]
to this encounter, I conceive on any point of side AB, distant by x from point A, a perpendicular
y of which the extremity is beyond the quarter circle. I place the center of the cylinder on this
extremity, from which I let down four straight lines equal to r, and of which two descend onto
the side AB extended, if that is necessary, and two others onto the side AD similarly prolonged. I
name 2φ the angle comprehended between the first two lines, and 2φ the angle coomprehended
between the second two. It is clear that the cylinder, in turning on its center, will encounter
the side AB extended as often as one of its halves will be within the angle 2φ, and that it will
encounter the side AD extended as often as one of its halves will be within the angle 2φ ; the
total number of all combinations in which the cylinder will encounter one or the other of these
sides is therefore 4(φ + φ ); thus this number, relatively to the part of the square exterior to the
quarter circle, is 
4 (φ + φ )dx dy;

now we have evidently


x = r cos φ , y = r cos φ;
the preceding integral becomes thus

4r2 (φ + φ )dφ dφ sin φ sin φ ,

and it is easy to see that the integral relative to φ must be taken from φ = 0 to φ = π2 − φ, and
that the integral relative to φ must be taken from φ = 0 to φ = π2 , that which gives 12 r2 (12−π 2 )
2 2
for this integral. In adding to it π 2r , we will have the number of combinations relative to the
square, and in quadrupling this number and joining it to the preceding numbers of combinations

8
relative to the encounter of the outline of the great rectangle by the cylinder, we will have, for
the total number of combinations,
8(a + b)r − 8r2 .
But the total number of possible combinations is evidently equal to 2π multiplied by the area [369]
ab of the great rectangle; the probability of the encounter of the divisions of the plane by the
cylinder is therefore
4(a + b)r − 4r2
.
abπ

9
BOOK II
CHAPTER VI
DE LA PROBABILITÉ DES CAUSES ET DES ÉVÈNEMENTS FUTURS, TIRÉE
DES ÉVÈNEMENTS OBSERVÉS

Pierre Simon Laplace


Théorie Analytique des Probabilités OC 7 §§26–33, pp. 370–409

ON THE PROBABILITY OF CAUSES AND OF FUTURE EVENTS, DEDUCED


FROM OBSERVED EVENTS

An observed event being composed of simple events of the same kind and of which
the possibility is unknown, to determine the probability that this possibility is
comprehended within some given limits. Expression of this probability. Formula
in order to determine it by a very convergent series, when the observed event is
composed of a great number of these simple events. Extension of this formula to
the case where the observed event is composed of many different kinds of simple
events No 26.

Application of these formulas to the following problems: Two players A and B play
together with this condition that the one who out of three trials will have won two
of them will win the game, the third trial not being played as useless, if the same
player wins the first two trials. Out of a great number n of won games, A has
won the number i of them; we demand the probability that his skill, respectively
to player B, is comprehended within some given limits.
We demand the probability that the number of trials played is comprehended within
some determined limits. Finally, this last number being supposed known, we
demand the probability that the number of games is comprehended within some
given limits.

Solutions of these diverse problems. No 27.

Application of the formulas of no 26 to the births observed in the principal places of


Europe. Everywhere the number of births of boys is superior to the one of the
births of girls. To determine the probability that there exists a constant cause of
this superiority, according to the births observed in a given place. Solution of
the problem. This probability for Paris differs excessively little from certitude.
No 28.

1
At Paris, the ratio of the baptisms of the boys to those of the girls is 25 24 , while at
London this ratio is 19
18 . To determine the probability that there exists a constant
cause of this difference. Solution of the problem. This probability is very great.
Probable conjecture with respect to this cause. No 29.
Investigation of the probability of the results based on the Tables of mortality or of
assurance, constructed out of a great number of observations.
Supposing that, out of a great number p of individuals of age A, we have observed
that there exists q of them at age A + a, r at age A + a + a , . . ., to determine
the probability that, out of a great number p of individuals of the same age A,
 
there will exist of them ppq ± z at age A + a, ppr ± z  at age A + a + a , . . .
Solution of the problem. There results from it that by increasing the number p
we approach without ceasing the true law of mortality, with which the results of
the observations would coincide, if p was infinite. No 30.
To evaluate, by means of annual births, the population of a vast empire. Solution of
the problem. Application to France. Probability that the error of this evaluation
will be comprehended within some given limits. No 31.
Expression of the probability of a future event, deduced from an observed event.
When the future event is composed of a number of simple events, much smaller
than the one of the simple events which enter into the observed event, we are
able, without sensible error, to determine the possibility of the future event, by
supposing to each simple event the possibility which renders the observed event
most probable. No 32.
From the epoch where we have distinguished at Paris, out of the registers, the births
of each sex, we have observed that the number of masculine births surpasses the
one of the feminine births; to determine the probability that this annual superi-
ority will be maintained within a given interval of time, for example, in the space
of a century. No 33.

2
§26. The probability of the greater part of simple events is unknown: by consid- [370]
ering it a priori, it appears to us susceptible of all the values comprehended between
zero and unity; but, if we have observed a result composed of many of these events,
the manner by which they enter there renders some of these values more probable than
the others. Thus, in measure as the observed result is composed by the development of
the simple events, their true possibility is made more and more known, and it becomes
more and more probable that it falls within some limits which, being tightened without
ceasing, would end by coinciding, if the number of simple events became infinite. In
order to determine the laws according to which this possibility is discovered, we will
name it x. The theory exposed in the preceding Chapters will give the probability of
the observed result, as a function of x. Let y be this function; if we consider the dif-
ferent values of x as so many causes of this result, the probability of x will be, by the
third principal of §1, equal to a fraction of which the numerator is y, and of which the
denominator is the sum of all the values of y; by multiplying therefore the numerator
and the denominator of this fraction by dx, this probability will be
y dx
 ,
y dx

the integral of the denominator being taken from x = 0 to x = 1. The probability that [371]
the value of x is comprehended within the limits x = θ and x = θ is consequently
equal to
y dx
(1)  ,
y dx

the integral of the numerator being taken from x = θ to x = θ , and that of the
denominator being taken from x = 0 to x = 1.
The most probable value of x is that which renders y a maximum. We will designate
it by a. If at the limits of x, y is null, then each value of y has an corresponding equal
value on the other side of the maximum.
When the values of x, considered independently of the observed result, are not
equally possible, by naming z the function of x which expresses their probability, it is
easy to see, by that which has been said in Chapter I of this Book, that by changing in
formula (1), y into yz, we will have the probability that the value of x is comprehended
within the limits x = θ and x = θ . This reverts to supposing all the values of x
equally possible a priori, and by considering the observed result as being formed of
two independent results, of which the probabilities are y and z. We are able to restore
thus all the cases to the one where we suppose a priori, before the event, an equal
possibility to the different values of x, and, by this reason, we will adopt this hypothesis
in that which will follow.
We have given in §22 and the following of Book I the formulas necessary in order
to determine, by some convergent approximations, the integrals of the numerator and
of the denominator of formula (1), when the simple events of which the observed event
is composed are repeated a very great number of times; because then y has for factors
functions of x raised to very great powers. We will, by means of these formulas,
determine the law of probability of the values of x, in measure as they deviate from

3
the value a, the most probable, or which renders y a maximum. For that, let us resume
formula (c) of §27 of Book I, [372]
⎧  
⎪ 1 d2 U 3 1.3 d4 U 5 2

⎪ y dx =Y U + + + + · · · dt c−t

⎪ 2 1.2 dx 2 2 2 1.2.3.4 dx 4


2
⎨ Y 2 dU d2 U 3 d3 U 4
(2) + c−T −T + (T 2
+ 1) − · · ·

⎪ 2 dx 1.2 dx2 1.2.3 dx3




⎪ Y −T 2 dU 2 2 3
d3 U 4

⎩  d U 2
− c +T + (T + 1) + ··· ;
2 dx 1.2 dx2 1.2.3 dx3
2 2 3 2 2 3
ν is equal to √logx−a
Y −log y
, and U, dU d U dν d ν
dx , dx2 , . . . are that which ν, dx , dx2 , . . . ,
become when we change, after the differentiations, x into a, a being√the value of x
which renders y a maximum: T is equal to that which the function log Y − log y
becomes, when we change x into a − θ in y, and T  is that which the same function
becomes, when we change x into a + θ . The preceding expression of y dx gives the
 2
value of this integral, within the limits x = a − θ and x = a + θ , the integral dt c−t

being taken from t = −T to t = T . 
Most often, at the limits of the integral y dx, extended from x = 0 to x = 1, y
is null; now, when y is not null, it becomes so small at these limits, that we are able to
suppose it null. Then,we are able to make at these limits T and T  infinite, that which
gives for the integral y dx, extended from x = 0 to x = 1,
 
1 d2 U 3 1.3 d4 U 5 √
y dx = Y U + + 2 + + ··· π;
2 1.2 dx2 2 1.2.3.4 dx4
thus the probability that the value of x is comprehended within the limits x = a − θ
and x = a + θ is equal to


⎪ 1 −T 2 dU 2 d2 U 3 2 d3 U 4 ⎪

⎨ 2c −T + (T + 1) − ··· ⎪

dx 1.2 dx2 1.2.3 dx3

2


2 3 3 4 ⎪
⎩ − Y c−T + ··· ⎪
2 dU d U d U ⎭
 −t2 + T + (T 2 + 1)
dt c 2 dx 1.2 dx 2 1.2.3 dx 3
(3) √ +  d2 U 3 d4 U 5
 √
π U + 12 1.2 1.3
dx2 + 22 1.2.3.4 dx4 + · · · π
We see, by §23 of Book I, that, in the case where y has for factors functions of x raised [373]
to great powers of order α1 , α being an extremely small fraction, then U is most often of
√ 2
d2 U 3
order α, so that its successive differences; U, dU dx , dx2 , . . . are respectively of the
√ 3
orders α, α, α 2 , . . . ; whence it follows that the convergence of the series of formula
(3) requires that T and T  are not of an order superior to √1α .
If we suppose θ = θ , then we have very nearly T = T  , and formula (3) is reduced,
 2
dt c−t
by neglecting the terms of order α, to the integral √π , taken from t = −T to
t = T  ; that which reverts, to neglecting the square of the difference T 2 − T 2 , to
doubling the preceding integral and to taking it from t null to

T 2 + T 2
t= .
2

4
Now we have
T 2 = log Y − log y,
and we are able to suppose
1
log y = log φ,
α
φ being a function of x or of a − θ, which no longer contains factors raised to great
d2 Φ dφ d2 φ
powers. By naming therefore Φ, dΦ dx , dx2 , . . . that which φ, dx , dx2 , . . . become,
when θ is null, by observing next that the condition of Y or Φ a maximum gives dΦ dx =
0, we will have
  2 4 
2 3
2 2 d Φ 3 d Φ θ4 d4 Φ d Φ
αT = −θ 2
+θ 3
− 4
− + ···
2Φ dx 6Φ dx 8 3Φ dx Φ dx2

By changing θ into −θ, we will have the value of αT 2 ; we will have therefore, by
neglecting the terms of order α2 ,
α(T 2 + T 2 ) d2 Φ
= −θ2 ;
2 2Φ dx2
hence,   [374]
T 2 + T 2 θ d2 Φ
=√ − ;
2 α 2Φ dx2
Let us make  
d2 Φ αd2 Y
k= − = − ,
2Φ dx2 2Y dx2

t α
θ= ;
k

t α
the probability that the value of x is comprehended within the limits a ± k will be
 2
2 dt c−t
√ ,
π
the integral being taken from t = 0, and being able to be obtained in a very near way
from the formulas of §27 from Book I.
There results from this expression that the most probable value of x is a, or that
which renders the observed event the most probable, and that by multiplying to infinity
the simple events of which the observed

event is composed, we are able at the same
time to narrow the limits a ± t kα , and to increase the probability that the value of
x will fall between these limits; so that at infinity, this interval becomes null, and the
probability is confounded with certitude.
If the observed event depends on simple events of two different kinds, by naming
x and x the possibilities of these two kinds of events, we will see, by the preceding
reasonings, that, y being then the probability of the composite event, the fraction
y dx dx
(4) 
y dx dx

5
will be the probability of the simultaneous values of x and of x , the integrals of the
denominator being taken from x = 0 to x = 1, and from x = 0 to x = 1. By
naming a and a the values of x and x which render y a maximum, and making x = [375]
a + θ, x = a + θ , we will find, by the analysis of §27 from Book I, that if we suppose
 
∂2Y
θ ∂2Y  ∂x∂x −2Y
√ − 2 −θ ∂2Y
=t,
2Y ∂x 2Y ∂x2
  2 2
θ ∂2Y ∂2Y ∂ Y
 2 2
− 
=t ,
−2Y
2
∂ Y ∂x ∂x ∂x∂x
∂x2

the fraction (4) will take form


2 2
dt dt c−t −t
 .
dt dt c−t2 −t2

The integrals of the denominator must be taken from t = −∞ to t = ∞, and from


t = −∞ to t = ∞; because the integrals relative to x and x of the fraction (4) being
taken from x = 0 and x = 0 to x and x equal to unity, and at these limits, the values
of θ and of θ being −a and 1 − a, −a and 1 − a , the limits of t and of t are equal
to these last limits multiplied by some quantities of order √1α : thus the exponential
2 2
c−t −t is excessively small at these limits, and we are able, without sensible error,
to extend the integrals of the denominator of the preceding fraction to the positive and
negative infinite values of the variables t and t . This denominator becomes thus equal
to π; and the probability that the values of θ and of θ are comprehended within the
limits 
2
t −2Y ∂∂xY2
 
θ =0, θ =  ∂ 2 Y 2 ,
∂2Y ∂2Y
2
∂x ∂x 2 − ∂x∂x
√ 
∂2Y
t 2Y t ∂x∂x  −2Y
θ =0, θ = +   ∂2Y
∂2Y 2 2 2 2
− ∂x2 ∂x2 ∂x2 − ∂x∂x
∂ Y ∂ Y ∂ Y ∂x2

is equal to  [376]
1  −t2 −t2
dt dt c ,
π
the integrals being taken from t and t nulls.
We see by this formula that, in the case of two different kinds of simple events,
the probability that their respective possibilities are those which render the composite
event most probable becomes more and more great, and ends by being confounded with
certitude; that which holds generally for any number whatsoever of different kinds of
simple events, which enter into the observed event.
If we imagine an urn containing an infinity of balls of many different colors, and if
after having drawn from it a great number n, p out of this number had been of the first
color, q of the second, r of the third, etc.; by designating by x, x , x , . . . the respective

6
probabilities to bring forth in a single drawing one of these colors, the probability of the
observed event will be the term which has for factor xp xq xr · · · , in the development
of the polynomial
(x + x + x + · · · )n ,
where we have
x + x + x + · · · = 1,
p + q + r + · · · = n;
we will be able therefore to suppose here y = xp xq xr · · · , and then we have for the
values of x, x , x , . . . which render the observed event the most probable
p q r
x= , x = , x = , ...
n n n
Thus the most probable values are proportionals to the numbers of the arrivaled of the
colors, and when the number n is a great number, the respective probabilities of the
colors are very nearly equals to the numbers of times that they are arrived divided by
the number of drawings.

§27. In order to give an application of the preceding formula, let us consider the [377]
case where two players A and B play together with this condition, that the one who
out of three trials will have won two of them wins the game, and let us suppose that,
out of a very great number n of games, A has won a number i of them. By naming x
the probability of A in order to win a trial, and consequently 1 − x the corresponding
probability of B, the probability of A in order to win a game will be the sum of the
first two terms of the binomial (x + 1 − x)3 , and the corresponding probability of B
will be the sum of the last two terms. These probabilities are therefore x2 (3 − 2x) and
(1 − x)2 (1 + 2x); thus the probability that, out of n games, A will win i of them, and B,
n−i, will be proportional to x2i (3−2x)i (1−x)2n−2i (1+2x)n−i . By naming therefore
y this function, and a the value of x which renders it a maximum, the probability that
the value of x is comprehended within the limits a − θ and a + θ will be

y dx
 ,
y dx
the integral of the numerator being taken from x = a − θ to x = a + θ, and that of the
denominator being taken from x = 0 to x = 1. If we make
1 i
= α, = i ,
n n
we will have, by the preceding section,
   
φ = x2i (3 − 2x)i (1 − x)2−2i (1 + 2x)1−i .

The condition of the maximum of y or of φ gives dφ = 0; consequently, a being the


value of x corresponding to this maximum, we will have
2i 2i 2(1 − i ) 2(1 − i )
0= − − + ,
a 3 − 2a 1−a 1 + 2a

7
whence we deduce

i = a2 (3 − 2a), 1 − i = (1 − a)2 (1 + 2a);

next we have
−d2 Φ 18
2
= = k2 .
2Φdx (3 − 2a)(1 + 2a)
The probability that the value of x is comprehended within the limits a ± √r
n
will be [378]
therefore, by the preceding section, equal to
√ 
6 2 −18r 2
 dr c (3−2a)(1+2a) .
π(3 − 2a)(1 + 2a)

We will see easily that this result agrees with the one that we have found in §16, by an
analysis less direct than this one.
The game ends in two trials, if A or B wins the first two trials, the third trial not
being played, because it becomes useless. Thus the numbers of games won by one and
the other of the players do not indicate the number of games played; but they indicate
that this last number is contained within some given limits, with a probability that
increases without ceasing, in measure as the games are multiplied. The investigation of
this number and of this probability being very proper to clarify the preceding analysis,
we will occupy ourselves with it.
The probability that A will win a game in two trials is x2 , x expressing, as above,
his probability to win at each trial. The probability that he will win the game in three
trials is 2x2 (1 − x). The sum x2 (3 − 2x) of these two probabilities is the probability
that A will win the game. Thus, in order to have the probability that, out of i games
won by player A, s will be of two trials, it is necessary to raise to the power i the
binomial
x2 2x2 (1 − x)
+
x2 (3 − 2x) x2 (3 − 2x)
or
1 2(1 − x)
+ ,
3 − 2x 3 − 2x
and the term i − s + 1 of the development of this power will be that probability which
is thus equal to
1.2.3 . . . i.2i−s (1 − x)i−s
.
1.2.3 . . . s.1.2.3 . . . (i − s)(3 − 2x)i
The greatest term of this development is, by §16, the one in which the exponents s and [379]
i − s of the first and of the second term of the binomial are very nearly in the ratio of
these terms, that which gives
i
s= .
3 − 2x
We will name s this quantity, and we will make

s = s + l.

8
We will have, by §16, 
i −il2
dl c 2s (i−s )
2s π(i − s )
for the probability of s, corresponding to the skill x of player A.
We will find similarly that, if we name z the number of the games of two trials,
won by player B, out of the number n − i of games that he has won, the most probable
n−i
value of z will be 1+2x , and that by designating by z  this quantity and making

z = z  + l ,
the probability of z corresponding to x will be

n−i −(n−i)l2
 2z (n−i−z )
dl c .
2z  (n − i − z  )π
The product of these two probabilities is therefore the probability corresponding to x,
that the number of games of two trials, won by player A, will be s + l, while the
number of games of two trials, won by player B, will be z  + l . Let
i n−i
q= , q = ;
2s (i − s ) 2z  (n − i − z)
we will have, for this composite probability,
√ 
qq 2  2
dl dl c−ql −q l .
π
It is necessary to multiply this probability by that of x, which, as we have seen in the
preceding section, is yydx
dx
; the product is [380]


qq  y dx 2  2
()  dl dl c−ql −q l .
π y dx

The integral of the denominator must be taken from x = 0 to x = 1, and by §27 of


Book I, this integral is, very nearly,

√ −2Y
Y π d2 Y .
dx2

If we name X the function


 2
−q  l2
qq  c−ql
and if we designate by a the value of x which renders Xy a maximum, and by X 
and Y  that which X and y become when we change x into a , we will have, by the
preceding section, by making x = a + θ,
 2  2 θ 2 d2 (X  Y  )
ydx qq  c−ql −q l = Y  X  dθ c 2X  Y  dx2 .

9
It is easy to see that a differs from the value a of x which renders y a maximum, only
by a quantity of order α, which we will designate by f α; by substituting into Y , a+f α
instead of a , in order to form Y  , and developing with respect to the powers of α, we

will see that dYda being null, because Y is the maximum of y, Y differs from Y only
by quantities of order α; thus we have, to the quantities near of an order inferior to the

d2 X 
one that we conserve, and by observing that XdX  dx and X  dx2 are able to be neglected

with respect to YdY dx ,

d2 X  Y  d2 Y
= ;
2X  Y  dx2 2Y dx2
the function () becomes thence

√ 
qq d2 Y 2  2 θ 2 d2 Y
( ) √ − 2
dl dl dθ c−ql −q l + 2Y dx2 .
π π 2Y dx
We must, in this function, suppose x = a, that which gives, by substituting for i its [381]
value na2 (3 − 2a),
3 − 2a 1 + 2a
q= , q = .
4na2 (1 − a) 4na(1 − a)2
Next, x being equal to a + θ, it is equal to a + f α + θ; by neglecting therefore the
quantities of order α, we will have

x = a + θ.
Now the number of games of two trials being
i n−i
+ + l + l ,
3 − 2x 1 + 2x
this number will be


i n−i 2i 2(n − i)
+ + − θ + l + l .
3 − 2a 1 + 2a (3 − 2a)2 (1 + 2a)2
Let us make


2i 2(n − i)
t= − θ + l + l ,
(3 − 2a)2 (1 + 2a)2
and let us designate by q  the quantity
d2 Y
−  2 ,
2i 2(n−i)
2Y dx2 (3−2a)2 − (1+2a)2

which, after all the reductions, is reduced to


9(3 − 2a)(1 + 2a)
;
2n(1 − 2a)2 (3 − 2a + 2a2 )2

10
the function ( ) will become
√  
qq q 2  2   2
( ) √ dt dl dl dθ c−ql −q l −q (t−l−l ) .
π π

By integrating it from l = −∞ to l = ∞, and from l = −∞ to l = ∞, we will have


the probability that the number of games of two trials will be equal to [382]
i n−i
+ + t;
3 − 2a 1 + 2a
now we have
   
q 
2
2  2   2 − qq (t−l )2 −q  l2 −(q+q  ) l− q+q 
 (t−l )
dl c−ql −q l −q (t−l−l ) = dl c q+q .

This last integral, taken from l = −∞ to l = ∞, is, by that which precedes,



π qq   2  2
− q+q  (t−l ) −q l
√ 
c .
q+q

By multiplying it by dl and by putting it under this form


√  2
π dl − qq +qq
qq  q  t2
 +q  q  −
qq  +qq  +q  q  qq  t
l − qq +qq
√ c q+q   +q  q 
,
q + q 

and integrating from l = −∞ to l = ∞, we will have


π qq  q  t2

√ c qq +qq +q q
qq   
+ qq + q q 

The function ( ) integrated with respect to l and l , within the positive and negative
infinite limits of these variables, becomes thus

1 qq  q  qq  q  t2
− qq +qq
√ c  +q  q 
.
π qq  + qq  + q  q 

Thus the probability that the number of games of two trials will be comprehended
within the limits
i n−i
+ ± t = n[a2 + (1 − a)2 ] ± t
3 − 2a 1 + 2a
is equal to the double of the integral of the preceding differential, taken from t null. We
qq  q 
must observe that q, q  , q  are of order n1 , so that the quantity qq +qq  +q  q  is of the

k2 √
same order. Let us represent it by n , and let us make t = r n; we will have [383]

2 2 2
( ) √ k  dr c−k r ,
π

11
for the expression of the probability that the number of games of two trials will be
comprehended within the limits

n[a2 + (1 − a)2 ] ± r n,

the integral being taken from r null. The interval of these two limits is 2r n, and the
ratio of this interval to the number n of games is √2rn . This ratio diminishes without
ceasing in measure as n increases, and r is able to increase at the same time indefinitely,
so that the preceding integral approaches indefinitely unity.
The total number of trials is the triple of the number of games of three trials, plus
the double of the number of games of two trials, or the triple of the total number n of
games, less the number of games of two trials; it is therefore

2n(1 + a − a2 ) ∓ r n.

The integral ( ) is therefore the expression of the probability that the number of trials
will be comprehended within these limits.
If, instead of knowing the number i of games won by player A and the total number
n of games, we know the number i and the total number of trials, the same analysis
will be able to serve to determine the unknown number n of games. For this, let us
designate by h the total number of trials; we will have, by that which precedes, the two
equations
i n−i √
3n − − = h ± r n,
3 − 2a 1 + 2a
i i n−i n−i
− = − .
n 3 − 2a 1 − a 1 + 2a

These equations give a and n as functions of h ± r n. Let us suppose [384]
 √  √
h±r n h±r n
n = iψ , a=Γ ;
i i

we will have, by reducing into series,


  
h √ dψ hi
n = iψ ± ir n + ··· ;
i dh
   
we will substitute into k  , instead of n and of a, iψ hi and Γ hi : the integral ( ) is
then the probability that the number n of games is comprehended within the limits
    
h h dψ hi
iψ ± ir iψ .
i i dh

§28. It is principally in the births that the preceding analysis is applicable, and
we are able to deduce from it, not only for the human race, but for all the kinds of
organized beings, some interesting results. Until here the observations of this kind

12
have been made in great number only on the human race; we will submit the principals
to the calculus.
Let us consider first the births observed at Paris, at London and in the realm of
Naples. In the space of forty years elapsed from the commencement of 1745, an epoch
where we have begun to distinguish at Paris, out of the registers, the births of two sexes,
to the end of 1784, we have baptized in this capital 393386 boys and 377555 girls, the
found infants being comprehended in this number: this gives nearly 25 24 for the ratio of
the baptisms of the boys to those of the girls.
In the space of ninety-five years elapsed from the commencement of 1664 to the
end of 1758, there was born at London 737629 boys and 698958 girls, that which gives
19
18 nearly, for the ratio of the births of boys to those of girls.
Finally, in the space of nine years elapsed, from the commencement of 1774 to the [385]
end of 1782, there was born in the realm of Naples, Sicily not included, 782352 boys
and 746821 girls, that which gives 22 21 for the ratio of the births of the boys to those of
the girls.
The smallest of these numbers of births are relative to Paris; besides it is in this
city that the births of the boys and of the girls approach more to equality. For these
two reasons, the probability that the possibility of the birth of a boy surpasses 12 must
be less than at London and in the realm of Naples. Let us determine numerically this
probability.
Let us name p the number of masculine births observed at Paris, q the one of the
feminine births, and x the possibility of a masculine birth, that is the probability that
an infant who must be born will be a boy; 1 − x will be the possibility of a feminine
birth, and we will have the probability that, out of p − q births, p will be masculine,
and q will be feminine, equal to

1.2.3 . . . (p + q)
xp (1 − x)q ,
1.2.3 . . . p.1.2.3 . . . q
By making therefore
y = xp (1 − x)q
the probability that the value of x is comprehended within some given limits will by,
by §26, equal to 
y dx
 ,
y dx
the integral of the denominator being taken from x = 0 to x = 1, and that of the
numerator being taken within the given limits. If we take zero and 12 for these limits, we
will have the probability that the value of x not surpass 12 . The value which corresponds
p
to the maximum of y is p+q , and, seeing the magnitude of the numbers p and q, the
p 1
excess of p+q over 2 is too considerable in order to employ here formula (c) from §27

of Book I, in the approximation of the integral y dx, taken from x = 0 to x = 12 ; it [386]
is necessary therefore, in this case, to make use of formula (A) from §22 of the same
Book. Here we have
y dx x(1 − x)
ν=− =− ;
dy p − (p − q)x

13

the formula cited (A) gives thus, for the integral y dx taken from x = 0 to x = 12 ,


1 p+q
1 − + · · · .
2p+q+1 (p − q) (p − q)2

As for the integral y dx taken from x = 0 to x = 1, we have, by §26,
 
1 d2 U 3 √
y dx = Y U + + ··· π,
2 1.2 dx2
p
Y being that which y becomes at its maximum, or when we substitute p+q for x; ν is
p
x− p+q 2 3 2 3
here equal to √log Y −log y , and U, ddxU2 , . . . are that which ν, ddxν2 , . . . become, when
p

we make, after the differentiations, x = p+q . We find thus, for the integral y dx
taken from x null to x = 1,
 1 1√

pp+ 2 q q+ 2 2π (p + q)2 − 13pq
y dx = 3 1 + + · · · ;
(p + q)p+q+ 2 12pq(p + q)
1
the probability that the value of x does not surpass 2 is therefore equal to
3

(p + q)p+q+ 2 p+q (p + q)2 − 13pq
(o) √ 1− − − ··· .
3 1 1
(p − q) π 2p+q+ 2 pp+ 2 q q+ 2 (p − q)2 12pq(p + q)

In order to apply large numbers to this formula, it would be necessary to have the
logarithms of p, q and p − q, with twelve decimals at least: we are able to supply it in
this manner. We have
    
p+q p+q
2 p−q p−q
log = −p log 1 + − q log 1 − .
pq q q p+q p+q

When the logarithms are hyperbolic, the second member of this equation, reduced to [387]
series, becomes
⎡ 2  4  6  8 ⎤
p−q p−q p−q p−q
⎢ p+q p+q p+q p+q ⎥
−(p + q) ⎣ + + + + ···⎦
1.2 3.4 5.6 7.8

We will have therefore, by this very convergent series, the hyperbolic logarithm of
(p+q)p+q
2p+q pp q q . In multiplying it by 0, 43429448, we will convert it into tabular logarithm,
3
(p+q) 2
and, by adding to it the tabular logarithm of √ , we will have the tabular
2(p−q) 2pqπ
1
logarithm of the factor which multiplies series (o). If we name μ this factor and if we
make
p = 393386, q = 377555,
we find by tabular logarithm

log μ = 72, 2511780,

14
and the series (o) becomes
1
(1 − 0, 0030761 + · · · ).
μ
This quantity of excessive smallness, subtracted from unity, will give the probability
that at Paris the possibility of the births of the boys surpasses that of girls; whence we
see that we must regard this probability as being equal, at least, to that of the most
authenticated historical facts.
If we apply formula (o) to the births observed in the principal cities of Europe,
we find that the superiority of the births of boys over the births of girls, observed
everywhere from Naples to Petersburg, indicates a greater possibility of the births of
boys, with a probability extremely near to certitude. This result appears therefore to be
a general law, at least in Europe, and if, in some small cities, where we have observed
only a not very considerable number of births, nature seems to deviate from it, there [388]
is everywhere to believe that this deviation is only apparent, and that at length the
births observed in these cities would offer, in being multiplied, a result similar to the
one of the great cities. Many philosophers, deceived by these anomalies, have sought
the cause of phenomena which are only the effect of chance; that which proves the
necessity to make precede from parallel researches for that of the probability with
which the observations indicate the phenomena of which we wish to determine the
cause. I take for example the small city of Vitteaux, in which, out of 415 observed
births during five years, there is born 203 boys and 212 girls; p being here less than q,
the natural order appears reversed. Let us see what is, according to these observations,
the probability that the facilities of
 the births of boys surpasses in this city that of the
y dx
births of girls. This probability is  y dx , the integral of the numerator being taken from
x = 12 to x = 1, and that of the denominator being taken from x = 0 to x = 1.
Formula (o), which, subtracted from unity, gives this fraction, becomes here divergent;
we will employ then formula (3) from §26, which is reduced very nearly to its first term
 2
dt c−t 1
√ , the integral being taken from the value of t which corresponds to x = to
π 2
the value of t which corresponds to x = 1. Now we have, by the section cited,
t2 = log Y − log y,
y being xp (1 − x)q , and Y being the value of y corresponding to the maximum of
y, which holds when x = p+q p
; the value of t2 which corresponds to x = 12 is

p+q p+q
( 2 )
− log pq q q , this logarithm being hyperbolic, and being given, by that which
precedes, by a very convergent series. The value of t2 which corresponds
 to x = 1
2
is t2 = ∞; we have therefore thus the two limits of the integral dt c−t , an integral
which it will be easy to obtain by the formulas which we have given for this object. We
find thus the probability that at Vitteaux the facilities of the births of boys surpasses [389]
over those of girls equal to 0, 33; the superiority of the facility of the births of girls is
therefore indicated by these observations, with a probability equal to 0, 67, a probabil-
ity much too weak to balance the analogy which carried us to think that at Vitteaux, as
in all the cities where we have observed a considerable number of births, the possibility
of the births of boys surpasses that of the births of girls.

15
§29. We have seen at London the observed ratio of the births of boys to those of
girls is equal to 19
18 , while at Paris the one of the baptisms of boys to those of girls is
only 25
24 . This seems to indicate a constant cause of this difference. Let us determine
the probability of this cause.
Let p and q be the numbers of baptisms of boys and girls made at Paris in the inter-
val from the beginning of 1745 to the end of 1784; by designating by x the possibility
of the baptism of a boy, and making, as in the preceding section,

y = xp (1 − x)q ,

the most probable value of x will be that which renders y a maximum: it is therefore
p
p+q ; by supposing next
p
x= + θ,
p+q
the probability of the value of θ will be, by §26, equal to

dθ (p + q)3 − (p+q) 3
2
√ c 2pq θ .
π 2pq

By designating by p , q  and θ that which p, q and θ become for London, we will have

dθ (p + q  )3 − (p2p
 +q  )3
 q θ
2
√  
c
π 2p q

for the probability of θ ; the product [390]



dθ dθ (p + q)3 (p + q  )3 − (p+q) 3
2 (p +q  )3 2
2pq θ − 2p q  θ
c
π 4pqp q 

of these two probabilities will be therefore the probability of the simultaneous existence
of θ and θ . Let us make
p p
+ θ = + θ + t;
p + q p+q
the preceding differential function becomes
  2
dθ dt (p + q)3 (p + q  )3 − (p+q) 3
2 (p +q  )3
2pq θ − 2p q 
p q−pq 
θ+t− (p+q)(p  +q  )
c .
π 4pqp q 

By integrating it for all the possible values of θ and next for all the positive values of
t, we will have the probability that the possibility of the baptisms of boys is greater at
p
London than at Paris. The values of θ are able to be extended from θ equal to − p+q to
(p+q)3 2
p
θ equal to 1 − p+q ; but, when p and q are very great numbers, the factor c− 2pq θ is
so small at these two limits that we are able to regard it as null; we are able therefore to
extend the integral relative to θ, from θ = −∞ to θ = ∞. We see, for the same reason,

16
that the integral relative to t is able to be extended from t = 0 to t = ∞. By following
the process of §27 for these multiple integrations, we will find easily that, if we make
(p + q)3 (p + q  )3
k2 = ,
2p q  (p + q)3 + 2pq(p + q  )3
p q − pq 
h= ,
(p + q)(p + q  )
2pqk 2
θ+ (t − h) = t ,
(p + q)3
that which gives dθ = dt , the preceding differential, integrated first with respect to t [391]
from t = −∞ to t = ∞, and next from t = 0 to t infinity, will give

k dt −k2 (t−h)2
√ c
π
for the probability that at London the possibility of the baptisms of boys is greater than
at Paris. If we make
k(t − h) = t ,
this integral becomes 
dt 2
√ c−t ,
π
the integral being taken from t = −kh to t = ∞, and it is clear that it is equal to

dt 2
1− √ c−t ,
π
the integral being taken from t = kh to t = ∞. Thence it follows, by §27 of Book
I, that, if we suppose
p q  (p + q)3 + pq(p + q  )3
i2 = ,
(p + q)(p + q  )(p q − pq  )2
the probability that the possibility of the baptisms of boys is greater at London than at
Paris has for expression
1
ic− 2i2 1
(μ) 1− √
2π i2
1+
2i2
1+
3i2
1+
4i2
1+
1 + ···
By making in this formula

p =393386, q =377555,
p =737629, q  =698958,

17
it becomes
1
1− .
328269
There is therefore odds of 328268 against one that at London the possibility of the [392]
baptisms of boys was greater than at Paris. This probability approaches so much to
certitude, that there is place to investigate the cause of this superiority.
Among the causes which are able to produce it, it has appeared to me that the
baptisms of the found infants, who are part of the annual list of the baptisms at Paris,
must have a sensible influence on the ratio of the baptisms of the boys to those of the
girls, and that they must diminish this ratio, if, as it is natural to believe, the parents
in the surrounding country, finding advantage to retain near to them the male infants,
have sent them to the hospice of the Enfants-Trouvés1 of Paris, in a ratio less than the
one of the births of the two sexes. This is that which the summary from the registers of
this hospice has made me see with a very great probability. From the commencement
of 1745 to the end of 1809, we have baptized 163499 boys and 159405 girls, a number
of which the ratio is 39 25
38 , and differs too much from the ratio 24 of the baptisms of the
boys and the girls at Paris, in order to be attributed to simple chance.

§30. Let us determine, according to the preceding principles, the probabilities of


the results founded on the Tables of mortality or of assurance, constructed on a great
number of observations. Let us suppose first that, with respect to a number p of indi-
viduals of a given age A, we have observed that there exists yet the number q at the age
A + a; we demand the probability that, out of p individuals of age A, there will exist
q  + z of them at the age A + a, the ratio of p and q  being the same as that of p to q.
Let x be the probability of an individual of age A, in order to survive to age A + a;
the probability of the observed event is then the term of the binomial x + (1 − x)p
which has xq for factor; this probability is therefore
1.2.3 . . . p
xq (1 − x)p−q ;
1.2.3 . . . (p − q)1.2.3 . . . q

thus the probability of the value of x, taken from the observed event, is [393]

xq dx (1 − x)p−q
 ,
xq dx(1 − x)p−q

the integral of the denominator being taken from x = 0 to x = 1.


The probability that, out of the p individuals of age A, q  + z will live to age A + a
is

1.2.3 . . . p   
xq +z (1 − x)p −q −z .
1.2.3 . . . (q  + z)1.2.3 . . . (p − q  − z)
In multiplying this probability by the preceding probability of the value of x, the prod-
uct integrated from x = 0 to x = 1 will be the probability of the existence of q  + z
persons to age A + a. By naming therefore P this probability, we will have
1 Translator’s note: Foundling Hospital of Paris.

18
   
1.2.3 . . . p xq+q +z dx(1 − x)p+p −q−q −z
P =  ,
1.2.3 . . . (q  + z)1.2.3 . . . (p − q  − z) xq dx(1 − x)p−q
the integrals of the numerator and of the denominator being taken from x = 0 to x = 1.
We have, by §28, very nearly,

  
xq+q +z dx(1 − x)p+p −q−q −z

 q+q +z+ 12
√  z
= 2π (q + q ) 1 +
q + q
  p+p −q−q −z+ 12
(p + p − q − q  ) 1 − p+p −q−q
z

×  3 ,
(p + p )p+p + 2
 √ q q+ 12 (p − q)p−q+ 12
xq dx(1 − x)p−q = 2π 3 .
pp+ 2
Next, by §33 of Book I, we have

 1 √
1.2.3 . . . p = pp + 2 c−p 2π,
 q +z+ 12
 q  +z+ 12 z  √
1.2.3 . . . (q + z) = q 1+  c−q −z 2π,
q
 p −q −z+ 12
  1 z   √
1.2.3 . . . (p − q  − z) = (p − q  )p −q −z+ 2 1 −  
c−p +q +z 2π;
p −q
qp
finally we have q  = p . This premised, we find, after all the reductions, [394]

  q+q +z+ 12  p+p −q−q −z+ 12


p3 1+ z
q+q  1 − p+p −q−q
z

P =  q +z+ 12  p −q −z+ 12 .
qp (p − q)(p + p )2π
1 + qz 1 − p −q
z


If we take the hyperbolic logarithm of the second member of this equation, if we


reduce this logarithm into series ordered with respect to the powers of z, and if we
neglect the powers superior to the square, we will have, by passing again from the
logarithm to the function,



p3 (2q − p)p2 z −p3 z 2
P =  
1 +  
c 2qp (p−q)(p+p ) .
qp (p − q)(p + p )2π 2qp (p − q)(p + p )

p, q, p being supposed very great numbers of order α1 , the coefficient of z is very


2
√ of −z is very small and of the same order. But, if we
small of order α; the one
z
suppose p of the order α, we will be able to neglect, in the preceding expression, the

19

term depending on the first power of z, as very small of order α. Moreover, this term
is itself destroyed, when we have regard at the same time to the positive and negative
values of z. By neglecting it therefore, we will have
 
p3 p3 z 2
− 2qp (p−q)(p+p )
2 dz c
qp (p − q)(p + p )2π
for the expression of the probability that, out of p individuals of age A, the number of
those who will arrive to age A + a will be comprehended within the limits q ± z, the
integral being taken from z null.
Let us suppose now that we have found by observation that, out of p individuals of
age A, q lived yet to age A + a, and r to age A + a + a ; we demand the probability
 
that, out of p individuals of the same age A, qpp + z will live to age A + a, and rpp + z 
will live to age A + a + a .

The probability that, out of p individuals of age A, qpp + z will live to age A + a [395]
is, by that which precedes,

p3 − p3 z 2

 
c 2qp (p−q)(p+p ) .
2qp (p − q)(p + p )π

  
We will have the probability that, out of qpp +z individuals of age A+a, qpp + z rq +
qp
u will live to age A + a + a , by changing in the preceding function p into p + z, p

into q, q into r and z into u; that which gives, by neglecting z with respect to qpp ,

qp2 − qp2 u2

 
c 2rp (p−r)(p+p ) .
2rp (q − r)(p + p )π
The product of these two probabilities is the probability of the simultaneous existence
of z and of u. Now we have
 
qp r rp
+z +u= + z,
p q p
that which gives
rz
u = z − ;
q
by making therefore
p3
β2 = ,
2qp (p − q)(p + p )
qp2
β 2 = ,
2rp (q − r)(p + p )
the probability P of the simultaneous existence of the values of z and of z  will be

β dz β  dz  −β 2 z2 −β 2 (z − rz 2
q ) .
P = √ √ c
π π

20
By following this analysis, we find generally that, if we make [396]

rp2
β 2 = ,
2sp (r
− s)(p + p )
sp2
β 2 = ,
2tp (s − t)(p + p )


....................

the probability P that, out of p individuals of age A, the numbers of those who will
live to ages A + a, A + a + a , A + a + a + a , . . . will be comprehended within the
respective limits

qp qp rp rp sp sp tp tp


, + z; , + z; , + z  ; , + z  ; ...,
p p p p p p p p
is   2
β dz β  dz  β  dz  −β 2 z 2 −β 2 (z  − rz
2
q ) −β
2
z  − sz

−···
P = √ √ √ ··· c r
.
π π π
We are able to estimate by this formula the respective probabilities of the numbers of
a Table of mortality, constructed on a great number of observations. The manner to
form these Tables is very simple. We take out of the registers of births and of deaths a
great number of infants who we follow during the course of their life, by determining
how many there remain of them at the end of each year of their age, and we write this
number vis-à-vis dying each year. But, as in the first two or three years of life mortality
is very rapid, it is necessary, for more exactitude, to indicate in this first age the number
of the surviving at the end of each half-year. If the number p of infants were infinite,
we would have thus exact Tables which would represent the true law of mortality in
the place and at the epoch of their formation. But the number of infants that we choose
being finite, however great it be, the numbers of the Table are susceptible of errors.
Let us represent by p , q  , r , s , t , . . . these diverse numbers. The true numbers, for
    
a number p of births, are qpp , rpp , spp , tpp , . . . If we make q  = qpp + z, z will be [397]
rp
the error of q  ; similarly, if we suppose r = + z  , z  will be the error of r , and
p
thus consecutively. The preceding expression of P is therefore the probability that the
errors of q  , r , s , . . .are comprehended within the limits zero and z, zero and z  , zero
and z  , etc. The values of β, β  , . . . depend on p, q, r, . . . which are unknowns; but
the supposition of p infinite gives

p2
β2 = .
2qp (p − q)
q
We are able to substitute, without sensible error, p instead of pq , that which gives

p
β2 = .
2q  (p − q  )

21
We will have in the same manner
q
β 2 = ,
2r (q  − r )
r
β 2 =   ,
2s (r − s )
·····················

If we wish to consider only the error of one of the numbers of the Table, such as s ,
then we will integrate the expression of P , relatively to z  , z iv , . . ., from the infinite
negative values of these variables to their infinite positive values, and then we have
    
β dz β  dz  β  dz  −β 2 z2 −β 2 z − rqz 2 −β 2 z − sr z  2
P = √ √ √ c .
π π π

The integrals relative to z and z  must be taken from their negative infinite values to
their positive infinite values; we will find thus, by the process of which we have often
made use for this kind of integration, that, if we suppose

p
γ2 = ,
2s (p − s )

we will have  [398]


γ dz  −γ 2 z2
P = √ c .
π
The probability that the error of any number from the Table will be comprehended
within the limits zero and any quantity is therefore independent, either of the interme-
diate numbers, or of the subsequent numbers.
If we make γz  = t, we will have

z  2(p − s )

=t ,
s p s 

and the probability P that the ratio of the error of the number
 s from the Table to this
2(p −s )
number itself will be comprehended within the limits ±t p  s is

dt 2
P =2 √ c−t ,
π

the integral being taken from t null. We see thus that, the value of t and consequently
the probability P remaining the same, this ratio increases when s diminishes; thus
the numbers from the Table are so much less certain as they are more extended from
the first p . We see further that this ratio diminishes in measure as p increases, or in
measure as we multiply the observations; in a manner that we are able, by this multi-
plication, to diminish at the same time this ratio and to increase t, this ratio becoming
null when p is infinite, and P becoming then equal to unity.

22
§31. Let us apply the preceding analysis to the investigation of the population of a
great empire. One of the simplest and most proper ways to determine this population is
the observation of the annual births of which we are obliged to take account in order to
determine the civil state of the infants. But this way supposes that we know, very nearly,
the ratio of the population to the annual births, a ratio that we obtain by making at many
points of the empire the exact denumeration of the inhabitants, and by comparing it to
the corresponding births observed during some consecutive years; we conclude from [399]
it next, by a simple proportion, the population of all the empire. The Government
has well wished, at my prayer, to give orders to have, with precision, these data. In
thirty departments, distributed over the area of France, in a manner to compensate the
effects of the variety of climates, we have made a choice of the townships of which the
mayors, by their zeal and their intelligence, would be able to furnish the most precise
information. The exact denumeration of the inhabitants of these townships, for 22
September 1802, is totaled to 2037615 individuals. The summary of the births, of the
marriages and of the deaths, from 22 September 1799 to 22 September 1802, has given,
for these three years,
Births Marriages Deaths
110312 boys 46037 103659 males
105287 girls 99443 females.
The ratio of the births of boys to those of girls, that this summary presents, is the one
of 22 to 21, and the marriages are to the births as 3 to 14; the ratio of the population to
the annual births is 28, 352845. In supposing therefore the number of annual births in
France equal to one million, that which deviates little from the truth, we will have, by
multiplying by the preceding ratio, this last number, the population of France equal to
28352845 individuals. Let us see the error that we are able to fear in this evaluation.
For this, let us imagine an urn which contains an infinity of white and black balls
in an unknown ratio. Let us suppose next that having drawn at random a great number
p of these balls, q have been white, and that, in a second drawing, out of an unknown
number of extracted balls, there are q  white of them. In order to deduce from it this
unknown number, we suppose its ratio to q  , the same as the one of p to q, that which

gives pqq for this number. Let us seek the probability that the number of balls extracted

in the second drawing is comprehended within the limits pqq ± z. Let us name x the
unknown ratio of the number of white balls to the total number of balls in the urn. The [400]
probability of the observed event in the first drawing will be expressed by the term
which has for factor xq (1 − x)p−q in the development of the binomial [x + (1 − x)]p ,
whence it is easy to conclude, as in the preceding section, that the probability of x is

xq dx (1 − x)p−q
 ,
xq dx (1 − x)p−q

the integral of the denominator being taken from x = 0 to x = 1. Let us imagine



now that, in the second drawing, the total number of balls extracted is pqq + z; the
probability of the observed number q  of white balls will be the term of the binomial

23
pq   pq  
+z
[x+(1−x)] q , which has for factor xq (1−x) q +z−q ; this probability is therefore
  
1.2.3 . . . pqq + z  pq  
   xq (1 − x) q +z−q .
 pq
1.2.3 . . . q .1.2.3 . . . q + z − q 

By multiplying it by the preceding probability of x, by integrating the product from


x = 0 to x = 1, and by dividing it by this same product multiplied by dz and integrated
for all the positive and negative values of z, we will have the probability that the total

number of balls extracted is pqq + z . We will find thus, by the analysis of the preceding
section, this probability equal to

q3 − q3 z2

 
c 2pq (p−q)(q+q ) .
2pq (p − q)(q + q )π

By naming therefore P the probability that the number of balls extracted in the second

drawing is comprehended within the limits pqq ± z, we will have

 
q3 q3 z2
− 2pq (p−q)(q+q
P =1−2 dz c )
,
2pq  (p − q)(q + q  )π

the integral being taken from z = z to z infinity.


Now, the number p of balls extracted in the first drawing is able to represent a denu- [401]
meration, and the number q of white balls which are comprehended is able to express
the number of women who, in this denumeration, must become mothers in the year,
or the number of annual births, corresponding to the denumeration. Then q  expresses
the number of annual births observed in all the empire, and whence we conclude the

population pqq . In this case, the preceding value of P expresses the probability that

this population is comprehended within the limits pqq ± z.
We will suppose, conformably to the preceding data,
110313 + 105287
p = 2037615, q= ;
3
we will suppose next
q  = 1500000, z = 500000;
the preceding formula gives then
1
P =1− .
1162
There is odds therefore around 1161 against one that in fixing at 42529267 the popu-
lation corresponding to fifteen hundred thousand births, we will not be deceived by a
half-million.
The difference between certitude and the probability P diminishes with a very great
rapidity when z increases; it would be insensible if we suppose z = 700000.

24
§32. Let us consider now the probability of future events, deduced from observed
events, and let us suppose that having observed an event composed of any number of
simple events, we seek the probability of a future result, composed of similar events.
Let us name x the probability of each simple event, y the corresponding probability [402]
of the observed result, and z the one of the future result; the probability of x will be, as
we have seen,
y dx
 ,
y dx
the integral being taken from x = 0 to x = 1; yzydxdx
is therefore the probability of the
future result, taken from the value of x, considered as cause of the simple event. Thus,
by naming P the entire probability of the future event, we will have

yz dx
P =  ,
y dx

the integrals of the numerator and of the denominator being taken from x = 0 to x = 1.
Let us suppose, for example, that an event being arrived m times consecutively,
we demand the probability that it will arrive the following n times. In this case, x
being supposed to represent the possibility of the simple event, xm will be that of the
observed event, and xn that of the future event, that which gives

y = xm , z = xn ,

whence we deduce
m+1
P = .
m+n+1
Let us suppose the observed event, composed of a very great number of simple events;
let a be the value of x which renders y a maximum, and Y this maximum; let a be the
value of x which renders yz a maximum, and Y  and Z  that which y and z become at
this maximum. We will have, by §27 of Book I, very nearly
 3√
Y 2 2π
y dx =  ,
2
− ddxY2
 3√
(Y  Z  ) 2 2π
yz dx =  .
2 Z)
− d (Y dx 2

The observed result being composed of a very great number of simple events, let us [403]
suppose that the future event is much less composite. The equation which gives the
value of a of x, corresponding to the maximum of yz, is
dy dz
0= + ;
y dx z dx
dy
y dxis a very great quantity, of order α1 , and, since the future result is much less
composite with respect to the observed result, zdzdx will be of a lesser order, which

25
1
we will designate by α1−λ . Thus, a being the value of x which satisfies the equation
0 = y dx , the difference between a and a will be very small of order αλ , and we will
dy

be able to suppose
a = a + αλ μ.
This supposition gives
dY α2λ μ2 d2 Y
Y  = Y + αλ μ + + ···
dx 1.2 dx2
dY dn Y
But we have dx = 0, and it is easy to conclude from it that Y dxn is of an order equal or
1
αnλ μn dn Y
less than 1
n ; the term 1.2.3...n Y dxn will be consequently more than order αn(λ− 2 ) .
α2
Thus the convergence of the expression of Y  in series requires that λ surpass 12 , and
in this case Y  differs from Y only by quantities of order α2λ−1 .
If we name Z that which z becomes when we make x = a, we will be assured in
the same manner that Z  is able to be reduced to Z. Finally we will prove by a similar
2   2
reasoning that d (Y dx2
Z )
is reduced to very nearly Z ddxY2 . By substituting these values
into the expression of P , we will have
P = Z,
that is that we are able then to determine the probability of the future result, by sup-
posing x equal to the value which renders the observed result most probable. But if [404]
it is necessary for that that the future result is rather not very composite so that the
exponents of the factors of z are of an order of magnitude smaller than the square root
of the factors of y; otherwise, the preceding supposition would expose some sensible
errors.
If the future result is a function of the observed result, z will be a function of y,
which we will represent by φ(y). The value of x which renders zy a maximum is,
in this case, the same which renders y a maximum; thus we have a = a, and if we

designate dφ(y) dY
dy by φ (y), the expression of P will become, by observing that dx = 0,

φ(Y )
P = .
Y φ (y)
1+ φ(Y )

If φ(Y ) = y n , so that the future event is n times the repetition of the observed event,
we will have
Yn
P =√ .
n+1
The probability P , calculated under the supposition that the possibility of the simple
events is equal to that which renders the observed result most probable, is Y n ; we see
thus that the small errors which result from this supposition are accumulated at the rate
of the simple events which enter into the future result, and become very sensible when
these events are in great number.

§33. Since 1745, an epoch where we have commenced to distinguish at Paris upon
the registers the baptisms of boys from those of girls, we have constantly observed that

26
the number of the first has been superior to the one of the second. Let us determine the
probability that this superiority will be maintained during a given time, for example, in
the space of a century.
Let p be the observed number of baptisms of boys, q the one of girls, 2n the number
of annual baptisms, x the probability that the infant who will be born and be baptized [405]
will be a boy. By raising x + (1 − x) to the power 2n and developing this power, we
will have
2n(n − 1) 2n−2
x2n + 2nx2n−1 (1 − x) + x (1 − x)2 + · · ·
1.2
The sum of the n first terms of this development will be the probability that each year
the number of baptisms of boys will surpass the one of the baptisms of girls. Let us
name z this sum; z i will be the probability that this superiority will be maintained
during the number i of consecutive years; therefore, if we designate by P the entire
probability that this will take place, we will have, by the preceding section,
 p
x dx z i (1 − x)q
P =  p ,
x dx (1 − x)q

the integrals of the numerator and of the denominator being taken from x = 0 to x = 1.
If we name a the value of x which renders xp z i (1 − x)q a maximum and if we
d2 Z dz d2 z
designate by Z, dZ dx , dx2 that which z, dx , dx2 becomes, when we change x into a,
we will have, by §26,
 √
ap+1 (1 − a)q+1 Z i 2π
x dx z (1 − x) = 
p i q
.
2 2Z
p(1 − a)2 + qa2 + ia2 (1 − a)2 dZZ−Zd
2 dx2

z being the sum of the first n terms of the function




2n 1 − x 2n(2n − 1) (1 − x)2
x 1 + 2n + + ··· ,
x 1.2 x2

we have, by §37 of Book I,


 un−1 du
(1+u)2n+1
z= un−1 du
,
(1+u)2n+1

the integral of the numerator being taken from u = 1−xx to u = ∞, and that of the [406]
denominator being taken from u = 0 to u = ∞. Let there be u = 1−s
s ; this value of z
will become  n
s ds (1 − s)n−1
z= n ,
s ds (1 − s)n−1
the integral of the numerator being taken from s = 0 to s = x, and that of the denomi-
nator taken from s = 0 to s = 1. Thence we deduce
dz xn (1 − x)n−1
= n ,
z dx s ds (1 − s)n−1

27
the integral of the denominator being taken from x = 0 to s = x. We will have next

d2 z dz n − (2n − 1)x
2
= = .
z dx z dx x(1 − x)
2
By changing x into a in these expressions, we will have those of Z, ZdZ d Z
dx , Z dx2 .
In order to determine a, we will observe that the condition of the maximum of
xp z i (1 − x)q gives
p q dZ
0= − +i ,
a 1−a Z dx
dZ
whence we deduce, by substituting for Z dx its preceding value,

p ian+1 (1 − a)n
a= +  ,
p + q (p + q) sn ds (1 − s)n−1
the integral of the denominator being taken from x = 0 to s = a. In order to conclude
a from this equation, we will observe that the value of s which renders sn (1 − s)n−1
a maximum is very nearly 12 , and consequently less than p+qp
, which itself is smaller
than a. Thus, n being supposed a large number, we are able, without sensible error,
to extend the integral of this expression of a, from s = 0 to s = 1, the term which [407]
depends on it being very small. This gives, by §28,
 1 1√ √
nn+ 2 (n − 1)n− 2 2π π
s ds (1 − s)
n n−1
= 2n+ 1 = 2n √ .
(2n − 1) 2 2 n

The equation which determines a becomes thus, quite nearly,



p ian+1 (1 − a)n 22n n
a= + √ .
p+q (p + q) π
p
In order to resolve it, we will observe that a differs very little from p+q , so that, if we
make
p
a= + μ,
p+q
μ will be quite small, and we will have, in a very close manner,

 2 n
p−q
p 1 − p+q
√ nμ(p+q)(p−q) (p+q)2 nμ2
(1) μ=i n 2
√ c− pq − pq ;
(p + q) π
we will have next, very nearly,
 p  q
p q (p+q)3
μ2
ap (1 − a)q = c− 2pq .
p+q p+q
By substituting into the radical

dZ 2 − Zd2 Z
p(1 − a)2 + qa2 + ia2 (1 − a)2 ,
Z 2 dx2

28
p dZ (p+q)a−p (p+q)μ d2 Z
for a its value
p+q + μ, for Z dx its value ia(1−a) or ia(1−a) , and for Z dx2 its value
dZ n−(2n−1)a
Z dx a(1−a) , this radical becomes very nearly

  
pq (p + q)μ (p + q)2 2 p+q
1+ [n(p − q) − p] + μ 2n + .
p+q pq pq i

Finally we have, by §28, [408]


  p  q  √
p q pq 2π
xp dx (1 − x)q = .
p+q p+q p+q p+q

This premised, the expression of P will become very nearly


(p+q)3
μ2
Z i c− 2pq
(2) P =  .
(p+q)μ (p+q)2 μ2 p+q
1+ pq [n(p − q) − p] + pq 2n + i

The concern is therefore no longer but to determine Z. We have


 n
s ds (1 − s)n−1
Z= n ,
s ds (1 − s)n−1

the integral of the numerator being taken from s = 0 to s = a, and that of the de-
nominator being taken from s = 0 to s = 1. It is easy to conclude from it that we
have  n
s ds (1 − s)n−1
Z =1−  ,
sn ds (1 − s)n−1
the integral of the numerator being taken from s = a to s = 1 and that of the denomi-
nator being taken from s = 0 to s = 1; we will have thus, quite nearly, by §29,
 2
dt c−t
(3) Z =1− √ ,
π
the integral relative to t being taken from

2
2 2n − 1 n(p − q) p
t = − + (2n − 1)μ ,
2n(n − 1) p+q p+q

to t2 = ∞.
In order to apply some numbers to these formulas, we will observe that, by that
which precedes, in the interval from the commencement of 1745 to the end of 1784,
we have by §28, relatively to Paris,

p = 393386, q = 377555.

By dividing by 40 the sum of these two numbers, we will have 19273,5 for the mean [409]
number of annual baptisms, that which gives n = 9636, 75; we will suppose moreover

29
i = 100. By means of these values we will determine that of μ by equation (1); we will
determine next the value of Z by equation (3); finally equation (2) will give the value
of P . We will find thus
P = 0, 782.
There was therefore at the end of 1784, according to these data, odds nearly four against
one that, in the space of a century, the baptisms of boys at Paris will surpass, each year,
those of girls.

30
BOOK II
CHAPTER VII
DE L’INFLUENCE DES INÉGALITÉS INCONNUES QUI PEUVENT EXISTER
ENTRE DES CHANCES QUE L’ON SUPPOSE PARFAITMENT ÉGALES

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §34, pp. 410–415

ON THE INFLUENCE OF THE UNKNOWN INEQUALITIES WHICH ARE ABLE


TO EXIST AMONG THE CHANCES THAT ONE SUPPOSES PERFECTLY
EQUAL

Examination of the cases in which this influence is favorable or contrary. It is contrary


to the one who, in the game of heads and tails, wagers to bring forth heads an odd
number of times, in an even number of trials. Means to correct this influence. No
34.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
§34. I have already considered this influence in §1, where we have seen that these [410]
inequalities increase the probability of the events composed of the repetition of simple
events. I will resume here this important object in the applications of the analysis of
probabilities.
There results from the section cited that if, in the game of heads and tails, there
exists an unknown difference between the possibilities to bring forth one or the other,
by naming α this difference, so that 1±α 2 is the possibility to bring forth heads, and
consequently 1∓α 2 that to bring forth tails, the one of the two signs + and − that we
must adopt being unknown, the probability to bring forth heads n times consecutively
will be
(1 + α)n + (1 − α)n
2n+1
or
 
1 n(n − 1) 2 n(n − 1)(n − 2)(n − 3) 4
(1) 1+ α + α + ··· .
2n 1.2 1.2.3.4
The game of heads and tails consists, as we know, in casting into the air a very thin
coin, which falls again necessarily on one of its two opposite faces that we name heads
and tails. We are able to diminish the value of α, by rendering these two faces the
most equal as it is possible. But it is physically impossible to obtain a perfect equality,
and then the one who wagers to bring forth heads twice consecutively or tails twice [411]
consecutively has the advantage over the one who wagers that, in two trials, heads and
2
tails will alternate, its probability being 1+α
2 .
We are able to diminish the influence of the inequality of the two faces of the coin,
by submitting them themselves to the chances of hazard. Let us designate by A this
coin, and let us imagine a second coin B similar to the first. Let us suppose that after
having projected this second coin, we project the coin A in order to form a first trial,
and let us determine the probability that in n consecutive similar trials, the coin A will
present the same faces as the coin B. If we name p the probability to bring forth heads
with the coin A and q the probability to bring forth tails; if we designate next by p0 and
q 0 the same probabilities for the coin B, pp0 + qq 0 will be the probability that in one
trial the coin A will present the same faces as the coin B. Thus (pp0 + qq 0 )n will be the
probability that that will take place constantly in n trials. Let
1+α 1−α
p= , q= ,
2 2
1 + α0 1 − α0
p0 = , q0 = ;
2 2
we will have
1
(pp0 + qq 0 )n = (1 + αα0 )n .
2n
But, as we are ignorant of what the faces are that the inequalities α and α0 favor, the
preceding probability is able to be equally either 21n (1 + αα0 )n or 21n (1 − αα0 )n , ac-
cording as α or α0 are of like sign or of contrary signs. The true value of this probability
is therefore, α and α0 being supposed positives,
 
1 0 n 1 0 n
(1 + αα ) + n (1 − αα )
2n+1 2

2
or  
1 n(n − 1) 2 02 n(n − 1)(n − 2)(n − 3) 4 04
1 + α α + α α + · · · .
2n 1.2 1.2.3.4
If we compare this formula to formula (1), we see that it approaches 21n more than it, [412]
or of the probability which would hold if the faces of the coins were perfectly equal.
Thus the inequality of these faces is thence corrected in great part; it would be it even
in totality, if α0 were null, or if the two faces of the coin B were perfectly equal.
p representing the probability of heads with the coin A, and q that of tails, the
probability to bring forth heads an odd number of times in n trials will be
1
[(p + q)n ∓ (p − q)n ] ,
2
the − sign holding if n is even, and the + sign holding if n is odd. Making p =
1+α 1−α
2 , q = 2 , the preceding function becomes

1
(1 ∓ α)n .
2
If n is odd and equal to 2i + 1, this function is
1
(1 + α2i+1 );
2
but, as we are able to suppose equally α positive or negative, it is necessary to take
the half of the sum of its two values relative to these suppositions, that which gives 12
for its true value; the inequality of the faces of the coin changes therefore not at all the
probability 12 to bring forth heads an odd number of times. But, if n is even and equal
to 2i, this probability becomes
1
(2) (1 − α2i ),
2
±α being the unknown inequality of the probability between heads and tails; there is
therefore disadvantage to wager to bring forth heads or tails an odd number of times in
2i trials, and consequently there is advantage to wager to bring forth one or the other
an even number of times.
We are able to diminish this advantage by changing the wager to bring forth heads
an odd number of times in 2i trials, into the wager to bring forth in the same number
of trials an odd number of resemblances between the faces of the two coins A and B, [413]
projected as we have said above. In fact, the probability of a resemblance at each trial
is, as we have seen, pp0 + qq 0 , and the probability of a dissemblance is pq 0 + p0 q. Let
us name P the first of these two quantities and Q the second; the probability to bring
forth an odd number of resemblances in 2i trials will be
1
(P + Q)2i − (P − Q)2i .

2
If we make, as previously,
1+α 1−α 1 + α0 1 − α0
p= , q= , p0 = , q0 = ,
2 2 2 2

3
we will have
1 + αα0 1 − αα0
P = , Q= ;
2 2
the preceding function becomes thus
1
(1 − α2i α02i ).
2
This function remains the same, whatever change that we make in the signs of α and
of α0 ; it is therefore the true probability to bring forth an odd number of resemblances;
but, α and α0 being small fractions, we see that it is nearer 21 more than formula (2);
the disadvantage of an odd number is therefore thence diminished.
We see by that which precedes that we are able to diminish the influence of the
unknown inequalities among the chances that we suppose equals, by submitting them
themselves to chance. For example, if we put into an urn the tickets 1, 2, 3, . . . , n
following this order, and if next, after having agitated the urn in order to mix well the
tickets, we draw one from it; if there is among the probabilities of tickets to exit a small
difference depending on the order according to which they have been placed in the urn,
we will diminish it considerably by putting these tickets into a second urn, according
to their order of exit from the first urn, and by agitating next this second urn, in order
to well mix the tickets. Then the order according to which we have placed the tickets
in the first urn will have extremely little influence on the extraction of the first ticket
which will exit from the second urn. We would diminish further this influence, by [414]
considering in the same manner a third urn, a fourth, etc.
Let us consider two players A and B playing together, in a manner that at each trial
the one who loses gives a token to his adversary, and that the game endures until one
of them has won all the tokens of the other. Let p and q be their respective skills, a and
b their numbers of tokens at commencement. There results from formula (H) of §10,
by making i infinity, that the probability of A to win the game is

pb (pa − q a )
.
pa+b − q a+b
If we make in this expression
1±α 1∓α
p= , q= ,
2 2
we will have, by taking the superior sign, the probability relative to the case where A
is stronger than B, and, by taking the inferior sign, we will have the probability relative
to the case where A is less strong than B. If we are ignorant of who is the strongest of
the players, the half-sum of these two probabilities will be the probability of A, that we
find thus equal to
1 a a
 b b

2 [(1 + α) − (1 − α) ] (1 + α) + (1 − α)
(3) ;
(1 + α)a+b − (1 − α)a+b
by changing a into b and reciprocally, we will have the probability of B. If we suppose
a b
α infinitely small or null, these probabilities become a+b and a+b ; they are therefore

4
proportionals to the numbers of tokens of the players; thus, for equality of the game,
their stakes must be in this ratio. But then the inequality which is able to exist between
them is favorable to the player who has the smallest number of tokens; because, if
a
we suppose a less than b, it is easy to see that expression (3) is greater than a+b . If
the players agree to double, to triple, etc. their tokens, the advantage of A increases [415]
without ceasing, and, in the case of a and b infinite, its probability becomes 12 or the
same as that of B.
P being the probability of an event composed of two simple events of which p and
1 − p are the respective probabilities, if we suppose that the value of p is susceptible
of an unknown inequality z which is able to be extended from −α to +α, by naming φ
the probability of p + z, φ being a function of z, we will have, for the true probability
of the composite event, R 0
P φ dz
R ,
φ dz
P 0 being that which P becomes when we change p into p + z, and the integrals being
taken from z = −α to z = α.
If we have no other data in order to determine z but one observed event, formed
from the same simple events, by naming Q the probability of this event, p + z and
1 − p − z being the probabilities of the simple events, the preceding expression gives,
by changing φ into Q, for the probability of the composite event,
R 0
P Q dz
R ,
Q dz

the integrals being taken here from z = −p to z = 1 − p; that which is conformed to


that which we have found in the preceding Chapter.

5
BOOK II
CHAPTER VIII
DES DURÉES MOYENNES DE LA VIE, DES MARIAGES ET DES
ASSOCIATIONS QUELCONQUE

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§35–37, pp. 416–427

ON THE MEAN DURATION OF LIFE, OF MARRIAGES AND


OF ANY ASSOCIATIONS WHATSOEVER

Expression of the probability that the mean duration of life of a great number n of
infants will be comprehended within these limits, true mean duration of life,
more or less a very small given quantity. There results from it that this probability
increases without ceasing in measure as the number of infants increases and that,
in the case of an infinite number, this probability is confounded with certitude,
the interval of the limits becoming infinitely small or null. Expression of the
mean error that we are able to fear by taking for mean duration of life that of a
great number of infants. Rule in order to conclude from the Tables of mortality
the mean duration of that which remains to live to a person of a given age. No
35.
Expression of the mean duration of life, if one of the causes of mortality comes to
be extinguished. Particular expression in the case where we happen to destroy
a malady that we are able to contract only one time in life. The extinction of
the small pox, by means of vaccine, would increase by more than three years the
mean duration of life, if the increase of population which would result from it
would not be arrested by the deficiency of subsistances. No 36.
On the mean duration of marriages. Expression of their most probable mean duration
and of the probability that the error of this expression is comprehended within
some given limits. On the mean duration of the associations formed by any
number of individuals. No 37.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
§ 35. Let us suppose that we have followed with respect to a very great number n [416]
of infants the law of mortality, from their birth to their total extinction; we will have
their mean life, by making a sum of the durations of all their lives and by dividing it
by the number n. If this number were infinite, we would have exactly the duration of
mean life. Let us seek the probability that the mean life of n infants will deviate from
this one only within some given
 limits.
Let us designate by φ xa the probability to die at age x, a being the limit of x, a
and x being supposed to contain an infinite number of parts taken for unity. We will
consider the power
n
1 −$√−1 2 −2$√−1
      
0
φ +φ c +φ c + ···
a a a
 .
 
 x √ a √
−x$ −1 −a$ −1
+φ c + ··· + φ c
a a

It is clear that the coefficient of c−(l+nµ)$ −1 , in the development of this power, is the
probability that the sum of the ages to√which the n infants will arrive will be l + nµ;
(l+nµ)$ −1
by multiplying therefore√ by c the preceding power, the term independent
±$ −1
of the powers of c in the product will be this probability, which, consequently, [417]
is equal to
n
1 −$√−1 √
    
0 x

−x$ −1 
1
Z √

 √ φ + φ c + · · · + φ c 
(1) d$cl$ −1 cµ$ −1 
 a a a 
,
2π a √ 
c−a$ −1 
 
 + ··· + φ
a
the integral being taken from $ = −π to $ = π. R
If we take in this integral the hyperbolic logarithm of the quantity under the sign,
raised to the power n, we will have, by developing the exponentials into series, this
logarithm equal to
√ √  x  $2 Z
Z   Z 
x 2
x
(2) nµ$ −1 + n log φ − $ −1 xφ − x φ + ··· ;
a a 2 a
R
the sign referring here to all the values of x, from x = 0 to x = a. If we make
x 0 0
a = x , and if we observe that, the variation of x being unity, we have adx = 1, we
will have Z   Z
x
φ = a dx0 φ(x0 ),
a
Z x Z
xφ =a 2
x0 dx0 φ(x0 ),
a
Z x Z
x2 φ = a3 x02 dx0 φ(x0 ),
a
··························· ,
the integrals relative to x0 being taken from x0 = 0 to x0 = 1. Let us name k, k 0 ,
k 00 , . . . these successive integrals; the probability that the duration of life of an infant

2
will be comprehended within the limits zero and a is φ xa or a dx0 φ(x0 ); now this
R  R

probability is certitude itself; we have therefore ak = 1. This premised, the function


(2) becomes
√ k0 √ k 00 a2 $2
 
nµ$ −1 + n log 1 − a$ −1 − + ···
k k 2
or
nµ nk 0 √ kk 00 − k 02 2 2
 
− a$ −1 − n a $ − ···
a k 2k 2
If we make [418]
ak 0 a2 k 0
µ= = = a2 k 0 .
k ak
the first power of $ disappears, and moreover, n being supposed a very great number,
we are able to be arrested at the second power of $; the function (1) becomes thus, by
passing again from the logarithms to the numbers,

Z
1 kk00 −k02 2 2
d$ cl$ −1−n 2k2 a $ .

If we make √ √
k2 a$ n βl −1
β2 = , t= − √ ,
2(kk − k 02 )
00 2β a n
this integral becomes, by taking it from t = −∞ to t = ∞,
β β 2 l2
√ c− a2 n .
a nπ

By multiplying it by dl, and making l = ar n, we will have
Z
2 2 2
√ βdr c−β r
π
for the probability that the sum of the ages √ to which the n infants will arrive will be
2 0
comprehended within theRlimits na k ± ar n.
The quantity a2 k 0 or xφ xa is the sum of the products of each age by the proba-


bility of arriving there; it is therefore the true duration of mean life; thus the probability
that the sum of the ages to which the n infants cease to live, divided by their number,
is comprehended within these limits
ar
True duration of mean life, more or less √
n
,

has for expression Z


2 2 2
√ βdr c−β r
.
π
The mean value of r, more or less, is, by § 20, [419]
Z
1 2 2
±√ βr dr c−β r .
π

3
the integral being taken from r = 0 to r infinity. By multiplying it by √an , we will have
the mean error to fear more or less, when we take for mean duration of life the sum of
the ages that the n infants considered above have lived, divided by n, a quotient that
we will designate by G; this error is therefore
a
± √ .
2β nπ

We have, very nearly,


a2 k 0 = G,
and, as ak = 1, we will have
k0 G
= .
k a
If we name next H the sum of the squares of the ages that the n infants have lived,
divided by n, we will find, by the analysis of § 19,

k 00 2
a = H;
k
these values give
a2
β2 = :
2(H − G2 )
the mean error to fear more or less with respect to the duration of life becomes thus

H − G2
± √ .
2nπ
It is clear that these results hold equally relatively to the mean duration of that which
remains to live, when, instead of departing from the epoch of birth, we depart from any
epoch of life.
We are able to determine easily, by means of the Tables of mortality, formed from
year to year, the mean duration of that which remains to live to a person of whom [420]
the age is an entire number of years A. For that, we will add all the numbers of the
Table which follow the one which corresponds to age A; we will divide the sum by
this last number, and we will add 12 to the quotient. In fact, if we designate by (1),
(2), (3), . . . the numbers of the Table, corresponding to the year A and to the following
years, the number of individuals who die in the first year, in departing from year A,
will be (1)−(2); but, in this short interval, the mortality is able to be supposed constant;
1
2 [(1) − (2)] is therefore the sum of the durations of their life, in departing from age
A. Similarly 32 [(2) − (3)], 52 [(3) − (4)], . . . are the sums of the durations of life, by
departing from the same age, of those who die in the second, third, etc., years counted
from year A. The reunion of all these sums is (1) 2 +(2)+(3)+(4)+ · · · ; and, by dividing
it by (1), we will have the mean duration of that which remains to live to the person
of age A. We will form thus a Table of the mean durations of that which remains to
live at the different ages. We will be able likewise to conclude these durations from

4
one another, by observing that, if F designates this duration for age A, and F 0 the
corresponding duration at age A + 1, we have
 
(2) 1 1
F = F0 + + .
(1) 2 2

§ 36. Let us determine now the mean duration of life which would hold, if one of
the causes of mortality were to be extinguished. Let U be the number of infants who,
out of the number n of births, would survive yet to the age x under this hypothesis, u
being the one of the infants living at this age out of the same number of births, in the
case where that cause of mortality subsists. Let us name z4x the probability that one
individual of age x will perish of this malady in the very short interval of time 4x;
uz4x will be, very nearly, by § 25, the number of individuals u who will perish of
this malady in the interval of time 4x, if this number is considerable. Similarly, if we
designate by φ4x the probability that an individual of age x will perish by the other
causes of mortality in the interval 4x, uφ4x will be the number of individuals who
will perish by these causes, in the interval of time 4x; this will be therefore the value [421]
of −4u; I affect 4u with the − sign, because u diminishes in measure as x increases;
we have therefore
−4u = u4x(φ + z).
We will have similarly
−4U = U φ4x.
By eliminating φ from these two equations, we will have

4U 4u
= + z4x.
U u
4x being a very small quantity, we are able to transform the characteristic 4 into the
differential characteristic d, and then the preceding equation becomes

dU du
= + z dx;
U u
whence we deduce, by integrating and observing that at age zero U = u = n,
R
z dx
(3) U = uc ,

the integral being taken from x null. We are able to obtain this integral, by means of
the registers of mortality, in which we take account of the age of the dead individuals
and of the causes of their death. In fact, uz4x being, by that which precedes, the
number of those who, arrived to age x, have perished in the interval of time R 4x, by
the malady of which there is concern, we will have very nearly the integral z dx, by
supposing 4x equal to one year, and by taking from the birth of the n infants that we
have considered, until the year x, the sum of the fractions which have for numerator the
number of individuals who the malady has made perish each year, and for denominator,
the number n of the infants who survive yet to the middle of the same year. Thus we

5
will be able to transform, by means of equation (3), a Table of ordinary mortality, into
that which would hold if the malady of which there is concern did not exist.
Smallpox has this in particular, namely, that the same individual is never twice
attacked, or at least this case is so rare, that, if it exists, we are able to set it aside. Let [422]
us imagine that, out of a very great number n of infants, u arrive to age x, and that, in
the number u, y have not had smallpox at all. Let us imagine further that out of this
number y, iy dx take this malady in the instant dx, and that, out of this number, iry dx
perish from this malady. By designating, as above, by φ the probability to perish at age
x by some other causes, we will have evidently

du = −uφ dx − iry dx.

We will have next


dy = −yφ dx − iy dx.
In fact, y diminishes by the number of those who, in the instant dx, take smallpox,
and this number is, by the supposition, iy dx; y diminishes further by the number of
individuals comprehended in y, who perish by some other causes, and this number is
yφ dx.
Now, if from the first of the two preceding equations, multiplied by y, we subtract
the second multiplied by u, and if we divide the difference by y 2 , we will have
u u
d = i dx − ir dx,
y y
that which gives, by integrating from x null, and observing that at this origin u = y =
n.
 Z  R
u R
(4) = 1 − ir dx c− i dx c i dx ;
y
this equation will make known the number of individuals of age x who have not at all
yet had smallpox. We have next
iry dx
z dx = ,
u
uz dx being, as above, the number of those who perish within the time dx, of the
malady that we consider. By substituting, instead of ny , its preceding value, we will [423]
have, after having integrated,
R
z dx 1
c = R R ;
1− ir dx c− i dx
equation (3) will give therefore
u
(5) U= R R .
1− ir dx c− i dx
This value of U supposes that we knew by observation i and r. If these numbers were
constants, it would be easy to determine them; but, as they are able to vary from age to

6
age, the elements of formula (3) are easier to know, and this formula seems to me more
proper to determine the law of mortality which would hold, if smallpox was extinct. By
applying to it the data that we have been able to procure with respect to the mortality
caused by this malady, at the diverse ages of life, we find that its extinction by means of
the vaccine would increase more than three years the duration of mean life, if besides
this duration was not at all restrained by the diminution related to the subsistances, due
to a greater increase of population.

§ 37. Let us consider presently the mean duration of marriages. For that let us
imagine a great number n of marriages among n young men of age a, and n young
women of age a0 ; and let us determine the number of these marriages subsisting after
x years elapsed from their origin. Let us name φ the probability that a young man
who is married at age a will arrive to age a + x; and ψ the probability that a young
woman who is married at age a0 will arrive to age a0 + x. The probability that their
marriage will subsist after its xth year will be φψ; therefore, if we develop the binomial
[φψ + (1 − φψ)]n , the term H(φψ)i (1 − φψ)n−i of this development will express the
probability that, out of n marriages, i will subsist after x years. The greatest term of
the development is, by § 16, the one in which i is equal to the greatest whole number
contained in (n + 1)φψ; and, by the same section, it is extremely probable that the
number of the marriages subsisting will deviate only very little more or less from this
number. Thus, by designating by i the number of subsisting marriages, we will be able [424]
to suppose, very nearly,
i = nφψ.
nφ is quite near the number of the n husbands surviving to the age a + x. The Tables
of mortality will make it known in a quite near manner, if they have been formed out
of the numerous lists of mortality; because if we designate by p0 the number of men
surviving to age a, out of the collection of these lists, and by q 0 the number of the
surviving to age a + x, we will have, to quite nearly, by § 29,
nq 0
nφ = .
p0
If we name similarly p00 the number of women surviving to age a0 and by q 00 the number
of the surviviors to age a0 + x, we will have, to very nearly,
nq 00
nψ = ;
p00
therefore
nq 0 q 00
i= .
p0 p00
We will form thus, from year to year, a Table of values of i. By making next a sum of
all the numbers of this Table, and by dividing it by n, we will have the mean duration of
the marriages made at age a for the young men and at the age a0 for the young women.
Let us seek now the probability that the error of the preceding value of i will be
comprehended within some given limits. Let us suppose, in order to simplify the cal-
culation, that the two spouses are of the same age, and that the probability of the life of

7
the men is the same as that of the women; then we have

a0 = a, q 00 = q 0 , p00 = p0 , φ = ψ;

and the preceding expression of i becomes

nq 02
i= .
p02
02
Let us imagine that the value of i is nqp02 + s; s will be the error of this expression of [425]
i. We have seen, in § 30, that if we have observed that, out of a very great number p of
individuals of age a, q are arrived to the age a + x, the probability that, out of p0 other
0
individuals of the age a, ppq + z will arrive to the age a + x, is
s
p3 p3 z 2
− 2qp0 (p−q)(p+p 0)
c .
2qp0 (p − q)(p + p0 )π

If we suppose p and q infinite, we will have evidently


q
φ= ,
p
and, if we make
p0 q
+ z = q0 ,
p
we will have
q0 z
φ= − 0,
p0 p
nz 2
that which gives very nearly, by neglecting the square p02 ,

nq 02 2nq 0 z
nφ2 = 02
− 02 ;
p p
thus the preceding probability of z is at the same time the probability of this expression
of nφ2 . Let us suppose now i = nφ2 + l; by considering the binomial [φ2 + (1 − φ2 )]n ,
the probability of this expression of i is, by § 16,
2
1 l
− 2nφ2 (1−φ 2)
p c .
2nπφ2 (1 − φ2 )

But the preceding value of i becomes, by substituting for nφ2 its value,

nq 02 2nq 0 z
i= 02
− 02 + l;
p p
the probability of this last expression of i is equal to the product of those of i and of z, [426]

8
found above; it is therefore equal to
z2 l2
− −
c 2p0 φ(1−φ) 2nφ2 (1−φ2 )
p .
2π np0 φ3 (1 − φ)2 (1 + φ)
02 0
Having supposed previously i = nq 2nq z
p02 + s, we will have s = l − p02 ; by substi-
tuting therefore for l its value deduced from this equation and observing that we have
0
to very nearly pq 0 = φ, we will have, for the probability that the value of s will be
comprehended within some given limits, the integral expression
2nq 0 z
 
s+
z2 p02
RR − 2φ(1−φ)p0 − 2nφ2 (1−φ2 )
dz ds c
p ,
2π np0 φ3 (1 − φ)2 (1 + φ)

the integral relative to z being able to be taken from z = −∞ to z = ∞. Thence, it is


easy to conclude, by the methods exposed previously, that, if we make

p0
k2 = ,
2nφ2 (1 − φ)[p0 + (p0 + 4n)φ]

the preceding integral becomes


Z
k ds −k2 s2
√ c ;
π
nq 02
thus the probability that the error of the expression i = p02 will be ±s is
Z
2 2 2
√ k ds c−k s
,
π

the integral being taken from s null.


The preceding analysis is applied equally to the mean duration of a great number
of associations formed of three individuals or of four individuals, etc. Let n be this
number, and let us suppose that all the associates are of the same age a at the moment of
association; let us designate by p the number of individuals from the Table of mortality
of the age a, and by q the number of individuals of the age a + x; the number i of the [427]
associations existing after x years elapsed from the origin of the associations will be to
quite nearly
nq r
i= r ,
p
r being the number of individual of each association. We will find by the same analysis
the probability that this number will be contained within the given limits. The sum of
the values of i corresponding to all the values of x, divided by n, will be the mean
duration of this kind of associations.

9
BOOK
CHAPTER IX
DES BÉNÉFICES DÉPENDANTS DE LA PROBABILITÉ DES ÉVÉNEMENTS
FUTURS.

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§38–40, pp. 428–440

ON THE BENEFITS DEPENDING ON THE PROBABILITY OF FUTURE


EVENTS.

If we await any number of simple events of which the probabilities are known and
of which the arrival procures an advantage, their non-arrival causing a loss, to
determine the mathematical benefit resulting from their awaiting. Expression
of the probability that the real benefit will be comprehended within some given
limits, when the number of events awaited is very great. However little advantage
that each awaited event produces, the benefit becomes infinitely great and certain
when the number of events is supposed infinite. No 38.
If the diverse chances of an awaited event produce the advantages and the losses
of which the respective probabilities are given, to determine the mathematical
benefit resulting from the awaiting of any number of similar events. Expression
of the probability that the real benefit will be comprehended within some given
limits, when this number is very great. No 39.
On the benefits of the establishments based on the probabilities of life. Expression of
the capital that it is necessary to give in order to constitute a life pension on one
or many heads. Expression of the rent that one individual must give to an estab-
lishment in order to assure to his heirs a capital payable at his death. Expression
of the probability that the real benefit of the establishment will be comprehended
within some given limits, by supposing that a great number of individuals, in
constituting each a pension on his head, each deposits a determined sum into the
funds of the establishment, in order to defray his expenses. No 40.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
§38. Let us imagine that the arrival of an event procures the benefit ν, and that its [428]
non-arrival causes the loss µ. A person A awaits the arrival of a number s of similar
events, all equally probable, but independent of one another; we demand what is his
advantage.
Let q be the probability of the arrival of each event and consequently 1 − q that of
its non-arrival; if we develop the binomial [q + (1 − q)]s , the term
1.2.3 . . . s
q i (1 − q)s−i
1.2.3 . . . i.1.2.3 . . . (s − i)
of this development will be the probability that out of s events i will arrive. In this
case, the benefit of A is iν, and his loss is (s − i)µ; the difference is i(ν + µ) − sµ;
by multiplying it by its probability expressed by the preceding term and taking the
sum of these products for all the values of i, we will have the advantage of A, which,
consequently, is equal to
1.2.3 . . . s
−sµ[q + (1 − q)]s + (ν + µ)S q i (1 − q)s−i ,
1.2.3 . . . i.1.2.3 . . . (s − i)
the sign S extending to all the values of i. We have
i.1.2.3 . . . s
S q i (1 − q)s−i
1.2.3 . . . i.1.2.3 . . . (s − i)
d 1.2.3 . . . s d
= S q i ti (1 − q)s−i = [qt + (1 − q)]s ,
dt 1.2.3 . . . i.1.2.3 . . . (s − i) dt
provided that we suppose t = 1, after the differentiation, that which reduces this last [429]
member to qs; the advantage of A is therefore s[qν − (1 − q)µ]. This advantage is null,
if νq = µ(1 − q), that is, if the benefit of the arrival of the event, multiplied by its prob-
ability, is equal to the loss caused by its non-arrival, multiplied by its probability. The
advantage becomes negative and is changed into disadvantage, if the second product
surpasses the first. In all cases, the advantage or the disadvantage of A is proportional
to the number s of the events.
We will determine by the analysis of §16 the probability that the real benefit of A
will be comprehended within the given limits, if s is a large number. Following this
analysis, the sum of the diverse terms of the binomial [q + (1 − q)]s comprehended
between the two terms distant by l + 1, on both sides of the greatest, is
Z
2 2 1 l2
√ dt c−t p c− 2sq(1−q) ,
π 2sπq(1 − q)
l
the integral being taken from t = 0 to t = √ . The exponent of q in the
2sq(1−q)
greatest term is very nearly, by the same section, equal to sq, and the exponents of q,
corresponding to the extreme terms comprehended within the preceding interval, are
respectively sq − l and sq + l. The benefits corresponding to these three terms are
s[qν − (1 − q)µ] − l(ν + µ),
s[qν − (1 − q)µ],
s[qν − (1 − q)µ] + l(ν + µ);

2

by making therefore l = r s, the
√ probability that the real benefit of A will not exceed
the limits s[qν − (1 − q)µ] ± r s(ν + µ) is equal to
r2
dr c− 2q(1−q)
R
2 1 r2
√ p +p c− 2q(1−q) ,
π 2q(1 − q) 2sπq(1 − q)
the integral being taken from r = 0, and the last term being able to be neglected. We
see by this formula that if qν − (1 − q)µ is not null, the real benefit increases without [430]
ceasing and becomes infinitely great and certain in the case of an infinite number of
events.
We are able to extend, by the following analysis, this result to the case where the
probability of the s events are different, in the same way as the benefits and the losses
which are attached. Let q, q (1) , q (2) , . . . , q (s−1) be the respective probabilities of
these events; ν, ν (1) , ν (2) ,. . . , ν (s−1) the benefits which their arrivals procure. We
are able, in order to simplify, to set aside the losses which their non-arrivals cause, by
comprehending in the benefit which the arrival of each event procures the quantity that
A would lose by its non-arrival, and by subtracting next from the total advantage of A
the sum of these last quantities; because it is easy to see that that changes not at all the
position of A.
This premised, let us consider the product
 √  (1) √
  (s−1) √

1 − q + qcν$ −1 1 − q (1) + q (1) cν $ −1 · · · 1 − q (s−1) + q (s−1) cν $ −1
.
0
√ that the sum of the benefits will be f + l will be equal to
It is clear that the probability
0
the coefficient of c(f +l )$ −1 in the development of this product; it is therefore equal
to
√ √ (s−1) √
Z
1 0
   
(a) d$c−(f +l )$ −1 1 − q + qcν$ −1 · · · 1 − q (s−1) + q (s−1) cν $ −1
.

the integral being taken from $ = −π to $ = π, and the numbers ν, ν (1) , . . . being
supposed, as we are able to make it, whole numbers. We take the logarithm of the
product
√  √   (s−1) √

(b) c−f $ −1 1 − q + qcν$ −1 · · · 1 − q (s−1) + q (s−1) cν $ −1
;

by developing it according to the powers of $, it becomes


√ $2 (i)
(Sq (i) ν (i) − f )$ −1 − Sq (1 − q (i) )ν (i)2 − · · ·
2
the sign S corresponding to all the values of i, from i = 0 to i = s − 1. The supposition
of f equal to Sq (i) ν (i) makes the first power of $ disappear, and the size of s, a very
great number, renders insensible the terms depending on the powers of $, superior [431]
to the square. By passing again therefore from the logarithms to the numbers in the
preceding development, the product (b) becomes very nearly
$2
Sq (i) (1−q (i) )ν (i)2
c− 2 ,

3
that which changes the integral (a) into this one
0 √
Z
1 $2 (i) (i) (i)2
d$ c−l $ −1− 2 Sq (1−q )ν .

The integral must be taken from $ = −π to $ = π, and Sq (i) (1 − q (i) )ν (i)2 being
a great number of order s, it is clear that this integral is able to be extended without
sensible error to the positive and negative infinite values of $. By making therefore
r √
Sq (i) (1 − q (i) )ν (i)2 l0 −1
$ + p =t
2 2Sq (i) (1 − q (i) )ν (i)2

and integrating from t = −∞ to t = ∞, the integral (a) becomes


l02
1 −
2Sq (i) (1−q (i) )ν (i)2
p c .
2πSq (i) (1 − q (i) )ν (i)2

If we multiply this quantity by 2dl0 , and if next we integrate it from l0 = 0, this integral
will be the expression of the probability that the benefit of A will be comprehended
within the limits f ± l0 , or Sq (i) ν (i) ± l0 ; by making thus
q
l0 = 2Sq (i) (1 − q (i) )ν (i)2 ,

the probability that the benefit of A will be comprehended within the limits
q
Sq (i) ν (i) ± r 2Sq (i) (1 − q (i) )ν (i)2

is Z
2 2
√ dr c−r ,
π
the integral being taken from r = 0.
Now it is necessary, by that which precedes, to change within the preceding limits [432]
ν (i) into ν (i) + µ(i) and to subtract from it Sµ(i) ; the probability that the real benefit of
A will be comprehended within the limits
q
S[q (i) ν (i) − (1 − q (i) )µ(i) ] ± r 2Sq (i) (1 − q (i) )(ν (i) + µ(i) )2

is therefore Z
2 2
√ dr c−r .
π
We see by this formula that, as little as the mathematical expectation of each event
surpasses zero, by multiplying the events to infinity, the first term
√ of the expression of
the limits being of order s, while the second is only of order s, the real benefit is
increased without ceasing and becomes at the same time infinitely great and certain, in
the case of an infinite number of events.

4
§39. Let us consider now the case where, at each event, the person A has any
number of chances to hope or to fear. Let us suppose, for example, that an urn contains
some balls of diverse colors, that we draw a ball from this urn, by replacing it into the
urn after the drawing, and that the benefit of A is ν if the extracted ball is of the first
color, that it is ν 0 if the extracted ball is of the second color, that it is ν 00 if the extracted
ball is of the third color, and thus consecutively, the benefits becoming negative when
A is forced to give instead of receive. Let us name a, a0 , a00 , . . . the probabilities that
the ball extracted at each drawing will be of the first, or of the second, or of the third,
etc. color, and let us suppose that we have thus s drawings; we will have first
a + a0 + a00 + · · · = 1.
By multiplying
√ next the √terms of the first member of this equation, respectively by
0 √ 00 √
cν$ −1 , cν $ −1 , cν $ −1 , . . ., the term independent of the powers of c$ −1 , in
the development of the function
√  √ 0 √ 00 √
s
c−(l+sµ)$ −1 acν$ −1 + a0 cν $ −1 + a00 cν $ −1 + · · · ,

will be, by that which precedes, the probability that, in s drawings, the benefit of A [433]
will be sµ + l; this probability is therefore equal to
√ √ √ 0 √
Z is
1 h 
(c) d$ c−l$ −1 c−µ$ −1 acν$ −1 + a0 cν $ −1 + · · · ,

the integral relative to $ being taken from $ = −π to $ = π. If we develop with
R powers of $ the hyperbolic logarithm of the quantity raised to the power
respect to the
s under the sign, and if we observe that a + a0 + a00 + · · · = 1, we will have, for this
logarithm,

$ −1(aν + a0 ν 0 + a00 ν 00 + · · · − µ)
$2
− [aν 2 + a0 ν 02 + a00 ν 002 + · · · − (aν + a0 ν 0 + a00 ν 00 + · · · )2 − · · ·
2
We will make the first power of $ disappear, by making
µ = aν + a0 ν 0 + a00 ν 00 + · · · ;
if we suppose next
2k 2 = aν 2 + a0 ν 02 + a00 ν 002 + · · · − (aν + a0 ν 0 + a00 ν 00 + · · · )2 ,
and if we observe that, s being supposed a great number, we are able to neglect the
powers of $ superior to the square, we will have, by passing again from the logarithms
to the numbers,
h √  √ 0 √ 00 √
is 2 2
c−µ$ −1 acν$ −1 + a0 cν $ −1 + a00 cν $ −1 + · · · = c−sk $ ,

that which changes the integral (c) into this one



Z
1 2 2
d$ c−l$ −1−sk $ ,

5
which becomes, by integrating as in the preceding section,
1 l2
√ c− 4sk2 .
2k sπ
By multiplying it by 2dl and integrating the product from l = 0, we will have the
probability that the real benefit of A will be comprehended within the limits [434]

s(aν + a0 ν 0 + a00 ν 00 + · · · ) ± l;

by making therefore √
l = 2kr0 s,
this probability will be, by taking the integral from r0 = 0,
Z
2 02
√ dr0 c−r .
π
We have supposed in that which precedes the probabilities of the events known; let
us examine the case where they are unknown. Let us suppose that, out of m similar
events awaited, n arrived, and that A awaits s similar events, of which each procures
to him by its arrival the benefit ν, the non-arrival causing to him the loss µ. If we
n
represent by m s + z the number of events which will arrive out of the s events awaited,
the probability that z will be contained within the limits ±kt will by §30
Z
2 2
√ dt c−t ,
π

the integral being taken from t = 0, k 2 being equal to


2ns(m − n)(m + s)
.
m3
n
But, ms + z being the number of arrived events, the real benefit of A is
 
nν (m − n)µ
− s + z(ν + µ);
m m
the preceding integral is therefore the probability that the real benefit of A will be
comprehended within the limits
 
nν (m − n)µ
− s ± kt(ν + µ).
m m

k is of order s, if m and n are of order equal or greater than s; thus, howsoever small [435]
that the mathematical expectation be relative to each event, the real benefit becomes
at infinity, certain and infinitely great, when the number of past events is supposed
infinite, as the one of future events.

§40. We will now determine the benefits of the establishments founded on the
probabilities of human life. The most simple way to calculate these benefits is to reduce

6
them to actual capitals. Let us take for example life pensions. A person of age A wishes
to constitute on his head a life pension h; we demand the capital that he must give for
it to the funds of the establishment which makes this pension for him.
If we name y0 the number of individuals of age A in the Table of mortality of which
we make use, and yx the number of individuals of age A + x, the probability to pay the
pension at the end of the year A + x will be yyx0 ; consequently, the value of the payment
will by hy
y0 . But, if we designate by r the annual interest on unity, so that the capital 1
x

becomes 1 + r after one year, it will become (1 + r)x after x years; thus, the payment
(1 + r)x made at the end of the xth year, reduced to actual capital, becomes unity, or
this same payment divided by (1 + r)x ; the payment hy x
y0 reduced to actual capital is
hyx
therefore y0 (1+r) x . The sum of all the payments made during the duration of life of

the person who constitutes the pension and multiplied by their probability is equivalent
therefore to an actual capital represented by the finite integral
X hyx
,
y0 (1 + r)x
P
the characteristic must embrace all the values of the function that it affects.
We are able to determine this integral by forming all these values according to the
Table of mortality, and by adding them together; we will deduce next the capitals from
one another, by observing that, if we name F the capital relative to the age A and F 0 [436]
the capital relative to age A + 1, we have
y1 F 0 + h
F = .
y0 1 + r
But this process is simplified when the law of mortality is known, and especially when
it is given by a rational and entire function of x, that which is always possible, by
considering the numbers of the Table of mortality as some ordinates of which the cor-
responding ages are the abscissas, and by making a parabolic curve pass through the
extremities of the two extreme ordinates and of many intermediate ordinates. The dif-
ferences which exist among the diverse Tables of mortality permit regarding this mean
as exact also as these Tables, and even to be satisfied with a small number of ordinates.
Let us make
1 yx
= p, = u;
1+r y0
let us resume formula (16) of §11 of Book I which gives
X px
px u = du + f,
pc dx − 1
f being an arbitrary constant. It is necessary, in the development of the first term of the
second member of this equation with respect to the ratio to the powers of du dx , to change
i di u
any power dudx into dxi , and to multiply by u the first term, which is independent of
du
dx . We have thus

X px u px+1 du (p + 1)px+1 d2 u
px u = f − − dx
− + ···
1 − p (1 − p)2 1.2.(1 − p)3 dx2

7
P x
In order to determine f , we will observe that the integral p u is null when x = 1,
and that it is terminated when x = n + 1, A + n being the limit of life; because then it
embraces the terms corresponding  the numbers 1, 2, 3, . . . , n. Let us designate
 to0 all
0
therefore by (u), dx , . . . , u , dx , . . . the values of u, du
du du

dx , . . . , corresponding to [437]
x = 1 and to x = n + 1; we will have
 p n 0

 1 − p [(u) − p (u )]






 

2 0
     

 p du n du 

X hpx yx 
 + 2
− p 

=h (1 − p) dx dx
(o)
y0 2
 2   2 0   .
(p + 1)p d u d u

 
 + 1.2.(1 − p)3 − pn

 

2 2
 

 dx dx 


 
+····································
 

If u or yyx0 is constant and equal to unity, from x = 1 to x = n, then the life pension
must be paid certainly during the number n of years, and it becomes an annuity. In this
hp(1−pn )
case, du
dx is null, and the preceding formula gives 1−p for the capital equivalent to
the annuity h.
If u = 1 − nx , then the probability of life decreases in arithmetic progression, and
the preceding formula gives
1 − pn
 
hp
1−
1−p n(1 − p)
for the capital equivalent to the life annuity h, and thus consecutively.
Let us suppose now that we wish to constitute a life pension h on many individuals
of ages A, A + a, A + a + a0 , . . . , so that the pension remains to the survivors. Let
us designate by yx , yx+a , yx+a+a0 , . . . the numbers of the Table of mortality, corre-
sponding to the ages A, A + a, A + a + a0 , . . . , the probability that the first individual
of living to the age A + x being yyx0 , the probability that at this age he will have ceased
to live is 1 − yyx0 . Similarly, the probability that the second individual of living to age
A + a + x or to the end of the xth year of the constitution of the pension being yx+a ya ,
the probability that he will have ceased to live then is 1 − yx+a ya ; the probability that
the third individual will have ceased to live, at the same epoch of the constitution of
y 0
the pension, is 1 − x+a+aya+a0 , and thus consecutively. The probability that none of these [438]
individuals will exist at this epoch is
   
yx yx+a yx+a+a0
1− 1− 1− ···
y0 ya ya+a0
By subtracting this product from unity, the difference will be the probability that one
of these individuals at least will be living at the endPof the xth year of the constitution
of the pension. Let us name u this probability; hpx u will be the actual capital
equivalent to the life annuity h. But we must observe, by taking this integral, that the
quantities yx , yx+a , . . . are nulls, when their indices x, x + a, . . . surpass the number
n, A + n being the limit of the life.

8
x x
If yx is a rational and entire function of x, and of exponentials P xsuch as q , r , . . .
we will have easily, by the formulas of Book I, the integral h p u; but we are able
in all cases to form, by means of a Table of mortality, all the terms of this integral,
by taking the sum and constructing thus some Tables of life annuities on one or many
heads.
The preceding analysis similarly serves to determine the life annuity that one must
make at an establishment in order to assure to his heirs a capital after his death. The
capital equivalent
x
to the life annuity h, made on a person of age A, is, by that which
precedes, hS p yy0 x , the sign S comprehending all the terms inclusively, from x = 1 to
the limit of the life of the person. Let us name hq this integral, and let us imagine that
the establishment receives from this person the rent h, and gives to him in exchange
the capital hq. Let us imagine next that the same person places this capital at perpetual
interest with the establishment itself, the annual interest of unity being r or 1−p p . It
is clear that the establishment must render the capital hq to the heirs of the person.
But the person has made during his life the rent h to the establishment, and the person
has received from it the pension hq(1−p) p ; the rent that the person has made really is
h i
q(1−p)
therefore h 1 − p ; it is this is therefore what the person must give annually to [439]
the establishment in order to assure to his heirs the capital hq.
I will not insist further on these objects, in the same way with respect to those which
are relative to the establishments of assurance of each kind, because they present no
difficulties. I will observe only that all these establishments must, in order to prosper,
reserve themselves a benefit and multiply considerably their affairs, so that, their real
benefit becoming near certain, they are exposed the least as it is possible to some great
losses which would be able to destroy them. In fact, if the number of the affairs is s and
if the advantage of the establishment in each of them is b, then it becomes extremely
probable that the real benefit of the establishment will be sb, s being supposed a very
great number.
In order to show it, let us suppose that s persons of age A constitute, each on his
head, a life annuity h, and let us consider one of these persons who we will designate
by C. If C dies in the interval at the end of the year x elapsed from the constitution of
his pension to the end of the year x + 1, the establishment will have paid to him the
pension h during x years, and the Psumx+1 of these payments, reduced to actual capital,
will be h(p + p2 + · · · + px ) or hp ; now the probability that C will die within
this interval is yx −y
y0
x+1
or − 4yx
y0 ; the value of the loss that the establishment must then
4yx P x+1
support is therefore − y0 hp . The sum of all these losses is

X  4yx X 
(r) − hpx+1 ;
y0

this is the capital that C must deposit into the funds of the establishment in order to
receive from it the life pension h. We are able to observe here that we have
X X X
−4yx px+1 = −yx+1 px+1 + yx px + yx px ;

9
by integrating the second member of this equation, the function (r) is reduced to

hyx px
P
yx X x
− hp + const.;
y0 y0
P x
now p is reduced to zero, when x = 1, and when x = n + 1 yx is null by that [440]
which precedes; the function (r) or the capital that C must pay to the establishment is
P x
therefore hy y0
xp
, that which is conformed to that which precedes. But under the form
of the function (r) we are able to apply to the benefit of the establishment the analysis
of §39. In fact, we have in this case, by the section cited,
X  4yx X 
0 0 00 00 x+1
aν + a ν + a ν + · · · = − hp ;
y0

next a, a0 , . . . being the successive values of − 4y x


y0 , we will have

X  4yx X 2 
0 02 x+1
aν + a ν + · · · = − hp ,
y0

so that
X  4yx X 2  X 4y X 2
2 x+1 x x+1
2k = − hp − hp .
y0 y0

In supposing that each of s persons who constitute the pension h on his head deposit
to the funds of the establishment, beyond the capital corresponding to this pension, a
sum b, in order to defray the expense of the establishment, we will have, by §39,
Z
2 02
√ dr0 c−r ,
π

for the probability that the real benefit of the establishment will be comprehended
within the limits √
sb ± 2kr0 s.
Thus, in the case of an infinite number of affairs, the real benefit of the establishment
becomes certain and infinite. But then those who treat with it have a mathematical dis-
advantage which must be compensated by a moral advantage, of which the estimation
will be the object of the following Chapter.

10
BOOK II
CHAPTER X
DE L’ESPÉRANCE MORALE.

Pierre Simon Laplace∗


Théorie Analytique des Probabilités OC 7 §§41–43, pp. 441–454

ON MORAL EXPECTATION

Expression of moral fortune, in departing from this principle that the moral good pro-
cured to an individual, by an infinitely small sum, is proportional to this sum di-
vided by the physical fortune of that individual. Expression of the moral fortune
resulting from the expectation of any number of events which procure benefits
of which the respective probabilities are known. Expression of the physical for-
tune corresponding to this moral fortune. The increase of this physical fortune,
resulting from the awaited events, is that which I name moral advantage relative
to these events. Consequences which result from these expressions. The game
mathematically most equal is always disadvantageous. It is worth more to ex-
pose his fortune by parts to some dangers independent of one another, than to
expose it all entire to the same danger. By thus dividing his fortune, the moral
advantage approaches without ceasing to the mathematical advantage and ends
by coinciding with it, when the division is supposed infinite. The moral advan-
tage is able to be increased by means of the funds of assurance, at the same time
as these funds produce to the assurers a certain benefit. No 41.
Explication, by means of the previous theory, of a paradox that the Calculus of Prob-
abilities presents. No 42.
Comparison of the moral advantage of the placement of one same capital on one head
with the one of the placement on two heads. One is able at the same time, by
some similar placements, to increase his proper advantage and to assure in the
future the lot of the persons who interest us. No 43.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
§41. We have seen, in §2, the difference which exists between mathematical expec- [441]
tation and moral expectation. Mathematical expectation resulting from the probable
awaiting of one or many goods being the product of these goods by the probability to
obtain them, it is able to be evaluated by the analysis exposed in that which precedes.
Moral expectation is ruled on a thousand circumstances which it is nearly impossible to
evaluate well. But we have given in the section cited a principle, which, being applied
to the most common cases, leads to some often useful results, and of which we are
going to develop the principals.
According to this principle, x being the physical fortune of an individual, the in-
crease dx that he receives produces in the individual a moral good reciprocal to this
fortune; the increase of his moral fortune is able therefore to be expressed by k xdx , k
being a constant. Thus, by designating by y the moral fortune corresponding to the
physical fortune x, we will have

y = k log x + log h,

h being an arbitrary constant that we will determine by means of a value of y corre-


sponding to a given value of x. With respect to that, we will observe that we are never
able to suppose x and y nulls or negatives in the natural order of things; because a man
who possesses nothing regards his existence as a moral good which is able to be com-
pared to the advantage that a physical fortune of which it is quite difficult to assign the [442]
value would procure to him, but that we are not able to fix below that which it would be
for him rigorously necessary in order to exist; because we imagine that he would not
agree at all to receive a moderate sum, such as one hundred francs, with the condition
to claim nothing, when he would have spent it.
Let us suppose now that the physical fortune of an individual is a, and that the
expectation of one of the increases α, β, γ, . . . occurs to him, these quantities being
able to be nulls or even negatives, that which changes the increases to diminutions. Let
us represent by p, q, r, . . . the respective probabilities of these increases, the sum of
these probabilities being supposed equal to unity. The corresponding moral fortunes of
the individual will be able to be

k log(a + α) + log h, k log(a + β) + log h, k log(a + γ) + log h, ...

By multiplying these fortunes respectively by their probabilities p, q, r, . . . the sum of


their products will be the moral fortune of the individual by virtue of his expectation;
by naming therefore Y this fortune, we will have

Y = kp log(a + α) + kq log(a + β) + kr log(a + γ) + · · · + log h.

Let X be the physical fortune which corresponds to this moral fortune, we will have

Y = k log X + log h.

The comparison of these two values of Y gives

X = (a + α)p (a + β)q (a + γ)r · · ·

2
If we subtract the original fortune a from this value from X, the difference will be
the increase of the physical fortune which would procure to the individual the same
moral advantage which results for him from his expectation. This difference is there-
fore the expression of this advantage, instead that the mathematical advantage has for
expression
pα + qβ + rγ + · · ·
Thence result many important consequences. One of them is that the mathematically [443]
most equal game is always disadvantageous. In fact, if we designate by a the physical
fortune of the player before commencing the game; by p his probability to win, and by
µ his stake, that of his adversary must be, for equality of the game, (1−p)µ
p ; thus the
player winning the game, his physical fortune becomes a + 1−p p µ, and the probability
of that is p. If he loses the game, his physical fortune becomes a−µ, and the probability
of that is 1−p; by naming therefore X his physical fortune, by virtue of his expectation,
we will have, by that which precedes,
 p
1−p
X = a+ µ (a − µ)1−p ;
p
now this quantity is smaller than a, that is that we have
 p
1−p µ  µ 1−p
1+ 1− <1
p a a
or, by taking the hyperbolic logarithms,
 
1−p µ  µ
p log 1 + + (1 − p) log 1 − < 0.
p a a
The first member of this equation is able to be put under the form
Z !
dµ 1 1
(1 − p) µ − 1− µ ,
a 1 + 1−p p a a

a quantity which is evidently negative.


There results further from the preceding analysis that it is worth more to expose
his fortune by parts to some dangers independent from one another, than to expose all
entire to the same danger. In order to show it, let us suppose that one merchant, having
to make come by sea a sum , exposes it on a single vessel, and that observation has
made known the probability p to the arrival of a vessel of the same kind in the port;
the mathematical advantage of the merchant, resulting from his expectation, will be p. [444]
But, if we represent by unity his physical fortune, independently of his expectation, his
moral fortune will be, by that which precedes,

kp log(1 + ) + log h,

and his moral advantage will be, by virtue of his expectation,

(1 + )p − 1,

3
a quantity smaller than p: because we have
(1 + )p < 1 + p,
since log(1 + )p or p log(1 + ) is less thanR log(1 + p), R pthat which is evident, when
p d d
we put these two logarithms under the form 1+ and 1+p .
Let us suppose now that the merchant exposes the sum , by equal parts, on r ves-
sels. His physical fortune will become 1+, if all the vessels arrive, and the probability
of this event is pr . If r − 1 vessels arrive, the physical fortune of the merchant becomes
1 + (r−1)
r , and the probability of this event is rpr−1 (1 − p). If r − 2 vessels arrive,
the physical fortune of the merchant becomes 1 + (r−2) r , and the probability of this
r(r−1) r−2 2
event is 2 p (1 − p) , and thus consecutively; the moral fortune of the merchant
is therefore, by that which precedes,
 
r−1
 
r r−1
 p log(1 + ) + rp (1 − p) log 1 +
  

 r 
k   + log h,
 r(r − 1) r−2 2 r−2 
 +
 p (1 − p) log 1 +  + ··· 

2 r
an expression that we are able to put under this form
" #
pr−1 (r − 1)pr−2 (1 − p) (r − 1)(r − 2)pr−3 (1 − p)2
Z
(a) kp d + + + · · · +log h.
1 + r−1 1.2. 1 + r−2

1+ r  r 

If we subtract from this expression that of the moral fortune of the merchant when [445]
he exposes the sum  on a single vessel, and if we obtain by makingR d r = 1 in the
preceding, that which, setting aside log h, reduces that here to kp 1+ which is equal
to
 r−1
(r − 1)pr−2 (1 − p) (r − 1)(r − 2)pr−3 (1 − p)2
Z 
p
kp d + + + · · · +log h,
1+ 1+ 1.2. (1 + )
the difference will be
" #
r−1 pr−2 (r − 2)pr−3 (1 − p)
Z
d
kp(1 − p) +  + ··· ;
r 1 +  1 + r−1
r  1.2. 1 + r−2
r 

this difference being positive, we see that there is morally the advantage to partition
the sum  on several vessels. This advantage is increased in measure as we increase the
number r of vessels, and, if this number is very great, the moral advantage becomes
nearly equal to the mathematical advantage.
In order to see this, we take formula (a) and give to it this form
ZZ
r−1
dx d c−(1+ r )x pc− r + 1 − p
 x
(a0 ) kp + log h,

the integral relative to x being


RR taken from x null to x infinity. In this interval, the
coefficient of dx under the signs has neither maximum nor minimum; because its
differential taken with respect to x is
r−2 h  e i
−c(1+ r )x dx pc− r + 1 − p
 x x
p(1 + )c− r + (1 − p) 1 + ;
r

4
this differential is constantly negative from x = 0 to x infinity: thus the coefficient
itself diminishes constantly in this interval. This is therefore here the case to make use
R (A) of §22 of Book I, in order to have, by a convergent approximation, the
of formula
integral y dx, y being equal to
r−1
c(1+ r )x pc− r + 1 − p
 x
.

The quantity that we have named ν in the section cited becomes then [446]
− x
y dx pc +1−p r
ν=− = x ,
dy p(1 + )c− r + (1 − p) 1 + re

that which gives


1
U= ,
1 + p + (1 − p) re
p(1 − p)2 1 − 1r

dU
=  2 ,
dx r 1 + p + (1 − p) re
····················· ,
dU dν
U, . . . being that which ν,
dx , dx , . . . become when x is null. This premised, formula
(A) cited will give
Z
r−1
dx c−(1+ r )x pc− r + 1 − p
 x

( )
p(1 − p)2 (1 − 1r

1
= 1+  2 + · · · .
1 + p + (1 − p) re r 1 + p + (1 − p) e
r

0
Formula (a ) becomes thus, very nearly, when r is a great number,
Z
p d
k + log h
1 + p
or
k log(1 + p) + log h.
Now let X be the physical fortune corresponding to this moral fortune; we have,
by that which precedes,
k log X + log h,
for the moral fortune corresponding to X; by comparing therefore these two expres-
sions, we will have
X = 1 + p.
In this case, the moral advantage is p; it is therefore equal to the mathematical advan- [447]
tage.
Often the moral advantage of the individuals is increased by the mean of the funds
of assurance, at the same time as these funds produce to the assurers a certain benefit.
Let us suppose, for example, that a merchant has a part  of his fortune on a vessel of

5
which the probability of the arrival is p, and that he assures this part, by given a sum
to the assurance company. For perfect equality between the mathematical lots of the
company and of the merchant, the latter must give (1 − p) for price of assurance. By
representing by unity the fortune of the merchant, independently of his expectation ,
his moral fortune will be, by that which precedes,

kp log(1 + ) + log h,

in the case where one does not assure, and in the case where he assures, it will be

k log(1 + ) + log h;

now we have
log(1 + p) > p log(1 + )
or, that which reverts to the same,
Z Z
p d p d
>k ,
1 + p 1+
p being less than unity; the moral fortune of the merchant is therefore increased, by
means of his assurance. He is able thus to make to the assurance company a sacrifice
proper to defray the expense of the establishment and to the benefit that it must make. If
we name α this sacrifice, that is if we suppose that the merchant gives to the company,
for the price of his assurance, the sum (1−p)+α, we will have, in the case of equality
of the moral fortunes, when the merchant assures, and when he does not assure at all,

log(1 − α + p) = p log(1 + ),

that which gives


α = 1 + p − (1 + )p .
This is all that which the merchant is able to give to the company, without moral disad- [448]
vantage; he will have therefore a moral advantage, by making a sacrifice less than this
value of α, and at the same time, the company will have a benefit which, as we have
seen it, becomes certain, when its relations are very numerous. We see thence how
some establishments of this kind, well designed and sagely administered, are able to be
assured a real benefit, by procuring advantages to the persons who negotiate with them.
This is in general the end of all the exchanges; but here, by a particular combination,
the exchange holds between two objects of like nature, of which one is only probable,
while the other is certain.

§42. The principle of which we just made use in order to calculate the moral ex-
pectation has been proposed by Daniel Bernoulli, in order to explicate the difference
between the result of the Calculus of Probabilities and the indication of common sense
in the following problem. Two players A and B play at heads and tails, with the condi-
tion that A pays to B two francs, if heads arrives at the first trial; four francs, if it arrives
at the second trial; eight francs if it arrives at the third trial, and thus consecutively to
the nth trial. We demand that which B must give to A in commencing the game.

6
It is clear that the advantage of B, relative to the first trial, is one franc; because he
has 12 of probability to win two francs at this trial. His advantage relative to the second
trial is similarly one franc; because he has 14 of probability to win four francs at this
trial, and thus consecutively, so that the sum of all his advantages relative to the n trials
is n francs. He must therefore, for the mathematical equality of the game, give to A
this sum which becomes infinite, if we suppose that the game continues to infinity.
However a person, in this game, will not risk with prudence an even rather moderate
sum, such as one hundred francs. If we reflect in the least at this kind of contradiction
between the calculus, and that which common sense indicates, we see easily that it
depends on this that, if we suppose, for example, n = 50, that which gives 250 for the
sum that B is able to hope at the fiftieth trial, this immense sum produces to B not at
all a moral advantage proportional to its magnitude, in a manner that there is for him [449]
a moral disadvantage to expose a franc in order to obtain it, with the excessively small
probability 2150 to succeed. But the moral advantage that an expected sum is able to
procure depends on an infinity of circumstances proper to each individual and that it
is impossible to evaluate. The only general consideration that we are able to employ
in this regard is that, the more one is rich, the less the very small sum is able to be
advantageous, all things equal besides. Thus the most natural supposition that we are
able to make is that of a moral advantage reciprocal to the wealth of the interested per-
son. It is to that which the principle of Daniel Bernoulli is reduced, a principle which,
as we have just seen, makes the results of the calculus coincide with the indications
of common sense, and which gives the means to estimate with some exactitude these
always vague indications. His application to the problem of which we have just spoke
will furnish us a new example of it.
Let us name a the fortune of B before the game, and x that which he gives to
player A. His fortune becomes a − x + 2, if heads arrives at the first trial; it becomes
a − x + 22 , if heads arrives at the second trial, and thus consecutively to trial n, where
it becomes a − x + 2n , if heads arrives only at the nth trial. The fortune of B becomes
a − x, if heads arrives not at all in the n trials, after which the game is supposed to
end; but the probability of this last event is 21n . By multiplying the logarithms of these
diverse fortunes by their respective probabilities and by k, we will have, by that which
precedes, the moral fortune of B, by virtue of the conditions of the game, equal to
1 1
k log(a − x + 2) + 2 k log(a − x + 22 ) + · · ·
2 2
1 1
+ n k log(a − x + 2n ) + n k log(a − x) + log h.
2 2
But, before the game, his moral fortune was k log a+log h; by equating therefore these
two fortunes, provided that B always conserves the same moral fortune, and passing [450]
again from the logarithms to the numbers, we will have, a − x being supposed equal to
a0 , and making a10 = α,
1 1 1
(o) 1 + αx = (1 + 2α) 2 (1 + 22 α) 22 · · · (1 + 2n α) 2n ;
1 1
the factors (1+2α) 2 , (1+22 α) 22 diminishing without ceasing, and their limit is unity;
because we have 1 1
(1 + 2i α) 2i > (1 + 2i+1 α) 2i+1 .

7
In fact, if we raise to the power 2i+1 the two members of this inequality, it becomes

1 + 2i+1 α + 22i α2 > 1 + 2i+1 α,

and under this form the equality becomes evident. Moreover, the logarithm of (1 +
1
2i α) 2i is equal to i log 2
+ 21i log α + 21i , and it is clear that this function is null in

2i
1
the case of i infinite, that which requires that in this case (1 + 2i α) 2i be unity.
If we suppose n infinite in equation (o), we have the case where the game is able
to be prolonged to infinity, that which is the most advantageous case to B. a0 and
consequently α being supposed known, we will take the sum of the tabular logarithms
of a rather great number i − 1 of the first factors of the second member, in order that
2i α is at least equal to ten. The sum of the tabular logarithms of the following factors,
to infinity, will be, to very nearly, equal to

log α (i + 1) log 2 0, 4342945


+ + .
2i−1 2i−1 3α2i−2
The addition of these two sums will give the tabular logarithm of a0 +x or of a. Thus we
will have for a physical fortune a, supposed B has before the game, the value of x which
he must give to A at the beginning of the game, in order to conserve the same moral
fortune. By supposing, for example, a0 equal to one hundred, we find a = 107fr , 89,
whence it follows that, the physical fortune of B being originally 107fr , 89, he must [451]
then risk prudently in this game only 7fr , 89, instead of the infinite sum that the result
of the calculus indicates, when we set aside all moral considerations. Having thus the
value of a relative to a0 = 100, it is easy to conclude from it in the following manner
its value relative to a0 = 200; in fact we have, in this last case,
1 1 1 1 1
a = (100 + 2) 2 (100 + 22 ) 22 · · · = 2(100 + 1) 2 (100 + 2) 4 (100 + 4) 8 · · ·

But we have just found


1 1 1
(100 + 2) 4 (100 + 4) 8 · · · = (107, 89) 2 ;

therefore p
a = 2 101.107, 89 = 208, 78.
Thus the physical fortune of B being originally 208, 78, he is not able to risk prudently
in this game beyond 8fr , 78.

§43. We will now extend the principle exposed above to the things of which the
existence is distant and uncertain. For this, let us consider two persons A and B, who
wish to each invest, in a life annuity, a capital q. They are able to make it separately;
they are able to partner and to constitute a life annuity on their heads, in a manner that
the pension is reversible to the one who survives the other. Let us examine what is the
most advantageous part.
Let us suppose the two persons of the same age and having the same annual fortune
that we will represent by unity, independently of the capital that they wish to place.
Let β be the life pension that this capital would produce to each of them, if they placed

8
their capitals separately, so that their annual fortune becomes 1 + β. We will express,
conformably to the principle of which there is concern, their corresponding annual
moral fortune by k log(1 + β) + log h. But this fortune will take place only probably
in the xth year; thus, by designating by yx the probability that A will survive to the
end of the xth year, we must multiply his annual moral fortune relative to this year
by yx ; by adding therefore all these products, their sum, that we will designate by [452]
[k log(1 + β) + log h]Σyx , will be that which I name here life moral fortune.
Let us suppose now that A and B place the sum 2q of their capitals on their heads,
and that that produces a life pension β 0 , reversible to the survivor. So long as A and
B will live, each of them will touch only 21 β 0 of life annuity, and their annual moral
fortune will be k log(1+ 12 β 0 )+log h. By multiplying it by the probability that they both
will live to the end of year x, a probability equal to (yx )2 , the sum of these products
for all the values of x will be the life moral fortune of A, relative to the supposition of
their simultaneous existence; this fortune is therefore
β0
   X
k log 1 + + log h (yx )2 .
2
The probability that A will exist alone to the end of the xth year is yx − (yx )2 ; his life
moral fortune relative to his existence after the death of B, which renders his annual
moral fortune equal to 1 + β 0 , is therefore
X
[k log (1 + β 0 ) + log h] yx − (yx )2 .


The sum of these two functions


β0 X
  hX X i X
k log 1 + (yx )2 + k log (1 + β 0 ) yx − (yx )2 + log h yx
2
will be the life moral fortune of A under the hypothesis where A and B place conjointly
their capital.
If we compare this fortune to that which we have just found in the case where they
place their capitals separately, we see that there will be for A advantage or disadvantage
to place conjointly, according as
β0 X
  hX X i
log 1 + (yx )2 + log (1 + β 0 ) yx − (yx )2
2
will be greater or lesser than log (1 + β 0 ) yx . In order to know it, it is necessary to
P
determine the ratio of β 0 to β; now we have, by §40, [453]
X
q=β px yx ,
1−p
p being the annual interest on the money. We have next, by the same section,
X 
2q = β 0 px 2yx − (yx )2 ;


we have therefore
2β px yx
P
0
β =P .
px [2yx − (yx )2 ]

9
yx , (yx )2 ,
P x P x
p (yx )2 ;
P P
The Tables of mortality will give the values of p yx ,
we will be able thus to judge which of the two placements of which there is concern is
most advantageous.
0
P
P β and β very small fractions; the quantity log(1+β) yx becomes
Let us suppose
very nearly β yx . The quantity

β0 X
  hX X i
log 1 + (yx )2 + log (1 + β 0 ) yx − (yx )2
2

becomes
β0 h X X i
2 yx − (yx )2 ,
2
0
and, by substituting for β its preceding value, it becomes
 P P x
2 yx − (yx )2
P
p yx
β P x P x ;
2 p yx − p (yx )2

there is therefore advantage to place conjointly, if


h X X iX
2 yx − (yx )2 px yx

surpasses h X iX
X
2 px yx − px (yx )2 yx ,

or if we have P x
p (y )2 (y )2
P
P xx > Px ;
p yx yx
it is in fact that which holds generally, p being smaller than unity.
The advantage to place conjointly the capitals increases by the consideration that
0
the increase β2 of revenue arrives to the survivor, at an ordinarily advanced age, in [454]
which the greatest needs which are sensed render it much more useful. This advantage
increases yet on all the affections which are able to attach the two individuals to one
another, and which make them desire the well being of the one who must survive. The
establishments in which one is able thus to place his capitals and, by a slight sacrifice
of his revenue, to assure the existence of his family for a time where one must fear
no longer to be sufficient to its needs, are therefore very advantageous to the dead,
by favoring the softest penchants of nature. They offer not at all the inconvenience
that we have noted in even the most equitable games, the one to render the loss more
sensible than the gain, since on the contrary they offer the means to exchange the super-
fluous against some assured resources in the future. The Government must therefore
encourage these establishments and to respect them in their vicissitudes; because the
expectations that they present carrying onto an extended future, they are able to prosper
only with shelter from all anxiety on their duration.

10
BOOK II
CHAPTER XI
DE LA PROBABILITÉ DES TÉMOIGNAGES.

Pierre Simon Laplace∗


Théorie Analytique des Probabilités 7 §§44–50, pp. 455–470

ON THE PROBABILITY OF TESTIMONIES

We have extracted a ticket from an urn which contains the number n of them; a wit-
ness of this drawing, of whom the veracity and the probability that he is not
mistaken at all are supposed known, announces the exit of the no i; we demand
the probability of this exit. No 44.
We have extracted a ball from an urn which contains n − 1 black balls and one white
ball. A witness to the drawing announces that the extracted ball is white; we
demand the probability of this exit. If the number n is very great, that which
renders extraordinary the exit of the white ball, the probability of the error or of
the falsehood of the witness becomes quite near to certitude, that which shows
how the extraordinary facts weaken the belief due to the testimonies. No 45.
Urn A contains n white balls, urn B contains the same number of black balls; we
have extracted a ball from one of these urns and we have put it into the other urn
from which we have next extracted a ball. A witness of the first drawing has seen
a white ball exit. A witness of the second drawing announces that he has seen
similarly a white ball extracted. We demand the probability of this double exit.
In order that this double exit take place, it is necessary that a white ball extracted
from urn A in the first drawing, put next into urn B, has been extracted from it in
the second drawing, this which is a quite extraordinary event, when the number n
of black balls with which we have mixed it is very considerable. The probability
of this event becomes then very small; whence it follows that the probability of
the fact, resulting from the collection of many testimonies, decreases in measure
as this fact becomes more extraordinary. No 46.
Two witnesses attest to the exit of the no i from an urn which contains the number
n of them, and from which we have extracted only one ticket. We demand the
probability of this exit.
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
One of the witnesses attests to the exit of the no i and the other attests to the exit of the
no i0 ; to determine the probability of the exit of the no i. No 47.
One or many traditional successions of r witnesses transmit the exit of the no i from
an urn which contains the number n of them; to determine the probability of this
exit. No 48.

We know the respective veracities of two witnesses, of whom at least one, and perhaps
both, attest to the exit of the no i from one urn which contains the number n of
them; to determine the probability of this exit. No 49.
The judgments of the tribunals are able to be assimilated to the witnesses. To deter-
mine the probability of the goodness of these judgments. No 50.

2
§44. I will first consider a single witness. The probability of his testimony is [455]
composed of his veracity, of the possibility of his error and of the possibility of the fact
in itself. In order to fix the ideas, let us imagine that we have extracted a ticket from an
urn which contains the number n of them, and that a witness of the drawing announces
that the no i exited. The observed event is here the witness announcing the exit of the
no i. Let p be the veracity of the witness, or the probability that he will not at all seek
to deceive; let further r be the probability that he is not deceived at all. This premised:
We are able to form the following four hypotheses. Either the witness does not
deceive at all and is not deceived at all; or he deceives not at all and is deceived; or he
deceives and is not deceived at all; finally, or he deceives and is deceived at the same
time. Let us see what is, a priori, under each of these hypotheses, the probability that
the witness will announce the exit of the no i.
If the witness does not deceive at all and is not deceived at all, the no i will be
exited: but the probability of this exit is a priori n1 ; by multiplying it by the probability
pr of the hypothesis, we will have pr n for the entire probability of the observed event
under this first hypothesis.
If the witness does not deceive at all and is deceived, the no i must not be exited at
all, in order that he announces its exit; the probability of that is n−1n . But the error of
the witness must carry over one of the non-exited tickets. Let us suppose that it is able
1
to carry equally over all: the probability that it will carry over the no i will be n−1 ; the [456]
probability that the witness not deceiving at all and being deceived will announce the
no i is therefore n−1 1 1
n n−1 or n . By multiplying it by the probability p(1 − r) of the
hypothesis itself, we will have p(1−r)
n for the probability of the observed event under
this second hypothesis.
If the witness deceives and is not deceived at all, the no i will not be exited at all,
and the probability of that is n−1
n ; but the witness must choose, among the n − 1 tickets
not exited, the no i. If we suppose that his choice is able equally to carry over each of
1
them, n−1 will be the probability that his choice will be fixed on the no i; n−1 1
n n−1 or
1 o
n is therefore the probability that the witness will announce the n i. By multiplying
it by the probability (1 − p)r of the hypothesis, we will have (1−p)r n for the entire
probability of the observed event under this third hypothesis.
Finally, if the witness deceives and is deceived, the probability that he will not
believe the no i exited will be n−1n , and the probability that he will choose it among
1
the n − 1 tickets that he will not believe exited will be n−1 ; n−1 1 1
n n−1 or n will be
o
therefore the probability that he will announce the exit of n i. By multiplying it by the
probability (1−p)(1−r) of the hypothesis, we will have (1−p)(1−r) n for the probability
of the observed event under this fourth hypothesis.
This hypothesis contains one case in which the no i exited, namely the case in
which, the no i being exited, the witness believes it not exited, and he chooses it among
the n − 1 tickets that he believes not exited. The probability of that is the product of n1
1
by n−1 . By multiplying this product by the probability (1−p)(1−r) of the hypothesis,
we will have (1−p)(1−r)
n(n−1) for the probability of the case of which there is concern.
We are able to arrive to the same results in this manner. Let a, b, c, d, i, . . . be [457]
the n tickets. Since the witness is deceived, he must not believe exited at all the exited

3
ticket, and since he deceives, he must not announce at all as exited the ticket that he
believes exited. Let us put therefore, in the first place the exited ticket, in the second
the ticket that the witness believes exited, and in the third the ticket that he announces.
Among all the possible combinations of the tickets three by three, without excluding
those where they are repeated, there are compatibles with the present hypothesis only
those where the ticket which occupies the second place occupies neither the first, nor
the third; such are the combinations aba, abc, . . . Now it is easy to see that the number
of combinations which satisfy the two preceding conditions is n(n − 1)2 ; because the
combination ab is able to be combined with the n − 1 tickets other than b, and the num-
ber of combinations ab, ba, ac is n(n − 1). Now the combinations in which the no i is
announced without being exited are of the form abi, bai, aci, . . ., and the number of
these combinations is (n − 1)(n − 2); thus the probability that one of these combina-
n−2
tions will take place is n(n−1) . The combinations in which the no i being exited, it is
announced, are of the form iai, ibi, . . . and the number of these combinations is clearly
1
n−1; the probability that one of these combinations will take place is therefore n(n−1) .
It is necessary to multiply all these combinations by the probability (1 − p)(1 − r) of
the hypothesis, and then we will have the preceding results.
Now, in order to have the probability of the exit of the no i , we must make a sum
of all the preceding probabilities, relative to this exit, and to divide it by the sum of all
these probabilities, that which gives, for this probability,
pr (1−p)(1−r)
n + n(n−1) (1 − p)(1 − r)
p(1−r) (1−p)r
or pr + .
pr
n + n + n + (1−p)(1−r)
n
n−1

If r is equal to unity, or if the witness is not deceived at all, the probability of the exit [458]
of the no i will be p, that is the probability of the veracity of the witness.
If n is a very great number, this probability will be very nearly pr or the probability
of the veracity of the witness, multiplied by the probability that he is not deceived at
all.
We have supposed that the error of the witness, when he is deceived, is able equally
to fall on all the tickets non-exited; but this supposition ceases to hold, if some of them
have more resemblance than the others with the exited ticket, because the mistake in
this regard is easier. We have further supposed that the witness, when he deceives, has
no motive in order to choose one ticket rather than another, that which is not able to take
place. But it will be very difficult to make enter into one formula all these particular
considerations.

§45. Let us suppose now that the urn contains n − 1 black balls and one white ball,
and that by having extracted one ball, a witness of the drawing announces the exit of
a white ball. Let us determine the probability of this exit. We will form the same hy-
potheses as we have just made. In the first, the probability of the exit of the white ball
is, as above, pr
n . Under the second hypothesis, the witness is being deceived without
deceiving, a black ball must be exited, and the probability of that is n−1
n , and as the wit-
ness, supposed truthful, must announce the exit of a white ball, by that alone that he is
mistaken, the probability of this announcement will be therefore n−1 n , a probability that
it is necessary to multiply by the probability p(1−r) of the hypothesis, that which gives

4
p(1−r)(n−1)
n for the probability of the event observed under this hypothesis. Under the
third hypothesis, the witness being supposed deceiving and not at all being deceived, a
black ball must be exited, and the probability of that is n−1
n . By multiplying it by the
(1−p)r(n−1)
probability (1 − p)r of this hypothesis, we will have n for the probability [459]
of the observed event, under this hypothesis. Finally, under the fourth hypothesis, the
witness, deceiving and being deceived, is able to announce the exit of the white ball
only as long as it will be exited. The probability of this exit is n1 . By multiplying it
by the probability (1 − p)(1 − r) of the hypothesis, we will have (1−p)(1−r)n for the
probability of the observed event, under this hypothesis.
Presently, if we unite among the preceding probabilities those in which the white
ball exited, we will have the probability of this exit, by dividing their sum by the sum
of all the probabilities, that which gives

pr + (1 − p)(1 − r)
pr + (1 − p)(1 − r) + [p(1 − r) + (1 − p)r](n − 1)

for the probability of the exit of the white ball; consequently

[p(1 − r) + (1 − p)r](n − 1)
pr + (1 − p)(1 − r) + [p(1 − r) + (1 − p)r](n − 1)

is the probability that the fact attested by the witness of the drawing has not taken place.
We are able to observe here that, if we name q the probability that the witness
announces the truth, we will have

q = pr + (1 − p)(1 − r);

because it is clear that he spoke true, in the case of which there is concern, either that
he deceives not at all and is not deceived at all, or that he deceives and is deceived.
This expression of q gives

1 − q = p(1 − r) + (1 − p)r.

In fact, the probability 1 − q that he does not enunciate the truth is the probability that
he deceives not at all and is deceived, plus the probability that he deceives and is not [460]
deceived at all. The preceding expression of the probability that the attested fact is
false becomes thus
(1 − q)(n − 1)
.
q + (1 − q)(n − 1)
If the number n − 1 of black balls is very great, this probability becomes very
nearly equal to unity or to certitude, if the error or the mistake of the witness is in the
least probable. Then the fact that he attests becomes extraordinary. Thus we see how
the extraordinary facts weaken the belief due to the witnesses, the mistake or the error
becoming so much more possible as the attested fact is the more extraordinary in itself.

§46. Let us consider presently two urns A and B, of which the first contains a great
number n of white balls, and the second the same number of black balls. We draw

5
from one of these urns a ball that we replace into the other urn; next we draw a ball
from this last urn. A witness of the first drawing attests that one white ball exited; a
witness of the second drawing attests similarly that he has seen a white ball extracted.
Each of these testimonies, considered isolated, offers nothing of the unlikely. But the
consequence which results from their collection is that the same ball, exited from the
first drawing, has reappeared in the second, that which is a phenomenon so much more
extraordinary as n is a greater number. Let us see how the value of these testimonies is
weakened from it.
Let us name q the probability that the first witness enunciates the truth. We see,
by the preceding section, that in the present case this probability is composed of the
probability that the witness deceives not at all and is not deceived at all, added to the
probability that he deceives and is deceived at the same time; because the witness,
in these two cases, enunciates the truth. Let q 0 be the same probability relative to
the second witness. We are able to form these four hypotheses: either the first and the
second witness say the truth; or the first says the truth, the second does not say it; or the
second witness says the truth, the first not saying it at all; or finally neither of the two [461]
say the truth. We determine a priori, under each of these hypotheses, the probability of
the observed event.
This event is the announcement by the exit of one white ball at each drawing. The
probability that one white ball exited at the first drawing is 12 , since the ball extracted
is able to be equally exited from urn A or from urn B. In the case where it has been
extracted from urn A and put into urn B, n + 1 balls are contained in this last urn, and
1
the probability to extract from it one white ball is n+1 ; the product of 21 by n+1
1
is
therefore the probability a priori of the extraction of one white ball in the two consec-
utive drawings. In multiplying it by the probability qq 0 that the two witnesses say the
truth, we will have
qq 0
2(n + 1)
for the probability of the observed event, under the first hypothesis.
Under the second hypothesis, the ball has been extracted from urn A and put into
urn B: the probability of this extraction is 12 . Moreover, since the second witness does
not say the truth, a black ball has been extracted from urn B, and the probability of
n
this extraction is n+1 . By multiplying therefore 21 by n+1 n
, and the product by the
probability q(1 − q 0 ) that the first witness says the truth while the second does not say
it, we will have
q(1 − q 0 )n
2(n + 1)
for the probability of the observed event under the second hypothesis.
Under the third hypothesis, a black ball has been extracted from urn B and put into
urn A: the probability of this extraction is 12 . Moreover, a white ball has been further
n
extracted from urn A, and the probability of this extraction is n+1 ; by multiplying [462]
1 n 0
therefore 2 by n+1 , and the product by the probability (1−q)q that the second witness
says the truth, while the first does not say it, we will have
(1 − q)q 0 n
,
2(n + 1)

6
for the probability relative to the third hypothesis.
Finally, under the fourth hypothesis, a black ball has first been extracted from urn
B, and the probability of this extraction is 21 . Next this black ball, put into urn A, has
been extracted from it in the second drawing, and the probability of this extraction is
1
n+1 ; by multiplying therefore the product of these two probabilities by the probability
(1 − q)(1 − q 0 ) that none of the witnesses says the truth, we will have

(1 − q)(1 − q 0 )
2(n + 1)

for the probability relative to the fourth hypothesis.


Now the probability of the fact which results from the collection of the two testi-
monies, namely, that a white ball extracted in the first drawing has reappeared in the
second drawing, is clearly equal to the probability relative to the first hypothesis di-
vided by the sum of the probabilities relative to the four hypotheses; this probability is
therefore
qq 0
.
qq 0 + (1 − q)(1 − q 0 ) + [q(1 − q 0 ) + q 0 (1 − q)]n
The phenomenon of the reappearance of a white ball in the second drawing becomes
so much more extraordinary as the number n of balls of each urn is more consider-
able, and then the preceding probability becomes very small. We see therefore that the
probability of the fact resulting from the collection of the witnesses is extremely weak,
when it is extraordinary.

§47. Let us consider simultaneous witnesses: let us suppose two witnesses in ac- [463]
cord on a fact, and let us determine its probability. In order to fix the ideas, let us sup-
pose that the fact is the extraction of the no i from an urn which contains the number
n of them, such that the observed event is the accord of two witnesses of the drawing
to enunciate the exit of the no i. Let us name p and p0 their respective veracities, and
let us suppose, in order to simplify, that they are not deceived at all. This premised, we
are able to form only these two hypotheses: the witnesses say the truth; the witnesses
deceive.
Under the first hypothesis, the no i exited, and the probability of this event is n1 . By
multiplying it by the product of the veracities p and p0 of the witnesses, we will have
pp0
n for the probability of the observed event, under this hypothesis.
In the second, the no i is not exited, and the probability of this event is n−1 n ; but
the two witnesses are agreed to choose the no i among the n − 1 non-exited tickets.
Now the number of different combinations which are able to result from their choice
is (n − 1)2 , and in this number they must choose that where the no i is combined
1
with itself; the probability of this choice is therefore (n−1) 2 . By multiplying it by the

preceding probability n , and by the products of the probabilities 1 − p and 1 − p0


n−1
0
that the witnesses deceive, we will have (1−p)(1−p
n(n−1)
)
for the probability of the observed
event, under the second hypothesis.
Now, we will have the probability of the exit of the no i by dividing the probabil-
ity relative to the first hypothesis by the sum of the probabilities relative to the two

7
hypotheses; we will have therefore, for this probability,
pp0
(o) (1−p)(1−p0 )
.
pp0 + n−1

If n = 2, then the exit of the no i is as probable as its non-exit, and the probability of [464]
its exit, resulting from the accord of the testimonies, is
pp0
.
pp0 + (1 − p)(1 − p0 )
This is generally the probability of a fact attested by two witnesses, when the existence
of the fact is as probable as its nonexistence. If the two witnesses are equally truthful,
that which gives p0 = p, this probability becomes
p2
.
p2 + (1 − p)2
In general, if the number r of the equally truthful witnesses affirm the existence of a
fact of this kind, its probability resulting from the testimonies will be
pr
.
pr + (1 − p)r
But this formula is applicable only in the case where the existence of the fact and its
nonexistence are in themselves equally probable.
If the number n of the tickets of the urn is very great, formula (o) becomes very
nearly unity, and consequently the exit of no i is extremely probable. That holds to
this that it is not very credible that the witnesses, wishing to deceive, are agreed to
enunciate the same ticket, when the urn contains a great number of them. Simple
good sense indicates this result from the calculus; but we see at the same time that the
probability of the exit of the no i is much diminished, if the two witnesses, seeking to
deceive, have been able to hear one another.
Let us suppose now that the first witness affirms the exit of the no i, and that the
second witness affirms the exit of the no i0 . We are able to form then the following
three hypotheses: the first witness says the truth and the second deceives; in this case
the no i exited, and the probability of this event is n1 ; moreover, the second witness, who
deceives, must choose among the other non-exited tickets the no i0 , and the probability
1
of this choice is n−1 . The product of these two probabilities by the product of the
probabilities p and 1 − p0 , that the first witness not deceive and that the second deceive, [465]
will be the probability of the observed event or of the enunciation of the exit of the nos i
0
and i0 , under this hypothesis, a probability which is thus p(1−p
n(n−1) .
)

Under the second hypothesis, the first witness deceives and the second does not
deceive. Then the no i0 exited, and the probability of this event is n1 . Moreover, the first
witness chooses the no i out of the n − 1 non-exited tickets, and the probability of this
1
choice is n−1 . By multiplying the product of these two probabilities by the product of
the probabilities 1 − p and p0 , that the first witness deceives and that the second does
0
not deceive, we will have (1−p)p
n(n−1) .

8
Finally, under the third hypothesis, the two witnesses deceive at the same time.
Then none of the two tickets i and i0 exited. The probability of this event is n−2 n .
Moreover, the first witness must choose the no i, and the second must choose the no i0 ,
among the n − 1 non-exited tickets, and the probability of this composite event is
1
(n−1)2 . By multiplying the product of these two probabilities by the product of the
probabilities 1 − p and 1 − p0 that the first and the second witness deceive, we will have
(n−2)(1−p)(1−p0 )
n(n−1)2 for the probability of the observed event, under this hypothesis.
Now we will have the probability of the exit of the no i, by dividing the probabil-
ity relative to the first hypothesis by the sum of the probabilities relative to the three
hypotheses; the probability of this exit is therefore
p(1 − p0 )
(1−p)(1−p0 )
.
1 − pp0 − n−1

If n = 2, that is if the existence of each fact attested by the two witnesses is a priori as
probable as its nonexistence, then the preceding probability becomes 21 , when p = p0 , [466]
that which is clear besides, the two testimonies destroying themselves reciprocally. In
general, if a fact of this kind is attested by r witnesses and denied by r0 witnesses, all
equally truthful, it is easy to see that its probability will be
0
pr−r
,
pr−r0 + (1 − p)r−r0
that is the same as if the fact were attested by r − r0 witnesses.

§48. Let us consider presently a traditional chain of r witnesses, and let us suppose
that the fact transmitted is the exit of the no i from an urn which contains n tickets.
Let us designate by yr its probability. The addition of a new witness will change this
probability into yr+1 , a probability which will be formed: 1◦ from the product of yr by
the veracity of the new witness, a veracity that we will designate by pr+1 ; 2◦ from the
product of the probability 1 − pr+1 that this new witness deceives, by the probability
1
1 − yr that the preceding witness has not said the truth, and by the probability n−1 that
the new witness will choose the exited ticket, in the number of the n − 1 other tickets
than the one which has been indicated to him by the preceding witness; therefore we
will have
1
yr+1 = pr+1 yr + (1 − pr+1 )(1 − yr ),
n−1
an equation of which the integral is
1 (np1 − 1)(np2 − 1) · · · (npr − 1)
yr = +C ,
n (n − 1)r
C being an arbitrary constant. In order to determine it, we will observe that the prob-
ability of the fact, after the first testimony, is, by that which precedes, equal to p1 ; we
have therefore y1 = p1 , that which gives C = n−1 n ; hence

1 n − 1 (np1 − 1)(np2 − 1) · · · (npr − 1)


yr = + .
n n (n − 1)r

9
If n is infinite, we have [467]
yr = p1 p2 . . . pr .
If n = 2, that is if the existence of the fact is as probable as its nonexistence, we have
1
yr = 2 + 12 (2p1 − 1)(2p2 − 1) · · · (2pr − 1).

In general, in measure as the traditional chain is prolonged, yr approaches indefinitely


to its limit n1 , a limit which is the probability, a priori, of the exit of the no i. The term
n−1 np1 −1
n n−1 · · · of the expression of yr is therefore that which the chain of witnesses
adds to this probability. We see thus how the probability is weakened in measure as the
tradition is prolonged. In truth, the monuments, printings and other causes are able to
diminish this inevitable effect of time; but they are never able to entirely destroy it.
If we have two traditional chains, each of r witnesses, if we suppose the witnesses
of these chains equally truthful and if the last witness of the one of the chains accords
with the last of the other to affirm the exit of the no i, we will have the probability of
this exit, by substituting yr for p and p0 in formula (o) of the preceding section, which
becomes thence
yr2
2 .
yr2 + (1−y r)
n−1

§49. Let us consider two witnesses of whom p and p0 are the respective veracities.
We know that both or at least one of the them, without being contradicted by the other
who, in this case, has not pronounced at all, affirm that the no i exited from one urn
which contains the number n of them. By supposing always that we have extracted
only a single ticket, we demand the probability of the exit of the no i.
Let r and r0 be the respective probabilities that the witnesses pronounce. We are
able to make here only the following four hypotheses: 1◦ the two witnesses pronounce
and say the truth; 2◦ the two witnesses pronounce and deceive; 3◦ one of the witnesses [468]
pronounces and says the truth, and the other witness does not pronounce; 4◦ one of the
witnesses pronounces and deceives, and the other does not pronounce.
Under the first hypothesis, the no i exited, and the probability of this event is n1 .
It is necessary to multiply it by the product of the probabilities r and r0 that the two
witnesses have pronounced, and by the product of the probabilities p and p0 that they
say the truth; we will have thus
pp0 .rr0
n
for the probability of the observed event, under this hypothesis.
In the second, the no i is not exited, and the probability of this event is n−1
n . But,
if the two witnesses deceive without hearing one another, the probability that they will
1
agree to enunciate the same no i is (n−1) 2 . It is necessary to multiply the product of

these probabilities by the probability rr0 that the two witnesses pronounce at the same
time, and by the probability (1 − p)(1 − p0 ) that they both deceive. We will have thus

(1 − p)(1 − p0 )rr0
n(n − 1)

10
for the probability of the observed event, under the second hypothesis.
Under the third, the no i exited, and the probability of this event is n1 . It is neces-
sary to multiply by the probability pr(1 − r0 ) + p0 r0 (1 − r) that one of the witnesses
pronounces by saying the truth, while the other witness does not pronounce it at all.
We will have thus
pr(1 − r0 ) + p0 r0 (1 − r)
n
for the probability of the observed event under this hypothesis.
Finally, under the fourth, the no i is not exited, and the probability of this event is [469]
n−1
n ; but the witness who deceives must choose it in the n − 1 non-exited tickets, and
1
the probability of this choice is n−1 . It is necessary to multiply the product of these
probabilities by the probability (1 − p)r(1 − r0 ) + (1 − p0 )r0 (1 − r) that one of the
witnesses pronouncing deceives, while the other witness does not pronounce it at all.
We have thus
(1 − p)r(1 − r0 ) + (1 − p0 )r0 (1 − r)
n
for the probability corresponding to the fourth hypothesis.
Now we will have the probability of the exit of the no i, by dividing the sum of the
probabilities relative to the first and to the third hypothesis by the sum of the probabil-
ities relative to all the hypotheses, that which gives, for this probability,
pp0 rr0 + pr(1 − r0 ) + p0 r0 (1 − r)
(1−p)(1−p0 )rr 0
.
pp0 .rr0 + r(1 − r0 ) + r0 (1 − r) + n−1

These examples indicate sufficiently the method to subject to calculation of the proba-
bility of the testimonies.
§50. We are able to assimilate the judgment of a tribunal which pronounces between
two contradictory opinions to the result of the testimonies of many witnesses of the
extraction of a ticket from one urn which contains only two tickets. In expressing by p
the probability that the judge pronounces the truth, the probability of the goodness of a
judgment rendered by unanimity will be, by that which precedes,
pr
,
pr + (1 − p)r
r being the number of judges. We are able to determine p by the observation of the
ratio of the judgments rendered by unanimity by the tribunal to the total number of
judgments. When this number is very great, by designating it by n, and by i the number
of judgments rendered by unanimity, we will have very nearly [470]
i
pr + (1 − p)r = ;
n
p of the judges. This equation is
the resolution of this equation will give the veracity √
reduced to a degree less by half, by making p = 1 + u. It becomes thus
√ √ i
(1 + u)r + (1 − u)r = ,
n

11
an equation which, developed, is of degree 2r or r−1
2 , according as r is even or odd.
The probability of the goodness of a new judgment rendered by unanimity will be
n
1− (1 − p)r .
i
If we suppose the tribunal formed of three judges, we will have
r
1 4i − n
p= ± .
2 12n
We will adopt the + sign; because it is natural to suppose to each judge a greater
probability for the truth than for error. If the half of the judgments rendered by the
tribunal have been rendered by unanimity, then ni = 12 , and we find p = 0, 789. The
probability of a new judgment rendered by unanimity will be 0, 981. If this judgment
is rendered only by plurality, its probability will be p or 0, 789.
In general, we see that the probability 1 − ni (1 − p)r of the goodness of a new
judgment rendered by unanimity is so much greater as r is a greater number and as
the values of p and of ni are greater, that which depends on the wisdom of the judges.
There is therefore a great advantage to form the tribunals of appeal, composed of a
great number of judges chosen among the most enlightened persons.

12
Additions
P. S. Laplace∗
OC 7 pp. 471–493.

I One deduces from the analysis of no 34 of Book I the expression of the ratio of
the circumference to the radius, given by Wallis, in infinite product. Analysis of
the remarkable method by which this great geometer is arrived there, a method
which contains the germs of the theories of the interpolations and of the definite
integrals.
II Direct demonstration of the expression of ∆n si , found in no 40 of Book I, by the
passages from the postive to the negative and from the real to the imaginary.

III Demonstration of the formula (p) from no 42 of Book I or of the expression of


the finite differences of the powers, when one stops this expression at the term
where the quantity raised to the power becomes negative.

∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. September 25, 2010

1
I.

We have integrated, by a very convergent approximation, in no 34 of Book I, the


equation in the finite differences

0 = (n0 + s + 1)ys+1 − (n + s)ys .

It is easy to conclude from our analysis the expression of the ratio of the circumference
to the radius, in infinite products, given by Wallis. In fact, this analysis has led us, in
the section cited, to the general expression
R 2n0 −2n+1
(n + µ)(n + µ + 1) · · · (n + s + 1) u du(1 − u2 )n+s−1
(a) = R 0 ,
(n + µ + 1)(n0 + µ + 2) · · · (n0 + s) u2n −2n+1 du(1 − u2 )n+µ−1

the integrals being taken from u = 0 to u = 1. By making first n0 = 0, n = 21 , µ = 1


1
and observing that du(1 − u2 ) 2 = 14 π, π being the ratio of the semi-circumference
R

to the radius, one will have


4 3.5 . . . (2s − 1)
= R 1 .
π 4.6 . . . 2s du(1 − u2 )s− 2

By supposing therefore generally


1
R = ys ,
du(1 − u2 )s

one will have


4 3.5 . . . (2s − 1) 3.5 . . . (2s + 1)
= ys− 12 = y 1 = ··· ,
π 4.6 . . . 2s 4.6 . . . (2s + 2) s+ 2

this which gives


2s + 1
ys− 12 = y 1
2s + 2 s+ 2
If one makes next, in formula (a), n0 = − 21 , n = 0 and µ = 1, it gives

3.5 . . . (2s − 1)
= ys−1 ;
2.4 . . . (2s − 2)

whence one draws


2s
ys−1 = yx ,
2s + 1
an equation which coincides with the preceding between ys− 12 and ys+ 12 by changing
s into s + 12 , so that this equation holds, s being entire or equal to an entire plus 12 .
The two expressions of ys−1 and of π4 give

4 3.3 5.5 (2s − 1)(2s − 1) ys− 21


= · ··· ;
π 2.4 4.6 (2s − 2)2s ys−1

2
the equations in the differences in ys and ys− 21 give

ys− 12 (2s + 1)2 ys+ 12 (2s + 1)2 (2s + 3)2 ys+ 32


= = = ···
ys−1 2s(2s + 2) ys 2s(2s + 2) (2s + 2)(2s + 4) ys+1
ys− 1
The ratio ys−12 is greater than unity; it diminishes without ceasing, in measure as s
increases, and, in the case of s infinite, it becomes unity. In fact, this ratio is equal to
du(1 − u2 )s−1
R
R 1 .
du(1 − u2 )s− 2
1
Now the element du(1 − u2 )s−1 is greater than the element du(1 − u2 )s− 2 , or du(1 −
1
u2 )s−1 (1 − u2 ) 2 ; the integral of the numerator of the preceding fraction surpasses
therefore that of the denominator; this fraction is therefore greater than unity. When s
is infinite, these integrals have a sensible value only when u is infinitely small; because,
u being finite, the factor (1 − u2 )s−1 becomes a fraction having an infinitely great
1
exponent; one can therefore then suppose (1 − u2 ) 2 = 1, this which renders the ratio
1
s− 2
ys−1 equal to unity.
This ratio is equal to the product of an infinite sequence of fractions, of which the
(2s+1)2
first is 2s(2s+2) , and of which the others are deduced from it, by increasing successively
ys (2s+1)2
s by one unit; it becomes s− 12
, by changing s into s + 21 , and the fraction 2s(2s+2)
(2s+2)2
becomes (2s+1)(2s+3) ; now one has, whatever be s,

(2s + 1)2 (2s + 2)2


> ;
2s(2s + 2) (2s + 1)(2s + 3)
one has therefore this inequality
ys− 12 ys
> .
ys−1 ys− 12

By changing s into s − 12 , one will have


ys−1 ys− 12
> .
ys− 32 ys−1

The two inequalities give


s
ys− 12 r
ys ys− 12
> < .
ys−1 ys−1 ys− 32
ys− 1
ys
Substituting instead of the ratios ys−1 and ys− 3
2
their values given by the equations in
2
the differences in ys , one will have
r
ys− 21
r
1 1
> 1+ < 1+ ;
ys−1 2s 2s − 1

3
one will have
 q
 4
π > 3.3
2.4 · 5.5
4.6 · · · (2s−1)(2s−1)
(2s−2)(2s 1+ 1
2s ,
(A) q
 4
π < 3.3
2.4 · 5.5
4.6 · · · (2s−1)(2s−1)
(2s−2)(2s 1+ 1
2s−1 .

Wallis published in 1657, in his Arithmetica infinitorum, this beautiful theorem, one
of the most curious in Analysis, by itself and by the manner in which the inventor is
arrived there. His method containing the principles of the theory of definite integrals,
that the geometers have specially cultivated in these last times, I think that they will see
with pleasure a succinct exposition in the actual language of Analysis.
Wallis considers the series of fractions of which the general term is R  1 1 s ,
dx 1−x n
n and s being some entire numbers, by commencing with zero. By expanding the
binomial contained under the integral sign and integrating each term of the expansion,
he obtains, for one same value of n, the numerical values of the preceding fraction,
corresponding to s = 0, s = 1, s = 2, . . . , this which gives to him a horizontal series,
of which s is the index. By supposing successively n = 0, n = 1, n = 2, . . . , he has
so many horizontal series. Thence, he forms a Table in double entry, of which s is the
horizontal index and n the vertical index.
In this Table, the horizontal and vertical series are the same, so that, by designat-
ing by yn,s the term corresponding to the indices n and s, one has this fundamental
equation
yn,s = ys,n .
Wallis observes next that the first series is unity; that the second is formed of the natural
numbers; that the third is formed of the triangular numbers, and so forth; in a manner
that the general term yn,s of the horizontal series corresponding to n is
(s + 1)(s + 2) · · · (s + n)
;
1.2.3 . . . n
this fraction being equal to
(n + 1)(n + 2) · · · (s + n)
,
1.2.3 . . . s
one sees clearly that yn,s is equal to ys,n .
Now, if one arrives to interpolate in the preceding Table the term corresponding
to n and s equal to 12 , one will have the ratio of the square of the diameter to the
1 4
surface of the circle; because the term of which there is concern is R 2
1 , or π .
dx(1−x ) 2
Wallis seeks therefore to make this interpolation. It is easy in the case where one of
the two numbers n and s is an entire number. Thus, by making successively s equal to
an entire number less 21 in the function (s+1)(s+2)···(s+n)
1.2.3...n , he obtains all the terms of
the horizontal series, corresponding to the values of s, − 12 , 32 , 52 , . . . ; and by making
n equal to an entire number less 12 in the function (n+1)(n+2)···(n+s)
1.2.3...s , he obtains all
the terms of the vertical series, corresponding to the values of n, − 12 , 32 , . . . But the
difficulty consists in finding the terms corresponding to n and s, both equal to some
entire numbers less 21 .

4
Wallis observes for this that the equation
(s + 1)(s + 2) · · · (s + n)
yn,s =
1.2.3 . . . n
gives
s(s + 1) · · · (s + n − 1)
yn,s−1 = ,
1.2.3 . . . n
and that thus one has
s+n
(a) yn,s = yn,s−1 ;
s
so that each term of a horizontal series is equal to the preceding, multiplied by the
fraction s+ns ; whence it follows that all the terms of a horizontal series, departing from
s = − 12 , s increasing successively by unity, are the products of yn,− 21 by the fractions
2n+1 2n+3 2n+5
1 , 3 , 5 , . . . , and, departing from s = 1, these terms are the products of yn,0
by the fractions n+1 n+2 n+3
1 , 2 , 3 , . . . He supposes that the same laws subsist in the case
of n fractional and equal to 21 , so that one has all the terms, departing from s = − 12 ,
by multiplying y 21 ,− 12 by the series of fractions 12 , 43 , 65 ,. . . by designating therefore by
 the term corresponding to n = 12 and s = 12 , a term which, as one has seen, is equal
to π4 , one has
2
 = y 12 ,− 12 ,
1
this which gives
1
y 12 ,− 21 = .
2
Departing from y 12 ,0 , or from unity, he obtains the successive terms of the series, corre-
sponding to s entire, by multiplying successively unity by the fractions 23 , 54 , 76 , . . . He
forms thus the horizontal series according to which correspond to n = 12 , and to s
successively equal to − 12 , 0, 12 , 1, 23 , . . . ,
1 3 4 3 3 4 6
(i) , 1, , , , · , · , ··· ,
2 2 3 2 4 3 5
a series which represents this here,
1 1 1
1 , R , 1 , ...
x2 ) − 2 dx(1 − x2 )0
R R
dx(1 − dx(1 − x2 ) 2
The series (i) gives generally, s being an entire number,
4 6 2s
y 21 ,s− 21 = · ··· ,
3 5 2s − 1
3 5 2s − 1
y 21 ,s−1 = · ··· ;
2 4 2s − 2
whence one draws
3.3 5.5 (2s − 1)(2s − 1) y 12 ,s− 12
(B) = · ··· .
2.4 4.6 (2s − 2)2s y 12 ,s−1

5
Wallis considers next that, in the series (i), the ratio of each term to the one which
precedes it by one unit is greater than unity and diminished without ceasing, so that
one has
y 12 ,s y 1 ,s+1
> 2 .
y 12 ,s−1 y 12 ,s
This results in fact with the equation
2s + 1
y 12 ,s = y 21 ,s−1 .
2s
He supposes that this holds equally for all the consecutive terms of the series, so that
one has the two inequalities
y 12 ,s− 21 y 12 ,s y 12 ,s−1
> < ;
y 21 ,s−1 y 12 ,s− 12 y 12 ,s− 23

whence it follows, as one has done above,


r
y 12 ,s− 12
r
1 1
> 1+ < 1+ ;
y 12 ,s−1 2s 2s − 1

thence, it changes formula (B) into formula (A).


This manner to proceed by way of induction must appear and appeared, in fact, ex-
traordinary to the geometers accustomed to the rigor of the ancients. Thus we see that
some great contemporary geometers of Wallis were not very satisfied with it, and Fer-
mat, in his correspondence with Digby, made some objections not very worthy of him
against this method which he had not studied sufficiently deeply. It must be, without
doubt, employed with an extreme circumspection: Wallis himself said, in responding to
Fermat, that it is thus that he is served by it, and, in order to confirm the exactitude, he
supported it on a calculation by which lord Brouncker had found, by means of formula
(A), the ratio of the circumference to the diameter, contained between the limits

3.141592653569,
3.141592653696,

limits which coincide in the first ten digits with this ratio that one has carried beyond
one hundred decimals. Notwithstanding these confirmations, it is always useful to
demonstrate in rigor that which one obtains by these means of invention. Wallis ob-
serves that the ancients had, without doubt, similar that they had not made known at all,
being content to give their results supported on synthetic demonstrations. He regrets,
with reason, that they had concealed from us their ways to arrive there, and he said to
Fermat that one must be thankful to him not to have imitated them, and to not have
destroyed the bridge after the flood having passed. It is worthy of remark that Newton,
who had profited from this method of induction of Wallis and of his results in order to
discover his theorem on the binomial, has merited the reproaches that Wallis made to
the ancients geometers, in seeking the means which had led to their discoveries.

6
Let us resume formula (B) of Wallis. If one supposes
y 12 ,s− 21
= us ,
y 12 ,s−1

this formula will give


(2s − 1)2
us−1 = us
(2s − 2)2s
or

(l) 0 = 2s(2s − 2)(us − us−1 ) + us .

Let there be
A(1) A(2) A(3)
us = A(0) + + + + ··· ,
s + 1 (s + 1)(s + 2) (s + 1)(s + 2)(s + 3)
and let us consider that which produces, in the second member of equation (l), the term

A(r)
.
(s + 1) · · · (s + r)
By having regard only to this term in us , one will have

−rA(r)
us − us−1 = ;
s(s + 1)(s + 2) · · · (s + r)
the term 2s(2s − 2)(us − us−1 ) of the equation (l) becomes thus

−4rA(r) (s − 1)
,
(s + 1) · · · (s + r)
or
−4rA(r) 4r(r + 1)A(r)
+ .
(s + 1) · · · (s + r − 1) (s + 1) · · · (s + r)
The term of us depending on A(r+1) will produce some similar terms, and thus of
the others. By comparing therefore in equation (l) the terms which have the same
denominator (s + 1) · · · (s + r), one will have

0 = 4r(r + 1)A(r) − 4(r + 1)A(r+1) + A(r) ,

this which gives


(2r + 1)2 A(r)
A(r+1) = .
4(r + 1)
It is clear, by that which precedes, that us is reduced to unity when s is infinite, this
which gives A(0) = 1. Thence one draws

12 12 .32 12 .32 .52 ys− 21


us = 1+ + 2 + 3 +· · · = .
4(s + 1) 4 /1/2(s + 1)(s + 2) 4 .1.2.3(s + 1)(s + 2)(s + 3) ys−1

7
The ratio of the mean term of the binomial (1 + 1)2s to the entire binomial is

(s + 1)(s + 2) · · · 2s
22s .1.2.3 . . . s
or
1.3.5 . . . (2s − 1)
.
2.4.6 . . . 2s
By naming therefore T this mean term, formula (B) will give
1
T2 = .
sπus
This theorem and the preceding expression of us in series are due to Stirling, and one
sees how they are attached to the theorem and to the analysis of Wallis. This value of
T2 is able to serve to determine by approximation the ratio of the circumference to the
diameter, this which was the object of Wallis; or, this ratio being supposed known, it
gives the mean term of the binomial, this which was the object of Stirling.

II. (pp. 480–485) (omitted)

III. (pp. 485–493) (omitted)

8
SUR LES COMÈTES
Pierre Simon Laplace∗

Connaissance des Temps for the year 1816. OC 13 pp. 88–97.


Published November 1813.

Among the hypotheses which one has proposed on the origin of the comets, the
most probable appears to me to be that of Mr. Herschell, which consists in regarding
them as small nebulae formed by the condensation of the nebulous material spread with
so much profusion in the universe. The comets would be thus, relative to the solar sys-
tem, that which the aerolites are with respect to the Earth, to which they seem strangers.
When these stars become visible for us, they offer a resemblance so perfect with the
nebula that one often confounds them with them, and it is only by their movement, or
by the knowledge of all the nebulae contained in the part of the sky where they are
indicated, that one succeeds in distinguishing them in it. This hypothesis demonstrates
in a happy manner the great extension that the heads and the tails of the comets take,
in measure as they approach the Sun, and the extreme rarity of these tails which, de-
spite their immense profusion, never weaken sensibly the radiance of the stars that one
sees to intersect, so that it is very probable that many have enveloped the Earth without
having been perceived.
When the nebulae arrive in this part of space where the attraction of the Sun is pre-
dominant, and what we will call sphere of activity of this star, it forces them to describe
some elliptical or hyperbolic orbits. But their speed being equally possible following
all directions, they must be moved indifferently in all senses and in all inclinations to
the ecliptic, this which is conformed to that which one observes. If their orbits are el-
liptic, they are very elongated; since their major axes are at least equal to the radius of
the sphere of activity of the Sun; but these orbits can be hyberbolic, and if the axes of
these hyperboles are not very great with respect to the mean distance of the Sun to the
Earth, the movement of the comets which describe them will seem sensibly hyperbolic.
However, out of one hundred comets of which one has the elements already, none has
appeared to move itself in an hyperbola, this which forms a plausible objection against
the preceding hypothesis, at least that the chances which give a sensible hyperbola are
extremely rare with respect to the contrary chances. The conformity of this hypothesis
with the phenomena which the comets offer to us has made me suspect that this is so,
and, in order to assure myself of it, I have applied to this object the Calculus of prob-
abilities. I have found that in effect there are odds a great number against unity that
a nebula which penetrates into the sphere of solar activity, in a manner to be able to
be observed, will describe either a very elongated ellipse or a hyperbola which, by the
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. September 17, 2010

1
magnitude of its axis, will be confounded sensibly with a parabola in the part which
one observes. This application of the analysis of probabilities being able to interest
geometers and astronomers, I am going to expose here.
The comets are so small that they become visible only if their perihelion distance
is not very considerable. Until the present, this distance has surpassed only two times
the diameter of the terrestrial orbit, and most often it has been under the radius of
this orbit. One imagines that, in order to approach so near to the Sun, their speed
at the moment of their entry into its sphere of activity must have a magnitude and a
direction contained in some narrow limits. It is necessary therefore to determine what
is, within these limits, the ratio of the chances which give a sensible hyperbola to the
chances which give an orbit which one can confound with a parabola. It is clear that
this ratio depends on the law of possibility of the perihelion distances of the observable
comets, and the examination of the table of the elements of the cometary orbits already
calculated demonstrate to us that, beyond a perihelion distance equal to the radius of
the terrestrial orbit, the possibilities of the perihelion distances diminish with a great
rapidity in measure as these distances increase. The law of these possibilities must
therefore be subject to this condition; but being, to this nearly, unknown, we are able
only to determine the limit of the ratio of which there is question, or its value in the
case most favorable to the sensible hyperbolas. If one supposes the radius of the sphere
of activity of the Sun equal to one hundred thousand times its distance to the Earth,
this which appears to be yet beneath that which indicates the smallness of the parallax
of the stars, the analysis gives, in the most favorable case, 57135714 for the probability
that a nebula which penetrates into the sphere of solar activity, in a manner to be able
to be observed, will describe a hyperbola of which the major axis will equal at least
one hundred times the distance of the Sun to the Earth. A similar hyperbola will be
confounded sensibly with a parabola; there is thus, in the case most favorable to the
sensible hyperbolas, by very nearly odds fifty-six against one that, out of one hundred
comets, none must have a sensible hyperbolic movement; it is therefore not surprising
that, until here, one has not observed at all similar movement.
The attraction of the planets, and perhaps further the resistance of the ether, ought
to change many cometary orbits, into some ellipses of which the major axis is much
less than the radius of the sphere of activity of the Sun. One can believe that this change
has taken place for the orbit of the comet of 1682, of which the major axis surpasses
only thirty-five times the distance of the Sun to the Earth. A change greater still is
arrived in the orbit of the comet of 1770, of which the major axis equals only around
six times this distance.
A comet loses, at each return to its perihelion, a part of its substance, as the heat
and the light of the Sun raise vapors from it and they disperse into space, to a distance
from the comet such that its attraction can not make them fall back to its surface.
This star must therefore, after many returns, dissipate itself in whole or reduce itself
to a fixed core which will present some phases as the planets. The comet of 1682,
the only one in which one has until now observed some phases, appears to approach
from this state of stability. If this core is too small in order to be perceived, or if the
evaporative substances which remain on its surface are too small in quantity to form,
by their evaporation, a sensible comet head, the star will disappear forever. Perhaps is
this one of the causes which render so rare the reappearances of the comets; perhaps

2
yet has this cause made vanish more often than one had ought expect, many comets
of which one was able to follow the trace in space, by means of the elements of their
orbits; perhaps finally the same cause has rendered invisible the comet of 1770, which,
if it has continued to be moved in an ellipse which it has described during its apparition,
is returned, since this period, at least seven times to its perihelion.
Let

V be the speed of a comet at the instant where it penetrates into the sphere of activity
of the Sun;
r be the radius vector of the comet at the same instant;
a the semi-major axis of the orbit which it is going to describe around the Sun;
e the eccentricity of this orbit;
D its perihelion distance.

In taking for unity of mass that of the Sun and for unity of distance its mean distance
to the Earth, and, moreover, neglecting the masses of the comets and of the planets
relative to this star, one will have, as one knows1 :
1 2
= − V 2,
a pr
rV sin $ = a(1 − e2 ),
D = a(1 − e);

$ being the angle which the direction of the speed V makes with the radius vector r.
These equations give, in eliminating a and e,
2D 2
2D − r + D2 V 2
sin2 $ = ,
r2 V 2
whence one draws
q
D
s
1− r

D

1 − cos $ = 1 − r2 V 2 1 + − 2D.
rV r

Now, if one imagines a sphere of which the center is the one of the comet and of
which the radius is equal to the speed V , this speed will be able to be equally directed
towards all the points of the half of this sphere contained in the sphere of activity of
the Sun. The probability of a direction forming the angle $ with the radius vector will
be 2π sin $, π being the Rsemi-circumference of which the radius is unity; by dividing
therefore the integral 2π d$ sin $ by the surface of the half-sphere, one will have
the probability that the direction of the speed V will be contained within the limits zero
and $; this probability is thus 1 − cos $. The limits of the perihelion distance which
corresponds to these limits of $ are zero and D; by supposing therefore all the values
1 Oeuvres de Laplace, T. I., Book II, Chapter IV.

3
of D equally possible, one has for the probability that the perihelion distance will be
contained between zero and D
q
1− D
s  
r 2 2
D
1− r V 1+ − 2D.
rV r

It is necessary to multiply this value by dV ; by integrating it next within some deter-


mined limits and dividing the integral by the greatest value of V , a value which we
will designate by U , one will have the probability that the value of V will be contained
within these limits. This put, the smallest value of V is that which renders null the
quantity contained under the preceding radical, this which gives

2D
rV = q .
1+ D r

We suppose next to the other limit



rV = i r;
and we seek, within these limits, the value of the integral
 q 
1− D
Z s  
r D
(a) dV 1 − r2 V 2 1 + − 2D
rV r

Let s r !
 
D D
r2 V 2 1+ − 2D = rV 1+ −z ;
r r
we will have
2D + z 2
rV = q ;
2z 1 + D r

formula (a) becomes thus


q
D
1− r
Z 
1 4D D

V + dz − 2
+ 2 .
r 2 2D + z z
By integrating, it becomes
q
D
1− r

z √ z D

(b) V + − 2 2D arctan √ − + C.
r 2 2D z
C being an arbitrary constant. In order to determine it, we will observe that the two
limits of rV being, by that which precedes,

2D √
q , i r,
D
1+ r

4
the corresponding limits of z are
" s #

r
√ D 2D
2D, i r 1 + 1− 1− 2 D

r i r 1+ r

This last limit is    


D D 1
√ 1− 1 − 2 + ··· .
i r 2r i
In determining therefore C in a manner that formula (b) is null at the first limit and
extends to the second, this formula becomes

(π − 2) 2D D
− √ .
2r ir r

If one divides this function by U , we will have



(π − 2) 2D D
− √ .
2rU iU r r

for the probability that the perihelion distance of a star which enters into the sphere of
activity of the Sun will be contained within the limits zero and D, the value of V 2 not
2
exceeding ir . This value is 2r − a1 ; one has therefore

1 2 − i2
= ;
a r
the orbit is elliptical or parabolic when i2 is inferior or equal to 2, it is hyperbolic when
i2 surpasses 2. If one supposes, for example, a = −100, one will have
r + 200
i2 = ,
100
and the probability that the perihelion distance being contained between zero and D,
the orbit will be either elliptical, or parabolic, or a hyperbola of which the semi-major
axis will be at least equal to 100, is

(π − 2) 2D 10D
− p .
2rU rU r(r + 200)

The probability of a value of i more considerable, or of a hyperbola of which the


semi-major axis will be less than 100, is equal to
10D
p ,
rU r(r + 200)

because, by supposing i infinite, one will have (π−2)
2rU
2D
for the probability that the
perihelion distance will be contained between zero and D. If one subtracts from it

5
the probability that the orbits will be ellipses, or parabolas, or hyperbolas with a semi-
major axis equal or superior to 100, one will have
10D
p ,
rU r(r + 200)
for the probability of the hyperbolas with a major axis below this value. Thus the
perihelion distance being supposed contained between zero and D, the probability that
the orbit will be either an ellipse, or a parabola, or a hyperbola with a semi-major axis
at least equal to 100, is to the probability that it will be a hyperbola with a semi-major
axis inferior, as r
(π − 2) r
(r + 200) − 1 : 1.
10 2D
If one supposes r = 100000 and D = 2, of the greater perihelion distances being
so rare that one can set them aside, this ratio becomes the one of 5712, 7 to unity;
there is therefore by very nearly odds of fifty-six against one that, out of one hundred
observable cometary orbits, none must be a hyperbola with a semi-major axis inferior
to 100.
The preceding analysis supposes all the values of D contained between 0 and 2,
equally possible relative to the comets that one can perceive. However, the examination
of the table of the elements of the cometary orbits already calculated shows that the
perihelion distances which surpass unity are in much smaller number than those which
are below. We name φ(D) the probability of a perihelion distance D relative to an
observable comet. One have just seen that the probability that the perihelion distance
of an observable comet will be contained between zero and D, D being quite small
with respect to r, is, in the case where all these distances are equally possible, equal to

(π − 2) 2D
;
2rU
r
and that the probability that the semi-major axis will be inferior to 2−i 2 is

D
√ .
iU r r
In order to have the ratio of these probabilities, in the case where these distances are
not equally possible, it is necessary, according to the analysis of the probabilities, to
differentiate these two quantities with respect to D and to multiply the differentials by
φ(D); then, according to this analysis, the preceding probabilities will be respectively
as the integrals of these products, or as
(π − 2)
Z √ 1
Z
φ(D)d 2D; √ dDφ(D);
2rU iU r r
the integrals being taken from D = 0 to its limit, that one here supposes infinity,
because φ(D) is null when D surpasses 5. Thus the probability that the semi-major
r
axis of the orbit will be inferior to 2−i 2 , is
R
2 dDφ(D)
(q) √ R
(π − 2)i r dD√φ(D)
2D

6
In the case of φ(D) constant, the preceding function becomes

2D
√ ,
(π − 2)i r

this which is conformed to that which precedes; but, if φ(D) diminishes when D in-
creases, then the formula (q) diminishes. In order to show it, it suffices to prove that,
in this case, one has
2 dDφ(D) √
R
R dD φ(D) < 2D

2D
or Z √ Z
dD φ(D)
2 dDφ(D) < 2D √ ,
2D
and, by differentiating,

1
Z
dD φ(D) 1
Z √ dφ(D)
φ(D) < √ √ = φ(D) − √ dD 2D .
2D 2D 2D dD

Now this inequality is evident, because, φ(D) diminishes when D increases, dφ(D)
dD is
a negative quantity.
In examining the Table of elements of the cometary orbits already calculated, one
2
sees that one will depart little from the truth by making φ(D) = kc−D , c being the
number of which the hyperbolic logarithm is unity. Then the formula (q) becomes

π
√ R 1
(π − 2)i 2r s 4 ds e−s .

In supposing, as above, r = 100000, and observing that one has


Z
1
log 10 s 4 ds e−s = 0, 9573211

the preceding fraction becomes


1
;
8264, 3
there is then, quite nearly, odds 8263 against one that one nebula which penetrates into
the sphere of activity of the Sun will describe an orbit of which the semi-major axis
will be at least equal to 100. Thus one can regard the assumption of φ(D) constant,
and extending only to D = 2, as the limit of the assumptions favorable to the sensible
hyperbolic movements, so that there are odds at least 56 against unity that, out of one
hundred observable comets, none will have a sensible movement.

7
SUR
L’APPLICATION DU CALCUL DES
PROBABILITÉS
A LA PHILOSOPHIE NATURELLE

Pierre Simon Laplace∗


Connaissance des Temps for the year 1818 (1815). pp. 361–377.
OC 13 pp. 98–116
Read to the first class of the Institut, 18 September 1815.

When we wish to know the laws of phenomena, and to attain to a great exactitude,
we combine the observations or the experiences in a manner to bring out the unknown
elements, and we take the mean among them. The more observations are numerous,
and the less they depart from their mean result, the more this result approaches to the
truth. We fulfill this last condition by the choice of the methods, by the precision
of the instruments, and by the care that we put to observe well. Next, we determine
by the theory of probabilities the most advantageous mean result, or the one which
gives the least taken to error. But this does not suffice; it is yet necessary to estimate
the probability that the error of this result is comprehended within some given limits;
without this, we have only an imperfect knowledge of the degree of exactitude obtained.
Formulas proper to this object are therefore a true perfection of the method of natural
philosophy, that it is quite important to add to this method. It is one of the things
that I have had principally in view in my Théorie analytique des Probabilités, where I
am arrived to some formulas of this kind which have the remarkable advantage to be
independent of the law of the probability of errors, and to contain only quantities given
by the same observations and by their analytic expressions. I am going to recall here
the principles.
Each observation has for analytic expression a function of the elements which we
wish to determine; and if these elements are nearly known, this function becomes a
linear function of their corrections. By equating it to the observation itself, we form
that which we name the equation of condition. If we have a great number of similar
observations, we combine them in a manner to form as many final equations as there
are elements; and by resolving these equations, we determine the corrections of the el-
ements. The art consists therefore in combining the equations of condition in the most
advantageous manner. For this we must observe that the formation of a final equation,
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-
sity, Cincinnati, OH. June 20, 2012

1
by means of the equations of condition, reverts to multiplying each of these by an inde-
terminate factor, and to reunite these products; but it is necessary to choose the system
of factors which give the smallest error to fear. Now it is clear that if we multiply each
error of which an element determined by a system is yet susceptible, by the probability
of this error, the most advantageous system will be the one in which the sum of these
products, all taken positively, is a minimum; because a positive or negative error can
be considered as a loss. By forming therefore this sum of products, the condition of
minimum will determine the system of most advantageous factors, and the minimum
error to fear respecting each element. I have shown, in the Work cited, that this system
is the one of the coefficients of the elements in each equation of condition; so that we
form a first final equation by multiplying respectively each equation of condition by
its coefficient of the first element, and by reuniting all these equations thus multiplied.
We form a second final equation by employing the coefficients of the second element,
and thus in succession. I have given in the same Work the expression of the minimum
of error, whatever be the number of elements. This minimum gives the probability of
the errors of which the corrections of these elements are yet susceptible, and which
is proportional to the number of which the hyperbolic logarithm is unity, raised to a
power of which the exponent is the square of the error taken to less, and divided by
the square of the minimum of error, multiplied by the ratio of the circumference to the
diameter. The coefficient of the negative square of the error, in this exponent, is able
therefore to be considered as the modulus of the probability of the errors, since, the
error remaining the same, the probability decreases with rapidity when it increases; so
that the result obtained weighs, if I may thus say, towards the truth, so much more as
this modulus is greater. I will name, for this reason, this modulus, weight of the result.
By a remarkable analogy of these weights with those of the bodies, compared to their
common center of gravity, it happens that, if one same element is given by diverse sys-
tems composed each of a great number of observations, the most advantageous mean
result of them altogether is the sum of the products of each partial result by its weight,
this sum being divided by the sum of all the weights. Moreover, the total weight of the
diverse systems is the sum of their partial weights; so that the probability of the errors
of the mean result of them altogether is proportional to the number which has the unit
for hyperbolic logarithm, raised to a power of which the exponent is the square of the
error, taken to less, and multiplied by the sum of all the weights. Each weight depends,
in truth, on the law of probability of the errors in each system, and nearly always this
law is unknown; but I am happily arrived to eliminate the factor which contains it, by
means of the sum of the squares of the deviations of the observations of the system,
from their mean result. It will be therefore to wish for, in order to complete our knowl-
edge from the results obtained by the collection of a great number of observations, that
we wrote, beside each result, the weight which corresponds to it. In order to facili-
tate the calculation, I develop its analytic expression when we have no more than four
elements to determine. But this expression becoming more and more complicated in
measure as the number of elements increases, I give a quite simple way to determine
the weight of the result, whatever be the number of elements. Then, a regular process
to arrive to that which we seek is preferable to the use of analytic formulas. When
we have thus obtained the exponential which represents the law of probability of the
errors of a result, the integral of the product of this exponential, by the differential of

2
the error, being taken within some determined limits, it will give the probability that
the error of the result is contained within these limits, by multiplying it by the square
root of the weight of the result, divided by the circumference of which the diameter is
unity. We find, in the Work1 cited, some very simple formulas in order to obtain this
integral, and Mr. Kramp, in his Traité des Réfractions astronomiques, has reduced this
genre of integrals into quite convenient Tables.
In order to apply this method with success, it is necessary to vary the circumstances
of the observations in a manner to avoid the constant causes of error. It is necessary
that the observations be reported faithfully and without bias, by separating only those
which contain some evident causes of error. It is necessary that they be numerous, and
that they be so many more as there are more elements to determine; because the weight
of the mean result increases as the number of observations divided by the number of
elements. It is yet necessary that the elements follow, in these observations, a different
march; because if the march of two elements were rigorously the same, that which
renders their coefficients proportionals in the equations of condition, these elements
would form only a single unknown, and it would be impossible to distinguish them by
these observations. Finally, it is necessary that the observations be precise, so that their
deviations from the mean result are not very considerable. The weight of the result
is, thence, much increased, its expression having for divisor the sum of the squares of
these deviations. With these precautions we will be able to make use of the preceding
method, and to determine the degree of confidence that the results deduced from a great
number of observations merit.
In the Researches which I have read last to the Class on the phenomena of the
seas, I have applied this method to the observations of these phenomena. I give here
two new applications of them: one is related to the values of the masses of Jupiter, of
Saturn and of Uranus; the other is related to the law of variation of gravity. For the first
object, I have profited from the immense work that Mr. Bouvard had just finished on
the movements of Jupiter and Saturn, from which he has constructed new very precise
Tables. He has made use of all the oppositions and all the quadratures observed since
Bradley, and which he has discussed anew with the greatest care, that which has given
to him for the movement of Jupiter, in longitude, 126 equations of condition. They
contain five elements, namely: the mean movement of Jupiter, its mean longitude at
a fixed epoch, the longitude of its perihelion to the same epoch, the eccentricity of
its orbit; finally the mass of Saturn, of which the action is the principle source of the
inequalities of Jupiter. These equations have been reduced, by the most advantageous
method, to five final equations of which the resolution has given the value of the five
elements. Mr. Bouvard finds thus the mass of Saturn equal to the 3512th part of that
of the Sun. We must observe that this mass is the sum of the masses of Saturn, of
its satellites and of its ring. My formulas of probability show that there are odds of
11000 against one that the error of this result is not a hundredth of its value, or, that
which reverts to very nearly the same, that after a century of new observations added to
the preceding and discussed in the same manner, the new result will not differ by one
hundredth from the one of Mr. Bouvard. There are odds of many billions against one
that this last result is not in error of a fiftieth, because the odds against one increases,
1 TAP, page 109.

3
by the nature of its analytic expression, with a great rapidity when the interval of the
limits of the error increases.
Newton had found, by the observations of Pound out of the greatest elongation of
the fourth satellite of Saturn, the mass of this planet equal to the 3012th part of that of
the Sun, that which surpasses by a sixth the result of Mr. Bouvard. There are odds of
millions of billions against one that the one of Newton is in error, and we will not be
surprised at all if we consider the extreme difficulty to observe the greatest elongations
of the satellites of Saturn. The ease to observe those of the satellites of Jupiter has
rendered much more exact the value of the mass of this planet, that Newton has fixed
by the observations of Pound to the 1067th part of that of the Sun. Mr. Bouvard, by
the set of 129 oppositions and quadratures of Saturn, finds it a 1071th of this star, that
which differs very little from the value of Newton. My method of probability, applied
to the 129 equations of condition of Mr. Bouvard, gives odds 1000000 against one that
his result is not in error of one hundredth of its value; there are odds 900 against one
that his error is not one hundred fiftieth.
Mr. Bouvard has made the mass of Uranus enter into his equations as indeterminate;
he has deduced from them this mass equal to the 17918th part of that of the Sun. The
perturbations which it produces in the movement of Saturn being not very considerable,
we must not yet expect from the observations of this movement a great precision in this
value. But it is so difficult to observe the elongations of the satellites of Uranus, that
we are able to justly fear a considerable error in the value of the mass which results
from the observations of Mr. Hershel. It was therefore interesting to see that which, in
this regard, the perturbations of the movement of Saturn give. I find that there are odds
213 against one that the error of the result of Mr. Bouvard is not a fiftieth; there are
odds 2456 against one that it is not a fourth. After a century of new observations added
to the preceding, and discussed in the same manner, these odds numbers will increase
further by their squares; we will have therefore then the value of the mass of Uranus,
with a great probability that it will be contained within some narrow limits.
I come now to the law of gravity. Since Richer who recognized, first, the diminution
of this force at the equator by the deceleration of his clock transported from Paris to
Cayenne, we have determined the intensity of gravity, in a great number of places, ei-
ther by the number of diurnal oscillations one same pendulum, or by measuring directly
the length of the pendulum in seconds. The observations which have to me seemed to
merit the most confidence are in number of thirty-seven and extend from 67 ˚ of north-
ern latitude to 51 ˚ of southern latitude. Although their march is quite regular, they
leave however to desire a greater precision still. The length of the isochronous pen-
dulum which results from it follows very nearly the most simple law of variation, that
of the square of the sine of the latitude, and the two hemispheres present not at all, in
this regard, sensible difference, or at least what can not be attributed to the errors of
the observations. But, if there exists among them a slight difference, the observations
of the pendulum, by their facility and the precision which we can bring there now,
are very proper to demonstrate it. Mr. Mathieu has well wished to discuss, at my re-
quest, the observations of which I just spoke, and he has found that, the length of the
pendulum in seconds at the equator being taken for unity, the coefficient of the term
proportional to the square of the sine of the latitude is 551 hundred thousandths. My
formulas of probability, applied to these observations, give odds 2127 against one that

4
the true coefficient is contained within the limits 5 thousandths and 6 thousandths.
If the Earth is an ellipsoid of revolution, we have its flatness by subtracting from
it the coefficient of the law of gravity of 868 hundred thousandths. The coefficient 5
1
thousandths corresponds thus to the flatness 272 ; there are therefore odds 4254 against
one that the flatness of the Earth is below. There are odds some millions of billions
against one that this flatness is less than the one which corresponds to the homogeneity
of the Earth, and that the terrestrial layers increase with density in measure as they ap-
proach the center of this planet. The great regularity of gravity at its surface proves that
they are disposed symmetrically around this point. These two conditions, necessarily
following from the fluid state, could not evidently subsist for the Earth, if it had been
not at all originally this state, that an excessive heat has been able to give alone to the
whole Earth.

§ 1. Suppose that we have a sequence of equations of condition of the form

(1) (i) = p(i) z + q (i) z 0 + r(i) z 00 + t(i) z 000 + ν (i) z iv + λ(i) z v + · · · − ω (i) ,

z, z 0 , z 00 , . . . being some m elements of the corrections of the elements which we seek


to determine by the whole of these equations, of which the number is supposed very
great; p(i) , q (i) , . . . being some quantities given by the analytic expressions of the ob-
servations; ω (i) being the quantity given by the same observation, and (i) being the
error of the observation. I have shown in no. 21 of the second Book of my Théorie
analytique des probabilités,2 that if n is the number of elements, we will have n final
equations the most proper to determine the elements: 1 ˚ by multiplying each final
equation by its coefficient of z, and by reuniting all the resulting equations with these
products, that which gives
2
Sp(i) (i) = zSp(i) + z 0 Sp(i) q (i) + z 00 Sp(i) r(i) + · · · − Sp(i) ω (i) ,

the sign S indicating the sum of the quantities which it affects, from i = 0 to i = s − 1,
s being the number of observations or of equations of condition; 2 ˚ by multiplying
each equation of condition by its coefficient of z 0 ; that which gives, by reuniting these
products,
2
Sq (i) (i) = zSp(i) q (i) + z 0 Sq (i) + z 00 Sq (i) r(i) + · · · − Sq (i) ω (i) ,

and thus consecutively. We will resolve these equations by supposing

Sp(i) (i) = 0, Sq (i) (i) = 0, Sr(i) (i) = 0, ...,

and we will have the most advantageous values of z, z 0 , z 00 , . . . There results from the
section cited, that the probability of error u of the value of z thus determined, is of the
√ −P u2
form P √ c
π
, c being the number of which the hyperbolic logarithm is unity, and π
being the ratio of the circumference to the diameter. By multiplying this probability
by udu, and taking the integral from u = 0 to u infinity, we will have, by the section
2 TAP, page 327.

5
cited, that which I have named in this section the minimum error to fear; this minimum
is therefore 2√1πP . I have given in the same section the expression of this minimum
error; this expression will give therefore the value of P , or of the weight of the result;
and we find that if there is only one correction or element z, we have
2
sSp(i)
P = .
2S(i)2
If there are two elements z and z 0 , we will have the value of P , relative to the first
2 2 (i) (i) 2
element, by changing Sp(i) into Sp(i) − (SpSq(i) q )
2 , by making therefore generally

s A
P = ,
2S(i)2 B
2 2
and designating, for brevity, Sp(i) by p(2) , Sp(i) q (i) by pq, Sq (i) by q (2) , we will have

A = p(2) q (2) − pq 2 ,
B = q (2) .

If there are three elements z, z 0 , z 00 , we will have A by changing, in the value


2 2
preceding A, p(2) into p(2) − rpr(2) , pq into pq − pr qr
r (2)
, and q (2) into q (2) − rqr(2) , and
multiplying the whole by r(2) . We will have B by making the same substitutions and
the same multiplication relative to the preceding value of B; we have thus

A = p(2) q (2) r(2) − p(2) qr2 − q (2) pr2 − r(2) pq 2 + 2pq pr qr,
B = q (2) r(2) − qr2 .

If there are four elements, we will have the values of A and of B by changing,
2
in the two preceding, p(2) into p(2) − tpt(2) , pq into pq − pt qt
t(2)
, . . . and multiplying the
(2)
whole by t , that which gives
2 2
A = p(2) q (2) r(2) t(2) −p(2) q (2) rt − p(2) r(2) qt − p(2) t(2) qr2
2
−q (2) r(2) pt − q (2) t(2) pr2 − r(2) t(2) pq 2
2 2 2
+pq 2 rt + pr2 qt + pt qr2
+2p(2) qr qt rt + 2q (2) pr pt rt
+2r(2) pq pt qt + 2t(2) pq pr qr
−2pq pr qt rt − 2pq pt qr rt − 2pr pt qr qt,
2 2
B = q (2) r(2) t(2) − q (2) rt − r(2) qt − t(2) qr2 + 2qr qt rt.

In continuing thus, we will have the value of P relative to the first element, what-
ever be the number of elements. By changing p into q and q into p, we will have the
value of P relative to the second element; p into r and r into p, we will have the value
of P relative to the third element, and thus consecutively.

6
The value of A becomes more complicated in measure as the number of elements
increases; its expression for six elements is of an excessive length, and its numeric
calculation would be impractical. It is worth more then to have a simple and regular
process in order to arrive there; this is that which we obtain in the following manner:
Suppose that there are six elements, and that thus the equation of condition (1) is
of the form

(2) (i) = λ(i) z v + ν (i) z iv + t(i) z 000 + r(i) z 00 + q (i) z 0 + p(i) z − ω (i) .

By multiplying this equation by λ(i) , and reuniting the similar products, relative to
all the equations of condition that equation (2) represents, we will have
2
Sλ(i) (i) = z v Sλ(i) + z iv Sλ(i) ν (i) + z 000 Sλ(i) t(i) + · · · − Sλ(i) ω (i) .

By the conditions of the most advantageous method, we have

Sλ(i) (i) = 0,

the preceding equation will give therefore

Sλ(i) ν (i) Sλ(i) t(i) Sλ(i) ω (i)


z v = −z iv 2 − z 000 2 − ··· + .
Sλ (i) Sλ (i) Sλ(i)2
By substituting this value of z v into equation (2), we will have this here

(i) (i)
 
(i) iv (i) (i) Sλ ν
  =z ν −λ


 Sλ(i)2
(3)
Sλ(i) t(i) (i) (i)
 
(i) Sλ ω
+ z 000 t(i) − λ(i)
 (i)
+ · · · − ω + λ .


Sλ(i)2 Sλ(i)2

We have thus, by making successively i = 0, i = 1, . . . , i = s − 1, a new system


of equations of condition, which contains no more than five elements, z iv , z 000 , . . .
Making, for brevity,

(i) Sλ(i) ν (i)


ν1 = ν (i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) t(i)
t1 = t(i) − λ(i) ,
Sλ(i)2
··· ,
(i) Sλ(i) ω (i)
ω1 = ω (i) − λ(i) ,
Sλ(i)2
equation (3) will become
(i) (i) (i) (i) (i) (i)
(4) (i) = ν1 z iv + t1 z 000 + r1 z 00 + q1 z 0 + p1 z − ω1 .
(i)
By multiplying this equation by ν1 , and reuniting the similar products, relative to
(i)
all the equations which this represents, by observing next that we have Sν1 (i) = 0,

7
by virtue of the two equations Sλ(i) (i) =0, Sν (i) (i) =0, which the conditions of the
most advantageous method give, we will have
(i)2 (i) (i)
0 = z iv Sν1 + z 000 Sν1 t1 + · · ·

If we deduce from this equation the value of z iv , we will have, in substituting it into
equation (4),
(i) (i) (i) (i) (i)
(5) (i) = t2 z 000 + r2 z 00 + q2 z 0 + p2 z − ω2 ,

by making
(i) (i)
(i) (i) (i) Sν1 t1
t2 = t1 − ν1 (i)2
,
Sν1
(i) (i)
(i) (i) (i) Sν1 r1
r2 = r1 − ν1 (i)2
,
Sν1
······························ ,
(i)
By multiplying further equation (5) by t2 , and reuniting the similar products rel-
ative to all the equations of condition represented by equation (5), by observing next
(i)
that we have St2 (i) = 0, by virtue of the equations

Sλ(i) (i) =0, Sν (i) (i) =0, St(i) (i) =0,

we will have an equation whence we will deduce the value of z 000 , which, substituted
into equation (5), will give3
(i) (i) (i) (i)
(6) (i) = r3 z 00 + q3 z 0 + p3 z − ω3 ,

by making
(i) (i)
(i) (i) (i) St2 r2
r3 = r2 − t2 (i)2
, ···
St2
By continuing thus, we arrive to an equation of the form
(i) (i)
(7) (i) = p5 z − ω5 ,

There results from no 20 of the second Book of my Théorie analytique des probabilités4
that if the value of z is determined by equation (7) and if u is the error of this value, the
probability of this error is s
(i)2 (i)2
sSp5
sSp5 − 2 u
2

2 c 2S(i) ;
2S(i)
3 Translator’s note: The original lacks superscripts on p, q, r and t in equation (6) and the following

displayed equation. These have been inserted.


4 TAP, page 318.

8
we have therefore
(i)2
sSp5
P = .
2S(i)2
(i)2
Now the question is to form the quantity Sp5 . For this, I observe that the equations of
condition, represented by equation (2), give the following six equations, by multiplying
them first by their coefficient of z v and adding them, next by multiplying them by their
coefficient of z iv and adding them, and thus consecutively:

λω=λ(2) z v + λνz iv + λtz 000 + λrz 00 + λqz 0 + λpz,




νω= λνz v +ν (2) z iv + νtz 000 + νrz 00 + νqz 0 + νpz,




tω = λtz v + νtz iv +t(2) z 000 + trz 00 + tqz 0 + tpz,


(A)

 rω = λrz v + rνz iv + rtz 000 +r(2) z 00 + rqz 0 + rpz,
qω = λqz v + qνz iv + qtz 000 + qrz 00 +q (2) z 0 + qpz,




pω = λpz v + pνz iv + ptz 000 + prz 00 + pqz 0 +p(2) z.

We must observe that, in these equations, we have


2
λ2 = Sλ(i) , λν = Sλ(i) ν (i) , ...,

and thus of the rest.


We will form in the same manner the following five equations:
(2)
ν1 ω1 = ν1 z iv +ν1 t1 z 000 +ν1 r1 z 00 +ν1 q1 z 0 +ν1 p1 z,



 iv (2) 000 00 0
 t1 ω1 =t1 ν1 z + t1 z +t1 r1 z +t1 q1 z +t1 p1 z,


(2) 00
(B) r1 ω1 =r1 ν1 z +r1 t1 z + r1 z +r1 q1 z 0 +r1 p1 z,
iv 000
(2)
 q1 ω1 =q1 ν1 z iv +q1 t1 z 000 +q1 r1 z 00 + q1 z 0 +q1 p1 z,




(2)
p1 ω1 =p1 ν1 z iv +p1 t1 z 000 +p1 r1 z 00 +p1 q1 z 0 + p1 z.

(2)
We will have the values of ν1 , ν1 t1 , . . ., by means of the coefficients of equations
(A), by observing that
2
(2) λν λν λt λν λr
ν1 = ν (2) − , ν1 t1 = νt − , ν1 r1 = νr − , ...,
λ(2) λ(2) λ(2)
2
(2) λt λν λω
t1 = t(2) − , ..., ν1 ω1 = νω − , ...
λ(2) λ(2)
We will form in the same manner the following four equations:
(2)


 t2 ω2 = t2 z 000 +t2 r2 z 00 +t2 q2 z 0 +t2 p2 z,
(2)
r2 ω2 =r2 t2 z 000 + r2 z 00 +r2 q2 z 0 +r2 p2 z,


(C) (2)

 q2 ω2 =q2 t2 z 000 +q2 r2 z 00 + q2 z 0 +q2 p2 z,
(2)

p2 ω2 =p2 t2 z 000 +p2 r2 z 00 +p2 q2 z 0 + p2 z.

9
whence we have
2
(2) (2) ν1 t1 ν1 t1 ν1 r1
t2 = t1 − (2)
, t2 r2 = t1 r1 − (2)
,
ν1 ν1
ν1 t1 ν1 ω1
t2 ω2 = t1 ω1 − (2)
, ...
ν1

As we have no more here than four elements, we can apply to these equations the
(2)
formulas of no 1, but we can continue to eliminate and to form thus the value of p5 .

§ 2. In order to apply this method to an example, I take the following six equations:

129z v +46, 310z iv + 1, 1128z 000 + 1, 3371z 00 + 5722z 0 + 2602z=−1002, 900.


000 00
v iv
46, 310z +2, 1543z + 3, 6213z + 1, 2484z − 5459z 0 + 696, 13z= −343, 455.
1, 1128z v +3, 6213z iv +57, 1911z 000 − 3, 2252z 00 − 39749, 1z 0 − 1959, 0z= −40, 335.
1, 3371z v +1, 2484z iv − 3, 2252z 000 + 71, 8720z 00 − 153106, 5z 0 + 6788, 2z= 237, 782.
5722z v − 5459z iv −39749, 1z 000 −153106, 5z 00 +424865729z 0 −12729398z=−738297, 8.
2602z v +696, 13z iv − 1959, 0z 000 + 6788, 2z 00 − 12729398z 0 + 795938z= 7212, 6.

These equations are those in which Mr. Bouvard is arrived by 129 as many opposi-
tions as quadratures of Saturn, and from which he has concluded the corrections of the
elements of the movement of this planet. z v is the correction of the mean longitude,
in 1750; z iv is the secular correction of the mean movement; z 000 is the correction of
the equation of the center; z 00 is the product of the equation of the center with the cor-
rection of the perihelion; z 0 is the mass of Jupiter and z is that of Uranus. The second
decimal is unity.
By means of these equations, which are contained in the system (A), I have formed
the following five, contained in the system (B):

4, 9181z iv + 3, 2217z 000 + 0, 7684z 00 − 7513, 2z 0 − 237, 97z=· · · ,


3, 2217z +57, 1815z − 3, 2367z − 39798, 5z 0 − 1981, 4z=· · ·
iv 000 00
,
0, 7684z iv − 3, 2367z 000 + 71, 8581z 00 − 153165, 8z 0 + 6761, 2z=· · · ,
−7513, 2z iv −39798, 5z 000 −153165, 8z 00 +424611921z 0 −12844814z=· · · ,
−237, 97z iv − 1981, 4z 000 + 6761, 2z 00 − 12844814z 0 + 743454z=· · · ,

From these equations I have deduced the following four, contained in the system
(C):
55, 071z 000 − 3, 7401z 00 − 34876, 8z 0 − 1825, 5z=· · · ,
−3, 7401z 000 + 71, 7380z 00 − 151992, 0z 0 + 6798, 4z=· · · ,
−34876, 8z 000 −151992, 0z 00 +413134287z 0 −13208352z=· · · ,
−1825, 5z 000 + 6798, 4z 00 − 13208352z 0 + 731939z=· · · ,
These last equations have led me to the following three:

71, 4840z 00 − 154360, 6z 0 + 6674, 4z=· · · ,


−154360, 6z 00 +391046641z 0 −14364450z=· · · ,
6674, 4z 00 + 14364450z 0 + 671427z=· · · ,

10
Finally, I have deduced from this last system of equations the following two:
57724487z 0 +48067z=· · · ,
48067z 0 +48244z=· · · ,
I am myself stopped at this system since it is easy to conclude from it the values of P ,
relative to the two elements z 0 and z, which I wished particularly to know, and I have
found by the formulas of no 1, for z 0 ,
(48067)2
 
s
P = 57724487 − ,
2S(i)2 48244
and for z,
(48067)2
 
s
P = 48244 − .
2S(i)2 57724487
The number s of observations is here 129, and Mr. Bouvard has found
2
S(i) = 31096,
we have therefore, for z 0 ,
log P = 5, 0778548,
and, for z,
log P = 1, 9999383.
The mass of Jupiter is
1
(1 + z 0 ),
1067, 09
and Mr. Bouvard has found z 0 = −0, 00332, that which gives the mass of Jupiter equal
1
to 1070,5 .
The probability that the error of z 0 is comprehended within the limits ±U , equals
√ Z
P 2
√ du c−P u ,
π
the integral being taken within the limits u = ±U . We find thus the probability that the
error of the value of the mass of Jupiter, determined by Mr. Bouvard, is comprehended
1 1
within the limits ± 150 of 1067,09 , equal to 900
901 , and the probability that this error is
1 1
contained within the limits ± 100 of 1067,09 , equal to 999307
999308 . The mass of Uranus is

1
(1 + z),
19504
1
and Mr. Bouvard has found z = 0, 08848, that which gives 17918 for the mass of
Uranus. The probability that the error of the mass of Uranus thus determined is con-
tained within the limits ± 15 of 19504
1
, is 212,8
213,8 .
Relative to the mass of Saturn, Mr. Bouvard has supposed it, in his equations of
condition of the movement of Jupiter in longitude, equal to
1+z
,
3534, 08

11
1
and he has found z = 0, 00633, that which gives 3512 for the mass of Saturn. In
applying my formulas to these equations of condition, I find

log P = 4, 8851146.
1
The probability that the mass of Saturn thus determined is within the limits ± 100
1 11170
of 3534,08 , equals 11171 .

3. We apply again the formulas of probability to the observations of the pendulum


in seconds.
In representing by z 0 the length of the pendulum at the equator, by p(i) the square
of the sine of latitude, and by z its coefficient in the law of gravity, Mr. Mathieu has
formed, by comparing to this law the thirty-seven observations of which I have spoken
above, thirty-seven equations of condition of the form

(i) = zp(i) + z 0 − ω (i) .

In resolving them by the most advantageous method, he has deduced from them
two final equations which have given to him the values of z and z 0 , and he has deduced
from them, for the expression of the length of the pendulum,

(a) 1, 0000043162 + 0, 0055188p(i) .

In this expression, the length of the pendulum is compared to none of our linear
measures, because the observations, such as Mr. Mathieu has considered them, are,
properly speaking, only those of the number of diurnal oscillations which one same
pendulum has made in the diverse places. It is necessary therefore, in order to have in
linear measures the length of the pendulum in decimal seconds, to compare this length
to these measures, in a given place. This is that which Borda has executed with a care
and an extreme precision, at the Observatory of Paris, where he has found this length
equal to 0m , 741887. Thence I have concluded, for the general expression of the length
of this pendulum,
0m , 739505 + 0m , 0040780p(i) .
Now, in order to have the probability that the coefficient of p(i) or of the law
of gravity is contained within the given limits, it is necessary to know the values of
2 2
Sp(i) , Sp(i) and S(i) . Mr. Mathieu has found

Sp(i) = 14, 255136,


2
Sp(i) = 7, 9569564,
2
S(i) = 0, 00000093890182.

We have besides here q (i) = 1; that which gives Sq (i) = s, s being the number of
observations which, in the present case, is equal to thirty-seven. This put, I observe that
if we name u and u0 the simultaneous errors of the values of z and z 0 , determined by

12
the most advantageous method, the probability of these errors is, by no 21 of the second
Book of my Théorie analytique des probabilités,5 proportional to the exponential
(F u2 +2Guu0 +Hu02 )s
− 2 ,
c E2S(i)

and we have, by the same section,


h 22
i
F = Sp(i) sSp(i) − (Sp(i) )2 ,
h 2
i
G = Sp(i) sSp(i) − (Sp(i) )2 ,
h 2
i
H = s sSp(i) − (Sp(i) )2 ,
2
E = sSp(i) − (Sp(i) )2 ,

that which changes the preceding exponential into this


2
s(u2 Sp(i) +2uu0 Sp(i) +u02 )
− 2 ,
c 2S(i)

But, if we take for unity the length of the pendulum at the equator, it will be neces-
sary to divide the formula (a) by its first term, and then it becomes quite nearly

(b) 1 = 0.0055145p(i) .

We see also that the error of this new coefficient of p(i) is u − u0 , we will designate
it by t, so that u − u0 = t. In making moreover
h 2
i
sSp(i) − (Sp(i) )2 s
P = ,
(Sp(i)2 + 2Sp(i) + s)2S(i)2
t(Sp(i) + s)
t0 = u − ,
Sp(i)2 + 2Sp(i) + s
the preceding exponential becomes
2
s(Sp(i) +2Sp(i) +s)t02
−P t2 − 2 ,
c 2S(i)

By multiplying this exponential by dt dt0 , by integrating it with respect to t0 , from


t = −∞ to t0 = ∞, and relatively to t, within the given limits; finally, by dividing
0

this double integral by the same double integral, taken relative to t and t0 from −∞ to
+∞, we will have the probability that the value of t is comprehended within the given
limits. The expression of this probability will be thus
√ R 2
P dt c−P t

π
5 TAP, page 327.

13
2 2
The preceding values of s, Sp(i) , Sp(i) and S(i) give

log P = 7, 3884431.

By means of this value of log P , we can determine the probability that the true
coefficient of p(i) , in formula (b), is comprehended within some given limits. I find
thus that the probability that it is contained between 0, 0050145 and 0, 0060145 is
1
2128,1 .

14
SUR
LE CALCUL DES PROBABILITÉS
APPLIQUÉ A LA PHILOSOPHIE NATURELLE

Pierre Simon Laplace∗


Connaissance des Temps for the year 1818 (1815) pp. 378–381.
OC 13 pp. 117–120.

I have given, in my Théorie analytique des probabilités and in that which preceded,
some general formulas in order to have the probability that the errors of the results ob-
tained by the whole of a great number of observations, and determined by the most
advantageous method, are comprehended within some given limits. The advantage of
these formulas, is to be independent of the law of probability of the errors of the ob-
servations, a law always unknown, and which does not permit to reduce into numbers,
the expressions which contain it. I am happy to succeed to eliminate from my for-
00
mulas, the factor 2kk a2 s, which depends on this law, by observing that the number s
of observations being quite great, this factor is very probably equal to the sum of the
squares of the errors of the observations, and that this sum is very probably the sum
of the squares of the residuals of the equations of condition, when we have substituted
the elements determined by the most advantageous method. I suppose that we have
before the eyes §§ 19, 20 and 21 of the second Book of my Théorie analytique des
probabilités. The importance of these formulas in natural philosophy, requires that the
uncertainty that they can leave is dissipated, and the only one which remains yet, is
relative to the equalities of which I just spoke. I myself propose here, to clarify this
delicate point of the theory of probabilities, and to show that these equalities can be
employed without sensible errors.
The sum of √the squares of the errors of the observations being supposed equal to
2k00 2 2
k a s + a r s, the probability that the value of r is comprehended within the given
limits is, by the § 19 cited, Z
1 β 0 2 r2
√ β 0 dr c− 2 ,

the integral being taken within the given limits. Let us represent the general equation
of condition of the elements z, z 0 , etc. by this one,

(i) = p(i) z + q (i) z 0 + etc. − α(i) ,


∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. June 20, 2012

1
(i) being the error of the observation. The elements z, z 0 , etc. being determined by
the most advantageous method, let us designate by u, u0 , etc. their errors, we will have
by naming 0(i) the rest of the function

p(i) z + q (i) z 0 + etc. − α(i) ,

when we have substituted for z, z 0 , etc. their values thus determined

(i) = 0(i) + p(i) u + q (i) u0 + etc.,

that which gives


2 2
S(i) = S0(i) + 2S0(i) (p(i) u + q (i) u0 + etc.) + S(p(i) u + q (i) u0 + etc.)2 ,

the integral sign S extending to all the values of i, from i = 0 to i = s − 1. But by the
conditions of the most advantageous method, we have Sp(i) 0(i) = 0, Sq (i) 0(i) = 0,
etc.; we have therefore
2 2
S(i) = S0(i) + S(p(i) u + q (i) u0 + etc.)2 .
2
By comparing this value of S(i) to its preceding value

2k 00 2 √
a s + a2 r s,
k
we will have
√ 2 2k 00 2
a2 r s = S0(i) − a s + S(p(i) u + q (i) u0 + etc.)2 .
k
Let us make
2k 00 2
2 √
S0(i) − a s = t s,
k
ν ν0
u = √ , u0 = √ , etc.,
s s
we will have
S(p(i) ν + q (i) ν 0 + etc.)2
a2 r = t + √ ;
s s
β 02 r 2
the exponential c− 4 becomes thus
 2
β0 2 S(p(i) ν+q (i) ν 0 +etc.)2
− 4a 2 t+

s s
c ;

thus the probability of t is proportional to this exponential.


The probability of the simultaneous existence of the quantities u, u0 , etc. is, by
§ 21 of the second Book of the Théorie analytique des probabilités, proportional to the
exponential
k (i) (i) 0 2
c− 4k00 a2 s S(p ν+q ν +etc.) ,

2
the probability of the simultaneous existence of t, ν, ν 0 , etc. is therefore proportional
to  2
02 S(p(i) ν+q (i) ν 0 +etc.)2
β
− 4a 2 t+

s s
− 4k00ka2 s S(p(i) ν+q (i) ν 0 +etc.)2
c .
00 2 2 √
By substituting for 4k ka s its value 2S0(i) − 2t s, this exponential is reduced, by
neglecting the terms of order 1s , to the following function:
 √ 
t s (i) 2
1− S(p ν + etc.) .
2(S0(i)2 )2
 2
β0 2 S(p(i) ν+q (i) ν 0 +etc.)2 S(p(i) ν+q (i) ν 0 +etc.)2
− 4a 4 t+

s s
− 0(i)2
c 2S

Now, in order to have the probability that the value of ν is comprehended within
some given limits, it is necessary: 1 ˚ to multiply this function by dt dν dν 0 etc.; 2 ˚
to take the integral of the product, for all the possible values of t, ν 0 , etc. and, with
respect to ν, to integrate only within the given limits; 3 ˚ to divide the whole by this
same integral, taken with respect to all the possible values of t, ν, ν 0 , etc. The unknown
00 2
value of 2k ka s being able to vary from zero to infinity, the value of t is able to vary
0(i)2
from S√s to negative infinity, and as S0(i)2 is of the order of s, t is able to vary from

negative infinity to a positive value of order s; the preceding exponential will become
2
therefore, at the extremity of the integral taken with respect to t, of the form c−Q s ,
and will be able to be neglected, because of the magnitude supposed to s; thus we can
take the integral relative to t, from t = −∞ to t = ∞. Similarly the integrals relative
to ν, ν 0 , etc. can be taken within the same limits. If we make

S(p(i) ν + q (i) ν 0 + etc.)2


t+ √ = z;
s s

the integral relative to z will be able to be taken with respect to z, from z = −∞ to


z = ∞.
Thence it is easy to conclude that the probability that ν is contained within the
given limits, is equal to the integral
( 2 )
S(p ν + q (i) ν 0 + etc.)2
 (i)
S(p(i) ν+q (i) ν 0 +etc.)2
Z
0 −
dνdν etc.c 2S0(i)2
1+ ,
(2S0(i)2 )2 s

the integral being taken from ν 0 , ν 00 , etc. equal to −∞, to their infinite values and, with
respect to ν, within the given limits, and being divided by the same integral extended
to the positive and negative infinite values of ν, ν 0 , ν 00 , etc.
00 2
The consideration of the difference between 2kk a2 s and S0(i) introduces there-
fore, into the expression of the probability of which there is concern, only a term of
order 1s , an order that I myself am permitted to neglect in the work cited, seeing the
magnitude supposed to s.

3
PREMIER SUPPLEMENT.
SUR L’APPLICATION DU CALCUL DES PROBABILITÉS A LA
PHILOSOPHIE NATURELLE

Pierre Simon Laplace∗


1816. OC 7 pp. 497–530

The phenomena of nature are most often enveloped in so many strange circum- [497]
stances, so great a number of perturbing causes mix their influence, that it is very
difficult to recognize them. We are able to arrive there only by multiplying the obser-
vations or experiments, so that the strange effects coming to be destroyed reciprocally,
the mean results set into evidence these phenomena and their diverse elements. The
more the observations are numerous and the less they deviate among themselves, the
more their results approach to the truth. We fulfill this last condition by the choice of
methods, by the precision of the instruments and by the care that we take to observe
well. Next we determine through the theory of probabilities the most advantageous
mean results or those which lay us less open to error. But this does not suffice; it is
more necessary to estimate the probability that the errors of these results are compre-
hended within some given limits. Without this, we have only an imperfect knowledge
of the degree of exactitude obtained. Some formulas proper to this object are therefore
a true perfection of the method of the sciences, and so it is quite important to add to
this method. The analysis that they require is the most delicate and the most difficult
of the theory of probabilities. This is one of the things that I have had principally in
view in my Work, in which I am arrived to some formulas of this kind, which have the
remarkable advantage to be independent of the law of probabilities of the errors and to
contain only quantities given by the observations themselves and by their expressions. [498]
I am going to recall here the principles.
Each observation has for analytic expression a function of the elements that we
wish to determine; and, if these elements are nearly known, this function becomes a
linear function of their corrections. By equating it to the same observation, we form
that which we name the equation of condition. If we have a great number of similar
equations, we combine them, in a manner to obtain as many final equations as there
are elements of which we determine next the corrections, by resolving these equations.
But what is the most advantageous manner to combine the equations of condition in
order to obtain the final equations? What is the law of the errors of which the elements
that we deduce from it are yet susceptible? It is this that the theory of probabilities
makes known. The formation of a final equation, by means of the equations of con-
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 6, 2014

1
dition, revert to multiplying each of these by an indeterminate factor and to reunite
these products; but it is necessary to choose the system of factors which give the least
error to fear. Now it is clear that, if we multiply the possible errors of an element by
their respective probabilities, the most advantageous system will be the one in which
the sum of these products, all taken positively, is a minimum; because a positive or
negative error must be considered as a loss. By forming therefore this sum of products,
the condition of the minimum will determine the system of factors that it is necessary
to choose. We find thus that this system is the one of the coefficients of the elements in
each equation of condition, so that we form a first final equation by multiplying respec-
tively each equation of condition by its coefficient of the first element and by reuniting
all these equations thus multiplied. We form a second final equation by employing
likewise the coefficients of the second element, and thus consecutively. In this manner,
the elements and the laws of the phenomena, contained in the compilation of a great
number of observations, are developed with the most evidence. I have given, in § 21
of Book II of my Théorie analytique des Probabilités, the expression of the mean error
to fear respecting each element. This expression gives the probability of the errors of [499]
which the element is further susceptible, and which is proportional to the number of
which the hyperbolic logarithm is unity, raised to a power equal to the square of the
error taken to less and divided by the square of the double of this expression and by
the ratio of the circumference to the diameter. The coefficient of the negative square
of the error in this exponent is able therefore to be considered as the modulus of the
probability of the errors, since the error remaining the same, the probability decreases
with rapidity when it increases, so that the result obtained weighs, if I am able to say
so, toward the truth, so much more as this modulus is greater. I will name, for this
reason, this modulus weight of the result. By a remarkable analogy of these weights
with those of bodies compared to their common center of gravity, it happens that, if
one same element is given by diverse systems, each composed of a great number of
observations, the most advantageous mean result of them altogether is the sum of the
products of each partial result by its weight, this sum being divided by the sum of all
the weights. Moreover the total weight of the diverse systems is the sum of their partial
weights, so that the probability of the mean result of them altogether is proportional to
the number which has unity for hyperbolic logarithm, raised to a power equal to the
square of the error, taken to less and multiplied by the sum of all the weights. Each
weight depends, in truth, on the law of probability of the errors in each system, and
nearly always this law is unknown; but I am happily arrived to eliminate the factor
which contains it, by means of the sum of the squares of the deviations of the observa-
tions of the system, from their mean result. It would be therefore to desire, in order to
complete our understandings on the results obtained by the totality of a great number
of observations, that we wrote beside each result the weight which corresponds to it.
In order to facilitate the calculation of these weights, I develop its analytic expression,
when we have no more than three elements to determine. But, this expression becom-
ing more and more complicated in measure as the number of elements increase, I give
a quite simple way in order to determine the weight of a result, whatever be the number [500]
of elements. When we have obtained thus the exponential which represents the law
of probability of the errors, we will have the probability that the error of the result is
comprehended within some given limits, by taking, within these limits, the integral of

2
the product of this exponential by the differential of the error and by multiplying it by
the square root of the weight of the result divided by the circumference of which the
diameter is unity. Thence it follows that, for one same probability, the errors of the
results are reciprocals to the square roots of their weights, that which is able to serve to
compare their respective precision.
In order to apply this method with success, it is necessary to vary the circumstances
of the observations or of the experiments, in a manner to avoid the constant causes of
the error. It is necessary that the observations be numerous and that they be so many
more as there are more elements to determine: because the weight of the mean result
increases as the number of the observations divided by the number of elements. It is
further necessary that the elements follow, in these observations, a different march; be-
cause, if the march of the two elements were rigorously the same, that which would
render their coefficients proportionals in the equations of condition, these elements
would form only a single unknown, and it would be impossible to distinguish them by
these observations. Finally it is necessary that the observations be precise. This con-
dition, the first of all, increases much the weight of the result, of which the expression
has for divisor the sum of the squares of their deviations from this result. With these
precautions, we will be able to make use of the preceding method and to measure the
degree of confidence which the results deduced from a great number of observations
merit.

§ 1. A great advantage of this method, which permits evaluating from it the expres-
sions numerically, is, as we have said, to be independent of the law of probability of
00
the errors of the observations. The factor 2kk a2 s, which depends on this law, has been
eliminated from the formulas of §§ 19 and 21 of Book II, by observing that this factor
which is the sum of the squares of all the possible errors of the observations, multi- [501]
plied by their respective probabilities, and which expresses thus the true mean of these
squares, is very probably equal to the sum of the squares of the rest of the equations
of condition, when we have substituted the elements determined by the most advanta-
geous method. The importance of this method in natural philosophy requires that the
uncertainty that it is able to permit is dissipated, and the only one which remains yet is
relative to the equality of which I just spoke. I will first clarify this delicate point of the
theory of the probabilities and show that the preceding equality is able to be employed
without sensible error.
The sum of the squares of the errors of√the observations, of which the number is
00
s, being supposed equal to 2kk a2 s + a2 r s, the probability that the value of r is
comprehended within the given limits is, by § 19 cited,
Z
1 β 0 2 r2
√ β 0 dr c− 4 ,
2 π
the integral being taken within these limits. Let us represent the general equation of
condition of the elements z, z 0 , . . . by this one
(i) = p(i) z + q (i) z 0 + · · · − α(i) ,
(i) being the error of the observation. The elements z, z 0 , . . . being determined by the
most advantageous method, let us designate by u, u0 , . . . their errors; we will have, by

3
naming 0(i) the remainder of the function

p(i) z + q (i) z 0 + · · · − α(i)

when we have substituted for z, z 0 , . . . their values thus determined,

(i) = 0(i) + p(i) u + q (i) u0 + · · · ,

that which gives

S(i)2 = S0(i)2 + 2S0(i) (p(i) u + q (i) u0 + . . .) + S(p(i) u + q (i) u0 + . . .)2 ,

the integral sign S being extended to all the values of i, from i = 0 to i = s − 1. But,
by the conditions of the most advantageous method, we have [502]

Sp(i) 0(i) = 0, Sq (i) 0(i) = 0, ...;

we have therefore
2 2
S(i) = S0(i) + S(p(i) u + q (i) u0 + · · · )2 ;
2
2k00 2 √
by comparing this value of S(i) to its preceding value 2
k a s + a r s, we will have

√ 2 2k 00 2
a2 r s = S0(i) − a s + S(p(i) u + q (i) u0 + · · · )2 .
k
Let us make
2 2k 00 2 √
S0(i) − a s = t s,
k
0
ν ν ν 00
u= √ , u0 = √ , u00 = √ , ...,
s s s
we will have
S(p(i) ν + q (i) ν 0 + · · · )2
a2 r = t + √ ;
s s
β 02 r 2
the exponential c− 4 becomes thus
 2
β 02 S(p(i) ν+q (i) ν 0 +··· )2
− 4a 2 t+

s s
c ;

thus the probability of t is proportional to this exponential.


The analysis of § 21 of Book II leads to this general theorem, namely that the
probability of the simultaneous existence of the quantities u, u0 , u00 , . . . is proportional
to the exponential
k (i) (i) 0 2
c− 4k00 a2 s S(p ν+q ν +··· ) ;
the probability of the simultaneous existence of t, ν, ν 0 , ν 00 , . . . is therefore propor-
tional to  2
02 S(p(i) ν+q (i) ν 0 +··· )2
β
− 4a 4 t+

s s
− 4k00ka2 s S(p(i) ν+q (i) ν 0 +··· )2
c .

4
00 2 √
By substituting for 4k ka s its value 2S0(i) − 2t s, this exponential is reduced, by [503]
neglecting the terms of order 1s , in the following function:
 √  β02  S(p(i) ν+q(i) ν 0 +··· )2 2 S(p(i) ν+q(i) ν 0 +··· )2
t s (i) (i) 0 2
− 2 t+ √
s s

2S0(i)
2
1− 0(i)2 2
S(p ν + q ν + · · · ) c 4a
2(S )

Now, in order to have the probability that the value of ν is comprehended within some
given limits, it is necessary: 1◦ to multiply this function by dt dν dν 0 . . .; 2◦ to take the
integral of the product for all the possible values of t, ν 0 , ν 00 , . . . and, with respect to
ν, to integrate only within the given limits; 3◦ to divide the whole by this same integral
taken with respect to all the possible values of t, ν, ν 0 , . . . By regarding S0(i)2 as a
00 2
datum from observation, t varies only at the rate of the unknown value of 2k ka s , and
0(i)2
S√
this value is able to vary from zero to infinity; t is therefore able to vary from s
to negative infinity; and, as S0(i)2 is√of the order of s, t is able to vary from negative
infinity to a positive value of order s. The preceding exponential becomes, at that
2
limit of the integral taken with respect to t, of the form c−Q s , and will be able to be
supposed null, because of the magnitude of s. Thus we are able to take the integral
relative to t, from t = −∞ to t = ∞. Similarly the integrals relative to ν 0 ν 00 , . . . are
able to be taken within the same limits. If we make

S(p(i) ν + q (i) ν 0 + · · · )2
t+ √ = t0 ,
s s

the integral relative to t0 will be able to be taken with respect to t0 from t0 = −∞ to


t0 = ∞.
Thence it is easy to conclude that the probability that ν is comprehended within the
given limits is proportional to the integral
( 2 )
S(p ν + q (i) ν 0 + · · · )2
 (i)
S(p(i) ν+q (i) ν 0 +··· )2
Z
0 −
dνdν . . . 1 + c 2S(i)2 ,
(2S0(i)2 )2 s

the integral being taken from ν 0 , ν 00 , . . . equal to −∞ to their positive infinite values [504]
and with respect to ν within the given limits, and being divided by the same integral
extended to the positive and negative infinite values of ν, ν 0 , ν 00 , . . .
00
The consideration of the difference which is able to exist between 2kk a2 s and
S0(i)2 introduces therefore into the expression of the probability of which there is
concern only one term of order 1s , an order that I myself am permitted to neglect in my
Work. Thence, the preceding integral becomes
S(p(i) ν+q (i) ν 0 +...)2
Z

dνdν 0 . . . c 2S0(i)2 .

5
If we make
(i) q (i) Sp(i) q (i)
p1 = p(i) − ,
Sq (i)2
(i) q (i) Sr(i) q (i)
r1 = r(i) − ,
Sq (i)2
(i) q (i) St(i) q (i)
t1 = t(i) − ,
Sq (i)2
·····················
the exponential
S(p(i) ν+q (i) ν 0 +r (i) ν 00 ...)2

c 2S0(i)2

will be able to be set under this form


(i) (i) 2
S(p1 ν+r1 ν 00 ...)2

Sq (i)2 (i) q (i) +ν 00 Sr (i) q (i) +···
− − ν 0 + νSp .
2S0(i)2 2S0(i)2 Sq (i)2
c

By multiplying this quantity by dν 0 , and by integrating it from ν 0 = −∞ to ν 0 = ∞,


we will have a quantity proportional to
(i) (i)
S(p1 ν+r1 ν 00 ...)2

c 2S0(i)2

and in which the variable ν 0 has disappeared. By following the same process, we will
make the variables ν 00 , ν 000 , . . . vanish. We will arrive thus to an exponential of the
(i)2
ν 2 Sp
n−1

form
√ c
2Se , n being the number of elements. If we restore, instead of ν, its value
0(i)2
[505]
u s, this exponential becomes
2
c−P u ,
by making
sSp(i)2
. P =
2S0(i)2
u being the error of the value of z, P is that which I name weight of this value. The
probability that this error is comprehended within some given limits is therefore
R √ P −u2
du P c
√ ,
π
the integral being taken within these limits, and π being the circumference of which
the diameter is unity. But it is simpler to apply the process of which we have just made
use to the final equations which determine the elements, in order to reduce them to one
alone, that which gives an easy method to resolve these equations.

§ 2. Let us take the general equation of condition, and, for more simplicity, let us
limit it to the six elements z, z 0 , z 00 , z 000 , z iv , z v ; it becomes then

(1) (i) = p(i) z + q (i) z 0 + r(i) z 00 + t(i) z 000 + γ (i) z iv + λ(i) z v − α(i) .

6
By multiplying it by λ(i) and reuniting all the products together, we will have
Sλ(i) (i) = zSλ(i) p(i) + z 0 Sλ(i) q (i) + · · · − Sλ(i) α(i) ,
the integral sign S extending to all the values of i, from i = 0 to i = s − 1, s being
the number of observations employed. By the conditions of the most advantageous
method, we have Sλ(i) (i) = 0; the preceding equation will give therefore
Sλ(i) γ (i) (i) (i)
000 Sλ t
(i) (i)
00 Sλ r
z v = −z iv − z − z
Sλ(i)2 Sλ(i)2 Sλ(i)2
(i) (i) (i) (i) (i) (i)
Sλ q Sλ p Sλ α
−z 0 −z +
Sλ(i)2 Sλ(i)2 Sλ(i)2
If we substitute this value into equation (1) and if we make [506]

(i) Sλ(i) γ (i)


γ1 = γ (i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) t(i)
t1 = t(i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) r(i)
r1 = r(i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) q (i)
q1 = q (i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) p(i)
p1 = p(i) − λ(i) ,
Sλ(i)2
(i) Sλ(i) α(i)
α1 = α(i) − λ(i) ,
Sλ(i)2
we will have
(i) (i) (i) (i) (i) (i)
(2) (i) = p1 z + q1 z 0 + r1 z 00 + t1 z 000 + γ1 z iv − α1 ;
by this means, the element z v has disappeared from the equations of condition that
(i)
equation (2) represents. By multiplying this equation by γ1 and reuniting all the
products together, by observing next that we have
(i)
Sγ1 (i) = 0
by virtue of the equations
0 = Sλ(i) (i) , 0 = Sγ (i) (i)
that the conditions of the most advantageous method give, we will have
(i) (i) (i) (i) (i) (i) (i) (i) (i)2 (i) (i)
0 = zSγ1 p1 + z 0 Sγ1 q1 + z 00 Sγ1 r1 + z 000 Sγ1 t1 + z iv Sγ1 − Sγ1 α1 ;
whence we deduce
(i) (i) (i) (i) (i) (i) (i) (i) (i) (i)
Sγ1 t1 Sγ1 r1 Sγ1 q1 Sγ1 p1 Sγ1 α1
z iv = −z 000 (i)2
− z 00 (i)2
− z0 (i)2
−z (i)2
+ (i)2
.
Sγ1 Sγ1 Sγ1 Sγ1 Sγ1

7
If we substitute this value into equation (2) and if we make [507]
(i) (i)
(i) (i) (i) Sγ1 t
t 2 = t 1 − γ1 (i)2
,
Sγ1
(i) (i)
(i) (i) (i) Sγ1 r
r2 = r1 − γ1 (i)2
,
Sγ1
(i) (i)
(i) (i) (i) Sγ1 q1
q2 = q1 − γ1 (i)2
,
Sγ1
(i) (i)
(i) (i) (i) Sγ1 p1
p2 = p1 − γ1 (i)2
,
Sγ1
(i) (i)
(i) (i) (i) Sγ1 α1
α2 = α1 − γ1 (i)2
,
Sγ1
we will have
(i) (i) (i) (i) (i)
(3) (i) = p2 z + q2 z 0 + r2 z 00 + t2 z 000 − α2 .

By continuing thus, we will arrive to an equation of the form


(i) (i)
(4) (i) = p5 z − α5 .

There results from § 20 of Book II that, if the value of z is determined by this equation
and if u is the error of this value, the probability of this error is
s
(i)2 (i)2
sSp5 sSp5
− 0(i)2 u2
0(i)2
c 2S ,
2S π

S0(i)2 being the sum of the squares of the remainders of the equations of condition,
when we have substituted there the elements determined by the most advantageous
(i)2
sSp5
method. The weight P of this error is therefore equal to 2S0(i)2
.
(i)2
The concern now is to determine Sp5 . For this, we will multiply respectively
each of the equations of condition represented by equation (1) first by the coefficient
of the first element, and we will take the sum of these products; next by the coefficient
of the second element, and we will take the sum of these products, and thus of the rest.
We will have, by observing that by the conditions of the most advantageous method [508]
Sp(i) (i) = 0, Sq (i) (i) = 0, . . . , the six equations following:

pα=p(2) z+ pqz 0 + prz 00 + ptz 000 + pγz iv + pλz v ,




qα = pqz +q (2) z 0 + qrz 00 + qtz 000 + qγz iv + qλz v ,




rα = rpz + rqz 0 +r(2) z 00 + rtz 000 + rγz iv + rλz v ,


(A)

 tα = tpz + tqz 0 + trz 00 +t(2) z 000 + tγz iv + tλz v ,
γα= γpz + γqz 0 + γrz 00 + γtz 000 +γ (2) z iv + γλz v ,




λα= λpz + λqz 0 + λrz 00 + λtz 000 + λγz iv +λ(2) z v ,

8
whence we must observe that we suppose
p(2) = Sp(i)2 , pq = Sp(i) q (i) , q (2) = Sq (i)2 , qr = Sq (i) r(i) , ...
If we multiply similarly the equations of condition represented by equation (2) re-
spectively by the coefficients of z and if we add these products, next by the coefficients
of z 0 by adding again these products, and thus in succession, we will have the follow-
(i) (i)
ing system of equations, by observing that Sp1 (i) = 0, Sq1 (i) = 0, . . ., by the
conditions of the most advantageous method.
(2)
p1 α1 = p1 z +p1 q1 z 0 +p1 r1 z 00 +p1 t1 z 000 +p1 γ 1 z iv ,



 (2) 0 00 000 iv
 q1 α1 =p1 q1 z+ q1 z +q1 r1 z +q1 t1 z +q1 γ1 z ,


0 (2) 00 000
(B) r1 α1 =p1 r1 z+q1 r1 z + r1 z +r1 t1 z +r1 γ1 z iv ,
(2)
 t1 α1 =p1 t1 z +q1 t1 z 0 +r1 t1 z 00 + t1 z 000 +t1 γ 1 z iv ,




(2)
γ1 α1 =p1 γ1 z+q1 γ1 z 0 +r1 γ1 z 00 +t1 γ1 z 000 + γ1 z iv ,

whence we must observe that


(i) (i) (2) (i)2
p1 q1 = Sp1 q1 , p1 = Sp1 , ...
(i) (i)
By substituting, instead of p1 , q1 , . . ., their preceding values, we have
(i) (i) Sλ(i) p(i) Sλ(i) q (i)
p1 q1 = Sp1 q1 −
Sλ(i)2
or
λp λq
p1 q1 = pq − ;
λ(2)
we have similarly [509]
2
(2) λp
p1 = p(2) − (2) ,
λ
2
(2) λq
q1 = q (2) − (2) ,
λ
λp λr
p1 r1 = pr − (2) ,
λ
············
λp λα
p1 α1 = pα − ,
λ(2)
············
Thus the coefficients of the system of equations (B) are deduced easily from the coef-
ficients of the system of equations (A).
The equations of condition represented by equation (3) will give similarly the fol-
lowing system of equations
(2)


 p2 α2 = p2 z +p2 q2 z 0 +p2 r2 z 00 +p2 t2 z 000 ,
(2)
q2 α2 =p2 q2 z+ q2 z 0 +q2 r2 z 00 +q2 t2 z 000 ,


(C) (2)

 r2 α2 =p2 r2 z+q2 r2 z 0 + r2 z 00 +r2 t2 z 000 ,
(2)

t2 α2 =p2 t2 z+q2 t2 z 0 +r2 t2 z 00 + t2 z 000 ,

9
and we have
(2) (2) γ1 p 1 2
p2 = p1 − (2)
,
γ1
γ1 p1 q1 γ1
p2 q2 = p1 q1 − (2)
,
γ1
············
γ1 p1 γ1 α1
p2 α2 = p1 α1 − (2)
,
γ1
············
We will have similarly the system of equations

(2) 0 00
 p3 α3 = p3 z +p3 q3 z +p3 r3 z ,

(2) 0
(D) q3 α3 =p3 q3 z+ q3 z +q3 r3 z 00 ,
(2)
r3 α3 =p3 r3 z+q3 r3 z 0 + r3 z 00 ,

by making [510]
2
(2) (2) p2 t2
p3 = p2 − (2)
,
t2
p2 t2 q21 t2
p3 q3 = p2 q2 − (2)
,
t2
t2 p2 t2 α2
p3 α3 = p2 α2 − (2)
,
t2
··············· ;
we will have further
(
(2)
p4 α4 = p4 z +p4 q4 z 0 ,
(E) (2)
q4 α4 =p4 q4 z+ q4 z 0 ,

by making
(2) (2) p3 r3 2
p4 = p3 − (2)
,
r3
p3 r3 q3 r3
p4 q4 = p3 q3 − (2)
,
r3
p3 r3 α3 r3
p4 α4 = p3 α3 − (2)
,
r3
···············
Finally we will have
(2)
(F) p5 α5 = p5 z,

10
by making
(2) (2) p4 q4 2 p4 q4 q4 α4
p5 = p4 − (2)
, p5 α5 = p4 α4 − (2)
;
q4 q4
(2) (i)2
p5 is the value of Sp5 , and the weight P will be
(2)
sp5
.
2S0(i)2
(2) (2)
We see by the sequence of the values of p(2) , p1 , p2 , . . . that they diminish without
ceasing, and that thus, for the same number of observations, the weight P diminishes
when the number of elements increase.
If we consider the sequence of equations which determine p5 α5 , we see that this [511]
function, developed according to the coefficients of the system of equations (A), is of
the form
pα + M qα + N rα + · · · ,
the coefficient of pα being unity. It follows thence that if we resolve equations (A), by
1
leaving pα, qα, rα, . . . as indeterminates, (2) will be, by virtue of equation (F), the
p5
1
coefficient of pα in the expression of z. Similarly, (2) will be the coefficient of qα in
q5
0
the expression of z ; 1
(2) will be the coefficient of rα in the expression of z 00 ; and thus
r5
(2) (2)
of the rest; that which gives a simple means to obtain p5 , q5 , . . .; but it is simpler
yet to determine them thus.
(2)
First equation (F) gives the value of p5 and of z. If in the system of equations (E)
we eliminate z instead of z , we will have a single equation in z 0 , of the form
0

(2)
q5 α5 = q5 z 0 ;
by making
(2) (2) p4 q4 2 p4 q4 p4 α4
q5 = q4 − (2)
, q5 α5 = q4 α4 −
(2)
.
p4 p4
If in the system of equations (D) we eliminate z instead of z 00 , in order to conserve
(2)
at the end of the calculation only z 00 , we will have r5 by changing in the sequence of
(2)
equations which, departing from this system, determine p5 , the letter p into the letter
r, and reciprocally. We will have thus
(2) (2) p3 r3 2
r4 = r3 − (2)
,
p3
p3 q3 p3 r3
r4 q4 = r3 q3 − (2)
,
p3
(2) (2) p3 q3 2
q4 = q3 − (2)
,
p3
(2) (2) p4 q4 2
r5 = r4 − (2)
,
q4
···············

11
(2)
In order to have t5 , we will depart from the system of equations (C), by changing, in [512]
(2) (2)
the sequence of the values p3 , p3 q3 , . . . , r3 , q3 r3 , . . . , the letter p into the letter t,
and reciprocally.
(2)
We will have similarly the value of γ5 , by departing from the system of equations
(2) (2)
(B) and changing in the sequence of values of p2 , p3 , . . . , the letter p into the letter
γ, and reciprocally.
(2)
Finally, we will have the value of λ5 by changing, in the sequence of values of
(2) (2)
p1 , p2 , . . . , the letter p into the letter λ, and reciprocally.

§ 3. The error of which the value of z is susceptible being u, its probability is, as
we have seen, √ −P u2
Pc
√ .
π
By multiplying it by u du and taking the integral from u null to u infinity, we will have
1
√ √
2 π P
for the mean error to fear more respecting the value of z. This expression affected with
the sign − will be the mean error to fear to less respecting this value. I have given in
§ 21 of Book II the analytic expression of these mean errors, whatever be the number
of elements. We will have therefore, by comparing it to the preceding, the value of P ,
and it is easy to recognize the identity of these expressions. We find thus, in the case
of a single element
sp(2)
P = .
2S0(i)2
If we make generally, for any number of elements whatsoever,
s A
P = ,
2S0(i)2 B
we find, for two elements,
A = p(2) q (2) − pq 2 ,
B = q (2) .
By applying these results to the equations (E), we will have the value of P relative to [513]
the element z.
We find, for three elements,

A = p(2) q (2) r(2) − p(2) qr2 − q (2) pr2 − r(2) pq 2 + pq pr qr,


B = q (2) r(2) − qr2 .

These results applied to the equations (D) will give the value of P relative to the ele-
ment z.
By continuing thus, we will have, whatever be the number of elements, the weight
relative to the first element z. By changing in its expression p into q and q into p, we

12
will have the weight relative to the second element z 0 . By changing, in the expression
of the weight of the first element, p into r and r into p, we will have the weight relative
to the third element z 00 , and thus in succession. But, when the number of elements
surpasses three, it is much simpler to make use of the method of the previous section.
We will observe here that the mean error to fear respecting each element being,
by §§ 20 and 21 of Book II, smaller in the system of factors which constitute the most
advantageous method than in every other system, the value of P is the greatest possible.
Thus, for one same error of an element in this method, the probability is smaller than
in every other method, that which assures its superiority.

§ 4. All my analysis rests on the hypothesis that the facility of the errors is the same
for the positive errors and for the negative errors; that which renders null the integral
of the product of the error by its probability and by its differential, the integral being
taken in all the extent of the limits of the errors, and the origin of the errors being in
the middle of the interval which separates these limits. But, if the law of facility is
different for the positive errors and for the negative errors, then the preceding integral [514]
becomes null only in the case where this origin is at the point of the abscissa through
where passes the ordinate of the center of gravity of the curve, of which the ordinates
represent the law of facility of the errors represented themselves by the abscissas. For
every other point, the mean error of the observation is this integral divided by the
interval of the limits; and, if we have a great number of observations, the mean of
the errors of these observations will be, by that which we have seen in Book II, equal
very nearly to this quotient. By making therefore so that the sum of the errors is null,
we will be able to suppose null the integral of which we just spoke, and then all my
analysis subsists and becomes independent of the hypothesis of an equal facility of
positive errors and of negative errors. We are able always to obtain this advantage by
adding to the equations of condition an indeterminate element of which the coefficient
is unity. It is that which takes place of itself in the equations of condition relative
to the movement of the planets in longitude; because the correction of the epoch has
unity there for coefficient. But, the addition of an element weakening, as we have said,
the probability of the errors of the other elements, a probability which, for the same
number of observations, diminishes when the number of elements which are supported
on them is greater, it is necessary to recur to this addition only when we are able to fear
that a constant cause favors the errors of one sign rather than those of a contrary sign.
Besides, we will be assured of it easily, by making the sum of the positive remainders
and that of the negative remainders of the equations of condition, when we will have
substituted the values of the elements determined by the most advantageous method,
without the addition of which we just spoke and by seeing if the excess of one of these
sums over the other indicates a constant cause.
In order to leave no doubt on this object, I am going to apply the calculus. There
results from § 22 of Book II that the probability that the sum of the errors of the obser-
vations equals
ak 0 √
s + ar s
k
is proportional to the exponential [515

13
2 2
− 2(kkk00 r−k02 ) .
c

This sum is S(i) , and, by § 1, we have

S(i) = S0(i) + S(p(i) u + q (i) u0 + · · · ).

By the nature of the final equations, we have S0(i) = 0; we have therefore

ak 0 √
s + ar s = S(p(i) u + q (i) u0 + · · · ).
k
ν0
If we make, as in this section, u = ν

s
, u0 = √
s
, . . . , we will have thus

k0 √ 1
r=− s + S(p(i) ν + q (i) ν 0 + · · · ).
k as
k0 √1 ,
Thus k is of order s
and its square is of order 1s ; we are able therefore to neglect it,
00
having regard to kk . The probability of the simultaneous existence of r, ν, ν 0 , . . . is
thus proportional to the exponential
S(p(i) ν+q (i) ν 0 +··· )2
− 2kk00 r 2 −
c 2S0(i)2 .

By multiplying it by dr, dν 0 , . . . , and integrating it with respect to r, ν 0 , ν 00 , . . . , from


negative infinity to positive infinity, we will have a quantity proportional to the prob-
ability of ν. By multiplying therefore this quantity by dν and by taking the integral
within some given limits, by dividing next by this same integral taken from ν = −∞
to ν = +∞, we will have the probability that the value of ν is contained within these
limits. We see thus that the consideration of the values that k 0 is able to have and on
which depends the difference of probability of the positive and negative errors has no
sensible influence on the results of the general method exposed here above.

§ 5. Let us apply now this method to an example. For this, I have profited from [516]
the immense work that Bouvard has just finished on the movements of Jupiter and of
Saturn, from which he has constructed very precise Tables. He has made use of all the
oppositions observed by Bradley and by the astronomers who have followed him: he
has discussed them anew and with the greatest care, that which has given to him 126
equations of condition for the movement of Jupiter in longitude and 129 equations for
the movement of Saturn. In these last equations, Bouvard has made the mass of Uranus
enter as indeterminate. Here are the final equations that he has concluded by the most

14
advantageous method:

721200 , 600 =795938z − 12729398z 0


+ 6788, 2z 00 − 1959, 0z 000 + 696, 13z iv + 2602z v ,
−738297 , 800 = − 12729398z + 424865729z 0
00

− 153106, 5z 00 − 39749, 1z 000 − 5459z iv + 5722z v ,


237 , 782 =6788, 2z − 153106, 5z 0
00

+ 71, 8720z 00 − 3, 2252z 000 + 1, 2484z iv + 1, 3371z v ,


−4000 , 335 = − 1959, 0z − 39749, 1z 0
− 3, 2252z 00 + 57, 1911z 000 + 3, 6213viv + 1, 1128z v ,
−34300 , 455 =696, 13z − 5459z 0
+ 1, 2484z 00 + 3, 6213z 000 + 21, 543z iv + 46, 310z v ,
−100200 , 900 =2602z + 5722z 0
+ 1, 3371z 00 + 1, 1128z 000 + 46, 310z iv + 129z v .
1+z
In these equations, the mass of Uranus is supposed 19504 ; the mass of Jupiter is
1+z 0 00
supposed 1067,09 ; z is the product of the equation of the center by the correction of
the perihelion employed first by Bouvard; z 000 is the correction of the equation of the
center; z iv is the secular correction of the mean movement; z v is the correction of the
epoch of the longitude at the beginning of 1750. The second of the decimal degree is
taken for unity.
By means of the preceding equations contained in the system (A), I have concluded [517]
the following, contained in the system (B):

2744100 , 68 =743454z − 12844814z 0


+ 6761, 23z 00 − 1981, 45z 000 − 237, 97z iv ,
−69381200 , 58 = − 12844814z + 424611920z 0
− 153165, 81z 00 − 39798, 46z 000 − 7513, 15z iv ,
24800 , 1772 =6761, 23z − 153165, 81z 0
+ 71, 8581z 00 − 3, 2367z 000 + 0, 7684z iv ,
−3100 , 6836 = − 1981, 45z − 39798, 46z 0
− 3, 2367z 00 + 57, 1815z 000 + 3, 2218z iv ,
16 , 5783 = − 237, 97z − 7513, 15z 0
00

+ 0, 7684z 00 + 3, 2218z 000 + 4, 9181z iv .

From these equations, I have deduced the following four, contained in the system

15
(C),

2824300 , 85 =731939, 5z − 1328350z 0 + 6798, 41z 00 − 1825, 56z 000 ,


−66848600 , 70 = − 13208350z + 413134432z 0 − 1519920z 00 − 34876, 7z 000 ,
24500 , 5870 =6798, 41z − 151992, 0z 0 + 71, 7381z 00 − 3, 7401z 000 ,
−4200 , 5434 = − 1825, 56z − 34876, 7z 0 − 3, 7401z 00 + 55, 0710z 000 ;

these last equations give the following, contained in the system (D),

2683300 , 55 =671414, 7z − 14364541z 0 + 6674, 43z 00 ,


−69543000 , 0 = − 14364541z + 391046861z 0 − 154360, 6z 00 ,
24200 , 6977 =6674, 43z − 154360, 6z 0 + 71, 4841z 00 .

Finally I have concluded thence the following two equations, contained in the system
(E):

417200 , 95 = 48442z + 48020z 0 , −17145500 , 2 = 48020z + 57725227z 0 .

I stop myself at this system, because it is easy to conclude from it the values of the
weight P relative to the two elements z and z 0 that I desired particularly to know. The [518]
formulas of § 3 give, for z,

(48020)2
 
s
P = 48442 −
2S0(i)2 57725227

and, for z 0 ,
(48020)2
 
s
P = 57725227 − .
2S0(i)2 48442
The number s of the observations is here 129 and Bouvard has found

S0(i)2 = 31096;

we have therefore, for z,


log P = 2, 0013595
0
and, for z ,
log P = 5, 0778624.
The preceding equations give
z 0 = − 0, 00305,
z =0, 08916.
1
The mass of Jupiter is 1067,09 (1 + z 0 ). By substituting for z 0 its preceding value,
1
this mass becomes 1070,35 . The mass of the Sun is taken for unity. The probability that
the error of z 0 is comprehended within the limits ±U is, by § 1,
√ Z
P 2
√ du c−P u ,
π

16
the integral being taken from u = −U to u = U . We find thus the probability that the
mass of Jupiter is comprehended within the limits
1 1 1
± ,
1070, 35 100 1067, 09
1000000
equal to 1000001 ; so that there are odds one million very nearly against one that the
1
value 1070,35 is not in error of a hundredth of its value; or, that which reverts to quite
nearly the same, that after a century of new observations, added to the previous, and [519]
discussed in the same manner, the new result will not differ from the previous by a
hundredth of its value.
Newton had found, by the observations of Pound, on the elongations of the satellites
of Jupiter, the mass of this planet equal to the 1067th part of that of the Sun, that which
differs very little from the result of Bouvard.
1+z
The mass of Uranus is 19504 . By substituting for z its previous value, this mass
1
becomes 17907 . The probability that this value is comprehended within the limits
1 1 1
± ,
17907 4 19504
2508
is equal to 2509 , and the probability that this mass is comprehended within the limits
1 1 1
±
17907 5 19504
is equal to 215,6
216,6 .
The perturbations that Uranus produces in the movement of Saturn being of little
importance, we must not yet expect from the observations of this movement a great
precision in the value of its mass. But, after a century of new observations, added to
the previous and discussed in the same manner, the value of P will increase in a manner
to give this mass with a great probability that its value will be contained within some
narrow limits; that which will be much preferable to the use than the elongations of the
satellites of Uranus, because of the difficulty to observe these elongations.
Bouvard, by applying the previous method to the 126 equations of condition which
the observations of Jupiter have given to him and by supposing the mass of Saturn
1+z
equal to 3534,08 , has found
z = 0, 00620
and
log P = 4, 8856829.
1
These values give the mass of Saturn equal to 3512,3 , and the probability that this mass [520]
is comprehended within the limits
1 1 1
±
3512, 3 100 3534, 08
is equal to 11327
11328 .
Newton had found, by the observations of Pound on the greatest elongation of the
1
fourth satellite of Saturn, the mass of this planet equal to 3012 , that which surpasses

17
by a sixth the preceding result. There are odds of millions of billions against one that
the one of Newton is in error, and we will not at all be surprised if we consider the
difficulty to observe the greatest elongations of the satellites of Saturn. The facility to
observe those of the satellites of Jupiter has rendered, as we have seen, the value much
more exact than Newton has concluded from the observations of Pound.

On the probability of judgments.


1816
I have compared, in § 50 of Book II, the judgment of a tribunal which pronounces
between two contradictory opinions to the result of the testimonies of many witnesses
of the extraction of a ticket from an urn which contains only two tickets. There is
however between these two cases this difference, namely, that the probability of the
testimony is independent of the nature of the thing attested, because we suppose that
the witness has not been able to be deceived on this thing; instead an object in litigation
is able to be surrounded by such obscurities, that the judges, in their supposing all
the good faith desirable, are able to be however of contrary opinions. The nature of
the affair which is subject to them must therefore influence on their judgment. I will
make this consideration enter into the following investigations, by applying it to the
judgments in criminal matter.
In order to condemn an accused, without doubt the strongest proof of his offense
is necessary to the judges. But a moral proof is never but a probability, and experience
has only too well made known the errors of which the criminal judgments, even those [521]
which appear to be most just, are yet susceptible. The possibility to repair these errors
is the most solid argument of the philosophers who have wished to proscribe the pain
of death. We should therefore abstain ourselves from judging, if it was necessary we
await mathematical evidence. But, when the proofs have a force such that the product
of the error to fear by its feeble probability is inferior to the danger which would result
from the impunity of the crime, judgment is commanded by the interest of society.
This judgment is reduced, if I do not deceive myself, to the solution of the following
question: Has the proof of the offense of the accused the high degree of probability
necessary in order that the citizens have less to fear the errors of the tribunals, if he is
innocent and condemned, than his new attempts and those of the unfortunate persons
who the example of his impunity would embolden, if he was culpable and absolved?
The solution of this question depends on many elements very difficult to know. Such is
the imminence of danger which would menace society if the accused criminal remained
unpunished. Sometimes, this danger is so great that the magistrate sees himself obliged
to renounce the prudent forms established for the certainty of innocence. But that
which renders nearly always the question of which there is concern insoluble is the
impossibility to estimate exactly the probability of the offense, and to fix that which is
necessary for the condemnation of the accused. Each judge, in this regard, is forced
to bring himself back to his proper feeling. He forms his opinion by comparing the
diverse witnesses and the circumstances of which the offense is accompanied to the
results of his reflections and of his experience; and, under this relation, a long habit of
interrogating and judging the accused gives much advantage in order to know the truth
in the midst of often contradictory indices.

18
The preceding question depends further on the magnitude of the punishment ap-
plied to the offense; because we require naturally, in order to pronounce death, proofs
much stronger than to inflict a detention of some months. This is a reason to propor-
tion the punishment to the offense, a grave punishment applied to a light offense must
inevitably render absolved many a guilty person. The product of the probability of the
offense by its gravity being the measure of the danger that absolution of the accused is [522]
able to make society experience, we would be able to think that the punishment must
depend on this probability. This is that which we do indirectly in the tribunals where
we retain during some times the accused against whom are raised some very strong
proofs, but insufficient to condemn. In the view to acquire new understanding, we de-
liver him not at all immediately into the midst of his fellow citizens, who would review
it not without lively alarms. But the arbitrariness of this measure and the abuse that we
are able to make of it has caused to reject it in the country where we attach a very great
price to individual liberty.
Now, what is the probability that the decision of a tribunal which is able to condemn
only by a given majority will be just, that is to say, conformed to the true solution of
the question posed above? This important problem well resolved will give the means
to compare the diverse tribunals among themselves. The majority of a single vote in a
numerous tribunal indicates that the affair of which there is concern is nearly doubtful;
the condemnation of the accused would be therefore then contrary to the principles of
humanity, protectors of innocence. The unanimity of the judges would give a very great
probability of a just decision; but, by being obliged, too many guilty persons would be
absolved. It is necessary therefore either to limit the number of judges, if we wish
that they be unanimous, or to increase the majority necessary to condemn, when the
tribunal becomes more numerous. I will test by applying the calculus to this object,
persuaded that the applications of this kind, when they are well conducted and based
on some data that good sense suggests to us, are always preferable to the most specious
reasonings.
The probability that the opinion of each judge is just enters as principal element in
this calculation. This probability is evidently relative to each affair. If, in a tribunal of
one thousand and one judges, five hundred one are of one opinion, and five hundred
are of a contrary opinion, it is clear that the probability of the opinion of each judge
surpasses quite little 12 ; because, by supposing it sensibly greater, a single vote of
difference would be an unlikely event. But, if the judges are unanimous, this indicates [523]
in the proofs that degree of force which carries away the conviction. The probability
of the opinion of each judge is therefore then very near to unity or of certitude; not
unless some passions or some common prejudgments mislead all the judges. Beyond
these cases, the ratio of the votes for or against the accused must alone determine this
probability. I suppose thus that it is able to vary from 21 to unity, but that it is not able to
be below 12 . If this were not, the decision of the tribunal would be insignificant as the
lot: it has value only as much as the opinion of the judge has more tendency to the truth
than to the error. It is next by the ratio of the numbers of votes favorable or contrary to
the accused that I determine the probability of this opinion.
These data suffice in order to have the general expression of the probability that the
decision of the tribunal judging in a given majority is just. In our special tribunals com-
posed of eight judges, five votes are necessary for the condemnation of an accused: the

19
probability of the error to fear respecting the justness of the decision surpasses then 14 .
If the tribunal were reduced to six members which would be able to condemn only with
the plurality of four votes, the probability of the error to fear would be then below 14 ;
there would be therefore for the accused an advantage to this reduction of the tribunal.
In both cases, the majority required is the same and equal to two. Thus, this majority
remaining constant, the probability of the error increases with the number of judges.
This is general, whatever be the majority required, provided that it remains the same.
By taking therefore for rule the arithmetic relation, the accused is found in a position
less and less advantageous in measure as the tribunal becomes more numerous. This
relation is followed in the Chamber of the peers of England. One requires for the con-
demnation a majority of twelve votes, whatever be the number of judges. If we have
belief that, the votes opposed are destroying reciprocally, the twelve remaining votes
represent the unanimity of a jury of twelve members, required in the same country for
the condemnation of an accused, we have been in a great error. Good sense shows that
there is a difference between the tribunal of two hundred twelve judges, of whom one
hundred twelve condemn the accused, while one hundred absolve him, and that of a tri- [524]
bunal of twelve judges unanimous for condemnation. In the first case, the one hundred
votes favorable to the accused permit thinking that the proofs are far from attaining the
degree of force which draw the conviction. In the second case, the unanimity of the
judges carry belief that they have attained this degree. But simple good sense does not
suffice to estimate the extreme difference of the probability of error in these two cases.
It is necessary then to recur to the calculus, and we find very nearly 15 for the probabil-
1
ity of the error in the first case, and only 8192 for this probability in the second case, a
1
probability which is not 1000 of the first. This is a confirmation of the principle that the
arithmetic relation is unfavorable to the accused when the number of judges increases.
To the contrary, if we take for rule the geometric relation, the probability of the error
of the decision diminishes when the number of the judges is increased. For example,
in the tribunals which would be able to condemn only in the plurality of the two thirds
of votes, the probability of the error to fear is nearly 41 if the number of judges is six:
it is below 17 if this number is raised to twelve. Thus we must be regulated neither on
the arithmetic relation, nor on the geometric relation, if we wish that the probability of
error is never above nor below a determined fraction.
But to what fraction must we be fixed? It is here that the arbitrary commences,
and the tribunals offer in this regard great varieties. In the special tribunals, where
five votes out of eight suffice for the condemnation of the accused, the probability
65
of the error to fear respecting the goodness of the judgment is 256 or below 14 . The
magnitude of this fraction is frightening; but that which must reassure a little is the
consideration that, most often, the judge who absolves an accused regards him not as
innocent. He pronounces only that it is not attained by some sufficient proofs in order
that he be condemned. We are especially reassured by the pity that nature has put into
the heart of man, and which disposes the mind to see with difficulty a guilty person in
the accused submitted to his judgment. This sentiment, more quick in those who have
not at all the habit of criminal judgments, outweighs the inconvenience attached to the [525]
inexperience of juries. In a jury of twelve members, if the plurality required for the
condemnation is of eight votes out of twelve, the probability of error to fear is 1093 8192
or a little less than 81 : it is nearly 22
1
if this plurality is of nine votes. In the cases of

20
1
unanimity, the probability of error to fear is 8192 , that is more than one thousand times
less than in our juries.
The solution of the problem that we just considered does not suffice to fix the
convenient majority, in a tribunal of any number of judges whatsoever. It is necessary,
for this, to know the probability of the offense below which an accused is not able to be
condemned, without that the citizens having to dread more the errors of the tribunals,
than the attacks which would be born from the impunity of a guilty person absolved. It
is necessary next to determine the probability of the offense resulting from the decision
of the tribunal and to fix the majority in a manner that these probabilities are equals.
But it is impossible to obtain them. The first is, as we have said, relative to the position
in which society is found, a variable position, very difficult to define well and always
too complicated in order to be submitted to the calculus. The second depends on a thing
entirely unknown, the law of probability of the opinion of each judge in the estimation
that he makes of the probability of the offense. Seeing our ignorance of these two
elements of the calculus, what is more reasonable than to depart from the solution of
the single problem that we may resolve in this manner, the one of the probability of
the error of the decision of a tribunal? This probability appears to me too high in our
tribunals, and I think that in this regard it is acceptable to approach to the English
1 1
jury where it is only 8192 . In fixing it at the fraction 1024 and in determining the
majority necessary to attain it, we place the accused in the position where he would
be vis-à-vis of a jury of nine members, of which we would require unanimity; that
which appears to me to guarantee sufficiently the innocent ones from the errors of the
tribunals, and society from the pains that impunity of the guilty persons would produce.
It must be extremely rare then that an accused is condemned with a probability less than
that which is necessary to his condemnation; because the majority who condemn him [526]
declare that the probability of his offense is at least equal to this necessary probability:
the minority who absolve him declare that the first of these probabilities appears to
it inferior to the second; but it is natural to believe that this inferiority is not very
considerable. It must rarely happen that the mean probability which results from the
totality of the judgments of the members of the tribunal is inferior to the probability
required for the condemnation of the accused, if we reduce, by a convenient majority,
the probability of the error to fear respecting the justice of the decision, to the fraction
1
1024 . The analysis furnishes, in order to have this majority, some formulas which I
will expose here and that it is easy to reduce into a Table dependent on the number of
the judges. But a parallel Table will appear too arbitrary to the common men who will
prefer always one or the other of the arithmetic and geometric relations which they are
able to imagine easily.

§ 1. A judge must not, in order to condemn an accused, expect the mathematical


evidence that it is impossible to attain in moral things. But, when the probability of the
offense is such that the citizens had more to dread the attempts which would be able to
be born of his impunity than the errors of the tribunals, the interest of society requires
the condemnation of the accused. I name a this degree of probability, and I suppose
that the judge who condemns an accused pronounces thence that the probability of his
offense is at least a. I name x the probability of this opinion of the judge, a probability
that I will suppose equal or superior to 21 , and varying by some infinitely small degrees,

21
equal to x and equally probable a priori. I suppose further that the tribunal is composed
of p + q judges, of whom p condemn the accused and q absolve him. The probability
that the opinion of the tribunal is just will be proportional to xp (1 − x)q , and the
probability that it is not will be proportional to (1 − x)p xq ; the probability of the
goodness of the judgment will be therefore, by § 1 of Book II,

xp (1 − x)q
(a) .
xp (1 − x)q + (1 − x)p xq
It is necessary to multiply this quantity by the probability of the value of x, taken from [527]
the observed event. This event is that the tribunal is itself divided into two parts of
which the one, composed of p judges, condemn the accused, and of which the other,
formed of q judges, absolve him. The probability of x is therefore the function xp (1 −
x)q +(1−x)p xq divided by the sum of all the similar functions relative to all the values
of x, from x = 21 to x = 1; it is consequently

[xp (1 − x)q + (1 − x)p xq ] dx


R .
xp dx(1 − x)q

the integral of the denominator being taken from x = 0 to x = 1. By multiplying this


function by the function (a), we will have

xp (1 − x)q dx
R
xp dx(1 − x)q

for the probability of the goodness of the judgment relative to x. The same probability
relative to all the values of x is therefore
R p
x dx(1 − x)q
(b) R ,
xp dx(1 − x)q

the integral of the numerator being taken from x = 21 to x = 1, and that of the denom-
inator being taken from x = 0 to x = 1. It follows thence that the probability of the
error to fear respecting the goodness of the judgment is further expressed by formula
(b), provided that we take the integral of the numerator from x = 0 to x = 21 . We find
thus this last probability equal to
(c)  
p + q + 1 (p + q + 1)(p + q) (p + q + 1)((p + q)(p + q − 1) 
1 + + +

 
1 1 1.2 1.2.3

2p+q+1  (p + q + 1)((p + q)(p + q − 1) · · · (p + 2)
 + ··· +
 


1.2.3 . . . q
1
If we require unanimity, q is null, and this expression becomes 2p+1 .

§ 2. Let us determine presently the probability of the error to fear respecting the
justice of the decision of the tribunal, when p and q are large numbers; that which ren- [528]
ders the formula (c) very difficult to evaluate in numbers. It is necessary to distinguish
here two cases, one in which p − q is considerable, the other in which p − q is rather

22
small. In the first case, we will make use of formula (o) of § 28 of Book II which gives,
for the probability of error,
3
(p + q)p+q+ 2 [(p + q)2 − 13pq]
 
p+q
(e) 3 1 1 √ 1− − ,
2p+q+ 2 pp+ 2 q q+ 2 (p − q) π (p − q)2 12pq(p + q)

π being the circumference of which the diameter is unity.


In the second case, where p − q is a small number relative to p, we will find easily,
by the analysis of § 19 of Book II, the probability of error to fear equal to
2
dt c−t
R
(f ) √ ,
π

the integral being taken from

(p − q)2 (p + q)
t2 =
8pq
to infinity.
In order to give an example of each of these formulas, we suppose a tribunal formed
of 144 judges, and that 56 is necessary for condemnation of the accused. Then we have

p = 90, q = 54,
1
and formula (e) gives 773 for the probability of the error to fear respecting the goodness
of the decision of the tribunal. In the case of unanimity of a jury of eight members, the
1
probability of the error to fear is 512 ; the accused is therefore then in a more favorable
position than vis-à-vis a similar jury.
Let us suppose the tribunal formed of 212 judges and that a majority of twelve votes
suffices for condemnation. In this case

p + q = 212, p − q = 12,
1
and formula (f ) gives 4,889 for the probability of the error to fear.

23
On a disposition of the Code of criminal instruction.
15 November 1816 [529]

Article 351 of the Code of criminal instruction is thus conceived:

“If nevertheless the accused is declared guilty only by a simple majority,


the judges will deliberate among them on the same point: and if the opin-
ion of the minority of the jurors is adopted by the majority of the judges,
of such sort that by reuniting the number of votes, this number exceeds
the one of the majority of the jurors and of the minority of the judges, the
opinion favorable to the accused will prevail.”

After this article, seven jurors declaring the accused guilty and five declaring him
not guilty, the accused is condemned when three alone of the five judges of the Assize
Court are reunited to the minority of the jurors. This appears to shock at the same time
the rules of common sense and the principals of humanity, protectors of innocence.
The Assize Court intervene then with justice, because the offense of the accused is not
sufficiently established by a simple majority of the jury, that which the Calculus of
Probabilities render indubitable. But, when the opinion of the Assize Court annuls the
one of the majority of the jurors, far from confirming it, when the difference of the two
votes, which gave to this majority only an insufficient preponderance, is reduced to a
single vote, by the addition of the judges of whom the state and the wise must inspire
confidence1 , is it not unjust to condemn the accused?
I propose therefore to reform thus the article cited: [530]

“If nevertheless the accused is declared culpable only by a simple majority,


the judges will deliberate among them on the same point; and if the opinion
of the minority of jurors is adopted by the majority of the judges, this
opinion will prevail.”

If this reform appeared just, the indispensable duty to abrogate promptly all that
which is able to compromise the innocence does not permit to await, in order to convert
by law, the general revision of the criminal Code, a revision which demands much
reflection and time. It is in order to fulfill this duty as much as it is possible to me that
I publish this writing.

15 November 1816
1 The difference of one vote gives to the majority a preponderance so much less as the number of judges is

more considerable: simple good sense shows it without the help of the calculus. In the present question, the
preponderance of the majority of the jurors diminishes therefore not only by the reduction of the two votes to
one, but further by the increase of the number of voters, which is raised from twelve to seventeen. Generally,
a constant difference between the majority and the minority below of which the accused is not able to be
condemned is so much less favorable to him as the number of judges is greater: on the contrary, the constant
ratio of the votes of the majority to those of the minority become to him more favorable, in measure as the
number of judges increases. The ratio 85 , adopted by the Chamber of Peers of France, is very favorable to
the accused before a tribunal so numerous. (We are able to see, on this object, the Supplement to my Théorie
analytique des Probabilités, and the third edition of my Essai philosophique sur les Probabilités.)

24
SUR
APPLICATION DU CALCUL DES
PROBABILITÉS
AUX OPÉRATIONS GÉODÉSIQUES

Pierre Simon Laplace∗


Connaissance des Temps for the year 1820 (1818) pp. 422–440.

We determine the length of a great arc, on the surface of the Earth, by a chain [422]
of triangles which are supported on a base measured with exactitude. But whatever
precision that we bring into the measure of the angles, their inevitable errors can, by
accumulating, deviate sensibly from the truth, the value of the arc that we have con-
cluded from a great number of triangles. We know therefore only imperfectly this
value, if we are not able to assign the probability that its error is comprehended within
some given limits. The desire to extend the application of the Calculus of Probabilities
to natural Philosophy, has made me seek the formulas proper to this object.
This application consists in deducing from the observations, the most probable re-
sults and to determine the probability of the errors of which they are always suscep-
tible. When, these results being known very nearly, we wish to correct them with a
great number of observations, the problem is reduced to determining the probability of
one or many linear functions of the partial errors of the observations, the law of prob-
ability of these errors being supposed known. I have given, in my Théorie analytique
des Probabilités, a method and some general formulas for this object, and I have ap-
plied them, to some interesting points of the System of the world, in the Connaissance
des Temps of 1818, and in a supplement to the work that I just cited. In questions of
Astronomy, each observation furnishes, in order to correct the elements, an equation
of condition: when these equations are very manifold, my formulas give at the same
time the most advantageous corrections, and the probability that the errors after these
corrections, will be contained within some assigned limits, whatever be moreover the
law of probability of the errors of each observation. It is so much more necessary to
be rendered independent of this law, that the simplest laws are always infinitely less
probable, seeing the infinite number of those which are able to exist in nature. But the
unknown law which the observations follow of which we make use, introduces into
the formulas an indeterminate which would permit not at all to reduce them to num-
bers, if we did not succeed to eliminate it. This is that which I have done by means [423]
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. June 20, 2012

1
of the sum of the squares of the remainders, when we have substituted into each equa-
tion of condition, the most probable corrections. The geodesic questions offering not
at all similar equations; it was necessary to seek another means to eliminate from the
formulas of probability, the indeterminate dependent on the law of probability of the
errors of each partial observation. The quantity by which the sum of the angles of each
observed triangle surpasses two right angles plus the spherical excess, has furnished
me this means; and I have replaced by the sum of the squares of these quantities, the
sum of the squares of the remainders of the equations of condition. Thence, we are
able to determine numerically the probability that the final result of a long sequence of
geodesic operations, does not exceed a given quantity. It will be easy to apply these
formulas, to the part of our meridian which extends from the base of Perpignan to the
isle of Formentera; that which is so much more useful, that any base of verification
having been measured toward the south part of this meridian, the exactitude of that
part reposes entirely on the precision with which the angles of the triangles have been
measured.
A perpendicular to the meridian of France, will soon be measured from Strasbourg
to Brest. These formulas will make an estimate of the errors, not only of the total arc,
but further the difference in longitude of its extreme points, concluded from a chain
of the triangles which unite them, and of the azimuths of the first and of the last side
of this chain. If we diminish, as much as it is possible, the number of triangles and if
we give a great precision to the measure of their angles, two advantages that the use
of the repetitive circle and of the reflectors procure, this way to have the difference in
longitude of the extreme points of the perpendicular, will be one of the better of which
we are able to make use.
In order to be assured of the exactitude of a great arc which is supported on a base
measured toward one of its extremities, we measure a second base toward the other
extremity, and we conclude from one of these bases the length of the other. If the length
thus calculated deviates very little from observation, there is everywhere to believe that
the chain of triangles is quite nearly exact, likewise the value of the great arc which
results from it. We correct next this value, by modifying the angles of the triangles,
in a manner that the bases calculated accord themselves with the measured bases, that
which is able to be made in an infinity of ways. Those that we have until the present
employed are based on some vague and uncertain considerations. The methods exposed [424]
in my Théorie analytique des Probabilités lead to some very simple formulas in order
to have directly the correction of the total arc, which results from the measures of
many bases. These measures have not only the advantage to correct the arc, but further
to increase that which I have named the weight of the errors, that is to say to render
the probability of the errors, more rapidly decreasing; so that the same errors become
less probable with the multiplicity of the bases. I expose here the laws of probability
of the errors of the total arc, that the addition of new bases gives birth to. After we
brought in the observations and in the calculations, the exactitude that we require now;
we considered the sides of the geodesic triangles, as rectilinear, and we supposed the
sum of their angles, equal to two right angles. Next we corrected the observed angles,
by subtracting from each of them, the third of the quantity of which the sum of the
three observed angles, surpassed two right angles. Legendre has noted first, that the
two errors that we commit thus, compensate themselves mutually, that is to say that by

2
subtracting from each angle of a triangle, the third of the spherical excess, we are able
to neglect the curvature of its sides, and to regard them as rectilinear. But the excess
of the three observed angles over two right angles, is composed of the spherical excess
and the sum of the errors of the measure of each of the angles. The analysis of the
probabilities shows that we must yet subtract from each angle, the third of this sum, in
order to have the law of probability of the errors of the results, most rapidly decreasing.
Thus, by the equal apportionment of the error of the sum observed of the three angles of
the triangle considered as rectilinear, we correct at the same time the spherical excess,
and the errors of the observations. The weight of the angles thus corrected, increases;
so that the same errors become by this correction, less probable. There is therefore
advantage to observe the three angles of each triangle, and to correct them as we have
just said it. Simple good sense makes us recognize this advantage; but the Calculus of
probabilities is able alone to estimate, and to show that by this correction it becomes
the greatest possible.
In order to apply with success the formulas of probability, to the observations, it
is necessary to return faithfully all those that we would admit if they were isolated,
and to reject none of them by the sole consideration that it is extended a little from the
others. Each angle must be uniquely determined by its measures, without regard to the [425]
two other angles of the triangle in which it belongs; otherwise, the error of the sum of
the three angles would not be the simple result of the observations, as the formulas of
probability suppose it. This remark seems to me important in order to disentangle the
truth in the middle from the slight uncertainties that the observations present.
I dare to hope that these researches interest the Geometers at a time where we are
occupied to measure the diverse countries of Europe, and where the King just ordered
the execution of a new map of France, by competing for details, the operations of the
cadastre which thence will become better and more useful yet. Thus the magnitude and
the curve of the surface of Europe will be known in all the senses; and our meridian ex-
tends to the north to the parallel of the Shetland isles, by its junction with the geodesic
operations made in England, and it being terminated to the south at the isle Formentera
in the Mediterranean, will embrace near a quarter of the distance from the pole to the
equator.

§ 1. Let us conceive, on a sphere, an arc of great circle A, A0 , A00 , etc. and


suppose that we have formed about the chain of triangles CAC 0 , CC 0 C 00 , C 00 C 0 C 000 ,
C 00 C 000 C iv , etc.; of which the sides CC 0 , C 0 C 00 , C 00 C 000 , etc. cut this arc at A0 , A00 , A000 ,
etc. I do not give at all the figure, because it is easy to trace it after these indications.
Let A be the angle CAA0 , A(1) the angle CA0 A, A(2) the angle C 0 A00 A000 , etc. Let
further C, be the angle ACC 0 , C (1) the angle CC 0 C 00 , C (2) the angle C 0 C 00 C 000 , etc.;
we will have
A + A(1) + C − α = π + t,
α being the error of the observed angle C, t being the excess of the angles of the spher-
ical triangle ACA0 over π which expresses two right angles. We will have similarly
A(1) + A(2) + C (1) − α(1) = π + t(1) ,
α(1) being the error of the observed angle CC 0 C 00 , and t(1) being the excess of the
angles of the spherical triangle A0 C 0 A00 over two right angles. We will form similarly

3
the equations
A(2) + A(3) + C (2) − α(2) = π + t(2) ,
A(3) + A(4) + C (3) − α(3) = π + t(3) ,
etc.;
whence we deduce easily

A(2n) = A +C −C (1) +C (2) −C (3) · · · +C (2n−2) −C (2n−1)


−α +α(1) −α(2) · · · +α(2n−1)
−t +t(1) −t(2) · · · +t(2n−1) ,

and [426]
A(2n−1) = π −A −C +C (1) −C (2) +C (3) · · · −C (2n−2)
+α −α(1) +α(2) · · · +α(2n−2)
+t −t(1) +t(2) · · · +t(2n−2) ;
by supposing therefore A well known, the error of the angle A(2n) is

−α −α(2) . . . − α(2n−2) ,
+α(1) +α(3) . . . + α(2n−1) ;

because the values of t, t(1) , etc., are quite small, and able to be determined with
precision. The concern now is to have the probability that this error will be contained
within some given limits.
2
For this, I will suppose that the probability of any error α is proportional to c−hα ,
c being the number of which the hyperbolic logarithm is unity. This supposition the
most natural and the most simple of all, results from the use of the repeating circle in
the measure of the angles of the triangles. In fact, let us name φ(q) the probability of an
error q in the measure of a simple angle, and let s be the number of the simple angles
observed in all the series that we have made, in order to determine the same angle. The
probability that the error of the mean result, or of the angle concluded from this series,
is ± √rs , will be by § 18 of the second book of my Théorie analytique des Probabilités,
proportional to
kr 2
c− 2k00
±a being the limits of the errors, k is, by the same section, equal to 2 dq q
R 
a φ a , and
R 2 dq
k 00 is equal to aq 2 da φ aq , the integrals being taken from q null to q = a; and the


negative errors being supposed as probable as the positive errors. By making therefore

α s ks
r= , h = 00 2 ;
a 4k a
2
c−hα will be the probability of the error α. But we will see, at the end of this Memoir,
that the following results always hold, whatever be the probability of α.
Let β and γ be the errors of the two angles AC 0 C and CAC 0 of the first tri-
angle ACC 0 ; the probability of the three errors α, β and γ will be proportional to
2 2 2
c−hα −hβ −hγ ; but the observation of these angles give the sum α + β + γ of the

4
three errors; because the sum of the three angles having to be equal to two right angles [427]
plus the surface of the triangle ACC 0 , if we name T the excess of the three angles
observed on this quantity, we will have

α + β + γ = T;

the preceding exponential becomes thus


2
−hβ 2 −hγ 2 −h(T −α−γ)2
c−α ,

or 2 2
1 1 h 2
− 3h 1
c−2h(β+ 2 α− 2 T ) 2 (α− 3 T ) − 3 T ,
β being susceptible to all the values from −∞ to ∞; it is necessary to multiply this
exponential by dβ and take the integral within these limits, that which gives an integral
which has for factor 2
3h 1 h 2
c− 2 (α− 3 T ) − 3 T ;
the probability of α is therefore proportional to this factor. The most probable value
of α is evidently that which renders null the quantity α − 13 T ; it is necessary therefore
to correct the three angles of each triangle by the third of the excess T of their sum
observed, over two right angles plus the spherical excess. This is that which we do
commonly.
Let us name ᾱ and β̄, the quantities α − 13 T and β − 13 T ; the probability of ᾱ will
be proportional therefore to
3 2
c− 2 hα .
If we diminish in the preceding expression of A(2n) , the angle C by 13 T , that is to
say if we employ the corrected angles of each triangle; this moment C̄, C̄ (1) , etc. that
which the angles C, C (1) , etc. become by these corrections; we will have

A(2n) =A + C̄ − C̄ (1) + C̄ (2) − etc.


− ᾱ + ᾱ(1) − ᾱ(2) + etc.
− t + t(1) − t(2) + etc.

The probability that the quantity −ᾱ + ᾱ(1)√− etc., where the error of the angle A(2n)
will be comprehended within the limits ±r 2n, will be, by § 18 cited,
q
2 32 h Z 3 2
√ dr c− 2 hr .
π
§ 2. The concern is no longer but to have the value of h. For this, I will take as [428]
given from the observations, the sums T , T (1) , etc. of the errors of the angles of each
triangle, and I will determine the value of h, which renders most probable, the observed
value of the sum of their squares. By that which precedes, the probability of the values
of ᾱ and of T , is proportional to
3h 2 h 2
c− 2 ᾱ − 3 T ;

5
by multiplying this exponential by dᾱ, and taking the integral from ᾱ = −∞ to ᾱ =
h 2
∞, the integral will have for factor c− 3 T , and this factor will be proportional to the
probability of T . This probability will be therefore
h 2
dT c− 3 T
h 2
,
dT c− 3 T
R

the integral being taken from T = −∞ to T = ∞; it will be thus


q
1
3h h 2
√ .dT.c− 3 T .
π

If we multiply by T 2 , this function; the integral taken from T = −∞, to T = ∞, and


2
multiplied by 2n, will be the most probable value of the sum T 2 + T (1) + etc. By
2
naming θ the observed sum, and by equating it to this product, we will have
3 2n
h= ;
2 θ2
the probability that the error of A(2n) is comprehended within the limits ± 23 r0 θ is thus
02
2 dr.c−r
R
√ ,
π
the integral being taken from r0 null.
Let us suppose the line AA0 is perpendicular to the meridian of the point A, and
that we have observed with exactitude, the angle that the last side C (2n−1) A(2n) C (2n)
forms at the point A(2n) with the meridian of this point. By naming E this angle,
π − E − A(2n) will be the angle that this meridian forms with that perpendicular. Let
φ be the angle formed by the meridans of the points A and A(2n) , or the difference in [429]
longitude of these points, and l the latitude of the point A; we will have

cos(π − E − A(2n) )
sin φ = ,
sin l
designating therefore by δφ and δA(2n) the errors of the angles φ and A(2n) , we will
have
δA(2n) . sin(E + A(2n) )
δφ = − ;
sin l. cos φ
the preceding integral will give therefore the probability that the error respecting the
longitude concluded from the observed azimuths observed at A and A(2n) will be com-
prehended within the limits

2 sin(E + A(2n) )
± θr0 .
3 sin l. cos φ
§ 3. Let us determine presently the probability that the error of the measure of
the line AA0 A00 . . . will be comprehended within some given limits. For this, let us

6
suppose that in the triangles CAC 0 , C 0 CC 00 , etc., we had corrected the angles as we
do ordinarily, that is to say by subtracting from each, the third of the sum of which the
three observed angles surpass two right angles plus the spherical excess; that we lower
the vertices C, C 0 , C 00 etc. of the perpendiculars CI, C 0 I 0 , C 00 I 00 , etc., onto the line
AA0 ; we will have, very nearly,

AI = AC cos IAC;

we will have next quite nearly,

II 0 = CC 0 sin A(1)

and, generally,
I (i) I (i+1) = C (i) C (i+1) sin A(i+1) .
by supposing therefore that δ is the characteristic of the errors, we will have

δ.I (i) I (i+1) δ.C (i) C (i+1)


= − δA(i+1) cot A(i+1) .
I (i) I (i+1) C (i) C (i+1)
We have, by that which precedes,

δA(i+1) = ±{ᾱ − ᾱ(1) + ᾱ(2) · · · ± ᾱ(i) };

the + sign having place if i is even, and the − sign if it is odd; next we have, in the [430]
(i + 1)st triangle,

C (i) C (i−1) sin C (i+1) C (i−1) C (i)


C (i+1) C (i) = ,
sin C (i−1) C (i+1) C (i)
that which gives

δ.C (i) C (i+1) δ.C (i) C (i−1)


(i) (i+1)
= + δC (i+1) C (i−1) C (i) cot C (i+1) C (i−1) C (i)
C C C (i) C (i−1)
− δC (i−1) C (i+1) C (i) cot C (i−1) C (i+1) C (i) ;

but ᾱ(i) is the error of the angle C (i) or C (i−1) C (i) C (i+1) ; let β̄ (i) be the error of the an-
gle C (i−1) C (i+1) C (i) ; −(ᾱ(i) +β̄ (i) ) will be the error of the angle . . . C (i+1) C (i−1) C (i) ;
we will have therefore

δ.C (i) C (i+1) δ.C (i) C (i−1)


= (i) (i−1) + (ᾱ(i) + β̄ (i) ) cot C (i+1) C (i−1) C (i)
C (i) C (i+1) C C
−β̄ (i) cot C (i−1) C (i+1) C (i) ;
that which gives, by observing that, in the first triangle, the side C (i−1) C is AC;
( (i) )
δ.C (i) C (i+1) (ᾱ + β̄ (i) ) cot C (i+1) C (i−1) C (i)
= −S ,
C (i) C (i+1) + β̄ (i) cot C (i−1) C (i+1) C (i) ]

7
the finite integral S expressing the sum of all the quantities that this sign contains, from
i = 0 inclusively to i inclusively. We will have therefore the value of δ.I (i) I (i+1) .
By reuniting all these values, we will have for the entire error of the measured line,
an expression of this form

pᾱ + q β̄ + p(1) ᾱ(1) + q (1) β̄ (1) + etc. (o)


The probability of the simultaneous values of ᾱ and of β̄ is, by that which precedes,
proportional to
2
− 23 hᾱ2
c−2h(β̄+ 2 ᾱ)
1
;
By making

β̄ + 12 ᾱ = 12 α 3,
the preceding exponential becomes
2
3
− 32 hᾱ2
c− 2 hα ,
and the function (o) takes then this form [431]

rα + r(1) ᾱ + r(2) α(1) + r(3) ᾱ(1) + etc.


The probability that the error of the function (o) is comprehended within the limits ±s,
is, by § 20 of the second book of my Théorie analytique des Probabilités,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null, to t being
s
3
2h
s 2 2 ;
2
r +r (1) + r(2) + etc.
now we have evidently
1 √ 1
r= q 3, r(1) = p − q;
2 2
the value of t will be therefore
s
3
2h
s 2 2
p2 − pq + q 2 + p(1) − p(1) q (1) + q (1) + etc.

§ 4. Let us suppose that in order to verify the operations, we have measured the last
part I (2n) I (2n+1) of the line AA0 A00 , etc.; the expression of the error of this part, will
be by that which precedes, of the form

lᾱ + mβ̄ + l(1) ᾱ(1) + m(1) β̄ (1) + etc. (p)

8
Let λ be this error, or the quantity of which the line I (2n) I (2n+1) , concluded from the
value of the side AC measured with care, surpasses the direct measure of this √ line; we
will equate the function (p) to λ. If in this function we make β + 12 ᾱ = 21 α 3; it will
take the form

f α + f (1) ᾱ + f (2) α(1) + f (3) ᾱ(1) + etc.


and the probability of that function, will be by that which precedes, proportional to
2 2 2
3
+ ᾱ2 + α(1) +α(1) +etc.)
c− 2 h(α .

by substituting for α, its value

λ − f (1) ᾱ − f (2) α(1) − etc.


,
f
the preceding exponential becomes [432]
2 2 (λ−f (1) ᾱ−f (2) α(1) −etc.)2
− 32 h(ᾱ2 + α(1) +α(1) +etc.)− 3h f2
c 2
.
the values of ᾱ, α(1) , etc. the most probable, are those which render the exponent of
c a minimum. We will differentiate therefore this exponent and we will equate to zero
the coefficients of dᾱ, dᾱ(1) , dα(1) , etc.; that which gives

f 2 ᾱ = f (1) {λ − f (1) ᾱ − f (2) α(1) − etc.},


f 2 ᾱ(1) = f (2) {λ − f (1) ᾱ − f (2) α(1) − etc.},
etc.

From these diverse equations, we deduce

λf
α= ,
f2 + f (1) 2 2
+ f (2) + etc.
λf (1)
ᾱ = 2 2 ,
f 2 + f (1) + f (2) + etc.
λf (2)
α(1) = 2 2 ,
f 2 + f (1) + f (2) + etc.
etc.,

that which gives, by observing that

f α + f (1) ᾱ = lᾱ + mβ̄,

and that
2β̄ + ᾱ
α= √ ,
3

9
λ(l − 12 m)
ᾱ = 2 2 ,
l2 − ml + m2 + l(1) − m(1) l(1) + m(1) + etc.
λ(m − 21 l)
β̄ = 2 ,
l2 − ml + m2 + l(1) + etc.
λ(l(1) − 21 m(1) )
ᾱ(1) = 2 ,
l2 − ml + m2 + l(1) + etc.
etc.

By substituting therefore these values into the function (o), we will have the correction
resulting from the measure of the part I (2n) I (2n+1) .
But we can arrive by the following method, to this result, and have at the same
time the new law of the errors of the measure of the entire arc, which results from the
measure of a second base.
3 2 3 2
c− 2 h.α and c− 2 h.ᾱ being proportionals to the probabilities of α and of ᾱ, it is [433]
easy to conclude from § 21 of the second book of my Théorie analytique des Prob-
abilités, that by supposing the function (o) equal to e, the probability of e will be
proportional to
!2
Sr (i) f (i)
− 3 h e−λ
2 Sf (i)2
(Sr (i) f (i) )2
Sr (i)2 −
c Sf (i)2 ,
the sign S extending to all the values of i, from i = 0, inclusively; by making therefore

Sr(i) f (i)
e−λ ± u,
Sf (i)2
the probability that the function (o) will be comprehended within the limits

λSr(i) f (i)
± u,
Sf (i)2
will be proportional to
− 3 hu2
2
(i) f (i) )2
(Sr
Sr (i)2 −
c Sf (i)2 .
thus we see that it is necessary to diminish the arc measured AI (2n+1) , by the quantity

λSr(i) f (i)
Sf (i)2
we see next by this correction, the weight of the error to fear is increased. Because
before the measure of the second base, it was

− 23 h
,
Sr(i)2
and by this measure it becomes

10
− 32 h
(Sr (i) f (i) )2
.
Sr(i)2 − Sf (i)2

we are able to observe that

r2 + r(1) = p2 − pq + q 2 ;
2
f 2 + f (1) = l2 − ml + m2 ;
rf + r(1) f (1) = l(p − 21 q) + m(q − 12 p);
we will be able therefore to form easily Sr(i)2 and Sr(i) f (i) by means of the coefficients [434]
of ᾱ, β̄, ᾱ(1) , . . . in the functions of (o) and (p).
If we had measured some other bases, we would have, by the method of the section
cited, the corrections which it would be necessary to make to the measured arc, and the
law of its errors.
We will have similarly the correction that we must make to the angle A(2n) which
gives the difference in longitude of the extreme points from a perpendicular to the
meridian; because the correction of A(2n) , or that which it is necessary to remove from
it, being by that which precedes

−ᾱ + ᾱ(1) − ᾱ(2) + etc.,


it will suffice to substitute, instead of the function (o), the function

−ᾱ + ᾱ(1) − ᾱ(2) + etc.,

that which gives

p = −1, q = 0, p(1) = 1, q (1) = 0, p(2) = −1, etc.,


thence it is easy to conclude that we must in order to correct the value of A(2n) , add
the quantity to it
( )
l − l(1) + l(2) − etc.

− 21 m + 12 m(1) − etc.
2 .
l2 − ml + m2 + l(1) − etc.
the probability that the error of this value of A(2n) , thus corrected, is within the limits
±u will be
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
q
u 32 h
t= q
(l−l(1) +l(2) −etc.− 12 m+ 21 m(1) −etc.)2
2n − l2 −ml+m2 +l(1) 2 −etc.

11
§ 5. We are arrived to the preceding results, by supposing that the law of probability
2
of the error α in the measure of the angle is proportional to c−hα , and we have proved
that this law of probability is able to be admitted in regard to the angles measured with
the repeating circle. We will show here that these results hold generally whatever be the
law of probability of error α. Let φ(α) be this law, the positive errors being supposed [435]
to have the same probability as the negative errors; let us make α − qT = ᾱ, and let us
seek the probability of the errors of the function

−ᾱ + ᾱ(1) − ᾱ(2) + etc. + ᾱ(2u−1) ; (i)

If we name α, and β the errors of the two angles of a triangle, and T the excess of their
sum over two angles plus the spherical excess; T − α − β will be the error of the third
angle; and the probability of the simultaneous existence of these three errors, will be

φ(α).φ(β).φ(T − α − β).

The probability of ᾱ, will be therefore


 
(1 − q).T − β
φ(ᾱ + qT ).φ(β).φ ,
−ᾱ

by multiplying this product by dβ.dT , and by taking the integrals from β and T equal
to −∞, to β and T equal to +∞; we will have a function which will be proportional
to the probability of ᾱ. We will suppose here, that which we are able to make, that
the function φ(α) is able to be extended to these infinite limits. Let us designate by
ψ(ᾱ), the function resulting
√ from these integrations. The probability that the error of
the function (i) is ±r 2n, will be by § 18 of my Théorie analytique des Probabilités,
proportional to
−H r
c 4H 00 .r ,
by making
Z Z
00
H=2 dᾱ.ψ(ᾱ); H =2 ᾱ2 dᾱ2 .ψ(ᾱ).ψ(ᾱ),

the integrals being taken within the positive and negative infinite limits. Now we have
by integrating within these limits,
Z  
T − ᾱ − β
H = 2 dᾱ.dβ.dT.φ(ᾱ + qT ).φ(β).φ ,
−qT

and by the theory of double integrals, this second member is equal to


Z
2 dα.dβ.dT 0 .φ(α).φ(β).φ(T 0 ),

by making
T 0 = T − qT − ᾱ − β;

12
R
by designating therefore by K the integral dα.φ(α) taken between the infinite limits,
we will have evidently
H = 2K 3 .
We will have next [436]
Z
H 00 = ᾱ2 .dᾱ.dβ.dT.φ(ᾱ + qT ).φ(β).φ(T − qT − ᾱ − β),

now we have
ᾱ = α − qT = α.(1 − q) − qβ − qT 0 ;
we will have therefore
Z
H 00 = [(1 − q).α − qβ − qT 0 ]2 .dα.dβ.dT 0 .φ(α).φ(β).φ(T 0 ).

If we observe now that α dα.φ(α) is null, because φ(α) being supposed the same for
the two errors α and −α, the two elements −α dα.φ(−α), and α dα φ(α) is destroyed
00
R 2 integral taken between the infinite limits; if we designate next by K ,
in the preceding
the integral α .dα.φ(α); we will have evidently
2
H 00 = K 00 .K 2 .(1 − q + 2q 2 );

thus the probability of r will be proportional to


−Kr 2
c 2K 00 .(1−q2 +2q2 ) .

The value of q which renders this probability most rapidly decreasing is that which
renders (1 − q)2 + 2q 2 , a minimum, and that value is 13 ; it is necessary therefore to have
the probability of error most rapidly decreasing, to diminish by a third of T , each angle
of the triangle; and then the probability of r becomes
3K.r 2
c 4K 00 .

It is necessary now to determine by the observations, the value of −3K 4K 00 . For this, we
will observe that the probability of T , will be proportional to the integral
Z
dα.dβ.φ(α).φ(β).φ(T − α − β),

taken within the infinite limits. Let Π(T ) be this integral. The most probable sum of
the values of T 2 in the 2n observed triangles, will be by § 19 of the second book of the
work cited,
Q00
.2n,
Q
by supposing Z Z
00
Q= dT.Π(T ); Q = T 2 dt.Π(T ),

13
the integrals being taken between the infinite limits. Now we have [437]
Z Z
dT.Π(T ) = dα.dβ.dT.φ(α).φ(β).φ(T − α − β);

and by that which precedes, this second member is equal to K 3 ; next we have
Z Z
T 2 dT.Π(T ) = (T 0 + α + β)2 .dω.dβ.dT 0 .φ(α).φ(β).φ(T 0 )
Z
= (T 02 + α2 + β 2 ).dα.dβ.dT 0 .φ(α).φ(β).φ(T 0 ).

This second member is evidently 3K2.K 00 ; we will have therefore


K 6n
= 2.
K 00 θ
and then the probability of r become proportional to
−9 r2
c 4 .2n. θ 2 ;

thus the probability of the error ± 23 r0 θ of the function (i) will be as previously
02
2 dr0 c−r
R
√ .
π
It is easy to see that we are able to extend the same reasoning, to all the results to
which we are arrived, by departing from the law of probability of the error α, propor-
tional to c−hα2 . Thence these results become independent of this law, and are extended
to all the laws that are able to exist in nature.
Let us consider in fact, the function

pᾱ + q β̄ + p(1) α(1) + q (1) β̄ (1) + etc., (o)

and let us seek the probability that the value of this function will be s. In designating
by ψ(ᾱ, β̄) the probability of the coexistence of the values of ᾱ and of β̄; we will have
Z
ψ(ᾱ, β̄) = dT.φ(ᾱ + 31 T ).φ(β̄ + 13 T ).φ( 13 T − ᾱ − β̄),

the integral being taken within the infinite limits, T = −∞, and T = ∞. Next we see
by § 20 of the second book of my Théorie analytique des Probabilités, that the sought
probability will be
 Z


 dᾱ.dβ̄.ψ(ᾱ, β̄).(ω̄) cos(pᾱ + q β̄).ω
Z √

1 
dω.c−sω −1
Z
π 
 × dᾱ(1) .dβ̄ (1) .ψ(ᾱ(1) , β̄ (1) ) cos(p(1) ᾱ(1) + q (1) β̄ (1) ).ω



×etc.

14
The integral relative to ω being taken from ω = −π to ω = π and the integrals relative [438]
to ᾱ, and β̄, being taken within their infinite limits. Let us develop into a√series ordered
with
R respect to the powers of ω, the logarithm of the factor of dω.c−sω −1 , under the
sign. We have
Z
log dᾱ.dβ̄.ψ(ᾱ, β̄) cos(pᾱ + q β̄).ω
Z
= log dᾱ.dβ̄.ψ(ᾱ, β̄)
R
ω2 dᾱ.dβ̄.ψ(ᾱ, β̄) cos(pᾱ + q β̄)
− · R − etc.,
2 dᾱ.dβ̄.ψ(ᾱ, β̄)
R
the integral dᾱ.dβ̄.ψ(ᾱ, β̄) is equal to
Z
dᾱ.dβ̄.dT.φ(ᾱ + 13 T ).φ(β̄ + 31 T ).φ( 13 T − ᾱ − β̄),

or to Z
dα.dβ.dT 0 .φ(α).φ(β).φ(T 0 ),

all these integrals being taken within the infinite limits. This last integral is, by that
which precedes, equal to K 3 . the integral
Z
dᾱ.dβ̄.ψ(ᾱ, β̄)(pᾱ + q β̄)2 ,

is equal to
" #2
p.( 39 α − 13 β + 13 T 0 )
Z
0 0
dα.dβ.dT .φ(α).φ(β).φ(T ). ,
+q.( 23 β − 31 α + 13 T 0 )
R
In the squared factor under the sign, we are able to neglect the products of α, β, and
T 0 , because they produce nothing, as we have seen, in the integral; then, it is easy to see
that this integral is reduced to 32 K 2 K 00 .(p2 − pa + q 2 ). Thence, it is easy to conclude
that the sought probability of the value s, is proportional to
√ K 00 .ω 2
Z
dω.c−sω −1 − .S.(p2 − pq + q 2 ) − etc.,
3K
2 2
by designating by S(p2 −pq +q 2 ), the sum p2 −pq +q 2 +p(1) −p(1) q (1) +q (1) +etc.
We are able by § 20 cited, to consider only the square of ω and to neglect its superior
powers; by setting next the preceding integral under this form
Z −K 00 2 2
h √ i2 2
R 3K − 34 · K 00 .S(pKs
3K .S(p −pq+q )· ω−s −1. 2K 00 . (p2 −pq+q 2 ) 2 −pq+q 2 )
dω.c ,

the integral is able, by this same section, to be taken from ω = −∞, to ω = ∞; the
probability of the value of s, is thus proportional to
2
− 34 . K 00 .S(pKs
2 −pq+q 2 )
c ,

15
and consequently to [439]
2
− 49 · θ2 .S(ps2 −pq+q
.2n
2)
c ;
thus the probability that the value of s is comprehended within the limits
r
3 S(p2 − pq + q 2 )
± tθ · ,
2 2n
will be
2 dt.c−t2
R
√ ,
π
the integral being taken from t null; that which is conformed to that which precedes.
We have supposed in that which precedes,
ᾱ = α − 13 T ; β̄ = β − 31 T ; etc.;
that is, that we correct each of the three angles of each triangle, by one third of the
observed error of the sum of its three angles. But is this correction here the most
advantageous? This is that which we will examine. Let us suppose generally in the
function (o)

ᾱ = α + iT ; β̄ = β + lT ; ᾱ(1) = α(1) + i(1) T ; etc.


1 1
Then ᾱ being α − 3T; β̄ being β − 3T; we will have
α = α + (1 + 31 ).T ; β = β + (1 + 13 ).T ; etc.,
and the function (o) will become
pα + pβ + p(1) α(1) + etc. + S.(pi + ql).T ;
and the correction of the calculated arc AI (2n) will be −S.(pi + ql)T ; and its error will
become
pα + pβ + p(1) α(1) + etc.
By applying to this function, the preceding analysis; we will find easily that the proba-
bility of the value s of this function, is proportional to
2
− 9 .2n. s2
4 θ

S.(p2 −pq+q 2 + 9 (pi+ql)2
c 2 ,
2n being the number of triangles employed.
It is clear that the values of i and of l which render the coefficient of s2 a maximum, [440]
are those which render pi + ql, null. Then the preceding correction of the arc AI (2n)
is null, and the law of probability of its errors is the same as in the case of i and of l
nulls. This case gives therefore the law of probability of the errors, the most rapidly
decreasing, a law which must be evidently adopted.
We will note here that in the calculation of the function (o), we are able to apply at
will, the errors α and β to two of the angles of the triangle. Similarly in the calculation
of the arc AI (2n) , we are able at will, in the series of the triangles which serve to this
calculation, to name them first, second, etc., triangles; only we will observe that α and
β belong to the first triangle, α(1) and β (1) to the second and thus of the rest. But it
will be simpler to enumerate them according to the order in which we use them.

16
DEUXIÈME SUPPLÉMENT.
APPLICATION DU CALCUL DES PROBABILITÉS AUX OPÉRATIONS
GÉODÉSIQUES

Pierre Simon Laplace∗


February 1818,
OC 7, pp. 531–580

We determine the length of a great arc, on the surface of the Earth, by a chain [531]
of triangles which are supported on a base measured with exactitude. But, whatever
precision that we bring into the measure of the angles, their inevitable errors can, by ac-
cumulating, deviate sensibly from the truth the value of the arc that we have concluded
from a great number of triangles. We know therefore only imperfectly this value, if we
are not able to assign the probability that its error is comprehended within some given
limits. The desire to extend the application of the Calculus of Probabilities to natural
Philosophy has made me seek the formulas proper to this object.
This application consists in deducing from the observations the most probable re-
sults and to determine the probability of the errors of which they are always suscep-
tible. When, these results being known very nearly, we wish to correct them from a
great number of observations, the problem is reduced to determine the probability of
one or many linear functions of the partial errors of the observations, the law of prob-
ability of these errors being supposed known. I have given, in Book II of my Théorie
analytique des Probabilités, a method and some general formulas for this object, and I
have applied them, in the first Supplement, to some interesting points of the System of
the world. In questions of Astronomy, each observation furnishes, in order to correct
the elements, an equation of condition: when these equations are very manifold, my
formulas give, at the same time, the most advantageous corrections and the probability [532]
that the errors, after these corrections, will be contained within some assigned limits,
whatever be moreover the law of probability of the errors of each observation. It is so
much the more necessary to be rendered independent of this law, as the simplest laws
are always infinitely less probable, seeing the infinite number of those which are able
to exist in nature. But the unknown law which the observations of which we make use
follow introduces into the formulas an indeterminate which would permit not at all to
reduce them in numbers, if we did not succeed to eliminate it. This is that which I have
done, by means of the sum of the squares of the remainders, when we have substituted,
into each equation of condition, the most probable corrections. The geodesic questions
offering not at all similar equations, it was necessary to seek another means to eliminate
from the formulas of probability the indeterminate dependent on the law of probability
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 8, 2014

1
of the errors of each partial observation. The quantity by which the sum of the angles
of each observed triangle surpasses two right angles plus the spherical excess has fur-
nished me this means, and I have replaced by the sum of the squares of these quantities
the sum of the squares of the remainders of the equations of condition. Thence, we are
able to determine numerically the probability that the final result of a long sequence
of geodesic operations does not exceed a given quantity. By applying these formulas
to the measure of a perpendicular to the meridian, they will make estimate the errors,
not only of the total arc, but also of the difference in longitude of its extreme points,
concluded from the chain of triangles which unite them and from the azimuths of the
first and of the last side of this chain. If we diminish, as much as it is possible, the
number of triangles and if we give a great precision to the measure of their angles, two
advantages that the use of the repetitive circle and of the reflectors procure, this way to
have the difference in longitude of the extreme points of the perpendicular will be one
of the better of which we are able to make use.
In order to be assured of the exactitude of a great arc which is supported on a base
measured toward one of its extremities, we measure a second base toward the other
extremity, and we conclude from one of these bases the length of the other. If the length [533]
thus calculated deviates very little from observation, there is everywhere to believe that
the chain of triangles is quite nearly exact, just as the value of the great arc which results
from it. We correct next this value, by modifying the angles of the triangles, in a manner
that the bases calculated accord themselves with the measured bases, that which is able
to be made in an infinity of ways. Those that we have until the present employed are
based on some vague and uncertain considerations. The methods exposed in Book II
lead to some very simple formulas, in order to have directly the correction of the total
arc which results from the measures of many bases. These measures have not only the
advantage to correct the arc, but further to increase that which I have named the weight
of a result, that is to render the probability of its errors more rapidly decreasing, so that
the same errors become less probable with the multiplicity of the bases. I expose here
the laws of probability of the errors that the addition of new bases give birth to. The
measure of a second base serves similarly to correct the difference in longitude from
the extreme points of a perpendicular to the meridian and to increase the weight of the
value of this difference.
After we brought, in the observations and in the calculations, the exactitude that
we require now, we considered the sides of the geodesic triangles as rectilinear, and we
supposed the sum of their angles equal to two right angles. Legendre has remarked first
that the two errors that we commit thus compensate themselves mutually, that is that by
subtracting from each angle of a triangle the third of the spherical excess, we are able
to neglect the curvature of its sides and to regard them as rectilinear. But the excess
of the three observed angles over two right angles is composed of the spherical excess
and the sum of the errors of the measure of each of the angles. The analysis of the
probabilities shows that we must yet subtract from each angle the third of this sum, in
order to have the law of probability of the errors of the results most rapidly decreasing.
Thus, by the equal apportionment of the error of the sum observed of the three angles of [534]
the triangle considered as rectilinear, we correct at the same time the spherical excess
and the errors of the observations. The weight of the angles thus corrected increases,
so that the same errors become, by this correction, less probable. There is therefore

2
advantage to observe the three angles of each triangle, and to correct them as we have
just said. Simple good sense makes to have a presentiment of this advantage; but the
Calculus of probabilities is able alone to estimate and to show that, by this correction,
it becomes the greatest that it is possible.
The formulas of which I just spoke are related to some future observations: thus,
when we apply them to some past observations, we set aside all the data that the com-
parison of these observations are able to furnish out of the errors, data of which we
are able to make use when we know the law of probability of the errors of the partial
observations. If this law is expressed by a constant less than unity, of which the expo-
nent is the square of the error, then my formulas agree to the past observations as to
the future observations, and they satisfy to all the data of these observations, as I have
shown in § 25 of Book II. In the case where the angles are measured by means of a
repeating circle, each simple angle is the mean result of a great number of measures of
the same angle contained in the total arc observed; the error of the angle is therefore
the mean of the errors of all these measures; and, by § 18 of Book II, the probability of
this error is expressed by a constant, of which the exponent is equal to the square of the
error. The employment of the repeating circle unites therefore to the benefit of giving
a precise measure of the angles the one to establish a law of probability of the errors
which satisfies all the data of the observations.
In order to apply with success the formulas of probability to the geodesic obser-
vations, it is necessary to return faithfully all those that we would admit if they were
isolated, and to reject none of them by the sole consideration that it extended a little
from the others. Each angle must be uniquely determined by its measures, without
regard to the two other angles of the triangle in which it belongs; otherwise, the error [535]
of the sum of the three angles would not be the simple result of the observations, as
the formulas of probability supposes it. This remark seems to me important, in order
to disentangle the truth in the middle from the slight uncertainties that the observations
present.

§ 1. Let us conceive, on a sphere, an arc of great circle A, A0 , A00 , . . . and sup-


pose that we have formed about the chain of triangles ACC 0 , CC 0 C 00 , C 0 C 00 C 000 ,
C 00 C 000 C iv , . . . , of which the sides CC 0 , C 0 C 00 , C 00 C 000 , . . . cut this arc at A0 , A00 ,
A000 , . . . I do not give at all the figure, because it is easy to trace it according to these in-
dications. Let A be the angle CAA0 , A(1) the angle C 0 A0 A00 , A(2) the angle C 00 A00 A000 ,
. . . Let further C be the angle ACC 0 , C (1) the angle CC 0 C 00 , C (2) the angle C 0 C 00 C 000 ,
. . . We will have
A + A(1) + C − α = π + t,
α being the error of the observed angle C, t being the excess of the angles of the spher-
ical triangle ACA0 over π which expresses two right angles or the semi-circumference
of which the radius is unity. We will have similarly

A(1) + A(2) + C (1) − α(1) = π + t(1) ,

α(1) being the error of the observed angle CC 0 C 00 , and t(1) being the excess of the
angles of the spherical triangle A0 C 0 A00 over two right angles. We will form similarly

3
the equations
A(2) + A(3) + C (2) − α(2) = π + t(2) ,
A(3) + A(4) + C (3) − α(3) = π + t(3) ,
························ ;
whence we deduce easily

A(2i) = A +C −C (1) +C (2) −C (3) + · · · +C (2i−2) −C (2i−1)


−α +α(1) −α(2) +α(3) − · · · −α(2i−2) +α(2i−1)
−t +t(1) −t(2) +t(3) − · · · −t(2i−2) +t(2i−1) ,
(2i−1)
A = π −A −C +C (1) −C (2) +C (3) − · · · −C (2i−2)
+α −α(1) +α(2) −α(3) + · · · +α(2i−2)
+t −t(1) +t(2) −t(3) + · · · +t(2i−2) ;

by supposing therefore A well known, the error of the angle A(n) is [536]

α(n−1) − α(n−2) + α(n−3) − · · · ± α

the superior sign having place if n is odd, and the inferior sign having place if n is even.
The values of t, t(1) , . . . are quite small and are able to be determined with precision.
The concern is now to have the probability that this error will be comprehended
within given limits. For this, I will suppose first that the probability of any error α is
2
proportional to c−hα , c being the number of which the hyperbolic logarithm is unity.
This supposition, the most natural and the most simple of all, results from the use of the
repeating circle in the measure of the angles of the triangles. In fact, let us name φ(q)
the probability of an error q in the measure of a simple angle, this probability being
supposed the same for the positive and for the negative errors. Let us suppose further
that s is the number of simple angles contained in all the series that we have made in
order to determine this angle. The probability that the error of the mean result or of the
angle concluded from these series will be ± √rs , by § 18 of Book II, proportional to

kr 2
c− 2k00
R
k being equal to dq φ(q), the integral being taken from q null to q equal to its greatest
value, that we are always able to supposeR infinite; by making φ(q) discontinuous and
null beyond the limit of q, k 00 is equal to q 2 dq φ(q). By supposing therefore

√ ks
r = α s, h= ,
2k 00
2
c−hα will be the probability of the error α. We will see, at the end of this article, that
the following results always hold, whatever be the probability of α.
Let β and γ be the errors of the two angles AC 0 C and CAC 0 of the first triangle
ACC 0 ; the probability of the three errors α, β and γ will be proportional to [537]
2
−hβ 2 −hγ 2
c−hα ;

4
but the observation of these angles give the sum α + β + γ of the three errors; because
the sum of the three angles must be equal to two right angles plus the surface of the
triangle ACC 0 , if we name T the excess of the three angles observed on this quantity,
we will have
α + β + γ = T;
the preceding exponential becomes thus
1 1 2 2 h 2
− 3h 1
c−2h(β+ 2 α− 2 T ) 2 (α− 3 T ) − 3 T ,

β being susceptible to all the values from −∞ to ∞; it is necessary to multiply this


exponential by dβ and take the integral within these limits, that which gives an integral
which has for factor 2
3h 1 h 2
c− 2 (α− 3 T ) − 3 T ;
the probability of α is therefore proportional to this factor. The value of α most prob-
able is evidently that which renders null the quantity α − 13 T ; it is necessary therefore
to correct the three angles of each triangle by the third of the excess T of their sum
observed over two right angles plus the spherical excess. This is that which we do
commonly.
Let us name ᾱ and β̄ the quantities α − 13 T and β − 13 T ; the probability of ᾱ will
be proportional therefore to
3 2
c− 2 hα .
If we diminish the angle C by 31 T , that is if we employ the corrected angles of each
triangle, by naming C̄, C̄ (1) , . . . that which the angles C, C (1) , . . . become, by these
corrections, we will have

A(2i) =A + C̄ − C̄ (1) + C̄ (2) − · · · − ᾱ + ᾱ(1) − ᾱ(2) + · · · − t + t(1) − · · ·


A(2i−1) =π − A − C̄ + C̄ (1) − · · · + ᾱ − ᾱ(1) + · · · + t − t(1) + · · ·
The probability that the quantity [538]

ᾱ(n−1) − ᾱ(n−2) − · · · ± ᾱ

or the error of the angle A(n) will be comprehended within the limits ±r n, will be,
by § 18 cited, q
2 32 h Z 3 2
√ dr c− 2 hr .
π
We are able to observe here the advantage that the observation of the three angles of
each triangle produces, by the correction of these angles. Without this correction, the
error of the angle A(n) would be

α(n−1) − α(n−2) − · · · ± α,

and the probability that this error is comprehended within the limits ±r n would be
√ Z
2 h 2
√ dr c−hr .
π

5
a probability less than the preceding in which the weight of the result is 32 h, instead as
it is here h.
Let us determine now the value of h. Among the data of the observations, the
quantities by which the sums of the angles of each triangle surpass two right angles
plus the spherical excess appear to be the most proper to make known this value. By
that which precedes, the probability of the simultaneous existence of ᾱ and of T is
proportional to
h 2 3h 2
c− 3 T − 2 ᾱ .
By multiplying this exponential by dᾱ, and taking the integral from ᾱ = −∞ to ᾱ =
h 2
∞, the integral will have for factor c− 3 T , and this factor will be proportional to the
probability of T ; this probability will be therefore
h 2
dT c− 3 T
h 2
,
dT c− 3 T
R

the integral of the denominator being taken from T = −∞ to T = ∞. It will be thus [539]
proportional to q
1
3h h 2
√ c− 3 T .
π
Here the observed event is that the sum of the angles of the first triangle, of the second,
of the third, etc. surpass two right angles plus the spherical excess, respectively, by the
quantities T, T (1) , . . . , T (n−1) , n being the number of triangles; the probability of this
event will be therefore proportional to
 1  n2
3h h 2
c− 3 θ ,
π
by making
θ2 = T 2 + T (1)2 + · · · + T (n−1)2 .
Now, if we consider the diverse values of h as causes of the observed event, the proba-
bility of h will be, by the principle of the probability of the causes drawn from observed
events, equal to
n h 2
h 2 dh c− 3 θ
R n h 2 ,
h 2 dh c− 3 θ
the integral of the denominator being taken for all the values of h, that is from h = 0
to h = ∞. The value of h that it is necessary to choose is evidently the integral of the
products of the values of h multiplied by their probabilities; this value is therefore
R n+2 h 2
h 2 dh c− 3 θ
R n h 2 ,
h 2 dh c− 3 θ
the integrals being taken from h = 0 to h = ∞. The integral of the numerator is equal
to Z
3(n + 2) n h 2
h 2 dh c− 3 θ .
2θ2

6
The preceding fraction becomes thus 3(n+2) 2θ 2 ; this is therefore the value of h that it is
necessary to adopt. If we suppose n a great number, this value become very nearly [540]
3n
2θ 2 . This quantity is the value of h which renders the observed event most probable,
n h 2
the probability of this event, a priori, being proportional to h 2 c− 3 θ . By taking for h
3n (n)
the quantity 2θ 2 , the probability that the error of the angle A will be comprehended

within the limits ±r n is √ Z
3 n 9 nr 2
√ dr c− 4 θ2 ;
θ π
the probability that it will be comprehended within the limits ± 32 θr0 is therefore
2
dr0 c−r
R
2
√ ,
π

the integral being taken from r0 null.

§ 2. Let us suppose the arc AA0 A00 . . . perpendicular to the meridian of the point A.
Let φ be the angle formed by this meridian and by the one of the extreme point A(n) ,
and V the smallest of the angles that this last meridian makes with the arc AA0 . . .; we
will have
cos V
sin φ = ,
sin l
l being the latitude of point A. By designating therefore by δφ and δV the errors of the
angles φ and V , we will have

δV sin V
δφ = − .
sin l cos φ
If we have measured with a great exactitude the angle that the last side of the chain of
triangles forms at A(n) with the meridian of this point, it is easy to see that

δV = ±δA(n) ,

δA(n) being the error of A(n) ; the preceding integral in r0 is therefore the probability
that the error δφ of the longitude φ concluded from the azimuths observed at A and [541]
A(n) will be comprehended within the limits ± 32 θr0 sinsin V
l cos φ .
There results from the analysis exposed in Chapter V of Book III of the Mécanique
céleste that, if there exists an eccentricity in the terrestrial parallels, it has no sensible
influence on the value of φ concluded in this manner, provided that the measured arc
is not very considerable. In measuring therefore, with a great precision, the angles
of the diverse triangles and the amplitudes of the extreme points, we will have quite
exactly the difference in longitude of these points, and we will be able, by the preceding
formula, to estimate the probability of the small errors to fear respecting this difference.
Let us determine presently the probability that the error of the measure of the line
AA0 A00 . . . will be comprehended within some given limits. For this, let us suppose
that in the triangles CAC 0 , C 0 CC 00 , . . . we had corrected the angles as one does or-
dinarily, that is by subtracting from each the third of the quantity bywhich the sum of

7
the three observed angles surpasses two right angles plus the spherical excess. If we
lower the vertices C, C 0 , C 00 . . . . of the perpendiculars CI, C 0 I 0 , C 00 I 00 , . . . onto the
line AA0 A00 . . .; we will have, very nearly,

AI = AC cos IAC.

We will have next, quite nearly,

II 0 = CC 0 cos A(1)

and, generally,
I (i) I (i+1) = C (i) C (i+1) cos A(i+1) .
By supposing therefore that δ is the characteristic of the errors, we will have

δ.I (i) I (i+1) δ.C (i) C (i+1)


= − δA(i+1) tan A(i+1) .
I (i) I (i+1) C (i) C (i+1)
We have, by that which precedes,

δA(i+1) = ᾱ(i) − ᾱ(i−1) + ᾱ(i−2) − · · · ± ᾱ;

next, we have, in the (i + 1)st triangle, [542]

C (i) C (i−1) sin C (i+1) C (i−1) C (i)


C (i) C (i+1) = ,
sin C (i−1) C (i+1) C (i)
that which gives

δ.C (i) C (i+1) δ.C (i) C (i−1)


(i) (i+1)
= +δC (i+1) C (i−1) C (i) cot C (i+1) C (i−1) C (i)
C C C (i) C (i−1)
−δC (i−1) C (i+1) C (i) cot C (i−1) C (i+1) C (i) ;

but ᾱ(i) is, by that which precedes, the error of the angle C (i) or C (i−1) C (i) C (i+1) ,
corrected by subtracting from it the third of the excess of the sum of the three ob-
served angles of the triangle over two right angles. Let β̄ (i) be the error of the angle
C (i−1) C (i+1) C (i) , thus corrected; −(ᾱ(i) + β̄ (i) ) will be the error of the third angle
C (i+1) C (i−1) C (i) . We will have therefore

δ.C (i) C (i+1) δ.C (i) C (i−1)


(i) (i+1)
= +(ᾱ(i) + β̄ (i) ) cot C (i+1) C (i−1) C (i)
C C C (i) C (i−1)
−β̄ (i) cot C (i−1) C (i+1) C (i) ;
that which gives, by observing that, in the first triangle, the side C (i−1) C is AC that I
supposed measured very exactly.

δ.C (i) C (i+1)


= −S[(ᾱ(i) + β̄ (i) ) cot C (i+1) C (i−1) C (i) + β̄ (i) cot C (i−1) C (i+1) C (i) ],
C (i) C (i+1)

8
the sign S serving to express the sum of all the quantities that it contains from i = 0
to i inclusively. We will have therefore thus the value of δ.I (i) I (i+1) . By reuniting all
these values, we will have, for the entire error of their sum or of the measured line, an
expression of this form

(o) pᾱ + q β̄ + p(1) ᾱ(1) + q (1) β̄ (1) + · · ·

The probability of the simultaneous values of ᾱ and of β̄ is, by that which precedes,
proportional to
2
− 23 hα2
c−2h(β̄+ 2 ᾱ)
1
.
By making
1 1 √
β̄ + ᾱ = α 3,
2 2
the preceding exponential becomes [543]
2
3
− 32 hᾱ2
c− 2 hα ;
thus the laws of probability of the values of α and of ᾱ are the same. The function (o)
takes then this form

(o0 ) rα + r(1) ᾱ + r(2) α(1) + r(3) ᾱ(1) + · · ·

The probability that the error of this function, and consequently of the function (o), is
comprehended within the limits ±s, by § 20 of Book II,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to t equal to
s
3
2h
s 2 (1)2
r +r + r(2)2 + · · ·
We have evidently

1 √
 
1
pᾱ + q β̄ = p − q ᾱ + qα 3;
2 2
that which gives, by equating it to rα + r(1) ᾱ,
1 √ 1
r= q 3, r(1) = p − q;
2 2
3n
the value of t will be therefore, by substituting for h its value 2θ 2 ,

9
r
3s n
.
2θ p2 − pq + q2 + p(1)2 − p(1) q (1) + q (1)2 + · · ·
The length of the measured arc makes known that of the osculating radius of the
surface at the point A of departure. Let 1 + u be the radius drawn from the center
of gravity of the Earth to its surface, u being a function of the longitude and of the
latitude, the semi-axis of the Earth being taken for unity; if we name R the osculating
radius of this point, in the sense AA0 , we will have, by the Chapter cited from Book III [544]
of the Mécanique céleste,
 
  d du
du dφ 2
R=1+u− tan l + ;
dl cos2 l
and if the name  the length of the measured arc AA(1) , we will have, quite nearly,

R= (1 − 13 2 tan2 l);
φ cos l
that which gives, quite nearly,
δ  δφ
δR = − 2 ;
φ cos l φ cos l
but we have, by that which precedes,

δ =pᾱ + q β̄ + · · · ,
∓δA(n) ±(ᾱ − ᾱ(1) + ᾱ(2) − · · · )
δφ = = ,
sin l sin l
the inferior sign having place if n is even, and the superior sign if n is odd. By making
therefore
p  q
p̄ = ∓ , q̄ = ,
φ cos l φ2 sin l cos l φ cos l
p(1)  q (1)
p̄(1) = ∓ , q̄ (1) = ,
φ cos l φ2 sin l cos l φ cos l
····················· , ······ ,
the probability that the error δR will be comprehended within the limits ±s will be
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
r
3s n
t= .
2θ p̄2 − p̄q̄ + q̄ 2 + p̄(1)2 − p̄(1) q̄ (1) + · · ·
The difference in latitude of the extreme points of the perpendicular depends, by the [545]

10
Chapter cited from the Mécanique céleste, on the eccentricity of the terrestrial parallels,
which introduce into its expression the quantity

   
du d du
(u) −φ tan l + ;
dφ dφ dl

the part of this expression which is independent of this eccentricity is proportional to


φ2 ; thus the small error of which φ is susceptible has no sensible influence at all on
the difference in latitude. By observing therefore with a great care this difference, the
eccentricity of the terrestrial parallels must be manifest, as little as it is sensible.
If the geodesic line has been traced in the sense of the meridian, the azimuth, at
the extremity of the measured arc, will make known the eccentricity of the terrestrial
parallels, and it is remarkable that this azimuth is the function (u), by changing φ into
the difference in latitude of the extreme points of the measured arc and by multiplying
it by the sine of the latitude divided by the square of the cosine of the latitude at the
origin of the arc.
The arc measured in the sense of the meridian will make known the osculating
radius of the Earth in this sense, and, by the preceding formulas, we will have the
probability of the errors of which its value is susceptible.
We will obtain more precision in all the results by fixing toward the middle of the
measured arc the origin of the angles; because then the superior powers of these angles,
that we neglect, becomes much smaller.

§ 3. Let us suppose that, in order to verify the operations, we measure, toward the
extremity A(n) of the arc AA0 A00 . . ., a second base. The expression of the error of this
base, concluded from the chain of the triangles and from the base measured at the point
A, will be, by that which precedes, of the form

(p) lᾱ + mβ̄ + l(1) ᾱ(1) + m(1) β̄ (1) + · · · ;

let λ be this error which will be known by the direct measure of the second base. If in
the function (p) we make, as previously
1 1 √
β + ᾱ = α 3,
2 2
it takes this form [546]

f α + f (1) ᾱ + f (2) α(1) + f (3) ᾱ(1) + · · ·


By designating by s the value of the function (o) or of its equivalent (o0 ) and observing
3 2
that the probabilities of α and of ᾱ follow the same law and are proportionals to c− 2 hα
3 2
and c− 2 hᾱ , the probability of the preceding function will be proportional to
2
3
+ ᾱ2 + α(1)2 +ᾱ(1)2 +··· )
c− 2 h(α .
By supposing the function equal to λ, this exponential becomes

11
  2 
2 (1) 2
− 23 h (α− fFλ ) + ᾱ− f F λ +···+ λF
c ,
F expressing the sum of the squares f 2 + f (1)2 + f (2)2 + · · · The most probable values
of α, ᾱ, α(1) , . . . are evidently those which render the exponent of this exponential a
minimum, that which gives

fλ f (1) λ f (2) λ
α= , ᾱ = , α(1) = , ,...
F F F
If we observe next that we have, by that which precedes,
1 √ 1
f= m 3, f (1) = l − m,
2 2
1 √ 1
β̄ = α 3 − ᾱ,
2 2
we will have

(l − 12 m)λ (m − 12 l)λ
α= , β̄ = ,
F F
1 λ (m(1) − 21 l(1) )λ
ᾱ(1) = (l(1) − m(1) ) , β̄ (1) = ,
2 F F
····················· , ····················· ,
and F will become

l2 − ml + m2 + l(1)2 − m(1) l(1) + m(1)2 + · · ·


If we substitute these values into the function (o), we will have the correction resulting
from the measure of a second base, by affecting it with a contrary sign. But we are able [547]
to arrive to this result directly, by § 21 of Book II, according to which we see that, s
being the value of the function (o), its probability is proportional to
!
(i) f (i)
3 h s−λ Sr
2 Sf (i)2

(Sr (i) f (i) )2
Sr (i)2 −
c Sf (i)2 ,
the sign S extending to all the values of i, from i = 0 inclusively. The most probable
value of s is that which renders null the exponent of c, that which gives

Sr(i) f (i)
s=λ ;
Sf (i)2
it is necessary therefore to subtract from the measured arc AA(1) . . . A(n) this value of
s; and, if we name u the error of the arc thus corrected, the probability of u will be
proportional to
3 hu2
− 2
(Sr(i) f (i) )2
Sr (i)2 −
c Sf (i)2 .

12
We see by this expression that the weight of the result is increased by virtue of the
measure of the second base; because, before this measure, the coefficient of −s2 was,
by the preceding section,
3
2h
,
Sr(i)2
and, by this measure, the coefficient of −u2 becomes
3
2h
(i) f (i) )2
.
Sr(i)2 − (SrSf (i)2
The same error becomes therefore less probable by this measure and by the preceding
correction of this arc.
We are able to observe here that the preceding values of r, r(1) , f and f (1) give

r2 + r(1) = p2 − pq + q 2 ,
f 2 + f (1)2 = l2 − ml + m2 ,
1 1
rf + r(1) f (1) = l(p − q) + m(q − p).
2 2
We will be able therefore to form easily Sr(i)2 and Sr(i) f (i) by means of the coeffi- [548]
cients of ᾱ, β̄, ᾱ(1) , . . . in the functions of (o) and (p).
If we had measured some other bases, we would have, by the analysis of § 21 of
Book II, the corrections which it would be necessary to make to the measured arc, and
the law of its errors.
The measure of a new base is able to serve to correct, not only the measured arc, but
also the difference in longitude of its extreme points or the angle A(n) . It will suffice
to substitute into the function (o) this one

±(ᾱ − ᾱ(1) + ᾱ(2) − · · · )


which expresses the error of A(n) , the superior sign having place if n is odd, and the
inferior if n is even. Then we have

p = ±1, q = 0, p(1) = ∓1, q (1) = 0, ...;


thence it is easy to conclude that, in order to correct the angle A(n) , it is necessary to
add to it the quantity

∓λ(l − l(1) + l(2) − · · · − 21 m + 12 m(1) − · · · )


.
l2 − ml + m2 + l(1)2 − m(1) l(1) + m(1)2 + · · ·
The probability that the error of A(n) thus corrected is within the limits ±u will be
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to

13
q
3
u 2h
t= q
(l−l(1) +l(2) −···− 12 m+ 21 m(1) −··· )2
n− l2 −ml+m2 +l(1)2 −···

§ 4. We are arrived to the preceding results by starting from the law of probability
2
of the error α proportional to c−hα , and we have proved that this law of probability
is able to be admitted in regard to the angles measured with the repeating circle. We
will show here that these results hold generally, whatever be the law of probability of
error α. Let φ(α) be this law. We will suppose it such that the same positive and [549]
negative errors are equally probable. We will suppose, moreover, that φ(α) extends
from α = −∞ to α = +∞: this supposition is always permitted; because, if the
probability becomes null beyond certain limits, the function φ(α) is then discontinued
and null beyond these limits. Let us seek now the probability of the values of the
function (o) of § 1. This function has been calculated by correcting the angles of each
triangle by a third of the observed sum of their errors. Let us suppose generally that,
in the first triangle, we correct the error α by (i + 31 )T , the error β by (i1 + 31 )T , and
consequently the third error by ( 13 − i − i1 )T , by designating by α and β the errors α
and β thus corrected, we will have

α = α + (i + 31 )T, β = β + (i1 + 13 )T.

By designating similarly by α(1) and β (1) the errors α(1) and β (1) respectively cor-
(1)
rected from (i(1) + 31 )T (1) , (i1 + 13 )T (1) , we will have
(1)
α(1) = α(1) + (i(1) + 13 )T (1) , β (1) = β (1) + (i1 + 31 )T (1) ,
and thus consecutively. The function (o) is, by § 1, equal to

pᾱ + q β̄ + p(1) ᾱ(1) + q (1) β̄ (1) + · · · ;


next, we have

α = ᾱ + 31 T = α + (i + 31 )T ;
that which gives

ᾱ = α + iT ;
we have similarly

β̄ = β + i1 T, ᾱ(1) = α(1) + iT, ...


The function (o) becomes thus

pα + qβ + p(1) α(1) + q (1) β (1) + · · · + S(pi + qi1 )T,


S(pi + qi1 )T designating the sum

14
(1)
(pi + qi1 )T + (p(1) i(1) + q (1) i1 )T (1) + · · ·
The correction of the function (o) relative to the values of i, i1 , i(1) , . . . is therefore [550]

−S(pi + qi1 )T,


and then this function thus corrected becomes

() pα + qβ + p(1) α(1) + q (1) β (1) + p(2) α(2) + · · · ,

In order to have the probability of the values of this last function, we will observe that
the probability of the simultaneous existence of the values of α, β and T is

dα dβ dT φ(α)φ(β)φ(T − α − β)
RRR ,
dα dβ dT φ(α)φ(β)φ(T − α − β)
the integrals of the denominator being taken
R within their positive and negative infinite
limits. Let us designate by k the integral dαφ(α), taken within these limits; it is easy
to see that this denominator will be equal to k 3 . The preceding fraction becomes thus
dα dβ dT
φ(α)φ(β)φ(T − α − β);
k3
the probability of the simultaneous existence of the values of α, β and T will be there-
fore
dα dβ dT
φ[α + (i + 13 T )]φ[β + (i1 + 31 T )]φ[( 13 − i − i1 )T − α − β]
k3
T being supposed to be able to be varied from −∞ to +∞, we will have the probability
of the simultaneous values of α and β by integrating the preceding function with respect
dα dβ
to T , within the infinite limits. Let us name k3 ψ(α, β) this integral. We see, by § 20
of Book II, that by designating by s the value of the function (), the probability of s
will be proportional to

 ZZ 


 dα dβ ψ(α, β) cos(pα + qβ)w 


Z √

 

−sw −1
Z Z
(H) dw c (1) (1) (1) (1) (1) (1) (1) (1) ,

 × dα dβ ψ(α , β ) cos(p α + q β )w 

 

 
× .................................................
the integral relative to w being taken from w = −π to w = π and the integrals relative [551]
to α and β being taken within their infinite limits. Let us develop into a series, ordered
to the powers of w, the function contained within the parenthesis. The
with respect RR
logarithm of dα dβ ψ(α, β) cos(pα + qβ)w is equal to
w2
dα dβ ψ(α, β)(pα + qβ)2
ZZ RR
2
log dα dβ ψ(α, β) − RR − ···
dα dβ ψ(α, β)

15
Now we have
ZZ
dα dβ ψ(α, β)
ZZZ
= dα dβ dT φ[α + (i + 13 T )]φ[β + (i1 + 13 T )]φ[( 13 − i − i1 )T − α − β].

The integrals being taken within their infinite limits, it is easy to see, by the known
theory of multiple integrals, that the second member of this equation is equal to
ZZZ
dα dβ dT 0 φ(α)φ(β)φ(T 0 ),

T 0 being equal to T − α − β; it is therefore equal to k 3 .


We have next
 ZZ


 dα dβ ψ(α, β)(pα + qβ)2
(u) ZZZ
dα dβ dT 0 φ(α)φ(β)φ(T 0 )(pα + qβ)2 ,

 =

by substituting for α and β their values in α, β, and T 0 in the quantity (pα+qβ)2 . Now
it follows from that which precedes that we have

α = ( 23 − i)α − (i + 31 )β − (i + 13 )T 0 ,
β = ( 23 − i)β − (i + 31 )α − (i1 + 31 )T 0 .

By substituting these values into the quantity (pα + qβ)2 , we will be able, in its de-
velopment, to neglect the terms dependent on the products αβ, αT 0 , βT 0 , because the
triple integral
ZZZ
(u) dα dβ dT 0 φ(α)φ(β)φ(T 0 )(pα + qβ)2

being taken within its infinite limits, and the function φ(α) being supposed the same [552]
for the values +α and −α, it is clear that the elements of this integral depending on
+αβ will be destroyed by R the negative elements depending on −αβ. If we observe
next that by designating α2 dα φ(α) by k 00 , we have
ZZZ
α2 dα dβ dT 0 φ(α)φ(β)φ(T 0 ) = k 2 k 00 ,

the function (u) will become


2
k 2 k 00 [ (p2 − pq + q 2 ) + 3(pi + qi1 )2 ];
3
the logarithm of ZZ
dα dβ ψ(α, β) cos(pα + qβ)w

16
being thus
k00 2 2 2
log k 3 − 2k w [ 3 (p − pq + q 2 ) + 3(pi + qi1 )2 ] − · · ·

By passing again from logarithms to the numbers and neglecting, consistently with the
analysis of § 20 of Book II, the powers of w superior to the square, the integral (H) will
take this form

Z
k00 w2 2 2 2 2
k 3n dw c−sw −1− 2k [ 3 S(p −pq+q )+3S(pi+qi1 ) ] ,

S(p2 − pq + q 2 ) representing the sum of the quantities

p2 − pq + q 2 + p(1)2 − p(1) q (1) + · · · ;

S(pi + qi1 )2 representing the sum of the quantities

(pi + qi1 )2 + (p(1) i(1) + q (1) i(1) )2 + · · · ,

and n being the number of triangles. Let us give to the preceding integral this form

−1 2
Z
s2
 
3n −Q w+ s 2Q − 4Q
k dw c ,

Q being equal to
k 00 2
[ S(p2 − pq + q 2 ) + 3S(pi + qi1 )2 ]
2k 3
The integral must be taken from w = −π to w = π, and we have seen, in the section
cited from Book II, that it is able to be extended from w = −∞ to w = ∞; then the [553]
s2
preceding integral, or the probability of s, becomes proportional to c− 4Q or to
3ks2

4k00 [S(p2 −pq+q 2 )+ 9 S(pi+qi1 )2 ]
c 2

It is necessary now to determine the value of kk00 . For this we will make, as above,
use of the observed values of T, T (1) , T (2) , . . . When these values are in great number,
the sum of their squares divided by their number will be, quite nearly, by that which
we have established in Book II, the mean value of T 2 ; by making therefore

θ2 = T 2 + T (1)2 + T (2)2 + · · · ,
θ2
n will be this mean value. Now we have this value by multiplying each possible value
of T 2 by its probability and by taking the sum of all these products; the expression of
the mean value of T 2 will be therefore
dα dβ dT.T 2 φ(α)φ(β)φ(T − α − β)
RRR
RRR ,
dα dβ dT φ(α)φ(β)φ(T − α − β)

the integrals being taken within their infinite limits. Let there be, as above,

T 0 = T − α − β;

17
the preceding fraction will become
RRR 0
(T − α − β)2 dα dβ dT 0 φ(α)φ(β)φ(T 0 )
RRR ,
dα dβ dT 0 φ(α)φ(β)φ(T 0 )

all these integrals being taken again within their infinite limits. It is easy to see, by the
preceding analysis, that the numerator of this fraction is equal to 3k 2 k 00 , and that its
00 2
denominator is equal to k 3 ; the fraction becomes thus 3kk ; by equating it to θn , we
will have
k 00 θ2
= ;
k 3n
the probability of s is therefore proportional to [554]
9ns2

4θ 2 [S(p2 −pq+q 2 )+ 9 S(pi+qi1 )2 ]
c 2

It is clear that the values of i and of i1 , which render this probability the most rapidly
decreasing are those which give pi + qi1 = 0; and then the preceding correction of
the measured arc becomes null. The case of i and i1 nulls give therefore the law of
probability of the geodesic errors, the most rapidly decreasing, a law which must be
evidently adopted.
Thence, it is easy to conclude that the probability that the value of s will be com-
prehended within the limits ±s is equal to
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
r
3s n
t= ,
2θ S(p − pq + q 2 )
2

that which is conformed to that which we have deduced in § 1 from the particular law
2
of probability of the errors α proportional to c−hα .
Let us express, as in § 2, the error of a new base concluded from the first by the
function
lᾱ + mβ̄ + l(1) ᾱ(1) + m(1) β̄ (1) + · · ·
By making, as previously,

α = ᾱ − iT, β = β̄ − i1 T, α = ᾱ(1) − i(1) T (1) , ··· ,

the correction of this function, relative to the values of i, i1 , i(1) , . . . will be −S(li +
mi1 )T , and the error of the new base thus corrected will be

(λ) lα + mβ + l(1) α(1) + m(1) β (1) + · · ·

Let s0 be the value of this function; the probability of the simultaneous existence of the
values of s and s0 of the functions () and (λ) will be, by § 21 of Book II, proportional [555]

18
to
√ √
ZZ
−1−s0 w0 −1−Qw2 −2Q1 ww0 −Q2 w02
dw dw0 c−sw ,

the integrals being taken from w and w0 equal to −∞ to w and w0 equal to +∞. We
see next, by the analysis of the section cited, that we have

Qw2 + 2Q1 ww0 + Q2 w02


1
dα dβ dT 0 φ(α)φ(β)φ(T 0 )[(pα + qβ)w + (lα + mβ)w0 ]2
RRR
2S
= RRR ,
dα dβ dT 0 φ(α)φ(β)φ(T 0 )

the integrals relative to α, β and T 0 being taken within their infinite limits; that which
gives, by substituting for α and β their previous values,

1 k 00
Q= [S(p2 − pq + q 2 ) + 29 S(pi + qi1 )2 ],
3 k 
1 k 00

h q  p i 9
Q1 = S p− l+ q− m + S((pi + qi1 )(li + mi1 ) ,
3 k 2 2 2
1 k 00
Q2 = [S(l2 − ml + m2 ) + 92 S(li + mi1 )2 ];
3 k
whence we conclude, by the analysis of the section cited, that the probability of the
simultaneous existence of the values of s and of s0 is proportional to
(Q2 s2 −2Q1 ss0 +Qs02 )

4(QQ2 −Q2 1)
c

or Q1 2
Q2 (s−s0 ) 02
Q2 s
− − 4Q
4(QQ2 −Q21)
c 2

The measure of the second base determines the value of s0 ; and, by naming it λ as
above, the probability of s will be proportional to
λQ1 2
Q2 (s− )
Q2

4(QQ2 −Q2 1)
c .

The most probable value of s is that which renders null the exponent of c; that which [556]
gives
Q1
s=λ ;
Q2
by making therefore
Q1
s=λ + u,
Q2
λQ1
u will be the error of the arc measured and diminished by Q2 ; and the probability of
this error will be proportional to
Q2 u 2

4(QQ2 −Q2 1)
c .

19
The values of i, i1 , i(1) , . . . must be determined by the condition that the coefficient of
u2 , in this exponential, is a maximum; let us see therefore what are the values of these
quantities of these quantities which render the fraction
Q2
QQ2 − Q21

a maximum. If we name Q0 that which the expression of Q becomes when we diminish


the finite integral S(pi + qi1 )2 by the element (pi + qi1 )2 , we will have

3 k 00
Q0 = Q − (pi + qi1 )2 .
2 k
If we name similarly Q01 that which the expression of Q1 becomes when we diminish
the finite integral S(pi + qi1 )(li + mi1 ) by the element (pi + qi1 )(li + mi1 ) , we will
have
3 k 00
Q01 = Q1 − (pi + qi1 )(li + mi1 ).
2 k
Finally, if we name Q02 that which Q2 becomes when we diminish the finite integral
S(li + mi1 )2 by the element (li + mi1 )2 , we will have

3 k 00
Q02 = Q2 − (li + mi1 )2 .
2 k
The fraction
Q02
Q0 Q02 − Q01 2
surpasses the fraction [557]
Q2
;
QQ2 − Q21
because, by substituting into the first, instead of Q0 , Q01 and Q02 , their values, and
reducing to the same denominator its excess over the second, the numerator of this
excess becomes
3 k 00
[Q2 (pi + qi1 ) − Q1 (li + mi1 )]2 .
2 k
00
Let us name further Q00 that which Q0 becomes when we subtract 23 kk (p(1) i(1) +
(1)
q (1) i1 )2 from it; and, consequently, that which the expression of Q becomes when we
(1)
diminish the integral S(pi+qi1 )2 by the two elements (pi+qi1 )2 +(p(1) i(1) +q (1) i1 )2 .
00 0
Let us name similarly Q1 that which Q1 becomes when we subtract from it

3 k 00 (1) (1) (1) (1)


(p i + q (1) i1 )(l(1) i(1) + m(1) i1 );
2 k
finally, let us name Q002 that which Q02 becomes when we subtract from it

3 k 00 (1) (1) (1)


(l i + m(1) i1 )2 ;
2 k

20
we will see, by the same process, that the fraction

Q002
Q00 Q002 − Q002
1

surpasses the fraction


Q02
;
Q0 Q02 − Q02
1
and, consequently, the fraction
Q2
.
QQ2 − Q21
By continuing thus, we see that this last fraction arrives to its maximum when the finite
integrals S(pi+qi1 )2 , S(pi+qi1 )(li+mi1 )2 and S(li+mi1 )2 are null in the expressions
of Q, Q1 and Q2 , that which reverts to supposing null the values of i, i1 , i(1) , . . .; this [558]
supposition gives therefore the law of probability of the most rapidly decreasing values
of Q, and then we have

θ2
Q= S(p2 − pq + q 2 ),
9n
θ2 h q  p i
Q1 = S p− l+ q− m ,
9n 2 2
θ2
Q2 = S(l2 − ml + m2 ).
9n
The weight of the error u becomes thus
9n
− 4θ 2
2
[S (p− q2 )l+(q− p2 )m]
S(p2 − pq + q 2 ) − S(l2 −ml+m2 )

It is easy to see that this result coincides with the analogous result of § 3.

21
On the probability of the results deduced, by any processes whatsoever, from a great
number of observations.

The true march of the natural sciences consists in showing, through the path of in-
duction, from the phenomena to the laws and from the laws to the forces. We come
down next from these forces to the complete explication of the phenomena as far as
into their smallest details. The attentive inspection of a great assembly of observations
and their comparisons multiplied make a presentiment of the laws that it conceals. The
analytic expression of these laws depends on constant coefficients that we name ele-
ments. We determine, by the theory of probabilities, the most probable values of these
elements, and if, by substituting them into the analytic expressions, these expressions
satisfy all the observations, within the limits of the possible errors, we will be sure that
these laws are those of nature, or at least they are very little different from them. We
see thence how much the application of the Calculus of Probabilities is useful to natural [559]
Philosophy, and how much it is essential to have methods in order to deduce from ob-
servations the most advantageous results. These results are evidently those with which
one same error is less probable than with each other result. Thus the condition that it
is necessary to fulfill in the choice of a result is that the law of probability of its errors
is most rapidly decreasing. Before the application of the Calculus of Probabilities to
this object, each calculator subjected the results of the observations to the conditions
which to him appeared to be most natural. Now if we have certain formulas in order to
obtain the most advantageous result, he is no longer able to have uncertainty in this re-
gard, at least when we make use of the factors. We are able, not only to determine this
result, but further to assign the probability of the errors of the results obtained by some
other processes and to compare these processes to the most advantageous method. The
excessive length of the calculations that this method requires, when we employ a very
great number of observations, does not permit then to make use of it. But, by grouping
conveniently the equations of condition and by applying this method to the equations
which result from each of these groups, we are able at the same time to simplify con-
siderably the calculations and to conserve a part of the advantages which are attached
to them, as we will see it in the following. Whatever be the process of which we make
use, it is very useful to have a means to determine the probability of the results to which
we arrive, especially when there is a question of the important elements. We will have
easily this probability by the following method.

§ 1. Let us consider first a quite simple case, the one of the angles measured by
means of a repeating circle. Let us suppose that at the end of each partial observa-
tion we read the corresponding division of the circle; we will have, by departing from
the point of departure, a sequence of terms of which the first will be the angle it-
self, the second will be the double of this angle, the third will be the triple of it, and
thus consecutively. Let us designate by A1 , A2 , . . . , An these different terms, and by [560]

22
a1 , a2 , . . . , an the n partial angles successively measured. We will have

A1 = a1 ,
A2 = a2 + a1 ,
A3 = a3 + a2 + a1 ,
··· ········· ;

and, if we name y the true simple angle, we will have this sequence of equations

 y − a1 + x1 =0,


 y − a2 + x2 =0,



(a) y − a3 + x3 =0,

··············· ;





y − an + xn =0,

x1 , x2 , x3 , . . . being the errors of the angles a1 , a2 , a3 , . . . We will have, by § 20 of


Book II, the most advantageous result by multiplying by unity each of the preceding
equations and by adding them, that which gives
a1 + a2 + · · · + an x1 + x2 + · · · + xn
y= + .
n n
By supposing x1 , x2 , . . . null, we will have the result of the most advantageous method,
and the error of this result will be x1 +x2 +···+x
n
n
. By designating by u this error, we see,
knu2
by Rthe section cited, that the probabilityR of u is proportional to c− 2k00 , k being equal
00 2
to dx φ(x) and k being equal to x dx φ(x), φ(x) being the law of probability
of the errors x of the partial observations, this law being supposed the same for the
positive and negative errors and being able to be extended to infinity; c is always the
number of which the hyperbolic logarithm is unity.
Svangerg, in his excellent Work on the degree of Lapland, exposes, in order to
determine y, a new process founded on the following considerations. Each term of the
sequence A1 , A2 , . . . is able to give its value, which is able to be equally determined by
the difference As0 − As of any two terms whatsoever of this sequence, s0 being greater
than s. This difference, divided by s0 − s, gives a value of y so much more exact as [561]
this divisor is greater. By multiplying it therefore by this divisor, we will render it
preponderant by reason of its exactitude. If we make next a sum of these products and
if we divide it by the number of simple angles that it contains, we will have a value of y
which, concluded from all the combinations of the quantities A1 , A2 , . . . by giving to
each of these combinations the influence that it must have, seems ought to approach to
the truth the nearest that it is possible. This would be just, in fact, if all these values of
y were independent. But their mutual dependence makes that the same simple angles
are employed many times and in a different manner for each of them, that which must
change the respective probabilities of the values of y and, consequently, the probability
of the mean value. This is a new example of the illusions to which we are exposed in
these delicate researches.

23
The process of which there is question reverts to forming the sum of the differences
As0 − As , s0 being greater than s and having with this condition to be extended from
s0 = 1 to s0 = n; s must be extended from s = 0 to s = n − 1, and we must make
A0 = 0. By dividing next this sum by the number of simple angles that it contains, we
have the value of y. It is easy to see that this value is
nSAn − 2SSAn−1
γ= n(n+1)(n+2)
,
1.2.3

SAn expressing the sum of the quantities A1 , A2 , . . . , An ; SSAn−1 is the sum of the
quantities
A1 ,
A1 + A2 ,
A1 + A2 + A3 ,
············ ,
A1 + A2 + · · · + An−1 ;

the angle a1 is contained n − i + 1 times in SAn , it is contained (n−1)(n−i+1)


1.2 times [562]
i(n−i+1)
in the function SSAn−1 ; it is therefore contained n(n+1)(n+2) times in the preceding
1.2.3
expression of y. Thence it follows that this process reverts to multiplying the equations
(a) respectively by the factors

n 2(n − 1) 3(n − 2)
n(n+1)(n+2)
, n(n+1)(n+2)
, n(n+1)(n+2)
, ;
6 6 6

and then we find, by § 20 from Book II, that the probability of the error u in the
preceding expression of y is proportional to
u2
− kk00
SM 2
c i ,
i(n−i+1)
Mi being here equal to n(n+1)(n+2) ; the integral SMi2 must comprehend all the values
6
of Mi2 from i = 1 to i = n inclusively. We have thus

6 n2 + 2n + 2
SMi2 = .
5 n(n + 1)(n + 2
6
n being supposed very great, this value of SMi2 is reduced very nearly to 5n ; the
probability of the error u is therefore proportional to
5 k 2
c− 6 k00 nu .

We have just seen that, in the most advantageous method, the probability of a sim-
ilar error of the result is proportional to
knu2
c− 2k00 .

24
Thus, in order that the same errors become equally probable, the observations must be,
in the process of Svanberg, more numerous than in the ordinary process, according to
the ratio of six to five.
We would be able to believe that, the result obtained by the process of Svanberg be- [563]
ing a new datum from the observations, its combination with the result of the ordinary
method must give a more exact result, and of which the law of probability of the errors
is more rapidly decreasing. But the analysis proves that this is not. Let us consider, in
fact, the system of equations

p y − a1 + x1 =0,

 1

 p y − a + x =0,

2 2 2
(b)

 ············ ;

pn y − an + xn =0,

x1, x2 , . . . being, as above, the errors of the observations. The most advantageous
method prescribes to multiply these equations, respectively, by p1 , p2 , . . . and to add
them, that which gives
Spi ai Spi xi
y= 2 − ,
Spi Sp2i
the sign S comprehending, as above, all the values that it precedes, from i = 1 to i = n
inclusively. The first term of this expression will be the value of y given by the most
advantageous method, and its error will be SpSpi x2 i ; in designating it by u, its probability
i
will be, by § 20 of Book II, proportional to
2
k
Sp2i
c− 2k00 u .

If we multiply the equations (b) respectively by m1 , m2 , m3 , . . . , their sum will give


Smi ai Smi xi
y= − .
Smi pi Smi pi
The first term of this expression will be the value of y relative to the system of factors
i xi
m1 , m2 , . . ., and Sm
Smi pi will be the error of this value, an error that we will designate
0
by u . If we make
l = Spi xi , l0 = Smi xi ,
the probability of the simultaneous existence of l and of l0 will be, by § 21 of Book II, [564]
proportional to
k 2 2 0 02 2
c− 2k00 E (l Smi −2ll Smi pi +l Spi ) ,
E being equal to Sm2i Sp2i − (Smi pi )2 . Now we have

l = uSp2i , l0 = u0 Smi pi ;

the simultaneous existence of u and of u0 is therefore proportional to


Sp2
k i [u2 E+(u0 −u)2 (Smi pi )2 ]
c− 2k00 E .

25
Let e be the difference of the preceding values from y; we have
Spi ai Smi ai
e= − ;
Sp2i Smi pi
the equality of these values, corrected respectively of their errors u and u0 , give
e = u − u0 ;
the preceding exponential becomes thus
 
(Smi pi )2
− 2kk00 Sp2i u2 +e2 E
c .
e is a quantity given by the observations; the value of u which renders this exponential
a maximum is evidently u = 0; thus the consideration of the result given by the sys-
tem of factors m1 , m2 , . . . add no correction to the result of the most advantageous
method and changes not at all the law of probability of its error u, which remains
always proportional to
k 2 2
c− 2k00 u Spi .
If the very great number of equations of condition do not permit applying this
method to them, there will be always advantage to apply it to some equations resulting
from groups of these equations. Let us suppose that we have r groups, each formed of
s equations, so that n = rs; we will have the following r equations [565]
P1 y − A1 + X1 = 0,



 P y − A + X = 0,

2 2 2
(V)

 . . . . . . . . . . . . . . ..

Pr y − Ar + Xr = 0,

and we have
P1 = p1 + p2 + · · · + ps ,
A1 = a1 + a2 + · · · + as ,
X1 = x1 + x2 + · · · + xs ,

P2 = ps+1 + ps+2 + · · · + p2s ,


........................
By applying to the equations (V) the process of the most advantageous method, we
have
SPt At SPt Xt
y= − ;
SPt2 SPt2
the sign S embraces all the quantities which it precedes, from t = 1 to t = r inclusively.
SPt Xt t At
SPt2
is the error of the value SPSPt2
taken for y; by designating this error by u, its
probability will be, by § 20 of Book II, proportional to
u2
− 2kk00
Sm2
c i ;

26
m1 , m2 , . . . being the coefficients of x1 , x2 , . . . in the expression of u; and the integral
Sm2i being extended from i = 0 to i = n inclusively. Now it is easy to see that we
have
P1 P1 P1
m1 = 2 , m2 = 2 , ··· , ms = 2 ,
SPt SPt SPt
P2 P2
ms+1 = 2 , ············ , ··· , m2s = 2 ,
SPt SPt
P3
m2s+1 = 2 , ············ , ··· , ······ ;
SPt
thence it is easy to conclude that we have [566]
s n
Sm2i = 2 = ;
SPt rSPt2
the probability of u is therefore proportional to
2
k r
SPt2
c− 2k00 n u .

If we reunited all the equations into a single group, the probability of u would be
proportional to
u2
k
(Spi )2
c− 2k00 n ;
because then r would become unity, P1 would become Spi , P2 , P3 , . . . would be nulls.
The weight of the result or the coefficient of −u2 would be therefore, in the first case,
k r
SP 2 ,
2k 00 n t
and, in the second case, it would be
k
(Spi )2
2k 00 n
Now the first of these quantities surpasses the second; in fact,

(Spi )2 = (P1 + P2 + · · · + Pr )2 .

If, in the development of this last square, we substitute, instead of the product 2P1 P2 ,
its value P21 + P22 − (P1 − P2 )2 , and thus of the other products, we see that this square
is equal to rSPt2 , less a positive quantity; there is therefore advantage to partition the
equations of condition into many groups to which we apply the most advantageous
method.
We see further that there is advantage to augment the number of groups; because,
if we suppose r even and equal to 2r0 , the weight of the result relative to the number r0
of groups will be proportional to

r0 [(P1 + P2 )2 + (P3 + P4 )2 + · · · + (P2r0 +1 + P2r0 )2 ];

and the weight of the result relative to 2r0 groups will be proportional to [567]

27
2r0 (P12 + P22 + · · · + P2r
2
0 ).

This last quantity surpasses the preceding, as we see it by observing that

2(P12 + P22 ) > (P1 + P2 )2 .

If the equations of condition contain many unknown elements, y, y 0 , . . . there will


be always advantage to partition them into groups in order to apply to the equations
resulting from these groups the most advantageous method. The more we will multiply
these groups, the more we will augment the weight of the results.
But, from whatever manner that we have obtained these results, we will be able
always to determine, by the following theorem, the probability of their errors. If we
have, by any process whatsoever, deduced from the equations of condition the equation
y − a = 0, it is clear that we have multiplied the equations of condition, respectively,
by some factors M1 , M2 , M3 , . . . such that the unknowns have disappeared, with the
exception of y which has unity for factor. The error u of the result y = a is evidently
M1 x1 + M2 x2 + · · · ; the probability of this error will be therefore, by § 20 of Book II,
proportional to
u2
− 2kk00
SM 2
c i ,
the sign S being extended to all the values of i from i = 1 to i = n, n being the
number of observations. All is reduced therefore to determine, in the process that we
have followed, the factors M1 , M2 , . . .
If, for example, the equations of condition contain two unknowns y and y 0 and if,
in order to form the final two equations, we add together all these equations: 1 ˚ by
changing the signs of the equations in which y has the sign −; 2 ˚ by changing the
signs of the equations in which y 0 has the sign −, we will obtain, by this process of
which we have often made use, two equations that we will represent by the following: [568]

P y + Ry 0 − A = 0,
P1 y + R1 y 0 − A1 = 0.

In multiplying the first of these equations by


R1
P R1 − P1 R
and the second by
−R
,
P R1 − P1 R
we will have, by adding them,
AR1 − A1 R
γ− = 0.
P R1 − P1 R
In the equations of condition, xi has been multiplied by ±1; the sign − having place
if, in order to form the final equations, we have changed the signs of the ith equation.

28
Thence it is easy to conclude that, if we designate by s the number of equations of
condition in which the coefficients of y and of y 0 have the same sign, we will have

s(R1 − R)2 + (n − s)(R1 + R)2


SMi2 = .
(P R1 − P1 R)2

We will simplify the calculation by preparing the equations of condition in a manner


that in each the coefficient of y has the sign +. We will form next a first final equation
by adding the s equations in which the coefficient of y 0 has the sign +. We will form a
second final equation by adding the n − s equations in which the coefficient of y 0 has
the sign −. Let
f y + gy 0 − h = 0,
f1 y + g1 y 0 − h1 = 0
be these two equations. By multiplying the first by f g1g+f
1
1g
and the second by g
f g1 +f1 g , [569]
we will have
hg1 + h1 g
y− = 0,
f g1 + f1 g
and it is easy to see that
sg12 + (n − s)g 2
SMi2 = .
(f g1 + f1 g)2
These values of y and of SMi2 coincide with the preceding, as it is easy to see it by
observing that we have

P = f + f1 , R = g − g1 , A = h + h1 ,
P1 = f − f 1 , R1 = g + g1 , A1 = h − h1 .

The equations of condition being represented generally by the following

0 = xi − ai + pi y + qi y 0 ,

if we multiply them respectively by m1 , m2 , . . . and if we add them, we will have the


final equation
0 = Smi xi − Smi ai + ySmi pi + y 0 Smi qi ;
if we multiply next the same equations, respectively by n1 , n2 , . . . , we will have, by
adding them, the final equation

0 = Sni xi − Sni ai + ySni pi + y 0 Sni qi .

By multiplying the first of these equations by SnIi qi and the second by − SmIi qi , I being
equal to
Smi pi Sni qi − Sni pi Smi qi ,
we will have
Smi ai Sni qi − Sni ai Smi qi Smi xi Sni qi − Sni xi Smi qi
0=y− + .
I I

29
This last term is the error of the value that we obtain for y, by supposing nulls x1 , x2 , . . .: [570]
we have therefore then
mi Sni qi − ni Smi qi
Mi = ;
I
whence it is easy to conclude
u2
− 2kk00 k 2 I2
c SM 2
i = c− 2k00 u H ,

by making

H = Sm2i (Sni qi )2 − 2Smi ni Smi qi Sni qi + Sn2i (Smi qi )2 ,

a result which coincides with the one of § 21 of Book II, in which we have proved that
the maximum of the coefficient of −u2 in this exponential takes place when we suppose
generally mi = pi , ni = qi ; this supposition gives therefore the most advantageous
result or the one of which the weight is a maximum.
We will determine the value of 2kk00 by means of the squares of the remainders
which take place when we substitute into the equations of condition the values deter-
mined for y and y 0 . By designating by i this remainder in the ith equation of condition

0 = xi − ai + pi y + qi y 0 ,

and designating by u and u0 the errors of these values, we will have

0 = xi + i − pi u − qi u0 ;

that which gives

S2i = Sx2i − 2uSpi xi − 2u0 Sqi xi + u2 Sp2i + 2uu0 Spi qi + u02 Sqi2 .

We have, by § 19 of Book II,


k 00
Sx2i =
n;
k
next, the values u and u0 cease to be probable, when they surpass the quantities of order
√1 . The values of Spi xi and Sqi xi cease to be probable when they surpass quantities of
n √
order n; the values of −2uSpi xi and −2u0 Sqi xi cease therefore to be probable when [571]
they cease to be of a finite order, n being supposed infinitely great. Sp2i , Spi qi and Sqi2
being of order n, the values of u2 Sp2i , 2uu0 Spi qi , u02 Sqi2 cease to be probable when
they cease to be finite quantities. We are able therefore to neglect all these quantities
and to suppose, whatever be the process of which we make use,

k 00
S2i = n,
k
that which gives
k n
= .
2k 00 2S2i

30
§ 2. The preceding methods are reduced to multiplying each equation of condition
by a factor and to adding all these products in order to form a final equation. But we
are able to employ some other considerations in order to obtain the result sought: for
example, we are able to choose that of the equations of condition which must most ap-
proach to the truth. The process that I have given in § 40 of Book III of the Mécanique
céleste is of this kind. By supposing the equations (b) of the previous section prepared
in a manner that p1 , p2 , p3 , . . . are positive and that the values ap11 , ap22 , . . . of y, given
by these equations under the supposition of x1 , x2 , . . . nulls, form a decreasing se-
quence, the process of which there is question consists in choosing the rth equation of
condition, such that we have

p1 + p2 + · · · + pr−1 < pr + pr+1 + · · · + pn ,


p1 + p2 + · · · + pr > pr+1 + pr+2 + · · · + pn ,
and in supposing
ar
y= .
pr
This value of y renders a minimum the sum of all the deviations from the other values,
taken positively; because by naming x1 , x2 , . . . these deviations, x1 , x2 , . . . , xr−1
will be positive and xr+1 , xr+2 , . . . , xn will be negative. If we increase the preced-
ing value of y by the infinitely small quantity δy, the sum of the positive deviations [572]
x1 , x2 , . . . , xr−1 will diminish by the quantity

δy(p1 + p2 + · · · + pr−1 );

but the sum of the negative deviations, taken with the sign +, will increase by the
quantity
δy(pr+1 + pr+2 + · · · + pn );
the deviation xr will become pr δy. The sum of the deviations, taken all positively, will
be therefore increased by the quantity

δy(pr + pr+1 + · · · + pn − p1 − p2 − · · · − pr−1 );

by the conditions to which the choice of the rth equation is subject, this quantity is
positive. We will see, in the same manner, that if we diminish aprr by δy, the sum of the
deviations taken positively will be increased by the positive quantity

δy(p1 + p2 + · · · + pr − pr+1 − pr+2 − · · · − pn ).

Thus, in the two cases of an increase and of a diminution of the value aprr by y, the
sum of the deviations, taken positively, is increased. This consideration seems to give a
great advantage to the preceding value of y, which, when there is a question to choose a
middle among the results of an odd number of observations, become the result equidis-
tant from the extremes. But the Calculus of probabilities is able alone to estimate this
advantage: I will therefore apply it to this delicate question.
The sole data of which we will make use are that the equation of condition

0 = xr − ar + pr y

31
gives, setting aside the errors, a value of y smaller than the r − 1 anterior equations and
greater than the n − r posterior equations; and that we have

p1 + p2 + · · · + pr−1 < pr + pr+1 + · · · + pn ,


p1 + p2 + · · · + pr > pr+1 + pr+1 + · · · + pn ,

We have [573]
a1 x1 ar xr
y= − = − ;
p1 p1 pr pr
that which gives
x1 a1 ar xr
= − + .
p1 p1 pr pr
Thus, ap11 surpassing aprr , xp11 surpasses xprr . It is of it the same of xp22 , xp33 , . . . to xpr−1
r−1
.
xr+1 xr+2 xn xr
We will see in the same manner that pr+1 , pr+2 , . . . , pn are less than pr . Thus, the
sole conditions to which we will subject the errors and the equations of condition are
the following:

 s > r, s < r,
(c) xs xr xs xr
 < , > ;
ps pr ps pr

p1 + p2 + · · · + pr−1 < pr + pr+1 + · · · + pn ,


p1 + p2 + · · · + pr > pr+1 + pr+1 + · · · + pn ,
It is uniquely according to these data from the observations that we will determine the
probability of the error xr . We will have besides no regard to the order that the first
r − 1 equations of condition and the n − r last observe among them, nor to the values
of the quantities a1 , a2 , . . . , an .
Let us represent, as above, by φ(x) the law of probability of the error x of the
observations and, in order to express that this probability is the same for the positive
and negative errors, let us suppose φ(x) a function of x2 .
Now, if we suppose xr positive, the probability that x1 will surpass p1 xprr will be

1
R
1 2 dx φ(x)
− ,
2 k
the integral dx φ(x) being taken from x = 0 to x = p1 xprr and k being, as above, this
R

integral taken from x null to x infinity. The probability that the quantities xp11 , xp22 , . . ., [574]
xr−1 xr
pr−1 will be all greater than pr is therefore proportional to the product of the r − 1
factors R R
dx φ(x) dx φ(x)
1− , 1− , ...;
k k
the integral of the first factor being taken from x = 0 to x = p1 xprr ; the integral of the
second factor being taken from x = 0 to x = p2 xprr ; and thus consecutively.

32
Similarly, all the quantities xpr+1
r+1
, xpr+2
r+2
, . . . , xpnn being supposed smaller than xprr ,
we see, by the same reasoning, that the probability of this supposition is proportional
to the product of the n − r factors
R R
dx φ(x) dx φ(x)
1+ , 1+ , ...;
k k
the integral of the first factor being taken from x = 0 to x = pr+1 xprr , that of the second
factor being taken from x = 0 to x = pr+2 xprr , and thus consecutively. The probability
of the error xr is φ(xr ); thus the probability that the error of the rth observation will be
xr and that the value of y given by the rth equation will be smaller than the values given
by the preceding equations, and will surpass the values given by the following equa-
tions, this probability, I say, will be proportional to the product of the n − 1 preceding
factors and of φ(xr ).
x being supposed very small, we have, to the quantities near of order x3 ,
Z
1
dx φ(x) = xφ(0) + x2 φ0 (0),
2

φ0 (0) being that which dφ(x)


dx becomes when x is null. In the present question, φ(x)
being a function of x2 , we have φ0 (0) = 0, and then we have
Z
dx φ(x) = xφ(0).

xr
The preceding factors will become thus, by making pr = ζ, [575]

φ(0)
1 − p1 ζ ,
k
φ(0)
1 − p2 ζ ,
k
··············· ,
φ(0)
1 − pr−1 ζ ,
k
φ(0)
1 + pr+1 ζ ,
k
··············· ,
φ(0)
1 + pn ζ .
k
d2 φ(x)
If we designate by φ00 (0) the value of dx2 when x is null, φ(xr ) becomes

1
φ(0) + p2r ζ 2 φ00 (0).
2
The sum of the hyperbolic logarithms of all these factors is, to the quantities near of

33
order ζ 3 , by dividing the factor φ(xr ) by φ(0),
φ(0)
−ζ (p1 + p2 + · · · + pr−1 − pr+1 − pr+2 − · · · − pn )
k
2
ζ 2 φ(0)

− (p21 + p22 + · · · + p2r + p2r+1 + · · · + p2n )
2 k
( 2 )
1 2 2 φ00 (0)

φ(0)
+ pr ζ + .
2 k k

The probability of ζ is therefore proportional to the base c of the hyperbolic logarithms,


elevated to a power of which the exponent is the preceding function. We must observe
that by virtue of the conditions to which the choice of the rth equation is subject, the
quantity
p1 + p2 + · · · + pr−1 − pr+1 − pr+2 − · · · − pn
is, setting aside the sign, a quantity less than pr , and that thus, by supposing ζ of order [576]
√1 , the number n of the observations being supposed quite great, the term depending
n
on the first power of ζ, in the preceding function, is of order √1n ; we are able therefore
to neglect it, thus as the last term of this function. By designating therefore by Sp2i the
entire sum
p21 + p22 + · · · + p2n ,
the probability of ζ will be proportional to
ζ 2 φ(0) 2
Sp2i
c− 2 [ k ] ,
ζ or xprr being the error of the value aprr given for y by the rth equation. The value given
by the most advantageous method is, by the preceding section,
Spi ai
y= ,
Sp2i
and the probability of an error ζ in this result is proportional to
2
k
Sp2i
c− 2k00 ζ ,
k 00 being always the integral x2 dxφ(x), taken from x null to x infinity. The result of
R

the method that we just examined, and that we will name method of situation, will be
preferable to the one of the most advantageous method, if the coefficient of −ζ 2 , which
is relative to it, surpasses the coefficient relative to the most advantageous method,
because then the law of probability of the errors will be more rapidly decreasing there.
Thus, the method of situation must be preferred if we have
 2
φ(0) k
> 00 ;
k k
in the contrary case, the most advantageous method is preferable. If we have, for
example,
2
φ(x) = c−hx ,

34
√ √
k becomes 2√πh and k 00 becomes 4h√πh ; that which gives kk00 = 2h. The quantity [577]
h i2
φ(0)
k becomes 4h 4h
π ; now we have 2h > π ; the most advantageous method must
therefore then be preferred.
By combining the results of these two methods, we are able to obtain a result of
which the law of probability of the errors is more rapidly decreasing. Let us name
always ζ the error of the result of the method of situation, and let us designate by ζ 0 the
error of the result of the most advantageous method. The first of these results is, as we
have seen, aprr , and the second is SpSpi a2 i . If we designate Spi xi by l, Spl 2 will be the error
i i
of this last result; thus we will have l = ζ 0 Sp2i . The probability of the simultaneous
existence of l and of ζ is, by § 21 of Book II, proportional to
√ √ √ √
Z Z Z
dw c−lw −1 φ(pr ζ)cpr ζw −1 dxφ(x)cp1 xw −1 dxφ(x)cp2 xw −1 · · · ,

√ from w = −π to w = π. The integral relative


the integral relativeR to w being taken
to x, in the factor dxφ(x)cp1 xw −1 , must be taken, by that which precedes, from
x = p1 ζ to x = ∞. In developing this factor according to the powers of w, it becomes
√ w2
Z Z Z
dxφ(x) + p1 w −1 xdxφ(x) − p21 x2 dxφ(x) + · · ·
2
By taking the integral within the preceding limits, we have, to the quantities near of
order ζ 3 , Z
dxφ(x) = k − p1 ζφ(0).

By neglecting similarly the quantities of the orders ζ 2 w, ζ 3 w2 , . . ., we have


√ √ −p21 2 k 00
Z Z
p1 w −1 xdxφ(x) = k 0 p1 w −1, w x2 dxφ(x) = − p21 w2 ,
2 2
0
R
k being the integral xdxφ(x) taken from x = 0 to x infinity. The factor of which [578]
there is question becomes therefore, by neglecting w3 , conformably to the analysis of
the section cited from Book II,
√ k 00 2 2
k − p1 ζφ(0) + k 0 p1 w −1 − p w .
2 1
Its hyperbolic logarithm is
2
φ(0) k 0 √ k 00 2 2 p21 φ(0) k 0 √

p1 ζ + p1 w −1 − p w − ζ − w −1 + log k.
k k 2k 1 2 k k
By changing p1 successively into p2 , p3 , . . . , pr−1 , we will have the logarithms of the
factors following,Rto the factor relative

to pr−1 .
In the factor R dxφ(x)cpr+1 xw −1 , the integral must be taken from x = −∞ to
x = pr+1 ζ; then xdxφ(x) becoming −k 0 , the logarithm of this factor is
φ(0) k 0 √ k 00 2
pr+1 ζ − pr+1 w −1 − p w2
k k 2k r+1
2
p2r+1 φ(0) k 0 √

− ζ − w −1 + log k.
2 k k

35
We will have the logarithms of the factors following √
by changing pr+1 successively
into pr+2 , pr+3 , . . . , pn . The factor φ(pr ζ)cpr ζw −1 is equal to
p2r ζ 2 00 √
 
φ(0) + φ (0)cpr ζw −1 ,
2
and its logarithm is
p2r 2 φ00 (0) √
ζ + pr ζw −1 + log φ(0).
2 φ(0)
Now, if we reassemble all these logarithms, if we consider next the conditions (c)
to which the rth equation is subject, finally if we pass again from the logarithms to
the numbers, we find, by neglecting that which it is permissible to neglect, that the
probability of the simultaneous existence of l and of ζ is proportional to
2 2
√ √ i2
h
φ(0) 0 00 Spi
−lw −1− ζ k − kk w −1 + kk w2
Z
2
dφ c

By making therefore [579]


2
k 00 k 02 Sp2i

F = − 2 ,
k k 2
the probability of the simultaneous existence of ζ and of ζ 0 will be proportional to
( 0 φ(0) √ )
ζ 0 −ζ k
0 φ(0) 2 [ζ0 −ζ kk k ] −1 Sp2
i
ζ2φ(0) 2
Sp2i −
[ k k ] (Sp2i )2
Z −F w+ 2F
c− 2 [ k ] 4F dw c .

By the analysis of § 21 of Book II, the integral relative to w is able to be taken from
w = −∞ to w = ∞, and then the preceding probability becomes proportional to
0 φ(0) 2
2 φ(0) 2 ζ 0 −ζ k
[ ]
k k
− ζ2 Sp2i [ k ]− 00 02 Sp2i
c 2 k −k2
k (
k
) ,
an expression that we are able to set yet under this form
0 2
− 2kk00 ζ 02 Sp2i − kk
00 [ζ φ(0)
k
−ζ 0 k ]
k
Sp2i
00 02
2( k −k2 )
c k k .
If we name e the excess of the value of y given by the most advantageous method over
that which the method of situation gives, we will have ζ = ζ 0 − e. Let us suppose
h 0
i
e φ(0)
k
φ(0)
k − kk00
ζ0 = u + h i2 ;
k k02 φ(0) k0
k 00 − k 002 + k − k 00

the probability of u will be proportional to


 
k00 0 2

− u2
2
Sp2i

k
+ k [ φ(0)
k
− k00 ]
k

 k00 k00 − k02 
k k2
c ;

36
the result of the most advantageous method must therefore be diminished by the quan-
tity h i
k0
e φ(0)
k
φ(0)
k − k 00

h i2 ;
k k 02 φ(0) k0
k00 − k002 + k − k00

and the probability of the error u, in this result thus corrected, will be proportional to [580]
k0
the preceding exponential. The weight of the new result will be augmented, if φ(0)k − k00
is not null; there is therefore advantage to correct thus the result of the most advanta-
geous method. Ignorance where one is of the law of probability of the errors of the
observations renders this correction impractical; but it is remarkable that, in the case
2 2
where this probability is proportional to c−hx , that is where we have φ(x) = c−hx ,
k0
the quantity φ(0)
k − k00 is null. Then the result of the most advantageous method receives
no correction of the result from the method of situation, and the law of probability of
errors remains the same.

(Feb. 1818)

37
APPLICATION DU CALCUL DES
PROBABILITÉS
AUX OPÉRATIONS GÉODÉSIQUES DE LA MÉRIDIENNE

Pierre Simon Laplace∗


Connaissance des Temps for the year 1822 (1820) pp. 346–348.

The part of the meridian, which extends from Perpignan to Formentera, rests on
the base measured near Perpignan. Its length is around 460 thousand meters, and it
is joined to the base by a chain of twenty-six triangles. We can fear that such a great
length which has not been verified at all by the measure of a second base toward its
other extremity, is susceptible of a sensible error arising from the errors of the twenty-
six triangles employed in measuring it. It is therefore interesting to determine the
probability that this error not exceed forty or fifty meters. Mr. Damoiseau, leutenant-
colonel of the artillery, who has just gained the prize proposed by the Academy of
Turin, on the return of the comet of 1759, has well wished, at my request, to apply to
this part of the meridian, the formulas that I have given for this object, in the second
Supplement to my Théorie analytique des Probabilités. He has found that by departing
from the latitude of the signal of Burgarach, some minutes more to the north than
Perpignan, to Formentera, that which comprehends an arc of the meridian of around
466006 meters, the probability of an error s, is proportional to the exponential
−9ns2
c 4θ2 .48350,606 ,

c is the number of which the logarithm is unity; n is the number of triangles employed,
θ2 is the sum of the squares of the errors observed in the sum of the three angles of
each triangle; finally s is the error of the total arc, the base of Perpignan being taken
for unity. Here n is equal to 26. By taking for unity of angle, thed sexagesimal secon,
we have
θ2 = 118, 178.
But the number of triangles employed being only 26, it is preferable to determine by a
great number of triangles, the constant θ2 which depends on the unknown law of the
errors of the partial observations. For that, we have made use of the one hundred seven
triangles which have served to measure the meridian from Dunkirk, to Formentera. The
collection of the errors of the observed sums of the three angles of each triangle is, by
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. June 20, 2012

1
taking them all positively, 173,82: the sum of the squares of these errors, is 445,217.
26
By multiplying it by 107 , we will have for the value of θ2

θ2 = 108, 134.
This value which differs little from the preceding, must be employed in preference. It is
necessary to reduce it into parts of the radius of the circle, by dividing it by the square
of the number of sexagesimal seconds that this radius contains; then the preceding
exponential becomes
2 2
c−(689,797) .s ;
so that the base of Perpignan being taken for unity, (689, 797)2 is that which I name the
weight of the result or of the arc measured from the signal of Burgarach, to Formentera.
This base is of 11706,40 m; we have concluded from it for the respective probabilities
that the errors of the arc of which there is concern, are comprehended within the limits
±60 m, ±50 m, ±40 m, the following fractions which approach quite nearly unity,
1743695 32345 1164
, , .
1743696 32346 1165
We must therefore have no reasonable doubt on the exactitude of the measured arc.
The limits between which there are odds one against one, that the error falls, are
±8, 0937 m.
If we measured on the side of Spain, a base for verification, equal to the base
of Perpignan, and if we joined it by two triangles, to the chain of the triangles of
the meridian; we find by the calculation, that we can wager one against one, that the
difference between the measure of this base, and its value concluded from the base of
Perpignan, would not surpass a third of a meter: this is nearly the difference in the
measure of the base of Perpignan, to its value concluded from the base of Melun.
We have seen in the Supplement cited, that the angles having been measured by
means of a repeating circle; we are able to suppose the probability of an error x in
the sum observed of the three angles of each triangle, proportional to the exponential
2
c−kx , k being a constant; whence it follows that the probability of this error is
√ 2
dx. k.c−kx
√ ,
π
π designating the ratio of the circumference to the diameter.
By multiplying it by x, taking the integral from x null to x infinity, and doubling
this integral; we will have clearly the mean error, by taking positively the negative
errors. This mean error being therefore designated by , we will have
1
= √ .

We will have the mean value of the squares of these errors, by multiplying by x2
the preceding differential, and integrating from x = −∞, to x infinity; by naming
therefore 0 this value, we will have
1
0 = .
2k

2
Thence, we deduce
2 .π
0 = .
2
We can thus obtain θ2 , by means of the errors taken all to plus, of the sum observed
of the angles of each triangle. In the one hundred seven triangles of the meridian, this
sum is by that which precedes, 172,82; we can thus take for , 173,82
107 ; that which gives
26.0 , or for θ2  2
2 26π 173, 82
θ = · = 107, 78.
2 107
This differs very little from the value 108,134 given by the sum of the squares of the
errors of the sum observed of the angles of each of the one hundred seven triangles.
This accord is remarkable.
By supposing the angle of intersection of the base of Perpignan, with the meridian
which passes through one of the extremities of this base, well determined; we would
have exactly the angle of intersection of the meridian with the last side of the chain of
the triangles which unite this base to the isle of Formentera, if the earth were a spheroid
of revolution, and if the angles of the triangle were exactly measured. The error coming
from this second cause, in the last angle of intersection, is by the formulas of the second
2
supplement cited, proportional to the exponential c−r , by expressing this error by
2 00
3 θr which in the present case becomes 6 , 8997.r. Thence it follows that the limits
between which we can wager one against one, that the error falls, are ±300 , 2908. If the
azimuthal observations were made with a very great precision; we would determine by
this formula, the probability that they indicate an ellipticity in the terrestrial parallels.
We can estimate the relative exactitude of the instruments of which we make use
in the geodesic operations, by the value of 0 concluded from a great number of tri-
angles. That value concluded from one hundred seven triangles of the meridian, is
445.217
107 . The same value concluded from forty-three triangles employed by La Con-
damine, in his measure of the three degrees of the equator, is 1718 43 , or near ten times
greater than the preceding. The equally probable errors, relative to the instruments em-
ployed in these two operations, are proportionals to the square roots of the values of
0 . Thence it follows that the limits ±8, 0937 m, between which we just saw that it is
equally probable that the error of the measured arc from Perpignan to Formentera falls,
would have been ±25, 022 m with the instruments employed by La Condamine. These
limits would have surpassed ±40 m, with the instruments employed by LaCaille and
Cassini, in their measure of the meridian. We see thus how much the introduction of
the repeating circle in the geodesic operations, has been advantageous.

3
MÉMOIRE SUR L’APPLICATION
DU

CALCUL DES PROBABILITÉS AUX


OBSERVATIONS
ET SPÉCIALEMENT

AUX OPÉRATIONS DU NIVELLEMENT

Pierre Simon Laplace∗

Annales de Chimie et de Physique, t. XII; 1819. OC 13, pp. 301–304.


Read to the Academy of Sciences, 20 December 1819

In the great triangulations that one has executed for the measure of the Earth, one
has observed with care the zenithal distances of the signals, either in order to reduce
the angles to the horizon, or in order to determine the respective heights of the diverse
stations. Terrestrial refraction has a great influence on these heights, and its variability
renders them quite uncertain. I myself propose here to estimate the probability of the
errors of which they are susceptible.
The theory of refraction shows us that, in a constant atmosphere, the terrestrial
refraction is a fraction of the celestial arc contained between the zeniths of the observer
and of the observed signal; so that, in order to obtain it, it suffices to multiply this
arc by a factor which would be constant if the atmosphere were always the same, but
which varies without ceasing, by reason of the continual changes of the temperature
and of the density of the air. A great number of observations are able to give the mean
value of this factor and the law of probability of its variations. I have concluded both
from the observations of Mr. Delambre, published in the second Volume of his Work
entitled: Base du Système métrique. By departing from these data, I have determined
the probability of the errors of the height of Paris above the sea, under the hypothesis
of a chain of twenty-five equilateral triangles which would join Dunkirk and Paris, this
which supposes around 20000m of length of each of their sides. One is able to obtain
this height by diverse processes; but the one in which the law of probability of the errors
is the most rapidly decreasing must be preferred as being the most advantageous. Its
research is an easy corollary of the analysis that I have given besides for all these
objects, and there results from it that there are odds nine against one that the error with
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. September 18, 2010

1
respect to the height of Paris above the sea would not exceed then 8m . The process
that Mr. Delambre has followed, in order to conclude this height from a nearly equal
number of triangles, is a little less exact than the previous; but it is principally the
magnitude of the sides of several of his triangles which distribute with respect to his
result the uncertainty and which do not permit to respond, with a sufficient probability,
that it is not in error of 16m or 18m ; this which forms a considerable part of it.
The equally probable errors diminish much when one brings together the stations,
and it is indispensable to make it when one wishes to obtain an exact leveling. The
great triangles, very proper in the measure of terrestrial degrees, are not convenient at
all to the measure of heights, and it is necessary to separate these two kinds of measures.
But, by multiplying the stations, the error which holds to the observation of the zenithal
angles increases with their number and becomes comparable to the error which depends
on the variability of terrestrial refraction. This has caused me to research the law of
probability of the errors of the results when there are many sources of errors. Such
are the greater part of astronomical results; because one observes the stars by means
of two instruments, the meridian lunette and the circle, both susceptible of errors of
which the law of probability must not be supposed the same. The analysis that I have
given in the analytic theory of probabilities is applied easily in this case, whatever be
the number of sources of error. It determines the most advantageous results and the
laws of probability of the errors of which they are susceptible. In order to apply to
the operations of leveling, it is necessary to know the law of probability of the errors
due to astronomical refraction; and one has just seen that it results from the great
triangulations of the meridian. It is necessary moreover to know the law of probability
of the errors of the zenithal angles. We lack observations in this regard; but one will
deviate little from the truth by supposing this law the same as for horizontal angles,
and which is deduced from the errors observed in the sum of the three angles of each
triangle of the meridian. By departing from these laws, I find that, if one divides the
distance from Paris to Dunkirk into equidistant stations of an interval of 1200m , there
are odds one thousand against one that the error in the height of Paris above the sea
will not exceed four-tenths meter. One would diminish this error by bringing together
the stations; but the precision that one would obtain by this bringing together will not
compensate the length of the operations that it requires.
The equations of condition that one forms in order to have the astronomical ele-
ments contain implicitly the errors of the two instruments which serve to determine the
position of the stars. These errors are affected of different coefficients in each equation.
Then the most advantageous system of factors by which one must multiply respectively
these equations in order to obtain, by the reunion of the products, as many final equa-
tions as there are elements to determine; this system, I say, is no longer the one of the
coefficients of the elements in each equation of condition. The analysis has led me to
the general expression of this system of factors and thence to the results for which the
same error to fear is less probable than in each other system. The same analysis gives
the laws of probability of the errors of these results. These formulas contain as many
constants as there are sources of errors and that depend on the laws of probability of
these errors. In the case of a unique source, I have given, in my theory of probabilities,
the means to eliminate the constant, by forming the sum of the squares of the remain-
ders of each equation of condition, when one has substituted the values found for the

2
elements. A similar process gives generally the values of these constants, whatever be
their number: this which completes the application of the calculus of probabilities to
the results of the observations.
I will finish with a remark that appears to me important. The small uncertainty that
the observations, when they are not very multiplied, leave with respect to the values of
the constants of which I have just spoken, renders a little uncertain those probabilities
determined by the analysis; but it suffices nearly always to know if the probability that
the errors of the results obtained are contained within narrow limits brought together
extremely from unity, and, when this is not, it suffices to know to what point it is
necessary to multiply the observations in order to acquire a probability such that there
remains no reasonable doubt on the good quality of the results. The analytical formulas
of probabilities fulfill perfectly this object, and, under this point of view, they are able
to be envisioned as the necessary complement of the method of the sciences, founded
on the collection of a great number of observations susceptible of errors. Thus, when
one would reduce to 15m the error from 18m that one is able to fear in the height of
Paris above the sea, concluded from the great triangles of the meridian, it would not
be less true of it that this height is uncertain and that it is necessary to determine it
by some more precise means. Similarly, the analytic formulas, applied to the triangles
of the meridian from the base measured near to Perpignan to Formentera, give odds
around seventeen hundred thousand against one that the error of the corresponding
arc of the meridian, of which the length surpasses 460000m , is not 60m in error. This
must dissipate the fears of inexactitude that the omission of a base of verification on
the side of Spain was able to inspire. One would be further reassured in this regard
when likewise the probability of an equal error, or greater than 60m , would surpass the
fraction given by the formulas and would be raised to a millionth.

3
TROISIÈME SUPPLÉMENT.
APPLICATION DES FORMULES GÉODÉSIQUES DE PROBABILITÉ A LA
MÉRIDIENNE DE FRANCE

Pierre Simon Laplace∗


Sequel to Second Supplement
OC 7 pp. 581–616

§1. The part of the meridian which extends from Perpignan to Formentera is sup- [581]
ported on a base measured near Perpignan. Its length is around 466 km, and its last
extremity is joined to the base of Perpignan by a chain of twenty-six triangles. We are
able to fear that so great a length, which has not been verified at all by the measure
of a second base toward its other extremity, is susceptible to a sensible error arising
from the errors of the twenty-six triangles employed to measure it. It is therefore inter-
esting to determine the probability that this error not exceed 40m or 50m . Mr. Damoi-
seau, lieutenant-colonel of the Artillery, who has just gained the prize proposed by the
Academy of Turin, on the return of the comet of 1759, has well wished, at my request,
to apply to this part of the meridian my formulas of probability. Here the meridian
does not cut all the triangles, as we have proposed for more simplicity; but it is easy to
see that we are able to apply, to the angles formed by the prolongations of the sides of
the triangles with the meridian, that which I have said respecting the angles that these
sides would form if they were cut by the meridian. Mr. Damoiseau has found thus that
departing from the latitude of the signal of Busgarach, a little more to the north than
Perpignan, to Formentera, that which comprehends an arc of the meridian of about
466,006m , and by taking for unity the base of Perpignan, we have (second Supplement,
§1)
(1)2 2
p2 − pq + q 2 +p1 − p(1) q (1) + q (1) + · · ·
2 2
+p(25) − p(25) q (25) + q (25) = 48350, 606.
The probability that an error in the measure of this arc is comprehended within the [582]
limits ±s becomes, by the formulas of the same section,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to the value of t equal to

s n+1
√ ,
2θ 48350, 606
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. January 10, 2014

1
n + 1 being the number of triangles employed, and θ2 being the sum of the squares of
the errors observed in the sum of the three angles of each triangle; π is the ratio of the
circumference to the diameter. By taking for unity the sexagesimal second, we find

θ2 = 118, 178

But, the number of triangles employed being only 26, it is preferable to determine by
a great number of triangles this constant θ2 which depends on the unknown law of the
partial observations. For this, we have made use of the one hundred seven triangles
which have served to measure the meridian from Dunkirk to Formentera. The set of
the sums of observed errors of the three angles of each triangle is, in taking all of them
positively, equal to 173, 82. The sum of the squares of these errors is 445, 217. By
26
multiplying it by 107 , we will have, for the value of θ2 ,

θ2 = 108, 184.

This value, which differs little from the preceding, must be preferred. It is necessary
to reduce θ into parts of the radius taken for unity, that which we will make by dividing
it by the number of sexagesimal seconds that this radius contains. We will have thus

t = s 689, 797;

s is a fraction of the base of Perpignan taken for unity. This base is 11706m , 40. By [583]
supposing therefore the error of 60m , we will have
60 × 689, 797
t= .
11706, 40
This put, we find, for the probabilities of the errors of the arc of the meridian of which
there is question are comprehended within the limits ±60m , ±50m , ±40m , the following
fractions:
1743695 32345 1164
, , .
1743696 32436 1165
There are odds one against one that the error falls within the limits 8m , 0757.1
If the Earth were a spheroid of revolution and if the angles of all the triangles
were exact, we would have exactly the inclination of the lasts side of the chain of the
triangle on its meridian, by supposing given this inclination relative to the base. The
probability that the error of the first of these inclinations, proceeding from the errors
1 The comparison of this Memoir with the one which was published in the Connaissance des Temps for

1822 permits the identification of errors in this supplement. In the Connaissance des Temps, Laplace gives
2
±8m ,0937. In reality, it would be ±8m ,0940. We have π2 c−t dt = 12 for t = 0, 476936 thereby giving
R
even odds. From the previous relation here,
s 689, 797
t= ,
11706, 40
where s = ±8m ,0940. The number given by Laplace differs slightly, without doubt, because the value of
t, deduced from the formulas of Laplace, has not been calculated with as much precision as that employed
later.

2
of the observed angles of the triangles, is comprehended within the limits ± 23 θt is, by
that which precedes,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null: these limits become, by substituting for θ it pre-
ceding value, ±t600 , 8997, the seconds being sexagesimal. Thence it follows that there
are odds one against one that the error falls within the limits ±300 , 2908. If the az-
imuthal observations were made with a great precision, we would determine by this
means the probability that they indicate an eccentricity in the terrestrial parallels. If we
measured, on the side of Spain, a base of verification equal to the base of Perpignan,
and if we joined it by two triangles to the chain of triangles of the meridian, we find,
by the calculation, that there are odds one against one that the difference, between this
base and its value concluded from the base of Perpignan, will not surpass a third of a
meter: that is, to quite nearly, the difference of the measure of the base of Perpignan to
its value concluded from the base of Melun.
We have seen, in the section cited, that, the angles of the triangles having been [584]
measured by means of the repeating circle, we are able to suppose the probability of
an error x in the observed sum of the three angles of each triangle proportional to the
2
exponential c−kx , k being a constant. Thence it follows that the probability of this
error is √ 2
dx k c−kx
√ .
π
By multiplying this differential by x and integrating from x null to x infinity, the double
of this integral will be the mean of all the errors taken positively. By designating
therefore by  this mean error, we will have
1
= √ .

We will have the mean value of the squares of these errors by multiplying by x2 the
preceding differential and by integrating it from x = −∞ to x infinity. By naming
therefore 0 this value, we will have
1
0 = ;
2k
thence we deduce
2 π
0 = .
2
We are able thus to obtain θ2 by means of the errors, taken all to plus, of the observed
sums of the angles of each triangle. In the one hundred seven triangles of the meridian,
the sum of the errors is 173, 82; we are able thus to take, for , 173,82
107 ; that which gives,
for 260 or for θ2 ,  2
2 173, 82
θ = 13π = 107, 78;
107

3
this differs very little from the value 108, 134 given by the sum of the squares of the
errors of the observed sum of the angles of each of the one hundred seven triangles.
This accord is remarkable.
We are able to estimate the relative exactitude of the instruments of which we make [585]
use in the geodesic observations, by the value of 0 concluded from a great number of
triangles. This value, concluded from the one hundred seven triangles of the meridian,
is 445,217
107 or 4, 1609. The same value, concluded from the forty-three triangles em-
ployed by La Condamine in the measure of the three degrees of the equator, is 171843 or
39, 953, and, consequently, nearly ten times greater than the preceding. The equally
probable errors, relative to the instruments employed in these two operations, are pro-
portionals to the square roots of the values of 0 . Thence it follows that the limits
±8m , 0937, between which we have just seen that there are odds one against one that
the error of the arc measured from Perpignan to Formentera falls, would have been
±25m , 022 with the instruments employed by La Condamine. These limits would have
surpassed ±40m with the instruments employed by La Caille and Cassini in their mea-
sure of the meridian. We see thus how the introduction of the repeating circle in the
geodesic operations has been advantageous.

§2. In order to give a very simple example of the application of the geodesic for-
mulas, I will consider the straight line AA(5) , of which we have determined the length
by a chain of triangles CC (1) C (2) , C (1) C (2) C (3) , . . .

C(1) C(3)

I A(2) I(2) A(4) I(4)


A A(5)
A(1) I(1) A(3) I(3)

C C(2) C(4)
I will suppose all these triangles equal and isosceles, and such that their bases CC (2) ,
C (1) C (3) , . . . are parallels to the line AA(5) . We will have, by lowering onto this line [586]

4
the perpendiculars CI, C (1) I (1) , . . .

II (1) = CC (1) cos A(1) ,


CC (1) sin C (1) CC (2)
C (1) C (2) = ,
sin C (1) C (2) C
I (1) I (2) = C (1) C (2) cos A(2) ,
C (1) C (2) sin C (2) C (1) C (3)
C (2) C (3) = ,
sin C (2) C (3) C (1)
and generally

I (i) I (i+1) = C (i) C (i+1) cos A(i+1)


C (i) C (i+1) sin C (i+1) C (i) C (i+2)
C (i+1) C (i+2) = .
sin C (i+1) C (i+2) C (i)
Let α(1) and β (1) be the errors of the angles opposed to the sides CC (1) and
C C (2) in the first triangle. Let α(2) and β (2) be the errors of the angles opposed
(1)

to the sides C (1) C (2) and C (1) C (3) of the second triangle, and thus consecutively. By
designating by δ a variation relative to these errors, we will have

δI (i) I (i+1) δC (i) C (i+1)


= − δA(i+1) tan A(i+1) ,
I (i) I (i+1) C (i) C (i+1)
δC (i) C (i+1) δC (i) C (i−1)
= + β (i) cot C (i+1) C (i−1) C (i)
C (i) C (i+1) C (i) C (i−1)
− α(i) cot C (i) C (i+1) C (i−1) .

We have further, by supposing the angles A(i) relative to the acute angles that the sides
of the triangles form with the line AA(1) , . . . ,

δA(i+1) + δA(i) + δC (i−1) C (i) C (i+1) = 0;

we will suppose here that the errors α(i) and β (i) of the angles C (i+1) C (i−1) C (i) ,
C (i) C (i+1) C (i−1) of the triangle C (i−1) C (i) C (i+1) are those which remain, when we
have subtracted from each angle of the triangle the third of the sum of the errors of the
three angles. Then we have

δC (i−1) C (i) C (i+1) = −α(i) − β (i) ,

that which gives


δA(i+1) = −δA(i) + α(i) + β (i) ;
we will have therefore [587]

δA(i+1) = α(i) − α(i−1) + α(i−2) − · · · ∓ α(1)


+ β (i) − β (i−1) + β (i−2) − · · · ± δA(1) ,

the superior sign having place if i is even, and the inferior if i is odd.

5
We will have next, by observing that

cot C (i) C (i−1) C (i+1) = cot C (i) C (i+1) C (i−1) = cot A(i)

and that A(i) = A(1) .


δC (i) C (i+1) δCC (1)
= +(β (i) +β (i−1) +· · ·+β (1) −α(i) −α(i−1) −· · ·−α(1) ) cot A(1) ;
C (i) C (i+1) CC (1)
we will have therefore

δI (i) I (i+1) δCC (1)


= + (β (i) + β (i−1) + · · · + β (1) − α(i) − α(i−1) − · · · − α(1) ) cot A(1)
I (i) I (i+1) CC (1)
− (α(i) − α(i−1) + · · · ∓ α(1) + β (i) − β (i−1) + · · · ∓ β (1) ± δA(i) ) tan A(1) .

Let us suppose now that we have measured a base AC situated in a manner that
the angle CAC (1) is equal to the angle CA(1) A. The first of these angles determines
the position of the line AA(1) with respect to the base, and it is supposed known. By
naming α and β the errors of the angles CC (1) A and CAC (1) , we will have

δA(1) = α + β,
δCC (1)
= β cot CAC (1) − α cot CC (1) A.
CC (1)
Let us make
cot CAC (1) = cot A + h,
cot CC (1) A = cot A + h0 ;
we will have, by designating by b the base AC and by a the straight line II (i) ,

b 1
h= − ,
2a sin A sin 2A
a 1
h0 = − .
2b sin A cos2 A sin 2A
we will have next [588]

δA(i+1) =α(i) − α(i−1) + · · · ± α + β (i) − β (i−1) + · · · ± β,


δI (i) I (i+1)
=(β (i) + β (i−1) + · · · + β − α(i) − α(i−1) − · · · − α) cot A
I (i) I (i+1)
− (α(i) − α(i−1) + · · · ± α + β (i) − β (i−1) + · · · ± β) tan A + hβ − h0 α.

The variation of the total length II (i+1) will be therefore

δII (i+1) =[(i + 1)(β − α) + i(β (i) − α(i) + · · · + (β (i) − α(i) )]a cot A
+ (i + 1)haβ − (i + 1)h0 aα
− (α(i) + α(i−2) + α(i−4) + · · · + β (i) + β (i−2) + β (i−4) + · · · )a tan A.

6
The quantity

p2 − pq + q 2 + p(1)2 − p(1) q (1) + q (1)2 + · · · + p(i)2 − p(i) q (i) + q (i)2

becomes thus, by neglecting the terms of order i,


(i + 1)(i + 2)(2i + 3) 2
a cot2 A + 3(h + h0 )(i + 1)2 a2 cot A
2
+ (h2 + hh0 + h02 )(i + 1)2 a2 .

Let us name Q this quantity; the probability that the error of the line II (i+1) is com-
prehended within the limits ±s will be, by that which precedes,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
s
3s i+1
t= ,
2θ Q

θ2 being the sum of the squares of the errors of the sum of the three angles of the i + 1
triangles.
Let us suppose that we have, as for the part of the meridian of which we have spoken
previously, twenty-six triangles, that which gives i = 25. Let us suppose further that
the length II (i+1) is that of this part of the meridian or of 466006m ; then we will have
466006
a= .
26
By taking for unity the base measured near to Perpignan, which is of 11706m , 40 and [589]
by supposing right-angled the isosceles triangles CC (1) C (2) , C (1) C (2) C (3) , . . . that
which gives tan A = cot A = 1, we find

Q = 48207, 6.

We have seen previously that the twenty-six triangles which join the base of Per-
pignan to Formentera give
Q = 48350, 6;
these two values of Q are not very different, and as the equally probable errors are
proportionals to the square roots of these values, we see that we are able to wager one
against one that the errors of the entire measure are contained within the limits ±8m , 1.
Under this relation, the case that we examine represents perfectly the measure of the
arc of the meridian from the base of Perpignan to Formentera.

§3. Let us suppose now that we measure, toward the last extremity of the line
II (i+1) , a base C (i+1) A(i+2) equal to the base CA, and put in a manner that the angle
C (i+1) C (i) A(i+2) is equal to the angle CC (1) A, and that the angle C (i) A(i+2) C (i+1)

7
is equal to the angle CAC (1) . In designating by α(i+1) and β (i+1) the errors of the
angles C (i+1) C (i) A(i+2) and C (i) A(i+2) C (i+1) , the equation

sin C (i+1) C (i) A(i+2)


C (i+1) A(i+2) = C (i+1) C (i)
sin C (i) A(i+2) C (i+1)
will give

δC (i+1) A(i+2) δC (i) C (i+1)


(i+1) (i+2)
= (i) (i+1) + α(i+1) cot CC (1) A − β (i+1) cot CAC (1) ,
C A C C
that which gives

δC (i+1) A(i+2)
=(β (1) + β (2) + · · · + β (i) − α(1) − α(2) − · · · − α(i) cot A
C (i+1) A(i+2)
+ β(h + cot A) − α(h0 + cot A)
+ α(i+1) (h0 + cot A) − β (i+1) (h + cot A).
That which we have designated in §2 of the second Supplement by
l, l(1) , . . . , m, m(1) , . . . becomes [590]

l = −(1 + h0 )b, m = (1 + h)b,


(i) (i)
l = −b, m = b,
······ , ······ ,
(i) (i)
l = −b, m = b,
(i+1) 0 (i+1)
l = (1 + h )b, m = −(1 + h)b;

the quantity that we have designated by Sf (i)2 in the section cited or by

l2 − ml + m2 + l(1)2 − m(1) l(1) + m(1)2 + · · ·

becomes here

3(i + 2)b2 + 6(h + h0 )b2 + 2(h2 + hh0 + h02 )b2 .

The quantity that we have named Sr(i) f (i) in the same section, or

l(p − 21 q) + m(q − 12 p) + l(1) (p(1) − 21 q (1) ) + m(1) (q (1) − 21 p(1) ) + · · · ,

becomes, by neglecting the terms which do not have i for coefficient,


3(i + 1)(i + 2) 0
ab + 3(i + 1)(h + h0 )ab + (i + 1)(h2 + hh0 + h2 )ab;
2
by representing therefore, as above, by λ the excess of the measured base C (i+1) A(i+2)
on the calculated base, and by s the excess of the true length of the line II (i+1) over
that calculated length, we will have

λSr(i) f (i) (i + 1)aλ


s= = ;
Sf (i)2 2b

8
it is necessary, consequently, to add to the calculated length of the line II (i+1) the
product of λ by the ratio of the half of this line to the base b; that which reverts to
calculating the first half of the line II (i+1) with the base AC, and the second half
with the base A(i+2) C (i+1) . This process would be generally exact, whatever was the
magnitude and the disposition of the triangles which unite the two bases, if the parts
of Sr(i) f (i) and of Sf (i)2 corresponding to these halves were respectively equal. This
is the process that we adopted in the Commission which fixed the length of the meter; [591]
and, in the ignorance where we were then of the true theory of these corrections, it
was most convenient; but it did not make known the correction of the diverse parts of
the total arc II (i+1) . For this, it is necessary to correct the angles of each triangle, or
to determine the corrections α, β, α(1) , β (1) , . . . which result from the excess λ of the
second base observed over that base calculated after the first. I have given, in the second
Supplement, these corrections, by supposing the law of errors of the observations of
2
the angles proportional to the exponential c−k(α+ 3 T ) , k being a constant, T being the
1

sum of the errors of the three angles of the triangle, α + 13 T , β + 31 T and 13 T − α − β


being the errors of each of the angles. We have seen, in the Supplement cited, that
the supposition of this law of probability must be admitted when the angles have been
measured with the repeating circle, and that then we have
l(s) − 12 m(s) m(s) − 12 l(s)
α(s) = λ, β (s) = λ,
F F
by designating by F the sum of all the quantities l2 − ml + m2 , l(1)2 − m(1) l(1) +
m(1)2 , . . . I will demonstrate here that these corrections have place, whatever be the
law of probability of the errors.
For this, I designate this law by φ(α+ 31 T )2 : by supposing it the same for the pos-
itive errors and for the negative errors, its expression must contain only some even
powers of these errors. The law of probability of the simultaneous values of α and β
will be thus proportional to the product
φ(α+ 13 T )2 φ(β + 31 T )2 φ( 13 T − α − β)2 .
If we develop this product, with respect to the powers of α and of β, by arresting
ourselves at the squares and at the products of these quantities, we will have
[φ( 91 T )]3 + (α2 + αβ + β 2 )φ( 19 T 2 )
× 2φ( 19 T 2 )φ0 ( 19 T 2 ) − 49 T 2 [φ0 ( 19 T 2 )]2 + 94 T 2 φ( 19 T 2 )φ00 ( 19 T 2 )


0
φ0 (x) expressing dφ(x) 00
dx , and φ (x) expressing
dφ (x)
dx . T being able to be supposed to [592]
vary from −∞ to T = ∞, we will multiply the preceding function by dT and we will
integrate within these limits; we will have thus for the probability of the simultaneous
values of α and β a quantity of the form
H − H 0 (α2 + αβ + β 2 ).
This probability will be therefore proportional to
H0 2
1− (α + αβ + β 2 ).
H

9
The probability of the simultaneous existence of α, β, α(1) , β (1) , . . . will be propor-
tional to the product of the quantities

H0 2
1− (α + αβ + β 2 ),
H
H 0 (1)2
1− (α + α(1) β (1) + β (1)2 ),
H
······························

The logarithm of this product is, s being an indeterminate number,

H0
− S(α(s)2 + α(s) β (s) + β (s)2 ) − · · · ;
H
this product is at its maximum if the preceding term is at its minimum, or if the function

S(α(s)2 + α(s) β (s) + β (s)2 )

is the smallest possible, the quantities α, β, α(1) , . . . satisfying besides the equation

λ = lα + mβ + l(1) α(1) + m(1) β (1) + · · ·

We are able to give to this function the form


 #2 
(s)
2 "
(s) 1 (s)  3 λ2


1  3m λ 3 (l m )λ
S 2β (s) + α(s) − + α(s) − 2
+ ;
4  2F 4 F  4F

this function is evidently at its minimum if we suppose [593]

3m(s) λ (l(s) − 21 m(s) )λ


2β (s) + α(s) − = 0, α(s) − = 0;
2F F
whence we deduce generally

1 λ 1 λ
α(s) = (l(s) − m(s) ) , β (s) = (m(s) − l(s) ) .
2 F 2 F
In the case that we just considered, we have

λb 3 λb 3
α=− ( + h0 + 12 h), β= ( + h + 12 h0 ),
F 2 F 2
3
2 bλ
3

α(1) = α(2) = · · · = α(i) = − 2F , β (1) = β (2) = · · · = β (i) = ,
F
λb 3 λb 3
α(i+1) = ( + h0 + 12 h), β (i+1) = − ( + h + 12 h0 );
F 2 F 2
thus by these corrections all the triangles other than those which have one of the bases
for one of their sides will remain right-angled.

10
The probability of the error ±u of the line II (i+1) , corrected by the second base,
will be, by the section cited in the second Supplement,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
v
3u u i+1
u
t= ,
2θ Q (Sr(i) f (i) )2
t
2
Sf (i)

which becomes here s


3u i+1
t= ,
2θ Q0
by designating by Q0 the function
(i + 1)(i + 2)(i + 3) 2 3
a + 2 (i + 1)2 (h + h0 )a2 + 21 (i + 1)2 (h2 + hh0 + h02 )a2 .
4
The equally probable errors being proportionals to the square roots of Q and of Q0 , [594]
we see that they are diminished and nearly reduced to half by the measure of a second
base.
The probability of an error ±λ in the measure of a second base is, by the second
Supplement,
2
2 dt c−t
R
√ ,
π
the integral being taken from t null to
s
3u i+1
t= ;
2θ Sf (i)2
2
and f (i) is equal to

3(i + 1)b2 + 6(h + h0 )b2 + 2(h2 + hh0 + h02 )b2 .

In the present case where i = 25, this quantity becomes

86, 8030b2 ;

the equally probable errors in the measures


√ of √the arc II (i+1) and of a new base equal
to the first are therefore in the ratio of Q to 86, 8030; whence it follows that there
are odds one against one that the error of a new base will be comprehended within
m
the limits ±0m , 34236, or to very nearly ± 31 . These are the same limits which result
from the angles of the twenty-six triangles which reunite the base of Perpignan to
Formentera. Thus, under this relation again, the hypothetical case, which we have just
examined, accords with that which this chain of triangles gives.

11
§4. I will consider now the zenithal distances of the vertices of the triangles and
the leveling which results from it. From one same vertex such as C (2) , we are able to
observe the four points C, C (1) , C (3) , C (4) . Let us name f the distance CC (1) and
h the base CC (2) of the isosceles triangle; all the triangles being supposed equal, if
we name x(i) the height of C (i) above of the level of the sea, the observed distance
from C (i−2) to the zenith of C (i) being designated by θ, the true distance will be quite
nearly, the triangles being able to be supposed horizontal, [595]

hu h
θ+ + ,
R R
h
u being the factor by which we must multiply the angle R in order to have the terrestrial
(i)
refraction at the point C , R being the radius of the Earth and  being the error of u.
I take account here only of this error, as being much greater than that of θ. If we name
similarly θ0 the zenithal distance of C (i) , observed from C (i−2) , the true distance will
be
hu h0
θ0 + + ,
R R
0 being the error of u in this observation. We will have

2hu h h
θ + θ0 + + ( + 0 ) = π + ;
R R R
we will have next
h h2
x(i) − x(i−2) = (θ − θ0 ) + ( − 0 ).
2 2R
If we name similarly θ00 the zenithal distance of C (i−1) observed from C (i) , the true
distance will be
f u f 00
θ00 + + ,
R R
00 being the error of u in this observation. By naming further θ000 and 000 the same
quantities relative to the zenithal distance of C (i) , observed from C (i−1) , we will have

2f u f f
θ00 + θ000 + + (00 + 000 ) = π + ,
R R R
f f 2 00
x(i) − x(i−1) = (θ00 − θ000 ) + ( − 000 ).
2 2R
As I myself propose here only to examine what degree of confidence we must accord
to this kind of leveling, I will make h = f , that which reverts to supposing all the
h2
triangles equilateral. I will take, moreover, 2R for unit of distance: by making next [596]
0 (i) 00 000 (i)
 −  = λ ,  −  = γ , one will have two equations of the form

(
x(i) − x(i−1) = γ (i) + p(i) ,
(A)
x(i) − x(i−2) = λ(i) + q (i) .

12
The first of these equations extends from i = 1 to i = n + 1, n being the number of
triangles. The second equation extends from i = 2 to i = n + 1. It is necessary now to
conclude from this system of equations the most advantageous value of x(n+1) − x(0) ,
the elevation x(0) of the point C above the sea being supposed known. For this, we
will multiply the first of the equations (A) by f (i) and the second by g (i) , f (i) and g (i)
being indeterminate constants. In the system of these equations added all together, the
coefficient of x(i) will be f (i) − f (i+1) + g (i) − g (i+2) . By equating it to zero and
observing that g (i+2) − g (i) = ∆g (i+1) + ∆g (i) , ∆ being the characteristic of finite
differences, we will have, by integrating,

f (i) = a − g (i) − g (i+1) ,


a being a constant. But, the values of g (i) beginning to take place only when i = 2,
this expression of f (i) is able to serve only when i = 2. In order to have the value of
f (1) , we will observe that the equating to zero of the coefficient of x(1) gives

f (1) = f (2) + g (3) ;


substituting, instead of f (2) , a − g (2) − g (3) , we will have

f (1) = a − g (2) .
Next, the preceding expression of f (i) extends only to i = n; but, relatively to i = n+1,
we must observe that the coefficient of x(n+1) must be unity, that which gives

f (n+1) + g (n+1) = 1
or

f (n+1) = 1 − g (n+1) ;
the equality to zero of the coefficient of x(n) gives f (n) = f (n+1) − g (n) , or f (n) = [597]
1 − g (n) − g (n+1) . By comparing this expression to this one f (n) = a − g (n) − g (n+1) ,
we will have a = 1. The error of the value of x(n+1) will be thus

f (1) γ (1) + f (2) γ (2) + · · · + f (n+1) γ (n+1)


+ g (2) λ(2) + g (3) λ(3) + · · · + g (n+1) λ(n+1)
The values of γ (1) , γ (2) , . . ., λ(1) , λ(2) , . . . being evidently subject to the same law of
probability, if we name s this error and if we make

H = f (1)2 + f (2)2 + · · · + f (n+1)2


+ g (1)2 + g (3)2 + · · · + g (n+1)2 ,
the probability of the error s will be proportional, by §20 of Book II, to an exponential
of the form
Ks2
c− H ,

13
K being a constant dependent on the law of probability of γ (i) and λ(i) .
It is necessary to determine the constants of H, in a manner that H is a minimum.
Now we have

H = (1 − g (2) )2 + (1 − g (2) − g (3) )2 + · · · + (1 − g (n) − g (n+1) )2


+ (1 − g (n+1) )2 + g (2)2 + g (3)2 + · · · + g (n+1)2 ;

by equating to zero the coefficient of the differential of g (i) , we have

(1) g (i+1) + 3g (i) + g (i−1) = 2.

This equation holds from i = 3 to i = n. The equality to zero of the coefficient of


dg (2) gives

g (3) + 3g (2) = 2,
and the equating to zero of the coefficient of dg (n+1) gives

3g (n+1) + g (n) = 2,
that which reverts to considering the general equation (1) as holding from i = 2 to
i = n + 1, and to supposing null g (1) and g (n+2) . The integration of equation (1) in the [598]
finite differences gives
2
g (i) = + Ali−1 + A0 li−1 ,
5
√ √
l and l0 being the two roots − 32 − 12 5, − 23 + 21 5 of the equation

y 3 + 3y + 1 = 0;
A and A0 are two arbitraries such that g (i) becomes null when i = 1 and when i = n+2.
We have therefore

Aln+1 + A0 l0n+1 = − 52 ,
A + A0 = − 25 .
ln+1 is an extremely great quantity when n is a great number and, l0n+1 being ln+1
1
,
0 2
we see that A is then an excessively small quantity and that thus A = − 5 . We have
next

f (i) = 1
5 − Ali−1 (1 + l) − A0 l0i−1 (1 + l0 ).
1
Thence it is easy to conclude that we have, very nearly and without fear 25 of error.
n+1
H= ,
5

14
and that thus the exponential proportional to the probability of error s is
5Ks2
c− n+1 ;
we are able therefore thus to determine this probability.
We have concluded the value of x(n+1) of the system of equations (A) by the fol-
lowing process.
The system of equations (A) gives

x(1) − x(0) = p(1) + γ (1) ;


whence we deduce

x(1) = p(1) + x(0) + γ (1) .


We have next the two equations [599]

x(2) − x(1) = p(2) + γ (2) ,


x(2) − x(0) = q (2) + λ(2) ;
that which gives

x(2) = 21 x(1) + 12 x(0) + 21 (p(2) + q (2) ) + 12 γ (2) + 21 λ(2) .


We have the two equations

x(3) − x(2) = p(3) + γ (3) ,


x(3) − x(1) = q (3) + λ(3) ;
that which gives

x(3) = 21 x(2) + 12 x(1) + 21 (p(3) + q (3) ) + 12 γ (3) + 21 λ(3) .


By continuing thus, we will have x(n+1) . The quantities γ (m) and λ(m) commence to
be introduced into this expression only with the two values of x(m) − x(m−1) and of
x(m) − x(m−2) . Let us designate by k (r) the coefficient of γ (m) in the expression of
x(m+r) ; this expression is

x(m+r) = 12 x(m+r−1) + 12 x(m+r−2) + 12 (p(m+r) + q (m+r) ) + 12 γ (m+r) + 12 λ(m+r) ;

by substituting for x(m+r) , x(m+r−1) , x(m+r−2) the parts of their values relative to
γ (m) , the comparison of the coefficients of this quantity will give

k (r) = 12 k (r−1) + 21 k (r−2) ;


whence we deduce, by integrating,
r−1
k (r) = A + A0 − 12 ,

15
A and A0 being two arbitraries. In order to determine them, we will observe that, r
being null, we have k (0) = 21 , and that, r being 1, we have

k (1) = 12 k (0) = 14 ;
thence we deduce

A = 13 , A0 = − 12
1
;
thus, in the value of x(n+1) , where r = n + 1 − m, we will have, for the coefficient
k (n+1−m) of γ (m) ,
n−m
k (n+1−m) = 31 − 12
1
− 12 ;
the coefficient of λ(m) in the same value will be evidently the same. Thus the expres- [600]
sion of x(n+1) will be a known quantity, plus the series

k (n) γ (1) + k (n−1) (γ (2) + λ(2) ) + · · · + k (0) (γ (n+1) + λ(n+1) ).

Let us designate by s this error and by H the sum of the squares of the coefficients of
Ks2
γ (1) , γ (2) , . . ., λ(2) , λ(3) , . . .; the probability of s will be proportional to c− H . We
have, very nearly,
H = 92 (n + 1);
−9Ks2
thus the probability of s is very nearly proportional to c 2(n+1) ; the equally probable
errors are therefore greater in this process
q than according to the most advantageous

method, and nearly in the ratio of 5 to 92 ; this process approaches therefore much
the exactitude of the most advantageous method, and, as the calculation of it is quite
simple, we will determine the probability of the errors to which it exhibits, in the gen-
eral case where the diverse triangles are neither equal nor equilateral.
If we represent by m(i) the square of C (i−1) C (i) divided by 2R, and by n(i) the
square of C (i−2) C (i) divided similarly by 2R, the system of equations (A) will be
changed into the following:
( (i)
0
x − x(i−1) = p(i) + m(i) γ (i) ,
(A )
x(i) − x(i−2) = q (i) + n(i) λ(i) .

The process that we have just examined gives, by following the preceding analysis, the
coefficient of γ (i) in the expression of x(n+1) equal to
1 (i)
n−i
3m − 121
m(i) − 12 .

Similarly, the coefficient of λ(i) , in the same expression, is


1 (i) 1 (i)
n−i
3n − 12 n − 12 ;

thence it follows that the value of H is, very nearly,


(i)2
1
9 S(m + n(i)2 ),

16
the integral sign S extending to all the values of i to i = n + 1; the probability of an [601]
error s, in the expression of x(n+1) , is therefore proportional to
−9Ks2
c S(m(i)2 +n(i)2 ) .

If we apply to the equations (A0 ) the analysis that we have given above for the case
of the most advantageous method, we will find, by multiplying them respectively by
f (i) and g (i) , the following equation

f (i) = 1 − g (i) − g (i+1) ,

and this equation will hold from i = 1 to i = n + 1, by supposing g (i) and g (i+2) nulls.
We will have next the general equation

m(i)2 g (i+1) + (n(i)2 + m(i)2 + n(i−1)2 )g (i) + m(i−1)2 g (i−1) = m(i)2 + m(i−1)2 .

This equation holds from i = 2 to i = n + 1. By combining it with the equations


g (1) = 0, g (n+2) = 0, we will have the values of f (1) , f (2) , . . . , f (n+1) ; g (1) , g (2) , . . . ,
g (n+2) ; we will have next

H = S(f (i)2 m(i)2 + g (i)2 n(i)2 ),

the sign S comprehending all the values of f (i) m(i)2 and of g (i)2 n(i)2 ; the probability
of an error s in the value of x(n+1) will be proportional to
Ks2
c− H ,

§5. It is necessary now to determine the value of K. For this, we will observe that
the factor u is determined, by that which precedes, by means of the equation

π − θ − θ0 + h
R
u= 2h
,
R
0
and that the error of this expression is +
2 . Each double station furnishes a value of u,
and the mean of these values is the value that it is necessary to adopt. If we name i the [602]
0
number of these values, the error to fear will be S +
2i , the sign S corresponding to the
0
+0
i quantities +
2i related to each double station. Let s be the sum S 2i ; the probability
of s will be, by §20 of Book II, proportional to an exponential of the form
K 0 s2
c− i ,

and, if we name q the sum of the squares of the differences of each partial value to its
mean value, we will have
i
K0 = .
2q

17
We have, by that which precedes, the probability of the error of a value s0 of the func-
0
tion S −
2i proportional to the exponential

4Ks02
c− i ,
0
the sign S extending to i quantities of the form −
2 . Now, the errors  and − being
0
−0
supposed equally probable, it is clear that the same values of S +
2i and of S 2i are
equally probable; we have therefore

4K = K 0 ,

that which gives


i
K= .
8q
The forty-five first values of u, given in the second Volume of the Base du Système
métrique (p. 771), and which are founded on some observations made in the month of
the year where we observe most often, give, for its mean value,

u = 0, 07818,

and the sum q of the squares of the differences of these values to the mean is 0, 04900629; [603]
i being here equal to 45, we have
45
K= = 114, 781.
0, 39205032
If we suppose the number n of triangles equal to 25 and if we make all the sides
equal to 20000m , we will have 240000m for the distance from x(26) to x(0) ; this is
f2
nearly the distance from Paris to Dunkirk. In this case, the quantity 2R , taken for unit
of distance, is 31m , 416. Thence we conclude that the odds are one against one that the
error on the height x(26) is comprehended within the limits ±3m , 1839. There are odds
nine against one that it is comprehended within the limits ±7m , 761; we are not able
therefore then to respond with a sufficient probability that this error will not exceed
±8m .
The chain of triangles that we have just considered is much more favorable to the
determination of the height of its last point than that of which Delambre has made use,
in the Work cited, in order to determine the height of the Pantheon above the level of
the sea. By considering this last chain, we see that we are not able to respond, with a
sufficient probability, that the error respecting this height will not exceed ±16m .

§6. We see, by that which precedes, that the great triangles, which are very proper
to the measure of terrestrial degrees, are too small in order to determine the respective
heights of the diverse stations. Thus, in the case of a chain of equilateral triangles of
which f is the length of each side, the equally probable

errors of the difference of level
f 2 n+1
of two extreme stations being proportional to 2R , n being the number of triangles,
if we name a the distance of these two stations, we will have, by supposing n + 1 even,

a = 12 (n + 1)f ;

18

f2 n+1 1
2R will be therefore proportional to 3 ; the equally probable errors will be [604]
(n+1) 2
therefore proportional to this fraction. Thus, by quadrupling the number of triangles,
they will become eight times smaller; but then the errors due to the observations of
the angles become comparable to the errors due to the variability of the terrestrial
refractions. Let us examine how we are able to have regard at the same time to these
two kinds of errors.
Let us consider a sequence of points C, C (1) , C (2) , . . . Let h(0) be the distance
from C to C (1) ; h(1) the distance from C (1) to C (2) ; h(2) the distance from C (2) to
C (3) , and thus consecutively. Let us imagine that from the point C (i) we observe
C (i+1) , and reciprocally. The zenithal distance from C (i+1) , observed from C (i) will
be, by that which precedes,

h(i) u h(i) 
θ+ + + α,
R R
 being the error of u and α being that of the observed angle θ. The zenithal distance
of C (i) , observed from C (i+1) , will be
h(i) u h(i) 0
θ0 + + + α0 ,
R R
0 and α0 being the errors of u and of θ0 in the observation made at the point C (i+1) .
We will have therefore the two equations

2h(i) u h(i) u h(i)


θ + θ0 + + ( + 0 ) + α + α0 = π + ,
R R R
θ − θ0 (i) h(i)2 1
x(i+1) − x(i) = h + ( − 0 ) + h(i) (α − α0 ).
2 2R 2
Let us designate as above  − 0 by γ (i) , and let us make α − α0 equal to λ(i) ; we will
have, for the elevation x(n+1) − x(0) of the point C (n+1) above C, an expression of the
form
h(i)2 (i) 1
x(n+1) − x(0) = M + S γ + S h(i) λ(i) ,
2R 2
the integral sign S corresponding to all the values of i, from i = 0 to i = n. The error [605]
of this value of x(n+i) is
h(i)2 γ (i) h(i) (i)
S +S λ .
2R 2
It is necessary now to determine the probability of this error that we will designate by
s. Let there be generally
s = Sm(i) γ (i) + Sn(i) λ(i) ;
the probability of s will be, by the analysis of §20 of Book II of the Théorie analytique
des Probabilités, proportional to

Z
d$ dx dy φ(x)ψ(y)c−s$ −1

× [cos(m(0) x + n(0) y)$ cos(m(1) x + n(1) y)$ cos(m(2) x + n(2) y)$ · · · ];

19
φ(x) is the law of probability of a value x of γ (0) ; ψ(y) is the law of probability of a
value y of λ(0) . The negative and positive errors are supposed equally probable: the
integrals relative to x and y are taken from negative infinity to positive infinity, and the
integral relative to $ = −π to $ = π. By making
Z Z
2 dx φ(x) = k, x2 dx φ(x) = k 00 ,
Z Z
2 dy ψ(y) = k̄, y 2 dy ψ(y) = k̄ 00 ,

the integrals being taken from x and y null to x and y equal to infinity, the analysis of
the section cited will give the probability of s proportional to
−s2
4k00 Sm(i)2 + 4k̄00 Sn(i)2
c k k̄

It is easy to conclude generally from the same analysis that, if we make

s = Sm(i) γ (i) + Sn(i) λ(i) + Sr(i) δ (i) + · · · ,

γ (i) , λ(i) , δ (i) , . . . being of the errors deriving from different sources, the probability [606]
of s is proportional to the exponential
−s2
¯ 00
4k00 Sm(i)2 + 4k̄00 Sn(i)2 + 4k̄ (i)2 +···
¯ Sr
c k k̄ k̄

by designating by π(x) the probability of an error x due to the third source of error,
and making Z Z
2 dx π(x) = k̄ ,¯ x2 dxπ(x) = k̄¯00 ,

the integrals being taken from x null to x infinity; and thus of the other errors.
00 00
In order to determine, in the present question, the constants 4kk and 4k̄k̄ , I will
suppose first the second null or very small, relatively to the first, as we are able to do in
the great triangulations of the meridian. In this case, the probability of an error s will
be, by making m(i) = 1, proportional to
−s2
4k00 n
c k ,

n being the number of intervals which separate the stations. The probability of a value
0
+0
s0 of S − 0 0
2 or of S 2 , that which corresponds to an error 2s in the value of S(− ),
will be proportional to
−4s02
4k00 n
c k ;
but, by that which precedes, this probability is proportional to
−i s02
c 2q n ;

20
we have therefore
2q k 00 4k 00 8q 1
= or = = .
i k k i 114, 781
00
If we suppose now kk null and n(i) = 1, the probability of a value s0 of the sum [607]
α0 −α
S 2 will be proportional to
−4s02
4k̄00 n
c k̄ ,
and the probability of a same value s0 of S(α + α0 + α00 ) will be proportional to
−2s02
12k̄00 n
c k̄ .

If we suppose this law of probability the same as for the errors of the sum of the three
angles of a spherical triangle, in the geodesic measures, and which, by §1 of the second
Supplement, is able to be supposed proportional to
−(i+2) s02
c 2θ 2 n ,

θ2 being the sum of the squares of the excess observed in the sum of the errors of the
three angles in i triangles, we will have

4k̄ 00 4θ2
= .
k̄ 3(i + 2)

We have, by that which we have seen,


i+2 109
2
= ;
θ 445, 217
hence,
4k̄ 00 4 445, 217
= ,
k̄ 3 109
a quantity that it is necessary to divide by the square of the number of sexagesimal
seconds that this radius contains, and then we have

4k̄ 00 1, 2801
= .
k̄ 1010
Let us suppose the distances of the consecutive stations equal to 1200 m; we will find [608]
that there are odds one against one that the error respecting the value of x(n+1) is not
above ±0m , 08555 when n = 200. There are odds one thousand against one that the
error is not above ±0m , 413.

21
General method of the Calculation of probabilities, when there are many sources of
errors.

The consideration of the two independent sources of error which exist in the opera-
tions of the leveling has led me to examine the general case of the observations subject
to many sources of errors. Such are astronomical observations. The greater part are
made by means of two instruments, the meridian lunette and the circle, of which the
errors must not be supposed to have the same law of probability. In the equations of
condition that we deduce from these observations, in order to obtain the elements of
the celestial movements, these errors are multiplied by some different coefficients for
each source of error and for each equation. The most advantageous systems of factors
by which it is necessary to multiply these equations, in order to have the final equations
which determine the elements, are no longer, as in the case of a unique source of errors,
the coefficients of each element in the equations of condition. The facility with which
the analysis that I have given in Book II of my Théorie des Probabilités is applied to
this general case will show the advantages of this analysis.
Let us suppose first that we have a system of equations of condition represented by
this here
p(i) y = a(i) + m(i) γ (i) + n(i) λ(i) + · · · ,
y being an element of which we seek the most advantageous value. If we multiply the
preceding equation by a factor f (i) , the reunion of all these products will give for y the
expression
Sa(i) f (i) Sm(i) f (i) γ (i) + Sn(i) f (i) λ(i) + · · ·
y = (i) (i) +
Sp f Sp(i) f (i)
The error of y will be [609]

Sm(i) f (i) γ (i) + Sn(i) f (i) λ(i) + · · ·


Sp(i) f (i)
By designating by s this error, its probability will be proportional, by the preceding
section, to the exponential
(i) f (i) )2
−s2(Sp
4k00 Sm(i)2 f (i)2 + 4k̄00 Sn(i)2 f (i)2 +···
c k k̄

It is necessary to determine f (i) in a manner that


4k00 00

k Sm
(i)2 (i)2
f + 4k̄k̄ Sn(i)2 f (i)2 + · · ·
(Sp(i) f (i) )2
is a minimum, because it is clear that then the same error s becomes less probable than
in each other system of factors. If we name A the numerator of this fraction, and if we
make f (i) vary by a quantity dq, we will have, through the condition of the minimum,
by equating to zero the differential of this fraction,
k00 (i)2 (i) k̄00 (i)2 (i)
k m f + k̄
n f + ··· p(i)
0= −
A Sp(i) f (i)

22
that which gives for f (i) an expression of this form

µp(i)
f (i) = k00 (i)2 k̄00 (i)2
k m + k̄
n + ···

We are able to make here µ = 1, because, this quantity being independent of i, it affects
equally all the multipliers f (i) ; thus the quantity f (i) , by which we must multiply each
equation of condition in order to have the most advantageous result, is [610]

p(i)
k00 k̄00 (i)2
k m(i)2 + k̄
n + ···

and the probability of an error s of this result is proportional to the exponential


−s2 p(i)2
4 S k00 m(i)2 + k̄00 n(i)2 +···
c k k̄

We will have, by the same analysis and by §22 of Book II, the factors by which we must
multiply the equations of condition in order to have the most advantageous results,
whatever be the number of elements to determine and the number of kinds of errors;
we will have similarly the laws of probability of the errors of these results.
Let us suppose that we have, between two elements x and y, the equation of condi-
tion
l(i) x + p(i) y = a(i) + m(i) γ (i) + n(i) λ(i) + r(i) δ (i) + · · · ,
γ (i) , λ(i) , δ (i) , . . . being some errors of which the sources are different. By multiplying
first this equation by a system f (i) of factors, the reunion of these products will give
the final equation

xSl(i) f (i) + ySp(i) f (i) = Sa(i) f (i) + Sm(i) f (i) γ (i) + Sn(i) f (i) λ(i) + · · · ,

By multiplying next the equation of condition by another system g (i) of factors, the
reunion of the products will give a second final equation

xSl(i) g (i) + ySp(i) g (i) = Sa(i) g (i) + Sm(i) g (i) γ (i) + · · · .

We deduce from these two final equations

Sa(i) f (i) Sp(i) g (i) − Sa(i) g (i) Sp(i) f (i)


x=
L
Sm(i) f (i) γ (i) Sp(i) g (i) − Sm(i) g (i) γ (i) Sp(i) f (i) + · · ·
+ ,
L
L being equal to [611]
Sl(i) f (i) Sp(i) g (i) − Sl(i) g (i) Sp(i) f (i) .
The coefficient of γ (i) in this value is

m(i) f (i) Sp(i) g (i) − m(i) g (i) Sp(i) f (i)


.
L

23
By changing m(i) into n(i) , r(i) , . . ., we will have the coefficients corresponding to
λ(i) , δ (i) , . . . By naming therefore s the value of the part of x dependent on the er-
rors γ (i) , λ(i) , δ (i) , . . ., the probability of this value will be, by that which precedes,
proportional to the exponential
−s2
c H ,
by making

SM (i) f (i)2 (Sp(i) g (i) )2 − 2SM (i) f (i) g (i) Sp(i) f (i) Sp(i) g (i) + SM (i) g (i)2 (Sp(i) f (i) )2
H= ,
L2
M (i) being equal to

4k 00 (i)2 4k̄ 00 (i)2 4k̄¯00 (i)2


m + n + r + ···
k k̄ k̄¯
It is necessary now to determine f (i) and g (i) in a manner that H is a minimum. For
this, we will make f (i) vary, and we will equate to zero the coefficient of its differential;
that which will give, by naming P the numerator of the expression of H,

0 =M (i) f (i) (Sp(i) g (i) )2 − M (i) g (i) Sp(i) f (i) Sp(i) g (i)
− p(i) SM (i) f (i) g (i) Sp(i) g (i) + p(i) SM (i) g (i)2 Sp(i) f (i)
P
− (l(i) Sp(i) g (i) − p(i) Sl(i) g (i) )2 .
L
It is easy to see that we satisfy this equation by supposing

l(i) p(i)
f (i) = , g (i) = ;
M (i) M (i)
and we must conclude from it that we would satisfy, by the same supposition, the
corresponding equation that would give dH = 0, by making g (i) vary. We see that [612]
the same values of f (i) and g (i) satisfy the similar equations which result from the
consideration of the element y.
If we have, among the elements x, y, z, . . ., some equations of condition repre-
sented by the general equation

l(i) x + p(i) y + q (i) z + · · · = a(i) + m(i) γ (i) + n(i) λ(i) + r(i) δ (i) + · · · ,

γ (i) , λ(i) , δ (i) , . . . being the errors of diverse kinds, we will find by the preceding
analysis that the factors by which we must multiply respectively this equation, in order
to form the final equations which give the values of the most advantageous elements,
are, for the first final equation, represented by

l(i)
k00 k̄00 ¯ 00

k m(i)2 + k̄
n(i)2 + ¯ r

(i)2 + ···

24
They are represented, for the second final equation, by

p(i)
k00 (i)2 k̄00 (i)2 ¯ 00
k̄ (i)2 + ···
k m + k̄
n + ¯ r

and thus consecutively. By applying therefore to the equations thus multiplied the
analysis of §2 of the first Supplement, we will have the values of the most advantageous
elements and the laws of probabilities of their errors.
In order to give an example of this application, let us consider only two elements x
and y. If we make

k 00 (i)2 k̄ 00 (i)2 k̄¯00 (i)2


M (i) = m + n + r + ···
k k̄ k̄¯
(i)
p
We will multiply the previous equation of condition by M (i) , and we will deduce from [613]
it
l(i) p(i) p(i)2 a(i) p(i) p(i)
xS + yS = S + S (m(i) γ (i) + n(i) λ(i) + · · · );
M (i) M (i) M (i) M (i)
but the condition of the most advantageous method gives

l(i)
0=S (m(i) γ (i) + n(i) λ(i) + · · · ),
M (i)
p(i)
0 = S (i) (m(i) γ (i) + n(i) λ(i) + · · · );
M
we will have therefore (i) (i) (i) (i)
S aM (i)
p
− xS pM (i)
l
y= p (i)2
.
SM (i)

Substituting this value of y into the general equation of condition, and making
(i) (i)

(i) (i) (i)


S l Mp(i)
l1 =l −p p (i)2
,
SM (i)
(i) (i)

(i) (i) (i)


S l Mp(i)
a1 =a −p p (i)2
,
SM (i)

we will have (i) (i)


a1 l1
S M (i)
x= (i)2
;
l
S M1 (i)
and the probability of an error s of this value will be proportional to
(i)2
−s2 l1
S
c 4 M (i) .
00 00
This analysis supposes knowledge of the constants kk and k̄k̄ . But we are able to
obtain from it, by the same observations, some very close values, in the following [614]

25
manner.
Let us imagine that we have determined the elements x, y, z, . . . by the method
according to which we form the final equations, by multiplying each equation of con-
dition successively by the corresponding coefficient of each element. If we substitute
the values of the elements thus determined into the equation of condition
l(i) x + p(i) y + · · · − a(i) = m(i) γ (i) + n(i) λ(i) + · · · ,
we will have an equation of this form
R(i) = m(i) γ (i) + n(i) λ(i) + · · ·
Let us suppose, for greater simplicity, that we have only two kinds of errors γ (i)
and λ(i) : we will multiply first the preceding equation by m(i) . By raising next each
member to the square and taking the sum of all the equations thus formed, we will have
Sm(i)2 R(i)2 = S(m(i)4 γ (i)2 + 2m(i)3 n(i) γ (i) λ(i) + n(i)2 m(i)2 λ(i)2 ).
The mean value of m(i)4 γ (i)2 is evidently
m(i)4 γ 2 dγφ(γ)
R
R ,
dγ φ(γ)
00 (i)4
the integrals being taken from γ = −∞ to γ infinity, that which gives 2k m k . We
00
have similarly 2k̄k̄ m(i)2 n(i)2 for the mean value of m(i)2 n(i)2 λ(i)2 . We find in the
same manner that the mean value of 2m(i)3 n(i) γ (i) λ(i) is null; we have therefore, by
substituting instead of the quantities their mean values, that which we are able to make
with so much more precision as the number of observations is greater,
2k 00 2k̄ 00 (i)2 (i)2
Sm(i)2 R(i)2 = Sm(i)4 + m n .
k k̄
We will have similarly [615]
2k 00 2k̄ 00 (i)4
Sn(i)2 R(i)2 = Sm(i)2 n(i)2 + Sn ;
k k̄
whence we deduce
4k 00 2Sn(i)4 Sm(i)2 R(i)2 − 2Sm(i)2 n(i)2 Sn(i)2 R(i)2
= ,
k Sm(i)4 Sn(i)4 − (Sm(i)2 n(i)2 )2
4k̄ 00 2Sm(i)4 Sn(i)2 R(i)2 − 2Sm(i)2 n(i)2 Sm(i)2 R(i)2
= ;
k̄ Sm(i)4 Sn(i)4 − (Sm(i)2 n(i)2 )2
be designating therefore by 2P and 2Q the numerators of these two expressions, the
factors by which we must multiply the equation of condition will be
l(i)
,
m(i)2 P + n(i)2 Q
p(i)
,
m(i)2 P + n(i)2 Q
···············

26
00 00
The concern now is to show that these values of 4kk , 4k̄k̄ are quite close. For this, let
us consider only one element x: the equation of condition

l(i) x = a(i) + m(i) γ (i) + n(i) λ(i)

will give
Sa(i) l(i) Sl(i) m(i) γ (i) + Sl(i) n(i) γ (i)
x= + .
Sl(i)2 Sl(i)2
Substituting this value in the equation of condition, we will have

l(i) Sa(i) l(i) − a(i) Sl(i)2


R(i) = ,
Sl(i)2
S(l(i) m(i) γ (i) + l(i) n(i) γ (i) )
R(i) + l(i) = m(i) γ (i) + n(i) γ (i) ;
Sl(i)2

but it is easy to see that the values of Sl(i) m(i) γ (i) and of Sl(i) n(i) λ(i) are nulls by [616]
the supposition of the negative errors as probable as the positive errors: we are able
therefore to make, as above,

R(i) = m(i) γ (i) + n(i) λ(i) ,

that which it was necessary to establish.

27
QUATRIÈME SUPPLÉMENT.
Pierre Simon Laplace∗

1825
OC 7 617–645

§1. U being any function whatever of a variable t, if we develop it according to [617]


the powers of t, the coefficient of tx , in this development, will be a function of x that
I will designate by yx ; U is that which I have named generating function of yx . If we
multiply U by a function T of t, similarly developed according to the ascending powers
of t, the product U T will be a new generating function of a function of x, derived from
the function yx according to a law which will depend on the function T . If T is equal
to 1t − 1, it is easy to see that the derived will be yx+1 − yx , or the finite difference
of yx . Let us designate generally, whatever be T , this derived by δyx . If we multiply
the product U T by T , the derived of the product U T 2 will be a derived of δyx similar
to the derived of δyx in yx ; we will be able therefore to designate by δ 2 yx this second
derived; whence it is clear generally that U T n will be the generating function of δ n yx .
If we multiply U by another function Z of t, similarly developed according to the
ascending powers of t, and if we designate by the characteristic 4 that which we have
named δ relative to the function T , U Z n will be the generating function of 4n yx .
We are able to imagine T as a function of Z. By developing this function into series
with respect to the ascending powers of Z, we will have an expression of T of this form

T = A(0) + A(1) Z + A(2) Z 2 + · · ·

By multiplying this equation by U and passing again from the generating functions
to the coefficients, we will have [618]

δyx = A(0) yx + A(1) 4yx + A(2) 42 yx + · · ·

We see thus that the same equation, which holds between T and Z, holds between their
characteristics δ and 4, provided that, in the development of this equation according
to the powers of δ and of 4, we substitute, instead of any power δ r , δ r yx ; instead
0 0 0 0
of a power 4r , 4r yx ; instead of a product such as δ r 4r , δ r 4r yx ; and that we
multiply by yx the terms independent of δ and 4. Thus, by supposing T equal to
1 1
t − 1, Z = ti − 1, δyx will be the finite difference of yx , x varying by unity; 4yx will
be the finite difference of yx , x varying with i; we have next

Z = (1 + T )i − 1,
∗ Translated by Richard J. Pulskamp, Department of Mathematics & Computer Science, Xavier Univer-

sity, Cincinnati, OH. August 2, 2013

1
and, consequently,
Z n = [(1 + T )i − 1]n ;
that which gives
4n = [(1 + δ)i − 1]n ,
provided that after the development we place yx after the powers of the characteristics.
This equation will hold furthermore by making n negative, but then the differences are
changed into integrals. The consideration of the generating functions show thus, in the
most natural and most simple manner, the analogy of the powers and of the differences.
We are able to consider this theory as the calculus of characteristics.
If we have 0 = δyx , we will have an equation in the finite differences: U T becomes
then a polynomial which contains only powers of t smaller than the highest of t in T .
Let us designate by Q the polynomial in t the most general of this nature; we will have
Q
U= .
T
The coefficient of tx in the development of U will be the integral yx of the equation
0 = δyx ; by this reason, I name U generating function of this equation.
If we imagine U a function of two variables t and t0 , the coefficient of the product [619]
x x0
t t , in the development of U , will be a function of x and of x0 that I designate by
yx,x0 ; T being a function developed in the same variables t and t0 , the product U T will
be the generating function of a derived of yx,x0 , that I will designate by δyx,x0 ; and it
is easy to conclude from it that U T n will be the generating function of δ n yx,x0 .
If we have 0 = δyx,x0 , we will have an equation in the partial finite differences. Let
us represent this equation by the following

0 = ayx,x0 + byx,x0 +1 + cyx,x0 +2 + · · · ,


+ a0 yx+1,x0 + b0 yx+1,x0 +1 + · · · ,
+ a00 yx+1,x0 + · · ·
+ ··· ;

it is easy to see that the generating function of the proposed equation will be
0
A + Bt0 + Ct02 + · · · + Ht0n −1 + A0 + B 0 t + C 0 t2 + · · · + H 0 tn−1
0 0 0
atn t0n + btn t0n −1 + atn t0n −2 + · · · 
 

 0 0 
+ a0 tn−1 t0n + b0 tn−1 t0n −1 + · · ·
 
0
00 n−2 0n

 +a t t + ··· 

+ ······ ,
 

n and n0 being the greatest increases of x and of x0 , in the proposed equation in partial
differences; A, B, C, . . . , H are arbitrary functions of t; A0 , B 0 , C 0 , . . . , H 0 are ar-
bitrary functions of t0 . We will determine all these functions by means of the generating
functions of
y0,x0 , y1,x0 , y2,x0 , . . . , yn−1,x0,
yx,0 , yx,1 , yx,2 , . . . , yx,n0 −1.

2
One of the principal advantages of this manner to integrate the equations in partial
differences consists in this that, the algebraic analysis furnishing diverse ways to de-
velop the functions, we are able to choose the one which agrees best to the proposed
question. The solution of the following problems, by the count de Laplace, my son, [620]
and the considerations that he has joined will spread a new day on the calculus of
generating functions.

§2. A player A draws from an urn, containing some white and black balls, one
ball which he returns after the trial, with the probability p to bring forth a white ball
and the probability q to extract from it a black; a second player B draws next, from
another urn, a ball which he returns equally after the drawing, with the probabilities p0
of a white ball and q 0 of a black. These two players continue thus to extract alternately,
each from their respective urn, a ball which they always take care to return. If one of
the players brings forth a white ball, he counts a point; if, on the contrary, he makes a
black ball exit, he counts nothing, and the turn of the player passes simply to the other.
The players having settled, by the conditions of their game, the number of points that
each must attain first in order to win the game, and having commenced to play, there is
lacking yet to player A the number x points in order to win, and x0 to player B; and the
turn to play belongs to player A. We demand, in this position, what is the probability
of each player to win the game.
Let zx,x0 be the probability of second player B, and let us represent by Yx,x0 his
probability, if he were the first to play. Player A, by beginning, is able to bring forth
a white ball, and the probability of B becomes Yx−1,x0 ; or the first player makes a
black ball exit, and then counts nothing, and the probability of the second is changed
into Yx,x0 ; but the probability of the first case is p, that of the second q; we will have
therefore the equation
zx,x0 = pYx−1,x0 + qYx,x0 ;
by a similar reasoning, we will have further this one
Yx,x0 = p0 zx,x0 −1 + q 0 zx,x0 ;
whence we deduce
Yx−1,x0 = p0 zx−1,x0 −1 + q 0 zx−1,x0 ,
and consequently(1 ) [621]
0 0 0 0
zx,x0 = p(p zx−1,x0 −1 + q zx−1,x0 ) + q(p zx,x0 −1 + q zx,x0 )
1 We arrive again to this equation in partial differences by considering together the two successive draw-

ings of A and B as one trial, and by examining the different cases which are able to be presented after this
trial played; now they are in number of four: 1 ˚ either the two players bring forth each one white ball, an
event of which the probability is pp0 ; then the probability zx,x0 will be changed into this one zx−1,x0 −1 ;
2 ˚ or the first player extracts a white ball and the second a black; under this hypothesis, which has for prob-
ability pq 0 , zx,x0 will become zx−1,x0 ; 3 ˚ or on the contrary the first player makes a black ball exit and
the second a white; under this hypothesis, which has for probability p0 q, zx,x0 will become zx,x0 −1 ; 4 ˚ or
finally each player draws a black ball, an event of which the probability is qq 0 , and then the probability zx,x0
remains the same. We will have therefore, by the known principles of probabilities, the equation
zx,x0 = pp0 zx−1,x0 −1 + pq 0 zx−1,x0 + p0 qzx,x0 −1 + qq 0 zx,x0 .
On obtains the generating function of zx,x0 , in this equation in partial differences, by applying to this case
the general rule which has just been exposed.

3
or
pq 0 p0 q pp0
zx,x0 = zx−1,x 0 + zx,x 0 −1 + zx−1,x0 −1 ,
1 − qq 0 1 − qq 0 1 − qq 0
and by making
pq 0 p0 q pp0
= m, = m0 , = n,
1 − qq 0 1 − qq 0 1 − qq 0
it will become
zx,x0 = mzx−1,x0 + m0 zx,x0 −1 + nzx−1,x0 −1 .
The generating function of zx,x0 , in this equation in partial differences, is
A + A0
,
1 − mt − m0 t0 − ntt0
A being an arbitrary function of t, and A0 another arbitrary function of t0 ; I observe
first that by attributing to the function A0 the term independent of t in the function A,
the generating function above is able to be game under this form
A1 t + A01
,
1 − mt − m0 t0 − ntt0
A1 and A01 being new arbitrary functions of t and of t0 that it is the question to de-
termine. Now, if we pay attention that z0,x0 is null, whatever be x0 , the probability of [622]
player A is changed then to certitude, we see that the coefficient of t0 in the develop-
ment of the generating function with respect to the powers of t must be null, and we
will have
A01
=0 or A01 = 0.
1 − m0 t0
Moreover, zx,0 is null when x is zero, and equal to unity when x is either 1 or 2, or
3, . . ., since then the probability of player B is changed into certitude; the generating
function of zx,0 is therefore 1−t t
; it is the coefficient of t00 in the development of the
generating function according to the powers of t0 ; we will have therefore
A1 t t
= ;
1 − mt 1−t
that which gives
t(1 − mt)
A1 t = ;
1−t
consequently the generating function of zx,x0 is
t(1 − mt)
(a) ;
(1 − t)(1 − mt − m0 t0 − ntt0 )
by putting it under this form
t 1
 
1 − t 1 − m0 +nt t0
1−mt

4
and the development with respect to the powers of t0 , we have
"  0   0 2  0 3 #
t m + nt 0 m + nt m + nt
1+ t + t02 + t03 + · · · .
1−t 1 − mt 1 − mt 1 − mt
0
The coefficient of t0x in this series is
x0
m0 + nt

t
,
1−t 1 − mt
and the one of tx in the development of this last function will be the expression of zx,x0 . [623]
 0 x0
Now, if we reduce first the expression t m +nt
1−mt into a series ordered according to
1
the powers of t, and if we multiply it next by the development of 1−t , it is easy to see
x
that the coefficient of t in this product is that which the series becomes by making
t = 1 in it and stopping ourselves at the power x of t; and we will find, for the value of
this coefficient or of zx,x0 ,

x0 n x0 (x0 − 1) n2 x0 (x0 − 1)(x0 − 2) n3 x0 (x0 − 1) · · · (x0 − x + 2) nx−1 


 
 1+ + + + ··· +
1 m0 m02 m03 m0x−1 
 



 1.2 1.2.3 1.2. . . . (x − 1) 


0 0 0 0 2 0 0 0 x−2
 
− − · · · −
 

 x x n x (x 1) n x (x 1) (x x + 3) n 


 + m 1 + 0
+ 02
+ · · · + 0x−2


1 1 m 1.2 m 1.2. . . . (x − 2) m

 


 

0
0x 0 0
 0 0 0 0 x−3

zx,x0 =m x (x − 1) 2 x n x (x − 1) · · · (x − x + 4) n
+ m 1+ + ··· +
1 m0 1.2. . . . (x − 3) m0x−3
 



 1.2 



 



 + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 



0 0 0
 


 + x (x + 1) · · · (x + x − 2) x−1


 m 

1.2. . . . (x − 1)
By designating by yx,x0 the probability of player A, we will be led, by the same
reasonings, to a similar equation in the partial differences,

yx,x0 = myx−1,x0 + m0 yx,x0 −1 + nyx−1,x0 −1 ,

which gives similarly for the variable yx,x0 a generating function of the form

A1 t + A01
,
1 − mt − m0 t0 − ntt0
A1 and A01 being, as above, arbitrary functions of t and of t0 what we will determine
1
by the same considerations. In fact the generating function of y0,x0 is 1−t0 , that of yx,0

is unity: we will form therefore the equations


A01 1
0 0
= ;
1−mt 1 − t0
whence we deduce
1 − m0 t0
A01 =
1 − t0

5
and [624]
A1 t + 1
= 1;
1 − mt
whence we conclude
A1 t = −mt.
The generating function of yx,x0 will be therefore
1−m0 t0
1−t0 − mt
(b) ,
1 − mt − m0 t0 − ntt0
which, developed according to the powers of t and of t0 , will give, by the coefficient of
0
tx t0x , the expression of yx,x0 which will be of a form similar to that of zx,x0 , although
a little more complicated.
By adding the two generating functions (a) and (b), their sum is reduced to that
here
1
,
(1 − t)(1 − t0 )
0
in which the coefficient of tx t0x is unity; thus we have

yx,x0 + zx,x0 = 1;

and effectively, the game must be necessarily won by one of the players, because both
are certain to be able to extract each from their urn the determined numbers of white
balls.
Now, let us suppose p = 0 and consequently q = 1, we have

m = 0, m0 = 1 and n = 0;

then the expression of zx,x0 becomes unity; that which is evident, since the player B,
having no more chances to lose, must always end by winning.
If, to the contrary, we suppose p = 1 and q = 0, that is if the first player A counts
a point before each drawing of player B, then

m = q0 , m0 = 0 and n = p0 ;

x0 being greater than x or equal, the expression zx,x0 is reduced to zero; and, in fact, it [625]
is evidently impossible that, in this case, player B is able to win the game; but, when x
is greater than x0 , the value of zx,x0 takes this form

x0 0 x0 (x0 + 1) 02 x0 (x0 + 1) · · · (x − 2) 0x−x0 −1


 
0x0
zx,x0 = p 1+ q + q + ··· + q .
1 1.2 1.2 . . . (x − x0 − 1)

Under this assumption, player B is able to win only so much as he will bring forth
x0 white balls before x − x0 black balls; otherwise, he is anticipated by player A who
counts a point at each trial: this expression of zx,x0 is therefore the probability that
player B will have drawn x0 white balls before having extracted from it x − x0 blacks,
and, consequently, the probability to win, if he made the wager with player A, who

6
would count then a point with the exit of each black ball while he counts one of them
at the exit of a white, to attain x0 points before his adversary has x − x0 of them; that
which is the problem of points. (2 )
If we examine with attention the form of the general expression which gives zx,x0 , [626]
we will recognize that this problem is able yet to be resolved, and even with simplicity,
by means of the theory of combinations: in fact, let a be the number of white balls
contained in the urn of player A, and b the one of the blacks; a0 the number of white
balls of player B, and b0 the one of the blacks; by considering, as we have already done,
the set of two successive drawings of A and B as one trial,
aa0 will be the number of combinations in which the players bring forth each one
white ball;
ab0 the one of the combinations which will give one white ball to A and one black
to B;
a0 b the one of the combinations which will give, to the contrary, one black ball to
A and one white to B;
bb0 the one of the combinations in which both players draw a black ball;
And the sum aa0 + ab0 + a0 b + bb0 will form the collection of all the combinations
which are able to take place in a trial. The combinations where the players bring
forth each one black ball bring no change to their position, we are able to set it aside,
and then we occupy ourselves only with the trials where there will be brought forth
at least one white ball. It is clear that in x + x0 similar trials one of the players has
necessarily won, and the game must be decided: now the number of all the equally
possible combinations, according to which these x + x0 trials are able to be presented,
will be 0
(aa0 + ab0 + a0 b)x+x ;
2 The generating function of zx,x0 is reduced in this case to

t(1 − q 0 t)
,
(1 − t)(1 − q 0 t − p0 tt0 )
and the equation in the corresponding partial differences will be
zx,x0 = q 0 zx−1,x0 + p0 zx−1,x0 −1 ,
in which zx,x0 is a function of x and of x0 which we will designate by φ(x, x0 ); if we make x − x0 = s, we
will have
φ(x, x0 ) = φ(s + x0 , x0 ),
and, if we represent by zs,x0 this last function, there results from it
zx,x0 = zs,x0 , zx−1,x0 = zs−1,x0 , zx−1,x0 −1 = zs,x0 −1 ;
and the equation in the partial differences is changed into that here
zs,x0 = q 0 zs−1,x0 + p0 zs,x0 −1 ,
an equation to which the problem of points would lead directly under the conditions enunciated above. By
paying attention that, in consequence of this transformation, zs,0 = 1 and z0,x0 = 0, and that z0,0 is not
able to take place, it is easy to see that the generating function of zs,x0 will be

t(1 − q 0 t)
,
(1 − t)(1 − q 0 t − p0 t0 )
0
in the development of which the coefficient of ts t0x will be the expression of zs,x0 .

7
the question is reduced therefore to choose in all these combinations those which make
player B win, that is those in which this player will have x0 white balls before player
A has brought forth x of them. In order to fix the ideas, let us suppose x0 greater
than x; we are able to form the following hypotheses: either player B will have won
at the xth trial, that is by drawing without interruption a white ball at each trial, and
then the number of the preceding combinations which are corresponding to this case is
evidently [627]
x0 x0 (x0 − 1) 2 x0 −2

0 0 0
a0x bx + abx −1 + a b + ···
1 1.2
x0 (x0 − 1) · · · (x0 − x + 2) x−1 x0 −x+1

+ a b (aa0 + ab0 + a0 b)x ;
1.2 . . . (x − 1)
0
and by dividing it by (aa0 + ab0 + a0 b)x+x , the total number of combinations, we will
have, for the probability of this hypothesis,
0 0
a0x bx x0 a x0 (x0 − 1) a2 x0 (x0 − 1) · · · (x0 − x + 2) ax−1
 
1 + + + · · · + ;
(aa0 + ab0 + a0 b)x0 1 b 1.2 b2 1.2 . . . (x − 1) bx−1
or the player B will have won at the (x0 +1)st trial, that is by having drawn only a single
black ball, for example at the commencement, and then the number of combinations
favorable to this event is
x0 x0 (x0 − 1) 2 x0 −2

0 0 0
b0 a0x bx + abx −1 + a b + ···
1 1.2
x0 (x0 − 1) · · · (x0 − x + 3) x−2 x0 −x+2

+ a b (aa0 + ab0 + a0 b)x−1 ;
1.2 . . . (x − 2)
but this number is the same, if the black ball is brought forth in the first trial or in the
second, . . ., or in the xth trial; it is necessary therefore to multiply it by x0 in order to
have all the combinations relative to this hypothesis, of which the probability is, by this
means,
0 0
x0 ab0 a0x bx x0 a x0 (x0 − 1) a2 x0 (x0 − 1) · · · (x0 − x + 3) ax−2
 
1 + + + · · · + ;
1 (aa0 + ab0 + a0 b)x0 +1 1 b 1.2 b2 1.2 . . . (x − 2) bx−2
or player B will have won at the (x0 + 2)nd trial, and we will see in the same manner
that the probability of this hypothesis will be
0 0
x0 (x0 + 1) a2 b02 a0x bx x0 a x0 (x0 − 1) · · · (x0 − x + 4) ax−3
 
1+ + ··· + ;
1.2 (aa0 + ab0 + a0 b)x0 +2 1 b 1.2 . . . (x − 3) bx−3
By continuing thus, we will have the probabilities of all the successive hypotheses
which are able to be presented under the supposition of the gain of the game by player
B, until that where he would win only at the (x0 + x − 1)st trial, an event of which the [628]
probability will be
0 0
x0 (x0 + 1) · · · (x0 + x − 2) ax−1 b0x−1 a0x bx
;
1.2 . . . (x − 1) (aa0 + ab0 + a0 b)x0 +x−1

8
and effectively, in this case, there are not able to be trials where the players bring forth
at the same time a white ball.
The sum of all these probabilities will give evidently that of player B in order to
win the game.
If we pay attention that

ab0 a0 b q n
= m, = m0 , and = 0,
aa0 + ab0 + a0 b aa0 + ab0 + a0 b b m
we recover the expression of zx,x0 .
Let us imagine presently that there are in the urns some white balls bearing the n◦ 1,
and other balls, of the same color, which bear the n◦ 2; each ball diminishing by its nu-
meral, by its exit, the number of points which are lacking yet to the player to which it is
favorable. The problem is no longer susceptible to be resolved generally by means of
combinations, instead the calculation of the generating functions will continue to fur-
nish a general expression of which the development will contain the complete solution
of the question and will be able, in certain cases, to be executed by laws easy to know,
as we will have occasion to see.
Let p be the probability player A to extract a ball labeled 1, p1 that to extract a
ball labeled 2, and q that to bring forth a black ball; p0 , p01 and q 0 the corresponding
probabilities for player B; and let always zx,x0 be the probability of this last player in
order to win the game. By following the same march as above, we will be led to the
equation in partial differences

zx,x0 =mzx−1,x0 + m1 zx−2,x0 + m0 zx,x0 −1 + m01 zx,x0 −2


+ nzx−1,x0 −1 + n1 zx−2,x0 −1 + n0 zx−1,x0 −2 + n01 zx−2,x0 −2

in which we make [629]


0 0 0
pq p1 q pq p01 q
= m, = m1 , = m0 , = m01 ,
1 − qq 0 1 − qq 0 1 − qq 0 1 − qq 0
pp0 p1 p0 pp01 p1 p01
= n, = n1 , = n0 , = n01 ;
1 − qq 0 1 − qq 0 1 − qq 0 1 − qq 0
the generating function of the variable zx,x0 given by this equation, will be

A + Bt0 + A0 + B 0 t
(c) ,
1 − mt − m1 t2 − m0 t0 − m01 t02 − ntt0 − n1 t2 t0 − n0 tt02 − n01 t2 t02

A and B being arbitrary functions of t, A0 and B 0 arbitrary functions of t0 , which will


be determined by means of the generating functions of

z0,x0 , zx,0 , z1,x0 , zx,1

which are themselves it by the conditions of the game.


We find, as previously, that the generating function of z0,z0 is zero and that of zx,0 ,
t
1−t .

9
From the general equation, we deduce the equation in finite differences

z1,x0 = m0 z1,x0 −1 + m01 z1,x0 −2 ,

which holds for all the values of x0 from x0 = 2 inclusively, and which gives conse-
quently, for the generating function of z1,x0 ,
a + bt0
,
1 − m0 t0 − m01 t02
a and b being constants that we determine by means of the values of z1,0 and z1,1 ; and
as z1,0 is equal to unity, z1,1 is equal to m0 + m01 , and is at the same time the coefficient
of t0 in the development of the generating function; there results from it

a = 1 and b = m01 ;

the generating function of z1,x0 is therefore


1 + m01 t0
.
1 − m0 t0 − m01 t02
Now, if in the preceding equation we put 1 − yx,x0 in the place of zx,x0 , yx,x0 being [630]
always the probability of the first player A, it is reformed in the same manner with
respect to this last variable, and we will deduce from it similarly the equation in the
finite differences
yx,1 = myx−1,1 + m1 yx−2,1 .
But we will see at the same time that it begins to hold only when x surpasses 2; because,
x being 2, we will have

y2,1 = my1,1 + m1 y0,1 + n1 + n01 .

It is necessary therefore to employ it only by departing from x = 3, and then the


generating function of yx,1 is of the form

a + bt + ct2
,
1 − mt − m1 t2
a, b and c being constants that we will determine, as previously, by means of the values
of y1,0 , y1,1 and y1,2 ; now y1,0 is unity; y1,1 is equal to 1 − m0 − m01 , and is the
coefficient of t in the development of the generating function; y2,1 has for value, as we
have just seen,
m(1 − m0 − m01 ) + m1 + n1 + n01 ;
this is the coefficient of t2 in the development of the function. We will conclude from
it
a = 1, b = 1 − m − m0 − m01 , and c = n1 + n01 ,
and the generating function of yx,1 will be therefore

1 + (1 − m − m0 − m01 ) t + (n1 + n01 ) t2


;
1 − mt − m1 t2

10
consequently that of zx,1 is

1 1 + (1 − m − m0 − m01 ) t + (n1 + n01 ) t2



1−t 1 − mt − m1 t2
(m0 + m01 ) t + (n1 + n01 ) t2 + (n1 + n01 ) t3
= .
(1 − t)(1 − mt − m1 t2 )

Let us resume actually the generating function (c); we are able always to restore it [631]
to this form
A1 t + B1 t2 t0 + A01 + B10 tt0
,
1 − mt − m1 t2 − m0 t0 − m01 t02 − ntt0 − n1 t2 t0 − n0 tt02 − n01 t2 t02

A1 and B1 being the arbitrary functions of t, A01 and B10 the arbitrary functions of t0 ;
which we determine easily, by equating first the coefficient of t0 in the development
of this function to the generating function of z0,x0 or zero, next the one of t00 to the
t
generating function of zx,0 or 1−t , since the one of t to the generating function of
0
z1,x0 , and finally the one of t to the generating function of zx,1 , that which will give
successively

1 − mt − m1 t2 m01 + n0 + n01 t
A01 = 0, A1 = , B10 = m01 , B1 = ,
1−t 1−t
and, consequently, for the generating function of zx,x0 ,

(1 − mt − m1 t2 )t + m01 tt0 + n0 t2 t0 + n01 t3 t0


(d) .
(1 − t)(1 − mt − m1 t2 − m0 t0 − m01 t02 − ntt0 − n1 t2 t0 − n0 tt02 − n01 t2 t02 )

If we suppose p and p0 null, then we have

m = 0, m0 = 0, n = 0, n1 = 0, and n0 = 0,

and the function (d) takes this form

tt0 (m01 + n01 t2 ) t


h  0 0 2 i + h  0 0 2  i,
m1 +n1 t 02 m1 +n1 t
2
(1 − t)(1 − m1 t ) 1 − 1−m1 t2 t (1 − t) 1 − 1−m 1t
2 t02

under which it is susceptible of the same developments as the function (a). There is to
note that we will recover the same coefficient for
0 0 0 0
t2r t02r , t2r−1 t02r , t2r t02r −1 , t2r−1 t02r −1 ;

that which is seen a priori, by paying attention that the players always count two points
with each white ball that they make exit.
Let us suppose that player A alone has some balls labeled 1 and 2, and that the [632]
other player has only some white balls marked 1, or which count to him only one point
on exiting; then
p01 = 0

11
and, hence,
m01 = 0, n0 = 0, n01 = 0;
the function (d) becomes

t(1 − mt − m1 t2 ) t 1
= i ;
(1 − t)(1 − mt − m1 t2 − m0 t0 − ntt0 − n1 t2 t0 )
h
1 − t 1 − m0 +(n+n1 t)t t0
1−(m+m1 t)t

0
by developing it according to the powers of t0 , the coefficient of t0x will be
x0
t [m0 + (n + n1 t)t]
x0
,
(1 − t) [1 − (m + m1 t)t]
an expression that the concern is to develop with respect to the powers of t in order to
have the coefficient of tx ; now this coefficient will be the sum of all the coefficients of
the powers of t inferior or equal to tx−1 , in the development of the expression
x0
[m0 + (n + n1 t)t]
x0
,
[1 − (m + m1 t)t]
which, by omitting the terms where the powers of t outside the binomials are superior
to tx−1 , is able to be put under this form
 2 x−1 
x0 n + n1 t x0 (x0 − 1) n + n1 t 0 0 0
   
x (x − 1) · · · (x − x + 2) n + n 1 t
t2 + · · · + tx−1 
 

 1+ 0
t+ 0 0

1 m 1.2 m 1.2 . . . (x − 1) m

 


 

 " # 
 0 0
  0 0 0
  x−2 

 x x n + n 1 t x (x − 1) · · · (x − x + 3) n + n 1 t x−2


+ (m + m t)t 1 + t + · · · + t
 
1
 
0 − 0
 


 1 1 m 1.2 . . . (x 2) m 


 
0x0 " #
m 0 0
x (x + 1) 0 0 0
x (x − 1) · · · (x − x + 4) n + n1 t
  x−3

 + (m + m1 t)2 t2 1 + · · · + 0
tx−3 




 1.2 1.2 . . . (x − 3) m 




 

 +. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .





 
0 0 0
 

 x (x − 1) · · · (x − x + 2) x−1 x−1


+ (m + m t) t .
 
1
 
1.2 . . . (x − 1)
 

If we reject further from this series all the powers of t superior to tx−1 , which will [633]
result from the developments of the binomials, and if, in that which remains, we make
t = 1, we will have the expression of zx,x0 .
Let us examine further the case where player A would be certain to extract at each
trial a ball which would count to that player one point, that is where we would have

p = 1, p1 = 0, q = 0,

and consequently

m = q0 , m1 = 0, m0 = 0, m01 = 0,
n = p0 , n1 = 0, n0 = p01 , n01 = 0.

12
The generating function of zx,x0 or the function (d) would be reduced to
t(1 − q 0 t) + p01 t2 t0
,
(1 − t)(1 − q 0 t − p0 tt0 − p01 tt02 )
and that of yx,x0 would be, hence,
1 t(1 − q 0 t) + p01 t2 t0
− ,
(1 − t)(1 − t0 ) (1 − t)(1 − q 0 t − p0 tt0 − p01 tt02 )
1 tt0
= + .
1 − t0 (1 − t)(1 − q 0 t − p0 tt0 − p01 tt02 )
In this last expression, the first term represents the generating function of y0,x0 , which
is equal to unity whatever be x0 , and the second will give, by developing it with respect
to the powers of t and of t0 , all the other values of yx,x0 ; now the coefficient of tx will
be
t0 [q 0 + (p0 + p01 t0 )t0 ]x−1
;
1 − t0
whence it results that, if we reject from the development of the series
" #
0 0 2
 0 0 0
  0 
(x − 1) p + p t (x − 1)(x − 2) p + p t
q 0x−1 t0 + 1
t02 + 1
t03 + · · ·
1 q0 1.2 q0
0
all the powers of t0 superior to t0x , and if we made in that which remains t0 = 1, we
0
will have, by supposing x0 even and equal to 2r + 2, the coefficient of tx t0x , or [634]
  0 0
 0 0
2 r 
(x − 1)(x − 2) · · · (x − r) p0 + p01 
 
(x − 1) p + p1 (x − 1)(x − 2) p + p1
+ ··· +


 1+ 0
+ 0 0





 1 q 1.2 q 1.2 . . . r q 



0r+1 0 02 0r
   

 (x − 1)(x − 2) · · · (x − r − 1) p (r + 1) p 1 (r + 1)r p 1 (r + 1)r · · · 2 p 1


+ 1 + + + · · · +

 



 1.2 . . . (r + 1) q 0r+1 1 p 0 1.2 p 02 1.2 . . . r p 0r 


 
0x−1 0r−1
yx,x0 =q (x − 1)(x − 2) · · · (x − r − 2) p0r+2 (r + 2) p01
 
(r + 2)(r + 1)r · · · 4 p1
+
 1+ + ··· + 


 1.2 . . . (r + 2) q 0r+2 1 p0 1.2 . . . (r + 1) p0r−1 



 
+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

 


 

 
02r+1
(x − 1)(x − 2) · · · (x − 2r − 1) p

 

+

 

02r+1

1.2 . . . (2r + 1) q
and, in the case of x0 odd or equal to 2r + 1,
 2 r 
(x − 1) p0 + p01 (x − 1)(x − 2) p0 + p01 (x − 1)(x − 2) · · · (x − r) p0 + p01 
   
+ ··· +


 1+ 0
+ 0 0





 1 q 1.2 q 1.2 . . . r q 



0r+1 0 02 0r−1
   

 (x − 1)(x − 2) · · · (x − r − 1) p (r + 1) p 1 (r + 1)r p 1 (r + 1)r · · · 3 p1


+ 1 + + + · · · +

 

1.2 . . . (r + 1) q 0r+1 1 p 0 1.2 p 02 1.2 . . . (r − 1) p0r−1

 


 

0x−1 0r+2 0 0r−2
yx,x0 =q
 
(x − 1)(x − 2) · · · (x − r − 2) p (r + 2) p1 (r + 2)(r + 1)r · · · 5 p1
+
 1+ + ··· + 
1.2 . . . (r + 2) q 0r+2 1 p0 1.2 . . . (r − 2) p0r−2
 


 

 
+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

 


 

 
02r
(x − 1)(x − 2) · · · (x − 2r) p

 


+ 

 
1.2 . . . 2r q 02r

13
It is clear that player B is able to expect to win only as long as x is greater than
r + 1, or that x0 equal 2r + 2 or 2r + 1; and effectively, beyond this supposition, the
preceding values of yx,x0 become all equal to unity.
We will note also that player A has necessarily won the game when player B will
have drawn x − r − 1 black balls before having attained x0 points; but this last player
is able yet to have lost before having drawn the totality of this number of black balls,
that which makes that this question is not at all susceptible to return into that which
is treated in the analytic Theory, after the problem of points, as previously a similar
supposition has led us to this last problem.

§3. The problem of points having been the object of the researches of two great
geometers of the XVIIth century (3 ), and to some extent the first of this kind subject to [635]
analytic methods, one will be perhaps curious to see how this same problem is deduced
again, as corollary, from another question of probability, of which the solution will
offer besides a new application of the method of generating functions.
We draw successively from an urn, which contains a determined quantity of white
and black balls, a ball that we do not return after the trial, and we demand, after a
certain number of known drawings, what is the probability to complete the drawing of
such given number of white balls before that of such other number, given equally, of
black balls.
Let a and a0 be the numbers of white and black balls contained originally in the
urn, n the number of white balls that we are proposed to attain before having extracted
another number n0 of black balls; and let us suppose that after having drawn succes-
sively from the urn a ball without returning it, we have brought forth n − x white balls
and n0 − x0 black balls, x and x0 being then the number of white and black balls that
there remain to make exit in order to decide the question. Let us represent by yx,x0 the
probability to bring forth in the following drawings x white balls before x0 black balls,
or to attain the totality of n white balls before having extracted n0 blacks; we will have,
according to the known rules of probabilities, the equation
a−n+x a0 − n0 + x0
yx,x0 = y x−1,x 0 + yx,x0 −1 .
a + a0 − n − n0 + x + x0 a + a0 − n − n0 + x + x0
Let us make
a − n + x = s, a0 − n0 + x0 = s0 and yx,x0 = us,s0 ;
the preceding equation becomes
s s0
us,s0 = u s−1,s0 + us,s0 −1 ,
s + s0 s + s0
and, by supposing [636]
1.2.3 . . . s.1.2.3 . . . s0
u= zs,s0 ,
1.2.3 . . . (s + s0 )
it is restored to this form
zs,s0 = zs−1,s0 + zs,s0 −1 ,
3 Pascal and Fermat.

14
an equation in the partial differences with constant coefficients, which must hold for
all the entire and positive values of s and of s0 , by departing from s = a − n and from
s0 = a0 − n0 , and gives consequently for the generating function of zs,s0

0 0 A + A0
ta−n t0a −n ,
1 − t − t0
A being an arbitrary function of t, and A0 an arbitrary function of t0 . We are able
always to transform this expression into this one

0 0 A1 + A01 t0
ta−n t0a −n ,
1 − t − t0
in which A1 and A01 are new arbitrary functions of t and of t0 . In order to determine
them, we will observe that, y0,0 not being able to take place and yx,0 being equal to
zero, whatever be the entire and positive values of x, we will have

1.2.3 . . . s.1.2.3 . . . (a0 − n0 )


0 = us,a0 −n0 = zs,a0 −n0 ;
1.2.3 . . . (a0 − n0 + s)

consequently the generating function of zs,a0 −n0 will be null, that which gives

0 0 A1
ta−n t0a −n = 0, and hence A1 = 0.
1−t
Moreover, y0,x0 being equal to unity for all the values of x0 from x0 = 1, we will have
similarly
1.2.3 . . . (a − n).1.2.3 . . . s0
1 = ua−n,s0 = za−n,s0 ;
1.2.3 . . . (a − n + s0 )
0
whence we deduce, for the value of za−n,s0 or the coefficient of ta−n t0s in the devel-
opment of its generating function, [637]

(a − n + 1)(a − n + 2) · · · (a − n + s0 )
za−n,s0 = ,
1.2.3 . . . s0
that which gives

0 0 A01 t0 0 0
a−n 0a0 −n0 (a − n + 1) · · · (a + a − n − n + 1)
ta−n t0a −n =t t
1 − t0 1.2.3 . . . (a0 − n0 + 1)
(a + a − n − n0 + 2)t02
0
 
t0 + + · · ·
 a0 − n0 + 2 
× 0 0 0 0 0 0x0

 (a + a − n − n + 2) · · · (a + a − n − n + x )t 
+ + · · ·
(a0 − n0 + 2) · · · (a0 − n0 + x0 )
1
The second member of this equation multiplied by t
1− 1−t
will be therefore the gener-
0

ating function of z s,s0 ; by developing it with respect to the powers of t and next with

15
respect to those of t0 , it is easy to see that the coefficient of ts or of ta−n+x is
0
0 0
0 (a − n + 1) · · · (a + a − n − n + 1)
t0a −n
1.2.3 . . . (a0 − n0 + 1)
(a + a0 − n − n0 + 2) 02
 
0
× t + t + ···
a0 − n0 + 2
x(x + 1) · · · (x + x0 − 2) 0x0 −1
 
x 0 x(x + 1) 02
× 1+ t + t + ··· + t + ··· ,
1 1.2 1.2 . . . (x0 − 1)
0 0 0 0
and the one of t0s , or of t0a −n +x in this last expression, or zs,s0 , is equal to

(a − n + 1) · · · (a + a0 − n − n0 + 1)
1.2.3 . . . (a0 − n0 + 1)
x(x + 1) · · · (x + x0 − 2) a + a0 − n − n0 + 2 x(x + 1) · · · (x + x0 − 3)
 
+ + · · ·
 1.2 . . . (x0 − 1) a 0 − n0 + 2 1.2 . . . (x0 − 2) 
× 0 0 0 0 0
.
 (a + a − n − n + 2) · · · (a + a − n − n + x ) 
+
(a0 − n0 + 2) · · · (a0 − n0 + x0 )
Now, by multiplying this value of zs,s0 by

1.2.3 . . . (a0 − n0 + x0 )
,
(a0 − n0 + x + 1) · · · (a + a0 − n − n0 + x + x0 )
we will have, after all the reductions, for the expression of yx,x0 , [638]

(a − n + x) · · · (a − n + 1)
yx,x0 =
(a + a0 − n − n0 + x + x0 ) · · · (a + a0 − n − n0 + x + 1)
x a0 − n0 + x0 x(x + 1) (a0 − n0 + x0 )(a0 − n0 + x0 − 1)
 
1+ + + · · ·
 1 a + a0 − n − n0 + x0 1.2 (a + a0 − n − n0 + x0 )(a + a0 − n − n0 + x − 1) 
× 0 0 0 0 0 0

 x(x + 1) · · · (x + x − 2) (a − n + x ) · · · (a − n + 2) 
+ 0 0 0 0 0
1.2 . . . (x − 1) (a + a − n − n + x ) · · · (a + a − n − n + 2)

Let us imagine actually a − n and a0 − n0 in the ratio of p to q, so that we have


a − n = pk and a0 − n0 = qk, and let us imagine that k becomes a very great number
or infinity; it is clear that the probability of the exit of a white ball or of a black ball in
p q
the successive drawings will become constant and will be p+q for a white ball and p+q
for a black, and the probability yx,x0 will be reduced to this expression
 x "  2
p x q x(x + 1) q
yx,x0 = 1+ + + ···
p+q 1p+q 1.2 p+q
x0 −1 #
x(x + 1) · · · (x + x0 − 2)

q
+ ;
1.2 . . . (x0 − 1) p+q

such is the formula to which the problem of points leads, and effectively we return to
the conditions of this problem by the supposition of k infinite.

16
If we suppose n equal to a and n0 equal to a0 , yx,x0 will express then the probability
of the exit of all the white balls remaining in the urn before all the blacks had been
depleted, and its expression will be changed into that here

x(x + 1) · · · (x + x0 − 2)
 
1.2.3 . . . x x x(x + 1)
1 + + + · · · + ,
(x + x0 ) · · · (x0 + 1) 1 1.2 1.2 . . . (x0 − 1)

which is reduced itself to


x0
.
x + x0
The probability of extracting from the urn the totality of the white balls before that [639]
of the blacks is therefore to the contrary probability in inverse ratio of the number of
white balls to the one of the blacks.
We arrive to this last result, in an extremely simple manner, by means of combina-
tions; in fact, the probability of the exit of all the balls from the urn, in any order, by
color, will be
x(x − 1) · · · 2.1 x0 (x0 − 1) · · · 2.1 1.2.3 · · · x0
= .
(x + x0 )(x + x0 − 1) · · · 3.2.1 (x + 1) · · · (x + x0 )

But, in order that the white balls exit in totality first, it is necessary necessarily that
a ball of the color black exit last: by combining x0 − 1 with x0 − 1 the x + x0 + 1 ranks
of exit which are found before the last, we will form as many different rankings for the
balls of the color black, and as many orders of exit by color, which will comprehend all
those where one black ball exits in last place; now the number of these combinations is

(x + x0 − 1)(x + x0 − 2) · · · (x + 1)
,
1.2 . . . (x0 − 1)

and by multiplying it by the probability common to each order of exit by color, we will
have the sought probability equal to

1.2.3 · · · x0 (x + 1) · · · (x + x0 − 1) x0
= .
(x + 1) · · · (x + x0 ) 1.2.3 . . . (x0 − 1) x + x0

Remarks on generating functions.

§4. Let u be a generating function in one or many variables; each equation between
this function and its variables, linear with respect to u, rational with respect to the
variables, will subsist still if we pass from the generating functions to the coefficients,
among these same coefficients, and will give place to an equation in the partial differ-
ences; but if, in this equation in partial differences, we pass again from the coefficients
to the generating functions, we will no longer arrive to an equation rigorously exact, at
least if we restore at the same time the functions of the variables which have been able [640]

17
to vanish in the first passage. Thus, in one of the questions that we have treated above,
the equation in the partial differences

zx,x0 = mzx−1,x0 + m0 zx,x0 −1 + nzx−1,x0 −1

would give, by going up again simply from the coefficients to the generating functions,
this one
u = mut + m0 ut0 + nutt0 ,
which is not at all exact; because it is easy to see that, according to the conditions of the
problem, it would be necessary to add to the second member the generating function of
zx,0 less this same function multiplied by m. This function of t, which it is necessary
to restore in the second member of the equation in order to complete it, is precisely
the arbitrary function that we have had to determine in the solution of this question. In
general, the functions to add in order to have still one equation in the passage from the
coefficients to the generating functions are the same as the arbitrary functions which
form the numerator of the generating function integral before it is developed.
For lack of having regard to these functions, we are able to fall into some grave er-
rors, by serving ourselves in this manner in order to integrate the equations in the partial
differences. For this same reason, the march followed in the solution of problems §§8
and 10 of Book II of the Théorie analytique des Probabilités is by no means rigor-
ous, and seems to implicate contradiction in this that it established a liaison among
the variables which are and must be always independent. Without entering into the
particular considerations which have been able to make it succeed here, and that it is
easy to know, we will show that the method of integration exposed at the beginning of
this Supplément is applied equally to these questions, and resolves them with no less
simplicity.
In the problem of §8, we have proposed to determine the lot of a number n of
players A, B, C, . . . of whom p, q, r, . . . represent the respective probabilities, that is
their probabilities to win a trial when, in order to win the game, there are lacking x trials
to player A, x0 trials to player B, x00 trials to player C, etc. By naming yx,x0 , x00 ,... the [641]
probability of player A to win the game, we have the equation in partial differences

yx,x0 , x00 ,... = pyx−1,x0 , x00 ,... + 1yx,x0 −1, x00 ,... + ryx,x0 , x00 −1,... + · · · ,

which gives for yx,x0 , x00 ,... this generating function


P + Q + R + ···
,
1 − pt − qt0 − rt00 − · · ·
in which P, Q, R, . . . are as many arbitrary functions of the variables t, t0 , t00 , . . . as
there are of these variables, by comprehending not at all t in the first, t0 in the second,
t00 in the third, etc. Now, this function is able to be put under this form
P 0 + Q0 t + R0 tt0 + S 0 tt0 t00 · · ·
,
1 − pt − qt0 − rt00 − st000 · · ·
P 0 , Q0 , R0 , . . .being, as above, arbitrary functions, the first of all the variables with
the exception of t, the second of all the variables of it excepting t0 , the third equally of

18
all the variables except t00 , and thus consecutively. In order to determine them, we will
observe that, in yx,x0 , x00 ,... , two of the indices x, x0 , x00 , . . .or a greater number are not
able to be nulls at the same time, since the game ceases when one of the players has
attained his points; moreover, y0,x0 , x00 ,... is equal to unity, whatever be x0 , x00 , . . . ; the
generating function of this expression, or that which gives unity for the coefficient of
0 00 000
any product whatsoever t0x t00x t000x . . ., is

t0 t00 t000
··· ;
1 − t0 1 − t00 1 − t000
consequently, we will have

t0 t00 t000
P0 = · · · (1 − qt0 − rt00 − st000 − · · · ).
1 − t0 1 − t00 1 − t000
Each value of yx,x0 , x00 ,... in which another index than x is null being equal to zero, [642]
the corresponding generating function becomes null also; we will have therefore suc-
cessively
Q0 = 0, R0 = 0, S 0 = 0, . . .
Hence, the generating function of yx,x0 , x00 ,... will be

t0 t00 1 − qt0 − rt00 − · · ·


· · · ,
1 − t0 1 − t00 1 − pt − qt0 − rt00 − · · ·
and the coefficient of tx , in the development of this function with respect to the powers
of t,
t0 t00 px
· · · ;
1 − t0 1 − t00 (1 − qt0 − rt00 − · · · )x
0 00
whence it is easy to deduce the coefficient of t0x t00x . . ., or
 x 
 1+ (q + r + · · · ) 
1

 


 

 + x(x + 1) (q + r + · · · )2

 

 

x 1.2
yx,x0 , x00 ,... = p ,

 x(x + 1)(x + 2) 3



 + (q + r + · · · ) 
1.2.3

 


 

+........................
 

by taking care to reject the terms in which the power of q surpasses x0 − 1, those in
which the power of r surpasses x00 − 1,. . .
In the problem of §10, we consider two players A and B of whom the skills are p
and q, and of whom the first has a tokens and the second b tokens; and we suppose that
at each trial, the one who loses gives a token to his adversary, and that the game finishes
only when one of the players will have lost all his tokens. We demand the probability
that one of the players, A for example, will win the game before or at the nth trial.
By representing by yx,x0 the probability of this player in order to win the game
when he has x tokens and when there are no more than x0 trials to play in order to

19
attain the n trials, we arrive, by the first principles of the probabilities, to the equation [643]
in the partial differences
rx,x0 = pyx+1,x0 −1 + qyx−1,x0 −1 ,
which gives, for the generating function of yx,x0 ,
A + A0 + B 0 t
,
qt2 t0 − t + pt0
A being an arbitrary function of t, A0 and B 0 two arbitrary functions of t0 . In order to
determine them more conveniently, we will transform this generating function into this
one
A1 t + A01 + B10 tt0
,
qt2 t0 − t + pt0
A0
in which A1 , A01 and B10 are, as above, arbitrary functions of t and of t0 . Now pt10 is
the coefficient of t0 in the development of the function with respect to the powers of t,
or the generating function of y0,x0 ; but, by the conditions of the problem, y0,x0 is null
whatever be x0 ; consequently its generating function is also it; A01 is therefore equal to
zero.
The coefficient of t00 , in the development of the generating function with respect
0
to t , is −A1 , that which is at the same time the generating function of yx,0 , a quantity
which is null so long as x is less than the sum of the tokens or a+b, and which becomes
unity when x = a + b; A1 is therefore a function of t which has for factor ta+b , and
of which we are able to take no account in the numerator of the generating function,
because it must give only powers of t superior to ta+b , and we have seen it only to
have a generating function composed of the powers inferior to t, since x is able to be
extended only from x = 0 to x = a + b.
The generating function of yx,x0 , thus limited between these values, is reduced
therefore to
B10 tt0
,
qt2 t0 − t + pt0
which we are able to put easily under this form [644]
 0
B1 1
t

 ! !
p
 1
q
1 1
q
1
−4pq 0− −4pq

0+
 t 02 t 02
t t
1− t 1− t


2p 2p



(Π)  q q 
1 1 1 1
+ − 4pq − − 4pq

0
  

 B1 t  t 0 t 02 t0 t 02


 = q q − q ;
p 1
+ t102 −4pq 1
− t102 −4pq 

1



 t02 4pq 

1− t0
t 1− t0
t

2p 2p

whence we deduce, for the coefficient of ta+b , the expression


 q a+b  q a+b
1 1 1 1
B10
1 t 0 + t02 − 4pq − t 0 − t 02 − 4pq
.
p (2p)a+b−1
q
1
2 t02 − 4pq

20
But this coefficient is the generating function of ya+b,x0 , a quantity which is equal to
unity; because it is certain that player A has won the game when he has won all the
tokens of B: moreover, x0 must be here zero or an even number, since the number of
trials in which A is able to win the game is equal to b plus an even number; and, in fact,
he must win all the tokens of B, and again win again each token that he has lost, that
which requires two trials. The series

ya+b,0 t00 + ya+b,2 t02 + ya+b,4 t04 + · · · ,


1
which represents the coefficient of ta+b , is therefore equal to 1−t 02 , and we conclude

from it
q
1
2 t02 − 4pq
0 a+b−1
B1 (2p)
p 1 − t02  1 q 1 a+b  q a+b .
1 1
t0 + t02 − 4pq − t0 − t02 − 4pq

Now the coefficient of ta , deduced from the development of the function (Π), al- [645]
ways with respect to the powers of t, will be
 q a  q a
1 1 1 1
B10
1 t0 + t02 − 4pq − t0 − t02 − 4pq
,
p (2p)a−1
q
2 1 − 4pq
t02

B0
and by substituting for p1 its value, we will have this coefficient or the generating
function of yx,x0 equal to
 q a  q a
1
b b
2 p + t102 − 4pq − t10 − t102 − 4pq
t0
1 − t02  1 q a+b  q a+b
t0 + t102 − 4pq − t10 − t102 − 4pq

or  p a  p a
b b 0b 1 + 1 − 4pqt02 − 1 − 1 − 4pqt02
2 p t
a+b ,
1 − t02
 p a+b  p
1 + 1 − 4pqt02 − 1 − 1 − 4pqt02

that which is formula (o) of the Théorie analytique.

21

You might also like