Geographical Analysis - July 1979 - Brassel - A Procedure To Generate Thiessen Polygons
Geographical Analysis - July 1979 - Brassel - A Procedure To Generate Thiessen Polygons
See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kurt E. Brassel and
Douglas Reif*
1. INTRODUCTION
In 1911 the climatologist A. H. Thiessen suggested a new method of
representing precipitation data from unevenly distributed weather stations. He
defined regions based on a set of data points in the plane (weather stations)
such that “regions be enclosed by a line midway between the station under
consideration and surrounding stations” [36, p. 10831. Based on this proposal
the term Thiessen polygon has since been commonly used in geography to
denote polygons defined by proximity criteria with respect to a set of points in
the plane. The net of all Thiessen polygons defined by the point set is called a
*Some preliminary studies leading to this algorithm have been initiated by Dr.Thomas Peucker,
Simon Fraser University, and have been supported by ONR Contract N0001475-0886. Mr. Robert
Fowler, Simon F r w r University, has pointed out and corrected some important shortcomings of a
preliminary version of this algorithm. The implementation of the present procedure has been
supported by the Erie and Niagara Counties Regional Planning Board. Mr. George Sicherman
reviewed the manuscript and provided substantial help in formulating the p m f section of this
paper. h4r. Mike Wasilenko executed the illustrations. The authors gratefully acknowledgeall these
contributions.
2. DEFINITIONS
A set N of n points is given in the plane, which in this discussion are called
centroids (see Fig. 1).Find a set of points V in the plane such that each V, E V
is equidistant and closest to at least three centroids; these points are termed
(Thiessen) uertices. A (Thiessen) edge is the locus of all points equidistant and
closest to two centroids; Thiessen edges may be delimited by two vertices or
may be unlimited in one direction. A Thiessen polygon is defined as the locus of
all points closer to a centroid C E N than to any other centroid. This definition
implies that Thiessen polygons are convex. The set of n centroids determines a
set of n Thiessen polygons. The set of all polygons is called a Thiessen diugrum.
CENTROIDS
v T H I ESSEN VERTEX
p PSEUDO-VERTEX
Thiessen polygons may be closed or open; closed polygons are entirely bounded
by Thiessen edges, open polygons extend to infinity. Define the cmvex hull of
the set N of centroids as the smallest convex polygon enclosing all centroids. All
centroids of the boundary of the convex hull have open Thiessen polygons, and
all interior centroids have closed polygons. A centroid B is called a Thiessen
neighbor of a centroid C if the Thiessen polygons about the two centroids have
an edge in common; they are called half-neighbors if they have a single vertex
in common.
3. BASIC CONCEPTS
Assume a given set of centroids in the plane. Given also is a rectangular
window that includes the set of centroids; this window is further referred to as
the border of the Thiessen diagram. In order to ease computation, the border is
selected parallel to the Cartesian axes and the centroids are sorted in x-direc-
tion. In our procedure the Thiessen diagram will be restricted to the Cartesian
rectangle, where portions of the border delimit open Thiessen polygons. This
delimitation of the Thiessen diagram is achieved by adding some dummy points
to the point set during processing time (see Fig. 2).
Given the centroid A, four dummy points W through 2 are introduced such
that perpendicular bisectors between A and the dummy points generate the
four edges of the border (Fig. 2). If A is the point closest to the lower left
comer V, then V, must be a vertex of the Thiessen polygon about A , and
dummy point X is a neighbor of A . Further searching can be reduced to a
search for the “next neighbor of A clockwise to X.” In this discussion the phrase
“next clockwise neighbor” will be used for this relationship.
1 ) F I N D F I R S T P O T E N T I A L NEIGHBOR P :
COMPUTE VERTEX V AND R A D I U S R:
I
V.R = f ( A . X . P )
2) F I N D F I R S T P O I N T P'
TO BE
SUBJECTED TO C I R C L E TEST
F I N D NEXT
POINT P'
DECLARE P '
AS P O T E N T I A L
NEIGHBOR P;
RECOMPUTE
V,R
'8
The search for next clockwise neighbors is repeated until the polygon about
the centroid in question is completed, and all the neighbors are recorded in a
neighborhood record. For the polygon about A in Figure 6, the neighborhood
recorded is as follows: A: X, B, D, C, Y, (X).
Since each vertex V, pertains to three Thiessen polygons, all neighborhood
relationships are mutual. Once a vertex is found it is stored and the mutual
relationships among the centroids involved are recorded in a bookkeeping
process. Whenever a new vertex V, in Figure 6 is found, the following entries
are made into the neighborhood file (modifications to the neighborhood file in
each step are in boldface).
Since the polygon about A is now completed, its neighbor record is eliminated
from the list, and the next centroid is chosen from the list for processing. This is
centroid B with D, A, X as known neighbors, where X is the most clockwise
point. The four dummy points are shifted to a position such that the bisectors
between dummy points and centroid B generate the four edges of the border.
15384632, 1979, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.1979.tb00695.x by EBMG ACCESS - GHANA, Wiley Online Library on [31/01/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kurt E . Brassel and D o u g h Reif / 297
SORT P O I N T
ALONG
X-AX1 S
FIRST
CENTROID
AND I T S i S T
F I N D NEXT
BOOKKEEPING:
ADO DUMMY N E I GHBOR
P O I N T S TO CLOCKWISE: STORE NEW
P O I N T SET C O M P U T E NEW VERTEX
COORD. AND
RECORD
N E I GHBORHOOO
RELATIONSHIPS
I 1YES
CENTROID T H l ES5EN
PROCESSED
RESTRUCTURE
F I L E S AND
Processing continues:
V,: B: D, A, X , W
D: C , A , B
C Y,A, D
V,: B: D, A, X , W, (D) completed
D: C , A , B, W
C: Y, A, D
15384632, 1979, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.1979.tb00695.x by EBMG ACCESS - GHANA, Wiley Online Library on [31/01/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
298 / Geographical Analysis
diagrams with 20-5,OOO randomly distributed data points. Other test runs
included data sets with one or two centroid clusters. The cumulative processing
time for the several steps of the Thiessen procedure is shown in Figure 9.
Recognize the near linear characteristics of these curves and the slight reduc-
tion in processing time for clustered data sets. Figure 10 shows the observed
average computation time for each Thiessen diagram as a function of the
number of data points in semilogarithmicscale.
Further analysis of the Thiessen procedure is shown in Figure 11. It repre-
sents empirically computed counts of processing steps. The top curve (A)
indicates the average number of data points consulted in order to create a
Thiessen polygon. These data points are subjected to the test for overall
rejection (“is P’ within abcissa range of circle (V, R)?) and the diagonal test (“is
P‘ in the halfplane E,?’). These two tests include a total of five additions, three
multiplications, and two arithmetic branching checks. Approximately 50 per-
cent of the points pass these tests; they are represented in curve B and are
checked for their location within the circle about the potential vertex. This test
includes two adds, one multiplication, and two arithmetic checks. As may be
seen from curve C, it further reduces the number of data points to be
considered drastically. For all points inside the circle (curve C) Thiessen
vertices have to be computed in a procedure including twenty-six additions and
seventeen multiplications. From this figure it is evident that many points are
subjected to simple tests, whereas the time-consuming processes are performed
only for relatively few centroids. This figure also indicates that the number of
processing steps is lower for clustered data sets.
COMPUTATION T I M E
I N SECONDS
( C O C C Y B E R 173)
..
200 loo0 2000 3ooo 4Ooo 5ooo
SIZE O F P O I N T SET ( # OF C E N T R O I D S )
20- .2a
PROCESSING TIME
I N MILLISECONOS
( C V B E R 1731
15- f
t -15
lo- -10
FIG.10. Average Computation Time for One Thiessen Polygon as a Function of the Number of
Data Points on the Point Set (in milliseconds, CDC Cyber 173).
250 250
203 200
A V E R I G E NUMBER OF P O I N T S CONSULTEO TO CONSTRUCT
A T H I E S S E N POLYGON:
A N U M B E R OF P O I N T S U S E D A S INITIAL P'
B NUMBER O F P O I N T S P A S S I N G DIAGONAL TEST
C C O M P U T A T I O N OF NEW V E R T I C E S ( P O I N T S
150 P A S S I N G CIRCLE TEST) 150
RESULTS FOR C L U S T E R E D DATA SETS LVERAGE NUMBER
AC,BC
OF DATA POINTS
CONSULTED TO
COMPUTE A
T i l l E S S E N POLYGON
100 100
50 50
1
I
50
I
100 200 A I
IWO
S I Z E OF P O I N T SET ID OF CENTROIDS)
Proof. Let S,,i € I , be convex, and let S = n S,. If a and b are two points in
S, then they are in each S,, so the line joining a to b lies in each S,, and
therefore in S.
Proof of main lemma: The locus of points closer to a centroid A than
another centroid B, is a halfplane Hi. The Thiessen polygon about A is the
intersection of the halfplanes Hi for aLl B,#A in the data set. But every
halfplane is convex; therefore, the Thiessen polygon about A is convex.
LEMMA2. Given a centroid C , which is known to create a closed Thiessen
polygon, assume a mdhl c d i n u t e system in which clockwise angles are
positive (Fig. 12). Let point P be a true Thiessen neighbor of C. The
straight line CP subdivides the plane into two halfplanes E , and E,.
Lemma: For a closed Thiessen polygon the true next neighbor about C
clockwise of point P must lie in the halfplane E,.
Proof. Assume at least one point P' in the halfplane E , (closed Thiessen
polygon). The angular relationship between the vector CP and the direction of
E,
EZ
FIG. 12. Angular Relationship between Vectors Connecting Neighbors and Bisectors
15384632, 1979, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.1979.tb00695.x by EBMG ACCESS - GHANA, Wiley Online Library on [31/01/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
302 GeographicalAnalysis
$Q=$P+W”.
Likewise, for the next neighbor, P‘ follows that $Q’=$P’+90”, the angle
between directions Q and Q’ is therefore
a=$Q’-Q=$P’-P.
For all $P” <$P the angle between Q and Q“ will be negative ( $ Q l -$Q <
0). Since Q“ is the direction of an edge subsequent to the edge Q in clockwise
order, the above condition implies that the Thiessen polygon about a point C is
not convex, which has been proven impossible. All points P” in the halfplane E,
can therefore be excluded as candidates for the next neighbor in clockwise
direction of a closed Thiessen polygon. Likewise, points on the straight line
through CP can be excluded (Fig. 13): PI cannot be a neighbor of C if P exists,
Pz cannot exist if P is known to be a neighbor of C, and P3 would create an
open Thiessen polygon.
LITERATURE CITED
1. Bentley, J. L. “Divide and Conquer Algorithms for Closest Point Problems in Multidimensional
Space.” Ph.D. thesis, University of North C a r o h at Chapel Hill, 1976.
2. Bentley, J. L., and M. I. Shamos. “Divide and Conquer in Multidimensional Space.” h m &
ings of the Eighth Annual ACM Symposium a Thwy of computing,Hershey, Pa., 1976, pp.
220-30.
3. Besag, J. “Spatial Interaction and the Statistical Analysis of Lattice Systems.” Journal of the
ROY^ Staeistioal Society ( B ) , 36 (1974), 192-236.
4. Boots, B. M. “Contact Number Properties in the Study of Cellular Networks.’’ Geographical
Analysis, 9 (1977,379-87.
5. . “Delaunay Triangles:An Alternative Approach to Point Pattern Analysis.” Ptvtxe&
ings of the Assodalion of American CeQgmphers, 6 (1974), 26-29.
6. . “Some Models of the Random Subdivision of Space.” CeQgmfiskaAnnaler, Ser. B, 55
(1973), 34-48.
7. Brassel, K. E. “Neighborhood Computations for Large Sets of Data Points.” E’mwxfings of the
International Symposium on COmputer-Assisted Coszogmphy (AUTO-CARTO II). Washington,
D.C.: American Congress on Sweying and Mapping, 1978,337-45.
8 . “A Topological Data Structure for Multi-Element Map Processing.” Hamad Papers a
ceogmphic Data Structures, 4 (1978).
9. Crain, I. K. “The Monte Car10 Generation of Random Polygons.” C o m p t e n and Geoscience, 4
(1978), 131-41.
10. Delaunay, B. “Sur la sphere vide.” B u W n of the A&y of sciences of the USSR,Classe
Sci. Mat. Nat. (1934), 793-800.
11. Dobkin, D. P., and R. J. Jipton. “The Complexity of Searching Lines in the Plane.”
Department of Computer Science, Yale University, Research Report No. 71,1975.
15384632, 1979, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.1979.tb00695.x by EBMG ACCESS - GHANA, Wiley Online Library on [31/01/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kurt E . Brassel and D m g h Reif / 303