Geographic Information Systems in Transportation Research
Geographic Information Systems in Transportation Research
INFORMATION
SYSTEMS IN
TRANSPORTATION
RESEARCH
Edited by
JEAN-CLAUDE THILL
Department of Geography and National Center
for Geographic Information and Analysis,
State University of New York at Buffalo
Emerald Group Publishing Limited
Howard House, Wagon Lane, Bingley BD16 1WA, UK
No part of this book may be reproduced, stored in a retrieval system, transmitted in any
form or by any means electronic, mechanical, photocopying, recording or otherwise
without either the prior written permission of the publisher or a licence permitting
restricted copying issued in the UK by The Copyright Licensing Agency and in the USA
by The Copyright Clearance Center. No responsibility is accepted for the accuracy of
information contained in the text, illustrations or advertisements. The opinions expressed
in these chapters are not necessarily those of the Editor or the publisher.
ISBN: 978-0-080-43630-2
Awarded in recognition of
Emerald’s production
department’s adherence to
quality systems and processes
when preparing scholarly
journals for print
TRANSPORTATION
RESEARCH
PARTC
Preface
The idea for this book grew out of the difficulties encountere d b y graduate students enrolled in
my Transportation Modeling an d GI S clas s in graspin g the dept h an d breadt h o f the geographi c
information system s (GIS) technology in transportation research . The long separate evolution of
research in the fields o f transportation an d GI S has finally reache d the point where paths overla p
and continu e i n grea t unison . GI S ha s prove n t o b e a n approac h t o spatia l researc h tha t dra -
matically enhance s efficienc y an d effectiveness . Sorel y missin g i n thi s rapidl y growin g fiel d o f
application i s a proper framin g o f the wealt h o f knowledge accumulated s o far i n apply GI S t o
transportation researc h questions . Thi s volum e is precisel y intende d to fill thi s void , wit h the
ultimate goal of supporting th e disseminatio n o f recent researc h int o graduate educatio n an d o f
hastening technolog y adoptio n i n th e field of transportation . I t appear s t o b e th e first original
book devote d exclusively to thi s topic .
This volum e consist s o f 2 2 original paper s solicite d t o represen t th e broa d bas e o f contem -
porary researc h theme s i n transportatio n GIS . Forty-fiv e scholar s fro m America , Europ e an d
Asia have contributed their knowledge to produce a unique compilation o f recent developments in
the field. The quality o f this volume ha s bee n greatl y enhance d b y the unselfish professionalis m
and dedicatio n o f 7 5 colleagues wh o advise d m e i n th e for m o f anonymou s refere e report s o n
individual manuscripts.
Special arrangement s wit h Elsevier Scienc e Ltd. mad e possibl e the dua l publicatio n o f all the
contributions i n the for m o f the presen t boo k an d a s volume 8 of the periodica l Transportation
Research Part C : Emerging Technologies. I n spit e o f th e stringen t schedul e impose d b y thi s ar -
rangement, contributing authors showe d great diligenc e throughout th e entire process. Professo r
Stephen Ritchie , Edito r o f Transportation Research Part C : Emerging Technologies provide d
valuable oversigh t an d guidanc e i n meetin g th e editoria l standard s o f th e journal. Hi s suppor t
during this effor t ha s bee n remarkable . Finally , I am thankfu l fo r th e patience, availability , an d
dedication o f the editoria l staf f a t Elsevie r Science Ltd., particularl y Chris Pringle and Leighto n
Chipperfield.
Jean-Claude Thill
Department of Geography and National Center for Geog. Info & Analysis,
State University of New York at Buffalo,
Buffalo, NY 14261, USA
E-mail address: [email protected]
0968-090X/00/$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00030- 9
CONTENTS
K.J. Dueke r and J.A . Butle r 1 3 A geographi c informatio n syste m framework for
transportation dat a sharin g
J.C. Sutto n and M.M . Wyma n 3 7 Dynamic location : a n iconi c mode l t o synchroniz e tempora l
and spatia l transportatio n dat a
M.A.P. Taylor , J.E. Woolle y and 25 7 Integration o f the globa l positionin g system and geographica l
R. Zit o information system s for traffi c congestio n studie s
S.M. Alexande r and 307 The effect s o f highwa y transportation corridor s o n wildlife : a
N.M. Water s case stud y o f Banf f Nationa l Par k
R.L. Churc h an d T.J . Cov a 32 1 Mappin g evacuatio n ris k o n transportatio n network s usin g a
spatial optimizatio n mode l
W.C. Frank , J.-C . Thil l and 33 7 Spatia l decisio n suppor t syste m for hazardou s materia l truc k
R. Batt a routin g
Y.-W. Huang , N. Jin g and 38 1 Optimizin g pat h quer y performance : grap h clusterin g strategie s
E.A. Rundensteine r
445 Inde x
VOLUME I
TRANSPORTATION
RESEARCH
PARTC
Abstract
The late 1980 s saw the first widespread use of Geographic Informatio n System s (GIS) in transportatio n
research and management. Due to the specific requirements of transportation application s and of the rather
late adoptio n o f thi s informatio n technolog y i n transportation , researc h ha s bee n directe d towar d en -
hancing existin g GIS approache s t o enabl e the ful l rang e o f capabilities neede d in transportation researc h
and management . Thi s pape r place s th e concep t o f transportatio n GI S i n th e broade r perspectiv e o f re-
search in GIS and Geographi c Informatio n Science. The emphasis is placed o n the requirements specific of
the transportatio n domai n o f applicatio n o f thi s emergin g informatio n technolog y a s wel l a s o n cor e
research challenges . © 200 0 Elsevier Science Ltd . All rights reserved.
1. Introduction
It i s quite a paradox that th e field of transportation has a t las t come to embrac e Geographic
Information System s (GISs) a s a ke y technology to suppor t it s research and operationa l needs,
while som e o f th e earl y pioneer s o f GI S a t th e Universit y of Washingto n an d Northwester n
University wer e i n fac t transportatio n scientists . Ove r thre e decade s hav e passe d sinc e thes e
missed opportunities for cross-fertilization. GI S has since evolved and matured from a tool, to a
technology, and finally to a legitimate domain of scientific inquiry called Geographic Information
Science (Goodchild , 1992a) . I n th e meantime , transportatio n ha s distance d itsel f fro m it s his -
torical root s i n geographi c an d spatia l sciences , bu t ha s als o becom e increasingl y multi -
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00029- 2
4 J.-C. Thill I Transportation Research Part C 8 (2000) 3-12
disciplinary, thus reflecting the multi-faceted reality of transportation infrastructur e and flows and
movements of passengers an d freight .
In th e Unite d States , th e multi-disciplinar y outloo k o f transportatio n ha s bee n fostere d b y
several ke y piece s o f federa l legislation passe d i n th e 1990 s (Meyer, 1999) , including the Clea n
Air Act Amendments, the Intermoda l Surfac e Transportation Efficienc y Act , th e American with
Disabilities Act, an d th e Transportation Equit y Act for the 21st century. All four act s containe d
explicit requirements for local and stat e governments to consider transportatio n system s through
their interdependenc e with other natural , social , o r economi c systems . From the new integrative
mission o f transportatio n studie s grew the nee d fo r enhance d approache s t o store , manipulate ,
and analyz e dat a spannin g multipl e themes : fo r instance , highwa y infrastructure , peak-tim e
traffic flow , transi t offerin g (fare , frequency , reliability), populatio n ethnicity , wor k forc e par -
ticipation, an d ai r quality . Offerin g a dat a managemen t an d modelin g platfor m capabl e o f
integrating a vas t arra y o f dat a fro m variou s sources , capture d a t differen t resolution s (stree t
segments, censu s tracts , traffi c analysi s zones , stree t corner , etc.) , an d o n seemingl y unrelated
themes, GI S ha s positioned itsel f as the ultimat e information integratio n technology . I n a GIS ,
integration proceed s b y referencin g all object s t o som e commo n locationa l framework . Wit h
proper conversio n rule s an d algorithms, dat a store d i n differen t scales , projection s an d dat a
models ca n b e registere d t o th e sam e underlyin g referencin g framework. It ca n b e argue d tha t
the presen t adoptio n o f GI S i n transportatio n bring s th e fiel d t o a ful l circl e a s i t i s rediscov -
ering th e primac y o f spac e an d place , tw o concept s tha t launche d th e systemati c stud y o f
transportation i n Geograph y an d Regiona l Scienc e i n th e 1950s . Th e acrony m GIS- T i s ofte n
employed t o refe r t o th e applicatio n an d adaptatio n o f GI S t o research , planning , an d man -
agement i n transportation .
In this paper, I give an overview of the nature o f GIS and place its evolution in context. I als o
discuss th e specificit y an d requirement s o f GI S i n transportation . Finally , som e o f th e cor e re -
search theme s in transportation GI S ar e highlighted.
GISs ar e computer-based system s for the capture, storage , manipulation , display , and analysis
of geographic information . Th e multiple functionality afforded b y GIS distinguishes it from olde r
technologies. Th e integration o f multiple functionalitie s withi n on e rather seamles s environmen t
dispenses users from masterin g a collection o f disparate, an d specialize d technologies . As it turns
out, thi s aspec t i s often hel d b y organization s a s on e o f th e decisiv e criteria i n thei r decisio n t o
adopt GI S technolog y becaus e o f its efficienc y benefits .
The functional complexity of GIS i s what makes it a system different fro m an y other. Withou t
geo-visualization capability, the GIS is merely a database management engin e endowed with some
power to extract meaningful relationship between data entities. Without analytical capability, GIS
would be reduced t o an automated mapping application. Withou t database managemen t features,
GIS woul d b e unabl e t o captur e spatia l an d topologica l relationship s betwee n geo-reference d
entities if these relationship s were not pre-defined .
What set s GIS apart fro m othe r database managemen t systems (DBMS) is not the nature of the
information handled. Indeed GIS and DBMS may contain exactly the same information, say fatal
J.-C. Thill / Transportation Research Part C 8 (2000) 3-12 5
accidents occurring on New York stat e highways during a given year. The difference betwee n the
two system s is "under the hood" , namely i n the way information i s referenced. A DBMS refer -
ences accidents b y some unique index or combination o f indexes, such as the date o f occurrence,
the vehicl e make, o r th e weather conditions. B y contrast, informatio n is all about a geographi c
description o f the surface of the Earth i n a GIS. Eac h accident record is a geographic event in the
sense that it is tied to a unique location defined i n a given referencing framework (global, national
or local datum). With the spatial referencing of objects, topology of the data can be defined, which
in tur n enables a host of spatial quer y operations of objects and se t of objects. For instance , the
task "identify all accidents that occurred within 100 meters of any intersection on urban arterials"
requires little effort becaus e of the spatial indexing of all accident an d roadwa y link objects in the
GIS databases .
The concep t o f GI S trace s it s root s bac k t o a handfu l o f researc h initiative s i n th e US ,
Canada an d Europ e during the late 1950s . It i s widely acknowledged that th e first real GI S was
the Canad a Geographi c Informatio n Syste m set up fo r the Canada Lan d Inventory . The reader
will fin d complet e historie s o f GI S i n Coppoc k an d Rhin d (1991 ) an d Foresma n (1998) . I t
suffices t o sa y her e tha t th e developmen t o f th e concep t an d it s implementation s i s closely
associated wit h the requirements of land information systems. Early transportation application s
of GI S wer e few and unabl e to creat e a momentu m in GI S researc h sufficien t t o remediat e th e
known limitation s o f th e technolog y i n handlin g transportatio n data , i n interfacin g wit h com-
plex analytica l network-base d models , an d i n fittin g i n existing enterprise-wid e busines s model .
Even th e US Bureau o f the Census' s Dua l Independen t Ma p Encodin g (DIME ) syste m - pre -
cursor o f th e Topologicall y Integrate d Geographi c Encodin g and Referencin g (TIGER) syste m
- doe s no t qualif y a s a prope r effor t t o enhanc e transportatio n GI S becaus e o f it s crud e
topology.
A GIS is a spatial representation, o r model, o f the data used to depict a portion o f the surface
of the earth (Frank, 1992) . In the transportation context, three classes of GIS models are relevant
(Goodchild, 1992b , 1998).
• Fiel d models, or representation o f the continuous variation of a phenomenon over space. Ter-
rain elevatio n uses this model.
• Discret e models , according t o which discrete entities (points, lines or polygons) populate space.
Highway rest areas, toll barriers, urbanized areas may use this model.
• Networ k models to represent topologically connected linear entities (such as roads, rail lines, or
airlines) that ar e fixed in the continuous reference surface.
While al l three model s ma y b e usefu l i n transportation , th e networ k model built aroun d th e
concepts of arc and node plays the most prominent role in this application domai n because single-
and multi-moda l infrastructur e network s ar e vita l i n enablin g an d supportin g passenge r an d
freight movement . I n fact , man y transportatio n application s onl y requir e a networ k mode l t o
represent data. Examples of such applications include:
• pavemen t and other facilit y managemen t systems;
• real-tim e an d off-lin e routin g procedures , includin g emergenc y vehicle dispatching an d traffi c
assignment in the four-step urban transportatio n plannin g process;
• web-base d traffic informatio n systems and trip planning engines;
• in-vehicl e navigation systems;
• real-tim e congestion management and accident detection.
6 J.-C. Thill I Transportation Research Part C 8 (2000) 3-12
The previous sectio n has identified the main data models o f GIS. I t also stressed tha t a commo n
trait o f research on , an d with , GIS- T is its reliance o n th e networ k data model , a t time s at th e
exclusion o f any other dat a model . This is not t o sa y that othe r domain s o f application hav e n o
use for network representations, bu t when used at all, networks play a rather peripheral role. The
network model is elegantly simple, yet functional. With its arc and node structure, it represents the
one-dimensional networ k objec t i n referenc e to th e two - (o r three- ) dimensiona l surfac e o f th e
earth. Arc s an d node s themselve s ar e primitive s o f th e discret e entit y model . Thei r locationa l
referencing i s absolute , usuall y two-dimensional : x an d y coordinates , longitude/latitude , fo r
example.
Once prope r topolog y ha s bee n define d o n th e network, th e network mode l support s basic as
well as advanced form s of network analysi s (Waters, 1999 ; Souleyrette and Strauss , 2000) , fro m
location-allocation modeling to vehicle routing and schedulin g and traffi c assignment , and finally
to networ k connectivit y optimization an d design . Network-based GI S enable s thus th e stud y of
flows and movements , which lies at th e cor e o f transportation research .
Land information system s and early population census information systems have not conceive d
of roadways, railway s and othe r transportatio n infrastructur e "as feature s for analysi s in and o f
themselves" (O'Neil l an d Harper, 2000 ) because transportation lines , like other linea r feature s -
most singularl y stream s - primaril y serv e t o delineat e polygons . I n transportatio n research ,
however, ther e i s a compellin g nee d fo r attributin g infrastructur e lines. This i s i n lin e wit h th e
primary mission of transportation agencie s to be custodian o f the transportation infrastructur e in
their jurisdiction , an d maintai n i t i n goo d operatin g conditio n (Petzol d an d Freund , 1990) .
Furthermore, mos t model s o f networ k analysi s mentione d abov e incorporat e som e measur e o f
travel impedanc e o n eac h lin k o f th e network , while some o f the m als o us e link-specifi c traffi c
capacity attributes . I t i s als o wel l know n tha t th e externa l validit y o f man y model s i s greatl y
enhanced b y a bette r representatio n o f traffi c condition s a t node s o n th e networ k (a t grad e in -
tersections, freewa y entrance s o r exits) . Node s ar e remarkabl e location s o n th e networ k wher e
various restriction s to movemen t may exist and dela y often develop s in relation t o th e mixing of
traffic streams . Nod e attribute s ma y entai l a rathe r elaborat e descriptio n o f a n intersectio n b y
traffic priorities , th e presenc e o f traffic signals , thei r timin g an d phase , amon g others .
So far, the implici t assumptio n ha s bee n tha t networ k links are homogeneous. Thi s ma y hol d
true i n some systems, but no t i n others. Numbe r o f lanes, pavement width, pavemen t condition ,
posted spee d ar e all but a few attributes tha t canno t b e constrained t o b e constant betwee n ter -
minal node s o f a link . Similarly , on th e Nationa l Highwa y System , traffic parameter s o f speed ,
flow and capacit y canno t b e expecte d t o b e constant betwee n widel y space d junctions . Th e dy -
namic natur e o f thes e distribute d attribute s o f th e networ k preclude s tha t th e networ k b e per -
manently edited to maintain the homogeneity of each link on each attribute . Instead an attribut e
can be viewed as a spatial (linear ) event occurring on the network. The variation o f the attribut e
can be referenced to discrete locations measured b y relative positions on a linear feature belongin g
to the network. In this approach, attributes are linearly referenced and linked dynamically to th e
entities formin g th e networ k (Scarponcini , 1999) . Earl y researc h i n GI S i n transportatio n le d
Dueker (1987) , Fletcher (1987) , and Vonderoh e et al. (1993) to identif y th e critica l nee d fo r thi s
capability in GIS-T. Traffi c accidents , bridges, traffic signs , and othe r zero-dimensional events can
J.-C. Thill I Transportation Research Part C 8 (2000) 3-12 1
also tak e advantag e o f linear referencin g system s for linkin g to th e one-dimensiona l transporta -
tion infrastructure.
Though th e basic network data model is already a domain-specific departure fro m conventiona l
GIS dat a modeling , i t doe s no t suffic e t o handl e th e complexit y embedde d i n transportatio n
network data . A s pointe d b y Goodchil d (1998) , extension s ar e neede d t o handl e particula r
structures. Th e followin g thre e meaningful extension s were recognized b y Goodchild (1998) .
• Plana r versu s non-planar model , wherein topological representatio n differ s fro m cartographi c
representation b y not forcing nodes at cartographic intersections. Non-planarity allow s the rep-
resentation o f freeway overpasse s as well as of turn prohibitions. Navigational database s mus t
conform t o a non-planar model .
• Tur n table s contain propertie s o f the turn between any pair o f links connected o n the network.
Properties can be binary (allowed, disallowed), or cardinal measurements (for instance, expect-
ed delay through a n intersection) .
• Link s are objects formed o f traffic lanes . A structure allowin g for this object-oriented vie w of
the infrastructure need s to define topology betwee n lanes. I t may store attributes fo r individual
lanes.
Certainly th e need for these and other extensions t o the base network mode l is not universal. I t
can b e motivated b y th e resolutio n an d th e geographi c scal e o f th e representation . Ultimately ,
need i s dictated b y the specifi c GIS-T application. The representatio n afforde d b y extensions o f
this kind is essential for the development of navigational databases. O n the other hand, i t is most
likely to be superfluous i n a task o f transportation planning fo r an entire metropolita n area.
To sum up, GIS in transportation i s more than just one more domain o f application o f generic
GIS functionality . GIS- T ha s severa l dat a modeling , dat a manipulation , an d dat a analysi s re -
quirements tha t ar e not fulfille d b y conventional GIS . Th e final report o f NCHRP Project 20-27
(Vonderohe e t al., 1993 ) theorizes GIS- T as the product of the cross-fertilization o f an enhance d
GIS and a n enhanced Transportatio n Informatio n Syste m (TIS). Se e Fig. 1 . To quote Vonderohe
et al . (1993) , "th e necessar y enhancemen t t o existin g TIS s i s th e structurin g o f th e attribut e
Fig. 1 . GIS-T, produc t of an enhance d GIS an d a n enhance d TIS. Afte r Vonderoh e et al . (1993).
8 J.-C. Thill I Transportation Research Part C 8 (2000) 3-12
databases t o provide consistent location referenc e data i n a form compatible with the GIS, whic h
in tur n ha s bee n enhance d t o represen t an d proces s geographi c dat a i n th e form s require d fo r
transportation applications " (p . 11).
The curren t flurr y o f researc h activit y in an d aroun d GIS- T i s a clea r sig n o f th e interes t o f
transportation researcher s an d professional s fo r thi s stil l emerging technology. Som e o f the new
trends discernibl e in state-of-the-ar t researc h i n GIS- T merel y echo th e transformation s o f GI S
per se . Some o f these transformations ar e technologicall y motivated (Fletcher , 2000) . Others ar e
part o f a n agend a se t forth b y th e Universit y Consortium fo r Geographi c Informatio n Scienc e
(UCGIS) to strengthen th e scientific basis of this emerging disciplin e bor n to the GIS technology .
In 1997 , UCGIS outline d a research agend a compose d o f research prioritie s i n ten areas . Thes e
priorities are listed in Table 1 . The state of research i n GIS-T with respect to eac h of the UCGI S
priorities ha s recentl y bee n compile d b y Wiggin s e t al . (2000) . Thi s sectio n point s t o selecte d
research theme s and challenge s that ar e particularly salien t i n contemporary GIS- T research .
Research theme s an d challenge s ma y manifes t themselve s differentl y o n differen t functiona l
aspects of GIS. I t is imperative therefore to discuss them in a functional framework. For th e sake
of the exposition , GI S functionalit y is here organized i n relation t o th e leve l of intensity of dat a
processing involved. A commonplace framewor k derived from thi s line of thought identifie s thre e
functional groups : dat a management , whic h concern s storag e an d retrieva l o f data ; dat a ma -
nipulation, whic h refer s t o th e creatio n o f new data ou t o f raw data; an d dat a analysi s or ana -
lytical modeling. Se e McCormack an d Nyerge s (1997) for a simila r framework in the contex t of
GIS-T. Interestingly , requirement s associate d t o eac h grou p ar e no t independent . A s dat a ma -
nipulation require s dat a storage , an d modelin g i s buil t o n th e latte r two , requirement s an d
challenges are cumulative. This hierarchical vie w of functionality is depicted in Fig. 2. This logic is
also followed in the organization o f the contributions included in this volume dedicated t o GIS in
transportation research . Man y o f the theme s introduced i n the res t o f this paper ar e furthe r de-
veloped i n these contributions .
Table 1
UCGIS Researc h Prioritie s for Geographi c Informatio n Science
(1) Spatial dat a acquisitio n and integration
(2) Distributed computing
(3) Extensions to geographi c representatio n
(4) Cognition o f geographic informatio n
(5) Interoperability o f geographic informatio n
(6) Scale
(7) Spatial analysi s in a GI S environment
(8) The futur e o f th e spatia l information infrastructure
(9) Uncertainty i n spatial data an d GIS-base d analyse s
(10) GIS an d societ y
J.-C. Thill I Transportation Research Part C 8 (2000) 3-12
Fig. 2 . Hierarchical mode l of data management, dat a manipulation , an d dat a analysi s functional groups.
The interoperability issu e is quickly becoming one of the most pressing themes in GIS-T as geo-
referenced dat a find their wa y into th e marke t place . Detaile d digita l stree t database s populat e
and routin g an d dispatchin g system s fo r emergenc y services , an d vehicl e navigatio n system s
available t o th e genera l publi c an d t o fleet s o f commercia l vehicles . Element s o f Intelligen t
Transportation System s (ITS ) tha t involv e wireles s communicatio n betwee n motorist s an d a
traffic control center or information service provider necessitate unambiguous identification of the
motorist location s withi n a reasonabl e rang e o f accuracy. Th e agend a se t forth abov e wil l con -
tribute to enablin g new generations o f wireless information services.
Geo-referenced dat a are increasingly collected a s part o f a continuous process rather tha n a t a
few pre-se t moments i n time. Need has als o emerged for accessing data on a real-time basis. Fo r
instance, continuou s stream s o f traffi c dat a fro m vehicle s carrying tol l transponder s o n parts of
New Yor k state' s freewa y syste m ar e fe d int o computationa l algorithm s fo r earl y acciden t de -
tection. In other metropolitan regions , probe vehicles equipped with a Global Positioning System
(GPS) devic e provide spee d dat a t o th e Traffi c Managemen t Center , whic h in turn disseminate s
congestion informatio n an d forecasts to wireless information service providers, thu s fitting in the
area's Congestion Management System. This (quasi) real-time traffi c dat a is also a primary inpu t
of world-wide-we b applications discusse d below. Real-tim e dat a storage , retrieval , processing ,
and analysi s are presently not meetin g the needs of society when it comes to geo-reference d data.
Quicker acces s dat a models , an d mor e powerfu l spatial dat a fusio n technique s an d dynami c
routing algorithm s are neede d t o tak e advantag e o f real-time traffi c information .
Real-world transportation problem s ten d to involve large amounts of geo-referenced data and
large networks . Visualization technique s o n whic h GI S mappin g i s based ar e inherite d fro m a n
age where data wa s no t abundant . GIS- T wil l benefi t fro m a n evolutio n i n Geographi c Infor -
mation Scienc e researc h toward s clos e integration o f geo-visualization principle s an d computa -
tional methods o f knowledge discovery and data mining. As the latter are still very much in their
infancy, no tangible outcome shoul d be expected in the near future. With GIS-T, the complexity is
compounded by the difficult y t o visualiz e information o n the single dimensio n o f a network .
The sheer size of transportation dat a sets often requir e innovative system designs that manag e
both t o optimize speed and accuracy of the display of information and to optimize the run time of
algorithms an d analytica l tool s o f flow and networ k analysis .
The connectivity offere d b y the Interne t technolog y has transforme d th e relationshi p between
the computer , th e softwar e application , th e data , an d th e user . Computin g ha s emerge d a s a
mobile, distributed , an d ubiquitou s reality . Web-base d GI S application s hav e becom e com -
monplace, includin g i n th e domai n o f transportation . Real-tim e transi t rout e an d schedul e in -
formation, roa d construction , traffi c informatio n ar e examples of applications tha n ar e currently
J.-C. Thill I Transportation Research Part C 8 (2000) 3-12 1 1
available. Remaining challenges revolve around bringing to the Internet client-server environment
the power of desktop GIS-T . This entails the development o f more powerful and robust analytica l
tools to fit the limited distributed computing resources and limited bandwidth on communication
networks. Also, system architectures wil l need to b e judiciously designed to make efficient us e of
local and remote computin g resources .
The future - n o longer distant - o f mobile computing is with Internet-enabled Personal Digita l
Assistants (PDA), Personal Navigation Assistants (PNA), and other on-board computing devices.
All the issues brought up earlier in this section are considerably magnified in this setting due to the
more severe constraints on bandwidth and local computing resources. An issue re-emerging in this
context i s that o f geo-referencing of remote service users and trackin g of their movement in real-
time.
5. Conclusion s
The late 1980s saw the first widespread us e of GIS in transportation research an d management .
Due to the specific requirements of transportation application s an d of the rather late adoption o f
this information technolog y in transportation, researc h ha s bee n directe d towar d enhancin g ex-
isting GI S approache s to enabl e th e ful l rang e o f capabilities neede d i n transportation researc h
and management .
This pape r place d th e concept o f transportation GI S i n the broader perspectiv e of research in
geographic informatio n system s and Geographi c Informatio n Science . The emphasis wa s place d
on th e requirement s specifi c o f the transportatio n domai n o f application o f this emerging infor -
mation technology. The paper conclude d with a synopsis of dominant theme s in current research
in GIS for transportation. Successfu l pursuit o f this agenda should solidif y th e position o f GIS as
an integrative syste m fo r transportation research an d management .
References
Coppock, J.T. , Rhind , D.W. , 1991 . Th e histor y o f GIS . In : Maguire , D.J., Goodchild , M.F. , Rhind , D.W . (Eds.) ,
Geographical Informatio n Systems : Principle and Applications , vol. 1 . Longman, Harlow , UK, pp . 21-43.
Dueker, K.J. , 1987 . Geographic informatio n systems and computer-aided mapping. Journal of the American Planning
Association 53 , 383-390.
Fletcher, D.R. , 1987 . Modelling GI S transportation networks. In : Proceedings o f the 25t h Annual Conferenc e o f the
Urban an d regiona l Information Systems Association, For t Lauderdale , pp. 84-92.
Fletcher, D.R. , 2000 . GIS- T i n th e ne w millenium - A loo k forward . In : Transportatio n i n th e Ne w Millenium,
Transportation Researc h Board , Washington, DC (CD-Rom) .
Foresman, T.W. , 1998 . The History of Geographic Information Systems: Perspectives from th e Pioneers. Prentice-Hall,
Upper Saddl e River , NJ .
Frank, A.U. , 1992 . Spatial concept, geometric data models, and geometric data structures . Computer and Geoscience s
18, 409-117.
Goodchild, M.F. , 1992a . Geographic information science. International Journal of Geographic Information Systems 6,
31-45.
Goodchild, M.F. , 1992b . Geographical dat a modeling . Computers an d Geoscience s 18 , 401^08.
12 J.-C. Thill I Transportation Research Part C 8 (2000) 3-12
Goodchild, M.F. , 1998 . Geographi c informatio n system s an d disaggregat e transportatio n planning . Geographica l
Systems 5 , 19-4 .
McCormack, E. , Nyerges , T. , 1997 . Wha t transportatio n modelin g need s fro m a GIS : a conceptua l framework.
Transportation Plannin g and technolog y 21 , 5-23 .
Meyer, M.D. , 1999 . Transportation plannin g in the 21s t Century. TR New s 204, 15-22 .
O'Neill, W. , Harper , E.A. , 2000 . Implementatio n of linear referencing systems in GIS . In : Easa , S. , Chan, Y . (Eds.) ,
Urban Plannin g an d Development Applications o f GIS. America n Society o f Civil Engineers, Reston, VA, pp. 79 -
98.
Petzold, R.G. , Freund , D.M. , 1990 . Potentia l fo r geographi c informatio n system s i n transportatio n plannin g an d
highway infrastructur e management. Transportatio n Researc h Record 1261 , 1-9 .
Scarponcini, P. , 1999 . Generalized mode l fo r linea r referencing. In: Advance s i n Geographi c Informatio n Systems,
Proceeding o f the Sevent h International Symposium , ACM GIS'99 , ACM, pp . 53-59 .
Souleyrette, R.R., Strauss , T.R., 2000. Transportation. In: Easa, S., Chan, Y. (Eds.), Urban Planning and Development
Applications of GIS. America n Society of Civil Engineers, Reston, VA, pp. 117-132 .
Vonderohe, A.P. , Travis , L. , Smith , R.L. , Tsai , V. , 1993 . Adaptatio n o f geographi c informatio n system s fo r
transportation. Nationa l Cooperativ e Highwa y Research Progra m Repor t 359 , Transportatio n Researc h Board ,
Washington, DC .
Waters, N.M. , 1999 . Transportation GIS : GIS-T . In : Longley , P.A., Goodchild , M.F. , Maguire , D.J., Rhind , D.W .
(Eds.), Geographi c informatio n systems, vol. 2: Management Issues and Applications . Wiley , New York, pp. 827 -
844.
Wiggins, L. , Dueker , K. , Ferreira , J. , Merry , C., Peng , Z.-R., Spear , B. , 2000. Application challenge s for geographi c
information science : Implications for research, education an d policy fo r transportation planning an d management .
Journal o f the Urba n an d Regiona l Informatio n Systems Association 1 2 (2), 51-59.
TRANSPORTATION
RESEARCH
PARTC
Abstract
Keywords: GIS-T ; Dat a sharing ; Data models ; Linea r locational referencin g systems
0968-090X/00/S - see front matte r © 200 0 Elsevier Science Ltd. All right s reserved.
PII: S0968-090X(00)00006- 1
14 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
\. Introductio n
Data sharin g i s more tha n simpl y having th e abilit y t o occasionall y impor t informatio n fro m
someone else's database. The business needs driving dynamic data sharing include those for multi-
agency loca l governmen t GI S infrastructures , wher e E-911 (emergenc y dispatch) publi c works ,
and property assessment organizations need to utilize a common database . Th e property assesso r
gets a ne w subdivision pla t o n whic h recently constructed street s ar e shown . The E-91 1 center
needs to know where the new streets are located; the building inspector needs to know all the new
lot addresses; and the public works department will need to establish new trash pick-up routes and
pavement management segments. Somehow, the information on the plat ha s to be communicated
accurately, efficiently an d quickly to these many users, each with a unique linear and/or non-linea r
location referencing system .
The reality in most case s i s that th e enterpris e approac h t o GI S i s late o n th e scene , with th e
different agency-specifi c GI S application s having bee n develope d i n isolation. Until suc h a tim e
when thes e system s come t o us e a singl e enterprise-wide data infrastructure , these agencie s will
need t o frequentl y exchang e dat a sets . Thi s pape r attempt s t o se t th e stag e fo r establishin g
comprehensive transportation dat a exchange mechanisms by describing a way to integrate them in
an enterpris e GIS-T data model .
Sharing GIS-T data i s both a n important issue and a difficult one . It is important becaus e there
are many organizations that produce or use GIS-T data; it is difficult becaus e there are many ways
to segmen t an d cartographicall y represen t transportatio n syste m elements . Ther e i s a lac k o f
agreement amon g transportatio n organization s i n definin g transportatio n object s an d i n th e
spatial accurac y with which they are represented cartographically. Thi s lack of agreement leads to
difficulty i n conflating or integratin g two view s of the sam e o r adjacen t linear objects. 2
There ar e tw o problem s i n definin g transportation objects : differen t definition s o f roads an d
different criteri a wit h which to brea k road s int o logica l segments . The logica l segment s become
objects i n the databas e that w e will refer t o a s "transportatio n features" . W e have selecte d thi s
term i n orde r t o includ e mor e tha n jus t roads . Roadways , railroads , transi t systems , shipping
lanes, an d air routes are all linear feature s that utilize the same basic network dat a model, whic h
utilizes linear trave l paths betwee n points o f intersection. Sinc e they all use the sam e basic data
model, w e will generally restrict ou r discussio n t o roadway s for simplicity .
Transportation feature s become the building blocks for specifi c applications . Person s building
vehicle navigation database s nee d t o includ e private road s tha t ar e ope n fo r publi c use. "Paper
streets", thos e whic h are no t ye t constructed an d tha t canno t b e navigated, should b e omitted .
Yet publi c organization s responsibl e fo r roa d maintenanc e follo w differen t rules . The y omi t
private road s and includ e planned publi c road s on thei r maps . Similarly , two organization s re-
sponsible fo r road s o n resourc e lands, th e Fores t Servic e and th e Burea u of Land Managemen t
have quit e differen t definition s o f roads. 3 Most organization s tha t maintai n database s o f roads
2
Sperlin g and Shar p (1999 ) describe conflation o f US Bureau of the Census TIGER files with local street centerline
files as the "automati c matchin g and transfe r o f features an d attribute s from on e geo-spatial database into another".
3
The Forest Service defines roads as any visible track, while the BLM limits roads to tracks that can be traversed by a
normal vehicle .
16 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
break the m int o logica l segment s t o creat e discret e transportatio n feature s accordin g t o som e
business interests, suc h a s a chang e o f pavement type , jurisdiction, functiona l type, o r a t al l in-
tersections.
These difference s i n origina l purpos e fo r transportatio n database s creat e a difficul t aren a fo r
sharing dat a wit h others . Th e dat a sharin g aren a include s dat a producers , dat a users , and , in -
creasingly, dat a integrator s wh o collec t dat a fro m th e field , legac y databases , o r othe r dat a
producers or user s an d reorganiz e i t fo r ne w uses and/or t o maintai n currency . A healthy dat a
sharing environment suggest s that dat a producer s embe d registratio n point s an d featur e identi-
fiers in their original data to facilitate importing and registratio n of foreign or legacy cartography
and attribut e data .
The nee d i s fo r standard s fo r dat a sharin g amon g organizations , bot h publi c an d private .
However, standards are difficult t o develop because system requirements of advanced application s
of GIS-T technology differ i n spatial an d tempora l accurac y and detail s of such features a s ramps
and lanes . Further , differin g level s o f real-tim e us e o f system s dictat e respons e requirement s o f
databases an d interfaces.
A mor e systemati c approac h t o GIS- T dat a sharin g o f GI S dat a call s fo r relie f fro m th e
conflation techniqu e tha t require s th e matchin g o f spatia l object s o f separat e dat a sets . I t i s a
problematic proces s du e t o th e nee d t o simultaneousl y matc h bot h topologica l an d geometri c
properties. This is exemplified b y efforts o f transferring TIGER attribute s to more accurate vector
files (Brown e t al , 1995 ; Tomaselli , 1994) . Thi s approac h t o matchin g spatia l dat a o f simila r
scale work s well , i f the dat a wer e capture d usin g th e sam e dat a mode l o r criteri a b y whic h t o
define an d segmen t roads. I f roads ar e not segmente d into simila r spatial objects, then conflation
is no t a satisfactor y wa y t o share transportatio n data . Som e (Seste r e t al. , 1998 ; Walter an d
Fritsch, 1999 ) work towar d automatin g th e conflatio n process, whil e Devogel e e t al . (1998 ) call
for th e need t o develo p an integrate d schem a from dat a model s t o facilitat e data sharin g and/o r
interoperability.
In addition , sharin g o f transportation dat a i s not a one-time issue. In a large r context, i t i s a
means of disseminating dat a about change s in the transportation system. Management o f update s
in order to maintain current representation s of transportation system s is a growing concern. Until
the present time , the effor t ha s been o n buildin g the initial database. Attentio n is now turning to
maintaining currency of databases o f increasing detail and complexity . The GIS-T communit y is
in nee d o f guidance o n thi s issue. We neither se e sufficient progres s on th e conflatio n approach ,
nor d o w e see adequate consensu s t o suppor t schem a integratio n t o suppor t th e wid e range o f
applications o f GIS-T data .
Sester et al. (1998) identify an alternative t o the bottom-up approach of conflation: a top-dow n
approach know n as semantic data integration . While maintaining the richness of attribute detai l
along features by linear referencing, our application o f this top-down approac h t o GIS-T requires
the aggregatio n o f spatia l objec t primitive s into large r (longer ) transportatio n feature s that ca n
and nee d b e uniquel y identified . Similarly, the transportatio n feature s i n ou r enterpris e GIS- T
data mode l ar e no t topologica l spatia l objects . Thi s enable s the buildin g o f application-specific
networks from selection s of a single and consisten t se t of underlying data. Our enterpris e GIS- T
data mode l als o allow s fo r multipl e spatia l representation s t o accommodat e th e nee d fo r bot h
abstract arterial-leve l dat a an d detaile d representation s o f roads , includin g freewa y ramp s an d
local streets.
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 1 7
Consequently, ou r enterpris e GIS- T data mode l fall s somewher e between one that specifies an
end produc t lik e TIGER o r GDF , an d a dat a exchang e standard lik e SDTS o r DIGEST . "Be -
tween" implie s it is more than a neutral standard, an d it is more than a format for a database t o
support a specific application. We call it an enterprise GIS-T dat a model to convey the notion of a
standard approac h t o maintainin g busines s dat a i n a transportatio n organizatio n abou t th e
transportation syste m for whic h the y ar e responsible . On e master se t of transportation feature s
enables maintenance o f current knowledge about th e system . Selection of transportation features
by type and from the available multipl e geometric representations enable s building of a number of
application-specific networks for functions, suc h as vehicle navigation and emergency , pavement,
bridge, and congestio n management.
political boundarie s wil l requir e a splittin g of topologica l edges . Thi s increase s th e difficult y o f
sharing dat a amon g network s tha t ar e separatel y maintained . Alternatively , newe r GI S
software, suc h a s Arclnf o 8 i s base d o n dat a model s tha t brea k thi s limitatio n b y allowin g
vector dat a t o b e non-topologica l lin e feature s o r topologica l features . Our enterpris e GIS- T
data mode l build s o n tha t principl e an d separate s transportatio n feature s fro m topologica l
links. I t i s a n intermediat e for m fro m whic h database s t o suppor t application s ca n b e
generated.
feature, not to the linear datu m a s in the NCHRP 20-27 data model. Making th e linear datu m the
principal entity requires transferring topology to the datum and thence to the cartography. Using
the NCHRP 20-27 data mode l approach , a dynami c segmentation proces s t o displa y transpor -
tation featur e attributes require s conversion fro m networ k measures to nod e offset s then t o an -
chor poin t offsets .
There ar e missin g element s i n bot h approache s tha t ar e bein g addresse d i n th e NCHR P
consensus dat a modelin g effort . On e i s th e proble m o f treatin g tim e an d th e thir d spatia l di -
mension mor e explicitl y i n th e dat a model . Th e othe r proble m i s wit h GI S dat a model s i n
general, which become paramount i n GIS-T. In traditional GIS , a spatial objec t is defined b y its
location. Consequentl y ther e i s n o suitabl e wa y t o represen t a movin g object , lik e a vehicle ,
package shipment , o r stor m i n suc h a GIS . Ther e need s t o b e a ne w dynami c trackin g o r
moving objec t clas s i n GIS , especiall y i n a GIS-T . Ther e ar e thre e approaches . On e i s a stati c
object wit h frequentl y changin g positions . Anothe r i s a ne w objec t clas s wit h locatio n a s a n
attribute rathe r tha n par t o f th e definition . Ye t anothe r i s a movin g objec t construc t wit h
starting locatio n an d attribute s o f direction , speed , an d destinatio n t o defin e a movin g object .
Emerging object-oriente d GI S platform s offe r a wa y t o d o thi s b y treatin g locatio n a s a n
attribute o f a n object .
Section 5 present s th e enterpris e GIS- T dat a model . Th e mode l bring s consistenc y t o th e
representation of transportation data and provides guidance to the data sharing participants. Th e
model als o provides th e basis fo r th e subsequen t developmen t o f data sharin g principles .
In previous works, Dueker and Butler provide a data model well suited for GIS-T data sharing .
It unbundles the geometry, topology, and attributes to facilitate separate maintenance and enables
extraction o f dat a fo r differen t uses . Th e origina l pape r provide d a genera l introduction t o in -
formation syste m desig n b y offerin g tutoria l appendice s o n suc h topic s a s use r requirement s
analysis, dat a modeling, an d busines s rul e construction . Th e tutorial informatio n wa s offered t o
enhance the reader's understandin g of the transportation dat a mode l described in the main body
of th e paper . Tha t mode l wa s propose d a s a comprehensiv e description o f th e entir e scop e of
transportation dat a tha t migh t b e house d i n a Stat e Departmen t o f Transportation . Whil e
business rule s o n whic h th e mode l wa s founde d wer e presented, th e accommodatio n o f imple -
mentation detail s wa s the main thrus t o f the work .
One unfortunate outcome of that pape r was the genera l difficulty reader s had a t takin g in the
big picture of a comprehensive data model and the myriad of database table s needed to implement
it. Reader comment s on that paper hav e motivated u s to create a new version, contained herein,
that i s base d o n th e forma l busines s rule s upo n whic h ou r enterpris e GIS- T dat a mode l i s
founded. These business rules provide the basis for th e ensuing discussion of GIS-T dat a sharin g
issues.
The simplifie d versio n of the enterpris e GIS-T dat a mode l is shown in Fig . 1.
Entities, th e thing s abou t whic h w e wish to stor e information , ar e show n in boxes, whil e th e
relationships betwee n entitie s ar e show n usin g lines . Eac h entit y typ e ha s bee n identifie d b y a
20 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
special style . Eac h relationshi p type has bee n show n using descriptive text an d connectio n sym-
bols. Eac h grou p o f entities for a singl e type can b e treated a s a stand-alon e data set .
An entity is a discrete part of our world, one for which we want to store information. A n entity
is a basic building block o f data models. Entities are related to eac h other through verb-oriented
statements, suc h as, "A jurisdiction define s on e or mor e transportation features" , an d it s corol-
lary, " A transportatio n featur e mus t b e define d i n th e contex t o f a singl e jurisdiction". Rela -
tionships ar e th e othe r buildin g blocks o f data models . A relationshi p without explicitl y stated
verbs generally can be read a s an ownership relationship , such as, "A base map string has one or
more line segments". Attribute s are part of the entities, not separat e entities. For example , a line
segment ma y hav e width and colo r attributes , whic h ar e aspect s o f the lin e segmen t entity, not
separate entities owne d b y line segment .
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 2 1
The first group o f entities we need to examine is the one that contains ou r basic transportatio n
elements. Th e grou p consist s o f si x entities: Jurisdiction , Transportatio n feature , Even t point ,
Linear event, Point event , and Intersection. Definitions of these entities and their relationships ar e
as follows:
• Jurisdiction: The political or other context for designating transportation features and their
names, which may b e merely numerical references unique within the jurisdiction. A jurisdiction set s
the contex t fo r definin g th e exten t an d nam e fo r eac h transportatio n featur e an d i s primaril y
geographic. Jurisdictio n carrie s n o othe r burden ; i.e. , i t doe s no t mea n whic h agenc y ha s
maintenance responsibility . A Stat e Departmen t o f Transportation ma y choose t o subdivid e the
State highway syste m o n a county basis , with each county-specifi c portion of a roadway havin g
its ow n identifier . In this instance , "county " woul d b e th e valu e o f Jurisdiction . Th e mainte -
nance jurisdiction i n which a transportation feature ma y be located would b e an attribute of the
feature, whic h coul d b e store d a s par t o f th e transportatio n featur e entit y i f i t applie d t o th e
entire feature, or a separate linea r event. Jurisdiction need not b e the same for all transportatio n
feature types . Airport s can be name d on a nationa l basis , interstat e freeway s name d on stat e
basis, an d loca l street s name d o n a zi p cod e basis . Relationships: Jurisdiction define s on e o r
more transportatio n feature . Transportatio n featur e mus t b e define d withi n th e contex t o f
Jurisdiction.
• Transportation feature: A n identifiable element of th e transportation system. A transportatio n
feature can be like a point (interchange or bridge), a line (road o r railroad), o r an area (rai l yard or
airport). Som e transportatio n feature s can consis t o f othe r features . This ma y b e th e cas e wit h
bridges an d roadways . I n a roadwa y inventory , th e bridg e is an even t tha t occur s o n a specific
roadway. Th e locatio n o f th e bridg e woul d typicall y b e define d b y a linea r LR S locatio n de -
scription. I n a bridg e inventory , th e bridg e i s a transportatio n featur e i n it s ow n right , an d it s
location may be defined by a set of earth coordinates. Relationships: Transportation Feature may
have on e o r mor e even t point . Even t poin t mus t b e define d o n a singl e transportation feature .
Transportation Feature may contain one or more intersection. Intersectio n mus t be owned by one
or more Transportatio n Feature .
• Event: An attribute, occurrence, or physical component of a transportation feature. Attribute s
include functional class, spee d limit, pavement type , and stat e roa d number . These things are not
tangible bu t describ e a tangibl e element, suc h a s a road. Occurrence s includ e traffic crashe s an d
projects. Physica l component s includ e th e numbe r o f lanes , guardrails , signs , bridges , intersec -
tions, and other tangible things that are field-identifiable elements. There are three event subtypes.
A give n event instanc e ma y b e expressed a s more tha n on e subtype:
Point event: A component or attribute that is found a t a single location (one event point). Poin t
events may occu r independentl y o r o n transportatio n feature s of the linea r o r are a form .
Linear event: A component or attribute that is found along a segment of a linear transportation
feature. Linea r event s ar e define d by tw o even t point s (beginnin g an d ending) . Linea r event s
may occu r onl y on linea r transportatio n features .
Area event: A transportation feature component or a non-transportation entity that affects a trans-
portation feature. Area s can be explicitly represented a s polygons or implicitly represented a s to
where they intersec t transportatio n features. Th e implicit optio n is called a n area event an d is
22 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
represented throug h relate d linea r and point events. For example, an area event could b e a city.
The city could be expressed by creating a linear event for the portion of a transportation feature
located within it, or as point events, where the city limits cross a transportation feature . Anoth-
er example could be a park-and-ride lot, which would be stored as a point event located where
the driveway to the lot intersects the adjacent road (transportation feature). Area events may be
applicable for any kind of transportation feature . Area event as a discrete entity is omitted fro m
the simplified model since such events are almost always expressed in a transportation databas e
using linear and point events. The omission of Area Event also lets us drop polygon, its corre-
sponding cartographic entity .
• Event point: Th e location where an event occurs on a transportation feature. Even t points ar e
located o n a transportatio n featur e a s a n offse t distanc e measur e fro m th e beginnin g o f th e
transportation feature . Most transportation databases use event point measure s made i n units of
0.01 miles. The smaller the measuremen t unit, the higher the resolution o f the database , i.e. , th e
closer tw o event s can b e and stil l be stored a s separat e events . A resolution of 0.01 miles means
that tw o event s within 52. 8 fee t o f eac h othe r ma y b e store d a s bein g locate d a t th e sam e po -
sition alon g th e transportatio n feature . Event point location s ar e store d usin g field measures in
real-world units, not cartographi c units. Event points locate event s on the transportation syste m
while cartographi c point s locat e representativ e graphica l element s o n a map . I n additio n t o
locating Event Point s by a linea r measure alon g transportatio n features , they can b e located b y
direct coordinat e measurement . Digitizin g map s o r field measurements fro m surveyin g or GP S
does this. Jus t a s linear measure s alon g transportation feature s can be converted to coordinate s
by interpolatio n alon g shap e file s representin g th e transportatio n featur e usin g dynami c seg -
mentation, coordinate s ca n b e snappe d t o transportatio n feature s an d converte d t o linea r
measures b y a revers e o f dynami c segmentation . Relationships: Transportatio n Featur e ma y
possess on e o r mor e Even t Points . Even t Poin t mus t b e define d i n th e contex t o f on e Trans -
portation Feature . Even t Poin t ma y locat e on e o r mor e Poin t Event . Point Even t mus t b e lo-
cated (o n Transportation Feature) b y one Event Point. Even t Point may locate th e beginning or
end of one or more Linear Event . Linear Event must be located b y a beginning Event Point an d
an endin g Even t Point . Even t Poin t represent s Geographi c Point . Geographi c Poin t locate s
Event Point .
• Intersection: A special type of Point Event which may be owned by more than one transportation
feature. Fro m th e perspectiv e o f a singl e transportatio n feature , a n intersectio n ha s bu t on e
owner; i.e. , that transportatio n feature . Although i n actuality , a n intersectio n may appea r a s a
point even t o n mor e tha n on e transportatio n feature . Relyin g solel y o n th e Transportatio n
Feature own s Point Even t relationship wil l resul t in redundant dat a storag e i n that eac h trans -
portation featur e would b e require d t o recor d th e sam e intersectio n data. Th e us e o f a n Inter -
section entit y allow s th e locatio n o f th e intersectio n t o b e store d a s a poin t even t o n eac h
intersecting transportation featur e but stores the intersection characteristics as part of intersection
entity attributes . It ma y be advantageous t o trea t an y type of transportation syste m junction or
crossing a s an intersection. Fo r example , trea t all bridges a s intersections i n order to lin k bridg e
height and weigh t data t o th e roads goin g over and unde r th e bridge. Relationships: Intersectio n
must correspon d t o on e o r mor e Poin t Event . Poin t Even t may represen t on e Intersection . In -
tersection must involve one or more Transportation Feature . Transportation Featur e may possess
one or more Intersection .
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 2 3
The secon d grou p o f entitie s consist s o f fou r entitie s tha t contai n informatio n abou t th e
connectivity (topology ) o f th e transportatio n network : Node , Link , Traversal , an d Traversa l
Segment. Network s ar e subdivide d int o segment s called links , which begin an d en d a t nodes . A
traversal i s a path through th e network tha t is composed o f traversal segments , eac h of which may
be formed fro m on e or mor e link s and thei r attributes .
• Node: A zero-dimension object that represents the topological junction between two or more
links, or the endpoint o f a link. In their most common form , nodes will correspond t o intersection s
and othe r poin t events . Not ever y intersection need be represented a s a node. I t is sufficient fo r a
routing application, fo r example, to include only those decision points (i.e., where the course may
be altered) tha t hav e been deeme d suitable . Relationships: Nod e ma y represent on e Point Event .
Node may begin one or more Link . Node ma y end one or more Link . In most cases , Nod e mus t
begin o r end a t leas t on e Lin k (ma y not b e true i f centroid node s ar e include d i n data set) .
• Link: A one-dimension object that represents th e logical connections between nodes. Links, as
used here , are dimensionless in that the y only specify th e possible connections an d no t th e actua l
distance betwee n nodes. W e could include an attribute fo r valid directions, but we have chosen t o
recommend tw o alternative implementatio n strategies . One is to orde r nodes ; the other is to only
include vali d node s tha t may be reached from a given node. I n the ordered node approach, if one
could trave l from Nod e A to Node B or from B to A, as would be the case with a two-way road,
then a link tabl e woul d includ e th e entries A, B and B,A . A corresponding one-wa y roa d woul d
omit on e of the nod e pai r entries . I n th e valid destinatio n nod e approach , a two-nod e entry for
Node A would b e B and on e for B would b e A. If, however, th e link A-B wer e a one-way roa d
24 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
with the direction of travel being from B to A, then there would not be an entry for Node B in the
node table record for Node A. Relationships: Lin k must begin at one Node. Link must end at one
Node. Lin k may be part o f one or mor e Traversa l Segment .
• Traversal: A path or route through a portion of a transportation network consisting of one or
more segments. We hav e chose n t o mak e a distinctio n i n applicatio n betwee n stati c traversals ,
such as that formed by a state road crossing many counties, and a dynamic traversal, such as that
created t o rout e a n overweight truck fro m origi n to destination . These ar e functionally th e same
things, however , thei r duratio n o f us e i s different . Bot h ar e constructe d b y selectin g traversa l
segments with a specifie d set of attributes. Fo r th e stat e roa d traversal , this would be those tra -
versal segments with the same road number. For th e overweight truck route traversal, this would
be those that could suppor t th e weight of the truck and possibly match other criteria, such as low
traffic volumes . Relationships: Traversa l contain s on e o r mor e Traversa l Segment . Traversa l
Segment may b e part o f one or mor e Traversal .
• Traversal segment: A link an d its relevant attributes. A traversal segmen t is both topologica l
and physical . It combine s th e connectivit y o f th e lin k wit h th e attribute s o f th e transportatio n
feature segmen t it represents. We have elected to utilize this separate entit y to store link attribute s
for tw o reasons. First , w e wanted to keep the purely topological informatio n separate i n order to
facilitate dat a sharin g and ou r centra l concep t o f unbundling. Second , w e wanted t o b e able t o
select the attribute se t to matc h th e needs of the application, whil e preserving a normalized dat a
structure for transportation feature s an d thei r attributes. Traversa l Segment s may be viewed as a
special linear event that correspond s t o th e path o f the related link . Traversal Segmen t attribute s
may b e derived b y finding the desired linear and/or poin t even t records that fal l withi n the sam e
roadway segment . Th e actua l selecte d value s ma y b e minimum s (e.g. , bridg e wit h th e lowes t
weight bearin g capacity , lowes t overhead clearance , etc. ) o r maximum s (e.g., highest traffi c vol -
ume, greates t population , etc.) . Relationships: Traversa l Segmen t ma y b e par t o f on e o r mor e
Traversal. Traversa l Segmen t must represent one Link. Traversal Segmen t may include attributes
of on e o r mor e Poin t Event . Traversa l Segmen t ma y includ e attribute s o f on e o r mor e Linea r
Event. Linea r Even t may b e applicable t o on e o r mor e Traversa l Segment . Poin t Even t may be
applicable t o on e or mor e Traversa l Segment .
These entit y and relationshi p definition s are based o n th e following business rules:
1. Link-nod e data structures may be used to express certain attributes that conform to that struc -
ture, such as the distances or travel times between intersections (block is a link, intersection is a
node).
2. Stati c an d dynami c paths (traversals ) must b e defined throug h th e highway network. A stati c
path is one that is defined by a slowly changing characteristic, suc h as state road number or a
bus route. A dynamic path is one created to serve an immediate and frequentl y changin g need,
such as to route a n overweigh t vehicle to it s destination o r t o direct a rider to th e nearest bu s
stop.
We have provided fou r entities to tie transportation feature s to the surface of the earth in 2- or
3-coordinate spaces. These entities are derived from thos e proposed b y Vonderohe and Hepwort h
(1998).
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 2 5
• Anchor point: A zero-dimension object representing the end or beginning of an anchor section
and serving to relate that terminus to an earth-based location. Position on earth and on any related
transportation feature ar e mandatory attributes. B y storing th e earth-base d an d linea r LR S co -
ordinates fo r a n ancho r point , th e database mode l provides a registratio n mechanism t o ti e the
transportation feature s t o th e ground . Thi s facilitate s cartographic conflation. (We actually rec-
ommend storing the linear LRS data i n the anchor sectio n records since the position is specific t o
each terminatin g ancho r section. ) Relationships: Ancho r Poin t ma y b e located b y on e o r mor e
Reference Object . Ancho r Poin t ma y begin one o r mor e anchor section . Anchor Poin t ma y end
one or mor e Ancho r Section . Fro m a practica l perspective , Anchor Poin t mus t begi n o r en d a t
least on e Anchor Section .
• Anchor section: A one-dimension object providing a logical representation of all or part of a
transportation feature. Length i s a mandatory attribute. An anchor sectio n begin s an d end s a t a n
ordered pai r o f ancho r points . Thi s orderin g provide s a n indicatio n o f directio n o f increasing
linear LRS measures. Anchor section s and point s differ fro m link s and nodes , by including more
than topologica l informatio n and a n incomplet e specification of topology. For example , ancho r
point location s ar e define d i n possibl y severa l systems ; nod e location s ar e no t define d a t all .
Ordered ancho r poin t pair s fo r definin g anchor sectio n directio n fai l t o indicat e whethe r traffi c
can flow in th e opposit e direction . Relationships: Ancho r Sectio n ma y establis h th e linea r LR S
datum fo r on e Transportation Feature . Transportatio n Featur e ma y b e defined b y one or mor e
Anchor Section . Anchor Sectio n must begi n at on e Anchor Point . Ancho r Sectio n must en d a t
one Anchor Point .
• Reference object: A physical object or recoverable location to which the position of anchor points
may b e conveniently related. Many ancho r poin t location s ar e likel y t o b e conceptuall y easy t o
define bu t physicall y hard t o fin d i n th e field . A n exampl e i s the middl e o f a n interchang e o r
intersection. It is easy to recognize but hard to precisely and consistently locate. Tying the location
to a physical object, such as a traffic signa l pole, allows the precise location o f the anchor sectio n
to b e readil y foun d usin g directio n an d distanc e fro m th e referenc e object . Ancho r Poin t an d
Reference Object may be merged to form a single entity. Relationships: Referenc e Object is located
on th e surfac e of th e eart h usin g one Geographi c Poin t (whic h may hav e many coordinate de -
scriptions). Referenc e Object ma y be used t o locat e on e or mor e Anchor Point .
• Geographic point: A zero-dimension object carrying the real-world (earth-based) location of a
reference object. W e have been deliberately restrictive in our definitio n o f Geographic Poin t a s a
means of simplifying th e model. In reality, every other entity in the real world could be described
using a geographic point. What we want to emphasize here is the translation of geographic points,
defined minimall y for referenc e objects, t o cartographi c point s a s a means of facilitating confla-
tion. Relationships: Geographi c Point may locate one Reference Object. Geographic Point may be
transformed t o on e o r mor e Cartographi c Poin t (eac h wit h it s ow n cartographi c datu m an d
coordinate system) . Event Poin t represent s Geographi c Point . Geographi c Poin t locate s Even t
Point.
The business rule s tha t produced thi s portio n o f the data model are :
1. Man y transportation feature s may b e efficiently locate d o n the surfac e of the earth usin g non-
linear location referencin g systems based o n GP S an d othe r eart h geod e methods. Mos t typi -
cally, these features are o f the are a an d poin t geometri c forms. Point an d linea r attributes o n
linear features may be secondarily defined usin g earth coordinates. However, this approach wil l
26 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
not unambiguousl y define a point o n the linear feature from a graphic perspective du e to differ -
ences (errors) between the map an d th e real world, the measured positio n o n the earth an d its
equivalent on th e map, an d betwee n the measured positio n an d th e tru e position .
2. Th e model must provide th e data entitie s needed to store a linear datum developed to meet po-
sitional accurac y an d dat a sharin g needs , but d o no t requir e tha t suc h a datum exist .
We use the ter m 'cartography ' to refe r t o th e map o r pictur e produce d b y a GI S application .
There are five cartographic entitie s that may be used to visually express the location an d shape of
transportation features , linear events , an d poin t events . The followin g entity definition s are in -
dependent o f proprietary softwar e terms:
• Cartographic point: The internal address reference for placing a single point on the surface of a
two- o r three-dimension map (manifold). Eac h mappin g environment has it s own wa y o f defin -
ing single locations, i.e. , its cartographic datum . Relationships: Cartographic Poin t ma y repre -
sent on e Geographi c Point . Cartographi c Poin t ma y shap e on e o r mor e Lin e Segment .
Cartographic Poin t ma y locate on e Point Symbol .
• Line segment: A straight connection or otherwise defined mathematical path between two carto-
graphic locations. All line segments must b e defined using beginning and endin g cartographi c
locations (points) . Relationships: Line Segment i s shaped b y one or mor e Cartographi c Point .
Line Segmen t ma y b e par t o f bas e Ma p String . Lin e Segmen t ma y b e par t o f Linea r Even t
String.
• Base map string: A connected non-branching sequence of line segments, usually specified as an or-
dered sequence of vertices, which cartographically defines the shape of a transportation feature.
Our basi c implementatio n suggestio n fo r mappin g transportatio n feature s is to star t wit h a n
equivalency between each entire transportation featur e and a single line object. This line object
will be composed o f one or more line segments. Multiple spatial representations are enabled. A
transportation featur e can be represente d by mor e tha n one bas e map string . Relationships:
Base Map Strin g may be used to describ e the shap e an d positio n o f one Transportation Fea -
ture. Bas e Map Strin g must consis t o f one o r more Lin e Segment .
• Linear event string: A connected non-branching sequence of line segments, usually specified as an
ordered sequence of vertices, which cartographically defines the shape of a linear event occurring
on a transportation feature. Ou r expectation here is that most linear event strings will be created
as needed b y dynamic segmentation, whic h uses straight-line interpolation t o extrac t th e por -
tion o f a base ma p strin g that correspond s t o th e location o f a linear even t usin g linear LR S
measures. Lin e style , width, an d colo r ma y al l b e used t o distinguis h th e attribut e bein g de-
scribed. Relationships: Linear Even t Strin g mus t represen t on e Linea r Event . Linea r Even t
String must consis t o f one o r mor e Lin e Segment .
• Point symbol: A cartographic object that is used to show the position and nature of a point-like
real-world feature. Jus t as linear features can be shown using a line, point features can be shown
using a point symbol . Relationships: Point Symbo l may be located o n a map usin g one carto-
graphic Point . Poin t Even t may be illustrated b y point symbol , which may also selec t the cor -
rect poin t symbo l instance t o display .
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 2 7
The business rule s that produced thi s portio n o f the data model are :
1. Separat e graphi c representations fro m attribut e data t o provid e scale independence; provides
"one database-man y maps " functionality.
2. Utiliz e dynamic segmentation and field-based linear LRS measures to create cartographic ob -
jects that correspond t o linea r event s and t o positio n poin t symbols .
3. Provid e a means of relating real-world measures to ma p locations .
More detail s of the mode l an d example s of implementation ar e contained i n the origina l pre-
sentation (Dueke r an d Butler , 1998) . For example , attribute s tha t are offsets fro m th e centerline ,
such a s signs , guardrails, o r numbe r o f lanes , ar e handle d b y mean s o f poin t o r linea r events.
Linear events , suc h a s lane s ca n b e reformulate d i n severa l ways . A linea r even t coul d b e rep -
resented as a recursive one-to-many relationship for Transportation Featur e entity , if we want to
look at the lanes as transportation feature s owned by a larger one . Or each lane could b e a stand-
alone feature . Th e dat a mode l itsel f i s substantiall y indifferen t t o th e detail s o f transportatio n
feature construction . We choose not t o specif y on e or the other a s the application ma y drive one
over th e other .
The data model supports non-planar graph s although we encourage the use of intersections for
over- o r underpasse s t o stor e clearance s an d restrictions . Although ou r enterpris e GIS- T dat a
model does not suppor t temporal data explicitly , time is an attribute of transportation feature s o r
of th e point an d linea r events associated wit h them.
Whether this data mode l o r another, a common understanding of the transportation syste m is
needed i n orde r t o shar e dat a effectively . Th e dat a mode l facilitate s selectio n o f appropriat e
transportation feature s fro m differen t database s an d spatiall y registering them an d creatin g new
application-specific networks . Th e GI S ope n system s concep t applie d t o interoperabilit y o f
transportation dat a require s a commo n featur e schema , whic h i s consistent wit h ou r transpor -
tation features . I n addition , th e data model separate s th e update an d maintenanc e o f transpor -
tation feature s fro m th e network s use d i n specifi c applications . Differen t network s ca n b e
generated from a common set of transportation feature s and one update process can support a set
of applications .
The connection between th e NCHRP 20-27 data model an d th e Dueker-Butler dat a mode l is
considerable an d implicitl y apparent fro m th e commo n us e o f several terms and methods . Rec-
ognizing the considerable contributio n o f 20-27 towards consensus building in the field, we sought
to adopt , whereve r possible, the conventions, approach, an d definition s o f the 20-27 data model .
A basi c componen t o f our mode l i s the concep t o f transportation features. Transportatio n fea -
tures hav e man y o f th e characteristic s o f th e 20-2 7 model' s traversals . Fo r example , a linea r
transportation featur e ca n hav e point s locate d alon g it s lengt h b y th e us e o f a linea r LRS .
However, th e concep t o f transportatio n featur e include s area an d poin t forms , no t jus t linea r
ones, where the term 'traversal' doe s not intuitively apply. We have retained the traversal concept
28 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
for linea r transportatio n features , even to th e point o f using the sam e definition a s in 20-27, bu t
the primar y linea r LR S i s tie d t o th e transportatio n feature . (Position s o n traversal s ma y b e
denned usin g a subordinate linea r LRS , including on e based o n time, e.g., a transit route runnin g
on al l or par t o f several transportation features. )
However, there are fundamental differences. First, we do not requir e transportation feature s t o
be composed o f one or mor e links . I n othe r words , topolog y i s not a requirement fo r transpor -
tation features . Second, we relate the cartography to the transportation feature , not t o the linear
datum. T o som e degre e w e als o adop t a thir d differenc e b y embeddin g th e linea r LR S i n th e
transportation feature ; however, the implementation o f this approach and tha t o f the 20-27 data
model appea r t o b e identical . A fourt h differenc e i s the expansio n o f datu m entitie s t o includ e
reference object s that ar e more readily recoverable i n the field than ancho r points . To som e de-
gree, ou r inclusio n o f referenc e object s i s intended t o accommodat e th e propose d nationa l in -
telligent transportatio n syste m (ITS) Datum , whic h consists of a group o f reference objects an d
the rule s to defin e them .
One o f th e bigges t difference s i s the fift h one : incorporatio n o f cartography . Th e 20-2 7 dat a
model provides support fo r cartographic conflatio n in that i t can supply position information in
the for m o f eart h coordinate s fo r ancho r points . Thes e coordinate s ca n b e mapped i n th e car -
tographic environmen t an d b e use d t o relat e graphica l object s t o th e datu m an d improv e th e
quality o f maps. Thi s approac h leave s out man y graphica l object s and provide s littl e roadway
shape information. Cartographic conflatio n remains a substantially manual process, but on e that
is a necessary pre-requisite for data sharing . The requirement for transferring topology, impose d
by th e use of links and node s t o connec t traversal s (a.k.a. , linea r transportation features ) t o th e
datum an d thenc e to th e cartography place s a stil l greater burde n o n data sharing .
The enterprise GIS-T data mode l provide s participants i n data sharin g wit h a framewor k fo r
clarifying roles . Department s o f transportatio n o r publi c work s ar e organization s tha t hav e
ownership an d maintenanc e responsibilitie s fo r transportatio n infrastructure . The y ar e dat a
producers, dat a integrators , an d dat a user s both fo r interna l use s and fo r othe r organizations .
Internal use s vary considerably . Planning , design , construction , an d maintenanc e division s ar e
substantially independen t o f on e another . Th e solutio n i s not a singl e GIS-T bu t a n enterpris e
approach t o GIS-T data.
Motorists an d th e genera l publi c ar e primaril y dat a users . Organization s tha t us e the trans -
portation syste m in their business, the police, delivery services, etc., often rel y on data integrator s
to provide transportatio n dat a in the form of maps and networks for location, pat h finding, and
routing. Increasingly , user s ar e demandin g current , logicall y correct , an d spatiall y accurat e
transportation dat a i n interoperable digital form for large regions that spa n man y jurisdictions.
Currently ther e ar e neithe r technica l an d institutiona l processe s t o achiev e a singl e integrate d
database t o handl e those diverse needs, nor i s it likely that suc h processes will be developed an d
sustained. Rather, principles need to be established to guide data sharin g and the development of
application-specific transportatio n database s tha t ca n b e assemble d withou t costl y redundan t
recollection o f source data an d update s fro m th e field each time.
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 2 9
There ar e two participants whos e accuracy requirement s drive the data sharin g process :
• Emergenc y management, E-9-1- 1 an d computer-aided (emergency ) dispatch (CAD ) hav e the
most demandin g nee d fo r currency an d completeness .
• Vehicl e navigation applications, whic h may include CAD, hav e the most demandin g need for
spatial accurac y of stree t centerline files. This i s sometimes referre d to a s "ma p matching " o f
GPS-derived locatio n o f vehicles to the correct stree t o r road in the road database. Identifyin g
the correc t ram p o f a complex freewa y interchang e on which a disabled vehicl e is located i s a
particularly demanding task. Similarly, electronic (ITS) toll collection applications ma y require
tracking vehicle s by lan e o f multiple lan e facilities .
Others hav e les s demanding need s fo r tempora l accurac y (currency) , completeness , an d spatia l
accuracy.
Successful dat a sharin g require s a commo n schem a o r dat a mode l tha t i s flexible enough t o
handle th e need s of diverse participants. Th e flow of data fro m provider s to user s calls fo r cap -
turing data once , an d deliver y to user s in various forms , time frames, and spatia l scales .
Principles fo r successfu l sharin g o f transportatio n dat a amon g participant s mus t addres s a
variety o f issues: definition and identificatio n of transportation features , cartography an d spatia l
accuracy, generatio n o f application-specific networ k representations , an d interoperability .
Two importan t principle s follow fro m th e GIS- T dat a model :
• Transportatio n feature s are bounded b y jurisdictions, no t intersections. Thi s i s not to say that
the underlying cartograph y coul d no t hav e othe r forms , onl y that th e lin k betwee n attribut e
data and cartography must occur through transportation feature s rather tha n spatial primitives
or networ k topology.
• Attribute s o f transportation feature s are represented a s linear o r point event s and are locate d
along the featur e using linear referencing.
These two principles enable longer transportation feature s than is the case in link-based network s
and reduce s th e numbe r o f transportatio n feature s tha t mus t b e maintaine d t o represen t th e
system. Adding network detail or additional attributes does not necessarily increase the number of
features. Additiona l detai l ca n b e adde d b y linearl y reference d even t table s an d analyze d an d
visualized usin g dynamic segmentation .
Butler an d Dueke r (1998 ) also identifie d important dat a sharin g principles:
• Transportatio n feature s must be uniquely identified to facilitate sharin g o f data among partic -
ipants. Participants nee d t o identif y commo n feature s in sharing data .
• Transportatio n data producers need to include a standardized uniqu e identifier with each trans-
portation feature .
• Collec t minor road facilities by street name or route number to minimize the number of unique
identifiers. Bu t do no t us e names or route s as identifiers. They change.
There ar e severa l othe r principle s that ar e offere d t o reduc e th e amoun t o f manual codin g an d
conflation, an d thereb y eas e complianc e wit h th e dat a sharin g principles . Thes e ar e oifere d t o
avoid the need for simultaneous conflation of cartography and topolog y with a process to resolve
inconsistent segments :
• Exchang e attribute data a s event tables for logical transportation features , i.e., without shape
points.
• Exchange cartography withou t topology.
• Ther e i s no need t o code topology , le t the GIS generate application-specifi c network s fro m a
selection o f appropriate transportatio n features.
• Minimiz e manual codin g o f transportation featur e identifiers by embedding existing identifiers
into more global identifiers an d usin g scripts to bulk-assig n state and count y codes.
Fig. 2 shows how using a transportation feature-base d identifier t o exchange data avoid s most of
the problems o f conflating cartographic and topologica l objects . I n thi s example , th e heav y line
represents the common transportatio n featur e in all three schemas. The to p an d middl e versions
use topological (link/node) interna l dat a structures bu t with different set s of links and node s (th e
top versio n has more) . The botto m versio n use s simpl e line strings . Fro m an implementatio n
perspective, th e top tw o might be Arclnfo-based coverages , whil e the bottom i s representative o f
line string s used i n GeoMedia , ArcSDE , an d ArcView . Dependence o n conflatio n to mak e th e
cartography and topolog y th e sam e must procee d th e practical sharin g o f data. However, i f all
three ha d previousl y combined thei r primitiv e object s to creat e th e higher-leve l transportatio n
feature an d give n i t a commo n identifier , then ther e woul d b e no nee d t o mak e th e map s an d
Fig. 2 . Compariso n o f cartographi c an d topologica l mode l approaches : (a ) Transportatio n featur e defined by link s
B-D, D-F , F-G , G-H , an d H-I; (b ) Transportation featur e defined by links 1-4, 4-5 , 5-6 , an d 6-3; (c ) Transportatio n
feature define d by line s 1 and 5 .
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 3 1
A transportation featur e can be like a point (interchange or bridge), a line (road or railroad), or
an are a (rai l yar d o r airport) . Nevertheless , som e application s ma y restric t thei r database s t o
roads, pedestria n paths, o r waterways. The important poin t is to code the type of transportatio n
feature so that the type can be used to select those features of common interest fo r sharing of data.
Butler an d Dueke r (1998 ) proposed a n internet-lik e address identifie r fo r transportatio n fea -
tures. Similarly , th e Orego n roa d bas e informatio n tea m subcommitte e (ORBITS ) (Boswort h
et al., 1998 ) and the NSDI framework transportation identificatio n standard (FGDC , 1999 ) have
proposed a roadway identifie r schema. Th e NSDI proposal will likely prevail when it is complete.
The purpos e o f assignin g a stabl e an d uniqu e identifie r t o eac h transportatio n featur e is t o
eliminate or reduc e reliance on traditiona l conflatio n processes to reconcil e differen t transporta -
tion databases . Uniqu e identifier s ar e use d t o matc h transportatio n feature s between database s
without relyin g on matching coordinates an d links.
A cas e stud y was conducted t o tes t method s o f assigning both th e ORBIT S an d NSD I iden -
tification codes . Th e ORBITS approac h collect s or divide s roadway features and identifie s the m
with a uniqu e code . I n th e contex t o f th e cas e study , decisio n rule s fo r breakin g o r collectin g
roadway section s and procedure s for bul k assignmen t of highe r leve l code s to sequence d num -
bered roadway features were developed. The assignment of NSDI codes to roadway feature s was
similar, except that point identifiers were also assigned to beginning and ending points of roadway
features. Th e ORBIT S tea m chos e no t t o cod e th e topology , leavin g tha t t o b e generate d i f
needed, a s it is too application-specifi c to b e of general use.
The ORBIT S cas e stud y develope d differen t decisio n rule s fo r assignmen t o f transportatio n
feature identifier s t o arteria l road s an d loca l roads . Urba n arteria l road s ar e segmente d a t in -
tersections o f major arterials . Arteria l roa d identifier s i n urban countie s ar e a concatenation of
state an d count y FIP S code s an d a concatenatio n o f i an d j traffi c assignmen t networ k nod e
numbers. I n rura l counties , w e would recommen d arteria l road s b e assigne d a cod e tha t i s th e
concatenation stat e an d count y FIP S code s an d th e Stat e Departmen t o f Transportatio n o r
county Department o f Transportation roa d identifier . Portlan d Metr o desired a finer breakdown
of uniquely identified majo r roads tha n th e rura l rul e would have accomplished.
The decisio n rul e for assignin g codes t o loca l road s i s to collec t connected TIGER line s that
have the same name and assign them sequence numbers concatenated wit h state and county FIPS
codes. Som e judgment ha s t o b e applie d t o dea l wit h interruption s i n connectednes s o f loca l
streets. When there are minor interruptions , the same code is assigned to th e local road wit h the
common name . When interruptions are more than minor , separate identifier s ar e assigned. Also,
there are situations wher e name changes occur arbitrarily, suc h as at municipal boundaries where
a different identifie r ma y not b e needed. Bender et al. (1999) provide a description of a case study
of assigning NSD I identifiers.
Even wit h th e collectio n o f link s o f legac y database s int o large r transportatio n features ,
the assignmen t o f identifier s i s a n onerou s task , especiall y i f mandate d withou t assistance .
32 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
The presen t draf t NSD I framewor k transportatio n identificatio n standar d (FGDC , 1999 )
satisfies man y o f the state d busines s rules of our enterpris e GIS-T data model. Bu t the propose d
standard i s not based o n a formal data model and as a result suffer s fro m ambiguities . Framework
transportation segment s (FTSegs) hav e a unique identifie r an d lik e ancho r sections , FTSeg s ar e
defined b y beginning and endin g points (framewor k transportation referenc e points , or FTRPs) .
Without a formal data model, the NSDI proposal lack s guidance on whether length of FTSegs is
mandatory o r optional , o r ho w to cod e a n interchange.
The implicit topology o f FTSegs, create d by stating the terminal FTRPs, is equivalent to that in
our anchor point/ancho r sectio n structure. The NSDI proposal require s an intermediate FTRP to
be defined i f there are multiple paths betwee n the two FTRPs defining a given FTSeg. The explicit
topology of FTSegs an d intermediate FTRPs - whic h do not create new segments but are located
using a distanc e offse t alon g th e segmen t - i s essentially th e sam e a s our poin t event . Bot h in-
termediate FTRP s an d explici t topology ca n b e expressed using our poin t even t entity.
A continuin g problem wit h the NSDI proposa l i s the nee d to serv e both logica l and physica l
descriptions o f the transportatio n network . Som e user s may nee d t o us e a singl e FTRP to rep -
resent an entire interchange that fo r another use r may require a dozen of more FTRPs. We have
proposed th e use of an intersection entity to serv e as a general, or logical, object for what may be
one o r mor e physica l position description s fo r al l or part o f a comple x structure.
The NSDI proposa l meet s th e transportatio n dat a sharin g need s for uniqu e and publi c iden-
tifiers. I t almos t meet s the requirement s of a nationa l transportatio n datum , suc h as tha t whic h
has bee n liste d a s a pre-requisit e fo r man y IT S deployments , needin g onl y t o mak e lengt h a
mandatory attribute . I t almost provide s a clear statement o f network topology, fallin g shor t only
in it s ambiguity of connection specification.
And yet , we are no t satisfied . Dat a user s nee d t o exchang e not onl y attributes relevant to a n
entire transportatio n feature , bu t als o thos e tha t appl y t o a portio n o f a feature . A propose d
standard structur e for constructin g a universa l transportation dat a exchang e format seem s nec-
essary for people t o b e able to exchang e entire data sets , not simpl y identify whic h road the y are
talking about . Th e absenc e o f linewor k t o represen t th e FTSeg s i s a seriou s omissio n i n a
"spatial" standard , i n th e eye s o f man y potentia l users . Fo r i t i s th e nee d fo r mor e accurat e
K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36 3 3
linework that drives most spatial data sharing needs. But more accurate linework is accompanied
by a more detailed representatio n of transportation features . Yet th e NSDI Framewor k Trans -
portation Identificatio n Standard i s a major effort i n the process of forging a true data exchange
mechanism.
Issues which states fac e i n constructing a road laye r for a statewide GIS illustrate the proble m
of sharing transportation data . Th e problem is stitching together data fro m variou s sources and
vintages.
Typically, Stat e Department s o f Transportatio n hav e a roadwa y databas e fo r highwa y in-
ventory. Attribute s of roads ar e recorded b y milepoint an d associate d wit h a straight-lin e chart
for visualization . Some states have incorporated th e linearly referenced data int o a GI S an d use
dynamic segmentation for analysis and visualization . The spatial scale of the cartography ranges
froml:24000to 1:100000 .
Similarly, the spatial scal e o f the road layer used by natural resourc e agencie s is from 1:2 4 000
or 1:10000 0 USGS source s but wit h very littl e attribution and wit h uneven currency. However,
the USGS digital orthophotography quarter-quadrangle program offers th e opportunity for states
to updat e thei r roa d vector s an d registe r the m usin g 1:1200 0 imagery. 4 Thi s ough t t o provid e
sufficient spatia l an d tempora l accurac y fo r E-91 1 and fo r vehicl e navigation (snappin g vehicle
GPS tracking data t o roa d vectors) . However, these sources may not b e sufficientl y accurat e t o
distinguish road lanes , which are needed in urban area s for dynamic vehicle navigation and roa d
pricing (e.g. , snappin g vehicl e GPS trackin g dat a t o lane s an d ramp s t o relat e t o lane-specifi c
volumes and speed s from loops , imagin g and vehicl e probes).
Vehicle tracking will require proper positioning, both in terms of which road th e vehicle is on
and wher e it i s o n tha t road . Th e IT S communit y ha s propose d a "cross-streets " profil e tha t
provides this information i n a message format that includes the name of the current stree t and the
location a s th e termina l cross-streets for th e presen t bloc k o n whic h the vehicl e is located. On e
reason fo r this approac h i s to avoi d th e nee d t o hav e precise GPS-map concurrenc e regarding
spatial position. Researchers have discovered, however, that current maps do not have sufficientl y
reliable an d complet e stree t nam e attribute s fo r thi s schem a t o b e routinel y implemented .
(Noronha, 1999 ) Currently, the IT S database communit y is proposing t o us e the coordinate s of
intersections as well a s street addresse s and ma y ad d stree t type for locationa l referencing.
If work must be done to populate databases with sufficient informatio n to identify whic h street
and bloc k a vehicle traversing, the n i t seem s appropriate to develo p a more complet e approac h
that avoids the remaining problems, such as differences i n spelling that may arise for street names
in differen t databases . Th e purpos e o f a transportatio n featur e identification schema (perhap s
with a redundan t cross-stree t index ) i s t o insur e completenes s an d currenc y o f databases .
In-vehicle navigation system s will provid e th e greates t challeng e in term s o f spatia l an d tem -
poral accurac y fo r roa d ma p databases . Curren t technolog y support s generalize d client-base d
networks fo r minimu m path routin g (base d o n typica l speed s o r impedances ) tha t produce s in -
structions in terms of street names and turns . This i s based o n a road ma p bas e tha t snap s GP S
vehicle-tracking data t o roa d vectors .
In the near future , w e are likely to se e detailed server-base d dynamic routing based o n curren t
network traffi c condition s wit h instructions including ramp an d signag e details an d snappin g of
GPS vehicle tracking data to lane s and ramps . Th e coding o f topology using formal and widely
recognized transportation featur e identifiers will allow vehicle routing to be done without reliance
on maps .
The chie f problem wit h transportatio n network s is the perceptio n tha t ther e is one bas e net -
work tha t wil l satisf y al l applications. W e contend thi s is a false premise , as someone will always
want t o ad d o r delet e links . Network s ough t t o b e application-specific , an d consequently , th e
network canno t b e the buildin g block o f sharable o r interoperabl e transportatio n data .
cameras use d t o estimat e vehicle flow rates. Thi s strea m o f dat a wil l requir e forma t standards .
Dailey e t al . (1999 ) provides a self-describin g method fo r transfe r of dat a i n real-tim e fo r IT S
applications.
8. Conclusio n
Sharing o f GIS- T dat a pose s challenges . Thi s pape r identified th e issue s an d develope d a
framework an d principles to addres s them. The central principle is the establishment of a schema
for transportatio n feature s and their identifiers. An underlying principle is the need for a commo n
data mode l tha t hold s transportatio n feature s as th e objec t o f interest , an d tha t attribute s o f
transportation feature s ar e represente d a s linea r an d poin t event s tha t ar e locate d alon g th e
feature usin g linea r referencing . Until ther e i s agreement o n thes e principles , dat a sharin g an d
interoperability wil l not progres s well . This lack o f agreement stem s from the current stat e o f flux
with respec t t o GIS- T dat a models . Thi s proble m shoul d diminis h wit h th e completio n o f th e
NCHRP 20-27(3 ) project and final adoption o f the NSDI framewor k transportation identification
standard.
In thi s context , sharin g o f transportatio n dat a involve s exchang e o f relevan t transportatio n
features an d events , no t link s an d node s o f application-specifi c databases. Thi s i s a majo r de -
parture fro m th e existin g process o f conflation . The exchang e of mor e fundamenta l features is
encouraged i n recognition tha t eac h applicatio n ha s quit e specific requirement s for thei r end-use
database, bu t al l users have need for basic transportation features . The major contribution of our
enterprise GIS- T dat a mode l i s this separatio n o f th e maintenanc e databas e fro m man y appli -
cation databases. Th e term enterprise takes on a broader meaning of a shared stewardship of data
about the larger transportatio n system .
Strategies for th e sharin g of transportation feature s follow fro m thi s approach. Th e first is to
enlist stat e an d loca l cooperator s t o construc t transportatio n feature s b y registerin g existin g
transportation vecto r dat a fro m TIGE R an d loca l source s to digita l orthophotography, suc h as
the USGS orthophotograph y quarter-quadrangles . A second stage would be to update th e vectors
using replications of vehicle tracking data. The fundamental strategy is to identif y feature s in the
database t o facilitate a transactional update system, one that does not require rebuilding the entire
database anew .
Although thi s approach t o the sharing of transportation dat a need s to be refined, i t provides a
better framewor k tha n tha t whic h currentl y exists . Ther e i s no commo n approac h amon g th e
communities o f ITS, vehicl e navigation databas e vendors , NSDI, stat e an d loca l transportatio n
organizations, an d E-911 . However, we are encouraged b y the continuing efforts o f these groups
to reac h a more consistent approac h tha t wil l facilitat e data sharin g by reducing the number of
inconsistent an d duplicativ e representations.
Transportation feature s are the fundamental objects in the database an d must be uniquely and
permanently identified . Similarly, additions , deletions , an d modification s t o transportatio n fea -
tures must b e identified t o facilitat e database update s an d maintenance . This approach , o r dat a
model, separate s th e updat e an d maintenanc e o f transportatio n feature s fro m th e link s o f net -
works used in specific applications. Thus, differen t network s can be generated from a common set
of transportatio n feature s and on e update process ca n support a set of applications .
36 K.J. Dueker, J.A. Butler I Transportation Research Part C 8 (2000) 13-36
References
Adams, T.M. , Vonderohe , A.P. , Butler , J.A. , 1998 . Multimodal , multidimensiona l locatio n referencin g syste m
modeling issues . Prepare d fo r NCHR P 20-27(3) . Worksho p o n Functiona l Specification s fo r Multimodal ,
Multidimensional Transportation Locatio n Referencin g Systems, Washington, DC, 3- 5 Decembe r 1998 .
Bender, P., Bosworth, M., Dueker, K., 1999 . ORBIT: The Oregon Road Base Information Team: the Canby case study
report. Cente r fo r Urba n Studies , Portlan d Stat e University , Catalog Numbe r PR111 . http://www.upa.pdx.edu/
CUS/PUBS/contents.html.
Bosworth, M. , Dueker , K. , Wuest , P. , 1998 . ORBIT: Th e Orego n Roa d Bas e Information Team : a draf t summar y
report. Cente r fo r Urba n Studies , Portlan d Stat e University , Catalo g Number P R 106, 1 9 p. http://www.upa.pdx .
edu/CUS/PUBS/contents.html.
Brown, J. , Rao , A. , Baran , J. , 1995 . Are yo u conflated ? Integrating TIGER an d othe r dat a set s through automate d
network conflation . In : GIS- T Symposiu m Proceeding s o f th e America n Associatio n o f Stat e Highwa y an d
Transportation Officials , 220-229 .
Butler, J.A., Dueker , K. , 1998 . A proposed metho d o f transportation featur e identification . Center for Urba n Studies,
Portland Stat e University , Catalog Numbe r DP97-8 , 18 p. http://www.upa.pdx.edu/CUS/PUBS/contents.html .
Chan, K. , 1998 . DIGEST: A primer for th e internationa l GIS standard . CR C Press , Boc a Raton, FL .
Dailey, D, Meyers , D., Friedland, N., 1999 . A self describing data transfe r methodology for ITS applications. Preprin t
CD-ROM. 78t h Annual Meeting , January 1999 , Transportation Researc h Board .
Dueker, K. , Butler , J.A. , 1998 . GIS-T enterpris e dat a mode l wit h suggested implementation choices. Journa l o f th e
Urban an d Regiona l Information System s Association 1 0 (1), 12-36 .
Devogele, T., Parent , C. , Spaccapietra, S. , 1998 . On spatial database integration . International Journal of Geographi c
Information Science s 1 2 (4), 335-352 .
FGDC: Federa l Geographi c Dat a Committe e (Groun d Transportatio n Subcommittee) , 1999 . NSD I Framewor k
Transportation Identificatio n Standard - Workin g Draft, 199 9 May. http://www.bts.gov/gis/fgdc/web\_intr.html .
GDF: Geographi c Dat a File s Standard , versio n 3.0 , 1995 . Workin g Grou p 7. 2 of CE N Technica l Committe e 278 ,
European Committe e fo r Standardization .
GDF: Geographi c Dat a File s Standard , versio n 5.0, 1999 . ISO/TC 204/SC/WG3 . ISO/W D 19990722-2 .
Noronha, V. , 1999 . Spatia l dat a interoperabilit y fo r ITS . Presentatio n a t Annua l Meetin g o f th e Transportatio n
Research Board , Washington, DC.
Sester, M. , Anders, K., Walker, V., 1998 . Linking objects of different spatia l data set s by integration and aggregation .
Geolnformatica 2 (4), 335-358.
Sperling, J. , Sharp , S. , 1999 . A prototype cooperative effor t t o enhanc e TIGER. Journal o f the Urba n an d Regiona l
Information System s Association 1 1 (2), 35-12 .
Sutton, J. , 1999 . Object-oriente d networ k dat a structures : th e roa d t o interoperability . Presentatio n a t GIS- T
Symposium, April 1999 , San Diego .
Tomaselli, L. , 1994 . Topological transfer : evolving linear GIS accuracy . URISA Proceedings , 245-259.
Vonderohe, A. , Chou , C. , Sun , F. , Adams , T. , 1998 . A generi c data mode l fo r linea r referencin g systems. Research
Results Diges t Numbe r 218. National Cooperativ e Highwa y Research Program , Transportatio n Researc h Board .
Vonderohe, A., Hepworth, T., 1998 . A methodology for design of measurement systems for linear referencing. Journa l
of the Urba n an d Regiona l Informatio n Systems Association 1 0 (1), 48-56.
Walter, V. , Fritsch, D. , 1999 . Matching spatial dat a sets : a statistical approach. Internationa l Journa l o f Geographi c
Information Science s 1 3 (5), 445^73.
TRANSPORTATION
RESEARCH
PARTC
Abstract
Keywords: Dynami c location ; Tempora l versio n control ; Combine d linea r an d geodeti c referencin g
0968-090X/00/S - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00028- 0
38 J.C. Sutton, M.M. Wyman I Transportation Research Part C 8 (2000) 37-52
1. Introductio n
The purpose o f this paper i s to review some of the problems o f integrating transportation dat a
in geographic information systems (GIS) an d t o present a new data model that provides a robust
method fo r referencin g and managin g tempora l an d spatia l changes . Th e mode l ha s bee n de -
veloped fro m fiel d practice an d experience , and utilize s off-the-shelf databas e an d GI S softwar e
components. Th e mode l ha s bee n teste d o n roa d networ k file s only , a s describe d below , bu t
should b e applicabl e t o transi t network s also . Th e model, referre d t o a s dynami c location , ha s
been developed in response to som e of the limitations in GIS, wit h respect to the management of
networks and linearl y referenced data. It also takes advantage o f some of the technology changes
affecting th e us e o f geospatia l technologies , especiall y th e us e o f object-oriente d programmin g
techniques an d objec t relationa l databas e managemen t system s (OBDBMS) . Th e too l i s a syn-
ergistic counterpart t o dynami c segmentation.
Linear referencin g is a cor e transportatio n metho d becaus e man y transportatio n feature s ar e
linear i n nature . Thi s referencin g process i s easil y implemente d i n a tabula r environment , bu t
complicated b y limitations within GIS. GI S has been an important information technology tool in
transportation becaus e of its ability to perform a spatial intersect on network overlays, an analysis
structured quer y language (SQL ) canno t suppor t becaus e dat a orde r (location ) mus t b e consid-
ered. Ther e are two ways to conduc t a spatia l intersect: the intersection o f topologically related
tables, or, the mathematical graphin g and subsequent measuring of graphed spatial relationships.
Topological method s separat e attribute dat a fro m geometr y and are a requirement when greater
accuracy i s afforde d b y physicall y measurin g th e earth , sa y wit h a calibrate d odometer , an d
storing th e values . Thes e accurat e value s ca n the n b e mappe d fo r displa y usin g dynami c seg-
mentation. Althoug h thi s proces s preserve s field-measured accuracy, i t separate s dat a fro m it s
GIS geometry (Dueker and Vrana, 1992 ; Adams and Vonderohe, 1998 ; Bespalko et al., 1998) , and
this prevents the use of common I T tool s suc h a s temporal an d spatia l versio n control an d dat -
abase rollback . Fou r problem s result.
First, there is not a one-to-one relationship between data records and spatial references. This is
an interestin g point, becaus e tool s suc h a s dynami c segmentatio n conside r th e abilit y t o ma p
many-to-one relationship s an asset . I n man y case s i t is . However, much mor e real-worl d infra -
structure an d event s ar e discrete , an d whe n treate d a s a n object , a one-to-on e relationshi p i s
created. This open s the door for a new tool tha t maximizes the one-to-one objec t relationship t o
synchronize dat a and geometry , an d thu s regai n spatia l and temporal control.
Second, additiona l problem s ar e encountere d whe n poor ma p scal e o r generalizatio n canno t
elucidate modern transportatio n infrastructur e (Sutton, 1997 ; Bespalko e t al., 1998 , 1999) . Such
situations ar e calle d networ k pathologies (Sutton , 1995 , 1996a) , such a s ramp s an d connectors ,
frontage an d servic e roads, interchanges , stacke d decks , high-occupanc y lane s an d contra-flow
lanes whos e directionalit y ma y var y wit h time . Simila r difficultie s aris e whe n integratin g an d
managing multiple network topologies of the real-world network feature, fo r example, integrating
street centerlin e files and mode l network s for highwa y and transi t deman d modelin g (Maingue-
naud, 1995 ; Kwan e t al., 1996 ; Horowitz, 1997 ; Kwan and Golledge , 1997 ; Sutton, 1996b , 1997).
Since small-scal e generalize d map s coul d no t mathematicall y manag e complicate d geometry ,
topological intersec t tool s emphasize d many-to-on e relationship s an d no t graphica l intersec t
means. I n th e cas e o f transportatio n infrastructur e management , i t wa s easie r t o physicall y
J.C. Sutton, M.M. Wyman I Transportation Research Part C 8 (2000) 37-52 3 9
measure th e distance s betwee n field infrastructures using a calibrate d odometer , buil d many-to -
one relational tables , an d live with the synchronization problem s of topology. However , accurat e
technologies suc h a s GPS , meter-leve l imagery , an d ligh t distanc e an d rangin g (LIDAR ) ca n
create mappin g grid s tha t surpas s th e accurac y tha t ca n b e gaine d fro m traditiona l fiel d mea -
surement techniques (Burkholder, 1997). There is opportunity to utilize GIS as a Cartesian mode l
of thos e data . Dynami c segmentatio n relate s tabula r dat a t o th e prevailin g GIS Cartesia n grid ,
while dynami c locatio n assume s a GI S tha t i s sufficien t t o preserve and mathematicall y manip -
ulate dat a without tables .
Third, because relative locations are a function o f a route and distance from origi n (DFO), data
management i s complicated whe n DFO point s ar e moved , routes ar e renamed , o r roadbed s ar e
realigned. Experienc e has show n that ther e ar e many difficultie s i n integrating relative location s
(route an d DFO ) use d in the transportation industr y and th e geodeti c coordinates (latitud e an d
longitude, stat e plane coordinates (SPC) , or universal transverse mercator (UTM) ) use d by most
other mappin g disciplines . Attempt s b y researcher s t o develo p a standar d methodolog y hav e
complex implementation s (Fletche r e t al. , 1995 , 1996 ; Vonderohe an d Hepworth , 1996 ; Dueke r
and Butler , 1998a) . Further , dat a ar e neithe r readil y exchange d betwee n differen t linea r refer -
encing systems (LRS) nor wit h the three-dimensiona l coordinat e system s used b y GPS. Ther e is
incompatibility betwee n relative and absolut e locations . Fo r example , a hazardous wast e spill of
Route 66, milepost 23 is not eas y to pass on to others who are not usin g this method an d without
the geographic coordinates .
Last, th e topolog y necessar y t o accomplis h a spatial intersec t (networ k overlay ) b y dynami c
segmentation is complex, computationally intensive , and fails to supply query and analysi s results
in a reasonabl e tim e perio d withi n a n enterpris e environment . Objec t model s streamlin e thi s
process an d consum e les s bandwidth ; however , a linea r spatia l intersec t proces s ha s bee n un -
available, preventing the adoption o f object models within transportation GIS . Dynami c locatio n
provides thi s tool .
As indicate d above , attempt s hav e bee n mad e t o resolv e these problems . Thes e implementa -
tions fal l int o thre e broad categories.
object-oriented methods. ESR I an d Intergrap h ar e also developing new versions of their dynamic
segmentation program s tha t ar e similarl y object based .
Since 1994 , GIS-T researcher s and practitioners have been attempting to develop a generic data
model fo r linea r referencin g tha t woul d b e compatibl e wit h existin g fiel d practice s an d GI S
software. Beginnin g with th e initia l NCHRP 20-2 7 project o n adaptatio n o f GI S fo r transpor -
tation (Vonderoh e e t al., 1995) , the researc h evolve d into th e GIS-T Linear Referencin g Pooled-
Fund Stud y (Fletcher e t al., 1995) , and then to the current NCHRP 20-27 (3) project on guidelines
for th e implementatio n o f multimoda l transportatio n locatio n referencin g systems (Adams an d
Vonderohe, 1998) . At present , tw o states , Iow a an d Minnesota , ar e implementin g thes e model s
using Intergrap h an d ESR I software , respectively.
As introduced above , GIS can be one of two things: a map linked to data (a relational model) ,
or dat a store d a s a mathematically tractabl e mode l (a n objec t o r iconi c model). Sinc e mappin g
technologies coul d no t suppor t sufficientl y reproducibl e mathematical models , both researc h an d
application hav e stresse d GI S a s a linkag e t o accurat e tabula r data . Tool s suc h a s dynami c
segmentation assume d it was best to keep data separate from maps , an d live with the problems of
synchronizing data an d geometry.
Dynamic location take s the second approach. I f sufficient dat a are available to buil d an iconi c
GIS model, what new tools ar e required an d wha t new benefits result ? GIS softwar e has alread y
shifted int o th e objec t model world . It create s a one-to-on e relationshi p betwee n real-world an d
database feature s thu s providin g tempora l an d spatia l synchronization . Wha t ha s no t bee n
provided i s a means o f performing linear spatia l intersect .
3. Dynamic location
Fig. 1 . RDLO is modeled with {x,y,m} coordinat e tuples a t eac h vertex . EDLO only requir e {x,y} pairs .
42 J.C. Sutton, M.M. Wyman / Transportation Research Part C 8 (2000) 37-52
dynamic location object s (DLO) databas e record , an d facilitates spatial intersec t between EDL O
and RDLO. Fro m an d to distance fields are not require d because they may be derived by looking
at the EDLO en d point locations relativ e to the {x , y, m ] coordinat e o f any desired RDLO. An y
combination o f EDLO ma y be intersected with any other event or route DLO i n this manner. Old
views o f separat e geometr y wit h detache d table s o f explici t from, to , an d route values ar e dis -
carded. Anothe r wa y o f conceptualizing th e EDL O an d RDL O i s as logica l networks (EDLO )
referencing th e underlyin g geometric networ k (e.g. , RDL O stree t centerlin e file) . Th e logica l
EDLO (e.g. , pavement section o r bus route) correspond t o th e geometric network (e.g., highway
270) by location (linear and absolute) , through spatial coincidence. No comple x linkage of tables
to topolog y i s required.
This builds a stable geodetic data model with the functional benefit s of linear referencing. Many
RDLO ca n follow th e same ground track, but calibrated to any datum or linear reference system.
Any EDLO can then be referenced to an y location referencin g system, linear or Cartesian . Since
each recor d store s bot h th e spatia l objec t an d attribut e data , an y tabl e o r se t of values may b e
developed on the fly by placing the desired EDLO agains t any RDLO and a n intersect table built
by readin g dow n th e numbe r lin e (RDLO) . I f a roadbed i s renamed, realigned , o r recalibrated ,
EDLO d o not change, because they are {x , y} absolut e locations not assigned to any route. Bring
in a new RDLO and develo p DFO value s as desired.
The dynami c location proces s i s illustrated i n Fig . 3 . The desire d intersec t product i s a tabl e
with the segmentation of surface type and condition in terms of a stated route system. The process
begins by spatiall y selecting the surfac e typ e and surfac e conditio n EDL O tha t mee t quer y con-
straints, as well as the desired RDLO t o vie w the solution. In thi s example, the three-component
cursor se t contains nin e record s ( 4 surface type, 3 surface condition an d 2 route sections) , eac h
equipped wit h geographic shap e an d a singl e attribute. Sinc e the curso r i s both smal l and inde -
pendent o f topological constraint , dat a ma y be rapidly transmitted fro m th e server to th e client.
Once record s hav e bee n selecte d by th e server , the clien t plot s th e thre e DL O sets . Next , th e
from an d t o EDLO points, an d associate d singl e attribut e values, ar e define d an d linke d t o th e
RDLO. Last , the points along the RDLO ar e read i n increasing order t o build the query table at
the botto m hal f o f Fig . 3 . The tabula r dat a ar e not use d t o generat e a geographi c pictur e a s in
dynamic segmentation, but rather geographic shapes are used to create ordered tabular data in the
output quer y table .
J.C. Sutton, M. M. Wyman / Transportation Research Part C 8 (2000) 37-52 43
This sam e metho d ca n als o transfe r dat a betwee n an y se t o f referencin g systems in term s of
multiple attribute s (intersect) , differen t linea r measuremen t systems , an d linea r an d geodeti c
referencing systems . Using snappin g procedures , i t i s not necessar y t o us e system s of precisely
coincident shapes ; however, loss of precision is encountered a s scale s decrease an d dat a becom e
spatially inconsistent. It is also important t o note that onl y precise, that is repeatable, data should
be used (e.g., replication o f an agreed upo n se t of roadbed centerlines) . Inaccurate data , suc h as
smaller scale maps, can also be used without loss of information. The key is the replication of the
same consistent shape .
As show n i n Fig . 4 , RDL O an d EDL O hav e equa l status, an d hav e n o relationa l linkages.
Relationships betwee n selecte d RDL O an d EDL O ar e develope d o n deman d throug h geodeti c
stacking or proximity. It is important to note that by definition, the object model provides version
control because spatial and attribute data may be stored in the same record. This is by design, and
the fundamental benefi t of the object model. However, the object model cannot be readily adapted
to GIS- T business practice becaus e n o networ k overla y tools ar e availabl e outsid e th e dynamic
Fig. 4 . There are no relationship s between {x,y} EDL O event an d {x,y, m } RDLO .
44 J.C. Sutton, M. M. Wyman / Transportation Research Part C 8 (2000) 37-52
segmentation. Dynami c locatio n provide s this intersect functionality, an d thu s th e full benefi t o f
the spatia l objec t mode l ma y b e exploited .
One of the advantages o f this model i s that differen t type s of objects can be referenced to mor e
complex, larger scale geometry. This is illustrated in Fig. 5 , where four differen t type s of objects or
transportation element s can be referenced to th e base map. I n eac h case, th e absolute locatio n i s
retained eve n when th e objec t ma y b e referenced linearly for networ k modelin g purposes .
The DL O objec t mode l ca n integrat e fou r type s of spatial object s (se e Fig. 5) :
1. Route objects tha t for m th e navigabl e networ k o n th e centerlines . Dynami c segmentatio n an d
dynamic location ca n reference and develop data using these networks. The set of all route ob -
jects contain s calibrate d roadbe d centerlines , moda l travelwa y centreline s (e.g. , HO V lanes) ,
transport rout e events , suc h a s transit stops , an d othe r logica l networks , suc h a s a pedestria n
network.
2. File objects grou p non-spatial dat a by location. Fil e object icons are located a t various physical
network locations , fo r example, a n intersection. Clickin g on the icon will present th e user with
a pick list of file information associate d wit h that icon. Example files would include road infra -
structure drawings , traffi c signa l timin g diagrams, an d traffi c coun t data .
3. Transport data objects ar e iconic models of transport infrastructure and ar e located at their real
world map locations using drawn shapes of real-world shape. Attribute data ar e then attache d
via the featur e I D t o th e object . Examples include road sign s or crosswalks for pedestrians .
4. Dynamic link objects ar e icons that represent th e real-time stat e of bit streams. Icon s ar e place d
at thei r real-worl d ma p location s an d ca n displa y traffi c signa l status , loo p detecto r status ,
CCTV images , and othe r real-tim e data .
4. DLO creation
DLO creation, eithe r RDLO o r EDLO, begin s with the adoption o f a base network. Once the
base network has been adopted, i t must be maintained, a s EDLO and RDLO are replicated fro m
it (thus precise, not necessaril y accurate). Sinc e dynamic location support s a n objec t model, it is
important t o adop t a network o f sufficien t scal e t o avoi d networ k pathologies. Roadbe d base d
models have proven the mos t successfu l mappin g strateg y fo r dynami c location, a s described i n
the case study later.
The adopted networ k shoul d b e viewed a s raw material i n pure link-nod e form . That is , one
continuous line network between intersects and cul-de-sacs . Fro m thi s raw material, RDLO an d
EDLO ma y be marked out . For th e creation o f RDLO, select appropriate DF O star t point s and
use these points to create the desired route system from th e base network, then calibrate the {x, y ,
m} tuple s accordingly. To create a second RDLO, a different se t of DFO point s i s applied t o th e
original unspli t bas e network , whic h i s the n calibrated . Likewise , pavemen t attribut e chang e
points ma y b e collecte d a s GP S point s an d tagge d wit h th e attribute s leadin g int o tha t point .
These points are then used to mark out the EDLO, an d the attributes from the points are snapped
onto the EDLO line segments. Not e how this syste m is designed t o collect field data as absolut e
points, an d no t measure d linea r segments . Mathematica l operatio n i s carried ou t o n th e GI S
Cartesian grid, and not in tables of calibrated odomete r distances . This marks a fundamental shif t
in GIS , fro m a map linke d to data , t o a geometric and mathematically tractabl e mode l o f data.
Changes in EDLO ar e simple because geometry does not change . Old data ar e closed out with
from and to dates, with new concurrent geometr y opened . When the base geometry changes, ther e
is a point where the new network departs th e old roadbase an d a point where the new geometry
rejoins th e ol d network . These tw o point s ar e use d t o cu t th e ol d sectio n ou t o f all stacke d ge-
ometry sets . The retired pieces are closed ou t wit h from an d t o dates , an d ne w sections with the
new geometry replicated and place d i n the database. Fo r example , if there were a mile stretch of
concrete roadbe d whos e las t half-mil e was realigned, th e sectio n woul d b e cut a t th e half-mil e
point. The first half-mile would keep its old attribute set and star t date , while the secon d section
after th e half-mil e point woul d receiv e the ne w geometry and attributes . Th e proces s ha s bee n
proven eas y to set-up , an d effective , usin g standard GI S programming languages .
representation of the correc t tim e period. Thi s procedur e has not bee n wel l implemente d within
GIS-T becaus e th e practic e o f dynami c segmentatio n separate s event s fro m thei r respectiv e
geometric locations . Traditionally , geometri c changes t o th e roadbe d hav e bee n calibrate d wit h
the route s an d event s by remeasurement usin g the linea r referencin g system. This le d to inaccu -
racies i n data locatio n o n thos e section s unaffecte d b y the roa d realignment . Th e NCHRP an d
other data models deal with these anomalies through the use of reference points to provide greater
control i n the remeasurement process . Retired sections of road wit h the old ID ca n be queried a s
can ne w sections with the new ID, bu t the y cannot b e queried together . Thi s is because they are
referenced t o separat e rout e systems, control sections , and referenc e points. Eve n where the same
datum i s used (ancho r section s an d ancho r point s in the NCHRP model) th e locatio n o f even t
data stil l relies o n linea r measurement , an d withou t absolut e locatio n t o correspon d t o th e un -
derlying cartographi c representation , versionin g i s problematic. I n practice , events , route s an d
route system s are duplicate d fo r eac h tim e period tha t a geometri c chang e occurs , whic h intro -
duces a lot o f redundancy an d overhea d int o th e data management process . I n contrast , thoug h
object model s provid e thi s capability , the y hav e no t bee n adopte d b y GIS- T becaus e spatio -
temporal intersec t capabilities ar e not supporte d i n GIS software.
Since dynamic location provide s spatial intersec t capabilities, the object model may be used t o
manage geometry and attribute information in a single monomial record as temporally bound and
controlled wit h begin an d en d dates. Followin g an y query , spatia l and/o r temporal , selecte d at -
tributes wil l already contain the applicable network fo r the time period i n question. Poin t event s
will be a set of points, linear events a set of lines, and continuous events as a representation of the
complete networ k a t a give n poin t i n time . Rout e an d EDL O wil l b e adde d t o th e database ,
opened and close d wit h begin an d en d dates, but neve r removed .
To answer the query, "What was the pavement composition an d alignment as they appeared i n
1997", a spatia l inde x quer y i s conducted firs t t o extrac t al l pavemen t record s withi n th e geo -
graphic area of interest, independent of date (see Fig. 6). A second SQL passes through the cursor
set select s thos e spatia l record s meetin g tempora l constraints . Not e ho w expire d DL O reside s
within th e activ e database . Sinc e time-expire d even t an d RDL O contai n thei r ow n geometry ,
placing historical event s over the correc t historica l networ k is simple.
As new transportation feature s are added or subtracted, new shapes are added to the database ,
while old shapes are closed with an "end date " entry and maintained in the same database. Rout e
renaming and realignment s are both addresse d i n this manner. Sinc e dynamic location i s essen-
tially a linearly enabled geodeti c model, recalibration o f measures need not b e accomplished .
The location objec t concep t i s being applied i n the design o f the transportation data layer fo r
Texas Strategic Mapping (StratMap) project , sponsored by the Texas Water Developmen t Board
but involvin g several stat e agencie s an d th e USGS. StratMa p i s designed t o integrat e disparat e
networks from Texa s DOT, counties , cities and othe r sources in a unified dat a layer . In addition,
the network wil l have multiple attribute s associate d wit h it. Thus, multipl e spatia l an d attribut e
data are being conflated into a single network file. It was apparent tha t the traditional methods of
integrating th e dat a i n GI S whether b y networ k conflatio n o r linea r referencin g woul d no t b e
robust enough to accomplis h the task. There were also issues of tracking data change s ove r time
as wel l as bein g abl e t o integrat e spatia l dat a a t differen t level s o f resolution an d detail , fro m
1:1000 to 1:2400 0 scale. The data model also had to accommodate use of GPS and ortho-rectified
images tha t ma y be used t o updat e dat a i n future, a t grea t accurac y (~ 1 m precision) . I t was
decided t o us e the DLO dat a model , that ha d previousl y been developed for th e Texas Depart -
ment o f Transportatio n t o enabl e TxDO T t o integrat e th e Texa s Referenc e Marke r (TRM )
System (the state LRS), with a new GPS based referencing schema. By adopting the Texas Linear
Measurement Syste m (TLMS) , StratMa p wil l b e consisten t wit h TxDOT, an d usin g th e sam e
methodology for managing and updatin g th e transportation dat a layer .
The DLO model as outlined abov e was programmed i n the GIS environment i n use at TxDOT,
namely Arc/Info and Arc View. One of the reasons why TxDOT were open to the new TLMS dat a
model is because th e integration o f the TRM i n Arc/Info relied upon dynami c segmentation , an d
only thos e user s traine d i n thi s metho d an d wit h acces s t o Arc/Inf o could perfor m the spatia l
queries required . TxDO T were therefore lookin g fo r a desktop GI S solution , an d on e that was
easier to learn and quicker to execute. The DLO method accomplishe s both thes e objectives. For
example, tests conducted o n a sample data set in Arc/Info using dynamic segmentation versu s the
DLO query method showed dramatic performance enhancement. The same result in the DLO was
accomplished i n a fractio n o f th e tim e i t take s dynami c segmentatio n t o comput e th e spatia l
intersect. This is because most of the data processing is performed in the database rathe r than the
GIS, withou t th e added overhea d o f managing networ k topology .
Fig. 7 illustrates the format of the temporally controlled DLO table used in TLMS, while Fig. 8
illustrates th e entitie s o f the entire model . Not e ho w events and geometr y ar e controlled b y the
dates they are opened and closed to traffic. A second set of dates is used to mark when the objects
were entered an d retire d fro m th e database. Th e prior dat e pai r offer s versio n control , an d th e
latter databas e rollbac k functionality . Th e loc k featur e i s provide d t o preven t an y attribut e
changes, shoul d a piec e o f geometr y ente r a n edi t mod e o r transitio n stage . Surfac e lengt h i s
carried as a separate field because TLMS was designed as a surface model, where route segments
are alway s measure d agains t a digita l elevatio n model . A s configure d i n Fig . 8 , the mode l ac -
commodates multiple LRS, geodetic coordinates (GPS), and linear, continuous, and point EDLO .
48 J.C. Sutton, M. M. Wyman / Transportation Research Part C 8 (2000) 37-5 2
Anchor points are not required by the model, but are used here to carry field descriptors for begin
and en d points .
The DL O dat a mode l prototyp e wa s develope d usin g ArcVie w 3. 1 an d Acces s databas e
management system . ArcView' s measure d shap e fil e format , polyLineM , wa s utilize d t o lin k
features reference d b y absolut e (x , y, { z} coordinates) an d linea r measuremen t ( m value) . Th e
polyLineM dat a typ e joins th e functionalit y of bot h geodeti c an d LRS . Th e {x,y,m} coordi -
nate tupl e create s a one-to-on e correspondenc e betwee n relativ e an d absolut e locations . Thi s
builds a stabl e geodeti c dat a mode l wit h th e functiona l benefit s o f linea r referencing . Road s
and transportatio n feature s ar e modele d a s separat e an d independen t geometri c artifacts , i.e. ,
J.C. Sutton, M. M Wyman / Transportation Research Part C 8 (2000) 37-52 49
photograph t o locat e feature s on th e highwa y that ma y serv e as ancho r point s fo r linea r refer -
encing o f stat e highwa y features or intersectio n points wit h other networ k files. Note ho w th e
Feature I D i s a concatenation o f the latitude of longitude thus establishing the absolute location
for th e designate d feature.
7. Summar y an d conclusions
Dynamic segmentatio n builds geographic display s from tabula r dat a o n demand , an d suffer s
the pitfall s o f storin g tabula r dat a separat e fro m it s geometry . Dynami c locatio n i s an object -
oriented metho d o f encapsulating transportation attribute s wit h their geographi c shap e an d lo -
cation i n a singl e database record . Thi s facilitate s ful l spatia l an d tempora l query , storage , an d
version control. I n contrast t o dynamic segmentation, tabular dat a ar e created o n demand fro m
geographic shape.
Spatial intersec t between any number of EDLO is accomplished b y stacking DLO en d point s
over any RDLO and extracting the underlying measure from th e {x, y, m} coordinate tuple. Data
may b e swiftl y exchange d betwee n referencing systems through a stabl e geodeti c datum , while
enjoying th e benefit s of linear referencing .
The dynamic location model changes how linear data are stored in an automated environment.
Geographic shape s are easily stored as binary large objects in a single database field, and becom e
a precis e iconi c mode l tha t supersede s th e nee d fo r dat a decompositio n int o tabula r form . Re -
lationships betwee n separat e iconi c object s ar e inherentl y boun d t o th e geodeti c datum , usin g
J.C. Sutton, M.M. Wyman I Transportation Research Part C 8 (2000) 37-52 5 1
geodetic locatio n a s th e onl y dat a integrator . B y using iconi c object s rathe r tha n relationshi p
tables, dynami c locatio n i s a paradigmatic shift i n data management process , an d thu s th e way
transportation data are collected, stored , and used .
While dynamic location is a different mode l to dynamic segmentation, the two are not mutuall y
incompatible. I t is still possible to use dynamic segmentation with the DLO data o r create DLO's
from dynami c segmentation route s and events . In thi s way, the DL O extend s the capabilities o f
GIS-T and put s networ k overla y queries on the users' desktops .
Finally, th e DLO model ha s been develope d an d teste d o n a specific GIS platform. However ,
the data model i s robust enoug h t o work with other GI S software that are capable of storing an d
managing measurement s i n their networ k dat a structure , an d can import/export dat a from dat -
abase managemen t systems . The BLO B format is supported b y Access and Oracle , fo r example,
and i s employed t o minimiz e storag e an d transmissio n - a critica l facto r i n Interne t acces s t o
spatial data and operations on the server. The DLO model has proved to be a robust method that
meets these requirements using industry standar d technolog y components .
References
Sutton, J., 1995 . Network Pathologies Phas e 1 Report. Sandi a Nationa l Laboratories , Projec t AH-2266 , November.
Sutton, J., 1996a . Network Pathologie s Phase 2 Report. Sandi a National Laboratories , Projec t AH-2266, March.
Sutton, J., 1996b . The role of GIS in regional transportation planning . Transportation Researc h Record 1518 , 25-31.
Sutton, J. , 1997 . Dat a attributio n an d networ k representatio n issue s i n GI S an d transportation . Transportatio n
Planning an d Technolog y 21 , 25-44.
Vonderohe, A.P. , Chou , C.L. , Sun , F. , Adams , T. , 1995 . Results of a worksho p on a generi c data mode l fo r linear
referencing systems . In : Alan Vonderoh e (Ed.) , Summar y o f a workshop sponsore d b y the Nationa l Cooperativ e
Highway Researc h Program (NCHRP) , Projec t No. 20-27 . University of Wisconsin, Madison, WI.
Vonderohe, A.P. , Hepworth , T.D. , 1996 . A Methodolog y fo r Desig n o f a Linea r Referencin g Syste m fo r Surfac e
Transportation, Fina l Report . Projec t AT-4567, Sandi a National Laboratories .
Westcott, B . (Ed.) , 1997 . NSD I Framewor k Roa d Dat a Modelin g Workshop . Summar y o f a worksho p hel d i n
Wrightsville Beach , NC, 3- 5 December .
TRANSPORTATION
RESEARCH
PARTC
Abstract
We were contracted t o tes t a suit e of proposed locatio n messagin g standards fo r th e intelligen t trans-
portation system s (ITS ) industry . W e studie d si x different database s fo r th e Count y o f Sant a Barbara ,
documented type s and magnitude s of error , an d examine d th e likel y succes s o f th e propose d standards .
This paper synthesize s the test results and identifie s caveats for the user community as well as challenges to
academia. W e conclude that, first, current messagin g proposals ar e inadequate, an d superio r method s are
required t o convey both locatio n an d a measure o f confidence t o th e recipient. Second , there is a need t o
develop method s t o correc t ma p dat a geometrically , s o that locatio n i s more accuratel y captured, store d
and communicated , particularl y i n mission critica l application s suc h a s emergenc y servicing. To addres s
this, we have developed methods for comparing maps and adjusting them in real time. Third, there must be
standards fo r centerlin e map accuracy , tha t reflec t th e dat a model s an d function s associate d wit h trans -
portation. © 200 0 Elsevier Scienc e Ltd. Al l rights reserved.
Keywords: Locatio n referencing ; Intelligen t transportation systems ; Geographic information systems ; Stree t networ k
databases; Map databas e interoperability; Transportatio n datum s
1. Introductio n
Although it is several years, eve n decade s since the adven t o f digital maps, global positioning
systems (GPS ) technolog y an d machine-searchabl e stree t name s an d coordinates , i t ca n b e sur -
prisingly difficul t fo r th e averag e person t o describ e a location , eve n on e limited t o th e discret e
0968-090X/00/S - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00020- 6
54 V . Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
confines of a street network. This problem is faced dail y by people reporting accidents and vehicle
breakdowns, and b y the emergenc y service personnel receivin g and servicin g those calls.
Recent technologica l developmen t i n intelligen t transportation system s (ITS ) allow s vehicles
and pedestrian s t o b e tracke d i n rea l time , usin g GPS , cellula r phon e towe r triangulation , in -
ductive loo p detector s embedde d i n highwa y pavement , o r close d circui t televisio n (CCTV )
cameras; eve n satellit e imagery i s being propose d a s a trackin g option . Th e method s hav e ap -
plications i n emergenc y servicing , real-tim e highwa y informatio n provisio n (e.g . congestion ,
construction, fog ) an d traffi c management , hazardou s materia l management , trave l deman d
studies, la w enforcement an d criminology . The y ar e certai n t o generat e a larg e volum e o f geo-
graphically reference d dat a i n th e comin g years . Bu t give n th e inheren t difficult y i n describin g
location, the positional reference s in such data coul d b e ambiguous o r erroneous, and th e error s
could propagat e through subsequen t processin g o f the data.
In th e Unite d States , IT S is a majo r are a o f technology research, it s development abette d b y
two transportation bills , the 199 1 Intermodal Surface Transportation Efficienc y Ac t (ISTEA), and
the 199 8 Transportation Equit y Ac t fo r th e 21s t centur y (TEA-21) . ISTE A spurre d th e initia l
development o f ITS concepts, notabl y th e nationa l IT S architecture (USDOT , 1997) , bu t i t be-
came clear by the mid-1990s that significan t operationa l problem s remaine d t o b e resolved. One
was th e nee d fo r standard s fo r interoperability , particularl y i n location referencing . TEA-2 1 ex-
plicitly addresse d this , an d i n 199 8 th e U S Departmen t o f Transportatio n release d a lis t o f
"critical standards " (USDOT , 1998 ) fo r immediat e research , developmen t an d testing . Intero -
perability in general, an d locatio n referencin g in particular, feature d prominently in this list.
Emergency managemen t service s (EMS ) i s a n importan t componen t o f ITS , becaus e the y
present challenge s simila r t o inciden t managemen t (IM ) service s in ITS , an d becaus e th e roa d
system i s the mediu m o f delivery o f servic e t o mos t non-roa d emergencies . Th e proble m o f de-
termining the precise location of an incident, identification of appropriate resources, facilitation of
service delivery by signal pre-emption, an d real-tim e traffi c managemen t or evacuation, are issues
that cu t acros s th e boundaries o f EMS and ITS .
Another aspec t o f transportation research, with potentially larg e impacts on infrastructure, i s
the periodi c systemati c stud y o f trave l behavior , bes t exemplifie d b y th e Federa l Highwa y
Administration's Nationwid e Persona l Transportatio n Surve y (NPTS). Individual s ar e aske d t o
maintain diarie s ove r a perio d o f days o r weeks , recording eac h tri p the y make fo r persona l o r
business reasons , listin g tim e o f day , origi n an d destination , intervenin g stop s an d route .
Battelle (1997 ) conducted a n experimenta l varian t o n th e standar d methodolog y i n Lexington,
Kentucky, usin g GP S logger s t o supplemen t th e respondents ' answers ; simila r method s hav e
since bee n explore d i n Georgi a an d Texas . A stud y b y th e Californi a Ai r Resource s Boar d
tracked truck s wit h GP S t o stud y thei r traffi c patterns , an d thei r implication s i n ai r quality .
Such new technology solutions are obviously data rich , but there is the potential fo r locations -
origins, destination s an d route s - reference d in th e dat a t o b e incorrectl y interprete d du e t o
GPS an d ma p dat a errors . Clearl y the nee d fo r accurac y i s greater i n som e applications tha n i n
others.
Problems wit h location reportin g hav e been observe d anecdotally fo r a numbe r o f years, bu t
there has been relatively little scientific documentation of the scale of the problem and its solution.
Some researchers hav e focused on the problem of address matching and geocoding (e.g. Kim an d
Nitz, 1994) , and a number o f studies have examined GP S error independentl y (e.g . Quiroga an d
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 5 5
Bullock, 1998) . Neithe r o f thes e addresse s th e broade r issue s o f locatio n reportin g i n a trans -
portation context .
This paper examine s thes e issues analytically, synthesizin g results of formal tests conducted by
the Vehicle Intelligence and Transportation Analysi s Laboratory (VITAL ) in response to the need
for standardizatio n i n the ITS industry. It expands on ideas by Noronha (1999) . Section 2 presents
the need for location expressio n and exchange with particular referenc e to ITS scenarios. Section 3
examines the types of error commonl y foun d in contemporary digita l maps , an d their impact o n
location referencing . Sectio n 4 discusses short- and long-ter m measure s to improv e the accurac y
of locatio n reporting . Th e pape r conclude s wit h speculation s o n futur e requirements , an d th e
academic, technical an d manageria l challenge s tha t lie ahead.
To understand the extent of the spatial interoperability problem, one must appreciate the types
and degree of error i n digital map databases. I n general there are errors i n position, classificatio n
and inclusion , names and othe r descriptiv e attributes, linea r measurement and topologica l rela -
tionships. Th e following discussion i s based o n tests conducted i n the County o f Santa Barbara ,
California, usin g six databases fro m publi c and privat e sources, representing the principa l street
map vendors for ITS in the US. Although we use the term "error" liberally in this discussion, bea r
in mind that these databases were developed for differen t purposes , from differen t scale s of survey
and mapping, an d that their sale prices vary from $150 0 to $45 000. We do not identify vendors or
compare products , an d ou r comments are intended not t o be critical of vendors, but t o illustrate
the kind s o f operational hurdle s tha t exist an d mus t b e overcome b y users .
Testing involves sampling a se t of point locations , eithe r lab-generated o r selecte d in the field
(field location s ar e associate d wit h landmark s suc h a s utilit y poles , an d documente d b y note s
and photographs) . I n som e cases , sampl e point s ar e generate d a t regula r o r rando m interval s
along networ k polylines ; i n othe r tests , the y ar e require d t o b e a t commonl y identifiabl e
locations suc h a s intersections . Majo r street s ar e examine d separately , o n th e ground s tha t
initial IT S deploymen t woul d b e o n highway s an d majo r arterial s only . Th e sampl e point s ar e
transferred t o eac h databas e i n turn b y the L X method unde r test , and a n appropriat e measur e
of succes s applie d t o examin e th e accurac y o f th e transfer . Th e tes t result s ar e obviousl y
specific t o th e Count y o f Sant a Barbar a an d th e tim e o f acquisitio n o f th e dat a set s (May ,
1997). The count y doe s presen t a representativ e an d averag e cross-sectio n o f street styles , both
urban freeway s an d sinuou s mountain roads , bu t i t i s reasonable t o believ e that result s would
be better wit h databases focuse d o n large urba n centers , an d conversel y poore r i n more remot e
locations.
V. Noronha, M.F. Goodchild 1 Transportation Research Part C 8 (2000) 53-69 57
3.1. Position
Fig. 1 . Positional error . A coordinate fro m a vehicle on a ramp or highway in map A (star o n solid line) snaps to th e
wrong ramp or street in map B (dot on broken line): (a) shows a freeway interchang e in Santa Barbara; (b ) is a highway
and adjacen t neighborhoo d o n the outskirts o f the city.
58 V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
for tw o reasons: first, current database s do no t contai n th e thir d dimension ; secondly , erro r in
GPS elevatio n reading s i s generally twice as high a s horizontal error .
To correc t error , on e mus t firs t measure , understan d an d mode l it . Fo r a singl e point , a n
obvious measure of error is the Euclidean distanc e between the point o n the erroneous map , an d
Fig. 2 . (a) Error vector s constructed betwee n points o n map A and correspondin g point s o n ma p B . (b) Vector field.
Errors tha t ar e consisten t i n magnitud e an d directio n (northeas t o f figure) are eas y t o mode l and t o correct . Whe n
vectors are inconsisten t (western portion o f figure), extensive correction o r complet e re-survey may b e required .
Fig. 3 . Overlay of tw o map s showin g different position s ( c and e ) where ramp meet s freeway. Whe n thes e ar e repre -
sented b y planar-enforced models , topologica l relation s clearly differ . Th e impac t o n linear measurements taken fro m
this poin t is more tha n 20 0 m. Reproduce d fro m Noronh a (1999 ) with permission.
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 5 9
the true position o f that point. Th e vectors between the points (Fig. 2(a) ) show the magnitude as
well as the direction of the error. Note that the term error assumes that one version is accurate and
the other i s not; this may not be easy to determine, considerin g th e remarks in the last paragraph.
Our analysi s of error i s based o n a n engineerin g scale map o f the stud y area, and verificatio n b y
differential GPS . Matchin g o f points fro m on e databas e t o anothe r i s accomplished b y a com -
bination o f automated an d manual methods .
Extending the vector analysis to a larger area , on e can develop a vector field (Goodchild an d
Hunter, 1997) that shows the variation in error over space (Fig. 2(b)), and suggests its origins. One
could hypothesiz e tha t th e field is continuous, i n whic h cas e th e erro r a t a give n poin t ca n b e
interpolated fro m tha t a t surroundin g points; our preliminar y studies indicate that thi s is true at
least to som e degree (Church e t al., 1998 ; Funk e t al., 1998) . Error directio n an d magnitud e are
often consisten t ove r entir e neighborhoods , bu t the y chang e abruptl y acros s som e boundaries .
This indicates that th e map wa s developed piecemeal, at differen t times , or from differen t sources .
Continuity of the error field also breaks down on freeways . Intersection s of ramps with freeway s
are prone t o high positional erro r that is difficult t o model, becaus e th e paths intersect a t a small
angle, and a sligh t error i n alignment translate s int o a large longitudinal displacement (Fig . 3) .
It was argued above that mapping necessarily entails an interpretation of reality. This is truest
in th e cas e of road inclusio n an d classification , which are closel y connecte d wit h map scal e an d
resolution. Dua l carriagewa y freeway s ar e single lines at one scale an d double line s at larger ma p
scales. Similarl y ther e ar e difference s i n th e treatmen t o f traffi c circles , media n strip s an d
channelized tur n lanes . I t i s impossible t o fin d tw o map s tha t agre e o n thei r categorizatio n o f
"major" roads , le t alon e th e fine distinctions betwee n arterial s and collectors . Discrepancie s i n
classification lea d t o difference s i n inclusion . Driveway s into condominium s an d buildin g com-
plexes (e.g. hospitals) do not appear in all maps, because not al l vendors consider these part of the
street network . Inclusio n i s also impacte d b y the effectivenes s o f map update . Th e vendo r tha t
promptly includes new neighborhoods clearly shows several streets that anothe r vendor does not.
A further consequenc e of discrepancies in inclusion is conflicts i n topology (the term is used in
the GIS sense, of connective relationships betwee n points, lines and polygons). Clearl y if one map
shows a numbe r o f driveway s off a principa l road , whil e anothe r ma p ignore s them , th e topo -
logical relations differ. Anothe r aspect o f topological inconsistency is the reduction of non-planar
intersections (i.e . overpasses) . GI S structure s modele d o n th e TIGE R databas e (Marx , 1986 )
impose planarit y o n al l intersection s du e t o th e nee d t o conside r road s a s potentia l polygo n
boundaries. Navigatio n tur n table s ar e buil t int o th e databas e structure , wit h a larg e artificia l
impedance o n turns that are not physicall y possible. Problem s the n arise when gross topologica l
errors occur in the database, e.g . when a cloverleaf ramp meets a freeway eas t rather than west of
an overpas s (Fig. 3) . Correction o f the positiona l erro r require s extensive recoding of the topo -
logical relationship s withi n th e interchange .
It i s tempting t o stud y som e o f th e abov e difference s i n inclusio n an d classificatio n b y char -
acterizing eac h roa d sectio n b y som e combinatio n o f nam e an d position , an d lookin g fo r a
corresponding segmen t in th e othe r database . Thi s doe s no t wor k because o f (a ) difference s i n
naming, discusse d i n detail below, and (b) topological discrepancie s tha t result in one map having
60 V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
long section s o f road , whil e another fragment s the m int o shor t segments . Nystuen et al . (1997)
study both positiona l an d inclusio n differences betwee n maps b y buffering th e centerlines in each
map, and topologically intersecting the buffers. Th e degree of correspondence is analyzed spatially
by gridding the are a an d measurin g the intersections between buffers, cel l by cell. This metho d i s
attractive becaus e i t mimic s visual analysis and i s intuitively valid; a drawbac k i s that fine r to -
pological difference s suc h a s dual-line versu s single-lin e highway s do no t affec t th e statistic s ap -
preciably, althoug h thes e difference s ca n hav e significan t impact s o n interoperability .
3.3. Names
In the light of the inadequacy o f coordinates a s an unambiguous LX, it has been suggested that
street names , i n the for m o f stree t addresse s o r intersections , woul d b e more reliable . Thi s i s a
reasonable argumen t u p to a point - afte r all , the post offic e deliver s mail base d o n street ad-
dresses wher e the y exist, wit h littl e error . Ther e ar e tw o caveat s t o this . First , mai l deliver y is
facilitated b y posta l code s tha t enabl e sortin g an d reduc e ambiguity , an d als o b y intelligen t
human interpretatio n tha t easil y substitute s a n incorrec t "Marke t Street " wit h th e correc t
"Marquette Street" . Secondly , transportatio n application s ar e no t restricte d t o mai l deliver y
routes. Man y road s (notabl y freeway ramps ) do not hav e names; for Sant a Barbar a Count y as a
whole, 20-45% of records hav e blank nam e fields.
Further problems exist in the capture and storage of names. Abbreviations are inconsistent (e.g.
"Av" versu s "Ave"), parsing of fields varies (e.g. Spanish stree t types such as "Via del " shoul d b e
separated a s street typ e prefixes, bu t the y are often incorporate d int o the proper name) , and only
one of our si x databases provides fo r aliase s tha t woul d equat e "Sa n Dieg o Freeway " with In -
terstate 5 . There ar e consistency errors eve n withi n databases: on e sectio n o f a roa d i s correctly
named "Winding Way " whil e another contains a typographic error , "Winting Way". Appropriat e
data structurin g would avoi d this .
Table 1
a
Four stretche s o f road, a s named i n five databases A-E
Road Databas e
A B C D E
1E Camino Ciel o Camino Ciel o [blank] [Street no t present ] E C m Ciel o
2 Sa n Marco s Pas s [blank] San Marco s Pas s San Marco s Pas s
Hwy 15 4 Highway 15 4 CA-154 [blank]
3 Mountai n Mountain [blank] Mountain W Mountai n
E Mountai n E Mountai n
Park Park
Bella Vist a Bella Vista
4 Foothil l Hwy 19 2 Foothill Foothill
Cathedral Oak s Cathedral Oak s Cathedral Oak s Cathedral Oak s
a
A road may have several names ove r the sampled distance ; database s ma y not agre e on where changes occur . Onl y A
provides fo r a n alias . Reproduce d fro m Noronh a (1999 ) with permission.
V. Noronha, M.F. Goodchild / Transportation Research Part C 8 (2000) 53-69 6 1
Table 1 shows four randoml y selected lengths of street, as named i n five databases. Th e streets
are selecte d visually , an d som e ar e s o lon g a s t o hav e multipl e name s ove r thei r course . Th e
variation i n names offer s a quick insigh t into th e matchin g problem .
For th e reasons outline d above, succes s rates in matching names across databases ar e lower in
ITS applications than eve n th e 60-75 % often reporte d fo r addres s matchin g a s used wit h stree t
addresses. Intelligen t softwar e doe s exis t t o dea l wit h som e o f th e problems , e.g . t o equat e
Winding wit h Winting, Marke t wit h Marquette ; an d eve n CA-101 wit h HWY-10 1 (Felleg i an d
Sunter, 1969 ; Knuth, 1973 ; Jaro, 1989); but software alone cannot resolve the more difficult cases ,
particularly where blank name s ar e encountered .
Prior t o the advent of GPS, th e only practical wa y to describe a location o n a remote section of
road wa s by linear measurement fro m a defined startin g point. Transportation engineer s continue
to us e linear reference s today , sometime s to th e exclusio n o f map s an d two-dimensiona l refer -
encing. Whil e motorists ' experienc e o f linea r referencin g i s limite d t o th e 10 0 m resolutio n o f
vehicle odometers , professional s employ distanc e measurin g instrument s (DMI ) fo r 1 m-resolu-
tion readings. DMI technolog y has evolved over the years, from mechanica l or optical revolution
counters attache d to the wheels of a vehicle, to modern electroni c puls e sensors tha t hook int o the
transmission.
There are three components t o a linear reference: (a) the road on which the point lies, described
by a road nam e o r numeri c identifier ; (b) the origi n an d directio n fo r th e measurement ; (c ) the
distance measurement. Error in (a) is eliminated if parties communicate using standard identifiers ,
else th e comment s o n stree t name interoperabilit y abov e apply . Specificatio n of the origi n (b) is
similarly susceptibl e t o identificatio n error . The origi n ha s t o b e described wit h precisio n equiv -
alent t o tha t o f the measurement: i f the distance is stated with 1 m precision, it is not sufficien t t o
specify th e origi n simpl y a s th e intersectio n o f tw o streets , whe n th e intersectio n i s 3 0 m wide .
Finally th e measuremen t (c ) i s subjec t t o errors , dependin g o n curvatur e an d othe r physica l
characteristics o f th e road . Throughou t thi s pape r w e assume offset s t o b e measure d alon g th e
road (i n the case of the DMI ) o r alon g it s polyline representation .
The linea r LX X questio n i s two-fold: (i) how wel l a locatio n expresse d b y othe r mean s (e.g .
coordinates o r stree t names ) translate s t o a linea r reference , and (ii ) how wel l a linea r referenc e
measured b y a DMI, o r with respect t o one map database, describes tha t location with reference
to a different ma p database. Regardin g (i), in the Santa Barbar a databases , ther e is on an averag e
a 40% chance that a GPS point (wit h 100 m selective availability error) snaps to the wrong road,
and a further 10 % chance that i t snaps t o the wrong topological section o f the correct road . The
average linear reference erro r fro m suc h a point i s about 9 0 m, assuming no recalibratio n point s
(VITAL, 1999) . If the point i s specified directl y from DM I measurement , ther e ar e errors due t o
instrument limitations , calibration an d operation , usuall y th e greate r o f 1 % or 2-1 0 m . O n th e
matter of (ii), the relationship betwee n polyline geometry, generalization and linea r measurement
has bee n studie d fro m a numbe r o f standpoint s (Mandelbrot , 1967 ; Douglas an d Poiker , 1973 ;
Buttenfield, 1985) . It i s generally wel l known tha t a s a polyline i s subjected t o increasin g gener -
alization, th e digitized length decreases. The Sant a Barbar a dat a se t shows average differences o f
8% in digitized length (longest versus shortest version of the same road) on a variety of roads, with
62 V . Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
a range of 1-16%. The deviation of digitized length from DM I lengt h for a given data se t over all
sample road s range s fro m 0.5 % t o 4 % with a n averag e o f 3 % and a wors t cas e o f 16% . Com -
pensation fo r thi s error ca n b e applied b y normalizing a linear offset , i.e . expressing the offse t a s
the proportion of its distance fro m the origin, t o the total length o f the road section (clearly thi s
adds t o th e cos t o f specifyin g th e location , becaus e i t require s tw o measurement s t o b e made).
Absolute linea r transfe r errors ar e abou t 5 0 m o n average , an d normalize d transfe r error s ar e
about 2 5 m; errors u p t o nearl y 1 km ar e encountere d i n exceptiona l cases , du e t o longitudinal
errors i n definin g th e intersection s of ramps wit h highways (VITAL, 1999) .
4. Erro r remediation
Given the immediate nee d for incident reportin g and amelioration o f transportation problems ,
it i s necessary t o examin e strategies that enabl e th e deploymen t o f service s using current data -
bases. Taking an ITS example, a motorist may buy an in-vehicle database from vendo r A, receive
data fro m a TMC tha t operate s wit h vendor B , and pas s message s to a n emergenc y center tha t
uses vendo r C . Regardles s o f which pair o f database s i s involved in a transaction , th e messag e
must b e received unambiguousl y an d error-free .
In additio n t o IT S concerns, ther e is the general problem o f backward compatibility o f data -
bases. A wealt h o f legac y dat a i s attached t o database s o f differin g positiona l quality . Fo r ex -
ample, a n emergenc y center ma y attac h a not e t o a 1980s-vintag e database, tha t 87 2 Whistler
Highway is the farm o f John Brown , with a large red barn near the driveway. It is important tha t
when the emergency center upgrades to a more current database, th e details of the Brown farm be
preserved. Thi s sectio n examine s how this might b e done .
Fletcher (1999 ) defines four levels of interaction, the "Four Is" : (a ) integration, in which parties
share hardware an d softwar e from a common vendor , or use a single database i n which internal
identifiers are universally understood, e.g. within a small office; (b ) interoperability, where vendors
may differ , bu t th e semantics of objects and processe s are standardized, and system s differ onl y in
the interna l detail s o f implementation ; (c ) interfacing, wher e ther e ar e difference s i n semantics ,
traditions, system s and vendors , an d communicatio n i s achieved b y means o f third-party trans -
lators that attempt to harmonize semantic s t o the extent possible; an d (d) independence, i n which
parties fai l t o communicat e because their system s are radicall y different . Interfacin g is generally
considered les s elegant tha n i s interoperation, an d attract s irreveren t terms suc h as "duct tape".
But give n th e considerabl e problem s wit h ma p databases , w e ma y no t hav e th e luxur y o f
restricting ourselves to any one of these I's, hence the word interoperability is used in the broadest
sense below.
4.1.1. Messaging
One obvious candidate fo r standardization i s the message used t o communicate a location. I n
the languag e of the Ope n System s Interconnect (OSI ) mode l (Da y an d Zimmerman , 1983) , this
discussion o f messag e standardizatio n pertain s t o th e uppe r leve l layers , applicatio n an d
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 6 3
Fig. 4. The J2374 cross-street profile expresses location in terms of a principal street (Ash), two cross-streets (Birch and
Cedar) an d th e offse t distanc e from th e first cross-street.
64 V . Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
An obvious need in messaging is for the user to have a measure of information reliability, and it
may b e importan t fo r th e sende r t o kno w whether the messag e ha s bee n satisfactoril y received.
VITAL ha s therefor e propose d a robust , theoreticall y 100 % successfu l protoco l terme d "LX -
100", fo r us e in mission-critical LXX . LX-10 0 mimics the negotiation process tha t take s place in
verbal communication whe n one describes an emergency or any other location. Give n the luxury
of time, say to describ e the venue of an offic e party , on e employs a combination o f street names,
landmarks, absolut e and relative navigatio n directives , which are together redundant; the receiver
processes th e information , detect s succes s or failur e base d o n redundan t clues , and request s ad -
ditional information in the event of failure. Similar principles are used for error checking in digital
packet dat a transfer , aroun d whic h th e Interne t Protoco l i s built ; th e Europea n standard s de -
velopment an d testin g initiative , Extensiv e Validatio n o f Identificatio n Concept s i n Europ e
(EVIDENCE), als o use s redundant L X in its concept o f the intersectio n location (ILOC) .
LX-100 is still under development, and many issues and algorithms remain to be resolved. If the
process is to approach 100 % success, i t must make assumption s abou t the quality an d fitness for
use of reference databases .
4.1.3. Datums
The mor e ambitiou s aspect s o f standardization , suc h a s coordinate accuracy , ma y no t b e re-
alizable by private vendor s alone, an d governmen t invention may be required. I n th e 1980 s and
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 6 5
Table 2
Error toleranc e i n ITS applications, presen t an d future 3
Period Local e Latera l toleranc e (m) Longitudina l tolerance (m )
Present Ope n highwa y 2 5 50-10 0
(2000) Urban/interchang e5 5 0
Future Ope n highwa y 1 10-2 0
(2015) Urban/interchang e1 5
a
The numbers are coarse estimates; the absolute values are less important than th e relative values and th e structure of
the table.
66 V . Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
Fig. 5 . VITAL's rubber-streetin g algorithm applies a geometric adjustment to on e map t o make it agree with another
more accurately surveye d map o r datum . Reproduce d fro m Noronha (1999 ) with permission.
between these concepts and th e ITS Datum, which incorporates al l the requirements o f the othe r
standards - linea r accuracy , standar d identifier s - an d three-dimensional accuracy .
VITAL (1999 ) has show n tha t whil e some o f the interoperabilit y test result s with current da -
tabases are poor, results with differential GPS and bette r qualit y databases are excellent, at leas t
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 6 7
for curren t applicatio n requirements . On e migh t legitimatel y argue tha t ne w survey s will soo n
brighten IT S interoperability prospects. Whil e this is true of the future , ther e ar e tw o causes fo r
caution. First, GPS-based re-survey is technologically possibl e today , but remains largel y undone
because o f cost . A n alternat e wa y t o improv e public secto r database s i s by integratio n o f con -
struction an d engineerin g data int o transportatio n GIS ; bu t fo r technica l an d managemen t rea-
sons, there are few jurisdictions where this has been done effectively. Second , improved coordinat e
accuracy i s only one aspect o f quality. Legac y attribute dat a have to b e merged with ne w coor-
dinates an d perhap s revise d topolog y - tha t ca n b e bes t achieve d b y a datu m tha t link s old
databases wit h new by means o f common identifiers .
Another argumen t i s that th e interoperability problem ca n b e avoided b y employing just on e
database acros s applications . Ther e ar e severa l reason s wh y this i s no t practical . First, th e
functional requirement s o f a database are not universal . Dat a need s var y considerabl y betwee n
users - fo r example , ther e ar e situation s wher e single-lin e representation o f freeway s i s appro -
priate an d preferabl e t o dual-lines . Second , i n a market-drive n econom y tha t i s stimulate d b y
competition, ther e i s a dange r i n allowin g an y vendor , privat e o r public , a monopol y ove r a
critical informatio n resourc e suc h a s roa d centerlines . However , i n th e shor t term , th e single -
vendor approac h ma y be most expedient.
5. Conclusion s an d prospects
The preceding sections have documented the types and magnitude of interoperability problem s
in location reporting . The data error s are presented not a s a case of vendor malpractice, bu t a s a
consequence of an evolving state of art that is currently being overtaken by increasingly ambitious
user requirements . As recently as 1980 , development o f street networ k databases wa s driven by
demographic application s tha t require d accurac y a t n o better tha n th e general neighborhood o r
block face . No w jus t 2 0 year s later , user s ar e equippe d wit h inexpensiv e GP S an d portabl e
computers; the y seek navigation solutions, they demand carriageway resolution, 1-20 m accuracy
and current information on attributes that change frequently (e.g . one-ways and turn restrictions,
even constructio n an d congestion) . Emergin g methodologie s fo r emergenc y managemen t an d
other application s ar e buil t aroun d thes e heightene d expectations . Inevitably , i n the shor t ter m
some of these requirement s will not b e met.
There are a number of initiatives under way, e.g. intelligent messaging and datums , t o addres s
these problems . On e coul d argu e tha t a s th e qualit y o f database s evolves , th e nee d fo r thes e
solutions will diminish. On the other hand, it is reasonable to speculat e that requirement s will be
even more stringent in the future, an d tha t solutions will have to keep pace. One obvious area of
development i s improved resolution , fro m carriagewa y t o lane . I t i s likely that navigationa l in -
structions wil l direc t driver s t o mov e int o a certai n lan e i n preparatio n fo r a n approachin g
maneuver. Th e Californi a Departmen t o f Transportatio n (Caltrans ) i s plannin g fo r a traffi c
management future wher e diversion of highway traffi c i s initiated several kilometre s ahead of an
incident, on e lane at a time , to prevent concentratio n o f such traffi c int o surroundin g arterials .
These developments raise academic challenges. Data models must evolve to accommodate lanes
(e.g. Foh l e t al., 1996), and publi c initiative s to improv e coordinate accuracy , suc h a s datums ,
must anticipate the evolution of user requirements. New messaging methods have to be developed
68 V . Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69
for traffi c informatio n scenario s tha t no t onl y report curren t status , but als o describe previous or
future events , such as the anticipated cours e o f a toxic plume - som e work on multi-dimensional
data modeling i s already unde r way , sponsore d b y the National Cooperative Highwa y Researc h
Program (NCHRP ) projec t 20-27(3) . Based o n the recent pac e o f development it is reasonable t o
believe tha t despit e GPS , o r perhaps because o f it, th e ar t an d scienc e o f location reportin g wil l
remain a n importan t are a o f endeavor ove r th e coming decades .
Acknowledgements
References
Battelle Memoria l Institute, 1997 . Global positionin g system s for persona l trave l surveys : Lexington are a trave l data
collection test . Final Report . Unite d State s department o f transportation, Federa l Highwa y Administration, Offic e
of Highway Management, Washington , DC .
Buttenfield, B.P. , 1985 . Treatment o f the cartographi c line . Cartographic a 2 2 (2), 1-26 .
Church, R., Curtin , K. , Fohl , P. , Funk , C. , Goodchild, M. , Kyriakidis, P., Noronha, V. , 1998 . Positional distortio n in
geographic dat a set s a s a barrie r t o interoperation . Proceedings , ACS M Baltimor e [document availabl e a t
www. ncgia. ucsb. edu/vital].
Day, J.D. , Zimmerman , H. , 1983 . The OSI referenc e model . Proceeding s o f the IEE E 71 , 1334-1340 .
Douglas, D.H. , Poike r (formerl y Peucker) , T.K., 1973 . Algorithms for th e reductio n o f the number o f points required
to represen t a digitize d line or it s caricature. Canadia n Cartographe r 10 , 112-122 .
Fellegi, I.P. , Sunter , A. , 1969 . A theor y fo r recor d linkage . Journa l o f th e America n Statistica l Associatio n 6 4 (328),
1183-1210.
FGDC, 1998 . Road Dat a Mode l - Conten t Standar d an d Implementation Guid e (Working Draft). Federa l Geographi c
Data Committee , Groun d Transportatio n Subcommittee , Washington , DC .
Fletcher, D. , 1999 . Road dat a mode l workshop , Washington, DC . Meetin g Notes .
Fohl, P., Curtin, K., Goodchild, M.F. , Church , R.L. , 1996 . A non-planar, lane-based navigable data mode l for ITS. In :
Kraak, M.J. , Molenaa r M . (Eds.) , Proceeding s o f the Spatia l Dat a Handling , Delft , 12-1 6 August , pp. 7B/17-29 .
Funk, C. , Curtin, K. , Goodchild, M., Montello, D. , Noronha, V., 1998. Formulation and test of a model o f positional
distortion fields. In : Proceedings o f the Third Internationa l Symposiu m on Spatial Accurac y Assessment in Natura l
Resources an d Environmenta l Sciences, Quebe c Cit y [UR L www.ncgia.ucsb.edu/vital].
Goodchild, M. , Hunter , G. , 1997 . A simpl e positional accurac y measur e fo r linea r features. International Journa l o f
Geographical Informatio n System s 1 1 (3), 299-306.
Jaro, M. , 1989 . Advances in records linkage methodology as applied to matching the 198 5 census of Tampa. Journa l of
the America n Statistica l Associatio n 8 4 (406), 414-420.
Kim, K. , Nitz , L. , 1994 . Application o f automated record s linkag e software i n traffi c record s analysis . Transportatio n
Research Recor d 1467 , 50-55.
Knuth, D.E. , 1973 . The art o f computer programming . Sortin g and Searching , vol. 3. Addison-Wesley, Reading, MA .
Mandelbrot, B.B. , 1967 . How lon g i s the coas t o f Britain? Statistical self-similarit y an d fractiona l dimension . Scienc e
156, 636-638.
Marx, R.W., 1986 . The TIGER system : automating th e geographic structur e of the United State s census. Governmen t
Publications Review 13 , 181-201 .
V. Noronha, M.F. Goodchild I Transportation Research Part C 8 (2000) 53-69 6 9
Abstract
Network matching is frequently needed for integrating data that come from differen t sources . Traditional
ways of finding correspondence s betwee n networks are time-consumin g and requir e considerable manua l
manipulation. Thi s pape r describe s a three-stage matchin g algorith m (node matching, segmen t matching,
and edg e matching ) tha t combine s bottom-up an d top-dow n procedure s t o carr y out th e matching com -
putation. As it uses sensitiv e matching measures , th e proposed algorith m promises good improvemen t t o
existing algorithms . A n experimen t o f matchin g tw o waterwa y network s i s reporte d i n th e paper. Th e
results of this experiment demonstrate tha t a reasonable matchin g rate an d good computationa l efficienc y
can b e achieved with this algorithm . The pape r als o briefl y discusse s necessary improvements in areas o f
linear alignment , aspatia l matchin g an d higher-leve l matching. © 200 0 Elsevie r Scienc e Ltd . Al l right s
reserved.
1. Introductio n
Matching roa d network s tha t com e fro m differen t source s or hav e bee n created at differen t
times i s a n importan t functio n fo r transportatio n dat a managemen t and manipulation . Fre-
quently, applications need to integrate data that are referenced b y different network s for the same
area (Rose n and Saalfeld , 1985 ; Saalfeld, 1988 ; Brown et al., 1995 ; Nystuen et al., 1997 ; Walter
and Fritsch, 1999) . A commo n procedure used t o realiz e this integration is to establis h corre-
spondences amon g differen t networks ; then data tha t ar e reference d o n differen t network s ar e
transferred t o an d integrate d into one o f the networks . The sam e procedure is necessary in sit-
uations where an existing database grows older and has to be updated with a newer version of the
network. In this case, the newer version of the network functions as the control layer, and matches
0968-090X/00/$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00011- 5
72 D . Xiong / Transportation Research Part C 8 (2000) 71-89
between thi s newe r versio n an d th e earlie r version s ar e established . The n dat a o n th e earlie r
versions of networks are conflate d t o th e newe r version.
Network matchin g als o ha s importan t application s i n imag e processin g (Novak , 1992 ; Stilla,
1995; Wang, 1998) . Raw image s obtaine d fro m remot e sensor s usuall y contain variou s kind s of
distortions. To geometrically rectify a distorted image , this image can be overlaid with a network
map. B y establishing correspondence s betwee n linea r feature s o n th e imag e an d o n a networ k
map, the image geometry can be transformed and rectified. Matching with a network map is also a
good strateg y for image recognition. When unknown linear features on an image can be matched
with known features on a network map, th e unknown features on the image can be recognized as
the sam e feature s a s th e one s o n th e networ k map . Afte r matching , th e characteristic s o f th e
matched feature s can be extracted, an d b y referencing these characteristics, othe r feature s o n th e
image that may not directl y correspond t o a n existin g map ca n be recognized.
Due t o th e importanc e o f these applications , considerabl e effor t ha s bee n devoted t o studyin g
and developin g automate d procedure s fo r networ k matching . Thi s researc h focuse s o n a n in -
vestigation o f a n algorith m tha t consist s o f thre e type s o f matching : nod e matching , segmen t
matching, an d edg e matching . Nod e matchin g establishe s nod e correspondence s betwee n tw o
networks usin g Euclidea n distanc e an d angl e pattern s forme d b y inciden t edges . Segmen t
matching tracks segmen t pair s alon g potentia l matchin g edges . Edg e matching take s inpu t fro m
segment matching and then derives matching measures at the edge level. Applying these matching
measures allow s identification of the bes t matchin g pairs.
In thi s research, a computational strateg y that combine s bottom-u p an d top-dow n computa -
tions i s adopted. Th e bottom-u p computatio n start s wit h node matching , the n proceeds t o seg -
ment matching, an d finall y end s up wit h edge matching. Th e top-dow n procedur e i s the reverse .
At th e initia l stag e o f th e top-dow n computation , potentia l edge-matchin g pair s ar e firs t hy -
pothesized usin g screening criteria such as distance and angl e difference . Then , segment matching
proceeds. Throug h segmen t matching , matchin g measure s fo r thos e hypothesize d matche s ar e
computed, lik e matches are confirmed, and unlike matches are rejected. The purpose of combining
the bottom-up an d top-down computation s i s that the bottom-up computatio n wil l find matches,
where nod e matche s ca n b e quickl y established , whil e th e top-dow n computatio n wil l fin d
matches wher e node matche s fai l o r networ k structure s differ .
This three-stag e matching procedur e promise s goo d improvement s to previou s matching pro -
cedures for tw o reasons . Firstly , previou s research ha s frequentl y use d nod e matchin g an d edg e
matching, bu t no t segmen t matching . Segmen t matchin g i s important becaus e i t ca n b e used t o
evaluate correspondences betwee n each pai r o f segments on potentially matche d edges . Throug h
segment matching , matchin g measure s betwee n eac h pai r o f segment s ar e firs t computed ; then ,
overall matchin g measure s a t th e edg e leve l ar e obtained . Du e t o thes e detaile d considerations ,
edge-matching measure s derive d fro m segmen t matchin g ca n b e mor e sensitiv e i n recognizin g
differences an d similaritie s amon g differen t edg e pairs . Usin g thes e measures , ther e i s a bette r
chance t o find the best matches .
The othe r advantag e o f th e curren t algorith m i s in its overal l computationa l procedure . Pre -
vious research ha s ofte n use d a bottom-up procedur e a s the main computationa l strategy . Ther e
are som e exception s (e.g. , th e two-stag e matchin g metho d develope d b y Gaba y an d Doytshe r
(1994)). Th e curren t algorith m no t onl y combine s th e bottom-u p an d top-dow n computation s
but als o consider s interactio n an d feedbac k betwee n differen t stage s o f matching , especiall y in
D. Xiong / Transportation Research Part C 8 (2000) 71-89 7 3
segment matchin g an d edg e matching , thereb y providin g som e refinemen t to thos e procedure s
that already hav e a computational mechanis m in place suc h as the one developed by Gabay an d
Doytsher (1994) .
In the following discussions, Section 2 provides an overview of previous research on the subject
of network matching; Sectio n 3 describes the proposed matchin g algorithm; Section 4 reports o n
the matchin g experiment ; and Sectio n 5 provides conclusion s an d discussion s o f the curren t al -
gorithm with respect to its overall performance and necessary improvements needed in the future .
2. Previou s research
The proble m o f networ k matchin g ha s bee n studie d i n differen t disciplines , an d relate d liter-
ature ca n b e foun d i n th e area s o f geographi c informatio n system s (GIS) , cartography , trans -
portation, and imag e processing. Due t o diversifie d references , different term s have been use d in
the literature . These terms include map matching, conflation, an d linear alignment. In thi s section ,
alternate term s will be used a s well.
One of the earliest matching algorithms was developed by the US Census to integrate data fro m
the US Geological Surve y and the Census Bureau (Rosen and Saalfeld , 1985; Saalfeld, 1988). This
algorithm wa s applied t o matc h map s tha t hav e a link-nod e structure . A bottom-u p computa -
tional procedure is utilized in this algorithm, which uses node matching as the starting point. Afte r
node matching , a rubbe r sheetin g operation is implemented. In thi s operation , the geometr y of
one of the maps is adjusted to make it correlate better with the other map. Finally , matches of the
corresponding link s on th e tw o maps ar e identified.
The metho d introduce d b y Gaba y an d Doytshe r (1994 ) represent s a majo r departur e fro m
previous matching algorithms. Th e major advantage of their method i s that it is able to find both
common element s on two maps and to reveal unique elements appearing on only one of the maps .
This capability allows geometric inconsistency and difference s o f topological characteristic s t o be
recognized an d handle d durin g th e ma p matchin g process . Th e two-stag e matchin g procedur e
proposed b y Gaba y an d Doytshe r (1994 ) is also a blueprin t o f th e strateg y tha t combine s th e
bottom-up an d top-dow n computations . I n thi s two-stag e matchin g procedure , line s tha t hav e
matched en d node s ar e matche d first , the n unmatche d line s ar e furthe r evaluated , an d fina l
matches ar e obtained .
Brown e t al . (1995 ) described a conflatio n system that wa s develope d i n a GI S environment .
This syste m makes use o f existin g GIS function s suc h a s rubbe r sheetin g and dynami c segmen-
tation t o adjus t networ k geometry and establis h node an d link correspondences o f two matchin g
networks. With th e system , linear mapping s alon g networ k edges ar e possible , an d networ k at -
tributes suc h a s directio n flag s an d distance s ca n b e assigne d o r compute d automaticall y whe n
these attributes ar e transferred fro m on e network to anothe r network .
Nystuen e t al . (1997 ) studied a metho d tha t compare s tw o network s fo r th e purpos e o f eval-
uating the qualit y o f digital network databases . I n thei r research , the y develope d a n automated
procedure t o associat e networ k elements . Throug h neares t nod e analysi s an d buffering , corre -
sponding node s an d edge s o n tw o participatin g network s ar e identifie d an d statistic s showin g
how wel l tw o network s correspon d eac h othe r ar e generated . Thei r studie s identifie d severa l
problems arisin g from networ k comparison. These problems , includin g scale effects , definitional
74 D . Xiong / Transportation Research Part C 8 (2000) 71-89
3. The algorithm
This section introduces some notation an d definition s an d describes the overall procedure used
to carry on the calculation o f network matching and the individual algorithms that constitut e the
overall computational procedure .
D. Xiong / Transportation Research Part C 8 (2000) 71-89 75
For simplicity, the symbol G = (N, E) is used to represent a directed network graph, planar or
non-planar, with N = { n 1 , . . . , n m } representing the node set, and E = { e 1 , . . . ,ep] representing
the edge set. For matching purposes, each node will have a set of attributes, including an ID,
the number of edges that are incident to the node, called degree D(n), the incident edge set,
n(e) = { e 1 , . . . ,e D(n) }, and a coordinate pair (x,y) representing the location of the node. An
edge-attribute set will include an ID, from-node and to-node (e.g., ni and nj,), and a set of
coordinate pairs { ( x 1 , y 1 ) , . . . ,(xk,yk}} representing the shape points of the edge. The shape
points on an edge also define a set of segments, and this segment set is denoted as
S = (s1,. . . ,sk-1).
The matches of two network graphs Ga(Na,Ea) and Gb(Nb,Eb) are defined with three sets
of mappings: node mapping, Mn = {(na,nb)\na Є Na and nb Є Nb}; segment mapping, Mes =
representing a segment on an edge, ea, of the Ga, and representing a segment on the
edge eb that corresponds to e a }; and edge mapping, Me = {(ea, e b )|e a Є Ea and eb Є Eb}. As precise
correspondences between elements (nodes, segments, and edges) of two networks are rare, it is
assumed that mappings between network matching pairs are based on edited network graphs
rather than on the original ones. That is, network edges may be split or merged (or network nodes
inserted or removed) when mappings between two networks are generated.
In this discussion, one of the networks, Ga, that participates in the matching is called the
reference network (following the definition by Nystuen et al. (1997)), and the other participating
network, Gb, is referred to as the matching network. In general, it is assumed that the reference
network functions as a control or reference layer. When data from two networks need to be in-
tegrated, usually data from the matching network will be transferred to the reference network. In
cases where networks need to be geometrically rectified, the reference network will be used as a
control layer, and the geometry of the matching network will be transformed.
The major objective of network matching is to generate three types of mappings: node map-
ping, segment mapping, and edge mapping. Naturally, calculations of these mappings involve
three types of computations: node matching, segment matching, and edge matching. As network
elements (nodes, segments, and edges) are intrinsically related to each other, it is not efficient to
compute individual mappings separately. To integrate these three types of computations, we
consider an overall computational strategy as shown in Fig. 1.
For this overall computational strategy, the matching computations are divided into two sub-
processes, the bottom-up and the top-down. The bottom-up computation starts with matching
network nodes using criteria of Euclidean distances between nodes and angle patterns of network
edges that are incident to these nodes. After node correspondences are established, segment
matching proceeds. Segment matching involves both tracking corresponding segments of a po-
tential match and computing the likelihood of whether one segment indeed forms the counterpart
of another segment. Finally, edge matching is carried out. In edge matching, matching measures
obtained from segment matching are first aggregated at the edge level. Then these measures are
applied to eliminate unlikely matches.
76 D . Xiong / Transportation Research Part C 8 (2000) 71-89
By searching nearest neighbors and exploitin g network topological relationships , the bottom -
up computation can quickly find node and edge matches at locations where node correspondences
can b e reliably established, but thi s method ma y leave good matche s unidentifie d at othe r loca -
tions. To deal with this problem, the top-down computation follows . Th e top-down computation
assumes no node correspondences; instead, it proceeds with generating matches between network
edges a s the first step. With th e bottom-up results , the top-down computation will nee d only t o
evaluate thos e edge s tha t ar e no t matche d afte r th e bottom-u p matching . Note , however, tha t
edge matching actuall y proceed s with two stages in the top-down computation . In the first stage,
potential candidates on the matching network that may form a counterpart to a given edge on the
reference networ k are first identified. Afte r that , segmen t matching takes over, and fo r each pair
of edge-matchin g candidates, segmen t mapping s ar e generated . Wit h thes e segmen t mappings ,
edge-matching procedure is reactivated to compute matching measures at th e edge level (or sub-
edge level if edges are split) . And finally, decisions are made t o find the bes t edge matches.
The top-down procedure can be independently implemented. In this case, the edge-mapping set
from th e bottom-u p matching, a s shown in Fig . 1 , is assumed t o b e empty.
With th e overal l procedur e describe d above , w e ca n no w trea t eac h o f th e thre e matchin g
procedures (node matching, segment matching , an d edg e matching) separately .
Assuming that geometric errors and distortions of the networks tend to be local and are bounded,
n
the gap between a corresponding node pair d {j can be used as a criterion to search for candidate
node matches.
As each node has one or more edges incident to the node, if two nodes represent the coun-
terparts of each other on two networks, the similarity of angle patterns formed between the nodes
and edges can be used as another matching criterion. Assume the angle between two edges ref-
erenced to the same node is calculated by
The minimization sign used in Eq. (3) means that the minimum angle difference is used to de-
termine whether at least one pair of edges is matched for these two corresponding nodes. With the
distance and the angle difference criteria, node matching is computed with a decision function
where Δ and Φ are thresholds set for distance and angle difference, beyond which no matches will
be allowed. In Eq. (4), 0 indicates no matches, and 1 indicates the two nodes are matched. The
procedure to derive matches between nodes of two networks is straightforward. The inputs for
this procedure include node and edge coordinates of the two networks, the maximum matching
distance, and the maximum angle difference. The computation proceeds as follows:
Step 1. Initialize the node-mapping set (e.g., set
Step 2. For each node na on the reference network, find all the nodes on the matching network
that are within the maximum distance A. Then, apply the matching criteria using Eq. (4), and
find the best match.
Step 3. Output the mapping Mn, which now contains nodes on the matching network that cor-
respond to the nodes on the reference network.
Segment matching establishes detailed correspondences at the segment level between edges. It
can find corresponding locations of nodes that are represented on one network but not on the
other network, and generate measures that can be aggregated at the edge level for edge-matching
decisions. We utilize three criteria to evaluate segment matching: angle difference between seg-
ments, distance between segments, and matched segment length. The use of these three criteria is
78 D . Xiong / Transportation Research Part C 8 (2000) 71-89
based upo n th e consideration tha t angl e difference an d distanc e ar e direct measure s o f how well
two segments can correspond wit h each other , while the length can be used a s a weighting factor
for bot h th e angl e difference an d th e distance .
To effectivel y matc h segments , a configuration , as show n in Fig . 2 , is used t o determin e rela -
tionships between segments of the two participating networks . In this configuration, s° represents
a segmen t on a n edg e o f th e referenc e network, an d s b represent s a segmen t o n a n edg e o f th e
matching network . Th e angl e differenc e betwee n s " and s b ca n b e compute d usin g a n equatio n
similar t o Eq . (2) , th e onl y differenc e i s that th e vector s use d i n th e equatio n ar e no w th e tw o
segments instead o f the tw o edges.
Before th e distanc e betwee n th e tw o segment s i s calculated, fou r distanc e measure s ar e firs t
derived: d AC', dCE, dBF, an d d DB. As shown in Fig . 2 , d 4C represent s th e distanc e fro m th e startin g
point A o f sa to s b, d CE represents the distance from th e starting point C of sb to s°, dBF represent s
the distance fro m th e ending point B of s" to s b, and d° B represent s the distance fro m th e endin g
point D o f s b to s a. Th e distance betwee n the tw o segment s then i s computed wit h
Notice tha t distanc e betwee n the two segments can be defined i n different way s - fo r instance ,
min(dAC,dCE,dBF,dDB). Nevertheless , Eq . (5 ) provide s a n averag e distanc e measur e tha t take s
account o f th e distance s a t bot h end s o f th e segmen t pair . B y contrast , measure s suc h a s
mm(dAC, d €E, dBF, dDB] giv e distance measure s onl y a t certai n points . Whe n correspondin g seg -
ments stretch into different directions , thes e measures may fail t o provide sensitive information in
a particular match . Durin g the computation o f the segment distance, the lengths of the effectivel y
matched segment s ar e also determined. I n the cas e o f Fig. 2 , the matching length s ar e d EB on sa
and d CF o n s b.
Segment matching in general involves intensive computations. T o carry out these computation s
effectively, a trackin g procedure tha t ca n coordinate wit h the node-matching an d edge-matchin g
computations i s devised. The majo r step s for thi s tracking procedure ar e a s follows :
Step 1 . For a given edge pair (that may be derived from nod e matching o r assumed a s a poten-
tial matchin g candidate fro m edg e matching), first compare thei r directions. I f they are in op -
posite directions , the n the orde r o f the shap e point s fo r on e o f the tw o edges will be reversed.
Step 2. Search for the first pair of matching segments along the two edges. This matching pair
must be within the allowed maximum distance and in the same direction.
Step 3. Compute the angle difference, distance, and matching lengths as described earlier for
each segment pair. Continue the process until one or two of the edges end or break away. Dur-
ing the process, the decision on which segment is selected for the next computation is based on
whether a segment is an overshot or an undershot (in Fig. 2, sb is an overshot and sa is an un-
dershot). The segment that is next to a previously undershot segment will be selected for next
computation (in Fig. 2, the segment next to sa will be selected).
For edge matching, we use three matching measures: average angle difference, average distance,
and total matched lengths. These three measures are derived from segment matching but aggre-
gated at the edge level. The average angle difference is computed with the length-weighted average
of segment angle differences. The average distance is the length-weighted average of segment
distances. The total matched lengths are the total lengths of the edges on the reference network
and on the matching network respectively. With these measures, the overall matching measure
between two edges are calculated with
where i and j represent ith edge on the reference network and jth edge on the matching network;
represent the average distance, average angle difference, total matched length
on the edge of the reference network, and total matched length on the edge of the matching
network separately; α, β, and δ are weighting factors used to balance the effect of different mea-
sures. In program implementation, a decision function is utilized to facilitate matching decisions:
where 1 indicates a match, and 0 indicates no match; Δe represents the maximum distance allowed
between two edges, and Le represents the minimum length required for an edge match; for
, it is assumed that for all potential matches, the matching pair with the minimum
overall measure will be identified to make the match. are the total lengths of the edge
pair, which may be different from the matched lengths, . The reason to use as
constraints in Eq. (7) is to make sure that short edges are not eliminated from the matching
process.
In designing procedures to carry out edge matching, two scenarios must be considered. The first
scenario is when the segments between two edges have been matched (e.g., edge matching with the
bottom-up computation). In this case, the task of edge matching is to apply the aggregated
matching criteria derived from segment matching to validate a potential match. The second
scenario is when edge matching starts without segment matching (e.g., edge matching with the
top-down procedure). In this case, edge matching proceeds with selecting a set of potential
matching candidates as the starting point, then segment matching follows. Through segment
80 D . Xiong / Transportation Research Part C 8 (2000) 71-89
4. The experiment
In spite of differences, thes e two waterway networks have a very similar node-link structure an d
have limite d coordinat e discrepancies , whic h mak e the m suitabl e fo r th e us e of th e bottom-u p
82 D. Xiong / Transportation Research Part C 8 (2000) 71-89
After node matching, edge correspondences are evaluated. This evaluation starts with segment
matching, which generates matching measures for edges that have their end nodes matched. The
computations of matching measures follow the general procedures outlined in Sections 3.4 and
3.5, but with some customization. In this specific application, each of the matching measures is
transformed into a 0-6 point score, and the weighting factors α, β, and δ , as shown in Eq. (6), are
set to 1.0. Finally, the three scores are added together to get the total score for each of the
matching pairs.
Table 1 illustrates how the matching measures are transformed into the 0-6 point scores. The
purpose of this transformation is to have a discretized score so that each of the measures used for
the edge matching can be easily assessed. As is shown in Table 1, different intervals were chosen
for the values of thresholds used for the discretization. The purpose was to provide a higher
resolution for measures, where potential matches are more likely. As no evidence suggests that a
84 D . Xiong / Transportation Research Part C 8 (2000) 71-89
particular matchin g measur e need s t o b e weighte d more o r les s tha n othe r measures , 1. 0 was
chosen fo r eac h o f the weightin g factors to giv e equal weight to th e measures .
When th e tota l score s ar e compute d fo r eac h o f th e edg e pairs, thos e edg e pairs tha t hav e a
score less than 1 will be eliminated for a match. The matching result with these scores is that for all
5088 edge s on th e 199 6 network, 4894 edges are matche d t o th e 199 5 network. A visua l exam-
ination o f these matched nodes an d edge s reveals no obviou s mismatches.
After th e bottom-u p matching, 19 4 edges on th e 199 6 network are lef t withou t a match. Some
of these edges may not hav e a match, but som e may have a counterpart o n the 199 5 network, but
their en d nodes are not exactl y matched . To allo w matchin g o f additional edges , the top-dow n
matching procedure i s now utilized.
For th e top-dow n matching , onl y thos e edge s tha t ar e unmatche d durin g th e bottom-u p
matching are considered for further matching . The edge mapping set generated from th e bottom-
up matching is passed t o the top-down matching procedure so that those unmatched edge s can be
identified. I n th e top-dow n matching , eac h edg e o n th e 199 6 networ k i s evaluate d separately .
When a 199 6 edge is selected, unmatched edges on the 199 5 network that are close and paralle l to
the selecte d edge on the 199 6 network are identified. Each of the candidate pairs is then matched
at th e segmen t level . Afte r segmen t matching , matchin g score s betwee n eac h edg e pai r ar e
Table 1
Transformation o f matching measures t o 0-6 poin t scores 3
5. Conclusion s an d discussion s
The experiment results show that a reasonable matc h rat e ca n be achieved wit h the propose d
matching algorithm . Th e combined use of top-down an d bottom-up computations is effective. A s
the waterway networks used i n our experimen t hav e a simila r networ k structure, th e bottom-u p
computation ca n quickl y identif y matche s usin g nod e correspondences . I n case s wher e nod e
correspondences canno t b e effectivel y identified , the top-dow n computatio n i s able t o fin d addi -
tional matches. The experiment also demonstrates tha t segmen t matching provides leverage when
D. Xiong / Transportation Research Part C 8 (2000) 71-89 8 7
edges are compared i n detail. With segment matching, finer resolution can be achieved when edges
show subtle differences .
Currently, th e program use d to implement the proposed algorith m i s able to read an d write a
ESRI's shape file format directly, which has proven to be very helpful. The data can be viewed and
analyzed convenientl y befor e the algorith m i s applied. Afte r matching computations , th e result s
can be quickly evaluated. In addition, it is possible to take advantage of existing GIS functions for
pre-processing an d post-processin g th e networ k data , whic h i s necessar y i n man y real-worl d
applications.
The computational performance of the proposed algorithm appears satisfactory based upon the
current experiment. The entire matching process for the waterway networks took about 2 s on an
Intel Pentium II processor. These 2 s included the time for reading the input files and writing the
output files and th e time spent o n the actua l matchin g computations . Thi s computationa l effec -
tiveness has a great deal to do with the use of a grid system for effectively indexin g network nodes
and edges . This gri d syste m is very similar t o th e on e introduced b y Franklin e t al. (1994). Th e
major usefulnes s o f this grid system is in searching for matching candidates during node and edge
matching. Tak e node matching as an example, if indexes are not used, then m * n calculations ar e
necessary to find the nearest matches for two sets of nodes that have the node numbers m and n ,
respectively. If m and n are assumed to be the same size, then the computational complexit y is on
the order of n2 or O(n2). When grid indexes are utilized, however, the computation number can be
cut to 9 * m * n/g, wher e g is the total number of cells. The factor of 9 is based on the assumption
that the maximum search range is equal to or less than the edge length of a cell, so a maximum of
nine cells will be searched in each of the searches . As the size of g is adjustable and ca n b e made
proportional t o th e siz e of n, the computationa l complexit y is now actually on th e orde r o f n or
O(n).
With th e us e of the spatia l indexes, a s describe d above , a n increas e i n networ k siz e will no t
significantly slo w down th e matchin g computation. Nevertheless , for extremel y large networks
such a s those fo r cities lik e Los Angeles or New York, a large amoun t o f data may overwhelm
computer memorie s if they ar e loade d a t once . I n thes e cases, specia l care mus t b e taken . Fo r
instance, thes e larg e network s ca n firs t b e divided int o severa l smalle r networks ; thes e smalle r
networks are then matched separately. Another factor that has to be considered in the analysis of
computational performanc e i s th e similarit y an d differenc e o f th e participatin g networks . I n
general, networks with similar geometrical and topologica l characteristics can be more effectivel y
matched. However , furthe r stud y of thi s proble m wil l b e necessar y in orde r t o arriv e a t mor e
substantive conclusions .
In spite of the preliminary success of this algorithm, additional improvement or extension of the
algorithm i s necessary to mak e it more effectiv e an d reliable . In th e shor t term , i t appear s tha t
improvements shoul d b e mad e i n thre e areas . Firstly , th e algorith m use s a segment-matchin g
procedure tha t is based o n distance an d angl e difference s t o trac k correspondin g segments . Th e
drawback o f thi s procedur e i s tha t whe n networ k edge s ar e distorte d significantly , segmen t
mappings derived from thi s procedure canno t faithfully reflec t actual segmen t correspondences. I f
this occurs, matching measures derived from thes e segment mappings are not accurate , an d nod e
positions represente d b y th e segmen t mapping s canno t b e determine d precisely . Fili n an d
Doytsher (1998 ) studied the proble m an d suggeste d that edge s or curves should b e divided into
separate smooth sub-edge s or sub-curves, and then mappings should be established betwee n these
88 D . Xiong / Transportation Research Part C 8 (2000) 71-89
Acknowledgements
Sincere thanks are du e to th e thre e anonymous reviewers for thei r valuable comments.
References
Brown, J. , Rao , A. , Baran , J . 1995 . Automate d GI S conflation : coverag e updat e problem s an d solutions . In :
Proceedings o f Geographi c Informatio n System s for Transportatio n Symposiu m (GIS-T) , Sparks , Nevada , USA ,
pp. 220-229.
Fonseca, L.M.G. , Manjunath , B.S. , 1996 . Registratio n technique s fo r multisenso r remotel y sense d imagery .
Photogrammetric Engineering and Remot e Sensing 62 , 1049-1056.
Filin, S. , Doytsher, Y. , 1998 . Conflation - th e linear matching issue . In: Proceedings o f 199 8 Annual Conventio n an d
Exhibition, America n Congres s on Surveyin g and Mapping , Baltimore , Maryland , USA .
Franklin, W.R. , Sivaswami , V., Sun, D., Kankanhalli , M. , Narayanaswami, C. , 1994 . Calculating th e area o f overlai d
polygons withou t constructin g the overlay . Cartograph y and Geographi c Information System s 21, 81-89.
Gabay, Y. , Doytsher , Y. , 1994 . Automati c adjustmen t o f lin e maps . In : Proceeding s o f th e GIS/LIS'9 4 Annua l
Convention, Arizona , Phoenix , USA , pp . 333-341 .
Novak, K. , 1992 . Rectification o f digital imagery . Photogrammetri c Engineerin g an d Remot e Sensin g 58 , 339-344.
Nystuen, J.D., Frank , A.I. , Frank , Jr., L. , 1997 . Assessing topological similarit y of spatial networks . In: Proceedings of
International Conferenc e o n Interoperatin g Geographi c Informatio n Systems , Santa Barbara , California , USA .
Rosen, B. , Saalfeld, A., 1985 . Match criteri a fo r automati c alignment . In : Proceeding s o f Auto-Carto VII , America n
Congress o n Surveyin g and Mappin g an d America n Societ y fo r Photogrammetr y an d Remot e Sensing , pp . 1-20 .
D. Xiong / Transportation Research Part C 8 (2000) 71-89 8 9
Saalfeld, A. , 1988 . Automated ma p compilation . Internationa l Journa l o f Geographical Information System s 2 , 217 -
218.
Shapiro, L.G. , 1980 . A structural model of shape. IEE E Transaction s on Pattern Analysi s and Machin e Intelligence 2,
111-126.
Shapiro, L.G. , Haralick , R.M. , 1981 . Structura l descriptio n an d inexac t matching . IEE E Transaction s o n Patter n
Analysis and Machin e Intelligence 5, 504-519 .
Stilla, U. , 1995 . Map-aide d structura l analysi s o f aeria l images . ISPR S Journa l o f Photogrammetr y an d Remot e
Sensing 50, 3-10 .
Walter, V. , Fritsch, D. , 1999 . Matching spatial dat a sets : a statistical approach. Internationa l Journa l of Geographi c
Information Scienc e 13 , 445-73.
Wang, Y. , 1998 . Principles an d application s o f structura l imag e matching. ISPR S Journa l o f Photogrammetr y an d
Remote Sensing 53, 154-165.
Ventura, A.D., Rampini , A., Schettini, R., 1990 . Image registration by recognition of corresponding structures. IEE E
Transactions o n Geoscienc e an d Remot e Sensin g 28, 305-314 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Third-generation personal navigation assistants (PNAs) (i.e., those that provide a map, the user's current
location, and directions) must be able to reconcile the user's location with the underlying map. This process
is know n a s ma p matching. Mos t existin g researc h ha s focuse d o n ma p matchin g whe n bot h th e user's
location an d th e ma p ar e know n wit h a hig h degre e o f accuracy. However , ther e ar e man y situation s in
which this is unlikely to be the case. Hence, this paper considers map matching algorithms that can be used
to reconcil e inaccurat e locationa l dat a with a n inaccurat e map/network . © 200 0 Publishe d b y Elsevie r
Science Ltd.
1. Introduction
There are three different type s of personal navigation assistant s (PNAs). First-generatio n PNA s
simply provide th e user with a map an d th e ability to searc h th e map i n a variety of ways (e.g.,
search for an address , searc h for a landmark, scroll , and pan) . Second-generatio n PNAs provide
both a map and the user's current location/position. Third-generatio n PNA s provid e a map, th e
user's location , an d direction s o f some kind.
It shoul d be clear why we distinguish first-generation PNAs fro m second - and third-generation
systems. Clearly, a system that provides the user's current location is much more complicated tha n
one that does not, an d generally requires bot h additional hardwar e an d software . What ma y not
be clear i s why we distinguish between second- an d third-generatio n PNAs .
The rationale for doin g s o is actually quit e simple . I n second-generatio n system s th e location
that i s provided t o th e use r nee d no t coincid e wit h th e stree t syste m (or subwa y system , etc).
However, in order to provide directions, the user's location must coincide with a street (or subway
line, etc.) whe n appropriate .
There are , i n essence , three differen t way s to determin e the user' s location . The firs t i s to us e
some for m o f dead reckonin g (DR ) i n whic h th e user' s spee d o f movement, direction o f move-
ment, etc . i s continuously used t o updat e her/hi s location (Collier , 1990) . The secon d i s to us e
some for m o f ground-base d beacon tha t broadcast s it s locatio n t o nearb y user s (Iwak i e t al. ,
1989). Th e thir d i s t o us e som e for m o f radio I satellite positioning system tha t transmit s infor -
mation tha t th e PN A ca n us e to determin e the user' s location . Thi s las t approac h i s by far th e
most popular, a great many PNAs use the global positioning system (GPS) to determine the user's
location (Hofmann-Wellenhof T e t al. , 1994) .
Given a GP S receiver , i t i s almos t trivia l t o conver t a first-generatio n PN A int o a second -
generation PN A (i.e. , on e that provide s bot h a ma p an d th e user' s location) , an d man y peopl e
have done so. However, reconciling the user's location with the underlying map (or network) can
be much mor e complicated . In othe r words , convertin g a second-generatio n PN A int o a third -
generation PN A ca n b e quite difficult .
When both the user's location an d the underlying network are very accurate, th e reconciliatio n
problem i s thought t o be straightforward - simpl y "snap" the location obtaine d fro m th e GPS
receiver t o th e neares t nod e o r ar c in th e network . Hence, i t is not surprisin g tha t a numbe r o f
people ar e workin g o n improvin g th e accurac y o f bot h th e underlyin g network an d th e posi -
tioning system. In orde r t o develo p more accurate maps/networks , enormous "surveying " effort s
are underwa y (Deretsk y an d Rodny , 1993 ; Schiff , 1993 ; Shibata, 1994) . Som e of these effort s ar e
being undertaken by government agencies and other s are being undertaken by private companies.
In order t o develo p more accurate positionin g systems, a great dea l of attention is being given to
combining data from multipl e sources. Som e systems combine GPS with dead reckonin g systems
(Degawa, 1992 ; Mattos , 1994 ; Kim , 1996) , other s us e differentia l GP S (Blackwell , 1986) , an d
others use multiple sources o f data (sometimes including maps) and then filter or fus e th e data in
some wa y (Krakiwsk y et al. , 1988 ; Tanaka, 1990 ; Abousalem an d Krakiwsky , 1993 ; Scott an d
Drane, 1994 ; Watanabe et al. , 1994 ; Jo e t al. , 1996) .
We are interested i n situations in which it is not possibl e or desirabl e to improv e the accurac y
of th e map/networ k an d th e user' s locatio n enoug h t o mak e a simpl e "snapping " algorith m
feasible. Suc h situation s aris e for man y reasons. Firstly , no t al l PNAs ar e vehicle-based . Hence,
it may not b e possible to use DR o r othe r data sources. Secondly, even if it is possible to develo p
a network/ma p tha t i s accurat e enough , suc h a networ k ma y no t alway s be available . Fo r ex -
ample, th e PN A ma y no t hav e sufficien t capacit y t o stor e th e complete , accurat e networ k a t all
times an d hence , ma y nee d t o eithe r stor e inaccurate/incomplet e network s o r downloa d less -
detailed network s fro m eithe r a loca l o r centra l server . Thirdly , man y facilitie s wil l probabl y
never b e availabl e fro m map/networ k vendor s an d wil l nee d t o b e obtaine d on-the-fl y fro m th e
facility, probabl y wit h limite d accuracy . Fo r example , vendor s ma y no t provid e detaile d net -
works/maps o f airports , campuse s (bot h corporat e an d university) , large parkin g facilities , an d
shopping centers .
Hence, the purpose o f this paper i s to discuss some simple map matching algorithms that can be
used t o reconcil e inaccurat e locational dat a wit h a n inaccurat e map/network . W e begin i n th e
following sectio n with a formal definition of the problem. W e then discuss point-to-point, point -
to-curve an d curve-to-curv e matching. I n al l thre e cases , w e consider algorithm s tha t onl y use
C.E. White et al. I Transportation Research Part C 8 (2000) 91-108 93
geometric information and algorithms that also use topological information. Then, we consider
the performance of each of the algorithms in practice (in an admittedly limited number of tests).
Finally, we conclude with a discussion of possible future research directions.
Our objective in this paper is not to provide a definitive evaluation of different map
matching algorithms. Rather, our objective is to describe some simple algorithms and to
consider, both theoretically and in a small number of tests, why they might or might not work
well in practice.
2. Problem statement
Our concern is with a person (or vehicle) moving along a finite system (or set) of streets, At
a finite number, T, of points in time, denoted by {0,1,..., T}, we are provided with an estimate of
this person's location. The person's actual location at time t is denoted by and the estimate is
denoted by Pt. Our goal is to determine the street in that contains That is, we want to
determine the street that the person is on at time, t.
Of course, we do not know the street system, , exactly. Instead, as illustrated in Fig. 1, we
have a network representation, N, consisting of a set of curves in R2, each of which is called an
arc. Each arc is assumed to be piece-wise linear. Hence, arc A Є N can be completely charac-
l nA
terized by a finite sequence of points (A°,A ,... ,A ) (i.e., the endpoints of the individual line
2 nA
segments that comprise A), each of which is in [R . The points A° and A are referred to as nodes
1 2 nA-1
while (A , A ,... , A ) are referred to as shape points. A node is a point at which an arc ter-
minates/begins (e.g., corresponding to a dead-end in the street system) or a point at which it is
possible to move from one arc to another (e.g., corresponding to an intersection in the street
system).
This problem is called a map matching problem because the goal is to match the estimated
location, Pt, with an arc, A in the "map", N, and then determine the street, , that cor-
responds to the person's actual location, . A secondary goal is to determine the position on A
that best corresponds to .1
In order to simplify the exposition, we assume that there is a one-to-one correspondence be-
tween the arcs in N and the streets in . This assumption can easily be relaxed, however (and
often does not hold in practice).
There are a number of different ways to approach the map matching problem, each of which
has advantages and disadvantages. We will briefly discuss several of them before moving on to a
discussion of the specific algorithms that we considered.
1
Not surprisingly, the problem considered here is similar to the map matching problem in mobile robotics. There, the
problem is to establish a correspondence between a current local map and a stored global map.
94 C.E. White et al. I Transportation Research Part C 8 (2000) 91-108
One can view the map matchin g problem a s a simple search problem . Then th e problem is to
match P t t o th e "closest" node or shap e point i n the network.
A number of data structure s and algorithm s exist (see e.g., Bentle y and Maurer , 1980 ; Fuchs
et al., 1980 ) for identifying all of the points "near" a given point (often called a range query). I t is
then a simple matter to find the distance between Pt and every node and shape point that is within
a "reasonable " distance of it (regardless of the metri c used), and selec t the closest .
While this approach i s both reasonabl y eas y to implemen t and fast , i t ha s many problems in
practice. Perhap s mos t importantly , it depend s critically on th e wa y in whic h shape point s ar e
used in the network. To see this, consider the example shown in Fig. 2. Here, Pt is much closer t o
Bl tha n it is to either A° or A1, hence it will be matched t o arc B even though it is intuitively clear
that i t should be matched t o ar c A. Hence , this kind of algorithm is very sensitive to th e way in
which th e network was digitized. That is , other thing s being equal, arcs with more shap e point s
are more likely to b e matched to .
C.E. White e t a l I Transportation Research Part C 8 (2000) 91-108 9 5
One might argu e tha t thi s problem coul d b e overcome simpl y by including more shap e point s
for ever y arc . Unfortunately, thi s dramaticall y increase s th e siz e o f th e networ k an d i s no t
guaranteed t o correc t th e problem .
One can also view map matchin g as a problem o f statistical estimation . I n thi s approach, on e
considers a sequenc e o f point s (P s,...,Pl) an d attempt s t o fi t a curv e t o them . Thi s curv e is
constrained t o li e on th e network .
This kind of approach has been explored in numerous papers (see e.g., Krakiwsky et al., 1988 ;
Scott and Drane , 1994 ; J o e t al., 1996) and i s quite appealing . I t i s particularly elegan t whe n th e
model describin g th e "physic s o f motion " i s simpl e (e.g. , movemen t i s onl y possibl e alon g a
straight line) . Unfortunately , i n most practical applications, the physics o f motion is dictated by
(or constrained by ) the network . This make s it quit e difficul t t o model .
To understand wh y this is important, conside r the network shown in Fig. 3. In this example, the
positions P1 . . . P7 have been recorded. Our objective is to fit a curve to these points, bu t the curve
is constraine d t o li e on th e network . I n thi s case , ther e ar e tw o candidat e curves , A an d B (we
ignore the res t o f the networ k fo r simplicity) .
In genera l (i.e. , regardless o f the metric) , th e curv e P is closer t o th e curv e B tha n i t i s to th e
curve A. Thus, if one uses a simple model o f motion one will be led to match P to B rather than t o
A.
Our objectiv e in this stud y was to combin e th e simplicit y of the simpl e searc h approac h wit h
some of the ideas in the statistical approach . I n the end, we implemented and teste d four differen t
algorithms.
Algorithm 1 is very simple. It finds nodes that are close to the GPS "tick" and finds the set of arcs
that are incident t o these nodes. It then finds the closest of these arcs and projects the point ont o
96 C.E. White et al I Transportation Research Part C 8 (2000) 91-108
that arc (using a minimum norm projection). As shown in Fig. 4, calculating the minimum distance
between a point and a line segment is slightly more complicated than calculating the minimum
distance between a point and a line. Calculating the minimum distance between p and the line
segment between A0 and A1 is straightforward since it is the same as the minimum distance between
p and the line through A0 and A1. However, when we calculate the distance between q and the line
1
through A° and A , we see that the "perpendicular" intersects the line outside the line segment.
0 1
Hence, we must also calculate the distance between q and both A and A and choose the smallest.
Finally, since each arc is a piece-wise linear curve, we must find the minimum distance from the
point of interest to each of the line segments that comprise A and select the smallest. Thus, cal-
culating the minimum distance between a point, P', and an arc A, involves finding the minimum
distance between Pf and the line segments {λA°+ (1 - λ ) A l , λ Є [0,1]}, {λAl + (1 - λ)A 2 ,
λ Є [0,1]},...,{λAnA-1+ (1 - λ ) A n A , λ Є [0,1]} and choosing the smallest.
Obviously, this algorithm has many shortcomings. Firstly, it does not make use of "historical"
information and this can cause problems of the kind illustrated in Fig. 5. The estimated position
P2 is equally close to arcs A and B. However, given P° and P1 it seems clear that P2 should be
matched to arc A.
Another problem with this algorithm is that it can be quite "unstable". This is illustrated in
Fig. 6. The points P°, P1, and P2 are all equidistant from arcs A and B. But, it turns out that
P° and P2 are slightly closer to A and P1 is slightly closer to B. Hence, the matching oscillates back
and forth between the two.
Hence, Algorithm 1 will play the role of a "straw man". It is fast, easy to implement, and
should be easy to beat.
Algorithm 2 is identical to Algorithm 1 except that it makes use of "heading" information.2 If
the heading of the PNA is not comparable to the heading of the arc, then the arc is discarded. So,
A simila r process is used for all C and D, and for the other candidate nodes . P2 is then projected
onto th e closest.
The street network we used for this study was taken from th e 199 7 TIGER/Line files for Mercer
County, New Jersey. We collected data while traveling o n four pre-determined routes. All of the
routes were "in-town". That is, they did no t involv e highways or arterials .
We determined our actual location b y manually instructing the PNA t o record th e system time
as w e entered eac h link . We determine d ou r estimate d locatio n usin g a GP S receive r tha t wa s
manufactured b y TravRoute usin g a chipse t fro m Rockwel l (12 channels, differentia l correctio n
was turned off) . W e ran the receiver in NMEA-0183 mode and used the GPRMC sentences (which
includes information about the time, latitude and longitude , speed , heading, an d satellit e count).
We recorded on e GP S tic k pe r second , alon g wit h the syste m time when it was generated.
Note that , give n this samplin g scheme , the dat a wil l be spatiall y biased . Unfortunately , given
current technology , i t i s no t possibl e t o obtai n a spatiall y unbiase d sampl e fo r tw o reasons .
Firstly, th e satellite s themselve s broadcast message s o n a pre-determine d (temporal ) schedule .
Secondly, th e GP S receive r cannot kno w whe n i t ha s travele d a give n distance , and , eve n if i t
C.E. White e t al . I Transportation Research Part C 8 (2000) 91-108 10 1
Table 1
General rout e information
Route No. o f arcs Avg. ar c lengt h (km)
1 12 0.1706
2 12 0.2249
3 14 0.1928
4 16 0.6083
could, i t would not necessaril y be able to determin e it s location a t that time. Thus, a n algorith m
that canno t perfor m wel l i n th e presenc e o f th e spatia l bia s i s no t a goo d algorith m fo r ou r
purposes.
Note also that th e GP S receiver di d not provid e us with raw satellit e information ; w e let the
GPS receiver do as much processing as it could in the time available to it. In fact, the GPS ticks we
used were actually "smoothed " by the receiver. Of course, thi s filtering doe s introduce erro r tha t
we woul d prefe r t o avoid . Unfortunately , i t i s no t clea r ho w t o avoi d thi s kin d o f error . I n
particular, th e syste m is almost alway s under-identified (i.e., fewer tha n fou r satellite s are visible)
or over-identifie d (i.e., more tha n fou r satellite s ar e visible) . We cannot ignor e al l such observa -
tions because, in practice, they occur frequently (indeed, sometimes for an entire trip). Since every
method o f handlin g th e identificatio n proble m introduce s error , w e chos e t o us e th e filterin g
scheme that Rockwel l built int o thei r chip set.
Table 1 provides som e informatio n abou t th e route s tha t were used. The fou r route s ha d be -
tween 1 2 and 1 6 arcs, mos t o f whic h ha d a spee d limi t o f 2 5 mph. Th e spee d limi t wa s neve r
exceeded while the data were being collected, bu t i t was frequently no t realize d because o f othe r
vehicles on the road. Route s 1 , 2, and 3 have similar mean arc lengths. Route 4 has a considerabl y
longer mea n ar c lengt h becaus e i t included a fe w "long" links.
5.3. Results
Unfortunately, space prevents us from presentin g maps of all of the results. We will, instead, try
to summariz e them. Give n th e smal l number an d limite d variabilit y o f routes in the field test, we
3
Of course, it is possible to record the PNA's true location on a closed test track. Unfortunately, the only test tracks
available to u s were small an d ha d simpl e topologies. Almos t al l algorithms work well o n tes t tracks .
102 C.E. White e t al . I Transportation Research Part C 8 (2000) 91-108
have chosen no t t o presen t a statistical analysi s of the data. Indeed , we caution agains t drawin g
any stron g conclusions fro m thes e results.
Table 2 displays, for eac h o f the fou r pre-planned route s use d i n th e study , the percentag e o f
correct matche s attaine d b y each algorithm .
Overall, th e bes t algorith m onl y correctl y matche s betwee n 66 % an d 86 % o f th e GP S
ticks, however , thi s i s no t a s ba d a s i t migh t sound . Recal l tha t on e GP S tic k wa s generate d
each second . Thi s mean s tha t relativel y mor e tick s ar e generate d nea r intersection s
(because speed s ar e alway s lowe r nea r intersection s an d ar e sometime s zero ) an d th e ma p
matching problem is much more difficult nea r intersections since several arcs are very close to each
intersection. I n addition , th e GP S receive r itsel f performs muc h mor e poorl y nea r intersection s
because th e spee d o f th e vehicl e is lower . Hence , th e erro r i n th e GP S tick s i s much large r a t
intersections.
This i s relativel y eas y t o se e o n a map . Fig . 1 0 contain s a portio n o f th e arc s tha t
compose Rout e 1 . I n th e figure , ther e ar e tw o set s o f arrow-heads . Th e lighte r arrow -
heads represen t th e GP S tick s an d th e darke r arrow-head s represen t th e ma p matche d
locations tha t wer e produced b y Algorith m 1 . As yo u ca n see , mos t o f th e problem s occu r a t
intersections.
Table 2
Matching rate s for route/algorith m pairs
Algorithm Route 1 Route 2 Route 3 Route 4
1 0.534 0.677 0.618 0.608
2 0.663 0.736 0.855 0.681
3 0.661 0.707 0.858 0.664
4 0.617 0.726 0.771 0.687
Table 3
Detailed performanc e o f Algorithm 1 (point-to-curve matching )
Algorithm 1
Attribute Match Route 1 Route 2 Route 3 Route 4
Arc lengt h (km) Correct Mean 0.246 0.290 0.268 1.12
S.D. 0.074 0.089 0.072 1.12
Incorrect Mean 0.146 0.239 0.191 0.345
S.D. 0.098 0.116 0.115 0.375
Speed (mph ) Correct Mean 21.8 21.5 22.2 32.8
S.D. 5.43 5.92 4.69 14.4
Incorrect Mean 15.8 17.8 15.2 32.4
S.D. 6.16 7.12 6.16 38.1
Closest ar c (km) Correct Mean 0.025 0.020 0.021 0.025
S.D. 0.019 0.023 0.019 0.022
Incorrect Mean 0.020 0.023 0.020 0.079
S.D. 0.013 0.022 0.015 0.096
Next closest arc (km) Correct Mean 0.065 0.061 0.069 0.144
S.D. 0.027 0.032 0.032 0.087
Incorrect Mean 0.040 0.037 0.039 0.122
S.D. 0.019 0.024 0.019 0.107
Serial correlation Correct Mean 10.3 18.6 12.8 16.7
S.D. 2.13 2.02 1.92 4.33
Incorrect Mean 8.18 8.88 7.91 11.5
S.D. 3.43 3.12 2.49 3.55
Closest intersectio n (km) Correct Mean 0.073 0.076 0.079 0.173
S.D. 0.042 0.047 0.047 0.127
Incorrect Mean 0.040 0.035 0.038 0.063
S.D. 0.022 0.025 0.024 0.060
the GP S tic k t o th e matched arc , th e distanc e fro m th e GP S tic k to th e nex t best arc , th e seria l
correlation i n the GPS ticks, and th e distance fro m th e GPS tick to th e nearest intersection. This
information i s summarized i n Tables 3-6 .
As you can see, all algorithms were more likely to produce an incorrect match when the PNA is
traversing a relatively short roa d segment . This is almost certainl y because th e PNA i s never far
from a n intersection .
As on e migh t expect , al l o f th e algorithm s worke d bette r wit h "better " GP S tick s (i.e. ,
when th e distanc e betwee n th e locatio n approximatio n an d th e closes t ar c i s
relatively small) . Note , however , tha t becaus e o f error s i n th e ma p database , thi s doe s no t
necessarily mea n tha t a mor e accurat e GP S receive r (e.g. , a DGP S receiver ) wil l yiel d bette r
results.
Speed ca n als o pla y a rol e an d ca n b e associate d wit h ar c length . Indeed , correc t matche s
tend t o occu r a t greate r speed s tha n incorrec t matches . Thi s ma y simpl y be becaus e th e mea n
speed o f trave l i s highe r o n longe r arc s (and , hence , th e GP S reading s ten d t o b e better) .
However, i n th e cas e o f Algorith m 4 , i t ma y als o b e a direc t resul t o f th e inclusio n o f topo -
logical information . Oddly , however , Algorith m 4 seem s t o perfor m badl y a t hig h speed s o n
Route 4 .
C.E. White e t al . I Transportation Research Part C 8 (2000) 91-108 10 5
Table 4
Detailed performanc e o f Algorithm 2 (point-to-curve matchin g supplemented wit h headin g information)
Algorithm 2
Attribute Match Route 1 Route 2 Route 3 Route 4
Arc lengt h (km) Correct Mean 0.223 0.284 0.246 0.098
S.D. 0.084 0.094 0.097 1.043
Incorrect Mean 0.153 0.246 0.192 0.466
S.D. 0.111 0.115 0.094 0.706
Speed (mph ) Correct Mean 21.2 21.8 20.4 30.9
S.D. 5.70 5.62 5.85 11.0
Incorrect Mean 14.7 16.0 14.3 36.5
S.D. 5.79 7.08 6.21 43.6
Closest ar c (km ) Correct Mean 0.030 0.023 0.028 0.032
S.D. 0.020 0.024 0.022 0.032
Incorrect Mean 0.048 0.042 0.050 0.175
S.D. 0.073 0.061 0.072 0.197
Next closes t ar c (km ) Correct Mean 0.073 0.066 0.072 0.186
S.D. 0.031 0.035 0.028 0.143
Incorrect Mean 0.093 0.066 0.070 0.259
S.D. 0.10 0.081 0.099 0.23
Serial correlatio n Correct Mean 10.7 12.5 16.3 20.0
S.D. 1.91 2.86 3.52 3.90
Incorrect Mean 5.00 4.46 2.75 9.36
S.D. 3.04 1.65 0.94 3.48
Closest intersectio n (km) Correct Mean 0.065 0.076 0.063 0.145
S.D. 0.044 0.048 0.049 0.116
Incorrect Mean 0.054 0.047 0.038 0.094
S.D. 0.076 0.064 0.072 0.147
Table 5
Detailed performanc e o f Algorithm 3 (point-to-curve matching supplemente d with heading informatio n an d connec -
tivity)
Algorithm 3
Attribute Match Route 1 Route 2 Route 3 Route 4
Arc lengt h (km) Correct Mean 0.219 0.279 0.250 0.942
S.D. 0.082 0.092 0.098 1.02
Incorrect Mean 0.163 0.267 0.177 0.569
S.D. 0.117 0.117 0.073 0.846
Speed (mph) Correct Mean 21.2 21.3 20.1 31.1
S.D. 5.78 5.84 5.97 11.1
Incorrect Mean 15.6 18.4 16.6 35.7
S.D. 5.78 7.60 7.00 42.6
Closest ar c (km) Correct Mean 0.029 0.019 0.029 0.033
S.D. 0.020 0.021 0.023 0.033
Incorrect Mean 0.035 0.058 0.051 0.158
S.D. 0.032 0.062 0.066 0.178
Next closes t ar c (km) Correct Mean 0.106 0.116 0.119 0.639
S.D. 0.059 0.084 0.059 0.633
Incorrect Mean 0.079 0.105 0.098 0.258
S.D. 0.053 0.104 0.105 0.217
Serial correlation Correct Mean 10.3 16.9 19.3 22.8
S.D. 1.81 2.69 3.35 3.33
Incorrect Mean 4.85 7.00 3.20 12.5
S.D. 3.01 2.75 2.46 3.45
Closest intersectio n (km) Correct Mean 0.061 0.071 0.059 0.288
S.D. 0.04 0.046 0.045 0.385
Incorrect Mean 0.033 0.044 0.019 0.084
S.D. 0.039 0.059 0.056 0.140
Table 6
Detailed performanc e o f Algorithm 4 (curve-to-curve matching)
Algorithm 4
Attribute Match Route 1 Route 2 Route 3 Route 4
Arc length (km) Correct Mean 0.233 0.288 0.250 1.04
S.D. 0.081 0.087 0.094 1.08
Incorrect Mean 0.147 0.238 0.199 0.347
S.D. 0.106 0.124 0.104 0.416
Speed (mph ) Correct Mean 21.4 21.2 21.0 31.5
S.D. 5.30 6.12 5.64 10.7
Incorrect Mean 16.0 19.2 15.8 36.0
S.D. 6.19 6.51 5.78 44.3
Closest ar c (km ) Correct Mean 0.03 0.021 0.028 0.031
S.D. 0.022 0.022 0.022 0.031
Incorrect Mean 0.035 0.037 0.051 0.166
S.D. 0.028 0.027 0.058 0.153
Next closes t arc (km) Correct Mean 0.00 0.00 0.00 0.00
S.D. 0.00 0.00 0.00 0.00
Incorrect Mean 0.00 0.00 0.00 0.00
S.D. 0.00 0.00 0.00 0.00
Serial correlatio n Correct Mean 9.67 15.6 10.8 21.5
S.D. 2.14 3.17 2.58 4.48
Incorrect Mean 5.54 5.9 3.19 9.77
S.D. 2.78 2.25 1.10 4.00
Closest intersection (km) Correct Mean 0.067 0.076 0.065 0.289
S.D. 0.045 0.048 0.046 0.379
Incorrect Mean 0.027 0.027 0.036 0.125
S.D. 0.037 0.03 0.063 0.153
of smal l errors i n the map matche d location . I n othe r cases , th e direction s chang e dramatically .
Hence, wor k need s to b e done bot h o n more robus t path-findin g algorithms an d o n varying the
map matchin g algorith m i n different situations .
Acknowledgements
References
Abousalem, M.A. , Krakiwsky , E.J. , 1993 . A qualit y contro l approac h fo r GPS-base d automati c vehicl e location
and navigatio n systems . In : Proceeding s o f th e Vehicl e Navigation an d Informatio n System s Conference , pp .
466-70.
108 C.E. White e t al . I Transportation Research Part C 8 (2000) 91-108
Bentley, J.L. , Maurer , H.A. , 1980 . Efficien t worst-cas e dat a structure s fo r rang e searching . Act a Inf . 13 ,
155-168.
Blackwell, E.G. , 1986 . Overview of differential GP S methods . Globa l Positionin g Sys . 3, 89-100.
Collier, W.C. , 1990 . In-vehicle route guidance system s using map matche d dea d reckoning . In : Proceeding s o f IEEE
Position Locatio n and Navigatio n Symposium , pp. 359-363 .
Degawa, H. , 1992 . A ne w navigatio n syste m wit h multipl e informatio n sources . In : Proceeding s o f th e Vehicl e
Navigation and Informatio n System s Conference, pp. 143-149 .
Deretsky, Z., U. Rodny, 1993 . Automatic conflation of digital maps: how to handle unmatched data. In: Proceedings o f
the Vehicle Navigation and Informatio n System s Conference, pp. A27-A29 .
Fuchs, H. , Kedem , Z.M. , Naylor , B.F. , 1980 . O n visibl e surfac e generatio n b y a prior i tre e structures . Comput .
Graphics 14 , 124-133.
Hofmann-Wellenhoff, B. , Lichtenegger, H., Collins , J. , 1994 . GPS: Theor y an d Practice . Springer , Berlin.
Iwaki, F. , Kakihari , M. , Sasaki , M., 1989 . Recognition o f Vehicle's Locatio n fo r Navigation . In : Proceeding s o f th e
Vehicle Navigation an d Informatio n System s Conference, pp. 131-138 .
Jo, T. , Haseyamai , M. , Kitajima , H. , 1996 . A ma p matchin g metho d wit h th e innovatio n o f th e Kalma n filtering .
IEICE Trans . Fund . Electron . Comm . Comput . Sci . E79-A, 1853-1855 .
Kim, J.-S. , 1996 . Node base d ma p matchin g algorithm for car navigation system . In: Proceedings o f the Internationa l
Symposium o n Automotive Technolog y and Automation , pp . 121-126 .
Krakiwsky, E.J. , Harris , C.B. , Wong , R.V.C. , 1988 . A Kalma n filte r fo r integratin g dea d reckoning , ma p
matching an d GP S positioning . In : Proceeding s o f IEE E Positio n Locatio n an d Navigatio n Symposium ,
pp. 39-46 .
Mattos, P.G. , 1994 . Integrated GP S and dea d reckonin g for low-cos t vehicle navigation and tracking . In: Proceeding s
of the Vehicl e Navigation an d Informatio n System s Conference, pp. 569-574 .
Scott, C.A., Drane , C.R., 1994 . Increased accurac y o f motor vehicle position estimatio n by utilizing map data , vehicle
dynamics an d othe r informatio n sources . In : Proceeding s o f th e Vehicl e Navigatio n an d Informatio n System s
Conference, pp . 585-590 .
Schiff, T.H. , 1993 . Dat a source s an d consolidatio n method s fo r creating , improvin g an d maintainin g navigatio n
databases, In : Proceeding s o f the Vehicl e Navigation an d Informatio n System s Conference, pp . 3-7 .
Shibata, M. , 1994 . Updating o f digital road map . In : Proceedings o f the Vehicle Navigation an d Informatio n Systems
Conference, pp . 547-550 .
Tanaka, J. , 1990 . Navigation syste m with map-matching method . In : Proceedings o f the SA E International Congres s
and Exposition , pp . 45-50.
Watanabe, K. , Kobayashi , K. , Munekata , F. , 1994 , Multiple sensor fusion fo r navigation systems. In: Proceedings of
the Vehicle Navigation an d Informatio n System s Conference, pp. 575-578 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Keywords: Database ; Quer y language; Pat h management ; Quer y resolutio n engine; Geographical informatio n system
1. Introductio n
Many efforts ar e being undertaken to elaborate innovative solutions for the representation and
exploration o f comple x databas e application s (Medeiro s an d Pires , 1994) . I n th e contex t o f
geographical databases , severa l spatial dat a model s have been identified. Man y proposals have
been advance d t o quer y these database s (Aufaure-Portie r an d Trepied , 1996 ; Di Loret o e t al. ,
*Fax: +33-2-32-95-97-08 .
E-mail address: [email protected] r (M. Mainguenaud).
0968-090X/00/$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00018- 8
110 M . Mainguenaud I Transportation Research Part C 8 (2000) 109-127
1996; Egenhofer , 1994 ; Kirb y an d Pazner , 1990 ; Meyer , 1992 ; Tso u an d Buttenfield , 1996 ;
Woodruff e t al. , 1995) . Grap h structure s ar e introduce d int o geographi c databas e model s t o
handle networks . Networ k structure s ar e particularl y usefu l t o represen t physica l o r influenc e
relationships i n space . Physica l network s includ e th e distributio n o f electricity , gas , wate r o r
telecommunication resources . Influenc e networks describe economical or social patterns i n space .
Network structure s ca n b e also use d t o represen t spatia l navigatio n processe s i n space an d tim e
(i.e., a movement, generally a human one, between several locations in space). Suc h processes ar e
represented throughout cognitiv e representations o f space that integrat e complementary levels of
abstraction (i.e. , larg e an d loca l scale s accordin g t o Kuiper , 1978) . In thi s case , networ k node s
represent symboli c an d discret e location s i n spac e an d edg e displacement s betwee n thes e loca -
tions. A classic logica l architectur e o f a Geographica l Informatio n Syste m (GIS) i s presented i n
Fig. 1 . Three mai n level s and thei r interaction s ar e presente d i n the figure .
The Use r Interfac e (UI ) leve l i s responsibl e fo r th e quer y definitio n proces s an d th e resul t
display. Thes e tw o task s ma y follo w differen t approaches . A s a n example , th e quer y definition
process may be performed through a n interface based o n click boxes (e.g., a formular). The result
display process ma y b e performed throug h a n interfac e based o n a visual language. Ther e i s no
reason t o constrai n th e quer y definitio n proces s t o b e performe d o n th e sam e hardwar e a s th e
query display (e.g., a query defined wit h real buttons and a visualization of the result on a screen).
The Quer y Resolutio n Engin e (QRE ) i s responsibl e fo r transformin g a n end-use r quer y int o
orders tha t ca n b e understoo d b y th e Dat a Bas e Managemen t Syste m (DBMS) . Th e allowe d
expressive power (i.e., the classes of queries an end-user can define) of the UI must be greater tha n
the DBMS is able to accept. Otherwise , the UI level is not a UI level but a simple interface to the
DBMS. Naturally, th e two expressive powers (i.e., the query language of the UI and the databas e
query language) mus t b e compatible. A solutio n base d o n a UI wit h operator s an d a predicate -
based query language for the DBMS doe s no t mak e sense . For example , the UI ma y accept i n a
single order the composition o f operators. Becaus e the DBMS doe s not accep t th e composition ,
the QR E i s in charge o f providing th e softwar e interfac e that simulate s the composition . Le t u s
consider fo r th e tim e tha t th e DBM S i s only i n charg e o f storin g dat a an d providin g tool s t o
retrieve this data. We are concerned i n this researc h with the links between th e UI leve l and th e
QRE.
Geographical Informatio n Syste m fo r Transportatio n (GIS-T ) represent s a clas s o f GI S
problems. Thi s kin d o f applicatio n focuse s o n geographi c dat a tha t ca n b e organize d wit h o r
without a spatia l representatio n (i.e. , th e managemen t o f a network) . Application s ca n manip -
ulate a networ k withou t an y knowledg e (o r nee d o f knowledge ) o f th e spatial location s (i.e. ,
spatial coordinates ) o f th e differen t entities . Th e manipulatio n i s base d o n th e concep t o f
graphing without a spatial component . Thi s property allow s for the development o f application s
on ver y chea p hardwar e device s sinc e a uniqu e visio n o f alphanumeri c dat a i s required . Fo r
example, plannin g an d managemen t o f a publi c bu s syste m involve s a lis t o f buse s an d con -
nections linke d to a timetable and does not require a two-dimensional (2D) representation. Mos t
of th e time , a 2 D representatio n o f a network is schematic an d doe s no t respec t scale . I n riche r
applications wher e scale an d positiona l accurac y ar e important , th e logica l representatio n o f a
network an d it s environment are merged. The quantit y of data availabl e i s more importan t an d
provides the opportunit y t o defin e mor e complex querie s than " I woul d like to go from plac e A
to place B". I n thi s case, a n end-user query can formally b e defined a s the application o f a set of
operators (spatia l o r not). GIS- T mus t provide a path evaluation operator . Fo r example, a query
may b e th e applicatio n o f a selectio n base d o n alphanumeri c criteri a (e.g. , a rout e wit h th e
characteristic "highway") , a n evaluatio n o f a path (e.g. , fro m a town name d "Paris " to a tow n
named "Nice") , and a spatial intersectio n operato r (e.g. , the path must cross a forest) t o provid e
a unique end-user quer y (e.g., a route fro m Pari s t o Nice which onl y uses a highway and crosse s
a forest) .
Several spatia l databas e model s an d quer y language s hav e bee n propose d t o represen t th e
properties o f networks - whethe r i n space o r not (Angellacio e t al., 1990 ; Car and Frank, 1994 ;
Christophides e t al. , 1996 ; Cru z e t al. , 1987 ; Erwig and Giiting , 1994 ; Timpf e t al. , 1992) . Th e
definition o f a data manipulation languag e that operates, organizes , and presents a set of network
queries withi n their geographi c contex t remain s a n importan t researc h challeng e t o date . Data -
base query languages are mainly based o n a logic of predicates that restrict a query result to a set
of tuple s o r objects . Th e projectio n operato r restrict s a quer y resul t t o som e o f th e attributes .
(Note tha t a projection operato r i s also define d withi n object-oriented databas e quer y languages
as a n operato r whic h restrict s resultin g attributes , hide s an d eventuall y rename s o r redefine s
object properties. ) T o exten d the semantic s o f the projectio n operator , aggregat e function s hav e
been propose d (e.g. , sum , average , maximum , minimum , count) . Thes e function s ca n b e inte -
grated withi n th e projectio n operato r i n orde r t o exten d th e semantic s delivere d b y th e quer y
results. In a GIS context, a query result combines network, spatial, an d alphanumeric properties .
For example , route s o f air flight s presen t a networ k wit h a map a s a background . Node s hav e
relevant geographi c locations bu t th e edges are symboli c in the sens e that the y do no t represen t
the rea l travelin g routes o f the planes .
A firs t extensio n propose d t o handl e spatia l dat a i n quer y language s i s th e introductio n o f
spatial predicates (e.g. , in the "where" clause of an extended SQL-like query language). However,
the semantic s o f these languages is not adapte d t o th e complexity of geographic applications. I n
this domain, operation s ar e ofte n oriente d towar d th e spatial an d logica l manipulations o f enti-
ties. The introduction o f spatial-oriented (o r network-oriented) operators improve s the benefi t o f
spatial queries . Ne w alphanumeric an d spatial semantic s ma y b e derived fro m thei r applicatio n
112 M . Mainguenaud I Transportation Research Part C 8 (2000) 109-127
results must take into account this risk (e.g., inter-dependent applications) . The philosophies of a
query definition an d the presentation o f the results may be different (e.g. , formular vs visual). The
management of query results involves with three components. The first is the data model associated
with the query results. The second entails the results themselves (metabase/database). The last is the
interpretation to avoid errors due to visual ambiguities of an operator wit h an aggregate function.
The remainde r o f thi s pape r i s organize d a s follows . Th e secon d sectio n present s th e quer y
modeling. Thi s i s the firs t ste p t o enabl e GIS- T databas e manipulations . Th e modelin g o f th e
query result s and thei r basi c manipulation s i s discussed i n th e followin g section . This i s the first
step to achiev e data visualization to end-users . Conclusions of this work are drawn in Section 4.
2. Quer y modeling
2.1. Graph
Without any loss of generality, we consider here a graph with a single level of abstraction. W e
require a graph structur e to illustrate th e examples on transportation data . The way this graph is
represented (th e data model ) and acquire d (constructo r operators ) ar e not importan t a t al l since
we are onl y concerned wit h the composition of operators in a query: this graph structur e is used
by a path operator. Therefore , let us use a simple definition o f a graph: a graph is formally define d
by a set of labeled vertices called nodes and labeled edges. Data can be associated wit h a graph as
114 M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127
a whole. Fig. 2 presents th e formal definition of a graph G (Cruz et al., 1987) . This definition als o
serves to mode l a query , in which case node s ar e th e operator s an d edge s model th e arguments
(i.e., a conventional representation o f a functional expression).
The notio n o f a grap h (a n abstrac t dat a type ) wil l guid e u s t o mode l th e applicatio n data ,
whatever th e dat a mode l an d th e semantic s of the applicatio n are . Th e forma l modeling of ap -
plication dat a (databas e schema ) ca n b e define d b y on e o f severa l approache s (Peckha m an d
Maryanski, 1988 ; Smit h an d Smith , 1987 ; Stonebrake r an d Moore , 1996 ; Zdonik an d Maier ,
1990).
Let u s use a traditiona l comple x objec t notatio n t o denot e a sampl e database : [ ] for aggre -
gations, { } for sets , an d ( ) fo r list s an d conventiona l type s integer , string , an d float . Le t u s
define a n abstract data type (Seshadri et al., 1997 ; Stemple et al., 1986 ) to model the cartographi c
representation o f a nod e (resp . a n edge ) name d SpatialRepresentation_type . Fig . 3 present s a
sample database with towns, forests, roads, and polluted areas, while Fig. 4 presents the instances
of this database (alphanumeric part). Without any loss of generality, we assume that all links are
unidirectional. Fig. 5 presents the logical view of the transportation graph, that is, the relevant
relative spatial location between nodes is missing. In this example, one mandatory link exists
between Lyon and Avignon in a path from Paris to Nice, whereas several alternative paths can be
followed to join Paris to Nice. Hence this is a fairly typical situation in transportation.
In the formal definition of a graph, v is a function that provides spatialRepresentation to a
node; s is a function that provides duration and spatialRepresentation to an edge. Town_type
is a subtype of Node_type, and Road_type is a subtype of Edge_type. TransportNetwork_type is
a subtype of Graph_type. y is a function that provides a visualization context for a transportation
network (e.g., with some forests and some polluted areas).
A double spatial representation of transportation data is handled with the ε function. This
allows for duration or travel time to be different. The logical representation of a given (origin,
destination) pair and of the double spatial representation associated to it is depicted in Fig. 6.
Data manipulations in a GIS-T database can be reduced to four classes. Class (a) conventional
database manipulations (i.e., querying v or e function) are based on a selection with alphanumeric
criteria: for instance, a town with a given name, "wha t portion o f road networ k N is maintained
by political entit y P? " Clas s (b) conventional GI S dat a manipulation s ar e base d o n spatia l op-
erators: for instance, an intersection operator fo r "a road mus t cross a forest", "D o th e bounding
boxes of two network s overlap?" Thi s class involves the spatial representation . Clas s (c.l) i s the
evaluation o f path s wit h a finit e stat e automata-base d query : th e roa d classificatio n must b e
"secondary", then "highway", an d finally "secondary" (Cruz et al., 1987). For this query type, the
constraint i s evaluated a t eac h edg e o f a path. Class (c.2 ) is the evaluatio n o f paths wit h one o r
more aggregates : fo r instance , th e tota l duratio n i s les s tha n 1 0 units. Th e constrain t i s her e
evaluated fo r the path as a whole .
Within this classification, clas s (b) is the only one that requires a spatial representation. Classes
(c.l) and (c.2 ) involve a path evaluation . A path may have two representations, namel y a logical
one - a graph, an d a spatial one . The spatial representatio n o f a path ma y be used in a class (b)
query.
In thi s paper , w e are no t concerne d wit h querie s o f clas s (a ) sinc e the y correspon d t o con -
ventional database manipulation operators. Furthermore , classe s (c.l) and (c.2) can be considered
as one same class (c) since we are not concerned wit h the query definition process (e.g. , what is the
kind o f UI ? ho w man y path s ar e required ? what i s th e valu e o f th e aggregate? ) o r th e quer y
evaluation proces s (e.g. , wha t algorith m shoul d b e used t o evaluat e a path ? wha t i s the "best "
optimization?). To illustrat e transportation manipulation s o n ou r sampl e database , le t us defin e
the followin g se t of queries to tackle s th e problems o f GIS-T :
Query Q 1: I would like to join Pari s to Nice .
Query Q 2: I would lik e t o join Pari s t o Nic e with a total duratio n les s than 1 0 units.
Query Q 3: Show me a map with a road(s) that crosses, wit h at least a 10-unit s length, a forest
(with a polluted area ) i n its non-polluted part ?
Query Q 4. Show me a map with a route(s) from Pari s to Nice that crosses a forest an d borders a
polluted area ?
M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127 11 7
Query Q 1 require s the evaluation o f a transitive closure on a graph G defined i n the database
(i.e., class (c)). Query Q 2 has the same requirement extended with an aggregate function (i.e. , path
evaluation unde r constraints) . Quer y Q 3 requires th e compositio n o f tw o spatia l operator s (a n
intersection applie d o n th e resul t o f a spatia l difference) , i.e. , clas s (b) . Quer y Q 3 ca n b e gener-
alized a s a mi x with query Q 1 t o involv e the pat h operato r (roa d v s path fro m Pari s t o Nice) .
Query Q3 represents a vertical composition (i.e. , the algebraic representation is a tree). Query Q 4 is
also a quer y wit h a compositio n o f operators . I t represent s a horizonta l compositio n (i.e. , th e
algebraic representatio n i s a DAG) .
In th e following , we consider tha t som e data type s are availabl e (e.g., Town_type), and som e
queries ar e define d t o illustrat e the differen t configuration s (e.g., query Q 3).
The output o f the QRE t o the DBMS depends on the expressive power of the database quer y
language. Th e introductio n o f predicates o r operator s a s primitives of a spatia l databas e quer y
language lead s t o differen t expressiv e powers (an d therefor e som e querie s ma y o r ma y no t b e
expressed).
A spatia l predicat e (e.g. , spatia l difference ) applie d t o dat a provide s a Boolea n resul t (e.g. , a
basic manipulation in a "Where" clause in SQL). As an example, query Q3 cannot b e formulated
with a predicate-based language sinc e it requires the results of the spatial difference operato r to be
able t o evaluat e whethe r an intersectio n exist s o r no t (i.e. , using the resul t o f an operator) . Th e
application o f a GI S t o transportatio n dat a introduce s severa l consequences for th e quer y lan-
guage definition :
Operator-based language: Th e extensio n to transportatio n managemen t require s the definitio n
of an operator-based language . The important poin t in the evaluation o f a path between Paris an d
Nice (e.g., query Q\) is the components (i.e., edges and nodes) of the path. Information , such as a
path that exists (i.e. , evaluatio n wit h a predicate) i s not relevan t fo r GIS-T application (i.e., th e
result of the path operator instead of a Boolean result). To manage transportation data, GIS must
handle othe r grap h manipulatio n operator s suc h a s node/edge/graph manipulations .
Closedness o f operators'. Th e underlyin g notion o f a labele d grap h require s that a resul t o f a
graph manipulation shoul d be compatible with the notion of a labeled graph. This requirement is
an extension of spatial data manipulations. Geographi c dat a is composed o f at leas t on e pair of
alphanumeric an d spatia l data . A GIS operator mus t provid e as a result a datum wit h the same
structure ( a pai r o f alphanumeri c an d spatia l data) . A s a n extension , a GI S datu m fo r trans -
portation i s a geographical datu m wit h an associated graph . The result of an operator mus t be a
geographical datu m (i.e. , alphanumeric and spatial data) an d an associated graph. In the context
of a network, the spatia l par t ma y be a symboli c representation (i.e., the real world coordinate s
are not mandatory) .
Expressive power. Once the query language provides the concept of operator, th e next problem
is to defin e dat a o n whic h operators ar e applied . Traditiona l databas e quer y language (e.g., re-
lational algebra) is based on the first order logic without function. To provide a realistic expressiv e
power t o develo p applications , quer y language s (e.g. , SQL ) introduc e som e basi c functions /
operators (e.g., count). GIS-T must provide spatial and graph manipulation operators. I n parallel
with GIS databases, tw o levels can be defined: wit h and without a combination o f operators. T o
118 M . Mainguenaudl Transportation Research Part C 8 (2000) 109-127
modeling represents the input of the QRE. Th e result modeling represents the output. Th e added
value o f the QR E i s introduced i n the output .
3. Resul t modeling
The evaluation o f a path operator i n a GIS-T database shoul d provid e a set of paths instea d of
a singl e path as a result. The QRE is not responsibl e for the definition of the relevant numbe r o f
paths to b e evaluated. Thi s tas k belong s t o th e interface leve l or the database operator. Fo r ex-
ample, in a GIS application for tourism management an d information, a shortest path may be far
from desirabl e sinc e visitors are more motivated b y landmarks an d othe r sightseein g attraction s
that b y efficiency considerations . Furthermore, th e less expensive path may also be the longest. In
the context of a multi-criteria analysis, it does not see m realistic to define a priori query . Huma n
interaction ma y b e necessary to mak e a choic e fro m a smal l se t of propositions. T o reduc e th e
volume of possible answers , a wide range of aggregate functions ought to be provided durin g the
query definitio n process (e.g. , tota l travel tim e unde r 1 0 units).
120 M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127
Aggregate functions may involve some ambiguities while the end-user interprets the results of a
query. The query result modeling must provide a structure to avoid such ambiguities. To ma-
nipulate this structure, some operators must be defined.
If the result of a path operator in a GIS-T query is considered as a unique graph, modeling the
set of paths (i.e., a set-oriented approach) may lead to confusion when an aggregation function is
involved in the query. The result must therefore be a set of graphs.
A path is itself a graph, that is a set of nodes and a set of edges, along with some characteristic
data (y function). Let us consider the result presented to the end-user in a single graph (Gr}. This
graph is built as the union (U) of the sets of nodes (resp. edges) of each path. Let n be the number
of paths Gi(Ni,Ei, v, ε, Ψ, γi), where i= 1 , . . . , « . The result of the path operator is
Gr = ( U N i , \ J E h v , ε , ' Ψ , yi) Vi = 1 , . . . , « .
This formalism can be provided when no aggregate function is involved in the query (e.g., query
<2i). It cannot be used anymore once an aggregate function is involved (e.g., query Q2 ). As an
example, the following paths are provided while query Q2 is evaluated (Fig. 4):
(1) Paris-Dijon-Lyon-Avignon-Marseille-Nice: Duration 10,
(2) Paris-Lyon-Avignon-Marseille-Nice: Duration 6,
(3) Paris-Lyon-Avignon-Toulon-Nice: Duration 8.
The following path:
(4) Paris-Dijon-Ly on-Avignon-Toulon-Nice
does not fulfill the requirements since its total duration is 12, which is greater than 10.
The generation of a graph built with the union of the different paths that fulfill the requirement
leads to two difficulties:
The answer is wrong. The union provides the same graph as the one presented in Fig. 5 and
therefore, one can infer that path (4) is relevant, This impossibility has been lost by the appli-
cation of the union operator.
The result for the aggregate function is lost. It must be re-computed as soon as a path is re-
quired since the y function cannot individualize each path. This disappearance is due to the un-
ion operator.
A set of paths results from the path operator. Each path in this set is defined independently of the
others. The result of a path operator with multiple paths as an answer cannot be managed as a
shortest path would be. A path operator is applied between Paris and Nice in our example. Three
paths are provided as an answer. This problem can be generalized. The only restriction is to
provide a realistic computational time (i.e., several starting places for a single arriving place or the
opposite).
Query results depend on the expressive power of the language used to define a query. A
predicate-based query language leads to a result built as a set of database components (e.g., re-
lations for relational-based DBMS, classes for object oriented DBMS). An operator-based query
M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127 121
The same sub-query may be used several times in the same query (i.e., the graph modeling a query
is not a tree) . Fo r example , Fig . 9 presents a horizontal compositio n wit h an operator . Fig . 1 2
presents a horizontal compositio n wit h a leaf, that i s a simplifie d graph associate d wit h the fol-
lowing query : " a roa d crosse s th e non-polluted par t o f a fores t an d border s " < > " this pollute d
area" - i.e. , an extension o f query Q^ . The functional representatio n i s
The evaluation of such a query is similar to a conjunction of queries. Therefore, the evaluation
mechanism will end since only deletions are involved. The deletion of non-relevant data is a re-
cursive application of the evaluation since the query is modeled with a DAG instead of being
modeled with a tree. Some arguments may have a reduced set of relevant data when they share the
same operator: for instance, a forest may be deleted from the relevant data set if it has a polluted
part, but the polluted area is not bordered by a road that crosses this forest.
The interpretation is different for a path operator whose answer is multiple paths because the
application of the path operator on an instance number i (the origin) and on an instance number j
(the destination) does not provide a single result (i.e., an instance number k) but a set of instances
(a set of paths). Since aggregate functions are almost unavoidable in a path evaluation, the result
must be a set of graphs (and not a unique graph). This restriction leads to some troubles while
defining in the query language the signatures of operators manipulating the result of a path
operator since the result is a set of graphs.
In conventional GIS interpretation (Fig. 10), a given pair of entities (say, Melun and PA1)
provides a unique answer (resp., Al). When aggregate functions are involved in a query, the result
of a path operator can no longer be denned as the union of the different paths. Therefore, from a
given pair, several results are provided.
Two modes can be defined depending on the expressive power of the query language: the union
mode (a graph as a result) and the elementary-at-a-time mode (a set of graphs as a result). The
union mode is appropriate to predicate-based query language or to a query in which no aggregate
function is involved. It is also appropriate with a path operator providing a unique result by
definition (e.g., a shortest path operator). As discussed above, when an aggregate function is
involved in a query (i.e., in a path operator or on the result of a spatial operator), the Union mode
can no longer be used: the elementary-at-a-time mode becomes mandatory.
A mode is independent of a path operator. A query is defined as a graph with labeling function
γ. The γ function is in charge of defining the relevant mode since its determination can be made
before the evaluation or after the evaluation if the result is a unique path. Visual ambiguities may
arise from the path operator or from a different operator in the query. Query Q2 and the gen-
eralization of query Q3 (i.e., with the query Q1 that does not require an aggregate function) are
examples of such queries. Query Q2 involves an aggregate function in the path operator. The
M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127 123
being able to return to the previous solution (Previous) , being able to retain a solution (Handled) ,
being abl e to selec t a retaine d solutio n (Select) , and bein g able to cance l a "handled " solution
(Cancel). A manipulation type, Manipulation_type, is defined t o model the relevan t operations
Manipulation_type = (Init , Next , Previous , Handled , Select , Cancel) .
Whenever a n ambiguou s quer y is presented, severa l solution s have t o b e displayed . The basi c
manipulation ha s th e followin g signature:
DisplayAmbiguousResult : FromQREtoUI-type x Manipulation_typ e —> Graph-type.
The grap h obtaine d b y th e DisplayAmbiguousResul t functio n ca n b e processe d as i f the query
was in mode Union sinc e no visual ambiguity may occur. The context is similar to a manipulation
with a GIS that provides a shortest pat h operator (i.e. , a single path a s a result).
4. Conclusio n
The large diffusion o f spatial database applications in scientific, planning, and business domains
leads t o th e emergenc e o f ne w requirement s i n term s o f dat a representatio n an d derivation .
Particularly, curren t spatia l data model s an d quer y language operations hav e to b e extended in
order t o integrate a more complete semantic representation o f complex domains. Th e integratio n
and representation o f graph structures within spatial data models is of particular interes t for many
application area s involve d in the management o f transportation networks .
Transportation dat a are modeled with the concept of a labeled graph. In this paper, the starting
point is an end-user query defined a s a set of operators. This query is modeled with the notion of
graph. The result of a query is modeled with a graph, which is isomorphic to the query graph. This
choice allow s an homogeneit y between the modelin g of the differen t component s of a GI S dat -
abase manipulation: data, query, and results are modeled with the same concept. To avoid visual
ambiguities tha t may aris e onc e a n aggregat e functio n is involved i n a query , a n interpretatio n
structure i s provide d t o defin e th e relevan t instance s o f quer y results . Thi s structur e i s als o
modeled wit h a grap h (isomorphi c t o a resul t graph) . Th e approac h i s of grea t significanc e i n
transportation application s becaus e aggregat e function s ar e nearl y mandator y t o reduc e th e
amount o f data an d t o provid e a realistic computation time.
The decompositio n o f a GI S logica l architecture into thre e independent layers (a UI, a QR E
and a DBMS) allow s for greater flexibility i n the management of the expressive power. A forma l
communication betwee n these components ( a graph structure between the interface and the query
resolution engine , a n extende d SQL-quer y languag e fro m th e QR E t o th e DBMS ) allow s fo r
easily changing one of them. An end-user query is a composition o f operators (spatial or not). The
aim o f a QR E i s to fil l th e ga p betwee n the leve l of abstractio n o f th e U I an d th e leve l o f ab -
straction of the DBMS (without composition of operators, with composition that doe s not allow
the sam e operator wit h the sam e arguments to b e used several times, and wit h a total composi-
tion). Th e introductio n o f a pat h operato r tha t provide s severa l result s require s changin g th e
interpretation o f the final result once an aggregat e function i s involved in the query . Two modes
are defined: the union mode (a set oriented approach) and the elementary-at-a-time mod e (path by
126 M . Mainguenaud I Transportation Research Part C 8 (2000) 109-127
path). Ne w manipulation functions are associated t o the elementary-at-a-time mod e to handle the
risk o f visual ambiguities.
The introduction o f a GI S a s a decision too l require s the opportunit y to propos e severa l so-
lutions. The added value of a human being is the decision. The UI must be able to present differen t
solutions withou t any visua l ambiguity. Using the framewor k o f th e QRE , severa l concrete in-
terfaces may be provided dependin g on the end-user (obviously it requires that the path operato r
of the database provid e several paths a s a result instead of a singl e path).
References
Angellacio, M. , Catarci , T. , Santucci , G., 1990 . QBD*: a graphical query language with recursion. IEE E Transactio n
on Softwar e Engineering 1 6 (10), 1150-1163 .
Aufaure-Portier, M.A. , Trepied , C. , 1996 . A survey of query languages fo r geographic information systems. In: Thir d
International Worksho p o n Interface s to Databases , Edinbugh , UK , 8-1 0 July .
Barrera, Buchmann, 1991. Schema definition and quer y language for a geographical database system . Transactions o n
Computer Architecture : Patter n Analysi s and Imag e Databas e Managemen t 11 , 250-256.
Car, A. , Frank , A. , 1994 . Genera l principle s o f hierarchica l spatia l reasonin g - th e cas e o f wayfinding . In: Sixt h
International Symposiu m on Spatia l Dat a Handling , Edinburgh, Scotland, 5- 9 September .
Christophides, V., Cluet, S., Moerkotte, G. 1996 . Evaluating queries with generalized path expressions. In: Proceeding s
of ACM SIGMO D Conference , Montreal, Canada .
Claramunt, C. , Mainguenaud , M. , 1999 . A revisite d databas e projectio n operato r fo r networ k facilitie s in a GIS .
Informatica 23 , 187-201 .
Cruz, I.F. , Mendelzon , A.O., Wood , P.T. , 1987 . A graphical quer y language supporting recursion . In : Proceedings o f
ACM SIGMO D Conference , San Francisco, USA , 27-29 May .
Di Loreto , F. , Ferri , F. , Massari , F. , Rafanelli , M. , 1996 . A pictoria l quer y language for geographica l databases . In :
International Worksho p o n Advance d Visua l Interfaces, Gubbio, Italy , 27-2 9 May .
Egenhofer, M. , 1994 . Spatia l SQL : a quer y and presentatio n language . IEE E Transaction s o n Knowledg e and Dat a
Engineering 6 (1), 86-95.
Erwig, M. , Giiting , R.H. , 1994 . Explici t graph s i n a functiona l model fo r spatia l databases . IEE E Transactio n o n
Knowledge and Dat a Engineerin g 6 (5), 787-804.
Guting, R.H. , 1991 . Extendin g a spatia l databas e syste m b y graph s an d objec t clas s hierarchies . In : Internationa l
Workshop o n Databas e Managemen t Syste m for Geographica l Applications , Capri , Italy , May .
Haas, L.M. , Cody , W.F. , 1991 . Exploitin g extensibl e DBM S i n integrate d geographica l informatio n systems . In :
Second Symposiu m o n Larg e Spatia l Databases , Zurich , Switzerland , Augus t 1991 , Lecture Note s i n Compute r
Science 525 , Springer , Heidelberg , pp . 423-450 .
Kuiper, B. , 1978 . Modelling spatial knowledge. Cognitive Science 2, 129-153 .
Kirby, K.C. , Pazner , M., 1990 . Graphic map algebra. 4th International Symposium o n Spatial Data Handling, Zurich ,
Switzerland, 22-28 July.
Larue, T. , Pastre, D., Viemont, Y. , 1993 . Strong integratio n o f spatial domains an d operators in a relational databas e
system. In : Abel , D.J. , Ooi , B.C . (Eds) , Advance s i n Spatia l Databases , Lectur e Note s i n Compute r Scienc e 692 .
Springer, Singapore , pp . 53-71.
Mainguenaud, M. , 1994 . Consistency of geographical informatio n system query result. Computers, Environmen t an d
Urban System s 18 , 333-342.
Mainguenaud, M. , 1995 . Th e modelin g o f th e geographi c informatio n syste m networ k component . Internationa l
Journal of Geographical Informatio n System s 9 (6), 575-593 .
Medeiros, C.B. , Pires , F. , 1994 . Databases fo r GI S SIGMO D record 2 3 (1), 107-115 .
Meyer, B., 1992 . Beyond icons: towards new metaphors fo r visua l query languages for spatial information systems. In:
Cooper, R . (Ed.), Firs t Internationa l Worksho p o n Interfaces to Database Systems . Springer, Heidelberg , pp. 113 -
135.
M. Mainguenaud I Transportation Research Part C 8 (2000) 109-127 12 7
Peckham, J. , Maryanski , F., 1988 . Semantic dat a models . AC M Computin g Survey s 20 (3), 153-189 .
Seshadri, P. , Livny , M., Ramakrishnan , R. , 1997 . The cas e fo r enhance d abstrac t dat a types . In : 23r d Internationa l
Conference o n Ver y Larg e Databases , Athens , Greece .
Smith, J.M., Smith , D.C. , 1987 . Database abstraction: aggregation and generalization. AC M Transaction On Database
System 2 (2), 105-133 .
Stemple, D. , Sheard , T. , Bunker , R., 1986 . Abstract Dat a Types i n databases: specification , manipulation an d access .
In: Proceeding s o f th e Secon d Internationa l Conferenc e o n Dat a Engineering , Lo s Angeles , CA , USA , 6- 8
February.
Stonebraker, M., Moore, D., 1996 . Object-Relational DBMSs: The Next Great Wave . Morgan Kaufmann, San Mateo,
CA.
Timpf, S. , Volta, G.S. , Pollock , D.W. , Egenhofer , M. , 1992 . A conceptual model of wayfinding using multiple level s of
abstraction. In : Frank , A.U. , Campari , I. , Formentini , U . (Eds.) , Theorie s an d Method s o f Spatio-Tempora l
Reasoning i n Geographic Space . Springer , Berlin , pp. 349-367 .
Tsou, M.H., Buttenfield , B. , 1996. A direct manipulation interface for geographical information processing. In: Seventh
International Symposiu m on Spatia l Data Handling , Delft , The Netherlands, 12-1 6 August .
Woodruff, A. , Su, A., Stonebraker, M., Paxson, C., Chen, J., Aiken, A., Wisnovsky, P., Taylor, C., 1995. Zooming an d
tunneling i n Tioga: supportin g navigatio n i n multidimensional space . In : Spaccapietra , S. , Jain, R . (Eds.) , Visua l
Database Systems 3 : Visual Information Management . Chapma n & Hall, London , pp. 360-371 .
Zdonik, S.B. , Maier , D. , 1990 . Reading s i n Object-oriented Databas e Systems. Morga n Kaufmann , Sa n Mateo, CA.
TRANSPORTATION
RESEARCH
PARTC
Abstract
This pape r present s a n integrate d transit-oriente d trave l deman d modelin g procedur e withi n th e
framework o f geographic information systems (GIS). Focusin g o n transit network development, this paper
presents both the procedure an d algorith m fo r automatically generatin g both lin k and lin e data for transit
demand modelin g from th e conventiona l stree t network data using spatial analysi s and dynami c segmen-
tation. Fo r thi s purpose, transi t sto p digitizing, topology and route syste m building, and the conversion of
route an d sto p dat a int o lin k an d lin e data set s are performed . Usin g spatia l analysis , such a s th e func -
tionality to search arcs nearest from a given node, the nearest stops are identified along the associated links
of the transit line, while the topological relation between links and line data sets can also be computed using
dynamic segmentation . Th e advantag e o f thi s approac h i s tha t stree t ma p database s represente d b y a
centerline can be directly used along with the existing legacy urban transportation plannin g systems (UTPS)
type travel modeling packages and existin g GIS without incurring the additional cost of purchasing a full -
blown transportatio n GI S package . A smal l tes t networ k i s adopted t o demonstrat e th e process an d th e
results. The author s anticipat e tha t th e procedur e se t forth in this paper wil l b e useful t o man y citie s and
regional transi t agencie s i n their transi t deman d modelin g process withi n the integrate d GIS-base d com -
puting environment . © 200 0 Elsevier Science Ltd. Al l rights reserved.
Keywords: Transi t networ k development; GIS; Digita l map; Spatia l analysis; Dynamic segmentation
1. Introductio n
0968-090X/00/S - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00013- 9
130 K . Choi, W . Jang I Transportation Research Part C 8 (2000) 129-146
the middl e o f highwa y links , represente d b y arc s i n GIS . On e motivatio n o f thi s pape r i s t o
overcome these types of discrepancies and difficulties .
The purpos e o f thi s research i s to deriv e a mor e realisti c transit networ k data t o b e use d in
transit demand modelin g fro m th e street GI S map databases. Th e proposed approac h make s use
of spatial analysis and dynamic segmentation. Using spatial analysis, the nearest (bus) stops are to
be identified as nodes, whil e the topological relation betwee n nodes/links and a series of line data
sets are computed usin g the dynamic segmentation. In this paper, bot h th e procedure and algo -
rithm to generate transit network data ar e presented along with a concrete implementation an d a
discussion o f the advantage s of the algorithm.
As stated before , the basic data used i n travel demand forecastin g are origin-destinatio n ma -
trices and network data. The origin-destination dat a represen t the demand sid e of quantity. The
network dat a describ e th e suppl y sid e tha t accommodate s th e demand . Amon g th e criticism s
identified b y Choi (1993) and Cho i an d Ki m (1994) in making those data sets for transportatio n
planning an d deman d forecasting , labor-intensiveness i s considered a s on e o f th e mos t seriou s
problems, especiall y in preparing transit networ k data.
During th e preparatio n o f th e transi t networ k an d relate d dat a se t needed , transportatio n
planners hav e t o prepar e map s t o describ e stud y area s an d actua l transportation networks . As
O'Neil (1991 ) states, network generation requires extensive data collection and integration efforts .
Furthermore, th e generate d network s ar e frequentl y modifie d t o reflec t changes , suc h a s (1 )
changes in study area boundaries, (2 ) changes in zone delineation due to th e land use change, (3)
modified network s (lin k shap e an d nod e locatio n change ) fo r testin g alternativ e networ k sce-
narios, and (4) link attribute changes like speed limit or capacity. In such cases, data requirements
are extensive. Two typical problems in transit network preparation are described first, along with
some possibl e proble m solvin g activities .
Fig. 1 . Highway network vs. transit network . An arro w indicate s the lin k direction.
Besides nod e an d lin k data , a complet e transi t networ k require s additiona l inpu t fo r transi t
demand modeling . Thi s is called line data. It is composed o f an ordered se t of stops on top of link
data. Th e mos t prevalen t proble m encountere d durin g th e networ k preparatio n i s th e lac k o f
consistency between link and line data. Fig . 2 illustrates such a situation where link 102-111 exist s
in th e lin e data se t but i s absent fro m th e lin k data set .
Choi an d Jan g (1997 ) proposed node-base d transi t networ k development procedures i n which
all stop s ar e assume d t o b e locate d a t physica l highwa y intersection s an d ar e represente d a s
nodes. But in actual transit networks, stops are normally represented as points along arcs rathe r
than nodes. Th e node-based transi t networ k developmen t procedur e i s rather straightforward t o
implement. The dat a preparatio n relationshi p in node-base d transi t networ k developmen t pro -
cedures i s show n i n Fig . 4 . Se e Choi an d Jan g (1997 ) fo r mor e details . I n Fig . 4 , w e us e a
terminology consisten t wit h th e us e o f ESRI' s ARC/INF O GI S application . Th e RAT , AAT ,
NAT, SEC , an d AM L acronym s stan d fo r rout e attribut e table , ar c attribut e table , nod e at -
tribute table , sectio n table , an d ar c macr o language , respectivel y (see Environmenta l System s
Research Institute , 1993a,b , an d 199 5 fo r mor e detail) . O n th e othe r hand , th e point-base d
procedure requires more complex steps to derive link and line data. Th e process will be discussed
in more detail in the rest of the paper. Table 1 summarizes the major differences betwee n the two
procedures.
Table 1
Node-based vs . point-based transit networ k development
Aspects Node-based Point-based
Stop i s represented by Node Stop
Line dat a ar e Series of nodes Ordered se t of stop s
Link length Arc lengt h (embedded i n AAT ) Need t o b e calculated
Application Not practicabl e Practicable
3. Dat a preparatio n
Before proceeding with the description o f the main algorithm fo r generating link and line data ,
processes involve d in stop positioning, stop-ar c relationship building, and lin k distance calcula -
tion ar e introduced i n this section .
When th e stop is positioned durin g stop data preparation, car e must b e taken to maintain the
relationship between arcs and stops . A s is the case in Fig. 5 , the stop position shoul d be inserted
more closel y to th e lin k to whic h it is eventually related. Fo r example , i n Fig. 5 , stop 2 must b e
positioned closer to link A than to link B to represent that sto p 2 is located alon g the link A. This
kind of relationship is needed t o buil d the route syste m and t o generat e transit networ k building
during the step s describe d later .
The stop's direction is categorized as one-way or two-way. Two-way means that the stop can be
used for bot h direction s of traffic, a s is the case with a normal subwa y station. Sto p 1 in Fig. 5 is
K. Choi, W . Jang I Transportation Research Part C 8 (2000) 129-146 13 5
the typica l example of a two-way stop. I n thi s situation, th e sto p i s positioned o n the centerline .
But a one-way sto p should b e positioned at an offse t o n one side or the other (stop 2). One more
important attribut e o f the sto p dat a i s the mod e availabilit y a t tha t stop . Ever y available mod e
should b e listed unde r th e stop's attribute data.
The relationship betwee n stops and arcs does not exist at first (after the stop data preparation) .
However, th e relationshi p ca n b e constructe d easil y b y GI S function s an d Fig . 6 show s thi s
procedure. I n Fig . 6 , stop s represente d a s point s hav e n o relationa l informatio n abou t arc s
nearby. However, by computing the shortest distance between a node and a n arc, we can identif y
the relationship , tha t is , which stop s ar e lying on whic h arc .
Link data information i n a transit networ k consists of a pair o f stops, lin k distance, an d othe r
related attribut e dat a o f th e link . I n a node-base d transi t networ k developmen t procedure , th e
distances betwee n stop s ar e embedde d i n th e topological informatio n o f the arc-nod e relationa l
database. Bu t i n a point-base d procedure , ther e i s no direc t lin k distance (stop-to-sto p distanc e
along th e route ) informatio n available . Therefore , a separat e procedure/algorith m mus t b e de -
veloped t o calculat e transi t lin k distances .
As show n i n Fig . 7 , if either stop i s located i n th e middl e o f a n arc , th e algorith m ha s t o cal -
culate the partia l lengt h o f the ar c where the sto p i s located. Fou r step s ar e require d t o achiev e
this. First , fin d th e perpendicula r point s fro m th e stop s t o th e neares t ar c segment s usin g GIS '
nearest ar c findin g capabilit y fro m a nod e representin g sto p usin g spatia l analysis . Second ,
calculate the distance from th e beginning stop of the link to th e first node encountere d alon g the
route (using the dynamic segmentation). Third, calculat e the distance from th e ending stop of the
link to the first node encountered bac k throug h the route using the arc-node and section tables, as
in th e secon d step . Fourth , su m th e result s from secon d an d thir d step s an d al l intervening arc
lengths a s illustrated i n Fig . 8 . If th e beginnin g sto p is located exactl y at th e nod e position , th e
second ste p i s not needed . Th e sam e is true o f the thir d step .
4. Th e algorith m
Fig. 9 shows the schemati c flow of the procedur e algorithm, whereas Fig. 1 0 depicts th e flow
chart o f th e propose d algorithm . Th e uppe r par t o f th e algorith m i n Fig . 9 depict s th e sto p
positioning o n top o f the existing GIS layer, building a route syste m with stops positioned using
dynamic segmentation and nearest ar c finding process using GIS spatial analysis. The lower part
of Fig. 9 describes the link and line data derivation using the prescribed data se t in the upper par t
of th e algorithm .
The algorith m convert s rout e dat a (generate d b y th e GI S dynami c segmentatio n model ) t o
point-based link s and line s using the relationshi p betwee n stops an d arc s describe d earlier . Th e
algorithm convert s one rout e a t a tim e an d examine s all section s i n tha t route . A t th e sectio n
examining stage , i t searche s fo r al l stop s tha t hav e th e sam e ar c number . I f an y stop s ar e
K. Choi, W. Jang I Transportation Research Part C 8 (2000) 129-146 139
found, they are checked to see if the route passes through them. Once the stop is found to be
on the route, it is added to the temporary stop buffer with its cumulative length from the
beginning point of the route. After examining all sections, stops stored in the temporary buffer
are sorted by cumulative length, and link and line data are extracted. After a route system is
processed in this fashion, the algorithm goes to the next route and so on until all the routes are
processed.
To describe the algorithm algebraically, the following notation is employed.
The most fundamental three steps of the algorithm are described hereunder.
d(k, i): length of section i of route k
D(k, j): total distance from the beginning of route k to the stop j along the route k
d(k, i , j ) : distance from the beginning of section i of route k to the stop j along the route k
Qj =j'th element in set Q)
\Q\ = number of elements in set Q
step (0)
Read R, S, A from files
L =Φ , Lk = Φ
step (1)
k
for each T , k = {1, . . . , \R\}
for each T*, i = {!,..., Tk\}
S' = (1: 1's nearest arc = Ai}
B = Φ, D = 0
(1)for each
if is within section i and is on the right of route or is a bi-directional stop)
then
To demonstrat e th e working of the algorith m described above , a small test network ha s been
arranged. Fig . 1 1 shows th e tes t networ k an d Fig . 1 2 depicts th e associate d sto p an d sectio n
Fig. 12 . Section an d sto p tables : "SECNO " means section numbe r and "ARC# " means arc number. "FROMPOS "
specifies the from-position where the section starts on the arc (%), whereas "TOPOS" denotes the to-position where the
section end s on th e ar c (%).
K. Choi, W. Jang I Transportation Research Part C 8 (2000) 129-146 141
tables. The networ k is composed of 7 stops and 1 route based upo n a stree t network having 23
arcs and 1 6 nodes. Th e dotted lin e in Fig. 1 1 shows a transit route under consideration, governin g
5 ful l arc s an d 2 partial arcs .
Above all, each stop's nearest arc should be identified a s shown in Fig. 13 . The far right column
in the table shows the result. After identifyin g the (ARC#) nearest arcs, the following data set s are
converted to text files: highway network topology, ar c coordinates, rout e and sectio n tables, stop
table includin g nearest ar c number, and relate d attribut e data .
The converted text files prepared in the data preparation ar e input data to the algorithm. These
data sets are loaded int o memory as arrays. For each element in a route array , each section in that
route i s examined in order . First , th e ar c numbe r o f th e sectio n i s identified vi a th e section-ar c
relationship, which is ascertained wit h the hel p of dynamic segmentation. Then, al l stops having
the same arc number are searched. I f any stops are found, each one is examined to see whether the
route passes through, followin g th e tw o sub-steps :
1. Chec k whether th e stop is to the right of an arc along the route or a two-way stop; if not, pro-
cess the nex t found stop .
2. Chec k i f the sto p is within the boundarie s o f a section ; if not, proces s the nex t stop.
Once every stop passes the above examination process , th e next step is to calculate th e total length
from th e beginning of the route t o arrang e th e stop s i n sequential order, and sav e the associated
142 K. Choi, W. Jang I Transportation Research Part C 8 (2000) 129-146
Fig. 14 . Identified stop s wit h length s (fro m th e star t o f line ) via sectio n examining .
data (sto p number, total length ) to the temporary stop buffer . I f any sections remain, continue to
examine the nex t section .
The final step i s to fetc h the link an d lin e data set s from th e buffer . Again , a lin k is a pair o f
stops and a line is an ordered se t of stops. Before writing down the link and lin e data, stop s saved
in th e buffe r mus t be sorted b y length (in ascending order) t o secur e the orde r o f stops fro m th e
start o f the line. The line data fo r line X, in this example in Fig. 14 , is composed of stops 1 , 2, 3, 6,
5, and 7 in thi s order an d th e associate d lin k length can als o b e generated .
While processin g a route , th e route is transformed int o a serie s o f stops, whic h ar e store d i n
the temporar y buffer . A singl e lin e an d it s associate d multipl e link s ar e create d a t th e sam e
time, maintainin g th e consistenc y i n th e relationshi p betwee n lin k an d lin e dat a pointe d ou t
earlier. Th e algorith m shoul d proces s al l th e line s in thi s manne r befor e terminatio n a s show n
in Fig . 15 .
We applied th e procedure discusse d abov e t o th e actua l transit networ k in the cit y of Suwon,
Korea (Fig. 16) . Although th e city has more than 70 bus lines, only four bus lines, each having 20 -
30 stops i n bot h directions , ar e considered , fo r th e sak e of simplicity.
As stated i n the introduction, the goal is to extract th e link and lin e data fo r each bus line and
feed i t int o transi t deman d modelin g packages , suc h a s TRANPLA N an d EMME/2 , withou t
incurring the additional cost s of purchasing new GIS packages for transit demand modeling. The
extracted dat a ar e based o n a network composed o f 5 5 links and 4 lines. Although the results of
the transi t deman d modelin g hav e bee n omitte d i n thi s paper, th e extracte d lin k an d lin e dat a
provided seamles s and error-fre e transit networ k (supply) data. Th e extracted lin k and lin e dat a
are show n i n Table 2.
K. Choi, W. Jang I Transportation Research Part C 8 (2000) 129-146 143
Table 2
Results o f link and lin e data fo r rea l bus network
Link no . Stop fro m Stop to Length
Number o f links: 5 5
1 1 2 95.423
2 2 3 85.280
3 3 4 79.837
4 4 5 78.021
5 5 6 121.233
6 6 7 119.670
7 7 8 102.990
8 8 59 84.632
9 8 71 145.028
10 18 6 146.430
Number of lines: 4
Route 1 27 , 25 , 24 , 22 , 20 , 18 , 6, 7 , 8 , 59 , 56 , 95 , 91 , 97 , 99 , 102 , 104 , 106 , 108
Route 2 82 , 80 , 78 , 70 , 68 , 66 , 64 , 44 , 41 , 40 , 38 , 36 , 33 , 32 , 30 , 62
Route 3 1 , 2, 3 , 4, 5 , 6 , 7 , 8 , 71 , 73 , 75 , 7 7
Route 4 84 , 86, 88 , 90, 92, 94, 59, 55 , 53, 51, 49, 60 , 46, 62
Table 3
Comparison o f two methods
Criteria metho d Consistenc y Accurac y Labor-intensivenes s
Conventional metho d Lo w Lo w Hig h
Using GIS database Hig h Hig h Lo w
K. Choi, W . Jang I Transportation Research Part C 8 (2000) 129-146 14 5
Compared with the conventional method of using an analog map with a text editor to prepare the
input transi t networ k data, th e GIS-oriente d approac h seem s to b e superior. The procedur e set
forth i n thi s pape r shoul d b e helpfu l t o transi t agencie s an d cit y government s facin g transi t
problems. For example , the procedure can be applied to a case of the city of Seoul's transit system
composed o f 90 transit companies, mor e than 43 0 lines, and mor e than 9000 buses. In addition ,
the algorithm can be expected to be applicable t o th e construction of mobility databases, suc h as
activity travel pattern and point-to-poin t tri p pattern fo r each purpose and mode, to th e issue of
geocoding o f trave l data , an d t o th e preparatio n o f input s o f th e micr o simulation modelin g
process, suc h as TRANSIMS o r a n equivalen t that i s currently under construction.
Acknowledgements
References
Affum, J.K. , Taylor , M.A.P. , 1997 . SELATM- A GI S base d progra m fo r evaluatin g the safet y benefit s o f loca l are a
traffic managemen t schemes . Transportatio n Plannin g an d Technolog y 21 , 93-120.
Ahuja, R.K. , Magnanti , T.L. , Orlin , J.B. , 1993 . Network Flows . Prentice-Hall , Ne w York.
Choi, K. , 1993 . Th e implementatio n o f a n integrate d transportatio n plannin g wit h GI S an d exper t system s fo r
interactive transportation planning . Unpublishe d Ph.D . Dissertation , Universit y of Illinois at Urbana-Champaign ,
Urbana, IL .
Choi, K. , Jang , W. , 1997 . Transit networ k developmen t usin g arc-node topologica l database . In : Proceeding s o f th e
1997 Geographic Informatio n System s for Transportatio n Symposium , pp . 357-364 .
Choi, K. , Kim, T.J., 1994 . Transportation planning with GIS: issue s and prospects. Journal o f Planning Education an d
Research 1 3 (3), 199-207 .
Environmental System s Research Institute , 1993a . User' s Guid e o f Dynamic Segmentation , Redland , CA .
Environmental System s Researc h Institute , 1993b . User' s Guid e o f Network Modeling , Redland , CA .
Environmental System s Research Institute , 1995 . Workshop Proceeding s o f the 199 5 ESRI Use r Conference , Redland,
CA.
Golledge, R.G. , 1998 . The relationship between GIS and disaggregate behavior travel modeling. Geographical System s
3 (5), 9-18 .
Goodchild, M.F. , 1998 . Geographi c informatio n system s an d disaggregat e transportatio n modeling . Geographica l
Systems 3 (5), 19-44 .
Greaves, S. , Stopher , P. , 1998 . A synthesi s of GI S an d activity-base d travel-forecasting . Geographica l System s 3 (5),
59-90.
Horowitz, A.J., 1997 . Integrating GIS concepts int o transportatio n networ k data structures . Transportatio n Plannin g
and Technolog y 21 , 139-153 .
Johnston, R.A. , Barra , T., 2000 . Comprehensiv e regiona l modelin g for long-rang e planning : linking integrated urba n
models and geographi c informatio n systems . Transportatio n Researc h 3 4 (2), 125-136 .
McCormack, E. , Nyerges , T. , 1997 . Wha t transportatio n modelin g need s fro m a GIS : a conceptua l framework .
Transportation Plannin g and Technolog y 21, 5-23 .
McNally, M.G. , 1998 . Activity-base d forecastin g model s integrating GIS. Geographica l System s 3 (5), 163-188 .
146 K . Choi, W . Jang I Transportation Research Part C 8 (2000) 129-146
Miller, H.J. , 1998 . Emergin g theme s an d researc h frontier s i n GI S an d activity-base d trave l deman d forecasting .
Geographical System s 3 (5), 189-198 .
O'Neill, W.A. , 1991 . Developing optima l transportatio n analysi s zones usin g GIS. IT E Journa l 6 1 (11), 33-36.
O'Neill, W.A. , Ramsey , R.D. , Chou , J. , 1992 . Analysis of transit servic e areas usin g geographic informatio n systems .
Transportation Researc h Recor d 1364 , 131-138 .
Racca, D. , 1998 . Usin g GI S t o identif y market s fo r transi t i n delaware . In : Proceeding s o f th e 199 8 Geographi c
Information System s for Transportatio n Symposium . Sal t Lak e City , UT , pp . 270-291 .
Spear, B.D. , Lakshmanan , T.R. , 1998 . The role of GIS in transportation plannin g and analysis . Geographical System s
3 (5), 45-58.
Sutton, J. , 1997 . Dat a attributio n an d networ k representation : issue s i n GI S an d transportation . Transportatio n
Planning and Technolog y 21 , 25-44.
Transportation Researc h Boar d (TRB) , 1990 . Transportation Researc h Recor d 1261 : GIS 1990 , Washington, DC .
You, J. , Nedovic-Budic , Z. , Kim , T.J. , 1997a . A GIS-base d traffi c analysi s zon e design : technique . Transportatio n
Planning and Technolog y 21 , 45-68.
You, J., Nedovic-Budic, Z. , Kim, T.J., 1997b . A GIS-based traffi c analysi s zone design: implementation an d evaluation .
Transportation Plannin g and Technolog y 21 , 69-91.
You, J., Kim, T.J., 1998 . An integrated land use-transportation modeling system with GIS. In : Proceedings of the 199 8
Geographic Informatio n System s for Transportatio n Symposium . Sal t Lak e City , UT , pp . 448-461.
Wakefield, S. , 1997 . BusMAP bring s B C transit u p t o speed . Geolnf o Systems . May , pp. 29-32 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
1. Introductio n
The demand for goods has grown steadily over the past half century so that today an essential
ingredient o f a thrivin g nationa l econom y i s a cost-effectiv e freigh t transportatio n system . Thi s
involves th e us e o f multimodal , includin g intermodal , transportatio n options . Intermoda l
movements are those in which two o r more different transportation modes are linked end-to-end
in orde r t o mov e freigh t and/o r peopl e fro m poin t o f origi n t o poin t o f destination. Withi n th e
* The submitted manuscript has been authored by a contractor of the US Government under contract No. DE-AC05-
96OR2464. Accordingly , th e U S Government retain s a nonexclusive, royalty-free license t o publis h o r reproduc e the
published for m o f this contribution, or allo w others to d o so , for U S Government purposes.
* Corresponding author .
0968-090X/00/$ - see front matte r © 200 0 Elsevier Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00004-8
148 F . South-worth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
expanded to reflec t the shipmen t volume s routed ove r them , are the n use d to estimat e the na-
tional, statewid e and regiona l ton-miles and dollar-mile s of freight activit y associated wit h these
calendar yea r 199 7 shipments, b y modes an d commodity types . These mileage statistics support a
range o f policy analysis , suc h a s an assessmen t o f the ton-mile s of truck freigh t activit y passing
into, within, ou t of , and throug h eac h o f the 5 0 states (BTS, 1997a,b) .
A significan t valu e of the CF S i s its treatment o f "door-to-door" freight movement s in which
the true origin of a shipment i s tied to the true destination. Thi s is important becaus e most source s
of freigh t movemen t dat a collecte d b y th e federa l governmen t dea l wit h termina l t o termina l
movements within specifi c modes . Henc e bot h th e tru e geograph y o f movements a s well as th e
intermodal nature of many shipments is not captured. The CFS fills this data gap : while the use of
network-based models to route CFS shipments allows a better understanding of how the differen t
modal infrastructure s are being used i n getting freigh t fro m sourc e t o custome r - a n are a iden -
tified a s needing attention i n the recen t freigh t literatur e (GAO, 1996) .
The CF S use s zi p code location s t o captur e bot h shipmen t origi n an d destinatio n locations ,
while cit y an d countr y o f destinatio n a s wel l a s U S por t o f exi t ar e als o reporte d fo r expor t
shipments. Thi s mean t developin g a digita l networ k capabl e o f routin g freigh t betwee n som e
forty-eight thousand differen t traffi c generator s and attractors. Give n the voluminous number and
variety o f possible shipment s generate d b y the survey, a number o f reasonably generi c compute r
programs ha d t o b e developed t o accommodat e a n automate d se t of shipmen t distanc e calcula -
tions. Thes e method s ar e describe d belo w unde r th e followin g headings:
• intermoda l network construction: placin g components o f intermodal freight shipments within a
network structure , by merging different moda l network s into a single , intermodal network ;
• intermoda l network access: ways to connect shipment origins and destinations to networks, and
the nee d to evaluat e multiple origin a s well a s multiple destination networ k access an d egres s
options i n order t o selec t most likel y shipment routes ;
• modelin g intermodal terminal transfers and inter-carrier interlining practices within a network
structure: includin g the modeling o f trans-oceanic an d trans-borde r expor t shipments ; an d
• th e use of generalized impedance function s fo r modeling the trans-global a s well as trans-con-
tinental routin g o f domestic an d expor t shipments .
These topic s ar e discusse d i n turn i n Section s 2-5 o f the paper. Muc h o f this discussio n ha s a n
applicability t o nationa l an d internationa l networ k analysi s tha t goe s wel l beyon d th e CF S
problem pe r se . Indeed, considerabl e valu e resides in a transportation networ k dat a mode l tha t
can now act as a starting point for a wide range of freight movement studies. Some additional uses
for suc h digita l networks are summarize d i n Section 6 of the paper .
intermodal paths. The network is "logical" in the sense that a computer program can find a chain of
links betwee n al l possibl e origin s an d destinations . Al l o f thes e networ k link s represen t som e
reality, whether physica l trafficways , o r processe s tha t the shipmen t passe s through i n sequence ,
and the y al l hav e a geographi c location . Th e resultin g "links " i n th e CF S composit e networ k
therefore rang e fro m section s o f rea l highwa y pavement t o broa d ocea n se a lanes , t o transfe r
processes involvin g cranes, drayage, storage , an d repackagin g at location s withi n a large seapor t
area. Tw o separat e digita l intermoda l network s were constructed fo r traffi c routin g in th e 199 7
CFS: a truck-rail-waterway s (TRW) network , an d a truck-ai r (TA ) network . Onl y th e con -
struction and application o f the former network is the subject of this present paper. I t was built by
combining, an d wher e necessar y modifying , earl y 199 7 calendar yea r version s of th e followin g
digital databases (see Southworth, 199 7 for attribute details; also Southworth et al., 199 8 for data
sources):
• th e Oak Ridge Nationa l Laborator y (ORNL ) Nationa l Highwa y Networ k an d its extensions
into th e main highway s of Canada an d Mexico;
• th e Federal Railroa d Administration's (FRA) National Rail Network and its extension into the
main rai l line s of Canada and Mexico ;
• th e US Army Corps o f Engineers National Waterway s Network;
• th e ORNL constructe d Trans-Oceani c Network ;
• th e ORNL constructed National Intermodal Terminal s Database; and
• a database o f 5-Digit Zip-Cod e are a locations .
For shipmen t routing purposes eac h o f these databases ma y be thought o f as a brick in the CF S
multimodal network building exercise, while the modeling and data handlin g techniques described
in thi s paper provid e th e morta r tha t wa s used t o integrat e the m int o a coheren t networ k dat a
structure. The intermodal terminals database similarl y represents a major database developmen t
effort i n its own right. Middendorf (1998 ) describes this database an d its construction, includin g a
list o f the man y differen t dat a source s tha t wer e used t o buil d it .
Table 1 lists the number o f separate lin k records, and implied network mileages contained i n a
final version of this merged, multimodal CF S network. Three points are worth noting about thi s
table. First , all mileages ar e what ar e commonly referre d t o a s "center-line" mileages. Highwa y
miles fo r exampl e ar e no t lane-miles . Second , difference s exis t i n bot h th e leve l o f geographi c
detail an d als o i n the geographi c scal e o f the variou s modal network s listed in th e table . Henc e
waterway link s wer e significantl y longe r tha n othe r links , whil e highway s wer e significantl y
Table 1
1997 CFS North-America n an d trans-oceani c networ k statistic s
Link type # o f links Network mileage
Highway 84,537 489,679
Rail 22,126 236,931
Water (US):
Inland & Inter-coastal: 4,26 9 44,11 6
Great Lakes : 91 2 13,15 6
Water (dee p sea ) 7,27 8 2,543,67 2
Intermodal transfer s 14,17 6
Access/egress Generate d a s needed
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 15 1
shorter, o n average , tha n th e link s fo r othe r modes . Whil e th e highwa y an d inlan d waterwa y
networks wer e constructe d a t a scal e o f 1:100,000 , th e onl y rai l networ k containin g dat a o n
trackage right s was buil t a t a scal e o f 1:2,000,000 . The ORN L dee p se a component o f th e wa -
terways networ k wa s much cruder : scaled at approximatel y 1:6,000,000 .
For traffi c routin g purposes thes e difference s i n scale are no t important . Wha t i s important i s
the abilit y t o connec t network s togethe r a t appropriat e (terminal ) transfe r locations . Onc e th e
usefulness of geographic detail has been established for a particular multimoda l network it may be
computationally efficien t t o simplif y on e o r mor e o f th e constituen t networks . Fo r thi s purpos e
and prior to computing th e number of links shown in Table 1 a certain amoun t of highway end-o n
link chainin g across count y an d othe r administrativ e border s was carried out . Not e als o tha t al l
network access an d egres s links, which are used to pu t freigh t o n an d of f the network, were built
"on th e fly" by a set of CFS routing algorithms , a s needed fo r specific shipments. Thi s procedure
is discussed i n Sectio n 3 .
To support traffic routin g eac h o f the major mode-specific networks neede d som e modificatio n
and enhancemen t prio r t o bein g merged int o th e multimodal CF S network. Besides the additio n
of a few specific links , notably rai l spu r lines and termina l acces s connectors no t alread y i n these
databases a t th e time , the followin g structura l modification s were needed:
Highways. The ORNL Nationa l Highwa y Network was built specifically for, and has been used
continuously i n traffi c routin g studie s fo r ove r a decad e (Southwort h e t al. , 1986 ; Chi n e t al. ,
1989). However , th e CF S distinguishe s betwee n privat e an d for-hir e truckin g sub-mode s an d
reports separat e statistic s for each. Therefore where both privat e an d for-hire trucking was listed
in a shipment' s mod e sequence , effort s wer e made t o identif y likel y truck (typically , carg o con -
solidation) terminal s as intermediate stops o n a route. Thes e within-the highway-mode terminals
were simulated a s additional network link s in a manner simila r t o intermoda l transfe r terminal s
(see Section 4) .
Waterways. Fo r traffi c routin g purpose s th e Nationa l Waterwa y Networ k wa s divide d int o
three differen t bu t connecte d sub-networks , followin g US Army Corp s of Engineers definitions .
These ar e th e inlan d an d inter-coasta l (largel y barge traffic ) sub-network , the Grea t Lake s sub -
network, an d a trans-oceani c o r "dee p sea " sub-network . Thes e distinction s ar e important fo r
waterborne commerc e routin g becaus e eac h o f thes e sub-network s uses differen t vesse l types t o
handle freight , sinc e large , mor e robus t vessel s needed t o cros s larg e bodie s o f ope n wate r ar e
uneconomical a s a method o f inland, riverin e transport . Hence it was important to capture both
the location s an d relativ e costs involve d i n transferrin g goods fro m on e vesse l type t o another .
This was done by adding inter-vessel transfer links to the logical CF S network, at locations where
such carg o unloadin g an d loadin g take s place .
Second, fo r assistanc e in imputin g withi n the Unite d State s expor t shipmen t mileages , where
the US port of exit was unreported in the CFS, th e ORNL trans-globa l dee p sea sub-network wa s
merged with this US waterways network. Thi s dee p se a sub-network takes th e for m o f a lattice-
work o f open wate r links supplemented b y much longer , more direct links between selecte d hig h
volume seaport corridors (Southworth e t al., 1998) . It was linked by manual GIS-base d editing t o
the National Waterway s Networ k principall y b y adding connector link s outside U S seaports .
152 F . Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
Railroads. Tw o importan t aspect s o f rai l traffi c routin g involv e railroa d compan y specifi c
"trackage rights" and between compan y "interlining " practices. To accommodate these the 199 7
version of the FRA Rai l Networ k was also subjecte d to modification prior t o its inclusion within
the CFS network. First, the representation o f the railroad syste m as a connected se t of individual
companies' sub-network s was needed for traffic routin g purposes. This required that each railroa d
link have a list of the companies tha t ca n operate ove r it, railroads that are said to have trackage
rights. The FR A networ k include d all trackage right s ove r a decade-long period , bu t w e needed
lists for 199 7 only. This was accomplished b y adding transitio n date s to ownershi p and trackag e
rights attribut e information o n each link . These date s wer e calculated from a model o f corporate
ancestry an d th e result s were used t o populat e th e geographi c database .
Second, w e needed t o kno w wher e traffi c wa s bein g exchange d betwee n railroa d companies .
These location s ar e know n as interlines. While th e FR A networ k containe d som e data o n thes e
they were assigned for the most part onl y in an approximat e fashio n to th e nearest metropolita n
area. To obtain the desired level of geographic specificity it was necessary to assign these interlines
to specific network locations, by defining them as a set of inter-railroad connecto r link s joining the
pair o f railroads involved . This link-by-lin k attribut e editing o f the rail network wa s carried ou t
within a commercial GIS .
Finally, both the CFS rail and highway networks were extended to include major Canadian and
Mexican rai l line s and highways , eac h tie d t o th e U S domesti c transportatio n network s by th e
addition o f transfe r link s at borde r crossings . Th e cos t o f delays a t custom s station s ca n b e at -
tached t o these transfer links to simulat e the relative costs o f alternative routes when considering
truck an d rai l export shipments .
The multimoda l CF S networ k was created b y mergin g th e above , no w traffi c routabl e singl e
mode networks . This wa s done b y linkin g them throug h a serie s o f intermoda l truck-rai l (TR) ,
truck-water (TW) and water-rai l transfer terminals. Fig . 1 illustrates this concept, fo r the case for
a truck-rail-truc k (TRT ) shipment . Usin g a suitabl e "shortes t path " route-findin g algorithm , a
route is generated b y first of all accessing the highway sub-network, linking this via a TR termina l
to th e rail sub-networ k and returnin g to th e highwa y network vi a a secon d intermoda l termina l
transfer. In practice, two separate versions o f the (same) highwa y sub-network ar e invoked i n this
routing procedure . Eac h o f th e thre e sub-network s show n ma y b e activate d o r suppresse d b y
suitable, use r drive n progra m command s t o handl e specifi c shipments , usin g a commo n sub-
network selectio n software. Thi s is done by invoking only those parts o f the multimodal network
database tha t ar e necessar y for a specifi c shipment' s routin g exercise.
The correct mode sequence for intermodal trip s is ensured as follows. We can begin by thinking
of all of the CFS network's links as being "switched off' , and by processing each reported shipment
in turn . A copy o f the highwa y portio n o f the CF S intermodal network i s switched on . Fo r th e
shipment shown in Fig. 1 a set of highway sub-network access links are generated from th e traffic' s
origin (zip code) by the method describ e in Section 4 of this paper. This is done for the first copy of
the highway sub-network only. Similarly a set of destination egres s links are created an d indexe d
only to a second cop y o f the (structurally identical ) highway sub-network. T o brin g the interme-
diate rai l portio n o f th e rout e int o th e pictur e th e rai l sub-networ k is als o switche d o n a s i s a
-
o
suitable subse t o f th e CF S network' s intermoda l truck-to-rai l termina l transfe r links . All othe r
terminal transfe r links , includin g al l rail-to-truc k transfer s ar e a t thi s tim e turne d of f prior t o
shipment routing (by assigning them an infinit e impedanc e as a starting default). Th e remaining,
direction-specific termina l transfers then ensure the correct TRT routin g sequenc e reporte d i n the
shipper survey. Finally, in making such terminal transfers it was often necessary to also generate, at
execution time, a set of local terminal access and egress links not present i n any of the modal sub-
networks , and notabl y where the us e of trucks was involved. These are als o illustrated in Fig. 1.
Efficient organizatio n o f th e multimoda l shipment s t o b e route d mad e fo r rapi d compute r
processing o n a shipmen t b y shipmen t basis . Tim e savin g computationa l procedure s wer e als o
developed to recognize and stor e previously computed shipmen t routes. 2 Onc e the impedance for
a network link or complete origi n t o destination route ha s been computed i t can be stored fo r re-
use in subsequen t routin g exercises . How thes e modal an d intermoda l impedance s an d thei r re-
sulting routes were selected is discussed i n Sectio n 4.
After th e intermoda l terminal s an d thei r loca l access/egres s link s to th e moda l network s are
included i n the CF S networ k w e have a unified, logical , geographic networ k with a reduced bu t
common set of attributes necessary for the CFS routing programs. Whil e this network is formally
a se t of geographic objects , ther e are a number o f features that must be overcome t o import and
use it within a commercial GIS .
While th e individua l moda l an d terminal s database s ca n b e readil y store d an d maintaine d
within commercia l GI S software , it wa s foun d les s efficien t t o stor e th e composit e multimoda l
network i s this fashion. Rather, i t is constructed b y a custom progra m tha t read s an d processe s
the individual databases and produces the data structures required by the CFS routing programs.
The impediment s in using this o r similarl y structured network s in a GIS are :
• A large number o f logical links are of zero length because the y represent a process tha t is ide-
alized to occur at a point, suc h as terminal activities and inter-company cargo interlines . Other
terminals ar e co-located with modal network nodes, so their access links appear, geometrically ,
as points. Mos t commercial GI S packages do not allo w points, eve n when they are degenerate
lines, in a datase t o f polyline objects .
• Som e other object s exactly overlay each other , suc h as railroad logica l links that represen t the
trackage right s of differen t companie s o n th e sam e physical link, or differen t logica l terminals
that handl e multipl e commoditie s i n the sam e physical facility.
• Mos t commercia l GI S package s construc t networ k topolog y geometrically , i.e. , i f tw o links
share endpoint s a t th e sam e geographi c location, the links will be connected. Thi s will clearl y
render th e CF S networ k described her e analyticall y useless. (The atypica l alternativ e is to es-
tablish connectivity through endpoin t node s identifie d o n eac h link' s attribut e list.)
In orde r t o us e this networ k i n a standar d GIS , tw o solution s ma y b e implemente d durin g it s
construction. The first is to slightly perturb logical node locations that would otherwise fall o n the
2
Maintenanc e o f tw o separat e highwa y sub-network s wa s als o usefu l fo r capturin g th e separate rout e mileag e
estimates fo r privat e versu s for-hir e truck s discusse d earlie r i n Sectio n 2 .
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 15 5
same point, along wit h the ending vertices of polylines inciden t t o them. This will simultaneousl y
insure positive line lengths and prevent spurious link connections. A second method (in GIS pack-
ages that support it) is to specify infinite turning penalties betwee n links that do not share a logical
endpoint node. Since our network routing analyses occurred outside a commercial GIS we did not
need to use these solutions. It was necessary, however, for us to load a version of the CFS network
into a commercial GI S t o displa y particular routes . Th e solutio n w e chose wa s to pic k ou t zer o
length links and call them points, so that th e network would have three classes of objects as sepa-
rate data layer s with corresponding attribute sets : links, nodes, and zero length links. For identifi -
cation, objects within layers are distinguished further b y attributes such as mode or commodity.
Table 2
Mode-specific defaul t value s of acces s model parameters a
Parameter Highway Rail Inland & Great Lake s Deep sea
P 0.00 (0.00) 4.97 (8.00) 4.97 (8.00) 6.22 (10.00)
(2e) (Domestic ) 0.22 (0.35) 1.86(3.00) 1.86(3.00) 1.86 (3.00 )
(2e) (Foreign) b 6.22 (10.00 ) 18.65 (30.00 ) 31.08(50.00) 31.08(50.00)
a
Distances ar e in miles (kilometers given in brackets).
b
Includes Canada an d Mexico.
root mea n squar e geographi c erro r i n the network's representation o f link locations, an d p i s the
maximum length of a local access connector (e.g. , a local road o r rail spur line) not included in the
CFS mod e specifi c networ k databases .
Table 2 lists the initial , mode specifi c value s of e and p use d i n the abov e formula.
Using Eq. (1) a mode specifi c maximum access distance from a zip code centroid is defined an d
all network links and node s within that distance are searched a s possible network access or egress
points. How thi s search occur s and ho w specifi c acces s link s are subsequently selected and linked
to th e CF S networ k varied b y mode o f transportation, a s follows :
Highway access: In the case of the comparatively dense highway network the above local access
formula p is initially set to zero. In this case such access represents local street mileage, or possibly
local minor arterial s not included in the national network database. Highwa y access link distances
were nearly alway s quite short . Thes e acces s distance s ar e approximate d b y finding the straight -
line distance fro m th e zi p centroid t o th e neares t poin t o n th e networ k (lin k or node ) i n each o f
three mutuall y exclusive, 120 ° sectors whic h mee t a t th e zi p centroid. A s show n i n Fig . 2 , thes e
sectors are defined initiall y by finding the nearest point on the highway network and using it as the
center o f the first of these sectors. Also show n in Fig . 2 is the presenc e of a rive r barrier . Where
such barriers were known to exist an access link had it s length increased to recogniz e the need to
cross th e barrier a t a point othe r than that bisected b y its straight lin e approximation , wit h ad -
ditional lengt h penalties imposed i f a roa d wa s o f th e limite d acces s type . Also , acces s was no t
allowed to occu r withi n a ferr y link . The resulting network access distances were then multiplied
by 1. 2 to represen t loca l highwa y network circuit y (i.e., the rati o o f networ k distanc e t o Grea t
Circle distance). I n the rare case that no network links existed within R z the single closest highway
link wa s selected a s the acces s point s o that n o zi p code are a wa s without highwa y access . Fo r
route selectio n purposes th e impedances assigne d t o acces s links were those of low level urban o r
rural collecto r roads .
Rail access: Th e 199 7 CFS use d a modified version o f the Federa l Railroa d Administration' s
1:2,000,000 scal e network database. Thi s meant there was about a 95% chance that the geographi c
error i n line placement wa s within ( 2 x e) = 1.8 6 miles (3 km): while the maximum lengt h o f a n
industrial rail s p u r, p, was set at almost 5 miles (8 km). The left hal f of Fig. 3 shows how Eq. (1) is
used. Acces s links were typically attached t o th e interio r o f a rai l network lin k by segmenting it
into tw o o r mor e subdivision s calle d "shado w links " (becaus e the y logicall y li e beneat h th e
original link) . The right hal f o f this figure shows how two shado w link s are created . Thes e link s
are built fro m th e point a t which the zip code centroi d connecto r meet s the network (poin t X) to
each end-point (nodes A and B ) of the real rail link. Each of these shadow links is given a distance
equal t o th e lengt h o f that portio n o f the rai l network lin k it represents , i n effec t creatin g a no -
tional node at point X for access modeling purposes. This approach wa s used in order to retain as
much o f the rail network' s lin k distanc e detai l a s possible .
An alternativ e t o thi s shado w lin k approac h would b e to connec t tw o notional link s directl y
from th e traffi c generatin g or attractin g centroid t o th e end-points o f the nearest real rail link. A
problem with this more data efficien t metho d (i.e., two new links are added instea d of three) is the
chance that access connectors fro m mor e than on e centroid ma y connect to a real network node ,
causing route s t o be built fro m one set of notional access links to another without traversin g an y
real networ k links . Fig . 4 show s how th e tw o approache s woul d wor k i n thi s case. O f th e tw o
approaches onl y th e shado w lin k approac h allow s shor t movement s betwee n traffi c generator s
adjacent t o th e sam e (rail ) network link t o receiv e the correct networ k distances.
Water access: Navigabl e waterway s access distanc e i s determined i n a manne r simila r to tha t
used fo r rai l access , bu t wit h two notabl e differences . First , give n tha t man y o f th e longe r wa -
terway links meander across the landscape, a n additional procedure was added to ensure selection
of th e mos t likel y access location . Onc e a se t of centroid-to-waterwa y notional acces s link s has
been constructed suc h connectors have their impedance multiplied by 5.0. This value was derived
by experiment and generall y ensures that th e acces s link with the lowest value of (5.0 x notiona l
connector straight line distance + network distance along a connected waterway link to its nearest
real network node) is selected for routing purposes, i.e. , selection strongly favors that point on the
river system closest to the zip code centroid . Second , the value for spur length, p, i s re-set to zero
after it s use in the threshold search process, i.e. , before the mileage computation i s recorded. That
is, the distance from th e zip code centroid to the nearest waterway link is not include d in the CF S
distance estimate . Thi s i s becaus e shipment s reporte d b y th e surve y t o eithe r star t o r en d b y
waterway ar e assume d t o originat e o r terminat e a t a doc k locate d alongside th e waterwa y
network.
To derive suitable intermodal routes from differen t combinations , and sequences, of single mode
networks require s a networ k mergin g t o occur . Tha t is , functional linkage s ar e require d a t lo -
cations i n th e rea l worl d wher e intermodal freigh t transfer s take place . Thi s i s accomplished b y
modeling th e operatio n o f intermoda l terminal s withi n a networ k context . Fig . 5 show s tw o
approaches. Th e approac h show n i n th e lef t hal f o f Fig . 5 is termed th e bi-moda l connection s
model, since each intermodal transfe r is represented as a single network lin k between two differen t
modes of transport. This is the method that ORNL use d to model intermodal transfers in the 1993
CFS (Middendor f et al. , 1995 ; Southwort h e t al. , 1997) . This is a straightforwar d solutio n tha t
allows the modeler to assign a direction-specific transfer cost to each terminal link in the databas e
for the purpose of traffic routing. I t can be implemented b y adding a series of bi-modal "notional"
links to a network's database o f single mode links. However, this solution has the limitation tha t
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 15 9
only a single , catch-all transfe r cost o r impedanc e valu e is assigned to eac h bi-modal , direction -
specific linkage : an d changin g an y singl e component o f transfe r cos t woul d involv e a re-calcu -
lation o f each composit e bi-moda l lin k cost.
For th e 199 7 CFS, makin g us e o f a detaile d intermoda l terminal s databas e develope d since
1993 (Middendorf, 1998) , this bi-modal link transfer model was replaced with the one represented
in th e righ t hal f o f Fig . 5 . In thi s approac h a n explici t transfer facilit y a t a specifi c geographi c
location i s identified, alon g with explicit representation o f local network-to-termina l access (an d
egress) links . Th e loca l termina l acces s link s show n a s singl e dashe d line s in Fig . 5 are actuall y
represented in the database a s two, uni-directional linkages for each of a series of terminal access
"gates". Ther e ma y als o b e more tha n on e suc h gat e pe r primar y mod e o f transportation. Fo r
example, i n Fig. 5(b ) two separate railroad compan y connection s ar e shown. The same procedur e
is used to fo r every other mode, varyin g only those parameters specifi c t o a modal sub-network's
characteristics.
The mor e elaborat e acces s mode l show n i n Fig . 5(b ) i s especially usefu l wher e the main-lin e
modal network s ar e comparativel y spars e an d therefor e nee d quit e lon g acces s link s t o brin g
terminals int o th e network : especiall y wher e mor e tha n on e geographica l directio n o f mod e
specific entry/exi t exists to a larg e terminal complex. Indeed, i t i s often a t thes e local access an d
transfer point s that major delays, and hence costs, occur in today's freigh t movement system. This
approach als o allow s a mor e realisti c representatio n o f within-terminal versus outside termina l
operating and maintenanc e cost s should suc h a logistical analysis prove to be of interest. Finally,
it allow s termina l an d networ k representatio n issue s t o be de-coupled, s o that multiple termina l
sets o r model s may b e used wit h the sam e networks, and vic e versa. It mus t b e noted, however,
that data at the level shown in Fig. 5(b ) are far from bein g universally available at the present time
(Middendorf, 1998) .
Putting CF S shipment s ont o the CFS networ k fo r th e purpose o f estimating mod e an d com -
modity specifi c ton-mile s an d dollar-mile s o f freigh t activit y required a metho d o r method s fo r
160 F . Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
first of all generating sensible single and multi-modal routes , and wher e more than on e route was
likely t o b e used , a metho d fo r assignin g percentage s o f shipmen t volume s t o eac h o f thes e
candidate routes . For consistenc y with the 199 3 CFS single route truck freight modelin g was used
to comput e 199 7 CF S shipmen t distances . Singl e pat h waterwa y routin g wa s als o th e norm .
However, wher e rai l dominate d a route' s mileage (bot h rai l onl y an d rail-inclusiv e intermoda l
routing) th e situation wa s more complicated . Mor e tha n on e rail carrier-specifi c rout e wa s ofte n
both plausibl e an d likely , an d therefor e each rout e neede d t o b e both generate d an d assigne d a
portion o f th e origin-to-destinatio n volume . Rai l shipmen t volume s were the n sprea d acros s a
limited number of highly likely rail routes using a logit assignment model calibrated roughl y to the
tonnages carrie d o n th e hig h volum e traffi c corridor s reporte d i n th e Surfac e Transportatio n
Board's annual railcar waybill sampl e (AAR , 1998) .
A "good " route, fo r CF S purposes , i s a rout e tha t reproduce s th e shipper reporte d mod e se-
quence an d can either b e validated using other data sources, o r in the absence o f such sources ca n
stand u p t o som e commo n sens e rules associate d wit h the economic s o f freigh t movement . Re-
course to the literature on multimodal freigh t routin g practices, includin g the work of Friesz et al.
(1986), Harke r (1997 ) an d Guela t e t al . (1990) , indicate s a comple x se t o f factor s influencin g
actual route s taken , involvin g carrier a s wel l a s shipper decisions , an d on e tha t als o varie s b y
commodity type . However, empirical validatio n o f a large number o f such mode an d commodit y
specific rout e selectio n models was beyond the resources of the study. Nor woul d such a thing be
easy to accomplish give n the current state o f freight movement dat a across the nation as a whole,
and notably so for movements involving trucks (see Southworth, 1997) . To ensure the selection of
sensible routes , therefore , lin k specifi c impedance s wer e develope d t o represen t th e generalize d
cost o f differen t e n rout e activities , includin g the cost s of :
• loca l acces s t o major traffi c way s and terminals ;
• withi n termina l transfe r activitie s includin g loadin g an d unloadin g betwee n modes , vehicles ,
and railroa d companies ;
• negotiatio n o f border crossings; an d
• th e line haul cost s i n different corridors .
In all cases one or more routes through th e CFS network are identified b y a shortest path routine .
Path length is determined here on the basis of a set of modal impedance s This process starts with a
set of what we term "native lin k impedance functions", i.e., native to the mode in question. In the
case o f th e highwa y networ k thes e nativ e impedance s ar e assigne d base d o n a numbe r o f lin k
attributes, notabl y distanc e an d urba n an d rura l functiona l class , wit h defaul t lin k traversa l
speeds modifie d on the basis of traffic conditions , acces s controls, th e presence o f a toll o r a truc k
route designatio n an d whethe r the highway is divided or not. The native impedance fo r highways
is therefor e a surrogat e trave l tim e impedance . Rout e selectio n ove r th e railroa d network , i n
contrast, i s determine d b y a n evaluatio n o f lin e importanc e calle d "mai n lin e class" . Thoug h
primarily based o n traffi c volumes (e.g., "branch" lines carry less than five million gross tons/year
and "A-main " lines more tha n 3 0 million), we subjectively modified these classes on th e basis o f
operating condition s and th e principal commoditie s carried. Our routing procedure als o required
the identification an d assignmen t o f an impedance to those "interline" points wher e railcars may
be transferred betwee n separat e railroa d companies , dat a tha t i s now a par t o f the rai l network
database. Waterwa y routing s i n contrast wer e comparatively straight-forwar d fo r the most part,
as onl y a singl e waterway route wa s typically available an d competitive . However , where Grea t
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 16 1
3
There ar e a few exceptions, suc h a s truck o r container-sized shipment s t o Hawaii , whic h were presumed t o us e a
Pacific coas t por t i n preferenc e t o passag e throug h the Panam a Canal .
4
Fo r highway s suc h a "best " facilit y wa s a rura l interstat e road . Fo r rai l i t wa s a n "A-Main " rai l line . Fo r
waterborne commerce, with few places where any route choice existed, all links were treated as equal within each of the
three vesse l types .
162 F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
Unlikely split s betwee n modal mileages occu r when the routing algorith m selects paths with long
mileages o n a mor e expensiv e mode relativ e t o a les s expensive one. Thi s latte r ca n als o occu r
when th e algorith m select s mode-specifi c mileages b y goin g throug h a transfe r termina l tha t
produces mileages that ar e much longer than a direct trip by a single mode would be. With a little
computer programmin g i t was possible t o pick out thes e questionable routings from amon g very
long dat a list s an d investigat e thes e case s i n more detail , subsequentl y usin g a GI S packag e t o
display questionabl e routes .
Alternative T R routing models. Man y o f th e dubiou s case s identifie d b y th e abov e rout e vali -
dation criteri a involved TR intermoda l moves . A majority of the intermodal shipment s reporte d
in th e 199 7 CF S involv e thes e tw o modes . T o addres s thes e issue s tw o differen t T R routin g
models were developed . Thes e models ar e termed respectivel y th e "majo r terminals " model an d
"distributed terminals " model . I n particular , a distinctio n wa s made betwee n containerized an d
non-containerized (bul k an d break-bulk ) freight . As show n in Fig . 6 , freight designate d a s con -
tainerized b y 199 7 CFS shipper s wa s handled b y th e major terminal s model, an d specificall y by
allowing TR transfer s to occu r onl y at thos e terminal s where containerized traffi c wa s known t o
be handled. 5 Where non-containerize d shipment s wer e concerned th e ORNL terminals databas e
provided th e first set of candidate intermodal transfe r locations trie d by the rail-inclusive routing
algorithms. I f th e resultin g circuity wa s foun d t o b e unacceptabl y hig h fo r a specifi c origin -
destination shipment , o r i f th e resultin g allocatio n o f highway-to-rai l mileag e wa s deeme d to o
high t o warran t expensiv e rail-base d intermoda l transfers , the n th e alternativ e "distribute d
5
Fo r th e continenta l Unite d States , an d fro m amon g th e mor e tha n 290 0 terminal s containe d withi n the ORN L
intermodal terminal s database a t tha t tim e (mid-1997) , som e 25 6 TR containerize d cargo terminal s were identified ,
involving either Trailer-on-Flatcar or Container-on-Flatca r operations.
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 16 3
terminals" mode l wa s applied. I n suc h case s a "major terminals " routin g alternativ e wa s con -
sidered suspec t when either o f the followin g condition s wa s violated:
• whe n the route circuit y is more tha n 2. 5 times the Great Circl e Distance ,
• whe n th e highway proportio n o f the entire origin-to-destinatio n rout e lengt h i s greater tha n
25%.
A GIS is a valuable tool here for examining suspect terminal-inclusiv e routes . Rejection o f a route
led to th e use of the "distribute d terminals " model . This model assumes that fo r certain type s of
TR intermoda l movemen t ther e wil l b e a tea m trac k o r othe r rai l transfe r facilit y withi n a rea -
sonable distanc e o f the shipment origin or destination (dependin g on the TR, RT , o r TRT mod e
sequence involved). Without knowin g where all of these terminals are located th e model posits a
TR transfe r facilit y a t th e singl e closes t nod e o n eac h rai l company' s sub-network , fo r al l rai l
nodes within a 90 mile search radius of the truck en d of the trip . Once located, a highway route
between a zip code traffi c generato r an d these "ad hoc " terminal s is then constructed. If the resul t
obtained fro m thi s distributed terminals model was deemed significantly better, in the sense of the
route bein g noticeabl y les s circuitou s tha n tha t supplie d b y th e majo r terminal s model , i t wa s
accepted. As a practical matter, access links to al l major terminals are constructed b y the ORN L
routing procedures a t networ k generation time . Acces s links to a d ho c terminal s under th e dis-
tributed terminal s model are constructed a t mode l execution time.
Handling export shipments. Routing export shipments within the 199 7 CFS required data on the
US seaport o f exit as well as the domestic origi n and foreign destination o f the movement. Where
US port o f exit data wa s missing from otherwis e useful shipmen t records a method fo r imputing
the most likel y port o f exit was devised. This was done by adding deep sea impedances to within-
US truck, rai l and/or waterwa y impedances associate d wit h each export shipment . The resulting
US port-inclusive , relative origin-to-destinatio n impedance s wer e then used t o estimat e a se t of
travel impedance-discounted comparativ e por t attractio n factors , with the most attractive port(s )
being assigned the export shipment s (Southworth et al., 1998) . In terms of ton-mileage and othe r
distance calculation s the non-US portions of these routes are not reported b y the CFS, s o that the
principal value of the routes to the survey is to identif y th e US origin to U S port o f exit mileages
involved. Fig. 7 shows an example trans-oceanic shipment routing, using a commercial GIS to pan
and zoo m in along three specific section s of the route. This particular shipment 6 begins by rail in
East St . Loui s with carg o transfe r i n th e Ne w York-Ne w Jerse y por t are a followe d b y trans -
Atlantic shipping to the port o f Amsterdam in the Netherlands. Exper t knowledge of the network
options an d por t terminal s involved can b e verified quickl y with the ai d o f the GIS .
6. Summary
This paper has described th e development o f a large and detailed multimoda l network , create d
and store d in digital form for us e in a specific freigh t traffi c routin g study: the 199 7 United State s
Commodity Flow Survey (CFS). While commercial GIS softwar e wa s found to be invaluable for
Note, this is an example route, not an actual shipment route drawn from the 199 7 CFS database, for which detailed
information ma y not b e divulged.
164 F . Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
Fig. 7 . Three views of a rail-water trans-oceanic route : (a ) pan vie w of route, (b ) US portion o f rail route, an d (c) zoom
in o n rail-wate r por t transfer .
displaying, checking and editin g the network , it was also foun d t o b e most efficien t t o construc t
and process shipment routes outside this environment. Among other benefits this approach allows
different mod e specifi c lin e hau l network s t o b e linke d togethe r vi a mor e tha n on e dat a repre -
sentation fo r transportatio n terminals , an d usin g more tha n on e approach t o definin g loca l net -
F. Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166 16 5
work access and egress. The procedures described in the paper ar e for the most part th e generic or
default method s use d t o automat e thi s process . Th e reade r shoul d not e tha t adjustmen t an d
manipulation o f the various network parameter setting s reported i n the paper wer e often neede d
to accommodat e specifi c rout e selections , where available empirical evidence suggested that they
were warranted. Mos t o f this post-model developmen t adjustment activity is currently necessary
because ther e exist s n o nationall y representativ e sampl e o f ho w eithe r truck s o r intermoda l
shipments move around the United State s against whic h to calibrate such routing models. Fo r rail
and wate r movements this situatio n i s somewhat better , i f still far fro m ideal . Give n thes e cau -
tions, geographicall y referenced digital database s suc h a s the CF S networ k now offe r a startin g
point fro m whic h to mode l freigh t activit y in more detail , o n a route o r corrido r specifi c basis .
As long as such activities as cargo handling and intermodal transfer can be translated into link-
specific measure s o f relativ e moda l impedance , the y ca n b e use d withi n th e networ k modelin g
framework describe d i n this pape r t o investigat e the implication s fo r futur e freigh t movement s
from introducin g ne w an d location-specific freigh t movemen t technologies . Thi s include s th e
simulation of often fa r reaching congestion impact s caused b y in-transit delays at major seaport s
or othe r traffi c bottleneck s (see , fo r example , Por t o f Lon g Beach , 1994) . While validatio n o f
many CFS-generate d route s i s currently problemati c du e t o th e limitation s o f alternativ e dat a
sources, th e generatio n o f sensibl e traffi c routin g options , base d o n shipper reporte d mod e se -
quences, provides a good dea l of insight into the infrastructure-constrained options availabl e for
intermodal transportatio n withi n th e continenta l Unite d States . Thi s i s a topi c o f growin g im-
portance a s th e globalizatio n o f trad e put s competitiv e pressure s o n nationa l economie s t o in -
crease thei r freigh t transportatio n carryin g capacity . A detaile d geograph y o f freigh t
transportation network s will make it easier to anticipat e as well as understand the need for thes e
new capital investments , both a t hom e an d abroad , and withi n increasingly inter-dependent na -
tional and internationa l freigh t movemen t systems.
Acknowledgements
References
AAR, 1998 . User Guide to the 199 7 Surface Transportation Board Carload Waybill Sample. Association of American
Railroads, Washington, DC 20001 .
Bureau o f th e Census , 1995 . 199 3 Commodit y Flo w Survey . Unite d States . TC92-CF-52 . U S Departmen t o f
Commerce, Washington, DC 20233.
Bureau o f Transportatio n Statistics , 1997a . Transportatio n Statistic s Annua l Repor t 1997 . U S Departmen t o f
Transportation, Washington , DC 20590.
Bureau o f Transportation Statistics, 1997b . Truc k movement s in America : shipments from, to , within , an d through
states. Transtats 1 . US Departmen t of Transportation, Washington, DC 20590.
166 F . Southworth, B.E. Peterson I Transportation Research Part C 8 (2000) 147-166
Chin, S-M. , Peterson , B.E., Southworth , F., Davis , R.M., Scott , R.G. , 1989 . Graphics displa y of convoy movements.
In: Proceedings , ASC E Conferenc e o n Microcomputer s i n Transportatio n Planning . Oa k Ridg e Nationa l
Laboratory, Sa n Francisco , July .
Crainic, T.G. , Florian , M. , Guelat , J. , Spiess , H. , 1990 . Strategi c plannin g o f freigh t transportation : STAN , a n
interactive graphic system. Transportation Researc h Recor d 1283 , 97-124 .
Friesz, T.L. , Gottfried , J.A. , Morlok , E.K. , 1986 . A sequentia l shipper-carrie r networ k mode l fo r predictin g freigh t
flows. Transportation Scienc e 20, 80-91.
GAO, 1996 . Intermoda l Freigh t Transportation . Project s an d Plannin g Issues , Unite d State s Genera l Accountin g
Office, GAO/NSIAD-96-159 .
Guelat, J. , Florian , M. , Crainic , T.G. , 1990 . A multimod e multiproduc t networ k assignmen t mode l fo r strategi c
planning o f freigh t flows . Transportatio n Scienc e 24, 25-39.
Harker, P.T. , 1997 . Predicting Intercity Freight Flows . VNU Scienc e Press, Utrecht .
Jourquin, B. , Beuthe , M. , 1996 . Transportatio n polic y analysi s with a geographi c informatio n system : the virtua l
network o f freigh t transportatio n i n Europe. Transportatio n Researc h C 4 (6), 359-371.
Middendorf, D. , 1998 . Intermoda l terminal s database : concepts , design , implementation , and maintenance . Repor t
Prepared fo r th e Burea u o f Transportation Statistic s by Oak Ridg e Nationa l Laboratory , Oa k Ridge , TN 37831.
Middendorf, D. , Bronzini , M.S. , Peterson , B.E. , Liu , C. , Chin , S-M. , 1995 . Estimatio n an d validatio n o f mod e
distances fo r th e 199 3 Commodit y Flo w Survey , In : Proceeding s o f th e 37t h Annua l Transportatio n Researc h
Forum. Reston , VA 22090, pp. 456-473.
Port o f Lon g Beach , 1994 . Th e nationa l economi c significanc e o f th e Alamed a corridor . Repor t Prepare d fo r th e
Alameda Corrido r Transportatio n Authority , Long Beach , CA 90801.
Southworth, F. , 1997 . Development o f data an d analysi s tools i n suppor t o f a nationa l intermoda l network analysis
capability. Burea u o f Transportatio n Statistics , U S Departmen t o f Transportation , Washington , D C (http:/ /
www. bts.gov/gis/reference/develop/develop. html).
Southworth, F., Peterson , B.E. , Chin, S-M., 1998 . Methodology fo r estimating freight shipmen t distances for the 1997
Commodity Flo w Survey . Repor t Prepare d fo r th e Burea u o f Transportatio n Statistics , U S Departmen t o f
Transportation, Washington , D C 20590.
Southworth, F. , Peterson , B.E. , Davis , R.M. , Chin , S-M. , Scott , R.G. , 1986 . Applicatio n o f th e ORN L highwa y
network dat a bas e t o militar y and civilia n transportation operation s planning . Papers an d Proceedin g o f Applied
Geography Conferences , vol. 9. pp. 217-227 .
Southworth, F., Xiong, D., Middendorf , D., 1997 . Development of analytic intermodal freight network s for use within
a GIS . In : Proceeding s of the GIS- T 9 7 Symposium. American Associatio n o f State Highwa y and Transportatio n
Officials Conference , pp. 201-218 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Current geographica l informatio n system s (GIS ) ar e not wel l adapted t o th e managemen t o f very dy-
namic geographica l phenomena . Thi s i s due t o th e lac k o f conceptual an d physica l interoperabilit y wit h
real-time computin g facilities . The researc h describe d i n thi s pape r i s oriented toward s th e identificatio n
and experimentatio n o f a ne w methodological an d applie d framewor k fo r th e real-tim e integration , ma -
nipulation an d visualisatio n o f urban traffi c data . I t i s based o n proactive interactio n betwee n th e spatio -
temporal database an d visualisation levels, and between the visualisation and end-user levels. The propose d
framework integrates different spatia l and temporal levels of granularity durin g the analysis of urban traffi c
data. Urba n traffi c behaviour s ar e analyse d eithe r b y observation o f the movement s o f several vehicles in
space, o r by changes i n urban networ k properties (i.e. , micro - versus macro-modelling). Visualisatio n an d
interaction tool s togethe r constitut e a flexible interface environment for th e visualisation o f urban traffi c
data within GIS. Thes e concept s provide a relevant suppor t for the visual analysis of urban traffic pattern s
in th e thematic , spatia l an d tempora l dimensions . Thi s integrate d framewor k is illustrated b y a n experi -
mental prototyp e develope d i n a large town i n the UK . © 200 0 Elsevier Science Ltd. Al l rights reserved.
1. Introductio n
Recent development s in information technolog y ar e having a major effec t o n the way in which
systems ar e designe d an d use d i n man y applicatio n fields . Geographica l informatio n system s
Corresponding author .
E-mail addresses: [email protected] k (C . Claramunt) , [email protected] e (B . Jiang) , [email protected] k
(A. Bargiela) .
0968-090X/00/$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00009- 7
168 C . Claramunt et al I Transportation Research Part C 8 (2000) 167-184
(GIS) hav e bee n adopte d a s a successfu l solutio n b y a wid e rang e o f discipline s suc h a s envi -
ronmental planning, business demographics, propert y management and urban studie s to mention
some examples . Currently , on e o f th e mos t importan t challenge s fo r GI S i s to generat e a cor -
porate resource whose full potentia l wil l be achieved by making it accessible to a large set of end-
users. I n th e urba n domain , a n importan t issu e is the development o f a co-operative traffi c GI S
that integrates static urban dat a wit h dynamic traffic flows (Pursula, 1998) . Such a system will be
of great interes t fo r many application s relate d to th e monitoring an d analysi s o f urban traffi c i n
which represente d vehicle s or networ k propertie s ar e changin g i n a fas t an d almos t continuou s
mode. Recent advance s i n traffic system s include the development o f graphical interface s as a new
functional leve l o f monitoring tasks (Peytchev et al. , 1996 ; Kosonen e t al. , 1998 ; Barcielo et al. ,
1999) and real-tim e traffi c interface s i n the World Wid e Web that display traffi c condition s o n a
regular basis (Dayley and Mayers , 1999 ; Feng e t al., 1999) . Nevertheless, the function s provided
by these solutions ar e quite limited in terms of the analysis and visualisation o f urban traffi c data .
Furthermore, w e believe that th e ful l potentia l an d benefi t o f traffi c database s stil l need a close r
integration wit h GIS that will facilitate the integration o f traffic data as a component o f urban an d
transport plannin g an d environmenta l an d healt h studies . However , curren t GI S softwar e an d
interfaces d o no t provid e th e se t o f function s t o mak e thi s technolog y compatibl e wit h traffi c
systems use d fo r monitorin g an d simulatio n purposes . First , th e integratio n o f GI S an d traffi c
systems is likely to b e a challenging an d worthwhil e objectiv e fo r user communitie s whos e need s
are no t satisfie d b y a loosel y connecte d se t o f existin g systems. This poo r leve l o f integratio n i s
often th e result o f the different paradig m use d within GIS and modelling system s and the fact that
many integrate d solution s ofte n impl y th e re-desig n o f existin g solution s (Abe l e t al. , 1992) .
Secondly, despit e recen t progres s i n th e developmen t o f tempora l GIS s (e.g. , Langran , 1992 ;
Peuquet, 1994 ; Claramunt an d Theriault , 1995) , current GIS s ar e stil l not adapte d t o th e man -
agement o f very dynamic geographica l phenomena du e to th e lac k o f interoperability wit h real -
time computing facilities. Moreover, the development of GIS applications, characterise d by a high
frequency o f changes , implie s a reconsideratio n o f th e modelling , manipulation , analysi s an d
visualisation function s a s GI S model s an d architecture s hav e not bee n preliminarily designed t o
handle th e propertie s o f very dynamic phenomena .
The researc h describe d i n this paper i s oriented t o th e identificatio n and experimentatio n o f a
new methodologica l framewor k fo r th e real-tim e integration , manipulation , visualisatio n an d
animation o f urban traffi c dat a within GIS. W e characterise a very dynami c GIS (VDGIS ) a s a
GIS applicatio n whic h has a high frequency of change (e.g. , real-time traffi c databases , simulate d
traffic databases) . Change s include modifications to located propertie s (e.g., traffic flows within an
urban network ) an d movin g properties o f on e t o severa l geographica l object s (e.g. , vehicl e po -
sitions within an urba n network) . A hig h frequenc y o f change correspond s t o a smal l tempora l
unit o f change, whic h can b e generally quantified in second s o r minutes. The scop e o f VDGIS is
relatively large . I t als o includes, for example, real-tim e application s tha t monito r a larg e volume
of urban o r environmental data, an d simulatio n system s that attemp t t o predic t th e future state s
of a real-worl d system. VDGIS objective s are multiple , fro m th e contro l o f the geographica l lo -
cations o f one to severa l moving objects, t o th e suppor t o f analysis tasks oriente d t o th e identi -
fication o f comple x spatia l behaviours . Fo r example , traffi c monitorin g an d simulatio n
applications integrat e real-tim e location s o f vehicle s o n a second , o r less , tempora l granularit y
basis. Th e integratio n o f GI S capabilitie s withi n thes e engineerin g system s i s a promisin g
C. Claramunt e t a l I Transportation Research Part C 8 (2000) 167-184 16 9
Fig. 1 . Visualisation of changes with: (a) map series ; (b) map symbol ; (c) computer animation.
visualisation of local changes at the individual object level; cartographic symbol s such as an arrow
line, for example, are commonly used to indicate the trace of a dynamic object in space (Fig. l(b)) .
In orde r t o facilitat e the representatio n an d analysi s of dynamic phenomena an d chang e pat -
terns, cartographers hav e developed temporal aggregatio n mechanism s to reduc e the numbe r of
snapshots of a map series , and t o provid e a level of temporal analysi s adapted fo r the particula r
needs o f a n applicatio n domain . Thes e concepts ca n b e illustrated with the stud y of populatio n
migration i n urba n area s usin g temporal map s (Szego , 1987) , or a t th e loca l level , by th e ma p
representation o f dail y individua l activitie s (Parke s an d Thrift , 1980) . Animatio n technique s
(Fig. l(c) ) als o pla y a n importan t rol e i n th e analysi s o f dynami c geographica l data , a s the y
provide a for m o f tempora l continuity , thu s facilitatin g a n understandin g o f processe s an d
changes. A n animate d ma p i s a cartographi c statemen t tha t occur s i n time ; its interpretation i s
based o n th e huma n sensitivit y t o detec t movemen t o r change s i n a graphi c displa y (Peterson ,
1995). On e o f th e firs t cartographi c animation s wa s i n fac t th e urba n growt h simulatio n in th e
Detroit regio n develope d by Toble r (1970) . The combinatio n of GIS and animation s provid e
powerful platform s to simulat e very dynamic phenomena. Thes e ca n b e used fo r th e analysi s of
real-time systems (Valsecchi et al. , 1999 ) or th e simulatio n o f human behaviour s (Jiang, 1999) .
The availabilit y of a larg e volume o f spatio-tempora l dat a ha s stimulate d researc h interes t in
visualisation an d exploratio n o f dynami c phenomen a (Robertson , 1988 ; Campbell an d Egbert ,
1990; Kraak an d MacEachren , 1994 ; MacEachren , 1994 ; Jiang, 1996) . A map, o r a n interactiv e
map, support s th e visualisatio n o f spatio-tempora l objects , o r propertie s i n space , individually,
but als o mor e interestingl y in a logi c way, fro m whic h map user s can perceiv e spatial relation -
ships, density , arrangements, trends , connectivit y relationships, hierarchie s an d spatia l associa -
tions (Muehrcke, 1981). Efforts hav e been made on implementing strategies for the cartographica l
C. Claramunt e t al . I Transportation Research Part C 8 (2000) 167-184 17 1
exploration o f time-series data (Monmonier, 1990) , proactive graphics (Buttenfield , 1993), and the
identification o f dynami c variable s fo r th e visualisatio n o f change s (DiBias e e t al. , 1992 ; Mac -
Eachren, 1994) . Recent advance s hav e explored time-serie s animation o f urban growt h (Buziek,
1997), geologica l change s (Bisho p e t al. , 1999) , an d socio-economica l change s (Andrienk o an d
Andrienko, 1998) . MacEachre n e t al . (1999 ) have investigate d the integratio n o f geographi c vi -
sualisation an d dat a minin g for knowledg e discovery in the contex t o f spatio-tempora l environ -
mental data . Variou s new terms hav e been use d t o reflec t thi s evolution o f cartography suc h a s
'animated cartography ' (Peterson , 1995) , and 'explorator y cartography ' (Kraak , 1998) . Tech -
niques for visualising time and chang e in cartography an d GI S als o benefi t fro m relate d researc h
areas such a s information visualisation , human-compute r interaction and data mining (McCor -
mick et al., 1987 ; Schneiderman, 1994 ; Card e t al., 1999 ; Chen, 1999) . In particular , visualisation
techniques use d fo r th e analysi s o f communicatio n traffi c i n larg e computin g network s ar e o f
interest a s the y ar e base d o n comparabl e networ k models . Le t u s mentio n amon g other s th e
development of filtering, interactive manipulation o f visualisation parameters, an d th e interactive
use of tempora l (e.g. , selectio n of appropriat e tempora l periods ) and spatia l operation s (e.g. ,
zooming) fo r displayin g telecommunication networ k traffi c (Eick , 1996) .
in which the information flows between the database, visualisatio n and user levels are particularly
intensive. Suc h concepts provid e a solution t o the derivation o f spatio-temporal data , a s a query
language tha t support s th e expressio n o f user-define d spatio-temporal querie s on th e on e han d
and a s an interactio n languag e betwee n user s and th e visualisation leve l on the other .
We promote a flexible view of this multi-layered approach t o the visualisation of very dynamic
phenomena, whic h i s independent o f an y dat a mode l o r quer y language . Th e se t o f principle s
explored define s a user-define d level between a spatio-tempora l databas e an d th e rang e o f visu-
alisation mechanism s require d fo r th e manipulatio n o f ver y dynami c databases . Withi n th e
proposed framework , proactive aggregatio n tool s ar e illustrate d b y th e compositio n o f derived
views based o n different level s of granularity i n the temporal and spatial dimension s (Fig . 3) . This
interactive visualisation level allows for th e manipulation o f this user-defined level using differen t
visualisation mechanism s (e.g. , animated map, animate d chart ) dependin g o n the objective o f the
end-user(s).
interconnected map s defined , fo r example , usin g differen t scales , an d additiona l dat a medi a
(e.g., photographs) . Secondly , a visualisatio n goes furthe r b y integratin g animate d scene s tha t
represent th e tempora l evolutio n o f a regio n o f interes t and/o r themati c propert y changes . A
visualisation is then more than a static map as it also integrates a dynamic component, which can
be user-controlled depending on the properties of the phenomenon represented. The structure of a
visualisation i s not a linea r one bu t rathe r a comple x on e compose d o f differen t level s o f gran -
ularity i n bot h th e spatia l an d tempora l dimensions . Thirdly , a visualisatio n i s complete d b y
interactive tasks for the manipulation of its visual components (e.g., pan an d zooming functions) .
As such , a visualisatio n is supported b y a se t of interactive facilities offere d t o end-users , which
can then manipulate different visua l components in order to develop a user-oriented perception of
real-world phenomena .
In th e contex t o f VDGIS , visualisation s combin e geographica l an d themati c dat a alon g th e
temporal lin e from differen t medi a an d source s usin g visual communication techniques . A visu-
alisation integrate s multipl e components suc h a s maps, chart s an d tables . Interactiv e functions,
such as spatial operations or temporal brushes allow for the manipulation of visualisations, and in
fact o f the underlyin g GIS database . Severa l complementary aspects need t o b e analysed for th e
development o f VDGIS visualisations:
• underlyin g properties o f very dynami c geographical data ;
• pre-processin g function s fo r filtering larg e volumes of data;
• quer y and visualisation operations ;
• interactiv e functions fo r animation an d interface manipulation .
These consideration s lea d t o th e analysi s of th e natur e o f th e dynami c data an d change s t o b e
displayed. Changes have been categorised b y DiBiase et al. (1992) as changes in either: (a) spatial
location; (b ) spatial location and/or attributes; and (c ) classification of objects within the attribute
space (e.g. , re-classification). Dynamic objects or spatia l properties ar e difficul t t o evaluate on a n
individual basis. Therefore, the analysis and presentation o f a set of dynamic objects often requir e
the use of parsing, aggregation and/or statistica l techniques (Kraak an d MacEachren , 1994) . This
is illustrated, for example, by the pre-processing of the spatial, thematic and tempora l propertie s
of ver y dynami c objects . Often , passin g fro m on e spatia l o r tempora l leve l o f granularit y t o a
coarser on e provide s a complementar y insigh t fo r th e analysi s an d understandin g o f spatia l
phenomena. Withi n th e tim e dimension , tempora l operator s coul d b e use d fo r changin g th e
temporal granularit y of the phenomena represented. A s such, thes e operations allo w phenomena
visualisation fro m a hierarchica l poin t o f view . Propagatio n o f tempora l constraint s betwee n
spatio-temporal processe s represente d a t complementar y level s o f granularit y ca n b e realise d
using constraint propagation algorithm s (Claramunt and Bai, 1999). Reasoning and manipulation
in tempora l system s have bee n widel y studie d i n tempora l logi c (Allen , 1984 ; Bestougef f an d
Ligozat, 1992 ; Badalon i an d Benati , 1994) . Forma l tempora l language s an d operator s ar e o f
particular interes t fo r th e tempora l aggregatio n o f ver y dynami c data. Generally , compositio n
operations, base d o n th e manipulatio n o f tempora l intervals , allo w th e representatio n o f phe -
nomena at a coarser level of granularity usin g temporal operators that aggregate temporal periods
(e.g., fro m a n hou r t o a da y frequenc y o f change). Within th e spatia l an d themati c dimensions,
aggregational, statistica l an d relationa l operation s ca n als o b e applied. Thes e operation s consti -
tute a se t o f pre-processin g function s tha t cove r th e differen t dimension s o f geographical
phenomena.
C. Claramunt et al . I Transportation Research Part C 8 (2000) 167-184 17 5
The informatio n contained withi n a VDGI S visualisatio n include s dynamic objects an d thei r
changing properties on the one hand and the relatively static environment on the other. The latter
acts a s a visua l backgroun d fo r presentatio n an d animatio n purposes . Fo r stati c data , a singl e
visual representatio n i s generall y used , i t i s bounde d i n tim e b y th e tempora l validit y o f th e
geographical dat a visualise d which is either a time instant or interval . The stati c component o f a
visualisation provide s support for the interactive exploratio n o f changes as it gives a geographica l
reference t o th e dynami c phenomena analysed . Th e duratio n o f a tempora l visualisatio n scene
(i.e., animation ) can b e proportional t o th e magnitude o f the phenomeno n represented . Fo r ex -
ample, th e temporal progression o f animation slow s down a s changes visualise d ar e increasing in
intensity. This type of animation technique is referred t o as pacing (DiBiase et al., 1992) . This also
requires a n analysi s o f th e visua l propertie s tha t suppor t th e presentatio n o f dynami c object s
(urban, rural , motorway) . Th e researc h prototyp e reporte d her e relate s t o ou r wor k wit h th e
SCOOT traffi c monitorin g an d contro l syste m which provides a basis for a whole range o f tele-
matics applications includin g microscopic and macroscopi c traffi c simulatio n and portabl e traffi c
information system s (Bargiela and Berry , 1999). The SCOOT traffi c managemen t system retained
for th e development of this project model a part o f the city of Mansfield , a mid-sized city in th e
UK. Th e tempora l granularit y o f incomin g traffi c dat a provide d b y th e memor y managemen t
system is given on a second basis. Such a frequency o f communication flow leads to a huge volume
of traffi c dat a (about on e million traffi c dat a messages per day) . I n orde r t o reduc e suc h a huge
volume o f data , w e decide d t o aggregat e incomin g traffi c dat a t o hal f a n hou r tim e interva l
samples. Thi s resolution largel y reduces th e amount o f traffic dat a generated, an d i s still relevan t
for th e objectives of an analysis of traffic conditions . The applications have been interfaced using a
generic inter-proces s communicatio n facilit y develope d a t Nottingha m Tren t University , th e
distributed memory environment DIME (Argil e et al., 1996) . This communication environment is
based o n a TCP/IP protocol and a client-server architecture . This syste m has been developed an d
tested in conjunction with the SCOOT system . With the aid of DIME, th e VDGIS ca n appear t o
SCOOT a s another applicatio n tha t perform s complex data aggregatio n an d visualisatio n task s
while essentially maintaining it s autonomy .
We illustrat e thes e concept s i n th e contex t o f th e OSIRI S prototyp e oriente d t o th e devel -
opment o f a n inter-Operabl e System fo r th e integratio n o f .Real-tim e traff/c dat a withi n a GI S
(Etches e t al. , 1999 ; Valsecch i e t al. , 1999 ; Grzywac z an d Claramunt , 2000) . Th e databas e
method use d t o suppor t th e descriptio n o f th e traffi c syste m is based o n a n object-relationshi p
model (Etche s e t al. , 1999) . Fo r th e purpose s o f ou r prototype , th e databas e desig n ha s bee n
mapped t o a geo-relationa l model . Th e resulting model support s bot h objec t an d attribut e ver -
sioning, thu s allowin g a flexible representation o f tempora l properties . Th e OSIRI S prototyp e
extends the curren t capabilitie s of traffic monitorin g system s in terms o f database functions an d
develops a user-oriente d interfac e based o n th e integration , aggregation , manipulation , visuali -
sation an d animatio n o f traffic condition s withi n a n urban network . Suc h a system complement s
the monitorin g function s provide d b y real-tim e traffi c systems . I t i s oriente d toward s urba n
studies that integrate traffi c condition s as a parameter. Within OSIRIS , traffic dat a are importe d
from a n urba n traffi c contro l syste m tha t optimise s th e split , cycle , an d offse t time s o f traffi c
signals. This traffi c syste m is a macroscopic traffi c syste m which is therefore not oriente d towards
the modellin g of individual cars but rathe r traffi c condition s withi n a roa d networ k (e.g., queue
lengths). The OSIRI S implementatio n i s realised o n to p o f Maplnfo GI S usin g C++, Delph i (a
windows GU I edito r an d Pasca l compiler) , an d MapBasi c programmin g languages . The urba n
network componen t of the databas e is base d on ordnanc e surve y centr e alignmen t of road s
(OSCAR) data .
Changing th e leve l o f granularity in the representatio n o f any real-worl d phenomenon ha s a n
impact o n both th e spatial an d temporal dimensions . I n the temporal dimension , the granularity
of ver y dynami c dat a need s t o b e pre-define d according t o use r needs . Within a traffi c system ,
incoming dat a based o n a very dynamic frequenc y of change ca n be pre-processed accordin g t o
the minimal time interval of interest (Valsecch i et al., 1999) . In order t o analys e traffic condition s
at complementar y level s o f granularity , severa l spatia l an d tempora l aggregatio n mechanism s
have bee n developed. At th e spatia l level , the aggregatio n o f traffi c dat a i s based o n thre e com-
plementary level s that provide differen t representation s of traffic dat a flows, from th e finest spatial
178 C Claramunt et a l I Transportation Research Part C 8 (2000) 167-184
level o f granularity t o th e coarse r spatia l leve l of granularity, i.e. , incomin g lane , road segmen t
and node , respectivel y (Fig. 6) .
Additionally, a user-define d level allows the aggregatio n o f traffi c dat a o n pre-selecte d route s
(e.g., se t of road segmen t ends). At th e temporal level , the sourc e tempora l granularit y provide d
by DIME (i.e. , 1 s) is aggregated o n a half an hour basi s by the pre-processor (i.e. , averages an d
maximum o f traffi c dat a values) . A user-oriente d tempora l granularit y i s als o selecte d durin g
aggregation analysi s accordin g t o applicatio n needs . Th e pre-processin g function s o f incomin g
traffic dat a hav e bee n implemente d throug h a visua l user interface . The pre-processin g o f a vi-
sualisation require s th e definitio n of th e tempora l parameter s (i.e. , tim e interval , period o f ag -
gregation), th e definitio n of incomin g traffi c attribute s (eithe r base d o n a maximu m o r averag e
basis), an d th e level s of spatia l an d tempora l granularit y (Fig . 7) . Fo r example , pre-processin g
functions calculate averages an d maximum s (Fig . 7(a) ) o f queue lengths , traffi c ligh t periods, and
node saturation. Thes e functions are applied o n either maximum or average incoming traffic dat a
attributes (Fig . 7(b)) . Fo r th e analysi s o f very dynamic phenomena suc h a s traffi c flows , spatia l
and/or tempora l aggregation s provide different level s of analysis. At a coarser level of granularity,
aggregated behaviour s ar e identified . Coarser tempora l and/o r granularit y level s allow th e iden -
tification o f globa l changes . O n th e othe r hand , th e analysi s o f loca l change s require s fine r
temporal an d spatia l granularit y levels. Browsing throughout differen t spatia l an d temporal level s
of granularity is an important functiona l requirement for the development of successful VDGI S in
order t o suppor t a large range of user functions that cove r both th e study of local propertie s an d
the analysi s o f general trend s withi n the urba n traffi c network .
The query components o f the OSIRIS prototype hav e been completed b y the implementatio n
of tempora l operation s tha t exten d current relationa l an d spatia l operation s provide d b y a GI S
system (Grzywac z an d Claramunt , 2000) . Tempora l operation s ar e embedde d withi n a quer y
interface tha t integrates thematic, spatia l an d tempora l operations . The tempora l function s rep -
resent th e extensio n developed . Th e graphi c use r interfac e (GUI) extend s the curren t Maplnf o
query interfac e b y integratin g tempora l predicate s withi n th e WHER E claus e an d tempora l
functions withi n the SELEC T claus e (e.g. , Valid( ) , Cast(Valid ( ) a s interval) . A tempora l op -
eration wizar d allow s th e use r t o creat e a temporal predicat e (Fig . 8) . Each tempora l predicat e
consists o f tw o operand s an d a n operatio n in-between . Th e syste m control s th e choice s o f
operands accordin g t o th e temporal operatio n selecte d by the user. This implementation is based
on a dua l approac h tha t combine s a first normal for m (INF ) approac h wit h TSQL2 tempora l
operations, whic h i s th e curren t databas e standar d fo r tempora l operations . Suc h a solutio n
presents th e advantag e o f bein g compatibl e wit h curren t geo-relationa l softwar e architectures ,
which i s a constrain t o f ou r prototyp e environment . Typica l quer y example s ar e a s follows : (1 )
display th e spatia l extent s and delive r th e identifiers , averag e numbe r o f passing cars, an d vali d
times of the lanes that have an average numbe r o f passing car s greater tha n or equal t o 25, during
periods tha t en d afte r 11:4 5 o n 1 2 December 1998 ; (2 ) retur n th e maximu m valu e o f averag e
numbers o f passin g cars , fo r period s o f tim e afte r 10:3 0 o n 1 2 December 1998 . Thi s tempora l
manipulation interface implements the main operation s define d i n TSQL2, an d extend s the range
of GI S queryin g capabilities toward s th e tempora l dimension . Thes e tempora l operation s com -
plete the pre-processin g an d quer y capabilitie s o f the OSIRI S prototype .
In orde r t o provid e complementar y visualisation perspectives, multi-dimensiona l visualisation
techniques reflec t th e dynami c propertie s o f incomin g traffi c dat a (e.g. , themati c chart , spatia l
chart, thematic animation, spatia l animation). Fo r example , an animation allow s users to browse
through th e tempora l traffi c state s o f selecte d an d aggregate d traffi c value s within a considere d
period of time. Such functions enrich the user perception o f traffic data through time and act as an
exploratory tool that can be used to identify traffi c pattern s in space and time. These visualisations
can b e use d t o detec t incident s i n orde r t o identif y critica l nodes , o r fo r th e analysi s o f traffi c
patterns within the traffic network . In the context of our project, the visualisation of very dynamic
geographical dat a implie s a hig h leve l o f interaction tha t support s complementar y user-define d
tasks:
• definitio n o f complementary tempora l an d spatia l level s of granularity (Fig . 7);
• derivatio n o f traffic dat a usin g query language capabilitie s (Fig . 8);
• combinatio n o f different dimension s i n order t o analys e pattern s i n the spatial , tempora l an d
thematic dimensions (Fig. 9) .
Within the scope o f the OSIRIS prototype , differen t visualisatio n and animatio n techniques have
been used :
• Ma p animation s tha t presen t th e variation o f traffi c propertie s locate d i n the network , using
different spatia l (lane, road segmen t or node) an d tempora l aggregation s (i.e. , different tempo -
ral granules). Fig . 9(a ) presents a n example o f spatial animatio n tha t can either simulat e traffi c
behaviours at the queue, road segmen t or node levels. The animation can be controlled throug h
the GU I wit h a n interactio n bo x tha t i s user-controlled .
• Animate d graph s tha t describ e th e variatio n o f traffi c properties , usin g differen t spatia l (i.e. ,
lane, roa d segmen t o r node ) an d tempora l aggregatio n level s (differen t tempora l granules) .
Fig. 9(b ) present s a n exampl e o f themati c animatio n tha t simulate s th e variatio n o f traffi c
queue values along the tim e line thanks t o a n interactio n bo x tha t i s user-controlled.
• Chart s that present th e temporal evolutio n o f a traffic paramete r for a user-defined route o r set
of road element s (i.e., lane, road segmen t or node). Fig. 9(c) presents an example of variation of
traffic queu e values along th e tim e line for a se t of traffi c networ k nodes .
• Animation s tha t presen t th e evolutio n o f th e distributio n o f a traffi c paramete r fo r a user -
defined se t o f tempora l component s (i.e. , lane , roa d segmen t o r node) . Fig . 9(d ) present s
an animatio n tha t illustrate s th e distributio n o f traffi c value s fo r a se t o f traffi c networ k
nodes.
C. Claramunt et a l I Transportation Research Part C 8 (2000) 167-184 18 1
Fig. 9 . (a ) Spatiall y oriente d animation , (b ) Thematicall y oriente d animation , (c ) Thematicall y oriente d chart , (d )
Distribution-oriented animation .
The followin g scree n snapshot s - example s take n fro m th e OSIRIS prototyp e - illustrat e th e
concepts o f visualisatio n an d animatio n tool s tha t integrat e complementar y graphica l an d car -
tographical techniques . Th e interactio n leve l is given by a se t o f action s tha t suppor t tempora l
browsing function s withi n th e differen t visualisations . Differen t level s o f granularit y ar e user -
defined durin g th e aggregatio n o f data selecte d fo r th e visualisatio n process . Al l together, thes e
visualisation and interaction tool s provid e a suitable platform that allow s users to explore urba n
traffic dat a from variou s perspectives and to generate a set of dynamic visual representations tha t
give an overvie w of traffic flows. Such functions enric h the user perception o f traffic dat a throug h
time an d ac t a s an exploratio n too l tha t can b e used t o identif y traffi c incident s an d pattern s i n
space an d time . Fo r example , OSIRI S visualisation s ca n b e use d t o detec t th e impac t o f a n
182 C . Claramunt e t a l I Transportation Research Part C 8 (2000) 167-184
6. Conclusion
The experimenta l researc h presente d i n thi s pape r develop s a new framework for th e integra -
tion, analysi s an d visualisatio n o f urba n traffi c dat a withi n VDGIS . Th e integratio n o f urba n
traffic dat a withi n VDGI S require s a sequenc e o f manipulation s tha t includ e pre-processin g
functions, selectio n an d derivatio n o f traffi c data , an d visualisatio n an d animatio n tasks . I n
particular, th e constraint s o f a VDGI S impl y the developmen t o f pre-processing function s tha t
aggregate incomin g traffi c dat a i n bot h th e spatia l an d tempora l dimensions . Thes e function s
allow th e analysi s o f spatio-tempora l phenomen a a t complementar y level s o f granularity . Th e
manipulation an d analysi s o f urba n traffi c i s base d o n severa l complementar y levels : pre-pro -
cessing, visualisatio n an d interactio n tool s tha t allo w user s t o analys e urba n traffi c dat a withi n
GIS. Th e presented framewor k has been illustrated and validated in the context of a VDGIS for a
real-time traffic system . The method propose d an d th e implementation realised with the prototyp e
OSIRIS ar e origina l a s th e propose d architectur e combines : (1 ) a dynami c integratio n o f traffi c
data, (2 ) pre-processing o f traffi c dat a a t complementar y level s of granularity, (3) the integratio n
of tempora l operation s withi n a GI S quer y language , an d (4 ) an interfac e that support s visuali -
sations and animation s i n the thematic, spatia l an d tempora l dimensions . Further wor k includes
the developmen t an d prototypin g o f a real-tim e traffi c GI S fo r simulatio n purposes .
Acknowledgements
We would lik e t o than k th e anonymou s reviewer s and Prof . Jean-Claud e Thil l fo r thei r mos t
helpful comment s an d suggestions .
References
Abel, D.J. , Yap, S.K., Ackland, R., Cameron , M.A., Smith, D.F. , Walker , G., 1992 . Environmental decisio n support
system project: an exploration of alternative architectures for GIS. International Journal of Geographic Information
Systems 6 (3), 193-204 .
Allen, J.F. , 1984 . Towards a general theory of action s and time . Artificial Intelligenc e 23 , 123-154 .
Andrienko, G.L. , Andrienko , N.V., 1998 . Visual data exploratio n b y dynamic manipulation of maps. In: Poiker , T.,
Chrisman, N . (Eds.) , Proceeding s of th e Eight h International Symposium on Spatia l Data Handling , Vancouver ,
pp. 533-542 .
Argile, A. , Peytchev , E. , Bargiela , A. , Kosonen , I. , 1996 . DIME : a share d memor y environment fo r distribute d
simulation, monitorin g and contro l of urban traffic . In : Proceeding s of Europea n Simulation Symposiu m ESS'96 ,
SCS, vol. 1 , Genoa, pp . 152-156 .
Badaloni, S. , Benati , M., 1994 . Dealing with tim e granularity in a tempora l planning system. In : Proceeding s of th e
First Internationa l Conference on Temporal Logic. Springer, Berlin, pp. 101-116 .
Barcielo, J. , Ferrer , J.L. , Martin , R. , 1999 . Simulation assisted desig n an d assessmen t of vehicl e guidance systems.
International Transactions in Operational Research 6, 123-143 .
C. Claramunt et al . I Transportation Research Part C 8 (2000) 167-184 18 3
Bargiela, A. , Berry , R. , 1999 . Enhancin g th e benefits of UT C through distribute d applications . Traffi c Technolog y
International, February , pp. 63-66.
Bestougeff, H. , Ligozat , G. , 1992 . Logical Tool s fo r Tempora l Knowledg e Representation. Elli s Horwood, UK .
Bishop, I.D. , Ramasamy , S.M. , Stephens , P. , Joyce , E.B. , 1999 . Visualisation of 800 0 years of geologica l histor y in
Southern India . Internationa l Journal of Geographi c Informatio n Science 1 3 (4), 417-427.
Buttenfield, B.P. , 1993 . Proactive graphics and GIS : prototype tools for query, modelling and display . In: Proceeding s
of Aut o Cart o 11 . ACSM/ASPRS, Minneapolis , pp. 377-385 .
Buziek, G. , 1997 . Th e desig n o f a cartographi c animatio n - experience s an d results . In : Proceeding s o f th e 18t h
International Cartographi c Conference . 1C A, Stockholm , pp . 1344-1351 .
Campbell, C.S. , Egbert , S.L. , 1990 . Animated cartography/Thirt y year s of scratching th e surface. Cartographica 2 7 (2),
24-46.
Card, S.K , Mackinlay , J.D. , Schneiderman , B. , 1999 . Reading s i n Information Visualisation : Usin g Visio n to Think .
Morgan Kaufmann , San Francisco .
Chen, C. , 1999 . Information Visualisation and Virtua l Environments. Springer, Berlin.
Claramunt, C. , Bai , L., 1999 . A multi-scal e approach t o th e propagatio n o f temporal constraints in GIS . Journa l o f
Geographic Informatio n and Decisio n Analysi s 3 (1), 9-20 .
Claramunt, C., Mainguenaud , M., 1996 . A Spatia l representatio n an d navigatio n model. In : Kraak , M.J. , Molenaar,
M. (Eds.) , Advances in GIS II . Taylo r & Francis, Delft , Netherlands, pp. 767-784 .
Claramunt, C. , Thériault, M. , 1995 . Managin g time in GIS : a n event-oriente d approach. In : Clifford , J. , Tuzhilin , A .
(Eds.), Recen t Advance s in Temporal Databases . Springer , Berlin, pp. 23-42 .
Dayley, D.J., Mayers , D. , 1999 . A statistical mode l for dynamic ride-matching i n the World Wide Web. In: Proceeding s
of th e ITSC'9 9 Conference . The IEE E Compute r Society , Tokyo, pp . 154-165 .
DiBiase, D.A. , MacEachren , M. , Krygier , J.B. , Reeves , C. , 1992 . Animation an d th e rol e o f map desig n i n scientifi c
visualisation. Cartograph y and Geographi c Informatio n System s 1 9 (4), 201-214 .
Eick, S.G. , 1996 . Aspects o f network visualization. Computer Graphic s and Application s 1 6 (2), 69-72.
Etches, A. , Claramunt , C. , Bargiela , A., Kosonen , I , 1999 . A n interoperabl e TGI S mode l fo r traffi c systems . In :
Gittings, B . (Ed.), Innovation s i n GI S 6 , Integratin g Informatio n Infrastructure s with G I Technology . Taylo r &
Francis, London , pp . 217-228 .
Feng, C. , Wei , H. , Lee , J. , 1999 . WWW-GIS strategie s fo r transportatio n applications . In : Proceeding s o f th e 78t h
Transportation Researc h Board , Washington, DC , pp . 234—249 .
Grzywacz, M. , Claramunt , C. , 2000 . An implementatio n o f temporal operation s withi n a co-operativ e traffi c system.
International Journa l of Applied System s Studies, Special Issue on Applied Co-operative System s 1(1 ) forthcoming .
Jiang, B. , 1996 . Cartographic visualisation : analytical and communicatio n tools. Cartography , 1-11 .
Jiang, B. , 1999 . SimPed : simulatin g pedestria n crowd s i n a virtua l urba n environment . Journa l o f Geographi c
Information an d Decisio n Analysi s 3 (1), 21-30.
Kosonen, I. , 1999 . HUTSI M - Urba n traffi c simulatio n an d control model : principle s an d applications, unpublishe d
Ph.D. dissertation . Helsink i University of Technology.
Kosonen, I. , Bargiela , A. , Claramunt , C., 1998 . A distribute d informatio n system for traffi c control . In : Bargiela , A.,
Kerckhoffs, E . (Eds.), Proceedings of the Tenth European Symposium in Simulation Systems. Nottingham, pp. 355 -
361.
Kraak, M.J., 1998 . The cartographic visualisation process: from presentatio n to exploration. The Cartographic Journa l
35(1), 11-15 .
Kraak, M.J. , MacEachren , A.M. , 1994 . Visualisation o f tempora l componen t o f spatia l data . In : Waugh , T.C. ,
Healey, R.G . (Eds.) , Proceeding s o f th e Internationa l Spatia l Dat a Handlin g Conferenc e SDH'94 , Edinburgh ,
pp. 391-409 .
Langran, G. , 1992 . Time in Geographi c Informatio n Systems. Taylor & Francis, London .
MacEachren, A., 1994 . Tim e a s a cartographic variable . In : Hearnshaw , H.M. , Unwin , D.J . (Eds.) , Visualisatio n i n
Geographical Informatio n Systems . Wiley, New York, pp. 115-130 .
MacEachren, A.M. , Wachowicz , M. , Edsall , R. , Haug , D. , 1999 . Constructing knowledg e from multivariat e spatio-
temporal data : integratin g geographical visualisatio n with knowledg e discovering in databas e methods . Interna -
tional Journa l of Geographi c Informatio n Scienc e 1 3 (4), 311-334 .
184 C . Claramunt et al . I Transportation Research Part C 8 (2000) 167-184
McCormick, B.H. , DeFanti , T.A. , Brown , M.D. , 1987 . Visualisation i n scientifi c computin g (specia l issue) . AC M
SIGGRAPH Compute r Graphic s 2 1 (6).
Monmonier, M. , 1990 . Strategies fo r th e visualisatio n o f geographic time-serie s data . Cartographica 27 (1), 30-45.
Muehrcke, P.C. , 1981 . Maps i n geography. Cartographica 1 8 (2), 1-41 .
Parkes, D. , Thrift , N. , 1980 . Times, Spaces , an d Places . Wiley , New York.
Peterson, M.P. , 1995 . Interactive an d Animate d Cartography . Prentice-Hall , Englewoo d Cliffs , N J.
Peuquet, D.J., 1994 . It's about time: a conceptual framewor k for the representation of temporal dynamics in geographic
information systems . Annals of the Associatio n o f th e America n Geographer s 8 4 (3), 441-61.
Peytchev, E. , Bargiela , A. , Gessing , R. , 1996 . A predictiv e macroscopi c cit y traffi c flow s simulatio n model . In :
Proceedings of European Simulatio n Symposium ESS'96, SCS , vol. 2, Genoa, pp . 38-42 .
Pursula, M., 1998 . Simulation of traffic systems : an overview. In: Bargiela, A., Kerckhoffs, E . (Eds.), Proceedings of the
10th Europea n Simulatio n Symposium, pp. 20-24 .
Robertson, P.K. , 1988 . Choosing dat a representations fo r th e effectiv e visualisatio n of spatial data. In: Proceedings o f
the Third Internationa l Symposium of Spatia l Dat a Handling , ICA, Sydney , pp. 243-252 .
Schneiderman, B. , 1994 . Dynamic queries for visua l information seeking. IEEE Softwar e 1 1 (6), 70-77.
Stonebraker, M. , Chen , J. , Natha , N. , Paxson , C. , Wu , J. , 1993 . Tioga : providin g dat a managemen t suppor t fo r
scientific visualisatio n applications. In : Agrawal , R. , Baker , S., Bell , D.B . (Eds.) , Proceeding s o f th e Ver y Large
Database Conference , Dublin, Ireland , pp . 123-134 .
Szego, J. , 1987 . Huma n Cartography : Mappin g th e Worl d o f Man . Th e Swedis h Counci l fo r Buildin g Research ,
Stockholm, Sweden.
Tobler, W.R. , 1970 . A computer movi e simulatin g urban growt h in the Detroi t region . Economi c Geograph y 4 6 (2),
234-240.
Tufte, E.R. , 1983 . The Visual Display o f Quantitative Information . Graphic s Press , Cheshire , CT .
Valsecchi, P. , Claramunt , C. , Peytchev , E. , 1999 . OSIRIS: a n inter-operabl e syste m for th e integratio n o f rea l tim e
traffic dat a withi n GIS. Computers , Environmen t and Urba n System s 23 (2), 245-257 .
Vasiliev, I., 1996 . Design issue s to be considered whe n mapping time . In : Wood, C. , Keller, C.P . (Eds.) , Cartographi c
Design: Theoretical an d Practica l Perspectives. Wiley, New York, pp . 137-146 .
Voisard, A. (1991). Towards a toolbox for geographical user interfaces. In: Gimther, O., Schek, H.-J. (Eds.) , Advances
in Spatia l Databases . Springer, Zurich , pp . 75-98 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
A major difficult y i n the analysis of disaggregate activity-travel behavior in the past arises from th e many
interacting dimension s involve d (e.g . location , timing , duratio n an d sequencin g o f trip s an d activities) .
Often, th e researcher is forced t o decompose activity-travel pattern s into their component dimensions an d
focus onl y on on e or tw o dimensions a t a time , o r t o trea t the m a s a multidimensional whol e using mul-
tivariate method s t o deriv e generalize d activity-trave l patterns . Thi s pape r describe s severa l GIS-base d
three-dimensional (3D ) geovisualizatio n methods fo r dealin g with the spatia l an d tempora l dimension s of
human activity-trave l patterns a t the same time while avoiding the interpretative complexity of multivariat e
pattern generalization or recognition methods. Thes e methods are operationalized usin g interactive 3D GIS
techniques an d a trave l diar y dat a se t collected i n th e Portlan d (Oregon ) metropolita n region . Th e stud y
demonstrates several advantage s in using thes e methods. First, significance of the temporal dimension an d
its interaction with the spatia l dimensio n i n structuring the daily space-time trajectories of individuals can
be clearly revealed . Second , the y ar e effectiv e tool s for the exploratory analysi s o f activity diar y dat a that
can lea d t o mor e focuse d analysi s in late r stage s o f a study . The y ca n als o hel p th e formulatio n of mor e
realistic computationa l o r behaviora l trave l models. © 200 0 Published b y Elsevier Scienc e Ltd . Al l rights
reserved.
1. Introductio n
As evident i n early time-us e an d activity-trave l studie s (e.g . Chapin , 1974 ; Culle n e t al. , 1972 ;
Szalai, 1972) , a major difficulty i n the analysis of human activity-trave l patterns i s that individual
0968-090X/00/S - see front matte r © 200 0 Published b y Elsevie r Scienc e Ltd. Al l rights reserved.
PII: S0968-090X(00)00017- 6
186 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
movement in space-time is a complex trajectory with many interacting dimensions. These include
the location, timing , duration, sequencin g and type of activities and/or trips . This characteristic of
activity-travel behavio r ha s mad e th e simultaneou s analysi s o f it s man y dimension s difficul t
(Burnett an d Hanson , 1982) . Often, one has either t o focus on a few component dimension s a t a
time (e.g. Bhat, 1997 , 1998; Chapin, 1974 ; Golob an d McNally , 1997 ; Goulias, 1999 ; Lu and Pas ,
1999; Michelson, 1985 ; Pendyala, 1997) , or t o trea t th e pattern a s a multidimensional whole and
use multivariat e method s t o deriv e generalize d activity-trave l patterns fro m a larg e numbe r o f
variables (e.g. Bhat and Singh , 2000; Golob, 1985 ; Hanson an d Hanson , 1980 , 1981; Janelle an d
Goodchild, 1988 ; Koppelman an d Pas , 1985 ; Ma an d Goulias , 1997a,b ; Pas, 1982 , 1983 ; Recker
et al , 1983 , 1987).
The development an d application of these quantitative method s i n transportation research hav e
enhanced ou r understandin g o f activity-trave l behavior . Throug h th e us e of multivariat e group
identification methods , suc h a s clustering or patter n recognitio n algorithms , comple x patterns i n
the origina l dat a se t can b e represente d b y som e genera l characteristic s an d organize d int o rel -
atively smal l numbe r o f homogenou s classes . Further , onc e activity-trave l patterns ar e repre -
sented, they can be related t o a large number o f attributes of the individuals or households whic h
generate them an d use d as a response variabl e in models of activity-travel behavior (Koppelma n
and Pas , 1985) . While thes e quantitativ e methods ar e usefu l fo r modelin g purpose s an d fo r dis -
covering the comple x interrelation s amon g variables , they also hav e their limitations.
First, fe w o f thes e method s wer e designe d t o handl e rea l geographical location s o f huma n
activities an d trip s i n th e contex t o f a stud y are a (Kwan , 1997) . Often, th e spatia l dimensio n is
represented by some measures derived from rea l geographical location s (e.g . distance or directio n
from a referenc e point suc h a s hom e o r workplac e o f a n individual) . Further , locationa l infor -
mation o f activitie s and trip s was ofte n aggregate d wit h respect t o a zona l divisio n o f the stud y
area (e.g . traffi c analysi s zones) . Usin g suc h zone-base d data , measuremen t o f locatio n and/o r
distance involves using zone centroids wher e informatio n abou t activit y locations i n geographi c
space and their spatia l relation s wit h other urban opportunitie s i s lost (Kwan and Hong, 1998) . As
point-based activity-trave l data geocode d t o stree t addresse s have gradually become availabl e in
recent years , ne w analytica l method s tha t ca n handl e th e locatio n o f activitie s and trip s i n rea l
geographic spac e ar e needed .
Second, sinc e many analytica l method s (e.g . log-linear models ) ar e designe d t o dea l wit h cat -
egorical data , organizing th e original dat a in terms o f discrete unit s of space an d time has been a
necessary ste p i n most analyse s o f activity-travel patterns i n the past . Discretizatio n o f tempora l
variables, suc h a s the star t tim e o r duratio n o f activities, involve s dividin g th e relevan t spa n o f
time into several units and assignin g each activity or trip into the appropriate clas s (e.g. dividing a
day int o 8 or 1 2 temporal division s into whic h activities or trip s are grouped) . Discretizatio n o f
spatial variables , suc h a s distance fro m home , involve s dividing the relevant distance rang e int o
several "rings" . Sinc e bot h th e spatia l an d tempora l dimension s ar e continuous , result s o f an y
analysis that ar e based upo n thes e discretized variable s ma y be affected b y the particular schem a
of spatia l and/o r tempora l division s used . Th e proble m ma y b e seriou s whe n dealin g wit h th e
interaction betwee n spatia l an d tempora l variable s sinc e tw o discretize d variable s ar e involved .
Visualization ma y hav e a n importan t rol e t o pla y i n alleviatin g this difficult y sinc e th e spatio -
temporal pattern s o f th e origina l dat a ca n b e explore d befor e the y ar e discretize d fo r furthe r
analysis o r modeling .
M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203 18 7
Third, as the amount and complexity of activity-travel data increase considerably in recent years
(Cambridge Systematics, 1996a), effective methods for exploring these data are also urgently needed
(McCormack, 1999) . Without them , th e researche r ma y nee d t o mode l activity-trave l patterns
without a preliminary understandin g o f the behaviora l characteristic s o r uniquenes s of the indi -
viduals in the sample at hand. This can be costly in later stages of a study if the model's specifications
fail t o tak e int o accoun t o f th e behaviora l anomalie s involved . Sinc e exploratory dat a analysi s
(EDA) can often lead to more focused and fruitful method s or models in later stages of a study, the
recent developmen t an d us e o f scientifi c visualization fo r ED A sugges t a possibl e directio n fo r
overcoming the problem (Dykes, 1996; Gahegan, 2000) . Recent developments in the integration of
scientific visualization an d exploratory spatia l data analysis (ESDA) als o indicate th e potential o f
geovisualization for the analysis of activity-travel patterns (Anselin, 1998, 1999; Wise et al., 1999).
This stud y explore s th e applicatio n o f interactive geographica l visualizatio n (o r geovisualiza -
tion) i n th e analysi s o f georeference d activity-travel data. I t describe s severa l GIS-base d three -
dimensional (3D ) visualizatio n method s fo r handlin g th e spatia l an d tempora l dimension s o f
activity-travel behavio r tha t avoi d th e interpretativ e complexity of multivariate pattern general-
ization o r recognitio n methods . Thes e includ e space-tim e activit y densit y surfaces , space-tim e
aquariums an d standardize d space-tim e paths . Thes e method s ar e operationalize d usin g inter-
active 3D GIS techniques and a n activity-travel diary data set collected in the Portland (Oregon )
metropolitan are a i n 1994/95 . While som e o f thes e method s wer e developed b y th e autho r fo r
analyzing a smalle r data se t collected i n Columbus , Ohi o (Kwan , 1999a) , ne w methods ar e de -
veloped an d explore d i n thi s paper . Thes e includ e th e us e o f GIS-base d surfac e modelin g an d
virtual realit y techniques. Further, a s these visualization methods ar e computationally intensive,
implementing the m fo r handlin g a large dat a set would she d ligh t o n thei r feasibility, value an d
limitations fo r th e analysi s of activity patterns i n space-tim e for transportatio n researchers .
Visualization i s the proces s o f creating an d viewin g graphical image s of data wit h the ai m of
increasing human understandin g (Hearnshaw an d Unwin , 1994) . It i s based o n th e premis e tha t
humans ar e abl e t o reaso n an d lear n more effectivel y i n a visual setting than whe n using textual
and numerical data (Tufte, 1990 , 1997). Visualization is particularly suitable for dealing with large
and comple x dat a set s becaus e conventiona l inferentia l statistics an d patter n recognitio n algo -
rithms may fai l whe n a larg e number o f attribute s ar e involve d (Gahegan, 2000) . In vie w o f th e
large number of attributes that can be used to characterize activity-trave l patterns, and give n th e
capability o f scientifi c visualizatio n i n handlin g a larg e numbe r o f attributes , visualizatio n i s a
promising directio n fo r explorin g and analyzin g large an d comple x activity-trave l data. Geovi -
sualization, o n th e othe r hand , i s the us e o f concret e visua l representation s an d huma n visua l
abilities to mak e spatia l context s and problem s visibl e (MacEachren et al. , 1999) . Through in-
volving the geographica l dimensio n i n the visualization process , i t greatly facilitate s th e identifi-
cation and interpretation of spatial patterns and relationships in complex data in the geographica l
context o f a particular stud y area .
For th e visualizatio n of geographic data , conventiona l GI S ha s focuse d largel y on th e repre -
sentation an d analysi s of geographic phenomen a i n two dimensions . Althoug h 3 D visualization
188 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
programs wit h advanced 3 D modelin g an d renderin g capabilities hav e bee n availabl e for man y
years, the y have bee n develope d an d applie d largel y in areas outside th e GIS domain (Sheppard ,
1999). Only recently has GIS incorporated th e ability to visualize geographic data in 3D (although
specialized surface modeling programs hav e existed long before). This is so not onl y in the digital
representation o f physica l landscap e an d terrai n o f lan d surfaces , bu t als o i n th e 3 D represen -
tation o f geographi c object s usin g variou s dat a structures . Ther e ar e man y method s fo r repre -
senting complex geographic object s in 3D (Li, 1994) . One is to assign the Z valu e using attributes
available in the two-dimensional (2D) database t o produce a "3Dable" geographic database. Fo r
example, a 3 D representation o f a building ca n be created b y extruding th e 2 D building outlin e
along th e Z-axi s b y the heigh t o f the building . Thi s practice i s often referre d t o a s 21/2- D as ther e
can be only one Z valu e for any single location (X, Y) on the 2D surface, thus limiting its ability to
represent comple x geographi c objects in 3D. To represen t geographi c entitie s as true 3 D objects,
one has to use other methods. These include solid modeling used in computer-aided desig n (CAD)
software, th e voxel data structure that cover s 3D space with 3D pixels (voxels), or object-oriented
3D data models (Le e an d Kwan , 2000) .
Although GIS-base d 3D geovisualization ha s been applie d i n many area s of research i n recen t
years, it s us e in th e analysi s o f huma n activit y patterns i s rather limite d t o date . I n man y early
studies, 2 D map s an d graphica l method s wer e use d t o portra y th e pattern s o f huma n activity -
travel behavio r (e.g . Chapin , 1974 ; Tivers, 1985) . Individua l dail y space-tim e path s wer e repre -
sented a s line s connecting variou s destinations . Usin g suc h kin d o f 2 D graphica l methods , in -
formation about th e timing, duration and sequence of activities and trips was lost. Even long after
the adoption o f the theoretical constructs o f the time-geographic perspectiv e by many researchers
in th e 1970 s an d 1980s , th e 3 D representatio n o f space-tim e aquarium s an d space-tim e path s
seldom went beyond the schematic representations use d either to explain the logic of a particular
behavioral model or to put forward a theoretical argument abou t human activity-trave l behavior.
They wer e not intende d t o portra y th e rea l experienc e of individuals i n relatio n t o th e concret e
geographical contex t i n an y empirica l sense.
However, a s more georeference d activity-travel diary data become available, an d a s more GI S
software ha s incorporated 3 D capabilities, i t is apparent tha t GIS-base d 3 D geovisualization is a
fruitful approac h for examining human activity-trave l behavior i n space-time. For instance , Forer
(1998) and Huisma n an d Fore r (1998 ) implemented space-tim e path s an d prism s based o n a 3 D
raster dat a structur e fo r visualizin g an d computin g space-tim e accessibilit y surfaces . Thei r
methods ar e especiall y usefu l fo r aggregatin g individual s with simila r socioeconomi c character -
istics an d fo r identifyin g behaviora l patterns . However , sinc e th e raste r dat a structur e i s no t
suitable fo r representin g th e complex topolog y o f a transportatio n network , the implementation
of network-base d computationa l algorithm s i s difficul t whe n usin g their methods . O n th e othe r
hand, Kwa n (1999a , 2000a ) implemente d 3 D visualizatio n o f space-tim e path s an d aquarium s
using vector GI S methods and activity-trave l diary data. These recen t studie s indicate that GIS -
based geovisualizatio n has considerabl e potentia l fo r advancin g the researc h o n huma n activity-
travel behavior. Further , implementin g 3D visualization of human activity-trave l patterns can be
an importan t firs t ste p i n th e developmen t o f GIS-base d geocomputationa l procedure s tha t ar e
applicable i n man y area s o f transportatio n research . Fo r example , Kwa n (1998 , 1999b) , Kwa n
and Hon g (1998 ) an d Mille r (1991 , 1999 ) develope d differen t network-base d algorithm s fo r
computing individua l accessibilit y usin g vector GI S procedures .
M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203 18 9
The use of GIS-based 3 D geovisualization in the analysis of human activity patterns has several
advantages. First , i t provides a dynamic an d interactiv e environment that i s much more flexibl e
than th e conventional mod e o f data analysis in transportation research . The researcher no t onl y
can directl y manipulate the attribute s o f a scene and it s features, but als o can chang e th e views ,
alter parameters, quer y data an d se e the result s of any of these actions easily. Second, sinc e GI S
has th e capabilit y t o integrat e a larg e amoun t o f geographi c dat a i n variou s format s and fro m
different source s int o a comprehensiv e geographi c database , i t i s abl e t o generat e fa r mor e
complex an d realisti c representation s o f th e urba n environmen t tha n conventiona l method s
(Weber an d Kwan , 2000) . Th e concret e spatia l contex t i t provides can greatl y facilitate explor-
atory spatia l data analysi s and th e identification of spatial relations in the data. Result s can als o
be exported easil y to spatial analysi s packages fo r performing formal spatial analysis (Anselin and
Bao, 1997) . Third , wit h man y usefu l navigationa l capabilitie s suc h a s fly-through , zooming ,
panning an d dynamic rotation , a s well as the multimedia capabilitie s t o generate map animatio n
series such as 3D "walk-throughs" an d "fly-bys", the researcher can create a "virtual world" tha t
represents th e urban environmen t with very high level of realism (Batty et al., 1998) . Lastly, unlike
quantitative methods that tend to reduce the dimensionality of data i n the process of analysis, 3D
geovisualization ma y retain th e complexity o f the origina l dat a t o th e extent tha t huma n visua l
processing i s still capable o f handling .
3. Data
The dat a use d i n thi s researc h ar e fro m th e Activit y an d Trave l Surve y conducte d i n th e
Portland (Oregon ) metropolitan are a in the spring and autumn of 1994 and the winter of 1995 (see
Cambridge Systmatic s (1996b ) fo r details o f the survey). The survey used a two-day activity diar y
to record al l activities involving travel and al l in-home activities with a duration of at least 30 min
for al l individual s i n th e sample d households . O f the 709 0 households recruite d fo r th e survey ,
4451 household s wit h a tota l o f 10,08 4 individuals returned complete d an d usabl e surveys . The
data set logged a total o f 128,18 8 activities an d 71,80 8 trips .
Besides th e informatio n commonl y obtaine d i n trave l diar y survey , this dat a se t come s with
the geocode s (x y coordinates ) o f al l activit y locations , includin g th e hom e an d workplac e o f
all individual s i n th e sample . Thi s greatl y facilitate s it s incorporatio n int o a geographi c dat -
abase o f th e stud y area . Beside s th e activit y diar y data , geographi c informatio n abou t th e
Portland metropolita n regio n are als o use d in thi s study . This include s data on variou s aspect s
of th e urba n environmen t an d transportatio n system . Thi s contextua l informatio n allow s th e
activity-travel dat a t o b e relate d t o th e geographica l environmen t o f th e regio n durin g visu -
alization.
The nex t two section s explore several methods fo r th e 3 D geovisualization of human activity -
travel patterns i n space-time . Fo r th e implementatio n o f these methods , variou s segment s o f a
subsample o f the origina l respondents ar e used. Th e subsampl e consists o f individuals who were
identified a s head o f household o r spous e o r partner and ar e employed full-tim e or part-time. I t
provides informatio n fo r 4,74 4 individual s wh o ar e workin g adults i n household s wit h a t leas t
one adult . Al l geoprocessin g i s performed usin g ARC/INF O an d ArcVie w GIS , whil e th e 3 D
interactive geovisualization i s conducted usin g ArcView 3D Analyst .
190 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
tation o f activity patterns in space-time, the Z variable in this study represents the time dimension
of activities and trips. In this particular example , activity start time is used as the Z variable in the
conversion process. Usin g this Z value , each activit y is first located i n 3D space as a point entity
using its geographic location (X, Y) and activit y star t time (Z) . T o represen t th e duration of each
activity, th e activit y point s i n 3 D ar e extrude d fro m thei r star t time s b y a valu e equa l t o th e
duration o f the activity .
Fig. 1 shows th e resul t o f usin g thi s metho d fo r th e 14,78 3 out-of-home , non-employmen t
activities performe d b y th e 215 7 European-American (white ) women i n th e subsample . I n th e
figure, activity duration i s indicated b y the length of the vertical line that represents the tempora l
span o f a n activity . Activity start tim e ca n b e color-code d s o tha t th e tempora l distributio n o f
activities ca n b e viewed durin g interactiv e visualizatio n (thes e colo r code s ar e use d i n the colo r
version o f Fig . 1 posted o n th e Web). A helpfu l backgroun d fo r relatin g the activit y patterns t o
various locations i n the stud y are a i s created throug h addin g severa l layers o f geographic infor-
mation into the 3D scene. These layers include the boundary o f the Portland metropolita n region ,
freeways an d majo r arterials . Fo r bette r visua l anchorin g an d locationa l referencin g during vi-
sualization, a 3D representation o f downtown Portland, whic h appears a s a partially transparen t
3D pillar derived from extrudin g its 2D boundary alon g the Z dimension , is also adde d int o the
scene.
Through interactiv e geovisualization , i t i s apparen t tha t th e highes t concentration s o f non -
employment activitie s o f th e selecte d wome n ar e foun d largel y i n area s clos e t o downtow n
Portland insid e th e "loop " an d area s wes t o f downtow n alon g an d sout h o f Freewa y 84 . Im-
portant cluster s of non-employment activitie s are als o foun d i n Beaverto n i n th e wes t an d Gre -
sham i n the east. Mos t o f the non-employment activitie s are o f very short duratio n (94 % of them
with duratio n unde r 5 min, an d les s than 1 % have duratio n ove r 1 0 min). Further , mos t o f th e
non-employment activitie s tha t were undertaken during lunc h hou r are found largel y i n the high
density area s nea r downtown , while non-employment activitie s in the more suburba n area s ten d
to tak e plac e i n th e morning , lat e afternoo n o r evenin g (perhap s associate d wit h th e commut e
trip). Ther e i s strong spatia l associatio n betwee n the locatio n o f non-employment activitie s an d
the location s o f workplace for th e selecte d individuals .
where k(•) is the kernel function, the parameter h > 0 is the bandwidth determinin g th e amount o f
smoothing, w , is a weighing factor, and 6 h(x) i s an edg e correction facto r (Cressie, 1993) . In this
study, th e quarti c kerne l function
Fig. 2 . Activity density patterns i n geographic space, with the surface for non-employment activities over the surface of
home locations .
Fig. 3 . A close-up vie w of activity density patterns in geographic space , wit h the surface for workplace over the surface
of home locations.
peaks o f home locations can b e clearly seen . Th e mos t intensiv e on e i s in th e eas t o f downtow n
Portland whil e th e othe r tw o ar e i n th e wes t o f downtow n an d southwes t o f downtow n alon g
Freeway 5 . Not a s obvious i n the figure is the peak o f the workplace surfac e whic h is centered a t
194 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
downtown Portlan d insid e the "loop" (it cannot b e seen easily because th e surface is interrupted
from belo w a t th e "saddle " betwee n th e peak s o f th e hom e surfac e du e t o it s transparency) .
Another are a wit h hig h densit y o f workplac e i s foun d i n area s alon g Freewa y 21 7 o n th e
southwest betwee n th e junctions wit h State Route 26 in the north an d Freewa y 20 5 in the south.
During th e interactiv e 3 D geovisualizatio n o f thi s scene , th e proximit y an d spatia l relationshi p
between th e peak s o f the tw o surface s are striking.
The majo r advantag e o f thi s metho d i s it s capabilit y fo r examinin g th e spatia l relationship s
between differen t surface s i n thei r concret e geographica l context . However , t o explor e th e tem -
poral dimensio n an d it s interactio n wit h th e spatia l dimension , anothe r visualizatio n metho d is
needed.
Fig. 5 . Gender diiferenc e in th e densit y o f non-employment activitie s between wome n an d me n employed part-time .
196 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
performed man y more non-employmen t activitie s than wome n i n thi s space-tim e area . Anothe r
sharp troug h i s found i n the evenin g hours betwee n 7 p.m. t o 8 p.m. abou t 3 km fro m home .
There ar e tw o majo r advantage s i n usin g these 3 D space-tim e activit y densit y surfaces. First,
they revea l th e intensit y o f activitie s i n spac e an d tim e simultaneously , thu s facilitatin g th e
analysis of their interaction. Second , th e grid-base d metho d i s amenable t o man y map-algebrai c
operations that can be used to adjust the computed ra w density for highlighting the distinctiveness
in th e activit y patterns o f a particula r populatio n subgroup . I t als o make s th e derivatio n o f a
"difference surface " fo r tw o populatio n subgroup s relativel y easy, thus facilitatin g the examina -
tion o f inter-group difference. Th e followin g section turn s t o explor e th e 3 D geovisualizatio n of
space-time paths .
For th e visualizatio n of individual space-time paths , the earlies t 3 D method i s the 'space-tim e
aquarium' conceived by Hägerstrand (1970) . In a schematic representation o f the 'aquarium', th e
vertical axi s is the tim e o f da y an d th e boundar y o f th e horizonta l plan e represent s th e spatia l
scope o f th e stud y area . Individua l space-tim e path s ar e portraye d a s trajectorie s i n thi s 3 D
aquarium. Althoug h th e schemati c representatio n o f th e 'space-tim e aquarium ' wa s develope d
long ago, i t has never been implemented using real activity-travel diary data. The main difficultie s
include the need t o conver t th e activity data into "3Dable " formats that ca n be used b y existing
visualization software , and th e lac k o f comprehensive geographi c dat a fo r representin g comple x
geographic object s o f th e urba n environment . Th e recen t incorporatio n o f 3 D capabilitie s int o
GIS package s an d th e availabilit y o f contextua l geographi c dat a o f many metropolita n region s
have greatly reduced thes e two difficulties .
To implemen t 3 D geovisualizatio n o f th e space-tim e aquarium , fou r contextua l geographi c
data layer s are first converted fro m 2 D ma p layer s to 3 D shap e files and adde d t o a 3 D scene.
These include the metropolitan boundary , freeways , majo r arterials , an d rivers . For bette r close-
up visualization and fo r improving the realism of the scene, outlines of commercial an d industria l
parcels in the study area are converted to 3 D shapes and verticall y extruded in the scene. Finally,
the 3 D space-tim e path s o f individual s wh o ar e Africa n Americans , Hispanic s an d Asia n
Americans from th e subsample are generated an d added t o the 3D scene. These procedures finally
created th e scen e shown in Fig . 6 .
The overal l pattern o f th e space-tim e path s fo r thes e thre e group s show n i n Fig . 6 indicate s
heavy concentration o f day-time activities in areas i n and aroun d downtow n Portland. Usin g the
interactive visualization capabilities of the 3 D GIS, i t can b e seen that many individuals in these
ethnic minorit y groups wor k i n tha t are a an d a considerabl e amoun t o f thei r non-employmen t
activities ar e undertake n i n areas withi n an d eas t o f downtown Portland . Space-tim e paths fo r
individuals wh o undertoo k severa l non-employmen t activitie s i n a sequenc e within a singl e day
tend to be more fragmented than those who have long work hours during the day. Further, ethnic
differences i n the spatia l distributio n o f workplace ar e observe d usin g the interactiv e capabilities
provided b y th e geovisualizatio n environment. Th e space-tim e path s o f Hispanic s an d Asia n
M.-P. Kwan I Transportation Research Part C 5 (2000) 185-203 19 7
Fig. 6 . Space-time aquariu m wit h th e space-tim e path s o f Africa n Americans , Hispanic s an d Asia n America n i n th e
subsample.
Americans are more spatially scattered throughou t th e area, while those of the African American s
are spatiall y restricted , concentratin g largel y in the eas t sid e of the metropolitan region .
A close-up view from th e southwest of this interactive geovisualization session is given in Fig. 7,
which shows some of the details of downtown Portlan d in areas around the "loop" and along the
Willamette River i n th e foreground . Portion s o f som e space-tim e path s ca n als o b e see n i n thi s
scene. With th e 3 D parcels an d othe r contextua l layer s in view, th e figure gives the researche r a
strong sens e about th e geographical contex t through a virtual reality-like view o f the downtown
area. Thi s interactiv e virtua l environmen t no t onl y contextualize s th e visualizatio n i n it s actua l
geographical surroundin g but als o enables the analysi s o f local variations at fine spatial scales .
For instance , i n th e colo r versio n o f the figure provided o n th e Web , commercia l building s ar e
color-coded orange-brown , whil e industria l building s ar e i n green . Th e us e o f colo r code s fo r
distinguishing differen t type s o f building s would giv e the analys t a sens e o f th e potentia l inter -
action spac e and it s context, which can then be compare t o activities and path s o f the individuals
(where activities and stop s ca n also be color code d b y activity type). This approach wil l therefore
have considerable potential fo r the development of person-specific, activity-based methods at fine
spatial scales .
Fig. 7 . A close-u p vie w o f downtown Portlan d fro m th e 3 D scen e show n i n thi s figure.
coordinate system . This ca n be done b y shifting th e locational coordinate s o f all activity sites for
an individua l s o that th e hom e locatio n become s th e origi n (0,0 ) and th e home-wor k axi s is ro -
tated unti l i t become s th e positiv e x-axis . Usin g thes e transforme d o r 'standardized ' space-tim e
paths, man y distinctiv e features of the trajectorie s o f a particular populatio n subgrou p ma y stil l
be identifiabl e eve n whe n numerou s space-tim e path s o f man y individual s ar e plotted . Fig . 8
shows th e standardize d space-tim e path s fo r th e individual s o f the thre e minorit y groups i n th e
subsample. The vertical plane is the home-work plane where the home location i s indicated by the
origin (0,0) . I n th e interactiv e visualizatio n session , i t ca n b e see n tha t ther e ar e considerabl e
amount o f non-employmen t activitie s durin g th e day , an d tha t ther e ar e distinctiv e bundles of
work activitie s at particula r distance s fro m home . Further , the spatia l distributio n o f these non -
employment activitie s reflects a bias toward th e home-work axis , supporting simila r observation s
in previou s studie s (e.g., Kitamur a e t al. , 1990 ; Saxena an d Mokhtarian , 1997) .
In vie w o f th e comple x space-tim e pattern s th e researche r ha s t o dea l wit h whe n usin g thes e
methods, pattern extraction algorithms and other geocomputationa l procedure s ca n be developed
to complemen t th e geovisualizatio n method s discusse d i n thi s sectio n (e.g . th e GIS-base d geo -
computation o f individua l accessibilit y b y Kwan , 1998 ; Miller , 1999) . Thus , thes e 3 D method s
enabled b y the 3 D geovisualization environment can be the basis for developing and formulatin g
quantitative methods for the characterization an d extraction o f patterns fro m th e large number of
space-time trajectorie s a s valuable analytica l tools .
M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203 19 9
6. Conclusion s
There are , however, several difficulties i n the development an d us e of these 3D methods. First ,
there i s the challeng e o f convertin g many type s of data int o "3Dable " format s fo r a particula r
geovisualization environment. Since every visualization software may have its unique data forma t
requirements, and th e activity and geographi c data currently available are largely in 2D formats,
the dat a preparatio n an d conversio n proces s ca n b e tim e consumin g an d costly . Fo r example ,
considerable dat a preparatio n an d pre-processin g ar e require d fo r convertin g th e Portlan d ac -
tivity-travel dat a befor e they ca n b e displaye d a s 3 D space-tim e paths . Futur e researc h shoul d
investigate ho w th e effor t an d tim e spen t o n dat a conversio n coul d b e reduce d whe n dat a fro m
various source s are used .
Second, th e researche r ma y encounte r barrier s t o th e effectiv e visualizatio n o f larg e an d
complex activity-trave l data sets . Fou r suc h potential barrier s identifie d b y Gahegan (1999 ) are :
(1) renderin g speed : th e abilit y o f th e hardwar e t o delive r satisfactor y performanc e fo r th e in -
teractive displa y an d manipulatio n o f larg e dat a sets ; (2 ) visua l combinatio n effects : problem s
associated wit h th e limitatio n i n huma n abilit y t o identif y pattern s an d relation s whe n man y
layers, themes or variables ar e simultaneously viewed; (3) large number o f visual possibilities: the
complexity associate d wit h th e vas t rang e o f possibilitie s that a visualizatio n environmen t pro -
vides (i.e., the vast number o f permutations an d combination s o f visual properties th e researche r
can assign to particular data attributes); and (4) the orientation of the user in a visualized scene or
virtual world . Implementatio n o f th e interactiv e 3 D method s i n thi s stud y show s tha t a geovi -
sualization environmen t whic h provide s a geographica l contex t fo r th e researche r ma y consid -
erably alleviat e th e fourt h problem . However , th e othe r thre e barrier s ma y stil l remai n a
significant challeng e to researchers who want to use this kind of methods. For instance , rendering
the densit y surface in Fig . 4 , whic h involve s 227,041 triangles, ca n b e taxin g o n th e hardware .
Further, identifyin g pattern s fro m th e space-time paths coverin g 129,18 8 activities undertaken by
the surve y respondents ma y pus h ou r visua l ability beyond it s limit. Future researc h shoul d ex-
amine how human cognitiv e barriers involve d in the interpretatio n o f complex 3 D patterns may
be overcome.
Third, th e us e of individual-level activity-travel data geocode d t o stree t addresses , give n thei r
reasonable degre e of positional accuracy , ma y lea d t o considerabl e ris k o f privacy violation. As
Armstrong an d Ruggle s (1999 ) demonstrated, althoug h "raw " map s tha t comprise d o f abstrac t
map symbols do not directly disclose confidential information, a determined data spy can use GIS
technology and othe r knowledg e to "hack" the maps and make an estimate of the actual addres s
(and hence, a good gues s of the identify o f an individual) associated wit h each point symbol . This
practice, called "inverse address-matching", ha s the potential for serious confidentiality or privacy
violation. A s "ma p hackers " ma y b e abl e t o accuratel y recove r a larg e proportio n o f origina l
addresses fro m do t maps , an y us e o f suc h kin d o f individual-leve l geocode d dat a shoul d b e
conducted with great concern in protecting the privacy of survey respondents and maintaining the
confidentiality o f information . A s apparen t i n th e 3 D geovisualizatio n example s i n thi s pape r
(e.g., the detail s in Fig . 7), releasin g a 3D scen e create d fro m severa l accurat e dat a theme s in
virtual reality marku p language (VRML ) forma t ma y lead t o significan t risk o f privacy violatio n
because map hacker s may be able to recove r the identit y of a particular surve y respondent. Thi s
may further lea d to the disclosure of other confidential information. As a result, researchers using
the 3 D geovisualizatio n method s discusse d i n th e pape r shoul d pa y particula r attentio n t o thi s
potential risk.
M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203 20 1
Acknowledgements
Support fo r thi s researc h fro m th e Colleg e o f Socia l an d Behaviora l Sciences , th e Ohi o Stat e
University, i s gratefull y acknowledged . Th e autho r woul d lik e t o than k thre e anonymou s
reviewers for thei r comments on a n earlie r draf t o f this paper.
References
Anselin, L. , 1998 . Explorator y spatia l dat a analysi s in a geocomputationa l environment . In : Longley , P.A., Brooks ,
S.M., McDonnell , R. , MacMillan , B . (Eds.), Geocomputation : A Primer. Wiley , Ne w York, pp . 77-94 .
Anselin, L. , 1999 . Interativ e technique s an d explorator y spatia l dat a analysis . In : Longley, P.A. , Goodchild , M.F. ,
Maguire, D.J., Rhind , D.W. (Eds.) , Geographic Informatio n Systems, vol. 1 : Principles and Technical Issues , second
ed. Wiley, Ne w York , pp. 253-266 .
Anselin, L., Bao, S., 1997. Exploratory spatial data analysi s linking SpaceStat an d Ar c View. In: Fischer, M. , Getis , A .
(Eds.), Recen t Development s i n Spatia l Analysis . Springer, Berlin , pp. 35-59 .
Armstrong, M.P. , Ruggles , A.J. , 1999 . Ma p hacking : o n th e us e o f invers e address-matchin g t o discove r individua l
identities fro m point-mappe d informatio n sources . Pape r presente d a t th e Geographi c Informatio n an d Societ y
Conference, Universit y of Minnesota , 20-2 2 June .
Bailey, T.C. , Gatrell , A.C. , 1995 . Interactive Spatia l Dat a Analysis . Longman, Ne w York .
Batty, M. , Dodge , M. , Doyle , S. , Smith, A. , 1998 . Modellin g virtual environments. In: Longley , P.A. , Brooks , S.M. ,
McDonnell, R. , MacMillan , B . (Eds.), Geocomputation : A Primer . Wiley , New York, pp . 139-161 .
Bhat, C.R. , 1997 . Wor k trave l mod e choice and numbe r o f non-work commut e stops . Transportation Research B 31
(1), 41-54 .
Bhat, C.R. , 1998 . A mode l o f home-arriva l activit y participatio n behavior . Transportatio n Researc h B 3 2 (6) ,
387-400.
Bhat, C.R. , Singh , S.K. , 2000 . A comprehensiv e dail y activity-trave l generatio n mode l syste m fo r workers .
Transportation Researc h A 3 4 (1), 1-22 .
Burnett, P., Hanson, S. , 1982. The analysis of travel as an example of complex human behavior in spatially constriaine d
situations: definition an d measuremen t issues . Transportatio n Researc h A 1 6 (2), 87-102.
Cambridge Systematics , 1996a . Sca n o f Recen t Trave l Surveys . Cambridge Systematics , Oakland , CA .
Cambridge Systematics , 1996b . Data Collectio n i n the Portland, Orego n Metropolita n Area . Cambridg e Systematics ,
Oakland, CA .
Chapin, F.S . Jr. , 1974 . Huma n Activit y Patterns in the City . Wiley , Ne w York .
Cressie, N.A.C. , 1993 . Statistics fo r Spatia l Data . Wiley , New York .
Cullen, L , Godson , V. , Major , S. , 1972 . Th e structur e o f activit y patterns . In : Wilson , A.G . (Ed.) , Pattern s an d
Processes i n Urba n an d Regiona l Systems . Pion, London , pp . 281-296 .
Dykes, J. , 1996 . Dynami c map s fo r spatia l science : a unifie d approac h t o cartographi c visualization . In: Parker , D .
(Ed.), Innovation s in GIS 3. Taylor & Franics, London , pp. 177-187 .
Forer, P. , 1998 . Geometric approache s t o th e nexus of time, space , an d microprocess : implementin g a practical mode l
for mundan e socio-spatial systems . In: Egenhofer, M.J., Golledge, R.G . (Eds.) , Spatial an d Tempora l Reasonin g in
Geographic Informatio n Systems . Oxford Universit y Press, Oxford , England , pp . 171-190 .
Gahegan, M. , 1999 . Fou r barrier s t o th e developmen t o f effectiv e explorator y visualizatio n tool s fo r th e geosciences .
International Journa l o f Geographi c Informatio n Scienc e 1 3 (4), 289-309.
Gahegan, M., 2000. The case for inductive an d visual technique s i n the analysis o f spatial data. Journal o f Geographical
Systems 2 (1), 77-83.
Gatrell, A. , 1994 . Density estimation an d the visualization of point patterns. In: Hearnshaw, H.M., Unwin, D.J. (Eds.) ,
Visualization i n Geographical Informatio n Systems . Wiley, New York, pp . 65-75 .
Golob, T.F., 1985 . Analyzing activity pattern dat a usin g qualitative multivariate statistical methods . In : Nijkamp, P. ,
Leitner, H. , Wrigley , N. (Eds.) , Measurin g th e Unmeasurable . Martinu s Nijhoff , Boston , MA , pp . 339-356 .
202 M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203
Golob, T.F., McNally , M.G., 1997 . A model of activity participation an d trave l interactions between household heads .
Transportation Researc h B 31 (3), 177-194 .
Goulias, K.G. , 1999 . Longitudina l analysi s o f activit y and trave l patter n dynamic s usin g generalize d mixe d Marko v
latent clas s models . Transportation Researc h B 33 (8), 535-558.
Hagerstrand, T. , 1970 . What abou t peopl e i n regional science? Papers o f Regiona l Scienc e Associatio n 24 , 7-21 .
Hanson, S. , Hanson , P. , 1980 . Gende r an d urba n activit y patterns i n Uppsala . Swede n Geographica l Revie w 70 (3),
291-299.
Hanson, S. , Hanson , P. , 1981 . Th e travel-activit y pattern s o f urba n residents : dimension s an d relationship s t o
sociodemographic characteristics . Economi c Geograph y 57 , 332-347.
Hearnshaw, H.M. , Unwin , D . (Eds. ) 1994 . Visualizatio n i n Geographica l Informatio n Systems . Wiley , Chichester,
England.
Huisman, O., Forer, P., 1998 . Towards a geometric framewor k for modelling space-tim e opportunitie s an d interaction
potential. Pape r presente d a t th e Internationa l Geographica l Union , Commissio n o n Modellin g Geographica l
Systems Meetin g (IGU-CMGS), Lisbon , Portugal , 28-2 9 August.
Janelle, D.G. , Goodchild , M.F. , 1988 . Space-tim e diarie s an d trave l characteristic s fo r differen t level s o f respondent
aggregation. Environmen t and Plannin g A 2 0 (7), 891-906.
Kitamura, R. , Nishii , K. , Goulias , K. , 1990 . Trip chainin g behavior b y centra l cit y commuters : a causa l analysi s of
time-space constraints . In : Jones , P . (Ed.) , Development s i n Dynami c an d Activity-Base d Approache s t o Trave l
Analysis. Avebury, Aldershot, pp . 145-170 .
Kondo, K. , Kitamura , R. , 1987 . Time-spac e constraint s an d th e formatio n o f tri p chains . Regiona l Scienc e an d
Economics 1 7 (1), 49-65.
Koppelman, F.S. , Pas , E.I. , 1985 . Travel-activity behavior i n time and space : methods for representatio n an d analysis .
In: Nijkamp , P. , Leitner , H. , Wrigley , N. (Eds.) , Measurin g th e Unmeasurable , Martinu s NijhofT , Boston , MA ,
pp. 587-627 .
Kostyniuk, L.P. , Kitamura , R. , 1984 . Tri p chain s an d activit y sequences : tes t o f tempora l stability . Transportatio n
Research Recor d 987 , 29-39.
Kwan, M.-P., 1997 . GISICAS: a n activity-based travel decision suppor t syste m using a GIS-interfaced computational -
process model . In : Ettema , D.F. , Timmermans , H.J.P . (Eds.) , Activity-Base d Approache s t o Trave l Analysis .
Elsevier, Ne w York, pp . 263-282 .
Kwan, M.-P. , 1998 . Space-time and integra l measures of individual accessibility: a comparative analysi s using a point-
based framework . Geographical Analysi s 30 (3), 191-216 .
Kwan, M.-P. , 1999a . Gender, th e home-wor k link , and space-tim e pattern s o f non-employment activities . Economi c
Geography 7 5 (4), 370-394.
Kwan, M.-P. , 1999b . Gende r an d individua l acces s t o urba n opportunities : a stud y usin g space-tim e measures . Th e
Professional Geographe r 5 1 (2), 210-227.
Kwan, M.-P., 2000a. Huma n extensibilit y and individual access to information: a multi-scale representation using GIS .
In: Janelle, D., Hodge , D . (Eds.), Information, Places , an d Cyberspace : Issue s in Accessibility. Elsevier, Amsterda m
(in press) .
Kwan, M.-P. , 2000b . Gende r difference s i n space-tim e constraints . Are a (i n press).
Kwan, M.-P. , 2000c . Analysi s o f huma n spatia l behavio r i n a GI S environment : recen t development s an d futur e
prospect. Journa l o f Geographica l System s 2 (1), 85-90.
Kwan, M.-P. , Hong , X.-D. , 1998 . Network-based constraints-oriente d choic e se t formation usin g GIS. Geographica l
Systems 5 , 139-162 .
Lee, J., Kwan, M.-P., 2000. A 3D data model fo r representing spatial entities in built environments. Paper presente d a t
the 96t h Annua l Meetin g of the Association o f American Geographers , 4- 8 April , Pittsburgh , Pennsylvania .
Lenntorp, B. , 1976 . Path s i n Time-Spac e Environments : A Tim e Geographi c Stud y o f Movemen t Possibilitie s o f
Individuals. Gleerup , Lund .
Li, R. , 1994 . Dat a structure s an d applicatio n issue s i n 3- D geographi c informatio n systems . Geomatic a 4 8 (3) ,
209-224.
Lu, X. , Pas , E.I. , 1999 . Socio-demographic, activit y participation an d trave l behavior . Transportatio n Researc h A 33
(1), 1-18 .
M.-P. Kwan I Transportation Research Part C 8 (2000) 185-203 20 3
Ma, J. , Goulias , K.G. , 1997a . An analysi s of activity and trave l patterns in the Puget Soun d transportatio n panel . In:
Ettema, D. , Timmermans , H . (Eds.) , Activity-base d Approache s t o Trave l Analysis . Elsevier , Tarrytown , NY ,
pp. 189-207 .
Ma, J., Goulias, K.G. , 1997b . A dynamic analysis of person and household activity and travel patterns using data fro m
the first two waves in the Puge t Soun d Transportatio n Panel . Transportatio n 2 4 (3), 309-331 .
McCormack, E. , 1999 . Using a GI S t o enhanc e th e value of travel diaries. IT E Journa l 6 9 (1), 38-43.
MacEachren, A.M. , Wachowicz , M. , Edsall , R. , Haug , D. , 1999 . Constructin g knowledg e fro m multivariat e
spatiotemporal data : integratin g geographica l visualizatio n an d knowledge discover y i n databas e methods .
International Journa l of Geographical Informatio n Scienc e 1 3 (4), 311-334.
Michelson, W. , 1985. From Sun to Sun: Daily Obligations and Community Structur e i n the Lives of Employed Wome n
and thei r Families . Rowma n an d Allanheld , Totowa, NJ .
Miller, H.J. , 1991 . Modellin g accessibilit y usin g space-tim e pris m concept s withi n geographi c informatio n systems.
International Journal of Geographical Information System s 5 (3), 287-301.
Miller, H.J. , 1999 . Measurin g space-tim e accessibilit y benefit s withi n transportatio n networks : basi c theor y an d
computational procedures . Geographica l Analysi s 31 (2), 187-212 .
Parkes, D.N. , Thrift , N. , 1975 . Timing spac e an d spacin g time . Environmen t an d Plannnin g A 7 , 651-670.
Pas, E.I. , 1982 . Analyticall y derive d classification s o f dail y travel-activit y behavior : description , evaluatio n an d
interpretation. Transportatio n Researc h Recor d 879 .
Pas, E.I. , 1983 . A flexible and integrate d methodolog y fo r analytica l classificatio n o f dail y travel-activit y behavior .
Transportation Scienc e 1 7 (3), 405-429.
Pendyala, R.M. , 1997 . An activity-based microsimulatio n analysi s o f transportation control measures. Transportatio n
Policy 4 (3), 183-192 .
Recker, W.W., McNally, M.G., Root , G.S. , 1983 . Application of pattern recognition theory to activity pattern analysis.
In: Carpenter , S. , Jones , P . (Eds.) , Recen t Advance s i n Trave l Deman d Analysis . Gower , Aldershot , England ,
pp. 434-449.
Recker, W.W. , McNally , M.G. , Root , G.S. , 1987 . An empirica l analysi s o f urba n activit y patterns . Geographica l
Analysis 19(2) , 166-181 .
Saxena, S., Mokhtarian, P.L. , 1997 . The impact o f telecommuting o n the activity spaces o f participants. Geographica l
Analysis 29 (2), 124-144 .
Sheppard, S.R.J. , 1999 . Visualization softwar e brin g GI S application s t o life . GeoWorl d 1 2 (3), 36-37.
Silverman, B.W. , 1986 . Density Estimation fo r Statistic s and Dat a Analysis . Chapman & Hall, London .
Szalai, A . (Ed.) 1972 . The Us e of Time. Mouton , Th e Hague.
Tivers, J. , 1985 . Women Attached : Th e Dail y Live s of Women wit h Young Children . Croo m Helm , London .
Tufte, E.R. , 1990 . Envisioning Information. Graphic s Press , Cheshire , Connecticut .
Tufte, E.R. , 1997 . Visua l Explanations : Image s an d Quantities , Evidenc e an d Narrative . Graphic s Press , Cheshire ,
Connecticut.
Weber, J. , Kwan , M.-P. , 2000 . Th e influenc e o f time-of-da y trave l time variation s o n individua l accessibility . Paper
presented a t th e 96t h Annua l Meetin g o f th e Associatio n o f America n Geographers , 4- 8 April , Pittsburgh ,
Pennsylvania.
Wise, S. , Haining , R. , Signoretta , P. , 1999 . Scientifi c visualisatio n an d th e explorator y analysi s o f are a data .
Environment an d Plannin g A 31 , 1825-1838 .
VOLUME II
TRANSPORTATION
RESEARCH
PARTC
Abstract
Keywords: Ai r quality ; Geographic informatio n systems; Mobile source emissions ; Travel demand modeling
0968-090X/00/S - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00005- X
206 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
1. Introductio n
to
The Clea n Ai r Act, a s amended i n 1990 , an d othe r federal legislation an d regulation s requir e
metropolitan area s wit h unacceptable ai r quality to develo p strategies for reducing air pollution.
In plannin g fo r attainmen t o f thes e standards , metropolita n area s establis h emission s 'budgets '
that provid e benchmark s fo r gaugin g attainmen t progress. Meetin g emission s budge t limit s i n
target year s ofte n become s difficult . Metropolita n area s mus t accommodat e th e need s o f a
growing population and economy, while simultaneously lowering or maintaining levels of ambient
pollutants. Therefore , growin g urban area s mus t continuall y develo p creativ e strategie s t o cur b
increased pollutan t production . Transportatio n system s contribute significantl y t o carbo n mon -
oxide (CO) , nitroge n oxide s (NO*), an d hydrocarbo n (HC ) emission s in urban areas . Estimate s
for th e amount of pollutants produced by motor vehicles vary from 33 % to 50 % of NOx, 33 % to
97% o f CO , 40 % to 50 % of HC , 50 % of ozon e precursors , an d a t leas t one-fourt h o f volatil e
organic compound s (VOC ) (Chatterje e e t al. , 1997 ; SCAQMD , 1996 ; USEPA , 1995 ; CARB ,
1994; USDOT , 1993) . Developin g measure s o f effectivenes s an d subsequen t prediction s o f th e
overall impac t o f contro l strategie s require s a n abilit y t o mode l th e relationship s betwee n ob -
servable transportatio n syste m characteristics an d thei r resultin g emissions. In addition , model s
that incorporat e thes e relationship s mus t balanc e inpu t dat a availabilit y and qualit y wit h pre -
dictive power.
Motor vehicl e emissio n rate s correlat e wit h a variet y o f vehicl e an d engin e characteristic s
(weight, engine size, transmission type , emission control equipment , etc.), operatin g mode s (idle ,
cruise, acceleratio n an d deceleration) , an d transportatio n syste m conditions (roa d grade , pave -
ment condition, etc. ) (Bart h e t al. , 1996 ; Guensler, 1994) . Emission rate s use d i n models ar e es-
timates o f th e rat e a t whic h differen t pollutant s ar e emitte d i n gram s pe r activit y unit, suc h a s
grams o f CO/s o r CO/mile . Differen t vehicl e activities (starting a n engine , accelerating, cruising ,
etc.) resul t i n differen t emissio n rates . Exhaus t pollutant s produce d fro m startin g a vehicl e ar e
correlated t o th e vehicle' s engin e characteristics an d duratio n o f the engin e cool dow n tim e be-
tween starts . Runnin g exhaus t emission s requir e additiona l estimate s o f dynami c engine condi -
tions that vary with the way the vehicle is driven. Estimating motor vehicl e emissions requires the
ability to predict or measure these activity parameters fo r an entire region at a level of spatial an d
temporal aggregatio n fittin g th e scop e o f anticipated contro l strategies .
Traditional moto r vehicl e emissions modeling involves four separat e modelin g regimes: travel
demand forecastin g models , mobil e sourc e emission s rat e models , photochemica l model s (fo r
emission inventorie s and resultin g regional ai r quality) , an d microscal e models . Trave l deman d
forecasting model s us e characteristic s o f th e transportatio n syste m an d socioeconomi c dat a t o
estimate road-specifi c traffi c volumes . Emissio n rat e model s emplo y flee t characteristi c data ,
operating environmen t characteristics, an d assumption s relate d t o emissio n control program s t o
predict emissio n rate s fo r th e on-roa d fleet . Th e trave l deman d estimate s ar e linke d wit h th e
outputs of the emissio n rate model s to predic t mobil e sourc e mas s emissions. Analysts spatiall y
allocate th e mobil e sourc e emission s estimate s alon g wit h stationar y sourc e estimates , t o a re -
gional gri d a s input t o photochemica l models . Th e photochemica l model s emplo y emissions es-
timates (from all sources) and meteorological data to predict ambien t pollutant levels in space and
time. Microscale model s suc h as CALINE o r FLIN T employ mobile sourc e emissions estimates
and ambien t estimate s to predic t pollutan t level s nea r specifi c transportatio n facilitie s suc h as
W. Bachman e t al . I Transportation Research Part C 8 (2000) 205-229 20 7
Mobile emissions are intrinsically spatial . Amon g other things, emission rates vary by location
and engin e activity. Fig. 1 shows conceptually how emissions vary as a vehicle operates in space
and time. Each 'block' of elevated or reduced emissions has different predictor variables. When the
engine is off, emission s continue but a t a reduced level (evaporative mode). When a vehicle starts
(engine start mode) , emission s are high due to th e nature of catalysts (they need to reac h a n el-
evated temperatur e befor e operatin g efficiently) . Afte r a fe w minutes, emissions are lo w unless
interrupted b y a shar p powe r deman d fro m har d acceleratio n o r grade-induce d engin e loa d
(running exhaust mode). All of these conditions vary in spatial terms as a vehicle moves along its
path. Thi s relationship betwee n emission rates and location i s a primary argument for using GIS
in mobile emissions model development.
208 W. Bachman et al. I Transportation Research Part C 8 (2000) 205-229
Mobile emissions model inputs can be divided into three general categories: fleet activity, fleet
characteristics, an d operatin g conditions. Mos t region s model emission s using regional aggrega -
tions of all these inputs, and then use estimates of vehicle miles traveled (VMT) calculated for each
grid cell to disaggregat e eac h o f the totals . Unfortunatel y this practic e does no t accoun t fo r th e
spatial variability of each of the inputs. For example , current regulatory emissions models assume
a unifor m distribution o f the vehicle fleet across a region in calculating their emission estimates.
This i s not a reasonabl e assumptio n i n mos t areas . Fig . 2 shows the censu s block grou p distri -
bution of average vehicle model years in the Atlanta metropolitan area. Clearly, significant spatial
variability o f automobil e ownershi p exist s acros s thi s region . Improvin g th e spatia l scal e o f
emissions modeling would b e a majo r advanc e tha t coul d benefi t fro m th e man y geoprocessin g
capabilities of a robust GI S modeling platform.
Fig. 2 . Atlanta mean registere d mode l yea r by block grou p an d ful l model yea r distribution for two sample ZI P codes.
quality models linking traffic informatio n with a GIS to produce synthesized databases fo r use in
vehicle emissio n an d ai r dispersio n models . Barro s e t al . (1998 ) develope d a methodolog y t o
develop a GIS-based traffi c emissio n inventory for Portugal, usefu l fo r estimating both are a an d
line sources. Briggs et al. (1997) described th e use of a GIS combined with least squares regression
analysis for mappin g traffic-relate d ai r pollutio n t o generat e predictive models o f pollution sur -
faces based o n monitored pollutio n data an d exogenou s information. Anderson et al. (1996) also
described th e use of a GIS as a tool to illustrate th e spatial pattern s o f emissions an d to visualiz e
the impac t congestio n ha s o n emissions . Th e mode l consiste d o f a n integrate d urba n landus e
model tha t interfaced wit h the emissions rate model MOBIL E 5C . The integrated mode l allowe d
the impac t o f transportatio n an d landus e polic y changes t o b e simulate d i n term s o f thei r ai r
quality impact .
210 W . Bachman e t al . I Transportation Research Part C 8 (2000) 205-229
One common them e of all of these effort s i s the use of GIS as a tool to prepare or process dat a
related t o emission s modeling. Non e o f these earlier effort s use d GI S a s a n integrate d modelin g
environment capabl e o f estimating emission rates a t a user-defined grid cel l level.
A researc h effor t tha t doe s no t involv e a commercia l GI S platfor m bu t ha s focuse d o n im -
proving the spatial scal e of data input s into emission s photochemical model s is the development
of th e TRansportatio n Analysi s an d SIMulatio n Syste m (TRANSIMS) microsimulatio n trave l
forecasting mode l (Williams et al., 1999) . TRANSIMS processes , stores , and manipulate s spatia l
data throug h the use of a powerful spatia l databas e engin e with explicit network topology. It uses
an advance d approac h o f vehicle microsimulation usin g synthetic populations. Th e TRANSIM S
approach dramaticall y improve s th e spatia l scal e issue by modeling individual synthetic vehicles
on a second-by-second basis. Unfortunately, TRANSIMS mode l inputs far exceed even the most
detailed regiona l models. Further , the TRANSIMS model is still under development wit h many of
its component s undergoin g calibratio n an d validation . Unti l TRANSIM S become s widel y ac -
cepted and implemented , ther e wil l be continua l relianc e on existin g modeling frameworks . A
GIS-based macroscopi c modelin g approach tha t doe s no t rel y heavil y on regiona l aggregation s
for input s provides a robust alternativ e to th e current modeling regime as well as future regiona l
microscopic model such as TRANSIMS. I n Section 2.3, we introduce a conceptual framework for
a GIS-based macroscopi c modelin g approach .
Elements of any emissions model should include data importan t to accurately predict emissio n
rates, data that are available t o the expected user, an d data that fit the mitigation tool s availabl e
to transportatio n professionals . GIS, computin g power , an d dat a storag e capabilitie s allo w this
model desig n t o expan d fro m historica l one s focuse d o n simplicity , to on e focuse d o n compre -
hensiveness, usefulness , an d flexibility . B y removin g th e concer n o f processin g tim e an d dis k
storage, a robus t emission s model conceptua l framewor k can b e conceived withou t fea r tha t i t
cannot b e implemented.
Research suggest s that th e conceptual framewor k for a robus t mode l shoul d b e modal i n na -
ture. A GI S framewor k is ideally suite d fo r implementin g a moda l modelin g approach , wher e
emission rate s ar e a functio n o f specifi c mode s o f vehicl e operation (engin e starts, runnin g ex-
haust, enrichment , etc.). Moda l model s hol d th e mos t significan t promise fo r improvin g model
accuracy an d eliminatin g th e variou s shortcoming s o f th e curren t highl y aggregate d approac h
(Washington, 1995 ; Barth et al., 1996) . Conceptually, a GIS can be used to estimate mobile sourc e
emissions for differen t operatin g mode s an d stor e thes e results o n individual layers. The layere d
information coul d the n b e aggregated t o gri d cell layers that ar e compatible wit h photochemica l
models. Onc e aggregated , tota l estimate s ca n b e calculate d b y summin g acros s layers . Th e
summing proces s i n itsel f woul d no t contribut e t o th e erro r o f th e individua l laye r estimate s
because th e cel l polygon s whos e attribute s ar e summe d acros s correspondin g layer s woul d b e
identical i n shape, size, and location. Thus the main contributor to error from a spatial processing
standpoint i s in the disaggregatio n (an d t o som e exten t the aggregation ) o f data t o gri d cells.
In 1995 , Bachman e t al . (1996a ) develope d a conceptua l framewor k fo r a moda l GIS-base d
emissions modeling regime. The conceptua l framewor k benefited fro m lesson s learned i n earlier
attempts using a GIS to assist in modeling mobile source emissions. Fig. 3 illustrates this original
W. Bachman e t al I Transportation Research Part C 8 (2000) 205-229 21 1
conceptual model. Th e framework attempted to improve o n the spatial resolution o f inputs, whil e
adding additional element s that ha d been shown to have significant impac t o n emissions but were
not currentl y considere d i n regulator y emission s models . Fo r example , severa l researc h effort s
have show n tha t roadwa y grade s ar e directl y relate d t o elevate d emission s becaus e the y hav e
significant impact s on engin e load s (Cicero-Fernandez e t al. , 1997 ; Pierson e t al. , 1996) . Unfor -
tunately, grade s hav e been largel y ignored i n current regulator y models.
Portions o f th e origina l conceptua l framewor k were implemente d int o a workin g prototyp e
model tha t include d experimenta l emissio n rate s designe d t o identif y th e impact s o f change s in
acceleration an d engin e load . Whil e th e prototyp e wa s limite d (e.g. , i t di d no t conside r flee t
distribution) i t di d illustrat e severa l benefits o f using a GIS platfor m fo r modelin g emissions es-
timates:
• efficientl y manage s spatiall y reference d parameters tha t affec t emissions ,
• provide s manipulatio n tool s t o calculate emission s fro m th e modal parameters ,
• allow s a 'layered' approac h t o individual vehicle activit y estimation,
• ca n efficientl y aggregat e emissio n estimates int o gri d cell s fo r input t o photochemica l model s
using topologic overla y capabilities ,
212 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
• include s a robust se t of geocoding tools , suc h as address matching an d globa l positioning sys-
tem linkages t o facilitat e creatio n o f new and modifie d databases,
• provide s visualizatio n and map-making tools , and
• contain s useful link s to other softwar e packages suc h as statistical analysis software, tha t allow
analysis an d manipulatio n o f data beyond th e capabilities o f a stand-alone GIS .
3. Introducin g MEASUR E
The GIS conceptual framework and associated prototype has evolved into the Mobile Emissio n
Assessment System for Urban an d Regiona l Evaluation (MEASURE) . Whil e MEASURE mode l
development and evaluation are ongoing, a discussion of its specific approach an d design provides
insights int o ho w th e productio n GIS-base d mode l fo r publi c releas e i s being developed . Earl y
validation an d testing have shown promising results. The details of MEASURE mode l design and
architecture ca n b e foun d i n th e USEP A repor t entitle d " A GIS-Base d Moda l Mode l o f Auto-
mobile Exhaust Emissions " (USEPA , 1998) .
MEASURE include s automobile exhaust-relate d moda l vehicl e activity measures for differen t
vehicle conditions including starts, idle , cruise, acceleration, an d deceleration . Vehicl e technology
characteristics (model year, engine size, etc.) and operating conditions (road grade, traffic flow, etc.)
are included as data inputs. The model outputs emissions by facility type, grid cell, operating mode,
and pollutant type (VOCs, NO* , and CO) . Fig . 4 depicts general mode l processe s an d the flow of
information. The model is currently undergoing a variety of validation studies to demonstrate that
MEASURE exceed s th e predictiv e capabilities o f curren t model s (se e Sectio n 3. 6 for additiona l
discussion o n validation efforts) . A recent stud y conducted i n Atlanta compare d MEASUR E an d
MOBILE emission rate models to predict th e results of 1 6 different runnin g exhaust emission test
cycles (bag tests) . MEASURE prove d t o b e substantially better tha n MOBIL E i n predicting the
cycles (which varied i n cycle speeds and accelerations ) (Fomunun g e t al., 2000) .
This description focuse s on the model components and procedures that demonstrat e innovative
uses of GIS an d spatia l analysis . Othe r crucia l elements , suc h a s new modal emission s rates, ar e
discussed briefly ; i n dept h discussion s hav e bee n publishe d elsewher e (Washington , 1995 ;
Washington e t al. , 1997 ; Fomunun g e t al. , 2000) . While no t al l th e indirec t relationship s ar e
modeled (e.g. , landuse change), a n effor t wa s made to includ e these parameters i n anticipation of
future researc h findings .
Required dat a necessar y t o develo p a MEASUR E mode l ca n b e divided int o five categories:
spatial character , tempora l character , vehicl e technology , moda l activity , an d tri p generation .
These data ar e identifie d a s follows :
Spatial character.
• landus e boundaries,
• U S census block boundaries ,
• traffi c analysi s zon e boundaries ,
• roads ,
• trave l demand forecastin g network,
• outpu t gri d cel l boundarie s (user-defined) .
W. Bachman e t al . I Transportation Research Part C 8 (2000) 205-229 21 3
Temporal character.
• hou r o f the day.
Vehicle technology:
• mode l year,
• engin e displacement ,
214 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
• transmissio n type,
• fue l deliver y technology ,
• supplementa l air injection system ,
• catalys t configuration,
• exhaus t ga s recirculation.
Modal activity:
• idle ,
• cruise ,
• acceleration ,
• deceleration ,
• starts ,
• engin e off.
Trip generation:
• landuse ,
• housin g units,
• socioeconomi c characteristic s (fo r spatia l allocatio n only) ,
• home-base d wor k trips ,
• home-base d shoppin g trips ,
• home-base d universit y trips ,
• home-base d grad e schoo l trips ,
• home-base d othe r trips ,
• non-home-base d trips .
The organizatio n o f the MEASUR E spatial environmen t is a function of the forma t o f inpu t
data provide d fro m othe r sources . Historically, emissions modeling regimes have divided exhaust
emissions int o star t an d non-star t (runnin g exhaust ) emissions . Mos t prognosti c trave l model s
provide a traffi c analysi s zone (TAZ ) estimat e o f the number o f trip origin s and a line (link) es-
timate of road volum e and average speed. By defining an engine start as being synonymous with a
trip origin, TAZs becom e the base spatial entity used for estimating engine start emission s within
MEASURE. Runnin g exhaust emissions occur on the road network , suggesting the network 'link'
as th e bas e spatial entity . Improvement s i n the spatia l resolutio n o f the zonal estimate s ca n b e
made outsid e th e travel model by disaggregating trip s t o smalle r zones. In MEASURE , tri p or -
igins ar e disaggregate d b y census landus e dat a an d b y U S censu s block grou p househol d data .
Census block s ca n b e use d t o disaggregat e trip s eve n further . Fo r example , TA Z trip s leavin g
from hom e ar e spatiall y allocated t o th e portion s o f the TA Z tha t contai n residentia l lan d use s
and further weighted b y household densit y determined fro m censu s blocks. Thi s process does no t
affect th e total TA Z home-base d origins ; the origin s are simply allocated t o smalle r zones within
the TAZ based on those factors. The final 'zone' used for emissions modeling is created through a
series o f GI S polygon overlay s tha t intersect s TAZs , censu s bloc k an d censu s bloc k grou p
boundaries, and U S postal cod e (ZI P code ) boundaries .
W. Bachman et al . I Transportation Research Part C 8 (2000) 205-229 21 5
The spatia l accurac y o f th e lin e estimat e i s improve d b y conflatin g th e trave l deman d fore -
casting networ k 'links ' t o a comprehensiv e an d accurat e roa d database . Trave l deman d fore -
casting model networks are frequently abstract stick networks without intermediate shap e points.
Conflating improve s th e spatia l resolutio n o f roadwa y link s an d help s t o clearl y identif y th e
network acces s point s fro m eac h TAZ . Bachma n e t al . (1996b ) describ e a GI S conflatio n pro -
cedure for improving the spatial accuracy of travel forecasting networks. This procedure use s the
GIS's built-in rubber sheetin g tools a s well as a series of heuristics (rules of thumb) t o guid e the
automated conflation . A one-to-one correspondence tabl e is also necessary to accurately conflat e
complicated roa d configurations , such a s interchange s and closel y space d roads . Th e Bachma n
paper also discusse s th e potential impacts which conflating a travel demand forecastin g network
would hav e o n ho t stabilize d emissions when the fina l estimate s are aggregate d t o 1 - and 5-k m
grid cell s suitable for photochemica l modeling .
distributions of a local fleet (census blocks within a 3-mile radius of the ramp) were combined, the
fleet distribution at each ramp location coul d be predicted. While Tomeh's approac h t o predicting
on-road vehicles prove d valid , hi s research result s als o indicated tha t a more accurate prediction
would result if the local fleet could be better defined. Instea d o f a 3-mile radius, the actual spatia l
pattern o f observed vehicl e home locations wa s skewed based o n the networ k configuration and
the tim e o f day . I f th e ram p wa s a n 'off-ramp' , ther e wa s a n upstrea m concentratio n o f loca l
vehicles in the afternoon and a regional distribution in the morning (the opposite held true for on-
ramps). In its current form, MEASURE relie s on Tomeh's origina l strategy (3-mile search radius),
but futur e effort s ma y include a more advanced approach based o n his observations. I n addition ,
MEASURE adjust s the weighting s of regiona l versu s local vehicl e distributions based o n roa d
classification. Interstate s ar e assigne d a higher regiona l fraction (assumin g that interstate s serve
more of a regional se t of vehicles), while local roads are assigned a higher local fraction (Bachma n
et al. , 1998 ; Tomeh, 1996) .
The cor e prognosti c capabilit y of the mode l rest s on the abilit y of travel demand forecastin g
models to accuratel y predic t regiona l travel . Th e emission relate d vehicl e activity estimate s pro -
vided by regional travel models are the number and location o f peak hou r (o r daily) trip origins,
the road segmen t volumes, and the volume-to-capacity-based averag e speeds (later post-processe d
in estimating speed/acceleration distributions) . Importan t activitie s not provide d b y most curren t
models are tempora l trave l behavior an d moda l (idle , cruise, acceleration, an d deceleration ) op-
erations. I n MEASURE , th e usabl e trave l mode l informatio n i s translate d int o th e emission s
modeling environment and an y missing emission-related parameters ar e estimated base d o n data
available.
(cadastral, hydrologic , etc.) . Censu s block s typicall y includ e 50-20 0 dwellin g units . Th e 199 0
estimates o f th e numbe r o f household s ar e availabl e a t th e censu s block level . Althoug h thes e
estimates are dated, the y can provide clue s to housin g density within th e TAZ an d landus e des-
ignations. Thi s informatio n i s used t o furthe r spatiall y disaggregate trip s originatin g from resi -
dential areas . Wit h goo d landus e and socioeconomi c data , variou s trips ca n be disaggregated t o
smaller zones. Eve n if the landus e designations are a s broad a s "residential" an d "non-residen -
tial", the spatial resolution of trip generation estimates is improved. This allows for an improved
spatial resolutio n fo r engine start estimates .
accurate fractions of vehicle activity in the high power demand area s ar e selecte d to estimat e th e
fraction o f emission-specifi c modal behavio r occurrin g i n thos e instances . Moda l profile s hav e
been develope d fo r interstates , ramp s (suspecte d a s hig h powe r deman d areas) , arterials , an d
signalized intersections . Sampl e profiles ar e show n in Fig . 5 . Because vehicle emissions for mos t
speed an d acceleratio n condition s ar e relativel y constant, i t i s only important tha t th e assigne d
Fig. 5 . Sample speed-acceleratio n profile s fo r differen t roa d segment s along a majo r interchange in Atlanta .
W. Bachman et al , I Transportation Research Part C 8 (2000) 205-229 21 9
profiles accuratel y reflec t th e fractio n o f vehicl e activit y unde r crucia l high-emissio n modes .
Hence, although thes e speed acceleration profile s d o not perfectl y reflec t th e entire range of speed
and acceleratio n operations , the y d o accuratel y predic t th e fractio n of activit y occurring unde r
high acceleration an d high power deman d condition s fo r different level s of congestion (Hallmar k
and Guensler , 1999) . When thi s process i s conducted fo r ever y road segmen t section, th e distri-
bution o f modal behavio r ca n b e estimated i n space an d time .
Roadway facilitie s are divided int o zones and lines corresponding t o the previously mentione d
emission modes of engine starts and runnin g exhaust (respectively). Facility activit y estimates are
used t o allocat e emissio n productio n t o thos e vecto r spatia l dat a structure s currentl y use d b y
transportation planners . B y typing emissio n productio n estimate s t o facilities , task s associate d
with research, reporting, validation, or control strategy development are easier. Emission rates for
each portio n were developed b y reanalysis vehicl e emissio n test s fro m a variety of sources (Wol f
et al., 1998 ; Fomunung, e t al. , 1999) .
where E i s the emission s fo r singl e facilit y i n gram s (CO , HC , o r NO x), N th e numbe r o f tech -
nology group s fo r pollutan t o f interest, T G the fractio n of registere d vehicles in th e zon e in th e
specified technology group, ER the gram per start emission rate for the specified technology grou p
and pollutant , an d O is the numbe r o f vehicle trip origins.
The resultin g emission s o f CO , HC , an d NO x ar e usuall y reporte d fo r a typica l weekda y
(Tuesday-Thursday) on a n hourly basis . Th e typical weekda y limitation is a result o f the trave l
demand modeling process, as few models are currently setup to predict weekend or Friday travel.
MEASURE allow s other time aggregations to occur since vehicle activity is a direct input into the
model.
network. Fo r thes e zones , th e tota l trave l time , th e typica l loca l roa d spee d an d acceleratio n
profile, and the zonal technology characteristics are used as inputs to the emission rate algorith m
(discussed i n Sectio n 3.4.3).
where E i s the emission s for singl e facility i n gram s (CO , HC , o r NO x), T V the numbe r o f tech -
nology groups fo r th e pollutant o f interest, T G the fraction o f registered vehicle s o n the road in
the specifie d technolog y group , B th e mea n FT P Bag 2 emissio n rat e i n g/ s fo r th e specifie d
technology group , / th e interactio n facto r fo r specifi c technolog y combinatio n an d estimate d
modal conditions , F the constan t fo r eac h o f th e thre e pollutant, an d T is the tota l second s of
travel tim e for tha t roa d segment .
More detail s o n th e equation s an d thei r validatio n ca n b e foun d i n Fomunun g e t al . (1999 ,
2000).
The rol e of the emissions inventory module is to conver t the facility-base d emission estimates
into gridded estimates. Procedurally , th e user selects a grid cell size; the software creates a polygon
database o f gri d cel l boundaries, allocate s eac h zon e o r lin e (o r part s o f zone s o r lines ) t o it s
corresponding grid, sums all emissions for each cell, and finall y convert s the results to raster dat a
structures.
Grid cel l size is optional fo r th e use r but usuall y is dictated b y the subsequen t photochemica l
model. The gridde d result s from MEASUR E ar e inputs int o othe r model s tha t predic t ambien t
pollutant concentrations. Most photochemical model s use grid cell sizes of 4-5-km; however, new
designs plan t o use 1-km grid cells. MEASUR E create s a grid cell boundary databas e (polygons ,
not raste r cells) for the study area. Emissions at each facility (zon e or line) are converted to a rate
based o n th e are a (zones ) o r lengt h (roads) . Resultin g values are therefor e in g/squar e km , o r
g/km. The gri d cell boundaries are used as 'cookie cutters ' to identif y whic h facilities o r part s o f
W. Bachman et al . I Transportation Research Part C 8 (2000) 205-229 22 1
facilities fal l i n eac h gri d cell . Th e facilit y emission s are the n converte d bac k t o gram s b y mul-
tiplying the rate s b y the ne w areas o r lengths .
The polygon gri d cells are then converted t o raste r data structure s with raster cells equivalent
to th e siz e and positio n o f th e origina l polygon . This conversio n doe s no t contribut e t o mode l
error becaus e eac h raste r cel l ha s th e exac t sam e shap e an d locatio n o f th e correspondin g
polygon gri d cell . Th e conversio n i s conducted becaus e th e raste r databas e i s more efficien t a t
storing gridde d information . Th e fina l raste r dataset s ar e individua l 'layers ' o f eac h pollutan t
emission 'mode ' (totals, engin e starts, etc.) . MEASUR E include s a customized use r interface fo r
querying an d visualizin g two- an d three-dimensiona l image s o f th e variou s inpu t an d outpu t
databases.
MEASURE wa s written using 'C' code and ARC/INFO AML. Each of the modules described
previously i s controlled by a 'Makefile' . The model require s ARC/INFO software to b e resident
on th e system , but handle s all software access an d syntax . A sampl e model wa s developed tha t
predicts grams of CO, HC, an d NO x, for all zones, lines, 100-m cells, 250-m cells, 500-m cells, and
1-km cells . Th e stud y are a wa s th e 1 3 county, non-attainmen t are a i n Atlanta , Georgia . Th e
following inpu t dataset s wer e used:
• 199 5 Atlanta Regiona l Commissio n (ARC ) LandUs e Data ,
• 199 0 US Census Summary Tape Fil e (STF ) 3a ,
• 199 4 U S Censu s Topologicall y Integrate d Geographi c Encodin g an d Referencin g (TIGER)
File,
• 199 5 Updated TIGER Roa d Database,
• 199 6 ARC ARCMAP Roa d Database ,
• 199 5 ARC Traffic Analysi s Zones,
• 199 5 ARC Travel Demand Forecastin g Network ,
• 199 5 ARC Temporal distributions b y trip type,
• 199 6 Georgia Departmen t o f Motor Vehicle s Registration Dataset, an d
• 1996-9 7 Georgia Tec h Spee d an d Acceleration Profiles .
The followin g outpu t files were created b y the model:
• Zona l Vehicl e Characteristics,
• Roa d Segmen t Technology Group Distributions ,
• Zona l Vehicl e Activity,
• Roa d Segmen t Vehicle Activity,
• Zona l Star t Emissions (Fig. 6),
• Zona l Runnin g Exhaus t Emissions (Fig . 6),
• Roa d Segmen t Running Exhaust Emissions, and
• Gridde d Emission s (Fig . 7).
Sample model ru n output s fo r Atlant a ar e demonstrate d i n Figs . 6 and 7 . Spatial inpu t data
for th e mode l ru n wer e organize d unde r a singl e datu m an d projectio n system . Majo r dat a
preparation step s tha t wer e conducte d outsid e th e MEASUR E domai n includ e geocodin g o f
the vehicle registration databas e to ZI P codes , vehicl e identificatio n numbe r (VIN ) decodin g o f
222 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
Fig. 6 . Road segmen t and zona l emission estimates for 7- 8 a.m . in Atlanta.
the vehicl e registration database , generatin g bloc k grou p polygon s fro m TIGER , an d addin g
US Censu s STF3 a housin g uni t dat a t o th e bloc k groups . Th e trave l deman d forecastin g
network wa s converte d usin g a softwar e utilit y develope d specificall y fo r tha t purpose . Som e
"clean-up" o f thi s networ k wa s necessar y usin g som e o f th e automate d technique s describe d
previously.
The model code ('C', an d ARC/INFO AML ) is organized in a 'make' routin e that verifie s cod e
updates, clean s temporar y files, and initiate s the module s in the appropriat e sequence . The
shortest pat h routine s tha t allocat e eac h engin e start zon e to th e closest major intersection too k
the longest t o process . The entire run, which took approximatel y 25 h o f continuous processing ,
W. Bachman e t al . I Transportation Research Part C 8 (2000) 205-229 22 3
Fig. 7 . Total grams of CO mobile emissions in Atlanta from 7- 8 a.m . (1-km grid cells) .
was conducted on a Dell dual-processor 400 MHz Pentium II operating Windows NT with ARC/
INFO resident.
This initia l ru n fo r Atlant a di d no t includ e modal activit y fo r intersection s sinc e signalized
intersection data wer e not available . Instead, a speed/acceleration profile fo r a typical signalized
arterial roa d wa s used. Thi s 'typical ' arteria l speed/acceleratio n profil e include d observe d dat a
aggregated from severa l signalized and unsignalize d arterial roads. All of the site s observed were
multi-lane facilities. In the model run, all non-interstate, non-ramp multi-lane roads were assigned
this profil e a s an estimat e of modal activity.
224 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
The Atlanta model i s being used a s a basis for validating MEASUR E components. One effor t
currently underwa y is a vertica l pollutant flux study that involve s detailed monitorin g o f a me-
tered ramp syste m on 1-75 , north o f downtown Atlanta. Th e researc h tea m collecte d traffi c flow
data, vehicl e classification, fleet technology characteristic s (b y monitored licens e plate dat a an d
later decodin g th e registration VINs), an d spee d acceleratio n profile s wit h laser guns on the fou r
ramps an d th e adjacen t freewa y links . Th e researc h tea m i s currently analyzin g the 1 8 days o f
vehicle activit y dat a an d wil l us e th e vertica l pollutan t flu x result s t o compar e predicte d an d
measured emissio n rates.
Initial comparison s hav e alread y bee n conducte d betwee n MEASUR E an d MOBILE5 a t o
explore and identif y difference s i n their emission rates. Fig. 8 shows the NO x g/s emission rates fo r
MOBILE5a and MEASURE b y speed and acceleration bi n (using an Atlanta regiona l fleet). The
charts indicat e tha t MEASURE i s much more sensitive to changes i n acceleration, particularly at
high operatin g speeds . Thi s i s not unexpecte d becaus e MOBIL E emissio n rate s ar e primaril y a
function o f averag e speed , whil e MEASUR E rate s ar e directl y influence d b y th e relationshi p
between spee d an d acceleration . I n thi s comparison , th e regiona l fleet distribution wa s used t o
estimate emissions for interstat e LO S A-F. Mea n emissio n rates i n g/s were within 20% of eac h
method for LO S A. MEASUR E emissio n rates wer e 50% higher than MOBIL E fo r LO S B and
C, an d twic e a s hig h fo r LO S D an d E . Emissio n rate s wer e bac k t o withi n 20 % for LO S F .
Because MEASUR E i s sensitiv e to acceleration s a t hig h speed , it s emissio n rate s wer e muc h
higher for moderate congestion levels where high speeds and variabl e acceleration resultin g from
increased vehicl e interaction i s typical. Once ther e is a breakdown, a traffi c flow resulting in low
speeds wit h variabl e acceleration , MEASUR E emissio n rate s dro p t o level s comparabl e t o
MOBILE.
The significan t difference s in emission rate s betwee n MEASUR E and MOBILE , especiall y at
moderate LOSs , magnify th e nee d fo r furthe r validatio n studies .
If EPA approves a GIS-based moda l emissions model, such as MEASURE, for regulatory use,
the type s o f mitigation strategie s available to loca l and stat e government s wil l change dramati -
cally. Under th e current modeling system, transportation planner s and engineer s have only three
ways to reduc e mobil e emissions : depen d upo n EP A to pas s ne w vehicle certification standards ,
reduce vehicl e miles o f travel , o r optimiz e averag e speed s t o range s wher e emissions base d o n
average spee d estimate s ar e reduced . Th e choice s usuall y resul t i n reducin g th e mobilit y an d
accessibility desire d b y the transportatio n syste m users.
If spatiall y resolve d moda l model s ar e developed , muc h mor e divers e and creativ e strategies
become assessable. Any strategy tha t reduces th e number o f high-emitting vehicles or reduces th e
occurrence o f har d acceleration s an d deceleration s i s expecte d t o reduc e mobil e emissions .
However, curren t modeling regime s canno t accuratel y asses s th e impac t o f those change s t o re -
gional ai r quality . A mode l framework , such a s MEASURE'S , make s tw o significan t improve -
ments in this area: spatial variability becomes an important component o f any mitigation strategy
and th e moda l characteristic s allo w the assessmen t of variability in traffi c flow, not just averag e
speed. Reducin g traffi c volume s may b e les s importan t tha n improvin g traffic flo w through IT S
strategies, signa l timing , o r eve n lane additions . Th e ne w modal approache s ma y sho w tha t (a t
least i n th e shor t term ) mobilit y an d accessibilit y ca n increas e a s mobil e emission s decreas e
(Hallmark et al., 2000) .
Spatially resolve d emission s estimate s allo w planner s t o prioritiz e certain location s fo r mit -
igation strategie s becaus e o f thei r disproportiona l contributio n t o regiona l ozon e formation .
This i s the rea l valu e o f improve d spatia l variability . The disproportiona l contributio n ma y b e
the resul t o f topography , landuse , o r climati c factors . Regardless , a dolla r spen t mitigatin g
mobile emission s i n on e par t o f th e regio n ma y no t resul t i n th e sam e reductio n i f spen t i n
another part . Th e spatiall y resolve d estimate s a t prope r resolution s als o allo w loca l transpor -
tation planner s an d traffi c engineer s t o develo p sub-regiona l strategie s tha t hel p t o improv e
regional air quality .
4. Futur e researc h
Some specific GIS-oriente d mode l design and implementatio n research projects could improve
the accuracy of MEASURE estimates . Whil e model validation i s important in confirming current
capabilities, addressin g thes e theoretica l issue s could provid e significan t modelin g benefits ove r
the short term .
• Fraction o f total vehicle operation b y vehicle type: Th e registratio n datase t represent s all light-
duty vehicles that are licensed to operate on the road. The actual operating fleet may look quite
different. Olde r third and fourth vehicles in a household are not expected to be used to the same
extent.
• On-road vehicle distribution search pattern: Additiona l studie s hav e indicate d tha t th e radia l
search patter n use d in the model could b e significantly improve d for determining a local oper -
ating fleet. Research int o the size and shap e o f the search pattern wil l significantly improv e the
capability o f predicting the on-roa d fleet distribution.
226 W . Bachman e t a l I Transportation Research Part C 8 (2000) 205-229
While the conceptual design of MEASURE is comprehensive, the actual working model is not.
The current model scope is limited to automobile exhaust emissions. Moving to a complete mobile
emissions model involves adding much more information and data . Som e of the major items are
listed below.
• On - and off-network grade distributions and impacts: Becaus e roa d grade has spatial variabilit y
and ha s significan t impac t o n th e loa d o n a n engine , it shoul d b e included i n the researc h de -
sign. This may mean moving to more detailed modal emissio n rates tha t mode l emissions as a
direct functio n o f engine load rathe r tha n a s a functio n o f load surrogates .
• More speed/acceleration matrices: Currently, the model is limited to approximatel y 20 different
profiles. Furthe r refining the subroutines that define spee d and acceleratio n profile s for all road
types and configuration s (accounting for influence s o f weaving sections and othe r physical pa -
rameters) will provide a more comprehensive view o f modal activity .
• Other motor vehicle types: A comprehensive mobile source model must include all vehicle clas-
sifications. Currently , all light-duty vehicles are modeled a s automobiles becaus e there are few
vehicle emissio n tests fo r sports-utilit y vehicles and light-dut y trucks under a wid e variet y of
operating conditions. Heavy-duty truck modeling components (load-based) ar e currently being
added.
• Load-based approach: A new engine load-based modelin g approach t o predicting emissions will
allow enrichmen t emission s t o b e separately identified, an origina l model desig n objective .
• Non-exhaust mobile emissions: Exhaust emissions only make-up a portion o f the overall mobile
emission modes . Evaporativ e emission s ar e currentl y bein g adapte d directl y fro m th e MO -
BILE5a mode l an d wil l b e upgraded i n futur e models .
• External/internal trips: Currently , external/internal trips are excluded from th e models' predic -
tions of star t activit y and evaporativ e emissions.
At the time MEASURE developmen t first began, a number of GIS platforms were evaluated.
ARC/INFO was chosen because of its robust set of spatial analysi s tools as well as its widespread
use a t Metropolita n Plannin g Organization s (MPOs ) throughou t th e US . Th e emergenc e an d
evolution of Arc View and relate d components, Geomedia , an d MapObject s offer s attractiv e an d
more affordabl e GI S platform s for a futur e versio n of MEASUR E onc e th e ongoin g validatio n
phase is complete. A thoroug h evaluatio n o f these and othe r GI S softwar e will nee d t o b e con -
ducted t o determin e if they are suitably equipped t o accommodat e th e complex spatial modeling
requirements of MEASURE .
5. Conclusion
Traditional transportatio n emissio n models have been shown t o suffe r fro m problems , suc h a s
highly aggregate d datasets , non-representativ e emissio n factors , an d lac k o f spatia l resolution .
Consequently, there is a desire to move towards a modal approac h whic h relates activity-specific
emission productio n wit h correspondin g vehicl e activity. The specia l capabilitie s o f a GI S ca n
greatly simplify th e procedures associate d wit h creating, combining, and manipulating the spatia l
databases necessar y fo r implementin g a moda l approach . Further , th e GI S ca n b e use d t o es -
tablish emissions estimates as a function of a number of variables associated wit h points and area s
(off-network activities ) or roadway links (on-network activities). Once established, these improved
W. Bachman et al. I Transportation Research Part C 8 (2000) 205-229 22 7
spatial estimate s ca n b e aggregate d t o gri d cell s tha t ar e compatibl e fo r inpu t int o regiona l
photochemical models . The major modeling deficiencies that exist in current models are addressed
with the implementation of the MEASURE model. The limitations of MEASURE revolve mostly
around th e intensit y of dat a required . MEASUR E wa s designe d an d develope d a s a researc h
model that considers all relevant data a t th e best resolution possible. As research and validation
continue, relationships between variables may be identified whic h would reduce the quantity and
variety of data. This is critical if MEASURE is to be widely accepted and implemented on a large
scale.
While tailpipe emission reductions for the fleet-at-large will continue to be an important contro l
strategy for years to come , there is an increasin g interest in other methods of controlling mobile
sources. Futur e mobile-sourc e emission models must b e equipped t o predic t th e impac t o f such
methods a s improved traffic flow , improved inspection an d maintenanc e programs, targeted en -
forcement o f super-emitters , traffic restrictions , and us e of alternative fuels. MEASUR E ha s al -
ready been shown to be capable of analyzing these types of policies. For example , MEASURE is
currently bein g use d i n Atlant a b y th e Georgi a Departmen t o f Transportatio n t o evaluat e th e
emission impacts of proposed ITS projects. Even with great strides in mobile emission reductions,
there will alway s be a nee d to gathe r comprehensive spatial and tempora l distribution s of emis-
sions fo r urba n areas . A GIS-base d emission s modeling framework make s thi s mor e practica l
than eve r before .
References
Anderson, W.P., Kanaroglou , P.S., Miller, E.J., Buliung , R.N., 1996 . Simulating automobile emissions in an integrate d
urban model . Transportatio n Researc h Recor d 1520 , Transportation Researc h Board , Washington , pp . 71-80 .
Bachman, W. , Sarasua , W. , Guensler , R. , 1996a . Geographi c informatio n framewor k fo r modelin g mobile-sourc e
emissions. Transportatio n Researc h Recor d 1551 , Transportation Researc h Board , Washington , pp . 123-133 .
Bachman, W. , Sarasua, W. , Washington, S. , Guensler, R. , Hallmark , S. , Meyer, M., 1996b . Integrating travel deman d
forecasting model s wit h GI S t o estimat e ho t stabilize d mobil e sourc e emissions . In : Proceeding s o f th e 199 6
AASHTO GIS- T Symposium. American Associatio n o f State Highwa y and Transportatio n Officials , Kansa s City,
pp. 257-268.
Bachman, W. , Granell , J. , Guensler , R. , Leonard , J. , 1998 . Research need s for determining spatiall y resolved subflee t
characteristics. Transportatio n Researc h Recor d 1625 , Transportation Researc h Board , Washington , pp. 139-146 .
Barros, N. , Borrego , C. , Lopes , M. , Miranda , A. , Tchepel , O. , 1998 . Development o f an emission s dat a bas e fo r ai r
pollutants fro m mobil e source s i n Portugal . In : Fourt h Internationa l Conferenc e o n Urba n Transpor t an d th e
Environment fo r th e 21st Century, 285-294.
Barth, M. , An , F. , Norbeck , J. , Ross , M. , 1996 . Moda l emission s modeling : a physica l approach . Transportatio n
Research Recor d 1520 , Transportation Researc h Board , Washington , pp . 81-88 .
Briggs, D.J., Collins , S., Elliot, P., Fischer, P., Kingham, S. , Lebret, E., Pryl, K., Van Reeuwijk, H., Smallbone, K., Van
Der Veen, A., 1997 . Mapping urba n air pollution usin g GIS: a regression-based approach . Internationa l Journa l of
Geographic Informatio n Scienc e 1 1 (7), 699-718.
Bruckman, L. , Dickson , R.J. , Wilkonson , J.G. , 1992 . Th e us e o f GI S softwar e i n th e developmen t o f emission s
inventories and emissions modeling. In: Proceedings of the Air and Waste Management Association. Pittsburgh , PA.
California Air Resources Board (CARB), 1994 . The land use-air quality linkage: how land use and transportation affec t
air quality . CARB , Sacramento , CA .
Chatterjee, A., Wholley, T.F., Guensler , R. , Hartgen, D.T., Margiotta , R.A. , Miller, T.L., Philpot , J.W., Stopher, P.R. ,
1997. Improvin g transportatio n dat a fo r mobil e sourc e emission s estimates . NCHR P Projec t 25-7 , Nationa l
Cooperative Highwa y Research Program , Repor t 394 , Washington.
228 W . Bachman et al . I Transportation Research Part C 8 (2000) 205-229
Cicero-Fernandez, P. , Wong, W., Long, J.R., 1997 . Fixed point mobile source emissions due to terrain related effects : a
preliminary assessment . In : Proceeding s o f th e Ai r an d Wast e Managemen t Association' s 90t h Annua l Meeting .
Toronto, Canada .
Fomunung, I. , Washington, S., Guensler, R. , 1999 . A statistical model for estimating oxides of nitrogen emissions fro m
light-duty motor vehicles . Transportation Researc h D 4 (5), 333-352.
Fomunung, I. , Washington , S. , Guensler , R. , Bachman , W. , 2000 . Performanc e evaluatio n o f MEASUR E emissio n
factors - compariso n wit h MOBILE. Publishe d i n CD-ROM o f the Proceedings o f the 78th Annual Meeting of the
Transportation Researc h Board , Washington .
Grant, C. , Guensler , R. , Meyer , M.D. , 1996 . Variabilit y o f heavy-dut y vehicl e operatin g mod e frequencie s fo r
prediction o f mobil e emissions . In : Proceeding s o f th e 89t h Annua l Meetin g o f th e Ai r an d Wast e Managemen t
Association. Pittsburgh , PA.
Guensler, R. , 1994 . Vehicle emission rate s an d averag e vehicl e operating speeds . Dissertation , Departmen t o f Civi l
Engineering, Universit y of California , Davis .
Hallmark, S., O'Neill, W. , 1996 . Integrating geographic informatio n systems for transportation an d ai r quality models
for microscal e analysis . Transportation Researc h Recor d 1551 , Transportation Researc h Board , Washington , pp .
133-140.
Hallmark, S.L. , Guensler , R. , 1999 . Comparison o f speed-acceleration profile s fro m field data wit h NETSIM outpu t
for moda l ai r qualit y analysi s o f signalize d intersections . Transportatio n Researc h Recor d 1664 , Transportatio n
Research Board , Washington , pp . 40-48 .
Hallmark, S.L., Bachman , W., Guensler, R., 2000. Assessing the impacts of improved signal timing as a transportatio n
control measur e usin g a n activity-specifi c modelin g approach . Transportatio n Researc h Record , Transportatio n
Research Board , Washingto n (i n press).
LeBlanc, D.C. , Saunders , M. , Meyers , M.D., Guensler , R. , 1995 . Driving patter n variabilit y and impact s o n vehicle
carbon monoxid e emissions. Transportation Researc h Recor d 1472 , Transportation Researc h Board , Washington,
pp. 45-52 .
Medina, I.C. , Schattanek , G. , Nichols , F . Jr. , 1994 . A framewor k fo r integratin g information system s in ai r qualit y
analysis. In: Proceeding s o f URIS A 1994 , 32n d Annua l Conference . Milwaukee , WI, p . 339.
Pierson, W.R., Gertler, A.W. , Robinson, N.F., Sagebiel , J.C., Zielinska , B., Bishop, G.A., Stedman , D.H. , Zweidinger ,
R.B., Ray , W.D., 1996 . Real-world automotive emissions - summar y of studies in the Fort McHenry and Tuscaror a
mountain tunnels . Atmospheric Environmen t 3 0 (12), 2233-2256.
Siwek, S.J. , 1997 . Summar y o f Proceedings , EPA-FHW A Modelin g Workshop . An n Arbo r Science , An n Arbor ,
MI.
South Coas t Ai r Qualit y Management Distric t (SCAQMD) , 1996 . Current Ai r Quality. Diamon d Bar , CA .
Souleyrette, R.R. , Sathisan , S.K. , James , D.E. , Lim , S. , 1992 . GI S fo r transportatio n an d ai r qualit y analysis . In :
Proceedings o f the National Conference on Transportation Plannin g and Ai r Quality. ASCE, Ne w York, NY , pp .
182-194.
Stopher, P. , 1993 . Deficiencies of travel-forecastin g methods relativ e to mobil e emissions . Journa l o f Transportatio n
Engineering, ASC E 11 9 (5), 723-741 .
Stopher, P.R. , Hartgen , D.T. , Li , Y. , 1996 . SMART : simulatio n mode l fo r activities , resource s an d travel .
Transportation, vol . 23. Kluwer Academi c Publishers , Dordrecht , Netherlands , pp . 293-312 .
Tomeh, O. , 1996 . Spatia l an d tempora l characterizatio n o f th e vehicl e flee t a s a functio n o f loca l an d regiona l
registration mix . Dissertation , Schoo l o f Civi l and Environmenta l Engineering , Georgi a Institut e o f Technology ,
Atlanta, GA .
United State s Departmen t o f Transportatio n an d U S Environmenta l Protectio n Agency , 1993 . Clea n ai r throug h
transportation: challenge s in meeting national ai r qualit y standards .
United State s Environmenta l Protection Agency , 1998 . A GIS-base d moda l mode l o f automobil e exhaus t emissions.
Report numbe r EPA-600/-98-097 , Researc h Triangl e Park , NC .
United State s Environmental Protection Agency , 1995. National ai r pollutant emission trends, 1900-1994 . Office o f Air
Quality Plannin g and Standards , Researc h Triangl e Park , NC .
Vonderohe, A.P., Travis, L., Smith, R., Tsai, V., 1993. National Cooperative Highwa y Research Progra m Report 359 .
Transportation Researc h Board , Washington .
W. Bachman et al . I Transportation Research Part C 8 (2000) 205-229 22 9
Williams, M.D. , Thayer , G.R. , Smith , L. , 1999 . A compariso n o f emissions estimated in th e TRANSIM S approac h
with those estimated from continuou s speeds and accelerations. Published in CD-ROM o f the 78th Annual Meeting
of the Transportation Researc h Board , Washington.
Wolf, J., Washington, S., Guensler, R., Bachman , W., 1998 . High emitting vehicle characteristics using regression tree
analysis. Transportation Researc h Recor d 1641 , Transportation Researc h Board , Washington, pp. 58-65 .
Washington, S., 1995. Estimation of a vehicular carbon monoxide modal emission model and assessment of an intelligent
transportation technology . Dissertation, Department o f Civil Engineering, University of California, Davis.
Washington, S. , Leonard , J. , Roberts , C. , Young , T. , Botha , J. , Sperling , D. , 1997 . Forecastin g vehicl e modes o f
operation neede d a s inpu t t o moda l emission s models . In : Proceeding s o f th e Fourt h Internationa l Scientifi c
Symposium o n Transport an d Ai r Pollution . Lyon, France.
TRANSPORTATION
RESEARCH
PARTC
Abstract
The purpos e o f this paper i s to develo p an d evaluat e a hybri d trave l tim e forecasting model wit h geo -
graphic information systems (GIS) technologies for predicting link travel times in congested road networks.
In a separate stud y by You an d Ki m (cf. You, J., Kim, T.J., 1999b . In: Proceedings o f the Third Bi-Annua l
Conference o f the Easter n Asi a Societ y fo r Transportatio n Studies , 14-1 7 September , Taipei , Taiwan) , a
non-parametric regressio n model has been developed as a core forecasting algorithm to reduce computatio n
time an d increas e forecastin g accuracy . Usin g th e cor e forecastin g algorithm , a prototyp e hybri d fore -
casting mode l has bee n develope d and teste d by deployin g GIS technologie s in the followin g areas: (1)
storing, retrieving , an d displayin g traffi c dat a t o assis t i n th e forecastin g procedures , (2 ) buildin g roa d
network data , an d (3 ) integratin g historica l database s an d roa d networ k data . Thi s stud y show s tha t
adopting GI S technologie s i n lin k trave l time forecastin g is efficient fo r achievin g two goals : (1 ) reducing
computational dela y an d (2 ) increasin g forecastin g accuracy . © 200 0 Elsevie r Scienc e Ltd . Al l right s
reserved.
Keywords: GIS ; Trave l time forecasting; Non-parametric regression; Historical database; Machine learning; Parameter
adjustment
1. Introductio n
Many researcher s hav e endeavore d t o develo p reliabl e trave l tim e forecastin g model s usin g
various methods including historical profile approaches, time series models, neural networks, non -
parametric regressio n models , traffi c simulatio n models , an d dynami c traffi c assignmen t (DTA )
models (Sen e t al., 1997 ; Ben-Akiv a et al., 1995 ; Gilmor e and Abe , 1995 ; Peet a and Mahmassani ,
0968-090X/00/$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00012- 7
232 / . You, T.J. Ki m I Transportation Research Part C 8 (2000) 231-256
1995; Ben-Akiv a et al., 1994 ; Mahmassani e t al. , 1991 ; Davis e t al., 1990) . Lessons learned fro m
these experimenta l effort s evinc e that futur e trave l time s ar e difficul t t o estimat e using a singl e
forecasting method .
Therefore, th e mai n purpos e o f thi s stud y i s t o develo p an d evaluat e a hybri d trave l tim e
forecasting mode l in a comprehensive framework by adopting geographica l informatio n systems
(GIS) technologies . Upon reviewin g various functions o f GIS-based application s (Nygar d e t al.,
1995; Cho i an d Kim , 1994 ; Aza d an d Cook , 1993 ; Gillespie , 1993 ; Ries , 1993 ; Loukes , 1992 ;
Abkowitz e t al. , 1990) , th e thre e importan t role s o f GIS , dat a management , technolog y man -
agement, an d informatio n management , hav e bee n employe d t o suppor t th e operatio n o f th e
hybrid forecastin g model:
• Data Management. Th e hybri d mode l require s variou s type s o f traffi c dat a includin g traffi c
speed, traffi c volume , occupanc y rate , numbe r o f lanes, an d s o forth . Therefore , GI S shoul d
be able t o provid e mechanisms for dat a aggregatio n an d manipulation .
• Technology Management. The hybrid mode l requires variou s functions including display, com -
putation, an d analysis. Therefore, GIS should provide flexible tools to integrate and customize
the require d functions . Moreover, GI S shoul d suppor t trave l time forecasting models to dea l
with variou s type s of raw traffi c dat a fro m loo p detector , mobil e communication, an d globa l
positioning systems (GPS) .
• Information Management. Afte r forecastin g future trave l times , th e result s ar e validate d an d
classified s o that traffi c managemen t and information centers could utilize them to manage net-
work traffic . I t is expected that GIS will eventually become an information gatewa y to the pub-
lic, throug h variou s type s of communication tools , suc h a s th e internet , telecommunications ,
broadcasting, etc .
For a more reliable and operational trave l time forecasting model, this study focuses on pursuing
the followin g objectives :
• developin g a historical database usin g traffic dat a collected fro m loo p detector s an d probe ve-
hicles;
• devisin g a possible wa y of integrating historical database s wit h road networ k data, and ;
• developin g and evaluating a hybrid trave l time forecasting model, which adopts non-paramet -
ric regression techniques in conjunction with GIS technologies .
In thi s study , thre e distinctiv e type s o f trave l tim e forecastin g models ar e considere d i n con -
junction wit h GIS technologies as shown in Fig. 1 . They are: (1 ) Type I - th e forecasting model
"including" GIS, (2) Type II - th e forecasting model "connected to " GIS, an d (3) Type III - the
forecasting mode l "within " GIS. I n general , thes e thre e integration method s fal l withi n the cat -
egory of close combination o f information systems, which technically combines subsystems in an
integrated framework . However, thes e integratio n method s sho w differen t level s o f syste m per -
formance, integratio n difficulty , an d syste m modification (Kriger an d Schlosser , 1992) .
Type I represents a forecasting model tha t include s GIS functions . I n general, it is believed to
be difficul t t o buil d customize d compute r program s fo r GI S function s du e t o th e complicate d
mechanisms of GIS data . T o ease the difficult y o f developing Type I applications, man y vendors
J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256 233
are currentl y offering object-oriente d GIS functio n librarie s an d Active X controls. MapObjects ,
NetEngme (ESRI) , MG E (Intergraph) , GISD K (Calipe r Corporation) , MapX , an d MapBasi c
(Maplnfo) ar e some examples of this category. These object librarie s and controls allow users t o
perform spatia l an d attribute-base d queries , communicate with external applications , an d buil d
custom application interfaces in conjunction with object-oriented development environments such
as Visual Basic, Visual C++ (Microsoft) , an d Delphi (Inprise Corporation). In developing Type I
applications, i t is possible to adopt necessary GIS components selectively. As a result, applications
in this category could be more efficient tha n other methods, although this would be a difficult tas k
to implement.
By combining GIS and a forecasting model, Type II could be implemented by using customized
software interfaces . These customize d interface s can b e develope d usin g an y softwar e develop -
ment tools , as seen in Choi an d Ki m (1994 ; 1996) , and Yo u an d Ki m (1999a). There ar e several
234 J . You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
transportation modelin g software package s that hav e such interfaces to access GIS data formats ,
such as ESRI's coverag e and shap e files. TP+ and Tranpla n DBC (Urban Analysi s Group) can
directly access ESRI' s shap e files. This i s done in the sam e manner a s ARC/INFO and ArcView
GIS can access EMME/2 (INRO) dat a (Lussier and Wu, 1997) . Moreover, several transportatio n
agencies have developed customize d hybri d system s that interfac e between GI S an d transporta -
tion models. Santa Clar a County , CA , for example, has integrated Tranplan wit h ARC/INFO to
accomplish transportation deman d modelin g and analysi s (Lockfeld and Speed , 1993) . Likewise,
Mesa County , C O ha s develope d a countywid e traffi c mode l wit h MINUT P (Urba n Analysi s
Group) an d ARC/View. To utilize car and public transport time matrices and population statistic s
to perfor m a n accessibilit y analysis, both ARC/INF O an d TRIP S (MVA ) have bee n combine d
(Holm an d Stavanger, 1997) . A research group at Lo s Alamos National Laborator y (LANL ) has
developed th e TRansportation ANalysi s and SIMulatio n Syste m (TRANSIMS), whic h is a par t
of transportatio n mode l improvemen t progra m (TMIP) . TRANSIM S ha s bee n develope d t o
provide transportatio n planner s accurate , complet e informatio n o n traffi c impacts , congestion ,
and pollutio n (Bush , 1999 ; Nagel e t al. , 1998) . Although TRANSIM S utilize s ARC/INFO an d
ArcView, it largely uses the C++ developmen t environments for simulation programs (Berkbigler
et al. , 1997) .
Type III application s ca n be implemented by developing a forecasting model within GIS using
built-in macr o languages , suc h a s ar c macr o languag e (AML ) i n ARC/INFO , Avenu e i n Arc -
View GIS , an d GISD K (Calipe r Cooperation) . MG E macr o languag e (Intergraph) , unlik e th e
built-in macr o languages, utilize s the PER L macr o languag e whic h can b e used o n mos t hard -
ware platforms , suc h a s UNIX , NT , an d DOS . Nevertheless , i t i s understood tha t thes e macr o
languages ar e les s effectiv e i n compariso n t o object-oriente d developmen t environments . I n
addition, ther e ar e som e commerciall y availabl e softwar e packages tha t includ e transportatio n
modeling programs , suc h as TransCA D (Calipe r Corporation ) and UfosNE T (RS T Interna -
tional).
The hybri d mode l i s designe d a s illustrate d i n Fig . 2 . I t include s five modules: graphi c use r
interface (GUI) , real-tim e traffi c dat a collection , database , forecasting , an d machin e learnin g
(ML) modules .
3.1. GUI
The GUI s accep t use r input , suc h a s origi n an d destination , mode l parameters , an d displa y
model output . User s execut e the forecasting tasks an d monito r output s throug h GUIs .
The real-time traffic dat a collection module collects real-time traffic dat a transmitted fro m suc h
devices as loop detector s an d prob e vehicle s and send s them t o th e database module .
J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256 235
3.3. Database
"15-min ahead " travel time forecast. Ideally, the computation tim e should b e less than a minute
or two. Thus, our priority in developing the hybrid model is to minimize the computation time of
interaction betwee n GIS networ k topology an d historica l databases .
A simple r approac h t o creat e a n efficien t interfac e betwee n historica l database s an d roa d
networks i s show n i n th e exampl e i n Fig . 3 . Th e figur e show s a transportatio n networ k tha t
consists o f 6 nodes an d 7 links and th e ar c topolog y tabl e tha t correspond s t o th e network . By
observing the arc table, it is acknowledged that Link 1 consists of Node 1 and Node 2 , and its
direction is from Nod e 1 to Node 2. If we assume that Link 1 is a two-way road, there must be a
record for the other direction of Link 1 from Nod e 2 to Node 1. If all 7 links are bi-directional,
the historica l traffic databas e requires 1 4 record fields to stor e the traffi c informatio n as shown in
Table 1 , where i s the traffi c volum e on Link 1 at tim e T_ n, and i s the traffi c volum e on
Link 1 a t tim e '1 0. By combining th e typica l arc topolog y tabl e (a s shown in Fig . 3 ) with the
historical traffi c databas e (as shown in Table 1) , only half of the traffi c dat a would be connected t o
the arc topology table. In order to avoid this problem, an interfacing algorithm is devised to make
the connection possible among the 1 4 links in Table 2 and th e arc topology table in Fig. 3. Unlike
the conventional ar c topolog y table (Fig . 3) , this expanded ar c topolog y tabl e (Table 2 ) includes
the bi-directiona l informatio n o f th e sampl e network , an d thu s i t ca n b e connecte d t o th e his -
torical databas e usin g th e identificatio n field, Historical_DB_ID. Th e searc h algorith m fo r
connecting GI S roa d networ k dat a t o historica l database s i s simpl e enoug h t o identif y
Table 1
Historical traffi c dat a correspondin g to th e sampl e network in Fig. 3
Table 2
Expanded ar c topology table
Link_ID Historical_DB_ID From_Node# To_Node#
1 L, 1 2
2 L2 1 3
3 L3 3 4
4 L4 2 4
5 L5 3 5
6 L6 5 6
7 L7 4 6
1 L8 2 1
2 L9 3 1
3 LIO 4 3
4 Ln 4 2
5 L12 5 3
6 L,3 6 5
7 L]4 6 4
corresponding links between the two (network data an d historical data ) a s shown in Fig. 4. Since
the algorith m i s efficien t fo r computation , w e di d no t hav e t o us e an y othe r linea r referencing
systems, suc h a s dynamic segmentation , fo r this particula r application.
Fig. 4 . A schematic chart o f the search algorithm for connecting GIS road networ k data t o historica l traffi c database .
among probe vehicle data samples. One problem associated wit h data samples is that som e prob e
vehicles are delivery trucks that occasionally repor t exceptionall y large values of travel times (i.e.,
outliers) betwee n intersection s du e to deliver y an d pick-u p o f packages.
section. Fig. 5 illustrates a simplified concept of non-parametric regression, which adopts the k-
Nearest Neighbor (k-NN) smoothing method. A brief explanation of the model is given below.
Suppose if there is a part of time series data representing a non-linear system, by segment in^ the
original time series into a number of finite levels, it is possible to obtain local linear subsystems. A
group of similar past cases to the present condition is identified (parameter K in required input
data), and each slope of the past cases is compared with the slope of the present condition (pa-
rameter k in required input data). By having k sets of similar cases, the future conditions of travel
time could be estimated.
Table 3
Domains o f model parameter s
Parameters Type Domain Unit
Forecasting rang e Discrete {15, 30,45 , 60 } Minute
Search data segmen t lengt h Discrete {15, 30,45,60 } Minute
Day o f the wee k Binary {Consider, Ignor e }
Search rang e Discrete {1,2,3} Hour
Large K Discrete {1,2, 3,4 , 5 , 6, 7., 8 , 9 , 10 }
Small k Discrete {1, 2 , 3,4 , 5 , 6 , 7., 8 , 9 , 10 }
Local estimatio n metho d Binary {Local averaging , Local fitting } -
Data preprocessin g Binary {Wavelet, Outlie r detection}
Search range is used to limi t the siz e of the searc h spac e in historical databases . Th e large r th e
search range, th e longer the search time . Domain range s are ±1, ±2 , o r ± 3 h .
Large K is used to limit the number of similar patterns selected from pas t data for forecasting in
the k-NN smoothin g method o f non-parametric regression module within the search range. ' K
sets o f simila r case s ar e selecte d base d o n th e similarit y betwee n th e curren t traffi c conditio n
and th e historica l database . Domai n range s fro m 1 to 10 .
Small k is the subset of the large K. It is used to limit the number of similar patterns to the cur-
rent condition withi n 'K ' set s of similar cases . The large 'K ' set s of similar case s are re-ranke d
after comparin g th e similarity between the two: the slope of the current traffic patter n an d each
of the slopes of 'K set s of similar cases. Domain ranges from 1 to 10 , but k values should not be
larger tha n th e K value.
Local estimation method is used to calculate travel times from 'k ' sets of similar cases. Basically,
each simila r cas e is composed o f a number o f continuous dat a points . Ther e ar e two method s
for estimating the travel time representing the 'k' sets. Local averagin g method uses only the last
set of data points o f traffic dat a for estimating th e mean, ignorin g al l other data points in th e
set, while local fitting method uses all data points to delineate a local linear trend line using the
least-squares estimation .
Data preprocessing assist s in filtering noise from traffi c data . Users can select either the wavelet
transform techniqu e or a n outlie r detection algorith m o f a robust estimatio n method .
Table 4
Data structur e o f training sample s
J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256 241
DATA STORAGE
Process Data
3.4.4. Evaluation
The evaluatio n modul e will be activated when observe d actua l trave l times become available .
There is a time delay because th e observed actua l travel times are not available at the time when a
user order s a forecastin g task. Base d o n th e evaluatio n result, if a priori give n acceptable erro r
margin ha s no t bee n met , th e hybri d forecastin g mode l activate s th e M L modul e t o readjus t
242 / . You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
parameter values . If the result is acceptable, i t completes a forecasting task an d waits for the next
task.
3.5. Th e ML module
The ML module in the hybrid forecasting model readjusts model parameter value s to produc e
more accurate forecasting results. With this objective, two tasks are performed: generating training
samples beforehand and learning parameters afterward with the generated training samples. In the
first instance, trainin g samples ar e generate d a s inpu t t o th e M L module . Th e trainin g samples
contain estimated forecasting errors with different combination s of parameter values, a s shown in
Table 4, where denote s the nth parameter value of mth sample, Fm the rath estimated travel time,
and O m denotes the mth observe d actual trave l time. As described before , when the hybrid model
detects an unsatisfactory prediction result, training samples can be generated.
Upon generatin g trainin g samples , th e learnin g proces s i s initiate d t o identif y th e lowes t
forecasting erro r fro m eac h paramete r value . I n th e M L module , a s show n i n Fig . 6 , the inpu t
training samples are ranke d b y the averag e errors i n ascending order. B y combining parameter s
with th e leas t errors , a combination o f parameter value s (i.e., a linear strin g of numbers) i s cre-
ated, an d th e erro r fo r thi s newl y combine d paramete r i s estimated. I f th e forecastin g error i s
within a prior i give n acceptabl e margin , the n th e learne d paramete r value s ar e applie d t o th e
hybrid model, an d th e learnin g process i s terminated. I f the forecastin g error i s greater tha n th e
acceptable erro r margin , changin g on e o f th e paramete r value s mutate s th e combination . Th e
order of mutation is based on the size of average forecasting errors within parameter values under
consideration. In this manner, the learning processes are iterated until a combination tha t satisfie s
the decision rule (i.e., the error level should be smaller than the acceptable erro r margin) is found.
Finally, i f thi s iterativ e proces s fail s t o fin d an y combinatio n tha t meet s th e decisio n rule , th e
learning proces s attempt s t o produc e a randoml y selecte d combinatio n amon g th e trainin g
samples wit h relativel y small forecastin g errors . Onc e th e rando m generatio n i s completed, th e
user ha s th e choice to accep t o r rejec t them.
Table 5
Development tool s for Type I integration
Functions Developmen t tools
Supporting user inputs and syste m controls Visua l BASIC
Displaying networks OD E ARCPLO T Active X Control
Managing real-time and historica l data OD E ARCPLO T ActiveX Control
Representing traffi c network s OD E ARCPLO T ActiveX Control
Simulating real-time data input s Visua l BASIC
Screening and filterin g ra w traffi c dat a MATLAB : wavelet transform, Visual BASIC: robust estima-
tion
Performing prediction s Visua l BASIC and Visua l C+ +
Managing forecasting model executions Visua l BASIC
Evaluating predicted result s OL E MSGrap h object
4.2. GUI
The mai n GU I fo r th e forecastin g model i s shown in Fig. 7 . There ar e thre e othe r importan t
GUIs i n th e model . GUI s fo r evaluatio n ar e show n i n Figs . 8 an d 9 , an d GU I fo r th e M L
module is shown in Fig. 10 . Two different method s were carried ou t t o evaluat e the performanc e
of th e model : forecastin g base d o n continuou s tim e point s an d forecastin g base d o n randoml y
selected discret e tim e points . A s show n i n Fig . 8 , th e continuou s forecas t simulate s uninter -
rupted prediction s fo r al l point s o f tim e fro m a prespecifie d simulatio n startin g time . O n th e
other hand , th e rando m forecas t assist s i n predictin g trave l tim e usin g randoml y generate d
points o f tim e a s show n i n Fig . 9 . Bot h GUIs utiliz e OL E object s t o dra w chart s an d sho w
output data . Fig. 1 0 shows the ML GU I tha t assist s in enhancing th e hybrid model's forecastin g
accuracy.
4.3. Data
For th e implementatio n o f the hybrid model , ra w traffi c dat a tha t com e fro m loop detectors
installed an d managed b y th e Kore a Highwa y Corporatio n (KHC ) ar e used . KH C i s currently
collecting variou s real-tim e traffi c dat a throug h loo p detector s an d othe r traffi c surveillanc e
systems through a n informatio n highwa y network. I n thi s study , a 30- s interval real-tim e traffi c
data, transmitted from 8 loop detectors between 1 1 and 1 7 March 1996 , are used. Collected traffi c
data include traffi c volumes , occupancy rates , an d traffi c speeds . Thes e 8 detectors cove r a tota l
length o f 114. 3 km between PanGy o I/C an d ChungWo n I/C , a s shown i n Fig . 11 .
The hybrid mode l has als o bee n applied fo r arteria l road network s in forecasting travel times.
Raw arterial traffic dat a have bee n acquire d from the L G Traffic Informatio n System s (LGTIS),
Seoul, Korea. LGTI S ha s developed a n advanced travele r information system called Road Traffi c
244 J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
Information System s (ROTIS) an d has collected real-time arterial traffi c informatio n using probe
vehicles an d roadsid e beacon s sinc e 1998 . A s o f 1999 , there ar e approximatel y 250 0 roadsid e
beacons o n major intersections i n Seoul, Korea. These roadsid e beacon s rela y traffi c informatio n
that is transmitted from prob e vehicles to the company's traffi c informatio n center. In this study,
arterial traffi c dat a com e fro m 6 roadside beacon s betwee n 1 3 and 1 9 February 1999 . These 6
beacons cover the total lengt h of 5.3 km for 7 road link s on 5 major arterial roads. Transmitte d
arterial dat a includ e prob e vehicles ' drive n distances , drive n times , an d stoppe d time s fo r eac h
link. The locations of roadside beacon s ar e show n in Fig. 1 2 along with other roadsid e beacon s
managed b y the company.
A real-tim e traffi c dat a simulato r simulate s traffi c dat a transmission s fro m th e traffi c dat a
collection module. For th e performance evaluation, a real-time traffic dat a simulato r is devised to
simulate a s if online traffi c dat a are transmitted fro m bot h loo p detectors an d roadsid e beacons .
J. You, T.J. Ki m I Transportation Research Part C 8 (2000) 231-256 24 5
Fig. 9 . GUI fo r forecastin g trave l time based on randoml y selecte d discret e time points.
Thus, i t replaces the traffic dat a collectio n module in Fig. 2. For th e simulation of real-time traffi c
data, parts o f historical dat a are taken an d assumed as if these are real-time data fo r both arteria l
and highways . I n othe r words , th e syste m ha s bee n implemente d usin g part s o f th e historica l
databases t o predic t th e conditio n a t tim e t using dat a fro m earlier time s o f t(—1) , a s show n in
246 /. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
Fig. 13. In the figure, the calendar time of f(—1) is the same time point as the time t at the
simulated calendar time.
Three different measures of effectiveness are used in this research for evaluating the perfor-
mance of the hybrid forecasting model: root mean square error (RMSE), mean absolute percent
error (MAPE), and correlation coefficients. RMSE is useful for understanding the deviation
between observed and forecasted values for each forecasting output. MAPE is calculated to
understand the overall performance of the hybrid model. Correlation coefficients (ρ) are calcu-
lated to identify the relationships between observed and forecasted values (i.e., the closer the value
of ρ to 1, the better the performance of the forecasting model).
J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256 247
where observed data is represented by x{, predicted data by mean of observed data xt by x m e a n >
mean of forecasted data by correlation coefficient by ρ, standard deviation of observed
data xi by a, and standard deviation of forecasted data
248 /. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
Fig. 14 . Forecasting base d o n randoml y selecte d discret e tim e point s wit h arteria l dat a (origin : B1173 ; destination :
B1179).
Fig. 15 . Forecasting base d o n randoml y selected discret e tim e points with highway data (origin : PanGyo; destination :
SinGal).
250 J. You, T.J. Kim I Transportation Research Part C 8 (2000) 231-256
Travel times are predicted using randomly selected discrete time points from the historical
databases that are generated using a random number generator. In general, travel conditions
measured during daytime are very different from those measured during nighttime on a particular
network. With the historical databases that reflect such variations of changing traffic conditions,
this approach assists in analyzing whether the hybrid model fulfills reliable predictions for various
points of time.
To accomplish random forecasting tests, 200 points of time have been randomly generated.
Fig. 14 shows the forecasting results for "15-min ahead" travel times using the arterial historical
database. The error measures indicate that the model forecasts travel time with about 10% error
margin of MAPE and RMSE. The correlation coefficient (ρ) for both cases (two different time at
the same location) are reported in Fig. 14.
This has resulted in the application of the model to the highway historical database for pre-
dicting "15-min ahead" travel times, as shown in Fig. 15. The error measures indicate that the
hybrid forecasting model performs better with the highway historical database than with the
Fig. 16. Forecasting based on continuous time points with arterial data (origin: B1173; destination: B1179).
J. You, T.J. Kim 1 Transportation Research Part C 8 (2000) 231-256 251
arterial historica l database . Fo r bot h case s of the highway data (applyin g the model for the two
different time s at th e sam e location) , MAP E show s less than 3% , and th e correlation coefficient s
are clos e t o 1 .
In thi s simulation , trave l time s ar e predicte d usin g continuous trave l time dat a fro m a given
point o f time . Thi s i s to verif y i f th e hybri d mode l predict s an y biase d results . Thi s analysi s i s
particularly usefu l i n understandin g whethe r th e hybri d mode l correspond s t o continuousl y
changing traffi c condition s o n highway s or arteria l roa d networks .
Inspecting Figs . 1 6 and 17 , it is evident that there is no specifi c tendenc y of underestimation or
overestimation fo r "15-mi n ahead " trave l tim e forecasting . A s i n th e previou s simulation , this
analysis indicates tha t the hybrid model produces a relatively high accuracy of less than 10 % error
in MAPE with the arterial historical data an d a very high accuracy of less than 3% error in MAPE
with the highway historica l data .
Fig. 17 . Forecasting base d o n continuou s tim e points wit h highway data (origin : PanGyo; destination : SinGal).
252 J . You, T.J. Ki m I Transportation Research Part C 8 (2000) 231-256
For th e performance evaluation, we used the following number of parameter values : forecasting
range (15 , 30 , 45, and 6 0 min), searc h dat a segmen t lengt h (15 , 30, 45, an d 6 0 min), da y o f th e
week (yes or no), search range (±1 , ±2 , an d ± 3 h), K (1-10), k (1-10), local estimation method s
(the average or the fitting), an d dat a preprocessing (the wavelet or the outlier detectio n analysis) .
The tota l possibl e combinatio n o f mutatio n fro m th e inpu t paramete r value s i s 21,12 0
( 4 x 4 x 2 x 3 x 5 5 x 2 x 2 ) . Amon g thes e possibilities , w e have implemente d th e model 10,56 0
minimum computatio n tim e i n implementin g th e forecastin g model , believin g that fas t compu -
tation time s ar e ver y importan t i n designing a hybrid forecastin g model.
To evaluate the changes in computation time s when more links are added, w e added five links,
one b y one. A s show n i n Fig . 20 , each additiona l lin k imposes relativel y small amounts o f time.
For prediction s wit h 5 links, the computation tim e required i s less than 2 0 s for arterial data , and
approximately les s than a minut e for highwa y data.
5. Conclusion
In orde r t o forecas t trave l tim e reasonabl y well , th e cor e forecastin g algorith m wit h non -
parametric regressio n technique s ha s bee n integrate d wit h GIS technologie s t o implemen t a hy-
brid trave l time forecasting model. A hybrid forecasting model ha s been developed and teste d by
deploying GI S technologie s i n th e followin g areas: (1 ) storing, retrieving , an d displayin g traffi c
data t o assis t i n the forecastin g procedures, (2 ) building road networ k data , an d (3 ) integrating
historical database s an d roa d networ k data . Base d o n th e performanc e evaluatio n results , w e
strongly assure that th e demonstrated hybri d forecasting model could b e utilized as an importan t
tool fo r various IT S applications .
Two major contribution s are made in this paper:
1. w e developed a Type I model for combining GIS an d trave l time forecasting model an d imple-
mented i t successfully ;
2. w e develope d a n M L modul e an d successfull y integrate d i t wit h th e trave l tim e forecastin g
model wit h positive performance evaluatio n results .
Acknowledgements
ROTIS), Seoul , Korea , fo r providin g traffi c data . W e als o wis h t o exten d ou r thank s t o Mr .
Sathyamoorthy Ponnuswam y fo r editing this paper. W e take, however, the ful l responsibilit y for
the contents of this paper .
References
Abkowitz, M. , Walsh , S. , Hauser, E. , 1990 . Adaptation o f geographic informatio n systems to highwa y management .
Journal o f Transportation Engineerin g 116, 310-327 .
Azad, B. , Cook , P. , 1993 . Th e management/organizationa l challenge s o f th e "Server-Net " mode l o f GIS- T a s
recommended b y th e NCHR P 20-27 . In : Proceeding s o f th e GIS-T'93 : Geographica l Informatio n System s fo r
Transportation Symposium , pp. 327-342 .
Ben-Akiva, M., Cascetta, E. , Gunn, H. , 1995 . An on-line dynamic traffic predictio n model for an inter-urban motorway
network. In : Gartner , N.H. , Improta , G . (Eds.) , Urba n Traffi c Networks , Dynami c Flo w Modelin g and Control .
Springer, Ne w York, pp . 83-122 .
Ben-Akiva, M. , Koutsopoulos , H.N. , Mukundan , A. , 1994 . A dynami c traffi c mode l syste m fo r ATMS/ATI S
operations. IVH S Journa l 2 (1), 1-19 .
Berkbigler, K.P., Bush , B.W., Davis, J.F., 1997 . TRANSIMS softwar e architectur e for IOC-1. Report No. LA-UR-97-
1242, Lo s Alamos Nationa l Laboratory .
Bush, B.W. , 1999 . The TRANSIMS framework . TRANSIMS Opportunit y Forum , Sant a Fe , NM , 2 8 June.
Choi, K. , Kim , T.J. , 1994 . Integratin g transportatio n plannin g model s wit h GIS : issue s an d prospects . Journa l o f
Planning Educatio n an d Researc h 13 , 199-207 .
Choi, K. , Kim, T.J., 1996 . A hybrid travel demand mode l with GIS and expert systems. Computers, Environmen t and
Urban System s 20 (4/5), 247-259.
Davis, G.A. , Nihan , N.L. , Hamed , M.M. , Jacobson , L.N. , 1990 . Adaptiv e forecastin g o f traffi c congestion .
Transportation Researc h Recor d 1287 , 29-33.
Gillespie, S. , 1993 . Th e benefit s o f GI S us e fo r transportation . In : Proceeding s o f th e GIS-T'93 : Geographica l
Information System s fo r Transportation Symposium , pp . 34-41 .
Gilmore, J.F., Abe , N., 1995 . Neural networ k models for traffi c contro l and congestion prediction. IVHS Journal 2 (3),
231-252.
Holm, T., Stavanger , A.V., 1997 . Using GIS in mobility and accessibilit y analysis. Paper Presente d at th e 17t h Annual
ESRI Use r Conference , 8-11 July , San Diego, CA .
Kriger, D., Schlosser, M., 1992 . Integration of GIS with a travel demand forecasting model for transportation planning.
Transportation Foru m 4 , 106-115 .
Lockfeld, F. , Speed , V., 1993 . GI S link s county t o othe r transportatio n planners . Public Works 124 , 43-44.
Loukes, D. , 1992 . Geographi c informatio n system s i n transportatio n (GIS-T) : a n infrastructur e managemen t
information system s tool. In : Proceeding s o f the TAC Annua l Conferenc e 3, pp. B27-B42 .
Lussier, R. , Wu , J.H. , 1997 . Developmen t o f a dat a exchang e protoco l betwee n EMME/ 2 an d ARC/INFO . Pape r
Presented a t th e 17t h Annual ESR I Use r Conference , 8-1 1 July , San Diego, CA.
Mahmassani, H.S. , Peeta , S., Chang, G.-L. , Junchaya, T., 1991 . A review of dynamic assignment and traffi c simulation
models fo r ADIS/ATM S applications . Technica l Repor t DTFH61-90-R-00074 . Cente r fo r Transportatio n
Research, Th e Universit y of Texas, Austin .
Nagel, K. , Rickert , M. , Simon , P.M. , 1998 . The dynamic s o f iterated transportatio n simulations . Paper presente d at
TRISTAN-HI. Repor t No . LA-U R 98-2168 . Los Alamos Nationa l Laboratory.
Nygard, K. , Vellanki , R., Xie , T., 1995 . Issues i n GI S fo r transportation . MP C Repor t No . 95-43 . Mountain-Plains
Consortium, Fargo, ND.
Peeta, S. , Mahmassani , H.S. , 1995 . Multipl e use r classe s real-tim e traffi c assignmen t fo r on-lin e ATIS/ATM S
operations: a rolling horizon solutio n framework . Preprints o f papers a t th e Transportatio n Researc h Board 74th
Annual meeting , Washington, p . 27.
Ries, T. , 1993 . Desig n Requirement s fo r Locatio n a s a Foundatio n fo r Transportatio n Informatio n Systems . In :
Proceedings o f the GIS-T'93 : Geographica l Informatio n System s for Transportation Symposium , pp. 48-66 .
256 J. You, T.J. Ki m I Transportation Research Part C 8 (2000) 231-256
Sen, A., Sööt , S. , Ligas, J., Tian , X. , 1997 . Arterial link travel time estimation: probes , detector s an d assignment-typ e
models. Preprint s of papers a t th e Transportatio n Researc h Boar d 76t h Annual meeting , Washington , p . 21.
You, J. , Kim , T.J., 1999a . An integrate d urba n system s model wit h GIS. Th e Journal o f Geographical System s 1 (4),
305-321.
You, J., Kim, T.J., 1999b . Implementation of a hybrid travel time forecasting model with GIS-T. In: Proceedings of the
Third Bi-Annua l Conferenc e o f th e Easter n Asi a Societ y fo r Transportatio n Studies , 14-1 7 September , Taipei ,
Taiwan.
TRANSPORTATION
RESEARCH
PARTC
Abstract
The Transport System s Centre (TSC ) ha s developed a n integrated Globa l Positionin g Syste m (GPS) -
Geographical Informatio n Syste m (GIS ) fo r collectin g on-roa d traffi c dat a fro m a prob e vehicle . Thi s
system has been furthe r integrated wit h the engine management syste m of a vehicle to provide time-tagge d
data o n GP S positio n an d speed , distanc e travelled , acceleration , fue l consumption , engin e performance,
and ai r pollutant emission s on a second-by-second basis . These data ar e handled within a GIS and can be
processed an d queried durin g the data collectio n (fro m a notebook P C in the vehicle) or saved to a file for
later analysis . The databas e s o generated provide s a ric h sourc e o f information for studie s of travel times
and delays, congestion levels , an d energy and emissions. A case study application o f the system is describe d
focusing o n studie s of congestion level s on tw o parallel routes in a major arterial corridor i n metropolita n
Adelaide, South Australia. A s part o f these investigations, a discussion of the nature of traffic congestio n is
given. Thi s provide s bot h a genera l definitio n o f traffi c congestio n an d th e discussio n o f a numbe r o f
parametric measure s o f congestion . Th e computatio n o f thes e parameter s fo r th e stud y corrido r o n th e
basis o f data collected fro m th e integrate d GPS-GI S syste m i s described. Th e GI S provide s a databas e
management platform for the integration, display , and analysi s of the data collecte d fro m GP S and th e in-
vehicle instrumentation. © 200 0 Elsevier Science Ltd. Al l rights reserved.
Keywords: Movin g observe r traffi c studies ; Traffi c congestion ; Globa l Positionin g System ; Geographic Information
System; Traffi c dat a analysi s
1. Introductio n
Transportation data , i n common wit h many othe r dat a set s in civil engineering and th e socia l
sciences, often hav e spatial attributes . Fo r example , traffi c count s com e from specifi c sites , travel
0968-090X7007$ - see front matte r © 200 0 Elsevier Science Ltd. Al l rights reserved.
PII: S0968-090X(00)00015- 2
258 M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285
time dat a refe r t o particula r routes , an d origin-destinatio n dat a appl y t o a given area. Conven -
tional database system s cannot make much use of the spatial o r locational attribute s of a data set ,
other tha n hol d referenc e detail s fo r it . Geographica l Informatio n System s (GIS), o n th e othe r
hand, ca n absor b a database, relat e it s spatial attributes to map s o f the region , and offe r spatia l
integration wit h other pertinen t database s fo r tha t region .
This pape r describe s th e us e o f GI S technique s i n th e collectio n an d analysi s o f trave l time ,
delay, an d congestio n data fo r urba n roa d corridor s usin g data collecte d b y a moving observer .
The technique s have application s i n bot h plannin g an d desig n an d i n real-tim e monitorin g o f
traffic systems . A ke y element of the GIS-base d syste m is the us e of Global Positionin g System s
(GPS) data t o determin e locations, fo r both stati c observations an d dynamic recording o f vehicle
positions over time. The GIS takes on the central role in data management, i n terms of data entr y
and integration , dat a management, an d som e aspect s of data analysis an d display . Quirog a an d
Bullock (1998a ) provide a prime example of the capabilities o f GPS as a tool i n moving observer
traffic studies .
The Transpor t System s Centr e (TSC ) a t th e Universit y of Sout h Australi a ha s develope d a
GIS-based syste m for collecting on-road traffi c dat a from a n instrumented prob e vehicle driven in
a traffi c stream . Th e prob e vehicl e can recor d trave l time, distanc e covered , location , speed , fue l
consumed, air pollutan t emissions , engine performance, operatin g stat e variables , and dela y an d
queuing data ove r time (second-by-second) . Tabl e 1 lists the dat a item s recorded b y the vehicle .
The integration of the various data collection modules in the vehicle is a major task facilitate d by
the us e o f GIS . Th e dat a collectio n module s includ e differentia l GP S (fo r time-base d location ,
speed, and directio n of travel), and on-boar d vehicl e data (speed , acceleration, fue l consumption ,
engine performance, and emissio n rates). 1 The GIS syste m makes strong use of GPS for real-tim e
data collection , an d integratio n wit h a radi o communication s syste m allow s fo r th e real-tim e
tracking of a fleet of probe vehicles , as well as post-processing o f the on-road data . Th e TSC test
vehicle ha s a customise d interfac e with it s engin e managemen t syste m an d Globa l Positionin g
Receivers (GPS) (Zit o an d Taylor , 1999) . It ca n provid e the data show n i n Table 1 on a second -
by-second basis. The data ca n be viewed in real time using a notebook P C in the vehicle or logged
to a file for pos t analysis . A typica l output strin g is of the form :
time (s), distance (m), speed (km/h), fue l (ml) , engine RPM, MAP , A/C , gear , P/E, en g temp,
Tpos, lat , lon g 123.45 , 1876.456 , 62.1 , 534.21 , 1700 , 111 , 0 , 4 , E , 80 , 45 , -349197536 ,
+ 1386064077.
The pape r outline s th e element s o f th e dat a recordin g syste m an d thei r integratio n throug h
GIS. I t then illustrates the use of the syste m with a recent stud y of the southern roa d corrido r i n
the Adelaid e metropolita n are a wher e th e firs t stag e o f a ne w 'high-tech ' reversibl e flo w Ex -
pressway was recently opened. At th e time of the study , the engine maps fo r pollutant emission s
were not available . Therefore, emission data are not reporte d here . The maps wer e not availabl e
1
Al l variables except emission rate s ar e measured directl y by the on-board instrumentation. Pollutan t emissio n rate s
are measure d indirectly , usin g th e engin e performanc e variable s engin e revolution s (RPM) , manifol d pressur e an d
engine temperature , an d the n applyin g a calibrate d engin e map s fo r eac h pollutan t determine d fro m dynamomete r
testing o f th e vehicl e (Zito an d Taylor , 1999) .
M.A.P. Taylor e t a l I Transportation Research Part C 8 (2000) 257-285 25 9
Table 1
Vehicle parameters logge d in real time by the TS C prob e vehicle
Variable Measurement units Variable Measurement units
Time s Air conditioning on/off
Distance m Power/economy mode on/off
Speed km/h Engine gear gear (1-4 )
Fuel consumption 1 Hydrocarbons (HC) ppm
Engine revolution s rpm Nitrogen oxide s (NO x) ppm
Manifold pressur e Pa Carbon monoxid e (CO) ppm
Throttle positio n ratio Carbon dioxid e (CO 2) ppm
Engine temperature °C Oxygen (O 2) ppm
GPS position Latitude + Longitude
until February 2000. However, th e recorded data can be re-analysed usin g these maps t o provid e
the emissions , and thi s will b e done i n subsequen t stages of the souther n roa d corrido r study .
2. GIS-GPS integration
The use of GIS a s a database integrato r for a transport stud y area is illustrated in Fig. 1 . This
shows a schemati c diagram representin g a se t of individual databases fo r th e stud y area, which
comprise a mixtur e of spatial , numerical , and perhap s textua l data. These ar e integrated within
the GI S an d displaye d b y superpositio n o f a separat e ma p laye r for eac h database . Th e layer s
shown in Fig. 1 include topographical and land use data, transport network infrastructure , socio -
economic an d demographi c data , traffi c flow data, an d pollutio n and environmen t impact data .
Areas o f modellin g an d analysi s in transpor t plannin g an d transpor t system s management in -
volving the use of different subset s layers are also indicated. An example of a GIS map showin g a
combination o f data layers is given in Fig. 2. This figure includes probe vehicle locations over time
along a route together with street centreline data an d street map data layers. It is a snapshot of the
data display available in the TSC instrumented vehicle, as described in this paper. A useful feature
of the displa y i s the use of a GI S 'inf o tool' , providing a display windo w o n the ma p o f Fig . 2,
showing the data relating to the journey and the vehicle's performance for the point highlighted by
the arrow . Th e inf o too l windo w is in fac t displayin g at tha t poin t o n th e journey all of th e re -
corded variable s for th e prob e vehicle , as listed in Table 1 . This display is available at an y stag e
whilst th e dat a ar e bein g logge d (o n th e noteboo k P C i n th e prob e vehicle ) o r subsequentl y in
post-processing and analysis of the recorded data. Th e system thus allows the entire history of the
journey t o b e studied a t an y time.
GPS receivers provid e a fas t an d convenien t metho d fo r obtainin g positio n informatio n tha t
can b e collected i n real time and i s easily employed within a GIS . Thi s is because the basic GP S
position information is provided in the form o f latitude and longitude on the surface of the Earth ,
which i s compatible wit h common GI S locatio n specifications . Differential GP S correction s ca n
be used t o enhanc e positiona l accurac y t o abou t ± 5 m (with 95 % confidence) under almos t al l
conditions. I n addition , instantaneou s spee d accurac y t o ± 2 km/ h (with 95% confidence) can be
260 M.A.P. Taylor et al I Transportation Research Part C 8 (2000) 257-285
achieved, wit h vehicl e speed bein g observed directl y and recorde d independently o f the GP S po -
sition data , usin g commonl y availabl e GP S receivers . Zito e t al . (1995) , Quiroga an d Bulloc k
(1998a), an d D'Est e e t al . (1999 ) explain the us e of GPS i n traffi c studie s an d th e theor y of an d
methods fo r instantaneou s spee d observatio n usin g GP S an d describ e a serie s o f experiment s
which validate d th e abov e results for bot h positio n an d th e independent spee d observations .
A critica l facto r i n th e us e o f a GI S i s th e dat a tha t ar e availabl e wit h it . Rigorou s spatia l
analysis usin g GI S i s onl y possibl e when th e appropriat e ma p database s ar e available . Fo r A -
delaide, Sout h Australi a th e onl y road centrelin e database availabl e until ver y recentl y was tha t
derived fro m th e Digital Cadastra l Databas e (DCDB ) hel d by the South Australia n Departmen t
of Lands. This database is used for land titling in the state and shows the boundaries and gives the
coordinates o f the point s tha t mak e u p al l parcels o f land i n Sout h Australia . Th e stree t centr -
elines were then formulated by taking the centreline between the land boundarie s to estimat e the
road centrelin e position. However , this method did not yiel d an exhaustiv e coverage of roads in
M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285 26 1
Fig. 2 . GIS map of probe vehicle journey through street network, showing different dat a layers and 'info tool' display of
journey parameters at th e indicated location along the route.
Adelaide so the database wa s complemented with other digitised data fro m various maps, a s well
as the use of GPS to fill in data on missing road segments. This databas e has proved t o be more
than adequat e fo r applicatio n i n traffi c studies . Th e spatia l coverag e o f the database , show n in
Fig. 3 , is some 100-k m east-west an d almos t 170-k m north-south. Th e basic attribute s associate d
with each lin k in the raw stree t centreline database, beside s the star t an d endpoint s of each link,
are show n i n Table 2. These attribute s ar e basic, bein g little mor e tha n stree t addres s references
and a broa d roa d typ e code , bu t the y ope n th e doo r t o mor e detaile d description s throug h
linkages wit h othe r databases .
Whilst the database provide s a comprehensive coverage of the roads in the region, it does have
some deficiencies. Firstly, Fig . 3 shows an inset o f a blow up of a sample non-linea r roa d segmen t
centreline. The inset clearly shows that th e road curv e is made up o f four independen t links. This
coding o f road curve s b y a sequenc e o f straight-line segment s wa s adopted t o facilitat e th e dig -
itising of the road database . However , this greatly increases the numbers of links and nodes in the
overall roa d network . No r ar e 'arc ' o r 'polyline ' object s a par t o f thi s database , whic h furthe r
decreases its efficiency. Th e current database i s made up of some 135,000 records. Maintenance of
the databas e i s als o a n issue , fo r th e stree t centrelin e databas e onl y trul y represent s th e roa d
network at a given date. This is a significant issue in areas undergoing new development, as is the
262 M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285
Fig. 3 . Adelaide DCBD stree t centrelin e database, showin g inset o f a typica l road segment .
Table 2
Sample roa d centrelin e attributes for lin k segments in the DCD B databas e
Road nam e From lef t To left + From righ t To right Road typ e
ACACIA R D 29 51 26 48 1
REBECCA A V 3 3 4 4 0
WENTWORTH S T 160 154 159 153 0
case in the case study region considered i n this paper. Updates mus t be released t o includ e newly
constructed road s an d an y modifications to existin g roads throug h re-alignmen t o r traffi c man -
agement changes , suc h as the conversion o f a two-wa y road int o a one-wa y road. Unti l th e lat e
1990s, update s wer e not available ; now , updates to th e database are released annually .
Thus, th e database , whil e usefu l a s a n overla y t o GP S dat a points , i s no t completel y
suitable fo r navigatio n purposes . It s limitation s nee d t o b e understoo d an d th e dat a use d
M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285 26 3
3. Traffi c studie s
Traffic studie s are an essentia l part of traffic plannin g and engineering . They provide th e basi c
inventory and performance data , a s well as area-specific travel demand data , tha t are required fo r
project plannin g an d desig n an d traffi c system s managemen t an d monitorin g (e.g. , Taylo r an d
Young, 1988 ; Liu, 1994 ; Robertson, 1994 ; Taylor e t al. , 1996 ; Taylor e t al. , 2000) . Muc h o f th e
data hav e spatial attributes , an d man y o f the traffi c stud y techniques involv e data survey s based
on observations fro m a variety of sites or collected b y moving observers. Suc h data include traffi c
counts fro m roa d section s and turning movement counts a t intersections, road crash data , speed ,
travel time and dela y data, and origin-destination data . GI S techniques have a significant rol e in
the storage, analysis , and reporting of these data set s and also allow for the integration of differen t
databases relatin g t o a stud y area , includin g bot h traffic-base d dat a an d relate d lan d use , de -
mographic, topographic , an d environmenta l data .
Moving observe r method s ar e commonl y use d fo r trave l tim e an d dela y surveys , includin g
assessments o f traffi c congestion , an d ar e als o beginnin g t o b e use d fo r studie s o f fue l con -
sumption an d pollutan t emission s (fo r example s se e Taylor e t al. , 2000) . Th e basi c metho d in -
volves the surveyo r ('observer' ) bein g i n a vehicl e in th e traffi c strea m an d noting , amon g othe r
things, the time taken to travel between specified points. These techniques are particularly suitable
for relativel y long or complex journeys which do not have sufficient through-flo w to support othe r
travel tim e surve y techniques , suc h a s registratio n plat e matchin g o r input/outpu t method s
(Taylor e t al. , 2000) . Anothe r advantag e o f the movin g observe r method s i s that the y ca n yiel d
information abou t trave l time s an d traffi c condition s fo r intermediat e stretche s o f roa d an d s o
identify th e reason s fo r an y abnormalit y i n the overal l journey time . The obviou s proble m wit h
the basic moving observer method is that, unless repeated man y times by different drivers , it is not
likely t o b e representativ e o f actua l condition s an d ma y b e undul y influence d b y th e driver' s
264 M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285
driving style. 2 A general rule is that i n order to overcome the sampling problem, eac h route shoul d
be driven at leas t 1 5 times with different driver s each time . This is seldom a practical proposition ,
and ma y introduc e it s ow n inaccuracie s (e.g. , o n th e shoulder s o f pea k period s whe n traffi c
conditions ma y b e changing rapidly , th e differen t run s ma y no t equat e t o rando m observation s
from th e same population). Adoptin g th e floating car method, whereb y the drivers are instructe d
to attempt t o 'float' in the traffic stream , overtakin g as many vehicles as overtake them can reduc e
the extent of bias due to drivin g style. Overzealous attempts t o compl y with this instruction can ,
of course, have safety consequences, but even where it is safe t o overtak e or hang back i t may no t
be obviou s whethe r it i s appropriate . I t can , fo r example , b e difficul t t o judge whethe r a given
vehicle is a part of the main traffi c strea m i n which the driver ha s to float or whether i t is slowing
down to join a queue of traffic waitin g to exi t onto a side road. Th e developers of the floating car
method (Wardro p an d Charlesworth , 1954 ) suggested a method o f correcting fo r failur e to floa t
perfectly; se e also O'Flaherty an d Simon s (1970) and Cowa n an d Erikso n (1972 ) for independen t
assessments o f th e method . Wardro p an d Charleswort h suggeste d a correctio n formul a t o cal -
culate th e mea n trave l tim e fro m a to b as show n i n Eq . (1):
where t ab is the time taken b y a survey car to travel the route from a to b, and O the net number o f
vehicles overtaken b y the surve y car vehicle (i.e. vehicles overtaken minus vehicles who overtake )
while travellin g fro m a t o b , an d q i s th e mea n flo w rate . Th e mea n flo w rat e appear s i n th e
formula i n order to reflec t th e fact tha t overtakin g one 'too many ' vehicles in a flow of 100 0 veh/h
is les s significan t tha n th e sam e thin g i n a flo w o f 10 0 veh/h. Sinc e q i s rarel y availabl e fro m
conventional source s (i t is likely to diffe r o n differen t part s o f the route) a n estimat e ca n be mad e
by having a vehicle travel in the opposit e directio n ( b —> a) recording travel time (t ba) and numbe r
of vehicles met (m ) in the opposin g flow (i.e. travelling in the b — > a direction). Then a n estimate of
q is given by Eq . (2) :
The accurac y o f th e metho d i s improve d b y takin g severa l run s i n th e a — > b directio n (wit h
matching b — > a runs to estimat e q) , and i t i s convenient t o utilis e pairs o f vehicles running bac k
and fort h betwee n a an d b fo r thi s purpos e (not e that , althoug h common , th e us e o f just on e
vehicle to provid e the a —> b and b —> a times is theoretically flawe d since the a — > b and b — > a ru n
should be simultaneous). If travel times are required fo r subsections within the a —> broute, it will
be necessar y t o recor d dat a t o allo w separat e correctio n factor s t o b e calculate d fo r eac h sub -
section. If executed correctly, this method ca n produce usefu l data , albeit at some cost. However ,
it i s subject t o erro r i f conditions fluctuat e markedl y during the survey . This metho d fo r volum e
estimation i s no t recommende d i f alternativ e method s ca n b e employed , e.g . th e us e o f traffi c
2
Not e tha t th e problem i s not overcome b y instructing the driver to tail another vehicl e - tha t merel y replaces a bia s
due t o th e surveyor' s drivin g style by one du e t o tha t o f a randoml y selecte d membe r o f the driving population .
M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285 26 5
analysers to record volumes at sites along the route or volume data extracted from an urban traffi c
control syste m (a s i s th e cas e i n th e stud y reporte d i n thi s paper) . Informatio n derive d fro m
moving observe r survey s ca n b e enriche d i f dat a i s simultaneously recorde d o n tim e spen t i n
queues withi n each segmen t o f th e journey, an d i t i s in thi s regar d tha t us e of a n instrumente d
probe vehicl e becomes important .
Modern moving observer surveys thus often require the use of specially equipped probe vehicles
to collec t dat a i n rea l time , a s describe d i n D'Est e e t al . (1999) , Quirog a an d Bulloc k (1998a) ,
Woolley and Taylor (1999) , and Zito an d Taylor (1999) . Increasing use is being made of GPS for
the recordin g o f vehicle positions ove r time , se e Quiroga an d Bulloc k (1998a), an d previou s re-
search a t th e TS C ha s indicate d tha t pseudo-instantaneou s trave l speed s ca n als o b e recorde d
accurately b y GPS, independentl y of the GPS position data. Se e Zito and Taylor (1994), Zito et al.
(1995), an d D'Est e et al. (1999 ) fo r research result s clearl y indicating that th e direct spee d mea -
surements fro m GP S correlat e closel y with 'actual' vehicle speeds (a s recorded b y on-boar d in -
strumentation). GP S locatio n dat a alon g wit h othe r dat a recorde d fro m th e GP S o r
simultaneously fro m othe r on-boar d instrumentatio n o n the probe vehicl e are most convenientl y
handled through GIS .
The prob e vehicl e use d i n th e researc h stud y reporte d i n thi s pape r i s the TSC' s instrumen t
vehicle, currentl y a Genera l Motor s Holde n V S Commodore seda n car . Thi s vehicl e has bee n
instrumented s o tha t i t ca n lo g dat a continuousl y (a t tim e interval s o f on e second ) directl y
from it s engin e managemen t syste m i n synchronisatio n wit h GP S receivers . Th e GP S dat a
provides bot h spatia l an d time/distanc e base d dat a fro m whic h variou s traffi c parameter s ca n
be derived , including travel time , stoppe d time , trave l speed s (instantaneou s an d average) , an d
various congestio n indices . Differentiall y correcte d GP S positio n dat a i s recorde d i n rea l tim e
using the differentia l correction s broadcas t o n the F M radi o networ k 'JJJ' i n Australia; se e Zito
and Taylo r (1996 ) fo r a n explanatio n o f thi s syste m an d a cas e stud y applicatio n concerne d
with residentia l are a traffi c calming . The engin e managemen t syste m modul e directl y provide s
data, suc h a s time , distance , speed , fue l consumption , engin e revolution s (RPM) , throttl e
position, engin e temperature , engin e gear , us e o f ai r conditioning , an d economy/powe r mod e
(see Tabl e 1) . Ai r pollutan t emission s ar e determine d usin g calibrate d engin e map s fo r th e
vehicle an d ar e base d o n th e observe d engin e performanc e variable s a s note d above . Th e
development an d us e o f th e engin e maps i s described i n Zito an d Taylo r (1999) . The elementa l
data provide d b y th e vehicl e an d th e GP S ar e store d i n GI S softwar e runnin g o n a noteboo k
PC i n th e vehicle . Thu s th e recorde d dat a ma y b e displayed , interrogated , an d analyse d usin g
GlS-specific functionalit y bot h durin g th e dat a collectio n an d afterwards , a s indicate d i n
Fig. 2 .
Once the GPS and other vehicle-based dat a hav e been collected, the y are imported int o a GIS ,
where the y can be displayed an d analyse d spatially , a s well as analytically. Stree t centrelin e dat a
are include d a s a laye r i n th e GI S s o that th e exac t route , spee d profile , an d tim e dat a ca n b e
determined o n a link-by-lin k basis. I n addition , othe r ma p layers , suc h a s aeria l photography ,
electronic stree t director y maps , topographica l feature s an d landmark s ar e included . Fig . 2
provides a sampl e plo t o f probe vehicl e location s ove r tim e alon g a rout e superimpose d o n th e
266 M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285
street centreline data and electronic street map data layers (a raster image of street layout and land
uses). Vehicle location a t eac h point i n time is represented on the map b y a small coloured circle .
The circle s ca n b e colour-code d t o indicat e relativ e value s o f a specifi c dat a item , suc h a s in -
stantaneous spee d o r fue l consumptio n rate . Th e use r ma y selec t th e specifi c dat a ite m t o b e
displayed i n this way by th e GIS .
As well as indicating the ability of GIS to display the position o f the probe vehicle in the street
network, Fig . 2 also illustrate s th e us e of GI S t o associat e th e GP S time-tagged , vehicle-base d
data variables t o eac h circl e as show n by the 'Inf o Tool' dialogue box . I n thi s way, data ca n be
queried usin g standar d databas e technique s (e.g . displa y thos e time-tagge d observations , wher e
the vehicl e was stationar y o r wher e its spee d exceede d 8 0 km/h, etc. ) an d als o querie d spatiall y
(e.g. what was happening t o th e vehicle at a certain poin t alon g the route (a s in Fig. 2), or wha t
data were collected within 100 metres upstream an d downstrea m of a given intersection, etc.). T o
give the data collected added value, GIS has the ability to overlay different databases . Fig . 2 shows
how the collected data have been overlaid with a raster street directory image allowing the user to
read directl y off the ma p th e stree t names . I n addition , anothe r vecto r laye r of street centrelines
has bee n adde d t o th e attribute s associate d wit h thi s database , thu s includin g th e geographi c
coordinates o f th e links , a s wel l as th e addres s range s fo r eac h sid e o f th e road . Thi s databas e
allows the raster image to be georeferenced so that it is spatially correct, hence GPS positions can
be overlaid on th e map .
A major advantag e of the use of GIS in management and analysi s of traffic dat a is the ability
to integrat e thes e dat a wit h othe r dat a sets , suc h a s lan d us e an d socio-economi c an d demo -
graphic data relating to th e region. This is especially important i n environmental impact studies ,
where th e possibl e juxtaposition o f poo r environmenta l condition s an d residentia l lan d us e i n
some location s i s o f concern . Thi s issu e an d th e us e o f GI S t o determin e suc h location s i s de-
scribed in Klungboonkron g and Taylo r (1998) , Affu m and Brow n (1999) , and Taylo r et al.
(2000).
Fig. 4 show s some typica l demographic map s o f th e Adelaid e metropolita n area . Thes e dat a
were collected in the 199 6 Australian Nationa l Censu s and ar e readily available in a GIS forma t
and suitabl e for immediate mapping an d analysis . The map o n the left-hand side shows the total
number o f passenge r car s pe r censu s collecto r distric t (th e basi c zonin g syste m fo r th e censu s
data). The second map show s the dwelling densities in the same region. In both maps, the darke r
colours reflec t th e higher values, while the lighter colours represen t th e lowe r values. Kenworthy
and Newma n (1982) suggested that thes e parameters amongs t other s could b e used to segmen t a
larger regio n into various socio-demographi c area s tha t coul d possibl y represent differen t trave l
demand and driving patterns within them. Typically all of the demographic variables suggested by
Kenworthy an d Newma n (1982 ) when applied t o th e Adelaide situatio n showe d simila r charac -
teristics a s thos e i n Fig . 4 . Ther e ar e basicall y thre e distinc t regions : northern , southern , an d
central. Thi s result suggests the need t o undertak e separat e analyse s of travel characteristics an d
traffic condition s withi n these thre e regions . Th e Souther n Expresswa y corridor studie d i n thi s
paper lie s in the souther n region .
M.A.P. Taylor e t al . I Transportation Research Part C 8 (2000) 257-285 26 7
Fig. 4 . Typical demographic feature s of the Adelaide Statistica l Divisio n i n 1996 , indicating th e broad divisio n of the
metropolitan are a into three social regions: northern, central and southern.
Given th e regions that have been established, i t is a matter o f determining which roads in those
regions shoul d b e drive n o n t o obtai n a representativ e sampl e o f drivin g i n tha t region . Fig . 5
shows a GIS ma p o f the mai n arteria l route s i n the Adelaid e metropolita n area , whic h display s
average mornin g pea k hou r traffi c volumes . Highe r volume s ar e show n a s darke r shade s an d
lower volumes in lighter shades. These network flow maps may be analysed by the GIS to indicate
the proportions o f driving (vehicle-kilometres o f travel or vehicle-hours o f travel) within the three
regions (and by time of day). They may be further use d to estimate total delay loads (e.g., vehicle-
hours o f delay ) i n th e network , give n dela y tim e information . Thi s ca n b e obtaine d fro m th e
moving vehicle studies, as indicated i n Sectio n 4.
4. Congestion
Fig. 5 . Morning peak hou r traffi c volume s on mai n roads i n the Adelaid e metropolitan area .
basic information about th e occurrence of congestion i n a network. Some of this information can
be collecte d fro m permanen t monitorin g sites , whils t othe r informatio n ca n b e collecte d usin g
moving observe r methods . Th e tempora l an d spatia l distributio n o f congestio n i n a regio n i s
important, and the use of GIS software for database integration, data analysis, and data display is
most advantageou s (Affu m an d Taylor , 1999) .
If knowledg e abou t congestio n an d it s exten t an d intensit y i s important, the n th e firs t con -
sideration i s to define just what congestion is. Congestion is an integral part of a transport system ,
but its specific definition an d identification are not immediatel y obvious. Taylor (1992 ) reviewed a
number o f different definition s of traffic congestio n and th e observe d phenomena associate d wit h
it. O n th e basi s o f thi s review , three recurren t idea s tha t occurre d i n th e variou s definition s of
congestion were identified:
• congestio n involve s the imposition of additional trave l costs on all users of a transport facility
by each use r o f that facility ;
• transpor t facilitie s (e.g . road links , intersections, lanes and turning movements) have finite ca-
pacities to handl e traffic , an d congestion occur s when the demand t o us e a facility approache s
M.A.P. Taylor e t a l I Transportation Research Part C 8 (2000) 257-285 26 9
or exceeds th e capacity;
• congestio n occur s on a regular , cyclic basis, reflectin g th e levels and schedulin g of social and
economic activities in a given area; and
• i n addition t o th e recurrent congestion, specia l episode s of congestion ma y occur a t differen t
points i n a network due to irregula r incidents, suc h as roadworks, breakdowns or accidents .
The followin g definitio n o f congestio n wa s subsequentl y propose d fo r us e i n traffi c studie s
(Taylor, 1999) :
This definition i s an extensio n o f tha t give n in Taylor (1992) . Th e mor e recen t extension , a s
presented here, recognizes that th e capacity o f an element in a traffic system s may vary over time,
e.g. whe n traffi c incident s occu r or for mino r strea m traffi c movement s where capacity may de-
pend o n th e traffi c volum e in the majo r stream .
Thus, congestio n ma y alway s be presen t i n an y par t o f a transpor t system , but th e leve l o f
congestion ma y hav e t o excee d som e threshol d valu e t o b e recognised . Th e threshol d ma y b e
context-specific, fo r instanc e owin g t o th e occurrenc e o f incidents , suc h a s breakdowns , roa d
works, o r roa d crashes . Peak period s ar e recognised as prone t o congestion , but i t must als o be
recognised that congestion can occur at other times due to different traffi c management regimes in
place off-peak o r due to traffi c incident s or unusual loca l traffic generatin g activity . Fo r strategi c
transport plannin g purposes, a satisfactor y definition o f th e leve l o f congestio n o n a network
component (e.g . a route, lin k or intersection turnin g movement) is the excess travel time incurre d
by a traveller when traversing that network component. Exces s travel time is the additional travel
time over and above the free flow travel time (To), which is the minimum amount of time required
to cover the component. Thu s the excess travel time corresponds to the 'system delay' pertaining
to the component under the given traffic condition s (Taylor et al., 2000). It shoul d b e noted that
system delay is a measure of the total dela y experienced, including 'stopped delay' , delays due t o
decelerations and accelerations (which may be due to the vehicle joining a queue or in negotiating
road features , suc h as bends or traffi c contro l devices), and delay s incurred through interaction s
with other roa d user s in a traffi c stream . The GPS-equippe d prob e vehicle can measure some of
these components o f total delay, suc h a s stopped tim e o r acceleratio n nois e (AN) (a s described
below).
Traffic movemen t alon g a link in a network may be seen as consisting o f two components. The
first component is cruising with traffic movin g along the link largely uninterrupted (except for th e
possibility o f side friction, say due to vehicle parking manoeuvres). Trave l along the link may also
be punctuate d b y point s o f interruption , sa y pedestrian crossings , bu s stop s and , mos t impor -
tantly, roa d junctions. Fo r example , the junction a t th e downstream end of the link may dictate
the traffi c progressio n along the link. Movement through the interruption points can be handled
using the methods for intersection analysis and queuin g theory. What i s also needed, particularly
Next Page
for urba n area s o r other places where congestion is expected, is a composite relationship that ca n
include the tw o components simultaneously.
Having establishe d concept s an d principle s tha t ma y b e use d t o defin e an d identif y traffi c
congestion i n a network , i t i s necessary t o see k som e measure s o r indice s tha t ma y b e use d a s
indicators o f the level s of congestion i n the network.
4.1. Delay
Delay may be used as one measure o f the extent of travel time. A commonly accepte d definition
of delay is the syste m delay (d), defined a s th e exces s travel tim e above th e minimu m (fre e flow )
travel time needed to traverse a network element (e.g. a link, road section , or intersection). If T is
the actual trave l time and T O is the free flow travel time, then the system delay (as described above )
is
d = T - T 0 . (3 )
Because i t i s a direc t measur e o f th e additiona l tim e costs impose d b y on e motoris t o n others ,
system dela y is usuall y the mos t appropriat e measur e of dela y for use in trave l tim e studie s in
congested networks. There are alternative definitions of delay (e.g. stopped tim e (Ts)) as discussed
in Taylo r e t al. , 2000) , an d car e i s needed t o ensur e compatibility betwee n definitions and com -
putational procedure s whe n makin g comparison s betwee n result s fro m differen t studies . Th e
probe vehicl e data collecte d i n this study is used to estimat e both syste m delay and stoppe d tim e
delay. Th e caus e o f th e dela y ma y als o b e important , wit h th e nee d t o distinguis h betwee n re -
current delay s du e t o cycli c variation s i n trave l deman d an d incident-base d delays . Wit h
knowledge o f traffi c volume s along th e link s in a route , th e dela y tim e based o n Eq . (3 ) may b e
expressed i n term s o f vehicle-hours o f delay (the product o f link volume and dela y on th e link).
Thus a measure of the total delay load along the route or on its component link s can be obtained .
Abstract
Many metropolita n area s hav e starte d program s t o monito r th e performanc e o f thei r transportatio n
network an d to develop system s t o measure and manage congestion. This paper presents a review of issues,
procedures, an d example s o f applicatio n o f geographi c informatio n syste m (GIS ) technolog y t o th e de -
velopment o f congestion managemen t system s (CMSs). Th e paper examine s transportatio n networ k per -
formance measure s an d discusse s th e benefi t o f usin g trave l tim e a s a robust , eas y t o understan d
performance measure . Th e pape r addresse s dat a needs an d examine s th e use of global positionin g syste m
(GPS) technolog y fo r the collection o f travel tim e and spee d data . The paper als o describes GIS platform s
and sampl e use r interface s t o proces s th e data collecte d i n the field, data attribut e requirement s an d dat -
abase schemas , an d example s o f applicatio n o f GI S technolog y fo r th e productio n o f map s an d tabula r
reports. © 200 0 Elsevie r Science Ltd . Al l rights reserved .
Keywords: Congestio n managemen t systems ; Travel time ; GPS ; GIS ; Dynami c segmentation ; Performanc e measure s
1. Introductio n
Traffic congestio n is a critical problem in urban areas. Several indicators confirm this trend. Fo r
example, between 197 6 an d 1996 , th e number of vehicle-miles traveled (VMT) i n the United States
increased b y 77% , whil e th e mileag e o f road s an d street s increase d onl y b y 2 % (FHWA, 1998) .
Over the years, the percentage of the peak-hour VMT that occurs under congested conditions has
steadily increased, although at a slower pace in recent years. In 1996 , tha t percentage was 54% for
the urban interstat e system an d 45 % for th e urban nationa l highway system . Congestion usually
results i n tim e delays , increase d fue l consumption , pollution , stress , healt h hazards , an d adde d
0968-090X/00/S - see front matte r © 200 0 Elsevier Scienc e Ltd . Al l rights reserved.
PII: S0968-090X(00)00008- 5
288 C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306
vehicle wear . Th e associate d cos t i s huge. Fo r example , i n 199 7 congestion cos t traveler s in 68
urban area s i n th e Unite d State s 4. 3 billio n hour s o f delay , 6. 6 billio n gallon s o f waste d fue l
consumed, an d $7 2 billion o f time and fue l cos t (Schran k an d Lomax , 1999) .
Travel dela y is perceived b y many a s th e mos t noticeabl e impac t o f congestion . No t surpris -
ingly, numerous effort s hav e been mad e t o eliminat e it or , a t least , to alleviat e its effects . I n th e
past, addin g capacit y wa s considere d th e mai n solutio n t o eliminate o r reduc e trave l delays .
However, thi s approac h ha s frequentl y prove d t o b e insufficient . Face d wit h thi s reality , man y
urban area s hav e opte d fo r implementin g alternative management measures . Example s o f these
alternative managemen t measure s includ e improve d traffi c surveillanc e an d contro l systems ,
dedicated high-occupanc y vehicl e (HOV ) lanes , improve d transi t service , an d congestio n an d
parking pricing. The objective of these measures is to manage and reduce congestion by improving
traffic flow , enhancin g mobilit y an d safety , an d reducin g deman d fo r ca r use . Man y o f thes e
management measure s ar e th e resul t o f federal , stat e o r loca l legislation . Suc h i s th e case , fo r
example, of the measures that had to be implemented in urban areas designated as non-attainment
for ozon e o r carbo n monoxid e (CAAA , 1990 ; ISTEA, 1991 ) and tha t continue t o b e eligible for
federal fundin g unde r TE A 2 1 (1998).
2. Performanc e measure s
Travel tim e (t L,) is the total tim e required t o travers e a roadway segmen t of length L. I t can be
measured directly using field studies, although it could als o be derived using simulation models o r
empirical relationship s betwee n volume an d roadwa y characteristics .
Acceptable trave l tim e ( tLo) is the trave l tim e associate d wit h a performanc e goa l establishe d
for th e transportatio n facility . Th e acceptabl e trave l tim e shoul d b e influence d b y communit y
input an d should , explicitl y o r implicitly , provid e a balanc e betwee n transportatio n quality ,
economic activity , lan d us e patterns , environmenta l issues , an d politica l concerns . I n th e ab -
sence o f a mor e detaile d analysis , a numbe r o f transportatio n agencie s defin e th e acceptabl e
travel tim e a s tha t associate d wit h fre e flo w condition s (o r sometime s poste d spee d limits) . A
more detaile d analysi s shoul d provid e a differentiatio n b y time period (i.e. , peak v s off-peak), b y
functional clas s (e.g. , freewa y v s majo r arterial) , an d b y geographi c locatio n (e.g. , centra l
C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306 291
Segment speed (u) is the result of dividing the length (L) of the segment along which travel time
data ar e collected b y the correspondin g trave l time t L:
u = L/tL. (1 )
Length is an established item in a roadway inventory database an d is normally measured in the
field with a distanc e measurin g instrumen t (DMI) . GI S package s ca n als o b e use d t o provid e
estimates of distance, but it is clear that the accuracy of these estimates depends on the accuracy of
the underlying digital base map. Digita l bas e ma p accurac y ha s dramaticall y increase d i n recent
years (submete r accurac y i s now commonplace) and , a s a result , measurin g distance s wit h GI S
packages ha s become a feasible alternative. Reader s shoul d realize , however , that GI S package s
usually measur e distance s o n th e ellipsoid , i.e. , the y "simulate " groun d distance s b y usin g a
mathematical mode l o f the surface of the Earth. Som e GIS packages tak e into consideration th e
eccentricity o f the ellipsoid (i.e., explicitl y account fo r th e fact th e Earth is not a perfect sphere),
while othe r GI S package s simpl y assum e a spherica l mode l o f th e surfac e o f th e Earth . GI S
packages that do consider the eccentricity effec t provid e more accurate estimate s of distance tha n
GIS package s tha t assum e a simpl e spherical mode l o f the surfac e o f the Earth . Difference s be -
tween the two models can be quite significant. Fo r example , at 30 ° of latitude, differences between
the tw o model s coul d b e u p t o 3 m/km o n th e N- S directio n an d u p t o 2 m/km o n th e E- W
direction (Quiroga , 1999) .
In general, travel time studies involve several runs and mor e than one segment. In this case, it
may also be of interest to compute representative speeds and travel times for each segment and/or
for al l segments combined. In general, as shown in Fig. 2, if there are interchanges or intersections
along th e route , th e numbe r o f runs per segmen t may b e different .
Assume a representative segment travel time is given by the arithmetic average of all travel time
values associated wit h a segment. Following Quiroga and Bulloc k (1999), the total representativ e
travel tim e and spee d ove r al l segments can b e expressed a s
where t mi i s the media n trave l tim e associate d wit h segmen t i and, u mi i s the media n spee d asso -
ciated wit h segment i .
Travel rate (tr) is the inverse of average speed and i s usually expressed in min/km (or min/mile).
While no t readil y understoo d by all audiences, travel rat e provides a useful measure that can be
averaged fo r a facility , geographic area , o r mode . I t ca n als o b e used t o compar e performanc e
among transportatio n facilitie s mor e effectivel y tha n speed . Trave l rat e can b e expressed as
tr = t L/L. (5 )
2.5. Delay
Delay (d L) is the differenc e betwee n trave l tim e and th e acceptabl e trave l time o n a roa d seg -
ment. Dela y ca n b e expressed a s
dL = tL-tLo. (6 )
C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306 29 3
Total dela y (D Ltp) i s the su m o f delay s fo r al l vehicles traversing th e segmen t during th e tim e
period fo r whic h trave l tim e dat a ar e availabl e an d i s normally expresse d i n vehicle-minutes or
vehicle-hours. Tota l delay ca n b e expressed a s
where tp is the subscript associated wit h the time period for which data are available (e.g., 1 5 min),
Vtp is the number o f vehicles traversing the segmen t during the tim e period fo r which travel time
data ar e available . Traffi c volume s are established item s in a roadwa y inventory database .
Delay rat e (d rL) is the rat e o f time loss for a specifie d roadwa y segment . I t i s calculated a s th e
difference betwee n th e trave l rate an d th e acceptabl e trave l rate. Dela y rat e ca n b e expressed as
3. Data collection
There ar e essentially two groups o f travel time data collection techniques : roadside technique s
and vehicl e techniques. Roadside technique s ar e based o n th e us e of detecting devices physically
located alon g the stud y route s at pre-specifie d intervals . The y obtai n trave l tim e dat a fro m ve-
hicles traversin g th e rout e b y recordin g passin g time s a t predefine d checkpoints . Example s o f
these technique s includ e licens e plat e matchin g an d automati c vehicl e identification (AVI) . Li -
cense plate matching i s based o n recording o f the licens e plate number o f individual vehicles and
the correspondin g tim e stamp s a s the y pas s checkpoints . Trave l time s are determine d a s differ -
ences in time stamps between checkpoints. An assumption of this technique is that each individual
vehicle doe s no t mak e intermediat e stops . Thi s ma y b e limiting , particularly i f ther e ar e inter -
sections, on-ramps , off-ramps , o r interchange s betwee n checkpoints .
AVI i s an exampl e o f a dat a collectio n techniqu e include d a s par t o f traffi c surveillanc e an d
control syste m deployments at traffi c managemen t centers . AVI systems are based o n the used of
in-vehicle transponders (o r tags), roadside readin g units, a communication network, and a central
computer system . Th e roadsid e readin g unit s detec t individua l vehicle s equippe d wit h
294 C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306
transponders a s they pass nearby and transmit the corresponding transponder dat a to the central
computer system. Travel times between consecutive checkpoints are computed in a similar manner
as wit h th e licens e plate technique , excep t tha t transponde r identificatio n numbers ar e use d t o
compare tim e stamps instead o f vehicle license plate numbers. One advantage o f AVI technology
is that area-wid e real-time travel time data collection an d disseminatio n ar e possible. With GIS -
based Interne t tools , fo r example , citie s like Houston, Chicago , an d Seattl e ar e usin g AVI tech -
nology to disseminate up-to-date geo-referenced travel time and speed data to the traveling public.
Vehicle techniques are based on the use of detection devices carried inside the vehicle. Examples
of thes e technique s includ e th e traditiona l stopwatc h an d clipboar d techniqu e an d automati c
vehicle locatio n (AVL) . I n th e stopwatc h an d clipboar d technique , trave l tim e an d passag e o f
specific landmark s ar e manuall y recorde d alon g th e route . Tw o technician s ar e require d i n th e
vehicle: one of them to drive and the other on e to manually record item s such as the location an d
time o f individua l checkpoint s an d th e lengt h an d tim e spen t i n queues . Unfortunately , thi s
process tend s t o b e labo r intensiv e during th e dat a collectio n an d dat a reductio n phases , an d
spatial resolutio n and coverag e are limited. In addition, problem s such as missing checkpoints or
inaccurately marke d checkpoint s ar e common . T o avoi d som e o f thes e problems , man y trans -
portation agencie s us e distanc e measurin g instrument s (DMIs ) i n thei r prob e vehicles . With a
DMI, onl y one technician is needed in the vehicle. In som e cases, it is even possible to lo g route
and checkpoint locations. However, DMIs require frequent calibration s t o avoid inaccurate speed
and distance reading s (Ben z an d Ogden , 1996) . In addition , DM I dat a (whic h by definitio n ar e
linearly referenced) are not alway s compatible with geographic databases because of the difficult y
to ensur e those critical checkpoints o n the survey, mainly the beginning and endin g points, hav e
been properl y geo-reference d in th e field.
AVL is a generic term that groups several techniques that use receivers or transmitters on board
to determin e vehicle location (i n latitude an d longitude ) and speed . Examples of these techniques
are ground-based radio navigationa l system techniques and GP S techniques. GPS techniques are
particularly advantageous because they do not nee d receiving towers on the ground as traditional
radio navigational systems do. Several GPS-based technique s have been developed in recent years
(Guo an d Poling , 1995 ; Laird, 1996 ; Quiroga an d Bullock , 1996; Quiroga an d Bullock , 1999 ; Zito
et al. , 1995 ) and th e numbe r o f implementations i n urban area s i s constantly increasing .
One o f th e significan t advantage s o f AV L technique s compare d t o othe r technique s i s tha t
traffic monitorin g i s roadwa y networ k an d drive r independent . Thi s make s AV L suitabl e fo r
many applications , includin g trackin g th e motio n o f special-purpos e prob e vehicle s and entir e
fleets. When used with single probe vehicles, AVL systems are usually configured so that data are
collected and store d on board, and then post-processed i n the office . Whe n used with entire fleets,
AVL systems are usually configured so that data are collected and transmitted via radio or cellular
phone t o a central locatio n wher e they can be automatically processed .
Table 1 is a summar y o f characteristic s an d applicabilit y o f th e trave l tim e dat a collectio n
techniques described previously. Roadside technique s are obviousl y infrastructure dependent, a s
opposed t o vehicl e techniques. Roadside technique s have lower levels of resolution an d accurac y
than vehicl e techniques. However, vehicle techniques are generally based o n a limited number of
probe vehicles , which means tha t are a wid e coverage i s limited. This make s roadsid e techniqu e
(specifically AVI ) better suite d fo r dail y o r real-tim e monitoring . I n contrast , vehicl e techniques
are bes t fo r determinin g initial conditions an d fo r annua l monitoring .
C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306 29 5
Table 1
Comparison o f travel time data collectio n techniques (adapted fro m Li u an d Raines , 1996 ; Turner, 1996)
Criteria Roadside techniques Vehicle techniques
License plate matching AVI DMI AVL
Characteristics
Infrastructure dependent Yes Yes No No
Travel time/speed resolution Low Low High High
Travel time/speed accuracy Good Good Good Very goo d
Area wid e coverage Low Very good Low Low
Technology status Proven Proven Proven Proven3
Capital costs Low High Low Low to moderate
Operating costs per unit Moderate Low High Low to moderate
Applicability
Annual monitoring Yes Yes Yes Yes
Daily monitoring Limited Yes Limited Limited
Real-time trave l information Limited Yes No Yes
Incident detection Limited Yes Limited Limited
a
GPS is a proven technology. However, its applicability to trave l time studies has been limited unti l recently.
4. Data management
Regardless o f th e dat a collectio n techniqu e use d t o collec t trave l tim e data , th e dat a man -
agement componen t o f a CM S i s critical . I n general , tha t componen t shoul d b e buil t usin g a
geographic relationa l databas e mode l an d provid e al l the necessary interfaces and procedure s t o
provide th e capabilit y t o measur e congestio n accurately , reliably , an d efficiently . Obviously , the
structure of the data management componen t depend s on the data collection techniques used and
the performance measure s chosen .
As a n illustration , thi s sectio n summarize s th e dat a managemen t componen t o f a trave l tim e
data collection applicatio n develope d recently in support o f a congestion managemen t syste m for
Baton Rouge , Louisian a (Quirog a an d Bullock , 1998 ; Quiroga an d Bullock , 1999) . Thi s appli -
cation, calle d trave l time with GPS (TTG), is based o n th e use of GPS receivers to collec t trave l
time dat a an d GIS-base d procedure s t o manage th e dat a collecte d i n the field. TTG include s a
spatial model , a geographi c relationa l database , a procedur e fo r linearl y referencin g GPS dat a
using dynamic segmentation tools , an d dat a reporting procedures . TT G wa s used t o proces s 2.4
million GP S record s o n 40,00 0 k m (25,00 0 miles ) o f trave l tim e run s o n a 240-k m (15 1 miles)
highway network .
The spatia l mode l i s based o n highwa y links, where highway link s are define d a s directiona l
centerlines delimite d b y physica l discontinuitie s lik e signalize d intersections , ramps , an d inter -
changes (Fig . 3) . Fig. 3 also show s sampl e GP S dat a bein g mappe d t o highwa y links. Linearl y
referencing thes e GP S point s involve s computin g cumulativ e linea r distance s fo r GP S point s
located alon g th e rout e o f interest.
296 C.A. Quiroga / Transportation Research Part C 8 (2000) 287-306
Fig. 3 . Highway link s o n Florid a Boulevar d in Baton Rouge, Louisiana, with overlai d GPS dat a collecte d during a
travel time run.
Fig. 4 shows a generi c time-distance diagram fo r a prob e vehicl e that traverse s a segmen t of
length L (Note : a segmen t doe s no t necessaril y hav e t o b e sam e spatia l entit y a s a link) . Th e
segment trave l time, t L, could b e estimated b y interpolating the time stamps of the two GPS points
located immediatel y before and afte r th e segment entrance, an d th e time stamps o f the two GP S
points locate d immediatel y befor e and afte r th e segmen t exit . If instantaneous spee d value s ar e
recorded alon g with the GPS positional data , a representative spee d value for the segment coul d
be computed. In this case, the segment travel time, t L, could be estimated b y dividing the segment
length b y the representativ e segment speed value .
Each GP S data file contains data suc h a s time stamps, speed , latitude , longitude , an d satellit e
navigational data at regular time intervals, say every 1 s. These data need to be linearly referenced
so tha t GP S dat a ca n b e associate d wit h route s o n th e highwa y network. T o manage al l thi s
information efficiently , a geographic relational database was developed. For illustration purposes,
Fig. 5 shows th e databas e schem a (o r databas e structure ) an d include s tables , field names an d
relationships (both one-to-one and one-to-many) among tables. To assist readers in the process of
understanding the database structure , Fig. 6 shows a few sample records o f each table included in
the database .
The database structur e assume s tha t a lin k code i s explicitly associated wit h each GP S poin t
(LinkCode attribut e i n tabl e GPS_DATA) . Thi s lin k cod e result s fro m th e linea r referencing
process an d i s th e sam e a s tha t associate d wit h link s i n th e highwa y networ k ma p (e.g. , Lin k
code = 177 9 in Fig . 3 and tabl e GPS_DAT A of Figs. 5 and 6) . This way, users can easil y build
queries based o n the same segmentation schem e as that used for generating the highway network
map. Strictl y speaking , however , all that is required from the linear referencin g process i s a route
Fig. 5 . Geographic databas e schem a showin g tables an d relationship s among table s (primar y key s ar e indicate d in
bold).
298 C.A. Quiroga / Transportation Research Part C 8 (2000) 287-306
Fig. 6 . Sample o f records fro m the databas e (primar y key s are indicate d i n bold) .
code and a cumulative distance value for each GPS point (attribute s RouteCode an d MilePos t in
table GPS_DATA) . Wit h thi s information an d an y tabl e containin g cumulativ e distances asso -
ciated with links or segments along routes, generic GPS data tables for any highway segmentation
scheme can be produced t o generat e segmen t aggregated trave l time and spee d data .
TransCAD wa s used to linearly reference GPS data and to display results in a map format. Like
other deskto p GI S packages , TransCAD' s architectur e i s based o n a fairl y larg e number o f as-
sociated files. For example , a geographi c file can easil y involve 10-2 0 associate d files including
C.A. Quiroga / Transportation Research Part C 8 (2000) 287-306 299
graphical elements , tables , indexes , and dat a dictionaries . Mos t table s ar e i n dBas e I V format
(.dbf extension) , which means tha t eac h tabl e is stored i n a separat e dBas e file. This kin d o f ar -
chitecture i s intende d t o provid e flexibilit y t o typica l TransCA D users . However , i t ca n als o
complicate data management problems in an environmen t where tens or hundred s o f GPS data
files are being generated and processed. With GPS data scattered in several dBase files, the process
of enforcin g data integrit y constraints, buildin g queries, an d producin g report s that involv e ag-
gregating or summarizin g travel tim e and spee d dat a by , say, time period o r corridor , coul d b e
quite challenging . To addres s thi s issue , al l databas e table s wer e store d i n a singl e acces s file.
Tables LINKS , ROUTE_LINKS , an d GPS_DAT A contai n record s importe d fro m TransCA D
files. Table LINK S contain s th e sam e record s a s fil e links.dbf , whic h i s a fil e generate d b y
TransCAD fo r viewin g attribute dat a associate d wit h each lin k i n the highwa y network. Tabl e
ROUTE_LINKS i s the resul t o f joining file links.dbf and al l rout e link.db f files in TransCAD .
Table GPS_DATA contain s linearly referenced GP S data that result s from th e linear referencing
process i n TransCAD .
Fig. 7 shows a generic view of a typical work flow using TTG. Fo r completeness , Fig . 7 shows
both dat a reductio n step s and dat a reportin g steps . In this section, only the data reductio n step s
are discussed. In summary , the data reductio n procedur e allows users t o
• Impor t GPS data file into the GIS to generate a point geographi c file that can be overlaid on
the highway network vector map ;
• Specif y when and where the vehicle enters and exits a study route using an animated GPS play-
back utilit y (Fig. 8) . Of particular interes t in Fig. 8 are th e mark-in (MI) and mark-ou t (MO )
buttons. The MI butto n i s used to specif y whe n the vehicle enters a route by marking the first
GPS point associated with that route . Similarly , the MO button is used to specif y whe n the ve-
hicle exits a route by marking the last GP S point associate d wit h that route. Fo r adde d flexi-
bility, the GPS player utilit y allows users to defin e MI-M O pairs anywhere along a route and
define more than on e MI-MO pair per route. This technique is useful fo r filtering out spuriou s
GPS points an d fo r partial rout e analysis;
• Linearl y reference GPS points to the highway network. TTG also measures th e transversal dis -
tance from eac h GPS point t o the mapped lin k on the network (Offset field in table GPS_DA -
TA of Fig. 6). This offset provide s a verification of GPS positional accuracy and allows users to
flag GPS points that ma y have been reference d to incorrec t routes ;
• Impor t th e linearly referenced GPS data int o a repository databas e (Microsof t Access).
After storin g the linearly referenced GPS data in a repository database, th e next step involves
constructing queries and reports. For the sake of brevity, only a sample of databases queryin g and
data reportin g option s ar e include d here . Fo r additiona l dat a reportin g examples , reader s ar e
referred t o othe r source s (Quiroga, 1997 ; Quiroga an d Bullock , 1998 ; Turner e t al., 1998) . Sup -
pose it is of interest to produce report s showing average link speeds on a corridor. The procedure
to d o thi s can b e summarized a s follows:
• Buil d a query to retrieve the GPS records associated wit h the time period of interest (e.g., 7:00 -
7:15 am).
• Calculat e lin k travel time and speed for each travel time run conducted during the time period
of interest. Because relational database and GIS packages do not have tools to readily perform
numerical interpolations, a special purpose utility was developed. This utility automatically cal-
culates lin k speed s base d o n a tabl e suc h a s GPS_DAT A an d output s th e result s to a tabl e
called SEGMENT_SPEEDS_; x (wher e x represent s the number associated wit h the procedur e
to calculate link or segment speed chosen). Fo r example , by selecting a time interpolation pro -
cedure (Fig . 4) , the output tabl e would be called SEGMENT_SPEEDS_1 . The utility provides
users with the capability to compute segment speeds and trave l times for any highway segmen-
tation scheme .
• Calculat e representative segment speeds (Eqs. (2)-(4)). Off-the-shelf databas e function s give us-
ers the capability to compute minimum, average, and maximum speeds, but not median speeds .
As a result , a utility to comput e media n speed s was developed.
• Produc e report s documentin g averag e speed s on the corridor o f interest. Examples of report s
include maps an d stri p charts (Fig . 9 ) and tabula r reports .
TTG wa s tested using GPS dat a fro m 42 8 files collected i n Baton Rouge , Louisiana . Th e 428
files included 2.4 million GPS records on 40,000 km (25,000 miles) of travel time runs on a 240-km
(151 miles) highway network. Of the 2. 4 million GPS record s collected , 1. 8 million GPS record s
were located on the main routes. The remaining 0.6 million GPS records were located on the other
parts of the network including service roads, on-ramps, off-ramps, an d intersecting streets. The 1.8
million GP S record s locate d o n th e mai n route s wer e stored i n tabl e GPS_DAT A i n th e acces s
database file. The resultin g siz e of this database file, including the othe r table s show n i n Figs . 5
and 6 , was 238 Mbytes. This number translates to approximately 13 2 bytes per linearly referenced
GPS record .
Processing an d storin g 1. 8 million linearl y reference d GP S record s i n a relationa l databas e
provided th e capabilit y to improv e and optimiz e th e dat a reductio n application , develo p gener-
alized data quality control checks , and detect limitations and bugs of the developing platform. In
general, th e centralize d databas e approac h require s some extr a effor t o n th e par t o f users when
setting up projects and at the beginning and end of the data reduction process (mainly to generate
entries fo r th e GP S dat a file in Acces s an d t o impor t an d appen d gpsdata.db f files to Access) .
However, i t appear s th e extr a effor t i s worth th e benefit s th e syste m provides, particularl y wit h
respect to the ability to build comprehensive queries to generate travel time reports, th e ability to
conduct generalize d data quality control checks , and the ability to process data muc h faste r tha n
with manua l dat a collectio n methods . A n analysi s o f dat a reductio n speed s indicate s tha t th e
automated dat a reductio n proces s can resul t in 15-1 7 mi n o f data reductio n tim e for tw o hour s
worth of GPS data. B y comparison, traditiona l dat a reduction procedure s base d o n manual data
collection procedure s requir e abou t on e hou r o f dat a reductio n pe r hou r o f dat a collectio n
(Turner e t al. , 1998) . In othe r words , th e GPS/GI S dat a reductio n procedur e describe d her e i s
about 7. 5 times faster tha n traditiona l manua l dat a collectio n procedures .
During the processing of the GPS data files, a number of areas were detected where errors ten d
to occur frequently. T o assist readers in this process, a set of procedures o r checks for data quality
control were developed. Some of these checks are listed below. Additional checks can be found in
Quiroga an d Bulloc k (1998).
• File system. Mak e sure each GPS data file is stored in a separate subdirector y unde r the GPS -
DATA subdirectory . Likewise , make sur e th e file entry i n tabl e GPS_FILE_PROPERTIES ,
particularly th e file name an d pat h i s correct. Th e nee d t o chec k fo r th e locatio n o f the GP S
data file could b e eliminated b y developing a scrip t t o automaticall y stor e GP S data files in
the appropriate subdirectorie s as they are being uploaded fro m th e data collectio n equipment .
The scrip t woul d also generat e an entr y in the databas e automatically .
• MI/MO operation. Verify tha t rout e assignment s an d beginnin g an d endin g tim e stamp s ar e
correct. A n effectiv e wa y o f doing thi s is by checking the content s o f th e outpu t file from th e
C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306 30 3
data reduction process. A s an aid to users, this file is automatically displaye d a t the end of every
MI/MO session .
Route system files. Befor e beginnin g with the forma l linear referencin g process, verif y tha t al l
route syste m files are correct , i.e. , tha t route s contai n onl y valid links and tha t th e beginning
and endin g mileposts o f individual links are correct. TransCA D automaticall y calculat e mile -
posts, however, if links must be edited, the GIS does not alway s recalculate distances correctly.
Linear referencing. Verif y tha t lin k codes , mileposts , an d offset s ar e correc t an d meaningful .
Offsets ar e transverse distances fro m th e GPS points t o the highway links and provid e an indi-
rect measurement of either GPS data positional error s (if the underlying highway network base
map is more accurate tha n th e GPS data) o r GPS data mapping errors (unusuall y high offset s
could be an indication that the route assignment was incorrect). Table GPS_DATA ca n be used
to chec k fo r larg e offse t values .
Average link/segment speeds. Verif y tha t representativ e link/segmen t speed s ar e meaningful .
TTG includes two formulations for the computation o f representative link/segment speeds: using
a harmonic mean formulation (Eq. (3)), and using a median formulation (Eq. (4)). As discussed
previously, harmonic mean speeds are based on arithmetic mean travel times. However, harmon-
ic mean speeds are very sensitive to low outlying speeds which to occur under atypically adverse
traffic conditions. By comparison, median speeds are not sensitive to outliers and, therefore, they
tend to provide more robust estimate s of central tendency than harmoni c mean speeds .
5. Conclusions
This paper presented a summary of procedures, collectivel y called travel time with GPS (TTG),
and example s of application o f GPS an d GI S technologie s fo r th e collection o f travel time data
needed fo r monitorin g an d managin g congestion . Th e pape r examine d transportatio n networ k
performance measure s an d discusse d th e benefi t o f using travel time as a robust , eas y to under -
stand performanc e measure. Th e pape r addresse d dat a need s an d examine d th e us e o f ne w
technologies suc h as global positionin g syste m (GPS) technolog y fo r the collection o f travel time
and spee d data .
TTG i s built using a general data model that include s a spatial model, a geographic relationa l
database, an d a procedure for linearly referencing GPS data. The spatial model uses a GPS-base d
directional vecto r representatio n o f th e network . I n thi s vecto r representatio n o f th e network ,
routes are partitioned int o links an d link s ar e assigne d uniqu e identificatio n numbers . Th e geo-
graphic relational databas e i s composed o f a serie s o f tables tha t stor e informatio n abou t links ,
routes, poste d spee d limits , GPS file descriptors, an d linearl y referenced data. The procedure fo r
linearly referenc e GP S data uses GIS dynami c segmentation tools. T o automat e thi s process, an
application tha t allow s users t o determin e whe n a vehicl e enters an d exit s a rout e an d t o auto -
matically calculate milepost s for al l GPS point s alon g th e route s o f interest wa s developed.
TTG wa s implemented using a PC-based TransCAD-Acces s environment . This environment is
relatively inexpensive and allow s users to process vas t amounts of GPS data an d produce report s
quickly and cost-effectively. Thi s environment works well in most cases although it was found that
the softwar e ha d som e deficiencie s that coul d produc e erroneou s result s i f care i s not take n i n
processing th e data.
304 C.A . Quiroga I Transportation Research Part C 8 (2000) 287-306
Following an analysis of the travel time and speed data and an evaluation of the congestion
situation, the next step would to identify and evaluate potential TCMs. If appropriate and
properly implemented, TCMs could result in reductions of congestion levels. A sample of TCMs
is included in Table 2. Notice that some of the TCMs can be evaluated using GIS techniques,
particularly those that involve spatial analyses such as accessibility analysis, routing analysis, and
demand/market program analysis. As an illustration, consider the case of defining appropriate
locations for park-and-ride lots. Each park-and-ride location has an associated cost to locate,
build, and operate. In addition, each location is expected to serve a number of drivers. For each
driver using the park-and-ride lot, there is an associated cost of service, e.g., the travel time be-
tween the driver’s home and the park-and-ride location. A goal for the park-and-ride location
could be to minimize the cost of service for all drivers. Cost of service values is stored in cost
matrices. In a cost matrix, each row represents the location of each alternative park-and-ride lot
and each column represents a driver. Each cell, therefore, represents the cost of service for a single
location-driver combination. For the GIS analysis, the following layers of data are needed: a
network map layer with travel time and/or speed data, a park-and-ride location layer, and a driver
location layer. Using GIS routing and dynamic segmentation functions, it is possible to construct
the corresponding cost matrix. The minimum total cost of service could be obtained by adding the
cost of service for all drivers who are expected to use each park-and-ride location and by com-
paring the total cost of service among locations. The locations with the lowest total cost of service
are then retained for further analysis.
Another example could be that of development of para-transit operation improvements. The
objective would be to service all origins and destinations based on a specified fleet of vehicles,
while minimizing the total travel time for their customers. Each vehicle has a fixed capacity and
each destination has a demand given by the number of passengers that must be transported
there. This is a typical routing problem that requires the construction of a vehicle routing
matrix. The vehicle routing matrix contains the distance and travel time between each origin
(i.e., each customer’s home) and each destination and between every pair of origins. It is
Table 2
Transportation control measures likely to have a positive impact on congestion
Transportation control measure group Examples
High occupancy vehicle (HOV) lanes Entrance ramp priority, dedicated HOV lanes
Traffic flow improvements Traffic signal optimization, incident management systems
Parking management Preferential parking for HOVs, parking zoning regulations,
park-and-ride facilities, shuttle services
Vehicle use restrictions Route diversion, downtown vehicle restrictions, no-drive days, truck
movement control
Special event and activity center Remote parking with shuttle service, parking management
programs
Improved public transit Service expansion, operational improvements, rail expansion
Employer-based transportation Carpooling, transit, financial incentives to employees,
management programs telecommuting, flextime, compressed work weeks
Trip reduction ordinances Special use permits, mandated ridersharing
Rideshare incentives Commute management organizations, tax incentives
Bicycle and pedestrian programs Bicycle routes and storage facilities, sidewalks and walkways
C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306 30 5
possible t o build th e vehicle routing matrix b y using GIS routing functions applied to a networ k
map layer , a custome r hom e layer , an d a destinatio n layer . Onc e th e vehicl e routin g matri x i s
developed, th e syste m attempt s t o fin d efficien t route s tha t servic e a s man y customer s an d
destinations a s possibl e whil e tryin g t o minimiz e th e tota l trave l time . Th e outpu t fro m th e
procedure i s an itinerar y fo r eac h vehicl e summarizing the rout e an d al l stop s an d destination s
associated wit h tha t route .
References
Benz, R.J. , Ogden , M.A. , 1996 . Developmen t an d benefit s o f computer-aide d trave l tim e dat a collection .
Transportation Researc h Recor d 1551 , TRB, National Researc h Council , Washington, DC, pp . 1-7 .
CAAA, 1990 . Clean Ai r Ac t Amendment . US Code , Titl e 1 , Section 103 .
FHWA, 1998 . Ou r Nation' s Highway s - Selecte d Fact s an d Figures . Publicatio n No . FHWA-PL-98-015 , U S
Department o f Transportation, Washington , DC , p . 28.
Francois, M.I. , Willis , A., 1995 . Developing effectiv e congestio n management systems . Federal Highwa y Administra -
tion, Technica l Repor t No . 8 , p. 22.
Guo, P. , Poling , A.D. , 1995 . Geographic informatio n Systems/Globa l positionin g system s design for networ k trave l
time study . Transportation Researc h Recor d 1497 , TRB, Nationa l Researc h Council , Washington, DC, pp . 135 -
139.
HRB, 1950 . Highwa y Capacity Manual ; Practical Application s o f Research. Nationa l Researc h Council , Washington ,
DC, p . 147 .
HRB, 1965 . Highway Capacity Manual , secon d ed . Special Report 87 , National Researc h Council , Washington , DC ,
p. 397.
ISTEA, 1991 . Intermodal Surfac e Transportatio n Efficienc y Act . U S Code , Titl e 23 , Chapter 3 , Section 303 .
JHK, 1996 . Performance Measure s and Levels of Service in the Year 2000 Highway Capacity Manual . NCHRP Projec t
3-55(4), TRB, Nationa l Researc h Council , Washington, DC, p . 25.
Laird, D., 1996 . Emerging Issues in the Use of GPS for Travel Time Data Collection. National Traffic Data Acquisition
Conference, Albuquerque , NM, 6- 9 May , 1 , pp. 117-123 .
Lindquist, E. , 1999 . Assessing effectiveness measure s in the ISTEA managemen t systems. Southwest Region University
Transportation Center , Texas Transportatio n Institute , College Station , TX , p . 103.
Liu, T.K., Haines , M., 1996. Travel Time Data Collection Fiel d Tests - Lesson s Learned. Report FHW A A-PL-96-010 ,
US Departmen t o f Transportation, p . 116 .
Lomax, T. , Turner , S. , Shunk , G. , Levinson , H.S. , Pratt , R.H. , Bay , P.N. , Douglas , G.B. , 1997 . Quantifyin g
congestion. Fina l Report , Nationa l Cooperativ e Highwa y Researc h Program , Transportatio n Researc h Board ,
p. 184 .
Quiroga, C.A. , 1997 . A n integrate d GPS-GI S methodolog y fo r performin g trave l tim e studies . Ph.D . Dissertation ,
Louisiana Stat e University , Baton Rouge, LA , p. 171.
Quiroga, C.A. , 1999 . Accuracy o f linearl y reference d dat a usin g GIS. Transportatio n Researc h Recor d 1660 , TRB ,
National Researc h Council , Washington, DC, pp . 100-107 .
Quiroga, C.A. , Bullock , D. , 1996 . Architectur e o f a congestion managemen t syste m fo r controlled-acces s facilities .
Transportation Researc h Recor d 1551 , TRB, Nationa l Researc h Council , Washington, DC, pp . 105-113 .
Quiroga, C.A. , Bullock , D. , 1998 . Developmen t o f CM S Monitorin g Procedures . Repor t No . FHWA/LA-314 ,
Louisiana Transportatio n Researc h Center , p . 87.
Quiroga, C.A. , Bullock , D. , 1999 . Trave l tim e informatio n usin g GP S an d dynami c segmentatio n techniques .
Transportation Researc h Recor d 1660 , TRB, Nationa l Researc h Council, Washington, DC, pp . 48-57.
Schrank, D.L. , Lomax , T.J., 1999 . The 199 9 Annual Mobilit y Report. Texa s Transportation Institute , p. 123 .
Schwartz, W.L. , Suhrbier , J.H. , Gardner , B.J. , 1995 . Dat a collectio n an d analysi s method s t o suppor t congestio n
management systems . ASCE Transportatio n Congress , Proceeding s V. 2, San Diego, CA , pp . 2012-2023 .
TEA 21 , 1998 . Transportation Equit y Act fo r th e 21s t Century. US Code , Titl e 23, Section 149 .
306 C.A. Quiroga I Transportation Research Part C 8 (2000) 287-306
TRB, 1985 . Highway Capacity Manual , thir d ed . Specia l Repor t 209 , Nationa l Researc h Council , Washington , DC ,
p. 504 .
TRB, 1994 . Highway Capacity Manual , third ed. Update, Specia l Report 209 , National Researc h Council, Washington,
DC.
TRB, 1997 . Highway Capacity Manual , third ed. Update, Specia l Report 209 , National Researc h Council, Washington,
DC.
Turner, S.M. , 1996 . Advanced technique s for travel time data collection . Transportation Researc h Recor d 1551 , TRB,
National Researc h Council , Washington , DC, pp. 51-58 .
Turner, S.M. , Eisele , W.L. , Benz , R.J. , Holdener , D.J. , 1998 . Travel Tim e Dat a Collectio n Handbook . Repor t No .
FHWA-PL-98-035, Texa s Transportatio n Institute , Colleg e Station , TX , p . 346.
Zito, R. , D'Este , G. , Taylor , M.A.P. , 1995 . Globa l positionin g system s i n th e tim e domain : ho w usefu l a too l fo r
intelligent vehicle-highwa y systems? Transportation Researc h Par t C 3 (4), 193-209 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Road fragmentatio n i s a concern fo r wildlif e viability i n and adjacen t t o protecte d area s i n the Rock y
Mountains. Road s creat e a barrie r t o wildlif e movemen t an d hav e documente d demographi c effects , in -
cluding th e alteratio n o f anima l communities , th e reductio n o f biologica l diversity , an d th e increase d
threat o f extinction. Wildlife movement across an d adjacent to the Trans-Canada Highwa y (TCH) (14,00 0
annual averag e dail y traffic , AADT ) an d Highwa y 1 A (3000 AADT) wa s studied i n Banf f Nationa l Park ,
Alberta. Anima l track s wer e observe d crossin g roadway s an d o n transect s adjacen t t o road s fo r wolves,
cougar, lynx , wolverine, marten, elk , deer, sheep , hare, an d re d squirre l relative to roa d types . Data were
analyzed t o asses s th e barrie r effec t an d a geographica l informatio n syste m (GIS) wa s use d t o identif y
landscape attribute s associate d wit h species movement. The TCH wa s found to b e a barrier to movement
for al l species . I n les s perturbe d environments , i t wa s observe d tha t movemen t pattern s fo r th e wildlif e
communities wer e spatially continuou s an d tha t individua l specie s movemen t wa s complex . Thi s move -
ment wa s no t observe d acros s th e TCH . A n interpolatio n o f poin t dat a showe d site s o f hig h crossin g
frequency withi n th e continuu m o f crossin g points . Thes e site s range d fro m 25 0 to 200 0 m i n diameter .
General predictor s fo r movement by aspect wer e found to b e the south, southwes t and wes t facing slopes .
Flat slopes, areas of low topographic complexity , and slope s lower than 5 ° were also effectiv e predictor s of
animal movements . Th e dat a sugges t that maintainin g contiguou s tract s o f habita t wit h th e abov e attri -
butes facilitate normal wildlif e movement most effectively . Mitigatio n that approximate s previou s patterns
can b e achieved onl y b y elevating and/o r burying extensiv e section s of highway. © 200 0 Elsevie r Scienc e
Ltd. Al l right s reserved.
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00014- 0
308 S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320
1. Introductio n
One of the richest sources of original GIS-T research has been the AASHTO sponsored GIS- T
conferences hel d annuall y sinc e 1988 . Cooke e t al . (1998 ) note tha t man y stat e department s o f
transportation (DOT ) us e GI S t o suppor t transportatio n decision-makin g processes . However ,
they argue that thei r case study is novel in that it uses GIS t o integrat e environmental issues into
the transportation system s planning process in the North Carolin a Departmen t o f Transportatio n
(NCDOT). The previous lack of environmental data had, according to the authors, led to a feeling
of frustration with the decision-making process and confrontation rather than cooperation amon g
transportation planners , engineers , and environmenta l planners. One of the main benefits o f GI S
was foun d t o b e th e eas e wit h which consensus coul d b e reache d o n environmentall y preferred
corridors amon g a variet y of agencies wit h widely differing perspectives . The us e o f GI S b y th e
NCDOT ha s allowe d site-specifi c environmental informatio n t o b e incorporate d i n th e earlies t
phase o f th e plannin g process . Late r i n thi s paper , simila r site-specifi c informatio n i s use d t o
determine th e bes t location s fo r roa d mitigatio n measures , suc h a s under - an d over-passe s o n
wildlife movemen t patterns . Th e NCDO T use d thei r GI S t o develo p a serie s of environmental
sensitivity maps that would allow the least sensitiv e areas to be set aside for future transportatio n
corridors.
If the AASHTO sponsore d GIS- T symposia have traditionally neglected the environment, then
the ICOWE T conference s have conversely paid scan t attentio n t o GI S a s they address ho w th e
environment i s impacte d b y transportatio n infrastructure . Tha t thi s i s beginnin g t o chang e i s
S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320 30 9
The creation o f linkages between isolated patche s o f habitat ha s been identified a s one metho d
for improvin g bioti c exchang e acros s roa d barrier s (Forma n and Alexander , 1998) . Linkage s
across barriers facilitate dispersal and migration processes, which are critical to species persistence
(Weaver e t al., 1996) . Dispersal refer s t o movements by juvenile animals when leaving their nata l
range, while migration i s the movement of individuals or groups between two areas (Weaver et al.,
1996).
4. Stud y are a
to Apri l (1997/9 8 t o 1998/99) . Excep t fo r wildlif e warnin g signs , non e o f th e stud y road s ar e
currently mitigated.
Three roa d section s wer e surveyed ; eac h i s approximatel y 3 0 km i n length . Annua l averag e
daily traffi c (AADT ) volum e o n th e TC H i s abou t 14,00 0 vehicles an d volum e i s abou t 300 0
vehicles o n th e 1 A (Banif-Bow Valley Study, 1996) . The BN P roa d surve y sections use d i n this
study ar e classified as :
Bl: Bo w Valley Parkway (Hw y 1A ) from 5 Mile Bridge to Castl e Junction ,
B2: Bow Valley Parkway (Hw y 1A ) from Castl e Junctio n t o Lak e Louise , an d
B3: TCH from Castle Junction t o the British Columbia Borde r (unfenced , non-twinned sectio n
of the TCH) .
4.1. Methodology
extended surve y period relativ e to th e roa d survey . Within dayligh t hours, a researcher i s phys-
ically capabl e o f surveyin g no mor e tha n five to seve n 1-k m transect s pe r day . Track s wer e re-
corded o n transect s fo r al l roa d crossin g specie s plu s squirrel , weasel , and hare . Th e additiona l
species were recorded fo r use as explanatory variables in the presence of other species. These three
species were not include d o n roa d survey s because thei r abundanc e woul d prohibit timel y com-
pletion o f th e surveys. Transect surve y dat a provid e a n estimat e o f availabl e migrant s an d ar e
critical t o explainin g variances i n movement a t differen t traffi c volumes .
Landscape attribute s hav e bee n use d t o predic t movemen t o f specie s acros s th e landscape .
Research o n wolves in BNP has indicated tha t movemen t can be predicted b y presence or use of
landscape features , suc h a s slope , aspect , elevation , an d vegetatio n typ e an d closur e (Paquet ,
1993). A disproportionate us e of certain physiographic attribute s is assumed to reflect a preference
(Alexander an d Waters , 1999) .
The Idris i GI S (Eastman , 1997 ) wa s use d t o analyz e landscap e attribute s coinciden t wit h
crossings observed in 1998/9 9 for marten, wolf , lynx , cougar, an d elk . Slope and aspec t coverages
were derive d usin g th e SURFAC E modul e an d a digita l elevatio n model (DEM-Albert a Envi -
ronment, Provincial Bas e Map ) wit h a 30- m resolution. Topographi c complexit y was calculated
from th e DE M usin g the PATTERN modul e withi n Idrisi (Eastman, 1997) . One function i n this
module, which determines the number of different classe s around a centroid, was used to specify a
3 x 3-movin g window for the complexity analysis . For detail s on this operation refer to the Idris i
instruction manua l (Eastman , 1997) . A combinatio n o f th e Idris i CROSSTA B modul e an d th e
Minitab statistical package wa s used to calculate X 2 and related statistic s (Loether and McTavish ,
1988; Fosnight an d Fowler , 1996 ) to examine the various landscape association s discusse d below.
To visualiz e the spatia l distributio n of points, a map overla y operatio n was used to rende r a
composite surfac e o f all crossing points. A n Idris i SURFAC E interpolatio n wa s then ru n o n th e
previous road coverag e to identify crossin g "ho t spots " for multiple species. These hot spot s may
define suitabl e location s an d spatia l exten t require d fo r successfu l mitigation .
5. Results
Tables 1 and 2 summarize crossin g count s for al l species surveye d durin g th e winte r season s
1998/99 and 1997/98 . Each o f the thre e road section s is approximately 3 0 km i n length .
Crossing frequencie s wer e compare d b y highwa y sectio n usin g th e X 2 statisti c a t th e 99 %
confidence interval . Observe d crossing s wer e tested agains t a unifor m distributio n of crossings .
Results of this analysi s are presente d i n Tables 3 and 4 .
Observed crossing s were compared wit h expecte d count s usin g output fro m th e Idris i CROS -
STAB module, and the X2 statistics were calculated usin g Minitab (Kitchin and Tate, 2000) , while
Cramer's V measure s o f associatio n wer e calculate d b y th e author s (Loethe r an d McTavish ,
1988). Observation s wer e teste d agains t a unifor m distributio n o f crossing s b y attribut e clas s
S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320 313
Table 1
Banff roa d crossin g summary : tota l fo r 1998/199 9 (17 surveys)
Species 1A: East (Bl ) (unmitigated ) 1A: West (B2 ) (unmitigated) Phase III B (B3) (unmitigated )
Marten 219 106 65
Coyote 76 36 28
Wolf 30 5 1
Lynx 0 2 6
Cougar 18 0 0
Wolverine 0 0 0
Elk 69 8 2
Moose 0 1 0
Sheep 6 0 0
Deer 57 1 0
Fox 1 0 0
Fisher 0 1 0
Table 2
Banff roa d crossin g summary : tota l fo r 1997/199 8 (12 surveys)
Species 1A: East (Bl ) (unmitigated ) 1A: West (B2 ) (unmitigated) Phase III B (B3 ) (unmitigated)
Marten 68 15 16
Coyote 77 9 23
Wolf 14 7 1
Lynx 3 6 5
Cougar 12 0 0
Wolverine 6 0 1
Elk 50 3 2
Moose 0 0 0
Sheep 3 0 0
Deer 7 0 0
Fox 1 0 0
Fisher 0 0 0
Table 3
1997/98 summar y statistic s
Sections compare d d/ X2 Cramer's V Significance leve l (%)
Bl v s B3 9 27.268 0.31 99
B2 vs B 3 5 11.314 0.35 95
IIIA vs B3 6 393.775 0.66 99
IIIA v s B2 8 455.355 0.44 99
Bl v s B3 vs B3 18 224.585 0.54 99
Table 4
1998/99 summar y statistic s
Sections compared Cramer's V Significance leve l (%)
Bl v s B3 72.276 0.35 99
B2 vs B3 7 9.734 0.19 Under 9 5
Bl v s B 2 vs B3 20 120.603 0.29 99
314 S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320
Table 5
Marten ( N = 489)
Attribute d/ Significance (% ) Cramer's V
Aspect 272.528 99 0.0290
Slope 21 375.096 99 0.0350
Topographic complexity 343.693 99 0.0314
Table 6
Elk (N = 134 )
Attribute df X2 Significance (% ) Cramer's V
Aspect 8 83.282 99 0.016
Slope 11 149.232 99 0.021
Topographic complexit y 8 109.716 99 0.019
Table 7
Wolf ( N = 19 - point s surveye d opportunistically no t included)
Attribute d/ X2 Significance (% ) Cramer's V
Aspect 8 32.810 99 0.010
Slope 11 30.466 99 0.010
Table 8
Lynx (T V = 3 - point s surveye d opportunisticall y no t included )
Attribute df Significance (% ) Cramer's V
Aspect 19.475 99 0.008
Slope 11 20.357 95 0.007
Table 9
Cougar ( N = 1 - point s surveye d opportunisticall y no t included)
Attribute df Significance (% ) Cramer's V
Aspect 11.716 Under 9 5 N/A
Slope 11 10.166 Under 9 5 N/A
(aspect, slope , an d complexity). The relationships betwee n crossings and landscap e attribute s are
listed by species in Tables 5-9 . Detail s on the variability explained by specific attribut e classes are
presented i n th e nex t section.
6. Discussion o f results
frequencies for Bl, B2 and B3 indicated a significant difference at 99% confidence between
highway crossing frequencies in both 1997/98 and 1998/99 (χ2 = 224.585, d/ = 18; χ2 = 120.603,
d/ = 20, respectively). This difference is explained primarily by variation between Bl (1A East)
and B3 (TCH). A high crossing frequency by marten, coyote, cougar, wolf, wolverine, and elk on
Bl (1A East) contributed most to the differences between highway segments in both data sets.
Pairwise comparisons showed a significant difference at 99% confidence between the TCH (B3)
and the 1A East (Bl) in 1997/98 and 1998/99 (χ2 = 27.268, df = 9; χ2 = 72.276, df = 8, respec-
tively). In 1997/98, a significant difference was observed between B2 and B3 at the 95% confidence
level but did not reach the 99% level of confidence (χ2 = 11.314, df = 5). In 1998/99, no sig-
nificant difference was observed between B2 and B3, at either the 99% or even the 95% confidence
level (χ2. = 9.734, df = 7).
A preliminary analysis of transect data shows that species richness and abundance were
comparable in habitat adjacent to the B3 (TCH) and Bl (1A East) but lower in neighboring B2
(1A West). This finding further supports the conclusion of a barrier effect along B3 (TCH) because
the lower ratio of actual crossings over potential crossings increases the severity of the existing
statistical differences. The crossing frequencies on B2 are low because of the presence of less
suitable habitat adjacent to the road. In contrast to the habitat bordering sections Bl and B3, that
adjacent to B2 is more steep and rugged, which reduces the suitability of the habitat for multiple
species and thereby reduces the number of available migrants. Subsequent formal analysis of
transect data is planned and will provide an index of habitat suitability in these regions.
It was expected that all species would have lower crossing frequencies on the TCH compared
with lower traffic volume roads (e.g., 1A). However, lynx crossings were higher on the TCH (B3)
compared to other highway sections and increases in movement appear to coincide with the
offspring dispersal period.
6.2.1. Marten
The distribution of marten crossings was significantly different from uniform (99% confidence)
for the landscape attributes aspect, slope, and topographic complexity.
Marten tracks were observed more often than expected in aspect classes south (157.5-202.5°),
southwest (202.5-247.5°), west (247.5-292.5°), and flat (slope = 0°). Marten were observed less
than expected in classes north (337.5-22.5°), northwest (292.5-337.5°), northeast (22.5-67.5°),
east (67.5-112.5°), and southeast (112.5-157.5°). These findings are consistent with expected
occurrence of marten by vegetation type, the latter being correlated by aspect (Buskirk and
Ruggiero, 1994).
A statistically non-uniform distribution of crossings by slope angle (99% confidence) was
shown for marten. The higher frequency of observations over expected values at 0°, 1°, and 2°
contributed most to the statistical variability. All slopes above 5° showed lower crossing fre-
quencies than expected. No crossings occurred in areas over 21°. The latter is related to sampling
on roads, which rarely occur on steep slopes.
Marten observations coincided statistically (99% confidence) with areas of low topographic
complexity, as represented by the number of different elevation classes around a centroid ( 3 x 3
moving window). Substantially higher observed versus expected values for cells with one, two, and
316 S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320
three different neighboring classes and lower observed versus expected for cells with eight and nine
different neighbors contributed most to the non-uniform distribution of points.
6.2.2. Elk
It was found that elk distribution was statistically non-uniform by aspect category (99% con-
fidence). Most of the variance can be explained by the higher use than expected rates on flat and
southern slopes and the disproportionately lower use than expected on north and northeast
slopes.
Use of slopes by elk was significantly non-uniform (99% confidence). Slopes below 5° were used
more frequently than expected. Much of the variance from a uniform expectation can be ex-
plained by the relatively higher observed compared to expected values on flat and 1° slopes, and
lower observed versus expected values for slopes above 11°.
Elk use of habitat of varying complexity was shown to be statistically non-uniform (99%
confidence). Overuse of areas of low complexity (one, two, and three different types of neigh-
boring cells) explained this statistical difference. Under-utilization of areas of the highest
complexity class (nine different neighboring cells) also made a substantive contribution to the
non-uniformity.
6.2.3. Wolf
Wolf crossings were statistically non-uniform (99% confidence) in distribution by aspect class.
Seven crossings were observed in the "flat" aspect class, explaining most of the variance in dis-
tribution. The small sample size weakens the validity of these results and additional research will
be necessary to confirm these observations.
The occurrence of wolf points was non-uniform by slope class (99% significant). The majority
of wolf points were observed in flat aspects.
Qualitative observations indicated that wildlife crossings on a community level are not spatially
clustered but are spatially continuous. See Fig. 2 for an example. In contrast, the existing TCH
wildlife mitigation constrains movement to narrow and infrequent sites. On lower traffic volume
roads, such as Bl and B2, crossings for many survey species are characterized by multiple
S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320 317
crossings in one movement session (e.g., in one day). For example , one wolf pack was observed t o
cross the road 1 3 times in 24 h (12 February 1999) . This type of movement is not possibl e on th e
mitigated portio n o f th e TC H (Phas e IIIA ) becaus e o f inadequat e mitigatio n design , suc h a s
narrow culverts that wolve s refuse t o cros s (Paquet , 1993) .
The Idris i surfac e interpolation modul e was run o n th e continuu m o f points show n i n Fig . 2.
This procedur e identifie d node s o f hig h crossin g frequenc y an d indicate d tha t thes e ho t spot s
range fro m 25 0 to 200 0 m i n diameter .
7. Conclusion s
The present findings yield strong evidence that traffi c o n the TCH, Phas e IIIB creates a barrier
to movemen t fo r most species .
All th e statistica l analyse s o f crossing s b y landscap e attribute s showe d lo w strengt h o f rela -
tionship (using Cramer's V) . This is a common occurrence in raster-based GI S analysis because of
the overwhelmin g presence o f zero value s in the crossin g image.
It ma y b e conclude d tha t genera l predictor s fo r movemen t fo r aspec t ar e south , southwest ,
west, an d fla t slopes . I n addition , lyn x wer e foun d o n northeaster n aspects . Roa d mitigatio n
should b e designe d t o maintai n connectivit y o f wildlif e population s betwee n thes e suitabl e as -
pects. Lo w topographic complexit y and slope s lower than 5 ° were shown to be optimal areas for
movement. Thes e finding s ar e consisten t wit h predictiv e model s fo r wol f movemen t tha t wer e
constructed usin g radio-telemetr y observation s (Paquet , 1993) . I t i s suggested tha t area s exhib -
iting the previous qualities may be useful t o delineate movement corridors an d t o locat e site s for
mitigation tha t hav e high movement potentia l alon g roadways .
318 S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320
These result s indicate tha t th e mitigation approac h employe d in BNP a t presen t doe s no t fa -
cilitate natural movement. In less perturbed environments , the spatial movement pattern observe d
for individua l specie s an d a t a community leve l was continuous and complex . Withi n thi s con -
tinuum o f point s show n o n th e multi-specie s crossin g surface , site s of hig h crossin g frequencies
were observed , th e averag e o f whic h range d fro m 25 0 to 100 0 m i n width . Thus , i n designin g
mitigation that meets the continuous natur e o f movement, the frequen t placement o f mitigation
structures that have considerable spatial extent (e.g., greater tha n 250 m wide) is recommended. It
is hypothesized that the bes t mitigation desig n shoul d simultaneousl y allow use of the mitigated
area a s a home range for smaller species, such as a marten, and as a movement corridor fo r larger
species, such as the wolf. The proposed desig n is achieved most appropriatel y b y elevating and/or
burying substantia l portion s o f the highway.
The present researc h has used a traditional GIS framework an d tools of analysis t o determin e
the spatia l requisite s o f wildlif e movemen t patterns , an d compar e thes e t o existin g style s o f
mitigation. Onl y throug h th e us e o f suc h tool s i s i t possibl e t o sho w th e limitation s o f thes e
measures and sugges t appropriate remedies . To date such research integrating GIS environmental
concerns an d transportatio n ha s bee n relativel y rare i n the literature .
Acknowledgements
References
Adler, J.L., Franks, T., Nelson, D., Benware, T., Ivery, M., McVoy, G., 1997 . Expert system architecture for computer-
aided environmenta l analysis. Transportation Researc h Record No. 1601 , Environmental Issues in Transportation,
Transportation Researc h Board, National Researc h Council, National Academy Press, Washington, DC, pp. 29-34.
Alexander, S. , Waters , N. , 1999 . Decisio n suppor t method s fo r assessin g placemen t an d efficac y o f roa d crossin g
structures fo r wildlife . In : Evink , G.L. , Garrett , P. , Zeigler , D . (Eds.) , Proceeding s o f th e Thir d Internationa l
Conference o n Wildlif e Ecolog y an d Transportation . FL-ER-73-99 , Florid a Departmen t o f Transportation ,
Tallahassee, FL , pp . 237-246.
Baker, E.J., Lee , Y., 1975 . Alternative analyses of geographical contingency tables. Professional Geographer 27 , 179 -
188.
Banff-Bow Valle y Study , 1996 . Banff-bo w valley : a t th e crossroads . Technical repor t o f th e Banff-Bo w Valle y Task
Force (Rober t Page , Suzann e Bayley , J . Dougla s Cook , Jeffre y E . Green , J.R . Bren t Ritchie) . Prepare d fo r th e
Honourable Sheil a Copps, Ministe r of Canadian Heritage , Ottawa , ON .
Bardman, C.A. , 1997 . Applicabilit y o f biodiversit y impac t assessmen t methodologie s t o transportatio n projects .
Transportation Researc h Recor d No . 1601 , Environmenta l Issue s i n Transportation , Transportatio n Researc h
Board, Nationa l Researc h Council, National Academ y Press, Washington , pp.
S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320 31 9
Bascompte, J. , Solé , R., 1996 . Habitat fragmentatio n and extinctio n thresholds in spatially explicit models. Journa l of
Animal Ecolog y 65 , 465-473.
Buskirk, S.W. , Ruggiero , L.F. , 1994 . American marten . In : Th e Scientifi c Basi s fo r Conservin g Fores t Carnivores :
American Marten , Fisher , Lyn x an d Wolverin e in th e Wester n Unite d States . U S Dept . o f Agriculture , Fores t
Service, Gen . Tech . Rep. RM-254 , Fort Collins , Colorado, US A 80526 . p. 184 .
Callaghan, C. , 1999 . Personal communication .
Callaghan, C. , Paquet , P. , Wierzchowski , J., 1999 . Highway effects o n gra y wolves within th e golde n canyon , British
Columbia. In : Proceedings o f the Third Internationa l Conferenc e on Wildlif e Ecolog y and Transportation . Evink ,
G.L., Garrett, P., Zeigler, D. (Eds.). FL-ER-73-99, Florid a Department o f Transportation, Tallahassee , FL, pp. 39 -
51.
Carr, M.H. , Zwick , P.D., Hoctor , T. , Harrell , W. , Goethals , A. , Benedict , M. , 1998 . Usin g GI S fo r identifyin g th e
interface betwee n ecological greenways and roadwa y systems at the state and sub-stat e levels. In: Proceedings o f the
International Conference on Wildlife Ecolog y and Transportation. Evink , G.L., Garrett , P. , Berry, J.B. (Eds.), FL -
ER-69-98, Florid a Departmen t o f Transportation, Tallahassee , FL , pp . 68-77 .
Cooke, P.D. , Foster , D. , Schuller , E.R., 1998 . Geographi c informatio n system s (GIS) implementatio n in th e phase d
environmental proces s fo r system s plannin g a t th e Nort h Carolin a departmen t o f transportatio n (NCDOT) . In :
Proceedings of the Geographic Informatio n Systems for Transportation Symposium , American Association of State
Highway an d Transportatio n Officials , Washington , pp . 230-253 .
Eastman, J.R. , 1997 . Idris i fo r Windows . User' s Guide . Versio n 2 . Clar k Lab s fo r Cartographi c Technolog y an d
Geographic Analysis , Clark University , Worcester, MA .
ESRI, 1995 . Understanding GIS : Th e Arc/Inf o Method . Environmenta l Systems Research Institute , Geoinformation
International, Cambridge, UK .
Forman, R.T.T., Alexander, L.E., 1998 . Roads an d their major ecological effects. Annua l Review of Ecological Systems
29, 207-231.
Fosnight, E.A. , Fowler, G.W., 1996 . Measures of association an d agreemen t for describing land cover characterizatio n
classes. In : Proceeding s o f th e Spatia l Accurac y Assessmen t i n Natura l Resource s an d Environmenta l Science s
Symposium, USD A Fores t Service , General Technica l repor t RM-GTR-277 , pp . 425-433.
Goodchild, M.F. , 1977 . A n evaluatio n o f lattic e solution s t o th e proble m o f corrido r location . Environmen t an d
Planning A 9, 727-738.
Hobbs, R.J. , 1993 . Effect s o f landscap e fragmentatio n o n ecosyste m processe s i n th e wester n Australia n wheatbelt.
Biological Conservatio n 64 , 193-201 .
Ims, R.A. , Rolstad, J., Wegge, P., 1993 . Predicting space use responses to habita t fragmentation : Can vole s (Microtus
oeconomus) serve as an experimental model system (ems) for capercaillie grouse (Tetrao urogallus) in Boreal forest?
Biological Conservatio n 63 , 261-268.
Kitchin, R. , Tate , N.J. , 2000 . Conducting Researc h int o Huma n Geography . Prentice-Hall , Englewoo d Cliffs , NJ .
Klein, L. , 1999 . Usage o f GIS i n wildlife passag e plannin g in estonia. In : Evink , G.L., Garrett , P. , Zeigler, D . (Eds.) ,
Proceedings of the Third International Conference on Wildlif e Ecolog y and Transportation . FL-ER-73-99 , Florid a
Department o f Transportation, Tallahassee , FL , pp . 179-184 .
Lackey, A.E. , 1997 . Reconciling transportatio n corrido r preservatio n an d nationa l environmenta l poic y ac t process :
evaluation o f nort h Carolin a phase d environmenta l approach . Transportatio n Researc h Recor d No . 1601 ,
Environmental Issue s i n Transportation , Transportatio n Researc h Board , Nationa l Researc h Council , Nationa l
Academy Press , Washington, pp . 21-28 .
Lee, B.D. , Tomlin , C.D. , 1997 . Automate transportatio n corrido r location . GI S Worl d 1 0 (1), 56-60.
Loether, H.J. , McTavish , D.G., 1988 . Descriptive and Inferentia l Statistics : an Introduction , third ed . Allyn & Bacon,
Newton, MA .
Mader, H.J. , Schell , C. , Kornacker , P. , 1990 . Linea r barrier s t o arthropo d movement s i n th e landscape . Biological
Conservation 54 , 209-222.
McHarg, I. , 1969 . Design wit h nature. Doubleday, Natura l Histor y Press, Ne w York.
Noronha, V. , 1999. Unit 18 3 Transportation Networks . Core Curriculum in Geographic Information Science, National
Center fo r Geographi c Informatio n an d Analysis , Departmen t o f Geography , Universit y of California , Sant a
Barbara, California , URL:http://www.ncgia.ucsb.edu/giscc/units/ul83/ .
320 S.M. Alexander, N.M. Waters I Transportation Research Part C 8 (2000) 307-320
Nyerges, T. , 1995 . Geographi c informatio n syste m suppor t fo r urban/regiona l transportatio n analysis . In : Susa n
Hanson (Ed.) , Th e Geography o f Urban Transportation , secon d ed . Guilford Press , New York, pp . 240-265 .
Oehler, J.D., Litvaitis , J.A., 1995 . The rol e of spatial scale in understanding responses of medium-sized carnivores t o
forest fragmentation . Canadian Journa l o f Zoology 74 , 2070-2079.
Paquet, P.C. , 1993 . Summar y referenc e document-ecologica l studie s o f recoIonizing wolve s i n the Centra l Canadia n
Rocky Mountains . BNP Warde n Service , Banff , AB .
Paquet, P.C., Callaghan , C., 1996 . Effects o f linear developments on winter movements of gray wolves in the bow river
valley o f Banf f Nationa l Park , Alberta . In : Evink , G.L. , Garrett , P. , Zeigler , D. , Berry , J . (Eds.) , Trend s i n
Addressing Transportatio n Relate d Wildlif e Mortality , Proceeding s o f th e Transportatio n Relate d Wildlif e
Mortality Seminar . FL , USA . Jun e 1996 .
Reed, R.A. , Johnson-Barnard , J. , Baker , W.L. , 1996 . Contributio n o f road s t o fores t fragmentatio n i n th e rock y
mountains. Conservation Biolog y 10 (4), 1098-1106 .
Waters, N . (Ed.) , 1992 . Geographi c informatio n system s i n transportation . In : Proceeding s o f th e Transportatio n
Research Foru m 6 , 112-29 .
Waters, N. , 1999 . Transportatio n GIS : GIS-T . In : Longley , P. , Goodchild , M. , Maguire , D. , Rhin d D . (Eds.) ,
Geographical Informatio n Systems: Principles, Techniques , Applications and Management , secon d ed . Wiley, New
York.
Weaver, J.L. , Paquet , P.C. , Ruggiero , L.F. , 1996 . Resilienc e an d conservatio n o f larg e carnivore s i n th e rock y
mountains. Conservatio n Biolog y 1 0 (4), 964-976.
Wilcox, B.A. , Murphy , D.D. , 1985 . Conservatio n strategy : th e effect s o f fragmentatio n o n extinction . America n
Naturalist 125 , 879-887 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Keywords: Emergenc y evacuation; Networ k capacity ; Optimization ; Integer programming; Geographical information
systems
1. Introductio n
Disasters that require some type of evacuation are relatively common (Perry, 1985). Buildings ,
industrial plants, airplanes, flood zones are but a few examples of areas or structures, which may
require immediate and quick evacuation. The risk involved may be well understood. For example,
buildings ar e ofte n modele d for thei r evacuation or egres s potential (Owen et al. , 1996) . Room
0968-090X/00/$ - see front matter © 200 0 Elsevier Science Ltd . Al l rights reserved.
PII: S0968-090X(00)00019- X
322 R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336
neighborhood tha t might be difficult t o evacuate. Then, that area can be classified a s to the degree
to whic h evacuation difficult y exists . By applying the mode l numerous time s across th e networ k
and classifyin g eac h loca l are a a s t o evacuatio n difficulty , a ma p o f evacuatio n vulnerabilit y
emerges. Wit h suc h a mode l i t i s possibl e t o ma p "evacuatio n risk " just a s floo d ris k ca n b e
mapped. I n the next section we describe one possible approach to estimatin g evacuation risk. I n
subsequent sections we integrate the approach t o estimating evacuation risk into a model that can
be used in conjunction with GIS t o ma p evacuatio n risk.
Little i s known abou t smal l are a evacuatio n a s i t i s nearly impossibl e t o measur e accuratel y
during an emergency. Further, fe w models have been developed to simulate such an emergency for
a transportation networ k at the scale of a neighborhood o r relatively small urban are a (Sinuany-
Stern an d Stern , 1993) . T o us e a simulatio n model , th e neighborhoo d need s t o b e define d i n
advance. Sinc e th e difficult y involve d in evacuatio n i s inextricably tied t o th e definitio n o f th e
neighborhood, the n th e proble m o f neighborhoo d evacuatio n analysi s include s identifyin g
the critical size and shape of the neighborhood i n question. Thus, definin g the exact boundaries of
the neighborhoo d i s part o f the evacuatio n ris k mapping problem . Consequently , we begin ou r
discussion her e wit h tw o factor s tha t ar e indisputabl y important : (1 ) tota l demand , an d (2 )
transport capacity . Conside r th e following terms:
popk th e population o f neighborhood k , whic h may be estimated a s the product o f the
number o f houses times the numbe r of people pe r househol d
cppk peopl e pe r vehicle during a sudde n evacuation of neighborhood k
capk capacit y o f lanes leading out o f neighborhood k i n vehicles per minute
ctek clearing time estimate for neighborhood k.
The above cte value for a neighborhood i s a simple estimate on how much time it would take to
clear a neighborhood o f its inhabitants. It assumes no accidents, that al l inhabitants would either
drive or ride a vehicle, and tha t the critical transportation elemen t is the outbound road capacity
of the neighborhood. Le t us suppose that it takes time to notify an d to marshal people in terms of
the urgency of evacuation. Technically the time for notification and marshalling needs to be added
to th e ct e value . I n th e even t o f a n accident , the n outboun d o r exitin g capacity wil l likel y b e
impaired and therefore clearing time increased. Thus, the above ratio is a lower bound estimate on
the actua l tim e t o clea r a neighborhoo d o f it s inhabitants . I f th e ct e i s low i n value , the n i t i s
theoretically possible to make a speedy clearance of the neighborhood. I f the cte is large in value,
then i t might signa l a potentia l difficulty , especiall y if the tim e is too larg e in comparison t o th e
time available du e to th e hazar d (e.g . encroaching toxi c cloud o r a moving firestorm).
324 R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336
If exi t capacit y i s measured i n term s o f th e numbe r o f exi t lanes , rathe r tha n a flo w rat e o f
vehicles per minute, then th e abov e rati o i s a measure o f bulk demand pe r lan e and ha s units of
vehicles per lane that mus t evacuate . We can designate this ratio a s bulk lane demand (bid) . The
higher th e bul k lan e demand , th e highe r th e ris k i n leavin g a neighborhood . Bot h bl d an d ct e
values ar e estimate s o f evacuatio n risk . Th e large r thes e value s are , th e greate r th e tim e an d
volume o f traffi c pe r lan e that i s required t o clea r a neighborhood .
Although the cte and bld values are not th e only possible measures of evacuation risk, they are
both eas y t o comput e an d estimate . Thes e factor s (i.e . clearin g tim e an d bul k demand) , un -
doubtedly, ar e directl y relate d t o evacuatio n vulnerability . An alternat e approac h woul d b e t o
employ a micro-simulation mode l of a neighborhood and estimate an expected clearing time. The
difficulty tha t w e face her e is that w e do no t hav e a neighborhood define d i n advance. Sinc e cte
and bld values can be easily calculated (as compared t o using a simulation model) for a very large
number o f possibl e neighborhoo d definitions , i t wil l allo w u s th e capabilit y t o searc h fo r th e
critical neighborhood definition . For th e remainder o f this paper, we will assume that the cte and
the relate d bl d valu e ar e reasonabl e surrogat e measure s fo r evacuatio n risk . I n th e followin g
section, w e discuss the definitio n of an "evacuatio n neighborhood" .
Fig. 1 . (a) A bld calculatio n fo r a specifi c neighborhood , (b ) A bid calculatio n for a larger neighborhood .
R.L. Church, T.J. Com I Transportation Research Part C 8 (2000) 321-336 325
analysis, this may not be appropriate. Fo r example , in Fig. l(b) , the bld is computed fo r a slightly
larger neighborhood. Th e bld value increases since demand increases and the number of exit lanes
decreases t o one . Thus , th e bi d an d ct e value s wil l b e quit e sensitiv e t o th e definitio n o f th e
neighborhood. Take n fro m a more personal perspective , let us say you live d at nod e 1 which is
indicated by the numeral 1 in Fig. 1 . Your interest i n evacuation migh t involve the question: wha t
is the most difficult (worst-case ) evacuation scenari o within which I might participate (given that I
am a t nod e 1) ? Clearly, thi s dictate s findin g th e neighborhoo d whic h surrounds you a t nod e 1 ,
which has the highest bld or cte values. We will call the neighborhood defined about a given node,
which maximize s bl d o r cte , a s th e "critica l neighborhood" . Th e node s an d stree t segment s of
which the "critical neighborhood" i s comprised will be called the critical cluster. For eac h node of
the network , ther e exists a critica l cluster . Identifyin g th e critical cluster for a given node i s the
critical cluster problem (CCP) . I n the following section, we present a model which can be used to
solve an instanc e of the CCP .
Given a node of interest, called an anchor node, the critical cluster problem involves identifying
a critical neighborhood o f connected nodes and arcs about that anchor node which has the highest
cte or bld value. As our focus is on delineating a small area or neighborhood EPZ , an upper limit
will be used to restric t the critical cluster size. That is , since the area o f the hazard tha t i s faced is
small an d concentrate d lik e th e Oaklan d Hill s fire , th e immediat e are a o f evacuatio n wil l b e
relatively small . Consequently , fo r th e wor k presente d here , we assume that eac h critica l cluste r
will b e limited an d no t excee d some maximum possible size.
There ar e man y possibl e cluster s that ca n b e defined abou t a given anchor node . Th e critical
cluster fo r tha t nod e i s the on e that maximizes some measure of risk or vulnerabilit y like cte or
bld. An example of a critical cluster is given in Fig. 2 for node 1 . In this example each node has a
population o f 1 and eac h ar c ha s on e lan e i n eac h direction . Fig . 2 also present s a plo t o f th e
relationship between cluster size and th e bld value. It is quite likely that on e of the critical points
in evacuation occurs when the neighborhood is relatively small as depicted in Fig. 2. Note that fo r
the exampl e anchor node , th e critical cluster occurs at a cluster size o f five. The focu s i n spatia l
evacuation analysi s is on the local spatial variatio n i n critical cluste r values fro m ancho r nod e t o
anchor node .
The problem of finding an anchor node' s critical cluster falls within the broad category of graph
or networ k partitioning problems (Cov a an d Church , 1997) . A partitio n i s a subse t o f node s
within a larger graph, and a n optimal partitioning maximizes or minimizes some specified criteria
related to th e partition. In this context, we are only interested in contiguous partitions which will
be referred t o as clusters. The CCP is related in concept to the Graph Partitionin g problem in the
operations researc h literatur e (see , fo r example , Kernigha n an d Lin , 1970 ; Johnson e t al. , 1989 ;
Jin an d Chan , 1992 ; Laguna et al., 1994 ; Pirkul and Rolland , 1994) . The critical cluster problem
for nod e r can b e stated formall y as:
Given a graph G with capacities on its edges and populatio n a t its nodes (set V) , find a con-
tiguous partitio n les s than a given maximum siz e that contains ancho r nod e r (call i t parti -
tion V r) so as to maximiz e some measure of evacuation vulnerability/risk such as cte or bld .
We can formulat e the CC M wit h respect t o maximizin g bld i n the followin g manner:
Objective:
Maximize :
Subject to :
where
where the objective is to maximize M. O n the right-hand side of the equation, it is evident that this
is a nonlinea r objectiv e as we have the variabl e M multiplie d by othe r variabl e term s c ijyij. T o
transform th e formulation into a linear integer programming (LP/IP) problem, we can fix M as a
constant, ad d th e abov e right-han d expressio n a s a constraint , an d appen d a ne w objective:
minimize the siz e of the cluster. In othe r words , find the smalles t cluster (in nodes) that includes
node r , suc h tha t th e rati o o f th e cluster' s population t o lane-capacit y i s greater tha n M . Thi s
alternative LP/IP formulatio n is given as follows :
Objective:
Subject to :
328 R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336
where M i s the minimu m partitio n threshold , an d wher e al l othe r notatio n i s as define d previ -
ously.
This formulatio n ca n b e use d t o solv e for critica l cluster s whic h maximize cte o r bl d value s
depending upon which values are used i n the definition o f the objectiv e (7). Technically, it can be
used to searc h fo r the smallest cluster which has a specified minimu m cte or bld value. It may be
necessary to use several constraint values, M, in order to identify th e highest feasible M value and
its associated critica l cluste r fo r a given anchor nod e r .
Because a cluste r ca n alternativel y be viewe d as eithe r a se t o f nodes o r a se t o f arcs , th e y ij
variables ar e not see n as important in finding a critical cluster . In the model onl y the xf variable s
must be restricted to intege r values with an uppe r boun d o f 1 . The y tj variable s will be integer as
long as the xf ar e integer i n value. For larg e data sets, this greatly reduces th e number o f branche s
that occur s durin g the branch-and-bound procedure .
The contiguit y constraint se t (6) was not specifie d mathematicall y abov e i n order t o kee p th e
CCM formulatio n a s simpl e a s possible . I n applicatio n i t i s worthwhil e t o attemp t t o solv e a
problem without such constraints. If the optimal solution to the above model is connected without
such constraints , the n thes e constraint s ar e no t necessary . W e wil l describ e ho w w e can mak e
selective us e o f suc h constraint s later . However , i t i s worth specifyin g thes e constraint s unam -
biguously. T o structur e suc h a constraint se t consider a node / that is different fro m the anchor
node, but ha s been selecte d as a part of the cluster. If the cluster is contiguous, the n a path mus t
exist tha t connect s th e ancho r node to nod e / and whic h lie s entirely withi n th e cluster . W e can
ensure that a path exist s within the cluster between the anchor nod e r and nod e / in the followin g
manner:
Contiguity constraint se t for node / , (6 )
R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336 32 9
where additionally ,
The abov e contiguit y constrain t se t ensure s tha t a pat h wil l exis t wholl y withi n th e cluste r
between node / of the cluster and the anchor node. The first constraint (6a) specifies that an arc in
a path leading to node / from th e anchor node r must be chosen to depart fro m th e anchor node r
if nod e / is chosen fo r th e cluster . I f nod e / is not chose n fo r th e cluste r the n n o suc h pat h re -
quirement is enforced. This constraint ensures that i f node / is in the cluster, then there is one ar c
that originate s from th e anchor nod e that i s an element of the path between the anchor node and
node / . The nex t constraint (6b) specifies tha t fo r an y nod e i that i s not th e ancho r nod e o r th e
destination nod e / , the numbe r of arcs chose n in th e pat h t o / that terminat e a t nod e i must b e
equal t o th e numbe r o f arcs chose n in the path / that originate fro m nod e i. This assure s that a
path does not terminate at an intermediate node on its way to node /. Constraint (6c ) specifies tha t
the numbe r of arcs chosen i n th e pat h / that terminat e a t nod e / is equal t o 1 if node / is in th e
cluster and 0 if it is not. This constraint ensures that if node / is in the cluster, then there is one arc
terminating at node / that is an element of the path between the anchor node and node /. The last
two constraints specify tha t for any arc used in the path from th e anchor node to node / must have
both endpoint s within th e cluster.
The se t o f constraints (6a)-(6e) give n abov e fo r nod e / , can b e replicate d fo r al l nodes o f th e
network othe r than the anchor node. This would then represent a complete contiguity constraint
set (6). The siz e of the model without contiguity constraints, in terms of the numbers of variables
and constraint s is : n x t variables ; 2m y ij variables ; 2m 4+ 3 constraints, wher e n is the numbe r of
nodes an d m i s th e numbe r o f arcs . A ful l complemen t o f contiguit y constraint s adds : 2m n P\.
variables and n 2 +4 m n +2n — 1 constraints. It is easy to see that the above model is a manageable
size withou t the contiguit y constraints , but is far too larg e to be practica l whe n all possibl e
contiguity constraint s ar e employed . Ou r approac h i n usin g selecte d contiguit y constraint s i s
described i n the nex t section.
The contiguity constraints structured above use a set of path variables for each potential node /
that coul d be selecte d for the cluster . An alternativ e approac h coul d be to use a networ k flow
approach, wher e flow originates at th e anchor nod e and must flow across arcs wholly within the
cluster an d wher e on e uni t o f flo w mus t arriv e a t eac h nod e selecte d fo r th e cluster . Suc h a n
approach require s fewe r constraint s an d variables , however , constraints tha t ensur e tha t flow
must be on arcs wholly within a cluster cannot b e written in an integer friendly for m (i n contrast
to th e intege r friendly for m o f constraints (6d) and (6e)) .
The general approach t o derivin g optimal solution s for th e CC M involved writing a softwar e
driver t o generat e a mathematical programming system (MPS) file. In thi s context, a drive r is a
program tha t generate s an instanc e of a n optimizatio n mode l (e.g. critical cluster model) i n th e
330 R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336
MPS file format. CPLEX™ was used for solvin g all integer programming problems. Fig . 3 shows
the procedur e fo r derivin g an optima l solutio n to th e CCM . Th e driver accepts a textua l repre -
sentation o f the network and a set of parameters that specif y th e anchor node and scale limit. This
driver outputs an MPS file for input into CPLEX, whic h solves the problem optimally. The arrow
running fro m th e solutio n bac k t o th e drive r implies that thi s procedure i s iterative.
In man y cases , th e initia l solutio n t o th e mode l (withou t contiguit y constraints ) fo r a give n
anchor nod e an d scal e limi t wa s no t contiguous . Becaus e th e objectiv e of th e modifie d linea r
formulation i s to minimize the size of the cluster, the model consistently selected the anchor nod e
and a non-contiguou s se t o f heavil y weighted satellit e clusters. I n thes e cases , a n optima l con -
tiguous solution could be "teased out" b y fixing one of the heavily weighted nodes from a satellite
cluster out o f the solution with an additiona l constraint . However , there needed to b e a basis for
determining whether or not the node to fix out was clearly not an element of the optimal solution .
To accomplis h this , th e nod e wa s firs t "fixed-in " t o th e solution , alon g wit h th e require d
contiguity constraint s fo r tha t particula r nod e only . This usuall y generated a solutio n tha t wa s
clearly inferior t o a simple heuristically derived solution about th e anchor node . Hence, on e can,
without loss of generality, fix that node out in subsequent model runs. Deriving heuristic solutions
to th e CC M i s th e subjec t o f a companio n pape r (Cov a an d Church , 1997) . Th e bes t know n
contiguous cluste r valu e tha t wa s derive d fro m a heuristi c algorith m describe d i n Cov a an d
Church wa s used a s th e initia l M value . I f CPLE X faile d t o fin d a cluste r tha t containe d th e
"fixed-in node " an d matche d o r exceede d the M value , then tha t nod e coul d no t b e part o f the
optimal solution , an d i t coul d justifiably b e fixed out. I f a ne w "higher" cluste r was found tha t
included the fixed-in node, the n that cluster would become the new best M value , and th e searc h
for a greater cluste r could b e continued .
The M paramete r tha t wa s introduced i n orde r t o mak e th e mode l linea r can b e viewed as a
form o f threshold. Fig . 4 show s a "clustergram " tha t depict s a graphi c interpretatio n o f the M
value. I n a sense , th e M valu e change s th e quer y to , "I s ther e a n evacuatio n startin g scenari o
within which residents at thi s node might participate, withi n the scale limit, that exceeds M"? I n
short, the size limit (s) (see model formulation) bounds the size of the cluster to search for, and th e
M valu e put s a lowe r bound o n th e cluste r difficult y valu e t o searc h for . Thi s mean s tha t i t is
possible that n o solutio n ma y exis t for a given anchor nod e an d scal e limit if there i s no cluste r
with a difficulty valu e that exceeds M. For thi s reason, this approach t o solving the model is useful
for performin g an optimalit y tes t o n a know n cluster . I n othe r words , i f a know n cluster ha s a
difficulty valu e m and a size s, we can test whether i t is the optimal cluster for any of its nodes by
substituting m fo r M .
Optimal solutions were derived for 40 randomly selected nodes from 4 real world street networks
(10 each) at 3 scale limits 10 , 25, and 5 0 (120 problems). The networks ranged in size from 20 0 to
300 nodes, and the overall process was labor intensive due to the fact that the decision process as to
which nodes to fix-in and fix-out was not completel y automated. A n average problem too k any-
where on the order of a half-hour to an hour to completely solve and resulted in approximately 30
runs of CPLEX t o systematicall y fix-out as many as 1 5 nodes to arrive at the optimal contiguou s
cluster. Suc h a proces s coul d b e automated t o reduc e operato r involvement , but stil l would b e
computationally intensive i n terms o f the use of the CPLEX software. Obviously , suc h a model
relying solely on genera l purpose softwar e fo r generatin g solutions is of limited use. Research o n
alternate optimal approache s coul d be of considerable value for application .
The real value of the above optimization model and genera l purpose solution softwar e i s in its
use to tes t heuristi c approache s to the CCM . Withou t a competitiv e solutio n algorithm , it is
necessary to rely on a heuristic approach for application. The design and acceptance o f a heuristic
can b e made wit h confidence, only when known optimal solution s can b e used i n testing a heu-
ristic's performance . Th e solution s generate d b y th e CC M mode l an d genera l purpos e optimi -
zation software were used in testing the efficacy o f a CCM heuristic presented in Cova and Churc h
(1997). Th e heuristi c algorith m relie s o n a region-growin g approach t o construc t a contiguou s
cluster fro m eac h potential ancho r nod e i n a transportation networ k by iteratively adding node s
on the fringe o f the current cluster. A semi-greedy strategy is used to add a node to the cluster for
each iteration , wher e the nex t node i s randomly selecte d fro m a lis t o f node s withi n a specifie d
percent o f the nod e tha t mos t increase s th e objectiv e value. Th e heuristi c was found t o b e rela-
tively robust a t identifying optima l critical clusters. Fig. 5 shows a histogram o f the mean percent
from optima l when the heuristic was restarted 12 8 times from eac h node and an y node within 77%
of th e greed y choice a t eac h ste p i s a potentia l candidat e t o ad d t o th e cluster . I n thi s case, th e
heuristic identified the optimal solutio n 70 % of the time and th e mean percent from optima l was
3%.
Maps are ofte n use d in depicting risk which varies spatially, like flood plains and seismi c zones
subject t o liquefaction . The designatio n o f a flood plain, for example , involves identifying whic h
areas wil l b e floode d wit h som e annua l frequenc y (e.g . once i n 5 0 yr). A floo d plai n ma p the n
depicts those areas which are subject to a reasonable ris k of flooding (i.e. probability of an event
occurring o n a give n land unit) . Suc h maps depic t "even t ris k occurrence" , bu t no t th e ris k th e
occupants fac e whe n an emergenc y evacuation i s made. Th e most commo n ma p associate d wit h
evacuation involve s the depictio n o f evacuation route s an d saf e zone s (like shelters). Such map s
depict an evacuation pla n fo r a designated area , bu t no t th e risks. A critique of such maps can be
found i n Dymon an d Winte r (1993).
What is proposed i n this work and i n Cova an d Churc h (1997 ) is the development of risk maps
of potentia l evacuatio n difficulty , whic h coul d b e use d i n conjunctio n wit h event-ris k map s t o
develop better evacuation plannin g maps. The major objective would be to identify place s of high
event-risk an d hig h evacuation-risk . Fo r example , hind-sigh t show s that th e Oaklan d Hill s area
was suc h a n area . Advance d knowledg e an d publi c recognition coul d hav e possibl y averted th e
tragedy o f 1991 .
The CC M represent s on e possibl e mode l tha t ca n b e use d t o estimat e smal l neighborhoo d
evacuation ris k difficulty. Th e higher the bld or cte, the greater the possible problems that may be
encountered i n an emergency evacuation. If each node of the network is used as a anchor node for
the CCM, i t is then possible to label each node on the transport networ k with a risk measure like
bid or cte. Given a spatial depiction of node locations and risk values, it is possible to perform two
types of mapping functions: (1) interpolate lines of equal risk, like elevation contours, an d develop
a risk contour map ; or (2) classify eac h node according t o its "relative ris k value" and map nodes
and arc s usin g som e typ e of color schem e o r gra y scal e according t o th e risk .
To develop such a map would be realistically out of the question for a large area, unless many
of the functions were automated. I t is only natural to select a GIS platform to supply much of this
functionality. Fo r ou r project , w e selected th e ARC INF O GIS system , although othe r system s
R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336 33 3
would support suc h an application as well. Essentially, dat a in the GIS was represented b y a set of
coverages, includin g network an d populatio n information . A n expor t functio n usin g th e AR C
INFO macro languag e was developed whic h produced a forwar d star dat a structur e fo r th e as-
sociated roa d network . This data structur e was used by the MPS setup program and the heuristic.
All network data wa s stored withi n the GI S an d wa s exported i n the specia l for m for a given
selected anchor node . The CCM model was then solved (using either heuristic or CPLEX genera l
purpose LP/I P software system) and th e result was imported bac k int o th e GIS. Th e applicatio n
was automated s o that eac h node was systematically selected as an anchor nod e and solved by the
heuristic. The risk values, e.g. bid, were then imported into the GIS and assigne d as attributes t o
the nodes. Each identified critical cluster represented a set of nodes and a bld value. After solutio n
for a specified ancho r nod e was determined, each node within the critical cluster was tested to see
if the new bld value was higher than th e current bld value in the data layer. If the new value was
higher, then the node attribute for the critical value was set at that new value. If the new value was
lower, then the node attribute for the critical value was left unchanged . Essentially, for each nod e
the critica l valu e is the highes t valu e foun d associate d wit h al l critica l cluster s foun d whic h in-
cluded tha t node .
After al l nodes were considered a s possible anchor node s fo r solving a CCM, the n severa l Arc
Macro Language (AML) routines were executed. These routines assigned the evacuation difficult y
of eac h ar c th e highe r o f th e noda l endpoin t evacuatio n ris k value s (i.e. bl d values) . The n eac h
node and ar c was categorized b y the relative bld value. These categories were then assigned either
a colo r o r a gray scale value, and a complete ma p was produced. Withou t th e capabilities o f the
GIS, thi s type of mapping exercis e would be too tim e consuming.
An example evacuation ris k map develope d b y the use of the CC M model coupled wit h ARC/
info i s give n in Fig . 6 . Fig . 6 depicts a ris k ma p fo r th e sout h coas t regio n o f Sant a Barbar a
County. The upper portio n o f Fig. 6 depicts the entire south coast region . This region is bounded
by the ocean to the south and a mountain rang e to the north. A gray scale was used to depict bulk
lane deman d i n range s o f 0-200; 201-300; 301-00 ; 401-500 ; >50 0 people pe r exi t lane. Censu s
population dat a wa s used t o estimat e populatio n value s an d a NavTec h databas e wa s used t o
depict transpor t networ k links.
In th e lowe r part o f Fig . 6 , four separat e area s ar e show n in greater detail . The y ar e Missio n
Canyon (uppe r left) , Carpenteri a (uppe r right) , downtown Sant a Barbar a southwes t o f highway
101 (lower right), an d Isl a Vist a (lower left) . Thes e depic t fou r o f the area s tha t appea r a s very
dark i n the upper map. Becaus e of the natural foliage and stee p terrain the mission canyon area is
a region of very high fire risk. If a fire risk map were available, then it would be possible to identif y
Mission Canyo n a s bot h hig h evacuatio n ris k an d hig h fir e risk . Mos t othe r Sant a Barbar a
foothill location s hav e lowe r evacuation risk , thu s plannin g effort s ca n b e concentrated i n suc h
special area s o f hig h risk . A s a n aside , homeowner s an d fir e departmen t official s hav e bee n
convinced of the urgency of the problem b y this map. Eve n a district supervisor called a planning
meeting based upon the results of this mapping exercise. Members of the homeowners associatio n
are now suggesting the nee d fo r a detaile d simulatio n an d evacuatio n plan .
8. Conclusion
We have presented a specialized networ k partition mode l called th e critical cluster model. Thi s
model can be used to identify smal l neighborhoods abou t a given node that have potentially risky
combinations o f high population an d low exit road capacity . Th e CCM can be used to identif y a
contiguous noda l cluste r tha t maximize s bul k lan e deman d o r a n estimat e o f network clearin g
time. Both measures , whil e no t exact , ar e assume d t o b e reasonabl e surrogat e measure s o f
evacuation risk . Althoug h i t woul d b e desirabl e t o identif y neighborhood s a t ris k b y a micro
simulation model, such a process would require defining the neighborhood i n advance. The CCM
model can be used to perfor m this task .
R.L. Church, T.J. Cova I Transportation Research Part C 8 (2000) 321-336 33 5
Acknowledgements
The authors appreciat e th e helpful comment s provided by the reviewers of the original draft of
the manuscript. Network data was supplied by Navigational Technologies , unde r an agreement t o
the National Cente r fo r Geographi c Informatio n an d Analysis . Support b y the National Scienc e
Foundation (NS F SBR96-00465 ) is gratefully acknowledged .
References
Cova, T.J. , Church , R.L. , 1997 . Modeling communit y evacuatio n vulnerabilit y using GIS . Internationa l Journa l o f
Geographic Informatio n Scienc e 11 , 763-784.
Dymon, U.J. , Winter , N.L., 1993 . Evacuation mapping : th e utilit y o f guidelines. Disasters 17 , 12-24.
Hobeika, A.G. , Jamei , B. , 1985 . MASSVAC : a mode l fo r calculatin g evacuatio n time s unde r natura l disasters .
Emergency Planning , Simulations Series 15 , 23-28.
Jin, L.M. , Chan , S.P. , 1992 . A geneti c approac h fo r networ k partitioning . Internationa l Journa l o f Compute r
Mathematics 42, 47-60.
Johnson, D.S. , Aragon , C.R. , McGeoch , L.A. , Schevon , C. , 1989 . Optimizatio n b y simulate d annealing : a n
experimental evaluation ; par t I , grap h partitioning . Operation s Research 37 , 865-892.
Kernighan, B.W., Lin, S., 1970. An efficient heuristi c procedure for partitioning graphs. Bell Systems Technical Journa l
49, 291-307.
Laguna, M. , Feo , T.A. , Elrod , H.C. , 1994 . A greed y randomize d adaptiv e searc h procedur e fo r th e two-partitio n
problem. Operation s Researc h 42 , 677-687.
Lindell, M.K., Perry, R.W., 1991. Understanding evacuation behavior : an editorial introduction. International Journal
of Mas s Emergencie s and Disaster s 9 , 133-136 .
Office o f Emergency Services, 1992. The East Ba y Hills Fire - A Multi-agency Review of the October 199 1 Fire in the
Oakland/Berkeley Hills . East Ba y Hills Fire Operation s Revie w Group, Governor' s Office , Sacramento , CA .
Owen, M. , Galea , E.R. , Lawrence , P.J. , 1996 . Th e EXODU S evacuatio n mode l applie d t o buildin g evacuatio n
scenarios. Fir e Engineer s Journal 56 , 26-30.
Perry, R. , 1985 . Comprehensiv e Emergenc y Management : Evacuating Threatene d Populations. JAI Press , London .
Pidd, M. , Eglese , R. , d e Silva , F.N. , 1997 . CEMPS: a prototyp e spatia l decisio n suppor t syste m t o ai d i n planning
emergency evacuations. Transaction s i n GIS 1 , 321-334.
336 R.L. Church, T.J. Co m I Transportation Research Part C 8 (2000) 321-336
Pirkul, H., Holland, E., 1994. New heuristic solution procedures fo r the uniform graph partitioning problem: extensions
and evaluation . Computers and Operation s Researc h 21, 895-907.
Sheffi, Y. , Mahmassani, H., Powell, W.B., 1982 . A transportation networ k evacuation model. Transportation Researc h
16A, 209-218 .
Shough, W.H., Magdalena , A.T. , Stalberg , C.E., 1992 . Hazard mitigatio n report fo r the East Ba y fire in the Oakland -
Berkeley hills. FEMA, FEMA-919-DR-CA .
Sinuany-Stern, Z. , Stern , E. , 1993 . Simulatin g th e evacuatio n o f a smal l city : th e effect s o f traffi c factors . Socio -
Economic Plannin g Sciences 27, 97-108.
Sorensen, J.H. , Vogt , B.M. , Mileti , D. , 1987 . Evacuation : A n Assessmen t o f Plannin g an d Research . Oa k Ridg e
National Laborator y ORNL-6376 , Tennessee .
Southworth, F. , 1991 . Regional Evacuatio n Modeling : A State-of-the-Ar t Review . Oak Ridg e Nationa l Laborator y
ORNL-11740, Tennessee .
Stern, E., Sinuany-Stern, Z., 1989 . A behavioral-based simulatio n mode l for urban evacuation . Paper s o f the Regiona l
Science Associatio n 66 , 87-103.
Tufekci, S. , Kisko, T.M., 1991 . Regional evacuation modellin g system REMS: a decision support syste m for emergency
area evacuations . Computer s an d Industria l Engineerin g 21, 89-93.
TRANSPORTATION
RESEARCH
PARTC
Abstract
Shipping hazardou s materia l (hazmat ) place s th e publi c a t risk . Peopl e wh o liv e o r wor k nea r road s
commonly traveled by hazmat truck s endure the greatest risk . Careful selection of roads used for a hazmat
shipment can reduce the population a t risk. On the other hand, a least time route will often consis t of urban
interstate, thus placing many people in harms way. Route selectio n is therefore the process o f resolving the
conflict betwee n populatio n a t ris k an d efficienc y considerations . T o assis t i n resolvin g thi s conflict , a
working spatia l decisio n suppor t syste m (SDSS) calle d Hazma t Pat h i s developed. Th e propose d hazma t
routing SDS S overcome s thre e significan t challenges , namel y handlin g a realisti c network , offerin g so -
phisticated rout e generatin g heuristic s an d functionin g on a deskto p persona l computer . Th e pape r dis -
cusses creativ e approache s t o dat a manipulation , dat a an d solutio n visualization , use r interfaces , an d
optimization heuristic s implemente d i n Hazma t Pat h t o mee t thes e challenges . © 200 0 Elsevie r Scienc e
Ltd. Al l rights reserved.
Keywords: Hazardou s material; Routing ; Spatia l decision suppor t system ; GIS ; Highwa y networ k
1. Introductio n
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00007- 3
338 W.C. Frank e t al . I Transportation Research Part C 8 (2000) 337-359
The goa l o f thi s pape r i s t o develo p a workin g SDS S capabl e o f handlin g a realisti c trans -
portation networ k covering a multi-state region or an entire country and their numerous loading
and deliver y points. DSS s ar e computer-based system s that shar e severa l key characteristics, in -
cluding (Sprague and Carlson , 1982 ; Geoffrion, 1983 ; Turban, 1995) :
• Assistin g users in their decisio n makin g in a flexible and interactive manner .
• Solvin g all classes of problems, includin g ill-structured ones .
• Havin g a powerful an d user-friendly interface .
• Havin g a data analysis and modeling engine .
Conceptually, SDSS s are special cases of DSSs. Densham (1991 ) effectively argues that they diffe r
markedly fro m genera l DSSs i n some key respects. They need spatia l capabilitie s fo r dat a input ,
display of complex relations and structures, analysis, and cartographic output. The architecture of
our Hazma t Pat h SDS S i s depicted i n Fig . 1 . Hazmat Pat h i s a full-feature d SDS S allowing for
interactive proble m editing , compariso n of solutions, an d evaluatio n o f decision criteria .
There i s a difference betwee n conceptualizing an d actuall y developing a working SDSS. When
developing a working SDSS, numerou s trade-offs exis t between level of effort an d SDS S quality.
A conceptual SDSS does not hav e this real world trade-off. Commercia l geographi c information
systems (GIS) softwar e applications , whic h often provid e the spatia l informatio n processing en-
gine of SDSSs, ar e designed wit h tremendous flexibility but at a cost to time for producing results .
Displaying result s in a timel y manner i s imperative for a SDSS . Therefore , th e choic e has bee n
made to design a Windows-based softwar e applicatio n tailore d t o the SDSS and running on mid-
range deskto p computer s instea d o f using a n existin g commercial application .
Since most long-distanc e trave l occurs on highways, they compose the network on which travel
time i s minimized. The y primaril y connec t an d transvers e larg e populatio n centers . Therefore ,
minimizing travel time puts a large population a t risk. Population center s can be avoided by using
slower an d les s direct non-interstat e roads . Thus , th e strateg y o f selectin g a rout e t o minimiz e
travel tim e and th e strateg y o f minimizing population a t ris k ar e conflicting i n nature .
Arbitration betwee n these conflicting costs makes use of the capability of the SDSS to generat e
several alternativ e truc k routin g solution s base d o n singl e optimizatio n criteria . A widel y used
approach i s to combine severa l attribute cost s into a single cost. The new cost is often take n to be
a linear functio n o f population a t risk , distance, tim e and acciden t probability . With a single link
cost, a simple solution method (e.g., Dijkstra's shortes t path algorithm) ca n be used to determine a
vehicle route. B y varying the weights of the attribute costs, different route s ca n be generated. Th e
process o f varying the weight s indicates th e sensitivit y tha t th e attribut e cost s hav e o n rout e se-
lection.
Another approac h to rout e selectio n is having multiple objectives. Minimizing travel time and
total populatio n at ris k i s an exampl e of multiple objectives . Many route s ca n b e in the solutio n
set because it contains al l of the non-dominated routes . The number o f non-dominated path s can
become ver y larg e i n network s typica l o f real-worl d applications , thu s renderin g th e approac h
unpractical.
Still anothe r approac h i s to minimiz e on e cost attribut e whil e limiting the sum s o f othe r cos t
attributes. Thi s typ e o f proble m i s calle d a constraine d shortes t pat h (CSP) . I t i s used i n thi s
research a s follows. Trave l tim e is minimized while four othe r criteri a - tota l populatio n a t risk,
distance, acciden t probability , an d consequence - ar e constrained. Trave l tim e is chosen fo r the
objective function because , it best represents financial cost. As noted earlier, minimizing time for a
long route often produce s a path, which places large numbers of people at risk. Consequently, the
CSP problem i s a method tha t resolve s the minimum tim e and minimu m populatio n conflict .
An adde d complicatio n t o th e CS P proble m i s that som e lin k attribute s ar e assume d i n thi s
research to be a function o f the time of day. Link travel time is one of the temporal link attributes.
The level of traffic congestion influence s link travel time and, congestion is affected b y time of day.
This problem is called the time-dependen t constraine d shortes t pat h (TCSP) .
3. Literatur e revie w
The literature relevan t t o ou r hazma t routin g proble m ca n be organize d int o studie s tha t im-
plement a solution within a GIS and those tha t use a SDSS. Bot h group s use GIS techniques for
data storage , dat a manipulatio n (fo r instance , t o generat e lin k attribut e costs ) an d t o displa y
solutions on a map. SDSS-base d implementation s hav e the added capabilit y of allowing the user
to easil y specif y shipmen t origins and destinations , along wit h removin g intersection s an d roads
from rout e consideration . GI S provide s a n idea l environmen t fo r desig n an d managemen t o f
hazmat route s becaus e o f its ability to integrat e multi-them e an d multi-sourc e dat a int o a n op -
erational informatio n system . Souleyrett e an d Sathisa n (1994 ) advocat e th e us e o f GI S fo r th e
comparative stud y o f pre-defined , alternativ e route s o n selecte d characteristics . Thi s typ e o f
W.C. Frank e t a l I Transportation Research Part C 8 (2000) 337-359 34 1
analysis is illustrated by a case study of Nevada highway and rai l routes for shipment of high-level
radioactive materials .
Abkowitz e t al . (1990 ) envisio n GI S t o fulfil l function s i n hazma t transportatio n tha t reac h
beyond thos e o f input dat a storage , dat a manipulatio n an d outpu t ma p display. They propose a
GIS application o f hazmat routin g on a large-scale network of size similar to the one used in this
research. Th e routing algorith m handles a single routing criterion , bu t compromise o r negotiate d
solutions ca n b e achieve d b y ex-pos t compariso n o f solution s generate d o n differen t routin g
criteria. In their implementation, the authors use criteria o f distance ( a measure of efficiency) an d
populations a t ris k ( a measure o f safety). The latter is measured b y the tall y o f people within a
given bandwidt h (0.25 , 0.5, 1 , 3, 10 , or 2 5 miles) along highway segments. A gradien t metho d i s
used i n conjunction wit h Thiesse n polygon s t o allocat e enumeratio n distric t populatio n t o pre -
defined buffers .
In thei r stud y of transportation o f aqueous hazardou s wast e in the London , UK , area , Brai -
nard e t al. (1996) apply weighting schemes to identif y routin g solutions that compromis e betwee n
criteria. A singl e lin k attribut e i s calculate d fro m a weighte d combinatio n o f lin k attribute s
(population a t risk, groundwater vulnerability, and acciden t likelihoo d computed fro m historica l
records). A labeling algorithm i s then used to minimize this new attribute combination . Solution s
associated wit h alternative weight s can b e compared visuall y (map display ) an d statisticall y (fo r
instance, trave l time , expecte d numbe r o f accidents , highwa y mileag e versu s mileag e o n loca l
roads) fo r ris k assessmen t purposes . A simila r approac h i s suggested b y Lepofsk y et al . (1993) .
These author s stres s tha t th e networ k dat a mode l use d b y GIS t o represen t individua l highway
segments allow s fo r thei r detaile d attribut e characterizatio n an d fo r th e efficien t modelin g o f
segment-specific ris k o f hazma t shipping . Followin g establishe d practice s i n th e matter , the y
define ris k by a combination o f acciden t likelihood , probabilit y o f a release , consequenc e o f a n
incident measure d in terms of population exposed , and ris k preference of affected interes t groups .
Generation o f the accident likelihoo d and consequence factors is considerably enhanced i n a GI S
environment. Lepofsk y et al . (1993 ) presen t th e cas e stud y o f a shipmen t fro m th e California /
Arizona borde r t o Vandenberg, CA , with a rather small-scal e networ k representing th e highway
system o f Souther n California . A rout e throug h Lo s Angele s is produced whe n trave l tim e wa s
minimized. O n th e dow n side , this rout e ha s th e highes t populatio n exposure . A compromis e
solution i s retained wit h weights of 25% travel time and 75 % accident likelihood , allowin g for a n
acceptable trade-of f betwee n efficienc y an d safety .
Useful components o f a hazmat SDSS are outlined by Baaj et al. (1990) and discussed in greater
detail by Erku t (1996) . The y includ e usin g differen t routin g solutio n methods. Also , interactiv e
post-editing o f generated an d displaye d route s give s control o f the process t o th e analyst : a user
may wish to create a detour aroun d a sensitive location, o r it may be deemed desirable to remove
some link s fro m th e network an d hav e a new route generated. Usuall y ther e is no "best " route.
Minimizing on e criterio n typicall y conflicts wit h minimizing another. B y an iterativ e proces s o f
displaying routes, using different solutio n methods an d creating detours, a compromise route can
be developed .
A working SDSS i s developed b y Lassarr e e t al . (1993) . Their networ k covers 600 km2 i n th e
region o f Haute-Normandie , France . Th e SDS S ha s th e capabilit y o f loadin g geographica l
overlays that include hydrology , railway s and population densities. Dijkstra' s algorith m i s used t o
compute a route with the lowest risk. Risk is defined t o be the product o f accident rat e and peopl e
342 W.C. Frank e t a l I Transportation Research Part C 8 (2000) 337-359
affected i n an accident. Th e latter is composed o f the day population i n adjacent polygons (buffer s
of given width around highwa y segments), on links (traffic o f road segments) , and nearb y point s
(children a t school) . Thei r SDS S als o ha s th e abilit y of removing links an d node s o r area s wit h
particular characteristic s (i.e. , single-lan e roadways ) fro m th e network .
Coutinho-Rodrigues e t al. (1997) propose a personal computer-base d SDS S for multi-objective
hazmat locatio n an d routin g problems . Th e routin g solutio n generato r offer s severa l alternativ e
multi-objective optimizatio n techniques , includin g the weighted method , th e constrain t method ,
and goa l programming . Th e use r interact s wit h th e SDS S throug h multipl e graphica l an d nu -
merical solution displays. Time-dependence o f attributes and solution s are not accommodate d b y
the SDS S i n its current form . Geographic visualizatio n of attributes an d solutio n paths i s limited
to grap h window s that hardl y allow for geographi c reasonin g o n solution s an d othe r geo-refer -
enced information .
The user may elect to display several relevant map objects. Fig. 2 shows the options availabl e to
the user. In this dialogue box, truck route selection is gray because no route has been generated so
far an d therefore , cannot b e displayed . Th e use r ca n als o specif y th e siz e an d colo r o f many o f
these objects .
Displaying individual road attribute s i s useful whe n evaluating truck routes. Question s suc h as
"Are ther e an y alternativ e route s availabl e wit h a lower total population a t ris k without greatl y
increasing trave l time" ? can b e explored an d answere d b y visual examination o f suitabl y attrib -
uted ma p objects . The geo-visualization metho d use d her e to displa y link costs is by road width.
In Fig . 2 , the attribut e "total population a t risk " i s selected for displa y and th e resultin g map is
shown i n Fig . 3 . The truck rout e i s shown wit h intersection arriva l times .
A linear relationship between attribute cost and width of the representation o f the road feature
is used. Th e maximum roa d width is seven pixels. Each incrementa l pixe l width corresponds t o a
range o f the attribut e cost, wit h the requiremen t tha t th e rang e o f the singl e pixel start a t zero .
Therefore, th e are a o f a link on the monitor i s a measure of its cost. Furthermore , th e origi n to
destination path displaye d with the least area (smalles t number of colored pixels ) is also the least
cost pat h fo r that attribute .
The population a t ris k road parameter s (tota l populatio n a t ris k and "expecte d consequence "
in Fig. 2) exhibit heavily skewed frequency distributions, where very large values appear i n a few
large urban areas. A strict applicatio n of the rule of proportional symbolic representation lead s to
the depictio n o f most road s outsid e thes e area s wit h a widt h o f on e pixel . While thi s pose s n o
problem whe n the are a o f interest includes these few large urban areas , fo r prope r visua l differ -
entiation, the pixel width should be scaled up as shown in Fig. 2 when they are not included. With
default scalin g o f 20, roads wit h 1/2 0 to 20/2 0 of th e tru e maximum population a t ris k ar e dis -
played with the same maximum width. Roads wit h zero to 1/2 0 of the true maximum population
at ris k have pixel width proportional t o populatio n a t risk.
Since the area of a link is linearly proportional t o one of its attribute costs, the link length times
the attribut e cos t need s t o represen t th e cos t t o trave l tha t link . The probabilit y of a n acciden t
pixel width is a linear function of probability of an accident per mile. Expected consequenc e widt h
pixel width is also a rate , expected consequenc e pe r mile . The sam e is true for tota l population .
Average population a t ris k ha s no equivalen t rate an d therefore , cannot b e displayed.
Hazmat Pat h uses the national highway planning network (NHPN) maintained by the Federa l
Highway Administration (Bureau of Transportation Statistics , 1999). The NHPN has 95,000 nodes.
Pre-processing o f the origina l networ k bring s i t dow n t o 57,00 0 nodes. Thi s operatio n involved
removing nodes outside the continental US as well as most of the nodes with only two incident links.
The user has the option o f temporarily removing intersections from th e network. Intersections
temporarily removed from th e network cannot be part of the truck route being planned. There are
two reason s fo r removin g intersections , namel y t o reduc e solutio n run-tim e an d t o preven t a
solution fro m traversin g a particula r regio n fo r equit y reasons o r t o compl y with local o r stat e
hazmat traffi c regulations .
In ou r SDSS , thre e differen t method s ar e availabl e t o exclud e or deactivate intersections. The
first method i s by selecting individual intersections b y point an d clic k with the mous e or cursor .
Intersections ca n b e adde d t o th e networ k b y th e sam e operation . Also , th e curso r ca n b e
dragged ove r a rectangula r regio n t o ad d o r remov e al l intersection s within th e circumscribed
area. Fig . 4 display s a rectangl e of intersection s being deleted : the intersection s represented by
clear circle s hav e bee n remove d fro m th e network . Th e las t metho d i s associate d wit h a con -
straint impose d b y th e user . Onl y intersection s tha t ar e reachabl e whil e travelin g fro m th e
specified origi n t o th e specifie d destinatio n withou t violating the sai d constrain t ar e include d i n
the network . I n th e curren t implementation , th e constrain t i s base d eithe r o n trave l tim e o r
distance.
Often man y hazmat shipment s take place between the same origin-destination pairs. Following
the sam e rout e fo r multipl e shipment s coul d concentrat e th e ris k amon g a localize d grou p o f
people abov e an d beyon d wha t is deemed acceptable. Therefore , wit h such equity considerations
in mind , on e may consider spreadin g th e ris k mor e evenl y over a large population . Thi s ca n b e
accomplished by finding D differentiated routes . The metric for total population at risk would be a
set o f D number s (tota l populatio n place d a t ris k b y onl y on e o f th e routes , tota l populatio n
placed a t ris k b y an y tw o o f the routes.. . tota l population place d a t ris k by all D routes) . Thi s
requires th e calculatio n o f th e combine d tota l populatio n a t ris k o f an y tw o link s alon g wit h
developing a solutio n method , bot h o f which are beyon d the scop e o f this paper .
A les s computationall y intensiv e approach ca n b e applie d b y utilizin g th e networ k editin g
features of the SDSS . The user can develop differentiated route s by reducing the overlap of route
buffers i n populated areas . Fo r thi s purpose, th e total population a t risk along each road segment
is displayed along with the mile scale so as to reveal the population density i n regions o f interest.
The mil e scal e is neede d for the use r to estimat e the distanc e betwee n highwa y links or mor e
importantly th e degre e o f overla p betwee n link buffers . Th e approac h consist s in manually sub-
tracting and adding intersections and routes generated until an appropriate leve l of differentiatio n
is obtained .
W.C. Frank et al / Transportation Research Part C 8 (2000) 337-359 34 5
The SDS S ca n generat e tw o differen t categorie s o f routes , namel y thos e alon g th e leas t cos t
path and those along the least time path with limits on the other attributes. In both instances , the
user is prompted to inpu t the distance from a road within which an acciden t is expected to hav e
harmful impact s o n huma n populations . Th e are a affecte d b y th e releas e o f toxic materials an d
population exposur e depends heavily on the properties o f the material being shipped and the spill
characteristics (Lepofsk y et al. , 1993) . Th e user-supplie d distanc e i s therefor e case-specific . I t
defines th e radiu s of the moving circular buffe r use d t o identif y averag e population a t ris k when
calculating th e expecte d consequenc e of a n acciden t a s well a s th e bandwidt h o f buffer s aroun d
highway segments used when determining total population at risk. More details on the calculation
of th e populatio n a t ris k attribute s i s found in Sectio n 5.
With the least cost path option , eac h o f the five link attributes is weighted to produc e a single
composite link cost. As in Lepofsky et al. (1993) and Brainar d et al. (1996), Dijkstra's algorithm is
346 W.C. Frank e t al . I Transportation Research Part C 8 (2000) 337-359
then applie d t o thi s cost t o generat e a singl e path. Th e user inputs the weights using a dialogu e
box. Du e t o th e wid e variatio n o f measuremen t scale s o f eac h lin k attribute , weight s ar e nor -
malized. The normalization consist s in dividing non-zero weighted attributes of each pat h b y the
minimum origin-destinatio n pat h cos t fo r tha t attribute . A s weight s ar e applie d linearly , th e
solution pat h i s invariant with respect t o a scalin g of the weigh t vector.
The least cost path solution method can conceivably use non-temporal accident and populatio n
at risk attributes only. If temporal attributes were used without a constraint o n time, then a likely
solution woul d involv e excessive parking t o avoi d link s with high cost durin g specific tim e win-
dows. For instance , a vehicle assigned to a 500 mile ride could optimall y be scheduled to travel for
a few hours each day so as to take advantage of lower cost temporal attributes, thus unrealistically
and impracticall y delayin g deliver y for severa l days . Th e approac h propose d her e avoid s suc h
unrealistic scenario .
The constrained leas t tim e path proble m ha s three different solutio n methods, all of which are
extensions of Handler an d Zang (1980). The first approach assumes non-temporal attributes . The
remaining tw o solutio n method s allo w fo r tempora l attributes . The y ar e th e intermediat e nod e
method an d th e weight-guided solution method. Discussio n of the latter methods is conducted in
Section 6.
If a solutio n metho d wit h tempora l lin k attribute s i s selected , the n star t tim e influence s
the rout e selected . Star t tim e ca n b e inpu t manuall y i n a dialogu e box . Th e month o f the yea r
can als o b e supplie d t o accoun t fo r seasona l fluctuation s i n th e leve l o f congestion . Th e dia -
logue bo x throug h whic h th e constraine d leas t tim e pat h proble m i s specifie d i s show n i n
Fig. 5(a) .
The lowe r lef t pane l i s labeled shortes t path s withou t constraints . Wit h th e exceptio n o f th e
minimum time , non-tempora l attribute s ar e use d t o calculat e th e minimu m lin k attributes . Fig .
5(a) display s th e result s describing the solutio n pat h i n term s o f eac h o f th e five attributes (ac -
cident probability, distance, expected consequence, travel time, and total population ) whe n travel
time i s minimized, whil e Fig . 5(b ) display s th e correspondin g result s whe n tota l populatio n i s
minimized. By clicking on the appropriate radi o button, the numerical results can be displayed for
different attribut e minima . Thi s featur e assist s th e use r i n determinin g appropriat e constrain t
bounds.
Solution paths can be depicted i n the decisio n spac e by displaying the appropriat e ma p objec t
in th e ma p window . As a n ai d t o decisio n making , th e performanc e o f solution s vis-a-vi s th e
objectives an d constraint s o f th e routin g proble m ca n b e visualize d by displayin g the m i n th e
objective space . Spide r web s are used in this research (se e also Coutinho-Rodrigue s e t al., 1997) .
Routing performanc e result s with travel tim e minimization are depicte d i n Fig. 6 . In thi s graph ,
the corner s o f the dark gra y polygon represent th e minim a o f the five attributes. Th e corner s of
the light gray polygon represent the attribute sums when travel time is minimized. Fig. 7 shows the
solution diagram for a scenario where total population is minimized. It should be pointed out that
the dar k gra y polygon is the sam e as in Fig. 6 but th e ligh t gray polygo n is very different . Thes e
graphs illustrat e the conflic t tha t typicall y exists between minimizing travel time and minimizing
population a t risk.
W.C. Frank et al / Transportation Research Part C 8 (2000) 337-359 34 7
Fig. 5 . (a) Constrained leas t time path dialogu e box . (b) Total populatio n minimized.
348 W.C. Frank e t al . I Transportation Research Part C 8 (2000) 337-359
The lowe r cente r pane l o f Fig . 5(a ) i s used t o suppl y uppe r bound s o n attribut e sums . Th e
numerical result s ar e displaye d i n th e lowe r righ t sid e o f Fig . 5(a) . Fig . 8 display s th e result s
graphically. This figure is similar to Fig. 6 except that the constraint i s added in the form of a do t
on th e tota l populatio n axis . The attribut e sum s ar e show n i n th e for m o f a polygon withou t a
colored interior . Th e difference s betwee n th e ligh t gra y polygo n an d th e uncolore d polygo n ar e
caused b y constraints .
W.C. Frank et al . I Transportation Research Part C 8 (2000) 337-359 34 9
The Hazmat Path SDSS utilizes two data structure sets, namely link costs and map objects. The
link costs set is used to determin e the truck route. The development o f these costs is presented in
Section 5 . To produc e on e TCS P route , Dijkstra' s algorith m migh t b e performed thousand s o f
times. Therefore , to reduc e computer runtime , this dat a i s stored i n RAM i n a binary tre e data
structure. The tree is constructed only once when the program is initially started. The user has the
option of choosing different solutio n methods. This option dictates the assumptions applied to the
link attribut e costs .
The second data structure set consists of information needed to draw a map. Mos t of this data
are in the for m o f shape point s used t o dra w roads an d stat e boundaries. Thi s data is stored o n
disk because of its large size. There are thre e complete sets of state boundaries . Each se t has th e
same number of chains but th e number of shape points varies dramatically. As the user zooms in
or out, the resolution is adjusted by displaying a set with more or less shape points. The sets range
350 W . C. Frank et a l I Transportation Research Part C 8 (2000) 337-359
in tota l fil e siz e from 30 0 KB t o 8 MB. Thi s decrease s ma p drawin g tim e with little noticeable
decrease in map quality. Without differen t sets , multiple shape points may represent one pixel on a
monitor. Th e stat e boundar y chain s ar e divide d among 1 0 files, each o f which covers a differen t
region o f th e country . A fil e i s rea d whe n a t leas t on e chai n fro m tha t fil e i s displayed . Thi s
prevents reading an entire state boundary data set every time state boundaries are redrawn. Road
data set s ar e handle d i n a simila r manner . Ther e ar e thre e roa d dat a set s eac h divide d into 45
regions. Th e set s rang e i n tota l fil e siz e fro m 2 3 to 6 7 Mb. T o reduc e file read time , interstate
chains are placed a t the beginning of each of the files. With thi s structure, the entire file need not
be read i f the use r wants to displa y interstat e road s only .
The resolution-specifi c data set s ar e create d fro m th e sam e maste r dat a se t by selectivel y re-
moving shape points. Generalization o f features in a data se t proceeds as follows. If the change in
direction a t a shape poin t is below a given threshold , the n the shape point is removed. Anothe r
approach for the pre-processing of chain features is based on the distance between adjacent shap e
points. I f this distance is below a pre-defined threshold, then one of the shap e points is removed.
This method greatl y reduces double counting but doe s not eliminat e it. Indeed, if the B region of
two link s overlap, doubl e countin g occurs .
W. C. Frank et al. I Transportation Research Part C 8 (2000) 337-359 351
The other population at risk metric is average population. It measures the average population
at risk in case of accident on a link, under the assumption that the probability of an accident is
constant along the length of the link. If an accident occurs while traveling a link, then the expected
number of people exposed is the population within a given radius A of the accident location. The
value of the radius λ depends on the type of hazmat and the characteristics of the accident. Given
that the location of future accidents is not known and that they are expected to occur with a
probability that is uniformly distributed over the link, the expected number of people at risk can
be approximated as follows. A series of circles are centered and equally spaced along the length of
the link as shown in Fig. 10. The circle radius λ is taken to be constant in this implementation, but
the methodology proposed here can accommodate a distribution of radii associated with a dis-
tribution of hazmat accidents drawn from historical records. The number of people within each
circle is calculated by overlay with the layer of block group polygons produced by the Bureau of
the Census and incorporated in the TIGER/Lines files. Their statistical mean produces the av-
erage population at risk. By design, the average population at risk calculation assigns more weight
to people the closer they live to the link. A person nearer the link is more likely to be counted
multiple times than a person further from the link. A similar discussion is held in Erkut and Verter
(1995).
Average population at risk is of little use by itself. Once multiplied by the probability that
an accident occurs on a link, it produces the expected consequence of a single truck accident.
Three nationwid e data source s are integrated t o establis h temporal acciden t rates :
1. Annua l weighte d accident polic e report s wit h type of roadway , tim e of da y an d da y o f week
(National Highwa y Traffi c Safet y Administration , 1993) .
2. Truc k mileag e by type of roadway fo r a on e year period (FHWA , 1996) .
3. Annua l truc k mileag e by typ e of roadway, time of day an d da y o f week (US DoT , 1992) .
The secon d an d thir d sourc e ar e used t o develo p a nationally weighted truck mileage by type of
roadway, time of day and da y of week for a one year period. This statistic is then combined with
the first source to determine accident rates by type of roadway, time of day and weekday/weekend.
All data ar e relativ e t o th e 199 3 calendar year . Th e result s are show n in Fig . 1 1 for al l types of
roadways combined .
On most urban an d al l rural roads , the travel speed is not significantl y affecte d b y congestion .
Therefore, o n thes e roads , th e trave l spee d ca n b e assume d t o b e constant . Lin k trave l tim e is
calculated a s th e lin k lengt h divided b y th e spee d limi t inferre d fro m th e roa d type . O n th e re -
maining urban roads where congestion is a problem, trave l speed or time delay is time dependent.
Ideally, a four-ste p travel deman d methodolog y shoul d b e used t o forecas t lin k travel speed s in
each metropolitan are a afflicted wit h significant congestion. Given the enormity of such task in the
context o f the presen t research , a n alternativ e metho d incorporatin g simulation s is followed.
The approach to estimating travel time starts with a bivariate classification of links into groups
with simila r characteristics , namel y averag e annua l daily traffi c (AADT ) an d lan e width . A
simulation i s performed fo r eac h grou p t o determin e tempora l trave l speed s unde r expecte d re-
current an d non-recurren t congestio n conditions. Th e simulatio n follow s th e procedure outline d
in U S Do T (1986) . Inputs t o th e mode l includ e freewa y capacit y reductio n unde r inciden t con -
ditions, incident frequenc y and average incident duration. There are five different lan e widths and
five AADT values for a total of 25 simulations. Linear interpolation is used for values between the
five average AAD T values . Th e averag e AAD T o n an y lin k i s estimated b y a serie s o f simpl e
proportionality rule s given the following link attributes: city name, month of the year, weekday or
weekend, time of day, and link orientation wit h respect to th e center of the metropolitan. AAD T
per lane for the 5 0 most congeste d metropolita n area s comes from th e US DoT (1997) . National
averages are used t o distribut e the latte r b y month, weekda y or weeken d and, tim e of day.
Traffic directionalit y varie s dramatically wit h th e tim e o f th e day . Mor e vehicle s are headin g
towards a city in the morning, while more vehicles exit a city during the evening peak hours. These
traffic pattern s ar e modele d a s a functio n o f lin k orientatio n wit h respec t t o th e cit y cente r b y
means of a directional flow. Equal flow in both directions is associated t o an even 50% directional
flow. Different directiona l flow s ar e applie d t o link s fo r mornin g an d evenin g peak period s t o
capture th e directionalit y o f commuting patterns .
The directional flow is determined in two steps. The first step is determining the spatial relation
between th e cente r o f a n urba n are a an d th e directio n o f flow. Traffic flo w angl e i s used a s th e
metric in the first step. The traffic flow angle is converted into a directional flow in the second step.
The followin g algorith m calculates th e traffi c flow :
1. Dra w a line through th e link enter node to th e lin k exit node. Thes e ar e points A and C fro m
the exampl e i n Fig . 12 .
354 W.C. Frank et al I Transportation Research Part C 8 (2000) 337-359
2. Dra w a line from th e center o f the urban are a to a point midway between the entrance and exit
nodes, poin t B .
3. Th e angl e made b y thes e tw o line s is the traffi c flo w angle .
For mornin g traffic , traffi c varie s directly with the traffi c flo w angle . The opposit e i s true for th e
evening. A sine function is used to convert traffic flow angle to directional factor. Proper scaling of
the transfor m generates directional flows in the 34—66 % range (Robinson et al., 1992) . The traffi c
flow angle-directional facto r relationshi p i s given by
W.C. Frank et al. I Transportation Research Part C 8 (2000) 337-359 355
Costs (approximated by travel distance, travel time and risk exposure metrics) occur when
traveling from the origin to the destination of the shipment. This section describes the methods
implemented to evaluate these costs once the solution path has been identified. The mathematical
formulation of the TCSP problem is also presented along with solution methods to solve it.
Let path 77 be an origin to destination path. Also, let us denote by δijt the travel time on link ij
when departing node i at time t. The departure time from node z is denoted by γ{.
If parking is not allowed, then departure time is: γj = γi + δijγi.. On the other hand, if parking is
utilized, then departure time is γj = γi, + dij7. + Dj, where Dj is the time parked on node j.
Let the probability of an accident on link (ij) at time t be denoted by aijt. Summing up the costs
for all the links on a path, the probability of an accident on Π is Σv(i,j)eΠaij
The average population at risk along link (ij) at time / is denoted by sijt. The expected con-
sequence on link (ij) is aijtsijt. This assumes that an accident anywhere along link (ij) is equally
likely. The expected consequence on 77 becomes Σv(i,j)eΠaijsij
The average population at risk along link (ij) at time t is denoted by bijt, and the total pop-
ulation attribute associated with path Π is Σv(i,j)eΠbijγi
Distance is the only non-temporal link attribute. The total distance on Π isΣv(i,j)eΠdij.dij
the distance when traveling link (ij).
Total travel time is equal to the destination departure time (yD) with no parking on the des-
tination.
Therefore, the TCSP takes the following formulation:
Minimize y~
Subject to :
The TCSP incorporates a number of link attributes whose valuation varies with time of the day,
and da y of the week. In addition, parkin g or stopping on a node an d continuance of the trip at a
later time is allowed in the temporal network. Waiting to enter a link may be advantageous when
the cost of traversing this link is expected to decrease in the future. Parkin g can be thought of as a
cycle where the onl y non-zero lin k attribute i s time. Cycles may exis t i n the optimu m solutio n if
parking is not allowed . If neither cycles nor parking is allowed, then a less obvious method of time
consumption ma y exist. An exampl e is given to illustrat e this problem. Conside r th e networ k in
Fig. 14 .
Assume the travel time of subpath 1-3- 2 is greater than the travel time from nod e 1 to node 2.
Also assume all other link attribute costs for subpath 1-3-2 are greater than subpath 1-2 attribute
costs. Subpat h 1- 2 wil l b e chose n ove r subpat h 1-3- 2 i f parkin g i s allowe d tha t subpat h 1- 2
dominates subpat h 1-3-2 . The temporal cost s of the link fro m nod e 2 to the destination have no
effect o n whic h subpat h i s chosen.
Now, conside r the case wher e parking i s not allowe d an d th e costs t o trave l lin k 2 D decreas e
with time. Traveling subpath 1-3- 2 will delay entering link 2D. This delay will decrease the costs
of traveling link 2D. However, this delay comes at the increased costs of traveling subpath 1-3-2 .
Subpath 1-3- 2 wil l b e chosen i f the tota l cos t i s less.
By it s shee r size , th e 57,00 0 nod e networ k derive d fro m FHWA' s NHP N networ k entail s a
major challeng e to solvin g the TCSP . Eve n under th e optimisti c scenari o o f a 90 % reduction i n
network siz e obtained b y applying the constraint s discusse d in Sectio n 4, the networ k is still to o
large t o appl y a tempora l dynami c programmin g solutio n method . A solutio n metho d wit h ex-
ponential complexity would not be practical with such a large network, so a major concern o f this
research i s the developmen t o f efficien t heuristi c procedures.
A workabl e strategy to ge t around th e networ k siz e problem i s to brea k u p th e proble m int o
manageable parts . Th e primar y difficult y o f thi s approac h i s relatin g informatio n betwee n th e
manageable parts . Two solution method s ar e implemented i n Hazmat Path , namely the so-calle d
weight-guided solutio n method , an d th e intermediat e nodes solutio n method . Bot h approache s
are a n extensio n of Handler an d Zan g (1980 ) and ar e now outlined .
Origin 1 2 Destinatio n
Fig. 14 . Simple network.
W. C. Frank et al I Transportation Research Part C 8 (2000) 337-359 35 7
The weight-guided solution metho d disentangle s the original formulation into a master prob -
lem and a sub-problem . Th e master proble m produce s a path usin g Dijkstra's algorith m wit h a
single link cost. Thi s link cost is a linear combination o f the five link attributes. The sub-problem
determines th e parkin g time s at nodes . Th e attribut e sum s fro m thi s path alon g wit h th e uppe r
bounds o n thes e attribut e sum s ar e input s in determinin g th e ne w single lin k cost . Th e maste r
problem the n produces anothe r origin-destinatio n path. Thi s is repeated unti l there is no chang e
in result s betwee n iterations .
The intermediat e node s solutio n metho d transform s th e TCS P int o man y time-independen t
CSP problems . Al l lin k attribute s ar e define d ove r discret e tim e interval, wit h the exceptio n o f
travel time in congested citie s which is a continuous functio n o f time. The discret e time intervals
range fro m 1 to 6 h. To implement this method, trave l time in large urban area s is converted to a
function o f discret e time . Th e algorith m proceed s a s follows . A vehicl e starts a t th e origi n an d
reaches a pre-determine d intermediat e nod e befor e any attribute s chang e values. It stay s at thi s
parking positio n unti l attribute s chang e value . Sinc e no attribute s chang e value while traveling
from th e origin to the intermediate node, a time-independent solution method can be used in this
subpath rout e selection . Th e vehicl e resume s it s tri p t o anothe r intermediat e nod e o r t o th e
destination. Thi s proces s i s repeated a s needed unti l the destinatio n i s reached.
In thi s paper, w e presented a workin g and eas y t o us e hazmat routin g SDS S tha t overcome s
three significan t challenges , namel y handlin g a realisti c network , offerin g sophisticate d rout e
generating heuristics an d functionin g on a deskto p persona l computer . Althoug h man y part s of
this work can individually be found in previous work, never before have they been combined int o
one singl e working system.
A successfu l SDS S necessitate s th e developmen t o f custom software . Decisio n makin g is ren-
dered considerable less cumbersome for several reasons. First, the user follows a logical procedure
when developin g a route . O n th e contrary , off-the-shel f softwar e adapted t o hazma t routin g re -
quires learning th e general synta x of the softwar e prio r t o delvin g into hazma t routing . Custo m
software als o produce s a rout e i n a mor e timel y manne r befor e i t incorporate s efficien t dat a
structures and solutio n algorithms. The navigational simplicity and efficienc y advantage s help the
decision-maker focus on creating solutions, negotiating trade-offs , an d evaluatin g scenarios .
This paper outline s two solution methods t o the TCSP problem implemented in Hazmat Path .
Another possible solution method could be developed by combining the work of Handler and Zang
(1980) and Lombar d an d Churc h (1993 ) approach t o solvin g the gateway shortest path problem.
This method ca n b e outlined a s follows. The best rout e generated fro m th e gateway procedure is
used a s input t o th e Handler-Zan g procedure, whic h calculates a set of link weights. These link
weights are then applied to the network. The gateway procedure is run again and the best route is
determined. Thi s process i s repeated unti l there is no improvemen t between iterations.
In Sectio n 4 , w e discussed producin g differentiate d route s t o sprea d ris k ove r a large r popu -
lation. I n ou r approach , route s were constructed interactivel y by the decision-maker on the ma p
window. A possible enhancement o f the SDSS could involve adding a route generator t o produce
differentiated routes . Akgu n e t al . (1999 ) evaluate d severa l method s fo r creatin g differentiate d
358 W.C. Frank et a l I Transportation Research Part C 8 (2000) 337-359
paths. Th e computationa l effor t require d t o generat e these paths i s very lo w fo r som e o f these
methods.1 However , evaluatin g th e qualit y o f the se t of differentiate d path s woul d require con -
siderable computational effort during route selection. This calculation consists of creating polygon
overlays, which are use d for determinin g the overla p of link buffers .
The display of temporal link attributes is another are a where the current system could benefi t
from futur e enhancements . Th e SDS S presentl y displays attribute s fo r on e user-define d tim e
period wherea s a shipment ma y take a considerable amoun t o f time, possibly severa l days. Dis -
playing multiple maps of temporal attributes in a time loop would prove to b e a useful decisio n
support tool . Suc h capability would require creating, storing and retrievin g multiple bitmaps.
Acknowledgements
References
Abkowitz, M. , Cheng , P.D.-M. , Lepofsky , M., 1990 . Use o f geographic information system s in managing hazardou s
materials shipments. Transportation Researc h Recor d 1261 , 35^3.
Akgun, V. , Erkut , E. , Batta , R. , 2000 . O n findin g dissimilar paths . Europea n Journa l o f Operational Researc h 121 ,
232-246.
Baaj, M.H. , Ashur , S.A. , Chaparrofarina , M. , Pijawka , K.D., 1990 . Desig n o f routin g network s using geographi c
information systems : application s t o soli d an d hazardou s wast e transportation planning. Transportatio n Researc h
Record 1497 , 140-144.
Brainard, J., Lovett, A., Parfitt, J., 1996 . Assessing hazardou s waste transport risk s using a GIS. Internationa l Journa l
of Geographic Information System s 10 , 831-849.
Bureau o f Transportatio n Statistics , 1999 . National Transportatio n Atla s Databases , U S Departmen t o f Transpor -
tation, Washington , DC .
Coutinho-Rodriques, J. , Current , J. , Climaco , J. , Ratick , S. , 1997 . Interactiv e spatia l decision-suppor t syste m fo r
multiobjective hazardou s material s location-routing problems . Transportatio n Researc h Recor d 1602 , 101-109 .
Densham, P. , 1991 . Spatia l decisio n Suppor t Systems . In : Maguire , D.J. , Goodchild , M.F. , Rhind , D.W . (Eds.) ,
Geographic Information Systems : Principle s an d Applications , vol . 1 . Longman, London , pp . 403-412 .
Erkut, E. , 1996 . The roa d no t taken . OR/M S Toda y 2 3 (6), 22-28.
Erkut, E., Verter, V., 1995. A framework for hazardous materials transport ris k assessment. Risk Analysis 15, 589-601.
FHWA, 1996 . Annual Vehicl e Miles of Travel and Relate d Data , US Department of Transportation, Publication No .
FHWA-PL-96-024.
Geoffrion, A.M. , 1983 . Can OR/M S evolv e fast enough . Interface s 13 , 10-25.
Handler, G.Y. , Zang , I. , 1980 . A dual algorith m for th e constrained shortes t path problem. Network s 10 , 293-310.
Jovanis, P.P., Delleur , J. , 1983 . Exposure-based analysi s of motor vehicl e accidents. Transportation Researc h Recor d
910, 1-7 .
Abstract
Current geographi c information system s typically offer limite d analytical capabilities an d lac k th e flex-
ibility to suppor t spatia l decisio n makin g effectively. Spatia l decisio n suppor t system s aim to fill this gap .
Following thi s approach , thi s pape r describe s a n operationa l syste m fo r integrate d land-us e an d trans -
portation plannin g called Locatio n Planner . Th e syste m integrates a wid e variety of spatia l model s i n a
flexible and easy-to-use problem solving environment. Users are able to construct a model out of available
components an d us e the model for impact analysi s and optimization . Thus , i n contrast t o existin g spatial
decision support systems, the proposed syste m allows users to address a wide range of problems. The paper
describes the architecture o f the syste m and a n illustrative application. Furthermore , th e potentials o f the
system fo r land-us e an d transportatio n plannin g ar e discussed . © 200 0 Elsevier Scienc e Ltd . Al l rights
reserved.
Keywords: Integrate d land-us e an d transportatio n planning ; Spatia l decision suppor t systems ; Spatia l models ; Multi -
purpose trip s model s
1. Introductio n
0968-090X/00/S - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00010- 3
362 T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380
geographical informatio n system s offer. I n mos t commerciall y availabl e systems , suc h opportu -
nities are rather limited . Moreover, to th e extent that system s offer modelin g opportunities, the y
tend t o b e quite dated an d ofte n restricte d t o th e simples t versions of that model .
Under such circumstances, the user is left wit h the option to either apply more or less outdate d
technology, o r develo p an d appl y dedicate d software . Severa l geographica l informatio n system s
now als o provid e some script languag e that ma y b e used, bu t i n general, this is not particularl y
efficient. I n man y cases , w e hav e therefor e chose n t o buil d dedicate d softwar e tha t ha s bee n
optimized with a particular clas s of problems in mind, and that incorporates the latest models that
have been developed fo r tha t problem . Communicatio n wit h widely used geographica l informa -
tion system s is guaranteed throug h th e us e of particular dat a file formats.
The presen t articl e focuse s on a n exampl e o f suc h a spatia l decisio n suppor t syste m that wa s
made operationa l ove r th e las t coupl e o f years : Locatio n Planner . I t ha s bee n develope d wit h
retail planning problems in mind, and allows the user to address issues of transportation an d land-
use planning . A majo r proble m i s how th e suppl y o f retail facilities , given the propertie s o f th e
transportation system , affects spatia l shoppin g behavior an d relate d trip patterns , an d vice versa.
The article is organized as follows. First , we will describe the objectives of the system against a
brief overview of spatial decision suppor t systems . This is followed by a more detailed discussio n
of the architecture and various components of the decision support system . Next, we will illustrate
the use of part of the system in a case stud y o f Veldhoven, Netherlands . The article i s concluded
with a summar y and evaluatio n o f the propose d system .
Decision suppor t system s (DSS) were first introduced i n business management. The aim was to
improve the decision support capabilitie s o f the management information systems that were used
at th e time. The first DSS applications bega n to appea r i n the early 1970s. Since the early 1980s,
DSS efforts gaine d i n strength under influence o f the PC revolution, the increasing performance -
price ratio of hardware and software, and the increasing availability of public databases and othe r
sources of external data (Sprague, 1989) . Although there is not a generally agreed upon definition,
the ter m DS S commonl y refer s t o "computer-base d system s which hel p decisio n maker s utiliz e
data an d model s t o solv e unstructured problems " (Sprague, 1989) . Spatial DS S is generally de-
fined as a DSS which combine s geographi c information wit h appropriat e algorithms t o suppor t
locational decisio n makin g (Crosslan d e t al. , 1995 ; Keenan, 1998 ; Maniezzo e t al. , 1998) . This
section first reviews approaches i n spatial DS S for transportation an d locatio n plannin g and the n
positions ou r approac h i n this context.
Spatial DS S approaches reporte d i n the literatur e ar e typicall y centered o n a singl e modeling
technique. According to the technique used, we can distinguish spatial interaction/choice modeling ,
mathematical programming , o r multi-criteria decision modeling approaches. Example s of systems
based o n th e firs t approac h ar e Ro y an d Anderso n (1988) , Borger s an d Timmerman s (1991) ,
Grothe an d Scholte n (1992) , Kohsaka (1993) , Birkin (1994), Birkin et al. (1996 ) and Clark e an d
Clarke (1995) . The core o f these systems is a spatial interaction/choic e mode l fo r predictin g des -
tination choic e or interactions betwee n zones dependent o n travel distance, siz e and attribute s of
retail o r servic e facilities. Typically, prediction s ar e represente d i n a n origin-destinatio n matri x
T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380 36 3
from whic h market share s o f stores and trave l demands o f consumers ar e derived. Thus, th e sys-
tems allow users to predict and analyze impacts o f possible location o r transportation plans . Few
systems also activel y support locatio n selection . The syste m proposed b y Kohsak a (1993 ) incor-
porates a steepes t descen d algorith m t o searc h fo r location s o n a continuou s potentia l surfac e
(predicted b y th e shoppin g model) . Clark e an d Clark e (1995 ) an d Birki n e t al . (1996 ) describ e
dedicated GI S applications tha t incorporate a location-allocation model for optimizing the spatial
configuration o f a network of facilities.
In contrast , location-allocatio n models make up th e core o f systems based o n the secon d ap -
proach. Location-allocatio n models simultaneously optimize the choice of locations fo r facilities
and th e allocatio n o f consumer s t o thos e facilities . Th e mathematica l programmin g approac h
implies tha t a n optimizatio n proble m ca n b e define d b y user s i n term s o f a (single ) objectiv e
function an d on e o r mor e constraint s o n solutions . Fo r give n candidat e locations , th e syste m
generates optimal configuration s of the retail/service network and, i n some cases, optiona l outle t
formats. Densha m (1994 ) reviews applications fo r location selection . Armstrong et al. (1990) and
Densham an d Rushto n (1996 ) describe spatia l DS S applications fo r reorganizing service delivery
systems. As Densham (1991 ) argues, the objective of spatial DSS (in this approach) is to provide a
'flexible problem-solvin g environmen t i n whic h decisio n maker s ca n explor e a give n problem ,
evaluate the possible trade-off betwee n conflicting objectives, and identif y unanticipated , possibl y
undesirable characteristics of the problem'.
Multi-criteria or multi-objective DSS represents a third approac h aime d at providing tools for
analyzing the complex trade-of f betwee n candidate location s i n choosing a suitable location fo r a
new facilit y (e.g. , highway , waste dumps , powe r plant , hospital , etc.) . Multi-criteri a evaluatio n
(MCE) refers to a set of techniques for ranking a given set of choice alternatives on a given set of
multiple, possibly , conflictin g criteria. Multi-criteri a o r multi-objectiv e spatial DS S i s typically
applied i n a group setting . It assist s the group i n identifying locatio n candidates , developin g lists
of criteri a fo r evaluation , determinin g weight s o f criteria , performin g sensitivit y analysis an d
alternative rankin g (Janssen , 1991) . Th e system s typicall y integrat e MC E o r multi-objectiv e
programming (e.g. , goa l programming ) wit h analytica l an d mappin g tool s o f GIS . Ther e ar e
many examples of such systems reported i n the GIS literature (Fedra and Reitsma , 1990 ; Carver,
1991; Pereir a an d Duckstein , 1993 ; Jankowski, 1995 ; Lin e t al. , 1997 ; Malczewski , 1996 ; Jan -
kowski an d Ewart , 1996 ; Crosslan d e t al. , 1995) . Thil l (1999 ) give s a revie w o f th e field . Th e
spatial DSS for solid waste planning described in MacDonald (1997 ) combines various techniques
including mathematica l programming, impact analysis , an d th e Analytical Hierarchica l Process.
Carver (1991 ) describes a method fo r integratin g GI S an d MC E techniques .
The spatial DSS that we propose intend s to improve the flexibility and interactive properties of
current spatia l DSS . Spragu e an d Carlso n (1982 ) distinguish two aspects o f flexibility. First-level
flexibility is the ability of the system to adapt to a solution path preferre d by the decision maker.
This i s important becaus e locatio n problem s ar e ofte n ill-structure d implyin g that n o standar d
solution procedure exists . Second-level flexibility is the ability to modif y th e configuration of the
DSS so that i t can handle a different se t of problems. This is an essential feature o f a generic DSS.
Existing systems perform poorly on both first and second-level flexibility. With respect to the first
level, spatia l interaction/choic e mode l (SIM ) system s focu s o n wha t Densha m an d Armstron g
(1993) cal l th e intuitiv e mode. I n thi s mode , user s ca n specif y scenario s i n term s o f planne d o r
anticipated development s (e.g., opening a new facility, population forecasts ) and th e system gives
364 T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380
feedback i n terms of impacts on criterion variables (e.g., travel demands). On the other hand, DSS
based o n mathematical programmin g (MP ) o r multi-criteri a evaluatio n (MCE) method s ar e de-
signed to suppor t th e goal-seeking mode. That is, users can specify th e location proble m i n terms
of criterion variables (e.g., criteria weights) and th e system generates optimal solutions in terms of
decision variable s (i.e. , optima l configuration s o r rankings) . Wher e SIM-system s lac k th e ana -
lytical capabilitie s o r a n appropriat e user-interfac e fo r th e goal-seekin g mode , th e MP/MC E
systems are weak in supporting th e intuitive mode. The goal of our approach is to design a system
that supports bot h interactio n modes in terms of modeling capabilities, a s well as user interfaces.
With respec t t o second-leve l flexibility, SIM-systems als o ten d t o b e highl y restrictive. The y
offer user s limite d possibilitie s t o chang e th e specificatio n of a shoppin g mode l (selectio n o f at -
tributes), th e utility function form (e.g., nested logi t or MNL) , an d sometime s even the values of
model parameters (weight s of attributes). The reason for this is that most systems are designed for
a specifi c application . Fo r eac h ne w application, system s must b e re-designed t o sui t the specifi c
information needs . A generic DSS requires a more fundamenta l solution. To b e able to suppor t
problems in both public-sector and private-sector planning, the system must cover a wide range of
planning objectives . Moreover , user s shoul d b e abl e t o choos e attributes , attribut e parameters ,
and eve n the for m o f the spatia l shoppin g model .
Interactive propertie s relat e t o tw o dimensions of the user-interface: the extent to whic h users
can vie w an d manipulat e relevan t condition s o f a proble m (opennes s o f th e system ) an d com -
municative properties . T o identif y potentiall y relevant information categories, we use a model of
the plannin g cycle . Th e mode l i s schematicall y show n i n Fig . 1 . The boxe s i n thi s schem e cor -
respond t o differen t informatio n categorie s o r section s o f th e proble m domai n an d th e arrow s
represent th e dependenc y relationship s between them. The user-interface of the proposed syste m
allows users to interact with each o f the sections to view or change conditions of the system being
planned. I n particular , user s shoul d no t onl y be able t o evaluat e alternativ e plans, bu t als o de -
mographic and economic scenarios of change i n a study area .
To enhanc e th e communicativ e properties , th e secon d dimensio n o f user-interfaces , w e use
dynamic-graphics technique s muc h i n th e sam e wa y a s propose d b y Densha m an d Armstron g
(1993) and Densha m (1994) . That is , the propose d syste m support s multipl e representatio n for -
mats (views ) o f domai n sections . Thes e includ e map , grap h an d tabl e format . Th e view s ar e
dynamically linked with each othe r s o that change s i n on e vie w lea d t o automaticall y updatin g
related views . For example , if the user selects a record i n the table view, the system highlights the
corresponding objec t i n the map an d grap h view s at th e sam e time.
interaction mode . Users can define scenarios , for example related to population o r plans, an d ru n
models to predict and analyz e impacts o n planning objectives. To provide second-level flexibility,
the inference model is not a fixed component o f the system. Rather users are able to construct th e
model ou t o f mor e elementar y component s tha t ar e availabl e i n th e mode l bas e o f th e system .
Thus, user s can construct a model tha t suit s information needs, available data and preference for
model types. To provide first-level flexibility, the second major component calle d Proble m Solve r
supports the goal-seeking interaction mode . Users can define a problem in terms of objectives and
constraints an d th e syste m generate s th e optimu m an d near-optimu m solutions . Th e syste m is
implemented a s a stand-alon e Windows-9 5 applicatio n usin g th e object-oriente d programmin g
tool C++-builder . The remainde r o f this sectio n discusse s each componen t i n turn .
1. procedure s for retrieving data from th e external database (present-state , goal-state and changes
attributes);
2. procedure s fo r arithmetic summatio n (comput e : futur e = presen t + changes);
3. spatia l shoppin g model s (future-state-interaction s attributes);
4. demand-relate d performanc e analysi s models (future-C-deman d attributes) ;
5. supply-relate d performanc e analysi s models (future-C-suppl y attributes) ; an d
6. performanc e evaluatio n model s (discrepanc y attributes) .
Having selecte d a model , user s nex t specif y mode l parameters , i f any, an d indicate whic h at -
tributes i n relate d section s (method s 2-6 ) o r externa l database s (metho d 1 ) provide inpu t data .
Thus, method s establis h connection s betwee n mutua l attribute s (2-6 ) o r betwee n attributes an d
the externa l database (1) .
The external database consists of a collection of files specified b y the user. The data files should
contain attribut e informatio n of demand location s an d suppl y locations, an d distance s betwee n
locations i n table form . Geographic and imag e data about th e study area ar e optional and allow
the syste m to displa y a map o f the area an d a n imag e background fo r the map .
Location Planne r does not provide an editor for editing these files. Instead, the system supports
the us e o f existin g files, for example , generate d b y a GI S o r othe r general-purpos e software .
Supported format s include DBase , TransCad-tabl e and tex t files for attribute and distanc e data,
BNA fo r geographical data , an d bitma p fo r image data .
The inference engine controls the execution of methods for evaluating attributes and guards the
internal consistency o f the inferenc e model. I n thi s context, it i s important t o not e tha t definin g
dependency relationships between attributes an d methods constitutes a coherent model structure .
Generally, the methods that need to be executed for evaluating an attribute at any position in the
model constitut e a tre e structur e o f which th e endpoint s (leafs ) consis t o f method s fo r readin g
external data. If the engine receives a command t o evaluate an attribute, i t identifies an d executes
the nodes (methods) of the tree starting from lea f nodes. On the other hand, if the definition of an
attribute changes , th e dat a o f dependent attribute s ar e affecte d a s well. The attribute s tha t ar e
affected constitut e a tree structure in the opposite direction, whereby leafs represent the endpoint s
of reasoning chains. I f an action o f the user leads to a change of the definition of an attribute, the
engine identifies and resets the tree of affected attributes . Various user actions can lead to a change
of variabl e definitions . These includ e re-specifyin g o r deletin g a method , changin g th e selectio n
status of supply or demand objects , changing data files, editing attributes in present state, chang e
or goal sections and adding or deleting attributes. I n sum, the inference model works much like a
spreadsheet wher e formula s nee d t o b e define d onl y onc e an d ar e automaticall y execute d i f
needed. Beside s speeding up th e proces s o f evaluation, th e advantag e i s that user s ar e insulate d
from th e technica l detail s of the models .
All settings defined b y users, as well as the internal database, ar e stored i n a single project file.
When user s ope n a ne w project file, every sectio n o f the databas e i s empty an d hold s onl y on e
attribute. Section s can be viewed through a table , ma p an d grap h representatio n format . I n th e
table view , users can ad d attribute s (column s o r sheets ) an d defin e method s fo r attributes .
The table and map views are bi-directionally linked. When activated, th e map window displays
demand object s a s zone s with centroids an d suppl y object s as poin t locations . B y clicking o n a
map object , the record i n the table window is simultaneously highlighted. Conversely, by clicking
on a table record, th e corresponding ma p objec t receives focus. This functionality of the system is
considered important , a s it allow s user s to lin k attribut e an d geographica l dat a o f objects. Fur -
thermore, th e system automatically display s thematic informatio n containe d i n the table vie w on
the map . Whe n user s selec t a column o f a table , th e ma p i s refreshed by the syste m and th e at -
tribute is displayed in the form of a circle diagram o n the map. Thus , th e spatial distribution o f a
quantity, suc h as population o r floor space, ca n be easily assessed. Interactio n data , on the othe r
hand, ar e displayed in the form of lines connecting deman d an d suppl y locations. The width of a
line represent s the relativ e siz e of the flow. Finally, whe n activated , th e grap h vie w displays th e
selected attribut e i n th e for m o f a ba r diagram . Thi s forma t give s a visua l impressio n o f th e
distribution o f object s o n a n attribut e (e.g. , population) . Table-grap h link s ar e uni-directiona l
from tabl e t o graph . Specifically , the grap h i s refreshe d eac h tim e th e selectio n o f a colum n
changes.
T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380 37 1
Editing dat a i s possibl e i n th e tabl e vie w o f change , futur e an d goa l sections . I n Locatio n
Planner, th e conten t o f a demand-chang e o r supply-chang e sectio n i s calle d a 'scenario' . A
scenario can b e implemented by editing in the tabl e vie w or b y retrieving data from th e externa l
database. B y clicking o n th e updat e button , user s giv e a comman d t o th e inferenc e engin e t o
update attributes . Th e discrepanc y stat e show s consequence s i n term s o f performanc e score s
relative t o th e goa l stat e (i f any) . Th e Scenari o Manage r modul e allow s on e t o stor e an d
manage scenarios . Usin g th e scenari o manager , user s ca n easil y retur n t o a n ol d scenari o o r
evaluate ne w combination s o f deman d an d suppl y scenarios . Proble m Solve r supports , i n a
similar way , th e managemen t o f generate d solutions . Generate d solution s ar e automaticall y
stored. User s ca n display , remove , o r re-generat e solutions . A solutio n itsel f i s a lis t generall y
consisting o f N bes t solutions . Th e sam e linke d table , grap h an d ma p view s ar e availabl e t o
view solutions .
Typically, th e fina l resul t o f thi s user-syste m interaction i s a se t o f alternativ e scenarios an d
corresponding outcomes . Eac h scenario i s based o n specifi c assumptions about demographi c an d
economic development s (deman d scenarios) , consume r choic e behavio r (shoppin g model ) an d
objectives (optimizatio n mode l o r goa l state) . T o facilitat e politica l decisio n making , the result s
together wit h scenari o assumption s shoul d b e compiled int o a comprehensiv e report . Locatio n
Planner doe s no t offe r a facilit y fo r generatin g suc h a report . However , th e view s provid e th e
required material in the for m o f tables, graphs and maps . Moreover , eac h attribut e is explicitly
defined i n term s o f the linke d method specification .
4. Illustratio n
The case study conducted t o illustrate the system considers a large-scale expansion of the major
shopping cente r in Veldhoven, Netherlands. The multi-purpos e tri p mode l and complementar y
performance model s ar e applie d t o predic t th e impact s o f th e expansio n o n trave l demand s o f
consumers an d marke t share s o f shopping centers. Thus, th e cas e study focuses o n th e inferenc e
model component o f Location Planner . This section discusses the definition o f the study area, th e
estimation of a multi-purpose trip model, and the information that can be derived from th e model.
Fig. 4 . Structure of the multipurpose trip model used in the Veldhoven case study.
was multi-purpose in terms of the two-good classification. The destination choice se t for each tri p
was defined a s the se t of centers know n t o th e individual, wher e stores required b y the trip typ e
under concer n wer e availabl e (daily , non-dail y o r both) . Eac h destinatio n alternativ e wa s de -
scribed i n term s o f bot h trave l distanc e fro m th e hom e locatio n an d a se t o f attribute s o f th e
shopping center . Distanc e wa s measured a s the lengt h in meters of the shortes t rout e acros s th e
road network. Respondent s wer e assigned t o the node of the network tha t was closest to the five-
digit zi p code o f their hom e address . Similarly , the shoppin g center s were assigned t o th e node s
closest to the centroid of the shopping area (e.g., a street). A major road to the network linked the
relevant shopping area s in Eindhoven. The GIS package TransCA D (Calipe r Corporation , 1996 )
was used to digitize th e geographic data and to generate a demand x supply distance matri x usin g
a shortes t pat h routine. Th e attributes use d t o describ e shopping center s included th e total floo r
space o f store s i n th e dail y an d th e non-dail y sector s respectively , and a binar y variabl e repre -
senting th e presenc e o f a low-pric e leve l imag e o f th e center . Cente r atmospher e an d parkin g
facilities ar e generall y als o influentia l factors , bu t wer e no t include d becaus e th e dat a wa s no t
available.
The softwar e HieLo w wa s use d fo r full-informatio n estimatio n o f th e hierarchica l choic e
model (Bierlaire , 1995) . The rho ba r square d valu e of 0.163 indicates a satisfactory goodness-of -
fit o f th e mode l considerin g th e limite d se t o f variable s use d t o describ e shoppin g centers . Al l
parameter value s wer e statisticall y significan t an d ha d value s a s expected . Th e logsu m param -
eters hav e a n interestin g interpretatio n fo r impac t analysis . Fo r eac h tri p type , a logsu m pa -
rameter wa s estimated . Th e estimate d value s o f thes e parameter s indicat e th e exten t t o whic h
the choic e o f a trip typ e i s influenced b y th e attractivenes s o f locations . I f thi s valu e i s 0, trip
type choic e i s no t sensitiv e t o supply . I f thi s valu e i s 1 , the suppl y elasticit y i s maximal . Th e
values foun d i n thi s cas e stud y ar e 0.3 4 (daily) , 0.6 4 (non-daily ) an d 0.5 3 (multi-purpose )
suggesting tha t th e higher-orde r trip s ar e mor e sensitiv e to variatio n i n suppl y tha n th e lower-
order trips .
374 T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380
The mode l wa s implemente d i n Locatio n Planne r usin g th e syste m facilitie s fo r mode l con -
struction. For predictin g shopping trips, the same travel distance matrix as in the estimation stage
represented the transportation system . The travel distance matrix was based o n a subdivisio n of
the are a int o 8 9 zip cod e areas . However , demographi c dat a wa s availabl e only a t a mor e ag -
gregated leve l o f 1 4 districts. Th e demographi c dat a were disaggregated t o th e zi p code leve l b y
assuming that population s withi n districts are evenl y distribute d acros s th e zip code areas . Des -
tination choice-set s per trip type and pe r zone were defined usin g deterministic rules available for
that purpose in Location Planner . Specifically, fo r each trip type the choice-set was defined a s the
centers offerin g th e required store types (daily or non-daily or both). Furthermore , a n additiona l
rule wa s use d fo r single-purpos e dail y goo d trip s t o furthe r reduc e choice-set s t o center s lying
within a distanc e o f 500 0 meter fro m origi n locations .
The multi-purpose trip model was used t o predic t trip s in both th e before and afte r situation .
Since the expansion of the City Center was the only development that had take n place, difference s
found coul d b e interpreted a s impacts o f the expansion . Impact s o n th e choic e o f multi-purpose
trips, trave l demands o f consumers, an d marke t share s o f centers were considered usin g the de -
mand an d suppl y related performanc e model s availabl e i n Location Planner .
Table 1
Travel demand afte r expansio n of the Cit y Center expressed as a percentage of travel demand befor e expansion
District Population Total frequenc y Average tri p length Total travel
1 1560 99.6 99.0 98.5
2 210 99.5 98.7 98.3
3 3225 99.5 97.9 97.6
4 160 99.6 99.6 99.1
5 3900 99.5 98.5 98.0
6 4690 99.6 99.7 99.3
7 3920 99.5 96.4 95.9
8 2430 99.5 96.4 95.9
9 5060 99.6 96.3 95.9
10 560 99.6 96.3 95.9
11 1500 99.5 98.1 97.6
12 4800 99.6 98.6 98.1
13 2975 99.6 98.1 97.6
14 5985 99.5 94.8 94.3
Average 99.5 97.5 97.1
individuals keep purchase frequencie s constant an d mak e multi-purpose trip s in order t o reduc e
the require d number o f trips.
The model was run fo r both th e before and afte r situation . Th e output generate d by Locatio n
Planner describe s fo r eac h zon e (i ) the predicte d tri p frequenc y pe r capita , (ii ) the averag e tri p
length, an d (iii ) th e tota l distanc e traveled . Tabl e 1 shows th e afte r situatio n whe n th e befor e
predictions are set to 100 . Average trip length is calculated as a weighted sum of trip lengths using
probabilities o f tri p type s a s weights . Then, tota l distanc e travele d i s simpl y calculate d a s th e
product o f trip frequency, trip length and populatio n weights. As the figure s indicate , the multi-
purpose mode l predict s a decreas e i n tota l trave l acros s residentia l zone s o f 2.9% . Th e mode l
predicts fo r eac h zon e a smal l decrease i n tota l tri p frequenc y (o n a n averag e 0.5%). Assuming
that th e su m o f purchase s o f dail y an d non-dail y good s remain s constant , th e decreas e i s at -
tributable to the predicted increase in the share of multi-purpose trips (on an average 3.6%). Also,
the predicted averag e trip length has decreased fo r each zone (on an average 2.5%). A closer look
at the destination choice probabilities reveals that this decrease is the result of two counter-acting
effects. First , th e Cit y Cente r tend s t o attrac t trip s whic h i n th e befor e situatio n wen t t o loca l
district center s inducing more trave l inside Veldhoven. At th e sam e time, the increased compet -
itive strength of the center is responsible for a decrease of relatively long trips to the larger centers
in Eindhoven . The ne t resul t o f these tw o opposit e effect s i s a decrease o f average tri p length.
Table 2
Attributes and marke t shares after expansion o f the Cit y Center a
Center Size daily Size non-daily Low price Market shar e Market share
(m2) (m2) image daily (%) non-daily (%)
1 City Cente r 8000 10,800 0 137.9 131.6
2 Burg van Hoo f 1198 1745 0 93.4 94.9
3 Kromstraat 1534 4148 0 94.1 95.1
4 Heikant 775 30 0 93.8 95.0
5 t Loo k 610 0 0 93.8
6 Zonderwijk 1340 230 0 93.7 94.9
7 Mariaplein 115 745 0 93.3 95.0
8 Zeelst 657 1146 0 93.2 94.9
9 Oerle 100 0 0 93.3
10 EH inne r city 4273 88,273 0 92.6 94.6
11 EH Wonsoe l 7780 12139 0 92.5 94.0
12 De Hur k 1225 3163 1 92.8 95.0
13 Kast. Plei n 1653 2318 0 93.2 94.9
14 Trudoplein 207 2189 0 93.0 95.0
a
Market shares are expressed a s a percentage of the market shar e before expansion .
types and users can specify th e relative weights of trip type. In the present case, the relative weight
of multi-purpose trips was set to 0.5 for both daily and non-daily goods assuming that the amoun t
of expenditure for each good is twice as much on single purpose trip s than o n multi-purpose trips .
Furthermore, user s can specify for each shopping center the amount o f expenditure attracted fro m
outside the study area. Becaus e these data were not availabl e in this case, inflows were assumed to
be zero. Hence, the calculated marke t shares cannot be readily interpreted in terms of turnover, but
they d o giv e a n indicatio n o f th e competitiv e strengt h o f center s i n attractin g consumer s fro m
within the study area, which was the primary concer n i n the present study .
Predicted marke t shares for the after situatio n when the before situation i s set to 10 0 are shown
in Tabl e 2 . As expected, th e marke t share o f the Cit y Center ha s increase d considerabl y i n bot h
the dail y (38%) and non-dail y sector (32%) . The decrease in market share of competing centers is
distributed almos t evenl y across th e centers . Impact s rang e betwee n 6.9 % and 7.5 % in the daily
sector and between 4.9% and 6.0% in the non-daily sector. The competition with the major center
of Eindhove n i s also o f interest. The los s of Veldhoven market shar e i n Eindhove n inne r city is
7.4% (daily sector ) and 5.4 % (non-dail y sector) .
4.4. Discussion
As the results indicate, the expansion of the center has reduced the total distance traveled by the
population fo r shoppin g purpose s (b y approximatel y 3%) . Multi-purpos e trip s hav e increase d
somewhat, bu t th e effec t i s largely due to a substitution o f long trips to th e Eindhove n center b y
shorter trip s t o th e Veldhove n center . Th e marke t shar e o f th e Cit y Cente r ha s increase d con -
siderably (32-38%) , bu t t o a lesse r exten t tha n th e increas e i n floor space (approximatel y 50%) .
The model did not take into account possibl e impact s o n transport mod e choice for the shoppin g
trips. However , th e sam e mode l structur e coul d b e used t o defin e a neste d mode l o f transport -
mode an d locatio n choice . I n a mode-location-nested model , alternativ e destinations o f trips ar e
T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380 37 7
nested under mode choices (for example, car versus other modes), so that the impact of changes at
destinations o n mod e choic e ca n b e predicted .
In this case study, the model was estimated on consumer data after th e change had taken place .
However, i n mos t studie s on e woul d estimat e th e mode l o n th e befor e situatio n an d us e i t t o
predict possibl e impact s i n a plannin g stage . Severa l scenario s migh t b e considere d relate d t o
anticipated populatio n development s or compensatory action s to reduce negative impacts, i f any.
For example , i f parkin g facilitie s wer e include d a s additiona l attribute s i n th e locatio n utilit y
function, th e effect s o f simultaneous parking policies could b e evaluated. Th e facilitie s offered b y
Location Planne r may reduc e th e threshol d fo r formulatin g scenarios . Th e inferenc e engin e au -
tomatically runs the entire inference model fo r evaluatin g impacts o f each scenari o an d Scenari o
Manager ca n b e used fo r generatin g and managin g a scenari o base .
The present cas e study focuse d o n how a multi-purpose trip model ca n be specified, estimate d
and applie d for impac t analysi s using Location Planner . The two-goo d syste m assumed in thi s
case suit s th e data-availabilit y an d informatio n need s o f loca l governments . A t leas t i n Dutc h
retail planning, it is usual to analyze and collect floor space data at the level of daily and non-daily
goods. Retail plans are also normally formulated at this level. The case study highlighted the extra
information tha t ca n b e derive d fro m th e multi-purpos e model . Beside s a n origin-destination
matrix, the model predicts trip-type probabilities an d tri p frequencies dependen t o n the choice of
multi-purpose trips. Moreover, th e mode l i s sensitive to (i ) settings of the relativ e weight of trip
types i n predictin g marke t shares , (ii ) assumption s abou t th e trip-generatio n effect s o f multi -
purpose trips , (iii ) purpose-specifi c attractivenes s o f centers , an d (iv ) suppl y elasticit y o f tri p
choice.
5. Conclusion s an d discussion
This articl e describe d an d illustrate d wit h a cas e stud y th e spatia l decisio n suppor t system ,
Location Planner , whic h w e have developed . Th e primar y objectiv e o f Locatio n Planne r i s t o
provide a system that is easy to use and able to support a large variety of problems in retail/service
planning. The system is relevant in addressing both issues of transportation plannin g and locatio n
decisions. I t incorporate s a wid e rang e o f spatia l models , includin g spatia l interaction/choic e
models, syste m performance models an d location-allocatio n model s (Fig . 5) .
The syste m ca n b e evaluate d agains t genera l objective s of a decisio n suppor t system . First ,
adaptability o f th e syste m t o a wid e rang e o f problem s i s a stron g poin t o f th e system . Th e
structure o f the inferenc e model is not fixed but ca n b e defined b y users. Because the model bas e
includes a wid e range o f model variant s withi n eac h categor y a wid e range o f problems ca n b e
accommodated. Onc e th e mode l structur e i s defined , th e syste m has th e flexibility to allo w th e
choice betwee n different mode s o f user-system interaction . Th e intuitiv e mode support s impac t
analysis o f plan s o r marke t developments . Th e goal-seekin g mod e support s model-base d opti -
mization o f the spatia l configuration o f retail or servic e networks .
Second, the syste m is strong on visual and interactiv e properties . Th e use of dynamic variable
definitions strongl y reduces th e lengt h o f feedback loops . User s ca n manipulat e a wid e rang e of
conditions an d nee d onl y t o clic k o n a n updat e butto n t o se e the implications . Usin g multipl e
active and linke d view s o n dat a section s enhances th e interactiv e properties. User s can vie w th e
378 T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380
same data se t in table, map an d grap h views . The views are linked so that the selection of objects
or attribute s i n on e vie w i s simultaneously implemented in linked views a s well.
Third, i t shoul d b e noted tha t th e syste m has a limite d focus. Not al l stage s of plan decision
making are supported. Th e system emphasizes the stages of impact analysi s and pla n generating.
To als o suppor t th e precedin g stag e o f monitoring developments and identifyin g problems , th e
system must be extended with a time dimension that make the representation and analysis of time
series data possible . The identification of candidate location s an d evaluatio n of plan alternatives
also receiv e limited attention .
Given th e emphasis on impac t analysi s and plan-generation , general-purpos e GIS an d MCE -
software ar e considere d complementary . Standar d GI S tool s suppor t th e elementar y form s o f
spatial analysi s required fo r identifyin g candidat e locations . Commerciall y available MCE-soft-
ware, suc h a s Exper t Choic e (1995) , ca n b e use d i n additio n fo r identifyin g criteria , derivin g
criterion weights , and rankin g alternatives on th e criteria . Hence , Locatio n Planne r i s explicitly
meant t o b e complementar y t o existin g GIS an d MC E (group ) software . Communicatio n i s
realized throug h data files.
Acknowledgements
References
Arentze, T.A., Borgers , A.W.J., Timmermans , H.J.P. , 1994a . Geographical informatio n systems and th e measuremen t
of accessibilit y i n the contex t o f multipurpose travel : a ne w approach. Geographica l System s 1 , 87-102.
T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380 37 9
Arentze, T.A., Borgers , A.W.J., Timmermans , H.J.P. , 1994b . Multistop-base d measurement s of accessibility in a GI S
environment. Th e Internationa l Journa l o f Geographical Informatio n Systems 8, 343-356.
Arentze, T. , Oppewal , H. , Timmermans , H.J.P. , 1997 . A multipurpos e destinatio n choic e mode l fo r shoppin g trips :
some empirica l results . In : Proceeding s o f th e Pape r presente d a t th e Fourt h Recen t Advance s i n Retailin g an d
Services Scienc e Conference , 3 0 June-3 July, Scotsdale , Arizona .
Armstrong, M.P. , De , S., Densham, P.J. , Lolonis , P., Rushton, G. , Tewari, V.K., 1990 . A knowledge-based approach
for supportin g locational decisionmaking . Environment and Plannin g B: Planning and Desig n 17, 341-364.
Bierlaire, M., 1995 . A robust algorith m for the simultaneous estimation of hierarchical logit models. GRT Repor t 95/3,
Department o f Mathematics, FUNDP , Namur, Belgium .
Birkin, M. , 1994 . Understandin g retai l interactio n patterns : th e cas e o f th e missin g performanc e indicators . In :
Bertuglia, C.S. , Clarke , G.P. , Wilson , A.G . (Eds.) , Modellin g th e City : Performance , Polic y an d Planning .
Routledge, London , UK , pp . 121-150 .
Birkin, M. , Clarke , G. , Clarke , M. , Wilson , A. , 1996 . Intelligen t GIS : Locatio n Decision s an d Strategi c Planning .
Geoinformation International , Cambridge , UK .
Borgers, A.W.J. , Timmermans , H.J.P. , 1991 . A decisio n suppor t an d exper t syste m fo r retai l planning . Computer s
Environment an d Urba n System s 15 , 179-188.
Breheney, M.J. , 1978 . The measuremen t o f spatia l opportunit y i n strategic planning. Regional Studie s 12 , 463-479.
Caliper Corporation , 1996 . TransCAD: Transportation GIS software : User' s Guid e Version 3.0, Caliper Corporation ,
Newton, MA .
Carver, S.J., 1991 . Integrating multicriteria evaluation with geographical informatio n systems. International Journal of
Geographical Informatio n System s 5, 321-339.
Clarke, C. , Clarke , M. , 1995 . Th e developmen t an d benefit s o f customize d spatia l decisio n suppor t systems . In :
Longley, P. , Clarke, G . (Eds.) , GIS fo r Busines s and Servic e Planning. Geoinformation International , Cambridge,
UK, pp . 227-254.
Clarke, G.P. , Wilson , A.G., 1994 . A new geography o f performance indicators for urban planning. In: Bertuglia, C.S. ,
Clarke, G.P. , Wilson , A.G. (Eds.) , Modelling the City: Performance Polic y and Planning. Routledge, London, UK ,
pp. 55-81 .
Crossland, M.D. , Wynne , B.E., Perkins, W.C., 1995 . Spatial decision support systems : an overview of technology and a
test o f efficacy . Decisio n Suppor t System s 14 , 219-235.
Densham, P.J. , 1991 . Spatia l decisio n suppor t systems . In: Maguire , D.J. , Goodchild , M.F. , Rhind , D.W . (Eds.) ,
Geographical Informatio n Systems : Principles. Wiley , New York, pp . 403-412 .
Densham, P.J. , 1994 . Integratin g GI S an d spatia l modelling : visua l interactiv e modellin g an d locatio n selection .
Geographical System s 1 , 203-221 .
Densham, P.J. , Armstrong , M.P. , 1993 . Supportin g visua l interactiv e locationa l analysi s usin g multipl e abstracte d
topological structures . In: Proceedings o f AutoCarto 11 , American Congress o n Surveying and Mapping , Bethesda ,
MD, pp . 2-22 .
Densham, P.J., Rushton , G. , 1996 . Providing spatia l decisio n suppor t fo r rura l public servic e facilities tha t requir e a
minimum work load . Environmen t and Plannin g B : Planning an d Desig n 23 , 553-574.
Expert Choice , 1995 . Decision Suppor t Software : User Manual . Exper t Choice , Pittsburgh , Pennsylvania , US.
Fedra, K. , Reitsma , R.F. , 1990 . Decisio n suppor t an d geographi c informatio n systems . In : Scholten , J.J. , Stillwell ,
J.C.H. (Eds.) , Geographi c Informatio n System s fo r Urba n an d Regiona l Planning . Kluwer , Dordrecht ,
Netherlands, pp . 177-188 .
Ghosh, A. , McLafferty , S.L. , 1987 . Location Strategie s for Retail and Servic e Firms. Lexingto n Books, Massachusetts.
Grothe, M. , Scholten , H.J. , 1992 . Modelling catchment areas : toward s th e developmen t o f spatia l decisio n suppor t
systems fo r facilit y locatio n problems . In : Harts , J.J. , Ottens , H.F.L. , Scholten , H.J . (Eds.) , Proceeding s o f th e
Second Europea n Conferenc e on Geographical Information Systems 2. EGIS Foundation, Faculty of Geographical
Sciences, Utrecht , Netherlands , pp . 978-987 .
Jankowski, P. , 1995 . Integratin g geographica l informatio n system s an d multipl e criter a decision-makin g methods .
International Journa l o f Geographical Informatio n System s 9, 251-273.
Jankowski, P., Ewart , G., 1996 . Spatia l decisio n suppor t syste m for health practitioners : selecting a location fo r rural
health. Geographical System s 3, 279-299.
380 T.A. Arentze, H.J.P. Timmermans I Transportation Research Part C 8 (2000) 361-380
Janssen, R. , 1991 . Multiobjectiv e decisio n suppor t fo r environmenta l problems . Dissertation , Fre e University ,
Amsterdam, Netherlands .
Keenan, B. , 1998 . Spatial decision suppor t system s for vehicl e routing. Decisio n Suppor t System s 22, 65-71.
Kohsaka, H. , 1993 . A monitoring and locationa l decisio n support syste m for retail activity. Environment and plannin g
A 25 , 197-211 .
Lin, H. , Wan , Q. , Li , X., Chan , J. , Kong , Y. , 1997 . GIS-based multicriteri a evaluatio n fo r investmen t environments.
Environment an d Plannin g B : Planning and Desig n 24 , 403-414.
Macdonald, M. , 1997 . A spatia l decisio n suppor t syste m fo r collaborativ e soli d wast e planning . In : Craglia , M. ,
Couclelis, H. (Eds.), Geographi c Informatio n Research : Bridging the Atlantic. Taylo r & Francis, London , UK, pp .
510-522.
Malczewski, J. , 1996 . A GIS-base d approach t o multipl e criteri a grou p decision-making . Internationa l Journa l o f
Geographical Informatio n System s 10, 955-997.
Maniezzo, I., Mendes, I., Paruccini, M., 1998 . Decision suppor t fo r siting problems. Decisio n Suppor t System s 23, 273 -
284.
Pereira, J.M.C. , Duckstein , L. , 1993 . A multipl e criteri a decision-makin g approac h t o GIS-base d lan d suitabilit y
evaluation. Internationa l Journal o f Geographical Informatio n System s 7, 407-424.
Roy, J.R. , Anderson , M. , 1988 . Assessin g impact s o f retai l developmen t an d redevelopment . In : Taylor , M.A.P. ,
Sharpe, R . (Eds.) , Deskto p Planning : Microcompute r Application s fo r Infrastructur e & Service s Plannin g &
Management. Newton Hargree n Publishin g Company, Melbourne , Australia , pp . 172-179 .
Sprague, R.H. , 1989 . A framework for the development o f decision suppor t systems . In: Sprague , R.H. , Watson , H.J .
(Eds.), Decisio n Suppor t Systems : Putting Theor y Int o Practice . Prentice-Hall , London , pp . 9-35 .
Sprague Jr., R.H. , Carlson , J.E.D. , 1982 . Building Effective Decisio n Suppor t Systems . Prentice-Hall, Ne w Jersey, US.
Thill, J.C., 1999 . Spatial Multicriteri a Decision-Makin g an d Analysis : A Geographic Informatio n Science s Approach .
Ashgate, Aldershot , UK .
TRANSPORTATION
RESEARCH
PARTC
Abstract
Path querie s ove r transportatio n network s ar e operation s require d b y man y Geographi c Informatio n
Systems applications. Suc h networks , typically modeled a s graphs compose d o f nodes an d link s and rep-
resented a s link relations, can be very large and hence often nee d to be stored o n secondary storag e devices .
Path quer y computatio n ove r suc h larg e persisten t network s amounts t o hig h I/ O costs du e to havin g t o
repeatedly brin g i n link s from th e lin k relatio n fro m secondar y storag e int o th e mai n memor y buffe r fo r
processing. Thi s pape r i s the firs t t o presen t a comparativ e experimenta l evaluatio n o f alternativ e grap h
clustering solution s i n orde r t o sho w thei r effectivenes s i n path quer y processin g ove r transportatio n net -
works. Clusterin g optimizatio n i s attractive becaus e i t does no t incu r an y run-tim e cost, require s n o aux -
iliary data structures , and i s complimentary to many of the existing solutions o n path quer y processing. I n
this paper, w e develop a novel clustering technique, called spatia l partitio n clusterin g (SPC), tha t exploit s
unique propertie s o f transportatio n network s suc h a s spatia l coordinate s an d hig h locality . W e identif y
other promisin g candidates fo r clusterin g optimization s fro m th e literature , suc h a s two-wa y partitionin g
and approximat e topologica l clustering . We fine-tune them t o optimiz e thei r I/O behavior fo r pat h quer y
processing. Ou r experimenta l evaluatio n o f the performance o f these graph clusterin g techniques usin g an
actual cit y road networ k as well as randomly generate d graph s consider s variation s i n parameters suc h a s
memory buffe r size , lengt h o f th e paths , locality , an d out-degree . Ou r experimenta l result s are th e foun -
dation fo r establishing guidelines to selec t the best clustering technique based o n the type of networks. We
*This work was supported in part by the University of Michigan IT S Research Cente r o f Excellence gran t (DTFH61 -
93-X-00017-Sub) sponsore d b y th e U S Departmen t o f Transportatio n an d b y th e Michiga n Departmen t o f
Transportation. N . Jin g wa s supporte d i n part b y the Stat e Educatio n Commissio n o f People's Republi c o f China.
'Corresponding author . Tel : +1-914-784-7523 .
E-mail addresses: [email protected] m (Y.-W . Huang) , [email protected] (N . Jing) , [email protected] u (E.A .
Rundensteiner).
1
Thi s wor k wa s performed whil e the autho r wa s Ph.D. studen t a t th e Universit y o f Michigan .
2
Thi s wor k wa s performed whil e th e author was visiting th e Universit y o f Michigan .
3
Thi s wor k wa s performed whil e the autho r was a facult y membe r o f the Universit y o f Michigan .
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved .
PII: S0968-090X(00)00049- 8
382 Y.-W. Huang et al . I Transportation Research Part C 8 (2000) 381-408
find tha t ou r SP C perform s th e bes t fo r th e highl y interconnecte d cit y map ; th e hybri d approac h fo r
random graph s wit h high locality; and th e two-way partitioning based o n lin k weights for rando m graph s
with n o locality . © 200 0 Elsevier Scienc e Ltd . Al l rights reserved.
Keywords: Pat h quer y processing; Transportatio n networks ; Spatial clustering ; Clusterin g optimization ; Geographi c
information system s
1. Introductio n
intersections and road segments are stored in two separate structures, called the node table and the
link table. Each elemen t in suc h a tabl e is referred to a s a tuple o f the table . The attribute s tha t
describe a nod e tupl e ma y includ e it s x- an d y-coordinates , th e connectin g roa d segment s (in -
coming an d outgoing) , th e traffi c contro l configuratio n (traffi c light , sto p sign , etc.) , point s o f
interest, an d s o on . A lin k itsel f i s identified b y it s origin an d destination nodes. Additiona l at -
tributes fo r describin g eac h lin k fo r a roa d networ k includ e fo r exampl e th e numbe r o f lanes ,
maximum speed, length, up-to-date link travel speed, an d s o on. The sizes for each node and link
therefore ca n b e very large , u p t o hundred s o f bytes in length.
Transportation networks are considered stabl e graph s fo r th e purpose o f this paper, sinc e the
addition an d th e removal o f intersections o r roads occurs onl y very infrequently i n practice. Th e
cost measurement data used for path query computation ma y however be either stable or unstable
depending o n th e attributes . Th e up-to-dat e estimate d lin k traversa l time , for example , may de-
pend o n changin g traffi c conditions , an d therefor e is unstable becaus e i t needs to b e updated a s
soon a s th e traffi c change s occur . Lin k distanc e o r th e geographi c coordinate s o f node s o n th e
network o n th e othe r han d ar e considered stable .
This pape r investigate s th e optimizatio n o f path quer y processin g base d o n grap h clusterin g
techniques. To compute paths for path queries such as those previously listed (Q1-Q3), we assume
that popular graph-traversa l searc h algorithms such as the Dijkstra, A*, Breadth-First Search , and
Depth-First Searc h algorithm s o r an y o f thei r variant s ar e used . The y searc h fo r path s b y tra -
versing fro m on e node t o anothe r throug h thei r respectiv e connecting link. Because path searc h
computation i s recursive in nature, searching a path means to recursively access links from th e link
table.
However, sinc e the siz e of the link table is often large r tha n th e capacity o f the mai n memory
buffer o f a given GIS system , the link table may need to be stored o n a secondary storage device,
typically on disk. While state-of-the-art database engine s may attempt t o cache the link table into
main memory during path evaluation, thi s will generally not be feasible due to size constraints. In
this case, many tuples (links) in the link table may need to b e retrieved over and ove r again fro m
secondary storag e an d place d int o th e mai n memor y buffe r fo r evaluation . Give n tha t suc h I/ O
operations o n mos t moder n computer s ar e typicall y severa l 100-fol d mor e expensiv e than CP U
operations, th e I/ O costs ar e the dominant facto r o f path computation costs .
The hig h processin g cost s ar e thu s incurre d b y th e recursiv e natur e o f th e grap h traversa l
component o f path quer y computation. Resolvin g embedded constraints may further increas e I/O
costs significantly . Fo r example , i n a relate d effor t (Huan g e t al. , 1998 , 1997,a,b,d,e) , w e foun d
that processing spatial constraints (se e Q3 path query) is very I/O intensive. Thus suc h constraint
resolution compete s wit h th e pat h findin g componen t o f th e searc h proces s fo r computationa l
resources such as the buffe r space . This furthe r motivate s our researc h presented i n this paper o n
optimizing the path computation proces s b y reducing I/ O activities.
Data is commonly not transferre d between secondary storage and main memory one tuple at a
time, bu t rathe r a t th e granularit y of on e o r mor e buffe r page s containin g possibly many tuples
each. Hence , on e importan t performanc e consideratio n studie d b y th e databas e communit y is
how best t o plac e tuple s onto disk pages s o to minimiz e the number of required I/ O operations.
384 Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-408
This is done by assuring that tuple s brought int o memory o n on e disk pag e ar e ideall y all made
use of f whenever in th e buffer . Thi s optimizatio n strateg y o f groupin g dat a ont o page s i s com-
monly referre d t o a s clustering.
The purpos e o f thi s pape r i s t o demonstrat e tha t clusterin g optimizatio n fo r pat h quer y
computation ca n b e effectiv e fo r man y type s of transportation networks . Clusterin g is attractiv e
because i t does no t incu r an y run-time cost , no r doe s i t require an y auxiliary dat a structure tha t
demands buffer space . Because transportation network s are stable graphs, clustering is a one-time
a priori cost not affectin g actua l path processing . Most importantly , clustering is at a level lower
than many other pat h query solution s tha t focus on auxiliary acces s structure s o r on algorithmic
techniques, therefore results emerging from th e comparative evaluation o f our clustering research
can b e deploye d b y suc h othe r solution s tha t d o no t alread y emplo y specifi c lin k clusterin g
(Agrawal an d Jagadish , 1988 , 1989 ; Bancilhon an d Ramakrishnan , 1986 ; Zhao and Zaki , 1994) .
Our wor k thu s i s complimentary t o muc h o f th e existin g work o n pat h findin g an d coul d b e
exploited t o furthe r optimiz e such techniques .
1.3. Contributions
network considered includ e parameters suc h as the size of the graph, th e average out-degree o r
the locality. 4
While a preliminar y versio n o f this wor k appear s i n a n earlie r conferenc e paper (Huan g e t al. ,
1996b), this journal pape r differ s fro m i t in many respects. First, w e propose tw o additional ne w
graph clusterin g technique s tha t ar e th e extension s o f th e clusterin g technique s presente d i n
(Huang e t al. , 1996b) , namel y th e hybri d SP C technique an d th e lin k weight based partitionin g
technique. Second , w e include the experimental results of the new extensions into the performance
evaluation sectio n i n this paper . Third , thi s paper present s additiona l type s of experiments tha t
provide ne w insights into th e behavior o f proposed clusterin g techniques. Suc h experiments, not
available i n the previou s report , ar e for exampl e th e path finding experiment s based o n paths o f
different length s an d network s o f variou s averag e out-degrees . Thes e ne w type s of experiment s
have bee n instrumente d fo r al l clustering techniques , both th e one s introduce d i n Huan g et al .
(1996b) as well as the new optimizations. Fourth , because the performance o f the new extensions
can be shown to lead to further improvemen t ove r the original techniques, the conclusions for this
paper hav e been revised to reflect th e new results. Lastly, this paper is written with more examples
and illustration s fo r better understandin g an d accessibilit y to th e material .
2. Relate d wor k
There ar e many recen t researc h effort s reporte d i n the literatur e tha t focu s o n minimizin g the
I/O costs o f path computatio n i n a database settin g that assume s a fixed-size main memor y I/ O
buffer. Mos t of such research has proposed solution s to solve recursive query problems for general
databases that focused on pure transitive closure computation (Agrawa l et al., 1998 ; Agrawal and
Jagadish, 1990 ; Bancilhon, 1985 ; Ebert, 1981 ; loannidis, 1986 ; loannidis an d Ramakrishna , 1988 ;
loannidis et al., 1993 ; Schmitz, 1983) . In our work, rather than aiming for generality, we now take
an application-driven stanc e by proposing differen t dis k page clustering algorithms for optimizing
path quer y processin g fo r GI S typ e o f application s an d the n experimentall y evaluatin g thei r
relative advantage s an d disadvantages .
Two unresolved problems arise when applying transitive closure pre-computation technique s to
path quer y processing for transportation networks . First , a single transitive closure computatio n
cannot tak e differen t embedde d constraint s int o account . Fo r example , a transitiv e closur e
computed fo r path quer y Ql canno t b e used to answer path quer y Q3. To answer all path querie s
with a large set of different embedde d constraints , w e may nee d t o comput e numerou s transitive
closures, eac h base d o n a unique embedded constraint . Clearly , this is not feasibl e i n practice .
Second, som e lin k weight s (cos t measurements ) ma y b e unstabl e an d ca n chang e ver y fre -
quently. In orde r fo r the transitiv e closure computed base d o n suc h cost measurements to reflec t
the mos t up-to-dat e cost , re-computatio n ma y nee d t o b e conducted ver y frequently . However ,
performance result s i n Agrawa l e t al . (1998 ) an d loannidi s e t al . (1993 ) hav e show n tha t thei r
techniques are not efficien t i n computing the shortest path transitive closure for graphs with cycles
4
W e defin e tha t i n a grap h o f hig h locality , th e tw o en d node s o f mos t link s ar e locate d closel y geographically ,
whereas fo r graph s o f no locality , suc h restrictio n doe s not apply .
386 Y.-W. Huang et al I Transportation Research Part C 8 (2000) 381-408
We propose to cluste r lin k tuple s fro m the link table base d on the spatia l proximit y of thei r
origin nodes, that is, to group tuples of the link table into disk pages an d then to transfer the link
relation betwee n secondar y storag e an d mai n memory i n the granularit y of these pages. W e call
this the Spatial Partitio n Clustering , or short SPC . To understand wh y the SPC can be effective fo r
transportation networks , we describe th e unique characteristic s o f the (road ) networks:
• Roa d network s are relatively sparse, hav e unifor m fanout typicall y between 2 and 5.
• Roa d networks are strongly inter-connected, wit h each node typicall y reachable fro m near-b y
nodes b y traversin g only a fe w links.
• Roa d network s consist of mostly short link s in comparison t o the size of the underlying spatia l
region. I n othe r words , mos t roa d link s spa n a shor t distanc e fro m on e intersectio n t o th e
neighboring intersection .
Graph-traversal searc h algorithm s conduc t nod e expansion s b y traversin g links . Becaus e mos t
road links are short, thes e algorithms therefor e exhibit high expansion localit y on transportatio n
networks. Furthermore , pag e size s in modern database s ca n be quite large. Therefore man y link
tuples i n the link table can b e stored withi n one page. Becaus e road transportation network s are
sparse with low fanout, multiple groups of links with the same origin can be stored within one page.
We call them same-origin-link (SOL) groups. For example, with a 4 KB page size and link tuple size
of 12 8 bytes, 32 links can be stored withi n one page. Fo r a transportation networ k with average
fanout of 3, roughly 11 SOL groups can be clustered in one page. This means that there are roughly
11 different node s i n each pag e tha t coul d potentiall y b e expanded b y the search algorithm .
388 Y.-W. Huang et al. I Transportation Research Part C 8 (2000) 381-408
If w e cluste r th e link table s o tha t ever y pag e contain s link s whos e origin node s ar e geo -
graphically closel y located , w e ar e groupin g th e expansio n node s base d o n thei r spatia l
proximity. Base d o n th e fac t tha t transportatio n network s ar e highl y inter-connecte d an d
consist o f mostl y shor t links , th e graph-traversa l algorithm s suc h a s Dijkstra ar e likel y t o
expand node s withi n th e sam e pag e b y traversin g the intra-pag e link s before traversin g cross -
page link s with suc h a clustering . Give n a fixed-sized main memor y buffe r no t larg e enoug h t o
hold th e entir e link table, suc h pagin g behavio r woul d reduc e pag e misse s cause d b y cross -
page lin k traversing . W e now present th e algorith m tha t create s th e spatia l partitio n clusterin g
for a give n network.
The algorithm that creates the SPC clustering is based on the plane-sweep technique s commonly
found i n multi-dimensiona l spatia l dat a operations . Th e plane-sweep technique i s fo r exampl e
used to implement the spatial intersect operation i n BrinkhofTet al . (1994), Preparata an d Shamo s
(1985) an d Shamo s an d Hoe y (1976) . Th e basi c ide a o f SP C i s firs t t o sor t al l link s by th e x -
coordinate value s of their origin nodes. Th e plane-sweep techniqu e is then applie d t o swee p all x-
sorted link s along th e x-coordinate fro m lef t t o right . The sweepin g process stop s periodicall y to
sort th e link s swept since last stoppag e b y the y-coordinate values of their origin nodes . Because
the origin nodes o f the links between two stoppage point s spa n a short distance alon g the x-axis,
sorting thes e link s by th e _y-coordinate value s o f thei r origin node s achieve s a partia l spatial or -
dering. After eac h y-sorting, the y-sorted links can be grouped into disk pages. We call the outpu t
of thi s clusterin g process th e SPC-clustere d link table, which as explained earlie r correspond s t o
the layou t of link tuples onto dis k pages .
One critical decision t o suc h a partition algorith m i s to determin e th e proper stoppage points
during plane sweepin g when y-sorting takes place. Ou r goa l is to achiev e a balanced partitioning
in whic h eac h resultin g partitio n consist s o f link s whos e origin node s ar e locate d withi n a
bounding are a tha t resemble s a squar e bloc k whe n th e link s are evenly distributed o n th e map .
Below, we introduce a heuristic tha t dynamically compute s th e proper stoppag e points i n order
to achiev e suc h a balance d partitioning . T o accommodat e unevenl y distribute d maps , th e
heuristic w e us e wil l adjust s th e boundin g bloc k fo r eac h partitio n b y growin g i n th e y-axi s
direction i f th e regiona l lin k distributio n i s sparse , an d shrinkin g i f otherwise . I n eithe r case ,
each partitione d pag e i s maximall y fille d wit h link s whos e origin node s ar e relativel y closely
located.
To presen t th e algorith m tha t create s th e SP C clustering, we use the followin g parameters :
• f refer s t o the number of link tuples that fit into a given page size , referred to a s the link tuple
blocking factor. We thus cal l ever y / consecutiv e lin k tuples an f -page.
• Th e block table is a temporary table tha t stores the links collected between two stoppage point s
during the sweepin g process.
• dx i i s th e differenc e betwee n th e minimu m an d maximu m x-coordinat e value s o f th e origin
nodes o f the link s in the firs t i f -pages i n the block table.
• dy i i s the difference betwee n th e minimum and maximum y-coordinate values of origin nodes of
the link s in the firs t i f -pages i n the block table.
• SPC-clustere d link table is the resulting table .
Y.-W. Huang et al . I Transportation Research Part C 8 (2000) 381-408 38 9
Algorithm (SPC0) .
Input: L: link table filled with all link tuples
Output: CL: link table clustere d into page s
1 The unclustered link table L is sorted by the x-coordinate values of the origin nodes of its link
tuples. The result i s called the x-sorted link table.
2 Read the x-sorted link table sequentially one f-page at a time using the following process: read
the next f -page an d writ e the page to th e end o f the block table (block table is initially empty).
Then chec k th e following conditions:
• I f all tuples i n the x -sorted link table are read , g o to ste p 3.
• I f there is only one f-page in the block table, go to step 2 to read the next f-page and write it
to th e en d o f the block table.
• Otherwise , conduct th e following evaluation:
2. 1 Let p b e the numbe r of f -pages i n th e block table. Compute th e following :
The intuition behind the heuristic is that whe n the first few f-pages (e.g., 1 or 2) are written fro m
the link table to the block table, pis small , and d p = |(dy p /p) — dx p| wil l likely be large, assuming a
map wit h evenly distributed links . This i s because i n th e plane sweep process , w e are proceeding
with smal l progres s o n th e x -axis an d wit h entir e rang e o n th e j-axis. Whe n mor e f-page s ar e
added t o th e block table, p increase s and d p decreases. At some point, d p will approach 0 and the n
starts picking up agai n when dx p > (dy p/p). W e capture thi s point by dynamically detecting dp >
dp-i an d make it a stoppage point. At a stoppage point , links in the first p—l f -pages i n the block
table ar e sorte d b y th e y-coordinat e value s o f thei r origin nodes . Becaus e d p-1 =
| ( d y p - 1 / ( p — 1)) — dxp-i | approaches 0 , each partition will be bounded b y an area that resembles a
square box.
Fig. 1 illustrates the sweepin g process an d th e reasonin g behind th e heuristic s in determining
the stoppag e points . I n Fig . l(a) , th e lin k tuple s ar e sorte d b y th e x-coordinat e value s of their
origin nodes. Next, f-pages of link tuples are written t o the block table sequentially. I n Fig . l(b) ,
when the fourth f-page is written to the block table, d4 > d 3. This is a stoppage point. In Fig. l(c) ,
links i n th e firs t 3 f-pages i n th e block table are the n sorte d b y th e y-coordinat e value s of their
390 Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-40 8
Fig. 1 , Spatial partitio n clustering : (a ) sor t link s b y x -value o f their origi n nodes , (b ) 4 f-page s loade d i n block tabl e
when d 4 > d 3 wher e d 4 = |(dy 4/4) — dx4| an d d 3 = |(dy 3 /3) — dx3 , (c ) sor t 3 f-page s link s b y y-valu e o f thei r origi n
nodes, (d ) spatial partitio n clustere d links .
Partitioning algorithm s hav e been widely deployed i n th e desig n and fabricatio n o f very large
scale integrate d circui t (VLSI ) chips . Mos t suc h algorithm s partitio n a networ k int o tw o sub -
networks (Chen g an d Wei, 1991 ; Fiduccia and Mattheyses , 1982 ; Kernighan an d Lin , 1970) , an d
through a divide-and-conquer process , reduc e a comple x proble m int o smalle r an d henc e mor e
manageable subproblems . Th e common objectiv e of such partitioning i s to shorte n th e tota l in -
terconnection distanc e betwee n al l subnetwork s i n achievin g a reduce d layou t cos t an d bette r
system performance. We now propose tha t thes e partitioning algorithms could als o be applied to
our problem of transportation networ k clustering, namely to cluster the link table by storing eac h
partition withi n a single page. In our context, the size of each partition therefor e is bounded b y the
size of a buffer page . Ou r goa l o f such partitioning is to reduce the page misses that occur durin g
path quer y computatio n t o a minimum . Because each cross-pag e traversa l in path computatio n
may potentially incur a page miss, our partitio n objectiv e is then t o minimize the number of inter-
partition (cross-page) links.
resulting partitions se t to s1 and s2. The i, s1, s2 are pre-specified input parameters .
2.3 The resul t o f step 2.2 is a two-wa y cut o f G' .
3 Restoring stage:
3.1 Restore th e tw o partition s i n G ' created i n ste p 2 by replacin g each condense d nod e i n
each partition b y its original nodes in the correspondent partitio n create d i n step 1 . The re-
sult i s two-way cut i n G .
3.2 Apply the Fiduccia-Mattheyses algorithm o n the two restored partition s in G one time,
and th e ending two partitions are the final result.
The ratio-cut routin e (Wei and Cheng , 1990 ) in step 1. 2 and th e Fiduccia-Mattheyse s algorithm
(Fiduccia an d Mattheyses , 1982 ) in ste p 2. 2 are tw o partitionin g algorithm s base d o n th e two -
stage heuristics described earlier . The former relie s o n the ratio-cut property to achiev e a more
balanced spli t while the latter deploys a data structure that reduces the computational complexity
and a specifi c siz e restriction t o achiev e a desire d partitioning .
The intuition behind th e contraction approac h is that nodes that ar e more strongl y connected
are identifie d b y the ratio-cut routin e in step 1. 2 and treate d a s one node in the swappin g stage.
This wa y the chanc e o f the m bein g spli t int o differen t partition s b y a ba d spli t i s reduced. Fo r
example, i n Fig . 4 , th e ratio-cu t routin e ma y grou p th e node s formin g a circula r versu s a tri -
angular configuratio n int o tw o differen t subgraph s A an d B . Becaus e partition s A an d B ar e
subsequently contracted into two inseparable units, if there is a cut that goe s through A and B , it
has to go through the link between node x an d nod e y. Not e that thi s is an optimal cu t between
A an d B and n o furthe r swappin g will chang e thi s cut-link. I f n o contractio n i s performed, a n
initial cu t o f al l th e node s i n A an d B may loo k lik e th e cu t i n Fig . 5 . Note tha t subsequen t
swapping will not alte r this cut because any single-node or pair-wise swapping of nodes a, b, c, d
does not yiel d a cu t wit h less inter-partitio n links . Therefore th e optima l cu t tha t goe s throug h
the link between nodes x an d y woul d be lost in the case if we were to no t utiliz e the contraction
heuristic.
Partition A Partition B
subgraph subgraph
and removin g acycli c portion s o f th e cycli c graph . B y this approach , th e cycli c graph ca n b e
"approximately" topologically sorted .
In thi s paper , w e implement an d evaluat e th e approximatel y topologica l clusterin g algorith m
proposed i n Agrawa l an d Kierna n (1993 ) and cal l i t Topo C (fo r Topologica l Clustering) . Th e
following i s a description of the main step s of the TopoC .
Algorithm (TopoC(J).
Input: L : link table representing transportation networ k
Output: CL : lin k table clustered into pages
1. Mov e a root-link 5 of the link table L into the clustered link table CL. Repeat thi s process until
no more root-links can be found in the remaining table If the remaining link table is empty, go
to ste p 4.
2. Mov e a sink-link 6 to a temporar y lin k table. Repea t thi s process until no mor e sink-link s re-
main i n the lin k table L .
3. Randoml y pick a node in the remaining link table L. Move all its outgoing links to the tempo -
rary lin k table. G o t o ste p 1.
4. Appen d th e links in the temporar y lin k table in revers e order t o th e clustered link table CL .
The TopoC achieves approximately topological clusterin g in three phases: Step s 1 and 2 remove
the acycli c portion s o f th e graph . Ste p 3 breaks cycle s b y removin g all out-goin g link s o f on e
selected node. When the remaining graph is empty, step 4 appends the links in the temporary link
table t o th e resultin g clustered link table .
As a n example , Fig. 6(a ) shows a cycli c directed grap h an d it s unclustered link table. W e use
this graph t o illustrat e how the clustered link table is built ste p by step by the TopoC algorithm .
Fig. 6(b ) depicts the graph afte r Topo C moves root-links Lfc an d L fg int o the clustered link table .
Since Fig . 6(b ) ha s n o sink-links , TopoC skip s ste p 2 . I n Fig . 6(c) , TopoC break s a cycl e by
moving lin k L ab int o th e temporar y lin k table . Repeatin g ste p 1 , Fig. 6(d ) depicts th e grap h b y
moving root-links L bd an d L be t o th e clustered link table. Then ste p 2 removes sink-link Lca fro m
the graph into the temporary link table (Fig. 6(e)). By breaking another cycle in Fig. 6 (f), link Lde
is moved ove r t o th e temporar y lin k table. Sinc e th e remainin g graph a t thi s tim e is an acycli c
graph, its links are topologically sorted by repeating step 1 to move root-links to the clustered link
table (Fig. 6(g)) . Finally, Fig. 6(h ) depicts the topologically clustered link table by appending th e
links of the temporar y lin k table in reverse order.
5
I f the origin node o f a link has n o incomin g link, this link is referred to a s root-link.
6
I f the destination node of a link has no out-going link, and th e origin node of this link does not belong to any of the
cycles, thi s link i s referred to a s sink-link.
396 Y.-W. Huang et al I Transportation Research Part C 8 (2000) 381-408
Fig. 6 . Example o f approximatel y TopoC algorithm : (a) example graph, (b ) ste p 1 : remove root-links (of node f), (c)
step 3: break cycle (of node a), (d) step 1 : remove root-links (o f node b), (e) step 2: remove sink-link (of node c), (0 ste p
3: break cycle (of node d) , (g ) step 1 : remove root-links, (h) ste p 4: attend temporar y link.
determine the path query processin g cost when n o clustering strategy i s deployed. I n thi s paper,
we compare it s performance in path quer y processing wit h those of all other clusterin g strategies .
that combine s the spatial partitio n clusterin g (Section 3.2) with the swapping technique (Section
4.1.1). W e cal l i t th e hybri d approach . Thi s hybri d approac h start s b y performin g th e spatia l
partition clusterin g o n th e graph , strivin g a s before t o reac h hig h occupanc y rate for eac h par -
tition. In contrast to the SPC approach wher e the page occupancy rate is approximately 100% , we
allow the minimum occupancy rate to be as low as 90% for the hybrid approach.7 This relaxation
on the occupancy rate leaves some space on each page that can be used for more effective single -
node swapping.
When th e spatia l partitioning proces s i s complete, w e perform single-node swapping with th e
maximum occupanc y rat e se t to 100% . This means that swappin g cannot fill a partition s o as to
exceed th e siz e of a page. W e conduct suc h swapping for severa l iterations with the partitionin g
objective base d o n minimu m number o f inter-partitio n links . Next, w e continue b y performing
pair-wise swapping for several more iterations. This is designed t o give the partitions that are ful l
a chance to swap nodes with other partitions to reach a better cut. In contrast t o the swapping in
the two-wa y partitioning approac h i n which each nod e ha s onl y one other partitio n t o swa p t o
(see Section 4.1), the swapping in our hybrid approach ha s to consider all other partitions a node
can potentiall y swa p to . Thi s i s becaus e th e spatia l partitionin g proces s create s man y initia l
partitions. It is to be expected (and our experiments confirm) that the resulting clustered link table
has a few more pages than th e SPC-clustere d link table because o f its slightly lowered occupancy
rate, but the number of inter-partition links could potentially be reduced by the swapping process.
In applyin g the two-way partitioning algorith m t o grap h clustering , our objectiv e is to reduc e
the number of cross-page links (TWPC_con i n Section 4.1). We believe such an approach wil l lead
graph-traversal path computation t o traverse more intra-page links and less inter-page links since
the numbers of the latter are reduced. As a result, the page misses happening durin g path searc h
are likel y to b e reduced. We now extend the partitioning objectiv e to include link weights also.
As before , w e se t ou r objectiv e to firs t als o minimiz e the numbe r o f tota l cross-pag e links .
However, i f tw o o r mor e possibl e cut s hav e th e sam e numbe r o f reduction s i n term s o f inter-
partition links , w e break th e ti e b y givin g the swappin g priorit y t o th e cu t tha t result s in th e
maximum sum of weights of all cross-page links. 8 This objective is based on the fact that our pat h
search algorithm, the Dijkstra algorithm , is a priority search that give s priority to th e node with
the minimum traversed weight so far. With everything else being equal, a link with a larger weight
is less likely to be favored by the Dijkstra algorith m to be traversed next than a link with a smaller
weight. Therefore, given an equal number of cross-page links, it can be expected that a partition
that ha s a large r tota l weigh t o f al l cross-page link s could potentiall y furthe r improv e th e I/ O
efficiency o f path quer y computation base d o n th e Dijkstra algorithm . Not e tha t t o reduc e th e
number o f cross-pag e link s is stil l the firs t priorit y becaus e i f we put th e maximu m su m o f al l
7
Ou r experiment s showe d tha t 90 % is a goo d compromis e betwee n hig h occupanc y rat e an d sufficien t roo m fo r
subsequent nod e swapping .
8
Not e that for stable graph clustering that we consider i n this paper, the link weight used in this partitioning objective
should no t b e an unstabl e lin k attribut e tha t change s frequently.
398 Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-408
weights o f al l cross-pag e link s a s th e firs t priority , th e resultin g cut wil l possibl y have a larg e
number o f cross-page links , making the partitionin g ineffective .
6. Testbe d environmen t
In thi s section , w e first discuss ou r experimenta l testbe d setup , followe d by the grap h repre -
sentation, an d dat a sets , and the n experimenta l parameters an d measurements .
We us e th e link table to mode l th e topolog y o f th e graph . Eac h lin k tupl e i n th e link table
models a link in the graph. The path queries discussed in this paper are assumed to be path queries
with embedde d constraint s (se e examples in Sectio n 1) . Because to resolv e such constraints ma y
require the retrieval of link attributes in order to evaluate the validity of each link traversed during
path finding , w e must stor e relevan t lin k attribute s i n thei r correspondin g lin k tuples . I n thi s
paper, the link tuple adopted in our experiments i s set to 12 8 bytes. The reader can find a listing of
possible attributes that coul d be associated wit h such links. As discussed in Section 1 , depending
on the type of query, the link attributes must be kept with the link itself in order t o allo w for th e
filtering of link s fro m th e candidat e path s durin g pat h processing , suc h a s fo r th e spatia l con -
straints in quer y Q3 fro m Sectio n 1 .
We test two kinds of graphs: randomly generated graphs an d a real (fine-granularity) network
representing the streets of Ann Arbo r Cit y that ha s 559 6 nodes and 14,03 3 links. We experiment
with rando m graph s wit h 500 0 nodes an d var y th e averag e out-degre e fro m 2 to 8 . To creat e a
random graph with average out-degree d, we randomly select , for eac h node, from 1 to 2 x d — 1
outgoing link s and, fo r eac h link , we randomly selec t a destination. The destination must b e dif-
ferent fro m th e origin o f the sam e link, and th e destinations of two differen t outgoin g links fro m
the sam e origin mus t b e different . Th e weight fo r eac h lin k i s chose n t o b e a rando m intege r
between 1 and 100 . We also create two sets of such random graphs , on e with high locality, and the
other with no locality. To control th e locality of a random graph, we associate each node with an
jc-coordinate and a y-coordinat e value . For graph s o f high locality, we allow a link to exis t only
when its origin and destination are within a relatively close vicinity as compared t o the total area .
For graph s o f no locality , we set no suc h limitation .
Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-408 39 9
The reason w e experiment with random network s of both hig h and no locality is because more
advanced GI S application s suc h a s Intelligen t Transportatio n System s nee d t o mode l graph s
beyond the road transportatio n network s (such as the Ann Arbor cit y network). For example , the
graphs of airline flight route s exhibit no planarity and locality therefore can be better modele d b y
random graph s wit h n o locality . An inter-moda l networ k o f both subwa y train an d bu s routes ,
however, exhibit s high localit y withou t planarity , therefor e can b e modeled b y rando m graph s
with hig h locality.
In ou r experiments , w e first prepare th e networ k dat a usin g the various clusterin g techniques
proposed i n this paper, namel y SPC, Hybri d (the hybrid approach tha t combines SPC and nod e
swapping), two-wa y partition clusterin g base d o n connectivit y (TWPC_con), two-wa y partitio n
clustering based o n stable link weights (TWPC_wgt), topologica l clusterin g (TopoC), and rando m
clustering (RandC) . Th e data then i s layed ou t o n dis k base d o n thi s preprocessing stage .
Thereafter, fo r eac h experiment , w e apply th e Dijkstr a algorith m t o conduc t a single-sourc e
shortest pat h searc h for randomly selecte d source node s i to all other nodes in the network. Such
computation correspond s t o th e graph-traversal searc h fo r th e shortest-pat h fro m nod e i to th e
one nod e j tha t i s th e farthes t awa y fro m i . Henc e thi s se t o f experiment s test s th e worst-cas e
scenario i n searching a shortes t pat h fro m i .
In this paper, th e size of pages on disk and in the main memory buffe is set to be 4 KB each. The
experiments are base d o n a buffe r containin g u p t o 24 0 pages, i.e. , varying the siz e of the buffe r
from 6 4 KB up t o 96 0 KB. The size of the entire link table is about 2 Mb, wit h a small differenc e
between th e various clusterin g techniques .
Although th e experiment s presented i n this paper ar e based o n buffe r size s up t o 96 0 KB, th e
adequate buffe r siz e for pat h quer y processin g i s proportional t o th e siz e of the underlyin g net-
work. The experimental networks in this paper (i.e. , th e Ann Arbo r cit y network) are of medium
sizes (Sectio n 6.3) . The size s of larger citie s can b e many time s larger. Fo r example , th e Detroi t
road networ k we are using for related researc h in this project has more than 50,00 0 links which is
about thre e time s th e siz e of the An n Arbo r map . Consequently , th e buffe r requiremen t shoul d
increase fo r large r maps . Second , resolvin g constraints embedde d i n th e path querie s may incur
heavy I/ O activities which takes away the buffe r spac e fro m th e path searc h process . Lastly , in a
multi-user and multi-tasking databas e system, on e cannot assume tha t the entire resources suc h as
the buffe r spac e ar e available to on e single query process. W e therefore are motivated t o find the
clustering strateg y tha t can proces s pat h querie s efficientl y whil e using as small a portion of the
buffer a s possible . Therefore , i n reality , th e buffe r requiremen t fo r processin g constraine d pat h
finding on a large network in a multi-user database environment ca n be many time s larger tha n
the variou s buffe r size s depicted i n ou r experimenta l evaluatio n i n Sectio n 7 .
Given that w e use the sam e search algorith m Dijkstr a in our experiments , the CPU processin g
costs are all fairly comparable whereas th e number o f disk pages that must be transferred betwee n
the slower secondary storage devic e to the faster main memory system for processing, referre d to
400 Y.-W. Huang e t a l I Transportation Research Part C 8 (2000) 381-408
7. Experimenta l evaluatio n
In the first set of experiments, we use the Ann Arbo r roa d network and conduc t single-sourc e
shortest path search for randomly selected nodes using the clustering techniques proposed i n this
paper. Th e result s in Fig. 7 show that Rando m clusterin g performs much worse than an y othe r
clustering, confirming ou r claim in this paper that proper graph clustering can be a key to efficien t
path quer y processing. Because the cost o f the Random clustering is very high, making it hard to
see the differenc e i n performance between the othe r five clustering approaches, w e plotted Fig . 8
without showin g the Random clusterin g results . I n Fig . 8 , it is clear tha t SPC performs th e best,
followed by , in exact order, TWPC_wgt , TWPC_con, Hybrid , an d TopoC. It i s surprising to see
that althoug h TopoC has the worst performance among the five clustering optimizations, it is still
much mor e effectiv e tha n RandC . Thi s i s contradictor y t o th e suggestio n i n Shekha r an d Li u
(1995) an d Zha o an d Zak i (1994 ) tha t topologica l clusterin g i s no t effectiv e fo r highl y cycli c
graphs suc h a s our roa d network .
It i s als o interestin g to not e tha t th e Hybri d approac h perform s wors t tha n th e SPC . Thi s
indicates tha t th e partitionin g objectiv e w e se t t o minimiz e th e cross-pag e link s durin g th e
swapping proces s ma y no t b e a s relevan t t o highl y interconnected near-plana r graph s lik e th e
Ann Arbo r networ k a s spatia l proximit y which SP C is based upon . Thi s als o help s t o explai n
why SP C performs better tha n TWPC_wg t an d TWPC_con . Th e fact tha t TWPC_wg t perform s
better tha n TWPC_co n indicate s tha t b y incorporatin g lin k weights , the partitionin g objectiv e
of TWPC_wg t catche s th e expansio n behavio r o f th e Dijkstra algorith m bette r tha n tha t o f
TWPC_con. Not e tha t whe n the buffe r siz e is greater tha n 51 2 KB, al l five clustering strategies
perform th e same . Thi s is because the siz e of the buffe r i s large enoug h t o contain the expansio n
locality o f th e Dijkstra algorith m captured b y al l fiv e clusterin g optimizations . Therefore ,
roughly on e pas s fo r suc h a larg e buffe r woul d b e sufficien t t o comput e th e single-sourc e
shortest paths .
Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-0 8 40 1
Fig. 8 . I/O cost of searching the longes t path (excludin g random clustering) on Ann Arbo r map.
The secon d se t o f experiment s i s base d o n a randoml y generate d grap h wit h 500 0 nodes ,
average out-degre e o f 3 , an d hig h locality . W e conduc t th e sam e single-sourc e pat h searc h
experiments described above . Whil e the An n Arbo r networ k is very planar an d interconnected ,
the high-localit y rando m grap h doe s no t guarante e planarit y an d hig h interconnection . Th e
results in Fig. 9 show that Rand C remain s the distant worst, with the TopoC significantl y wors e
than the other four clustering approaches .
402 Y.-W. Huang et al . I Transportation Research Part C 8 (2000) 381-408
Fig. 9 . I/O cost o f searching the longes t path o n the high-localit y random graph .
The close-up results in Fig. 1 0 show that th e Hybrid an d SP C perform better than TWPC_co n
and TWPC_wgt , wit h th e Hybri d havin g a sligh t edge . Thi s i s differen t fro m th e experimenta l
results on the Ann Arbor networ k where the Hybrid i s worse than th e other three . Thi s indicate s
that th e partitioning objective of minimizing cross-page link s does hel p in bringing down th e I/ O
cost incurre d b y th e Dijkstra pat h searc h fo r high-localit y graph s withou t hig h interconnectio n
and planarity . Th e superio r performanc e o f both th e Hybri d an d SP C over th e TWPC_con an d
TWPC_wgt indicate s tha t eac h partitio n create d b y our propose d SP C partitioning algorith m i s
tailored t o fit perfectly into a buffe r page . Therefor e th e resultin g grap h partition s ar e better, in
Fig. 10 . I/O cost o f searching th e longes t path (excludin g random clustering ) on high-localit y random graph .
Y.-W. Huang e t al . I Transportation Research Part C 8 (2000) 381-408 40 3
terms o f bot h pag e occupanc y rat e an d expansio n localit y exhibited b y th e Dijkstra algorithm ,
than th e partition s create d b y th e TWP C approache s tha t us e divide-and-conquer t o distribut e
partitions int o pages .
In th e thir d se t of experiments, we test a randomly generated grap h with 5000 nodes, average
out-degree of 3, and no locality. Interestingly, the results in Fig. 1 1 show that RandC an d SPC are
equally the worst. This can be explained by the fact tha t without locality, the spatial proximity is
irrelevant fo r prope r partitioning . Consequently , th e SP C performs the sam e a s th e Rand C o n
graphs wit h no locality . The swapping process i n the Hybri d approac h trie s to correc t th e irrel-
evant spatial partitioning, bu t its performance is still worse than the TWPC approaches , and even
worse than th e TopoC for som e buffe r sizes . The results also sho w that TWPC_wg t has th e bes t
performance, followe d by TWPC_con . Thi s indicate s tha t th e tw o TWP C approache s ar e no t
locality-dependent, therefor e have better performanc e than th e SP C and th e Hybrid approache s
on graph s wit h no locality . Th e link-weights based partitionin g objectiv e in TWPC_wgt tha t we
have propose d i s a n effectiv e optimizatio n ove r th e pur e connectivit y base d objectiv e i n
TWPC_con. W e note however that th e TWPC_wgt has the limitation that the link weight used as
the partitioning objectiv e must be stable .
So far, our experiments focus on the worst-case scenario, i.e., the cost of computing the longest
among all shortest-paths to all possible destinations. We now explore the path search performance
on path s o f differen t length , namel y short paths , mediu m paths, an d lon g paths. W e define th e
direct distance between the farthest node-pair a s dmax, and the direct distance between the two end
nodes of a path a s cohor t fo r shor t paths , d medium for mediu m paths, an d d\ ong fo r lon g paths. The n
the followin g relation s hold :
Because suc h a direct-distance base d classificatio n i s only meaningful if the graph has locality an d
is highly inter-connected, we conduct this set of experiments on the real Ann Arbor city map only.
We randoml y selec t a numbe r o f node-pair s fro m eac h category , conduc t pat h search , an d
collect th e average results . Th e buffer siz e is set to 256 KB. Because the performance fo r RandC is
much worse than an y other clustering approach, w e use the log scale in Fig. 12 . Fig. 1 3 shows the
results in linear scal e without RandC . I n Fig . 13 , we can se e more clearl y that SP C consistently
has th e bes t performanc e fo r al l three kind s o f paths, whil e TopoC generall y has th e wors t per -
formance.
In thi s las t se t o f experiments , w e randomly generate d graph s wit h 500 0 nodes, varyin g the
average out-degrees fro m 2 to 8. We set the buffer siz e to 960 KB in order to accommodate graphs
with larg e averag e out-degree . Th e purpos e o f th e experiment s is to fin d ou t whic h clusterin g
strategies wor k bette r fo r graph s o f variou s averag e out-degrees . Becaus e graph s o f hig h out -
degrees automatically los e locality wit h evenly distributed nodes , w e only generate graph s wit h no
locality for thi s se t of experiments.
The experimental result s in Fig. 1 4 show that both TWPC_wgt an d TWPC_con perfor m better
as th e averag e out-degre e increases . Althoug h TWPC_wg t ha s a bette r performanc e tha n
TWPC_con, the performance curve of TWPC_con actuall y becomes (i.e. , drops) more favorably
Fig. 13. I/O cost by length of paths (excludin g random clustering ) on the Ann Arbo r network .
8. Conclusion s
For futur e work , effectiv e grap h clusterin g techniques can als o b e extende d t o solv e mor e
general pat h problem s suc h as recursiv e query processing . Results from thi s paper ca n als o b e
exploited fo r further optimizatio n of complex path query processing with embedded constraints.
For instance , our exploration of a framework o f spatial path queries (Huang et al., 1997) is based
on the SPC solution first introduced in this paper. Lastly, clustering techniques could be explored
to take into account knowledge about in which s spatial location paths with certain properties can
be found. Such knowledge could be utilized t o constrai n the searc h and als o to adjus t th e clus-
tering of links for origin-destinatio n pairs that meet this particular class of queries.
References
Agrawal, R. , Dar , S. , Jagadish, H.V. , 1998 . Direc t transitiv e closure algorithms : desig n an d performanc e evaluation .
ACM Transaction s o n Databas e System s 1 5 (3), 427^58.
Agrawal, R., Jagadish , H.V. , 1988 . Efficien t searc h in ver y larg e databases . In: Proceeding s of the 14t h VLD B
Conference. Lo s Angeles, CA, pp. 407-418.
Agrawal, R. , Jagadish , H.V. , 1989 . Materializatio n an d incrementa l updat e o f pat h information . In : IEE E Fift h
International Conferenc e o n Data Engineering , pp. 374-383 .
Agrawal, R., Jagadish, H.V. , 1990 . Hybrid transitive closure algorithms. In: Proceedings of the 16t h VLDB Conference.
Brisbane, Australia , pp . 326-334 .
Agrawal, R. , Kiernan , J. , 1993 . A n acces s structur e fo r generalize d transitiv e closur e queries . In : IEE E Nint h
International Conferenc e on Dat a Engineering , pp. 429-438 .
Banerjee, J. , Kim , W. , Kim , S.J. , Garza , J.F. , 1988 . Clusterin g a DA G fo r CA D databases . IEE E Transaction s o n
Software Engineerin g 1 4 (11).
Bancilhon, F. , 1985 . Naive evaluation o f recursively defined relations, 1985 . In: Brodie, M., Mylopoulos , J. (Eds.), On
Knowledge Base Management System s - Integratin g Database and Al systems . Springer, Ne w York.
Bancilhon, F. , Ramakrishnan , R. , 1986 . A n Amateur' s introductio n t o recursiv e quer y processin g strategies . In :
Proceedings o f the 198 6 ACM SIGMO D Internationa l Conferenc e on Managemen t o f Data.
Brinkhoff, T. , Kriegel, H., Schneider, R., Seeger , B., 1994. Multi-step processing of spatial joints. In: Proceedings of the
1994 ACM SIGMO D International Conferenc e on Managemen t o f Data, pp . 197-208 .
Carey, M.R. , Johnson , D.S. , Stockmeyer , L. , 1976 . Som e simplifie d np-complet e grap h problems . Theoretica l
Computer Scienc e 237-267.
Cheng, C.K. , Wei , T.C. , 1991 . A n improve d two-wa y partitionin g algorith m wit h stabl e performance . IEE E
Transactions o n Computer-Aide d Desig n 1 0 (12), 1502-1511.
Dijkstra, E.W . 1959 . A note o n tw o problems i n connection wit h graphs. Numer . 269-271 .
Ebert, J. , 1981 . A sensitiv e transitive closure algorithm. Informatio n Processing Letter s 12 , 255-258.
Fiduccia, C.M. , Mattheyses , R.M. , 1982 . A linear time heuristic fo r improving network partitions. In : Proceedings of
ACM/IEEE 19t h Design Automati c Conference , pp. 175-181 .
Goodchild, M.F. , 1990 . Tiling larg e geograhica l databases . In : Buchmann, A., Gnther, O. , Smith, T.R., Wang , Y.-F .
(Eds.), Desig n an d Implementatio n o f Large Spatia l Databases . Springer , New York, pp. 137-146 .
Goodchild, M.F. , Shiren , Y. , 1992 . A hierarchica l spatia l dat a structur e fo r globa l geographi c informatio n systems.
Computer Visio n Graphics an d Imag e Processing : Graphica l Model s an d Imag e Processin g 5 4 (1), 31-44.
Huang, Y.W. , Jones , M.C. , Rundensteiner , E.A. , 1998 . Symboli c intersect detection : a method fo r improvin g spatia l
intersect joints. Journa l o f Geolnformatica Specia l issu e o n Spatia l Databas e System s 2 (2), 149-174 .
Huang, Y.W. , Jing , N. , Rundensteiner , E. , 1997 . Integrated quer y processin g strategie s fo r spatia l pat h queries . In :
IEEE Internationa l Conferenc e on Data Engineering , pp. 477-486.
Huang, Y.W. , Jing , N. , Rundensteiner , E. , Huang , Yun-Wu. , Jones , Matthe w C. , Rundensteiner , Elk e A. , 1997a .
Improving spatia l intersec t joints usin g symbolic intersect detection . In : SS D Conference, pp. 165-177 .
Huang, Y.W. , Jing, N., Rundensteiner , E., 1997b . A cost model fo r estimating the performance o f spatial joints using
R-trees. SSDB M (1997 ) 30-38.
408 Y.-W. Huang et al . I Transportation Research Part C 8 (2000) 381-408
Huang, Y.W. , ling , N. , Rundensteiner , E.A. , 1997c . A hierarchica l pat h vie w mode l fo r pat h findin g i n intelligen t
transportation systems . Journa l of Geolnformatica 1 (2), 125-159 .
Huang, Y.W. , Jing , N. , Rundensteiner , E. , 1997d . Spatia l joint s usin g R-trees : breadth-firs t traversa l wit h globa l
optimizations. VLD B (1997) 396-405.
Huang, Y.W. , Jing , N. , Rundensteiner , E.A. , 1997e . Query processin g strategie s fo r spatia l pat h queries . In : IEE E
International Conferenc e o n Dat a Engineering , ICDE-13, England .
Huang, Y.W. , Jing , N. , Rundensteiner , E.A. , 1996 . Pat h vie w algorith m fo r transportatio n networks : the dynami c
reordering approach . In: ACM Worksho p o n Geographi c Informatio n Systems , AC M GIS'96 , Washington , DC.
Huang, Y.W. , Jing , N. , Rundensteiner , E.A. , 1996 . Evaluation o f hierarchical pat h findin g technique s for IT S rout e
guidance. In : Proceedings o f ITS-America, Houston , April .
Huang, Y.W., Jing, N., Rundensteiner, E., 1996. Effective graph clustering for path queries in digital map databases. In :
Proceedings o f the Fift h Internationa l Conferenc e on CIK M 1996 , Washington, DC , pp . 215-222 .
loannidis, Y.E. , 1986 . On the computation o f the transitive closure of relational operators. In : Proceedings of the 12t h
International Conferenc e on VLDB , pp . 403-411.
loannidis, Y.E. , Ramakrishnan , R. , 1988 . A n efficien t transitiv e closur e algorithm . In : Proceeding s o f th e 14t h
International Conferenc e o n VLDB , pp . 382-394 .
loannidis, Y.E. , Ramakrishnan , R. , Winger , L. , 1993 . Transitive closure algorithm s base d o n grap h traversal . AC M
Transactions o n Databas e System s 1 8 (3), 512-576.
Jing, N. , Huang , Y.W. , Rundensteiner , E.A. , 1998 . Hierarchica l encode d pat h view s fo r pat h quer y processing : a n
optimal mode l an d it s performance evaluation. IEE E Transactions o f Knowledge and Dat a Eng . 1 0 (3), 409-432.
Jing, N., Huang , Y . W., Rundensteiner, E., 1996 . Hierarchical optimizatio n o f optimal path finding for transportatio n
applications. In : Proceedings o f the Fift h Internationa l Conferenc e on CIKM , pp . 261-268 .
Larson, P.A. , Deshpande , V. , 1998 . A file structure supportin g traversal recursion . In : Proceeding s o f the 198 9 ACM
SIGMOD International Conferenc e o n Managemen t o f Data, pp. 243-252.
Kernighan, B.W. , Lin, S., 1970 . An efficien t heuristi c procedure for partitioning graphs. Bell System Technical Journa l
49 (2), 291-307.
Preparata, P.P. , Shamos , M.I. , 1985 . Computational Geometry . Springer , NewYork .
Schmitz, I. , 1983 . An improved transitiv e closur e algorithm . Computin g 30, 359-371.
Shamos, M.I. , Hoey , D.J. , 1976 . Geometric intersectio n problems . In: Proceedings of the 17t h Annual Conference on
Foundations o f Computer Science , pp . 208-215 .
Shekhar, S. , Liu, D.R. , 1995 . CCAM: a connectivity-clustered acces s method fo r aggregat e querie s on transportatio n
networks: a summar y of results. In: IEEE llt h Internationa l Conferenc e on Data Engineering , pp. 410-419.
Wei, Y.-C. , Cheng , C.-K, 1990 . Ratio cu t partitioning for hierarchical designs. Technical Repor t CS90-164 , University
of California , Sa n Diego , January .
Zhan, Noon, 1998. Shortest path algorithms: an evaluation usin g real road networks. Transportation Science 32 , 65-73.
Zhao, J.L., Zaki, A., 1994 . Spatial data traversal in road ma p databases: a graph indexing approach. In : Proceedings of
the Thir d Internationa l Conferenc e o n CIKM , pp . 35 5 -362 .
TRANSPORTATION
RESEARCH
PARTC
Abstract
This articl e present s a Web-base d transi t informatio n syste m design tha t use s Interne t Geographi c In -
formation System s (GIS ) technologie s t o integrat e We b serving , GI S processing , networ k analysi s an d
database management . A pat h findin g algorith m fo r transi t networ k i s propose d t o handl e th e specia l
characteristics o f transit networks , e.g., time-dependent services , common bu s lines on the same street, an d
non-symmetric routin g wit h respec t t o a n origin/destinatio n pair . Th e algorith m take s int o accoun t th e
overall level of services and service schedule on a route to determine the shortest path and transfer points. A
framework i s created t o categoriz e th e development o f transit informatio n system s on the basi s o f conten t
and functionality , from simpl e stati c schedul e displa y t o mor e sophisticate d rea l tim e transit informatio n
systems. A unique feature of the reported Web-base d transi t information system is the Internet-GIS base d
system wit h a n interactiv e ma p interface . Thi s enable s th e use r t o interac t wit h informatio n o n transi t
routes, schedules , an d tri p itinerar y planning . Som e ma p rendering , querying , an d networ k analysi s
functions ar e als o provided . © 200 0 Elsevier Scienc e Ltd. Al l rights reserved .
Keywords: Interne t GIS ; Transi t networks ; Shortest path ; Intelligen t transportation system s
1. Introductio n
This paper describes and discusse s a way of designing a Web-based transit information syste m
that allows transit users to plan a trip itinerary and t o query service-relate d information , suc h as
schedules an d route s usin g Interne t Geographi c Informatio n System s (GIS ) technologies . His -
torically, transit agencies have relied o n printed schedules to provide customers with informatio n
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Science Ltd . Al l rights reserved.
PII: S0968-090X(00)00016- 4
410 Z.-R. Peng, R . Huang I Transportation Research Part C 8 (2000) 409-425
about transi t routin g and schedules . Transit user s have had t o selec t proper route s an d transfe r
points based o n the information printed o n the schedule . Use of the schedule is complex and ca n
be confusing for many people. In addition, since schedules are infrequently updated, many servic e
changes canno t b e reflected in the brochur e i n a timel y manner. Mos t transi t agencie s als o staf f
customer servic e agent s t o provid e telephon e assistanc e i n answerin g custome r inquirie s abou t
schedules and directions. Customer service agents can suggest itinerary plans for customers based
on printe d rout e maps , publishe d schedules , an d servic e update s no t ye t mad e o n th e printe d
schedules. Thi s manua l itinerar y plannin g proces s i s tedious , time-consuming , redundant , an d
often error-prone ; informatio n ca n b e inconsisten t fro m on e servic e agent t o anothe r (Salters ,
1996). Recently, computer assiste d trip planning systems were developed to automat e th e process
(Salters, 1996 ; Peng, 1997 ; Casey et al., 1998) . However, this early computer-aided tri p plannin g
system i s mainl y designe d t o assis t custome r servic e agent s an d require s proprietar y softwar e
installed in the users' local computers. Transi t users have limited or no direct access to it and have
to cal l i n to ge t updated servic e information an d a n itinerary .
The Interne t an d th e Worl d Wid e We b ar e revolutionizin g th e proces s o f informatio n dis -
semination, communications , an d transactions , whic h have brought som e importan t change s t o
traditional function s o f transi t services . Fo r example , mos t o f th e traditiona l custome r servic e
functions (e.g. , schedules , routing , itinerar y planning ) ca n b e enhance d o r eve n substitute d b y
Web-based informatio n systems . The beaut y o f th e Web-base d informatio n i s that i t coul d ti e
together an d make more intelligible routing and scheduling information that traditional brochur e
designers struggle d wit h fo r year s i n th e pre-We b days . Man y transi t agencie s ar e no w i n th e
process o f creatin g an d upgradin g thei r transi t informatio n o n th e Web . Wit h th e rapi d devel -
opment o f the Interne t technolog y an d th e proliferation o f online information , th e numbe r an d
use of transit information Web sites are increasing rapidly. For instance, the UK Public Transport
Information We b sit e ha d 100 0 visits a mont h a t th e en d o f 1996 ; it receive d ove r 13,00 0 pe r
month b y July 199 8 (http://www.ul.ie/~infopolis/).
There ar e man y transit informatio n systems on th e Web. Thes e includ e simple static schedule
display to more sophisticate d rea l time bus location systems . This paper provide s a taxonomy t o
review the state-of-the-ar t o f the existin g and futur e developmen t o f transit informatio n systems
on th e We b t o for m a framework for Web-based transi t informatio n systems.
The purpos e o f transit informatio n o n th e Web, lik e any othe r onlin e information systems , is
evolving from informatio n dissemination t o interactiv e communications an d onlin e transactions .
Transit informatio n disseminatio n serve s the purpose o f transit informatio n announcemen t an d
display, suc h a s informatio n abou t publishe d schedule s an d routing , a s wel l a s servic e changes.
Users receiv e th e informatio n passively . Interactiv e communicatio n provide s use r interactivity
and feedbac k channels . User s can activel y manipulat e and searc h for specifi c information base d
on thei r ow n need s an d giv e feedback to th e syste m providers. Transaction s offe r instan t inter -
actions betwee n syste m providers and users , for example , online ticketin g and reservations .
Based o n thes e evolving purposes, onlin e transit informatio n system s can var y significantly i n
terms o f conten t an d function . Tabl e 1 provides a framewor k fo r onlin e transi t informatio n
Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425 411
Table 1
Taxonomy o f online transi t informatio n system s
Content Function s an d interface
level We b Text search , Interactive Customiza- Online
browsing static graphi c map-based tion an d transaction
(HTML, links (ma p search, quer y information
PDF) images) and analysi s delivery
(Internet GIS )
Function leve l 0 1 2 3 4
Contents
General informatio n A A OA l
Static information B B OB l B2 B3 B4
(routes, schedul e an d
fare)
Trip itinerar y planning C C OC l C2 C3 C4
Real tim e information D D OD l D2 D3 D4
(bus location s an d
delays)
browser t o server ) where al l information i s provided a s Hyper Tex t Marku p Languag e (HTML )
or Portable Document Forma t (PDF ) and/o r stati c image maps. Thi s is the minimal function the
online transit information system should support .
The nex t level (Level 1 ) of interface support ca n provid e graphic browsing using graphic links
embedded i n ma p image s (clickabl e maps) . A transi t networ k ma p ca n b e provide d t o lin k
schedule data on each time point o n the map. The user is able to select a transit lin e or time point
on the transit network map. Each transit line or time point is linked with the schedule data. This is
useful fo r users who know the location o f their trip ends and ar e already familiar with the transit
routes an d services . This i s still a two-tie r architectur e (Web browse r t o We b server ) implemen-
tation, bu t th e syste m at thi s level does no t provid e data searc h an d quer y capability .
The nex t leve l of functiona l suppor t (Leve l 2) can provid e spatial searc h and attribut e dat a
search. Furthermore, i t can also provide graphi c interfac e to allo w users to directl y interact with
transit network maps. For example , if the user enters an address or points to a specific location on
the map, the system can find all the bus routes and stops within walking distance of that location .
The user can render the transit network and street network maps by zooming in and out, panning,
or by conducting a spatial search. The network location data and schedul e data, or real time bus
locations data , are linked with a relational database in a database managemen t syste m (DBMS)
on the server. Typically, this level of service requires a three-tier networ k architectur e that handle s
client-side user interaction, server-sid e network analysis, and databas e management .
There is a significant differenc e betwee n functions and architecture s a t Level 1 and Leve l 2. Fo r
the Leve l 1 function support, th e syste m architecture i s still the two-tie r We b browse r t o serve r
structure. Stati c document s ar e linke d wit h th e portion s o f th e ma p image s b y th e underlyin g
Universal Resource Locator s (URLs) . Whe n th e user clicks a location o n a map image , a linked
document pag e i s displayed. The linke d documen t pages , lik e bus schedules , ar e stati c an d pre -
prepared. An y change s i n th e bu s schedule s hav e t o b e mad e manuall y o n thos e pre-prepare d
document pages . Fo r Leve l 2 functionality, the system architecture i s a three-tier structure . That
is, th e use r interfac e on th e We b browse r i s linked with the We b server , which is further linked
with a GIS application serve r and/or database server . The spatial features (e.g., routes, stops, time
points) and their attributes (e.g., schedules, rea l time bus locations) on the map are connected wit h
a ma p serve r and/or DBMS . An y changes in the schedule s in the database wil l be automaticall y
updated an d available instantly. When the user makes a request, that reques t is transferred to the
server; th e serve r the n searche s th e databas e an d return s th e quer y results to th e user . Th e Ex-
tensible Markup Language (XML) can also be used to facilitate spatial and attribute data search .
Level 3 of functio n suppor t i s capable o f providin g customizatio n an d informatio n delivery.
The syste m can stor e th e user's persona l profile , such a s customer' s frequen t origin s an d desti -
nations, an d th e usua l tim e o f travel . When th e use r log s on , th e syste m can retriev e the infor-
mation base d o n th e use r profile , suc h a s bu s arriva l informatio n o n th e user' s frequentl y
patronized route s an d stops . Furthermore , thi s customized informatio n ca n als o b e delivered t o
the customer via pagers or hand-held devices (Peng and Jan, 1999) . Wireless information delivery
is clearl y a futur e tren d o f onlin e informatio n systems . Customizatio n facilitate s th e formatio n
and timel y delivery of personalize d information . Us e o f rea l tim e informatio n i s especiall y im -
portant fo r customized services. The desirability of user profiles is somewhat controversial. On th e
one hand , th e storag e o f persona l profile s make s i t easie r t o retriev e and delive r personalize d
information. O n th e othe r hand , som e user s ma y resen t someon e els e watchin g thei r trave l
Z.-R. Peng, R . Huang I Transportation Research Part C 8 (2000) 409-425 41 3
One of the recent developments in GIS technology is to deliver GIS data an d analysis functions
on th e We b throug h th e Interne t (Batty , 1999 ; Colman , 1999 ; Plewe , 1997) . Interne t GIS , a n
emerging technolog y t o serv e GIS data and provide GI S functionality on the Web (Plewe, 1997 ;
Peng an d Beimborn , 1998 ; Peng, 1999) , i s designed t o integrat e th e We b an d GI S i n orde r t o
manipulate, visualize , and analyz e GIS dat a o n th e Web . Interne t GI S ha s bee n use d i n many
applications (Sarjakoski, 1998 ; Doyle e t al. , 1998 ; Peng an d Beimborn , 1998 ; Muro-Medrano et
al., 1999) . The system architecture o f Internet GI S is evolving already in its short existence. Early
development looke d a t th e Interne t a s a wa y t o disseminat e spatia l dat a (Colema n an d
McLaughlin, 1997 ; Pen g an d Nebert , 1997) . Bu t accessin g spatia l dat a o n th e Interne t di d no t
provide an y GI S analysi s functionalit y and wa s thus a ver y limited application . Late r develop -
ments linke d existin g GI S program s wit h th e We b serve r t o provid e user s som e limite d GI S
functionality o n th e We b (Colman , 1999 ; Conquest an d Speer , 1996) . This approac h take s ad -
vantage o f existin g GIS program s an d thei r function s and deliver s them t o user s throug h We b
browsers. Recen t advance s explor e distribute d component s an d three-tie r syste m architectur e
(Ran e t al. , 1999) . The distribute d componen t approac h adopt s th e clien t serve r mode l t o dis -
tribute dat a an d GI S processin g component s fro m th e serve r t o th e We b clien t an d i s mor e
efficient an d scalable .
Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425 415
Three-tier syste m architecture i s used t o desig n a transit informatio n syste m at Level s C2, C3,
D2 and D 3 as shown in Fig. 1 . The three-tier architectur e is composed of the Web browser (client
tier), We b seve r (serve r tier) , an d on e o r mor e applicatio n server s (applicatio n tier) . Th e We b
browser is a user interface used to gather user input. Th e Web server acts as middleware to handl e
users' requests and transfer the requests to an application server . The application serve r is used to
process use r requests . I t i s compose d o f thre e components : a ma p server , a networ k analysi s
server, and a database server . The map serve r is designed for map renderin g and spatia l analysis ;
the network analysis server is used to provide network analysis functions; an d th e database serve r
is used t o handl e dat a management vi a DBMSs .
The architectur e show n i n Fig . 1 is a server-base d informatio n system . Tha t is , user s mak e
queries a t th e Web browser, bu t th e proces s i s conducted a t th e applicatio n server . Use r querie s
from a Web browser ar e transferred t o the Web server, which send s the user's request t o the map
server. Base d o n th e user' s request , th e ma p serve r either processe s th e quer y itself o r send s th e
task t o a networ k analysi s componen t and/o r a DBM S fo r processing . Th e outpu t i s the n de -
livered t o th e Web server an d ultimatel y t o th e user a t th e Web browser.
The syste m implementation proces s i s shown in Fig. 2 . It start s wit h user input o n tri p origin ,
destination, an d travel date and time. An interface needs to be developed wher e users interact with
416 Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425
Schedule Databas e
the application. It s flexibility an d ease of use is critical to th e use of the application (Howar d an d
MacEachren, 1996) . Ther e ar e severa l option s tha t ca n b e use d t o desig n th e use r interfac e a s
discussed in Table 1 , i.e., text, sketch map images , and interactiv e maps. The problem o f the text-
only interface, besides the lac k o f vivid visual effect tha t a map ca n provide, is that th e progra m
may not find the location o f the trip origin and destinations if inexact addresses ar e provided. Thi s
is quit e ofte n th e cas e whe n th e use r i s not familia r wit h th e are a o r doe s no t kno w th e exac t
address. To avoi d thi s situation , pull-dow n lis t boxe s containin g stree t intersection s and land -
marks are sometimes used (e.g., http://www.romanse.org.uk/). Sketch route maps offe r littl e or no
references t o surroundin g streets, nor d o the y provide prope r scale .
A unique feature of this user interface (Fig. 3) is that it is a GIS based system with an interactive
map interface, which provides users a map interface to select locations directly on the map. Som e
map rendering functions, such as zoom and pane, are also provided. Users can also enter their trip
origins an d destination s o n a n inpu t bo x o r selec t intersection s or landmark s fro m a pull-down
list.
Furthermore, th e map-based use r interface also provides spatial query and searchin g functions
to obtai n stree t informatio n (e.g. , stree t name s an d locations ) fro m th e map , a s well a s searc h
locations o f streets by typing street names. The user can find a particular stree t from th e databas e
and the map will automatically center at that location. The street maps use selective labeling. That
is, as the user zooms int o th e detail o f the map, th e stree t name wil l be shown on the map. Thi s
makes it easy to browse around th e map to find more information about th e neighborhood, othe r
bus line s and loca l attractions .
To enhanc e the interactivity between the user and th e map, We b client-side applications, suc h
as plug-ins, ActiveX control s an d Jav a applets , coul d b e developed. Bu t thes e client-side appli -
cations ar e perceive d t o b e to o technica l fo r transi t user s b y th e transi t agenc y fo r whic h thi s
system wa s designed. Therefore, a server-sid e approach i s used t o buil d th e interfac e to interac t
with users . Thi s i s the thin-clien t approach , th e simplest , ye t th e mos t user-friendly . It ha s n o
limitation o n a user's compute r platfor m an d local resources. Basi c map rendering function s lik e
Z.-R. Peng, R . Huang I Transportation Research Part C 8 (2000) 409-425 41 7
zoom, quer y an d searc h ar e provided i n HTML form . User s are abl e t o selec t feature s directl y
from th e map. However , the y are not abl e t o dra w a box or a circl e directly fro m a map imag e
because o f the limitation s o f the HTML . XM L ca n b e used fo r furthe r improvement .
The syste m i s intended t o offe r user s interactivit y wit h th e ma p b y allowin g users t o brows e
service and othe r informatio n directl y o n th e map. Therefore , th e syste m has t o b e able t o offe r
map-rendering an d addres s matchin g capability . Thi s i s handle d b y th e Ma p serve r (e.g. ,
MapObjects an d MapObject s IM S by ESRI) .
Trip origin s an d destination s ar e no t necessaril y o n th e transi t network . Consequently , th e
system needs to find all bus stops that are within walking distance (a quarter mile or 0.4 km) of the
user's tri p origin an d destination . Th e reason tha t al l stops within walking distance ar e searche d
rather tha n th e on e that i s the closest i s that th e closes t transi t sto p ma y no t b e on th e shortes t
route pat h (Peng , 1997) . Sometime s a littl e longe r walkin g time ma y resul t i n a shorte r overal l
travel time . If ther e is no sto p withi n walkin g distance of tri p origi n and destination , a longe r
418 Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425
walking distanc e shoul d b e used . Afte r thos e stop s fro m th e origi n an d destinatio n ar e found ,
these stop s ar e the n flagged for networ k analysis.
Four dat a files were used in developing the application: bu s route network, street network, bus
stops, an d tim e points wit h schedul e data . Thes e dat a files are store d i n a relationa l databas e
system using Access by Microsoft. The database i s linked with the map serve r and network server
(discussed below) through ope n database connectivity (ODBC). Rea l time GPS location dat a ar e
not availabl e at th e tim e o f project development , bu t th e applicatio n wa s developed a s an ope n
system to incorporate real time information when it becomes available in the near future. Th e bus
route file is derived fro m the street centerline file with the addition of some attribute data, such as
the stree t length, spee d limit , and trave l time. A lin e feature stree t map an d a point featur e bu s
stop map were used as background layers. The street map was also used as a base map for address
matching of trip origin s and destinations . The bu s sto p map wa s used fo r definin g star t an d en d
stops o f a trip . Th e bu s schedul e databas e i s separatel y store d fro m th e spatia l dat a fo r eas y
update an d management .
A networ k analysi s model i s the ke y component t o provid e tri p itinerar y planning . However ,
most path finding algorithms and programs are designed for highway usage (Moor, 1957 ; Martin,
1963; Dial, 1971 ; Ikeda e t al. , 1994 ; Zhan an d Noon , 1998) . Although existin g network analysis
and pat h finding algorithms serve well for highway routing and traffi c assignment , problems arise
when the y are applie d t o transit , becaus e transi t network s hav e significantl y different character -
istics fro m highwa y networks (Spear, 1994 ; Peng, 1997) . Many researcher s hav e pointed ou t th e
inadequacy o f applyin g th e pat h findin g algorithm s o f highway network s t o solv e th e minima l
path findin g problem s for transi t network s (Le Clercq, 1972 ; Chriqui an d Robillard , 1975 ; Last
and Leak , 1976 ; Tong and Richardson , 1984 ; De Cea and Fernandez , 1989 ; Spiess an d Florian ,
1989; Wong and Tong , 1998) . Because transit service is time dependent, differen t time s of the da y
or differen t day s o f the wee k hav e differen t level s o f transit service . Som e service s ar e availabl e
only at the peak time period. Second, one street segment may serve different bu s routes and many
routes may stop at the same bus stop. This is the so-called "commo n bus lines problem" (Chriqui
and Robillard , 1975) . Third , unlik e th e highwa y routin g problem , wher e th e computatio n o f
shortest pat h i s symmetric with respect t o a n origin/destinatio n pair , th e routin g o n transi t net -
works fro m origin s t o destination s i s no t symmetri c wit h tha t fro m destination s t o origins .
Fourth, transi t transfer s depend o n the arrival time of another bus . Hence the best path between
an origi n an d destinatio n ca n chang e dependin g upo n th e timin g of service s available. Further -
more, man y route s hav e loo p route s an d layove r time . Thes e uniqu e characteristic s mak e th e
minimal path finding application fo r transit network s much mor e challenging .
In th e cas e o f highwa y networ k analysis , on e stree t segmen t an d on e intersectio n ha s on e
unique value of travel time and tur n weigh t (or tur n penalty) . However , in a transit system , one
street segment may have several bus routes, and each has its own headway. Some are regular buses
and som e ar e express buses . I t i s even mor e difficul t t o determin e th e turn weight (wai t time ) a t
Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425 419
each intersection i n a transit system . At th e sam e intersection, different buse s may hav e differen t
turn weights. Take the intersection A in Fig. 4 as an example. For rider s taking bus B-3 north, th e
left tur n weigh t tim e is very small becaus e th e ride r doe s no t nee d t o transfer . Bu t for th e sam e
turn, if the rider has to transfer from B- 3 to B-2 west, the turn weight could be very large because
the rider has to transfer . However, conventional path findin g program s base d o n th e stree t net -
work requir e a single turn weight for each tur n at every intersection. This makes the actual tur n
weight extremel y difficult t o determine .
Adding more complexity to the situation is that th e turn weight changes over time because the
bus headwa y changes; some buses even stop servin g at certain time s of the day . This problem is
similar t o constructin g a shortes t pat h base d o n th e congestio n leve l o n roadways , sometime s
referred t o time dependent constrained shortest path o r TCSP (Fran k e t al., 2000). Even when the
turn weight is calculated for every second, it is still difficult t o determine which time period to use.
If the time of trip origin is used, when the bus gets to that transfer point the transfer route may not
be in service. When the expecte d arriva l tim e to tha t intersectio n i s used, how do yo u know th e
expected arrival time if you have not determined the path? Because of the complexity of the transi t
route network , conventiona l highwa y networ k topolog y an d analysi s method s ar e difficul t t o
apply in transit networks.
Several transit network models were proposed i n the literature. Most o f these prior models for
the shortest path findin g i n transit networks can be categorized int o tw o groups: headway-based
and schedule-based . Th e headway-base d pat h findin g algorithm s assig n passenger s t o th e first
arriving vehicl e based o n th e combine d frequencie s of commo n bu s line s (Spies s an d Florian ,
1989). Constant (average ) headwa y during a tim e period o n a segmen t is usually assumed . Th e
shortest pat h findin g algorithm s ar e usuall y variant s o f traffi c assignmen t procedure s use d fo r
highway networks that ar e modified t o reflect th e waiting time inherent to transit networks (Dial ,
1967; Last and Leak, 1976). The schedule-based transit network models used a branch-and-boun d
type algorith m t o determin e th e tim e dependen t leas t cos t path s betwee n all origin/destinatio n
pairs i n the transi t network . The passenger s ar e assume d t o boar d th e first vehicle to arriv e a s
specified i n a pre-determined schedule (Tong and Richardson , 1984 ; Wong and Tong, 1998) . The
path assignmen t for the headway-based approach i s stochastic and heuristic in nature in the sense
that a passenge r alway s take s th e firs t vehicl e tha t arrives . Bu t th e algorith m itsel f doe s no t
identify whic h route s th e passenge r shoul d tak e i f there is more tha n on e bu s lin e going to th e
same destination. The results could be more than on e path for any origin/destination pair. On the
contrary, th e schedule-based transi t models deterministically identif y the arrival of the first vehicle
based o n a fixed schedule. Therefore , i t yields one an d onl y on e optima l solutio n fo r an y given
origin/destination pair .
This stud y combine s th e headway-base d an d schedule-base d approac h i n a two-stag e pat h
finding process. I n th e firs t stage , averag e headway i s calculated fo r eac h segmen t on th e transi t
network. Th e average headway i s used t o calculate travel time on each link an d the turn weight at
each intersection . A vine-buildin g typ e shortes t pat h algorith m (Kirb y an d Potts , 1969 ) is then
used to estimate the shortest pat h fo r each origin/destinatio n pair . Sinc e transit servic e frequency
varies a t differen t time s o f th e day , th e averag e headwa y need s t o b e calculate d fo r eac h tim e
period (ever y half-hou r i n this case) . I n othe r words , th e shortes t pat h proble m ha s bee n trans -
formed int o many time dependent shortes t pat h problem s ove r many discrete time intervals. This
is similar to th e time independent solutio n metho d i n Frank e t al. (2000) and Handle r an d Zan g
(1980).
An example i s shown i n Fig. 5 . Assume a user takes bus B-l fro m the stree t St- 1 and need s to
change t o th e wes t o n Stree t St- 2 at intersectio n A . Th e use r need s t o transfe r a t poin t A . Th e
waiting time at point A depends on the number of bus routes on the street St-2 running to the west
and th e headwa y o f each bus route on that stree t at a specific time .
The valu e o f th e tur n weigh t i s derived statisticall y fro m th e possibl e wai t tim e a t th e inter -
section tha t ma y involv e bus transfers . One-half o f th e averag e headwa y o f al l buses i n on e di -
rection at the intersectio n is used to determin e the tur n weigh t at a specifi c time . One-hal f of
average headway for each bus line from th e possible next three consecutive buses' headway is used
as a n aggregate d measur e o f tur n weight . Fo r example , i f th e tri p star t tim e i s 4:00 p.m. , th e
average of three consecutive headways of route B-2 and route B-3 at intersection A after 4:0 0 p.m.
is 60 0 and 90 0 seconds, respectively . Thu s th e weigh t valu e fo r tha t tur n ca n b e calculate d b y
(1/2) * 3600/(3600/600 + 3600/900 ) = 18 0 s; the averag e waitin g tim e i s calculated t o b e 18 0 s.
Intersections tha t hav e mor e transi t route s wil l hav e smalle r headwa y an d smalle r tur n weigh t
values, and hence less impedance. Th e larger the number o f alternative bus routes and the shorte r
the headway, the smaller th e turn weight value and therefore the more likely the intersection is to
be selecte d a s a turn poin t (o r transfer point). Becaus e the bu s headwa y ma y chang e b y time of
day, th e turn weigh t value varies accordingly .
This method o f calculating the value of turn weight is only a proxy for potential wai t time. The
turn weigh t is only used i n th e proces s o f finding the shortes t path . I t i s updated a t th e tim e of
travel. Th e trave l tim e fo r eac h lin k o n th e transi t networ k i s determined b y th e averag e trave l
time of bus lines. Except for occasional express buses, the travel time on each lin k for every bus is
very similar . Th e lin k trave l tim e an d tur n weigh t are the n use d t o searc h fo r th e shortes t pat h
using a vine-buildin g algorithm.
Once a shortes t pat h i s defined , actua l wai t tim e an d transfe r point s ar e retrieve d fro m th e
schedule database , an d th e use r i s assigne d t o specifi c bu s line s base d o n th e actua l schedule.
Although th e shortest path finding program identifie s transfer points at intersection, som e actual
transfer point s ma y b e a t non-intersections . Fo r example , buse s B- l an d B- 4 run o n th e sam e
street, St-1 , and a passenger need s t o transfe r from B- 4 to B- l a s shown i n Fig . 6 . The shortes t
path program will identify the transfer point a t intersectio n A . The program will then chec k fo r
the bus stop location databas e t o identif y th e shortes t distanc e between stops o n routes B- l an d
B-4. Usuall y a share d bu s sto p (i n this case, sto p Z ) wil l b e chosen a s the actua l transfe r point.
In the cas e of multiple route s goin g to the sam e destinatio n tha t coul d be transferred to, the
next arrivin g bus i s chosen a s th e firs t choic e fo r th e transfer . However, th e progra m als o lists
other availabl e bus lines so that the passenger ha s backup option s i n case the current bu s and/o r
the transfer bus is not o n schedule . If there are express bus lines, the first regular bus that arrive s
at th e transfer point may no t b e the best choice. Th e passenger i s better of f to wait a little longer
for th e next express bus than to take the first arriving regula r bus . To take this into consideration ,
the progra m compare s th e arriva l time s o f all possible buse s tha t g o throug h th e tri p origi n o r
current transfe r point a t the next transfer point o r destination. Th e one that get s the next transfer
point or destination first would be the fastest one . The program lists that choice a s the first path
choice. As expected, the shortest path that goes through major streets that have more bus routes is
usually selected, and transfer points are usually chosen at intersections that hav e more alternative
buses o n th e sam e street .
This hybri d method , relyin g on th e averag e headwa y a s tur n weight , ma y no t produc e th e
shortest path . T o produce on e single best route, on e has to calculate th e actual wait time at eac h
potential transfe r poin t base d o n th e servic e schedule . Fo r example , ther e woul d b e tw o alter -
native turn weights at point A in Fig. 5 for a given time period. One is the wait time from bu s B-l
transferring t o bu s B- 2 and th e othe r i s th e wai t tim e fro m bu s B- l t o bu s B-3 . Bu t al l vine-
building-based path finding algorithms can handle only one turn weight for each intersection; they
cannot handl e tw o o r mor e tur n weight s at on e intersection . To solv e this problem, additiona l
pseudo-links an d pseudo-node s nee d t o b e added t o th e transit networ k t o represen t eac h indi -
vidual bus route. Theoretically , thi s method coul d produc e th e shortes t path i f the bus is always
100% on time. However, buses are not alway s on time. Therefore, there are at least two problems
associated with this approach. First, since the bus is not alway s on time, we have to giv e enoug h
cushion tim e for passengers t o transfer. I f the cushion tim e is too long, the resulting shortes t pat h
may not be the shortest. If the cushion time is too short , it is possible that the passenger may miss
the transfe r bus a t th e transfe r point . Second , th e headway-base d tur n weigh t usuall y select s a
transfer poin t wit h more transi t routes . Thi s gives passengers mor e option s i n case they miss the
first bus . Bu t th e schedule-only-base d algorith m depend s solel y o n th e schedul e o f individual
routes; i t doe s no t giv e preference t o street s wit h multipl e bu s routes . Therefore , i t offer s fewe r
route options. Fo r example , based o n the schedule match, the schedule-only approach ma y result
in the shortest path that goes through a link with a single bus route with a headway of 30 min. The
hybrid approac h yield s a pat h tha t goe s through a lin k with multipl e bu s routes . I n th e former
case, the wait time is only 5 min; in the second case, the weight time is 7 min, but there are two bus
lines eac h wit h 1 5 min headway . I f fo r som e reaso n th e passenge r misse s the firs t bus , th e pas -
senger then has to wait up to 30 min for the next bus in the first case, but he/she needs only to wait
for anothe r 1 5 min or less in the second case . Therefore , give n the not-so-reliable transi t services ,
the hybri d approach wil l produc e a better solution .
The performanc e o f th e pat h finding program i s very good . Th e shortes t pat h ha s bee n pre -
constructed from every node to all other nodes for every half-hour time period and is stored in the
server. Therefore, there is no need to estimate the shortest path on the fly. Depending on network
traffic, i t usually take s a couple o f seconds t o retriev e th e informatio n fro m th e server .
Once the shortest path ha s been determined, the system creates path direction s to report t o the
user th e names o f starting an d ending bu s stops, the bus route(s) to take , an d transfer stop s and
transfer routes , a s wel l a s bu s arriva l an d departur e information . Th e syste m als o conduct s a
separate shortes t pat h searc h fo r walk directions fro m tri p origi n t o th e starting transit sto p an d
from endin g transit stop s t o tri p destination . Sinc e the stree t distanc e i s the onl y link cost , Di -
jkstra's (1959 ) shortest pat h algorith m i s used t o determin e th e walkin g path. Th e whol e pat h
direction (includin g walking and transit) i s reported t o the user in both the text format an d maps .
A user profile i s used t o stor e customers' persona l information , including hom e address , wor k
address, location s o f othe r commo n destinations , an d usua l tim e t o wor k an d t o home . Thi s
information i s store d i n a use r profil e database . Eac h use r i s assigne d a use r identifie r an d a
password. Whe n th e use r enter s th e site , his/he r personalize d informatio n wil l be automaticall y
retrieved. The personalized information includes trip itinerary and bus schedules. A timer is set up
to check the current tim e against th e usual tim e to work and tim e to home. I f the current tim e is
closer t o th e tim e to home , a work-to-home tri p itinerar y will be presented b y default. Th e user
can always change his/her profile information , and th e database wil l be updated accordingly . The
personal profile i s not mandatory . Bu t creating a personal profile allow s the system to infor m th e
user vi a e-mail or othe r mean s of an y servic e changes. In th e future , a s real tim e information is
available, th e use r ca n get more timel y informatio n o n delays an d servic e changes .
The system design allows for displaying real time bus location onc e the real time AVL data are
available. Th e bu s location s o n th e ma p ca n b e update d i n a pre-define d time interva l suc h a s
Z.-R. Peng, R . Huang I Transportation Research Part C 8 (2000) 409-42 5 42 3
every 30 seconds or every minute. An example is shown in Fig. 7 to display real time bus locations
and bu s movement animation usin g Global Positioning Syste m (GPS) data . On e important im-
provement to be made in the future woul d be using real time bus location base d on the AVL data
in th e pat h finding process, raisin g th e leve l o f servic e to D 2 an d D3 . Sinc e AVL have been in-
stalled o n buse s in many transi t systems , serving real tim e data i s a matte r o f technical and in -
stitutional coordination . The current onlin e transit information system is designed flexibly enough
to accommodat e futur e rea l time GPS data.
5. Conclusio n
This paper present s a distribute d Web-based transi t informatio n system . A unique feature o f
the syste m i s that i t integrate s Interne t GI S int o th e syste m desig n s o that th e use r interfac e is.
map-based. Th e use r ca n interac t wit h th e transi t networ k and stree t maps , conductin g query ,
search, an d ma p rendering . Th e interactiv e map-base d use r interface also allow s the syste m t o
incorporate othe r information, such as shops, theaters, parks, and othe r local attractions. This is
very important for visitors who may want to explore these sites around their destinations. Internet
GIS ha s bee n prove n t o b e a powerfu l too l t o develo p flexible and versatil e function s an d t o
deliver rich information content t o th e users through th e Internet an d th e World Wid e Web.
424 Z.-R. Peng, R. Huang I Transportation Research Part C 8 (2000) 409-425
A shortest path finding algorithm that combines headway-based and schedule-based methods is
also develope d t o fi t th e uniqu e characteristic s o f transit networks , namel y time-dependen t ser -
vices, multipl e transi t route s o n th e sam e street , an d non-symmetr y o f shortes t path s betwee n
origin/destination pai r and destination/origi n pair .
Further wor k include s expansio n t o utiliz e real tim e transi t locatio n informatio n an d traffi c
conditions t o allo w real time trip planning. Another expansio n would be developing mechanisms
to delive r personalized informatio n t o user s via wireless devices.
Acknowledgements
This articl e ha s benefite d greatl y fro m th e comment s an d suggestion s fro m Professo r Jean -
Claude Thill and three anonymous referees. The authors would also like to acknowledge financial
support fro m th e Wisconsin Departmen t o f Workforce Development an d th e Cente r fo r Trans -
portation Educatio n an d Development a t the University of Wisconsin - Milwaukee . The authors
would als o lik e t o than k Professo r Nanc y Fran k wh o has painstakingl y edite d th e draf t o f th e
manuscript.
References
Batty, M. , 1999 . New technology and GIS . In : Longley , P.A., Goodchild , M.F. , Maguire , D.J., Rhind , D.W . (Eds.) ,
Geographic Informatio n Systems . Wiley, Chichester, pp. 309-316 .
Casey, R.F., Labell , L.N. , Prensky , S.P., Schweiger , C.L. , 1998 . Advanced Publi c Transportation Systems: The State of
the Art Update'98 . Federa l Transi t Administration , Washington.
Chriqui, C. , Robillard , P. , 1975 . Common bu s lines . Transportation Scienc e 9, 115-121 .
Colman, D.J. , 1999 . Geographic informatio n systems in networked environments. In: Longley, P.A., Goodchild, M.F. ,
Maguire, D.J. , Rhind , D.W. (Eds.) , Information Systems. Wiley, Chichester , pp. 317-329 .
Coleman, D.J. , McLaughlin , J.D. , 1997 . Information acces s and usage in a spatial information marketplace. Journa l of
Urban an d Regiona l Informatio n Systems 9 (1), 8-19 .
Conquest, J. , Speer , E. , 1996 . Disseminatin g ARC/INF O datase t documentatio n i n a distribute d computin g
environment. In : Proceeding s o f 199 6 ESR I Use r Conference , Redlands , C A (ESR I URL : http://www.esri.com /
resources/userconf/proc96/TO200/PAP166/P165.m).
De Cea, J., Fernandez, J.E. , 1989 . Transit assignment to minimal routes: an efficient ne w algorithm. Traffic Engineering
and Contro l 30 , 491-494.
Dial, R.B. , 1967 . Transit pathfinde r algorithm. Highwa y Research Record s 205 , 67-85.
Dial, R.B. , 1971 . A probabilisti c multipat h assignmen t mode l whic h obviate s pat h enumeration . Transportatio n
Research 5 , 83-111 .
Dijkstra, E.W. , 1959 . A note on two problems in connection wit h graphs . Numerisch e Mathematik 1, 269-271.
Doyle, S. , Dodge , M. , Smith , A. , 1998 . Th e potentia l o f web-base d mappin g an d virtua l realit y technologie s fo r
modeling urban environments . Computers Environmen t and Urba n System s 22 (2), 137-155 .
Environmental System s Research Institute , Inc. , 1998 . NetEngine : A Programmer' s Librar y fo r Networ k Analysis.
Environmental System s Researc h Institute , Inc. , Redland , CA .
Frank, W.C. , Thill , J.-C , Batta , R. , 2000 . Spatia l decisio n suppor t syste m fo r hazardou s materia l truc k routing .
Transportation Researc h C 8 (1-6), 337-359 .
Handler, G.Y. , Zang , I. , 1980 . A dual algorith m fo r th e constrained shortes t path problem. Network s 10 , 293-310.
Howard, D., MacEachren , A.M., 1996 . Interface design for geographic visualization: tools for representin g reliability.
Cartography an d Geographi c Informatio n Systems 23 (2), 59-77 .
Z.-R. Peng, R . Huang I Transportation Research Part C 8 (2000) 409-42 5 42 5
Ikeda, T., Hsu, M.Y., Imai, H., Nishimura, S., Shimoura, H., Hashimoto , T. , Temmoku, K., Mitoh , K., 1994 . A fas t
algorithm fo r findin g bette r route s b y A I searc h techniques . IEE E Vehicl e Navigation & Informatio n System s
Conference Proceeding s B 3-6, 291-296 .
Kirby, R. , Potts , R.B. , 1969 . Th e minima l rout e proble m fo r network s wit h tur n penaltie s an d prohibition .
Transportation Researc h 3.3 , 397-408 .
Last, A., Leak , S.E. , 1976 . Transept: a bus model . Traffi c Engineerin g and Contro l 17 , 14—20.
Le Clercq, F. , 1972 . A public transpor t assignmen t method . Traffi c Engineerin g and Contro l 13 , 91-96.
Moor, E.F. , 1957 . The shortest path through a maze. In: Proceedings of the International Symposium on the Theory of
Switching, Harvard University .
Martin, B.V. , 1963 . Minimu m path algorithm s fo r transportatio n planning . Researc h Repor t R63-52 . Department o f
Civil Engineering , Massachusetts Institute of Technology.
Muro-Medrano, P.R. , Infante , D., Guillo , J. , Zarazaga, J. , Banares , J.A., 1999 . A CORBA infrastructur e to provid e
distributed GPS data in real time to GIS applications. Computer s Environmen t and Urban Systems 23 (4), 271-285.
Peng, Z.R., 1997 . A methodology for design of GIS-based automati c transit traveler information systems. Computers
Environment an d Urba n System s 21 (5), 359-372 .
Peng, Z.R., 1999 . An assessment framewor k of the development strategie s o f internet GIS . Environmen t an d Plannin g
B: Planning and Desig n 2 6 (1), 117-132 .
Peng, Z.R. , Beimborn , E. , 1998 . Internet GIS : application s i n transportation . T R News , Number 195 , March/April ,
pp. 22-26.
Peng, Z.R., Jan , O. , 1999 . An assessmen t of means of transit information delivery. Transportation Researc h Record ,
forthcoming.
Peng, Z.R. , Nebert , D. , 1997 . An internet-based GI S data access system . Journal o f Urban an d Regional Informatio n
Systems 9 (1), 20-30.
Plewe, B., 1997 . GIS Online : information retrieval, mapping, and th e internet. OnWorl d Press , Sant a Fe , NM .
Ran, B. , Chang, B.P. , Chen , J. , 1999 . Architecture developmen t fo r web-based GI S applications in transportation. In:
Paper presente d a t th e 78th Transportation Researc h Board Annua l meeting, Washington, 10-1 4 January .
Sarjakoski, T. , 1998 . Networke d GI S fo r publi c participatio n - emphasi s o n utilizin g imag e data . Computer s
Environment an d Urba n System s 2 2 (4), 381-392 .
Salters, T., 1996 . DART o n target . IT S World, May/June.
Spear, B . D., 1994 . GIS an d spatia l dat a need s for urba n transportatio n applications . In : Davi d Moyer , D., Ries , T.
(Eds.), Proceeding s o f the 199 4 Geographic Informatio n System s for Transportation (GIS-T) Symposium, Norfolk ,
Virginia, pp. 31-41 .
Spiess, H. , Florian , M. , 1989 . Optima l strategies : a ne w assignmen t mode l fo r transi t networks . Transportatio n
Research B 23, 83-102.
Tong, C.O. , Richardson , A.J. , 1984 . A compute r mode l fo r findin g th e time-dependen t minimu m path i n a transi t
system wit h fixed schedules. Journal o f Advanced Transportation 1 8 (2), 145-161 .
Wong, S.C. , Tong , C.O. , 1998 . Estimatio n o f time-dependen t origin-destinatio n matrice s fo r transi t networks .
Transportation Researc h B 32 (1), 35-48.
Zhan, F.B. , Noon , C.E. , 1998 . Shortes t pat h algorithms : a n evaluatio n usin g rea l roa d networks . Transportatio n
Science 3 2 (1), 65-73.
RESEARCH
PARTC
Abstract
This paper is concerned with the development o f an Internet-based geographic information syste m (GIS )
that brings togethe r spatio-tempora l data, models an d users in a single efficient framewor k t o be used for a
wide rang e o f transportatio n application s - planning , engineerin g an d operational . Th e functiona l re -
quirements of the system ar e outlined taking into consideration the various enablin g technologies, such a s
Internet tools , large-scale database s an d distribute d computin g systems . Implementatio n issue s a s well as
the necessary model s neede d t o suppor t th e syste m are briefl y discussed . © 200 0 Elsevier Scienc e Ltd . Al l
rights reserved .
Keywords: Geographi c informatio n systems ; Intelligen t Transportatio n Systems ; Distributed systems ; Dynamic traffi c
assignment
1. Introductio n
0968-090X/00/$ - see front matte r © 200 0 Elsevie r Scienc e Ltd . Al l rights reserved.
PII: S0968-090X(00)00027- 9
428 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444
associated wit h spatia l an d tempora l coordinates , (iii ) the y ar e ofte n use d b y mor e tha n on e
models fo r different applications . Great benefits could b e realized, i f these data are integrated int o
a singl e database o r linke d distributed database s an d becom e accessibl e by all necessary models .
Data an d model s could, i n turn , b e availabl e t o al l involved entities: planners, engineers , oper -
ators, a s wel l a s variou s stakeholders , suc h a s researchers , consultants , truckin g companies ,
special interes t groups an d eve n the travelin g public.
Presently, transportatio n professional s have to cop e wit h fragmente d databases, multipl e an d
incompatible models , redundan t an d ofte n conflictin g dat a acquisitio n efforts , lac k o f coordina-
tion betwee n various agencie s an d privat e companie s operatin g o n th e sam e transportatio n fa -
cilities. This result s in serious inefficiencies an d waste of resources. For example , a transportatio n
planner use s the network o f a metropolitan are a fo r air quality analysi s with a certain forma t tha t
is usually incompatible with a signal optimization o r simulatio n softwar e that th e traffic enginee r
of th e sam e urba n are a use s t o optimiz e th e operation s o n th e urba n network . Typically , bot h
professionals independently collect and code the data for the same network and demand (albei t at
different aggregatio n levels ) withou t takin g advantag e o f th e existin g resources a t eac h other' s
agency.
Enabling technologie s develope d ove r th e pas t fe w years have create d unprecedente d oppor -
tunities to overcome some of the problems above . These technologies include the explosion of the
Internet an d Internet suppor t tools , terabyt e siz e databases, distributed computin g architectures ,
client serve r technologies a s wel l a s a ne w generatio n o f transportatio n tool s resulte d fro m th e
evolution o f Intelligen t Transportation Systems . Many o f thes e technologie s have alread y bee n
adopted by corporations an d have led into the development o f new business models . I n fact, from
supply chai n dynamic s to custome r relationshi p managemen t t o th e enabling o f an e-commerc e
infrastructure, th e opportunit y an d nee d hav e never been greate r t o lin k business processes an d
people throughou t a n organizatio n (Chambers , 1999) . Internet-enable d geographi c informatio n
system (GIS ) ha s als o attracte d a lo t o f attentio n i n th e las t fe w years: Jankowsk i an d Stasi k
(1997) introduced a n Internet-base d GI S to mak e possible collaborativ e spatia l decisio n makin g
via publi c participation . Keisle r an d Sundel l (1997 ) presente d a n integrate d geographi c multi -
attribute utility system with application t o park planning. An extensive survey of applications an d
research issue s fo r geographi c informatio n technologie s application s i n busines s i s provide d i n
Mennecke (1997).
This pape r introduce s a prototype Internet-base d GI S that aim s to integrat e spatio-tempora l
data an d model s fo r a wid e rang e o f transport applications : planning , engineerin g an d opera -
tional. Th e GI S graphi c use r interfac e (GUI) i s built i n JAVA , s o that i t ca n b e use d ove r th e
Internet (or any other large network). The database efficientl y store s and retrieves spatio-temporal
data, b y associatin g geographi c coordinate s an d tim e stamps . Th e databas e i s designed t o effi -
ciently manage a wide range of transport data : fro m off-lin e plannin g and engineerin g to stream s
of real-time , such a s thos e comin g in fro m stree t sensor s an d newe r vehicle based devices . Fur -
thermore, th e contro l an d encapsulatio n o f the data tha t become s possibl e wit h suc h a system ,
help dea l with problems arisin g from agenc y specifi c requirements .
A number of models have been coded i n the same framework and can by accessed via the GIS '
GUI i n a clien t serve r setting . Th e model s ca n b e remotel y accesse d vi a th e GI S an d ru n i n a
distributed environment based on the common object request broker architecture (COREA) . Th e
implemented models include traditional signa l control an d analysis tools, planning models as well
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-44 4 42 9
as newe r dynami c traffi c assignmen t (DTA ) and routin g algorithms . Thi s pape r discusse s the
interactions betwee n the models, th e potential efficiencie s achieve d by the integration o f data and
models, implementatio n difficultie s a s wel l a s th e implication s t o planning , engineerin g and op -
erational practices . I t shoul d b e noted tha t th e primar y focu s i s on th e technica l aspect s o f this
integration an d no t o n th e institutional/polic y issue s eve n thoug h th e latte r woul d obviousl y
constrain th e final system. Suc h issues, however , deserv e attention , whic h is beyond th e scope of
this paper. Furthermore , du e to the wide variety of policies and agencies, the introduced syste m is
not currentl y presented i n th e contex t o f any particula r stat e agenc y or entity . Instead, th e un -
derlying characteristic s o f transportation dat a an d model s ar e examine d i n orde r t o develo p a
system, which can late r be placed withi n the institutional framework of a particular entity.
In th e nex t section, w e identify th e need s and th e functiona l requirements of the system . The
overall mode l architectur e i s presented i n Sectio n 3 . The models currentl y implemente d o r tha t
need to be part of the framework are described in Section 4. Implementation details are provided
in Sectio n 5 , including a justification for th e Internet-based GU I an d th e distributed computin g
environment. Sectio n 6 discusses th e potential benefit s that can be realized b y the integration o f
the data , mode l an d users . Sectio n 7 conclude s thi s pape r an d identifie s direction s fo r futur e
research.
The transportatio n practic e involve s data , model s an d users . Dat a currentl y collecte d b y
transportation agencie s can b e loosely classified int o th e followin g types : planning, engineering
and operational ; a simila r classificatio n ca n b e made fo r th e model s used . Planning, data rang e
from tri p surveys , socioeconomi c dat a b y region an d networ k infrastructur e data , t o aggregat e
daily lin k traffi c flows . Thes e dat a ar e use d fo r long-rang e plannin g purposes , suc h a s esti -
mating th e impac t o f infrastructur e improvements , mod e split , an d environmenta l impact .
Models use d involv e th e traditiona l four-ste p plannin g model s a s well as other forecastin g an d
discrete choic e models . Engineering application s typicall y requir e mor e detaile d network ,
control an d traffi c data , obtaine d b y direc t observatio n an d traffi c engineerin g studies . Net-
work dat a involv e detai l representatio n o f intersectio n geometry , turnin g movement s an d
prohibitions; traffi c dat a ar e ofte n aggregate d ever y 1 5 min, while signa l o r sig n contro l in -
formation, suc h a s controller typ e an d precis e timing plan ar e needed . Thes e dat a ar e use d fo r
shorter-term engineerin g applications , suc h a s warran t analysis , signa l timin g pla n develop -
ment, an d freewa y management . Model s use d b y engineer s ar e typicall y comprise d o f signa l
optimization, simulatio n an d capacit y analysis . Finally , traffi c operator s collec t real-tim e dat a
from onlin e sensors , base d o n whic h real-tim e operational decision s ar e bein g made . Th e
network infrastructur e dat a ar e als o continuousl y update d base d o n constructio n zon e pla n
changes, incident s an d traffi c control . Urba n traffi c contro l syste m (UTCS ) typ e model s ar e
often use d b y control center s an d occasionall y ram p meterin g and variabl e messag e algorithm s
by freewa y operators . A rathe r crud e lis t o f th e variou s type s o f th e data , models an d user s is
included i n Tabl e 1 .
As mentione d earlier , th e stat e o f the transportatio n practic e i s characterized b y fragmented
databases, redundan t an d ofte n ill-conceive d dat a acquisitio n efforts , legac y databas e software ,
430 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444
Table 1
Types o f data, models and user s fo r transpor t application s
This functional requirement identifie s the Internet a s the means of accessing th e system, though
an Intrane t o r an y othe r larg e scal e corporate networ k structure could als o satisf y thi s require-
ment. Given, however, that on e class of users could ultimately be the public, the Internet seems to
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444 43 1
be the logical an d least expensive approach t o adopt. In addition, the existing vast infrastructur e
of th e Interne t guarantie s continuou s update , ne w tool s an d familiarit y of th e user s wit h thes e
tools. I f th e interfac e of the GI S i s accessible via the Internet , many users can acces s th e syste m
simultaneously an d perfor m a plethora o f operations .
2.2. An integrated (and possibly distributed) Internet-enabled database that can manage large sets of
off-line and online data
2.3. Distributed models so that the embedded algorithms with large needs in computational power can
be executed in a reasonable time
Simply put , distribute d system s allo w simultaneou s executio n o f man y object s o n man y
computers spatiall y distribute d an d interconnecte d vi a th e Interne t o r anothe r network . Dis -
tributed system s posse s intrinsi c abilitie s fo r bot h synchronou s an d asynchronou s communi -
cation amon g objects. Thi s allow s th e interfac e t o b e execute d remotel y fro m th e
implementation o f th e algorithms , an d multipl e processor s t o b e use d fo r th e executio n o f th e
algorithm module s i f needed .
2.4. An efficient reporting system that can query the database and produce intuitive reports that can
address the needs of all users
This functionality will facilitate communication amon g professionals. As mentioned above, this
functionality typicall y comes as a support too l to commercial databases , suc h as ORACLE 8i . It
should be emphasized, however, that there is a trade-off between a reporting system that is easy to
use an d allowin g for th e use r t o customiz e th e quer y an d reportin g system , which increases th e
need fo r trainin g an d th e difficult y o f using the system.
2.5. An efficient administrative support system for allowing users to manage their datasets and
reports
During the course o f using the system, many reports an d secondar y dataset s wil l be produced .
The users shoul d hav e the abilit y to manag e thei r files, remove temporary data set s and perma -
nently stor e an d publis h the final ones.
432 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-44 4
2.6. An effective security system that protects data integrity and protects authorized data access
Different user s will have access to the database a t various authorizatio n levels . For example , a
construction enginee r can updat e th e constructio n plan s fo r a workzon e area tha t change s th e
available capacit y o n th e particula r facility . Thes e change s wil l b e immediatel y know n t o th e
operator o f that facility , who may not have, however, the authority to change them. The opposite
may happen fo r the ramp metering strategy o f the facility under construction: i t may be known to
the construction o r resident engineer but canno t b e altered by them.
2.7. Scalability of th e system so that data an d models ca n be expanded a s user needs evolve
The syste m will continuousl y evolve as more users , data an d model s ar e included . Thus, th e
system shoul d b e scalable t o accommodat e futur e growt h an d changes .
2.8. Communication among users allowing the creation of virtual professional communities
The Internet an d the integrated databas e provide fo r a transparent system , where besides dat a
and models, secondary data, reports , analysi s and action s are available t o othe r user s operatin g
on th e system . Thi s wil l allo w fo r no t onl y informatio n bu t als o fo r informatio n exchange ; i f
researchers are included, this could be a convenient medium for technolog y transfer.
The overall approach introduced i n this paper wa s inspired o f and is implemented accordin g t o
practices adopte d b y th e InfoTec h (IT ) Industry . Th e objectiv e is t o mee t th e functiona l re-
quirements identifie d i n th e previou s sectio n an d brin g togethe r data , models , user s an d appli -
cations int o a seamles s efficient, interwove n system. The introduce d framewor k is called visual
interactive syste m fo r transportatio n algorithm s (VISTA) . VIST A i s a CORB A complian t dis -
tributed syste m accessible ove r a networ k (includin g the Internet 1), th e clien t is machine-inde-
pendent (JAV A technology) , user-friendl y and accessibl e base d o n various authorizatio n levels .
Data and models can be accessed by all users at different capacities , for retrieval, maintenance and
analysis. CORBA i s employed since application module s are written in a variety of languages (C,
C++, FORTRAN , etc. ) an d ma y ru n o n variou s Window s or UNI X platforms . Whil e othe r
distribution technologie s (RMI, DCOM ) hav e developed towards this more open object-oriente d
framework structure , i t ha s bee n a relativel y recent mov e and no t foun d sufficien t t o mee t th e
functional requirement s at th e time o f development.
Specifically, th e syste m implemented s o far includes :
• A data warehouse module (similar to ORACLE 8i ) accessed over a network (Intranet/Internet) .
• Existin g and new transportation model s and tools (planning, engineering, control, monitoring,
evaluation an d operational) .
1
http://its.civil.nwu.edu/vista-beta /
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444 43 3
• Use r interfaces for the various stakeholders (planners, engineers, policy makers, and operators)
that enable access to the data and models from an y computer hardware, at any location, at any
time, b y any user .
• Th e system is a CORBA compliant distributed system, allowing for CPU intensive models to be
executed o n many computers .
• Suppor t capabilities for all relevant transportatio n applications.
• Functionalit y fo r interaction amon g users.
• Som e reporting capability .
• Securit y feature s for many user s accessin g th e same database at various authorizatio n levels.
• Som e basic administrative capabilities .
The syste m i s intende d t o b e deploye d a t a Stat e level , wher e th e Stat e Departmen t o f
Transportation (DOT) , th e Metropolita n Plannin g Organization s (MPOs) , Count y Engineers ,
City Engineers, Transit Agencies, Freight Agencies, and other stakeholders will have access to the
system at various authorization level s to obtain/maintain data, ru n models and perform analysis.
Policy maker s a t th e Federal , Stat e an d Loca l government s wil l b e abl e t o monito r projects ,
obtain data , evaluate impacts of policies and mak e decisions. The overal l structure of VISTA is
outlined i n Fig . 1 .
The user interface is written entirely in JAVA so that it can easily be used on multiple platforms
and acros s th e Internet . I t communicate s wit h th e Managemen t Modul e throug h th e standar d
Java 1. 2 object reques t broke r (ORB) , allowin g an y use r wit h a recen t Netscap e o r Interne t
Explorer browse r t o acces s th e syste m by usin g the Jav a 1. 2 plug-in. The interfac e works as a
client exclusively , and communicate s onl y with the Managemen t an d Databas e modules. If th e
client i s functioning as a n Applet , i t i s constrained b y tigh t securit y restrictions with regar d t o
network communication . Essentially , the Apple t i s allowed t o communicat e onl y wit h th e ma -
chine from which it originated. Therefore, it is important that the Management Module be run o n
the sam e machine which services the VISTA web page.
When the VISTA HTML pag e is loaded, th e interoperable object reference (IOR) string is read
in as a parameter. Thi s IO R strin g consist s o f a sequence o f characters whic h uniquely identifie s
the Managemen t Modul e an d allow s th e interfac e t o loo k i t u p throug h th e CORB A functio n
ORB.string\_to\_object(ior). Thi s functio n return s a referenc e that ca n b e use d t o ge t a JAV A
object o f clas s Management . Onc e thi s referenc e i s obtained , th e Managemen t objec t ca n b e
manipulated a s i f it wer e an y othe r loca l object .
The curren t databas e modul e i s base d o n a combinatio n o f th e PostgresSQ L databas e
management syste m an d specialize d fil e handlin g routines . Th e specialize d routine s ar e im -
plemented i n orde r t o stor e intermediat e mode l dat a i n a n optima l manner . Suc h dat a in -
clude th e trave l cost s a s reporte d b y th e simulato r withi n eac h iteratio n o f th e DT A
algorithm. However , onc e th e algorith m i s complete , th e cos t dat a i s transferre d int o th e
structured quer y languag e (SQL ) databas e i n bot h desegregat e an d filtere d aggregat e for m i n
order t o suppor t a unifie d reportin g model . Th e origina l intermediat e binar y dat a ca n stil l b e
accessed throug h C/C+ + an d CORB A librar y functions . Th e databas e modul e ca n b e ac -
cessed throug h CORBA , C/C++ , ope n databas e connectivit y (ODBC) , o r jav a databas e
connectivity (JDBC ) libraries . Sinc e th e databas e modul e support s th e ODB C interface ,
widely availabl e tool s suc h a s Microsof t Acces s ca n b e use d i n orde r t o manipulat e VIST A
data. Finally , b y implementin g a standar d databas e protocol , th e underlyin g dat a manage -
ment too l i s compatibl e with , o r ca n easil y b e change d t o othe r complian t databas e systems ,
such a s ORACL E 8i .
A typical concer n when dealing with unified transportatio n system s involving GIS tool s i s the
representation o f geographical information . A great dea l o f work ha s been don e withi n thi s area
such as the linear referencing syste m (NHCRP 20-27 , 1997) presented wor k doe s not attemp t t o
expand upo n thi s specifi c field nor describ e a syste m which i s dependant o n an y on e represen -
tation methodology . Instead , b y using the object-oriented natur e o f the framework, interna l dat a
representations ca n b e maintaine d tha t ar e independen t o f specifi c standards . Thi s ca n b e ac -
complished throug h th e encapsulatio n o f dat a withi n th e database , an d acces s provide d onl y
through prox y data values .
Another concer n is data fusing an d archiving transportation dat a obtaine d o n a real-time basis
from sensin g devices shoul d b e archive d s o that the y ca n b e use d fo r engineerin g an d plannin g
applications. Fo r example , dat a obtaine d b y sensors shoul d be available :
• a t the 6-s format t o a freeway operator , wh o can invoke real-time contro l strategies ;
• a t a "processed 6-s format" a s travel times on the freeway to a freight operato r tha t can issue an
advisory to hi s truck driver s (o r ultimately to th e travelin g public);
• a t the 15-min aggregate forma t to a district engineer designing traffi c maintenance plan s for up-
coming constructio n plans ;
• a t an aggregate average daily traffic (ADT ) forma t to a transit operato r decidin g on new transit
routes;
• a t an aggregate annual averag e daily traffic (AADT ) forma t to an MPO planner who develops
a congestio n managemen t syste m o r th e Federa l Governmen t fo r th e Highwa y Performanc e
Monitoring System .
The are a o f data archivin g is an activ e field of research i n Compute r Science . Littl e has bee n
done with incorporating these approaches i n the current framework, but there is a need for furthe r
development in this area .
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-44 4 43 5
Another exampl e of the improve d efficiencie s b y maintaining the dat a i n a singl e database i s
maintaining crash dat a an d performin g safety analysis : Currently, accident dat a ar e maintaine d
by multiple jurisdictions, involve many people employe d by the agencie s in coding and cleanin g
the data . I f one needs to acces s the dat a fo r yea r 1999 , in most States , sh e has to wai t at least a
year fo r th e dat a t o becom e availabl e (i f at all) . Wit h th e propose d system , the office r wil l ulti -
mately have the capability to directly input the data vi a the Internet and a handheld device (in an
interface for m resemblin g th e pape r form s h e fill s up) . Thes e dat a wil l immediatel y becom e
available to all agencies with minimum additional processing . Havin g the data available in a single
database wil l allow safety expert s to accuratel y correlate i t t o prevailin g infrastructure, weather,
traffic an d contro l condition s (a t th e tim e o f th e accident ) an d develo p appropriat e counter -
measures o n time .
The presenc e o f abundan t dat a doe s no t itsel f guarante e usefu l informatio n fo r potentia l
users. Furthermore , specifi c informatio n requirement s ar e ofte n no t know n i n advanc e whe n
dealing wit h comple x systems . Thes e ar e th e specifi c issue s whic h hav e motivate d th e SQL ,
ODBC, an d JDB C protocols . B y supportin g al l o f thes e protocol s wit h complet e networ k
and Interne t availability , th e database modul e allow s a multitud e o f tool s complet e (o r
limited fo r securit y reasons ) acces s t o framewor k data . Thi s well-establishe d desig n mode l
allows fo r th e straightforwar d implementatio n o f comple x queryin g an d reportin g throug h
the suppor t o f JDB C fro m th e GIS . Typica l feature s tha t user s expec t ca n easil y b e in -
corporated int o th e framewor k a s simpl e form s an d button s withi n a GUI . Furthermore , this
open-architecture desig n ensure s tha t i f a desire d functionalit y is omitted , a use r ca n acquir e
the desire d informatio n throug h thir d par t tool s withou t th e nee d fo r change s withi n th e
framework.
In additio n t o th e benefits o f SQL/ODBC t o th e requiremen t of useful informatio n reporting ,
benefits ar e experience d wit h regar d t o th e administratio n o f user datasets . Dat a table s ca n b e
manipulated an d moved through typical SQL commands (which can easily be automated through
a GU I interface) . I n additio n t o th e SQ L functionality, users ca n manipulat e th e intermediat e
binary modul e dat a throug h a Jav a explore r interface , whic h reference s CORB A call s t o th e
Management Module .
The data can be accessed via three methods over a network: CORBA, JDBC, an d ODBC. Fo r
CORBA access, a user must first login with an accepted account . Onl y then access can be gained
to th e intermediate binary data. Acces s to thi s form o f data can easily be restricted a s desired by
server administrators . JDBC and ODB C suppor t includes al l typical securit y feature s (which far
exceed th e scop e o f thi s brie f description , suc h a s encryption) . Typica l SQ L use r accoun t an d
security information is maintained suc h a s user privileges and dat a ownership .
Due t o th e distribute d natur e o f th e framework , on e method availabl e i n orde r t o scal e th e
functional abilit y of th e syste m includes the additio n o f workstations. While this is obviously a
benefit whe n multiple users are runnin g various algorithms, o r a singl e user is running multiple
ones, many of the models can take direct advantage of additional CPU s through inherent paralle l
properties. Additiona l server s are als o a n optio n fo r th e managemen t o f data sinc e man y SQL
servers suppor t dat a replicatio n t o othe r ODB C complian t database s a s wel l a s variou s loca l
caching strategies .
Next, w e describ e th e mode l structure , whil e th e implementatio n detail s o f th e distribute d
system ar e discusse d i n Sectio n 5 .
436 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444
4. Mode l structur e
The primary module s currentl y implemente d i n the VISTA framewor k includ e a traffi c simu -
lator (RouteSim) , traditional (static) planning models, DTA models, network routing algorithms,
signal optimizatio n models , ram p meterin g and inciden t managemen t models . Th e interaction s
among model s ar e coordinate d b y th e centra l Managemen t Module . Althoug h eac h o f thes e
models may have different dat a typ e and structur e requirements, th e format fo r this data i s kept
uniform. The way in which the VISTA modules interact is represented in Fig. 1 . Each interactio n
is specified as either a synchronou s o r asynchronou s invocation .
The Managemen t Modul e i s the central component o f the VISTA framework, and on e of th e
only module s th e use r interface directly communicates with . It continuousl y runs o n th e server,
handles incoming requests from remote interface modules, and executes the algorithm modules. A
remote CORB A objec t can b e described by its IDL file (Object Management Group , 1995) .
As discussed in Sectio n 3 , when the Managemen t Modul e first runs, it creates a n HTM L file,
which contain s th e IO R strin g a s a n HTM L parameter . Thi s strin g uniquel y identifie s th e
Management Modul e a s a CORB A object . B y knowing this string , any CORB A enable d objec t
present on the Internet has the ability to lookup and communicate with the Management object .
Alternatively, a remote module coul d contac t variou s system modules by accessing the CORB A
naming service available o n th e central server.
When th e Managemen t Modul e receive s a remot e cal l t o execut e a n algorithm , i t doe s s o
through variou s means. Th e simples t method i s to us e the ANSI system ( )functio n t o execut e a
separate program . Th e inpu t parameter s ar e specifie d a s comman d lin e arguments , an d th e re -
sulting outpu t rea d fro m a temporar y file, then returne d t o th e invokin g interface object. Thi s
method is bes t for relativel y smal l algorithm s whic h nee d littl e computationa l tim e and dat a
manipulation.
For larger , mor e comple x system s suc h a s DTA , th e algorith m appear s a s a differen t
CORBA module , whic h contains it s own remote methods. Fo r thes e cases, th e remot e methods
are specifie d t o b e asynchronous o r oneway (whic h i s interprete d a s asynchronous withi n th e
employed OR B implementations) . Fo r ver y complex syste m where distribution i s a possibility,
this option also allows further distributio n within the algorithm to take place using the framework
services.
4.2. RouteSim
degree of detail. RouteSim requires as inputs network geometry and path flow data. The path flow
data can be generated fro m time-dependen t o r stati c origin-destination matrice s or input directl y
by the user. RouteSi m assign s ever y generated vehicl e t o a path, similar to th e DYNASMART
model introduced b y Mahmassani e t al. (1993). An advantage o f RouteSim is that the simulation
step an d th e representationa l detai l ar e adjustabl e t o th e geometr y o f th e network . Length y
freeway segment s that d o no t nee d t o b e modeled i n detail are simulate d a s aggregate lon g cells
and their state is updated infrequentl y - fo r example, a two-mile freeway segmen t without on- and
off-ramps coul d be modeled a s a single cell and b e updated ever y 2 min. On the other hand, clos e
to intersection s o r problemati c point s wher e th e evolutio n o f queues , spatio-tempora l traffi c
dynamics an d signalizatio n phase s nee d t o b e capture d i n detail , th e simulatio n ste p ca n b e a s
small a s 2 s . Simulatio n step s o f thi s magnitud e allo w detai l representatio n o f signalize d inter -
sections - tha t is , signal contro l strategies , phasing , start-up/los t time s an d gap acceptance be -
havior. Note that while detail data (e.g., geometry, timing plans, turning movements) are required
for accuratel y simulatin g a networ k wit h signalize d intersections , RouteSi m wil l ru n eve n if n o
such data are provided, by assuming (and prompting the user), geometry, control and traffi c data.
System optimum and user equilibrium static assignment algorithms have been implemented and
can be invoked throug h VISTA . Th e algorithm s ar e deterministi c approaches base d o n Frank-
Wolfe's conve x combination s metho d (Sheffi , 1985) ; a stochasti c user equilibriu m mode l i s cur-
rently unde r developmen t usin g a paire d combinatoria l logi t mode l (Glieb e e t al. , 1999) . Th e
demand table s ar e part o f th e inpu t data , sinc e no tri p generation , distributio n an d mod e spli t
modules ar e currentl y implemented . VISTA , however , provide s a convenien t framewor k fo r
embedding suc h models , a s well as using them i n conjunction with DTA models .
In addition , highwa y capacit y analysi s module s ar e currentl y bein g implemente d s o tha t th e
level o f service for intersections an d stree t segment s can b e computed fo r th e equilibriu m flows.
The computational procedure s ar e done according t o the Highway Capacity Manua l suggestions .
Existing software , suc h as the HCS , coul d als o b e interfaced.
Signal timing plans can be computed fo r isolated intersection s based o n simple delay function s
and offset s for intersection s along an arteria l (McShan e and Roess , 1990) . Network-wide signal
optimization model s ar e currentl y unde r development , althoug h an y o f th e alread y existin g
models (e.g., TRANSYT) can b e easily interfaced. A user-friendly graphic interfac e for viewing
(or modifying) the intersectio n signa l time plans has als o bee n developed .
Various routin g algorithm s can be invoked throug h VISTA : stati c and dynami c shortes t pat h
algorithms base d o n tim e o r cos t o n th e links . Version s of th e dynami c algorithm s tha t simul -
taneously optimiz e rout e an d departur e tim e ar e als o bein g developed . Th e algorithm s ar e im-
Table 2
DTA CP U tim e per iteration
Module CP U tim e (s)
UE (TDS P algorithm) 7 2
SO (TDLC algorithm) 17 7
Simulator (DYNASMART ) 26 5
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444 43 9
The foundatio n for th e introduced framewor k i s CORBA, whic h is a part o f the objec t man-
agement architecture (OMA) (Soley, 1993) . The goal of the OMA i s to facilitate the constructio n
of system s from object s distribute d acros s multipl e computers. Som e of th e benefit s fro m usin g
this system include :
• Object orientation. CORBA is an inherently object-oriented system. Therefore, each component
is recognized as a distinct objec t with an internal state. This aids code reuse, framework struc-
ture and expansion .
• Distributed capabilities. CORB A posses intrinsic abilities for both synchronou s and asynchro-
nous communication betwee n objects. This allow s the interface to b e executed remotely fro m
the implementatio n o f th e algorithms , an d multipl e processors t o b e use d fo r th e executio n
of th e algorith m module s if needed.
• Th e OMG interface definition language (IDL). Thi s is a basic scripting language used to specif y
the interfac e fo r each module. Due to this , each module needs onl y to kno w th e interface o f
other modules in order to communicate with them. This essentially separates th e published in-
terface o f an objec t from it s implementation .
• Th e object request broker (ORB). Thi s handle s th e underlyin g network communication issues
and allow s remote objects t o b e invoked as if they were local. Furthermore , th e OR B handle s
440 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444
When specific data needs to be returned fro m a remote module, one method i s to defin e a spe-
cialized data type within the framework. This allows any necessary data to be returned b y remote
procedures using the typical function return capability. For example, the traffic simulato r functio n
RouteSim returns information such as total system travel time, and number o f nodes, links, cells,
and vehicles. To create this specialized data type it is first specified in the frameworks IDL file as:
struct s_RSimDat a {
double ttime ;
int nodes,links,cell,vehicles;
};
typedef s\_RSimDat a RSimData ;
Other specialized data types include such things as summary data from the static assignment and
DTA module , administrativ e use r message s an d warnings , module executio n history , an d pat h
information fro m networ k routing algorithms. An alternative method would be to use the central
SQL database. The database offer s increased robustness, whereas direct CORBA transfers typically
yield significantly quicker communication and would prove most useful for real-time requirements.
As has been mentioned previously , the CORBA standar d allow s remote objects to be called as
if the y were typica l local objects . Thi s allows a straightforwar d mean s fo r dat a communication
between modules . A typica l remot e executio n an d retrieva l o f dat a fo r a synchronou s metho d
would appea r as :
try {
RSimData rsimdat a = Management.RunRouteSim(start, stop , step) ;
442 A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part C 8 (2000) 427-444
} catch(Exceptio n e ) { ... }
In thi s example, the traffi c simulato r RouteSi m i s run wit h th e specifie d start, sto p an d ste p
times. Th e RunRouteSim( ) metho d lack s th e ONEWA Y specificatio n i n th e ID L fil e whic h
implies it is to b e a synchronous function. Therefore, the interface will wait until the simulato r is
complete before continuing with operation. When completed, the remote call will return a variable
of type RSimData. Thi s type was defined fro m withi n the IDL file, so all CORBA modules in the
VISTA framewor k understand thi s data type. This is the simples t kin d o f data communication :
data ar e passe d t o remot e modules a s parameter s i n th e remot e procedur e calls , an d dat a ar e
returned bac k t o th e client as a specialize d data type .
No majo r technica l challenges appea r t o exis t for implementin g and deployin g th e propose d
framework, bu t quit e a fe w institutional. Th e syste m requires clos e collaboration amon g man y
agencies an d othe r entitie s that traditionall y do no t interact . I t assume s als o tha t certai n orga -
nizations wil l relinquis h contro l o f dat a an d wil l b e aske d t o chang e certai n practices . W e rec -
ognize that thi s may be a formidable obstacle. However, the expected benefits ar e so obvious and
substantial tha t a s thes e technologie s ar e routinel y accepte d an d use d b y corporations , th e
pressure o n publi c agencies will mountain, an d ultimatel y som e version of the propose d syste m
will be adopted .
Next, som e of the expecte d benefit s fro m deployin g the propose d syste m are briefl y identified
and discussed :
Data consistency. The system ensures the use of the same data by all participating agencie s and
professionals.
Efficiency. Dat a duplication an d redundan t dat a collectio n activitie s will be eliminated.
Economies of scale. Resources could be pooled by various agencies to obtain additional data or
functionality tha t will benefit all. In addition, the value of collecting additional data or developing
an additional model can be easily identified; this can help ensure equity for all participating entities.
New data collection and other approaches. Th e acceptanc e o f this new technology can ac t a s a
catalyst for new approaches t o be considered by agencies. For example , the fact that the system is
accessible over the Internet, makes it open to the general public. Travelers could us e it to obtai n
traffic an d transit information, either pre-route or en-route. At the same time, though, the system
could obtain dat a fro m th e travelers, such as origin-destination node, tim e of travel and possibly
other typ e of information tha t traditionall y i s obtained throug h surveys.
Data availability. Anytime, anywhere, anyone (authorized ) ca n acces s th e syste m an d func -
tionality.
Productivity improvement. It wil l enable transportation professional s to bette r d o thei r job b y
freeing the m from tediou s tasks of data conversion and manipulation a s well as help them bette r
interact wit h othe r professional s sinc e the y ar e usin g consisten t dat a an d models . I n addition ,
having all tools and dat a a t han d a t an y place and tim e helps them easily perform look-up task s
and presentin g their results. Information o n any component o f the transportation syste m will be
available on-line, enabling time-sensitive decisions t o be taken within a much shorter time frame.
Implementation o f the model s will b e conducted seamlessl y and o n time.
A.K. Ziliaskopoulos, S.T. Waller I Transportation Research Part. C 8 (2000) 427-44 4 44 3
Creation of a professional community. Communicatio n amon g user s coul d provid e for the
creation of virtual professional communities. Secondary data, reports , analysis and actions can be
available to al l responsible parties via the database. Thi s will allow for not onl y information but
also for knowledge exchange. If a county engineer has difficulties performin g a task, he can access
expertise at anothe r count y o r stat e offic e wit h a click o f a button .
New product development. The open architecture of the system will enable developers to validate
and tes t ne w model s base d o n th e sam e dat a tha t al l othe r transportatio n professionals use,
providing credibility to thei r work and reducin g the time to ente r the market. Furthermore, th e
proposed syste m will enabl e ne w users to develo p transportation softwar e base d o n a commo n
platform, reducin g software development and implementatio n costs.
Transparency an d accountability. By using electronic signatures, agencies will kno w who did ,
what, whe n and fo r what purpose .
7. Conclusion s
References
Jankowski, P. , Stasik , M. , 1997 . Design consideration s fo r spac e an d tim e distribute d collaborativ e spatia l decisio n
making. Journal o f Geographic Informatio n and Decisio n Analysi s 1 (1), 1-8 .
Janssen, B., Spreitzer, M., Larner, D., Jacobi, C., 1998. ILU 2.0alphal2 Reference Manual, ftp://ftp.parc.xerox.com/ilu /
ilu.html.
Keisler, J.M. , Sundell , R.C. , 1997 . Combinin g multi-attribut e utilit y an d geographi c informatio n fo r boundar y
decisions: a n applicatio n t o par k planning . Journa l o f Geographi c Informatio n an d Decisio n Analysi s 1 (2) ,
101-118.
Mahmassani, H.S. , Peeta , S. , Hu , T.Y. , Ziliaskopoulos , A.K. , 1992 . Dynami c traffi c assignmen t an d simulatio n
procedures fo r ATIS/ATM S applications . Technica l Repor t DTFH61-90-R-00074-2 , Cente r fo r Transportatio n
Research, Th e Universit y of Texas a t Austin .
Mahmassani, H.S. , Peeta , S. , Hu , T.Y. , Ziliaskopoulos , A.K. , 1993 . Dynamic traffi c assignmen t wit h multiple use r
classes for real-tim e ATIS/ATMS applications . In : Proceeding s o f the Advance d Traffi c managemen t Conference ,
St. Petersburg, Florida , pp . 91-117 .
McShane, W.R. , Roess , R.P. , 1990 . Traffic Engineering . Prentice-Hall, Englewood Cliffs , NJ .
Mennecke, B. , 1997 . Understandin g th e rol e o f GI S i n business : applicatio n an d researc h directions . Journa l o f
Geographic Informatio n and Decisio n Analysi s 3 (1), 44-68.
NCHRP 20-27 , 1997 . A generi c data mode l fo r linea r referencin g systems . National Cooperativ e Highwa y Research
Program o f the Transportatio n Researc h Boar d (Nationa l Researc h Council) , September 1997.
Object Managemen t Group , 1995 . The Common Objec t Reques t Broker Architecture an d Specification , Revisio n 2.0.
Framingham, M.A. , Jul y 1995.
Sheffi, Y. , 1985 . Urban Transportatio n Networks : Equilibrium Analysis with Mathematica l Programming Methods .
Prentice-Hall, Englewoo d Cliffs , NJ .
Soley, R.M . (Ed.) , 1993 . Object Managemen t Architectur e Guide , secon d ed . Object Management Group .
Yue, L. , Ziliaskopoulos, A.K. , Waller , S.T. , 1999 . Linear programmin g formulation s for syste m optimum DT A with
arrival tim e based an d departur e tim e based demands . Transportatio n Researc h Boar d (submitted) .
Ziliaskopoulos, A.K. , 2000 . A linea r programming mode l fo r th e singl e destination syste m optimum dynami c traffi c
assignment problem . Transportatio n Scienc e 34, 1-14 .
Ziliaskopoulos, A.K. , Lee , S. , 1997 . A cel l transmissio n base d assignment-simulatio n mode l fo r integrate d freeway /
surface stree t systems . Transportation Researc h Recor d 1701 , 12-23.
Ziliaskopoulos, A.K. , Mahmassani , H.S. , 1994 . A time-dependen t shortes t pat h algorith m fo r real-tim e intelligent
vehicle/highway systems . Transportation Researc h Recor d 1408 , 94-104.
Ziliaskopoulos, A.K. , Rao , L. , 1998 . A simultaneous rout e an d departur e tim e choice equilibrium model on dynamic
networks. Internationa l Transactions of Operational Researc h 6 (1), 21-37.
Ziliaskopoulos, A.K. , Kotzinos , D. , Mahmassani , H.S. , 1998 . Design an d implementatio n of parallel time-dependen t
shortest path algorithm s for real-time intelligent transportation systems . Transportation Researc h C 5 (2), 95-107.
SUBJECT INDEX
Dynamic segmentation 9 19 29
37–41 46 47
50 73 136
139 141 144
295 303 423
Dynamic tracking 10 19
Dynamic traffic assignment 231 429 436
438
DYNASMART model 437 438
Edge-mapping set 76 80
Edge matching 72 75 79
80
Emergency planning zones 322
Engine start activity 216 217
Enterprise data model 15 16 19–28
Environmental impact 308–318
Errors in map databases 9 56–62 i
Evacuation neighborhood 324 325
Evacuation risk factor 323 324
Evacuation vulnerability 323 332
Exploratory data analysis 185 187 199
Export shipments 163
Image registration 74
Inference engine 369
Information theory 74
Institutional challenges 442 443
Integer programming 330
Intelligent transportation systems (ITS) 10 28 29
33 35 54–56
61 227 290
Interactive geo-visualization 185–200
Intermodal Surface Transportation Efficiency Act (ITSEA) 4 54 288
Intermodal network 148–155
Intermodal transfer terminals 150 151–162
Intermodal transportation 147–149
Internet see world wide web
Interoperability 9 10 15–17
29 34 35
55 56 62–67
430
Inter-Operable System for the Integration of
Real-Time Traffic Data within a GIS (OSIRIS) 177–182
Intra-zonal running exhaust activity 217
Inverse address matching 200
ITS datum 64–66
TEA–21 4 54 288
Temporal control 45–47
Temporal GIS 168–171
Terminal access/egress links 153 154
Three-dimensional GIS 185–200
Three-tier system architecture 414 415
TIGER 5 15–17 31
35 59 100
222 351
Time-dependent constrained shortest path 340 355–357 419
Time-Dependent Constrained Shortest Path (TCSP) problem 340 355–258 419
Time-geographic perspective 190
Traffic analysis zones 214 216 217
Traffic congestion 257 267–270 287
288 304
Traffic data 168 176–182 234–236
243 244 257
TRANSIMS 210 234 430
Transit building algorithm 137–139
Transit information systems 410–415
Transit networks 130–145 419
Transit routing problem 418–422
Transit stops 131–145 420
Trans–Oceanic Network 150
Transportation control measures 304
Transportation information system 7
Transportation of hazardous materials 337–358