Chaomei Chen - Turning Points - The Nature of Creativity PDF
Chaomei Chen - Turning Points - The Nature of Creativity PDF
Turning Points
Turning Points
The Nature of Creativity
ISBN 978-7-04-031703-9
Higher Education Press, Beijing
Among the uniquely human capabilities is the capacity to create and discover.
Understanding how humans create innovative art, music, poetry, or novels
and discover scientific principles patterns, or relationships requires a recursive
form of creativity and discovery.
The foundations for human creativity and discovery depend on passion
for solving problems and fluency with social contexts that promote solutions.
The passion produces persistence over time and enables devotion to solving
important problems, filling troubling gaps, stretching annoying boundaries,
or opening doors to fresh opportunities.
The fluency with social contexts helps researchers to see problems more
clearly, bridge disciplines, and apply methods from one knowledge domain to
another. The social context also provides powerful motivations that encour-
age varied forms of competition and collaboration. Sometimes competition is
fierce, other times it can be friendly. Sometimes collaboration is narrow and
limited to dialogs between trusted partners, other times it can be broad and
long-term, producing lively conversations among thousands of contributors
who are united by the passion to solve a problem. Innovators who protect
their nascent ideas too closely will miss the opportunity to get feedback about
their progress or learn about related ideas.
Researchers are increasingly attracted to study the dynamics of creativity
and discovery. For the first time in history the databases of human scientific
activity are sufficiently large and widely available. For the first time in his-
tory the tools for analyzing this data are capable of performing appropriate
analyses and becoming widely available.
Retrospective citation analysis of scientific papers remains the major
approach, sometimes complemented by informed ethnographic observations
and interviews by researchers with sufficient knowledge-domain understand-
ing to recognize important steps, controversies, or mistakes. However, anal-
ysis of patents, patent citations, trade journal articles, blogs, emails, twitter
posts, and other social media will provide a finer-grained, more diverse, and
vi Foreword
Ben Shneiderman
University of Maryland
July 2011
Preface
Research assessment has become a central issue for more and more govern-
ment agencies and private organizations in making decisions and policies.
New indicators of research excellence or predictors of impact are popping
out one after another. However, if we look behind the available methods and
beyond the horizon decorated by the various types of indicators, then we will
encounter a few questions again and again: What is the nature of creativity
in science? Is there a way that we can tell great ideas early on? Are there
ways that can help us to choose the right paths? Can we make ourselves
more creative?
There are only two types of theories no matter what their subjects are:
the ones that are instructional and the ones that are not. An instructional
theory will explain the underlying mechanisms of a phenomenon in such a
way that we can see what we need to do to make a difference. The quest
for us in this book is to look for a better understanding of mechanisms be-
hind creativity, especially in the context of making and assessing scientific
discoveries. In this book, my goal is to identify principles that appear to
be necessary for creative thinking from a diverse range of sources and clarify
where we may struggle with biases and pitfalls created by our own perceptual
and cognitive systems. Then I will introduce an explanatory and computa-
tional theory of discovery and demonstrate its instructional nature through a
series of increasingly refined quantitative approaches to the study of knowl-
edge domains in science. Finally, the potential of transformative research is
measured by metrics derived from the theoretical underpinning and validated
with retrospective indicators of impact. The theory, for example, leads to a
much simplified explanation of why some of the good predictors of citation
counts of an article found by previous research are due to the same underlying
mechanisms.
The conception of the theory of discovery was inspired by a series of intel-
lectual landmarks across a diverse range of perspectives, notably, Vannevar
Bush’s As We May Think and his vision for trailblazing a space of knowledge
in his Memex (memory and index), Thomas Kuhn’s paradigm shift theory of
scientific revolutions, Henry Small’s methods for analyzing co-citation net-
works, Ronald Burt’s structural-hole theory, and Peter Pirolli’s optimal in-
x Preface
formation foraging theory. The development and use of the CiteSpace system
have played an instrumental role in experimenting and synthesizing these
great ideas. I have been developing and maintaining CiteSpace since 2003. I
have made it freely available for researchers and students to analyze emerg-
ing trends and turning points in the literature. The provision of CiteSpace
has probably also promoted the awareness of scientometrics, the field that
is concerned with quantitative approaches to the study of science. Feedback,
questions, and requests for new features from a diverse and growing popu-
lation of users have also propelled the search for theories to explain various
patterns that we see in the literature.
The central thesis of the book is that there are generic mechanisms for
creative thinking and problem solving. If we can better understand these
mechanisms, then we will be able to incorporate them and further enhance
them with computational techniques. Another important insight gained from
reviewing the literature across different fields is that creativity is about the
ability and willingness to find a new perspective so that we can see something
that we take for granted.
The notion of an intellectual turning point has naturally emerged. Kuhn’s
gestalt switch between competing paradigms and Hegel’s syntheses of theses
and antitheses are exemplars of view-changing intellectual turning points. We
may feel lucky or unlucky, depending on the particular perspective we take.
We may miss the obvious if we are looking for something else. I hope that this
book can provide the reader with some useful perspectives to study science
and its role in society as well as insights into the nature of creativity so that
we will be better able to recognize creative ideas and create opportunities for
more creative ideas.
I have a few types of readers in mind when I was preparing for this book:
1) anyone who is curious about the nature of creativity and wondering if
there is anything beyond the serendipitous view of creativity
2) analysts, evaluators, and policy makers in a situation where tough deci-
sions have to be made that will influence the fate of creative work
3) researchers and students who need to not only keep abreast of their own
fields of study but also position themselves strategically with a competi-
tive edge
4) historians and philosophers of science
The first four chapters of the book should be accessible to college students
and more advanced levels. The next four chapters may require a higher level
of background information in areas such as network analysis and citation
analysis. The book may be used for graduate-level courses or seminars in
information science, research evaluation, and business management.
Chaomei Chen
Philadelphia, Pennsylvania
April 2011
Acknowledgements
Many people have played an information role in the ideas presented in this
book.
My long-term collaborators in interdisciplinary research projects include
Michael S. Vogeley, an astrophysicist at the Department of Physics, Drexel
University, on a project funded by the National Science Foundation (NSF)
(IIS-0612129) to study the interconnections between astronomical literature
and the usage of the astronomical data obtained by the Sloan Digital Sky
Survey (SDSS), Alan M. MacEachren, at the Department of Geography, Penn
State University, on the Northeast Visual Analytic Center (NEVAC) project
funded by the Department of Homeland Security, my graduate research assis-
tants and doctoral students Jian Zhang and Don Pellegrino, and international
visitors Fidelia Ibekwe-SanJuan (France) and Roberto Pinho (Brazil).
Eugene Garfield and Henry Small, visionary pioneers of citation analy-
sis and co-citation analysis at Thomson Reuters, have been generous with
their time and insights. Thomson Reuters’ younger generation, David Liu
(China), Weiping Yue (China), and Berenika Webster (Australia), are en-
thusiastic, energetic, and supportive. In particular, Thomson Reuters made
generously arrangements for me to have an extensive period of access to the
Web of Science while I was on sabbatical leave. I was a recipient of the 2002
Citation Research Award from the ISI and the American Society for Infor-
mation Science and Technology.
I would like to thank Julia I. Lane and Mary L. Maher, Program Directors
at the National Science Foundation (NSF), for their masterminded efforts in
organizing the research portfolio evaluation project to explore technical fea-
sibilities of evaluating NSF proposals (NSFDACS-10P1303), Jared Milbank
and Bruce A. Lefker at Pfizer Global Research and Development at Groton
Labs for collaborating on a Pfizer-funded drug discovery project.
I am also grateful to Zeyuan Liu at the WISELab, Dalian University of
Technology, for his enthusiasm, vision, and insights in the use of CiteSpace in
mapping knowledge domains in China, Hung Tseng, a biologist-turned NIH
program director, for sharing his enthusiasm and insights in issues concerning
the evaluation of research and tracing timelines of discoveries from a funding
agency’s point of view, Rod Miller, Drexel University, for numerous in-depth
xii Acknowledgements
Chapter 5 Foraging · · · · · · · · · · · · · · · · · · · · · · · · · · · ·· · · · ·· 87
5.1 An Information-Theoretic View of Visual Analytics · ·· · · · ·· 88
5.1.1 Information Foraging and Sensemaking · · · · · ·· · · · ·· 89
5.1.2 Evidence and Beliefs· · · · · · · · · · · · · · · · · · · ·· · · · ·· 91
5.1.3 Salience and Novelty · · · · · · · · · · · · · · · · · · ·· · · · ·· 93
5.1.4 Structural Holes and Brokerage· · · · · · · · · · · ·· · · · ·· 94
5.1.5 Macroscopic Views of Information Contents· · ·· · · · ·· 95
5.2 Turning Points· · · · · · · · · · · · · · · · · · · · · · · · · · · · ·· · · · ·· 98
5.2.1 The Index of the Interesting · · · · · · · · · · · · · ·· · · · ·· 99
5.2.2 Proteus Phenomenon · · · · · · · · · · · · · · · · · · ·· · · · ·· 100
5.2.3 The Concept of Scientific Change · · · · · · · · · ·· · · · ·· 101
5.2.4 Specialties and Scientific Change· · · · · · · · · · ·· · · · ·· 103
5.2.5 Knowledge Diffusion· · · · · · · · · · · · · · · · · · · ·· · · · ·· 104
5.2.6 Predictors of Future Citations· · · · · · · · · · · · ·· · · · ·· 107
5.3 Generic Mechanisms for Scientific Discovery · · · · · · ·· · · · ·· 112
5.3.1 Scientific Discovery as Problem Solving · · · · · ·· · · · ·· 112
5.3.2 Literature-Based Discovery · · · · · · · · · · · · · · ·· · · · ·· 113
5.3.3 Spanning Diverse Perspectives · · · · · · · · · · · ·· · · · ·· 114
5.3.4 Bridging Intellectual Structural Holes · · · · · · ·· · · · ·· 116
Contents xv
Index · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 263
Chapter 1 The Gathering Storm
There are two ways to boil a frog alive. One is to boil the water first and then
drop the frog into boiling water — the frog will jump out from the immediate
crisis. The other is to put the frog in cold water and then gradually heat the
water until it boils — the frog will not realize that it is now in a creeping crisis.
As far as the frog is concerned, the creeping crisis is even more dangerous
because the frog loses its chance to make a move that could save its life.
Several major crises in the past triggered the U.S. to respond immediately,
notably the Japanese attack at Pearl Harbor in 1941, the Soviet Union’s
launch of Sputnik 1 in 1957, and the 911 terrorist attacks in 2001. The Sputnik
crisis, for example, led to the creation of NASA and DARPA and an increase
in the U.S. government spending on scientific research and education. In
contrast to these abrupt crises, several prestigious committees and advisory
boards to the governing bodies of science and technology policy have sounded
an alarm that the U.S. is now facing an invisible but deeply profound crisis —
a creeping crisis that is eroding the very foundation that has sustained the
competitive position of the nation in science and technology.
In 2005, William Wulf, the President of the National Academy of En-
gineering (NAE), made his case before the U.S. House of Representatives’
Commission on Science. He used the creeping crisis scenario to stress the
nature of the current crisis — a pattern of short-term thinking and a lack of
long-term investment. However, the view is controversial. There have been in-
tensive debates on the priorities that the nation should act upon and whether
there is such a thing as a “creeping crisis” altogether. One of the central points
in the debate is whether the science and engineering (S&E) education, espe-
cially math and science, is trailing behind the major competitors in the world
in terms of standard test performance and the ability to meet the demand of
the industries.
Why are people’s views so different that the idea of any reconciliation
seems to be distant and far-fetched? Is the crisis really there? Why are some
so concerned while others not? What are the key arguments and counterar-
guments? After all, what I want to address in this book is: what are the most
critical factors that hinge the nation’s leading position in science and tech-
nology? Furthermore, what does it really take to sustain the competitiveness
The notion that the U.S. is in the middle of a creeping crisis was most force-
fully presented to the U.S. House of Representatives’ Committee on Science
on October 20, 20051 . Norman R. Augustine, the chairman of the competi-
tiveness assessment committee, P. Roy Vagelos, a member of the committee,
and William A. Wulf, the president of the National Academy of Engineering
presented their assessments of the situation. Augustine is the retired chair-
man and CEO of Lockheed Martin Corporation and Vagelos is the retired
chairman and CEO of Merck. The full report was published by the National
Academies Press in 2007, entitled Rising above the Gathering Storm (Na-
tional Academy of Sciences, National Academy of Engineering, & Institute
of Medicine of the National Academies, 2007). In the same year, Is America
Falling Off the Flat Earth?, written by Augustine, was also published by the
National Academies Press2 (Augustine, 2007).
The Gathering Storm committee included members such as Nobel laure-
ate Joshua Lederberg, executives of research-intensive corporations such as
Intel and DuPont, the director of Lawrence Berkeley National Laboratory,
and presidents of MIT, Yale University, Texas A&M, Rensselaer Polytech-
nic Institute, and the University of Maryland. The prestigious background
of the committee and its starry members as well as the well articulated ar-
guments have brought a considerable publicity to the notion of the creeping
crisis — the gathering storm!
The key points of the creeping crisis presented in the Gathering Storm
committee can be summarized as follows:
1) America must repair its failing K-12 educational system, particularly in
mathematics and science.
2) The federal government must markedly increase its investment in basic
research, that is, in the creation of new knowledge.
The primary factor in this crisis is the so-called the Death of Distance,
which refers to the increasing globalization in all aspects of our life. Now
the competitors and consumers are all just a “mouse-click” away. Fast and
profound changes in a wide range of areas are threatening the leading position
of the U.S., for example, the mobility of manufacturing driven by the cost
of labor and the existence of a vibrant domestic market. For the cost of one
engineer in the United States, a company can hire eleven in India. More
importantly, the Gathering Storm committee highlighted that the increasing
mobility of financial capital, human capital, and knowledge capital is now
1
http://www7.nationalacademies.org/ocga/testimony/gathering storm energizing and
employing america2.asp
2
The National Academies Press offers a free podcast free of charge at http://books.nap.
edu/catalog.php?record id=12021
1.1 The Gathering Storm 3
U.S. Congress has passed the America COMPETES Act3 in 2007 to enact
some of the recommendations made by the Gathering Storm committee. For
example, the Act includes requirements to the National Science Foundation
(NSF), the major funding agency of basic research:
• (Sec. 4006) Requires the NSF Director to: (1) consider the degree to
which NSF-eligible awards and research activities may assist in meeting
critical national needs in innovation, competitiveness, the physical and
natural sciences, technology, engineering, and mathematics; and (2) give
priority in the selection of the NSF awards, research resources, and grants
to entities that can be expected to make contributions in such fields.
• (Sec. 4007) Prohibits anything in Divisions A or D of this Act from being
construed to alter or modify the NSF merit-review system or peer-review
process.
• (Sec. 4008) Earmarks funds for FY2008-FY2011 for the Experimental
Program to Stimulate Competitive Research under the National Science
Foundation Authorization Act of 1988.
Despite the compelling creeping crisis case and the consensus of the need
for action, many have raised serious questions that challenge the diagnos-
tics and treatments of the crisis. Indeed, multiple views, conflicting posi-
tions, and competing recommendations need to be validated, resolved, and
implemented. Not only for policy makers but also for scientists, educators,
students, and the general public, there is the urgent need for making sense
of what is really happening, and more importantly for understanding the
spectrum of the long-term consequences of decisions made today.
One of the most forceful attacks of the Gathering Storm report is made by
Into the Eye of the Storm (Lowell & Salzman, 2007). The authors of the paper
are Lindsay Lowell of Georgetown University and Hal Salzman of the Urban
Institute. Their research was funded by the Alfred P. Sloan Foundation and
the National Science Foundation.
The key finding of the Into the Eye of the Storm is that their review of
the data fails to find support for the challenges identify by the Gathering
Storm and those with similar views. Specifically, they did not find evidence
for the decline in the supply of high quality students from the beginning to
the end of the science and engineering pipeline due to a declining emphasis
on mathematics and science education and a declining career interest among
the U.S. domestic students in science and engineering careers. First, Lowell
and Salzman showed that the claim that the U.S. falls behind the world in
science and mathematics is questionable; their data shows that the U.S. is
the only country with a considerable diversity of student performance and
3
http://thomas.loc.gov/cgi-bin/bdquery/z?d110:SN00761:@@@D&summ2=m&
1.2 Into the Eye of the Storm 5
that simple rank positions make little sense in light of such a degree of di-
versity. Second, their analysis of the flow of students up through the science
and engineering pipeline suggests that the supply of qualified graduates is
far in excess of demand. Third, the more than adequate supply requires a
better understanding why the demand side fails to induce more graduates
into the S&E workforce. Policy approaches to human capital development
and employment from the prior era do not address the current workforce or
economic policy needs.
Lowell and Salzman’s analysis shows that, from employers’ point of view,
literacy and a competence in a broad range of subjects beyond math and
science are essential. Furthermore, they rightly stated that the question is
not about whether to improve the U.S. education system, but rather why
the U.S. performance is lower than other countries, what the implications
are for the future competitiveness, and what polices would best address the
deficiencies. Their analysis draws attention to the fact that, according to the
2006 U.S. census, single-parent households with children under age 17 account
for 33% of families in the U.S., whereas the number is 17% in Norway and less
than 10% in Japan, Singapore, and Korea. Therefore, it is unclear whether
using average test scores provide any meaningful indication of education or
potential economic performance of the U.S. because one could argue that it
is the diversity and openness of the U.S. that contribute to its lower average
educational performance as well as its high economic performance.
Further analysis of the education-to-career pipeline shows that science
and engineering firms most often complain about schools failing to provide
students with the non-technical skills needed in today’s firms.
In summary, Into the Eye of the Storm concluded that the perceived la-
bor market shortage of scientists and engineers and the decline of qualified
students are not supported by the educational performance and employment
data that Lowell and Salzman have reviewed. In contrast to the policy fo-
cus of the U.S. competitiveness committees calling for the U.S. to emulate
Singapore’s math and science education programs, Singapore’s recent com-
petitiveness policy focuses on creativity and developing a more broad-based
education — an emulation of the U.S. education.
The debates have made it clear that different questions should be asked:
What are the factors that have led to the consistent high performance of
the U.S. economy? What kind of workforce is likely to improve prospects of
the U.S. in the future? Lessons learned from the conflicting views underline
that evidence-based policy is necessary for developing effective programs for
the emerging global economy. Julia Lane, the Program Director of the NSF
Science of Science Policy Program, supports evidence-based approaches to
science policy.
In a recent article published in the Scientific American, Beryl Lieff Ben-
derly (2010), a columnist for the Science Careers of the journal Science,
addressed the question: Does the U.S. produce too many scientists? For ex-
ample, she addressed practical issues associated with the fact that labs in the
6 Chapter 1 The Gathering Storm
their paper has been cited three times. It was first cited in 1987 by Schubert
(1987) in a Scientometrics article on quantitative studies of science. In 1993,
it was cited in Psychological Inquiry by Hans Eysenck (1993) on creativity
and personality. He suggested a causal chain reaching from DNA to creative
achievement, based largely on experimental findings not usually considered
in relation to creativity (e.g., latent inhibition). His model is highly specula-
tive, but nonetheless testable. The most recent citation to Zhao and Jiang’s
paper was made by an article on a bibliometric model for journal discarding
policy in academic libraries.
Zhao introduced the notion of the social mean age of a country’s scientists
at time t as the average age of a scientist makes significant contributions:
Xi − Xb
At =
i=1,...,n
Nt
where Xb is the year of the birth of a scientist, Xi is the time when the
scientist makes noteworthy contributions, and Nt is the total number of sci-
entists at time t. Zhao noticed some interesting patterns: the At of 50 years
old seems to be a tipping point. Immediately before a country becomes the
center of scientific activity, the At of its outstanding scientists is below 50
years old. For example, Italy was the world center in 1540 – 1610; the social
mean age of scientists of Italy was 30∼45 years old between 1530 and 1570.
Similarly, England was the center during 1660 – 1730 and its social mean age
was 38∼45 between 1640 and 1680. France was the center 1770 – 1830 and
its social mean age was 43∼50 between 1760 and 1800. Germany became
the center in 1810 – 1920 and its social mean age was 41∼45. The U.S. has
been the center since 1920 and its social mean age of scientists was about 50
between 1860 and 1920.
On the other hand, if the social mean age of scientists in the host country
of the current center of scientific activity exceeds 50 years old, it tends to
lose its center position. For example, the At of France started to exceed 50
years old in 1800; by 1840, the center shifted to England. Why is the age of
50 so special?
As we shall see in Chapter 2, Zhao approached to this question from a
statistical perspective and defined the concept of an optimal age — a period
of the most creative years in the career of a scientist. Zhao found that when
a country’s social mean age approaches the distribution of the optimal ages
of the scientists in the country, the country’s science is likely on the rise;
otherwise, it is likely to decline. The estimation of the optimal age is built on
his theory of scientific discovery. We will re-visit Zhao’s work in more detail
in Chapter 2.
A different approach to the question was offered by Zeyuan Liu and Hais-
han Wang in 1980s4. They found that a country’s status of the world center of
scientific activities appeared to follow a 60-year leading period of revolutions
4
http://www.collnet.de/workshop/liu.html
1.4 Transformative Research and the Nature of Creativity 9
attention space and Kuhn’s competing paradigms is that for Collins explicit
rivalry between schools of thought often developed in succeeding generations
Fig. 1.2 The intellectual trails of the field of nanoscience between 1997 and 2007.
(see color figure at the end of this book)
Fig. 1.3 shows not only a map of the Universe but also discoveries and
research interests associated with various areas in the Universe. The earth is
12 Chapter 1 The Gathering Storm
at the center of the map because the distance to an astronomic object is mea-
sured from the earth. The blue band of galaxies and the red band of quasars
were formed at the early stage of the Universe. As the Universe expands,
they become further away from us. The Hubble Ultra Deep Field, shown
at the upper-right corner of the image, was one of the farthest observations
made by scientists. Unlike the free-form layout method used in generating
the visualization shown in Figure 1.2, the map of the Universe preserves the
relative positions of astronomic objects. It is common in cartography to use
a base map as the general organizational framework and then add various
thematic layers on top of it. Adding multiple thematic layers is in effect com-
bining information from multiple perspectives. A fundamental question yet
to be answered is how one should interpret the meaning of such combinations.
Each perspective represents its own conceptual space, which may or may not
be compatible with other spaces. The compatibility here means whether there
exists a topological mapping from one space to another. A central property of
topological mapping is that it reserves the proximity relations so that nearby
points in one space will remain to be neighbors when they are mapped into a
new space. This is obviously not held between the astronomical space and the
space of astronomical knowledge. Two black holes may be further apart in
the Universe, but they can be dealt with by the same theory in the knowledge
space. In contrast, two different theories may address the same phenomenon
in the Universe.
Fig. 1.3 A map of the Universe with overlays of discoveries and astronomical
objects associated with bursts of citations. The close-up view of the Hubble Ultra
Deep Field is shown at the upper-right corner (circled). (see color figure at the end
of this book)
1.4 Transformative Research and the Nature of Creativity 13
In May 2009, as H1N1 was rapidly spreading across many countries, there
was a rich body of knowledge about influenza pandemics in the literature.
The Web of Science alone indexed over 4,500 research papers on influenza and
pandemics. Fig. 1.4 shows a timeline visualization of this literature as of May
8th, 2009.5 Spots in red were articles with a burst of citations. In contrast,
Fig. 1.5 shows a similarity map of 114,996 influenza virus protein sequences.
Some of the significant questions to be addressed are what multiple views
of influenza such as these two would tell us and how they would foster new
research questions.
Fig. 1.4 A timeline visualization of the state of the art in research related to
influenza and pandemics as of May 8th, 2009. (see color figure at the end of this
book)
Fig. 1.5 114,996 influenza virus protein sequences. Source: (Pellegrino & Chen,
2011) (see color figure at the end of this book)
fields, or disrupting accepted theories and perspectives (NSF, 2007). The em-
phasis is clearly on the potential that may lead to revolutionary changes of
disciplines and fields. In contrast, European perspectives tend to emphasize
the role of high risks in the equation to justify the potential high impact.
The term scientific breakthrough is often used by european researchers and
officials when referring to transformative research.
The NSF has implemented several mechanisms to promote the funding
of transformative research, or risky science. For example, the EArly-concept
Grants for Exploratory Research (EAGER) funding mechanism aims to sup-
port exploratory work in its early stages on untested but potentially trans-
formative research. The NSF also has a quick-response funding mechanism
called the Grants for Rapid Response Research (RAPID) to deal with natural
or anthropogenic disasters or other unanticipated events.
Fig. 1.6 shows a network of terms used in 63 NSF EAGER award abstracts
in the IIS program between 2009 and 2010. A network like this can give a high-
level picture of what is going on in a highly volatile environment. Terms, more
precisely noun phrases, appear in these abstracts are grouped together based
on how often they appear side by side, known as co-occurrences. Frequently
co-occurred terms tend to form denser groups, whereas terms that rarely
appear together tend to stay in separated groups. This is a commonly used
Fig. 1.6 A network of 682 co-occurring terms generated from 63 NSF IIS EAGER
projects awarded in 2009 (cyan) and 2010 (yellow). Q = 0.8565, Mean silhouette =
0.9397. Links = 22347. (see color figure at the end of this book)
16 Chapter 1 The Gathering Storm
of the current global economic, social and climate conditions. The COV found
that the division has a high quality and integrity of selecting and funding in-
novative and far-reaching research, although the amount of funding has not
kept pace with the growing importance of IIS research. One of the questions
addressed in the COV report was whether the program portfolio has an ap-
propriate balance of innovative and potentially transformative projects. The
COV identified several steps made by the IIS division in this direction:
• Specific instructions to the review panels to consider the transformative
aspect of proposals
• Solicitations which push the frontiers of research
• Advice to panels to avoid implicit bias
• The creation of programs which require potentially transformative re-
search
In response to the COV report, the IIS management acknowledged that
low success rates continue to be a concern in each of the CISE divisions and
in the NSF as a whole8 . The NSF intends to make a strong case for increased
investments in computing.
The Science of Science Policy and Innovation (SciSIP) Program at the
NSF, directed by Julia Lane, is particularly relevant to issues concerning the
performance evaluation of research typically at national and disciplinary lev-
els. The growing portfolio of the SciSIP program includes a variety of innova-
tive research projects that investigate technical and fundamental issues con-
cerning evidence-based studies of the performance of science and research9.
For example, one SciSIP project CREA10 aims to develop measurements for
analyzing highly creative research in the U.S. and Europe. The Cyber-enabled
Discovery and Innovation (CDI) Program is an NSF-wide initiative on mul-
tidisciplinary research on innovations in computational thinking. As we shall
see in this book, there are good reasons why interdisciplinary work may be
an effective mechanism for transformative research.
1.6 Summary
The debates in the U.S. over the nature and extent of the crises and priorities
of action have profound implications. They are among the most substantial
proactive and responsive self-assessments since Pearl Harbor, Sputnik, and
911. These self-assessments are valuable and crucial for sustaining the com-
petitive edge. The Yuasa Phenomenon and its potential causes are particu-
larly interesting in this context. Emergent trends and patterns at macroscopic
levels demand explanations at microscopic levels.
What is the role of creativity in scientific discovery and innovation?
What can be done to increase our creativity?
Regardless of our opinions in response to the specific arguments and in-
terpretations of available evidence concerning the Gathering Storm, it is vital
to sustain and enhance the competitive position of a country, the drive of a
discipline, and the creativity of ourselves. Evidently, how to achieve such a
goal is one of the top priorities on the agenda of a plethora of stakeholders
from so many directions.
References
Augustine, N.R. (2007). Is America Falling Off the Flat Earth? : National Academi-
es Press.
Benderly, B.L. (2010). Does the U.S. produce too many scientists? Scientific Ameri-
can, http://www.scientificamerican.com/article.cfm?id=does-the-us-produce-
too-m&offset=6.
Collins, R. (1998). The Sociology of Philosophies: A Global Theory of Intellectual
Change. Cambridge, MA: Harvard University Press.
COSEPUP. (1993). Science, technology, and the federal government: National goals
for a new era. Washington, DC: National Academy Press.
Eysenck, H.J. (1993). Creativity and personality: Suggestions for a theory. psycho-
logical inquiry, 4(3), 147-178.
Lowell, B.L., & Salzman, H. (2007). Into the eye of the storm: Assessing the ev-
idence on science and engineering education, quality, and workforce demand:
The Urban Institute.
20 Chapter 1 The Gathering Storm
We have all heard of stories of how a falling apple set Isaac Newton and
his gravitational theory on the right track and how Henri Poincare arrived
at his ultimate moment of enlightening just as he was stepping on a bus.
Serendipity is one of the most fascinating, widely-admired, and yet most
mysterious characterizations of creativity. Through the lens of serendipity,
everything would magically fall into place so effortlessly that it leaves no trace
and no clue of how one gets there. Discoveries rendered with the serendipitous
paint make fascinating headline stories, and yet they improve little of our
knowledge in terms of what we have to do to get there ourselves. Indeed,
the notion of serendipity categorically denies rational pursuits of creative
thinking.
The April issue of PloS Biology in 2004 reported a study of brain activity
that accompanies the so-called ‘Aha!’ moments — the moments of inspira-
tion.1 Researchers gave participants a series of word problems to solve and
studied their brain activities using brain imaging techniques. They found
that activity increased in an area called the temporal lobe, in the right lobe
1
http://men.webmd.com/news/20040413/scientists-explain-aha-moments
5) verification
In the first stage — preparation — the problem is identified and formu-
lated. Previous work on the problem is also studied in this stage. The prob-
lem is then internalized in the incubation stage. There may be no apparent
progress on solving the problem in this stage. Importantly, this period of in-
terruption seems to be necessary for breaking away from misleading signals
and false alarms. In the intimation stage, we can feel that a solution is on its
way. In the illumination stage, the insight or the spark of creativity bursts
through from its preconscious processing to conscious awareness. The insight
often arrives suddenly and intuitively. Eventually, the idea is verified and
evaluated. The question that many of us want to ask is: What does it take
to be able to reach the illumination stage and find the inspirational insight?
Researchers and practitioners have repeatedly asked whether creativity
is what we were born with or it can be trained and learned. The practical
implications are clearly related to the fact that individuals in organizations
are expected to become increasingly creative as they collaborate in project
teams. A meta-analysis conducted by Scott and colleagues (2004) reviewed
the results of 70 studies of creative training effects and found that carefully
constructed creativity training programs typically improve performance. In
contrast, Benedek and his colleagues (2006) studied whether repeated prac-
tice can enhance the creativity of adults in terms of the fluency and originality
of idea generation. They found that while training did improve the fluency,
no impact on originality was found.
The American psychologist Howard E. Gruber (1922 – 2005), a pioneer
of the psychological study of creativity, questioned the validity of lab-based
experimental studies of creativity. He argued that because creative works
tend to be produced over a much longer period of time than the duration
of a lab experiment, the laboratory setting is simply not realistic enough to
study creativity. As a result, Gruber (1992) was convinced that an alternative
approach, the evolving systems, should be used for the study of creativity.
To him, a theory of creativity should explain the unique and unrepeatable
aspects of creativity rather than the predictable and repeatable aspects seen
in normal science.
Gruber strongly believed that the most meaningful study of creative work
should focus on the greatest individuals rather than attempt to develop quan-
titative measures of creativity based on a larger group of people. His work,
Darwin on Man: A Psychological Study of Scientific Creativity, is a classic
exemplar of his evolving systems approach. His principle is similar to Albert
Einstein’s famous principle: as simple as it is, but not simpler. He strongly
believed that characteristics of the truly creative work may not be found in an
extended population (Gruber, 1992). Instead, he chose to study how exactly
the greatest creative people such as Charles Darwin made their discoveries.
He chose in-depth case studies over lab-based experimental studies.
Creativity is purposeful work. Gruber studied the lives of famous innova-
tors and found broad common characteristics:
24 Chapter 2 Creative Thinking
tive illnesses. Being passionate is probably one of the necessary factors that
would go hand in hand with creativity. As Nobel laureate Max Planck once
said, “The creative scientist needs an artistic imagination.”
Fig. 2.1 A co-occurring network of major terms (noun phrases) extracted from
the abstracts of 5,656 articles on creativity (1990 – 2010).
Many researchers have been deeply intrigued by the analogy between trial-
and-error problem solving and natural selection in evolution. Is it merely an
analogy on the surface or more than that? The American social scientist
28 Chapter 2 Creative Thinking
Donald Campbell (1916 – 1996) was a pioneer of one of the most profound
creative process models. He characterized creative thinking as a process of
blind variation and selective retention (Campbell, 1960). His later work along
this direction became known as a selectionist theory of human creativity.
If divergent thinking is what it takes for blind variation, then convergent
thinking certainly has a part to play for selective retention.
Campbell was deeply influenced by the work of Charles Darwin. In Camp-
bell’s evolutionary epistemology, a discover searches for candidate solutions
with no prior knowledge of whether a particular candidate is the ultimate
one to retain. Campbell specifically chose the word blind, instead of random,
to emphasize the absent of foresight in the production of variations. He ar-
gued that the inductive gains in creative processes hinge on three necessary
conditions:
1) There must be a mechanism for introducing variation.
2) There must be a consistent selection process.
3) There must be a mechanism for preserving and reproducing the selected
variations.
Campbell’s work is widely cited, including both supports and criticisms.
As of July 2010, his original paper was cited 373 times in the Web of Science,
and over 1,000 times on Google Scholar.
Dean Simonton’s Origins of Genius: Darwinian Perspectives on Creativ-
ity (1999) is perhaps the most prominent extension of Campbell’s work.
His main thesis was that the Darwinian model might actually subsume all
other theories of creativity as special cases of a larger evolutionary frame-
work. Simonton pointed out that there are two forms of Darwinism. The
primary form concerns biological evolution — the Darwinism that is most
widely known. The secondary form of Darwinism provides a generic model
that could be applied to all developmental or historical processes of blind vari-
ation and selective retention. Campbell’s evolutionary epistemology belongs
to the secondary form of Darwinism. Campbell’s proponents argued that the
cultural history of scientific knowledge is governed by the same principles
that guide the natural history of biological adaptations. Simonton provided
supportive evidence from three methodological domains: the experimental,
the psychometric, and the historiometric domains.
Critics of Campbell’s 1960 model mostly questioned the blind-variation
aspect of the model. A common objection is that there would be too many
possible variations to search if there is no effective way to narrow down and
prioritize the search. The searcher would be overwhelmed by the enormous
volume of potential variations. In contrast, the number of variations that
would be worth retaining is extremely small. A scenario behind the British
Museum Algorithm may illustrate the odds (Newell, Shaw, & Simon, 1958).
Given enough time, what are the odds of a group of trained chimpanzees
typing randomly and producing all of the books in the British Museum?
Campbell defended his approach with the following key points and ar-
gued that the disagreement between his approach and the creative thinking
2.4 Blind Variation and Selective Retention 29
the literature is made of two distinct types of publications with very different
half-lives — the classic and the transient contributions. According to Price
(p. 515), “the research front builds on recent work, and the network becomes
very tight.” He estimated about 30∼40 articles published before a citing
article would constitute the research front relative to the citing article. We
calculated the average number of references cited by a paper in several fields
and found that the average number is 31 (see Table 2.1). It seems reasonable
to expect a paper to cite the entirety of its research front in this sense.
Table 2.1 Average number of references cited per paper.
Topics Records Average Ref per paper Max Ref per paper
Pulsars 1,048 13 200
Knowledge organization 4,444 14 331
Terrorism 1,732 21 168
String theory 7,983 38 182
Mass extinction 1,847 67 1,078
Mean 31
Price arranged 200 articles on the topic of N-rays chronologically and used
a matrix of citations (articles in a column, cite articles in a row) to depict
the research front of the subject. It was clear that the research front consists
of about 50 articles published prior to the citing article. Researchers are less
likely to pay attention to papers published before these 50 or so papers. To
cope with this immediacy effect and keep an idea constantly visible, scientists
need to publish their work persistently. How long does it take for a paper
to become obsolete? It depends on particular fields of study. The research
front tends to move fast as a field begins to emerge. For instance, the initial
discovery of pulsars was followed by fast-paced publications. Within the first
18 months of the discovery, the average half-life of papers was as short as
weeks rather than months or years. Publish or perish!
The blind variation and selective retention perspective is also evident in the
work of the late Chinese scholar Hongzhou Zhao, although it is not clear
whether his work was influenced by Campbell’s work. Outside China, Zhao
is probably better known for his work on the dynamics of the world center
of scientific activities (Zhao & Jiang, 1985). With his education in physics,
Zhao defined an element of knowledge as a scientific concept with a quantifi-
able value, for instance, the concepts of force and acceleration in Newton’s
F = ma, or the concept of energy in Einstein’s E = mc2 . The mechanism for
variation in scientific discovery is the creation of a meaningful binding be-
tween previously unconnected knowledge elements in a vast imaginary space.
2.5 Binding Free-Floating Elements of Knowledge 31
simultaneously. It is named after the Roman god Janus, who had two faces
looking in opposite directions (see Fig. 2.2). As we shall see shortly Janusian
thinking can be seen as a special type of divergent thinking and it can be
used to generate original ideas systematically. In addition, an interesting
connection between Janusian thinking and the work of the sociologist Murray
S. Davis on why we find a new theory interesting is discussed in Chapter 4.
Fig. 2.2 Janus, the Roman god. Source: (The Delphian Society, 1913).
knowledge and relations when bringing and using these opposites together.
The nature of the new conception is similar to a Gestalt switch or a change of
viewpoint (see Fig. 2.3). Finding the right perspective is the key to creativity.
Fig. 2.3 The famous “my wife and my mother-in-law” by W. E. Hill (1915).
to a view that may contradict to what we believe. The caveat is that the
new theory should not over do it. If it goes too far, it will lose our interest.
The difference between Davis’ framework and Janusian thinking is subtle but
significant. In Davis’ framework, when we are facing two opposite and contra-
dictory views, we are supposed to choose one of them. In contrast, Janusian
thinking is not about choosing one of the existing views and discarding the
other. Instead, we must come up with a new and creative perspective so that
it can accommodate and subsume all the contradictions. The contradictions
at one level are no longer seen as a problem at the new level of thinking.
It is in this type of conceptual and cognitive transformation that discoverers
create a new theory that makes the co-existence of the antitheses meaningful.
The ability to view things from multiple perspectives and reconcile con-
tradictions is in the center of dialectical thinking. The origin of dialectics is
a dialog between two or more people with different views but wish to seek a
resolution. Socratics, Hegel, and Marx are the most influential figures in the
development of dialectical thinking.
According to Hegel, a dialectic process consists of three stages of thesis,
antithesis, and synthesis. An antithesis contradicts and negates the thesis.
The tension between the thesis and antithesis is resolved by synthesis. Each
stage of the dialectic thinking process makes implicit contradictions in the
preceding stage explicit. An important dialectical principle in Hegel’s system
is the transition from quantity to quality. In the commonly used expression,
“the last straw that broke the camel’s back”, the one additional straw is a
quantitative change, where a breakdown camel is a qualitative change. The
negation of the negation is another important principle for Hegel. To Hegel,
human history is a dialectical process.
Hegel was criticized by materialist or Maxist dialectics. In Karl Marx’s
own words, his dialectic method is the direct opposite of Hegel’s. To Marx,
the material world determines the mind. Marxists see contradiction as the
source of development. In this view, class struggle is the contradiction that
plays the central role in social and political life. In Chapter 1 we introduced
how internalism and externalism differ in terms of their views of the nature
of science and its role in the society. Dialectic thinking does seem to have a
unique place in a diverse range of contexts.
Opposites, antitheses, and contradictions in Janusian thinking in partic-
ular and dialectic thinking in general are integral part of a broader system
or a longer process. Contradictory components are not reconciled but remain
in conflict; opposites are not combined, and oppositions are not resolved
(Rothenberg, 1996). Opposites do not vanish; instead, one must transcend
the tension between contradictory components to find a creative solution.
With regard to the 5-stage model of a creative process, the most cre-
ative and critical components of Janusian thinking are the transition from
the third phase to the fourth phase, i.e. from simultaneous opposition to the
construction of a new perspective. In Campbell’s perspective of blind varia-
tion and selective retention, Janusian thinking proactively seeks antitheses as
2.7 TRIZ 37
a mechanism for variation and imposes retention criteria for variations that
can synthesize theses and antitheses.
2.7 TRIZ
Fig. 2.4 Trial-and-error searches for a path to solution. Source: Figure 2 of (Alt-
shuller, 1999), p. 25.
3
http://www.salon.com/tech/feature/2000/06/29/altshuller/index.html
2.8 Summary 39
as early as 1801 that when electric current passes through metal filaments,
filaments will light up. But the problem was that the filaments burned out
in the process. The contradiction is that the filaments must get hot enough
to glow but not too hot to burn themselves out. The contradiction was not
resolved until 70 years later by the invention of Joseph Wilson Swan and
Thomas Alva Edison. They solved the problem by placing the filaments in a
vacuum bulb.
Another classic example is the design of tokamak for magnetic confine-
ment fusion. Tokamaks were invented in the 1950s by Soviet physicists Igor
Tamm and Andrei Sakharov. The problem was how to confine the hot fu-
sion fuel plasma. The temperature of such plasmas is so high that containers
made of any solid material would melt away. The contradiction was resolved
by using a magnetic field to confine the plasma in the shape of a donut. Con-
tradiction removal is a valuable strategy for creative thinking and problem
solving in general. Focusing on a contradiction is likely to help us to ask the
right questions.
2.8 Summary
References
knowing if we have found the best needle in the haystack yet. After all, we
may not even have the foggiest idea of what the needle looks like. Scien-
tific discoveries and creative thinking in general take place in such an open,
dynamic space that is full of uncertainty.
galaxies of compounds. However, some questions still remain: What are the
implications of such tendency on discovering new drugs? How common is
this tendency across the full spectrum of compound similarity measurement?
A far-reaching question is whether or not these galaxies of compounds are
distributed evenly and sparsely — because an even and sparse distribution
would make the search harder than otherwise. In the cosmological universe,
it is known that the distribution is uneven. Much of the universe is empty,
or void. If it is also true in the chemical space, galaxies of therapeutically
interested compounds would be separated far apart by vast voids between
them.
Thousands of high-throughput screening (HTS) programs suggested that
compounds that bind to certain target classes are clustered together in dis-
crete regions of chemical space. These regions can be defined by particular
chemical descriptors.
Large pharmaceutical companies usually have files of 106 compounds.
Chemical space is too large for a systematic scan. High-throughput screening
serves as the starting point of the current primary strategy for the pharma-
ceutical industry for identifying biological active molecules.
HTS is one of the new concepts of drug discovery. A large number of
hypothetical targets are simultaneously exposed to a large number of com-
pounds. These compounds in turn represent numerous variations on a few
chemical themes or fewer variations on a greater number of themes in HTS
configurations. Hits in the HTS process are expected to become leads, the
compounds that remain to be valid candidate in subsequent and more com-
plex models. Data points are screening results associated with one compound
at one concentration in a particular test. The number of data points has in-
creased rapidly, from 200,000 at the beginning of the 1990s to the 5∼6 million
in mid-1990s, and over 50-million around year 2000.
The leap-and-bounce increase has not generated any comparable increase
in the research productivity of drug discovery. Although HTS has resulted in
a large number of “hits,” some industry leaders were disappointed that very
few leads and development compounds, if any, can be credited to the new drug
discovery paradigm (Drews, 2000). As pointed out by Jürgen Drews (2000):
“The critical discourse between chemists and biologists and the quality of
scientific reasoning are sometimes replaced by the magic of large numbers.”
Others have reached similar assessments (Lipinski & Hopkins, 2004), the
generally poor quality of these data is not widely aware by those outside
industry. Drug discoverers using HTS as a massive filtering mechanism need
something else to improve the effectiveness of drug discovery.
Drug discovery is a lengthy process. It can take a decade or even longer
from the initial basic research to its commercialization. Some discoveries are
incremental, some are radical. Some discovered new core compounds for the
first time, while others found new ways of using known compounds. The
discovery process becomes increasingly expensive as it moves from initial
research to clinical trials. A recently published study (Sternitzke, 2010) found
46 Chapter 3 Cognitive Biases and Pitfalls
that radical innovations are more likely than incremental ones to originate
from basic research. On average, each drug is accompanied by 19 journal
publications and 23 additional patents. Additional patent filings peak when
the commercialization of the drug is in reach.
Kneller (2010) investigated the origins of 252 new drugs approved by
the US Food and Drug Administration (FDA) between 1998 and 2007. He
identified several factors that appear to play an important role in discov-
ering innovative drugs, for example, the levels of public funding for aca-
demic biomedical research, rigorous peer review, and professional mobility.
Higher levels of open, public funding are valuable for scientists trained in the
course of academic research and biotechnology and pharmaceutical compa-
nies. Peer review in government funding agencies such as the NIH has been
criticized for being reluctant to award funding to younger researchers and to
non-traditional projects. However, the increased competitiveness required for
successful proposals may nevertheless raise the quality of research and reduce
the potential monopoly of senior professors as shown elsewhere, for example,
in Japan. A higher professional mobility and career flexibility may contribute
to more opportunities of cross-fertilizing ideas and initiatives.
As far as drug discovery is concerned, improving the efficiency of the
process is a pressing problem. Navigators in the vast chemical space need
signs and clues that can help them to choose new paths more efficiently. The
current use of HTS is a truly blind variation mechanism. The mechanism
thus far does not make use of any knowledge of the structural properties of
the chemical space.
sky, spot a red rose in a land covered by green vegetation, or pick “the odd
one out” if it differs from the others in terms of its shape or size. However,
our perceptual system is not very good at detecting and recognizing changes
occurred in a scene, especially after our view is interrupted. The inability to
detect this type of changes effectively is called change blindness.
One of the most well-known studies of change blindness was an experiment
conducted by Simons (2000). In the experiment, the experimenter stopped a
pedestrian on a street and asked for directions. The experimenter was holding
a basketball as they were talking. Then a group of students walked by and
interrupted their conversation and visual contact. During the interruption,
the experimenter’s basketball was taken away. After the brief interruption,
the conversation was resumed. Very few pedestrians noticed that something
was missing, but more than half of the pedestrians began to realize that the
basketball was missing when they were asked specifically about it.
Change blindness can happen to both professionals and laymen. Re-
searchers have come up with many theories in attempt to explain how and
why change blindness happens. For instance, some theories focus on the role
of stimulus and suggest that change blindness is due to the stimulus shown at
different time. Some theories suggest that we only remember either what we
see before or what we see after, but never both of them, so we have no way
to compare and identify the difference between before and after. Theories
arguing that we remember the before scene are known as the first impres-
sion theories, whereas theories that suggest we remember the after scene are
known as overwriting explanation theories.
with the same level of accuracy — the number of correctly recalled positions
was much less than what the expert players could recall from settings based
on real games.
Generally speaking, experts always demonstrate superior recall perfor-
mance within their domain of expertise. As the chess experiments show, how-
ever, the superior performance of experts is unlikely to do with whether they
have a better ability to memorize more details. Otherwise, they would per-
form about the same regardless whether the settings were random or realistic.
So what could separate an expert from a novice?
Researchers have noticed that experts organize their knowledge in a much
more complex structure than novices do. In particular, a key idea is chunking,
which organizes knowledge or other types of information at different levels
of abstraction. For example, the geographic knowledge of the world can be
organized at levels of country, state, and city. At the country level, we do
not need to address details regarding states and cities. At the state level, we
do not need to address details regarding cities. In this way, we find it easier
to handle the geographic knowledge as a whole. A study of memory skills
in 1988 analyzed a waiter who could memorize up to 20 dinner orders at a
time.3 The waiter used mnemonic strategies to encode items from different
categories such as meat temperature and salad dressing. For each order, he
encoded an item from a category along with previous items from the same
category. If an order included blue cheese dressing, he would encode the initial
letter of the salad dressing along with the initial letters from previous orders.
So orders of blue cheese, oil vinegar, and oil vinegar would become a BOO.
The retrieval mechanism would be reflected at encoding and retrieval. Cicero
(106 B.C. – 43 B.C.), one of the greatest orators in ancient Rome, was known
for his good memory. He used a strategy called memory palaces to organize
information for later retrieval. One may associate an item with a room in
a real palace, but in general any structure would serve the purpose equally
well. The chunking strategy is effectively the same as using a memory palace.
We can easily fall into some common pitfalls as we try to come up with new
ideas or find unprecedented paths to discover the unknown. These pitfalls
could hinder the quality of decisions we make or make us miss the target
altogether.
We tend to pay more attention to things that immediately surround us
than things that are further away. This tendency can be described in terms
of the degree of interest. The intensity of our interest in a topic decreases as
3
Ericsson, K. A. & Polson, P. G. (1988). An experimental analysis of the mechanisms
of a memory skill. Journal of Experimental Psychology: Learning, Memory, and Cognition,
14, 305-316.
50 Chapter 3 Cognitive Biases and Pitfalls
the distance between the point and topics that we are familiar with. This ten-
dency may cause problems. While we tend to search locally, the real solution
to a problem may take an extensive search to find.
Another type of pitfall is that we tend to take paths of the least resistance
rather than paths that are likely to lead to the best answers or the best
decisions. The more we go down the same path, the less resistance the path
appears to be. Whenever a new path competes with a well-trodden one, our
decision tends to be biased towards the familiar and proven path. We prefer
no uncertainty and want to avoid unforeseen risks. This preference may have
serious consequences!
This problem can be better explained in terms of mental models, or cogni-
tive models. Mental models are simplified abstractions of how a phenomenon,
the reality, or the world works. We use such models to describe, explain and
predict how things work and how situations may evolve. However, we are also
biased by our own mental models.
Mental models are easy to form and yet hard to change. Once we have
established our mental model, we tend to see the world through the same
mental model and reinforce it with new evidence. If we have to deal with
evidence that apparently contradicts the model, our instinct is to find an
interpretation for the contradiction rather than to question the model itself.
Once we find extenuating reasons to convince ourselves why it makes sense
that the evidence doesn’t appear to fit the model, we move on with an un-
compromised faith in the original model. Fig. 3.1 shows a series of drawings
that gradually change from a man’s face to a sitting woman. If we start to
look at these images one by one from the top left, we see images showing a
man’s face until the last few images. In contrast, if we start it from the lower
right and move backwards, we see more images showing a woman sitting
there.
Fig. 3.1 Mental models are easy to form, but hard to change.
for granted. Sometimes such patterns prematurely narrow down the solution
space for subsequent search. Sometimes one may unconsciously rule out the
ultimate solutions. A simple connecting-the-dot game illustrates this point.
In this game, 9 dots are arranged in three rows and three columns (see
Fig. 3.2). You are asked to find a way to connect all these dots by four straight
lines. The end point of each line must be the starting point of the next line.
Fig. 3.2 Connecting the dot with no more than four jointed straight lines.
If there is still no sign of a solution after a few trials, ask yourself whether
you are making any implicit assumptions and whether you are imposing un-
necessary constraints to your solutions. The problem is usually caused by
implicit assumptions we make based on a Gestal pattern that we may not
even realize it is formed. These implicit assumptions set the scope of our sub-
sequent search. In this case, we won’t be able to solve the problem unless the
implicit assumptions are removed. This type of blind spot is more common
in our thinking than we realize. Sometimes such blind spots are the direct
source of accidents.
Charles Perrow published a book called Normal Accidents: Living with
High-Risk Technologies (1984). He made a series of compelling cases to un-
derline the core insight that many accidents are caused by human factors.
One of the cases was about an accident in the Chesapeake Bay in 1978. The
captain of a Coast Guard cutter training vessel saw a ship ahead. It was dark
and he saw two lights on the other ship, so he thought that was a ship going
in the same direction as his own ship. What he didn’t know was that there
were actually three lights and it was going towards them. Since he missed
one of the lights, his understanding of the situation was wrong. As both ships
traveled at full speed and they were closing up rapidly, the caption misinter-
preted that it must be a very slow fishing boat and he was about to overtake
the boat. The other ship was, in fact, a large cargo ship. Both of them ap-
proached the Potomac River. The Coast Guard captain suddenly thought,
still based on his incorrect mental model, he had to make a left turn so that
the small and slow fishing boat could turn to the port. Unfortunately, this
turn put the ship on a collision course with the oncoming freighter. Eleven
coast guards on the ship were killed in the accident.
The captain’s mental model in this case was how he perceived, interpreted
and responded to the situation. He started with a wrong mental model and
made a series of wrong decisions without questioning the model itself. At
a larger scale, we form mental models of the reality not only individually,
52 Chapter 3 Cognitive Biases and Pitfalls
but also collectively. A mental model of the reality can be shared by a large
number of people. A group of scientists can share a common scientific theory.
A group of thinkers can share a school of thought. In Kuhn’s Structure of
Scientific Revolutions, a paradigm is accepted by a scientific community. It
functions just like the mental model of the scientists in the community. For
scientists who work in a well-established paradigm, it may become increas-
ingly difficult to adopt an alternative paradigm and see how the world can
be interpreted in a different way. Kuhn used the notion of Gestalt switch to
explain the difficulty in changing perspectives.
How can we tell which models or paradigms are superior? Can we expect
that paradigms getting ‘better’ all the time? Even if mental model is shared
by a large number of intelligent people, it could still be a poor representation
of the reality. The Ptolemaic system of the solar system is a classic example4 .
The Copernican model is not only a more accurate representation of the
reality, but also a much simpler one.5 The simplicity stands out when you
look at the two models side by side in Fig. 3.3. Simplicity is one of the few
criteria that we expect to see in a superior theory.
Fig. 3.3 Models of the Ptolemaic system (left) and the Copernican system (right).
Mental models and theories in general are about how the reality works or
how a phenomenon takes place. They can be used to make predictions. Such
predictions have a direct influence on what decisions we make and which
course of action we take. There are two broad categories for us to assess the
quality of a mental model or a theory. We can examine the coherence and in-
tegrity of a theory internally. A theory is expected to explain the mechanisms
of a phenomenon in a consistent manner. We can also examine the utility of
a theory externally. Does it complete with an alternative theory? Can it give
simpler explanations of the same phenomenon than its competitors?
Don Norman, a pioneer of human-computer interaction, proposed that
one should check the reality through seven critical stages of an action cycle.
Norman’s action cycle starts with the intended goal to achieve, proceeds to
4
http://microcosmos.uchicago.edu/microcosmos new/ptolemy.html
5
http://microcosmos.uchicago.edu/microcosmos new/copernicus.html
3.2 Mental Models and Biases 53
6
http://www.snopes.com/autos/techno/icecream.asp
3.2 Mental Models and Biases 55
and Iraq’s missing weapon of mass destruction (WMD): in one case, the in-
telligence community failed to provide enough warning; in the other, it failed
by providing too much (Betts, 2007).
It was commonly believed after the 911 terrorist attacks that U.S. intel-
ligence had failed badly. However, Betts pointed out the issue is not that
simple. The intelligence system did detect that a major attack was imminent
weeks before the 911 attacks. The system warned clearly that an attack was
coming, but could not say where, how, or exactly when. The vital component
lacked from the warning was the actionability — it was too vague to act upon.
According to Betts, two months before 911, an official briefing warned
that Bin Laden “will launch a significant terrorist attack against the U.S.
and/or Israeli interest in the coming weeks. The attack will be spectacular
and designed to inflict mass casualties.” George Tenet, the Director of Cen-
tral Intelligence (DCI), later said in his memoirs that “we will generally not
have specific time and place warning of terrorist attacks.” In addition, many
intercepted messages or cryptic warnings were not followed by any terrorist
attack. Before 911, more than 30 messages had been intercepted and there
was no terrorist attack. Furthermore, it is not unusual to choose not to act
on warnings if various uncertainties are involved. An extreme hurricane in
New Orleans had been identified by the Federal Emergency Management
Agency (FEMA) long before the Hurricane Katrina arrived in 2005. Fixing
New Orleans’s vulnerability would have cost an estimated $14 billion. The
perceived cost-effectiveness before the disaster was not in favor of making
such investments while its benefit was hard to estimate. The question some-
time is not how to act upon a credible assessment of a threat or a potential
disaster; rather, it is prudent to ask whether one should act given the cost-
effectiveness in the situation. Gambling sometimes pays off. Other times it
will be seen as a failure, especially on the hindsight. In Chapter 4, we will
discuss the gambling nature of almost all decision-making in terms of the
optimal foraging framework.
Another factor that contributed to the failure was due to the lost of
focus caused by the tendency of maximizing collection. The trade-off between
collecting more dots and connecting the dots is the issue. The fear of missing
any potentially useful dots and the fear of taking direct responsibilities were
driving the maximum collection of dots. However, collecting more dots makes
connecting the dots even harder. Indeed, after reading the 911 Commission’s
report, Richard Posner concluded that it is almost impossible to take effective
action to prevent something that has never occurred previously. In many
ways, this is a question also faced by scientists, who are searching for ways to
find meaningful dots and connect them. As we will see in later chapters, there
are pitfalls as well as effective strategies for dealing with such situations.
The 911 Commission recommends that the dots should be connected more
creatively and it is “crucial to find a way of routinizing, even bureaucratizing,
the exercise of imagination.” Mechanisms for thinking outside the box should
be promoted and rewarded, just as Heurer (1999) recommended.
56 Chapter 3 Cognitive Biases and Pitfalls
The Nobel Prize is widely regarded as the highest honor and recognition of
one’s outstanding achievements in physics, chemistry, medicine, literature,
and peace. In his will, Alfred Nobel described the prizes should be given to
persons who have made the most important discoveries, the most outstand-
ing work, or have done the best work for promoting peace regardless their
nationality:
“one part to the person who shall have made the most impor-
tant discovery or invention within the field of physics; one part
to the person who shall have made the most important chemi-
cal discovery or improvement; one part to the person who shall
have made the most important discovery within the domain of
physiology or medicine; one part to the person who shall have
produced in the field of literature the most outstanding work in
an ideal direction; and one part to the person who shall have
done the most or the best work for fraternity between nations,
for the abolition or reduction of standing armies and for the
holding and promotion of peace congresses.”7
Although there is no question that the Prizes should be given to the
most important discoveries and the most outstanding work, there are always
discrepancies, to say the least, on which discovery is the most important
and which work is the most outstanding. While the importance of some of
the Nobel Prize winning work was recognized all along, the significance of
other Nobel Prize winning work was overlooked or misperceived. How often
7
http://nobelprize.org/alfred nobel/will/will-full.html
58 Chapter 3 Cognitive Biases and Pitfalls
8
http://www2.uah.es/jmc/nobel/nobel.html
60 Chapter 3 Cognitive Biases and Pitfalls
gether for the first 10 years — until Hounsfield used Cormack’s theoretical
calculations and built the first CT scanner in 1971. Cormack and Hounsfield
shared the 1979 Nobel Prize in Physiology or Medicine.
Nobel laureate Stanley Prusiner wrote9 , “while it is quite reasonable for
scientists to be sceptical of new ideas that do not fit within the accepted realm
of scientific knowledge, the best science often emerges from situations where
results carefully obtained do not fit within the accepted paradigms.” Prusiner’s
comment echoes our earlier discussions on the potential biases of one’s mind-
set. Both researchers and reviewers are subject to such biases. While it is re-
assuring to know that some early rejected works do become recognized later
on, it is hard to find out how many, if any, potentially important discoveries
discontinued because their values were not recognized soon enough.
The main thesis in the first half of the chapter is that human perception
and cognition is biased. We are conservative and not particularly good at
adopting a new perspective even at the presence of otherwise meaningful
signs. The theme of the second half of the chapter is that our imagination is
rather limited. Research has shown that the quality of hypotheses increases as
more and more hypotheses are generated. However, analysts are more likely
to select the first hypothesis that appears good enough than choose the best
from all feasible options. We need extra assistances to stretch our imagination
significantly.
mosphere for a couple of years. In 1990, a big crater was discovered in the
Mexico Bay and the crater was believed to be the most direct piece of evi-
dence for the impact theory. In 2001, inspired by the success of the impact
theory in explaining the KT extinction, researchers proposed a new line of
research that would follow the same pattern of the impact theory’s success.
The difference was that it aimed to explain an even earlier mass extinction
250 million years ago. However, the validity of the analogy was questioned
by more and more researchers. By 2010, it is the consensus that the analogy
does not seem to hold given the available evidence. We were able to detect
the analogical path from citation patterns in the relevant literature in our
paper published in 2006 (Chen, 2006). The same conclusion was reached by
domain experts in a 2010 review (French & Koeberl, 2010). We will revisit
this example in later chapters of this book.
We are often torn by competing hypotheses. Each hypothesis on its own can
be very convincing, while they apparently conflict with each other. One of the
reasons we find hard to deal with such situations is because most of us cannot
handle the cognitive load needed for actively processing multiple conflicting
hypotheses simultaneously. We can focus on one hypothesis, one option, or
one perspective at a time. In the literature, the commonly accepted magic
number is 7, taken or given 2. In order words, if a problem involves about
5∼9 aspects, we can handle them fine. If we need to do more than that, we
need to, so to speak, externalize the information, just as we need a calculator
or a piece of paper to do multiplications beyond single digit numbers.
It is much easier to convince people using vivid, concrete, and personal
information than using abstract, logical information. Even physicians, who
are well qualified to understand the significance of statistical data, are con-
vinced more easily by vivid personal experiences than by rigorous statistical
data. Radiologists who examine lung x-rays everyday are found to have the
lowest rate of smoking. Similarly, physicians who diagnosed and treated lung
cancer patients are unlikely to smoke.
Analysis of Competing Hypotheses (ACH) is a procedure to assist the
judgment on important issues in a complex situation. It is particularly de-
signed to support decision making involving controversial issues by keeping
track what issues analysts have considered and how they arrived at their
judgment. In order words, ACH provides the provenance trail of a decision
making process.
The ACH procedure has eight steps (Heuer, 1999):
1) Identify the possible hypotheses to be considered. Use a group of analysts
with different perspectives to brainstorm the possibilities.
2) Make a list of significant evidence and arguments for and against each
62 Chapter 3 Cognitive Biases and Pitfalls
hypothesis.
3) Prepare a matrix with hypotheses across the top and evidence on the
side. Analyze the “diagnosticity” of the evidence and arguments — that
is, identify which items are most helpful in judging the relative likelihood
of the hypotheses.
4) Refine the matrix. Reconsider the hypotheses and delete evidence and
arguments that have no diagnostic value.
5) Draw tentative conclusions about the relative likelihood of each hypothe-
sis. Proceed by trying to disprove the hypotheses rather than prove them.
6) Analyze how sensitive your conclusion is to a few critical items of evidence.
Consider the consequences for your analysis if that evidence were wrong,
misleading, or subject to a different interpretation.
7) Report conclusions. Discuss the relative likelihood of all the hypotheses,
not just the most likely one.
8) Identify milestones for future observation that may indicate events are
taking a different course than expected.
The concept of diagnostic evidence is important. The presence of diag-
nostic evidence removes all the uncertainty in choosing one hypothesis over
an alternative. Evidence that is consistent with all the hypotheses has no
diagnostic value. Many illnesses may have the fever symptom, thus fever is
not diagnostic evidence on its own. In the mass extinction example, the anal-
ogy of the KT mass extinction does not have sufficient diagnostic evidence
to convince the scientific community. We use evidence in such situations to
help us estimate the credibility of a hypothesis.
public are qualitatively different from the purely “scientific images.” To the
public, these “pretty pictures” are taken as scientific rather than aesthetic.
The Pillars of Creation10 was a famous example of such public-friendly pic-
tures. The public not only treated it as a scientific image, but also attached
additional interpretations that were not found in the original scientific image
(Greenberg, 2004).
The Eagle nebula is a huge cloud of interstellar gas in the south-east corner
of the constellation Serpens. Jeff Hester and Paul Scowen at Arizona State
University took images of the Eagle nebula using the Hubble Space Telescope.
They were excited when they saw the image of three vertical columns of gas.
Hester recalled, “we were just blown away when we saw them.” Then their
attention was directed to “a lot of really fascinating science in them” such as
the “star being uncovered” and “material boiling” off of a cloud.
Greenberg (2004) described how the public reacted to the image. The
public first saw the image on CNN evening news. CNN received calls from
hundreds of viewers who claimed that they saw apparition of Jesus Christ in
the Eagle nebula image. On CNN’s live call-in program next day, according
to viewers, they were able to see more: a cow, a cat, a dog, Juesus Christ,
the Statue of Liberty, and Gene Shalit (a prominent film critic).
The reactions to the Eagle nebula image illustrate the concept of a bound-
ary object, which is subject to reinterpretation by different communities. A
boundary object is both vague enough to be adopted for local needs and yet
robust enough to maintain a common identify across sites. Different commu-
nities, including astronomers, religious groups, and the public, put various
meanings to the image. More interestingly non-scientific groups were able to
make use of the scientific image and its unchallenged absolute authority for
their own purposes so long as newly added meanings do not conflict with the
original scientific meaning. Greenberg’s analysis underlines that the more the
scientific process is black-boxed, the easier it becomes to augment scientific
knowledge with other extra-scientific meanings.
It is usually easy to prove that something really exists, provided that we have
the right equipments for detection and observation. It is almost impossible to
prove that something does not exist. An example of the former is the discovery
of an impact crater as the diagnostic evidence for the impact theory of mass
extinctions 65 million years ago. An example of the latter is the failure of
the intelligence to find evidence that Saddam did not have weapons of mass
destruction.
The Black Swan is the bestseller of Nassim Nicholas Taleb. He was con-
cerned with the influence of highly improbable and unpredictable events that
10
http://apod.nasa.gov/apod/ap070218.html
64 Chapter 3 Cognitive Biases and Pitfalls
the system gives any early signs as it approaches to such tipping points.
The review in Nature found that although predicting such critical points in
advance is extremely difficult, research in different scientific fields is now sug-
gesting the existence of generic early warning signals. If, as these 10 authors
concluded in their review, the dynamics of system near a critical point have
generic properties regardless of differences in the details of each system, as
the authors claimed in the paper, then it is indeed a profound finding.
The most important clues of whether a system is getting close to a critical
threshold are related to a phenomenon called critical slowing down in dynam-
ical system theory. To understand critical slowing down, we need to explain a
few concepts such as fixed points, bifurcations, and fold bifurcations. A fixed
point, also known as an invariant point of a function is a point that is mapped
to itself by the function. As far as the fixed point is concerned, it remains
unaffected by the function or mapping. Fixed points are used to describe
the stability of a system. In a dynamical system, a bifurcation represents the
sudden appearance of a qualitatively different solution for a nonlinear system
as some parameters change. A bifurcation is a separation of a structure into
two branches.
At fold bifurcation points (e.g., one stable and one unstable fixed points),
the system becomes increasingly slow in recovering from perturbations. Re-
search has shown that 1) such slowing down typically starts far from the
bifurcation point, and 2) recovery rates decreases smoothly to zero as the
critical point is approached. The change of the recovery rate provides impor-
tant clues of how close a system is to a tipping point. In fact, the phenomenon
of critical slowing down suggests three possible early-warning signals in the
dynamics of a system approaching a radical change: slower recovery from
perturbations, increased autocorrelation, and increased variance.
The authors of the review article stress that the work on early-warning
signals in simple models is quite strong and it is expected that similar sig-
nals may arise in highly complex systems. They also note that more work is
needed, especially in areas such as detecting patterns in real data and deal-
ing with challenges associated with handling false positive and false negative
signals. It is also possible that sudden shifts in a system may not necessarily
follow a gradual approach to a threshold.
3.6 Summary
In this chapter, we have seen many common cognitive pitfalls that may un-
dermine our creativity. In particular, these pitfalls may obscure our ability
to detect early signs, to re-examine our existing mental model, to reduce
possible biases in analytic reasoning, and to solve problems from diverse and
possibly conflicting perspectives.
Mental models are valuable because they provide us a framework to de-
66 Chapter 3 Cognitive Biases and Pitfalls
References
Lamb, W.E., Schleich, W.P., Scully, M.O., & Townes, C.H. (1999). Laser physics:
Quantum controversy in action. Reviews of Modern Physics, 71, S263-S273.
Lipinski, C., & Hopkins, A. (2004). Navigating chemical space for biology and
medicine. Nature, 432(7019), 855-861.
Perrow, C. (1984). Normal accidents: living with high-risk technologies, Princeton
University Press.
Scheffer, M., Bascompte, J., Brock, W.A., Brovkin, V., Carpenter, S.R., Dakos, V.,
et al. (2009). Early-warning signals for critical transitions. Nature, 461(7260),
53-59.
Star, S.L. (1989). The structure of Ill-structured solutions: Boundary objects and
heterogeneous distributed problem solving. In M. Huhs & L. Gasser (Eds.),
Readings in Distributed Artificial Intelligence 3 (pp. 37-54). Menlo Park, CA:
Kaufmann.
Sternitzke, C. (2010). Knowledge sources, patent protection, and commercialization
of pharmaceutical innovations. Research Policy, 39(6), 810-821.
Ware, C. (2008). Visual thinking for design. Morgan Kaufmann.
Wenger, E. (1998). Communities of practice — learning, meaning, and identity.
Cambridge, UK: Cambridge University Press.
Wohlstetter, R. (1962). Pearl Harbor: Warning and decisions:. Stanford University
Press.
Chapter 4 Recognizing the Potential of
Research
Basic research often does not have an earlier sign of whether and how it
might be practically valuable. Research has found a recurring pattern that
many scientific breakthroughs emerge as multiple lines of research converge.
The question is: Is it possible to recognize a fruitful path ahead of time? In
this chapter we discuss lessons learned from studies of both hindsight and
foresight of identifying and recognizing the most important discoveries and
innovations.
4.1 Hindsight
What can we learn from the past? How were scientific breakthroughs made?
Black bears hibernate for 5∼7 months. When they wake up, they are as
strong as ever. In contrast, if we are inactive for as short as several days, we
may start to get weaker rather than stronger. We could start to lose our bone
mass and strengths. People who are unable to maintain their usual levels of
activity need to be very careful. For example, astronauts spend days in space
need to have specially designed programs to keep themselves strong.
What makes the difference between human beings and black bears? This
is the type of questions that everyone could see its value even before it gets
answered. Seth Donahue and colleagues at Michigan Technological University
started off with the good question. They were able to isolate a bone-building
biomarker in the blood of black bears. The research has great commercial
implications for osteoporosis treatment and prevention.
Their publication records show that their 2004 paper, entitled “Bending
properties, porosity, and ash fraction of black bear (Ursus americanus) corti-
cal bone are not compromised with aging despite annual periods of disuse,”
was cited 13 times by 2010, whereas their 2006 paper, entitled “Hibernating
bears as a model for preventing disuse osteoporosis,” was cited 3 times. The
practical value is more explicit in the 2006 paper. Donahue’s technique was
licensed to a company founded in 2007 called Aursos to make the therapeu-
tic compounds for osteoporosis patients. Their story became one of the 100
successful stories in 2010 of how federal funding enables basic research and
create jobs (The Science Coalition, 2010).
The connection between the basic research and its practical value is easy
enough to spot in this case. The successful commercialization had made it
easier for the funding agencies to justify their funding decisions when the
research was in its cradle.
Scientists, social scientists and politicians frequently credit basic science
with stimulating technological innovation, and with its economic growth.
Despite a substantial body of research investigating this general relation-
ship, relatively little empirical attention has been given to understanding the
mechanisms that might generate this linkage. Researchers considered whether
more rapid diffusion of knowledge, brought about by the norm of publication,
might account for part of this effect (Sorenson & Fleming, 2004). They iden-
tify the importance of publication by comparing the patterns of citations from
future patents to three groups of focal patents: (i) those that reference scien-
tific (peer-reviewed) publications, (ii) those that reference commercial (non-
scientific) publications; and (iii) those that reference neither. Their analyses
strongly indicated publication as an important mechanism for accelerating
the rate of technological innovation: Patents that reference published mate-
rials, whether peer-reviewed or not, receive more citations, primarily because
their influence diffuses faster in time and space.
In parallel to the role of citation data in modeling and visualizing sci-
entific revolutions, patent citation patterns play an important role in the
construction of knowledge diffusion examples (Jaffe & Trajtenberg, 2002).
There are a number of extensively studied knowledge diffusion, or knowl-
edge spillover, cases, namely liquid crystal display (LCD), nanotechnology
(Braun, Schubert, & Zsindely, 1997; Meyer, 2000), and tissue engineering
(Chen & Hicks, 2004).
In addition, knowledge diffusion between basic research and technological
innovation (see Meyer, 2000; Narin & Olivastro, 1992), is also intrinsically
related.
Empirical evidence shows a tendency of geographical localization in knowl-
edge spillovers (Jaffe & Trajtenberg, 2002). Further studies have revealed
profound implications of social dynamics. Agrawal, Cockburn and McHale
(2003) show that social ties between collaborative inventors play a stronger
part than geographic proximity in knowledge diffusion: inventors’ patents are
continuously cited by their colleagues in their former institutions.
Singh (2004) considered not just direct social ties but also indirect ones
in social networks of inventors’ teams based on data extracted from U.S.
4.1 Hindsight 71
Patent Office patents from 1975 – 1995. Two teams are connected in the so-
cial network if they have a common inventor. He used this network to analyze
knowledge flows between teams based on patent citations among over half a
million patents from 1986 – 1995. Social links between teams are associated
with higher probability of knowledge flow. The probability decreases as social
distance increases. An interesting finding in his study is that social links fur-
ther explain why knowledge spillovers appear to be geographically localized.
He also found a close social link to be a good predictor of knowledge flow
regardless corresponding geographic proximity. In social network analysis,
such networks of patents and inventors are known as an affiliation network
(Wasserman & Faust, 1994). This affiliation network consists of two kinds of
nodes: the inventors (the “actors”), and the patents (the “events”).
LCD first appeared in 1968 and was subsequently improved several times
between 1969 and 2003. Nanotechnology has various potential applications
such as self-replicating nanobot and smart materials for artificial drugs and
self-healing materials. Tissue engineering uses a combination of cells, engi-
neering materials, and suitable biochemical factors to improve or replace
biological functions in an effort to effect the advancement of medicine. Sci-
ence and technology linkages are particularly valuable for funding agencies
to evaluate funding efficiencies, for science policy researchers to study science
and technology indicators, and even for investment fund managers to rank
companies based on their innovation potentials.
A turning point in U.S. science policy was 1967 and 1968. Up until then, sci-
ence policy had been dominated by the cold war. By 1963, the national invest-
ment in R&D was approaching 3% of GDP, 2.5 times the peak reached just
before the end of World War II. More than 70% of this effort was supported
by the federal government. 93% of it came from only three federal agencies:
the Department of Defense (DOD), the Atomic Energy Commission (AEC),
and the National Aeronautics and Space Administration (NASA). Their mis-
sion was to ensure the commercial spinoff of the knowledge and technologies
they had developed. Much of the rest of federal R&D was in the biomedical
field. Federal funding for basic research in universities reached a peak in 1967.
It declined after that in real terms until about 1976.
The current funding environment is very competitive because of the tight-
ened budget and increasing demands from researchers. In addition, the view
that science needs to serve the society means that funding authorities as well
as individual scientists need to justify how scientific inquiries meet the needs
of society in terms of economic, welfare, and national security and competi-
tiveness. There are two types of approaches to assess the quality and impact
of scientific activities and identify priority areas for strategic planning. One is
72 Chapter 4 Recognizing the Potential of Research
esis with U.S. patents granted between 1975 and 1999. The usefulness of an
invention was measured by future citation, while the relatedness of inventions
was measured by the network of citations. He found a statistically significant
inverse U-shaped relationship between an invention’s usefulness and the re-
latedness among its component features. The usefulness of an invention was
relatively low when the relatedness was either too strong or too weak. In
contrast, the usefulness was the highest when the relatedness was in between
the two extremes.
Transformative research is often characterized as being high risk and po-
tentially high payoff. Revolutionary and groundbreaking discoveries are hard
to come by. What are the implications of the trade-off strategy on funding
transformative research with public funding? It is known that it takes long
time before the values of scientific discoveries and technological innovations
become clear. Nobel Prize winners are usually awarded for their work a few
decades ago. We also know that Nobel class ideas do get rejected. The ques-
tion is: to what extent will we be able to foresee scientific breakthroughs?
How long does it take for the society to fully recognize the value of scientific
breakthroughs or technological innovations? Project Hindsight was commis-
sioned by the U.S. Department of Defense (DoD) in order to search for lessons
learned from the development of some of the most revolutionary weapon sys-
tems. One of the preliminary conclusions drawn from Project Hindsight was
that basic research commonly found in universities didn’t seem to matter
very much in these highly creative developments. It appears, in contrast,
that projects with specific objectives were much more fruitful.
In 1966, a preliminary report of Project Hindsight was published1 . A team
of scientists and engineers analyzed retrospectively how 20 important mili-
tary weapons came along, including Polaris and Minuteman missiles, nuclear
warheads, C-141 aircraft, and Mark 46 torpedo, and the M 102 Howitzer.
Researchers identified 686 “research or exploratory development events” that
were essential for the development of the weapons. Only 9% were regarded
as “scientific research” and 0.3% was base research. 9% of research was con-
ducted in universities.
Project Hindsight indicated that science and technology funds deliberately
invested and managed for defense purposes have been about one order of
magnitude more efficient in producing useful events than the same amount of
funds invested without specific concerns for defense needs. Project Hindsight
further concluded that:
1) The contributions of university research were minimal.
1
Science, 1976, 192, pp. 105-111.
74 Chapter 4 Recognizing the Potential of Research
2) Scientists contributed most effectively when their effort was mission ori-
ented.
3) The lag between initial discovery and final application was shortest when
the scientist worked in areas targeted by his sponsor.
In terms of its implications on science policy, Project Hindsight empha-
sized mission-oriented research, contract research, and commission-initiated
research. Although these conclusions were drawn from the study of military
weapon development, some of these conclusions found their way to the eval-
uation of scientific fields such as biomedical research.
The extended use of preliminary findings had drawn considerable crit-
icism. Comroe and Dripps (2002) criticized Project Hindsight as anecdotal
and biased, especially because it was based on the judgments of a team of
experts. In contrast to the panel-based approach taken by Project Hindsight,
they started off with clinical advances since the early 1940’s that have been di-
rectly responsible for diagnosing, preventing, or curing cardiovascular or pul-
monary disease, stopping its progression, decreasing suffering, or prolonging
useful life. They asked 40 physicians to list the advances they considered to
be the most important for their patients. Physicians’ responses were grouped
into two lists in association with two diseases: a cardiovascular disease and
a pulmonary disease. Then each of the lists was sent to 40∼50 specialists in
the corresponding field. Specialists were asked to identify corresponding key
articles, which need to meet the following criteria:
1) It had an important effect on the direction of subsequent research and
development, which in turn proved to be important for clinical advance
in one or more of the ten clinical advances they were studying.
2) It reported new data, new ways of looking at old data, new concept or
hypothesis, a new method, new techniques that either was essential for full
development of one or more of the clinical advances or greatly accelerated
it.
A total of 529 key articles were identified in relation to 10 advances in
biomedicine:
• Cardiac surgery
• Vascular surgery
• Hypertension
• Coronary insufficiency
• Cardiac resuscitation
• Oral diuretics
• Intensive care
• Antibiotics
• New diagnostic methods
• Poliomyelitis
It was found that 41% of these advances judged to be essential for later
clinical advance were not clinically oriented at the time they were made. The
scientists responsible for these key articles sought knowledge for the sake of
knowledge. 61.7% of key articles described basic research, 21.2% reported
4.1 Hindsight 75
other types of research, and 15.3% were concerned with development of new
apparatus, techniques, operations, or procedures, and 1.8% were review ar-
ticles or synthesis of the data of others.
Comroe and Dripps discussed research on research, similar to the notion of
science of science. They pointed out that it requires long periods of time and
long-term support to conduct and support retrospective and prospective re-
search on the nature of scientific discovery and understand the courses of long
and short lags between discovery and application. Their suggestion echoes the
results of an earlier project commissioned by the NSF in response to Project
Hindsight. NSF argued that the timeframe studied by the Hindsight project
was too short to identify the basic research events that had contributed to
technological advances. The NSF commissioned a project known as TRACES
to find how long it would take for basic research to evolve to the point that
potential applications become clear. However, Mowery and Rosenberg (1982)
argued that the concept of research events is much too simplistic and the
neither Hindsight nor TRACES used a methodology that is capable enough
of showing what they purport to show.
4.1.4 TRACES
4.2 Foresight
Since 1970, the Science and Technology Agency (STA) in Japan carried out a
series of long-term forecasts, looking 30 years ahead into the future of science,
technology, and innovation. These forecasts are one of the most systematic
and wide-ranging forms of a foresight process. Some of the priority topics
identified in forecasts made in early 1990s include the development of an
HIV vaccine and effective methods for preventing Alzheimer’s disease.
After 20 years passed since the first Delphi exercise, Japan’s National
Institute for Science and Technology Policy (NISTEP) reviewed the accuracy
of its forecasts and found that 64% of topics were realized to some extent,
but only 28% were fully realized. The accurate rate was overall regarded as
encouraging given the experimental nature of the first Delphi exercise. In
addition, the inaccurate results were more often due to political or social
changes than technological development. Lessons learned from a separate
analysis of the foresight indicated that expert panels used in such surveys
should draw upon a wide range of expertise because experts tend to be over-
optimistically biases about the development of their own fields. Interestingly,
it was found that experts in neighboring fields were better able to foresee
potential barriers in related topics. This finding underlines a central premise
of this book: transformative discoveries are likely to emerge from the twilight
zones where multiple fields meet.
Although it may be ironic that experts are more likely to be biased on
topics that they have the most expertise, this is a valuable reminder of how
vulnerable our cognitive abilities are. An interesting approach utilized by
Japanese National Institute of Science and Technology Policy (NISTEP) is
78 Chapter 4 Recognizing the Potential of Research
using science maps to depict hot research areas as mountains. Although the
metaphor of landscape has been used in a variety of visualization for a long
time, it is still rare to see such use in official reports of scientific priority
forecasts. Hot research areas in NISTEP’s science maps are defined as topic
areas in which the total number of publications has exceeded a threshold.
NISTEP identified the promising role of a science map as a boundary
object in facilitating communications between domain experts and facilita-
tors: “During interviews, we were struck by the usefulness of the Science
map as a basis for discussion . . . . With shared data such as the Science map,
researchers from different fields can engage in more meaningful discussion
of the development of scientific research. By sharing the same ‘arena’, re-
searchers can mutually adjust their sense of distance, facilitating discussion
among researchers or among researchers and policy makers. In the future, we
would like to pursue this idea of the Science map as an arena for discussion.”
(Saka, Igami & Kuwahara., 2008).
It is certainly tempting for analysts to gather opinions from scientists
about future development of scientific fields, but the question is how reliable
the results are. The foresight approach in general is based on four principles
(Martin, 1995):
1) The forecasts must incorporate economic and social needs;
2) It must cover all of science and technology;
3) It should evaluate the relative importance of different R&D tasks and
determine priorities for policy purposes;
4) The forecast should be predictive (forecasting what is likely to happen)
and normative (setting goals for what should happen).
Japan, the UK and Australia are widely known for their continued ef-
forts in foresight processes. What has been done in the U.S. regarding the
future of science and technology? During the 1960s, the Committee on Sci-
ence and Public Policy (COSPUP) of the U.S. National Academy of Sciences
conducted a series of field surveys in order to assess individual scientific dis-
ciplines and promising areas in these disciplines (Westheimer, 1965). The
surveys were resumed in 1980 by the National Research Council (NRC). The
Pimentel report Opportunities in Chemistry (Pimentel, 1985) was regarded
as a successful field survey. A committee was set up in 1982 and chaired by
Professor Pimentel with the goal to survey the research frontiers of chemistry.
Several hundred chemists were asked to identify topics for further reviews.
Funding agencies and the U.S. congress criticized field surveys for several
reasons. In particular, almost all the field surveys in the 1980s made demands
unrealistically that funding for the field in question needs to be doubled over
the next 5 years. Each field study on average cost $0.5 million ∼ $1.0 million
and takes 3 years to complete. The final reports were often too long and
inaccessible to outsiders. No attempt was made to identify any overall pri-
orities. Field studies relied on informed but subjective judgments of experts.
More importantly, field studies did not identify priority areas needed for sci-
ence policy decision making in response to the stretched public funding. In
4.2 Foresight 79
part, the reluctance of identifying areas of declining importance was from the
scientific community and the National Academies that serve the interest of
the scientific community. Subsequent foresight activities learned from these
lessons and placed more emphases on balancing between interested parties
and independent third parties. Ben Martin’s reviews (Martin, 1995; Mar-
tin, 2010) provide informative accounts of the history of foresight, including
Australia, Germany, New Zealand, the Netherlands, and the UK.
Goodwin and Wright (2010) reviewed forecasting methods that target for rare
and high-impact events. They identified the following six types of problems
that may undermine the performance of forecasting methods:
• Sparsity of reference class
• Reference class that is outdated or does not contain extreme events
• Use of inappropriate statistical models
• The danger of misplaced causality
• Cognitive biases
• Frame blindness
Goodwin and Wright identified three heuristics that can lead to sys-
tematically biased judgments: (1) availability, (2) representativeness, and
(3) anchoring and insufficient adjustment. The availability heuristic bias
means that human beings find easier to recall some events than others, but
it usually does not mean that easy-to-recall events have a higher probabil-
ity of occurring than hard-to-recall events. The representativeness heuristic
is a tendency to ignore base-rate frequencies. The anchoring and insufficient
adjustment means that forecasters make insufficient adjustment for the fu-
ture conditions because they anchor on the current value. As we can see,
cognitive biases illustrate how vulnerable human cognition is in terms of es-
timating probabilities intuitively. Expert judgment is likely to be influenced
by these cognitive biases. Researchers have argued that in many real world
tasks, apparent expertise may have little to do with any real judgment skills
at the task in question.
The increasing emphasis on accountability for science and science pol-
icy is influenced by many factors, but two of them are particularly influen-
tial and persistent despite the fact that they started to emerge more than a
decade ago. The two factors are 1) limited public funding, and 2) the growing
view that publicly funded research should contribute to the needs of society
(MacLean, Anderson, & Martin, 1998). The notion of a value-added chain is
useful for explaining the implications. The earlier value-added chain, espe-
cially between 1940s and 1960s, was simple. Researchers and end-users were
loosely coupled in such value-added chains. The primary role of researchers
was seen as producing knowledge and the primary role of end-users was to
80 Chapter 4 Recognizing the Potential of Research
users often make a valuable input on what is attractive. This type of method
was adopted by (MacLean et al., 1998) to identify the nature of links in a
value-added chain and map science outputs on to user needs. Specially, a
two-round Delphi survey was conducted. Responses from over 100 individu-
als were obtained in each round. The differences between responses in the two
rounds were plotted in a two-dimensional feasibility-by-attractiveness space.
Fig. 4.1 shows a schematic illustration of the movements of assessments
in this two-dimensional space, which is a representation of the linkage be-
tween scientists and users in the value-added chain model. For example, the
topic remote data acquisition in the high feasibility and high attractiveness
quadrant moved to a position with an even higher feasibility and a higher at-
tractiveness after the second round of the Delphi survey. In contrast, the topic
sustainable use of marine resources was reduced in terms of both feasibility
and attractiveness. It is possible that one group of stakeholders changed their
assessments, but the other group’s assessments remained unchanged. For ex-
ample, while users did not alter their attractiveness assessments of topics
such as management of freshwater resources and prediction of extreme atmo-
spheric events, scientists updated the corresponding feasibility assessments:
one went up and the other went down.
Fig. 4.1 Feasibility and attractiveness of research topics. Source: The diagram is
drawn based on Figure 3 of (MacLean et al., 1998).
The Delphi method is the most frequently used method in foresight activities.
Fig. 4.2 A genealogical tree of national applications of the Delphi method. Source:
Figure 1 in (Grupp & Linstone, 1999).
4.2 Foresight 83
Some of the earliest studies using Delphi were performed at the RAND Cor-
poration (Dalkey, 1969; Kaplan, Skogstad, & Girshick, 1950). In 1972, the
Science and Technology Agency in Japan selected the Delphi method for fore-
sight activities. It gathers experts’ judgments using successive iterations of a
survey questionnaire. Each iteration is also known as a round. The results of
earlier rounds are shared among experts in a new round so that facilitators of
the survey can identify the convergence of opinions or the persistence of dif-
ferent opinions (Grupp & Linstone, 1999). The Delphi method is considered
particularly useful for making long-range forecasts over 20∼30 years because
in such situations expert opinions are the only source of information avail-
able. Unlike a committee, which usually seeks consensus, Delphi does not
force consensus. The Delphi method allows experts to shift their opinions
from one round to another based on new information that becomes available
to them. Fig. 4.2 shows a genealogical tree of national applications of the
Delphi method. The diagram is adopted from (Grupp & Linstone, 1999).
A widely cited retrospective review of Japan’s Delphi experiences was
done by Cuhls (1998) in her doctoral dissertation. She found that the Japanese
Delphi studies were able to keep the unresolved issues such as early earth-
quake detection on the national science and technology agenda even in years
of no earthquakes when the public and policy makers paid little attention to
these issues.
How realistic and reliable are expert opinions obtained from foresight activ-
ities? There is a rich body of literature on Delphi and related issues such
as individual opinion change and judgmental accuracy in Delphi-like groups
(Rowe, Wright, & McColl, 2005), pitfalls and neglected aspects (Geels &
Smit, 2000).
In a recent study, Felix Brandes (2009) addressed this issue and assessed
expert anticipations in the UK technology foresight program. The UK’s tech-
nology foresight program was recommended in the famous 1993 White Paper
‘Realizing our Potential.’ 15 expert panels were formed along with a large
scale national Delphi survey. The survey was sent to 8,384 experts in 1994 to
generate forecasts on 2015 and beyond. 2,585 responded to the survey. About
2/3 of statements were predicted to be realized between 1995 and 2004. Bran-
des’ study was therefore to assess how realistic the 1994 expert estimates by
2004, i.e. 10 years later.
Out of the original 15 panels of the 1994 UK foresight program, Brandes
selected three panels, Chemicals, Energy, and Retial & Distribution, to follow
up the status of their forecasts in terms of Realized, Partially Realized, Not
Realized, and Don’t Know. An online survey was used and the overall response
rate was 38%.
84 Chapter 4 Recognizing the Potential of Research
4.3 Summary
2
The Office of Science and Technology (UK) sent out a ‘Hindsight on Foresight’ survey
in 1995.
References 85
References
Agrawal, A., Cockburn, I., & McHale, J. (2003). Gone but not forgotten: Labor
flows, knowledge spillovers, and enduring social capital. NBER Working Paper
No. 9950.
Anderson, J. (1997). Technology foresight for competitive advantage. Long Range
Planning, 30(5), 665-677.
Brandes, F. (2009). The UK technology foresight programme: An assessment of
expert estimates. Technological Forecasting and Social Change, 76(7), 869-879.
Braun, T., Schubert, A., & Zsindely, S. (1997). Nanoscience and nanotechnology
on the balance. Scientometrics, 38, 321-325.
Chen, C., & Hicks, D. (2004). Tracking knowledge diffusion. Scientometrics, 59(2),
199-211.
Chubin, D.E., & Hackett, E.J. (1990). Paperless science: Peer review and U.S.
science policy.
Comroe, J.H., & Dripps, R.D. (2002). Scientific basis for the support of biomedical
science. In R.E. Bulger, E. Heitman & S.J. Reiser (Eds.), The ethical dimensions
of the biological and health sciences (2nd ed., pp. 327-340). Cambridge, UK:
Cambridge University Press.
Cuhls, K. (1998). Technikvorausschau in Japan. Heidelberg: Physica-Springer.
Dalkey, N.C. (1969). The Delphi method: An experimental study of group opinion.
Santa Monica, CA: The Rand Corporation.
Editorial. (2010). Assessing assessment. Nature, 465, 845.
Geels, F.W., & Smit, W.A. (2000). Failed technology futures: Pitfalls and lessons
from a historical survey. Futures, 32(9-10), 867-885.
Goodwin, P., & Wright, G. (2010). The limits of forecasting methods in anticipating
rare events. Technological Forecasting and Social Change, 77(3), 355-368.
Grupp, H., & Linstone, H.A. (1999). National technology foresight activities around
the globe — Resurrection and new paradigms. Technological Forecasting and
Social Change, 60(1), 85-94.
Hsieh, C. (2010). Explicitly searching for useful inventions: Dynamic relaatedness
and the costs of connecting versus synthesizning. Scientometrics.
Illinois Institute of Technology. (1969). Technology in retrospect and critical events
in science. Chicago: The Illinois Institute of Technology Research Institute.
Jaffe, A., & Trajtenberg, M. (2002). Patents, citations & innovations. The MIT
Press.
Kaplan, A., Skogstad, A.L., & Girshick, M.A. (1950). The prediction of social and
technological events. Public Opinion Quarterly XIV, 93-110.
Laudel, G. (2006). The art of getting funded: How scientists adapt to their funding
conditions. Science and Public Policy, 33(7), 489-504.
MacLean, M., Anderson, J., & Martin, B.R. (1998). Identifying research priorities
in public sector funding agencies: Mapping science outputs on to user needs.
Technology Analysis & Strategic Management, 10(2), 139-155.
Martin, B.R. (1995). Foresight in science and technology. Technology Analysis &
Strategic Management, 7(2), 139-168.
Martin, B.R. (2010). The origins of the concept of ‘foresight’ in science and tech-
nology: An insider’s perspective. Technological Forecasting and Social Change,
77(9), 1438-1447.
Meyer, M. (2000). What is special about patent citations? Differences between
scientific and patent citations. Scientometrics, 49(1), 93-123.
Miles, I. (2010). The development of technology foresight: A review. Technological
Forecasting and Social Change, 77(9), 1448-1456.
Narin, F., & Olivastro, D. (1992). Linkage between technology and science. Research
86 Chapter 4 Recognizing the Potential of Research
The investigation of 911 terrorist attacks has raised questions on whether the
intelligence agencies could have connected the dots and prevented the terror-
ist attacks (Anderson, Schum, & Twining, 2005). Prior to the September-11
terrorist attacks, several foreign nationals enrolled in different civilian flying
schools to learn how to fly large commercial aircraft. They were interested in
learning how to navigate civilian airlines, but not in landings or takeoffs. And
they all paid cash for their lessons. What is needed for someone to connect
these seemingly isolated dots and reveal the hidden story?
In an intriguing The New Yorker article, Gladwell differentiated puzzles
and mysteries with the stories of the collapse of Enron (Gladwell, 2007). To
solve the puzzle, more specific information is needed. To solve a mystery, one
needs to ask the right question. Connecting the dots is more of a mystery than
a puzzle. Solving mysteries is one of the many challenges for visual analytic
reasoning. We may have all the necessary information in front of us and yet
fail to see the connection or recognize an emergent pattern. Asking the right
question is critical to stay on track.
In many types of investigations, seeking answers is only part of the game.
It is essential to augment the ability of analysts and decision makers to ana-
lyze and assimilate complex situations and reach informed decisions. We con-
sider a generic framework for visual analytics based on information theory
and related analytic strategies and techniques. The potential of this frame-
work to facilitate analytical reasoning is illustrated through several examples
from this consistent perspective.
In information theory, the value of information carried by a message is
the difference of information entropy before and after the receipt of the mes-
sage. Information entropy is a macroscopic measure of uncertainty defined
on a frequency or probability distribution. A key function of an information-
theoretical approach is to quantify discrepancies of the information content
of distributions. Information indices, such as the widely known Kullback-
Leibler divergence (Kullback & Leibler, 1951), are entropy-based measures
of discrepancies between distributions (Soofi & Retzer, 2002).
The Kullback-Leibler (K-L) divergence of probability distribution Q from
probability distribution P is defined as follows:
P (i)
DivergenceK−L (P : Q) = P (i) log
i
Q(i)
follow a path that can maximize the overall profitability. Information scent
is the perception of the value, cost, or accessible path of information sources.
When possible, one relies on information scent to estimate the potential prof-
itability of a patch.
The power of information foraging theory is its own adaptability and ex-
tensibility. It provides a quantitative framework for interpreting behavioral
patterns at both microscopic and macroscopic levels. For instance, connect-
ing the dots of mysterious behaviors of 911 hijackers at flying schools would
depend on the prevalence and strength of the relevant information scent (An-
derson et al., 2005). The question is where an analyst could draw the right
information scent in the first place. Fig. 5.1 is not designed with information
foraging theory in mind, but the visualization intuitively illustrates the profit
maximization principle behind the theory. The connective density reinforces
the boundaries of patches. Colors and shapes give various information scents
about each patch in terms of its average age and the popularity of citations.
These scents will help users to choose which patch they want to explore in
more detail.
Fig. 5.1 The three clusters of co-cited papers can be seen as three patches of infor-
mation. All three patches are about terrorism research. Prominently labeled papers
in each patch offer information scent of the patch. Colors of patches, indicating the
time of a connection, provide a scent of freshness. The sizes of citation rings provide
a scent of citation popularity. Source: (Chen, 2008). (see color figure at the end of
this book)
We review our beliefs when new information becomes available. For example,
physicians run various tests with their patents. Physicians make sense of
test results and decide whether more tests are needed. In general elections,
voters ask questions about candidates’ political positions in order to reduce
or eliminate uncertainties about choosing candidates. Bayesian reasoning is
a widely used method to analyze evidence and synthesize our beliefs. It has
been used in a wide variety of application domains, from interpreting women’s
mammography for breast cancer risks to differentiating spam from genuine
emails.
92 Chapter 5 Foraging
The search for the USS Scorpion nuclear submarine is a frequently told
story of a successful application of Bayesian reasoning. The USS Scorpion was
lost from the sea in May 1968. An extensive search failed to locate the vessel.
The search was particularly challenging because of the lack of knowledge of
its location prior to its disappearance. The subsequent search was guided by
Bayesian search theory, which takes the following steps:
1) Formulate hypotheses of whereabouts of a lost object.
2) Construct a probability distribution over a grid of areas based on the
hypotheses.
3) Construct a probability distribution of finding the lost object at a location
if it is indeed there.
4) Combine the two distributions and form a probability distribution and
use the new distribution to guide the search.
5) Start the search from the area with the highest probability and move to
areas with the next highest probabilities.
6) Revise the probability distribution using the Bayesian theorem as the
search goes on.
In the Scorpion search, experienced submarine commanders were called in
to come up with hypotheses independently of whereabouts of the Scorpion.
The search started from the grid square of the sea with the highest probability
and moved on to squares with the next highest probabilities. The probabil-
ity distribution over the grid was updated as the search moved along using
Bayesian theorem. The Scorpion was found in October more than 10,000 feet
under water within 200 feet of the location suggested by the Bayesian search.
The Bayesian method enables searchers to estimate the cost of a search at
local levels and allows the searchers adapt their search path according to the
revised beliefs as the process progresses. This adaptive strategy is strikingly
similar to the profit maximization assumption of information foraging theory.
The revision of our beliefs turns probabilistic distributions to information
scents. The Bayesian search method is a tool that may help analysts in solving
mysteries.
If solving mysteries in visual analytics is akin to finding needles in numer-
ous haystacks, the needle of interest in visual analytics often has a low key
or low profile. They tend to blend in well with others. Furthermore, human
analysts are superior when it comes to identify and differentiate information
that only has subtle differences from others. In order to find connections be-
tween a few low-profile needles, analysts need tools that can reliably single
out subtle outliers or surprises from an overwhelmingly vast and diverse pop-
ulation. Information indices are designed to capture discrepancies of different
distributions. The following example shows how such information indices are
used to detect surprising spots in video frames.
5.1 An Information-Theoretic View of Visual Analytics 93
Fig. 5.2 Information entropies of the literature of terrorism research between 1990
and the first half of 2007. The two steep increases correspond to the Oklahoma
City bombing in 1995 and the 911 terrorist attacks in 2001. Source: (Chen, 2008).
96 Chapter 5 Foraging
Fig. 5.3 Symmetric relative entropy matrix shows the divergence between the
overall use of terms across different years. The recent few years are most similar
to each other. The boundaries between areas in different colors indicate significant
changes of underlying topics. Source: (Chen, 2008). (see color figure at the end of
this book)
Fig. 5.4 illustrates how one may facilitate a sensemaking process with
both high- and low-profile patterns embedded in the same visualization. The
network in Fig. 5.4 consists of keywords that appeared in 1995, 1996, and
1997, corresponding to the first period of substantial change in terrorism
research. High-profile patterns are labeled in black, whereas low-profile pat-
terns are labeled in dark red. High-profile patterns help us to understand the
most salient topics in terrorism research in this period of time. For exam-
ple, terrorism, posttraumatic-stress-disorder, terrorist bombings, and blast
overpressure are the most salient ones. The latter two are closely related to
the Oklahoma city bombing event, whereas posttraumatic-stress-disorder is
not directly connected at this level. In contrast, low-profile patterns include
avoidance symptoms, early intrusion, and neuropathology. These terms are
unique with reference to other keywords. Once these patterns are identified,
analysts can investigate even further and make informed decisions. For exam-
ple, one may examine whether this is the first appearance of an unexpected
topic or whether the emergence of a new layer of uncertainty to the system
at this point makes perfect sense.
Developing methods and principles for representing data quality, reliabil-
ity, and certainty measures throughout the data transformation and analysis
process is a key element on the research agenda for visual analytics (Thomas
& Cook, 2005). Each of the individual method illustrated here has been used
in their own application domains and some of them have already been applied
to visual analytics. However, introducing the collection of theories, strategies,
and techniques as a consistent and yet versatile information-theoretic view
of visual analytics is expected to strengthen the theory and practice of visual
analytics.
At the beginning of the chapter, we emphasize that asking the right ques-
tion holds the key to connecting the dots. Examples discussed here illustrate
various ways to find the dots, make sense of the dots, and differentiate dots at
different levels of abstraction, ranging from macroscopic to microscopic levels.
The information-theoretic perspective provides a potentially effective frame-
work to address questions concerning analytical reasoning with uncertainty,
synthesizing evidence from multiple sources, and developing a macroscopic
understanding of a complex, large-scale, and diverse body of information
98 Chapter 5 Foraging
Fig. 5.4 A network of keywords in the terrorism research literature (1995 – 1997).
High-frequency terms are shown in black, whereas outlier terms identified by infor-
mational bias are shown in dark red. Source: (Chen, 2008). (see color figure at the
end of this book)
The late sociologist Murray S. Davis developed some intriguing insights into
why we are interested in some (sociological) theories but not others (Davis,
1971a). Although his work focused on sociological theories, the insights are
broad-ranging. According to Davis, “the truth of a theory has very little to
do with its impact, for a theory can continue to be found interesting even
though its truth is disputed — even refuted!” A theory is interesting to the
audience because it denies some of their assumptions or beliefs to an extent.
But if a theory goes beyond certain points, it may have gone too far and the
audience will lose their interest. People pay attention to a theory not really
because the theory is true and valid; instead, a theory is getting people’s
attention because it may change people’s beliefs.
5.2 Turning Points 99
Continued
Phenomenon Dialectical Relations
Evaluation Good ←−−−−−−−→ Bad
Multiple Co-relation Interdependent ←−−−−−−−→ Independent
Co-existence Co-exist ←−−−−−−−→ Not co-exit
Co-variation Positive ←−−−−−−−→ Negative
Opposition Similar ←−−−−−−−→ Opposite
Causation Independent ←−−−−−−−→ Dependent
Proteus is a sea god in Greek Mythology. He could change his shape at will.
The Proteus phenomenon refers to early extreme contradictory estimates in
published research. Controversial results can be attractive to investigators
and editors. Ioannidis and Trikalinos (2005) tested an interesting hypothesis
that the most extreme, opposite results would appear very early as data accu-
mulate rather than late. They used meta-analyses of studies on genetic associ-
ations from MEDLINE and meta-analyses of randomized trials of health care
interventions from the Cochrane Library. They evaluated how the between-
study variance for studies on the same question changed over time and at
what point the studies with the most extreme results had been published.
The results show that for genetic association studies, the maximum between-
study variance was more likely to be found early in the 44 meta-analyses
and 37 in the health care interventions case. The between-study variance
5.2 Turning Points 101
Fig. 5.5 The swing of results decreased over time. Source: Figure 1 in (Ioannidis
& Trikalinos, 2005).
decreased over time in the genetic association studies, which was statistically
significant (Fig. 5.5). This 2005 study itself has attracted 330 citations in
2010.
The nature of scientific change has been studied in the philosophy of sci-
ence (Collins, 1998; Laudan, Donovan, Laudan, Barker, Brown, Leplin, Tha-
gard, & Wykstra, 1986; Schaffner, 1992), sociology (Fuchs, 1993; Griffith &
Mullins, 1977), and history of science (Brannigan & Wanner, 1983). Quan-
titative studies of the topic can be found in the fields of scientometrics, ci-
tation analysis, and information science in general (Chen, 2003; Heinze &
Bauer, 2007; Heinze, Shapira, Senker, & Kuhlmann, 2007; Hummon & Dor-
eian, 1989; Small & Crane, 1979; Sullivan, Koester, White, & Kern, 1980;
Wagner-Dobler, 1999). Scientific literature has increasingly become one of the
most essential sources for these studies. Social network analysis and complex
network analysis also provides valuable perspective (Barabási, Jeong, Néda,
Ravasz, Schubert, & Vicsek, 2002; Newman, 2001; Redner, 2004; Snijders,
2001; Valente, 1996; Wasserman & Faust, 1994).
It is evident that scientific discoveries share important and generic prop-
erties (Bradshaw, Langley, & Simon, 1983; Simon, Langley, & Bradshaw,
1981a). In order to obtain conclusive evidence, one will need a theory of sci-
102 Chapter 5 Foraging
Mark Twain thought he could learn how to become a Mississippi river pilot by
studying charts and manuals. In fact, he discovered that it would take several
years of apprenticeship to become an experienced river pilot and countless of
journeys over the same terrain to be able to “read” the meaning of currents
and water-levels of the ever-changing river in always-different circumstances
(Twain, 2001).
5.2 Turning Points 105
probabilities. Each node in the network represents a state. Moving from one
node to another is governed by a state transition probability, which can be
updated based on available evidence in Bayesian rules. The spread of knowl-
edge is thus translated into a question of how easy or how hard it would be
to make such moves.
The ant colony and random walk models have a more profound connection
to the information foraging theory (Pirolli, 2007). The fundamental premise
of the information foraging theory is that the behavior of a forager, namely,
information searchers and, in this case, scientists is driven by a perceived
or calculated profitability of the potential move. The profitability takes into
account the expected returns as well as potential risks or costs involved. For
example, if online access to an article costs $30, then the cost is only part of
the equation. Whether the article is worth your paying the $30 depends on
what you can do with the article and how urgently you need it.
Sandstrom argued that information seekers are very much like foragers,
especially in terms of how and where they forage for valued resources (Sand-
strom, 1999). She introduced the notion of bibliographic microhabitats to
underline the similarity between hunters and information seekers. She fur-
ther argued that if some empirical cost-benefit currency can be established,
then analysts would be able to rank foragers’ preferences, predict which re-
sources will be pursued, and specify the net returns associated with particular
choices.
In summary, unlike epidemic models, foraging models emphasize not only
structural properties of an information space for information seekers or a
problem space for scientists, but also the interplay between perceived values,
handling costs, and various competing and probably conflicting factors in
a broader context of decision making. In other words, one may incorporate
foraging models into existing workflows so that one can recognize and act
upon vital clues that may lead them to a fruitful path.
and science policy makers. The predictive power of a diverse range of variables
has been tested in the literature. As shown in Table 5.2, most of the commonly
studied variables can be categorized into a few groups according to their
parent classes where they belong to. For example, the number of pages of a
paper is an attribute of the paper as an article. The number of authors of a
paper is an attribute of the authorship of the paper. One can expect even more
variables will be added to the list. We expect to demonstrate that our theory
of transformative discovery provides a theoretical framework to accommodate
this diverse set of attributive variables and provides a consistent explanation
for most of them.
Table 5.2 Variables associated with articles that may be predictive of their sub-
sequent citations.
Hypotheses derived from theory of dis-
Components Attributive Variables covery
Boundary spanning needs more text to
Article Number of pages describe.
Number of years since
publication
More authors are more likely to con-
Authorship Number of authors tribute from diverse perspectives.
Reputation (citations,
h-index)
Gender
Age
Position of last name
in alphabet
Impact Citation counts The value of the work is recognized.
Usage Download times The value of the work is recognized.
Transformative ideas tend to be more
complex than simple ones. More words
Abstract Number of words
are needed to express more complex
ideas.
Structured (yes/no)
Type of contributions:
Content tools, reviews, meth-
ods, data, etc.
Scientific rigorous of
study design
More references are needed to cover
Reference Number of references multiple topics that are being synthe-
sized.
It is more likely that the work synthe-
Discipline Number of disciplines sizes multiple disciplines.
It is more likely that authors from differ-
Country Number of countries ent countries bring in distinct perspec-
tives.
5.2 Turning Points 109
Continued
Hypotheses derived from theory of dis-
Components Attributive Variables covery
It is more likely that authors from dif-
Institution Number of institutions ferent institutions bring in distinct per-
spectives.
Journal Impact factor
Indexed by different
databases
Sponsored yes/no
author in 1990, the first year of the period. If an article has multiple authors,
the most prominent author’s reputation was used. The reputation of a journal
was represented by its impact factor in 1990. They used duration analysis,
originated in survival analysis, to analyze the data. The central question is:
what determines the probability of an article changing from the initial state
of not being cited to a state in which it is cited? In survival analysis, the
role of a hazard function is to estimate the probability of transitions from
the initial state. The simplest form of a hazard function is constant with no
memory of how long the initial state lasts. In other words, the probability
of an article moving away from the initial state in the next time frame does
not depend on how much time it has been spent in the initial state. More
realistic hazard functions include positive and negative duration dependence.
Positive duration dependence specifies that the longer an article has been in
the initial state, i.e. not being cited, the better the chance it will be cited.
In contrast, negative duration dependence assumes the opposite. Dalen and
Henkens chose their hazard function based on the Gompertz distribution.
The Gompertz distribution is a theoretical distribution of survival times.
It was proposed by Gompertz in 1825 to model human mortality. The resul-
tant hazard function is defined as follows:
ct
y(t) = aebe
where a is the upper asymptote, i.e. the value of y(t → ∞) in the infinite
future time, b is the x displacement, c is the growth rate, and e is the Euler’s
number. The Gompertz function models the slow growth at the initial and
final stages and faster growth in intermediate stages. It has been used to
model the growth of tumors, the uptake of mobile phones, and the mortality
of population.
Dalen and Henkens divided articles into four categories and then used a
statistical method called multinomial logit to test how explanatory factors
such as authors’ and journals’ reputations could explain the citation patterns.
1) Articles with citations too little and/or too late (forgotten ones).
2) Articles with late citations (sleeping beauties).
3) Articles with early citations but fading off quickly (flash-in-the-pans).
4) Articles with early citation and many subsequent cites (normal science).
Prob (Article = sleeping beauty) = exp(Xβ (2) )/
[1 + exp(Xβ (2) ) + exp(Xβ (3) ) + exp(Xβ (4) )]
Their model shows statistically significant effects of several explanatory
variables such as author reputation, the number of pages, and journal repu-
tation (impact factor) at p < 0.01.
The survival model of the timing of first citation identified the major
role of the communication process in speeding up the uptake of a scientific
paper, namely visibility, language and reputation of authors and journals.
When the effect of a journal’s quality such as the reputation of the editors
and editorial policy is controlled, the duration analysis reveals the reputation
effect of authors. The effect of journals becomes clear.
5.2 Turning Points 111
Dalen and Henkens’ duration study essentially tell us that the reputation
of the authors of an article and the reputation of the journal in which the
article is published are the most critical factors for the article to gain visibility
and get cited. Are we attracted by other signals? What about structural,
temporal, and semantic properties of the underlying topic?
What is the extent to which quantitative rankings of highly cited authors
confirm or, even more ambitiously, predict Nobel Prize awards? Between
1977 and 1992, Eugene Garfield published a series of studies of Nobel Prize
winners’ publications and their citations and made predictions of future Nobel
Prize laureates based on existing citation data.
He reported that eight Nobel laureates were found on a list of 100 most
cited authors from 1981 through 1990(Garfield & Welljamsdorof, 1992). Oth-
ers on the list were seen as potential Nobel Prize winners in the future. On
the other hand, it was noted that the undifferentiated rankings of the most
cited authors in a given period of time could be further fine-tuned to increase
the accuracy of its coverage of Nobel Prize awards. For example, the Nobel
Committee sometimes selects relatively small specialties. Further dividing
the list according to specialties shows that Nobel laureates in relatively small
specialties are among the most cited authors in their specialties.
Methods papers of Nobel Prize winners tend to attract a disproportion-
ably high amount of citations. More recent examples of methodological con-
tributions include the 2007 Nobel Prize for the British embryonic stem cell
research architect Martin Evans. Garfield coined the phenomenon the Lowry
Phenomenon, referring to the classic example of Oliver Lowry’s 1951 methods
paper, which was cited 205,000 times up to 1990.
Research has shown that citation frequency has a low predictive power for
Nobel awards because there are so many other scientists with the same or even
higher citations as the few Nobel Prize winners. The greatest value of count-
ing citations is its simplicity. Subsequent attempts to improve the accuracy
of the method tend to lose the simplicity. Hirsch’s h-index has drawn much
interest also because of its simplicity despite its known limitations (Hirsch,
2005a). Antonakis and Lalive intended to capture both the quality and pro-
ductivity of a scholar with a new index IQp (Antonakis & Lalive, 2008). They
compared the new index of Nobel winners in physics, chemistry, medicine,
and economics. It is worth noting here that one should always be cautious
when using quantitative indicators in qualitative decisions. The authors found
about two third of Nobel winners have an IQp over 60. The authors showed
that in several examples, IQp differentiated Nobel class and others more accu-
rately than the h-index, including physicist Ed Witten (h=115 and IQp=230)
and others who have high h-index but relatively low IQp index, S. H. Snyder
(h=198, IQp=117) and R. C. Gallo (h=155, IQp=75).
In the context of scientific discovery, we will expand the information for-
aging theory to describe the behavior of scientists in searching for novel hy-
potheses and theories. This will help us to explain how a scientist would
maximize the profitability of the next move.
112 Chapter 5 Foraging
The most prominent work in this area has been done by Herbert Simon and
his colleagues using computer simulation to study and reconstruct scientific
discoveries (Bradshaw et al., 1983). A long list of examples of automated
discoveries was given in (Glymour, 2004). He used the metaphor of finding
a needle in a haystack to characterize the problems faced by scientists in
discovery. Rather than sifting through things in the haystack one by one,
automated discovery is akin to either setting the haystack on fire and blowing
away the ashes to find the needle, or running a magnet through the haystack.
There are advantages and limitations. Following the metaphor, for example,
the fire may melt the needle.
Many studies have addressed the nature of insight in scientific discovery.
For example, Gestalt psychologists suggest that insight occurs when prob-
lem solvers see the original problem from a fresh perspective (Mayer, 1995).
Other researchers have emphasized that the complexity of searching in a
problem space has more to do with the structure of a problem space than the
searcher (Perkins, 1995; Simon, 1981). In particular, Perkins distinguished
two types of problem spaces. In a Homing Space, there are many clues and
signposts such that navigating in such spaces is relatively easy. In contrast, a
Klondike Space has very few such clues. The sparseness of clues is illustrated
by Perkins (p. 498) in a widely known case of sudden insight — Charles Dar-
win’s discovery of the principle of natural selection. According to Darwin’s
autobiography, in October 1838, he conceived the principle while he “hap-
pened to read for amusement ‘Malthus on Population.’ What is remarkable
is that the next person arrived at the same principle 20 years later. What is
even more remarkable is that the person, Alfred Russell Wallace, came up
with the same idea while reading the same 1826 book by Malthus!
How could one increase the odds of stumbling on such clues? It becomes
clear, from Sandstrom’s notion of bibliographic microhabitats to Perkins’
characterizations of Homing and Klondike spaces, that finding and recog-
nizing clues is essential for both information foragers and problem solvers.
Research in the data mining community on interestingness is particularly
relevant (Hilderman & Hamilton, 2001; Liqiang & Howard, 2006). Interest-
ingness is a quantitative measure of where a set of scientific ideas fit on
the spectrum which ranges from the practice of normal science to that of
paradigm-shifting ideas (Davis, 1971b). In this regard, interestingness lies be-
5.3 Generic Mechanisms for Scientific Discovery 113
tween order and complete randomness, or chaos. We posit that three distinct
ranges of scientific reports and ideas are those which are 1) either confirma-
tory or boring — there is nothing new for the scientific reader; the previously
stated hypotheses have not been falsified yet, and are less and less likely to be
so determined; 2) interesting, which may deny widely accepted assumptions,
state new relationships between old ideas, propose new mechanisms, but do
not require the reader to adopt wholly new ways of thinking; and 3) paradigm
shifts and transformative discoveries. Interesting ideas are enlightening and
surprising in a non-threatening way; in fact, a surprise is generally a pleasant
one, in contrast to the experience of living through a shift of paradigm, espe-
cially when one’s accepted paradigm is being replaced by a more successful
one.
ter purification. He described the lessons learned from each application, and
how the techniques can be improved further (Kostoff, 2008).
Where can we go from here? How often could a Nobel Prize award be
characterized in terms of this A-B-C pattern of transitivity? Are there other
patterns of scientific discoveries? If literature-based discovery is a computer-
aided search in a problem space, what would it miss?
Effective strategies for making scientific discoveries have highlighted the abil-
ity to think creatively and look at a problem from a fresh perspective. Dun-
bar, for example, compared two different strategies of hypothesis generation
using a Nobel Prize winning discovery as the test case (Dunbar, 1993). He
found that it is a more effective discovery strategy to encourage researchers
to consider novel alternative hypotheses. A 1992 special issue of Theoreti-
cal Medicine examined the mechanisms of scientific revolution and how the
Nobel Prize committee selected scientific discoveries (Lindahal, 1992).
A longitudinal study of highly creative scientists in nano science and tech-
nology has found that it is not only the sheer quantity of publications that
enables scientists to produce creative work but also their ability to effectively
communicate with otherwise disconnected peers and to address a broader
work spectrum (Heinze & Bauer, 2007). Why is it possible that communicat-
ing with otherwise disconnected scientists can lead to more creative work?
What can one do specifically to come up with novel alternative hypotheses?
How do we think outside the box?
There are many philosophical theories of scientific change. Philosophers
of science (Laudan et al., 1986) argue that it would be useful to compare rival
theories of scientific change against the history of science. Proponents suggest
that conjectures of philosophical theories should be organized into theses so
that one can compare these theories in terms of individual theses. Laudan et
al. recommended rephrasing Lakatos’ research programme, Laudan’s research
tradition, and Kuhn’s paradigm in terms of a more generic notion of guiding
assumptions. A superior theory of scientific change would be the one that has
the most matches from the historical data. This idea was later criticized by
(Radder, 1997), suggesting that it was far too ambitious.
Our needs here are different. Our goal is not to evaluate the value of
individual philosophical theories of scientific change. Rather, what we need is
an explanatory theory that can clarify the underlying mechanisms of specific
scientific discoveries. In addition, we need a theory that can be instrumental
for quantitative studies of scientific change.
Kuhn’s paradigm-shift model of scientific revolutions (Kuhn, 1962, 1970)
is probably the most widely known theory. It describes how science advances
through a path of normal science, crisis, revolution, and new normal science.
5.3 Generic Mechanisms for Scientific Discovery 115
the other hand, it differs from Swanson’s famous A-B-C model. Instead of
searching for a transitive closure of A→C, given A→B and B→C, we focus on
the brokerage mechanism of discovery, which aims to establish an innovative
connection between A and C. Another important difference is that we utilize
structural properties of a network, whereas such properties are not used in
Swanson’s approach and its variations.
Third, our theory is related to network evolution models in complex net-
work analysis. Preferential attachment models, for example, characterize the
growth of a network in a process that popular nodes will become even more
popular as new nodes and links are added to the network (Albert & Barabási,
2002; Barabási et al., 2002). The popularity of a node can be broadly defined
by an attribute function of node, such as prestige, age, or by other rank-
ing mechanisms. Such processes often result in scale-free networks, which are
characterized by power law distributions of node degrees. While earlier pref-
erential attachment models assume that each new coming node is fully aware
of the prestigious status of every existing node, more recent studies have re-
laxed the assumption to ranking functions defined on a subset of the existing
nodes instead (Fortunato, Flammini, & Menczer, 2006). In contrast, the bro-
kerage mechanism in our theory provides a growth mechanism by building
connections across structural holes between two or more thematic networks.
A brokerage-driven growth is distinct from growths that can be modeled by
preferential attachment.
Fourth, our theory extends earlier efforts for predicting Nobel Prize win-
ners based on citation ranking (Garfield, 1992). Thomson Reuters’ Citation
Laureates1 are also in this category. Our approach is distinct in several im-
portant ways. Although using citation ranking alone has the advantage of
simplicity, we take multiple factors such as structural holes and the rate of
citation growth into account in order to better accommodate the complexity.
In addition, we are concerned with the possibly delayed identification due to
the time taken for the citation profile of a scholarly publication to become
prominent enough to be noticed. We expect that using structural properties
in the theory can resolve the issue to some extent.
Fifth, our theory provides an explanatory mechanism for the diffusion pro-
cess associated with a transformative discovery. Once a brokerage connection
is established between previously disparate areas, it would facilitate the in-
formation flow between these areas. In other words, we expect that the newly
discovered connection will accelerate the diffusion process. Interestingly, the
expected effect on diffusion can be explained in terms of the information
foraging theory (Pirolli, 2007). According to the information foraging the-
ory, searchers need to evaluate multiple patches of information. They need to
make decisions on which patch they should focus and how long they should
spend on a patch before they move on. Their decisions are essentially deter-
mined by the perceived profitability of each move. The higher the perceived
1
http://scientific.thomsonreuters.com/nobel/nominees/
5.4 An Explanatory and Computational Theory of Discovery 119
profitability, the more likely they will decide to go ahead and take the ac-
tion. The newly discovered connection will increase the perceived profitability
because the discovery not only reduces the risk, but also provides concrete
and positive examples of success. Therefore, we could conjecture that the
increased perceived profitability will be translated into bursts of observed
frequencies such as citation and co-citation counts.
Finally, the theory is related to but distinct from the notion of co-citation
pathways through science (Small, 2000). The creation of co-citation path-
ways aims to traverse scientific literature through a chain of highly co-cited
pairs of papers. Small found a co-citation pathway of 331 highly cited doc-
uments starting from economics and ending in astrophysics (Small, 1999).
He observed that each successive document in this path embodies an in-
formation transition towards the destination topic and, in most cases, such
transitions are surprisingly smooth. In contrast, the focus of our theory is on
novel connections that bridge previously disparate fields. Although in theory
such connections may appear as part of a co-citation pathway, it seems to be
more likely that brokerage connections would either deviate from pathways
of highly co-cited documents or not be selected altogether because of a high
co-citation threshold. Nevertheless, more investigations are needed to clarify
the relationships in detail.
Would the absence of such structural and temporal properties rule out
the possibility of a transformative discovery? This issue is concerned with
the scope of the theory. Further investigations are needed. In the following
section, we present some examples to further clarify the major properties
derived from the theory.
5.4.3 Integration
11
σ1 (v, G, T, ρburst ) = ρi = ρburst (2)
i=burst
⎛ ⎞ 11
σ1 (v, G, T, ρcentrality ) = ⎝ ρi ⎠ = ρcentrality (3)
i=centrality
11
σ1 (v, G, T, ρcitation ) = ρi = ρcitation (4)
i=citation
⎛ ⎞ 12
σ2 (v, G, T, ρburst , ρcentrality ) = ⎝ ρi ⎠
i=burst,centrality
√
= ρburst · ρcentrality (5)
122 Chapter 5 Foraging
⎛ ⎞ 13
σ3 (v, G, T, ρburst , ρcentrality , ρcitation ) = ⎝ ρi ⎠
i=burst,centrality,citation
√
= burst · ρcentrality · ρcitation
3 ρ (6)
Note that σ1 (ρcitation ), a special case of the generic definition, ranks the
significance of a reference based on its citations as seen in earlier efforts for
predicting Nobel Prize winners based on citation counts (Garfield, 1992). We
will also compare pair-wise Pearson correlation coefficients between σ1 , σ2
and σ3 indices of centrality, burst, and citation frequency in order to identify
the simplest and effective metrics among them.
In summary, our theory suggests that σ indices would be a good indica-
tor of potential transformative discoveries. Furthermore, once a reference is
identified with a high σ index, the theory provides an explanatory framework
such that we can focus on the precise brokerage connections at work. The
theory also suggests alternative ways to model the evolution of a network by
taking brokerage connections into account. According to our theory, a subset
of Nobel Prize discoveries will be transformative discoveries. More transfor-
mative discoveries would be expected from the recipients of a variety of other
awards in science. In addition, we expect that transformative discoveries can
be identified by these σ metrics at an earlier stage than by single-dimensional
ranking systems. In terms of diffusion, we expect that transformative discov-
eries in general will lead to a more rapid and sustained diffusion process. If
we see the diffusion process as an information foraging process by the scien-
tific community as a whole, transformative discoveries, i.e., brokerage con-
nections across structural holes, would have a higher perceived profitability,
which would motivate and stimulate the diffusion process. It also follows that
the domain-wide foraging process will spend more time with transformative
discoveries than other patches of scientific knowledge.
In each case study, CiteSpace (Chen, 2006) was used to construct a co-citation
network of the references relevant to the chosen topic. We followed the gen-
eral procedure described in (Chen, 2004; Chen, 2006). Bibliographic records
were retrieved from the Web of Science with a topical search for articles
only. Reviews, editorials, and other document types were excluded from the
analysis.
CiteSpace uses a time-slicing mechanism to generate a synthesized
panoramic network visualization based on a series of snapshots of the evolving
network across consecutive time slices2 . Each node in the network represents
a reference cited by records in the retrieved dataset. A line connecting two
2
http://cluster.cis.drexel.edu/∼cchen/citespace/
5.4 An Explanatory and Computational Theory of Discovery 123
nodes represents one or more co-citation instances involving the two refer-
ences. Colors of co-citation links correspond to the earliest year in which
co-citation associations were first made. Each node is shown with a tree-
ring of citation history in the same color scheme, representing the history of
citations received by the underlying reference.
Structural-hole and burst properties are depicted in two distinct colors —
purple and red — in visualizations. If a node is rendered with a purple ring,
it means it has a strong betweenness centrality. The purple color can only
appear as the color of the outermost rim of a node. The thickness of the purple
ring is proportional to the degree of the centrality: the thicker, the stronger
the betweenness centrality. In contrast, if a node has red rings, these red
rings represent the presence and strength of its burst property. It can appear
as the color of any inner rings of the tree ring of a node. The presence of
one or more red rings on a node indicates a significant citation burst was
detected. In other words, there was a period of time in which citations to
the reference increased sharply with respect to other references in the pool,
hence the name CiteSpace.
The captions below network snapshots record the time interval, the num-
ber of nodes, the number of co-citation links, and three thresholds. For exam-
ple, the caption “1981 – 1985. N=210, E=2038. 3,3,20” under the first snap-
shot of the network means that the network was formed between 1981 and
1985, consisting of 210 references and 2,038 co-citation pairs. Each reference
has received at least 3 citations in one of the 5 years during this period.
According to independent sources (Pincock, 2005), the first major publi-
cation of the Helicobacter pylori discovery was (Marshall & Warren, 1984).
Marshall-1984 appeared in the 1986 – 1990 network with essentially cyan and
green citation rings, which means it received its citations mostly in 1987 and
1988. It is quite possible that Marshall-1984 was cited as soon as it was pub-
lished in the 1981 – 1985 time interval, but it did not reach the top of the
most cited list until the 1986 – 1990 network. The six snapshots also demon-
strate that peptic ulcer research has evolved constantly with new references
reaching the top cited levels.
Fig. 5.6 A co-citation network of references on peptic ulcer research (1980 – 1990).
Source: (Chen, Chen, Horowitz, Hou, Liu, & Pellegrino, 2009). (see color figure at
the end of this book)
Fig. 5.7 shows a panorama view of the entire time interval of the dataset
(1980 – 2007). Marshall-1984 has a prominent structural property — a high
betweenness centrality (a large purple ring). Although it does demonstrate
a temporal property of burstness, its burst rate is detectable but not as
strong as some of its neighbors. The burst period was between 1986 and
1988, which is consistent with our observations in the earlier 5-year snapshot
series. The overview network shows that Marshall-1984 is in a dense cluster
with numerous references with citation bursts, suggesting other high-impact
references were present in the landscape of peptic ulcer research.
5.4 An Explanatory and Computational Theory of Discovery 125
Fig. 5.7 A co-citation network of references cited between 1981 and 2007 in
peptic ulcer research. Source: (Chen et al., 2009). (see color figure at the end of
this book)
As shown in Table 5.3, Marshall-1984 was the most cited reference (711
citations) and the highest betweenness centrality (ρcentrality of 0.393). On
the other hand, its burst rate ranked the 372nd. Marshall and Warren en-
countered resistances in getting their discovery accepted by the peptic ulcer
research community. The slow acceptance was documented (Pincock, 2005),
which may in part explain its relatively low burst rate. In contrast, Marshall-
1988 has the highest σ2 of 0.416. It was entitled Prospective double-blind trial
of duodenal ulcer relapse after eradication of Campylobacter pylori. In his
Nobel Prize lecture, Marshall dated the acceptance of his work as the 1994
NIN consensus conference in Washington DC.
The last column in Table 5.3 contains the σ2 index, i.e., the geometric
mean of the burst and centrality metrics. According to our theory, a trans-
formative discovery is a brokerage between previously disconnected areas of
scientific knowledge. The σ2 index takes into account both structural and
temporal properties that a discovery over a structural hole would demon-
strate. In this case, Marshall-1988 was the highest ranking candidate accord-
ing to the σ2 index, despite its citation count of 421 was much less than
Marshall-1984. Validating the true value of Marshall-1988 is beyond our own
expertise and beyond the scope of the analysis. Properly validating the value
of references with such strong combinations of structural and temporal prop-
erties will be an important issue to be addressed in the future work of our
construction of the theory. It is also related to the potential power of predict-
126 Chapter 5 Foraging
ing high-impact discoveries even before it reaches its citation peaks or while
they are overshadowed by other highly cited references.
Table 5.3 Top 5 most cited references in peptic ulcer research (1980 – 2007).
Citation Author Year Source Vol. Page ρburst ρcentrality σ2
711 MARSHALL BJ 1984 LANCET 1 1311 0.138 0.393 0.232
581 PARSONNET J 1991 NEW ENGL 325 1127 0.208 0.143 0.172
J MED
579 WARREN JR 1983 LANCET 1 1273 0.165 0.250 0.203
466 YAMADA T 1994 JAMA 272 65 0.635 0.071 0.213
421 MARSHALL BJ 1988 LANCET 2 1437 0.607 0.286 0.416
Fig. 5.8 A co-citation network of references cited between 1985 and 2007 in gene
targeting research. References with the strongest betweenness centrality scores are
labeled. The burst periods of their citations are shown as the thickened curves in
the three diagrams to the left. Source: (Chen et al., 2009). (see color figure at the
end of this book)
and ρburst . The 1st, 3rd, and 4th references are connected to the Nobel Prize
winning discoveries. Note that the first discovery paper Thomas-1987 has
the highest ranking although its citation count of 268 is not the highest. The
2nd reference is a book. If we consider journal articles only, the first three
references would be all related to the Nobel discoveries (see Fig. 5.9).
Fig. 5.9 Nobel Prize winning papers are ranked among the highest by the σ2
index. Source: (Chen et al., 2009).
Fig. 5.10 is a visualization of the areas associated with the Nobel Prize
winning discoveries in gene targeting research. The visualization was gener-
ated based on citing articles with 15 or more citations in the Web of Science.
In other words, these citing articles themselves have made impacts on the
field in their own right. Co-cited references are aggregated into clusters. The
diffusion of knowledge is tracked by showing how co-citation footprints move
from one cluster to another over time and how long they stay in particular
clusters. The history of the evolution can be seen as an information forag-
ing process participated in by all the scientists in the field. For example, the
embryo-derived stem cell (cluster #11) attracted a lot of citations in 1987
(shown as a high density co-citation cluster in red). In 1988, the foraging
process moved to DNA delivery method (cluster #19) above cluster #11.
All three papers associated with the 2007 Nobel Prize are concentrated in
cluster #12 — gene correction. During 1989 and 1990, much of the foraging
process was inside cluster #12. We also studied the diffusion process over
a longer period of time and the foraging process appeared to spend much
5.4 An Explanatory and Computational Theory of Discovery 129
longer time with cluster #12 than any other clusters. Our general hypothesis
is that transformative discoveries tend to retain the foraging process longer
than other patches of knowledge. Further investigations are needed. The con-
nection between structural-hole theory and information foraging theory is an
important research direction for further investigation.
Fig. 5.10 A diffusion map of gene targeting research between 1985 and 2007.
Selection criteria are at least 15 citations for citing articles and top 30 cited
articles per time slice. Polygons represent clusters of co-cited papers. Each cluster
is labeled by title phrases selected from papers citing the cluster. Red lines depict
co-citations made in the current year. The concentrations of red lines track the
context in which co-citation clusters are referenced. Source: (Chen et al., 2009).
(see color figure at the end of this book)
Fig. 5.11 A co-citation network of references cited between 1990 and 2003 in
string theory. Polchinski-1995 marked the beginning of the second string theory
revolution. Maldacena-1998 is highly transformative and brokerage link between
string theory and particle theories. The three embedded plots show the burst
periods of citations of Witten-1991, Maldacena-1998, and Polchinski-1995. Source:
(Chen et al., 2009). (see color figure at the end of this book)
smallest scales; and Einstein’s general theory of relativity, which deals with
the very largest.” In addition, our search on the web reveals that he is the
recipient of the 2007 Dannie Heineman Prize for Mathematical Physics4 “for
profound developments in Mathematical Physics that have illuminated in-
terconnections and launched major research areas in Quantum Field Theory,
String Theory, and Gravity.”
Table 5.5 shows pair-wise Pearson correlation coefficients between normal-
ized burst and centrality scores, the σ2 index of burst and centrality, and the
σ3 index of burst, centrality, and citation frequency. The σ2 and σ3 indices
are strongly correlated (r = 0.9780), suggesting that, at least in this case,
the σ3 index is redundant and we can simply focus on σ2 . The correlation
coefficients also show that burstness and centrality are almost independent
measures, although they both have some connections to citation counts. This
is a simple justification of our choice to use both burstness and centrality to
construct σ2 as an index of high-impact discoveries. More comprehensive val-
idations may consider other measures such as the h-index and its numerous
variations, e.g. (Antonakis & Lalive, 2008; Hirsch, 2005b).
Table 5.5 Pearson correlation coefficients between individual and synthetic in-
dices.
ρburst ρcentrality σ2 (ρburst , ρcentrality )
ρcitation 0.8026 0.3618
ρburst 0.0409
σ3 (ρburst , ρcentrality , ρcitation ) 0.9780
5.5 Summary
References
Adar, E., Zhang, L., Adamic, L.A., & Lukose, R.M. (2004). Implicit structure and
the dynamics of blogspace. In Proceedings of the workshop on the weblogging
ecosystem at 13th international world wide web conference.
Albert, R., & Barabasi, A. (2002). Statistical mechanics of complex networks. Re-
views of Modern Physics, 74(1), 47-97.
Anderson, T., Schum, D., & Twining, W. (2005). Analysis of evidence. (2nd ed.).
Cambridge, England: Cambridge University Press.
Antonakis, J., & Lalive, R. (2008). Quantifying scholarly impact: IQp versus the
Hirsch h Journal of the American Society for Information Science and Technol-
ogy, 59(6), 956-969.
Barabási, A.L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002).
Evolution of the social network of scientific collaborations. Physica A, 311, 590-
614.
Bartels, L. (1988). Issue voting under uncertainty: An empirical test. American
Journal of Political Science, 30, 709-728.
Bederson, B.B., & Shneiderman, B. (2003). Theories for understanding information
visualization. In the craft of information visualization: Readings and reflections
(pp. 349-351): Morgan Kaufmann.
Bettencourt, L.M.A., Castillo-Chavez, C., Kaiser, D., & Wojick, D.E. (2006). Re-
port for the office of scientific and technical information: Population modeling
of the emergence and development of scientific fields.
Bettencourt, L.M.A., Kaiser, D.I., Kaur, J., Castillo-Chavez, C., & Wojick, D.E.
(2008). Population modeling of the emergence and development of scientific
fields. Scientometrics, 75(3), 495-518.
Bradshaw, G.F., Langley, P.W., & Simon, H.A. (1983). Studying scientific discovery
by computer simulation. Science, 222(4627), 971-975.
Brandes, U. (2001). A faster algorithm for betweenness centrality. Journal of Math-
ematical Sociology, 25(2), 163-177.
Brannigan, A., & Wanner, R.A. (1983). Historical distributions of multiple discov-
eries and theories of scientific change. Social Studies of Science, 13, 417-435.
Brody, T., Harnad, S., & Carr, L. (2006). Earlier web usage statistics as predictors
of later citation impact. Journal of the American Association for Information
Science and Technology, 57(8), 1060-1072.
Brush, S.G. (1994). Dynamics of theory change: The role of predictions. In Proceed-
ings of the 1994 Biennial Meeting of the Philosophy of Science Association(pp.
133-145). East Lansing, MI.
Brush, S.G. (1995). Prediction and theory evaluation in physics and astronomy. In
A.J. Kox & D.M. Siegel (Eds.), No Truth Except in the Details (pp. 299-318).
Dordrecht: Kluwer Academic Publishers.
Burt, R.S. (1992). Structural holes: The social structure of competition. Cambridge,
Massachusetts: Harvard University Press.
Burt, R.S. (2001). The social capital of structural holes. In N.F. Guillen, R. Collins,
P. England & M. Meyer (Eds.), New directions in economic sociology. New York:
Russell Sage Foundation.
References 133
Burt, R.S. (2004). Structural holes and good ideas. American Journal of Sociology,
110(2), 349-399.
Burt, R.S. (2005). Brokerage and closure: An introduction to social capital. Oxford,
UK: Oxford University Press.
Cahn, R.W. (1970). Case histories of innovations. Nature, 225, 693-695.
Chen, C. (2003). Mapping scientific frontiers: The quest for knowledge visualiza-
tion. London: Springer.
Chen, C. (2004). Searching for intellectual turning points: Progressive knowledge
domain visualization. Proc. Natl. Acad. Sci. USA, 101(suppl), 5303-5310.
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and tran-
sient patterns in scientific literature. Journal of the American Society for Infor-
mation Science and Technology, 57(3), 359-377.
Chen, C. (2008). An information-theoretic view of visual analytics. IEEE Computer
Graphics & Applications, 28(1), 18-23.
Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). To-
wards an explanatory and computational theory of scientific discovery. Journal
of Informetrics, 3(3), 191-209.
Chen, C., & Kuljis, J. (2003). The rising landscape: A visual exploration of super-
string revolutions in physics. Journal of the American Society for Information
Science and Technology, 54(5), 435-446.
Chubin, D.E. (1976). The conceptualization of scientific specialties. The Sociological
Quarterly, 17(4), 448-476.
Collins, R. (1998). The sociology of philosophies: A global theory of intellectual
change. Cambridge, MA: Harvard University Press.
Crane, D. (1972). Invisible colleges: diffusion of knowledge in scientific communities.
Chicago, Illinois: University of Chicago Press.
Dalen, H.P.v., & Henkens, K. (2005). Signals in science — on the importance of
signaling in gaining attention in science. Scientometrics, 64(2), 209-233.
Davis, M.S. (1971a). That’s interesting! Towards a phenomenology of sociology and
a sociology of phenomenology. Philosophy of the Social Sciences, 1(2), 309-344
Davis, M.S. (1971b). That’s interesting! Towards a phenomenology of sociology
and a sociology of phenomenology. Phil. Soc. Sci., 1, 309-344.
Dorigo, M., & Gambardella, L.M. (1997). Ant colony system: A cooperative learn-
ing approach to the traveling salesman problem. IEEE Transactions on Evolu-
tionary Computation, 1(1), 53-66.
Dunbar, K. (1993). Concept discovery in a scientific domain. Cognitive Science, 17,
397-434.
Evans, M., & Kaufman, M. (1981). Establishment in culture of pluripotential cells
from mouse embryos. Nature, 292(5819), 154-156.
Fleming, L., Mingo, S., & Chen, D. (2007). Collaborative brokerage, generative
creativity, and creative success. Administrative Science Quarterly, 52, 443-475.
Fortunato, S., Flammini, A., & Menczer, F. (2006). Scale-free network growth by
ranking. Phys. Rev. Lett., 96, 218701.
Freeman, L.C. (1977). A set of measuring centrality based on betweenness. Sociom-
etry, 40, 35-41.
Fuchs, S. (1993). A sociological theory of scientific change. Social Forces, 71(4),
933-953.
Garfield, E. (1992). Of Nobel class: Part 2. Forecasting Nobel Prizes using citation
data and the odds against it. Current Contents, 35, 3-12.
Garfield, E., & Welljamsdorof, A. (1992). Of Nobel class — a citation perspective
on high-impact research authors. Theoretical Medicine, 13(2), 117-135.
Gill, J. (2005). An entropy measure of uncertainty in vote choice. Electorial Studies,
24, 371-392.
Girvan, M., & Newman, M.E.J. (2002). Community structure in social and biolog-
134 Chapter 5 Foraging
Kullback, S., & Leibler, R.A. (1951). On information and suffciency. Annals of
Mathematical Statistics, 22, 79-86.
Kumar, R., Novak, J., Raghavan, P., & Tomkins, A. (2003). On the bursty evolution
of blogspace. In proceedings of the WWW2003(pp. 477). Budapest, Hungary.
Kumar, R., Novak, J., Raghavan, P., & Tomkins, A. (2004). Structure and evolution
of blogspace. Communications of the ACM, 47(12), 35-39.
Laudan, L., Donovan, A., Laudan, R., Barker, P., Brown, H., Leplin, J., et al.
(1986). Scientific change — philosophical models and historical research. Syn-
these, 69(2), 141-223.
Lazarsfeld, P.F., Berelson, B., & Gaudet, H. (1944). The people’s choice: How
the voter makes up his mind in a presidential campaign. New York: Columbia
University Press.
Leydesdorff, L. (2007). Betweenness centrality as an indicator of the interdisci-
plinarity of scientific journals. Journal of the American Society for Information
Science and Technology, 58(9), 1303-1319.
Liben-Nowell, D., & Kleinberg, J. (2008). Tracing information flow on a global scale
using Internet chain-letter data. PNAS, 105(12), 4633-4638.
Lindahal, B.I.B. (1992). Discovery, theory change, and the Nobel Prize: On the
mechanism of scientific evolution. Theoretical Medicine, 13(2), 97-231.
Lindsay, R.K., & Gordon, M.D. (1999). Literature-based discovery by lexical statis-
tics. Journal of the American Society for Information Science, 50(7), 574-587.
Liqiang, G., & Howard, J.H. (2006). Interestingness measures for data mining: A
survey. ACM Computing Surveys, 38(3), 9.
Lokker, C., McKibbon, K.A., McKinlay, R.J., Wilczynski, N.L., & Haynes, R.B.
(2008). Prediction of citation counts for clinical articles at two years using data
available within three weeks of publication: retrospective cohort study. BMJ,
336(7645), 655-657.
Marshall, B.J. (2005). Helicobacter connections. Nobel Lecture.
Marshall, B.J., & Warren, J.R. (1984). Unidentified curved bacilli in the stomach
of patients with gastritis and peptic ulceration. Lancet, 16(1), 1311-1315.
Mayer, R.E. (1995). The search for insight: Grappling with Gestalt Psychology’s
unanswered questions. In R.J. Sternberg & J.E. Davidson (Eds.), The Nature
of Insight (pp. 3-32). Cambridge, MA: The MIT Press.
Morris, S.A., & Van der Veer Martens, B. (2008). Mapping research specialties.
Annual Review of Information Science and Technology, 42, 213-295.
Mullins, N.C., Hargens, L.L., Hecht, P.K., & Kick, E.L. (1977). The group struc-
ture of cocitation clusters: A comparative study. American Sociological Review,
42(4), 552-562.
Newman, M.E.J. (2001). The structure of scientific collaboration networks. Proc.
Natl. Acad. Sci. USA, 98, 404-409.
Nowakowska, M. (1973). An epidemical spread of scientific objects: an attempt of
empirical approach to some problems of meta-science. Theory and Decision, 3,
262-297.
NSF. (2007). Important Notice No. 130: Transformative research. Retrieved Nov
19, 2008, 2008, from http://www.nsf.gov/pubs/2007/in130/in130.jsp
Perkins, D.N. (1995). Insight in minds and genes. In R.J. Sternberg & J.E. Davidson
(Eds.), The nature of insight (pp. 495-534). Cambridge, MA: MIT Press.
Perneger, T.V. (2004). Relation between online “hit counts” and subsequent cita-
tions: prospective study of research papers in the BMJ. BMJ, 329, 546-547.
Pincock, S. (2005). Nobel Prize winners Robin Warren and Barry Marshall. Lancet,
366(9495), 1429.
Pirolli, P. (2007). Information foraging theory: Adaptive interaction with informa-
tion. Oxford, England: Oxford University Press.
Radder, H. (1997). Philosophy and history of science: Beyond the Kuhnian paradigm.
136 Chapter 5 Foraging
development agenda for visual analytics, Los Alamitos, CA: IEEE Computer
Society Press.
Valente, T.W. (1996). Social network thresholds in the diffusion of innovations.
Social Networks, 18, 69-89.
Wagner-Dobler, R. (1999). William Goffman’s “Mathematical approach to the pre-
diction of scientific discovery” and its application to logic, revisited. Sciento-
metrics, 46(3), 635-645.
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applica-
tions. Cambridge University Press.
Chapter 6 Knowledge Domain Analysis
Fig. 6.1 The relationship between a research front and its intellectual base. Source:
(Chen, 2006).
6.1.2 Tasks
The knowledge domain visualization has three primary tasks for understand-
ing a body of scientific literature or other types of documents such as grant
proposals and patent applications:
1) Improving the clarity of individual networks;
2) Highlighting transitions between adjacent networks;
3) Identifying potentially important nodes.
The first task focuses on the clarity of individual networks’ representa-
tions. One of the major aesthetic criteria established by research in graph
drawing is that link crossings should be avoided whenever possible. A net-
work visualization with the least number of edge crossings is regarded as not
only aesthetically pleasing, but also more efficient to work with in terms of
the performance of relevant perceptual tasks. The number of link crossings
may be reduced by pruning various links in a network. Minimum spanning
trees and Pathfinder network scaling are commonly used algorithms. The
major advantages and disadvantages of these scaling techniques are further
analyzed in the following subsection.
The second task requires that two adjacent networks can be progressively
merged so that it becomes clear which part of the earlier network is persistent
in the new network, which part of the earlier network is no longer active in
the new network, and which part of the new network is completely new. Much
142 Chapter 6 Knowledge Domain Analysis
of the novelty of our method is associated with how we address this issue.
The third task underlines the role of visually salient features in simplifying
search tasks for intellectual turning points. Visually salient nodes include
landmark nodes, pivot nodes, and hub nodes.
6.1.2.1 Improving the Clarity of Networks
Co-citation networks often have too many links to show without blocking
each other’s paths. There are two general approaches to reduce the number
of links in a display: a threshold-based approach and a topology-based ap-
proach. In a threshold-based approach, the elimination of a link is purely
determined by whether the link’s weight exceeds a threshold. In contrast,
in a topology-based approach, the elimination of a link is determined by
a more extensive consideration of intrinsic topological properties; therefore,
such approaches tend to preserve certain topological intrinsic properties more
reliably, although the computational complexity tends to be higher.
Pathfinder network scaling is originally developed by cognitive scientists
to build procedural models based on subjective ratings (Schvaneveldt, 1990).
It uses a more sophisticated link elimination mechanism than a minimum
spanning tree (MST) algorithm. It retains the most important links and pre-
serves the integrity of the network. Every network has a unique Pathfinder
network, which contains all the alternative MSTs of the original network.
Pathfinder network scaling aims to simplify a dense network while preserv-
ing its salient properties. The topology of a Pathfinder network is determined
by two parameters r and q. The r parameter defines a metric space over a
given network based on the Minkowski distance so that one can measure the
length of a path connecting two nodes in the network. The Minkowski dis-
tance becomes the familiar Euclidean distance when r = 2. When r = ∞, the
weight of a path is defined as the maximum weight of its component links,
and the distance is known as the maximum value distance. Given a metric
space, a triangle inequality can be defined as follows,
wij (Σk wnr k nk+1 )1/r
where wij is the weight of a direct path between i and j, wnk nk+1 is the weight
of a path between nk and nk+1 , for k = 1, 2, . . ., m. In particular, i = n1 and
j = nk . In other words, the alternative path between i and j may go all the
way round through nodes n1 , n2 , . . ., nk as long as each intermediate links
belong to the network.
If wij is greater than the weight of alternative path, then the direct path
between i and j violates the inequality condition. Consequently, the link i − j
will be removed because it is assumed that such links do not represent the
most salient aspects of the association between the nodes i and j.
The q parameter specifies the maximum number of links that alternative
paths can have for the triangle inequality test. The value of q can be set to
any integer between 2 and N − 1, where N is the number of nodes in the
network. If an alternative path has a lower cost than the direct path, the
6.1 Progressive Knowledge Domain Visualization 143
direct path will be removed. In this way, Pathfinder reduces the number of
links from the original network, while all the nodes remain untouched. The
resultant network is also known as a minimum-cost network.
The strength of Pathfinder network scaling is its ability to derive more
accurate local structures than other comparable algorithms such as multidi-
mensional scaling (MDS) and minimum spanning tree (MST). However, the
Pathfinder algorithm is computationally expensive. The maximum pruning
power of Pathfinder is achievable with q = N −1 and r = ∞; not surprisingly,
this is also the most expensive one because all the possible paths must be
examined for each link. Some recent implementations of Pathfinder networks
reported the use of the set union of MSTs.
Fig. 6.2 Three types of salient nodes in a co-citation network. Source: (Chen,
2004).
6.1.3 CiteSpace1
CiteSpace has been the primary vehicle for progressive knowledge domain
analysis. It is a freely available Java application for visualizing and analyzing
emerging patterns and critical changes in the literature of a scientific domain
(Chen, 2004; Chen, 2006; Chen, Ibekwe-SanJuan, & Hou, 2010). CiteSpace
uses a set of bibliographic records as input, typically including information
on cited references, and produces interactive visualizations of networks of
authors, references, and several other types of entities as types of nodes over
a number of consecutive time slices. These visualizations are designed to help
users identify intellectual turning points, critical paths of transitions, and
aggregations of individual nodes. The general procedure is described below
(see Fig. 6.3). Details of more specific analytic features will be given as they
are needed.
6.1.3.1 Time Slicing
Time slicing divides the entire time interval into equal-length segments called
time slices. The duration of each segment can be as short as one year or as
long as tens and even hundreds of years. If appropriate data is available, it is
possible to slice it thinner to make monthly or weekly segments. Currently,
sliced segments are mutually exclusive, although overlapping segments could
be an interesting alternative.
6.1.3.2 Sampling
Citation analysis and co-citation analysis typically sample the most highly
cited work — the cream of crop. In order to construct a network in CiteS-
1
http://cluster.ischool.drexel.edu/∼cchen/citespace
6.1 Progressive Knowledge Domain Visualization 145
pace, users may set their own criteria for node selection and link selection.
Alternatively, they can use the default setting provided by CiteSpace. The
simplest way to select nodes is the Top-N method, in which the N most cited
articles within the timeframe of each slice will be included in the final net-
work. Similarly, the Top-N% method will include the N% of the most cited
references within each slice. CiteSpace also allows the user to choose three
sets of threshold values and interpolates these values across all the slices.
Each set of threshold values are: a citation count (c), a co-citation count
(cc), and a cosine coefficient of co-citation similarity (ccv). In CiteSpace, the
user needs to select desired thresholds in the beginning, the middle, and the
ending slices. CiteSpace automatically assigns interpolated thresholds to the
remaining slices.
Research has shown that citation counts often follow a power law distri-
bution. The vast majority of published articles are never cited. On the other
hand, a small number of articles dominate a lion share of citations. Many
factors may influence the frequency and distribution of citations to published
articles. A highly cited article is highly visible. Its visibility is likely to at-
tract more citations. As far as intellectual turning points are concerned, we
are particularly interested in articles that have rapidly growing citations. In
146 Chapter 6 Knowledge Domain Analysis
the following superstring example, we use a simple model to normalize the ci-
tations of an article within each time slice by the logarithm of its publication
age — the number of years elapsed since its publication year. The rationale is
to highlight articles that increased most in the early years of their publication.
6.1.3.3 Modeling
By default, co-citation counts are calculated within each time slice. Co-
citation counts are normalized as cosine coefficients, provided c(i) = 0 and
c(j) = 0:
cc(i, j)
cccosine (i, j) =
c(i) ∗ c(j)
where cc(i, j) is the co-citation count between documents i and j, and c(i) and
c(j) are their citation counts, respectively. The user can specify a selection
threshold for co-citation coefficients; the default value is 0.15.
Alternative measures of co-citation strengths are available in the infor-
mation science literature, such as Dice and Jaccard coefficients.
6.1.3.4 Pruning
An effective pruning can reduce link crossing and improve the clarity of the
resultant network visualization. CiteSpace supports two common network
pruning algorithms, namely Pathfinder and MST. The user can select to
prune individual networks only, or the merged network only, or both. Pruning
increases the complexity of the visualization process. In the following section,
visualizations with local pruning and global pruning are discussed.
Here we concentrate on Pathfinder-based pruning. To prune individual
networks with Pathfinder, the parameters q and r are set to Nk − 1 and ∞,
respectively, to ensure the most extensive pruning effect, where Nk is the size
of the network in the kth time slice. For the merged network, the q parameter
is (ΣNk ) − 1, for k = 1, 2, ... .
6.1.3.5 Merging
The sequence of time sliced networks is merged into a synthesized network,
which contains the set union of all nodes ever appear in any of the individ-
ual networks. Links from individual networks are merged based on either the
earliest establishment rule or the latest reinforcement rule. The earliest es-
tablishment rule selects the link that has the earliest time stamp and drops
subsequent links connecting the same pair of nodes, whereas the latest rein-
forcement rule retains the link that has the latest time stamp and eliminates
earlier links.
By default, the earliest establishment rule applies. The rationale is to
support the detection of the earliest moment when a connection was made
in the literature. More precisely, such links mark the first time a connection
becomes strong enough with respect to the chosen thresholds.
6.1 Progressive Knowledge Domain Visualization 147
6.1.3.6 Mapping
The layout of each network, either individual time sliced networks or the
merged one, is produced using Kamada and Kawa’s algorithm (Kamada &
Kawai, 1989). The size of a node is proportional to the normalized citation
counts in the latest time interval. Landmark nodes can be identified by their
large discs. The label size of each node is proportional to citations of the
article, thus larger nodes also have larger-sized labels. The user can enlarge
font sizes at will. The width of a link is proportional to the corresponding
co-citation coefficient. The color of a link indicates the earliest appearance
time of the link with reference to chosen thresholds.
Visually salient nodes such as landmarks, hubs, and pivots are easy to
detect by visual inspection. CiteSpace also includes algorithms to detect such
nodes computationally. The visual effect is a natural result of slicing and
merging, while additional computational metrics enhance the visual features
even further. A useful computational metric should reflect the degree of a
node, and it should also take into account the heterogeneity of the node’s
links. The more dissimilar links a node connects to others, the more likely
the node has a pivotal role to play.
Fig. 6.4 Turning points in superstring research. Source: (Chen, 2004). (see color
figure at the end of this book)
Fig. 6.5 A network of 624 co-cited references. Source: (Chen, 2004). (see color
figure at the end of this book)
Friedan’s 1986 article is a distinct pivot node connecting a blue cluster (1985 –
1987), a pink cluster (1988 – 1990), and a green cluster (1991 – 1993). Witten’s
1986 article is a pivot between a blue cluster (1985 – 1987) and a yellow cluster
(2000 – 2002).
6.1 Progressive Knowledge Domain Visualization 149
In Fig. 6.4, small clusters in red (2003) indicate the candidates for emerg-
ing clusters. We were able to find Polchinski’s 1995 article in a smaller sized
merged network, but the article was overwhelmed by the 4,000 strong links
of the larger network. Nevertheless, the quality of the visualized network is
promising: intellectually significant articles tend to have topologically unique
positions.
Articles by Maldacena, Witten, and Gubser-Klebanov-Polyako, located
towards the top of the major network component, were all published in 1998.
When we asked Witten to comment an earlier version of the map, in which
citation counts were not normalized by years since publication, he indicated
that the Green-Scharwz article is more important to the field than the three
top cited ones, and that the earlier articles in the 1990s appeared to be under-
represented in the map. There is an apparent mismatch between citation
frequencies of nodes and their importance judged by domain experts. Witten’s
comments raised an important question: is it possible that an intellectually
significant article may not always be the most highly cited? Yes, indeed; it
is possible.
The comments from domain experts have confirmed that both versions of
the merged network indeed highlight significant articles. And these articles
tend to have unique topological properties that distinguish themselves from
other articles.
Fig. 6.6 Major areas in terrorism research. Source: (Chen, 2006). (see color figure
at the end of this book)
Fig. 6.7 Trends in mass extinctions research. Source: (Chen, 2006). (see color
figure at the end of this book)
Fig. 6.8 Macroscopic patterns were identified by our citation analysis published
in 2006 and by mass extinction experts in 2010.
152 Chapter 6 Knowledge Domain Analysis
We often take for granted that we can always tell the source of a shadow
by just looking at the shadow alone. Henry Bursill’s book Hand Shadows to
be Thrown Upon the Wall is full of vivid shadows made by bare hands on
the wall. Fig. 6.9 shows one of the shadows from the book. Making a vividly
looking shadow out of something drastically different has become an art.
As shown in Fig. 6.10, the motorcycle-shaped shadow is not a shadow of a
motorcycle; instead, the source of the shadow was a pile of chunk. Even our
natural language becomes awkward to express the split between a shadow
and its source. This type of projection is so vividly and precisely rendered
that it is hard for us to realize that the boy or the motorcycle does not even
exist!
6.2.2 Metrics
Our new procedure adopts several structural and temporal metrics of co-
citation networks and subsequently generated clusters. Structural metrics
include betweenness centrality, modularity, and silhouette. Temporal and
hybrid metrics include citation burstness and novelty.
The betweenness centrality metric is defined for each node in a network. It
measure the extent to which the node is in the middle of a path that connects
other nodes in the network (Brandes, 2001; Freeman, 1977). High between-
ness centrality values identify potentially revolutionary scientific publications
(Chen, 2005) as well as gatekeepers in social networks. If a node provides the
only connection between two large but otherwise unrelated clusters, then this
node would have a very high value of betweenness centrality. Recently, power
centrality introduced by Bonacich (1987) is also drawing a lot of attention
in dealing with networks in which someone’s power depends on the power of
those he/she is socially related to, for example, in (Kiss & Bichler, 2008).
The modularity Q measures the extent to which a network can be divided
into independent blocks, i.e. modules (Newman, 2006; Shibata, Kajikawa,
Taked, & Matsushima, 2008). The modularity score ranges from 0 to 1. A
low modularity suggests a network that cannot be reduced to clusters with
clear boundaries, whereas a high modularity may imply a well-structured
network. On the other hand, networks with modularity scores of 1 or very
close to 1 may turn out to be some trivial special cases where individual
components are simply isolated from one another. Since the modularity is
defined for any network, one may compare different networks in terms of
their modularity, for example, between ACA and DCA networks.
The silhouette metric (Rousseeuw, 1987) is useful in estimating the un-
certainty involved in identifying the nature of a cluster. The silhouette value
of a cluster, ranging from −1 to 1, indicates the uncertainty that one needs
to take into account when interpreting the nature of the cluster. The value of
1 represents a perfect separation from other clusters. In this study, we expect
that cluster labeling or other aggregation tasks will become more straightfor-
ward for clusters with the silhouette value in the range of 0.7∼0.9 or higher.
Burst detection determines whether a given frequency function has statis-
156 Chapter 6 Knowledge Domain Analysis
tically significant fluctuations during a short time interval within the overall
time period. It is valuable for citation analysts to detect whether and when
the citation count of a particular reference has surged. For example, after the
911 terrorist attacks, citations to earlier studies of Oklahoma City Bombing
were increased abruptly (Chen, 2006). It can be also used to detect whether
a particular connection has been significantly strengthened within a short
period of time (Kumar, Novak, Raghavan, & Tomkins, 2003). We adopt the
burst detection algorithm introduced in (Kleinberg, 2002).
Sigma (Σ) is introduced in (Chen, Chen, Horowitz, Hou, Liu, & Pelle-
grino, 2009a) as a measure of scientific novelty. It identifies scientific pub-
lications that are likely to represent novel ideas according to two criteria
of transformative discovery. As demonstrated in case studies (Chen et al.,
2009a), Nobel Prize and other award winning research tends to have highest
values of this measure. CiteSpace currently uses (centrality + 1)burstness as
the Σ value so that the brokerage mechanism plays more prominent role than
the rate of recognition by peers.
6.2.3 Clustering
K
G = Gk and Gi Gj = ∅, for all i = j. Given sub-graphs A and B,
k=1
a cut function is defined as follows: cut(A, B) = wij , where wij ’s
i∈A,j∈B
are the cosine coefficients mentioned above. The criterion that items in the
same cluster should have strong connections can be optimized by maximizing
K
cut(Gk , Gk ). The criterion that items between different clusters should
k=1
K
be only weakly connected can be optimized by minimizing cut(Gk , G −
k=1
K
cut(Gk , G − Gk )
Gk ). In this study, the cut function is normalized by to
vol(Gk )
k=1
achieve more balanced partitions,
where vol(Gk ) is the sum of the weights of
links in Gk , i.e. vol(Gk ) = wij (Shi & Malik, 2000).
i∈Gk j
Spectral clustering is an efficient and generic clustering method (Luxburg,
2006; Ng et al., 2002; Shi & Malik, 2000). It has roots in spectral graph
theory. Spectral clustering algorithms identify clusters based on eigenvectors
of Laplacian matrices derived from the original network. Spectral clustering
has several desirable features compared to traditional algorithms such as k-
means and single linkage (Luxburg, 2006):
1) It is more flexible and robust because it does not make any assumptions
on the forms of the clusters;
2) It makes use of standard linear algebra methods to solve clustering prob-
lems;
3) It is often more efficient than traditional clustering algorithms.
The multiple-perspective method utilizes the same spectral clustering al-
gorithm for both ACA and DCA studies. Despite its limitations (Luxburg,
Bousquet, & Belkin, 2009), spectral clustering provides clearly defined infor-
mation for subsequent automatic labeling and summarization to work with.
In this study, instead of letting the analyst to specify how many clusters
there should be, the number of clusters is uniformly determined by the spec-
tral clustering algorithm based on the optimal cut described above.
Candidates of cluster labels are selected from noun phrases and index terms
of citing articles of each cluster. These terms are ranked by three different
algorithms. In particular, noun phrases are extracted from titles and abstracts
of citing articles. The three term ranking algorithms are tf*idf (Salton, Yang,
158 Chapter 6 Knowledge Domain Analysis
& Wong, 1975), log-likelihood ratio (LLR) tests (Dunning, 1993), and mutual
information (MI). Labels selected by tf*idf weighting tend to represent the
most salient aspect of a cluster, whereas those chosen by log-likelihood ratio
tests and mutual information tend to reflect a unique aspect of a cluster.
Garfield (1979) has discussed various challenges of computationally se-
lecting the most meaningful terms from scientific publications for subject
indexing. Indeed, the notion of citation indexing was originally proposed as
an alternative strategy to deal with some of the challenges. White (2007a,
2007b) offers a new way to capture the relevance of a communication in terms
of the widely known tf*idf formula.
A good text summary should have a sufficient and balanced coverage
with minimal redundant information (Sparck Jones, 1999). Teufel and Moens
(2002) proposed an intriguing strategy for summarizing scientific articles
based on the rhetorical status of statements in an article. Their strategy
specifically focuses on identifying the new contribution of a source article
and its connections to earlier work. Automatic summarization techniques
have been applied to areas such as identifying drug interventions from MED-
LINE (Fiszman, Demner-Fushman, Kilicoglu, & Rindflesch, 2009).
centrality is greater than 0.1; the thickness of the ring is proportional to its
centrality value.
A line connecting two items in the visualization represents a co-citation
link. The thickness of a line is proportional to the strength of co-citation. The
color of a line represents the time slice in which the co-citation was made
for the first time. A useful byproduct of spectral clustering is that tightly
coupled clusters tend to be placed next to each other and visually form a
supercluster.
Fig. 6.13 A 120-author ACA network on a single time slice of 5 years (2001 –
2005). Clusters are labeled by citers’ title terms using tf∗ idf weighting. An
undefined cluster (#11) is omitted. Source: (Chen et al., 2010).
author can only appear in one cluster but may appear in multiple factors,
one matching factor was selected only if the author has the greatest factor
loading in absolute values; if no such factor was found, the author had no
match. The overall overlapping rate is 82%, computed as follows:
11
12
Ci Fj
i=1 j=1 98
= = 0.82
12 120
Ci
i=1
Fig. 6.14 An associative network of clusters (diamonds) and factors (circles) with
10% or more overlaps (thickness of line). Cluster labels are shown in two parts:
terms chosen by tf∗ idf and by log-likelihood ratio. Source: (Chen et al., 2010).
Fig. 6.15 40 ACA clusters (1996 – 2008) (Nodes=633, Edges=7,162, top N=150,
time slice length=1, modularity=0.2278, mean silhouette=0.6929). Source: (Chen
et al., 2010).
Table 6.1 shows automatically chosen cluster labels of the 6 largest ACA
clusters along with their size and silhouette value. Top-ranked title terms by
LLR were selected as cluster labels. The largest cluster interactive informa-
tion retrieval (#31) has 199 members. Its negative silhouette value of −0.090
suggests a heterogeneous citer set. The second largest cluster (#17), with
95 members, is labeled as information retrieval. Other candidate labels for
the cluster include probabilistic model and query expansion, confirming that
this cluster deals with classic information retrieval issues. The third largest
cluster (#7) is bibliometric analysis.
Most cited authors include Spink A and Saracevic T in interactive in-
formation retrieval (#31), Salton G, Robertson SE, and van Rijsbergen CJ
in information retrieval (#17), Garfield E, Moed HF, and Merton RK in
bibliometric analysis (#7), Egghe L, Price DJD, and Lotka AJ in statisti-
cal analysis (#2). The webometric analysis cluster (#11) includes Cronin B,
Rousseau R, and Lawrence S. The journal co-citation analysis (#8) includes
Small H, Leydesdorff L, and White HD.
Note that one may reach different insights into the nature of a co-citation
cluster if different sources of information are used. The cited members of a
cluster define its intellectual base, whereas citers to the cluster form a re-
search front. The major advantage of our approach is that it enables analysts
to consider multiple aspects of the citation relationship from multiple per-
spectives.
164 Chapter 6 Knowledge Domain Analysis
In the progressive DCA, co-citation networks were first constructed with the
top-100 most cited documents in each of the 13 one-year time slices between
1996 and 2008. Then, these networks were merged into a network of 655 co-
cited references. The merged network was subsequently decomposed into 50
clusters. Table 6.2 summarizes these clusters. We first provide an overview of
these clusters and discuss the five largest clusters in detail.
6.3 A Domain Analysis of Information Science 165
The 50 clusters vary considerably in size. The largest cluster #18 contains
150 members, which is 22.90% of the entire set of 655 references. The five
largest clusters altogether reach 51.60%. In contrast, there are six clusters
contain only two members.
The network’s overall mean silhouette value is 0.7372, which is the high-
est among the three co-citation networks we analyzed in this study. In gen-
eral, the silhouette value of a cluster is negatively correlated with its size
(−0.654). For example, the largest cluster, #18, has the lowest silhouette
value of −0.024, indicating its diverse and heterogeneous structure. In con-
trast, the second largest cluster, #43, has a more homogenous structure with
a reasonably high silhouette value of 0.522. The fifth largest cluster, #2, has
a very high silhouette value of 0.834. The following discussion will focus on
the five largest clusters and their interrelationships.
The five largest document co-citation clusters are interactive information
retrieval (#18), academic web (#43), information retrieval (#46), citation
behavior (#44), and h-index (#2). We analyzed two aspects of each specialty:
(1) prominent members of a cluster as the intellectual basis and (2) themes
identified in the citers of the cluster as research fronts.
166 Chapter 6 Knowledge Domain Analysis
Table 6.3 summarizes two clusters (#43 and #2, both have silhouette
values greater than 0.50) in terms of top-cited members and their structural,
temporal, and saliency metrics such as citation count (ϕ), betweenness cen-
trality (σ), citation burstness (τ ), and sigma — a novelty indicator ( ) (Chen
et al., 2009a). The stars in the academic web cluster (#43) are Lawrence 1999
and Kleinber 1999. Both papers were published outside the domain defined
by the 12 source journals; instead, they appeared in Nature and JACM.
This is an example of how one discipline (information science) was influenced
by another (computer science). The core of the fifth largest cluster, the h-
index cluster (#2), is Hirsch 2005, which originally introduced the concept
of h-index. The strongest citation burst of 15.75 was detected in the citation
history of Hirsch 2005. As our analysis will demonstrate, the h-index cluster
is one of the most active areas of research in recent years.
Table 6.3 Most frequently cited references in two of document co-citation clusters.
P
Cluster # ϕ τ σ Cited references
LAWRENCE S (1999) Accessibility and
76 8.83 0.06 0.17 distribution of information on the Web,
Nature, 400, 107.
ALMIND TC (1997) Informetric analyses
64 9.00 0.06 0.16 on the world wide web: methodological
approaches to ‘Webometrics’, J DOC, 53,
404.
INGWERSEN P (1998) The calculation
63 3.44 0.04 0.22 of Web impact factors. J DOC, 54, 236
43
Kleinberg, J. M. (1999) Authoritaive
53 6.58 0.02 0.12 sources in a hyperlinked environment.
JACM, 46, 604-632.
Rob Kling and Geoffrey W. McKim (2000)
Not just a matter of time: Field differ-
50 7.12 0.03 0.13 ences and the shaping of electronic me-
dia in supporting scientific communica-
tion. JASIS, 51(14), 1306-1320.
HIRSCH JE (2005) An index to quantify
42 15.75 0 0.02 an individual’s scientific research output,
P NATL ACAD SCI USA, 102, 16569
Bornmann, L. & Daniel, H.-D. (2005)
24 8.98 0 0.01 Does the h-index for ranking of scientists
really work? Scientometrics, 65(3), 391-
392.
2
Ball, P. (2005) Index aims for fair ranking
22 7.54 0 0.02 of scientists, NATURE, 436(7053), 900.
Branu, Tibor (2005) A Hirsch-type index
19 7.11 0 0.01 for journals, The Scientists, 19(22), 8.
Egghe, L. (2005). Power laws in the infor-
18 6.73 0 0.02 mation production process: Lotkaian in-
formetrics. Elsevier: Oxford, UK.
6.3 A Domain Analysis of Information Science 167
The average age of core papers in a cluster is an estimation of the time the
cluster was formed. According to the average age of top-5 core papers, the
37-year old citation cluster is the oldest — formed around 1973, its average
year of publication, whereas the h-index cluster is the youngest — 5 years
old, formed in 2005. In between, the Information Retrieval cluster (#13) is
31 (formed in 1979); the interactive Information Retrieval cluster (#18) is 18
formed in 1992; and the academic web cluster (#43) is 11 (formed in 1999).
Research fronts of a document co-citation cluster were characterized by
terms extracted from the citers of the cluster. Nine methods of ranking ex-
tracted terms were implemented in CiteSpace by choosing terms from three
sources — titles, abstracts, and index terms of the citers of each cluster —
and three ranking algorithms, namely, tf∗ idf weighting (Salton et al., 1975),
log-likelihood ratio tests (LLR) (Dunning, 1993; Witten & Frank, 1999), and
mutual information (MI)(Witten & Frank, 1999). Top-ranked terms became
candidate cluster labels.
The reliability of these term ranking methods was measured by a consen-
sus score r = 0.1 ∗ (n + 1), where n is the number of other methods that also
top rank the same term. It turned out that the best three ranking methods
were: (1) title terms ranked by LLR, (2) index terms ranked by LLR, and
(3) title terms ranked by tf∗ idf. tf∗ idf and LLR produced identical labels for
36 clusters out of 50 (72%).
The largest cluster (#18) has 150 members and it has the lowest silhou-
ette value. It turned out that the cluster was cited by 185 citing articles in
the dataset. A total of 869 terms were extracted from the titles of these citing
articles. In order to verify the heterogeneity of this set of citers, the term sim-
ilarity network was decomposed using singular value decomposition (SVD).
As a result, the term space was indeed multi-dimensional in nature because
the largest connected component of the term similarity network contains only
353 terms, which is 40.62% of the 869 terms. In contrast, the h-index cluster
(#2) was much more homogeneous; the cluster was the citation footprint of
39 citing articles.
The second largest cluster was labeled as academic web by LLR, but
the top-ranked index term was webometrics, which was also the name of a
specialty identified by Zhao and Strotmann. The index term webometrics is
broader and more generic than the term academic web. This observation sug-
gests that a manual labeling process is probably very similar to the indexing
process after all.
The identification of the h-index cluster is unique because there was no
such cluster in the 1996 – 2008 ACA. This is a good example why one should
consider both ACA and DCA so that distinct DCA clusters such as the h-
index one can be detected.
The time span τ between a research front and its intellectual base can be
estimated as the difference between their average years of publications:
d∈citers(Ci ) year(d) year(c)
τ (Ci ) = − c∈Ci +1
|citers(Ci )| |Ci |
168 Chapter 6 Knowledge Domain Analysis
For example, citation behavior (#35) has the longest time span, τ (C35 ) =
2000-1973=28 years. IR has the second longest time span τ (C13 ) = 2000-
1979=22. The time span for interactive IR is τ (C18 ) = 2000-1992=9 years;
for academic web (#43), τ (C43 ) = 2003-1999=5 years; and for h-index (#2),
τ (C2 ) = 2007-2005=3 years.
Table 6.4 lists the most representative citing articles in each cluster. For
example, Thelwall has a prominent role in the research front of the academic
web cluster (#43). He authored 3 of the top 5 citing articles of the cluster,
including Thelwall 2003 which cited 14 references of the cluster.
Table 6.4 Titles of the two most frequent citers to each of the 5 largest DCA
clusters. Terms chosen by LLR are underlined.
# Cluster label Titles of key citers
(16) Robins D (2000) shifts of focus on various aspects of
Interactive user information problems during interactive information
18 information retrieval
retrieval (15) Beaulieu M (2000) interaction in information searching
and retrieval
(14) Thelwall M (2003) disciplinary and linguistic consider-
ations for academic web linking: an exploratory hyperlink
Academic mediated study with mainland china and taiwan
43 web (12) Wilkinson D (2003) motivations for academic web site
interlinking: evidence for the web as a novel source of in-
formation on informal scholarly communication
(8) Ding Y (2000) bibliometric information retrieval system
(birs): a web search interface utilizing bibliometric research
results
Information (6) Dominich S (2000) a unified mathematical definition of
13
retrieval classical information retrieval
(6) Sparck-Jones K (2000) a probabilistic model of informa-
tion retrieval: development and comparative experiments
part 2
(5) Case DO (2000) how can we investigate citation behav-
ior? a study of reasons for citing literature in communica-
Citation be- tion
35 havior (5) Ding Y (2000) bibliometric information retrieval system
(birs): a web search interface utilizing bibliometric research
results
(14) Bornmann L (2007) what do we know about the h-
2 H-index index?
(11) Sidiropoulos A (2007) generalized hirsch h-index for
disclosing latent facts in citation networks
The DCA network shown in Fig. 6.16 was generated by CiteSpace. The
655 references and 6,099 co-citation links were divided into 50 clusters with a
modularity of 0.6205, which represents a considerable amount of inter-cluster
links. Major clusters are labeled in the visualization in red color with the font
size proportional to the size of clusters. The colors of co-citation links reveal
that the earliest inter-cluster connection is between interactive IR and IR,
6.3 A Domain Analysis of Information Science 169
Fig. 6.16 An overview of the co-citation networks. Cited references with highest
sigma values are labeled. Source: (Chen et al., 2010).
Fig. 6.17 A timeline visualization of the 50 DCA clusters (655 nodes, 6,099 links,
modularity=0.6205, mean Silhouette=0.7372). Cluster labels are automatically
generated from title terms of citing articles of specific clusters. Source: (Chen et
al., 2010).
Fig. 6.18 The burst of citations to Hirsch 2005. Source: (Chen et al., 2010).
If we start from the top of the timeline visualization and move down line
by line, we can see many representative references in these clusters. For exam-
ple, Film Archive (#11) is a relatively new cluster, containing Jansen 2000
on searching for multimedia on the web as a major reference. Similarly, the
most cited reference in information retrieval (#13) is Salton’s book. Further
down the timeline list is the citation behavior cluster (#35), which features
6.4 Summary 171
Garfield 1979 prominently. Many co-citation links join citation behavior and
academic web. Some long-range co-citation links connect the h-index clus-
ter and other clusters such as the academic web cluster and the power law
cluster.
6.4 Summary
free from the limitations of a specific data source. On the other hand, they
may need to deal with a potentially much larger search space, which can
be a daunting task, especially for those who do not have an encyclopedic
knowledge of the subject domain. Utilizing external information sources such
as the Wikipedia and the World Wide Web is a promising direction to resolve
the problems due to the limited term space problem. An interesting approach
was reported recently in (Carmel, Roitman, & Zwerdling, 2009).
Although some cluster labels make good sense, some labels are still puz-
zling and some members of clusters may not be as intuitive as others. Some of
the labels appear to be strongly biased by particular citing articles, especially
when the size of a cluster is relatively small. Algorithmically generated clus-
ter labels are limited to deal with clusters that have multiple aspects formed
by a diverse range of citing papers. Clusters with low mean silhouette values
tend to be subject to such limitations more than high silhouette clusters.
On a positive note, metrics such as modularity and silhouette provide useful
indicators of uncertainty that analysts should take into account when inter-
preting the nature of clusters. We have been looking for a labeling algorithm
that is consistently better than others. Since we do not have datasets with
gold standards, this cannot be validated systematically except by making
comparisons across the 9 sets of candidate labels.
A few more fundamental questions need to be thoroughly addressed. If
labels selected from citers differ from those from citees, how do we reconcile
the difference? How do we make sense of the citer-citee duality? One of the
fundamental assumptions for co-citation analysis is that co-citation clusters
do represent something substantial as real as part of the reality although they
might be otherwise invisible. Given that some co-citation clusters appear to
be biased by the citation behavior of particular publications, it may become
necessary to re-examine the assumption, especially whether co-citation clus-
ters represent something that is truly integral to the scientific community as
a whole.
The multiple-perspective approach has the following advantages over the
traditional one:
1) It can be consistently used for both DCA and ACA.
2) It uses more flexible and efficient spectral clustering to identify co-citation
clusters.
3) It characterizes clusters with candidate labels selected by multiple ranking
algorithms from the citers of these clusters and reveals the nature of a
cluster in terms of how it has been cited.
4) It provides metrics such as modularity and silhouette as quality indicators
of clustering to aid interpretation tasks.
5) It provides integrated and interactive visualizations for exploratory anal-
ysis.
These features enhance the interpretability and accountability of co-
citation analysis. Modularity and silhouette metrics provide useful quality in-
dicators of clustering and network decomposition. This is a valuable addition
References 173
References
Conflicting opinions are part of the life. At a larger scale, debates can last for
years about the causes of mass extinctions hundreds of million years ago. De-
bates like the one concerning the competitiveness of science and technology
in the Gathering Storm can involve a wide range of stakeholders and decision
makers. At a smaller scale, reviewers may give contradictory recommenda-
tions on whether particular research proposals should be funded. Consumers
may find drastically different opinions on whether a new book or a new prod-
uct is worth purchasing. These types of clashes of opinions are an essential
and valuable driving force in situational awareness and decision making. As
we have seen from the examples discussed in Chapter 2, contradictions are
often an integral part of creativity.
Critical challenges are to identify the basic premises of arguments from
each individual perspective, assess the credibility of available evidence and
alternative perspectives, understand the context and background of a par-
ticular position, and track the development of how various perspectives in a
broad context over a long period of time. While detecting trends and dynam-
and help users to understand what exactly the nature of an identified new
thematic pattern.
Fig. 7.1 The distribution of customer reviews of The Da Vinci Code on Ama-
zon.com within the first year of its publication (March 18, 2003 ∼ March 30, 2004).
Although positive reviews consistently outnumbered negative ones, arguments and
reasons behind these reviews are not apparent. Source: (Chen, Ibekwe-SanJuan,
SanJuan, & Weaver, 2006).
1
http://www.cogsci.ed.ac.uk/∼mikheev/tagger demo.html
7.1 Differentiating Conflicting Opinions 181
then used to select terms that are not purely high frequent, but influential
in differentiating reviews from different categories. Selected terms represent
an aggressive dimensionality reduction, ranging from 94.5%∼99.5%. Selected
terms are used for decision tree learning and classification tests with other
classifiers.
SVM can be used to visualize reviews of different categories. Each review
is represented as a point in a high-dimensional space S, which contains three
independent subspaces Sp , Sq , and Sc : S = Sp ⊕ Sq ⊕ Sc. Sp represents a
review purely by positive reviews. Similarly, Sq represents a review in neg-
ative review terms only and Sc represents reviews with both positive and
negative review terms. In other words, a review is decomposed into three
components to reflect the presence of positive review terms, negative review
terms, and terms that are common in both categories. Note that if a review
does not contain any of these selected terms, then it will not have a mean-
ingful presence in this space. All such reviews are mapped to the origin of
the high-dimensional space and they are excluded from subsequent analysis.
The optimal configuration of the SVM classifier is determined by a number
of parameters, which are in turn determined based on a k-fold cross-validation
(Chang & Lin, 2001). This process is known as model selection. A simple grid
search heuristic is used to find the optimal parameters in terms of the average
accuracy so as to avoid the potential overfitting problem.
Table 7.3 shows the statistics of the term extraction and clustering by
TermWatch. We describe these results in more detail in the following sections.
Table 7.3 Multi-layered feature selection using TermWatch.
Review categories Terms Classes Components Unique features
Positive 20,078 1,017 1,983 879
Negative 14,464 906 1,995 2,018
Fig. 7.4 Terms extracted from positive reviews are clustered based on both syn-
tactic and semantic relationships. Source: (Chen et al., 2006).
A term variation network has three levels: clusters are shown at the highest
level, then components, and finally terms at the lowest level.
7.1.4.1 Positive Reviews
The largest cluster labeled leonardo da vinci art in the network of terms asso-
ciated with positive reviews is surrounded by the clusters literary fiction, the
complete dead sea scroll, harvard professor, and isaac newton. The structure
of this cluster is highly interconnected and its content appears to be coherent
as it captures the main facets of the positive reviews: comments on the major
characters (Prof Langdon), the praises (great storytelling, clever story, grip-
ping novel, historic fiction), other major characters (Sophie Neveu, Leonardo
Da Vinci, Sir Isaac Newton).
Another main cluster Da Vinci code fuss is also about the book itself (the
da vinci code fuss, the da vinci novel, the da vinci code review). They were
grouped into the same cluster because of the terminological variation (here
modifier substitution).
The Da Vinci code fuss cluster is linked to another cluster labeled the vinci
code, which in turn connects to another cluster labeled as mary magdelene
legend. The mary magdalene cluster is concerned with the historical plausi-
bility of events, people and organizations described in the book. For instance,
there is much controversy about the supposed liaison between Mary Magda-
lene and Jesus Christ. Other much debated topics are the roles of the Prieure
de Sion and Opus Dei organizations, the effects of the historical events as
7.1 Differentiating Conflicting Opinions 185
depicted in the book on religious faith of today’s Christians, the research the
author claimed to have carried out to back up his version of the historical
events. Because of the varied nature of the terms in this cluster, most of the
links are due to associations (co-occurrence).
An isolated sub-network deals with the author’s writing history: his next,
previous or new books. Apparently, the terminology used to talk about this
in the reviews is distinct from the terms used to praise the current book,
hence the isolation.
The predictive text analysis of the book reviews serves two objectives: to
validate the predictive power of selected terms and to provide a visual rep-
resentation for analysts to explore and understand the role of these terms in
reviews of different categories.
Terms are ranked differently by document frequency and log-likelihood
ratio. As shown in Table 7.4, terms with high document frequency tend to
be descriptive of the book being reviewed (e.g., book, story, novel, fiction),
whereas terms with a high log-likelihood ratio tend to be more related to
opinions, judgments, and recommendations (e.g., money, hype, great read,
disappoint, waste).
Table 7.5 summarizes the number of terms selected by log-likelihood val-
ues and the accuracies of three classifiers with 10-fold cross-validation. The
original set of extracted terms contains 28,763 terms. The dimensionality re-
duction rates range from 94% to 99.5%. In contrast, if we select terms based
on their document frequencies (>= 2), there will be 6,881 terms and the
accuracy of classification with a C4.5 decision tree is 68.89%, which is be-
low all the models with log likelihood tests. More importantly, decision trees
(C4.5) are relatively stable in terms of 10-fold cross-validation accuracies
186 Chapter 7 Messages in Text
(slightly over 70%), whereas SVM models have more than 80% of accuracy,
which means the selected terms are good candidates to categorize these re-
views. These classifiers are available in data mining software Weka (Witten
& Frank, 1999).
Table 7.4 Terms ranked differently by document frequency (DF) and log-likelihood
ratio.
Term DF Log-likelihood Term DF Log-likelihood
book 2456 2.99 money 83 68.27
story 697 14.25 write 179 66.61
reader 571 0.23 hype 146 61.37
character 561 59.32 character 561 59.32
da vinci code 559 10.85 author 504 53.04
novel 539 0.00 great read 92 48.40
fiction 536 4.89 couldn’t 135 48.10
time 512 21.17 disappoint 39 46.77
author 504 53.04 waste 33 39.26
plot 499 17.25 don’t waste 22 37.63
Fig. 7.5 Distributions of selected terms. The colors of dots indicate the statis-
tical significance level of the corresponding terms, namely green (< 0.001), blue
(p=0.001), red (=0.01), and pink(=0.5). Source: (Chen et al., 2006). (see color
figure at the end of this book)
Fig. 7.6 A decision tree representation of terms that are likely to differentiate
positive reviews from negative reviews made in 2003. Source: (Chen et al., 2006).
Fig. 7.7 A decision tree based on reviews made in 2004. Source: (Chen et al.,
2006).
blah, and catholic conspiracy has only 6 variants in reviews, all negative,
identifying readers shocked by the book.
Browsing the interrelationship between reviews and TermWatch clusters
reveals topics that appear in both categories positive/negative and thus ig-
nored by decision trees. As it turns out, each of the terms like jesus christ
wife, mary magdalene gospel, conspiracy theory and christian history have
more than 50 variants that are almost evenly distributed between positive
and negative reviews.
The perspective of term variation helps to identify the major themes of
positive and negative reviews. For negative reviews, the heavy religious con-
troversies raised by the book are signified by a set of persistent and variation
rich terms such as mary madgalena, opus dei, andthe holy grail, and none
of these terms ever reached the same status in positive reviews. Much of the
enthusiasm in positive reviews can be explained by the perspective that the
book is a work of fiction rather than scholarly work with discriminating terms
such as vacation read, beach read, and summer read.
Fig. 7.8 show an opinion differentiation tree regarding the product of
video iPod. This decision tree model’s accuracy of classification is as high
as 91.49%. The presence of the term video quality predicts a positive review,
whereas battery life is a sign for a negative review. The more specific term
short battery life appears in the tree at the 6th level from the top.
Fig. 7.10 The structure of a concept tree. Each sub-tree corresponds to an un-
derlying concept.
fined query.
Research in citation-based trend analysis provides additional motivations
to the work. A typical way to analyze emerging trends in a scientific domain
is to analyze the structure and dynamics of its literature by forming a net-
work of references and then studying the network dynamics (Chen, 2006).
Analyzers often run clustering algorithms to divide the document space into
clusters of documents. Interpreting the nature of such clusters has been a
bottleneck of the analytical process. What the analyst needs at this stage are
patterns that characterize precise relations expressed in terms of hypotheses
and findings, which are not readily captured by statistical methods. In other
words, patterns must reflect natural language expressions.
Analysts often need to differentiate two instances of text, either two doc-
uments on related topics, or two samples of text from different time points.
Historians of science, for example, need to study the variations of terms in
order to establish reasons why a scientific theory was rejected before and
why it was accepted later on. In the case of the continental draft theory, its
acceptance relies on much fundamental conceptual changes (Thagard, 1992).
except the input data per se. The reason for making such a restricted assump-
tion is that we want to establish a baseline for the further development of such
processing procedures and we also want to identify how far the existing nat-
ural language processing and general-purpose programming techniques can
achieve the goal.
Our approach is inspired by an observation that is intrinsically related
to the notion of the degree of interest (DOI). According to the commonly
known explanation of DOI, variations of perceived details in a scene are the
function of the viewer’s interest. The function reflects where the viewer’s
interest is placed. Usually, we pay more attention to things next to us. In
contrast, we may pay less and less attention to things further and further
away from us. If we turn this thinking to natural languages, we recognize
something strikingly similar — variations of descriptive details in text are
the function of the writer’s interest. If we are writing about a few topics,
we tend to use naturally more words to describe, clarify, differentiate, and
iterate topics that we think are more important than the rest of them. We
tend to find more examples and take into account more perspectives than
otherwise. As a result, the more important topics will be surrounded by
far richer varieties of words than the rest of topics. The design of the new
procedure draws upon this observation and focuses on identifying the core of
such a concentration in text as the symbol of a concept. In addition, we decide
to concentrate on the predicate chain of the subject, the verb, and the object
in a sentence so as to simplify the original sentence and make it easy for the
user or the analyst to decide whether there is sufficient interest to pursue.
We expect that given a sufficiently large amount of input, it becomes more
likely that the basic structure of a sentence is to be found more and more
frequently, especially for important topics. As such instances accumulate,
emergent patterns may appear. Such patterns are expected to be insightful
for making sense of unstructured text. They may play instrumental roles in
facilitating the further visual analysis of seemingly unstructured text.
7.2.3.1 Procedure
The flow chart in Fig. 7.11 illustrates the key components of the procedure
and their communications.
The procedure starts with the selection of sources of text. The user may se-
lect a single document, multiple documents, and a directory of documents as
the initial input data source. The selected documents as a whole are referred
as a source of the procedure. The current prototype supports two sources.
The text in the selected source is subsequently processed by part-of-speech
(POS) tagging. The result is that each and every word in the original text is
annotated with a POS tag. For example, the noun tree is tagged as tree/nn
and the verb run is tagged as grow/vb.
The next step, pattern matching, is to identify segments of word sequences
based on their POS tags and then organize various parts of such segments
according to heuristics on implied hierarchical relations. For example, the
196 Chapter 7 Messages in Text
noun phrase large-scale network can be split into two parts large-scale and
network. The noun network is known as the head noun. The large-scale part is
known as the modifier of the head noun. Thus the word network represents a
concept and it will be stored as a parent node in a hierarchical representation.
The term large-scale can be seen as an attribute of the concept and it will be
stored as the child node to the parent network. The pattern matching process
is illustrated in Fig. 7.12.
Table 7.6 summarizes the major patterns defined by regular expressions
over POS-tagged text. To make the construction of these patterns easier,
we use a bottom-up approach. Starting with the basic building blocks such
as nouns, verbs, and adjectives, more complex patterns are built by joining
these building blocks. For example, the predicate pattern is defined in terms
of subject, verb, and object, which are in turn defined in terms of noun
phrases and verb groups.
7.2 Analyzing Unstructured Text 197
Fig. 7.12 The regular expression of the subject-predicate pattern consists of 3,480
characters. The sentence on the top of the figure is tagged first. Corresponding
patterns are matched based on rules defined in the regular expression for predicates.
Identified patterns are added to the tree.
Fig. 7.13 The user interface of a prototype. The portion of the predicate tree
shown in the figure represents the pattern of “these results demonstrate . . .”, which
forms a small branch of a 111,507-node hierarchy constructed based on 110-year
Science article abstracts (1900 – 2010).
are displayed on the top of the screen and instances of the current focal node
in the tree are highlighted in yellow.
the sizes of both the concept tree and the predicate tree. We are particularly
interested in the relationship between the average length of sentences and the
overall coverage rate (the percentage of sentences that are found with concept
and predicate patterns). We also recorded the runtime taken to completion.
The experiment used an IBM ThinkPad T500 with duo processors of 2.53GHz
and 3GB of RAM and the Java Runtime version of 1.6.0 11. The results are
shown in Tables 7.7 and 7.8.
Table 7.7 Datasets tested.
Nouns Verbs
Source Type Sentences Words
(%) (%)
Yahoo Patents(15227) Abstract 9,342 189,662 40.72 13.84
Google Patents (823) Abstract 4,372 83,586 39.40 14.40
InfoVis (2000 – 2009) Abstract 2,139 42,472 34.40 12.35
Darwin (1872)* Book 5635 197,332 23.05 13.77
Burt (2005) Book 5,201 112,556 30.94 12.87
Science (1900 – 2000) Abstract 98,370 2,062,010 37.09 10.98
Chen(2004) Article 358 6440 30.45 13.23
Shneiderman(1996) Article 258 4,500 34.04 12.51
Chen (2006) Article 659 10,831 33.60 12.55
*Excluding glossaries and the index.
Table 7.8 Records are sorted by the coverage rate (% sentences in trees).
Coverage Runtime
(% (mill.
Source Type Concepts Predicates sentences) Sec)
Yahoo Abstract 15,227 12,082 67.90 71,666
Patents(15227)
Google Patents Abstract 7,091 5,393 63.63 577
(823)
InfoVis (2000 – Abstract 5,364 2,535 58.85 6,957
2009)
Darwin (1872)* Book 13,583 5,538 50.22 279,622
Burt (2005) Book 11,187 5,896 49.51 147,826
Science (1900 – Abstract 279,932 111,506 49.44 4,852
2000)
Chen(2004) Article 899 375 41.06 8,722
Shneiderman Article 747 279 37.60 6,514
(1996)
Chen (2006) Article 1,463 589 34.90 14,756
Fig. 7.14 suggests that the average length of sentences in text may be cor-
related with the overall coverage rate. Patent abstracts and InfoVis abstracts
have particularly higher coverage rates than other sample datasets tested. It
7.2 Analyzing Unstructured Text 201
Fig. 7.14 Words from longer sentences may be more likely to be included in the
trees.
in and out to obtain more contextual details of a concept, for example, all
sorts of adjectives and nouns that were found surrounding the term data in
this particular source of text.
Fig. 7.15 A concept tree, a predicate tree, and a comparative predicate tree of
the InfoVis abstract dataset.
secondary part to these trees. Fig. 7.16 is an example of such merged trees.
Patterns found in the first sample are shown in pink; those found in the second
sample shown in green; and patterns in common are in yellow. These predi-
cates are associated with the subject node article. We can see that the most
commonly used patterns in the InfoVis dataset include article + presents
+ *, article + introduces + *, article + proposes + *, and article + de-
scribes + *. These rhetorical statements are probably also common for the
majority scholarly publications. Scrolling deeper down the tree reveals more
semantically focused statements, for example, GPU + requires + data parallel
programming.
Fig. 7.16 A merged predicate tree from IEEE InfoVis papers in two periods of
time: 2000 – 2004 and 2005 – 2009.
words, sentences, documents, and the entire data set, which would reduce
the cognitive burden of the analyst.
The third scenario is the study of a lengthy document. In this scenario, the
analyst can use the interactive visualization of the concepts and predicates as
an indexing mechanism. Since all the instances and contexts of a concept can
be found within the subtree of the concept, it becomes easy for the analyst
to access them. We illustrate this scenario with two book examples.
Brokerage and Closure is written by sociologist Ronald S. Burt (2005) to
introduce the concepts of brokerage and closure in social networks and their
practical implications. Fig. 7.17 shows the top of the concept tree and the
top of the predicate tree, where prominent patterns are usually positioned.
For example, it is visually evident that the book has many ways to describe
people, network, trust, and ideas as shown on the left concept tree. Similarly,
the predicate tree on the right reveals that the leading actors in the book are
you, we, they, and it.
Fig. 7.17 A concept tree and a predicate tree of Ronald Burt’s 2005 book.
The example in Fig. 7.18 is Darwin’s classic The Origin of Species (6th
ed.). Predictably, species is the most prominent concept, followed by forms,
varieties, differences, animals, plants, and group. Its predicate tree includes
patterns such as forms of life + are + *, it + * + *, we + * + *, they
+ * + *, and many exotic plants + have + *.
7.2 Analyzing Unstructured Text 205
Fig. 7.18 A concept tree and a predicate tree of Darwin’s the Origin of Species.
We list some examples below and hope they can serve as test data for
further enriched and enhanced heuristics and pattern matching rules. The
challenges for the further improvement include improving the efficiency of
the pattern matching mechanisms by shortening the lengths of regular ex-
pressions, expanding the coverage of the processing procedure so that a wider
variety of sentences can be handled, and improving the time efficiency of the
algorithm by shortening the overall runtime.
206 Chapter 7 Messages in Text
node. Using an indexing mechanism to store the bulk of data outside the tree
structures would improve the performance.
The implementation of the prototype uses our own hand-crafted regular
expression patterns. More sophisticated patterns may be developed and tuned
up with natural language processing tools such as GATE. It will be also useful
to compare the coverage and other benchmark scores with a wider range of
natural language processing resources, including various pos-taggers.
The new method has several application implications. For example, the
tree construction procedure can be used to develop an alternative indexing
and ranking algorithm by weighing on the size of subtrees of a node. One
may also derive semantic metrics based on the positions of nodes on such
trees and measure the degree of interest between two nodes. This indexing
potential has the advantage of preserving the original context and providing
an easy to access interface to all the instances in multiple documents.
The new method provides an extra layer of interface between text visu-
alization and text so that one can focus on exploring aggregated patterns
as the intermediate linkage between specific words and sentences and their
broader context. This extra layer enables the analyst to move back and forth
across the boundaries of documents and focus on the essence of the text.
The emergent structural patterns enable users to identify the areas to
pursue from the global overview of the entire dataset. The new method allows
the analyst to contrast and compare two sources of text at various levels of
detail, for example, in the study of patents from competing corporations or
publications from different schools of thoughts.
Future work should address the issues discussed above. In addition, con-
structing an authoritative ontology of the information visualization field for
comprehensive experimental tests, incorporating ontology construction tech-
niques, and conducting user evaluation and field studies are among the promis-
ing routes to proceed.
Michael Jackson, the legendary king of the pop, died on June 25, 2009. The
news of his death created a surge of search volume on the Internet that was
too much for Google News to handle.3 The volcanic peak was marked as B
in Fig. 7.19.
This type of surge is also known as a burst. Identifying such bursts in a
timely manner is the primary goal for burst detection. While it is reasonably
straightforward to track down reasons behind this type of burst as well as
their timing and duration, it can be much more complex and challenging in
other situations. Is there a visible burst of positive or negative reviews of The
3
http://articles.cnn.com/2009-06-26/tech/michael.jackson.internet 1 google-trends-
search-results-michael-jackson? s=PM:TECH
208 Chapter 7 Messages in Text
Da Vinci Code in Fig. 7.1? Was there a burst of particular terms in those
reviews? In the literature of a scientific field, do we expect to see a burst of
a topic as it becomes suddenly popular? If a field is experiencing a Kuhnian
paradigm shift, or a conceptual revolution, would we be able to detect a burst
of articles representing the new paradigm?
Fig. 7.19 The volcanic peak of search of “Michael Jackson.” Source: Google
Trends.4
one below) and the more recent complex network analysis that was largely
done by physicists.
Fig. 7.20 The largest connected component of complex network analysis research
(1980 – 2009).
The unique position of the pivotal node suggests that it is essentially one
of the few that both communities had in common. Was there any burst of
citation to the pivotal-point work? If we can detect a burst, what would it
tell us about the evolution of the two communities?
As it turns out, there was indeed a burst of citation. The burst started in
1998 and lasted till 2000 (Fig. 7.21). Among other early citers in 1998, one
of them was the groundbreaking paper for complex network analysis written
by Watts and published in Nature. What is intriguing is the position of the
burst in the entire 20-year history. The level of the citation counts during the
burst period is not the highest on the hindsight, but the timing is much more
meaningful for the purpose of identifying an emerging trend or a new field.
The timing synchronized with the publication of the groundbreaking paper of
the new complex network analysis. Furthermore, it was indeed cited by the
groundbreaking paper. The citations of the pivotal work increased rapidly
from 2002. The delay in part reflects how long it may take for knowledge
diffusion and for the new paradigm to establish.
210 Chapter 7 Messages in Text
Fig. 7.21 A burst of citation was detected between 1998 and 2000.
Fig. 7.22 Comparing time to effect and duration of burst with survival analysis.
cited paper in string theory written by Juan Maldacena. The paper was
published in 1998. A citation burst was detected between 1999 and 2003.
The waiting time was 1 year. A total of 288 cited references are shown in the
network in Fig. 7.24. They are split into two groups by the median citation
of 19.50, high- and low-citation groups, so that survival analysis can address
the question whether papers in the two groups differ in terms of patterns
associated with their citation bursts.
Fig. 7.23 The waiting time and the duration of a citation burst for a 1998 paper
by Maldacena. The burst began in 1999, one year after its publication, and ended
in 2003.
Survival analysis found that the highly cited reference group is likely to
212 Chapter 7 Messages in Text
Fig. 7.24 The input data for survival analysis consists of 288 cited references
as shown in this network. They are split into a high- and low-citation group by
h-index.
have a citation burst sooner than the less cited papers. A highly cited paper
on the average waits for about 2.2 years, whereas a less cited paper waits for
5.2 years (see Fig. 7.25).
Fig. 7.25 The highly cited papers are likely to have a citation burst sooner (burst
within 2.2 years) than the less cited papers (burst within 5.2 years).
7.3 Detecting Abrupt Changes 213
The previous example suggests that survival analysis, combined with burst
detection, can help us to differentiate two kinds of publications, namely pa-
pers that are highly cited and papers that are not. If we look closely at the
procedure, it appears that the procedure itself is quite generic. The proce-
dure can be used to compare not only two but several groups. A broad range
of events can be defined accordingly. Thus, the procedure is applicable to
compare different types of proposals, different types of patent applications,
as well as different types of scientific publications. In the following example
we illustrate how this procedure can be used to differentiate successful and
unsuccessful proposals.
We hypothesize that awarded and declined proposals may differ in terms
of how and when they deal hot topics. The timing of the appearances of a
hot topic can be measured by burst detection. The comparison between the
awarded and declined groups is done by survival analysis, which allows us
to address the question whether the two groups differ statistically in terms
of how soon hot topics appear and how long they last. The results indicate
that this is indeed detectable with statistically significance with the caveat
that the choice made in the noun phrase extraction and therefore the text
segmentation appears to influence the sensitivity of the analysis.
We considered hypotheses that may distinguish awarded and declined
proposals:
1) Awarded proposals address topics in a timelier manner than declined pro-
posals.
2) Awarded proposals address more profound topics than declined proposals.
3) Noun phrases extracted from core segments are more specific and focused
on proposed research questions than terms from one-page project sum-
maries.
4) Survival analysis of noun phrases extracted from one-page project sum-
maries is the same as results from noun phrases extracted from the core
segments.
The first hypothesis can be tested in terms of the survival probabilities of
bursts of terms as the events. Awarded proposals are expected to deal with
hot topics earlier than declined ones. The second hypothesis means that the
duration of a term burst in an awarded proposal is expected to be longer
than the corresponding duration in a declined proposal.
The hypotheses were tested with 1,206 awarded and 4,305 declined pro-
posals from a program of the NSF over four years (2007 – 2010). The results,
shown in Table 7.9, include the number of noun phrases extracted and the
number of noun phrases with detected bursts. The “Time till burst” col-
umn shows the p-level of the statistical significance between the awarded and
declined groups. Notably, awarded and declined groups are distinguishable
by single words and two kinds of noun phrases, namely, single-word nouns
214 Chapter 7 Messages in Text
This means that core segments are better sources of text than 1-page sum-
maries for studies of documents such as grant proposals.
Survival analysis of bursts of noun phrases (1 ∼ 4 words) from one-page
project summaries revealed a statistically significant difference (p = 0.007)
between awarded and declined proposals in terms of the duration of a burst.
As shown in Fig. 7.26, bursts of terms in awarded proposals lasted shorter
(on average 1.792 year) than in declined proposals (on average 2.381 year),
although no statistically significant difference was found.
Fig. 7.26 Awarded and declined proposals have different survival probabilities
of burst duration. Source: one-page project summaries of proposals (1 – awarded,
0 – declined).
7.4 Summary
In this chapter, we have first addressed some of the issues concerning how
to differentiate conflicting opinions in an evidence-based approach, in par-
ticular, the role of decision trees in representing terms that may predict the
orientation of customer reviews. The method particularly takes the advan-
tage of the available ratings of reviews. Decision trees of terms and positions
of reviewers provide not only a descriptive model of the central issues of a
debate but also a predictive model to anticipate the paths of arguments one
may follow.
The second part of the chapter deals with situations in which we do not
have judgments from users or experts in numerical forms and we do not have a
taxonomy or ontology either. We have introduced a method we are developing
to address these issues by tapping on patterns that we can discern from
linguistic relations. The flexibility needed to tolerate the ambiguity of natural
216 Chapter 7 Messages in Text
References
Budiu, R., Pirolli, P., & Fleetwood, M. (2006). Navigation in degree of interest trees.
http://www2.parc.com/istl/groups/uir/publications/items/UIR-2006-02-Budiu-
NavigationinDOITrees.pdf. Accessed June 1, 2010.
Burt, R.S. (2004). Structural holes and good ideas. American Journal of Sociology,
110(2), 349-399.
Burt, R.S. (2005). Brokerage and closure. New York, NY: Oxford University Press.
Callon, M., Courtial, J.P., Turner, W.A., & Bauin, S. (1983). From translations
to problematic networks — An introduction to co-word analysis. Social Science
Information Sur Les Sciences Sociales, 22(2), 191-235.
Card, S., & Nation, D. (2002). Degree-of-interest trees: A component of an attention-
reactive user interface. Proceedings of AVI (pp. 231-245).
Chalmers, M. (1992). BEAD: Explorations in information visualisation. In proceed-
ings of the SIGIR ’92(pp. 330-337). Copenhagen, Denmark. ACM Press.
Chang, C.-C., & Lin, C.-J. (2001). LIBSVM: A library for support vector machines.
http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
Chen, C. (2004). Searching for intellectual turning points: Progressive knowledge
domain visualization. Proc. Natl. Acad. Sci. USA, 101(suppl), 5303-5310.
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and tran-
sient patterns in scientific literature. Journal of the American Society for Infor-
mation Science and Technology, 57(3), 359-377.
Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). To-
wards an explanatory and computational theory of scientific discovery. Journal
of Informetrics, 3(3), 191-209.
Chen, C., Ibekwe-SanJuan, F., SanJuan, E., & Weaver, C. (2006). Visual analysis of
conflicting opinions. In Proceedings of the IEEE Symposium on Visual Analytics
Science and Technology (VAST)(pp. 59-66). Baltimore, MA.
Daille, B. (2003). Conceptual structuring through term variations. In proceedings of
the Proceedings of the ACL-2003 workshop on multiWord expressions: Analysis,
acquisition and treatment(pp. 9-16). Saporro, Japan.
Darwin, C. (1872). The origin of species. (6th ed.): Project. Gutenberg.
Fiszman, M., Demner-Fushman, D., Kilicoglu, H., & Rindflesch, T.C. (2009). Auto-
matic summarization of MEDLINE citations for evidence-based medical treat-
ment: A topic-oriented evaluation. Journal of Biomedical Informatics, 42, 801-
813.
Fry, B. (2009). On the origin of species: The preservation of favoured traces. http://
benfry.com/traces/.
Furnas, G.W. (1986). Generalized fisheye views. In proceedings of the CHI ’86(pp.
16-23. ACM Press.
Ham, F.v., Wattenberg, M., & Viéas, F.B. (2009). Mapping text with phrase
nets. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1169-
References 217
1176.
Havre, S., Hetzler, E., Whitney, P., & Nowell, L. (2002). ThemeRiver: Visualizing
thematic changes in large document collections. IEEE Transactions on Visual-
ization and Computer Graphics, 8(1), 9-20.
Heer, J. (2007). The prefuse visualization toolkit. http://prefuse.org/.
Heer, J., & Card, S.K. (2004). DOI Trees revisited: Scalable, space-constrained
visualization of hierarchical data. Proceedings of AVI (pp. 421-424).
Hetzler, B., Whitney, P., Martucci, L., & Thomas, J. (1998). Multi-faceted insight
through interoperable visual information analysis paradigms. In Proceedings of
the IEEE Information Visualization ’98(pp. 137-144). Los Alamitos, CA: IEEE
Computer Society Press.
Ibekwe-SanJuan, F. (1998). A linguistic and mathematical method for mapping
thematic trends from texts. In Proceedings of the 13th European Conference on
Artificial Intelligence (ECAI’98). (pp. 170-174). Brighton, UK.
Ibekwe-SanJuan, F., & SanJuan, E. (2004). Mining textual data through term
variant clustering: The TermWatch system. In Proceedings of the Recherche
d’Information assistée par ordinateur(RIAO 2004)(pp. 487-503). University of
Avignon, France.
Kohonen, T. (1995). Self-organizing maps. Springer.
Paley, W.B. (2002). TextArc. http://www.textarc.org/.
Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using
subjectivity summarization based on minimum cuts. In Proceedings of the ACL.
PNNL. IN-SPIRE. http://in-spire.pnl.gov/.
Rip, A., & Courtial, J.P. (1984). Co-word maps of biotechnology — An example of
cognitive scientometrics. Scientometrics, 6(6), 381-400.
Shneiderman, B. (1996). The eyes have it: A task by data type taxonomy for in-
formation visualization. In Proceedings of the IEEE Workshop on Visual Lan-
guage(pp. 336-343). Boulder, CO: IEEE Computer Society Press.
Sparck Jones, K. (1999). Automatic summarizing: Factors and directions. In I.
Mani & M.T. Maybury (Eds.), Advances in Automatic Text Summarization
(pp. 2-12). Cambridge, MA: MIT Press.
Swanson, D.R. (1986). Fish oil, Raynaud’s syndrome, and undiscovered public
knowledge. Perspectives in Biology and Medicine, (30), 7-18.
Teufel, S., & Moens, M. (2002). Summarizing scientific articles: Experiments with
relevance and rhetorical status. Computational Linguistics, 28(4), 409-445.
Thagard, P. (1992). Conceptual revolutions. Princeton, New Jersey: Princeton Uni-
versity Press.
Thomas, J.J., & Cook, K.A. (Eds.). (2005). Illuminating the path: The research
and development agenda for visual analytics. IEEE Computer Society Press.
Tijssen, R.J.W., & Vanraan, A.F.J. (1989). Mapping co-word structures — a com-
parison of multidimensional-scaling and leximappe. Scientometrics, 15(3-4), 283-
295.
Toutanova, K., Klein, D., Manning, C., & Singer, Y. (2003). Feature-rich part-of-
speech tagging with a cyclic dependency network. Proceedings of HLT-NAACL
2003 (pp. 252-259).
Viégas, F.B., Wattenberg, M., Ham, F.v., Kriss, J., & McKeon, M. (2007). Many
eyes: A site for visualization at Internet scale. IEEE Transactions on Visualiza-
tion and Computer Graphics, 13(6), 1121-1128.
Wattenberg, M., & Viégas, F.B. (2008). The word tree: an interactive visual con-
cordance. IEEE Transactions on Visualization and Computer Graphics, 14(6),
1221-1228.
Witten, I.H., & Frank, E. (1999). Data mining: Practical machine learning tools and
techniques with Java implementations. San Francisco, CA: Morgan Kaufmann.
Yang, Y., & Pedersen, J.O. (1997). A comparative study on feature selection in
218 Chapter 7 Messages in Text
text categorization. In J.D.H. Fisher (Ed.), Proceedings of the The 14th Inter-
national Conference on Machine Learning (ICML’97)(pp. 412-420). Nashville,
US. Morgan Kaufmann Publishers.
Chapter 8 Transformative Potential
Identifying and supporting high-risk and high pay-off research has been one of
the major concerns of science policy as well as individual scientists and their
institutions. The National Science Foundation (NSF) has been concerned
about identifying and funding transformative research for decades. The U.S.
is not the only country that is experiencing the sense of urgency like the
Gathering Storm we discussed at the beginning of the book. Research Coun-
cils UK (RCUK) considers high potential, high impact research as adven-
turous, speculative, innovative, exciting, creative, radical, groundbreaking,
precedence setting, unconventional, visionary, challenging, ambitious, uncer-
tain, mould-breaking or revolutionary (RCUK 2006). The Natural Sciences
and Engineering Research Council of Canada (NSERC) defined the concept
of risk based on unconventionality and an uncertainty of results (NSERC
2003).
Due to the intensified international competition, shrinking public funds,
and increasingly stringent criteria imposed by funding agencies, obtaining
competitive research funding is now a common problem worldwide at various
levels. Scientists have to make tough decisions (NSF, 2007) and balance the
precious time and effort on writing research grants that have a diminishing
chance of success and face increasingly overwhelmed reviewers. While many
funding agencies encourage high-risk and high payoff research, assessing the
transformative potential of a research idea is an increasingly intensified chal-
lenge.
allows us to pin point the specific links that alter the structure of existing
knowledge the most, which is valuable information for additional validation,
for example, by consulting with scientists themselves and other domain ex-
perts. In addition, the ability to pinpoint the potential of specific connections
makes it possible for analysts to keep track the evolution of their impact over
time so that one can verify whether scientific ideas that are identified today
with transformative potential are evidently transformative as shown in due
course.
In order to assess the extent that these metrics can capture transforma-
tive research, our strategy is to take a retrospective-predictive approach by
predicting citation counts received by scientific publications that had induced
strong structural variations in the past. In other words, our hypothesis is that
the degree of structural variation introduced by a scientific publication (as a
symbol of scientific ideas) is a significant predictor of its citation counts in
subsequent years.
Fig. 8.1 shows two plots related to original research articles on the subject of
mass extinction between 1975 and 2010. What is interesting is that the two
plots appear to run in parallel to each other most of the time until the most
recent years. Is it something purely coincident? Does it also happen to other
subjects?
Before we address these questions, we need to introduce some notations
and terminologies. We use the notation dsource → dtarget to denote the fact
that a scientific publication dsource cites another scientific publication dtarget .
For a given scientific publication d, its references R are scientific publications
it cites, i.e. references (d) = {ri |d → ri }, whereas citations are the instances
that subsequent scientific publications cite d, i.e.. citations (d) = {ci |ci → d}.
The two plots in Fig. 8.1 trigger a hypothesis that the number of references
made by an article is correlated with the number of citations it receives. In
other words, articles with a longer list of references appear to receive more
citations than articles with a shorter list of references.
A news article2 on Nature News published on August 13, 2010 was quick
to jump to the conclusion that an easy way to boost a paper’s citation is to
include more references. Gregory Webster, the psychologist at the University
of Florida in Gainesville, found a strong correlation between the number of
references and the number of citations based on 50,000 Science papers but
he made a superficial claim, “if you want to get more cited, the answer could
be to cite more people.” First, the claim stretched a simple correlation to
a causal relationship. Second, the claim lacked the support of a theory that
2
http://www.nature.com/news/2010/100813/full/news.2010.406.html
224 Chapter 8 Transformative Potential
Fig. 8.1 A correlation between the average number of references of articles and
their average citations.
can explain why this would be the case. A few informetricians questioned
the claim. One of them, Ronald Rousseau, the President of the International
Society for Scientometrics and Informetrics (ISSI), put forward a conjecture:
an article that deals with several topics has a higher probability of being
useful and being cited than an article that is relevant to just one subfield.
The number of references itself is not the cause of the relationship. Rousseau
asked if anyone can prove or disprove the conjecture.
In fact, Rousseau’s conjecture is expressed almost in a form that can be
derived from our explanatory theory of transformative discovery. According
to our theory, the brokerage mechanism is one of the key mechanisms that
can lead to transformative research. The brokerage mechanism, also known
as boundary spanning, creates unprecedented links that connect previously
disjoint topics or bodies of knowledge. A paper on potentially transformative
discovery is then likely to build conceptual bridges between multiple topics
and even distinct fields or disciplines. By doing so, it becomes natural that it
tends to cite references from multiple topics or fields and, as a result, leads
to a higher total number of references than a paper on a single topic would
cite. More significantly, due to the transformative value of the paper, it is
likely to receive more citations than less transformative papers. Therefore, we
hopothesize that it is not the total number of references that causes a higher
citation count; rather, high citations are much more likely to be caused by
the structural variation introduced by the transformative paper.
The next step is to test this hypothesis. The most straightforward strategy
is to first compute structural variation metrics of scientific papers published
after time t with reference to K(t), the knowledge structure up to time t,
then test to what extent such metrics predict subsequent citations obtained
8.2 Detecting the Transformative Potential 225
by these papers for the next 5 or 10 years alongside with other variables that
have been commonly considered as predictors of citation counts, such as the
length of a paper, the number of collaborating authors of a paper, and the
number of references of a paper.
In fact, many of these variables are derivable from the central hypothesis
that unusual conceptual linkages are likely to indicate the transformative
potential. For example, the quality of a paper may be not really related to
the number of its co-authors; instead, it may be related to the number of
topical areas where coauthors come from. Similarity, the quality of a paper
may be not related to the number of countries of coauthors, but, instead,
related to the number of distinct disciplines the coauthors belong to.
One of the essential criteria of a good theory is its coherence. That is
whether the theory can explain many seemingly different things under the
same framework. In the case of our explanatory theory of transformative
discovery, the theory has largely reduced the number of variables to consider
with the same underlying explanation. We introduce the rationale of the
design of metrics of structural variation in the following sections.
prior to the 2010 World Cup, most people would not think of an octopus in
any association with soccer.
Similar examples are also available in science and technology. Before ter-
rorist attacks on September 11, 2001, the literature of post-traumatic stress
disorder (PTSD) generally focused on people who were on site when trau-
matic events took place. After 911, however, researchers realized that even
people who were not anywhere physically near to a trauma could still develop
symptoms of PTSD due to graphical news coverage and extensive special cov-
erage on mass media:
1) people ∼ eyewitness/experience trauma ∼ PTSD
2) people ∼ news coverage on mass media ∼ PTSD
A search for potentially transformative research can be at least partially
fulfilled by finding scientific papers that make such structural changes to
the intellectual structure of a subject domain, e.g. PTSD in this example.
Then our hypothesis becomes that papers making this type of contributions
are more likely to be cited than papers making less significant structural
changes.
In the history of science, there are many examples of how new theories
revolutionized the contemporary knowledge structure. For example, the 2005
Nobel Prize in medicine was awarded to the discovery of Helicobacter pylori,
a bacterium which was not believed to be possible to find in human’s gastric
system (Chen, Chen, Horowitz, Hou, Liu, & Pellegrino, 2009). In literature-
based discovery, Swanson discovered previously unnoticed linkage between
fish oil and Reynaud’s syndrome(Swanson, 1986). In drug discovery, one of
the major challenges is to find new compound structures effectively in the
vast chemical space that satisfy an array of constraints(Lipinski & Hopkins,
2004). In mapping scientific frontiers(Chen, 2003) and studies in science of
science (Price, 1965), it would be particularly valuable if scientists, funding
agencies, and policy makers can have tools that may assist them to assess the
novelty of ideas in terms of their conceptual distance from the contemporary
domain knowledge. In these and many more scenarios, a common challenge
for coping with a constantly changing environment is to estimate the extent
to which the structure of a network should be updated in response to newly
available information.
The metrics to be introduced below are generic and suitable for a variety
of networks. To illustrate the use of such metrics, we focus on intellectual
networks of scientific domains and show how these metrics can be used to
detect potentially significant new publications.
A document co-citation network G(V, E) can be generated from a set of
scientific publications S. Each node n, a member of V , in such a network
represents a scientific publication cited by a member of the given set S. An
edge eij connecting nodes ni and nj in the network represents a co-citation
relationship, which means if there exists s in S such that s cites both ni
and nj . Usually an edge is weighted to reflect the relative strength of such
binding. The more often such co-citation instances there are, the stronger
8.2 Detecting the Transformative Potential 227
the edge grows between them. We are interested in the following question:
given a new publication s arrived from a new set of publications S , what
structural changes does s introduce with regard to the G formed prior to the
arrival of s . In other words, we seek to measure δE, the change of E to E
in the new G(V, E ). Note that V remains constant. Here, we limit our focus
to situations of constant V s. Scenarios in which V varies are more complex;
they will need to be addressed once we have a better understanding of the
relatively simple cases of a constant V .
In essence, we define structural change metrics in terms of δE with respect
to E. The simplest metric is |ΔE|, the number of different edges introduced
by the new s . If the new paper s uniformly cites all the references in the
existing network G, then s is adding nothing new to the structure of the
network, thus |ΔE| = 0. If all the references cited by the new paper s already
exist in E, then it is adding very little to the structure of the network as far
as the network topology is concerned, although it in effect reinforces a sub-
structure of the network. If s adds new edges to the original network, then
we are receiving new information that may potentially lead to a global change
of the network.
A more sophisticated metric takes the position of each node in the network
into account. For example, a metric can be defined according to the change
of centrality scores of all the nodes in the network. The node centrality of
a network G(V, E), C(G) is a distribution of the centrality scores of all the
nodes, < c1 , c2 , . . ., cn >, where ci is the centrality of node ni , and n is |V |,
the total number of nodes. The degree of structural change δE can be defined
in terms of the K-L divergence, we denote this metric as Δcentrality .
The next metric Δmodularity is defined to measure the novel associations
added across aggregations of nodes. First, decompose G(V, E) to a set of clus-
ters, {Ck }; in this case, Ck is a co-citation cluster (Chen, Ibekwe-SanJuan,
& Hou, 2010). Given a cluster configuration, the modularity of the network
can be computed. The modularity measures whether the network can be de-
composed nicely with the given clusters. A high modularity means that the
given cluster configuration can divide the network into relatively independent
partitions with few cross cluster edges. In contrast, a low modularity means
that the given cluster configuration cannot divide the network without many
cross-cluster edges. If a new paper s adds an edge connecting members of the
same cluster, it will have no impact on the modularity. It will not make any
difference to the value of Δmodularity . On the other hand, if s adds an edge
between different clusters and the two clusters are previously not connected,
the modularity of the new structure will be lower than that of the original
structure. Δmodularity =modularity(G )/modularity(G).
The modularity of a network is a function of a set of alternative partitions
of the network. Some partitions lead to a higher modularity, whereas others
lead to lower modularity scores. The optimal partition can be determined
based on the variation of modularity scores over different partitions of the
same network. Since the maximum modularity implies the maximum separa-
228 Chapter 8 Transformative Potential
Fig. 8.2 An incoming article #3 added a new link that connects references #1
and #2. The accumulated change of modularity by #3 is 0.022% with respect to
the network structure without connections made by #3. Data source: Fish oil and
Reynaud’s Syndrome.
Fig. 8.3 shows that the same method is applicable to detect the novelty
of a paper with reference to a network of co-occurring terms found in previ-
ous publications on terrorism research. A 2001 paper by P. J. Maddox made
a fresh connection between terms terrorist attacks and accidental exposure,
8.2 Detecting the Transformative Potential 229
Fig. 8.3 The novelty of a newly arrived paper can be also determined according
to the changes in modularity and centrality in a network of terms. Data source:
Terrorism Research.
able Citations. We also included two additional scores alpha and beta, where
alpha is the proportion of existing and redundant edges made by the article
in question and beta is the proportion of new edges added by the article. We
controlled the effect of NR, the total number of references of the article. The
dependent variable Citations is the number of citations the article has been
cited up to 2010.
UNIANOVA Citations WITH Δmodularity Δcentrality alpha beta
/REGWGT=NR
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/PRINT=PARAMETER ETASQ
/CRITERIA=ALPHA(.05)
/DESIGN=Δmodularity Δcentrality alpha beta.
The hypothesis was tested on papers that cite a major paper on CiteSpace
(Chen, 2006) in the Web of Science because we have extensive expertise in
relevant areas of research so that we could draw upon our domain knowledge
in the analysis and interpretation of the results. The data source contained
76 articles, including 32 journal papers, 38 conference proceeding papers, 5
review papers, and 1 editorial. They were written by a total of 229 authors
from 108 institutions. This dataset represents a total of 3,647 references. The
distributions of these articles from 2006 through 2010 are: 1, 17, 16, 31, and
11. The primary subject category of these articles is information science and
library science. The secondary one is computer science, information systems,
and interdisciplinary applications.
The algorithm operates as follows. At any time point t in the interval
[2006, 2010], the goal is to estimate the novelty of papers published in year
t, St , according to how much structural changes they introduced in compar-
ison with the network structure up to the year before. In other words, the
algorithm constructs a network of co-cited references Gt−1 based on papers
published in the interval [2006, t − 1]. Δmodularity (s) and Δcentrality (s) are
computed for each s ∈ St . For example, for t=2009, there are 31 papers in
the incoming stream. Their novelty is computed with reference to the co-
citation network formed based on references cited by 1+17+16=34 papers
published prior to 2009. In CiteSpace, we generated networks per slice with
the top 200 most cited references in each time slice. Further investigations are
needed to find out the impact of such selection criteria on the final ranking
results. Table 8.1 lists the details of accumulative networks prior to a given
Table 8.1 The accumulative networks prior to the streaming articles.
1-year slices citers criteria space nodes links networks size modularity
2006 1 top 200 19 19 171 G2006 19×19 0.0000
2007 17 top 200 338 200 2634 G2007 216×216 0.7340
2008 16 top 200 1526 200 9261 G2008 399×399 0.2268
2009 31 top 200 868 200 2432 G2009 558×558 0.3269
2010 11 top 200 475 200 2933
8.2 Detecting the Transformative Potential 231
time point t.
Takeda and Kajikawa (2010) studied the change of modularity in net-
works of direct citations and found that the evolution of such direct citation
networks appears to have three stages. Core clusters are first formed, fol-
lowed by peripheral clusters, and then by the further growth of the core
clusters. Taleda and Kajikawa adopted the clustering algorithm originally
introduced by Mark Newman. Newman’s algorithm starts with a bottom-
up procedure in which individual nodes are joined together. The algorithm
searches for the structure that represents the maximum modularity. Instead
of searching for the maximum modularity, Taleda and Kajikawa simply kept
track of the modularity at each step of the process and used this informa-
tion to explore the structural change in the network. In our analysis, the
network in 2006 was formed by the citation behavior of one article. By 2007,
the network represented the citation trails of 18 articles with a very high
network modularity of 0.7340, suggesting a relatively clear partition of the
network into distinct topics. By 2008, the network grew even larger with
contributions from additional 16 articles. The modularity of the new network
dropped to 0.2268, which means the overall interconnectivity of the network
was increased considerably. Finally, by incorporating co-cited references from
another 31 articles, the network now contained 558 references. Interestingly,
as the network continued to grow, the modularity increased to 0.3269, sug-
gesting that new topics were probably introduced into the network and the
boundaries between the new topics and older ones are still recognizable.
Now let’s look at the ranking results in Table 8.2. If a paper scores based
on whether it adds novel connections between clusters in a network, it follows
that the paper creates bridges or boundary-spanning links between previously
unconnected patches of knowledge. What types of papers would be ranked
high in such scenarios? Fig. 8.5 shows a ranked list of papers that cited (Chen,
2006) by Δmodularity , which is the first column in the table.
Table 8.2 Top-10 papers ranked by the modularity variation rate ΔQ, i.e. Δmodularity
ΔQ ΔC TC NR Author Year Title Source
4.5329 .0567 18 610 JUDIT 2008 Informetrics at the J INFORM-
BARILAN beginning of the 21st ETR
century — A review
2.0735 .0236 3 370 STEVEN A. 2008 Mapping researchANNU REV
MORRIS specialties INFORM
SCI TECH
1.5902 .0044 3 106 CHAOMEI 2009 Towards an explana- J INFORM-
CHEN tory and computa- ETR
tional theory of sci-
entific discovery
.8241 .0024 1 62 ERJIA YAN 2009 Applying Centrality J AM SOC
Measures to Impact INF SCI TE-
Analysis: A Coau- CHNOL
thorship Network
Analysis
232 Chapter 8 Transformative Potential
Continued
ΔQ ΔC TC NR Author Year Title Source
.7701 .0014 2 29 YOSHIYUK 2009 Optics: a bibliomet- SCIENTOM
I TAKEDA ric approach to detect ETRICS
emerging research do-
mains and intellectual
bases
.7079 .0037 1 84 KATY 2009 Visual conceptualiza- J INFORM-
BORNER tions and models of ETR
science
.4769 .0003 0 23 YOSHIYUK 2010 Tracking modularity SCIENTOM
I TAKEDA in citation networks ETRICS
.4635 .0026 1 45 YOSHIYUK 2009 Nanobiotechnology SCIENTOM
I TAKEDA as an emerging re- ETRICS
search domain from
nanotechnology: A
bibliometric approach
.4124 .0008 0 42 ALEKS 2009 Visual Overviews J AM SOC
ARIS for Discovering Key INF SCI TE-
Papers and Influences CHNOL
Across Research
Fronts
.3574 .0012 0 33 ERJIA YAN 2009 The Use of Centrality PROC IN-
Measures in Scientific TER CONF
Evaluation: A Coau- SCI IN-
thorship Network FOMET
Analysis
Chen in Table 8.2) is the connection between Diana Crane’s work on invisible
college along with an article by K. Dunbar on scientific discovery. Similarly,
the co-citation added by the 4th paper written by Erjia Yan in Table 8.2
between Freeman’s paper on centrality and the h-index paper by Hirsch is
among the ‘unusual’ ones as far as this dataset is concerned. In another
example, the article by Yoshiyuki Takeda in 2009, the 5th paper in Table
8.2, co-cited Klavans 2006 and Chen 2002. In summary, our new metrics
provide a holistic measure of the ‘unusual’ connections contributed by a new
paper.
We tested our hypothesis with a univariate General Linear Model. The
results are shown in Tables 8.3 and 8.4. The model found a statistically signifi-
cant effect of ΔCentrality in predicting the number of citations (p = 0.007), but
no significant effect was found for ΔModularity . The model explained 87.5%
of the variance, thus it can be regarded as a sufficiently accurate model.
Table 8.4 indicates that the effect of the centrality divergence is practically
meaningful.
Table 8.3 Tests of Between-Subjects Effectsb . Data source: 76 papers citing (Chen,
2006).
Dependent Variable: Citations
Type III Sum Partial Eta
Source df Mean Square F Sig.
of Squares Squared
Corrected Model 112675.351a 4 28168.838 58.578 .000 .890
Intercept 2331.753 1 2331.753 4.849 .036 .143
ΔModularity 801.177 1 801.177 1.666 .207 .054
ΔCentrality 4098.399 1 4098.399 8.523 .007 .227
alpha 46.711 1 46.711 .097 .758 .003
beta 1263.181 1 1263.181 2.627 .116 .083
Error 13945.494 29 480.879
Total 214646.000 34
Corrected Total 126620.845 33
a. R Squared = .890 (Adjusted R Squared = .875)
b. Weighted Least Squares Regression – Weighted by NR
The results are encouraging and we expect that these metrics can provide
valuable information needed in the analysis of the dynamics of networks and
dealing with changes and uncertainties. Next, we tested the hypothesis with
a negative binomial regression.
Negative binomial regression models are frequently used in the literature
when analyzing frequency data that the mean is much smaller than the vari-
ance. Paper citations and patent citations are a typical type of count data.
Various studies have used ordinary linear regression models. However, re-
searchers also notice that citation data tend to have many zeros and small
values. In other words, the variance in citation data is often greater than the
mean. Negative binomial regression models are more appropriate to model
this type of data (Lee, Lee, Song, & Lee, 2007; Lokker & Walter, 2010). The
negative binomial regression was specified as a generalized linear model with
modularity and centrality variation rates as co-variant variables to predict
the number of citations retrospectively.
* Generalized Linear Models.
GENLIN Citations WITH ModularityVariation CentralityVariation
/MODEL ModularityVariation CentralityVariation INTERCEPT=YES
OFFSET=Year SCALEWEIGHT=NR
DISTRIBUTION=NEGBIN(1) LINK=LOG
/CRITERIA METHOD=FISHER(1) SCALE=1 COVB=ROBUST
MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE)
SINGUL-AR=1E-012 ANALYSISTYPE=3(LR) CILEVEL=95 CITYPE=WALD
LIKELIHOOD=FULL
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
The results of statistical tests are shown in Tables 8.5∼8.7. The model
effects of both modularity and centrality variation rates are statistically sig-
nificant in predicting citation counts. In terms of parameter estimates, the
centrality variation rate is statistically significant, but the modularity varia-
tion rate is not. The result of the negative binomial regression is consistent
with the results of the UNIANOVA test.
Table 8.5 Omnibus Testa
Likelihood Ratio Chi-Square df Sig.
3892.663 2 .000
Dependent Variable: Citations
Model: (Intercept), Modularity Variation, Centrality Variation, offset = Year
a. Compares the fitted model against the intercept-only model.
Fig. 8.4 What do the two variation metrics measure? Source: 80 papers citing
(Chen, 2006).
236 Chapter 8 Transformative Potential
rate. The size of a circle is proportional to the number of citations that the
corresponding paper received by 2010. As the graph shows, the two papers
that were ranked consistently high by both metrics are review papers. What
are these metrics really measuring? Is there a reason why would a review
paper be ranked high by these variation metrics?
Recall that the modularity variation rate is designed to give higher scores
to papers that add unprecedented connections between distinct modules (i.e.
clusters) in a network representation of the history of a topic. A paper ranked
high by modularity variation should be among the papers that cited refer-
ences that have not been cited together. A review paper obviously fits into
this category. Similarly, a paper with a strong centrality variation rate means
that the paper has introduced a considerable shift in the distribution of node
centrality, probably by citing references with an unusual pattern or a combi-
nation of patterns. Again, a review paper can fit into this category as well.
Therefore, the pattern that review papers are ranked high is indeed consistent
with the theoretical expectation.
The more interesting cases would be non-review papers that are ranked
high. Among the top-5 papers highly ranked by modularity variation, #3∼#5
are original research papers. In other words, these papers are not review pa-
pers, but they standout because they add new connections between modules
that are previously not connected. According to our theory of discovery, these
papers have the potential to transform the research topic. They are likely to
become highly cited later on.
Another noteworthy property of the ranking method is that #3 and #5
are papers published in the current year. This shows that the method can
identify papers with modularity variation without having to wait for citations
to build up. In fact, the method does not rely on any use or evaluative data
such as times downloaded, times visited, or times cited. This is one of the
distinct advantages of this approach. Users can access to relevant indicators
of transformative potential as soon as a paper is published or even when it
is submitted for publication.
The results of the UNIANOVA test and the negative binomial regression
consistently identified the centrality variation rate is a reliable predictor of
citation counts of a paper. This implies that the centrality variation rate as
a novelty metric is likely to be meaningful. More thorough tests should be
done with larger datasets over a longer period of time to further verify the
role of both metrics. Nevertheless, in addition to scientific publications, the
same method is applicable to grant proposals, patents, and other sources of
information by constructing baseline network representations similarly to the
ones we tested with journal papers. In next section, we apply the method in
a case study of the pulsars research for the first 10 years of its development.
8.2 Detecting the Transformative Potential 237
Fig. 8.5 Hewish et al. (1968) has 472 citations as of 2010 July 5th Gold (1968)
has 362.
millionth of a second per year. The ratio of a pulsar’s current speed to its
slow-down rate tells us how old it is.
The terminology was also changed rapidly in the earlier years of pulsar
research. The initial term used by the discoverers was pulsating radio sources.
72% of the papers used the term were published in 1968 (18 papers). Only
3 papers used the term in 1969 (12%). After 1970, the term was almost
completely vanished. In comparison, the term pulsar was used in 54 papers
published in 1968, in 147 papers in 1969, and 151 papers in 1970. The word
pulsar is made from pulsating star.
In Meadows and O’Connor’s study of the growth of pulsar research, they
noticed a distinct initial concentration of pulsar papers in Nature, especially
within the first 6 months of the publication of the paper by Hewish et al.
Five weeks later, the first two theoretical papers on pulsars appeared, also in
Nature. Meadows and O’Connor identified this as a general tendency of the
birth of a new field: papers in a new growth area tend to concentrate in the
same journal as the original discovery paper.
The initial concentration of pulsar papers in Nature attracted 52% of the
citations to pulsar papers. The rate dropped to 40% in the first half of 1969.
Pulsar papers subsequently appeared in more and more journals. Meadows
and O’Connor noticed that it was already evident that journals with pulsar
papers appeared to comply with the Bradford law in the period of the first
18 months.
The speed of publication in the initial growth period of the pulsar research
area is remarkable. The original discovery paper took two weeks from the
reception to its publication in Nature, and one of the first theoretical papers
took only five days! The citation half-life of pulsar papers in the first two
years of the discovery was 0.7 years, which means 50% of the citations were
given to papers published 0.7 year ago. In contrast, the half-life of astronomy
papers as a whole in the same time frame was 5.4 years. The short half-life
8.2 Detecting the Transformative Potential 239
of pulsar papers in the initial years means that researchers had to seek rapid
publications; otherwise, their papers would become obsolete before they are
even published.
The number of citations per paper in a new area of research also conveys
interesting signals. Initially, there are few papers to cite. As relevant papers
appear rapidly, the numbers of citations increases rapidly.4 In the pulsar
example, the average number of citations per paper grew from 7.1 in 1968 to
9.9 in 1969. In addition, the rate of self-citation dropped from 15% in 1968
to 10% in 1969 as the research expanded to more research groups.
Another characteristic noticed by Meadows and O’Connor is that initial
pulsar papers have a higher number of co-authors on average (2.0 authors per
paper) than astronomy as a whole (1.5 authors per paper). More specifically,
observational papers had a higher co-author rate of 2.65 than theoretical
papers of 1.55.
The rapid increase of papers as both citing and cited papers leads to
the expectation that the citation network must become rapidly interwoven.
During the initial period of growth, what would be the prominent patterns
of citations on the structure of the literature? In terms of a co-citation
network, if one paper’s citation preserves the citation structure of the new
growth and another paper’s citation drastically alters the citation structure,
can we distinguish which one is likely to be more important than the other,
purely based on how their citations influence the existing citation structure?
To illustrate the extent to which macroscopic properties of pulsar papers
at earlier stages signal their subsequent impact, we use the co-citation net-
work of 1968 pulsar papers as the baseline reference and measure the degree
of the structural change introduced by a new pulsar paper published in the
following 5 years. Papers that induce the most profound structural change are
regarded having strong brokerage potential that could lead to fundamental
changes later on. We expect to see that the brokerage potential measure is an
important factor to explain the actual impact observed several years later.
Pulsar papers published in a 10-year window (1968 – 1977) were collected
from the Science Citation Index Expanded. Due to the terminology change,
we used a topic search for both ‘pulsating radio source*’ and pulsar*. Each
paper is ranked with a score ΔQ, which is the percentage of network mod-
ularity change due to its citation with reference to the network structure
formed by prior publications up to the year before, and a score ΔC, which is
the relative entropy of the new betweenness centrality distribution over the
prior distribution. The citations of these papers are measured globally as of
July 6, 2010. The search found 1,048 records. The peak of the subject lasted
about 10 years till early 1980s.
The 11-cluster configuration is optimal based on modularity and silhou-
ette scores (Fig. 8.6). Nodes are labeled by sigma scores, which identified
Hewish 1968 and Gold 1968 as the most influential discovery papers. Cluster
4
As a more recent example, the Sloan Digital Sky Survey (SDSS) research doubles its
citation and paper counts every 10 months.
240 Chapter 8 Transformative Potential
8.2 Detecting the Transformative Potential 241
Continued
Year ΔM ΔC TC NR Article
1971 0.69 0.0012 25 33 CHIU HY, THEORY OF RADIATION
MECHANISMS OF PULSARS .1., AS-
TROPHYS J, V163, P577
1972 0.67 0.0045 16 22 SHITOV YP, FINE-STRUCTURE OF
SPECTRA OF RADIO EMISSION OF
PULSARS, ASTRON ZH, V49, P470
1970 0.55 0.0009 397 42 GUNN JE, ON NATURE OF PULSARS
.3. ANALYSIS OF OBSERVATIONS,
ASTROPHYS J, V160, P979
1971 0.54 0.0033 31 16 HUNT GC, RATE OF CHANGE OF
PERIOD OF PULSARS, MON NOTIC
ROY ASTRON SOC, V153, P119
1971 0.54 0.0032 22 29 MANCHEST.RN, ROTATION MEA-
SURE AND INTRINSIC ANGLE OF
CRAB PULSAR RADIO EMISSION,
NATURE-PHYS SCI, V231, P189
1970 0.44 0.0013 10 27 SMITH FG, GENERATION OF RA-
DIO WAVES IN PULSARS, NATURE,
V228, P913
The following example is based on a report we produced for the NSF CISE/
SBE Advisory Committee’s Subcommittee on Discovery in Research Portfo-
lios. The analysis was done between October 2009 and October 2010. The
subcommittee was charged with identifying and demonstrating techniques
and tools that characterize a specific set of proposal and award portfolios.
The subcommittee was asked to identify tools and approaches that are most
effective in deriving knowledge from the data provided, i.e. most robust in
terms of permitting program officers to visualize, interact, and understand
the knowledge derived from the data. Subcommittee members were asked to
apply their research tools to structure, analyze, visualize, and interact with
data sets provided by the NSF.
Grant proposals submitted to the NSF consist of a number of components,
including a cover page, a one-page project summary, a project description up
to 15 pages, a list of references, 2-page biographies of investigators, and bud-
get information. The abstracts of awarded projects are publically available
on the NSF’s website. In the following analysis, we distinguish two sources of
data: the publically available award abstracts and the proposal dataset that
was made available to the members of the subcommittee for a limited period
of time. All the results discussed in this book regarding the proposal dataset
have been approved by a specific clearance procedure, which was in place to
safeguard the privacy and security of the proposal dataset.
We focused on questions at two levels. At the individual proposal level,
the main questions are: What is a proposal about in a nutshell? How does
one proposal differ from other proposals in terms of their nutshell representa-
tions? At the portfolio level, the questions focus on characteristics of a group
of proposals. What are the computational indicators that may differentiate
awarded and declined proposals? What are the indicators that may identify
transformative proposals in a portfolio?
8.3 Portfolio Evaluation 245
Fig. 8.8 The procedure for identifying core passages of a full-length document.
Hearst’s algorithm for text segmentation detects the subtopic shift pat-
246 Chapter 8 Transformative Potential
Fig. 8.9 A prototype that assists users to find the core information of a proposal.
8.3 Portfolio Evaluation 247
Hot topics are defined in terms of the frequency of noun phrases found in
project descriptions, project summaries, or other sources of text. Generally
speaking, high-frequency noun phrases are regarded as indicators of a possible
hot topic. The most valuable information of a hot topic is therefore when it
becomes hot and how long it lasts as a hot topic with reference to other topics
occurring at the same time.
There are many techniques for detecting the timing of a hot topic. In this
project, we adopt Kleinberg’s burst detection algorithm and detect when the
frequency of a noun phrase starts to jump and how long it remains high.
We use these two measures in subsequent survival analysis to differentiate
awarded and declined proposals.
Burst detection determines whether the frequency of an observed event
is substantially increased over time with respect to other events of the same
type. The types of events are generic, including the appearance of a keyword
in newspapers over a period of 12 months and the citations to a particu-
lar reference in papers published in the past 10 years. The data mining and
knowledge discovery community has developed several burst detection algo-
rithms. We adopt Kleinberg’s algorithm for its flexibility.
Two major measures of noun phrase burst are considered in our work: the
waiting time to burst and the duration of burst. The waiting time to burst
is how long the time elapses between the initial appearance of a noun phrase
in a set of proposals and the time when a burst is statistically detected.
The duration of burst is the time elapses between the beginning of the burst
until either the burst drops or the end of the timeframe of the analysis.
These measures are used in the subsequent survival analysis that aims to
differentiate awarded and declined proposals. These measures are domain
independent and do not require additional semantic-related input.
We also conducted the second preliminary test with 200 proposals (100
awarded and 100 declined) randomly sampled from 7,345 proposals of an
NSF program. The core information of each proposal is represented by the
250 Chapter 8 Transformative Potential
8.4 Summary
References
Chen, C. (2003). Mapping scientific frontiers: The quest for knowledge visualiza-
tion. London: Springer.
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and tran-
sient patterns in scientific literature. Journal of the American Society for Infor-
mation Science and Technology, 57(3), 359-377.
Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). To-
wards an explanatory and computational theory of scientific discovery. Journal
of Informetrics, 3(3), 191-209.
Chen, C., Ibekwe-SanJuan, F., & Hou, J. (2010). The Structure and dynamics of
co-citation clusters: A multiple-perspective Co-Citation Analysis. Journal of the
American Society for Information Science and Technology, 61(7), 1386-1409.
Gold, T. (1968). Rotating neutron stars as origin of pulsating radio sources. Nature,
218(5143), 731-732.
Härynen, M. (2007). Breakthrough research: Funding for high-risk research at the
Academy of Finland. Helsinki: The Academy of Finland.
Hewish, A., Bell, S.J., Pilkington, J.D.H., Scott, P.F., & Collins, R.A. (1968).
Observation of a rapidly pulsating radio source. Nature, 217(5130), 709-713.
Lee, Y.G., Lee, J.D., Song, Y.I., & Lee, S.J. (2007). An in-depth empirical analysis
of patent citation counts using zero-inflated count data model: The case of
KIST. Scientometrics, 70(1), 27-39.
252 Chapter 8 Transformative Potential
Lipinski, C., & Hopkins, A. (2004). Navigating chemical space for biology and
medicine. Nature, 432(7019), 855-861.
Lokker, C., & Walter, S.D. (2010). Prediction of citation counts: a comparison of
results from alternative statistical models. Retrieved Oct 15, 2010, 2010, from
http://www.bmj.com/content/336/7645/655/reply
Meadows, A.J., & O’Connor, J.G. (1971). Bibliographical statistics as a guide to
growth points in science. Science Studies, 1(1), 95-99.
NSF. (2007, September 25). Important notice No. 130: Transformative research.
http://www.nsf.gov/pubs/2007/in130/in130.txt. Accessed 14 Aug 2010.
Price, D.D. (1965). Networks of scientific papers. Science, 149, 510-515.
Swanson, D.R. (1986). Fish oil, Raynaud’s syndrome, and undiscovered public
knowledge. Perspectives in Biology and Medicine, (30), 7-18.
Takeda, Y., & Kajikawa, Y. (2010). Tracking modularity in citation networks. Sci-
entometrics, 83(3), 783-792.
Chapter 9 The Way Ahead
In this final chapter, we first summarize the key points in the previous chap-
ters and how they are connected or may influence each other. Then we iden-
tify a few theoretical and practical issues that need to be dealt with in future
studies and applications.
Before the Gathering Storm, a primary source of concern for individual scien-
tists, their institutions, and public funding agencies is the decreasing level of
public funds available for the increasing demands of research funding. In this
context, as funding agencies become more and more stringent in selecting
projects to fund, researchers respond to the declining successful rate by in-
creasing the number of their submissions. At the same time, funding agencies
and institutions increasingly find themselves in a position to address account-
ability issues: what are the projects you decided to fund but you shouldn’t,
what are the projects you didn’t find but you should, and how do you justify
the way the public fund is allocated? On the one hand, taxpayers rightfully
expect that their money should be used to fund research that is likely to
benefit the society. On the other hand, it is also known that applied science
and technological innovations are built on basic research and that the societal
implications of basic research are not always clear, in fact, almost always not
clear. We cannot demand eggs while rejecting the responsibility for feeding
the hens. How do we resolve this dilemma with the limited resources?
The Gathering Storm and related debates are not only pressing on these
issues even harder but also drawing our attention to more profound questions.
What is the key for maintaining and sharpening the competitive edge of
a nation? To address these questions, we narrow down from the funding
crunch to the nature of creativity and the role of individual scientists in
sustaining the leading position of a nation in science and technology. The
Yuasa phenomenon is a macroscopic pattern, but it may have microscopic
explanations. The average age of scientific productivity offers one possible
explanation of why the world center of scientific activity may move or stay.
It hints the potential role of scientific creativity, but like many macroscopic-
focused approaches, theories along this line do not offer much of a constructive
guidance; there is not much we can do about our age. What are the actions
beyond the age that we can take? What is it in the ways of our thinking
that we can consciously enhance so that not only can we come up with more
original and creative ideas, but also find creative solutions to hard problems?
Chapter 3 is concerned with biases and pitfalls that one may encounter or
has to cope with along the way of searching for creative ideas and recognizing
them. Our mental models, our perspectives, and working theories are not only
simplified but also biased representations of the world. The same evidence can
be used by different parties for their own purposes. As shown in the examples
of 911 terrorist attacks and Iraq WMDs, data do not speak for themselves,
theories and models do!
Rejecting future Nobel Prize worthy papers raises more questions about
how the transformative potential of research is recognized at various stages of
research, from the early grant proposals, intermediate publications, to widely
recognized impacts on society. Experts who propose new research projects can
be overly optimistic, whereas experts who serve the role of peer reviewers
may have legitimate reasons to reject premature ideas and poorly articulated
research plans. On the other hand, from a social and community point of
view, peer review experts technically do have a conflict of interest — they are
competitive peers.
Conflicts of interest aside, how hard is it to recognize the potential of a re-
search topic? In Chapter 4, we have looked at both hindsight and foresight of
scientific breakthroughs. Project Hindsight and TRACES looked retrospec-
256 Chapter 9 The Way Ahead
tively into the past and provided many lessons. If Project Hindsight focused
more closely on the selective retention phrase of creativity, TRACES empha-
sized the blind variation phrase more. Lessons learned from TRACES show
that technical innovations are preceded by years of mission-oriented critical
events, which are in turn preceded by even longer periods of non-mission
research. It is particularly hard to justify the potential values of non-mission
research to the society.
Early warning signs might be available and detectable as a complex sys-
tem is about to experience a phase transition, or a transformative change,
although some changes take place without any early signs or clues at all. Early
warning signs serve as navigational cues for navigators in the vast space of
the unknown. Although the optimal foraging theory is not introduced until
Chapter 5, the presence or absence of early signs will make qualitative differ-
ences as they will tip the balance between perceived risks and rewards. The
change in the ratio of perceived risks and rewards will change the diffusion
and feedback within the system. The self-reinforced feedback will cascade the
initial impact and accelerate the transformation of the system.
The reflection on the history and findings of foresight activities is to re-
inforce the theme of the chapter that human cognition is biased at both
individual and collective scales. Recent assessments of the accuracy of fore-
casts made by earlier foresight surveys indicated that experts tend to be over
optimistic. Although researchers offer explanations why experts make overly
optimistic predictions, it is still not clear for the practical implications of
the over optimistic tendency on the foresight seeking activities as a whole.
Soliciting stakeholders’ judgments on attractiveness of research topics is a
move in expanding the scope of social contract between science and society.
The attractiveness ratings from stakeholders provide the best justification of
the social values or at least the potential of social values of research top-
ics. The downside is that such attractiveness rating schemes are intrinsically
limited to mission-oriented and development phrases of scientific and techno-
logical breakthroughs as demonstrated by the TRACES study; they will not
be reliable and even feasible for judging non-mission science.
Assessments of foresight activities so far have generally missed the broader
issue: to what extent the high-impact scientific breakthroughs were ever iden-
tified as the priority areas by foresight-seeking activities? If the NSF, the
DoD, or the Office of Science and Technology were to commission another
Project Hindsight, TRACES, or Hindsight on Foresight today, how many
transformative discoveries achieved today were given priorities based on ex-
perts’ consensus 20, 30, or 50 years ago? What were the signs that made
experts to pick their feasibility ratings right? Who were the visionary users
back then to identify the attractiveness of transformative discoveries before
their conception?
9.4 Foraging 257
9.4 Foraging
Chapter 7 focuses on temporal patterns and variations in text. The first part
of the chapter gives an example of distinguishing conflicting opinions, namely
positive and negative reviews made by Amazon customers on a bestseller the
Da Vinci Code. The second part introduces a method designed for extracting
patterns from unstructured text without relying on any predefined ontology
or taxonomy. The method is designed based on an observation that the more
important a topic is to an author, the more language variations are likely
to be used to describe the topic. This phenomenon can be found not only
at a document level, but also at a cultural level. For example, the Chinese
language has a much richer set of vocabulary to describe relatives than lan-
guages such as English do. In Chinese, there is one specific word for an elder
brother and for a younger brother, whereas in English, it needs a combi-
nation of two words to express the same meaning. This particularly refined
vocabulary on this topic reflects the significance of these meanings in Chi-
nese culture. By the same token, in the Origin of Species Darwin wrote many
things about species, forms, plants, and differences. These patterns emerged
from text expressed in a natural language.
9.7 Transformative Potential 259
The design of the method is in fact more ambitious. Its ultimate goal is to
provide a baseline representation of the current scientific knowledge. Concepts
can be naturalistically identified from natural language passages. Relations
and predicates can be identified in the basic form of subject — verb — object.
The known and the unknown can be represented as assertions, claims, and hy-
potheses associated with available evidence and a level of uncertainty. Newly
proposed scientific ideas can be compared against the master representation
of the knowledge and their novelty can be derived and inferred.
The third part is about detecting the burstness of topical patterns. Burst
detection aims to identify the intensity and duration of an elevated level of
activities. For example, citation burst is defined a period of time in which
citations to a paper exceed a given threshold or a probabilistically defined
transition rate. The burst of the occurrences of a word can be similarly defined
as a period of time in which the frequency of the word is exceedingly high
with regards to other words.
The fourth part on survival analysis is relatively new. Although survival
analysis as a statistical method is widely used, the combination of burst de-
tection and survival analysis is novel. Survival analysis enables us to compare
two or more groups in terms of their temporal patterns. In particular, sur-
vival analysis allows us to address questions such as between highly cited and
less highly cited groups of publications on a topic, which group is more likely
to contain topics that are bursted sooner? Which group is more likely to find
topics that sustain their bursts longer?
9.8 Recommendations
Several lessons learned are particularly worth noting along with a few recom-
mendations for individual researchers, students, and science policy makers
and funding agencies.
First, the self-assessment and the courage to face long-term challenges
9.8 Recommendations 261
in opening debates such as the Gathering Storm are essential to sustain the
leading position of a nation in science and technology as well as in economic,
political, and cultural sectors.
Second, foresight-seeking activities need longitudinal follow-up assess-
ments. Retrospective assessments should pay close attention not only to
how priority areas identified earlier evolved, but also to scientific break-
throughs emerged in the same timeframe as a whole — regardless whether
they were once identified as strategic priority areas. More TRACES-styled
studies should be commissioned by funding agencies independently and jointly
so that critical events at various stages of the development of transformative
science and technology can be closely tracked, understood, and disseminated.
Third, biases and pitfalls in human cognition and decision making should
be studied systematically in connection with generic and specific mechanisms
for divergent thinking and problem solving.
Fourth, the foraging and brokerage mechanism-based theory of transfor-
mative discovery is valuable because it is able to reduce a large number of
possible factors to fewer and more fundamental ones. There are certainly
types of discoveries that are beyond the reach of the theory. It is therefore
important to identify other types of mechanisms that could explain other
types of discoveries.
Fifth, quantitative and visual analytic methods become increasingly ca-
pable of tracking the evolution of the intellectual dynamics of a scientific
domain. More theories should be developed to guide the design and use of
such tools.
The most important message of the book is twofold:
• Creativity often arises from carefully considering conflicting conceptual-
izations.
• Creativity can be cultivated and enhanced with a better understanding
of generic mechanisms and potential early signs as well as an improved
awareness of biases, pitfalls, and cognitive traps.
Creativity is the ability and willingness to embrace the unknown with an
open mind!
Index
O
Q
Oklahoma City bombing, 95
quantitative studies of science, 8
optimal age, 8, 31
quantum mechanics, 130
optimal foraging theory, 87, 256
quasars, 237
optimization, 156
originality, 23, 24
R
P radical change, 65
random walk, 106
PageRank, 246
reasoning, 60
paradigm shift, 114, 208
recovery rate, 65
part-of-speech, 191, 247
regular expression, 191, 247
partition, 156
relative entropy, 93, 239
patents, 199
Index 267
Fig. 1.5 114,996 influenza virus protein sequences. Source: (Pellegrino & Chen,
2011)
..
"
.' . :
:.:
wi • •••• •
··i ,11>' transf~lmjri9
••
t~l fIA • • .. 'r ~.
everyday social activity coordination
•
•
':' . .-. . ;-;=: .
., .' D
.
'.f~loUnderstanding social behavior
s- ". . ..
II .... s-
a ••
a.· .. -r'j2· ~xceptlonal c.se
• • • • • .... sa a •
..'....
=-~~
":" 4 ~ulturll
: herltlge Ippllcltlon
Fig.1.6 A network of 682 co-occurring terms generated from 63 NSF IIS EAGER
projects awarded in 2009 (cyan) and 2010 (yellow). Q = 0.8565, Mean silhouette =
0.9397. Links = 22347.
Fig.5.1 The three clusters of co-cited papers can be seen as three patches of infor-
mation. All three patches are about terrorism research. Prominently labeled papers
in each patch offer information scent of the patch. Colors of patches, indicating the
time of a connection, provide a scent of freshness. The sizes of citation rings provide
a scent of citation popularity. Source: (Chen, 2008).
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
I...._.....l _ _ _-" 2oJSf6
Fig.5.3 Symmetric relative entropy matrix shows the divergence between the
overall use of terms across different years. The recent few years are most similar
to each other. The boundaries between areas in different colors indicate significant
changes of underlying topics. Source: (Chen, 2008).
- ....·,,• ....._-
... ......-- - ....-".-, _.-
-- -...............
,.... ,
....-
----
-
'OMlt""'11 unil:.d..-c.r••
,,,
un
......
-- -..... ----..
,"" ......
-- ·· - ,,"'"......
...... . - . ,,..... ,
!Ifl'tOf!W""""
~
I
J.U.T
,,.,
'10>
,.,.
.-
,,..""
'"
.t~~~i~(~ 'l'l),~~fHAUllTlW.lA
1981-1985. N=210, E=2038. 3.3.20 1986-1990. N=261, E=3815. 4,4,20 1991-1995. N=228, E=3940. 9.9,20
o
••
1996-2000. N=209, E=1993. 14.14.20 2001-2005. N=140, E=1045. 13,13,20 2006-2007. N=156, E=1860. 8,8,20
. - .. J ./ .•
•
o
•
." ....
WA ~ EN JR. 1983. LA~~CET ...
•
Me-RSHAlL BJ. 1984. LANCET ...
~
• PI\.'RSONNET J. 1991. NEW ENGL J MED ...
Fig. 5.7 A co-citation network of references cited between 1981 and 2007 in peptic
ulcer research. Source: (Chen et aI., 2009).
,. Co_' MR. 1989.5<;10_. V24A. P1288
burot:15.78. oonltOlty;O.27. 00cl'2J6
/
2. Mansour SL. 1988. Nalure. V336. P348
bUt5t=39.52. cenlt\1lity=O.1$. freq=;354
2. 1.
f
3.lhom.. KR. 1987. Coli. V51. P503
oonltOlityoO.22. bGq'268
..
til • ~.
~o
.~ 01.11\
rr '1lI.lA. I~ . Sq[~ [
@,.. . "
~35.80.
"l,"·fUllllV.AIlVIU'UI
3.
Fig.5.8 A co-citation network of references cited between 1985 and 2007 in gene
targeting research. References with the strongest betweenness centrality scores are
labeled. The burst periods of their citations are shown as the thickened curves in
the three diagrams to the left. Source: (Chen et al., 2009).
Fig.5.10 A diffusion map of gene targeting research between 1985 and 2007.
Selection criteria are at least 15 citations for citing articles and top 30 cited articles
per time slice. Polygons represent clusters of co-cited papers. Each cluster is labeled
by title phrases selected from papers citing the cluster. Red lines depict co-citations
made in the current year. The concentrations of red lines track the context in which
co-citation clusters are referenced. Source: (Chen et al., 2009).
.......................11. .
. :"
ALDACENA J, 1998,[ ADV THE0R r.tATH PHYS ...
~
....
./
• FRJEDAN D, ~996
~
. J -
NUCL PHYS B •.
• ITIEN E, 1991. PHYS REV D ...
19.95, PHYS ((EV LETT ...
-
String Theory
-pburst
__ pcentrality
0
- .... - - _..I
I '
V ""'"
,--------------,~ ~
Fig.5.11 A co-citation network of references cited between 1990 and 2003 in string
theory. Polchinski-1995 marked the beginning of the second string theory revolution.
Maldacena-1998 is highly transformative and brokerage link between string theory
and particle theories. The three embedded plots show the burst periods of citations
of Witten-1991, Maldacena-1998, and Polchinski-1995. Source: (Chen et al., 2009) .
...... .
• 8APfIUiTl
esE~ ~11l~8V'~OK
•• .'OLCHINSKI J1t95V75P'7201
.".
..
+-__ WITIEN E 1991
V44 P31
CANDELAS P 1985
V258 P46 " . Turning points
FRIEDAN D 1986 " Pivot points
V27 P93
.1985-1987 1988-1990 1991-199301994-1996D1997-199902000-2002. 2003
HAWKING SW 1975
V43 P199
WIDEN E 1991
V44 P314° -
. ..
FRIEDAN D 1986
.. .....
V271 P93
WIDEN E 1986
V268 P253
DINE M 1985
V156 P55
o (>0)
11-seplember
--
.........
_.....