0% found this document useful (0 votes)
14 views49 pages

Application of Bibliometric Analysis: Advantages & Pitfalls

Uploaded by

trungtckt201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views49 pages

Application of Bibliometric Analysis: Advantages & Pitfalls

Uploaded by

trungtckt201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Application of bibliometric

analysis
Advantages & pitfalls

Thed van Leeuwen

Workshop on Research Evaluation in Statistical Sciences ,

Bologna, 25th March 2010


Introduction of bibliometrics
• Bibliometrics can be defined as the quantitative analysis of science and
technology performance and the cognitive and organizational structure of
science and technology.

• Basic for these analyses is the scientific communication between scientists


through (mainly) journal publications.

• Key concepts in bibliometrics are output and impact, as measured


through publications and citations.

• Important starting point in bibliometrics: scientists express, through


citations in their scientific publications, a certain degree of influence of
others on their own work.

• By large scale quantification, citations indicate influence or (inter)national


visibility of scientific activity, but should not be interpreted as synonym
for ‘quality’.
CWTS data system
• CWTS has a full bibliometric license from Thomson
Reuters Scientific to conduct evaluation studies
using the Web of Science.

• Our database covers the period 1981-2009.

• Some characteristics:
– Over 31.000.000 publications.
– Over 350.000.000 citation relations between source papers.
– 100.000.000 authors (incl. variations), 15.000.000 ‘unique’ names.
– Over 60.000.000 addresses, some 90% cleaned up over the last 10 years.
– Contains reference sets for journal and field citation data .
Bibliometric indicators
produced by CWTS
Some basic indicators are …

• P: number of publications in journals processed for the

Web of Science.
• C: number of received citations, excl. self-citations.

• CPP: mean number of citations per publication, excl. self-

citations
• Pnc: percentage of the publications not cited (within a

certain time-frame !!!)


• % SC: percentage self-citations related to an output set.
Important indicators are…

• CPP/JCSm: ratio between real, actual impact,


and mean journal impact.
• CPP/FCSm: ratio between real, actual impact,
and mean field impact.
• JCSm/FCSm: ratio between journal impact,
and field impact, indicative for the ‘quality’ of
the journal package in the field
Various types of analysis focus on …
• Research profiles: a break down of the output over
various fields of science.
• Scientific cooperation analysis: a break down of the
output over various types of scientific collaboration.
• Knowledge user analysis: a break down of the
‘responding’ output into citing fields, countries or
institutions.
• Highly cited paper analysis: which publications are
among the most highly cited output (top 10%, 5%, 1%) of
the global literature in that same field(s).
• Social network analysis: how is the network of partners
composed, based on scientific cooperation.
Journal & Field Normalization
Calculating the JCSm & FCSm
----------------------------------------------------------------------------------------------
Type publ. Journal Journal # citations
year category until 1999
----------------------------------------------------------------------------------------------
I review 1996 CANCER RES Oncology 17

II note 1997 J CLIN END Endocrinology 4

III article 1999 J CLIN END Endocrinology 6

IV article 1999 J CLIN END Endocrinology 8


----------------------------------------------------------------------------------------------
Calculating the JCSm & FCSm 2
-----------------------------------------------------------------
CPP JCS FCS
-----------------------------------------------------------------

I 17 16.9 23.7

II 4 3.1 3.0

III 6 4.8 4.1

IV 8 4.8 4.1
-----------------------------------------------------------------
Calculating the JCSm & FCSm 3
The mean citation score is determined as:
17 + 4 + 6 + 8

CPP = ------------------ = 8.8

1+1+1+1

The mean journal citation


score as: CPP / JCSm
(1 x 16.9) + (1 x 3.1) + (2 x 4.8) (8.8 / 7.4) = 1.19
JCSm = -------------------------------------- = 7.4
The mean field citation score
1+1+2
CPP / FCSm
as: (8.8 / 8.7) = 1.01
(1 x 23.7) + (1 x 3.0) + (2 x 4.1)
FCSm = -------------------------------------- = 8.7
1+1+2
Citation Windows
& Impact Measurement
Citation measurement and ‘windows’

• Publication years, fixed citation ‘window’.


Publications of 2002, with three citation years (namely 2002,
2003, and 2004), followed by 2003, with three years, etc.

• Blocks of publication years with a window decreasing in


length.
Publications of 2002-2005, with citation window of 4 years
(2002-2005), 3 years (2003-2005), 2 years (2004-2005), and 1
year (2005).
Citation measurement with ‘fixed window’
Citation years
2002 2003 2004 2005 2006 2007 2008 2009

2002 2002 2003 2004


2003 2004 2005
2003 2004 2005 2006
2004 2005 2006 2007
2005 2006 2007 2008
2006 2007 2008 2009
2007 2008 2009
2008 2009
2009
Citation measurement with ‘year blocks’

Citation years
2002 2003 2004 2005
2002 2006 2003 2007
2002 2004 20052008 2009
2003
2003 2004
2004 2005
2005 2006
2003 2004
2004 2005
2005
2005 2006 2007
2004 2005
2005
2005 2006
2006 2007
2007 2008
2005 2006
2006
2006 2007
2007
2007 2008
2008 2009
2006 2007
2007
2007 2008
2008 2009
2007 2008
2008 2009
2008 2009
2009
Methodological issues
Adequacy of citation indexes :
implications for bibliometric studies
How to tackle this issue ?

• We conduct analyses on the adequacy of the


citation indexes across disciplines based on
reference behavior of researchers themselves.

• The degree of referring towards other indexed


literature indicates the importance of journal
literature in the scientific communication
process.
Assessment of WoS Coverage

Citing/Source
Non- WoS
WoS Non-Wos
Journals
Books
?% ?% Conference
proceedings
Reports
Cited/Target Non-
WoS WoS Etc.
Total ISI/WoS Database (2002)

Citing/Source
Non- WoS
WoS

25% 75%

Cited/Target Non-
WoS WoS
The medical & Life sciences
100%
Ref erences non-ISI
Ref erences ISI
90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006
AGRICULTURE BASIC LIFE BASIC MEDICAL BIOLOGICAL BIOMEDICAL CLINICAL HEALTH
AND FOOD SCIENCES SCIENCES SCIENCES SCIENCES MEDICINE SCIENCES
SCIENCE
The natural sciences
100%
Ref erences non-ISI
Ref erences ISI
90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006
ASTRONOMY CHEMISTRY COMPUTER EARTH ENVIRONMENTALMATHEMATICS PHYSICS AND STATISTICAL
AND AND SCIENCES SCIENCES SCIENCES AND MATERIALS SCIENCES
ASTROPHYSICS CHEMICAL AND TECHNOLOGY SCIENCE
ENGINEERING TECHNOLOGY
Statistical sciences
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

1991

1996

References ISI
References non-ISI

2001

2006
The engineering sciences
100%
References non-ISI
References ISI
90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006
CIVIL ENGINEERING ELECTRICAL ENERGY SCIENCE GENERAL AND INSTRUMENTS AND MECHANICAL
AND ENGINEERING AND AND TECHNOLOGY INDUSTRIAL INSTRUMENTATION ENGINEERING AND
CONSTRUCTION TELECOMMUNICATION ENGINEERING AEROSPACE
The social– and behavioral sciences
100%
References non-ISI
References ISI
90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006
ECONOMICS EDUCATIONAL MANAGEMENT POLITICAL PSYCHOLOGY SOCIAL AND SOCIOLOGY AND
AND BUSINESS SCIENCES AND PLANNING SCIENCE AND BEHAVIORAL ANTHROPOLOGY
PUBLIC SCIENCES,
ADMINISTRATION INTERDISCIPLINARY
The humanities
100%
Ref erences non-ISI
Ref erences ISI
90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006
INFORMATION AND LANGUAGE AND CREATIVE ARTS, HISTORY, LAW AND LITERATURE
COMMUNICATION LINGUISTICS CULTURE AND PHILOSOPHY AND CRIMINOLOGY
SCIENCES MUSIC RELIGION
Overall WoS coverage by main field
EXCELLENT (> VERY GOOD (60- GOOD(40-60%)
80%) 80%)
Biochem & Mol Appl Phys & Mathematics &
Biol Chem Statistical
sciences
Biol Sci – Biol Sci – Anim & Economics
Humans Plants
Chemistry Psychol & Engineering
Psychiat
Clin Medicine Geosciences MODERATE
(<40 %)
Phys & Astron Soc Sci ~ Other Soc Sci
Medicine
Humanities &
Arts
Conclusions on adequacy issue

• We can clearly conclude that the application of


bibliometric techniques, solely based on WoS
(but very likely also Scopus) will not be valid for
some of the ‘soft’ fields in the social sciences
and the humanities.

• That is why the tool box has to be extended !


The H-Index and its limitations
The H-Index, defined as …

• The H-Index is the score that indicates the


position at which a publication in a set, the
number of received citations is equal to the
ranking position of that publication.

• Idea of an American physicist, J. Hirsch,


who published about this index in the Proc.
NAS USA.
Examples of Hirsch-index values
350

300
• Environmental biologist, output
of 188 papers, cited 4,788 times
250

200
Citations
in the period 80-04.
150

100
Value of H-Index= 31 • Hirsch-index value of 31
50

0
0 20 40 60 80 100 120 140 160 180 200
Publications
80

70

60
• Clinical psychologist, output of
50
72 papers, cited 760 time sin the
Citations
40
period 80-04.
• Hirsch-index value of 14
30
Value of H-Index= 14
20

10

0
0 10 20 30 40 50 60 70 80
Publications
Problems with the H-Index
• For serious evaluation of scientific
performance, the H-Index is as indicator
not suitable, as the index:
– Is insensitive to field specific characteristics (e.g.,
difference in citation cultures between medicine and
other disciplines).
– Does not take into account age and career length of
scientists, a small oeuvre leads necessarily to a low
H-Index value.
– Is inconsistent in its ‘behaviour’.
7.00

6.00
• Actual versus field
Phy
normalized impact
(CPP/FCSm)
5.00
displayed against
the output.
4.00
Phy
CPP/FCSm

Soc
Med
Psy Med
3.00
Eng • Large output can
Phy
Med
Env be combined with
Soc Eng Che
2.00
Med Bio
PsyChe
Bio
a relatively low
Psy
Hum
Mat
Bio
Med impact
Phy
1.00
Med
Med

0.00
0 50 100 150 200 250
TOTAL PUBLICATIONS
60

50
Med
• H-Index
displayed
40
Med against the
Bio
output.
Med
H-index

Phy Env
30
Phy
Psy

Bio Bio • Larger output is


Phy

20
Phy
Med
Med

Med Che
strongly
Med
Che
Psy
correlated with a
Eng
Eng
Psy high H-Index
Soc
10 Hum
Mat
value.
Soc

0
0 50 100 150 200 250
TOTAL PUBLICATIONS
Consistency: Definition

Definition. A scientific performance measure is


said to be consistent if and only if for any two
actors A and B and for any number n ≥ 0 the
ranking of A and B given by the performance
measure does not change when A and B both
have a new publication with n citations.

35
Consistency: Motivation

• Consistency ensures that if the publishing


behavior of two actors does not change over
time, their ranking relative to each other also
does not change
• Consistency ensures that if the individual
researchers in one research group X outperform
the individual researchers in another research
group Y, the former research group X as a whole
outperforms the latter research group Y.

36
Inconsistency of the h-index
Actor A Actor B
9 9

8 8

7 7

6 6
citations

citations
5 5

4 h=4 4 h=6
3 3

2 2

1 1

0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
publications publications

9 9

8 8

7 7

6 6
citations

citations

5 5

4 h=8 4 h=6
3 3

2 2

1 1

0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
publications publications 37
ISI Impact Factors:
calculation and validity
Methodology: ISI’s classical IF

• The ISI Impact Factor (IF) is defined as the


number of citations received by a journal in year
t, divided by the number of citeable documents
in that same journal in the years t-1 and t-2,

• Or, as a Citations in year t


Formula:
Number of ‘citeable
documents’ in t-1 & t-2
Share ‘citations-for-free’ for The Lancet
• ISI Method:
Publications Citations

90+91 1992 Citations in 2000 .

Citeable documents in ‘98 and ‘99


Article 784 2986
14037 (c)
Note 144 593
957 (a) IF=14.7
29 232
Review
• CWTS Method:
Sub-total 957 (a) 7959 (b) Citations to Art/Not/Rev in 2000 .

Art/Not/Rev in ‘98 and ‘99

4181 (d) 4264 (e) 7959 (b)


Letter IF=8.3
957 (a)
Editorial 1313 905
Citations to Art/Let/Not/Rev in 2000 .
Other 1421 909
Art/Let/Not/Rev in ‘98 and ‘99
Total 7872 14037 (c)
7959+4264 (b+e)
957+4181 (a+d)
IF=2.4
ISI Impact Factors

• From 1995 onwards CWTS has analyzed the uses


and validity ISI Journal Impact Factor (IF).
• Most important points of criticism were:

– Calculated erroneously.
– Not sensitive for the composition of the
journal in terms of the document types.
– Not sensitive for the science fields a journal
is attached to …
– Based on too short ‘citation windows’.
Distribution of citations used for the calculationof the IF value of The Lancet

100%

90%
• The red area indicates
80% citations ‘for free’,
70% while the blue area
60% indicates ‘correct
50% citations’
40%

30%
• The IF-score of The
20% Lancet is seriously
10% ‘overrated’ by the
0% scientific ‘audience’ of
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
the journal.
Impact Factors for Br. J. Clin. Pharm. and Clin. Pharm. & Ther.

4.50
• The graph shows the
4.00 correct and erroneous
3.50
impact factors of BJCP and
CPT
3.00
CPT Err IF
2.50
CPT IF • In the case of CPT,
2.00 BJCP Err IFcitations to published
BJCP IF meeting abstracts are
1.50
included, while BJCP
1.00
has stopped publishing
0.50 of meeting abstracts !
0.00
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
Document types and fields

Field Journal IF JFIS


IMMUNOLOGY ANN REV IMMUNOL 50.49 1 5.18 1

BIOCHEM & MOLECULAR BIOL ANN REV BIOCHEM 34.61 1 4.10 3

PHARMACOL & PHARMACY PHARMACOLOGICAL REV 27.74 1 4.75 1

CELL BIOL ANN REV CELL & DEVELOPM BIOL 27.53 1 1.72 13

DEVELOPMENTAL BIOL ANN REV CELL & DEVELOPM BIOL 27.53 1 1.72 3

PHYSIOLOGY PHYSIOLOGICAL REV 24.82 1 3.18 1

CELL BIOLOGY NATURE REV MOL CELL BIOL 22.21 4 2.76 8

ENDOCRINOL & METABOLISM ENDOCRINE REV 21.98 1 2.87 1

NEUROSCIENCES ANN REV NEUROSCIENCE 21.89 1 3.12 4

PHYSICS REV MODERN PHYSICS 20.14 1 5.02 1

CHEMISTRY CHEMICAL REV 19.67 1 2.89 2

The IF is for ‘02,


JFIS covers ‘98-‘02
Fields and Citation
0
windows
0.5 1 1.5 2 2.5 3 3.5 4 4.5
POLYMER SCIENCE (55)
CHEM, APPLIED (25)
CHEM, CLIN&MEDIC (8)
CHEM, PHYSICAL (78)
CRYSTALLOGRAPHY (18)
ELECTROCHEMISTRY (10)
Chemistry

CHEM, INORG&NUC (37)


BIOCH & MOL BIOL (169)
CHEM, ORGANIC (42)
CHEMISTRY (128)
CHEM, MISCELLAN (7)
CHEM, ANALYTICAL (54)

ENG, INDUSTRIAL (14)


ENG, MANUFACT (5)
ENGINEERING (84)
ENG, BIOMEDICAL (33)
Engineering sciences

ENG, PETROLEUM (8)


ENG, MECHANIC (69)
ENG, CIVIL (49)
ENG, ENVIRONM (6)
ENG, CHEMICAL (69)
ENG, MARINE (8)
ENG, ELECTRICAL (127)

PHYSICS, MATHEMA (10)


ACOUSTICS (20)
THERMODYNAMICS (11)
PHYSICS, FLUIDS (16)
PHYSICS, MISCELL (6)
PHYSICS, AT,M,C (22)
Phsyics

OPTICS (37)
PHYSICS, APPLIED (49)
PHYSICS, COND MA (36)
PHYSICS (85)
PHYSICS, NUCLEAR (16)
PHYSICS, PART&FI (11)
Citation measurement of IF

2002 2003 2004 2005 2006 2007 2008 2009

2002 2002 2003 2004


2003 2004 2005
2003 2004 2005 2006
2004 2005 2006 2007
2005 2006 2007 2008
2006 2007 2008 2009
2007 2008 2009
2008 2009
2009
CWTS answer to the problems of the IF
• This indicator is the JFIS, the Journal-to-Field Impact
Score.

• The JFIS solves the main objections against the


Impact Factor, as
– the calculation of JFIS is based on equally large
entities,
– document types are taken into account,
– JFIS is field-normalized, and finally,
– based on longer citation windows (1-4 years)
Citation measurement of JFIS
Citation years
2002 2003 2004 2005
2002 2006 2003 2007
2002 2004 20052008 2009
2003
2003 2004
2004 2005
2005 2006
2003 2004
2004 2005
2005
2005 2006 2007
2004 2005
2005
2005 2006
2006 2007
2007 2008
2005 2006
2006
2006 2007
2007
2007 2008
2008 2009
2006 2007
2007
2007 2008
2008 2009
2007 2008
2008 2009
2008 2009
2009
End of the presentation

For questions regarding the contents of the


presentation, mail to: [email protected]

You might also like