0% found this document useful (0 votes)
26 views89 pages

Error and Fraud The Dark Side of Biomedical Research 1st Edition Geoffrey P Webb PDF Download

The document discusses the book 'Error and Fraud: The Dark Side of Biomedical Research' by Geoffrey P. Webb, which explores the flaws and limitations in biomedical research methods, as well as notable cases of scientific errors and research misconduct. It highlights the prevalence of research fraud and its detrimental effects on scientific credibility and public health. The book aims to raise awareness about the importance of integrity in research practices and the need for better mechanisms to detect and prevent fraud.

Uploaded by

sekkoursfn75
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views89 pages

Error and Fraud The Dark Side of Biomedical Research 1st Edition Geoffrey P Webb PDF Download

The document discusses the book 'Error and Fraud: The Dark Side of Biomedical Research' by Geoffrey P. Webb, which explores the flaws and limitations in biomedical research methods, as well as notable cases of scientific errors and research misconduct. It highlights the prevalence of research fraud and its detrimental effects on scientific credibility and public health. The book aims to raise awareness about the importance of integrity in research practices and the need for better mechanisms to detect and prevent fraud.

Uploaded by

sekkoursfn75
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

Error And Fraud The Dark Side Of Biomedical

Research 1st Edition Geoffrey P Webb download

https://ebookbell.com/product/error-and-fraud-the-dark-side-of-
biomedical-research-1st-edition-geoffrey-p-webb-34402310

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Error And The Academic Self The Scholarly Imagination Medieval To


Modern Seth Lerer

https://ebookbell.com/product/error-and-the-academic-self-the-
scholarly-imagination-medieval-to-modern-seth-lerer-51906508

Error And Uncertainty In Scientific Practice Marcel Boumans Giora Hon

https://ebookbell.com/product/error-and-uncertainty-in-scientific-
practice-marcel-boumans-giora-hon-4762706

Error And Inference Recent Exchanges On Experimental Reasoning


Reliability And The Objectivity And Rationality Of Science 1st Edition
Deborah G Mayo

https://ebookbell.com/product/error-and-inference-recent-exchanges-on-
experimental-reasoning-reliability-and-the-objectivity-and-
rationality-of-science-1st-edition-deborah-g-mayo-1374540

Error And The Academic Self Seth Lerer

https://ebookbell.com/product/error-and-the-academic-self-seth-
lerer-1631618
Measurement Error And Misclassification In Statistics And Epidemiology
Impacts And Bayesian Adjustments 1st Edition Paul Gustafson

https://ebookbell.com/product/measurement-error-and-misclassification-
in-statistics-and-epidemiology-impacts-and-bayesian-adjustments-1st-
edition-paul-gustafson-1336382

Measurement Error And Research Design 1st Edition Madhu Viswanathan

https://ebookbell.com/product/measurement-error-and-research-
design-1st-edition-madhu-viswanathan-2023626

Medical Error And Patient Safety Human Factors In Medicine 1st Edition
George A Peters

https://ebookbell.com/product/medical-error-and-patient-safety-human-
factors-in-medicine-1st-edition-george-a-peters-2321252

Medical Error And Harm Understanding Prevention And Control 1st


Edition Milos Jenicek

https://ebookbell.com/product/medical-error-and-harm-understanding-
prevention-and-control-1st-edition-milos-jenicek-4725940

Computational Error And Complexity In Science And Engineering V


Lakshmikantham And Sk Sen Eds

https://ebookbell.com/product/computational-error-and-complexity-in-
science-and-engineering-v-lakshmikantham-and-sk-sen-eds-1525278
Error and Fraud
Error and Fraud

The Dark Side of Biomedical Research

Geoffrey P. Webb, BSc, MSc, PhD, SFHEA


Nutrition writer and consultant,
School of Health, Sport and Bioscience,
University of East London, UK
First edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2021 Geoffrey Webb

CRC Press is an imprint of Taylor & Francis Group, LLC

The right of Geoffrey Webb to be identified as author of this work has been asserted by him in accor-
dance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.

This book contains information obtained from authentic and highly regarded sources. While all
reasonable efforts have been made to publish reliable data and information, neither the author[s]
nor the publisher can accept any legal responsibility or liability for any errors or omissions that may
be made. The publishers wish to make clear that any views or opinions expressed in this book by
individual editors, authors or contributors are personal to them and do not necessarily reflect the
views/opinions of the publishers. The information or guidance contained in this book is intended
for use by medical, scientific or health-care professionals and is provided strictly as a supplement to
the medical or other professional’s own judgement, their knowledge of the patient’s medical history,
relevant manufacturer’s instructions and the appropriate best practice guidelines. Because of the
rapid advances in medical science, any information or advice on dosages, procedures or diagnoses
should be independently verified. The reader is strongly urged to consult the relevant national drug
formulary and the drug companies’ and device or material manufacturers’ printed instructions,
and their websites, before administering or utilizing any of the drugs, devices or materials men-
tioned in this book. This book does not indicate whether a particular treatment is appropriate or
suitable for a particular individual. Ultimately it is the sole responsibility of the medical professional
to make his or her own professional judgements, so as to advise and treat patients appropriately. The
authors and publishers have also attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@
tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks and
are used only for identification and explanation without intent to infringe.

ISBN: 978-0-367-46993-1 (hbk)


ISBN: 978-0-367-46992-4 (pbk)
ISBN: 978-1-003-03267-0 (ebk)

Typeset in Minion
by KnowledgeWorks Global Ltd.
For Kate and Lucy Webb
Contents

Preface xi
Prologue xv
Acknowledgements xxvii

1 Research strategies in the biomedical sciences 1


The use of statistics 1
The range and classification of available methods 5
Observational human studies 7
Animal and in vitro experiments 16
Human experimental studies 21
Quality assessment of human trials 25
Meta-analysis 26
Decision-making and hierarchies of evidence 30
National institute for health and clinical excellence (NICE) 34
Key references 35

2 Case studies of scientific errors 37


Case study 1 – Sleeping position and cot death 37
Case study 2 – The protein gap 40
Case study 3 – Defective brown fat thermogenesis
as an important cause of obesity 45
Case study 4 – Antioxidant s supplements to increase
life expectancy 49
Key references 54

3 More general concerns about scientific credibility 57


Are most published research findings wrong? 57
An avalanche of junk papers 64
Why is so much published data irreproducible? 65
Is all published research useful? 73
Key references 77

vii
viii Contents

4 The accused – case studies of scientists accused


of research misconduct 81
Jatinder Ahluwalia 81
Werner Bezwoda 84
Joachim Boldt 86
Stephen Breuning 88
Michael Briggs 89
Sir Cyril Burt 91
Ranjit Kumar Chandra 94
Dipak Kumar Das 97
Charles Dawson 99
Yoshitaka Fujii 102
Viswa Jit Gupta 103
John William Heslop Harrison 104
Paul Kammerer 105
Gregor Mendel 107
Haruko Obokata 110
J Malcolm Pearce 111
Scott S Reuben 113
Ram Bahadur Singh 114
Diederik Stapel 117
Jon Sudbo 118
William Talley Summerlin 120
Andrew Wakefield 122
Minor case studies 124

5 Research fraud overview 133


Basic definitions 133
How common is research fraud? 134
Is research fraud becoming more common? 135
The harm done by research fraud – why it matters 136
The publication process: how is the accuracy and integrity
of the scientific record maintained? 145
Key references 146

6 Protection – barriers to the publication of fraudulent data 149


Author level barriers 149
Peer review by co-authors, referees and editors 155
Key references 167

7 Detection: identifying fraud after publication 169


Faith in the efficacy of peer review and replication 169
Encouraging and protecting potential whistle-blowers 170
Reader scrutiny 173
Indicators of fraudulent papers and fraudulent scientists 174
Key references 180
Contents ix

8 Disinfection and measures for minimising the impact


of research fraud 183
Disinfection overview 183
One confirmed act of fraud should trigger a wider
investigation 183
Who should investigate accusations of research fraud? 184
Indications that data has been fabricated 191
Interviews with co-authors 194
Life after death – continuing influence of retracted papers 195
What can be done to improve the situation? 197
Key references 201

Index 203
Preface

This book has had a very long gestation period and I have probably spent more
time researching and writing it than any of my other books, including my
650-page nutrition textbook. I started to make serious efforts towards research-
ing this book in 2011 and I have had an almost complete draft in my files for over
5 years. I have had difficulty in finding a publisher willing to take the risk of pub-
lishing it; it seemed to fall into the gap between an academic text and a popular
science book that is often perceived to have limited sales potential. I seriously con-
sidered resorting to self-publishing so that my efforts were accessible to readers
and not wasted. I am therefore particularly grateful to my commissioning editor,
Joanna Koster, for persuading Taylor & Francis Group to agree to publish this
book. My daughter Kate also helped in the dissemination of my ideas and research
by setting up a blog site for me (https://drgeoffnutrition.wordpress.com/) which
I have used to post many articles relating to scientific error and research fraud
including many detailed accounts of the fraud and error case studies that are
summarized in Chapters 2 and 4. This blog also contains many opinion pieces
and educational articles. I have cited these blog articles many times in this book
as fuller accounts of issues and cases summarized here and as a route to find the
many and varied original sources that are too numerous to list here.
The first half of this book is concerned with the flaws and limitations of the
research methods used by biomedical scientists and some of the scientific errors
that have resulted from misuse or misinterpretation of these methods. I have
been writing about research methodology and research errors in my articles
and books for over 30 years. My interest in scientific mistakes was first triggered
by finding out that the protein gap that was so prominent in nutrition teach-
ing and research during the 1950s and 1960s was an illusion (see case study in
Chapter 2). The problem of meeting human protein needs was a major topic on
my undergraduate course and was a key element of the rationale for my PhD
thesis. During my undergraduate and postgraduate studies at Southampton
University (1967–1973), my department was heavily involved in work on a poten-
tial new protein source known locally as the “Rank mold” being developed by
the company then called Rank Hovis McDougal. This research eventually led to
the marketing of a mycoprotein preparation as a meat replacement product for
affluent Western vegetarians. This research was one of many expensive projects

xi
xii Preface

around the world looking for novel sources of protein-rich food that might help
to alleviate the perceived critical shortage of protein supplies for the Third World
where most children were considered to be at very high risk of protein deficiency.
After finishing a year’s study leave at King’s College London in 1987, I started
to do some library research into the past overestimation of the protein needs
of children that led almost all nutritional scientists to the mistaken belief in a
large and rapidly increasing shortfall in world protein supplies. My researches led
to publication of a review article about the protein gap in an education journal.
This research also left me with an abiding interest in the flaws and limitations of
biomedical research methods and their interpretation and ultimately led to the
research upon which this book is based and to critiques of these methods in my
papers and books, and to accounts of the error case studies discussed in Chapter 2.
Despite writing extensively about research errors and despite having served
for about 8 years on the editorial board of the British Journal of Nutrition (BJN),
up until around 2010, I had never really thought seriously about professional sci-
entists deliberately fabricating research data. I had heard of the Piltdown man,
of course but considered it a one-off aberration; the exception that proves the
rule. The possibility of data fabrication or falsification was not something that I
considered when reviewing papers for the BJN and other journals or when choos-
ing sources to rely on for my own books and papers. My eyes were first opened
to this possibility when I read that an author that I had cited many times and
who was regarded as a leading authority in his field of nutritional immunology,
RK Chandra, had been openly accused of publishing fabricated trial data. Then
in November 2010, I read in the Times Higher Education Supplement that a col-
league, Jatinder Ahluwalia, had had a paper retracted from Nature and been
accused in a report by University College London of multiple acts of research
misconduct in generating the data for this paper. I was once again particularly
shocked by these allegations because I had cited this Nature paper several times
in my books and articles. These revelations triggered the start of my research
into fraud and misconduct in scientific research. As soon as I started to look for
information about research fraud, I found numerous other examples of individu-
als who had been accused of research fraud, and a large sample of my case studies
of these individuals is summarized in Chapter 4. I have always tried to under-
stand and explain the background to each of these cases of fraudulent research
and its impact on that research field as well as details of how the perpetrator was
exposed and his or (rarely) her subsequent fate. My case studies include botanists,
zoologists, psychologists, dentists, paleontologists, geneticists, clinicians, cancer
biologists, immunologists and nutritionists. I have thus had to delve into many
areas of science that are outside my own areas of expertise; despite the nega-
tive stimulus that prompted this research it has been an enjoyable and rewarding
experience. I have largely avoided the physical and mathematical sciences in my
investigations because I did not think that I would have the background to fully
explain the science behind these cases and their impact.
The momentum of my research into scientific fraud was maintained and
increased when my daughter chose research fraud as the subject of her MA dis-
sertation. Her work on this dissertation gave me an entry into the wider literature
Preface xiii

about research fraud and the efforts being made to detect, quantify and control it.
Without the stimulus of working with her for those few months, I doubt whether
this book would ever have been completed. Despite being an English graduate
who had a very limited science background, she produced a report that I have
found scientifically useful and still refer to today.
I have given many invited talks to students and academics about research
error and fraud. The reaction of my audiences has convinced me that, despite
publicity about recent high-profile cases like Andrew Wakefield and the MMR
vaccination, I was not alone in rarely thinking about data fabrication or falsi-
fication as a major issue. Many scientists think as I did, that research fraud is a
rare aberration that has little impact on scientific progress or clinical practice.
In Chapter 5, I have tried to quantify the extent of research misconduct and
highlight the sometimes quite disproportionate harm that it can cause. Whilst
fraudsters may be uncommon, they are often prolific and prolonged offenders;
faking exciting, high-quality data is much quicker than generating it honestly.
The faked data of these serial offenders have sometimes done considerable harm
and wasted precious research time and resources. In Chapter 3, I similarly try
to give an indication of the extent of false and irreproducible conclusions in the
scientific literature and the reasons for this frequent lack of reproducibility. The
four error case studies discussed in Chapter 2 may represent the tip of a large
iceberg, and some have even claimed that the conclusions of most scientific papers
are wrong and that most research expenditure is wasted. The pace of scientific
progress has been rapid despite this waste of resources, but if the effort and expen-
diture directed towards unproductive and erroneous research could be reduced,
this would amount to a free boost to the resources available for useful research.
One of the major reasons for writing this book and for the numerous articles
about error or fraud that have preceded it, is to raise awareness and consideration
of these problems by scientists and science students. Awareness and understand-
ing of a problem is a necessary first step in its reduction.
If we spent more time analysing and discussing the causes of past mistakes
and how they achieved general acceptance, then perhaps we would be less likely
to make similar mistakes in the future. I have suggested that extrapolation of
evidence that is towards the bottom of the evidence hierarchy and making pre-
mature interventions based upon that evidence has been a common element in
several of these past mistakes. If NICE had existed in 1970 and had graded and
evaluated the evidence in favour of front sleeping for infants, would it ever have
become the norm recommended by most health professionals and baby care
writers?
If scientists and particularly journal editors and referees had been more con-
scious of the possibility that data might have been fabricated or falsified, then might
questions about the authenticity of the data of some career fraudsters have been
raised sooner and their impact reduced? In several cases, questions and doubts
were raised about serial fraudsters long before they were finally exposed but no
action was taken. In the final three chapters of the book, I have discussed some of
the features of fraudulent data and fraudulent authors that might raise suspicions
and allow those suspicions to be tested.
xiv Preface

Consideration and discussion of research error and research fraud should


start early and be more overtly present in the undergraduate and especially the
postgraduate programs of science students. There should be even more emphasis
on guiding students towards good practice and ethical behaviour and encourag-
ing them to avoid cutting corners to get another paper under their belt. Students
should also be given guidance on how to recognize poor practice and the possi-
bility that the data of others has been fabricated or falsified. There should be more
discussion of past mistakes and the reasons why they occurred and why false
beliefs flourished for so long. Students should discuss past cases of research fraud
and the ways in which fraudulent practice was detected. As I say later in the book:
Today’s students are tomorrow’s referees and editors.
Prologue

AIMS, ORIGINS AND STRUCTURE OF THIS BOOK


For many years, I have been writing about major scientific mistakes such as the
four examples below which are discussed more fully in Chapter 2:

●● The mistaken belief that protein deficiency was the most important cause of
world malnutrition because of a massive deficit in world protein supply, the
so-called protein gap.
●● The promotion of front sleeping for babies that led to large worldwide
increases in cot death rates in the 1970s and 1980s.
●● The belief that a defect in the heat-generating capacity of brown fat was a
major cause of human obesity and that drugs that stimulated brown fat
might be a viable treatment for human obesity.
●● The belief that antioxidant supplements, when taken by well-nourished adults,
would reduce cancer and heart disease and so increase life expectancy.

A common feature of these errors has been an uncritical and unjustified extrapo-
lation from findings at a low level in the evidence hierarchy. Such as:

●● Assuming that an epidemiological association is due to a cause and effect


relationship
●● Assuming that a favourable change in some biochemical risk marker will
inevitably lead to reduced disease risk or increased life expectancy
●● Prematurely applying the results from small animal studies to people
●● Extrapolating suggested benefits for a small high-risk group to the whole
population

In recent years, evidence-grading hierarchies have been developed. Normally


changes in clinical practice or health policy should only be made if there is clear
supportive evidence at the highest levels of the evidence hierarchy or pyramid.
Rigorous application of this system would have prevented most of the practical
consequences of these past errors. In Chapter 1, I briefly review the observational
and experimental methods available to scientists in the biomedical sciences and

xv
xvi Prologue

discuss the strengths and limitations of these various lines of enquiry. I also dis-
cuss how results from this variety of investigative approaches can be integrated
and graded to optimise the chances of making correct scientific, clinical and pol-
icy judgements. Meta-analysis has become a very popular technique for trying to
get a consensus from similar studies with common outcomes, and meta-analyses
of controlled trials are at the top of the evidence hierarchy. Meta-analysis involves
a weighted amalgamation of similar studies, so it is prone to distortion by large
or multiple fabricated trials; this reinforces the importance of identifying false or
fabricated data and removing it from the scientific record.

IS THERE A SYSTEMIC PROBLEM WITH THE SOUNDNESS


OF PUBLISHED RESEARCH?
In Chapter 3, I discuss claims that most published research is wrong and that as
much as 85% of research expenditure wasted, i.e. that the four error case studies
discussed in Chapter 2 are not just isolated cases but are symptomatic of a more
general problem with the soundness of much scientific research. I review some
of the problems with the design, execution and analysis of scientific studies that
may increase the likelihood that their results and conclusions will be unreli-
able. A major factor undermining confidence in published research is the lack of
reproducibility or lack of any attempt to reproduce most of it. Despite this lack of
reproducibility, many scientists still have great faith in the traditional belief that
incorrect or fraudulent science will be quickly detected when other scientists are
unable to reproduce it; the evidence suggests that this faith may not be wholly
justified.
The drive to generate research papers has led to an avalanche of research
papers that are largely unread. Many of these papers are of low-quality and pub-
lished in low-quality journals with low or very low thresholds of acceptance.
Much of this research seems to have little obvious potential for improving sci-
entific understanding or little chance for improving healthcare or health advice.
An improbable claim based upon statistically weak or flawed evidence may gen-
erate a succession of similar papers oscillating between supporting and refuting
the original claim. For example, weak evidence of an association between eating
dairy products such as yogurt and ovarian cancer risk has helped to spawn scores
of follow-up papers over several decades that have not advanced our understand-
ing of the causes of ovarian cancer or our ability to make recommendations to
reduce it. This is discussed more fully in Chapter 3 where I come to the conclu-
sion that further similar research is unlikely to change that conclusion.

MY PERSONAL JOURNEY FROM ERROR TO FRAUD


Error and fraud may seem like two quite distinct issues:

Largely honest production of flawed data or misinterpretation of


data to support a false hypothesis.
Prologue xvii

As opposed to:

Wilful fabrication of data or manipulation of real data to convince


others of the correctness of a hypothesis.

My interest in both error and fraud was sparked by personal involvement. My


doctoral research project was part of a programme to develop an alternative
fungal protein source that could be produced industrially and so contribute to
alleviating the perceived large and increasing shortage of protein for human
consumption – the “protein gap”. I was shocked when I later discovered that this
protein gap had probably never existed. My later research involved the use of
genetically obese mice. The notion that a defect in the heat-generating system in
brown fat might be the primary cause of obesity in these mice and that defective
thermogenesis might be an important cause of human obesity became briefly
fashionable at this time (late 1970s/early 1980s). Our observation that mice could
lower their body temperature and become torpid when fasted led us to suggest
that the well-documented persistent mild hypothermia of these mice and their
intolerance to sudden cold exposure was not a failure of thermogenesis but man-
ifestations of an adaptive energy conserving response to perceived starvation.
Their genetic defect is now known to leave their brains unable to detect their
huge fat stores and so they respond as if in a permanent state of starvation; they
respond by entering a permanent semi-torpid state. The defective brown fat
theory of human obesity was the result of an incorrect interpretation of research
observations in mice and its inappropriate application to people.
I first became conscious of research fraud when I discovered that Ranjit K
Chandra, whose publications I had cited in several of my books and papers, had
been accused of fabricating his data. I had also cited a Nature paper of Jatinder
Ahluwalia to support my case against the likely benefits of antioxidant supple-
ments. Ahluwalia had become a colleague by the time news first broke that he
had been accused of research fraud and I subsequently became involved in efforts
to persuade my employer to take action against him. At around this time, my
daughter Kate was taking an MA in publishing at University College, London
and because of my frequent discussions (ranting?) about research fraud she chose
to write about an aspect of research fraud for her dissertation. It was her research
and her discussions with me that helped to convert what had been a general
interest with sporadic bouts of reading about individual cases to more systematic
research about fraud and its causes and consequences that eventually led to this
book. She made me aware of many more cases and showed me that there is a
substantial body of academic literature dealing with various aspects of research
fraud. She also made me aware of organisations that deal with research fraud like
the Office of Research Integrity (ORI) in the USA and the UK Research Integrity
Office (UKRIO).
Scientists have often failed to fully consider flaws in the logic and gaps or
inconsistencies in the evidence for the theories that generated major scientific
mistakes. With hindsight, some of these gaps and flaws now look fairly obvious.
The high protein requirements of rapidly growing laboratory animals was used
xviii Prologue

as evidence for the high protein needs of slow growing human babies despite
the low protein content of human milk. The largely speculative benefits of front
sleeping were deemed sufficient to justify promoting a change in sleeping posi-
tion for babies that subsequently led to increased risk of cot death. Analysis of
past “mistakes” made me very aware of the fallibility not only of individual sci-
entists but of the scientific establishment in accepting flawed and poorly substan-
tiated theories and translating them into practical health advice or treatment.
It was thus no surprise to me that individual scientists who cheat and generate
false data may similarly be able to escape detection by the scientific community.
Papers based upon fabricated or falsified data may go largely unchallenged and
be cited for decades despite glaring flaws or anomalies that, in hindsight, now
seem obvious. The focus of the second half of this book is on scientists who try to
deceive others by the use of fabricated or improperly manipulated data.
Error and fraud both flourish when there is a collective suspension or sup-
pression of critical evaluation by the rest of the scientific community. Some of the
evidence used to support mistaken theories now looks weak or seriously flawed.
This makes it difficult to understand how they became so firmly established and
sometimes persisted as the accepted textbook belief for decades. It is even more
alarming that major policy decisions were made and implemented on the basis of
these inadequately substantiated theories.
Was there ever enough evidence to justify programmes costing the equivalent
of billions of pounds/dollars to develop new or improved protein sources to close
the protein gap?
Was there ever any substantial evidence to support recommendations for par-
ents to adopt the front sleeping position for their babies? Maybe the change in
sleeping position was casually adopted because of the illogical assumption that
even if it did no good then such a simple change in behaviour could not do any
real harm, but surely any behaviour change considered worthwhile and signifi-
cant enough to do some good must also have the potential to do harm.
The history of medicine is littered with well-intentioned practices that have
turned out to do more harm than good. The phrase evidence-based practice
has become the mantra of health professionals and health policy makers, but has
there always been proper and critical evaluation of the evidence that underpins
current practice? Once a scientific theory or mode of treatment has been widely
accepted it becomes accepted as the truth and is presented as such to students in
their textbooks and lectures. In Chapter 3, I discuss efforts to challenge many
current practices and weed out those that are ineffective or even harmful. Once a
theory becomes accepted fact, further attempt to question or test the fundamen-
tal basis of the theory often ceases. Research may be focused upon aspects of the
theory or its application under the assumption that it is proven fact. Sometimes
a theory may become so much a part of the research fabric in an area that
careers, research grants and even whole research programmes and organisations
may be dependent upon continuation of the theory.
For example, so much money, scientific effort and political capital had been
invested in measures to increase protein availability that it was very difficult to
persuade people to look critically at the rationale that underpinned this effort
Prologue xix

and expenditure, i.e. to question the belief that there really was a crisis in world
protein supplies. The existence of a protein gap was the raison d’etre of an agency
of the United Nations, the Protein Advisory Group. In his watershed 1974 Lancet
paper questioning the existence of “a protein gap”, Donald McLaren suggested
that many scientists had privately expressed sympathy with his opposition to the
existence of a protein gap and the expensive efforts to close it. He claimed that
they were unwilling to support him for fear of damaging their careers and
research funding. He even suggested that there were attempts to cynically sup-
press critical re-evaluation of the protein gap concept.
The antioxidant theory of disease and aging prevention has been seriously
undermined in recent years, and it is becoming increasingly difficult to argue that
indiscriminate use of antioxidant supplements in basically well-nourished adults
will prolong life. This theory has generated tens of thousands of research papers,
dozens of books and has been the justification for countless research grants and
programmes. Many scientific reputations and careers have been built upon it.
This theory has also had considerable wider commercial impact; it has been used
to promote the sale and manufacture of many foods, drinks and supplements
that are “high in antioxidants”. New exotic fruit and vegetables with high anti-
oxidant content have been introduced to the market and promoted on the basis
of their antioxidant content; new varieties of existing fruit and vegetables with
higher antioxidant levels have also been developed. If it is confirmed that extra
antioxidants given to normally nourished people have no long-term benefits,
then these products would lose the prestige and marketability they have gained
from their high antioxidant activity.
A similar suspension of fundamental critical analysis also characterises
many of the examples of deliberate fraud that are discussed in the second part
of the book. In Chapter 4, there are summaries of selected case studies of major
research fraudsters, some of whom have had decades of success and have been
become very influential by using fabricated or falsified data. These cases have
been selected in part to act as a reservoir of examples to illustrate points made in
the last four chapters. There now seem to be obvious flaws and inconsistencies in
their fraudulent data, some of their claims now seem outrageous and unbeliev-
able and the sheer volume and scope of work said to have been done by some
fraudulent researchers seems beyond belief.
With the benefit of hindsight, one might question:

●● Why some obvious and major statistical flaws in a person’s research output
were overlooked: impossible standard errors/deviations, regression coef-
ficients that remained constant to 3 decimal places despite large increases in
sample size, the same number of people with the same side effect in multiple
clinical trials, baseline distributions that could not have occurred if trial
subjects were randomly assigned as claimed by the authors.
●● How a single individual could generate data and research papers at a phe-
nomenal rate; sometimes publishing as sole authors, papers that would nor-
mally involve multiple co-authors and an army of assistants with differing
areas of expertise.
xx Prologue

●● How it was possible for someone to get away with using totally fictitious
collaborators on their papers.
●● Why data that is completely out of line with the results of other groups
remained largely unchallenged for decades.
●● How claims to have used impossibly large samples of fictitious subjects with
narrow or rare selection criteria went unchallenged.
●● How claims to have used drugs, databases or other materials that were not
then available were not noticed.
●● How someone who cannot be traced and probably does not exist can publish
papers or be listed as a co-author.
●● How someone can publish studies using many laboratory animals when the
institution’s animal house has not supplied them.
●● How someone can report experiments using radioactive chemicals when
there is no record of their use in the legally required log of their use and safe
disposal.

Scientists may be trained to criticise the methodology of other scientists and the
way they analyse and interpret their data, but they generally trust the honesty
of what has been written or said. Referees look for flaws in the design of a study,
question whether the correct statistical analyses have been done and whether the
interpretation of the results is sound. However, they generally assume that the
methods have been honestly described and the data submitted has been gener-
ated and analysed in the way described by the authors. They would not expect
the authors to be deliberately trying to deceive them. This trust acts as a powerful
shield for the deliberate fraudster because referees do not actively look for indica-
tions of fraud. Colleagues and managers at the host institution may be so blinded
by the reflected glory generated by their star researcher that they do not question
how their prolific output was achieved.
In some cases, the revelation that a prominent scientist has been faking data
for years comes as an apparent surprise to the scientific community. In other
cases, suspicions about a scientist’s work may have been present long before the
general acceptance of misconduct—suspicions that have been largely ignored or
suppressed for years.
After an accusation against Canadian scientist Ranjit Chandra, a Memorial
University panel decided in 1994 that he was almost certainly guilty of
misconduct:

With respect to the allegations, the committee is, therefore, led


to conclude that scientific misconduct has been committed by
Dr Chandra.

This report was not released, and Chandra continued to work at the university
until around 2002, when further public accusations of fraud were made. He pub-
lished prolifically during these intervening years including many influential
scientific reviews and keynote lectures at scientific conferences. He became an
acknowledged expert about how diet and dietary supplements affect the immune
Prologue xxi

system, and he was regarded by some as the father of nutritional immunol-


ogy. Chandra was awarded the Order of Canada, was on the 1995 honour roll
of Canadians who had made a difference published by Maclean’s magazine and
was said to have been twice nominated for the Nobel Prize. Several of his major
papers have only recently been retracted and his Order of Canada revoked.
In 2012, the Japanese anaesthesiologist Yoshitaka Fujii was proven to have faked
at least 172 scientific papers including many clinical trials. Suspicions about the
veracity of his work had been first raised in a letter published in April 2000 when
it was pointed out that the incidence a side effect appeared to be almost exactly the
same in all the groups in his 21 published clinical trials. It was statistically almost
impossible that these reported identical incidences could have occurred by chance
and some underlying influence must be responsible for this “incredibly nice” data,
i.e. a subtle way of suggesting that the results were not genuine. Despite this warn-
ing signal he continued to publish prolifically for another decade.
Why were these and other serial fraudsters allowed to pollute the scientific
record with fabricated data for so long especially when suspicion preceded their
formal unmasking by many years? If flaws in a scientist’s work that suggest miscon-
duct are highlighted then how can they carry on obtaining research grants, holding
on to senior academic positions and continuing to publish further flawed data? Did
colleagues and particularly senior colleagues at the host institution not question
how these individuals were able to publish so prolifically without creating a whirl-
wind of research activity in their laboratories and clinics? Were senior academics
and managerial staff reluctant to question the integrity of the data of their star
performer who was generating research income and publications to enhance the
standing and prestige of their institution? Did they fear that any attempt to investi-
gate the suspected offender would risk scandal and damage their institution’s repu-
tation? Were research sponsors happy to accept favourable data even though there
might be some suspicion that there had been irregularities in its generation? Were
junior colleagues too in awe of the eminent professor to question their integrity
or were they concerned that their own career might be blighted by any attempt at
whistle blowing? Did the tradition of trust amongst scientists and a failure to con-
sider the possibility of data fabrication help them to remain unexposed for so long?
Do the libel laws in particular so frighten employers, authors, editors and
journal owners that they become very reluctant to do anything that could be
construed as questioning the integrity of an author? Journal editors may be reluc-
tant to retract papers unless requested to do so by the other authors; until Ranjit
Chandra lost a libel case against the Canadian Broadcasting Corporation in 2016,
only one of his papers had been retracted even though clear evidence of data fab-
rication had been around for over two decades and a paper had been retracted for
research fraud in 2005.
In Chapter 5, I discuss efforts to estimate the prevalence of research fraud.
It is sometimes argued that research fraud is not a major problem because the
numbers of people involved are small, but some fraudsters have caused major and
lasting harm with their fraudulent publications, and there is discussion of some
of the adverse consequences of research fraud; in some cases, research fraud has
almost certainly been responsible for increased patient deaths. As noted earlier,
xxii Prologue

the popular technique of meta-analysis will be distorted by large or multiple fab-


ricated data sets and there is discussion of how data from some prolific fraudsters
changes the outcome of meta-analyses.
Chapters 6, 7 and 8 look at the ways in which the scientific literature might be
kept relatively free of fraudulent and incorrect research under the three umbrella
headings of prevention, detection and disinfection. Peer review is seen as the
primary safeguard against publication of poor quality research. Peer review may
identify many studies that are poorly designed, poorly executed or inappropri-
ately analysed and interpreted. However, because reviewers usually trust that a
paper is an honest account of what was done and what was found, it is relatively
ineffective where authors fabricate their findings and deliberately aim to deceive
the editor, peer reviewers and journal readers. Some editors have been found
to by-pass the peer review process in order to fill their journal and they have
accepted deliberately and obviously flawed spoof papers sent to test their peer
review process. The growth of open access publishing, where publication fees
from authors are the main source of the journal’s revenue, may have encouraged
some less scrupulous editors to accept almost anything submitted provided that
the publication fee is forthcoming. Some notorious fraudsters have published
much of their fraudulent work in journals that they edit and have thus able to
control or bypass peer review of their own papers.
Chapter 7 is primarily concerned with the detection of fraudulent or incorrect
research after publication. As noted earlier there is a traditional belief that fraudu-
lent or incorrect data that gets through the peer review process will be identified
when other researchers try to reproduce it. Despite this faith, peer review and
lack of reproducibility have rarely been responsible for unmasking high profile
cases of research fraud. Most fraudsters have been identified when colleagues have
acted as whistle blowers and reported their suspicions about the behaviour of a
colleague or collaborator. Other fraudsters have been unmasked by vigilant read-
ers who have spotted signs of data fabrication in published work. Whistle blow-
ers render an important service to the scientific community by helping to uphold
scientific integrity and reduce the distracting and distorting effects of fraudulent
published data. However, rather than being lauded and rewarded for their service
to science, some whistle blowers have suffered as a consequence of their vital ser-
vice and examples of these harmful impacts upon whistle blowers are highlighted.
Some of the characteristics of papers and data that have led readers to suspect that
publications contain fraudulent data are also discussed in this chapter.
Chapter 8 deals with the ways in which the total output of a fraudulent
researcher should be investigated so that the literature can be disinfected of
all their suspect work. In some cases most of the published output of known
fraudsters remains in the literature, but in other cases, expensive investigations
by employers or professional bodies have led to mass retractions of a fraudster’s
past papers. The importance of removing fraudulent data from the scientific
record has been increased by the increased amalgamation of published data into
meta-analyses. Finally there is discussion of some practical measures that might
reduce the amount of fraudulent material that reaches and remains in the scien-
tific literature.
Prologue xxiii

WHAT DO I HOPE TO ACHIEVE BY WRITING THIS BOOK?


One of my main motivations for writing about past mistakes was to encourage open
discussion of why they had occurred, and how we could learn from them and thus be
less likely to make similar mistakes in the future. In my opinion, there has been too
little analysis and open discussion of these past mistakes despite their major impacts,
e.g. billions of dollars/pounds in wasted research and development projects to close
the protein gap and tens of thousands of extra cot deaths caused by the promotion of
front sleeping for babies. There seems to be a reluctance of some in the scientific com-
munity to acknowledge just how important and damaging these mistakes were and
to fully recognise some of the fundamental flaws in the way research was conducted
and interpreted which caused them. The first part of the book aims to increase aware-
ness and discussion of past, present, and likely future errors in research. It considers
why so much research effort and expenditure is thought to be wasted and why most
published papers are unread and/or incorrect in their conclusions.
A code of relative silence and an apparent reluctance to openly address serious
problems in scientific research has also characterised the way in which research
fraud has sometimes been handled. This book aims to increase awareness of the
impact of research fraud, some of its tell-tale signs and characteristics and of
the great damage that it can cause. This increased awareness should make life
harder for the dishonest researcher, increase the chances of exposure and maybe
increase the chances of their being quickly identified and severely penalised for
their dishonesty. In addition to the relatively small number of cases of outright
fabrication, there is suggestive evidence and considerable suspicion, that select-
ing or improperly presenting data is more common, e.g. the clustering of reported
probabilities just below the 0.05 value taken to indicate significance is one indica-
tion of inappropriate data selection or manipulation.
Perhaps all science courses in universities should include direct coverage of
research fraud as a curriculum topic. Students could be given analysis of indi-
vidual cases of proven fraud and made more aware of the possibility of fraud and
shown some of the characteristics of fraudulent papers:

Today’s undergraduates and postgraduate students are tomorrow’s


journal referees and editors.

This more open discussion and awareness may make it more likely that other cases
will be detected more quickly and dealt with effectively once allegations have been
made or suspicion aroused. One recurring comment made after a fraudster has been
exposed is that referees, editors, co-authors and employers had not even considered
the possibility that they were being deceived and so they unquestioningly accepted
the authenticity of data provided by the fraudster; they looked for error and misjudge-
ment but did not consider the possibility of deliberate deception. In a lead article in
The Times Higher Education Supplement (August 23rd 2012), John Gill, in an appeal
for more openness in the way allegations of research fraud are dealt with, wrote:

“Sunlight is the best disinfectant, and the UK’s institutions and


researchers must be fearless in shining a light on misconduct”.
xxiv Prologue

Referees and editors need to be conscious of the possibility that some of the data
they review may be fabricated or dishonestly manipulated; they should actively
consider this possibility and be aware of some characteristics of fraudulent
papers/authors. They should ask themselves questions like:

●● Are the results credible? Are they consistent with previous findings? If not,
then do the authors give reasonable explanation for differences with other
published work?
●● Are the statistics believable? At the very simplest level do the standard devia-
tions/errors translate into likely or even possible ranges and are they consis-
tent with other data in the paper?
●● Are the baseline characteristics of subjects consistent with other established
data?
●● Do the authors give a clear and believable account of key parts of their meth-
odology such as how subjects were recruited and allocated to study groups?
●● Is the baseline data consistent with claims of random allocation?
●● Are the numbers of subjects with specific, and perhaps uncommon, charac-
teristics feasible?
●● Do authors give adequate details of what ethical approval was obtained for
the study and is any evidence of this approval supplied?
●● Is the workload and expertise involved in generating the data consistent with
the contributor list and the acknowledged assistance?
●● Are the co-authors all aware of their involvement and have they signified
their willingness to accept responsibility for the quality and veracity of the
methods and results submitted?

These questions would, in most cases, be almost a formality unless a referee’s


or editor’s suspicion is aroused. Once misconduct is suspected then other more
time-consuming investigations into the publication record of the suspect could
be initiated. Often such wider investigations do generate evidence of research
fraud; the evidence is often very clear once detailed analysis of the accused’s work
begins. The unmasking of Yoshitaka Fujii by an intensive analysis of the baseline
data in his clinical trials and the demonstration that it could not possibly have
arisen by random allocation of subjects provided compelling evidence of data
fabrication that was taken as essentially proof of research fraud. A detailed analy-
sis of raw data requested by the BMJ to support one of the papers submitted to the
journal by Ram Bahadur Singh showed that there was overwhelming evidence of
data fabrication or falsification.
Several fraudsters seem to have used other authors’ names without permission
to give credibility to their work, and some have used fictitious co-authors or have
written papers in the name of fictitious authors to support and add credibility to
their own published findings.

●● Yoshitaka Fujii used the names of other anaesthesiologists on his papers


without their knowledge.
Prologue xxv

●● A paper published by Amrit Jain in a journal edited by RK Chandra sup-


ported Chandra’s past findings; it was accepted the day after submission and
Jain has never been traced.
●● Psychologist Sir Cyril Burt has been accused of writing papers, reviews and
articles for the journal he edited using various pseudonyms. Two of his long
time assistants have never been traced despite co-authoring articles with him
and even publishing independent articles in the journal that Burt edited.

Awareness of such ploys makes their earlier detection more likely. Journals
should take reasonable steps to ensure that all the authors listed on an accepted
paper really exist and that they are aware that the paper is being published in
their name. Journal editors obviously play a pivotal role in journals’ efforts to
prevent publication of fraudulent research and in dealing with previously pub-
lished suspect material. Journal editors also hold a position of trust and as can be
seen from the Burt and Chandra cases this trust can be abused and the journal
manipulated to suit the editor’s own agenda. Many of the fraudsters dealt with in
the case studies had held one or more editorial positions with journals. Journal
editors must be scientists of the highest integrity and any doubts about their past
behaviour should rule them out of holding such a position of trust. Journal own-
ers and publishers must ensure that their journals do not become the personal
fiefdom of a dominant or long serving editor. Checks should be embedded in the
editorial process to prevent editors publishing whatever they want and to prevent
them handling papers that they have co-authored or that affect them directly, e.g.
supporting/refuting their own work.
In addition to phantom collaborators and people listed as collaborators with-
out their knowledge, some respected co-authors have been duped into adding
credibility to fraudulent findings. Some of the discredited papers published by
the Indian palaeontologist Vishwa Jit Gupta were co-authored by other highly
respected academic palaeontologists. He claimed to have found numerous fossils
in the Himalayas that had never otherwise been found within thousands of kilo-
metres. Gupta sometimes sent his genuine fossil to another expert who would
authenticate it as a genuine example and help in the description of the fossil and
because of this role they would be listed as a co-author. Their presence as co-
author would have added to the authority and credibility of Gupta’s claims that
he had found it in the Himalayas even though it had been acquired from a shop
or museum.

LIMITING THE INFLUENCE OF FRAUDULENT AUTHORS


Another aim of writing this book is to encourage journals to do more to restore
the integrity of the scientific record. Strongly suspect papers should be retracted
and readers warned about data in un-retracted papers of known fraudsters.
Otherwise fraudulent research will continue to influence those writing reviews
and conducting meta-analyses. Fraudulent data will continue to be used to in
commercial promotion of products and may continue to influence clinical deci-
sions. Data reported by Ranjit Chandra about the hypoallergenic properties
xxvi Prologue

of some commercial infant formulae were used to support these products for
years after an internal enquiry at Memorial University in 1994 found evidence
of research misconduct in the generation of this data. Yoshitaka Fujii’s prolific
clinical trials distorted conclusions about the best way of treating post-operative
nausea and vomiting. Meta-analysis showed that efficacy of different treatments
such as the drug granisetron were quite different when analysed with and with-
out the inclusion of his fabricated data
Fraudsters exert influence not only by publishing their fraudulent output but
also by being given opportunities to influence scientific opinion by invitations
to write and speak and to sit on policy making bodies. They will also supervise,
teach and perhaps write the books used by students who will become the next
generation of scientists. Review articles written by serial fraudsters should also
be considered for retraction. Often these reviews were invited and only published
because of the author’s reputation that was gained by use of fraudulent data. Most
fraudsters freely cite their own tarnished research in their review articles. Many
non-specialists and textbook writers will rely upon review articles to keep them
up to date in areas outside their own specialty. I have found many reviews writ-
ten by prolific fraudsters are still in the literature and often cite the work of their
discredited author usually including their retracted papers.
I found several reviews by disgraced oncologist Werner Bezwada remaining
in the literature that used fabricated clinical trial data to support the benefits of
aggressive high dose chemotherapy for some breast cancer patients.
Finally these fraudsters can influence scientific and public opinion through
writing books, popular articles and through media interviews. It is claimed, for
example, that RK Chandra has authored or edited 22 books.
Many fraudsters are serial offenders and have been allowed to pollute the sci-
entific record for decades even when suspicions were raised much earlier – did
secrecy and the desire not to make concerns public protect fraudsters and allow
them to continue? One telling addition to the discussion about how fraudsters
are able to flourish for so long is a comment made in a book review by the distin-
guished immunologist and Nobel laureate Sir Peter Medawar. He recalls being
shown a rabbit which had purportedly had its whole cornea transplanted from
a human cadaver. William Summerlin claimed that pre-incubation of skin and
corneas prevented rejection even when transplants were between different species.
The eye of the rabbit showed no signs of surgery, and Medawar said that he did not
believe that the rabbit had undergone any surgery to its cornea (which it had not)
but admits that he lacked the moral courage to say that he believed that the distin-
guished audience were being made victims of a hoax. Even such a distinguished
scientist felt inhibited from pointing out what he believed to be an obvious fraud,
so it is not surprising that lesser mortals are also tempted to say nothing:

“All it takes for evil to succeed is for a few good men to do nothing...”
Acknowledgements

I would like to thank my commissioning editor Joanna Koster and her colleagues
at Taylor & Francis Group for helping to make this book happen. I am grateful
for the help and guidance given in converting my mass of research and writings
into a publishable manuscript.
I also would like to acknowledge the important role that my daughter Kate
played in opening my eyes to the large body of literature and research miscon-
duct and the efforts being made to detect and control it.

xxvii
1
Research strategies in the
biomedical sciences

THE USE OF STATISTICS


The end result of most biomedical research studies is one of the following:

●● A comparison between the means of variables measured in different groups,


e.g. in control and test groups
●● A comparison between some measure of relative risk in different groups, e.g.
the relative risk of lung cancer in non-smokers and smokers (perhaps subdi-
vided according to level of tobacco usage)
●● A comparison of the frequency of occurrence of some variable in different
groups, e.g. the frequency of a potential side effect recorded in those given a
drug or placebo
●● The association or correlation between two variables

Scientists use statistical analysis to make objective judgements about whether


any differences or correlations are likely to be significant or just due to chance.
If a large group of 15-year-old children were divided in half by alphabetical
order, then the average heights of the two groups would almost certainly not
be exactly the same. If one divided them according to sex, then the difference
between average heights of the two groups would probably be bigger and the
male average higher. Statistical analysis should indicate that any small differ-
ence between the alphabetically selected groups was just due to chance but that
the difference between sexes was significant because adolescent boys tend to be
taller than girls.
Even if subjects are matched for factors like age and sex, most biological vari-
ables are normally distributed around a mean or average value. If a frequency
distribution is plotted, then this yields a so-called bell-shaped curve; lots of val-
ues are clustered close to the mean and the number of values decreases as you
get further away from the mean on either side. The standard deviation describes
the distribution of values around the mean in a normal distribution: about 68%

1
2 Research strategies

One standard
deviation

68% of data

95% of data

99.7% of data

–3 –2 –1 0 1 2 3

Figure 1.1 The frequency of values around the mean (0) in a normal
distribution.

values lie within one standard deviation either side of the mean, about 95%
within two standard deviations and 99.7% within three standard deviations of
the mean (see Figure 1.1).
This means that in a large sample of British men, only about 2.5% would be
taller than the mean/average plus two standard deviations and just 0.15% taller
than the mean or average plus three standard deviations. If a survey revealed that
25% of children in an area were more than two standard deviations below the
national average height, this would indicate that some factor was adversely affect-
ing the growth of this group. Individual values for any variable that are much
more than three standard deviations above or below the mean might indicate
that this has a pathological cause.
Using the means, standard deviations and sample sizes of a measurement
made on two groups, it is possible to estimate the probability that any difference
between the means of the two groups is just due to chance (e.g. using a “t” test). If
the likelihood of the difference occurring by chance is say 1:1000, then one would
be confident that this was a real difference and this result would be classified as
highly significant, i.e. p (probability) = 0.001 or 0.1%. If the probability was say
1:2 (p = 0.5), then this would be classified as a non-significant difference between
the two means because one would expect to get a difference of this magnitude
every second time simply by chance. By convention, scientists take a probability
of less than 1:20 (p < 0.05 or 5%) as statistically significant. Of course, statistically
significant differences can still occur by chance; theoretically once in every 20
comparisons. This means that if one does multiple comparisons on two groups
then this increases the risk of false positive results. If a treatment has a “real”
The use of statistics 3

effect upon an outcome measure, then whether or not a test achieves statistical
significance will depend upon:

●● The magnitude of the treatment effect


●● The general variability in the outcome measured and any variability in the
response to the treatment
●● The size of the sample

A statistically significant difference between control and experimental treat-


ments could also be the result of bias in the design or conduct of the experiment
rather than a real effect of the treatment under test. For example:

●● Some bias in the way the two groups were initially selected
●● Bias in the way the control and experimental groups were treated during the
experiment
●● A bias in the way outcome was measured in the two groups

When designing an experiment, it is possible to predict the sample size that is


needed to produce a statistically significant result (i.e. p < 0.05 or <5%), if the
intervention alters the measured variable by a given amount, e.g. what sample
size one would require to show significance if a drug reduced blood pressure by
say 20%?
Statistics may tell us that two means are probably different, but this may not
be a clinically significant difference. An intervention may alter a measured clini-
cal indicator significantly, but this does not necessarily mean that the patient
gets any real benefit from the intervention. For example, if an intervention low-
ers blood cholesterol by a small but statistically significant amount in short-term
trials, will this be of any clinical benefit to patients? Will the fall in blood cho-
lesterol make a real difference to heart attack risk and will it be maintained over
the longer term? The magnitude of the effect or degree of patient benefit from
an intervention may be more important than the probability value; something
can have a small and not clinically useful effect that is nonetheless statistically
significant.

Association between variables


Biomedical scientists, particularly epidemiologists, often look for associations
between variables:

●● Is there any relationship between saturated fat intake and blood cholesterol
concentration?
●● Is there any association between salt consumption and blood pressure?
●● Is there any association between activity level or fitness and body fatness?
●● Is there any association between level of cigarette smoking and lung cancer risk?
●● Is there any association between fruit and vegetable consumption and mor-
tality from cardiovascular disease?
4 Research strategies

y y y

x x
Positive Inverse No
correlation correlation correlation
+1.00 –1.00 0.00

Figure 1.2 Interpreting correlation coefficients.

When testing the relationship between biological variables that one expects to
be normally distributed, one can calculate a value known as the (Pearson) cor-
relation coefficient (r). If the two variables are plotted on the x- and y-axis of a
graph and if they lie on or equally scattered around a perfect straight line with
a positive slope (i.e. increases in x are associated with increases in y), then this is
a perfect positive correlation and r = +1. If the straight line has a negative slope
(y decreases as x decreases) then this is a perfect negative correlation and r = −1. If
all the points are on or evenly scattered around a horizontal line, then this means
that there is no correlation between the measures on the x- and y-axis and r = 0
(shown in Figure 1.2).
As with differences between means, one can assess the probability of a mea-
sured association being due to chance. The threshold for statistical significance
of r values between 0 and +1/−1 depends upon the sample size. The threshold
r (p = 0.05) is 0.6 with 10 pairs of values, 0.3 with 30 pairs, around 0.17 with
100 pairs and with 1,000 pairs an r of just 0.05 would be significant. Even though
low r values may be statistically significant, the association between x and y is
only a very weak one. If one squares the r value, this indicates how much of the
variation in y is explained by variation in x. With r values of +1 and −1 then r2 is
1 and so 100% of the variation in y can be explained by variation in x, for other
values:

Variation in y explained
r value (+ or −) r2 by variation in x
1 1 100%
0.7 0.49 49%
0.5 0.25 25%
0.2 0.04 4%
0.1 0.01 1%

As noted above, an intervention may have a statistically significant effect on a


measured clinical indicator that is not clinically useful. Likewise, a weak cor-
relation can be statistically significant without being of much value to a medical
The range and classification of available methods 5

researcher trying to apply it practically, e.g. a weak association between a life-


style factor and a disease risk factor. A significant correlation coefficient indicates
that two variables are associated but does not necessarily mean that changes in
x cause changes in y. Overlooking this mantra that association does not mean
cause and effect is probably the biggest weakness in the interpretation and analy-
sis of epidemiological findings and is discussed at length in Chapter 3.
When looking at potential causes of disease, then it is common to calculate
something called the relative risk. In its simplest form, this is the incidence of
disease in a group exposed to a potential cause of the disease divided by the
incidence in the unexposed group, e.g. the incidence of lung cancer in cigarette
smokers divided by the incidence in non-smokers. In this example, the relative
risk may be well over 10, i.e. habitual smokers are ten times as likely as non-smok-
ers to develop lung cancer. One could calculate relative risk at different levels or
durations of smoking and expect to see a graded increase in relative risk with
increased exposure. One might also compare relative risk in ex-smokers, non-
smokers and current smokers. Such epidemiological studies can only demon-
strate association but in this case the relative risk is so large and consistent across
different types of studies that it was accepted fairly quickly that this association
was almost certainly causal. The interpretation of small apparent increases in
relative risk is discussed in Chapter 3.

THE RANGE AND CLASSIFICATION


OF AVAILABLE METHODS

Observation → hypothesis → experiment → accept/reject/modify


hypothesis

This is the traditional model of scientific investigation; an observation generates


a hypothesis and an experiment is designed to test this hypothesis. In his famous
1854 cholera study, John Snow noted that cases of cholera during an outbreak in
Soho, London, tended to be clustered around a particular water source, the Broad
Street pump. Cases also occurred in people who lived outside this pump’s normal
catchment area but nevertheless got their water from this pump; people who lived
within this catchment area but had an alternative water supply were usually not
affected. From these observations, he hypothesised that water from this pump
was the source of the infection and removal of the pump handle was the interven-
tion that confirmed this hypothesis.
Biomedical researchers now have a range of sophisticated observational and
experimental methods available to help establish the causes of a disease and the
effectiveness of potential treatments. The key difference between an observa-
tional study and an experiment is that an experimenter imposes some constraint
or intervention to see if this produces the result predicted by the hypothesis.
Experiments usually compare the outcome measure in control and test groups or
control and test periods within the same group. Experiments are designed with
the aim of ideally making the intervention the only consistent difference between
6 Research strategies

control and experimental groups or periods. If this is achieved, then any statisti-
cally significant difference in the measured outcome between the groups can be
confidently attributed to the intervention. A clear cut and reproducible result
from a well-designed and executed experiment is frequently taken as proof of the
hypothesis being tested.
Observational studies cannot technically prove cause and effect. They can only
show that two variables are related and indicate the strength of that association.
They thus allow a hypothesis about cause and effect to be generated. Measuring
levels of blood cholesterol and relating this to the subject’s normal estimated
saturated fat intake is an observational study and produced the hypothesis that
replacing dietary saturated fat intake with polyunsaturated fat would reduce
blood cholesterol level. This hypothesis was tested in controlled experiments in
which it was shown that such reduced intake of saturated fat did indeed lead to a
significant reduction in blood cholesterol level.
Some associations are so strong and consistent that they may be all but
accepted as proof of cause and effect, such as the link between cigarette smoking
and lung cancer. Experimental studies can often confirm or disprove the hypoth-
esis generated by observational research but in some instances including in my
specialist area nutrition, the obvious controlled experiment is sometimes diffi-
cult to conduct, unethical or, in some cases, totally impractical. Observational
studies suggest that long-term, habitually high fruit and vegetable consumption
is associated with reduced cardiovascular and total mortality. It would not be
practically feasible to carry out the large, long-term experiment needed to test
directly whether controlled changes in fruit and vegetable consumption lead to
reduced mortality.
I have grouped the methods available to biomedical scientists into four
categories:

●● Observational human studies. These can range from simple anecdotal or


clinical observations through to sophisticated analytical methods like cohort
studies that may involve monitoring the behaviour and health outcomes of
many tens of thousands of subjects for years or even decades.
●● In vitro and animal experiments. Such studies often provide the initial evi-
dence that ultimately leads to new treatments or preventive strategies; their
primary role in biomedicine is to generate hypotheses about what would
happen in people. Almost all Nobel Prize winners in physiology or medicine
have made use of non-human experiments in their prize-winning work.
●● Experiments with human subjects. These range from short-term experiments
often looking at the effects of interventions on risk factors like blood choles-
terol, blood pressure or measures of oxidant stress through to randomised,
placebo-controlled trials (RCTs) with holistic outcomes like disease inci-
dence or total mortality. RCTs are seen as the gold standard of evidence in
medical research.
●● Meta-analysis. This is a very popular and fairly recent addition to the tools
available in the biomedical researcher’s armoury. It involves weighted
aggregation of similar studies testing the same hypothesis; studies with
Observational human studies 7

similar methodology and the same outcome measure. It is essentially a


statistical procedure which can be used to amalgamate almost any type of
study from animal experiments and observational studies through to RCTs.
Meta-analysis effectively amalgamates several smaller studies into one larger
study of greater statistical power and gives a consensus of the results from
the smaller studies. Meta-analyses of RCTs are placed at the pinnacle of
the evidence hierarchy in medical research. The results of a well-conducted
meta-analysis would be seen as one step higher in the hierarchy of evidence
than a single study of the type of study it combines. In practice, differences
between individual studies in things like the level of intervention (e.g. drug
dose), selection criteria for subjects (e.g. diagnostic criteria for a disease) and
outcome measures may make meta-analysis problematical or impossible.

OBSERVATIONAL HUMAN STUDIES


Cross-cultural comparisons
In cross-cultural comparisons, differences in an environmental, lifestyle or
dietary factor in populations are related to an outcome measure like their mor-
tality rates or incidence of a disease. For example, average sugar consumption
could be related to numbers of decayed, missing or filled teeth in an age group
of children or tobacco consumption could be related to death rates from lung
cancer. Standardised Mortality Ratio (SMR) is an age-corrected measure of mor-
tality that allows mortality rates in populations with different age structures to
be directly compared.
Back in 1973, Lillian Gleiberman published a paper in the rather obscure jour-
nal Ecology of Food and Nutrition in which she correlated average blood pres-
sure with estimated average daily salt intake in 27 populations around the world.
Figure 1.3 was plotted using some of Gleiberman’s male data and shows a highly
significant positive correlation which suggests that around 37% (r2) of the varia-
tion in blood pressure can be explained by variation in average salt intake. These
findings are consistent with the now widely accepted belief that high salt intake
is a causative factor in the development of hypertension. Gleiberman would have
been aware of the many limitations in her original study but the data was con-
vincing enough to suggest that the hypothesis that high salt intake is a cause of
high blood pressure was worthy of further investigation.

Time trends
Changes in a population’s behaviour or an environmental change can be related
to changes in disease frequency. Changes in sugar consumption might be related
to changes in rates of tooth decay; the start of bulk sugar imports to an island
population might be associated with sharp increase in rates of tooth decay in
the island’s children. The increase in dental disease with the increased use of
sugary foods on the remote South Atlantic island of Tristan da Cunha after 1937
8 Research strategies

16 r = 0.612 P < 0.001

Mean blood pressure (Kp)

14

12

10

0 4 8 12 16 20 24 28
Mean daily salt intake (g)

Figure 1.3 Relationship between average blood pressure and estimated salt
intake in 27 male populations around the world.

is widely used as evidence for the key role of sugar in promoting dental caries.
More recently, it has been claimed that the introduction of fluoride supplementa-
tion programme in 1982 was associated with a subsequent large increase in the
number of caries-free children on the island (see Chestnutt, 2003). Changes in a
population’s alcohol consumption could be related the incidence of liver disease.
As part of a wide-ranging review of the epidemiology of alcoholic liver disease,
Mann et al (2003) looked at the effects of prohibition on liver disease mortality
in the USA. The introduction of prohibition in 1920 was associated with a huge
decline in death rates from liver cirrhosis from the high levels seen at the begin-
ning of the 20th century. Rates of liver disease rose again after prohibition ended
in 1933. Observations such as these have been used to generate the now gener-
ally accepted hypotheses that high alcohol consumption causes liver disease and
sugar causes tooth decay.
In Chapter 2 (see Figure 2.1), it will be shown that changes in cot death rates
followed promotion of the front sleeping position and they fell sharply once
“Back to Sleep” campaigns were started.

Migration studies
Migration exposes migrants to a new environment and usually results in changes
in their diet and lifestyle as they acculturate. There is also a tendency for their
health characteristics and disease patterns to move towards that of the native
population in their new home. This suggests that many of the differences in
Observational human studies 9

disease and mortality patterns between populations are due to differences in


environment, lifestyle and diet rather than due to genetic differences between
them. Studies on the rates of multiple sclerosis in migrants have generated sev-
eral theories about some of the causes of this disease. When people migrate from
high-incidence areas like the UK to low-incidence areas, rates tend to fall. This
fall seems to be more pronounced in those who migrate before 15 years old than
in those who migrate later in life – does a childhood illness or exposure increase
risk of multiple sclerosis? When people migrate from low to high areas then they
tend to retain their low risk. Studies with Japanese migrants to the USA have
been used to support several proposed links between lifestyle factors and disease.
As these Japanese migrants acculturated so their levels of heart disease rose but
their levels of strokes fell which has been linked to their increased saturated fat
and reduced salt consumption in America.
Migration within national boundaries may also be studied. For example,
Poulter et al (1990) found that members of the Kenyan Luo tribe who had
migrated to Nairobi had higher blood pressure than those who had remained
in their villages. The higher salt intake of the urban diet was consistent with the
hypothesis that higher salt intake causes increased blood pressure.

Anomalous groups
Some groups may deviate markedly from a consistent cross-cultural trend. For
example, traditional Greenland Eskimos were said to have relatively low levels
of heart disease despite eating large amounts of animal-derived fats. Most of the
animal fat in the Inuit diet came from fish and marine mammals which triggered
the current interest in fish oils and the omega-3 fatty acids which are abundant in
marine oils although recent claims suggest that early estimates of heart disease
rates amongst the Inuit may have been underestimated (see Webb, 5 August 2018
for a referenced discussion).
Seventh Day Adventists in the USA tend to abstain from alcohol and tobacco
use and about half do not eat meat. Adventists also have lower rates of some cancers
and heart disease than other Americans and those Adventists who do and do not
eat meat also have different disease and mortality patterns; these observations have
generated many hypotheses about how diet affects health and mortality patterns.
Observations made upon victims of some congenital disorder, injury or
adverse environmental exposure may provide useful evidence about possible
causes or treatments for ill health. Observations upon individuals with rare inher-
ited immunological deficits played a role in helping us understand the workings
of the immune system and observations of citizens of Hiroshima and Nagasaki
after the Second World War told us much about the short- and long-term effects
of exposure to radiation. Familial hypercholesterolaemia is caused by a domi-
nant gene and leads to a very elevated plasma cholesterol concentration. Even
people who heterozygous for the gene have high cholesterol levels and are prone
to premature coronary heart disease and the few individuals who are homozy-
gous for this gene have usually died in childhood from heart disease without
intensive treatment. This confirms that high plasma cholesterol concentrations
10 Research strategies

and coronary heart disease risk are positively associated and because high cho-
lesterol is known to be a primary consequence of the genetic defect, it supports
the proposition that this association is causal.

Cross-sectional surveys
One can look for correlations between measured variables in data from cross-
sectional surveys like the UK’s National Diet and Nutrition Survey (NDNS) or
the Health Survey for England (HSE). In studies of this type, it has been consis-
tently shown that the proportion of people who are overweight or obese rises in
the lower activity categories, i.e. consistent with the hypothesis that inactivity is a
cause of excessive weight gain and obesity. Figure 1.4 was derived using data taken
from the 1993 HSE and has been used in past editions of my books. Subjects who
are overweight or obese tend to have higher blood pressures than those of the
same age and sex who are classified as normal weight; this supports the hypoth-
esis that being overweight increases blood pressure and that weight loss might
would be helpful in reducing blood pressure in people with hypertension.

Case-control studies
In case-control studies, the exposure to a suspected causative factor is compared
in matched groups of people with or without a disease or disease indicator. For
example, the rates of past smoking in lung cancer patients (cases) can be compared

1.2

1
Odds ratio for BMI over threshold

0.8

BMI 25+
0.6
BMI 30+

0.4

0.2

0
Low Medium High
Activity level

Figure 1.4 The effect of estimated activity level upon the risk of being over-
weight or obese.
Observational human studies 11

to rates in matched subjects who do not have lung cancer (controls). Such studies
find a much higher smoking rate (15–40 times) in the cases than in the controls
which helped identify smoking as a major cause of lung cancer. In their land-
mark 1950 study, Doll and Bradford Hill (1950) interviewed 700 London hospital
patients with lung cancer to determine their previous smoking behaviour. Their
responses were compared to those of patients with other cancers and cancer-free
patients of the same age and sex. They found that lung cancer patients were much
more likely to have smoked and they estimated that above the age of 45 years, the
risk of developing lung cancer in those who smoked more than 25 cigarettes a day
might be as much as 50 times as high as those who did not smoke. Even allowing
for some inaccuracy in this estimate, it would be difficult not to believe that this
was probably a causal association. The front sleeping position of infants who have
died of cot death (cases) has been found to be much higher than in matched sur-
vivors (controls). An expert report published by the UK Department of Health
(DoH, 1993) reported that 20 studies of this type from different countries had
found that the rate of use of the front sleeping position varied between 2 and
12 times higher in those who had suffered cot death compared to those who had
not. These results provide strong support for the hypothesis that the front sleep-
ing position increases the risk of cot death.
These sorts of case-control studies are often called retrospective studies
because they involve asking about past behaviour. However, they can be made
prospective if say blood samples from a large group are collected and stored, so
that analytes in the stored blood can be compared in those who later develop
particular conditions (cases) and those who do not (controls).

Cohort studies
Initial measurements are made on a cohort of people; things like their tobacco
usage, activity levels, alcohol consumption, salt intake, body mass index or blood
cholesterol levels. The cohort is then monitored, often for many years, and any
illnesses and deaths recorded. Incidence of specific diseases or risk of death
can then be related to the initial measurements, e.g. the risk of having a fatal
heart attack could be related to the initial blood cholesterol level and this would
probably show a progressive increase in heart disease risk with increasing blood
cholesterol level. One might relate fitness level or activity to the risk of becom-
ing overweight or obese; several studies have found, for example, that hours of
TV watching in children is strongly associated with risk of becoming obese. To
get enough data (e.g. deaths from or cases of a disease) for statistical analysis
often requires thousands of subjects to be monitored for several years. Cohort
studies are the most powerful of the observational methods but they are often
expensive, labour intensive and may take years to generate any useful data.
Several famous cohort studies have recruited very large samples and have fol-
lowed these subjects for decades. Thus, the Framingham study based in the town
of that name in Massachusetts, USA, started in 1948 with an initial sample of
5,000 adults and the Nurses’ Health Study was started in 1976 with a sample
of 120,000 married American female nurses. Both studies are still ongoing and
12 Research strategies

1.66
Death rate from lung cancer

0.86

0.47

0.07
x
0 1–14g 15–24g 25+g
Daily tobacco useage

Figure 1.5 The relationship between tobacco usage and death rate from lung
cancer.

have been widened to include additional measurements and new cohorts over the
years. The European Prospective Investigation into Cancer and Nutrition (EPIC)
is one of the largest cohort studies ever set up. Detailed information about diet
and lifestyle, physical measurements and blood samples were collected from over
500,000 people from 23 centres in 10 different European countries. These dietary
and lifestyle characteristics were then related to the risk of developing cancer.
Cohort studies such as these have identified many associations between aspects
of lifestyle and diet and disease risk.
The study by Doll and Hill, published in 1956, is a classic example of an early
cohort study. Around 40,000 British physicians filled in questionnaires about
their smoking habits; causes of death in this cohort were then monitored for sev-
eral years. In men over 35 years, death rate from lung cancer increased progres-
sively with the amount of tobacco smoked (see Figure 1.5). Overall the relative
risk of dying of lung risk was 13 times higher in smokers than non-smokers and
the relative risk in the highest smoking category was 24!

Association in observational studies does


not prove cause and effect
Epidemiological methods can only demonstrate association between two vari-
ables but association between two variables does not prove cause and effect. An
association remains just an association even if millions of subjects are used as
with the negative association between fruit and vegetable consumption and total
and cardiovascular mortality.
If the variables A and B are found to be associated, then it could be that
A causes B (cause and effect) or that B causes A (effect and cause). Children
who are overweight or obese watch more television than those who are lean.
This could mean that inactivity is a cause of their excessive weight gain or that
Observational human studies 13

being overweight makes exercise more stressful and causes them to curtail their
activity. The observation that current TV watching predicts their future risk of
becoming obese gives added support to the hypothesis that inactivity causes
weight gain.
It may be that some third or confounding variable C is linked to both B and A,
so they are not causally linked and the association between them is due to both
of them being related to C which is called a confounding variable. There may
be multiple confounding variables responsible for an apparent link between A
and B. It is also possible that any apparent association between A and B is just a
chance observation or caused by bias in the study, especially if the association is
a weak one. If a relatively weak association was found between alcohol consump-
tion and lung cancer, this might mean that alcohol is a direct contributory cause
of lung cancer but it could be that people who drink more alcohol are also more
likely to smoke or have been exposed to passive cigarette smoke. The associa-
tion between alcohol consumption and risk of lung cancer could, therefore, be
because drinkers get more exposure to carcinogenic cigarette smoke. Correction
for cigarette smoking might eliminate the association between alcohol and lung
cancer risk.
A very strong negative correlation (r2 = 0.97) was found between the number
of fresh lemons imported into the USA from Mexico and the US road fatalities
over a 5-year period. Intuitively, one knows this cannot be a cause and effect
relationship despite its strength. Both highway fatalities and lemon imports hap-
pened to be falling at the same time and so they appear to be related; time would
be the key confounding variable in this instance (see Johnson, 2008).
When people design and analyse epidemiological studies, they try to correct
for the effects of confounding variables like correcting the association between
alcohol consumption and lung cancer for cigarette smoking. However, it may
not always be possible to confidently identify all of the confounding variables
and some may be difficult to accurately correct for because there may be only
a crude estimation of the confounder or it may not have been measured at all.
Many epidemiological studies, where activity or physical fitness is a likely con-
founder, have either not attempted to correct for it or only used crude categori-
sation of people into wide self-reported activity bands. Most studies relating a
dietary variable to a disease routinely correct for social class, smoking behav-
iour and other dietary variables but many past studies would not have corrected
for physical activity levels. The choice of which potential confounding variables
are corrected for and how this correction is done may determine whether an
association remains statistically significant or not (see Chapter 3 under multiple
modelling).
Many reports of associations in scientific papers claim that an association
remains significant after correction for a variety of likely confounding variables.
However, there is no statistical magic wand that precisely corrects for all of the
possible confounders. The process of correction is an imprecise, sometimes very
imprecise, process especially if the measurement of the confounder is only a
crude categorisation or unreliable estimate. The choice of which variables to cor-
rect for is a decision of the authors who may be limited by data availability. In
14 Research strategies

many studies, there may be no data on one or more potentially confounding vari-
ables because the data was collected for a different purpose, e.g. as part of large
national surveys intended to monitor the health, diet and general well-being of
the population. Even measurement of the primary variables under test may be
relatively crude, particularly in nutritional studies.
Redemonstrating existing, well-documented associations with ever-larger
numbers of subjects, like the inverse association between fruit and vegetable
consumption and total mortality, does not really help in deciding whether the
association is causal unless better information on likely confounding variables is
also available.

Criteria for establishing causation in epidemiology


There are several characteristics of an epidemiological association that increase
the likelihood of it being causal. These are sometimes known as the Bradford Hill
Criteria after the epidemiologist Austin Bradford Hill and are discussed below.

STRENGTH OF THE ASSOCIATION


The stronger an association is, the more likely it is to be due to cause and effect.
The 13-fold increase in lung cancer mortality in smokers is a strong indication
that smoking is likely to be an important cause of lung cancer, and this is now
generally accepted. Many reported associations between dietary characteristics
and disease have relative risks that are well under 2. It is much more likely that
some of these may be caused by bias or due to the residual effects of some con-
founding variable(s).

TEMPORALITY AND REVERSIBILITY


The suspected cause must precede its effect. For example, any rise in tooth decay
in an island population must occur shortly after supplies of sugar start arriving
on the island. The repeal of prohibition in the USA in 1933 was followed shortly
after by a rise in the levels of alcoholic liver diseases as alcohol consumption rose
again.
If reduced exposure to the suspected cause is associated with reduced inci-
dence of the disease, then it is more likely to be a causal association. Interruption
of sugar supplies to an island population (e.g. during wartime) should be fol-
lowed by a decrease in rates of tooth decay. Epidemiological evidence led to the
hypothesis that increased use of front sleeping was responsible for the rise in cot
death rates in the 1970s and the halving the rate of cot deaths in the year after the
start of the “Back to Sleep” campaign in the UK and similar observations in other
countries provided compelling support for this hypothesis.
Health promotion initiatives are based upon this belief in reversibility:
reduced tobacco use will lead to reduced lung cancer risk, moderation of alco-
hol consumption will reduce the risk of liver disease, and a reduction in blood
cholesterol will reduce risk of a heart attack. Improved outcome requires that
behaviour change starts before irreversible damage has been done; it is too late to
give up smoking after advanced lung cancer has been diagnosed.
Observational human studies 15

SPECIFICITY OF VARIABLES
A single cause has a single effect and the more specific an association is, the
more likely it is to be causal. Correction for confounding variables should not
eliminate the association although the difficulties of this correction process were
noted earlier and will be discussed further in Chapter 3.

IS IT GRADED OR DOSE DEPENDENT?


There is usually a graded effect of true causes rather than a threshold effect as seen
earlier in the progressive increase in lung cancer risk with increased tobacco use.

CONSISTENCY OF THE FINDINGS


If an association is found in a variety of studies using different investigative
approaches, then it is more likely to be due to cause and effect. Many different
types of study indicate a link between smoking and lung cancer, and it is now
almost universally accepted that this association is causal.
Note that several different types of epidemiological studies suggest that there
is a negative association between intake of dietary antioxidants and risk of can-
cer and heart disease. Despite this, controlled trials of several antioxidant supple-
ments in adults have failed to demonstrate any holistic benefits of taking them and
some appear to do net harm rather than net good (discussed further in Chapter 2).

PLAUSIBILITY
An association is more likely to be due to cause and effect if there is a plausible
mechanism supported by laboratory studies. A causal link between inactivity
and excessive weight gain is plausible because inactivity reduces energy expen-
diture making energy surplus and increased fat storage seem more likely. A note
of caution; scientists are very good at producing intellectually satisfying mecha-
nisms to explain how associations could be due to cause and effect. Sometimes
equally plausible mechanisms can be produced to explain the opposite finding.
The most plausible and intellectually satisfying explanations are not always those
that prove to be correct.

COHERENCE
The suggested cause and effect relationship should not conflict with other relevant
knowledge whether it is other epidemiological evidence or the results of experi-
ments. In the case of antioxidants, there is just such a conflict because although
there is substantial epidemiological evidence that high antioxidant intakes are
associated with lower mortality, several large RCTs of antioxidant supplements
have reported similar or increased mortality in the supplemented group.

ANALOGY
The case is strengthened if the proposed cause and effect relationship is analo-
gous to another known cause and effect relationship. Type 2 diabetes is the result
of decreased tissue sensitivity to insulin rather than a primary failure of insulin
production. Most obese people have high levels of leptin, a hormone produced by
16 Research strategies

adipose tissue that is thought to regulate body fat stores by reducing food intake.
This is consistent with the proposition that most obesity is driven by reduced
leptin sensitivity rather than a failure of production just like type 2 diabetes is
driven by declining insulin sensitivity.

EXPERIMENTAL EVIDENCE
Evidence from whatever controlled experiments are possible should also be con-
sistent with the cause and effect hypothesis. This is clearly not the case with the
antioxidant example discussed earlier.

ANIMAL AND IN VITRO EXPERIMENTS


In vitro experiments are literally those done “in glass”. This category includes
experiments done with microorganisms, isolated cells, cell-free systems like bro-
ken cells or isolated enzymes and pieces of animal, human or plant tissue.
In vivo experiments involve the use of living animals, people or plants.
In silico studies are those done using computer simulations and although not
discussed in this book, they are increasingly being used in some areas like drug
design.
Animal and in vitro experiments are often the starting point of any major new
advance in the biomedical sciences. They can only properly be used to generate
hypotheses about what would happen in people and these hypotheses must then
be tested with human subjects. Showing that a substance kills bacteria in vitro is
only a first step in the development of a new antibiotic but something that failed
this test would be an unlikely candidate for an antibiotic. Showing that a sub-
stance kills or slows the growth of cultured tumour cells or even has beneficial
effects upon an experimental animal cancer would be a promising start but a
long way from showing that it might be useful in treating or preventing human
cancer.
Poorly interpreted animal experiments have considerable potential to mislead
human biologists. Sometimes medical researchers seem to have regarded mice or
rats almost as scale models of people rather than species that have evolved with
their own unique characteristics. In Chapter 2, I will suggest that poorly inter-
preted animal experiments played a role in the protein gap myth by exaggerating
the protein needs of growing children who grow much more slowly and need less
protein than rapidly growing experimental animals. The notion that some defect
in the heat-generating capacity of brown fat might be a major cause of human
obesity was mainly based upon studies with rats and mice (see Chapter 2).
Some who oppose animal experiments have suggested that differences
between species make animal experiments of little value in biomedical research.
Even though the responses of laboratory animals to an experimental interven-
tion can sometimes be a misleading guide to the likely human response, most
biomedical scientists believe that animal experiments have been and will con-
tinue to be an essential tool in biomedical research. Some of the alternatives they
favour like experiments with cultured cells or microorganisms or even computer
simulations seem even less likely to predict human responses.
Animal and in vitro experiments 17

Non-human experiments have played a central role in advancing understand-


ing in many areas of human biology:

●● Our understanding of the way in which many genetic diseases are inherited
stems from work with peas and fruit flies; experiments with microorganisms
have played a major part in our unravelling the molecular mechanisms of
inheritance.
●● Experiments with animals were crucial in advancing our understanding of
vitamin deficiencies like beriberi, scurvy, rickets and pellagra.
●● Animal experiments demonstrated that proteins and some amino acids were
essential nutrients but ironically may also have helped exaggerate the protein
requirements of growing children.
●● Experiments with dogs played a major role in the discovery of insulin and
producing the first effective treatment for type 1 diabetes.
●● The development of penicillin stemmed from studies using bacteria grown
on agar plates that had become contaminated with penicillin-producing
mould.

Animal and in vitro experiments are often an essential first step on the path
that leads to a major new advance in medical understanding or treatment. It
could be argued that the examples above are anecdotal, but Table 1.1 shows
that over an 80-year period around 90% of Nobel Prize winners in physiology
or medicine had made use of non-human species in their research. Nobel prizes
provide a relatively unbiased sample of key landmark advances in human biol-
ogy and medicine. This makes it difficult to sustain the argument that studies
with non-human organisms have not made a critical contribution to medical
research.
The notion of evolution from common ancestors underpins and unifies the
biological sciences. Evolution modifies a basic plan, and this has produced a vast
array of living organisms who share some basic characteristics but have specific
characteristics that make each species fit to survive in its own environment. For
example, all independently viable organisms have the same genetic code in their

Table 1.1 “Subjects” (%) used by Nobel Prize winners in physiology or


medicine 1901–1984

Other Other
Humans Primates vertebrates metazoans Microbes
1901–1984
22a 6 73 11 24
1901–1942
29 2 88 17 7
1942–1984
20 8 66 10 31
a About half of these also used other species
18 Research strategies

DNA which determines the amino acid sequences of the proteins that an organ-
ism produces. Identification of these codes was determined using in vitro stud-
ies. Therefore, substances that cause mutation in bacteria are often carcinogenic
because both processes are caused by changes in DNA.

In vitro experiments
One might test whether a substance inhibited multiplication of cultured tumour
cells or killed cultured bacteria. Such experiments might be a useful way of ini-
tially screening potential anti-cancer agents or potential antibiotics or antisep-
tics. One might test whether a potential drug inhibited an enzyme, e.g. a group
of drugs known as statins block a key enzyme in the normal synthetic pathway
for cholesterol and are now very widely used as cholesterol-lowering drugs. One
might test whether a substance induces mutations in cultured bacteria because
substances that do this are potential carcinogens. Conversely, one might test
whether substances inhibit the effect of other mutagens and so potentially reduce
cancer risk. These types of experiments can be particularly useful as a way of
rapidly screening large numbers of compounds for therapeutic potential or their
potential hazard to human health. For example, testing the mutagenic potential
of food additives with tests like the Ames test is widely used as a way of screening
for likely carcinogens.

Animal experiments
Experiments with animals, especially small animals, have a number of obvi-
ous advantages over human experiments. The experiments are relatively cheap
because of the small size, short life span and rapid breeding of these animals.
It is possible to do animal experiments of much higher technical quality than
is usually possible with human experiments, e.g. one can keep them all under
identical, controlled environmental conditions and feed them on precisely con-
trolled diets. By using highly inbred strains of mice, one can also essentially
eliminate genetic variability between individual animals, e.g. tissue can be
transplanted between individuals from these inbred strains without trigger-
ing rejection just like with human identical twins. There are also fewer ethical
limitations on the nature and range of experiments that can be undertaken
despite strict legal regulations and ethical guidelines governing animal experi-
ments. One cannot normally set up human experiments where serious harm to
the subjects is expected or likely, but this is frequently done in animal experi-
ments; COVID-19 vaccine testing would be greatly accelerated if one deliber-
ately exposed subjects to the virus.
One would expect that results from well-designed animal experiments should
have high repeatability or reliability but how valid is it to apply the results of
experiments done with inbred mice under highly controlled laboratory con-
ditions to genetically diverse, free-living people? If one did an experiment on
a single person from an inbred, isolated and newly discovered tribe of people
would one confidently expect the results to predict what would happen in people
Animal and in vitro experiments 19

generally let alone predict the response of mice? Animal experiments do not
always give a valid prediction of human responses and can only justifiably be
used to produce hypotheses about human responses. There are sometimes obvi-
ous reasons why the response of a rat or mouse might not predict the response
of a person. It is sometimes not even possible to even predict the experimental
responses of one strain of mice from those of another strain.

The potential of animal experiments to mislead


human biologists
Animal studies have the potential to mislead human biologists if there is not
careful consideration of how and whether the results from animal studies should
be applied to people. Scientists may sometimes apply their animal findings to
people without always fully considering why the responses of a laboratory rat
or mouse may not predict those of free-living people. In many cases, animal
experiments do indicate how human beings would respond in a given situa-
tion but sometimes there are clear biological reasons why what is observed in
a laboratory animal would not always be a good indication of what would hap-
pen in a person. Rapidly growing offspring of rats and mice require much more
of the energy in their food to be in the form of protein than do slow-growing
human babies and children. This is reflected in the big differences in the protein
content of rat or mouse milk compared to human milk; rat and mouse milk has
four times as much of the energy as protein as human milk. With the benefit
of hindsight, predicting the protein needs of human babies and children from
those of growing rats seems likely to be misleading (discussed in the protein gap
in Chapter 2).
During cold exposure small mammals, like mice, must generate more heat
to maintain their body temperature and energy-expensive heat generation in a
tissue known as brown fat (non-shivering thermogenesis) is the key to their sur-
vival in a cold environment. The main physiological response of people to cold
exposure is to conserve heat by reducing surface blood flow and reducing heat
losses. This basic biological difference was largely overlooked and the low body
temperature and poor cold tolerance of genetically obese mice was generally
interpreted as indicative of a thermogenic defect that caused their obesity and
similar defective brown fat thermogenesis was also suggested as a probable cause
of human obesity (see Chapter 2). Fish eat less in colder water because the body
temperature of fish and other poikilotherms falls and their metabolism slows
in the cold. Non-hibernating, homeothermic mammals increase their metabolic
rate in the cold to maintain their body temperature at around 37°. Fish would be
a poor guide to the effect of environmental temperature on food requirements
in most mammals, as might studies with hibernating animals. The lesser known
capacity of mice to lower their body temperature and become torpid to conserve
energy (see Chapter 2) may similarly make some experiments with mice an unre-
liable guide to human responses.
After 3 weeks of gestation, a female mouse produces a litter of pups that weighs
around 40% of the mother’s pre-pregnant weight and then supplies milk to this
20 Research strategies

litter that enables it to double its birth weight in around 4–5 days. After 9 months’
gestation, a human mother produces a baby that is on average just 6% of her pre-
pregnancy weight. The human infant may need to be breastfed for 4–6 months
to double its birth weight. Studies with mice would seem like a poor guide to
the likely impact of pregnancy and lactation upon the nutritional needs of the
human mother.
If a rat or mouse is deprived of vitamin C, it has no impact upon its health
because, like most mammals, they can manufacture their own vitamin C.
Human beings, primates and guinea-pigs develop the deficiency disease scurvy
within a few weeks of vitamin C deprivation.
Penicillin is toxic to guinea pigs so if guinea pigs had been used in the early
testing of penicillin, it might have considerably delayed its development as an
antibiotic.
It may also be problematical to translate some quantitative findings like effec-
tive drug dosages or nutrient requirements of small mammals to human beings
because of differences in both size and inherent sensitivity. Scaling on a weight
basis (e.g. mg per kg of body weight) may be easy and convenient but drug or
nutrient doses may be more dependent upon the relative metabolic rates of the
two species than relative body weight. Species may also vary in their fundamen-
tal sensitivity to a drug or their requirement for a nutrient. Consider the problem
of trying to decide upon the appropriate dose of LSD (lysergic acid) to use on
an elephant. This example is based on a real piece of research and Fiona Macrae
writing in the Daily Mail (9 November 2007) classified it as one of “the ten silliest
experiments of all time”. If the dose was estimated on a simple body weight basis
then scaling from the dose effective in a cat would give 297 mg but scaling from
the dose effective in people, it would be 8 mg. If the dose was scaled according
to the relative metabolic rate, then the dose would be 80 mg if one relied on the
cat or 3 mg if one used the effective human dose. As the brain is the main site of
action of LSD, if one used the relative brain sizes of people and elephants then the
predicted dose would be just 0.4 mg. The elephant was given a dose of 297 mg and
within minutes it went into convulsions and died.
As another example of this problem of scaling, consider the problem of trying
to model with laboratory mice, the situation of two human populations, one con-
suming 40 kg sugar/person/year and another consuming 20 kg per year. These
sugar intakes represent somewhere around 20% and 10%, respectively, of the
total calorie intake of these two human populations. If you scaled these sugar
doses simply according to the difference in body weight of people and mice, then
the two mice populations would receive amounts of sugar that represented just
2.5% and 1.3% of their calories in the form of sugar.
How significant is dietary cholesterol in elevating human blood cholesterol
and increasing atherosclerosis? The blood cholesterol of rabbits rises, and they
develop atherosclerosis if 1% cholesterol is added to their feed. Rabbits are her-
bivores and so their natural diet would not contain cholesterol and so one might
expect them to be ill-equipped to deal with an artificial dietary cholesterol load.
Omnivorous rats and people are much less sensitive to the effects of dietary
cholesterol.
Human experimental studies 21

HUMAN EXPERIMENTAL STUDIES


In human experimental studies, investigators allocate subjects to control and test
groups and determine the level of exposure (e.g. drug dose) of each test group to
whatever intervention is being tested. The design aims are that:

●● The test and control groups should be well-matched at the outset of the
experiment
●● The only consistent difference between the control and experimental groups
is the intervention being tested
●● The outcome measures are not influenced by the expectations of either the
subjects or the investigators

This should mean that any statistically significant difference in outcome between
groups can be confidently attributed to the intervention. Randomized, double-
blind, placebo-controlled trials (RCTs) should achieve these aims and are said
to be the gold standard of evidence in medicine. A highly statistically significant
effect obtained from an RCT can effectively prove the specific hypothesis being
tested. In some instances, a crossover design may be used where subjects act as
their own control; outcome measures are compared after periods on the con-
trol and real treatment. For example, subjects on low-salt diets had their blood
pressure measured after periods taking either salt tablets designed to negate the
effects of the low salt diet or a placebo.

A randomised double-blind, placebo-controlled trial


Unless numbers are very small, then matching groups is usually achieved by ran-
domly assigning subjects to test and control groups. If the distribution of vari-
ables in groups is found to be incompatible with random allocation, this indicates
bias in the allocation and in some cases suggests that the data has been fabricated.
A placebo is a dummy treatment that is indistinguishable from the real treatment
under test. Double-blind means that neither the subjects nor investigators know
who is receiving the real or dummy treatments until after the data collection has
been completed. Treatments (real or placebo) are given codes by an independent
person and decoding to identify real or placebo treatments is only done after the
data has been collected.
The “placebo effect” can be a substantial one, especially where the outcome is
subjective (e.g. pain intensity, a symptom like hot flushes or level of depression).
A placebo may improve subjective outcomes in over 50% of subjects and may
also affect objectively measured parameters like heart rate and blood pressure.
Investigator bias may occur because they tend to see the results they expect or
want or there may be subtle differences in the way they treat or measure subjects
in control and test groups.
Many things can influence the size of the placebo effect, such as giving mul-
tiple pills, the pill’s colour, whether capsules or tablets are used, route of admin-
istration (oral or injection) and the investigator’s perceived enthusiasm for the
Discovering Diverse Content Through
Random Scribd Documents
6. It is asserted that mimetic resemblances are produced in the most
diverse ways; that the modes whereby the similarity in appearance is
brought about are varied, but the result is uniform.

“A lepidopterous insect,” writes Poulton (p. 251), “requires above all


to gain transparent wings, and this, in the most striking cases that
have been studied, is produced by the loose attachment of the scales,
so that they easily and rapidly fall off and leave the wing bare except
for a marginal line and along the veins (Hemaris, Trochilium).”

7. It is alleged that the imitator and imitated are always found in the
same locality. If they did not do so no advantage would be derived
from the resemblance. It is further alleged that where the mimicking
species is edible it is invariably less abundant where it occurs than the
species it imitates.

8. It is pointed out that it sometimes happens that where in the


mimic the sexes differ in appearance, the male copies one species,
the female quite a different one. This is said to be because the
deception would be liable to be detected if the mimicking species
became common relatively to that which is imitated. “We therefore
find that two or more models are mimicked by the same [230
species” (Essays on Evolution, p. 372).

Occasionally the female mimics two other species, i.e. she occurs in
two forms, each like a different species.

It sometimes happens that the female alone mimics. This is said by


Wallace to be due to her greater need of protection. When she is
laden with eggs her flight is slow, and therefore she requires a special
degree of protection.

9. It is said that in some species we find a non-mimetic ancestor


preserved on islands where the struggle for existence is less severe,
while on the adjacent continent mimicry has been developed.
10. It is alleged that in the cases where moths resemble butterflies
the former are either as diurnal as the butterflies or are species which
“readily fly by day when disturbed.”

11. It is asserted that some seasonally dimorphic forms are examples


of mimicry only in one state, in the form that comes into being at the
time when the struggle for existence is most severe; that is to say, in
the dry season, in Africa, when insect life is far less abundant than in
the rainy season.

In other cases the mimicry of the dry-weather form is said to be far


more perfect.

Instances of this phenomenon are set forth in Professor Poulton’s


Essays on Evolution.

[231
Alternative Theories

It will be observed that we have quoted very largely from Professor


Poulton’s work. Our reason for so doing is that he appears to be the
most prominent advocate of the theory of protective mimicry, and his
work, which was published in 1908, may be taken as the latest Neo-
Darwinian pronouncement on the subject.

Hence if we can show, as we believe we can, that his arguments are


not sound, we may take it that we have demonstrated that the theory
in its present form is untenable.

It is worthy of notice that Professor Poulton sets forth three other


suggestions which have been proposed as substitutes for natural
selection as an explanation of the phenomena of mimicry.

The first is the theory of External Causes, namely, that the


resemblance is due to some external cause, such as food or climate.
The second is the theory of Internal Causes, which states that
mimetic resemblance is due to internal developmental causes.

The third is the suggestion that sexual selection has caused the origin
of these resemblances.

He then proceeds to demolish these to his own satisfaction, and adds


triumphantly, “The conclusion appears inevitable that under no
theory, except natural selection, do the various resemblances of
animals to their organic and inorganic environments fall [232
together into a natural arrangement and receive a common
explanation” (p. 228).

To reasoning of this description there is an obvious reply. Even if it be


granted that the alternatives to the theory of natural selection as set
forth by Professor Poulton are untenable, it does not follow that
natural selection affords an adequate explanation. If A, B, C and D
are charged with theft and the prosecutor proves that neither A nor B
nor C committed the theft, this will not suffice to secure the
conviction of D. It is quite possible that a fifth person, E, may be the
culprit.

Much of the popularity of the theory of natural selection is due to the


fact that biologists have not yet been able to discover a substitute for
it.

It seems to us that the proper method of making progress in science


is not to bolster up natural selection by ingenious speculations, but to
look around for other hitherto undiscovered causes.
KING-CROW OR DRONGO

This very conspicuous black bird (Dicrurus ater), ranging from Africa
to China, is a striking feature of the landscape wherever it occurs.
DRONGO-CUCKOO

The fork of the tail in this bird is unique among cuckoos, but is
nevertheless much less developed than in the supposed model, and
may be an adaptation for evolution in flight, as such tails usually
appear to be.

Objections to the Theory that the so-called Cases of


Mimicry owe their Origin to Natural Selection
It is obvious that for one creature to resemble another can be of little
or no benefit to either until the resemblance is tolerably close. It is,
therefore, insufficient to prove the utility of the perfected [233
resemblance. We may readily grant this and yet maintain that
the origin of the resemblance cannot be due to the action of natural
selection.

The Drongo-cuckoo (Surniculus lugubris) displays so great a likeness


to the King Crow (Dicrurus ater) that it is frequently held up by Neo-
Darwinians as an excellent example of mimicry among birds. But D.
Dewar writes, on page 204 of Birds of the Plains: “I do not pretend to
know the colour of the last common ancestor of all the cuckoos, but I
do not believe that the colour was black. What then caused
Surniculus lugubris to become black and assume a king-crow-like tail?

“A black feather or two, even if coupled with some lengthening of the


tail, would in no way assist the cuckoo in placing its egg in the
drongo’s nest. Suppose an ass were to borrow the caudal appendage
of the king of the forest, pin it on behind him, and then advance
among his fellows with loud brays, would any donkey of average
intelligence be misled by the feeble attempt at disguise? I think not.
Much less would a king-crow be deceived by a few black feathers in
the plumage of a cuckoo. I do not believe that natural selection has
any direct connection with the nigritude of the drongo-cuckoo.”

Darwin was fully alive to this difficulty when he wrote: “As [234
some writers have felt much difficulty in understanding how the
first step in the process of mimicry could have been effected through
natural selection, it may be well to remark that the process probably
commenced long ago between forms not widely dissimilar in colour”
(Descent of Man, 10th Ed., p. 324). Such a statement is of course
quite inconsistent with the Neo-Darwinian position. “The conclusion
which emerges most clearly,” writes Poulton (Essays on Evolution, p.
232), “is the entire independence of zoological affinity exhibited by
these resemblances; and one of the rare cases in which Darwin’s
insight into a biological problem did not lead him right was when he
suggested that a former closer relationship may help us to a general
understanding of the origin of mimicry. The preservation of an original
likeness due to affinity undoubtedly explains certain cases of mimicry,
but we cannot appeal to this principle in the most remarkable
instances.”

It is unnecessary to labour this point. It is surely evident to everyone


with average intelligence that, until the resemblance between two
forms has advanced a considerable way, the likeness cannot be of
utility to either, or at any rate of sufficient utility to give its possessor
a survival advantage in the struggle for existence. Until it reaches this
stage, natural selection cannot operate on it. It is therefore [235
absurd to look upon natural selection as the direct cause of the
origin of the likeness. When once a certain degree of resemblance
has risen, it is quite likely that in some cases natural selection has
strengthened the likeness.

The second great objection to the Neo-Darwinian explanation of the


phenomenon known as mimicry is that in many cases the
resemblance is unnecessarily exact. Even as we saw how the
Kallimas, or dead-leaf butterflies, carried their resemblance to dead
leaves to such an extent as to make it appear probable that factors
other than natural selection have had a share in its production, so do
we see in certain cases of mimetic resemblance an unnecessarily
faithful likeness.

The Brain-fever Bird

The common Hawk Cuckoo of India (Hierococcyx varius) furnishes an


example of this: “The brain-fever bird,” writes Finn, on page 58 of
Ornithological and Other Oddities, “is the most wonderful feather
copy of the Indian Sparrow-hawk or Shikra (Astur badius). All the
markings in the hawk are reproduced in the cuckoo, which is also of
about the same size, and of similar proportions in the matter of tail
and wing; and both hawk and cuckoo having a first plumage quite
different from the one they assume when adult, the resemblance
extends to that too. Moreover, their flight is so much the same that
unless one is near enough to see the beak, or can watch the [236
bird settle and note the difference between the horizontal pose
of the cuckoo and the erect bearing of the hawk, it is impossible to
tell them apart on a casual view.” Moreover, the tail of the cuckoo
sometimes hangs down vertically, thus intensifying the likeness to the
hawk.

It is quite possible that the brain-fever bird derives some benefit from
the resemblance; indeed, it has been seen to alarm small birds, even
as the hawk-like common cuckoo frightens its dupes, but, as D.
Dewar pointed out, on page 105 of vol. 57 of the Journal of the
Society of Arts, “this is not sufficient to explain a likeness which is so
faithful as to extend to the marking of each individual feather. When a
babbler espies a hawk-like bird, it does not wait to inspect each
feather before fleeing in terror; hence all that is necessary to the
cuckoo is that it should bear a general resemblance to the shikra. The
fact that the likeness extends to minute details in feather marking,
points to the fact that in each case identical causes have operated to
produce this type of plumage.” This conclusion is still further
strengthened by the fact that the likeness extends to the immature
plumage, that is to say, exists at a time when it cannot assist the
cuckoo in its parasitical work.

Poulton meets this objection as follows:

SHIKRA HAWK

The upper surface of the tail, not shown in this drawing, exactly
corresponds with that of the cuckoo “mimic.”
HAWK-CUCKOO

This species (Hierococcyx varius) is commonly known in India as the


“Brain-fever bird.”

[237
Hypertely

“All such criticism is founded on our imperfect knowledge of the


struggle for existence. The impressions and judgments of man are
immensely influenced by the ‘corroborative detail,’ giving ‘artistic
verisimilitude to a bold and unconvincing narrative.’ Indeed, the
laughter which is invariably raised by this passage from The Mikado
is, I have always thought, not only or chiefly due to the humour of
the application, but to the way in which a great and familiar truth
breaks in upon the listener with all the pleasing surprise which
belongs to epigram. Birds, the chief enemies of insects, are known to
have powers of sight far superior to those of man, and, from our
experience of them in captivity, it may be safely asserted that their
attention is attracted by excessively minute detail. Until our
knowledge of the struggle for life is far more extensive than at
present, the argument founded on Hypertely may be left to contend
with another argument often employed against the explanation of
cryptic and mimetic resemblance by natural selection. Hypertely
assumes that there are unnecessary details in the resemblance, that
the resemblance is perfect beyond the requirements of the insect; the
second argument maintains that birds are so supremely sharp-sighted
that no resemblance, however perfect, is of any avail against them. In
the meantime the majority of naturalists will probably reject both
extremes, and believe that the enemies are certainly sharp- [238
sighted and successful in pursuit, but that perfection in detail
makes their task a harder one, and gives to the individuals possessing
it in a higher degree than others, increased chances of escape, and of
becoming the parents of future generations.” (Essays on Evolution, p.
302.)

This long quotation requires careful consideration, since to us it


appears to be typical of the kind of reasoning resorted to by Neo-
Darwinians.

Note the reference to our “imperfect knowledge of the struggle for


existence.” This is almost invariably the last refuge of the Neo-
Darwinian when worsted in argument. We fully admit that there is still
much to be learned of the nature of the struggle for existence, but
such a statement sounds very curious when uttered to those who pin
their faith to the theory which sees in the principle of natural selection
an explanation of all the phenomena of the organic world. Natural
selection, be it remembered, is but a name for the struggle for
existence.

Birds capturing Butterflies

“Birds,” says Professor Poulton, “are the chief enemies of insects.”


This may be so. But we greatly doubt whether they are the chief
enemies of butterflies and moths, among which the most perfect
examples of mimicry are supposed to occur.

We have watched birds closely for some years, but believe that we
could almost count on our fingers the cases in which we have [239
seen a bird chase a butterfly.

Professor Poulton, being aware of this objection, sets forth, on pp.


283-292 of Essays on Evolution, the evidence he has gathered in
favour of the view that birds are the chief enemies of butterflies and
other lepidoptera.

As the result of five years’ observation in S. Africa, Mr G. A. K.


Marshall was able to record some eight cases of birds capturing
butterflies. In three cases the butterfly seized was warningly coloured,
or, at any rate, conspicuous! In two of these eight cases the bird
failed to capture its quarry!

Says Mr Marshall, “the fact that birds refrain from pursuing butterflies
may be due rather to the difficulty in catching them than to any
widespread distastefulness on the part of these insects.”

During six years’ observation in India and Ceylon, Colonel Yerbury


records some half dozen cases of birds capturing, or attempting to
capture, insects. He writes: “In my opinion an all-sufficient reason for
the rarity of the occurrence exists in the fact that in butterflies the
edible matter is a minimum, while the inedible wings, etc., are a
maximum.”

Colonel C. T. Bingham in Burma states that between 1878 and 1891


he on two occasions witnessed the systematic hawking of butterflies
by birds, although he observed on other occasions some [240
isolated cases.

This appears to be the sum total of the evidence adduced by


Professor Poulton as regards the capture of butterflies by birds. This
seems to us an altogether insufficient foundation upon which to build
the theory that the cases of resemblance between unrelated species
have been effected by natural selection.

It is, however, to be noted that probably among birds the most


dangerous enemies of butterflies are not those that habitually catch
insect prey on the wing. Such are experts in the art of fly-catching,
and would despise the comparatively meatless butterfly. One often
comes across butterflies with an identical notch in each wing, which
leaves little room for doubt that those particular butterflies had been
snapped at, while resting, by a bird. Among birds the chief enemies of
butterflies and moths are probably to be found in those that hunt for
their food in bushes and trees.

Thus, what we do know of the nature of the struggle for existence


offers but poor support to the Neo-Darwinian explanations of the
cases of so-called mimicry in nature.

Observing-powers of Birds

Professor Poulton’s idea of pitting the argument of Hypertely against


that of the alleged supreme sharp-sightedness of birds is ingenious,
but is not likely to satisfy very many people save those content [241
to live in a fools’ paradise. If birds are supremely sharp-sighted,
and pay attention to excessively minute detail, the difficulty of
accounting for the origin of protective mimicry on the natural
selection hypothesis becomes all the greater.

The question whether or not birds are good observers is a most


interesting one. Unfortunately, hitherto, but little attention has been
paid to the subject. The evidence available seems to point to the fact
that birds, like savages, have sharp eyes only for certain objects—that
is to say, for the things they are accustomed to look out for. All
observers of nature must have noticed how quick a butcher-bird is to
catch sight of a tiny insect upon the ground at a distance of some
yards from his perch.

On the other hand, it is said that when there is snow upon the ground
wood pigeons will approach quite close to a man wearing white
clothes and a white hat, provided he keep perfectly still. Finn once
witnessed in Calcutta a sparrow pick up a very young toad, obviously
by mistake, for it dropped it at once with evident distaste. Birds of
prey are supposed to have remarkably good eyesight; yet they can
readily be caught by a net stretched out before their quarry. They are
not trained to be on the watch for such things as nets, and so do not
appear to notice one when erected.

It is thus our belief that the very perfection and detail of some [242
so-called mimetic resemblances are a very serious objection to
the theory of protective mimicry as enunciated by Professor Poulton
and other Neo-Darwinians.

There is yet a further objection to this theory, one which, in our


opinion, is fatal to the hypothesis in its generally accepted form.

A number of cases occur where two species, in no way related, show


close resemblance to one another under such circumstances that
neither can possibly derive any benefit from the likeness. The theory
of protective mimicry is quite unable to explain these cases. This fact
leads to a suspicion that, in the instances where the theory does at
first sight appear to offer an explanation, the resemblance may also
be due to mere coincidence.

We may perhaps call the cases which the theory of mimicry is unable
to account for “false mimicry,” but in so doing we must bear in mind
the possibility that some, at any rate, of the examples of so-called
mimicry may, on further investigation, prove to be nothing of the
kind.

“False” Mimicry among Mammals


The Cacomistle of Mexico (Bassaris astuta), one of the raccoon family,
has a grey body and long black-and-white ringed tail, just like the
ring-tailed Lemur of Madagascar (Lemur catta); both are [243
arboreal and about the same size, and this lemur’s colouration
is exceptional in its family.
The banded Duiker-buck of West Africa (Cephalophus doriae), has the
same very unusual colouration as the thylacine or marsupial wolf of
Tasmania, light brown, with bold black bands across the hinder part
of the back, and the animals are about the same size.

The dormouse of Europe closely resembles a small American


Opossum (Didelphys murina), and a larger opossum (D.
crassicaudata) is very like the Siberian Mink (Mustela sibirica).

The Flying Squirrel of North America (Sciuropterus volucella) is closely


copied by the Flying Phalanger (Petaurus breviceps) of Australia.

It will be readily seen that in no one of these cases can the likeness
be of utility to either the “model” or the “copy.”

False Batesian Mimicry among Birds


There are many instances of this phenomenon among birds. The New
Zealand Cuckoo (Urodynamis tritensis) shows a far closer
resemblance to the American Sparrow-hawk (Accipiter cooperi) than
to any New Zealand hawk, and in fact closely mimics this quite alien
bird.

The stormy petrel, a purely oceanic bird, closely resembles in size,


colour, and style of flight the Indian Swift (Cypselus affinis), a purely
inland creature; both are sooty black, with a conspicuous white [244
patch on the lower back.

The Pied Babbling Thrush (Crateropus bicolor) of Africa is singularly


like the Pied Myna (Græulipica melanoptera) of Java, both being of
about the same size, with white body and black wings and tail quills.
This, we may add, is a very unusual colouration among small birds.

The black-headed Oriole (Oriolus melanocephalus) of India is very


similar in appearance to the common Troupial (Icterus vulgaris) of
Brazil; indeed, the troupials, a purely American group, are so like the
old world orioles in colour that they usurp their name in America.

The little insectivorous Iora (Ægithina tiphia) of India strongly


resembles in size and colour a Siskin (Chrysomitris colambiana) from
South America, the males in both being black above and yellow
below, while in the females the black is replaced by olive-green.

Another Indian babbler (Cephalopyrus flammiceps), yellowish-green,


with orange forehead, is closely copied by, or copies, the well-known
Brazilian Saffron-finch (Sycalis flaveola).

In Fergusson Island, near New Guinea, there is a ground pigeon


(Otidiphaps insularis) which is black with chestnut wings, like several
of the powerful ground cuckoos of the genus Centropus, but no
species of these cuckoos so coloured appears to inhabit the island.

In Africa there is a tit (Parus leucopterus) which has the same [245
very unusual colouration as an East-Indian bulbul (Micropus
melanoleucus), both being black with a white patch on the wing-
coverts. These two birds are about the same size. As showing the
purely coincidental character of such resemblances, we may mention
that this same rare pattern occurs again in our Black Guillemot (Uria
grylle) and in the Muscovy Duck (Cairina moschata).

We have already quoted Gadow (p. 198) on “false mimicry” among


snakes. He also gives, on p. 110 of Through Southern Mexico, an
example of this phenomenon among amphibia. It is, he writes,
“impossible to distinguish certain green tree-frogs of the African
genus Rappia from a Hyla, unless we cut them open. If they lived side
by side, which they do not, this close resemblance would be extolled
as an example of mimicry.”

We should be very greatly surprised if abundant examples of “false


mimicry” are not found among insects. We trust that this remark will
stimulate some entomologist to pay attention to the subject.
It is the essence of Müllerian mimicry that both model and copy are
immune from attack from enemies. Unfortunately for the theory,
similar resemblances occur among birds of prey, where neither [246
party can benefit from the association. This gives rise to what
we may perhaps call false Müllerian mimicry. Thus the goshawk and
peregrine falcon resemble each other in being brown above and
streaked below in immature plumage, and having barred underparts
and a grey upper plumage when adult.

Theory of Mimicry Criticised

Having stated the more important objections to the theory of


protective mimicry, it now remains for us to deal specifically with each
head of evidence offered in its favour.

1. With regard to the assertion that the model and its copy are often
not nearly related, we have shown that among mammals and birds
instances of resemblance between widely-separated groups occur
under such circumstances that neither party can derive any benefit
therefrom.

2. As regards the assertion that species which are mimicked are either
well-defended or unpalatable, this certainly does not hold good with
regard to some at any rate of the coincidental resemblances among
birds which we have pointed out; even if these pairs of similar species
lived in the same country it would require considerable ingenuity to
say why one should mimic the other.

3. As regards the argument that the inedible species of Ithomiinæ,


etc., display only fifteen colours, while the less numerous edible
Papilios display more than double this number of colours, we [247
may draw attention to the fact that those birds which are most
immune from attack are precisely those which display the smallest
range as regards colour, e.g., hawks, owls, crows, gulls, storks, and
cranes. As we have already submitted, no question of Müllerian
association comes in here.
On the other hand, the eminently edible families of game-birds and
ducks display great variety of colour, in the males at all events.

4. As regards the statement that although in many cases the mimetic


resemblances extend to the minutest detail, they are not
accompanied by any structural changes except such as assist in the
production of a superficial likeness, we may refer to the case we have
already cited of the New Zealand cuckoo, which, though it so closely
copies an American hawk, is typically cuculine in structure. Here, of
course, there can be no question of advantage to the “mimicking”
cuckoo in the resemblances.

5. In answer to the argument that mimetic resemblance extends to


form, attitude, and movement, as well as colour, and that deep-
seated organs are affected only when the superficial resemblance is
thereby intensified, we may draw attention to such cases as the
following:—

(a) The harmless Indian Snake (Lycodon aulicus) is closely similar to


the well-known Krait (Bungarus cœruleus), also Indian; but the [248
resemblance extends to a structural detail which can hardly
have mimetic value—namely, the harmless snake has long, fang-like
front teeth, though these are unconnected with poison-glands.
Animals which come into contact with the krait and its mimic are
hardly likely to inspect their teeth.

(b) A considerable number of birds of the shrike group—known as


Cuckoo-Shrikes (Campophaga)—closely resemble cuckoos in
plumage; but even if they derive any benefit from mimicking birds
which are credited with being mimics already, they cannot profit by
the fact that the shafts of the rump-feathers in both groups are
stiffened; this being a peculiarity which would not be perceptible until
the bird was in the grasp of an aggressor.

(c) As a third case of coincidence we may refer to the tubercle in the


nostril of the Brain-fever-bird (Hierococcyx varius), as a minute detail
of hawk-like appearance, though not present in the particular species
imitated.

6. The argument that mimetic resemblances are produced in the most


diverse ways, but the result is uniform, loses much of its force when
we consider the various methods by which short-tailed birds appear
to have long caudal appendages.

In the peacock it is the upper tail coverts which are elongated; in the
Stanley Crane (Tetrapteryx paradisea) it is the innermost or [249
tertiary quills of wing; in one of the egrets some of the feathers
of the upper back grow to a great length and form a train; in the Bird
of Paradise (Paradisea apoda) the long flank plumes are commonly
mistaken for the tail.

In these cases there can be no question of mimicry.

7. We have shown that the idea that imitator and imitated are always
found in the same area is absolutely fallacious. In birds, for example,
the most striking resemblances appear to occur between species that
dwell far apart.

8. We can cite, as parallel to the case of a mimicking species of which


the male copies one model and the female another, the strange
similarity between the barred brown plumage of the female blackcock
and that of the female eider-duck. The males of these species,
although both black and white, differ greatly in appearance; but the
male blackcock is admittedly very like the male of another species of
sea-duck—the scoter.

9. Against the supposed ancestral non-mimetic forms existing on


islands we can pit the “mimetic” orioles in small islands and their non-
mimetic cousins on the mainland. In Australia an oriole of what
appears to be an ancestral style lives beside, but declines to mimic, a
friar bird of a very pronounced type.

[250
10. The case of certain diurnal moths mimicking butterflies appears to
be explicable without the aid of the theory of protective mimicry.
When two species adopt the same method of obtaining food, it not
infrequently happens that a professional likeness springs up between
them. Of this the swifts and swallows afford a striking illustration.

11. As a set-off to the cases where the alleged mimicry is confined to


certain seasons of the year, we may cite the case of the pheasant-
tailed Jaçana (Hydrophasianus chirurgus), which in its winter plumage
might easily be mistaken, when on the wing, for the paddy bird or
Pond Heron (Ardeola grayii), both being of like size and having a
brown back, long green legs, and white wings. Moreover, they are to
be found in the same localities in India. At the breeding season,
however, they are absolutely different in plumage.

Yet another argument commonly adduced in favour of the theory of


protective mimicry is that local variations of the imitated species are
sometimes followed by the imitator; thus the butterfly Danais
chrysippus shows a white patch on the hind wings in Africa, and this
is followed by its mimic.

But the same thing occurs, quite irrationally, so to speak, among


birds. The peregrine falcon and hobby of Europe are only winter
migrants to India, where they are replaced as residents by the [251
Shaheen (Falco peregrinator) and Indian Hobby (F. severus).
Both these differ from the migratory forms by being blacker above
and chestnut below, instead of cream colour. Thus the resemblance
occurs in each race. A similar distinction, as noted by Blyth, exists
between the Common Swallow (Hirundo rustica) and the Swallow (H.
tytleri) of Eastern Asia, the latter having the whole ventral surface
rufous instead of only the throat. Yet no one will suggest that
swallows mimic falcons, or that there is mimicry between the
peregrine and hobby. It is obvious that such parallel changes occur
independently of mimicry.

The Water-rail (Rallus aquaticus) and Baillon’s Crake (Porzana bailloni)


of Europe are distinguished from their allies of Eastern Asia by having
the sides of the head plain grey, whereas the Eastern Asiatic forms
(R. indicus and P. pusilla) have a brown streak along each side of the
face. Here, again, we have an instance of birds of the same family
varying together with geographical distribution.

“Recognition” Colours
One of the prettiest conceits of the Wallaceian school of zoologists is
the theory of recognition markings.

“If,” writes Wallace, on page 217 of Darwinism, “we consider [252


the habits and life-histories of those animals which are more or
less gregarious, comprising a large proportion of the herbivora, some
carnivora, and a considerable number of all orders of birds, we shall
see that a means of ready recognition of its own kind, at a distance or
during rapid motion, in the dusk of twilight or in partial cover, must be
of the greatest advantage and often lead to the preservation of life.
Animals of this kind will not usually receive a stranger in their midst.
While they keep together they are generally safe from attack, but a
solitary straggler becomes an easy prey to the enemy; it is therefore
of the highest importance that, in such a case, the wanderer should
have every facility for discovering its companions with certainty at any
distance within the range of vision.

“Some means of easy recognition must be of vital importance to the


young and inexperienced of each flock, and it also enables the sexes
to recognise their kind and thus avoid the evils of infertile crosses;
and I am inclined to believe that its necessity has had a more
widespread influence in determining the diversities of animal
colouration than any other cause whatever. To it may probably be
imputed the singular fact that whereas bilateral symmetry of
colouration is very frequently lost among domesticated animals, it
almost universally prevails in a state of nature; for if the two [253
sides of an animal were unlike, and the diversity of colouration
among domestic animals occurred in a wild state, easy recognition
would be impossible among numerous closely allied forms.”

As examples of recognition colouration, Wallace cites, among others,


the white upturned tail of the rabbit—a “signal flag of danger,” the
conspicuous white patch displayed by many antelopes, the white
marks on the wing- and tail-feathers of the British species of butcher-
birds, the stone-chat, the whin-chat, and the wheat-ear.

Wallace therefore asserts, firstly, that recognition marks not only help
herbivorous animals to keep together, but act as a danger signal; the
member of a flock which first catches sight of the enemy takes to its
heels, displaying its white flag, which is the signal of danger to the
other members of the flock. Secondly, that recognition marks prevent
the evils of infertile crosses. Thirdly, that the necessity of being able
to recognise one another has rigidly preserved bilateral symmetry
among animals in a state of nature.

As regards assertion number one, we would point out that where a


flock of herbivora is being stalked by a beast of prey, the member of
the flock nearest to the enemy—that is to say, the hindmost member
—will probably be the first to observe him. As that creature will be
more unfavourably situated for escape than the rest of the [254
herd, it will not be to their advantage to follow the line it has
taken. Moreover, being at the rear of the flock, it is not in a good
position to take the lead, and its pursuer is likely to see the danger
signal before its friends do. It would thus seem that “danger signals,”
while possibly sometimes of service to their possessors, are on the
whole ornaments which might profitably be dispensed with. Natural
selection can scarcely be charged with the production of a character
of such doubtful utility to the organism.

Moreover, flourishing species of many gregarious animals do not


possess any “signal flag of danger,” while, on the other hand, a great
many solitary species display markings that render them very
conspicuous when in motion. Take the case of the famous Indian
Paddy Bird (Ardeola grayii). This, when at rest, is coloured so as to be
very difficult to distinguish from its surroundings, but flight transforms
it, for it then displays its milk-white pinions, which would make a
perfect danger signal, if only it were not peculiarly solitary in its
habits. Its gregarious brethren, the Cattle Egrets (Bubulcus
coromandus), on the other hand, display no danger signal.

Interbreeding of Allied Species

That these recognition marks prevent the intercrossing of allied


species and the production of infertile hybrids appears to be pure
fiction. As we have already shown, hybrids between allied species are
by no means always infertile. Moreover, species which differ [255
only in colour seem usually to interbreed in those parts where
they meet.

“This interbreeding,” writes Finn, on page 14 of Ornithological and


Other Oddities, “occurs where the carrion crow (Corvus corone)
meets the hooded crow (Corvus cornix), where the European and
Himalayan goldfinches (Carduelis carduelis and C. caniceps)
encounter each other, and where the blue rollers of India and Burma
(Coracias indicus and C. affinis) come into contact, to say nothing of
other cases.”

Of these other cases, the Indian bulbuls of the genus Molpastes form
a very remarkable one. In all places where two of the so-called
species meet they appear to interbreed, and so freely do they
interbreed that at the points where the allied species run into one
another it is not possible to refer the bulbuls to either species. Thus
William Jesse writes of the Madras Red-vented Bulbul (Molpastes
hæmorrhous) (page 487 of The Ibis for July 1902): “This bird,
although I have given it the above designation, is not the true M.
hæmorrhous. I have examined numbers of skins and taken nests and
eggs time after time, and have come to the conclusion that our type
is very constant, and at the same time differs from all the red-vented
bulbuls hitherto described. The dimensions tally with those given by
Oates for M. hæmorrhous, while the black of the crown [256
terminates rather abruptly on the hind neck, and is not extended
along the back, as is the case with M. intermedius and M.
bengalensis. On the other hand, as in the two last species, the ear
coverts are chocolate. Furthermore, I may add—although I attach
little importance to this—that the eggs of the Lucknow bird which I
have seen are, without exception, far smaller than my eggs of
genuine M. intermedius from the Punjab. My own opinion is that the
Lucknow race is the result of a hybridisation between the other three
species.”

Further, in Bannu, Mr D. Donald saw M. intermedius and M.


leucogenys paired at the same nest. That gentleman could not
possibly be mistaken on the point, as the latter species has white
cheeks and yellow under tail-coverts, while the cheeks of the former
species are dark-coloured and the patch of feathers under the tail is
red. Similarly, Whitehead and Magrath, writing of the birds of the
Kurram Valley (Ibis, January 1909), record that the former shot no
fewer than twelve bulbuls, which undoubtedly appear to be hybrids
between these two species. As these hybrids differ considerably inter
se, there seems no room for doubt that they breed with one another
and with the parent species.

Symmetry in Nature

Wallace’s third statement, that if the two sides of animals in a state of


nature were alike, easy recognition would be impossible among
numerous closely allied forms, reminds us forcibly of the sad [257
case of the boy whose tailor was his mother. Humanum est
errare: she made her son one pair of trousers that fastened up
behind, so that the poor boy when wearing them never knew whether
he was going to or coming home from school! If animals are able to
recognise their mates, their bilateral symmetry does not seem
necessary to enable them to distinguish their fellows from allied
species.
It is, indeed, true that asymmetrically marked animals are very rarely
seen in the wild state, while they are the rule rather than the
exception among domesticated species. But this appears to be due,
not to the necessity of recognition markings in nature, but to the fact
that those animals that display a tendency to massed pigment perish
in the struggle for existence, since this massing of pigment appears to
be correlated with weakness of constitution. In other words, this
massing of pigment is an unfavourable variation, which under natural
conditions dooms its possessor. In the easier circumstances of
domestication, animals which are irregularly pigmented are able to
survive, so that, among them, the almost universal tendency to the
massing of pigment can be followed without let or hindrance.

It is unnecessary to say more upon this subject. The few facts we


have set forth suffice to destroy this particular excrescence on the
Darwinian theory.

[258

The Colouring of Flowers and Fruits


Extremely interesting though the subject be, we are unable to
consider at length the generally accepted theory that the colour
markings and perfumes of wild flowers are the result of the
unconscious selection exercised by insects.

While not denying that many flowers profit by their colouring, that
these colours may sometimes serve to attract the insects, by means
of which cross-fertilisation is effected, we are not prepared to go to
the length of admitting that all the colours, etc., displayed by flowers
and floral structures are due to the unconscious selection exercised by
insects. It is one thing to admit that the colour of its flowers is of
direct utility to a plant; it is quite another to assert that the colour in
question owes its origin and development to natural selection. Our
attitude towards the generally accepted explanation of the colours of
flowers is similar to that which we adopt towards the theory of
protective mimicry among animals. In certain cases we are prepared
to admit that the mimicking organism derives benefit from the
likeness; but this, we assert, is no proof that natural selection has
originated the likeness.

Cross- versus Self-fertilisation

The theory that flowers have developed their colours in order to


attract insects to them, and thus secure cross-fertilisation, is based on
the assumption that cross-fertilisation is advantageous to [259
plants. It is questionable whether this assumption is justified.
True it is that numbers of experiments have been performed, which
show that, in many cases, flowers which are artificially self-fertilised
yield comparatively few seeds. But experiments of this kind do not
prove very much.

To place on the stigma pollen from the anthers of the same flower, in
case of a plant which for many generations has been cross-fertilised,
is to subject the plant in question to a novel experience—an
experience which may be compared to transplanting it to another soil.
The immediate effect may appear to be unfavourable, although, if the
experiment be persisted in, the ultimate results may prove beneficial
to the plant.

That this is the case with some flowers that are artificially fertilised is
asserted by the Rev. G. Henslow. This observer states, that had
Darwin pursued his investigations further, he would probably have
modified his views regarding the benefits of self-fertilisation. Darwin’s
statement that “Nature abhors perpetual self-fertilisation” seems to
be as far from the truth as that which declares “Nature abhors a
vacuum.”

From the mere fact that cross-fertilised flowers yield a greater


quantity of seed than they do when self-fertilised, it does not
necessarily follow that cross-fertilisation is advantageous. The [260
amount of seed produced is probably not always a criterion as
to the advantages of the crossing to the plant. Some flowers yield
most seed when fertilised by the pollen from flowers belonging to a
different species!

It is significant that some plants produce cleistogamous flowers, that


is to say, flowers which invariably fertilise themselves. Such flowers
never open; so that the visits of insects are precluded.

According to Bentham, the Pansy (Viola tricolor) is the only British


species of Viola in which the showy flowers produce seeds. The other
species are all propagated by their cleistogamous flowers. The genus
Viola is an advanced species: it would therefore seem that the
production of cleistogamous flowers is an advance on the production
of entomophilous flowers. Cleistogamous blossoms are obviously
more economical.

Insects and Flowers

In the case of the malvas, epilobias and geraniums, where we see,


side by side, races of which the individuals produce insect-fertilised
flowers and those that are characterised by self-fertilised flowers, the
latter are quite as thriving as the former.

The common groundsel, which, according to Lord Avebury, is “rarely


visited by insects,” flourishes like the green bay tree, as many [261
gardeners know to their cost. The same may be said of the
pimpernels. In this connection it is important to bear in mind that the
anemophilous, or wind-fertilised, angiosperms, as, for example, the
grasses, are believed to be descendants of insect-fertilised or
entomophilous forms.

A weighty objection to the theory that the colours of flowers have


been developed because they attract insects has been urged by Mr E.
Kay Robinson, namely, that among wild flowers the most highly
coloured ones are the least attractive to insects.
“Show me,” writes he, on page 222 of The Country-Side for March 20,
1909, “the insect-collector who will seek for specimens among the
brilliant scarlet poppies. Of what use is the dog rose, with its large
discs of pinky-white, to him? On the other hand, does he not find that
by far the most attractive flowers are the almost invisible spurge
laurel blossoms in February and March, the fuzzy sallow catkins in
March and April, the bramble blossom in midsummer, and the ivy’s
small green flowers in autumn? Of these only the bramble has any
pretensions to colour, and if you try, as I have tried, the experiment of
picking off every petal from sprays of bramble blossoms you will find
that its attraction to moths does not appear diminished.

“The fact that insects do visit many conspicuously coloured [262


flowers does not show that the colour attracts them, when the
fact is borne in mind that they neglect others which are equally
coloured, while the flowers which they particularly haunt are
inconspicuous. Conspicuous flowers which have abundance of nectar
attract insects, of course, but so do inconspicuous flowers which have
nectar. If they have no nectar, neither the conspicuous nor the
inconspicuous flowers attract insects other than pollen or petal eaters,
whose visits are not good for the plant. This shows that the nectar
attracts the insects and that the colour of the flowers makes no
difference.”

In autumn many leaves assume bright and beautiful tints. These are
not believed to be in any way useful to the plant. The autumnal hues
and shades are regarded, and rightly regarded, as the garb of death
and decay. Such colours are the result of the oxidation of the
chlorophyll or green colouring matter of the leaves. Why should not
the colours of the petals of the flowers, which wither and fade long
before the green leaves do, be due to a similar cause? The bright
colours of fruits are supposed to have been effected by natural
selection in order to attract fruit-eating animals. Surely a hungry
animal does not require that its food be brightly coloured in order to
find it! We must remember that during the greater part of the [263
year most animals have no occupation save that of finding their
food. Inconspicuously coloured fruits, like those of the ivy, are
frequently eaten by birds. The bright colours of some ripening fruits
are undoubtedly the colours of decay. Many fungi and seaweeds have
bright colours. It is never hinted that these are of any direct utility to
their possessor.

Every flower, every plant, every organism must be of some colour.

Honey

Many flowering plants produce honey. This is said by some botanists


to have been directly caused by natural selection, because the honey
attracts insects. Possibly those who take up this attitude are putting
the cart before the horse. It is probable that honey, like oxygen, is an
ordinary product of the metabolism of the plant, and that the visits of
bees and other insects to such plants are the result rather than the
cause of the honey being there. Boisier found that some plants, for
example, Potentilla tormentilla and Geum urbanum, gave honey in
Norway, but very little near Paris.

He further discovered that by supplying certain plants copiously with


water he could induce them to produce more than their normal
output of honey.

As is their habit, Neo-Darwinians have pushed their pet theory to


absurd lengths in its application to flowers. They assert that the [264
visits of insects are responsible for not merely the general
colour of every flower, but also the various lines, spots, and other
markings of flowers. The lines that frequently occur on the petals are
supposed to guide the insects to the honey! This particular
refinement of Neo-Darwinism, to quote Kay Robinson, “needs little
discussion. Insects have very poor sight. You can see this when a bee
or a butterfly flies bang against a whitewashed wall; when a wasp
pounces upon a black spot on a sunlit floor, mistaking it for a fly; or
when a settled dragon-fly will allow you to poke it in the face with the
end of a walking-stick, although it will be off like a flash if you raise
your arm. There is, therefore, large reason to doubt whether insects
can even see the fine lines in the throats of flowers which are
supposed to guide them to the nectar. It is rather absurd, too, to
suppose that such lines can be needed, since insects come in swarms
to inconspicuous and apparently scentless flowers or to ‘sugared’
tree-trunks in the dark. Where there is nectar, insects which have
come to the feast from a distance need no pencilled lines to guide
them over the last quarter of an inch of their journey.”

Scents of Flowers

Neo-Darwinians further assert that the scents of flowers have been


developed by natural selection because they serve to attract insect
visitors to the flowers. In support of this contention it is urged [265
that the most highly scented flowers are not usually the most
conspicuous ones, since it is not necessary for a flower to be both
highly coloured and strongly scented. Again, those flowers which
open at night are usually very highly scented.

Plausible though this view seems, there are weighty objections to it.
These are so admirably summarised by Kay Robinson in the issue of
The Country-Side for March 27, 1909, that we feel we cannot do
better than reproduce his words:—

“It is true that many flowers which are strongly scented are visited by
insects, but these flowers have abundance of nectar, and the insects
come in spite of the scent, and not on account of it. They visit
unscented flowers, provided that they have nectar, equally freely; and
they do not visit flowers which have scent without nectar.

“Moreover, fruits are more generally scented even than flowers; but
what explanation have those, who attribute the scents of flowers to
the tastes of insects, for the scents of fruits? Insects which visit fruits
are only robbers. Therefore, if we say that plants have scents for the
purpose of attracting insects, we accuse all plants which have scented
fruits of attempted suicide.
“There are hosts of plants, again, with scented leaves. Here also the
insects are only robbers, and it is quite clear that the scent is [266
not useful in attracting insects. If, therefore, you adopt the
insect theory to explain the scents of flowers, you must invent entirely
new theories to explain the scents of fruits and leaves.”

It is thus evident that the ordinarily accepted explanation of the


colours, scents, and markings of flowers is far from satisfactory.

Kay Robinson’s Theory

Mr E. Kay Robinson has put forth in recent issues of The Country-Side


(March 20, 27, and April 3, 1909) quite a new explanation of the
phenomena, and one which deserves careful consideration. He
maintains that “the real, primary, and original meaning of the colours,
markings, nectar and scents of flowers is not to attract insects, but to
deter grazing and browsing animals.”

“I say,” he writes, “that grazing and browsing animals avoid eating


conspicuous flowers. I have watched a flock of five hundred sheep
pass across a yard-wide strip of close-nibbled turf on the Norfolk
coast, grazing as they passed, and the number of open daisy
blossoms after they had passed seemed the same as before they
came. Every one of five hundred sheep had eaten something from
that yard of grass, and not one had eaten any of the hundred and
thirty odd daisies.

“Every summer the farm horses are turned into the same old pasture,
and as the summer wanes the field always presents the same
appearance—the green grass close-grazed, the tall buttercups [267
left standing high.

“Once, leaning over a gate with friends, I pointed out that a flock of
sheep grazing in a sainfoin field were nibbling the greenstuff close,
but were not eating the flowery stalks, when one sheep near us
accidentally pulled up a whole sainfoin plant by the roots and
proceeded to munch it upwards. Inch by inch the stem passed into its
jaws, and I began to be afraid that it was going to establish an
‘exception’ to my rule. But, just when the bright cluster of pink
sainfoin blossom was within two inches of its teeth, it gave an extra
nip, and the flower head fell to the ground, and the sheep resumed
its search for greenstuff.

“I do not say that this would always happen—I should be sorry for
any theory which depended upon the intelligence of a sheep—but it
was a very striking object-lesson to my two companions; and any one
who looks around during this summer with an inquiring mind will find
plenty of evidence that grazing, browsing, and nibbling animals avoid
flowers, and stick to greenstuff when they can get it.

“I do not say that all animals avoid the same flowers. Horses, for
instance, may dislike large flowers like roses and conspicuous yellow
flowers like buttercups, but they will bite off flat clusters of minute
white or pale yellow flowers, such as yarrow or wild parsnip. [268
These distinctions made by certain kinds of beasts will probably
in the future be found to afford valuable evidence as to the regions of
origin of our flowers and animals. Such plants as the yarrow and the
wild parsnip, for instance, probably did not originate in the home of
the wild horse, because they are not protected against it.

“As a general rule, however, there is abundance of evidence that


plants with conspicuous flowers gain a large advantage in the
struggle for existence, because grazing and browsing animals avoid
them; while there is no real evidence at all that conspicuous flowers
attract insects.”

Kay Robinson extends this explanation to the shape, the scent, and
the nectar of flowers. He admits that many flowers are adapted to the
visits of insects, but this is, he asserts, but a secondary result. The
“real, primary meaning” of the shapes of flowers of curious
configuration is, he insists, “a deterrent to grazing or browsing
animals.”
According to him plants, like the snap-dragon, which have “blossoms
in the semblance of a mouth,” are avoided by grazing animals,
because they mistake such flowers for mouths, and have no wish to
be bitten! Orchids, he asserts, “are strongly deterrent to grazing and
browsing animals, which are looking for greenstuff, and regard these
gaudy, spidery, winged blossoms as live creatures.” “If this is [269
not the truth,” he asks, “will any adherent of the theory that we
owe the shapes of flowers to insects explain why some of our
common British orchids are so like bees, spiders, etc.? Some which
have no particular resemblance to any insect still exhibit weird
shapes, suggestive to the human mind of living things, such as
lizards, etc. The reason why they look like bees, spiders, lizards, and
various unclassed creatures is quite simple. Grazing animals are
looking for greenstuff, and do not wish to eat living creatures which
may bite or sting or taste nasty. Thus the orchids have acquired the
power of looking like creatures.

“Every one,” he continues, “who is familiar with the blossom of the


wild carrot—a flat head of minute, dull-white blossoms—must have
noticed how very often the centre blossom in each head is purplish or
reddish-black. This makes it very conspicuous in the middle of the flat
white flower head. Now what conceivable use can this barren little
blackish blossom—scarcely bigger than a pin’s head—be to the wild
carrot plant if we regard the flat head of white flowers as an
attraction to the sight of insects? If, on the other hand, we rightly
regard the flat head of white blossoms as an advertisement to grazing
animals that it is not wholesome greenstuff, but innutritious blossoms
liable to be infested with ants and other stinging insects, we [270
see at once the great use of this small blackish flower in the
middle. It looks like an insect, and possibly in the home of the wild
carrot there is some minute blackish insect with a peculiarly villainous
smell or taste—or perhaps a potent sting—which grazing animals
carefully avoid whenever they can see it. Thus the wild carrot
flourishes; though here in Britain—where the wild carrot has
established itself now—we may fail at first to see the exact meaning
of the trick. I think, however, that, when we understand it, it fits
admirably into the theory that the shapes and colours of flowers are
primarily useful as deterrents to grazing and browsing animals and
not as attractions to insects.

“Thus we see,” he concludes, “that the queer shapes of these orchids,


which are a great stumbling-block in the way of those who preach
that we owe the shapes of flowers to the tastes of insects, become a
strong confirmation of my theory that we owe the shapes of flowers
to grazing and browsing animals.”

Of the nectar of flowers, Kay Robinson writes: “Since this is eagerly


sought for by hosts of insects, whose visits are in most cases useful
to the flowers, it seems only natural to suppose that we see cause
and effect in this connection.

“Here, however, I will outline my theory of the origin of nectar and of


flowers in general.

“I think there is no doubt whatever that all the parts of a flower [271
are modified leaves. The original type of flowering plant—I
think we may safely assume—had a single stem and produced its
seed at the summit, as the crown of its year’s endeavour. The flower,
before it became what we would recognise as a flower, was a cluster
of protecting leaves round the seed-making parts of the plant. To the
production of the seed the whole energies of the plant were devoted,
and into the cluster of leaves at the top of the stem all the essences
of the plant were concentrated. If during the coming spring you
handle and examine the leaves at the end of the strong shoots of
thorns or fruit bushes, you will find that the surface of the young
leaves is quite sticky. If you observe browsing animals also, you will
discover that—contrary to expectation—they do not like strong-
growing, juicy shoots, evidently preferring mature leaves lower down
the branch. This shows, I think, that plants have the power of
protecting their new shoots by crowding into them the volatile oils
and essences which they produce as a protection against animals.
Now nectar appears always to be distasteful to grazing and browsing
animals; and they also dislike scented flowers. I think, therefore, that
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like