Your Passport To A Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala Download PDF
Your Passport To A Career in Bioinformatics 2nd Edition Prashanth N. Suravajhala Download PDF
com
https://textbookfull.com/product/your-passport-to-a-career-
in-bioinformatics-2nd-edition-prashanth-n-suravajhala/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/your-passport-to-gifted-
education-1st-edition-monita-leavitt-auth/
textboxfull.com
https://textbookfull.com/product/you-majored-in-what-designing-your-
path-from-college-to-career-1st-edition-katharine-brooks/
textboxfull.com
https://textbookfull.com/product/passport-to-successful-icu-discharge-
carole-boulanger/
textboxfull.com
The Heart of a Leader Fifty Two Emotional Intelligence
Insights to Advance Your Career Harper
https://textbookfull.com/product/the-heart-of-a-leader-fifty-two-
emotional-intelligence-insights-to-advance-your-career-harper/
textboxfull.com
https://textbookfull.com/product/back-in-the-game-why-concussion-
doesn-t-have-to-end-your-athletic-career-1st-edition-gerstner/
textboxfull.com
https://textbookfull.com/product/principles-of-product-management-how-
to-land-a-pm-job-and-launch-your-product-career-1st-edition-peter-
yang/
textboxfull.com
Prashanth N. Suravajhala Editor
Your Passport
to a Career in
Bioinformatics
Second Edition
Your Passport to a Career in Bioinformatics
Prashanth N. Suravajhala
Editor
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
To my Mother Nirmala Sastry
Foreword
When I first heard about the field of bioinformatics, I was a university senior
majoring in chemistry. It was 1995, and my intention at the time was to focus on
the application of chemistry in the life sciences. In fact, in those days I was interested
in any field of science or engineering that could be applied to biology. But, when it
came time to select a project for my senior thesis, I was asked by my thesis adviser if
I had an interest in computers. Certainly, I did. I had a year of computer science
courses under my belt, but I also had an avid interest in computers as a hobby—I
wrote my first BASIC program circa 1981 on a friend’s Atari800. And, so my
adviser proceeded to tell me that there is this nascent field called “bioinformatics,”
which is a hybrid of computer science and biology. I immediately fell in love with
the idea that I could combine a professional interest of mine with a personal one.
And, from then on, even through graduate school, all of my research projects
involved programming. Not one required that I stand at a bench with a micropipette,
as I knew I would be doing as a biochemist. Of course, it did not go over so well with
many of the professors back then that a student would pursue a degree in either
biochemistry or biology with a purely computational project. In the 1990s, there
were just a handful of degree programs in bioinformatics in the whole world—one of
them halfway around the world from where I lived. But I limited my own geograph-
ical options, and it seemed that my only choice was to pursue a graduate degree in
“traditional” biochemistry and find an adviser and laboratory group that had an
interest in performing computational analyses on their data.
Fortunately for aspiring scientists today, there are many straightforward ways to
enter the field of bioinformatics. To that point, there are scores of degree programs
throughout the world—many of them online degrees. And, there are other ways to
further one’s own career as a bioinformatics practitioner. For one, there is the
Bioinformatics.Org website, of which I am the founder, with Prashanth Suravajhala
among the directors. Prash also founded Bioclues.org and has been active in
vii
viii Foreword
I thank Messers Springer, Bhavik Sawhney, Beracah John Martyn and Camilya
Anitta for agreeing to my request to reconsider the revised edition of this book and
for their consistent help in proofreading. Aninda Bose and Chandra Shekhar of
Springer who have supported me all through the making of the first edition of the
book have played a major role. Although the cartoons and illustrations were ideated
by me, full credits to Partha Paul for bringing life to them.
My sincere obeisances to my revered Guru Maa Bijaya for her grace and
blessings. My sincere gratitude goes to my mentees without whose thoughts this
book would not have been here today. Likewise, I owe appreciation to my wife
Renuka and my daughters Bhavya and Nirmala who always stood by me.
My peers in Bioclues.org and bioinformatics.org, ex-colleagues, and researchers
in India, Denmark, the United States, and Japan, countless “e-colleagues,” also
contributed to my discussions. I sincerely thank Cox Murray, Jeff Bizzaro, Madhan
Mohan, and Pawan Dhar who were generous enough to have responded to the
questionnaire. My grandparents—Shri D. S. Sastry and D. S. R. Murthy are always
remembered with fond love and affection. They have helped me in imparting clarity,
coherence, and brevity to the text.
Finally, the book would not have come into good shape without the help of
contributions from various authors across four continents, Springer reviewers,
friends, and well-wishers, but not the least the author sincerely thanks the Springer
typesetting team, Messers Nalini Gyaneshwar, Kamiya Khatter et al. for bringing the
manuscripts in shape.
ix
Prologue
Today, we define success by publicity and bank accounts. But that is not really
success at all. Do not believe the hype. Success is ephemeral. You have to define it
yourself.
Chris North
Most people would succeed in small things if they were not troubled with great
ambitions.
Henry Wadsworth Longfellow
Any new word invites inquiry, excitement, and sometimes disdain and so was
bioinformatics, at least in developing countries. Theoretical bioinformatics, although
born in the 1980s, has flourished ever since, as many new academic and empirical
developments with focal point on wet-lab research confirm. Bioinformatics is now
regarded as a tool but fantasized as a familiar science even by few scientists who
have had a track record of early career building. With research on bioinformatics
mushrooming, both theoretical and wet-lab-based bioinformatics-aided works are
often deemed very procedural and paraphernalia that these are not easily accessible
to those who want to use the “tools for biology.” Additionally, the career-driven
paths using bioinformatics is tacit by the fact that one needs to attend to earn
programming skills which is not always the case. This book aims to be an interface
between those who aim for bioinformatics and apply research with a focus on Q and
A on career growth. A great saying goes “If you want more, you have to require
more from yourself.” This also applies to bioinformatics. Happy reading!
xi
Contents
1 Whither Bioinformatics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Prashanth N. Suravajhala
2 Ten Reasons One Should Take Bioinformatics as a Career . . . . . . . 25
Prashanth N. Suravajhala
3 Developing Bioinformatics Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Prashanth N. Suravajhala
4 The Esoteric of Bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Prashanth N. Suravajhala
5 Common Minimum Standards: A Syllabus for Bioinformatics
Practitioners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Prashanth N. Suravajhala
6 Colloquial Group Discussion on Bioinformatics:
Grand Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Prashanth N. Suravajhala
7 The Bioinforma “TICKS”: Frequently Asked Questions . . . . . . . . . 69
Prashanth N. Suravajhala
8 Undergraduate Education in Bioinformatics—Progress and Lessons
Learnt from an Engineering Degree . . . . . . . . . . . . . . . . . . . . . . . . 73
Bruno A. Gaeta
9 Engineering Minds for Biologists . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Alfredo Benso, Stefano Di Carlo, and Gianfranco Politano
10 Design Bioinformatics Curriculum Guidelines: Perspectives . . . . . . 91
Qanita Bani Baker and Maryam S. Nuser
11 Machine Learning for Bioinformatics . . . . . . . . . . . . . . . . . . . . . . . 103
Harshita Bhargava, Amita Sharma, and Jayaraman K. Valadi
xiii
xiv Contents
xv
Chapter 1
Whither Bioinformatics?
Prashanth N. Suravajhala
Ever since the word “Theoretical Biology” was coined by Paulien Hogeweg in1978,
bioinformatics, the current word has steadfastly come into existence with many
biologists taking a leaf out of this discipline. Researchers by now know that bioinfor-
matics is a mere tool, whereas its sister concern, computational biology, is deemed as a
discipline. With bioinformatics burgeoning in the late 1990s, we relate the commence-
ment of data deluge to the animistic knowledge that bioinformatics has brought in,
lessening the scale of experimentation. Authentic bioinformatics, however, will not
gain significant interest for researchers, at least until the wet laboratory biologists take
a leap forward in acclimatizing the split half-term in bioinformatics. The figure of
dogmas is pivotal in bringing the collaboration between biologists and cross-
disciplinarians across biology as the event of dogmas in turn has introduced a plethora
of new relationships between scientific studies and molecular biology. In effect,
researchers have asked several questions on specialized mechanisms, if any that may
be discovered in the advent of bioinformatical knowledge. This collaborative knowl-
edge owes its impetus to the differentiation of independent eccentric science, namely,
systems biology (SB). So, to ask whither bioinformatics into the enunciation and
practice of the bioinformatical tools and scientific methods is a candid query.
Bioinformatics, since ages, has created a process of reasoning that was certainly
not dependent on biology alone. Prior notions of intelligent algorithms clubbed with
statisticians’ skills, IT scientists’ inclination, physicists’ predictions, chemists’ cor-
ner, and mathematicians’ mind are a necessity to perform bioinformatics research.
Not all disciplines can be made up by an individual alone but need unicentric efforts
to meet the goals to derive bioinformatics knowledge. For example, the next
P. N. Suravajhala (*)
Department of Biotechnology and Bioinformatics, Birla Institute of Scientific Research, Jaipur,
India
Bioclues Organization, Hyderabad, India
e-mail: [email protected]; http://bioclues.org
Table 1.1 Components defining different ‘omics’ technologies. The word ‘ome’ refers to ‘many’
or ‘monies.’ For example, genomes indicate the study of many genes
‘Omes’ Description
Genome The full complement of genetic information both coding and noncoding in an
organism
Proteome The complete set of proteins expressed by the genome in an organism
Transcriptome The population of mRNA transcripts in the cell, weighted by their expression
levels as transcripts copy number
Metabolome The quantitative complement of all the small molecules present in a cell in a
specific physiological state
Interactome Product of interactions between all macromolecules in a cell
Phenome Qualitative identification of the form and function derived from genes, but
lacking a quantitative, integrative definition
Glycome The population of carbohydrate molecules in the cell
Translatome The population of mRNA transcripts in the cell, weighted by their expression
levels as protein products
Regulome Genome wide regulatory network of the cell
Operome The characterization of proteins with unknown biological function
Synthetome The population of the synthetic gene products
Hypothome Interactome of hypothetical proteins
Fig. 1.1 An overview of the dogma of molecular biology with known specialized and unknown
mechanisms/flows. (Image courtesy: Daniel Horspool)
the huge increase in the scale of data being produced from time to time could be
better facilitated by in silico analysis. With the help of high-performance computing
(HPC), sequences generated can be sporadic and further analyzed. Nevertheless,
given the fact that molecular biology of a system is very complex, understanding and
disseminating the information is to carried out at different levels using the “omes”
4 P. N. Suravajhala
Known Unknowns
Knowns KK KU
UK UU
Unknowns
Fig. 1.2 The importance of Known Unknowns aliased “hypothetical genes” in the genome,
illustrated in the form of a checkerboard. The Known is acronymed “K” while the Unknown
“U.” Apparently, we seldom find “UU”s as it is a misnomer here. Unless the genome is sequenced,
we find genes evaluating and devaluating
revolutionized genomics and proteomics? Well, there has been a focus on molecular
medicine which paved the way for establishing intervention and treatment of well-
known diseases to proactive prediction and prevention of disease risk. These
approaches should really require new informatics systems that will link large-scale
databanks and special programs for data mining and retrieval in bioinformatics and
chemoinformatics. All the wet laboratories should be able to provide a platform for
powerful new molecular diagnostic tools along with multianalyte assays for expres-
sion of genes and proteins in different patterns of diseases. With researchers scaling
the ladder of bioinformatics progress by leaps and bounds, there is a need for an
enhanced understanding of the interactions in a system (organism). What are the
components that interact with each other? What is the outcome of such interactions?
Do interactions alone provide us the functional decipherment? Should we just be
sufficed with the progress made on say, cures for diseases by the year 2050? Should
we reach a consensus on the combination of tools, namely, rapid and inexpensive
DNA sequencing technologies, HapMap project, dollar one genome (DOG), and so
on? We hope that this will let us understand precisely how bioinformatics transits
from research to vocation and avocation (Table 1.2).
Systems biology has gained a lot of attention over the years. Of late, biologists have
been actively engaged in this discipline in different forms when molecular biology is
merged with multi-context disciplines. During this process, SB ran into several
definitions. To answer what is a system: We could think of multiple organelles
existing in our human body as we use components to describe entities in a system.
As we integrate various vehicular components to construct a vehicle, we describe
components such as organelles to makeup a living system. The biology of the system
1 Whither Bioinformatics? 7
is called systems biology. Every system has an effect on its environment and so does
the components in a system, even as the components entitled to SB include genes,
proteins, metabolites, and enzymes as minor entities, while cells, tissues, organelles,
and organs as major components. Hence, interactions among the components would
be interesting to value SB. While a system could have many organelles and the
components that makeup the flow of a system, they are bound to interact with one
another. For example, enzymes, proteins, metabolites, genes, DNA, and functional
protein domains are known to interact with each other. Integrating all the interactions
of components indicates: Which survives (and competes) the best while the ultimate
goal of SB is to exploit the interplay among the components. From a reductionism’s
point of view, researchers define SB based on whether the components in a system
are interacting with each other, mutations arising and falling, proteins evaluating and
devaluating, strains adapting and unfitting in the environment, and some genes if lost
and found (Table 1.3).
8 P. N. Suravajhala
Is systems biology (SB) all about the genes making up the proteins and how the
components processing in a system interact with each other? The fields of omics in
the recent past have believably revolutionized biomedicine and by far means there
needs to be a focus on change in defining these upcoming omics-es. Huang S’s
classification of SB has yielded the loose and the apparent but broadened definitions
from the dynamics and reductions approach (Huang 2004). The dynamicity of SB is
based on a pure level where the system is based on models and networks: Be it
quantitative or qualitative, whereas the reductionism defines SB based on the high-
throughput methods involving different molecular biology techniques. Overall, the
loose definition applies to projects exploring individual biological networks, while
the broadened but still “derivative” definition is the outgrowth of theoretical models
along with systems theory across interdisciplinary sciences such as engineering,
mathematics, statistics, artificial intelligence, and so forth. However, many authors
(Tracy 2008; Cornish-Bowden et al. 2007; Huang and Wikswo 2006; Strömbäck
et al. 2006; Bruggeman and Westerhoff 2007) have deliberated that the concept of
the gene resulting in omics has begun to outlive its usefulness while they felt that the
SB could be projected into several dimensions keeping in view the multifaceted
systems’ complexity of living organisms (Ideker and Hood 2019). With SB matur-
ing, researchers have started proposing an alternative means to define gene based on
a richer explanation: Genetic functor, or genitor, a sweeping extension of the
classical genotype/phenotype paradigm that describes the “functional” gene (Fox
Keller and Harel 2007). Thus, we could understand the dynamic behaviors of
molecular associations implicitly known from various methods and technologies
integrating one or more of the SB data:
Overall, SB can be envisaged keeping in view the following points:
1. Systems biology is conceptualized in terms of PPI. The interplay between
components in systems is exploited between protein–protein, domain–domain,
DNA–DNA as a whole, or even a protein–DNA.
2. The interactions among the components are better explained in such a way that
what is in theory need not fit practically implicating that a hypothesis-driven
approach need not always be experimental (biological) driven.
3. With some answers to questions like if there are interactions known, we can take a
measure of unknown interactions in a system, SB approaches toward understand-
ing bona fide PPI.
Does SB back biologists? There are specific traits that makeup PPI networks:
Everything in biology is better explained through interactions while the interactions
are a priority in accordance with the organization, cooperability, and mapping the
components in a system. The SB signifies if components interact with each other.
This led to the birth of several disciplines such as systems molecular medicine,
immunological SB, local and global metabolic profiling, systems diagnostic therapy,
1 Whither Bioinformatics? 9
and systems drug development, all budding across nascent biology disciplines.
Although the PPI are outcomes of almost all cellular processes, there is diversity
in protein interactions, that is, all proteins share common properties at a certainty.
For example, the distortion of protein interfaces leads to the development of many
diseases, and to understand its mechanism, we lead PPI experiments. When proteins
recognize specific targets and bind them, it results in conservation that depends on
structural and physicochemical properties. The nature and applications of SB with
respect to PPI were well-reviewed elsewhere (Huang 2004; Tracy 2008; Cornish-
Bowden et al. 2007; Huang and Wikswo 2006; Strömbäck et al. 2006).
Apart from the three most common omics-es, namely, “Gen-omics,” “Prote-omics,”
and “Transcript-omics,” bioinformatics and biology researchers have been taking up
omes and omics-es very rapidly as is evident from the use of the terms in PubMed
(Dell et al. 1996). As a result, a variety of omics disciplines such as phenomics
(Schork 1997), physiomics (Chotani et al. 2000; Gomase and Tagore 2008),
metabolomics (Kuiper et al. 2001; Fiehn 2002), lipidomics (Han and Gross 2003),
glycomics (Gronow and Brade 2001), interactomics (Govorun and Archakov 2002),
cellomics (Taylor et al. 2001) have begun to emerge, each with their own set of
instruments, techniques, reagents, and software. These have driven new areas of
research consisting of DNA and protein microarrays, mass spectrometry, and a
number of other instruments that enable high-throughput analyses.
While genomics forms a main hierarchy of classification, there are many other
omics-es that fall under a clad of primary (gen) omics’ enabled SB, for example,
functional genomics, comparative genomics, computational genomics, and
phylogenomics. With more than 1800 microbial genomes sequenced or being
sequenced today and the number still increasing, another set of omics called
metagenomics aims to access the genomic potential of an environmental sample. It
would answer some of the questions we posed in the earlier sections. This environ-
mental “omics” bridges the integration of metagenomics with complementary
approaches in microbial ecology (Schloss and Handelsman 2003).
While the mapping of PPI is a key to understand biological processes through
interactomics, many technologies have been reported to map interactions, widely
applied in yeast. At present, the number of reported yeast protein interactions truly
validated by at least one other approach is low with the amount of throughput it takes
to process (Cornell et al. 2004). This is because of the false discovery rate of proteins
interacting with their partners. With the advent of virtual interactions, the growth of
false positives also increased, thereby allowing the researchers to keep a track of
finding these false positives through statistical inference. Any dataset of interaction
map is complex while tools to decipher true positives are being developed in the
10 P. N. Suravajhala
Fundamental biological processes can now be studied by applying the full range of
omics technologies (genomics, transcriptomics, proteomics, metabolomics, etc.)
using the same biological sample and high-throughput methods such as MS
(McGuire et al. 2008; Kim et al. 2008). A wide array of assays including high-
throughput methods such as tandem mass spectrometry (MS/MS), yeast two hybrids
(Y2H), and pull-down assays are preferentially used to navigate them. Clearly, it
would be desirable if the concept of the sample were shared among technologies
such as MS for that, until the time a biological sample is prepared for use in a specific
1 Whither Bioinformatics? 11
Fig. 1.3 Quantitative picture of various omics and the various fields, an enthusiast can take up
2006). Inversely, as dogs too naturally develop cancer they may share many
characteristics with human malignancies. This probably would accelerate genome-
wide, cross-comparison of organisms for finding the function of more genes ulti-
mately using drug discovery development.
1.4.1 Metabolomics
Metabolomics has come into sight as one of the newest “omics” science with a
dynamic portrait of the metabolic status of living systems. The analysis of the
metabolome is particularly challenging as it has it’s roots in early metabolite
profiling studies but is now a rapidly expanding area of scientific research in its
own right. It is a science employed toward the understanding of global SB (Rochfort
2005). The metabolomic tools aim to fill the gap between genotype and phenotype
permitting simultaneous monitoring molecules in a living system. The smartness of
using metabolic information could be applied in translating into diagnostic tests as
they might have the potential to impact on clinical practice and might lead to the
supplementation of traditional biomarkers of cellular integrity, cell and tissue
homeostasis, and morphological alterations that result from cell damage or death
(Claudino et al. 2007). Metabolomics has been widely applied to optimize microor-
ganisms for white biotechnology even as it spreads to the investigation of biotrans-
formation and cell culture. Together with the other more established omics
technologies, metabolomics aims to contribute to different spheres ranging from
an understanding of the in vivo function of gene products to the simulation of the
whole cell in the SB approach. This will allow the construction of designer organ-
isms and yet another science synthetic biology evolves (Oldiges et al. 2007).
Although metabolomics measures the multiparametric response of living systems
to genetic modification, there is a consistent debate of synonymy with
metabolomics. Admittedly, there is a concurrence of the former being associated
with NMR while the latter being associated with mass spectroscopy. This part of the
microbial transformation has led several standards for these two meta-omics’ deliv-
ering SB tools (Fiehn et al. 2006).
1.5 Mitochondriomics
correlated with mitochondrial diseases where the clinical pathologies are believed to
include infertility, diabetes, blindness, deafness, stroke, migraine, heart, kidney, and
liver diseases (Reichert and Neupert 2004).
Recently, cancer was added to this list when investigations into human cancer
cells from breast, bladder, neck, and lung revealed a high occurrence of mutations in
mtDNA. With the understanding of the role of mitochondria in a vast array of
pathologies, research on mitochondria and mitochondrial dysfunction has in the
last decade yielded a huge amount of data in the form of publications and databases.
Yet, the field of mitochondrial research is still far from exhaustion with many
essentials waiting to be discovered. The recent identification of a number of proteins
targeting mitochondria has enabled immense interest to understand the function of
some genes unnoticed in the mitochondrion (Calvo et al. 2006). With only 13 pro-
teins sitting inside mitochondria through oxidative phosphorylation, and more than
1500 estimated proteins targeting this tiny organelle, identifying complete protein
repertoire in this machinery could decipher the biology behind mitochondria or what
makes us breathe. A complete set of mitochondrial proteomes syntenic with other
eukaryotes has just started and there is a promise in understanding how the organelle
proteomes and interactomes could essentially be used to develop into SB (Calvo
et al. 2006).
34.6%
laboratory research community has been successful in exploring these data on using
bioinformatics many challenges still persist. One of them is the effective integration
of datasets directly into approaches based on mathematical modeling of biological
systems. This is where SB has bud resulting in top–down and bottom–up
approaches. The advent of functional genomics has enabled the molecular biosci-
ences to come a long way toward characterizing the molecular constituents of life.
Yet, the challenge for biology overall is to understand how organisms’ function. By
discovering how function arises in dynamic interactions, SB is everywhere
addressing the missing links between molecules and physiology. Top–down SB
identifies molecular interaction networks on the basis of correlated molecular behav-
ior observed in genome-wide “omics” studies. On the other hand, bottom–up SB
examines the mechanisms through which functional properties arise in the interac-
tions of known components. Applications in cancer are a good example to counteract
these two major types of complementary strategies (Stransky et al. 2007). Several
web-based repositories have been established to store protein and peptide identifi-
cations derived from MS data, and a similar number of peptide identification
software pipelines and workflows have emerged to deliver identifications to these
repositories. Integrated data analysis is introduced as the intermediate level of an SB
approach and as a supplementary to bioinformatics to analyze different “omics”
datasets, that is, genome-wide measurements of transcripts, protein levels or PPI,
and metabolite levels aiming at generating a coherent understanding of biological
function (Steinfath et al. 2007). Furthermore, existing and potential problems/solu-
tions such as de facto experimental and the following bioinformatics challenges
might hold prospective in the near future:
1. Challenges in high-dimensional biology (HDB): Recently, the term HDB has
been proposed for investigations involving high-throughput data (Mehta et al.
2006). The HDB includes whole-genome sequences, expression levels of genes,
protein abundance measurements, and other permutations. The identification of
biomarkers, the effects of mutations and drug treatments, and the investigation of
1 Whither Bioinformatics? 15
Does close homology between two proteins confer that they do interact in the same
manner? Yes, they do and confer evolutionary constraints in lieu of structural
divergence while remotely related proteins have a different interaction mode
(Drummond et al. 2005). Also, conservation of protein interface indicates the
average conservation of the rest of the protein. While all these forms an integral
part of SB, apart from the novel interactions that arise based on the type of
homology, there are interactions based on the binding entity, namely, stable and
transient. The former interactions are consistent and bookmarked while the latter is
temporary. There are interacting proteins that might co-express indicating that the
expressed proteins, which evolve slowly are normalized wherein the normalized
difference between the absolute expression data is calculated based on several tools
such as microarrays (Drummond et al. 2005). However, there are other techniques
such as density gradient and virtual pull-down assay methods cited as above
beginning to be understood and substantiate above views.
As thousands of new genes are identified in genomics efforts, the rush is on to
learn something about the functional roles of the proteins encoded by those genes.
Clues to protein functions, activation states, and PPI have been revealed in focused
studies of protein localization. A meta-analysis of data derived from genome-wide
studies of aging in simple eukaryotes will allow the identification of conserved
determinants of longevity that can be tested in other mammals (Khanna 2006;
Kaeberlein 2004). Adding to the various high-throughput methods, technical break-
throughs such as GFP protein tagging and recombinase clones, large-scale screens of
16 P. N. Suravajhala
protein localization are now being undertaken to understand the function of the
proteins (O’Rourke et al. 2005).
In the recent past, various bioinformatics tools have been developed that allow
researchers to compare genomic and proteomic repertoire. Comparative studies
using algorithms such as Blast and databases are carried out to distinguish unique
proteins from paralogs, which later might have resulted from gene duplication
events. The genomes sequenced so far were helpful in predicting not only evolu-
tionary relationships but also identified function for the genes through functional
genomics. Although many methods are being employed by researchers, screening of
proteins for novel translatable candidates is not often used and the researcher
repeatedly performs the screening with laborious wet laboratory experiments. To
increase the sensitivity, further clues on tissues and development stages from the
queried gene’s sequences could be surveyed using tools such as gene expression
omnibus (GEO) or UniGene-EST or cDNA profile database. Furthermore, protein
link to the genomic location specified by transcript mapping, radiation hybrid
mapping, genetic mapping, or cytogenetic mapping as available from GenBank
resources would improve the understanding of protein annotation. Besides this,
whether or not a protein contains a polyadenylation signal could be an added
knowledge to meet the criteria of well-annotated proteins. This is because tools
such as MEME reveal many 30 UTRs forming conserved motifs, which indicates
these regions appear more conserved than expected. This means, higher the conser-
vation, greater the duplications and greater is the chance of being not annotated or
“hypothetical.” There seem to be many unique genes that are overrepresented in the
form of duplications; a simple search in GenBank gene list would reveal that there
are several accessions duplicated. For example, in the case of the gene FusA2a, bona
fide accession is mapped to CAD92986, and yet, a few of the isoforms/unique genes
remain unknown (e.g., CAD93127). In summary, there could be many proteins less
annotated, and yet many tools are known to describe the function. This leaves to beg
a question, what would be the fate of proteins that cannot be annotated through some
tools, or in contrast how many best tools are used to describe or annotate a protein?
Apart from BLAST and FASTA, the sequence-based feature annotation is
applied by RefSeq using several tools, namely, BEAUTY X-Blast Enhanced Align-
ment Utility, and PROSITE. While many other variants of BLAST including PSI
Blast and PHI Blast, sequence alignments using ClustalW, ClustalX, and Cobalt are
used, not all the tools are used in tandem to eliminate false positives. Whether the
protein is soluble or insoluble is known through TopPred; the topology of protein
with the orientation and location of transmembrane helices attribute to the function.
Additionally, orthology mapping using tools such as HomoMINT are used, which
1 Whither Bioinformatics? 17
increases the chance of the protein annotation. With the central dogma beyond the
age today in bioinformatics, namely, sequence specifies structure and function;
annotations have become mightier to further manually curate allowing researchers
to perform experimental analyses for some proteins. The structures of proteins not
only provide functions but the shapes exhibited by the proteins allow them to interact
selectively with other proteins or molecules. This specificity is the key for the
proteins to interact with another protein, thereby inferring the function. However,
most of the bioinformatics analyses are misleading unless biochemical characteri-
zation is carried out. Furthermore, the protein annotation has gained much impor-
tance with the introduction of many metazoan genome sequencing projects in
addition to the 1000 genomes project that is in progress. With 40–50% of identified
genes corresponding to proteins of unknown function, the functional structural
annotation screening technology using NMR (FAST-NMR) has been developed to
assign a biological function which is based on the principle that a biological function
can be described based on the basic dogma of biochemistry that the proteins with
similar functions will have similar active sites and exhibit similar ligand-binding
interactions, although there is a global difference in sequence and structure. Tools
such as combinatorial extension which confer structure similarity, DALI for NMR,
finally determining function, PvSOAR, and Profunc—given a 3D structure, aims at
identifying a protein’s function has been widely used. However, there are many
other methods such as the Rosetta Stone method, phylogenetic profiling method, and
conserved gene neighbors that have been widely employed and being accepted by
the scientific community.
Biological function of proteins would help in the identification of novel drug
targets and helps reduce the extensive cost of practical examinations on several
candidates. With the enormous amount of sequence and structure information
availability, innumerable automated annotation tools for proteins have also been
generated. One such example is the automated protein annotation tool (APAT),
which uses a markup language concept to provide wrappers for several kinds of
protein annotations. While FFPred is available to predict molecular function for
orphan and unannotated protein sequences, the method has been optimized for
performance using a protein feature-based method through support vector machines
(SVMs) that does not require prior identification of protein sequence homologs. It
works on the premise of posttranslational modifications, Gene Ontology, and local-
ization features of proteins. Yet another tool, namely, VICMpred, aids in broad
functional classification of proteins of bacteria into virulence factors, information
molecule, cellular process, and metabolism molecule. The VICMpred server uses an
SVM-based method having patterns, amino acid, and dipeptide composition of
bacterial protein sequences. ConSeq and ConSurf have been widely applied in
predicting functional/structural sites in a protein using conservation and
hypervariation.
The final part of annotation can be studied through interactions and associations.
All interactions are associations, while not all associations are interactions. The
association tools, namely, search tool for the retrieval of interacting genes/proteins
(STRING), GeneCards, IntAct, MINT, biomolecular interaction network database
Another Random Scribd Document
with Unrelated Content
indulging in any explanations, and obviously with great moral effort,
Willett staggered dizzily down to the cellar and tried the fateful
platform before the tubs. It was unyielding. Crossing to where he had
left his yet-unused tool satchel the day before, he obtained a chisel
and began to pry up the stubborn planks one by one. Underneath the
smooth concrete was still visible, but of any opening or perforation
there was no longer a trace. Nothing yawned this time to sicken the
mystified father who had followed the doctor downstairs; only the
smooth concrete underneath the planks—no noisome well, no world
of subterrene horrors, no secret library, no Curwen papers, no
nightmare pits of stench and howling, no laboratory or shelves or
chiseled formulae, no—Dr. Willett turned pale, and clutched at the
younger man. "Yesterday," he asked softly, "did you see it here—and
smell it?" And when Mr. Ward, himself transfixed with dread and
wonder, found strength to nod an affirmative, the physician gave a
sound half a sigh and half a gasp, and nodded in turn. "Then I will
tell you," he said.
So for an hour, in the sunniest room they could find upstairs, the
physician whispered his frightful tale to the wondering father. There
was nothing to relate beyond the looming up of that form when the
greenish-black vapor from the kylix parted, and Willett was too tired
to ask himself what had really occurred. There were futile, bewildered
head-shakings from both men, and once Mr. Ward ventured a hushed
suggestion, "Do you suppose it would be of any use to dig?" The
doctor was silent, for it seemed hardly fitting for any human brain to
answer when powers of unknown spheres had so vitally encroached
on this side of the Great Abyss. Again Mr. Ward asked, "But where did
it go? It brought you here, you know, and it sealed up the hole
somehow."
And Willett again let silence answer for him.
But after all, this was not the final phase of the matter. Reaching for
his handkerchief before rising to leave, Dr. Willett's fingers closed
upon a piece of paper in his pocket which had not been there before,
and which was companioned by the candles and matches he had
seized in the vanished vault. It was a common sheet, torn obviously
from the cheap pad in that fabulous room of horror somewhere
underground, and the writing upon it was that of an ordinary lead
pencil—doubtless the one which had lain beside the pad. It was
folded very carelessly, and beyond the faint acrid scent of the cryptic
chamber bore no print or mark of any world but this. But in the text
itself it did indeed reek with wonder; for here was no script of any
wholesome age, but the labored strokes of mediaeval darkness,
scarcely legible to the laymen who now strained over it, yet having
combinations of symbols which seemed vaguely familiar. The briefly
scrawled message was this, and its mystery lent purpose to the
shaken pair, who forthwith walked steadily out to the Ward car and
gave orders to be driven first to a quiet dining place and then to the
John Hay Library on the hill.
Willett rang for the man and asked him some low-toned questions. It
had, surely enough, been a bad business. There had been noises—a
cry, a gasp, a choking, and a sort of clattering or creaking or
thumping, or all of these. And Mr. Charles was not the same when he
stalked out without a word. The butler shivered as he spoke, and
sniffed at the heavy air that blew down from some open window
upstairs. Terror had settled definitely upon the house, and only the
businesslike detectives failed to imbibe a full measure of it. Even they
were restless, for this case had held vague elements in the
background which pleased them not at all. Dr. Willett was thinking
deeply and rapidly, and his thoughts were terrible ones. Now and
then he would almost break into muttering as he ran over in his head
a new, appalling, and increasingly conclusive chain of nightmare
happenings.
Then Mr. Ward made a sign that the conference was over, and
everyone save him and the doctor left the room. It was noon now,
but shadows as of coming night seemed to engulf the phantom-
haunted mansion. Willett began talking very seriously to his host, and
urged that he leave a great deal of the future investigation to him.
There would be, he predicted, certain obnoxious elements which a
friend could bear better than a relative. As family physician he must
have a free hand, and the first thing he required was a period alone
and undisturbed in the abandoned library upstairs, where the ancient
overmantel had gathered about itself an aura of noisome horror more
intense than when Joseph Curwen's features themselves glanced
slyly down from the painted panel.
Mr. Ward, dazed by the flood of grotesque morbidities and
unthinkably maddening suggestions that poured in upon him from
every side, could only acquiesce; and half an hour later the doctor
was locked in the shunned room with the paneling from Olney Court.
The father, listening outside, heard fumbling sounds of moving and
rummaging as the moments passed; and finally a wrench and a
creak, as if a tight cupboard door were being opened. Then there
was a muffled cry, a kind of snorting choke, and a hasty slamming of
whatever had been opened. Almost at once the key rattled and
Willett appeared in the hall, haggard and ghastly, and demanding
wood for the real fireplace on the south wall of the room. The
furnace was not enough, he said; and the electric log had little
practical use. Longing yet not daring to ask questions, Mr. Ward gave
the requisite orders and a man brought some stout pine logs,
shuddering as he entered the tainted air of the library to place them
in the grate. Willett meanwhile had gone up to the dismantled
laboratory and brought down a few odds and ends not included in
the moving of the July before. They were in a covered basket, and
Mr. Ward never saw what they were.
Then the doctor locked himself up in the library once more, and by
the clouds of smoke which rolled down past the windows from the
chimney it was known that he had lighted the fire. Later, after a great
rustling of newspapers, that odd wrench and creaking were heard
again; followed by a thumping which none of the eavesdroppers
liked. Thereafter two suppressed cries of Willett's were heard, and
hard upon these came a swishing rustle of indefinable hatefulness.
Finally the smoke that the wind beat down from the chimney grew
very dark and acrid, and everyone wished that the weather had
spared them this choking and venomous inundation of peculiar
fumes. Mr. Ward's head reeled, and the servants all clustered
together in a knot to watch the horrible black smoke swoop down.
After an age of waiting the vapors seemed to lighten, and half-
formless sounds of scraping, sweeping, and other minor operations
were heard behind the bolted door. And at last, after the slamming of
some cupboard within, Willett made his appearance, sad, pale and
haggard, and bearing the cloth-draped basket he had taken from the
upstairs laboratory. He had left the window open, and into that once
accursed room was pouring a wealth of pure, wholesome air to mix
with a queer new smell of disinfectants. The ancient overmantel still
lingered; but it seemed robbed of malignity now, and rose as calm
and stately in its white paneling as if it had never borne the picture of
Joseph Curwen. Night was coming on, yet this time its shadows held
no latent fright, but only a gentle melancholy. Of what he had done
the doctor would never speak. To Mr. Ward he said, "I can answer no
questions, but I will say that there are different kinds of magic. I
have made a great purgation. Those in this house will sleep the
better for it."
That Dr. Willett's "purgation" had been an ordeal almost as nerve-
racking in its way as his hideous wandering in the vanished crypt is
shewn by the fact that the elderly physician gave out completely as
soon as he reached home that evening. For three days he rested
constantly in his room, though servants later muttered something
about having heard him after midnight on Wednesday, when the
outer door softly opened, and closed with phenomenal softness.
Servants' imaginations, fortunately, are limited, else comment might
have been excited by an item in Thursday's Evening Bulletin which
ran as follows:
10 Barnes St.,
Providence, R. I.,
April 12, 1928.
Dear Theodore:
I feel that I must say a word to you before doing what I am going to
do tomorrow. It will conclude the terrible business we have been
going through (for I feel that no spade is ever likely to reach that
monstrous place we know of), but I'm afraid it won't set your mind at
rest unless I expressly assure you how very conclusive it is.
You have known me ever since you were a small boy, so I think you
will not distrust me when I hint that some matters are best left
undecided and unexplored. It is better that you attempt no further
speculation as to Charles's case, and almost imperative that you tell
his mother nothing more than she already suspects. When I call on
you tomorrow Charles will have escaped. That is all which need
remain in anyone's mind. He was mad, and he escaped.
So don't ask me any questions when I call. It may be that something
will go wrong, but I'll tell you if it does. I don't think it will. There will
be nothing more to worry about, for Charles will be very, very safe.
He is now—safer than you dream. You need hold no fears about
Allen, and who or what he is. He forms as much a part of the past as
Joseph Curwen's picture, and when I ring your doorbell you may feel
certain that there is no such person. And what wrote that minuscule
message will never trouble you or yours.
But you must steel yourself to melancholy, and prepare your wife to
do the same. I must tell you frankly that Charles's escape will not
mean his restoration to you. He has been afflicted with a peculiar
disease, as you must realize from the subtle physical as well as
mental changes in him, and you must not hope to see him again. He
stumbled on things no mortal ought ever to know, and reached back
through the years as no one ever should reach; and something came
out of those years to engulf him.
And now comes the matter in which I must ask you to trust me most
of all. For there will be, indeed, no uncertainty about Charles's fate.
In about a year, say, you can if you wish devise a suitable account of
the end, for the boy will be no more. You can put up a stone in your
lot at the North Burial ground exactly ten feet west of your father's
and facing the same way, and that will mark the true resting-place of
your son. Nor need you fear that it will mark any abnormality or
changeling. The ashes in that grave will be those of your own
unaltered bone and sinew—of the real Charles Dexter Ward whose
mind you watched from infancy—the real Charles with the olive-mark
on his hip and without the black witch-mark on his chest or the pit on
his forehead. The Charles who never did actual evil, and who will
have paid with his life for his "squeamishness."
That is all. Charles will have escaped, and a year from now you can
put up his stone. Do not question me tomorrow. And believe that the
honour of your ancient family remains untainted now, as it has been
at all times in the past.
With profoundest sympathy, and exhortations to fortitude, calmness,
and resignation, I am ever
Sincerely your friend,
Marinus B. Willett.
So on the morning of Friday, April 13, 1928, Marinus Bicknell Willett
visited the room of Charles Dexter Ward at Dr. Waite's private
hospital on Conanicut Island. After the interchange of a few strained
formalities, a new element of constraint crept in, as Ward seemed to
read behind the doctor's masklike face a terrible purpose which had
never been there before.
Ward actually turned pale, and the doctor was the first to speak.
"More," he said, "has been found out, and I must warn you fairly that
a reckoning is due."
"Digging again, and coming upon more poor starving pets?" was the
ironic reply. It was evident that the youth meant to shew bravado to
the last.
"No," Willett slowly rejoined, "this time I did not have to dig. We have
had men looking up Dr. Allen, and they found the false beard and
spectacles in the bungalow!"
"Excellent," commented the disquieted host in an effort to be wittily
insulting, "and I trust they proved more becoming than the beard and
glasses you now have on!"
"They would become you very well," came the even and studied
response, "as indeed they seem to have done."
As Willett said this, it almost seemed as though a cloud passed over
the sun; though there was no change in the shadows on the floor.
Then Ward ventured:
"And is this what asks so hotly for a reckoning? Suppose a man does
find it now and then useful to be twofold?"
"No," said Willett gravely, "again you are wrong. It is no business of
mine if any man seeks duality; provided he has any right to exist at
all, and provided he does not destroy what called him out of space."
Ward now started violently. "Well, Sir, what have ye found, and what
d'ye want with me?"
The doctor let a little time elapse before replying, as if choosing his
words for an effective answer.
"I have found," he finally intoned, "something in a cupboard behind
an ancient overmantel where a picture once was, and I have burned
it and buried the ashes where the grave of Charles Dexter Ward
ought to be."
The madman choked and sprang from the chair in which he had been
sitting:
"Damn ye, who did ye tell—and who'll believe it was he after these
full two months, with me alive? What d'ye mean to do?"
Willett, though a small man, actually took on a kind of judicial
majesty as he calmed the patient with a gesture.
"I have told no one. This is no common case—it is a madness out of
time and a horror from beyond the spheres which no police or
lawyers or courts or alienists could ever fathom or grapple with. You
cannot deceive me, Joseph Curwen, for I know that your accursed
magic is true!
"I know how you wove the spell that brooded outside the years and
fastened on your double and descendant; I know how you drew him
into the past and got him to raise you up from your detestable grave;
I know how he kept you hidden in his laboratory while you studied
modern things and roved abroad as a vampire by night, and how you
later shewed yourself in beard and glasses that no one might wonder
at your godless likeness to him; I know what you resolved to do
when he balked at your monstrous rifling of the world's tombs, and at
what you planned afterward, and I know how you did it.
"You left off your beard and glasses and fooled the guards around the
house. They thought it was he who went in, and they thought it was
he who came out when you had strangled and hidden him. But you
hadn't reckoned on the different contacts of two minds. You were a
fool, Curwen, to fancy that a mere visual identity would be enough.
Why didn't you think of the speech and the voice and the
handwriting? It hasn't worked, you see, after all. You know better
than I who or what wrote that message in minuscules, but I will warn
you it was not written in vain. There are abominations and
blasphemies which must be stamped out, and I believe that the
writer of those words will attend to Orne and Hutchinson. One of
those creatures wrote you once, 'do not call up any that you cannot
put down.' Curwen, a man can't tamper with Nature beyond certain
limits, and every horror you have woven will rise up to wipe you out."
But here the doctor was cut short by a convulsive cry from the
creature before him. Hopelessly at bay, weaponless, and knowing
that any show of physical violence would bring a score of attendants
to the doctor's rescue, Joseph Curwen had recourse to his one
ancient ally, and began a series of cabalistic motions with his
forefingers as his deep, hollow voice, now unconcealed by feigned
hoarseness, bellowed out the opening words of a terrible formula.
"PER ADONAI ELOIM, ADONAI JEHOVA, ADONAI SABAOTH,
METRATON...."
But Willett was too quick for him. Even as the dogs in the yard
outside began to howl, and even as a chill wind sprang suddenly up
from the bay, the doctor commenced the solemn and measured
intonation of that which he had meant all along to recite. An eye for
an eye—magic for magic—let the outcome shew how well the lesson
of the abyss had been learned! So in a clear voice Marinus Bicknell
Willett began the second of that pair of formulae whose first had
raised the writer of those minuscules—the cryptic invocation whose
heading was the Dragon's Tail, sign of the descending node—
At the very first word from Willett's mouth the previously commenced
formula of the patient stopped short. Unable to speak, the monster
made wild motions with his arms until they too were arrested. When
the awful name of Yog-Sothoth was uttered, the hideous change
began. It was not merely a dissolution, but rather a transformation or
recapitulation; and Willett shut his eyes lest he faint before the rest
of the incantation could be pronounced.
But he did not faint, and that man of unholy centuries and forbidden
secrets never troubled the world again. The madness out of time had
subsided, and the case of Charles Dexter Ward was closed. Opening
his eyes before staggering out of that room of horror, Dr. Willett saw
that what he had kept in memory had not been kept amiss. There
had, as he had predicted, been no need for acids. For like his
accursed picture a year before, Joseph Curwen now lay scattered on
the floor as a thin coating of fine bluish-gray dust.
*** END OF THE PROJECT GUTENBERG EBOOK THE CASE OF
CHARLES DEXTER WARD ***
Updated editions will replace the previous one—the old editions will
be renamed.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com