0% found this document useful (0 votes)
73 views13 pages

Recommendations For The Introduction of Metagenomic Next-Generation Sequencing in Clinical Virology, Part II: Bioinformatic Analysis and Reporting

The document discusses how metagenomics and high-throughput sequencing technologies have enabled the discovery of novel bioactive metabolites from uncultured microbes with potential pharmaceutical applications. Several examples of bioactive metabolites discovered from metagenomic samples that have significant therapeutic potential are discussed. The major high-throughput sequencing platforms used for metagenomic analysis are also summarized.

Uploaded by

Victor Benedito
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views13 pages

Recommendations For The Introduction of Metagenomic Next-Generation Sequencing in Clinical Virology, Part II: Bioinformatic Analysis and Reporting

The document discusses how metagenomics and high-throughput sequencing technologies have enabled the discovery of novel bioactive metabolites from uncultured microbes with potential pharmaceutical applications. Several examples of bioactive metabolites discovered from metagenomic samples that have significant therapeutic potential are discussed. The major high-throughput sequencing platforms used for metagenomic analysis are also summarized.

Uploaded by

Victor Benedito
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Metagenomics and Drug-Discovery

8
Bhupender Singh and Ayan Roy

Abstract

The emergence of High-throughput sequencing and its implication in the analysis


of microbial population has introduced a new area of scientific research-
metagenomics. Metagenomic analysis has brought a revolution in various fields
of biological research, notably drug discovery. The term refers to the collective
examination of genome analysis of unculturable microbial communities residing
in a specific type of environmental condition or niche. A comprehensive investi-
gation of the microbial diversity of unexploited areas with the aid of molecular
biology and High-throughput sequencing technologies has opened the floodgates
to explore and profile varieties of novel bioactive metabolites and potential
antibiotics that promise to be of immense gravity for the pharmaceutical sector.
High-throughput sequencing has accelerated the process of metabolite identifica-
tion from metagenomic samples. Several bioactive metabolites have been
obtained from metagenomic samples with immense therapeutic potential. Some
examples include malacidin, fluoroquinolone, minimide and erdacin. In this
chapter, major benchmark studies executed on the pharmacologically significant
bioactive metabolites, extracted from metagenomic samples, have been discussed
elaborately. An extensive review has also been conducted on several specialised
bioinformatics-based pipelines frequently employed for the purpose. The present
approach also aims at highlighting the major unexplored areas of drug discovery
from metagenomic samples and associated metabolites—a hidden treasure for the
pharmaceutical sector.

B. Singh · A. Roy (*)


School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab,
India
e-mail: [email protected]

# Springer Nature Singapore Pte Ltd. 2020 133


R. S. Chopra et al. (eds.), Metagenomics: Techniques, Applications, Challenges
and Opportunities, https://doi.org/10.1007/978-981-15-6529-8_8
134 B. Singh and A. Roy

Keywords

Metagenomics · Drug discovery · High-throughput sequencing · Metabolites ·


Antibiotics

8.1 Introduction

The recent advancement in microbiology has directed researchers from the respec-
tive discipline to explore microbes differently and insightfully (Arnold et al. 2016).
The finding that most microbes cannot be cultured acted as a catalyst to alter the
before used dynamics to study microbial populations (Stewart 2012). The knowl-
edge of the impact of microbes on the humans and the environment has led the
microbiologists to develop strategies for examining the uncultured microorganisms
(Handelsman 2004). The urge to study the evolutionary and functional
characteristics of the uncultured microbial diversity has introduced the multidisci-
plinary field called metagenomics (Imhoff 2016). It involves the isolation of the
genomic DNA from environmental samples and after cloning and expressing in a
culturable organism for further analysis (Handelsman 2004). The term
metagenomics was coined to denote the meta-examination of the relatively similar
microbial population residing in the various niche (Neelakanta and Sultana 2013).
The metagenomic field was brought in the spotlight by the studies of DeLongs and
his colleagues after they generated the metagenomic library of prokaryotes from
sea-water. The 16s rRNA sequencing confirmed that the library belonged to the
archaeon and was not cultured until that time (Stein et al. 1996).
The sequencing analysis has a unique role in the identification and functional
annotation of metagenomic samples (Österlund et al. 2017). The very first
metagenomic analysis coupled with shotgun sequencing was carried out to analyse
viral, microbial diversity residing on the surface of sea-water, which has identified
more than 65% of the sequences having no prior knowledge (Osunmakinde et al.
2018). Majority of the metagenome analysis has been performed concerning marine
samples because two-thirds of the earth is occupied by water. Secondly, it provides
the niche to diverse microbial communities which regulate the ecosystem (Aguiar-
Pulido et al. 2016). Apart from this, the marine microbial population serves as a
hidden treasure trove for novel pharmaceutical and industrially relevant metabolites
(Hug et al. 2018). Progressively, the combination of metagenomics with High-
throughput sequencing (HTS) technologies has opened the floodgates to find novel
metabolites from microbes having clinical implications. Together these approaches
have identified various important metabolites like biomass-degenerating enzymes
from cow rumen, recognizing novel CRISPR (Clustered Regularly Interspaced Short
Palindromic Repeats) systems and set up of gene index of the human microbiome
(Seshadri et al. 2018; Stewart et al. 2018).
8 Metagenomics and Drug-Discovery 135

8.2 HTS Technologies for Metagenome Examination

HTS technology was introduced firstly in 2005, and since then, it is being
improvised efficiently to provide greater accuracy and precision in terms of genome
analysis (Reuter et al. 2015). The advancement in the HTS has led the foundation of
extensive exploration of microbial communities by producing high throughput
genomic data with rate and time effectiveness (Zhou et al. 2015). The array of
HTS technology involves amplicon sequencing, whole-genome sequencing and
shotgun metagenome sequencing (Petrosino et al. 2009). The pros of HTS over
old-style sequencing convention involve its high-throughput data generation,
absence of cloning and fewer tariffs (Ari and Arikan 2016). The critical step in
this technology is to draw statistically significant inferences out of the generated
data. In the following discussion, various HTS platforms which can be implemented
to analyse metagenomes are highlighted (and summarized in Fig. 8.1).

8.2.1 Roche 454 Sequencer

Variants GS20, GS-FLX, GS-FLX Titanium and GS-FLX Titanium+.

Several bioactive metabolites have been obtained from metagenomic samples with
immense therapeutic potential. Some examples include malacidin, fluoroquinolone,
minimide and erdacin. GS20 was the first HTS variant introduced in 2005. This
sequencer implements sequencing by synthesis approach in a picotitre plate, which
gives 20 megabases of output in a single run and mean read size of 100 base pair
(Pareek et al. 2011). The sequencer works on the mechanism of pyrosequencing,
which involves NTP (Nucleotide Triphosphates) and nucleotide addition comple-
mentary to the sequencing strand is detected by the liberation of pyrophosphate
(Harrington et al. 2013). Its high-end variant GS-FLX Titanium + generates around
850 megabases in a single run with a mean read size of 700–750 base pairs. The
system was most suitable for the 16S rRNA sequencing as it can pave the highly
capricious fragments of 16S rRNA. The GS-FLX variants were discontinued since
December 2016 due to their cost, error rate and quantity of sample DNA required

Fig. 8.1 Various HTS platforms available for metagenome examination


136 B. Singh and A. Roy

was higher in comparison with other HTS platforms, but it has produced an
enormous amount of data which is not yet available in the scientific knowledge
(Liu et al. 2012).

8.2.2 Illumina Sequencer

Variants GA I, II, HiSeq, MiSeq, NextSeq 500, HiSeq2500 and HiSeq X Ten.

The Illumina sequencer was introduced back in 2006 and got widely used by the
scientists because of its lower cost. However, its major downside was smaller read
size in its initial variants which were taken care of in the advanced variants. The
MiSeq variant gives 2  300 base pairs of read-length (Quail et al. 2008). The
above-discussed peculiarities made the scientists switch from the Roche 454 platform
to Illumina sequencer. It utilises sequencing by synthesis tactic by the termination.
The HiSeq 2500 variant produces optimum four billion pieces of 125 bases size for a
single read in a paired-end manner dye (Schirmer et al. 2016). Its recent advance-
ment HiSeq X Ten includes, as the name suggests coupling of ten HiSeq machines to
obtain high-throughput data (Levy and Myers 2016). Illumina has also introduced
the first small-sized sequencer regarded as NextSeq 500 (Buermans and den Dunnen
2014).

8.2.3 SOLiD (Sequencing by Oligonucleotide Ligation


and Detection)

SOLiD was announced in 2006, by Applied Biosystems as a sequencing platform


which utilises sequencing by ligation methodology (Pareek et al. 2011). The variant
5500 xl generates 300 gigabytes of data, 3 billion base pairs in a single run with the
reading size of 75 base pairs. Despite abundant data generation and less rate for
single-base sequencing, the comparatively smaller read size and cost for a single run
were its major setbacks (Goodwin et al. 2016).

8.2.4 Ion Torrent Sequencer

Variants PGM (Personal Genome Machine, Proton and S5).

Ion torrent was the first organisation which has introduced small-scale sequencers in
the form of PGM to the researchers in 2010. As a result, it received positive feedback
and become a hotspot among researchers to perform sequencing analysis in compar-
atively lesser spending. The sequencing was carried out in a microtiter plate in which
DNA stretches are incorporated to the beads when the DNA is supplemented to the
sequencing strand it liberates the proton which results in a change of pH and sensed
by the detector. Ion Proton, the high-throughput variant of the Ion torrent, generates
8 Metagenomics and Drug-Discovery 137

ten gigabases of data with almost 50 million reads in a single run having read size of
200 base pairs (Lahens et al. 2017). The most significant advantage of PGM is that it
can generate a large read size of about 400 base pairs (Henson et al. 2012). Their
latest variant is Ion S5 which can generate 15 gigabytes of output with 60 to
80 million reads in a single run of size around 200 base pairs (Mehrotra et al. 2017).

8.2.5 Pacific Biosciences

Variant Pac Bio RS ll.

In 2012, Pacific biosciences launched its SMRT (Single Molecule Real-Time)


Sequencing platform which performs sequencing by synthesis. Helicos biosciences
were the first single-molecule sequencing platform, but the PacBio was able to
obtain a distinguished reputation in SMRT sequencing area. It utilises CCS (Circular
Consensus Sequencing) in order to obtain error-proof sequence stretches. The
variant of the respective company performs sequencing by ZMW (Zero-Mode
Waveguide) in which DNA polymerase is ligated to a unit DNA molecule. Subse-
quently, the incorporation of the DNA in the strand is recorded by detection of
luminescence. Each of the four types of nucleotide are labelled with different
fluorescent dyes which on incorporation to the synthesizing strand liberates different
fluorescence. It generates 10,000 to 60,000 base pairs read with comparative preci-
sion of around 99.999 percent. Due to its exceptional annotation efficiency, they are
regarded as best for shotgun metagenome analysis (Ardui et al. 2018).

8.2.6 Oxford Nanopore Sequencer

Variants MinIon, PromethION, SmidgION and VolTRAX.

The nanopore sequencer from Oxford utilises the state-of-the-art strand sequencing
which can sequence the entire DNA fragment by detecting the change in electric
current when passed through minute nanopores made of proteins. MinIon mk1B is a
compact sequencer which can get coupled with any sort of computer for immediate
data analysis. The PromethION sequencer offers 144,000 (3000 nanopore channels
of 48 flow cells) channels for the sequencing purposes. Their SmidgION variants can
get operated through a smartphone for instant analysis. The VolTRAX variant can be
controlled through the Universal Serial Bus (USB) after sample load. The compara-
tively larger read size removes the necessity of shot-gun sequencing and thus
bringing a revolt in the respective (Wanunu 2012).
138 B. Singh and A. Roy

8.3 Elucidation of Metagenomic Data

The analysis of metagenome data is planned explicitly to process concoction of


genomes and contigs of different sizes. The elucidation of metagenome data
involves the following stages.

8.3.1 Processing of Poor-Quality Reads

The poor-quality reads are initially processed by the utilities supporting the variant
of sequencer used for the sequencing. One such utility is the FASTX-Toolkit
(command-line based utility for pre-processing of FASTA/FASTQ data), apart
from this, FastQC (quality check utility to process raw HTS data) is also
implemented for the same purpose which also gives the overall figures of the
FASTQ data. Tools such as Galaxy (multivariate genome analysis platform),
SolexaQA (to view a graphical representation of sequence quality) and Lucy 2 (com-
mand-line based sequence cleaner and visualiser) are implemented to process
FASTQ data. These tools utilise Q quality or Phred scores (measures the sequencing
quality), whose verge relies on the variant of sequencer implemented.

8.3.2 Masking of Low-Complexity Reads

This is carried out with the help of utilities like DUST. After this, the reads/
sequences which are sharing more than 95% identity are eliminated. Some tools
like MG-RAST (MetaGenomic Rapid Annotation using Subsystems Technology)
allows the user to eliminate reads which are almost matched with the genome of
model organisms like human, fly, cow and mouse. The process is mediated by the
Bowtie 2 (fast and efficient tool for the alignment of sequencing reads against
reference sequence) utility.

8.3.3 Gene Identification

In this step “gene calling” is brought into action that allows the user to recognise
genes which are present in reads/contigs. CDS (Coding DNA Sequence),
non-coding RNA genes and some tools also allow the user to recognise CRISPR
(Clustered Regularly Interspaced Short Palindromic Repeats). Metagene,
FragGeneScan, Prodigal, MetaGeneMark and Orphelia helps to recognise the CDS
genes by implementing ab-initio gene prediction. These utilities implement codon
information to recognise regions of reads/contigs as introns and exons. They can be
trained by using the user-oriented datasets. FragGeneScan is used for recognising
prokaryotic genes and implemented by IMG/M (Integrated Microbial Genomes with
Microbiome samples; a platform to perform metagenome comparative data analy-
sis), EBI (European Bioinformatics Institute) Metagenomics and MG-RAST
8 Metagenomics and Drug-Discovery 139

(Metagenomics-Rapid Annotation using Subsystems Technology; a platform to


carry out evolutionary and functional analysis of metagenomes). The prediction
accuracy of FragGeneScan ranges from 65 to 70 percent. Non-coding RNA is
predicted by the tools like tRNAscanto anticipate the tRNAs, on the other hand,
rRNA genes are anticipated by personalised rRNA models for IMG (Integrated
Microbial Genome)/MER (Microbiomes Expert Review) and MG-RAST implement
homology-based search with SILVA (specialised database of aligned ribosomal
RNA sequences of Eukarya, archaea and bacteria), RDP (Ribosomal Database
Project). PILER-CR and CRT (CRISPR Recognition Tool) are implemented to
anticipate CRISPR stretches.

8.3.4 Gene Annotation

The next step of the metagenomic data elucidation includes allocation of the
functions to the genes. The objective is accomplished by the similarity—search
method in which investigational sequence is compared with the database sequence
having annotated genes information. The basic steps involved in metagenome
annotation are shown in Fig. 8.2. The bigger size of the metagenomic data has
made this process automated and computationally expensive. BLAST (Basic Local
Alignment Search Tool) utility is implemented in high-end computing servers. The
concept of multithread is implemented in which a process is separated into numerous
CPU (Central Processing Unit)/GPU (Graphical Processing Unit) in order to obtain

Fig. 8.2 General work-flow of the HTS metagenome annotation


140 B. Singh and A. Roy

the results in short time-span. The metagenomic data is annotated with the help of
various databases like KEGG (Kyoto Encyclopedia of Genes and Genomes), egg-
NOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups),
COG (Clusters of Orthologous Groups)/KOG (EuKaryotic Orthologous Groups)
and protein databases like PFAM (Protein FAMily), Interpro and TIGRFAM (The
Institute of Genomic Research’s database of protein Families). The use of numerous
databases mentioned above is brought into action for metagenomic data annotation.
IMG/MER uses Hidden Markov Model (HMM) profile to link the query set genes
with the PFAM after which with the help of COG ortholog clustering is carried out.
The PSSM (Position-specific Scoring Matrix) dataset is retrieved from the NCBI
(National Center for Biotechnology Information) for functional assignment of
proteins. On the other hand, genes are identified with the help of KEGG, and EC
(Enzyme Commission) numbers and evolutionary analysis of the metagenome data
is carried out by homology exploration.
IMG/MER contains a huge amount of genomic data which it utilises to retrieve
extra annotation information. The first step in its workflow is the anticipation of the
genes out of the metagenome and subsequently utilises other options to annotate
those genes further. This leads to the recognition of PFAM, which is not determined
in case of MG-RAST and results in comprehensive annotation parallel to the COG,
which is the only protein family identification resource utilised by MG-RAST. One
tailback for IMG/MER is the rapid increase in the gene counts which is not the case
with the MG-RAST but because the query metagenome as in case of IMG/MER is
subjected to PFAM analysis results in extensive annotation and reporting of the
metagenome.
MG-RAST initially anticipate the genes present in the metagenome followed by
searching for the homologs of those anticipated genes in the separated genomes. The
process is carried out by a utility called BLAT (BLAST—Like Alignment Tool). It
considers only those homologs whose identity is more than 70% thus omits consid-
erable hits. The best homologs from the separated genome are further subjected to
annotation rather than the metagenome. This turns into a drawback as the annotation
is carried out on the substituted genes of the separated genome while ignoring the
metagenome. Nevertheless, the plus point of implementing this method is the shorter
time-pan for complete annotation. Apart from this, the database does not enlarge
while the IMG/MER enlarge its mass.

8.4 Pharmaceutical Products from Metagenomes

In order to obtain the pharmaceutically significant metabolites, it is of immense


importance to primarily considering the functionality. The objective of the research
is to obtain the pharmaceutically validated active metabolite having preferred func-
tionality. The advancement in HTS technologies has shifted the interest of researches
to perform the sequence-based analysis of the clone libraries. Shotgun metagenome
sequence analysis has resulted in not-only in a prompt classification of the biosyn-
thetic gene clusters but also the anticipation of corresponding biochemical assembly.
8 Metagenomics and Drug-Discovery 141

However, the standalone bioinformatics and HTS approach can anticipate a limited
number of gene clusters, but the improved facilitation has allowed the researchers to
identify novel pharmaceutically significant active metabolites.
Nowadays, instead of functionally annotating the metagenome gene clusters, the
research has shifted to the targeted screening, which considers the background of the
metagenome under examination. In the upcoming part, we highlight various
pharmaceutically active metabolites obtained through metagenome examination.
Irrespective of the pipeline followed for the functional and structural
characterisation, and the metagenome is a reservoir for several metabolite-
synthesizing genes.

8.4.1 ET-743 (Yondelis)

In 1969, the examination of the marine squirt Ecteinascidia turbinata resulted in the
identification of its anti-cancer properties and the structure of Ecteinascidin (ET-743)
was elucidated in 1984, and presently it is a pharmaceutically validated anti-cancer
metabolite. The practices to grow sea squirt in order to fulfil the pharmaceutical
hunger was not that successful but alternatively, extensive artificial approaches were
adopted to meet the pharma needs. The identification of ET-743 homology with
bacteria-derived metabolites namely saframycin A (Streptomyces lavendule),
safracin B (Pseudomonas fluorescens), saframycin Mx1 (Myxococcus xanthus) has
resulted in the understanding that symbiotic bacterial communities synthesized
ET-743. The metagenome sequence analysis of tunicate depicts that it regulates
non-ribosomal peptide synthetase pathways by the expression of 25 genes. The
extensive sequence annotation workflow has identified that the bioactive molecule
is generated by Candidatus Endoecteinascidia frumentensis. The whole-genome
size of the particular organism was identified approximately 631 kb. The determina-
tion of the associated pathway for the metabolite synthesis open the gates for the
pharma industries to synthesize the respective metabolite along with its analogues at
massive scale.

8.4.2 Bryostatins

In 1968, Bryostatin was found in Bugula neritina, which further caught the limelight
because of its toxic action for the cancerous cells, explicitly targeting Protein Kinase
C. The activity of the Bryostatin was estimated by more than 80 clinical, trials and
the medication is used for the Alzheimer. Initially, it was found that Bryostatin was
expressed by symbiotic relationships as numerous forms of the compound exists.
Later, the cosmid library respective of B. neritina was constructed, and numerous
corresponding clones were sequenced, which leads to the identification of 65 kb
brygene group. Further, the hybridisation studies were carried out on two E. sertula
strains from a different host. In one strain, the genes were found adjoining while the
other strain was having the respective gene cluster fragmented from the auxiliary
142 B. Singh and A. Roy

genes. As the E. sertula is not-culturable, to meet the needs of the pharma industry,
the brygene cluster can be expressed in various host organisms.

8.4.3 Psymberin

Psymberin is a kind of polyketide with cell-toxicity activity against the tumour cells.
It was obtained from the numerous sea sponges. The compound is of little impor-
tance because of its intricate structure, bioactivity and structure of the respective
compound were elucidated in 11 years utilising more than 600 samples. The
biosynthesis process of the compound was determined in the metagenome of
Psammociniaaff. Bulbosa. Sample of Psammociniaaff. Bulbosa were collected
from scuba diving at Milne Bay, New Guinea. Further, the protein sequence-based
alignments of psymberin and other related groups were generated. Amplicons were
generated using the primer-based amplification approach. Total sponge DNA was
isolated and two libraries (3,20,000 and 9,00,000 clones) respective of
Psammociniaaff. Bulbosa were generated following PCR based screening using
psymEAD-Yyspez2-forward and psymEAD-Yyspez1-reverse primers to obtain the
PKS gene clusters. (Haas 2009). The genomic composition analysis of the respective
compound suggests its derivation from the bacteria.

8.4.4 Onnamides

The tumour targeting particularity of the mycalamide and pederin inhibits the
replication and translation mechanism even at the slight concentration of 1 ng/ml.
The clinical trial study suggests that the implication of the respective compounds
lead to increase the life-span of cancer-induced mice. The assistance of metagenome
analysis has found that these compounds are derived from non-culturable Pseudo-
monas linked with Paederusfuscipes. The homology study of the pederin results in
the identification of structurally and functionally similar compounds in Lithistida.
The pederin-led analysis of polyketide synthase used as amplification template was
obtained from Theonella swinhoei metagenome and subsequently determined the
biosynthesis pathway for onnamide. Advanced analysis of the T. swinhoei
metagenome depicts that these polyketide synthases can only be derived from
those sponges which had comprises pederin homologs formerly. Finally, it was
known that onnamides are derived from non-culturable Candidatus Entotheonella
spp.

8.4.5 Patellazoles

Patellazoles were extracted from the tunicate during 1980, and they have gained
importance due to their pharmacological significance. The respective metabolite was
able to show potential antifungal action and cell-toxicity action to human cell-lines.
8 Metagenomics and Drug-Discovery 143

The chief symbiont for this metabolite was L. patella and C. albicans. The structural
units of the patellazole contain acetate and thiazole ring has led to the assumption
that it may be synthesized through polyketide synthase and non-ribosomal peptide
synthetase pathway respectively. The assumption was tested by carrying out
sequence-based metagenome analysis of tunic-cloaca niche and gave-off negative
findings. The PCR (Polymerase Chain Reaction) analysis revealed the patellazole
synthesis mediated through trans-acyltransferase family from the miniature zooids.
Shotgun sequence analysis of isolated zooid DNA reveals the 86 kb genome
corresponding to trans-acyltransferase polyketide synthase pathways. The genome
was contemplated to possess by Candidatus Endolissoclinum faulkneri.

8.4.6 Calyculin A

The compound was extracted from sea sponge Discodermia calyx in 1986 which
possess high cellular-toxicity abilities. The biosynthesis of the respective compound
is mediated by a combination of non-ribosomal peptide and polyketide pathways,
respectively. The homologs respective of Calyculin were also identified, which
suggests its derivation regulated by symbiotic interaction. The biosynthetic gene
cluster for the Calyculin was determined with the aid of metagenome analysis.
Metagenome examination of D. calyx reveals 150 kb of the gene cluster. Taking
this gene cluster as a template with the help of molecular biology analysis, the
Calyculin synthesis pathway was found to be possessed by filamentous bacteria.
Further, 16s rRNA sequence analysis of the respective bacteria showed 97% homol-
ogy with Candidatus Entotheonella factor obtained from T. swinhoei sponge.

8.4.7 Polytheonamides

This group of compounds possess cellular-toxicity and are extracted from


T. swinhoei sponges. The compound is synthesized through non-ribosomal peptide
synthetase, and the peptide length of the respective compound is 48 amino acids. The
PCR analysis of T. swinhoei metagenome revealed that the compound is derived
from the ribosome and is a product of the symbiotic relationship of the bacteria.
Subsequent analysis revealed that the producer of polytheonamide was
non-culturable Entotheonella spp. This species was responsible for the production
of polytheonamide and onnamide metabolites.

8.5 Conclusion

Metagenomic examination of the samples from the different environmental condi-


tion of niches has enabled the researchers to dive into the oceans of screening a large
number of pharmaceutically relevant active metabolites. The metagenome examina-
tion with the help of HTS approaches has enabled the researchers to not only retrieve
144 B. Singh and A. Roy

the metagenome data but also perform the annotation effectively. We have discussed
various HTS platforms which can be utilised for the metagenome examination along
with their pros and cons. The strategies which can be employed to perform the
metagenome analysis has also been discussed along with some useful resources. The
main objective of the book chapter was to draw the attention of researchers towards
the HTS usefulness in metagenome examination so that this exponentially growing
field not only receive the appreciation but also direct the intellect of wet-lab
researchers towards designing their work-flow in a manner by which the distin-
guished properties of both wet lab and dry lab analysis can be utilised to serve the
human society at their best.

References
Aguiar-Pulido V, Huang W, Suarez-Ulloa V et al (2016) Metagenomics, metatranscriptomics, and
metabolomics approaches for microbiome analysis. Evol Bioinforma 12:5–16
Ardui S, Ameur A, Vermeesch JR, Hestand MS (2018) Single molecule real-time (SMRT)
sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids
Res 46:2159–2168
Ari Ş, Arikan M (2016) Next-generation sequencing: advantages, disadvantages, and future. In:
Hakeem KR, Tombuloğlu H, Tombuloğlu G (eds) Plant omics: trends and applications.
Springer, Berlin, pp 109–135
Arnold JW, Roach J, Azcarate-Peril MA (2016) Emerging technologies for gut microbiome
research. Trends Microbiol 24:887–901
Buermans HPJ, den Dunnen JT (2014) Next generation sequencing technology: advances and
applications. Biochim Biophys Acta Mol basis Dis 1842:1932–1941
Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation
sequencing technologies. Nat Rev Genet 17:333–351. https://doi.org/10.1038/nrg.2016.49
Haas MJ (2009) Polyketide pas de deux. Sci Exch 2:898–898. https://doi.org/10.1038/scibx.2009.
898
Handelsman J (2004) Metagenomics: application of genomics to uncultured microorganisms.
Microbiol Mol Biol Rev 68:669–685. https://doi.org/10.1128/MMBR.68.4.669-685.2004
Harrington CT, Lin EI, Olson MT, Eshleman JR (2013) Fundamentals of pyrosequencing. Arch
Pathol Lab Med 137:1296–1303. https://doi.org/10.5858/arpa.2012-0463-RA
Henson J, Tischler G, Ning Z (2012) Next-generation sequencing and large genome assemblies.
Pharmacogenomics 13:901–915
Hug JJ, Bader CD, Remškar M et al (2018) Concepts and methods to access novel antibiotics from
actinomycetes. Antibiotics 7:44
Imhoff J (2016) New dimensions in microbial ecology—functional genes in studies to unravel the
biodiversity and role of functional microbial groups in the environment. Microorganisms 4:19.
https://doi.org/10.3390/microorganisms4020019
Lahens NF, Ricciotti E, Smirnova O et al (2017) A comparison of Illumina and Ion Torrent
sequencing platforms in the context of differential gene expression. BMC Genomics 18:602.
https://doi.org/10.1186/s12864-017-4011-0
Levy SE, Myers RM (2016) Advancements in next-generation sequencing. Annu Rev Genomics
Hum Genet 17:95–115. https://doi.org/10.1146/annurev-genom-083115-022413
Liu L, Li Y, Li S et al (2012) Comparison of next-generation sequencing systems. J Biomed
Biotechnol 2012:1–11. https://doi.org/10.1155/2012/251364
Mehrotra M, Duose DY, Singh RR et al (2017) Versatile ion S5XL sequencer for targeted next
generation sequencing of solid tumors in a clinical laboratory. PLoS One 12:e0181968. https://
doi.org/10.1371/journal.pone.0181968
8 Metagenomics and Drug-Discovery 145

Neelakanta G, Sultana H (2013) The use of metagenomic approaches to analyze changes in


microbial communities. Microbiol Insights 6:MBI.S10819. https://doi.org/10.4137/mbi.s10819
Österlund T, Jonsson V, Kristiansson E (2017) HirBin: high-resolution identification of differen-
tially abundant functions in metagenomes. BMC Genomics 18:1–11. https://doi.org/10.1186/
s12864-017-3686-6
Osunmakinde CO, Selvarajan R, Sibanda T et al (2018) Overview of trends in the application of
metagenomic techniques in the analysis of human enteric viral diversity in Africa’s environ-
mental regimes. Viruses 10:429
Pareek CS, Smoczynski R, Tretyn A (2011) Sequencing technologies and genome sequencing. J
Appl Genet 52:413–435
Petrosino JF, Highlander S, Luna RA et al (2009) Metagenomic pyrosequencing and microbial
identification. Clin Chem 55:856–866
Quail MA, Kozarewa I, Smith F et al (2008) A large genome center’s improvements to the Illumina
sequencing system. Nat Methods 5:1005–1010. https://doi.org/10.1038/nmeth.1270
Reuter JA, Spacek DV, Snyder MP (2015) High-throughput sequencing technologies. Mol Cell
58:586–597
Schirmer M, D’Amore R, Ijaz UZ et al (2016) Illumina error profiles: resolving fine-scale variation
in metagenomic sequencing data. BMC Bioinf 17:125. https://doi.org/10.1186/s12859-016-
0976-y
Seshadri R, Leahy SC, Attwood GT et al (2018) Cultivation and sequencing of rumen microbiome
members from the Hungate1000 Collection. Nat Biotechnol 36:359–367. https://doi.org/10.
1038/nbt.4110
Stein JL, Marsh TL, Wu KY et al (1996) Characterization of uncultivated prokaryotes: isolation and
analysis of a 40-kilobase-pair genome fragment from a planktonic marine archaeon. J Bacteriol
178:591–599. https://doi.org/10.1128/jb.178.3.591-599.1996
Stewart EJ (2012) Growing unculturable bacteria. J Bacteriol 194:4151–4160
Stewart RD, Auffret MD, Warr A et al (2018) Assembly of 913 microbial genomes from
metagenomic sequencing of the cow rumen. Nat Commun 9:870. https://doi.org/10.1038/
s41467-018-03317-6
Wanunu M (2012) Nanopores: a journey towards DNA sequencing. Phys Life Rev 9:125–158
Zhou J, He Z, Yang Y et al (2015) High-throughput metagenomic technologies for complex
microbial community analysis: open and closed formats. MBio 6:e02288-14. https://doi.org/
10.1128/mBio.02288-14

You might also like