0% found this document useful (0 votes)
52 views7 pages

Research: Metagenomic Frameworks For Monitoring Antibiotic Resistance in Aquatic Environments

The document discusses using metagenomic analysis to develop an antibiotic resistance determinant (ARD) index to quantify antibiotic resistance potential in environmental samples. It analyzed published metagenomic datasets from various aquatic environments and observed differences in ARD levels between ecosystems with high and low human impact. The selection of sequence similarity thresholds influenced the index measurements. Unique index patterns distinguished the different metagenomes, showing potential for environmental health monitoring and surveillance.

Uploaded by

KennyBast
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views7 pages

Research: Metagenomic Frameworks For Monitoring Antibiotic Resistance in Aquatic Environments

The document discusses using metagenomic analysis to develop an antibiotic resistance determinant (ARD) index to quantify antibiotic resistance potential in environmental samples. It analyzed published metagenomic datasets from various aquatic environments and observed differences in ARD levels between ecosystems with high and low human impact. The selection of sequence similarity thresholds influenced the index measurements. Unique index patterns distinguished the different metagenomes, showing potential for environmental health monitoring and surveillance.

Uploaded by

KennyBast
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Research All EHP content is accessible to individuals with disabilities.

A fully accessible (Section 508–compliant)


HTML version of this article is available at http://dx.doi.org/10.1289/ehp.1307009.

Metagenomic Frameworks for Monitoring Antibiotic Resistance


in Aquatic Environments
Jesse A. Port,1,2 Alison C. Cullen,3 James C. Wallace,1,2 Marissa N. Smith,1,2 and Elaine M. Faustman1,2
1Department of Environmental and Occupational Health Sciences, and 2The Institute for Risk Analysis and Risk Communication, School
of Public Health, University of Washington, Seattle, Washington, USA; 3Evans School of Public Affairs, University of Washington,
Seattle, Washington, USA

shown to be widespread in environmental


Background: High-throughput genomic technologies offer new approaches for environmental bacteria (Wright 2010); furthermore, many
health monitoring, including metagenomic surveillance of antibiotic resistance determinants resistance genes found in pathogenic bacte-
(ARDs). Although natural environments serve as reservoirs for antibiotic resistance genes that can ria have evolved or are sourced from envi-
be transferred to pathogenic and human commensal bacteria, monitoring of these determinants has
been infrequent and incomplete. Furthermore, surveillance efforts have not been integrated into
ronmental microbial communities (Martinez
public health decision making. 2009). ARDs refer here to the genomic fac-
tors related to the presence and dissemina-
Objectives: We used a metagenomic epidemiology–based approach to develop an ARD index that
quantifies antibiotic resistance potential, and we analyzed this index for common modal patterns
tion of antibiotic resistance genes (ARGs),
across environmental samples. We also explored how metagenomic data such as this index could be including mobile genetic elements (MGEs)
conceptually framed within an early risk management context. such as plasmids, transposable elements (TEs),
Methods: We analyzed 25 published data sets from shotgun pyrosequencing projects. The samples
and phages as well as metal resistance genes
consisted of microbial community DNA collected from marine and freshwater environments across (MRGs), which have been shown to co-select
a gradient of human impact. We used principal component analysis to identify index patterns for ARGs (Wright 2007). The antibiotic
across samples. resistomes of natural environments includ-
Results: We observed significant differences in the overall index and index subcategory levels ing soil, marine, freshwater, and wastewater
when comparing ecosystems more proximal versus distal to human impact. The selection of ecosystems have revealed an abundance of
different sequence similarity thresholds strongly influenced the index measurements. Unique index ARDs (Allen et al. 2010; Davies and Davies
sub­category modes distinguished the different metagenomes. 2010; Zhang et al. 2009). In many cases,
Conclusions: Broad-scale screening of ARD potential using this index revealed utility for framing these genes have been shown to be function-
environmental health monitoring and surveillance. This approach holds promise as a screening ally resistant to selected antibiotics (Schmieder
tool for establishing baseline ARD levels that can be used to inform and prioritize decision making and Edwards 2012). The presence of resis-
regarding management of ARD sources and human exposure routes. tance genes in the environment may be due
Citation: Port JA, Cullen AC, Wallace JC, Smith MN, Faustman EM. 2014. Metagenomic to selective pressures favoring these genes,
frameworks for monitoring antibiotic resistance in aquatic environments. Environ Health Perspect including antibiotic overuse and misuse in
122:222–228;  http://dx.doi.org/10.1289/ehp.1307009 clinical treatment and agricultural and aqua-
culture applications as well as metal pollu-
tion. ARDs are ultimately disseminated into
Introduction 2010), detect pathogens in wastewater (Ye watersheds and coastal systems via sewage,
Advances in genomic technologies now offer and Zhang 2011), and identify indicators animal waste, and urban/agricultural runoff,
novel approaches for environmental health of sewage contamination (Bibby and Peccia and thus form environmental reservoirs of
monitoring and risk assessment. High- 2013; McLellan et al. 2010). Although these ARDs (Davies and Davies 2010). Humans
throughput sequencing of whole microbial techniques are thus promising, interpreta- can be exposed through food, including crops,
communities provides global snapshots of tion of the massive amounts of data produced livestock, and seafood; consumption of con-
community and functional composition, as poses a series of challenges for public health taminated drinking water; recreational activi-
opposed to more conventional analyses that decision makers. Determining the signifi- ties such as swimming; or direct contact with
are species and gene specific (Hugenholtz and cance of a given genomic signal in the con- organisms carrying antibiotic resistant bacteria
Tyson 2008). Because these new techniques text of risk, defining the levels of genomic (Wellington et al. 2013).
rely on culture-independent approaches, response needed to drive a decision, and iden-
they are able to access genomic information tifying the cost-benefit balance of using these Address correspondence to E.M. Faustman,
from the vast majority of bacteria that are methods versus more traditional approaches Department of Environmental and Occupational
Health Sciences, School of Public Health, University
not culturable (Amann et al. 1995). These will be necessary to translate metagenomic of Washington, 4225 Roosevelt Way NE #100,
technologies are also less labor and laboratory data into public health decision making. Here Seattle, Washington 98105-6099 USA. Telephone:
intensive and can generate massive volumes of we present a first step toward developing a (206) 685-2269. E-mail: [email protected]
genomic data in less than a day (Glenn 2011). decision-monitoring tool using the case study Supplemental Material is available online (http://
Shotgun metagenomics, or the direct extrac- of antibiotic resistance in marine and fresh- dx.doi.org/10.1289/ehp.1307009).
tion, sequencing and analysis of DNA from a water environments. This work was supported by the University of
Washington Pacific Northwest Center for Human
community of microorganisms (Handelsman Antibiotic resistance is a global phenome­ Health and Ocean Studies and funded by the
2004), is one high-throughput approach that non and is a growing source of morbidity and National Institutes of Health, National Institute of
in tandem with next generation sequencing mortality (Bush et al. 2011). Resistance occurs Environmental Health Sciences (grant P50 ES012762);
has potential utility for environmental public when bacteria evolve under selective pres- the National Science Foundation (grant OCE-
health surveillance. sure to confer resistance to antibiotics used 0434087); and the National Oceanic and Atmospheric
Although the environmental health appli- to treat their infection. Although the major- Administration (grant UCAR S08-67883).
The authors declare they have no actual or potential
cations of metagenomics remain to be fully ity of antibiotic resistance investigations have competing financial interests.
elucidated, this approach has been used to been focused on pathogenic bacteria in clini- Received: 25 April 2013; Accepted: 10 December
track fecal contamination in watersheds via cal settings, antibiotic resistance and antibiotic 2013; Advance Publication: 13 December 2013;
community composition profiling (Wu et al. resistance determinants (ARDs) have been Final Publication: 1 March 2014.

222 volume 122 | number 3 | March 2014  •  Environmental Health Perspectives


Metagenomics and antibiotic resistance monitoring

Monitoring for antibiotic resistance in Methods sludge sample was obtained discharges into a
the marine environment has been infrequent Data sources. Sequence reads for the 25 local waterway in Charlotte, North Carolina,
and incomplete (Allen et al. 2010) and has metagenomic samples included in this analysis and has a daily inflow of 7.5 million gallons
predominantly focused on measuring levels of are publicly available and were downloaded from primarily domestic sources in addition to
antibiotics in different water matrices (Segura from the National Center for Biotechnology several industries, a university, and a hospital
et al. 2009). Furthermore, environmental Information (NCBI) Sequence Read Archive (Sanapareddy et al. 2009).
monitoring of antibiotic resistance has not (SRA; http://www.ncbi.nlm.nih.gov/sra). All samples analyzed in this study (except
been formalized into public health surveillance These 25 samples were divided into seven the river sediment sample) were filtered and size
or water quality management decision frame- ecosystems: estuary, coastal ocean, freshwater fractionated (0.1–3.0 μm) to target the micro-
works, likely because of a continuing lack of lake, marina, river sediment, wastewater treat- bial community. Genomic DNA was extracted
data and uncertainty regarding risk and risk ment plant (WWTP) effluent, and WWTP and shotgun sequenced using pyrosequencing
metrics. Instead, global surveillance efforts sludge (Table 1). The estuary data set includes (Margulies et al. 2005). Pyrosequencing of
such as the European Antimicrobial Resistance surface water samples taken offshore in the total genomic DNA was performed using 454
Surveillance Network (European Centre for northern (samples P1, P26) and central (P5, GS-FLX or GS-FLX Titanium technologies
Disease Prevention and Control 2014) and the P28, P32) basins of Puget Sound, State of (454 Life Sciences, Branford, CT). For data
National Antimicrobial Resistance Monitoring Washington (Port et al. 2012). Sampling site sets with multiple samples (e.g., estuary, coastal
System for Enteric Bacteria (Centers for P26 was specifically located adjacent to the ocean, river sediment), samples were indi-
Disease Control and Prevention 2014) have northern basin in the Strait of Juan De Fuca, vidually barcoded and sequenced in parallel.
predominantly focused on the prevalence State of Washington. The marina sample is Summary sequencing statistics, including func-
of antibiotic usage and antibiotic resistance from the central basin of Puget Sound and tional annotation, are provided in Table 1.
isolates in clinical and public health labora- was taken near shore inside an urban marina Open reading frames (ORFs) were predicted
tory settings (Grundmann et al. 2011). Given and close to a source of freshwater input with MetaGeneMark software (http://topaz.
the global magnitude of antibiotic resistance, (Port et al. 2012). The coastal ocean samples gatech.edu/metagenome/) (Zhu et al. 2010)
including the emergence of multi-drug resis- were collected as part of an annual California and protein domains assigned using the
tance bacterial strains and increasing reports Cooperative Oceanic Fisheries Investigations Pfam 26.0 database (Punta et al. 2012).
of occurrence in the environment, there is a cruise in the Southern California Bight ARD index. Metagenomic data relevant
critical need for the identification, characteriza- (Allen et al. 2012). Samples at seven stations to environmental surveillance of ARDs was
tion, and control of these generally uncharac- were taken along hydrographic and nutrient classified into three categories: gene transfer
terized environmental ARD reservoirs (Bush gradients in near (samples GS257, GS263, potential, ARG potential, and patho­genicity
et al. 2011). GS264) and offshore (GS258, GS259, potential (Figure 1). A fourth category,
The objectives of the present study were GS260, GS262) upwelling regions within source tracking, relates to identifying poten-
three-fold. First, a metagenomic epidemiology- the California Current Ecosystem. The term tial anthropogenic sources of ARDs through
based approach was used to develop an index marine refers here to the estuary, marina, community composition profiling but has not
that quantifies the resistance potential of an and coastal ocean samples. The freshwater yet been incorporated into the index analysis.
environment. Metagenomic epidemiology is a lake sample is from a reservoir encompassing The index categories were quantified via their
multi-layered approach that considers the entire 59 square miles near Atlanta, Georgia, that respective subcategories as shown in Figure 1.
microbiotic context for environmental antibi- serves as a drinking water supply for the city Bioinformatic analyses. The unassem-
otic resistance by characterizing simultaneously and is used for recreational activities (Oh et al. bled DNA sequence reads for each meta­
the different levels of microbiome complexity 2011). The river sediment samples were taken genome were run through a bioinformatic
that drive antibiotic resistance including ARGs, at intervals downstream from a WWTP dis- framework that quantified the ARD index
genetic vectors, and the species in which these charge site in Patancheru, Hyderabad, India, (Figure 1). Reads were quality processed using
genes occur (Baquero 2012). Second, the index that processes water from approximately 90 the MG-Rast pipeline (Meyer et al. 2008)
was analyzed for common modal patterns (i.e., drug manufacturers (Kristiansson et al. 2011). and then run through three separate analy-
principal components) across a diverse set of The wastewater effluent, taken from a WWTP ses (one for each index category, excluding
marine and freshwater ecosystems. Third, we that discharges into Puget Sound, has an aver- source tracking). Quality control param-
sought to integrate the index into a public age daily inflow of 133 million gallons and eters included the removal of reads that had
health surveillance framework in order to pro- is sourced from storm water/groundwater a length > 2 SDs from the mean sample read
vide an example by which high-throughput (53%), residential (29%), commercial (17%), length, > 5 ambiguous bases, < 5% of any
meta­genomic data can be applied to regulation and industrial (1%) processes (Port et al. one nucleotide, or 100% identity to another
or management. 2012). The WWTP from which the activated sequence over the first 50 bp.
Table 1. Metagenomic samples included in this study with associated metadata and summary statistics.
Estuary: Coastal ocean: Freshwater: Marina: River sediment: WWTP effluent: WWTP sludge:
Characteristic Puget Sound, USA California Bight, USA Atlanta, GA, USA Puget Sound, USA Patancheru, India Seattle, WA, USA Charlotte, NC, USA
No. of samples 5 12 1 1 4 1 1
Size fraction (μm) 0.2–3 0.1–0.8, 0.8–3 0.22–1.6 0.2–3 NA 0.2–3 NA
Depth (m) 5 2 5 1 0a NA NA
Megabase pairs 413 1,940 502 91 91 48 95
Mean read length (bp) 368 551 395 379 365 381 250
ORFs [(mean ± SD) %] 81.8 ± 2.1 69.4 ± 9.95 86.2 87.3 74.9 ± 0.565 89.7 89.8
Pfams [(mean ± SD) %] 40.8 ± 1.5 32.8 ± 8.9 39.8 45.3 30.5 ± 1.1 39.8 35.6
SRA accession no. SRP015952 SRP006681 SRA023414 SRP015952 SRP002078 SRX328700 SRA001012
Reference Port et al. 2012 Allen et al. 2012 Oh et al. 2011 Port et al. 2012 Kristiansson et al. 2011 Port et al. 2012 Sanapareddy et al. 2009
Abbreviations: NA, not applicable; ORF, open reading frame; Pfam, protein family.
aRiver sediment samples were collected at centimeters in depth along the shoreline of the river.

Environmental Health Perspectives  •  volume 122 | number 3 | March 2014 223


Port et al.

The abundance of each index subcategory database (http://www.ncbi.nlm.nih.gov/refseq) ARG database (11,498 sequences) composed
was calculated using different sequence simi- (1,843 sequences) using the sequence similarity of a non­redundant and updated version of the
larity thresholds in order to generate a distri- thresholds shown in Table 2. To identify TEs, Antibiotic Resistance Genes Database (http://
bution of values for each subcategory and to 431,000 sequences annotated as TEs were ardb.cbcb.umd.edu/) (Liu and Pop 2009) in
determine how these thresholds affect data downloaded from GenBank (http://www.ncbi. addition to ARGs from metagenomic sam-
interpretation (Table 2). The high threshold nlm.nih.gov/genbank/) and databased, and ples that were functionally verified to confer
represents the most conservative annota- metagenomic reads were then searched against resistance (Schmieder and Edwards 2012).
tion approach (with the least false-­positives), this database using the similarity thresholds. Proteins were predicted from the ORFs gener-
followed by a gradual reduction in stringency, To annotate phages, reads were taxonomically ated from MetaGeneMark (Zhu et al. 2010)
including medium-high, medium-low, and assigned through the MG-Rast server using and then BLASTP searched against the ARG
low thresholds. Unless stated otherwise, anno- BLASTP (http://blast.ncbi.nlm.nih.gov/Blast. database (E-value < 10–5) using the thresholds
tated reads (per subcategory) were normalized cgi), and reads matching to phage families or presented in Table 2 to determine the best
to the total number of sequence reads genera were retained for each similarity thresh- match. Sequences with similarity to MRGs
per sample. old. The total phage count for each meta­ were identified by searching the SEED data-
Gene transfer potential sub­c ategories genome was normalized to the total number of base subsystem “Resistance to antibiotics and
included plasmids, TEs, and phages. Plasmids sequences assigned at the domain level. toxic compounds” (Overbeek et al. 2005).
were annotated by BLASTN (http://blast. ARG potential subcategories included This subsystem contains genes and gene clus-
ncbi.nlm.nih.gov/Blast.cgi) searching ARGs and MRGs. ARGs were identified using ters encoding resistance to arsenic, mercury,
(Altschul et al. 1990) the reads against plas- the same approach as previously described and cadmium.
mid sequences available in the NCBI RefSeq (Port et al. 2012). Briefly, we compiled an Two approaches were used to identify
pathogenic bacteria.
• Sequences were searched against the
Ribosomal Database Project (Cole et al.
2009) at the similarity thresholds and species
level; matches were then annotated as patho-
Unassembled DNA
sequence reads gens if present in the Microbial Rosetta Stone
Database (Ecker et al. 2005). This database
contains a list of bacterial pathogens known
Next generation sequencing
MG-Rast server to pose a human health risk.
(quality processing) • Sequences were taxonomically annotated
using the lowest common ancestor algo-
BlastN
BlastN/ rithm (LCA) within the MG-Rast server
BlastX (Meyer et al. 2008), and reads matching to
MetaGeneMark the species level at each similarity threshold
Environmental sampling gene prediction
Ribosomal Database were retained and run against the Microbial
Project (RDP) Rosetta Stone Database (Ecker et al. 2005).
Metagenomic protein Statistical analyses. For principal compo-
NCBI NCBI NCBI database nent analysis (PCA), the abundance counts
SEED nucleotide
protein database plasmid
database database database BlastP for each index subcategory were normal-
Microbial Rosetta ized to the total number of sequences in the
Antibiotic resistance
gene database Stone Database index for a given sample. PCA was performed
on the normalized data using the JMP ver-
Gene transfer
potential sion 10.0 statistical package (SAS Institute
Plasmids
Inc., Cary, NC). Eigen vectors and loading
Pathogenicity
TEs
ARG potential
potential values were extracted for the first two prin-
Source tracking
ARGs Pathogenic bacteria
cipal components. Finer scale analysis of the
Phages
Community genomic elements composing each index
composition Commensal bacteria MRGs Virulence factors
subcategory was run using GraphPad Prism,
Figure 1. Bioinformatic framework for quantifying the index of ARDs. The index categories are shown in
version 6.0 (GraphPad Software, San Diego,
the cream-colored boxes and the subcategories in red. The gray boxes (e.g., commensal bacteria and CA). Abundance counts per genomic ele-
virulence factors) represent subcategories that have not yet been incorporated into the index but may still ment were normalized to the total number of
play an important role in determining ARD potential. NCBI, National Center for Biotechnology Information. sequences within the respective subcategory
Table 2. Sequence similarity thresholds used to quantify the index subcategories.
Index category/subcategory High Medium-high Medium-low Low
Gene transfer potential
Plasmids 95% ID; 400 bp 95% ID; 300 bp 95% ID; 200 bp 95% ID; 100 bp
TEs 80% ID; 120 aa 80% ID; 90 aa 80% ID; 60 aa 80% ID; 30 aa
Phages 50% ID; 150 aa 50% ID; 100 aa 50% ID; 75 aa 50% ID; 50 aa
ARG potential
ARGs 80% ID; 150 aa 80% ID; 100 aa 80% ID; 75 aa 80% ID; 50 aa
MRGs 50% ID; 150 aa 50% ID; 100 aa 50% ID; 75 aa 50% ID; 50 aa
Pathogenicity potential
Pathogens 95% ID; 400 bp or 150 aa 95% ID; 300 bp or 100 aa 95% ID; 200 bp or 75 aa 95% ID; 100 bp or 50 aa
Abbreviations: aa, amino acids; bp, base pairs; ID, identity.

224 volume 122 | number 3 | March 2014  •  Environmental Health Perspectives


Metagenomics and antibiotic resistance monitoring

and 95% confidence intervals were generated (r = 0.85–0.98, p < 0.0001) (see Supplemental sludge share positive scores in PC2. However,
for each proportion. Material, Table S2). the sludge was nearly unweighted in PC1,
ARD index patterns. We used PCA to whereas effluent was highly positive on that
Results identify modalities (or principal component axis mainly because of an absence of phages.
Antibiotic resistance potential. An ARD index “patterns”) for the metagenomic data asso- PCA outliers included estuary sample P26
was developed that consisted of three cate­ ciated with each sample. PCA reduces our and three coastal ocean samples (GS259.1,
gories related to the molecular etiology of anti- highly multi­dimensional data set by gener- GS260.8, and GS262.1). Site P26 experiences
biotic resistance: a) gene transfer, b) ARG, and ating weighted (or loaded) linear combina- increased mixing of oceanic and Puget Sound
c) pathogenicity potential. To first compare tions [i.e., principal components (PCs)] of the waters relative to the other estuary locations,
the antibiotic resistance potential across the metagenomic categories ( e.g., ARGs, MRGs). which may explain why it is grouped within the
samples, index scores were calculated for each As a result, a small number of PCs explain as coastal ocean cluster. GS259.1 and GS262.1
metagenome using four sequence similarity much of the variance in the data set as pos- are the only coastal ocean samples represent-
thresholds ranging from high to low strin- sible. We ran PCA at the index subcategory ing the 0.1-μm microbial community from
gency (Table 2). When different bioinformatic level using the medium-high sequence similar- oligotrophic waters, but there is no apparent
thresholds are applied, the index scores change ity threshold for this case. For the abundance relationship between distance offshore and the
and consequently reveal differences that can of genomic elements composing the index ARD index profile.
impact public health monitoring and decision subcategories refer to Supplemental Material,
making. Application of the highest thresh- Table S2 and Figure S1. In our analysis of Discussion
old generated the lowest percentage of index- the full set of samples, the first two principal We developed and tested an index for charac­
positive sequences (mean, 0.025%) for all components, PC1 and PC2, explained 68% of terizing the ARD potential of marine and
samples except the river sediment (Figure 2A). the total variance in the data set. freshwater environments using shotgun meta­
As the similarity thresholds are reduced, this PC1 was predominantly characterized genomics. Currently available meta­genomic
percentage increases to 0.033% (medium- by the presence (reflected by positive load- data sets allow for gene transfer potential,
high), 0.28% (medium-low), and 0.55% ings) of ARGs, plasmids, and TEs and the ARG potential, and pathogenicity potential
(low). Individual index subcategories were also relative absence (negative loadings) of phages, to be included in the index, although future
differentially sensitive to increases in align- whereas PC2 reflected the presence of MRGs, introduction of source tracking data will
ment length and, therefore, threshold selection TEs, and pathogens and the relative absence enrich the approach. The index comprises
(Figure 2B–2G). of ARGs, plasmids, and phages (Figure 3). an ecological context for ARD potential by
As hypothesized, environments most There was a clear division between the coastal providing both the prevalence of ARGs and
proximal to human impact had the highest ocean and river sediment samples along the potential mechanisms by, and species in
cumulative ARD index scores at all similar- PC1, whereas the estuary, freshwater lake, which, these genes may be passed. This index
ity thresholds (Figure 2). Only the activated and WWTP effluent formed a mixed clus- differed across both diverse environmental
sludge sample did not follow this trend, likely ter with neutral scores along PC1. Despite samples and also within a group of marine
because the average read length of the sludge the diversity of sample types and relatively samples. Ecosystems proximal to human
data set did not meet the alignment length small sample size, the marine locations were impact, including effluent and river sediment
criteria of the higher thresholds. The river still largely distinguished from one another collected downstream from a WWTP pro-
sediment samples taken downstream from a along PC2. The estuary samples had posi- cessing high volumes of pharmaceuticals, had
WWTP processing high volumes of antibiot- tive scores within PC2, whereas the coastal the highest cumulative index scores. These
ics, as well as the effluent sample, had higher ocean samples were negative. The PC scores samples were distinguished by higher poten-
proportions of index-positive sequences due for the estuary samples are consistent with tials for gene transfer, pathogenicity, and
to elevated ARGs, plasmids, and TEs relative the presence of MRGs (arsenic and mercury the presence of ARGs. Less impacted envi-
to the other samples (Figure 2B–2G; see also resistance) and TEs (mainly Rhodobacteraceae ronments, including marine samples and a
Supplemental Material, Table S1). The most sp.) and the relative absence of ARGs and freshwater lake, had indices reflecting reduced
impacted environments also had the largest plasmids, whereas the coastal ocean samples public health concern but exhibited a distinct
proportion of sequences meeting the high were characterized by phages (primarily fingerprint characterized by either phages or
similarity threshold. In particular, sequences Myoviridae and Podoviridae) and the relative MRGs, depending on location. Pathogens
from the river sediment data sets had strong absence of MRGs, TEs, and pathogens. The were rare across all data sets but were likely
matches to known plasmids. The estuary freshwater lake sample had a similar profile to underestimated given the shotgun approach
samples, on average, had a slightly increased the estuary, including the presence of MRGs and, therefore, limited sequencing depth.
cumulative score relative to the coastal ocean (mainly arsenic resistance) and TEs (mainly As the samples in this study were diverse,
samples, with higher levels of MRGs, and Ralstonia, Rickettsia, and Synechococcus spp.). multiple factors may have contributed to the
to a lesser extent TEs, than the other marine The PC results characterized the river sedi- index profiles obtained including microbial
samples. Sample P26 (estuary) had an ele- ment samples by the presence of ARGs (sul- community composition, ecosystem type,
vated index score relative to all other marine fonamide and aminoglycoside resistance sampling methods, seasonality, and under­
samples due to an increased phage count genes) and plasmids (Edwardsiella tarda plas- lying data quality. We did not directly
(Podoviridae). Pathogens were rare at the mid pEIB202, Escherichia coli pO26, and address community composition or season-
higher similarity thresholds yet still detected Pasteurella multocida plasmid pCCK38) and ality, but composition is likely reflected in
in the effluent, river sediment, coastal ocean, by the relative absence of MRGs (Figure 3; the ecosystem type. We aimed to minimize
and marina samples (Figure 2G; see also see also Supplemental Material, Table S2). the impact of sample collection in part by
Supplemental Material, Table S1). The WWTP effluent was characterized by the only including studies that targeted the same
Multivariate analysis of all samples presence of MRGs (arsenic and mercury resis- size fraction. The PCA results suggest that
revealed ARGs and plasmids to be the tance), pathogens (Acinetobacter calcoaceticus) ecosystem type is a stronger predictor of the
most strongly correlated index subcate­ and to lesser extent TEs, and the relative index profiles than sequence quality. The
gories at all sequence similarity thresholds absence of phages. The effluent and activated coastal ocean samples had a wide range in the

Environmental Health Perspectives  •  volume 122 | number 3 | March 2014 225


Port et al.

10
High
9 Medium-high
Medium-low
ARD index-positive sequences (%)

8 Low

7 Ecosystems Ecosystems
more distal to proximal to
human impact human impact
6

0
P1

P5

P26

P28

P32

GS258 (0.1)

GS259 (0.1)

GS262 (0.1)

GS263 (0.1)

GS264 (0.1)

GS257 (0.8)

GS258 (0.8)

GS259 (0.8)

GS260 (0.8)

GS262 (0.8)

GS263 (0.8)

GS264 (0.8)

Lake Lanier

Marina

WWTP sludge

WWTP effluent

Discharge site

2.3 km downstream

2.7 km downstream

17.7 km downstream
Estuary Coastal ocean Marina WWTP River sediment
Freshwater downstream
from WWTP

1.6 2.0
ARGs MRGs

0.016 1.6
MRG sequences (%)
ARG sequences (%)

1.2 0.012
0.008
1.2
0.004
0.8
0.000
Lake Lanier
Marina
WWTP sludge
WWTP effluent

0.8
Estuary Coastal ocean
0.4
0.4

0.0 0.0

4.0 2.0
Plasmids TEs
Plasmid sequences (%)

0.30
3.0
TE sequences (%)

1.5
0.20
0.10
2.0 0.00 1.0
Lake Lanier
Marina
WWTP sludge
WWTP effluent

1.0 0.5

0.0 0.0

2.0 10
Phages Pathogens
Pathogen sequences (n)
Phage sequences (%)

1.6 8

1.2 6

0.8 4

0.4 2

0.0 0
Estuary Coastal ocean WWTP River sediment Estuary Coastal ocean WWTP River sediment
Freshwater downstream Freshwater downstream
Marina from WWTP Marina from WWTP

Figure 2. Percentage of total sequenced reads per metagenome assigned to the ARD index. (A) Percentage of index-positive sequences per sample and
ecosystem and (B–G) the percentage of sequence reads per sample and ecosystem assigned to each index subcategory [(B) ARG sequences; (C) MRG
sequences; (D) plasmid sequences; (E) TE sequences; (F) phage sequences; (G) pathogen sequences]. The percentages are shown for four different sequence
similarity thresholds [including high, medium-high, medium, and low stringencies (see Table 2)]. The number of pathogen-annotated sequences is shown instead
of the percentage. The vertical bar in each plot separates ecosystems more distal versus more proximal to human impact. Filter sizes (i.e., 0.1 and 0.8 μm) are
listed after the station names for the coastal ocean samples. The graph inserts for ARGs and plasmids in B and D are zoomed-in views of the abundance of each
subcategory excluding the river sediment samples.

226 volume 122 | number 3 | March 2014  •  Environmental Health Perspectives


Metagenomics and antibiotic resistance monitoring

number of predicted ORFs and proteins, yet organisms. For example, beach and shell- abundance counts (i.e., genome copies/L
they clustered closely in the PCA score plots. fishery closures in Washington State occur detected via quantitative polymerase chain
Furthermore, the river sediment samples, when fecal coliform levels exceed a geometric reaction) for pathogenic markers in fecally
which had a low number of ORFs and pre- mean of 14 colony forming units (CFUs) or contaminated recreational waters to deter-
dicted proteins compared to the other sam- enterococci levels exceed a geometric mean mine pathogen dose (Staley et al. 2012).
ples (except coastal ocean), had the highest of 70 CFU/100 mL marine water (State of The environmental detection rate described
number of predicted ARGs and MGEs. Thus, Washington 2014). Although this has been above begins to lay out a similar approach
although data quality may impact quantifica- an effective approach for reducing expo- for metagenomic assessments that may be
tion of the index, the diverse nature of the sure to well-known pathogens, early risk informative for distinguishing differently
samples confounded other potential factors. management may benefit from population- impacted environments and evaluating a vari-
As more metagenomic data with greater spa- level screening that results in a lower false-­ ety of public health impacts across marine
tiotemporal resolution become available, we negative rate and thus increased sensitivity microbial communities.
will be better able to tease apart these factors. for a broader range of organisms or genes of Relevance to public health management.
We evaluated the choice of sequence interest. Furthermore, a reduction in specific- Water quality management decisions have
similarity thresholds for annotating meta­ ity, and subsequent increase in false-­positives, ignored ARDs or antibiotics, likely because of
genomic data. Specific public health deci- may not be appropriate for regulatory con- a lack of data and uncertainty regarding risk
sions may require the selection of different texts, but it may be accepted when using and risk metrics. Given the global magnitude
thresholds in order to optimize the balance of a metric such as the ARD index to gain a of antibiotic resistance, including the emer-
false-­positives to false-­negatives. Our sequence broader understanding of the antibiotic resis- gence of multi-drug resistance bacterial strains
similarity thresholds matched or exceeded tance potential of an environmental sample in the environment, information pertaining
the criteria used in other studies investigating and to detect the emergence of ARDs. to the status, patterns, and trends in ARDs is
ARG and gene transfer in water. (Kristiansson To frame metagenomic screening data needed. Public health management decisions
et al. 2011; Zhang et al. 2011). There was a within an early risk management approach, that may benefit from information regard-
significant decrease in the number of index- we can calculate an environmental detec- ing ARD potential include actions aimed at
positive sequences for each sample and index tion rate for the ARD index by sample (see reducing the sources and exposure routes of
subcategory as the threshold was increased. Supplemental Material, Figure S2). The ARDs and the framing of adaptive monitoring
This trend may be related to sequence read environ­mental detection rate provides a rough protocols. Source control of ARDs entering
length in that sequences assigned at the lower estimate of the number of ARD sequences coastal environments primarily involves waste
thresholds may be too short to reach the align- present per volume of water sampled management and the regulation of antibiotic
ment length criteria of the higher thresholds (and takes into account the mass of DNA use in agriculture, aquaculture, hospitals,
(e.g., WWTP sludge sample) or that the lower extracted), the mass sequenced, and sequenc- and households (Davies and Davies 2010).
thresholds overassign false-­positives. Further ing depth. For example, the environ­mental Exposure control of ARDs may involve beach
optimization of sequence similarity thresholds detection rates for ARGs in the WWTP efflu- or shellfish bed advisories or aquaculture siting.
for public health applications will be necessary ent and estuary samples were approximately Due to the uncertainty in links between expo-
to ensure proper interpretation of the index. 1.58 × 108 sequences/L and 0 sequences/L, sure and actual human health risk, current
Applications to public health surveil- respectively (based on the medium-high applications of the index as a screening tool
lance. Current water quality standards are sequence similarity threshold). Quantitative are best suited to ARD source control. For
culture based and highly specific for targeted microbial risk assessments have used gene example, using the index to screen WWTP
and cruise ship effluent and discharge sites,
4
Estuary
0.8
fresh­water inputs such as river mouths and
Coastal ocean PC1
Freshwater lake 0.6
coastal aquaculture operations could provide
Marina baseline environmental levels for anthropo­
Loading value

River sediment downstream of WWTP 0.4


WWTP effluent 0.2
genically sourced ARDs. The availability of
WWTP activated sludge
0
such data would benefit decisions that cur-
PC2 (27% of the variance explained)

2
–0.2
rently do not account for the potential risk
–0.4
associated with antibiotic resistance release,
–0.6
such as reducing ARD dissemination into the
environment by improving WWTP technolo-
ARG

MRG

TE

Plasmid

Phage

Pathogen

0
gies or reducing the use of activated sludge as
0.8 fertilizer for agricultural crops.
PC2
P26
0.6
Future data needs. The ARD index is a
high-throughput measure of ARD poten-
Loading value

0.4
tial and as such cannot be directly related
0.2
–2 to human health risk. For environmentally
0
sourced ARGs to pose a health risk, they must
–0.2 a) be transferable via MGEs; b) be transferred
–0.4 to either pathogenic or commensal bacteria
–4
–0.6 that then infect or colonize humans; and
ARG

MRG

TE

Plasmid

Phage

Pathogen

–4 –2 0 2 4 c) confer resistance to antibiotics of clinical


PC1 (41% of the variance explained) importance. Furthermore, vectors such as
Figure 3. PCA score plot and corresponding loading values for the index subcategories by ecosystem. The
phages are ubiquitous in the marine envi-
medium-high sequence similarity threshold was used for this analysis (see Table 2). Sampling location P26 ronment (Breitbart 2012); thus any link to
experiences increased mixing of oceanic waters relative to the other estuary samples. The red and purple the dissemination of ARGs will require more
circles indicate distinct coastal ocean and river sediment sample clusters, respectively, according to PC1. targeted investigations. These limitations

Environmental Health Perspectives  •  volume 122 | number 3 | March 2014 227


Port et al.

reflect the fact that, although emerging tech- References Sogin ML. 2010. Diversity and population structure of
nologies will continue to provide unlimited sewage-­derived microorganisms in wastewater treatment
Allen HK, Donato J, Wang HH, Cloud-Hansen KA, Davies J, plant influent. Environ Microbiol 12:378–392.
access to genomic information, development Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M,
Handelsman J. 2010. Call of the wild: antibiotic resis-
of risk assessment frameworks will be of tance genes in natural environments. Nat Rev Microbiol et al. 2008. The metagenomics RAST server—a public
equal importance. 8:251–259. resource for the automatic phylogenetic and functional
Allen LZ, Allen EE, Badger JH, McCrow JP, Paulsen IT, analysis of metagenomes. BMC Bioinformatics 9:386;
Although the cost and time required doi:10.1186/1471-2105-9-386.
Elbourne LD, et al. 2012. Influence of nutrients and cur-
for metagenomic analysis is still greater rents on the genomic composition of microbes across an Oh S, Caro-Quintero A, Tsementzi D, DeLeon-Rodriguez N,
than existing regulatory options for moni- upwelling mosaic. ISME J 6:1403–1414. Luo C, Poretsky R, et al. 2011. Metagenomic insights into
the evolution, function, and complexity of the planktonic
toring, advances in sequencing technologies Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990.
microbial community of Lake Lanier, a temperate fresh­
Basic local alignment search tool. J Mol Biol 215:403–410.
and bioinformatic platforms are increasing Amann RI, Ludwig W, Schleifer KH. 1995. Phylogenetic identi- water ecosystem. Appl Environ Microbiol 77:6000–6011.
the utility of high-throughput approaches. fication and in-situ detection of individual microbial-cells Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY,
Cohoon M, et  al. 2005. The subsystems approach to
Next generation sequencing platforms now without cultivation. Microbiol Rev 59:143–169.
genome annotation and its use in the project to annotate
Baquero F. 2012. Metagenomic epidemiology: a public health
offer increased sequencing depths and read need for the control of antimicrobial resistance. Clin 1000 genomes. Nucleic Acids Res 33:5691–5702.
lengths for < $0.10/megabase (Glenn 2011). Microbiol Infect 18(suppl 4):67–73. Port JA, Wallace JC, Griffith WC, Faustman EM. 2012.
Furthermore, the availability of publicly avail- Bibby K, Peccia J. 2013. Identification of viral pathogen diver- Metagenomic profiling of microbial composition and
sity in sewage sludge by metagenome analysis. Environ antibiotic resistance determinants in Puget Sound. PLoS
able bioinformatic analysis tools and pipelines Sci Technol 47:1945–1951. One 7:e48000; doi:10.1371/journal.pone.0048000.
(Scholz et al. 2012) provides a platform for Breitbart M. 2012. Marine viruses: Truth or dare. Ann Rev Mar Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C,
public health practitioners to access and auto- Sci 4:425–448. et al. 2012. The Pfam protein families database. Nucleic
Bush K, Courvalin P, Dantas G, Davies J, Eisenstein B, Acids Res 40:D290–D301.
mate in a way that addresses the research or Sanapareddy N, Hamp TJ, Gonzalez LC, Hilger HA, Fodor AA,
Huovinen P, et al. 2011. Tackling antibiotic resistance. Nat
regulatory question at hand. Rev Microbiol 9:894–896. Clinton SM. 2009. Molecular diversity of a North Carolina
Decreased sequencing costs and increased Centers for Disease Control and Prevention. 2014. National wastewater treatment plant as revealed by pyrosequencing.
Antimicrobial Resistance Monitoring System for Enteric Appl Environ Microbiol 75:1688–1696.
sequencing depths will also allow for longi­ Schmieder R, Edwards R. 2012. Insights into antibiotic resistance
Bacteria (NARMS) Homepage. Available: http://www.cdc.
tudinal sampling and greater geospatial gov/narms/ [accessed 8 January 2014]. through metagenomic approaches. Future Microbiol 7:73–89.
coverage, leading to a more comprehensive Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. Scholz MB, Lo CC, Chain PS. 2012. Next generation sequencing
and bioinformatic bottlenecks: the current state of meta­
profiling of the ARD index. Furthermore, 2009. The Ribosomal Database Project: improved align-
genomic data analysis. Curr Opin Biotech 23:9–15.
ments and new tools for rRNA analysis. Nucleic Acids Res
although the sample size in this study was 37:D141–D145. Segura PA, Francçois M, Gagnon C, Sauve S. 2009. Review of the
limited, the PCA framework presented pro- Davies J, Davies D. 2010. Origins and evolution of antibiotic occurrence of anti-infectives in contaminated wastewaters
and natural and drinking waters. Environ Health Perspect
vides a platform from which to tease apart the resistance. Microbiol Mol Biol Rev 74:417–433.
117:675–684; doi:10.1289/ehp.11776.
Ecker DJ, Sampath R, Willett P, Wyatt JR, Samant V,
index and characterize individual ecosystems. Massire  C, et  al. 2005. The Microbial Rosetta Stone Staley C, Gordon KV, Schoen ME, Harwood VJ. 2012.
Database: a compilation of global and emerging infectious Performance of two quantitative PCR methods for micro-
Conclusions microorganisms and bioterrorist threat agents. BMC bial source tracking of human sewage and implications
Microbiol 5:19; doi:10.1186/1471-2180-5-19. for microbial risk assessment in recreational waters. Appl
We had three objectives, to a) develop a European Centre for Disease Prevention and Control. 2014. Environ Microbiol 78:7317–7326.
metagenomic ARD index that quantifies the European Antimicrobial Resistance Surveillance Network State of Washington. 2014. WAC 173-201A-210. Marine Water
antibiotic resistance signal within marine and (EARS-Net) Homepage. Available: http://www.ecdc. Designated Uses and Criteria. Available: http://apps.leg.
europa.eu/en/activities/surveillance/EARS-Net [accessed wa.gov/wac/default.aspx?cite=173-201A-210 [accessed
freshwater environments, b) analyze this index 8 Janurary 2014].
8 January 2014].
for common patterns charac­terizing specific Glenn TC. 2011. Field guide to next-generation DNA sequencers. Wellington EM, Boxall AB, Cross P, Feil EJ, Gaze WH,
ecosystems, and c) conceptually frame the Mol Ecol Resour 11:759–769. Hawkey PM, et al. 2013. The role of the natural environ-
Grundmann H, Klugman KP, Walsh T, Ramon-Pardo P, ment in the emergence of antibiotic resistance in gram-
index within an environmental health sur- negative bacteria. Lancet Infect Dis 13:155–165.
Sigauque B, Khan W, et al. 2011. A framework for global
veillance context. Significant differences were surveillance of antibiotic resistance. Drug Resist Update Wright GD. 2007. The antibiotic resistome: the nexus of chemical
seen in the index when comparing marine and 14:79–87. and genetic diversity. Nat Rev Microbiol 5:175–186.
Wright GD. 2010. Antibiotic resistance in the environment: a
freshwater environments that differ in proxim- Handelsman J. 2004. Metagenomics: application of genomics
link to the clinic? Curr Opin Microbiol 13:589–594.
to uncultured microorganisms. Microbiol Mol Biol Rev
ity to human impact, and distinct index pat- 68:669–685. Wu CH, Sercu B, Van de Werfhorst LC, Wong J, DeSantis TZ,
terns were evident across these environments. Hugenholtz P, Tyson GW. 2008. Microbiology: metagenomics. Brodie EL, et al. 2010. Characterization of coastal urban
watershed bacterial communities leads to alternative
We conclude that the index has potential to Nature 455:481–483.
community-based indicators. PLoS One 5:e11285;
Kristiansson E, Fick J, Janzon A, Grabic R, Rutgersson C,
be a valuable screening tool for early risk man- Weijdegard B, et al. 2011. Pyrosequencing of antibiotic- doi:10.1371/journal.pone.0011285.
agement of ARDs, but to define index thresh- contaminated river sediments reveals high levels of resis- Ye L, Zhang T. 2011. Pathogenic bacteria in sewage treatment
old levels of concern and link these levels to tance and gene transfer elements. PLoS One 6:e17038; plants as revealed by 454 pyrosequencing. Environ Sci
doi:10.1371/journal.pone.0017038. Technol 45:7173–7179.
decisions will require a better understanding Liu B, Pop M. 2009. ARDB—Antibiotic Resistance Genes Zhang T, Zhang XX, Ye L. 2011. Plasmid metagenome reveals
of the prevalence, fate, and transport of ARGs Database. Nucleic Acids Res 37:D443–447. high levels of antibiotic resistance genes and mobile
in the marine environment. Nevertheless, Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, genetic elements in activated sludge. PLoS One 6:e26041;
Bemben LA, et al. 2005. Genome sequencing in micro­ doi:10.1371/journal.pone.0026041.
characterization of the ARD potential of Zhang XX, Zhang T, Fang HH. 2009. Antibiotic resistance
fabricated high-density picolitre reactors. Nature
environmental microbial communities is a 437:376–380. genes in water environment. Appl Microbiol Biotechnol
first step toward incorporating metagenomic Martinez JL. 2009. The role of natural environments in the evo- 82:397–414.
lution of resistance traits in pathogenic bacteria. Proc Biol Zhu W, Lomsadze A, Borodovsky M. 2010. Ab initio gene
information into monitor­ing frameworks for identification in metagenomic sequences. Nucleic Acids
Sci 276:2521–2530.
antibiotic ­resistance in aquatic ecosystems. McLellan SL, Huse SM, Mueller-Spitz SR, Andreishcheva EN, Res 38:e132; doi:10.1093/nar/gkq275.

228 volume 122 | number 3 | March 2014  •  Environmental Health Perspectives

You might also like