The loss of effective antimicrobials is reducing our ability to protect the global population from infectious disease. However, the field of antibiotic drug discovery and the public health monitoring of antimicrobial resistance (AMR) is beginning to exploit the power of genome and metagenome sequencing. The creation of novel AMR bioinformatics tools and databases and their continued development will advance our understanding of the molecular mechanisms and threat severity of antibiotic resistance, while simultaneously improving our ability to accurately predict and screen for antibiotic resistance genes within environmental, agricultural, and clinical settings. To do so, efforts must be focused toward exploiting the advancements of genome sequencing and information technology. Currently, AMR bioinformatics software and databases reflect different scopes and functions, each with its own strengths and weaknesses. A review of the available tools reveals common approaches and reference data but also reveals gaps in our curated data, models, algorithms, and data-sharing tools that must be addressed to conquer the limitations and areas of unmet need within the AMR research field before DNA sequencing can be fully exploited for AMR surveillance and improved clinical outcomes.
Jia B, Raphenya AR, Alcock B, Waglechner N, Guo P, Tsang KK, Lago BA, Dave BM, Pereira S, Sharma AN, Doshi S, Courtot M, Lo R, Williams LE, Frye JG, Elsayegh T, Sardar D, Westman EL, Pawlowski AC, Johnson TA, Brinkman FS, Wright GD, & McArthur AG.
The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis.
Three of our undergraduate students gave poster presentations in the last month. Arjun Sharma (Biochemistry & Biomedical Sciences 3rd year) outlined his work on developing the Comprehensive Antibiotic Resistance Database’s (arpcard.mcmaster.ca) CARD*Shark text mining algorithms at the Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day while Kirill Pankov (Biomedical Discovery & Commercialization 4th year) presented the results of his summer NSERC Undergraduate Student Research Award (USRA) research in the Laboratory of Dr. Joanna Wilson into the origin of Cnidarian P450 enzymes, work he is continuing in our lab as part of his thesis research. Mohammad Khan (Biomedical Discovery & Commercialization 4th year), a thesis student in the Laboratory of Dr. Eric Brown that collaborates with our group, also presented a poster at IIDR Trainee Day on his work on chemical-genetic interaction database design.
Sharma, A.N., S. Doshi, A.R. Raphenya, B. Alcock, B.M. Dave, B.A. Lago, K.K. Tsang, & A.G. McArthur. 2016. CARDShark: Computer-assisted biocuration of the Comprehensive Antibiotic Resistance Database. Poster presentation at the 2016 Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day, Hamilton, Ontario, Canada.
Pankov, K., A.G. McArthur & J.Y. Wilson. 2016. The Cytochrome P450 (CYP) superfamily in the Cnidarian phylum. Poster presentation at the 2016 Undergraduate Student Research Awards (USRA) Poster Session, Hamilton, Ontario, Canada.
Khan, M.A., S. French, B. Aubie, A.G McArthur & E.D. Brown. 2016. Challenging common screening filters through analysis of a chemical-genetic screening database. Poster presentation at the 2016 Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day, Hamilton, Ontario, Canada.
A cross-national research consortia co-led by McMaster’s Andrew McArthur is receiving two of 16 federal grants to further develop a big data solution to the growing problem of antimicrobial resistance (AMR). The government’s investment, totaling more than $4M, is the result of Genome Canada’s 2015 Bioinformatics and Computational Biology Competition, a partnership with the Canadian Institutes of Health Research (CIHR). McArthur and his colleagues will receive $500,000 over two years. McArthur will work closely with researchers from the University of British Columbia, Simon Fraser University, Dalhousie University and the Public Health Agency of Canada to design and develop novel software and database systems that will empower public health agencies and the agri-food sector to rapidly respond to threats posed by infectious disease outbreaks and food-borne illnesses.
McArthur, A.G., B. Jia, A.R. Raphenya, P. Guo, K. Tsang, B. Dave, B. Alcock, B. Lago, N. Waglechner, & G.D. Wright. 2016. The Comprehensive Antibiotic Resistance Database – A Platform for Antimicrobial Resistance Surveillance. Invited presentation at the 2nd Conference Rapid Microbial NGS and Bioinformatics: Translation Into Practice, Hamburg, Germany.
Antimicrobial resistance (AMR) is among the most pressing public health crises of the 21st Century. Despite the importance of resistance to health, this field has been slow to take advantage of genome scale tools. Phenotype based criteria dominate the epidemiology of antibiotic action and effectiveness. There is a poor understanding of which antibiotic resistance genes are in circulation, which a threat, and how clinicians and public health workers can manage the crisis of resistance. However, DNA sequencing is rapidly decreasing in cost and as such we are on the cusp of an age of high-throughput molecular epidemiology. What are needed are tools for rapid, accurate analysis of DNA sequence data for the genetic underpinnings of antibiotic resistance. In an effort to address this problem, we have created the Comprehensive Antibiotic Resistance Database (card.mcmaster.ca). This database is a rigorously curated collection of known antibiotics, targets, and resistance determinants. It integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in raw genome sequences using the novel Resistance Gene Identifier (RGI). Here we review the current state of the CARD, particularly recent advances in the curation of resistance determinants and the structure of the ARO. We will also present our plans for development of semi- and fully-automated text mining algorithms for curation of broader AMR data, construction of meta-models for improved AMR phenotype prediction, and release of portable command-line genome analysis tools.
* presenter underlined, trainees in bold
The completion of the human genome project in 2001 sparked the beginning of a sequencing revolution with applications that are only now being realized by researchers. The decreasing cost of DNA sequencing has ignited a continuous generation of genomic data with a limited number of researchers able to manipulate the output. Consequentially the demand to examine this genetic information has forced bioinformaticians to improve the analytical tools involved in sequence analysis. Galaxy is a user-friendly analytical platform where researchers without a computational background can navigate their way through an investigation and use various analytical tools and workflows to assist them with their genomic research (1). Galaxy enables the addition of novel software into the environment by individual users to fill in the gaps of tools that haven’t been created by the Galaxy team. This project will focus on a particular analytical gap concerning tools related to antibiotic resistance, phylogenetics, and bacterial virulence. Currently, the proposed software to be adapted to the galaxy setting includes a resistance gene identifier (RGI) associated with the comprehensive antibiotic resistance database (CARD) (2), a single nucleotide polymorphism identifier (BANSP) , and novel virulence factor identification software associated with the virulence factor database (VFDB) (3). The combination of Galaxy’s existing ToolShed and these unique additions will create a comprehensive analytical environment that can be applied to realistic situations. One such situation that this project will concentrate on refers specifically to the outbreaks of Clostridium difficile (C. diff) in the health care system.
The loss of effective antimicrobials is reducing the ability to protect the global population from infectious diseases, leading to profound impacts on the healthcare system, international trade, agriculture, and environment. The field of antibiotic drug discovery and the monitoring of the dynamic and new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. The curation and directed development of the Comprehensive Antibiotic Database (CARD) will advance the understanding of the genetics, genomics, and threat severity of antibiotic resistance, while simultaneously improving its ability to accurately predict and screen for antibiotic resistance genes within raw genomes. Strategically advancing the Antibiotic Resistance Ontology (ARO), the unique organizing principle of the CARD, allows the value of big data in disparate realms of research to be used and understood by the multidisciplinary efforts working to combat the emergence and prevalence of the ESKAPE pathogens, a critical driving force of the global health crisis.
Wright, G.D. & A.G. McArthur. 2015. A bioinformatic platform for the characterization of antibiotic resistance in bacterial genomes and metagenomes. Presentation at the 2015 Interscience Conference of Antimicrobial Agents and Chemotherapy, San Diego, California.
The increasingly routine sequencing of bacterial genomes in biomedical research and the clinical lab requires access to easy to use, efficient, and accurate bioinformatic tools for analysis of bacterial traits from virulence to drug resistance. To contribute to this growing need, we have developed a platform for the investigation of antibiotic resistance elements, the Comprehensive Antibiotic Resistance Database (http://arpcard.mcmaster.ca/). This resource includes a manually curated database of over 3000 resistance genes and associated literature, protein structures, and target antibiotics. Associated with this platform are tools to aid in the study of resistance including the Resistance Gene Identifier (RGI) that can analyze genomic data for the presence of resistance elements. Our goal is to accurately predict resistance phenotype from genomic data. Our analysis of many genomes and associated antibiograms reveals a reservoir of ‘silent’ resistance genes that are predicted to encode viable resistance elements yet the phenotype is drug sensitive. Our efforts to manage these issues along with identifying and adding new resistance genes will be presented.
Authors: McArthur AG, Wright GD. Curr Opin Microbiol. 2015 Jul 31;27:45-50.
Antimicrobial resistance is a global health challenge and has an evolutionary trajectory ranging from proto-resistance in the environment to untreatable clinical pathogens. Resistance is not static, as pathogenic strains can move among patient populations and individual resistance genes can move among pathogens. Effective treatment of resistant infections, antimicrobial stewardship, and new drug discovery increasingly rely upon genotype information, powered by decreasing costs of DNA sequencing. These new approaches will require advances in microbial informatics, particularly in development of reference databases of molecular determinants such as our Comprehensive Antibiotic Resistance Database and clinical metadata, new algorithms for prediction of resistome and resistance phenotype from genotype, and new protocols for global collection and sharing of high-throughput molecular epidemiology data.