Combatting Antibiotic Resistance Using Surveillance – click on the image to watch the 10 minute video. More details here.

image1Zachary Lin – Adapting Galaxy bioinformatics to outbreak-associated Clostridium difficile

The completion of the human genome project in 2001 sparked the beginning of a sequencing revolution with applications that are only now being realized by researchers. The decreasing cost of DNA sequencing has ignited a continuous generation of genomic data with a limited number of researchers able to manipulate the output. Consequentially the demand to examine this genetic information has forced bioinformaticians to improve the analytical tools involved in sequence analysis. Galaxy is a user-friendly analytical platform where researchers without a computational background can navigate their way through an investigation and use various analytical tools and workflows to assist them with their genomic research (1). Galaxy enables the addition of novel software into the environment by individual users to fill in the gaps of tools that haven’t been created by the Galaxy team. This project will focus on a particular analytical gap concerning tools related to antibiotic resistance, phylogenetics, and bacterial virulence. Currently, the proposed software to be adapted to the galaxy setting includes a resistance gene identifier (RGI) associated with the comprehensive antibiotic resistance database (CARD) (2), a single nucleotide polymorphism identifier (BANSP) , and novel virulence factor identification software associated with the virulence factor database (VFDB) (3). The combination of Galaxy’s existing ToolShed and these unique additions will create a comprehensive analytical environment that can be applied to realistic situations. One such situation that this project will concentrate on refers specifically to the outbreaks of Clostridium difficile (C. diff) in the health care system.

karaKara Tsang – Expansion of the Antibiotic Resistance Ontology for the ESKAPE pathogens

The loss of effective antimicrobials is reducing the ability to protect the global population from infectious diseases, leading to profound impacts on the healthcare system, international trade, agriculture, and environment. The field of antibiotic drug discovery and the monitoring of the dynamic and new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. The curation and directed development of the Comprehensive Antibiotic Database (CARD) will advance the understanding of the genetics, genomics, and threat severity of antibiotic resistance, while simultaneously improving its ability to accurately predict and screen for antibiotic resistance genes within raw genomes. Strategically advancing the Antibiotic Resistance Ontology (ARO), the unique organizing principle of the CARD, allows the value of big data in disparate realms of research to be used and understood by the multidisciplinary efforts working to combat the emergence and prevalence of the ESKAPE pathogens, a critical driving force of the global health crisis.


Andrew-judy627Dr. McArthur gave a MacTalk at McMaster’s Big Ideas Better Cities evenings on Health and Social Innovation through Big Data about “Combatting antibiotic resistance using surveillance”. See the related How ‘Big Data’ can help solve big problems article at McMaster Daily News and the coverage at the Hamilton Spectator.



Wright, G.D. & A.G. McArthur. 2015. A bioinformatic platform for the characterization of antibiotic resistance in bacterial genomes and metagenomes. Presentation at the 2015 Interscience Conference of Antimicrobial Agents and Chemotherapy, San Diego, California.

The increasingly routine sequencing of bacterial genomes in biomedical research and the clinical lab requires access to easy to use, efficient, and accurate bioinformatic tools for analysis of bacterial traits from virulence to drug resistance. To contribute to this growing need, we have developed a platform for the investigation of antibiotic resistance elements, the Comprehensive Antibiotic Resistance Database ( This resource includes a manually curated database of over 3000 resistance genes and associated literature, protein structures, and target antibiotics. Associated with this platform are tools to aid in the study of resistance including the Resistance Gene Identifier (RGI) that can analyze genomic data for the presence of resistance elements. Our goal is to accurately predict resistance phenotype from genomic data. Our analysis of many genomes and associated antibiograms reveals a reservoir of ‘silent’ resistance genes that are predicted to encode viable resistance elements yet the phenotype is drug sensitive. Our efforts to manage these issues along with identifying and adding new resistance genes will be presented.



I spent July travelling to two great meetings in the British Isles. First was the Galaxy Community Conference in Norwich, UK which provided a crash course on the Galaxy Platform for data analysis – data intensive biology for everyone! We will definitely be using Galaxy for projects in 2015-2016. Second was the 2015 Annual Conference on Intelligent Systems for Molecular Biology / European Conference on Computational Biology joint meeting in Dublin, Ireland. This meeting covers a very broad spectrum of computational biology and our work on the CARD was well received. I also got a change to attend the Bio-Ontologies SIG for the first time, which provided a lot of perspective for our ontology development efforts. And yes, I had a few pints with colleagues…


  • McArthur, A.G. 2015. Flash Update – The Antibiotic Resistance Ontology. Presentation at Bio-Ontologies 2015, Dublin, Ireland.
  • McArthur, A.G., Waglechner, N., Nizam, F., Pereira, S.K., Jia, B., Sardar, D., Westman, E.L., Pawlowski, A.C., Johnson, T., Lo, R., Courtot, M., Brinkman, F.S., Williams, L.E., Frye, J.G., & Wright, G.D. 2015. The Comprehensive Antibiotic Resistance Database. Poster Presentation at the 23rd Annual International Conference on Intelligent Systems for Molecular Biology, Dublin, Ireland.
One of the themes in the McArthurLab is research at the intersection of academia, government, and industry. We endeavour to work with government agencies and industrial partners as much as with fellow academics. This is in part a reflection of our emphasis upon applied research but also my history of starting and owning my own bioinformatics company between being a professor in the United States (until 2006) and starting my faculty position at McMaster (in late 2014). This weekend I had the opportunity to participate in @Mac_Spectrum’s Summer Startup where student teams worked to create a start-up company in 36 hours. I got to meet with a lot of the teams and discuss their start-up strategies as well as give a talk on “Software in Science Entrepreneurship” where I discussed our efforts to partner our antibiotic resistance software with government and industry partners. It was a great weekend and impressive competition. Congratulations to Clear Roots, Avaro, and E-Dopa on their awards!



amr_web_banner3.7McArthur, A.G., Waglechner, N., Nizam, F., Pereira, S.K., Jia, B., Sardar, D., Westman, E.L., Pawlowski, A.C., Johnson, T., Lo, R., Courtot, M., Brinkman, F.S., Williams, L.E., Frye, J.G., & Wright, G.D. 2015. The Comprehensive Antibiotic Resistance Database. Presentation at the 4th ASM Conference on Antimicrobial Resistance in Zoonotic Bacteria and Foodborne Pathogens, Washington, District of Columbia.

Antimicrobial resistance (AMR) is among the most pressing public health crises of the 21st Century. Despite the importance of resistance to health, this field has been slow to take advantage of genome scale tools. Rather, phenotype based criteria dominate the epidemiology of antibiotic action and effectiveness. As a result, there is a poor understanding of which antibiotic resistance genes are in circulation, which ones are a threat, and how clinicians and public health workers can manage the crisis of resistance. However, DNA sequencing is rapidly decreasing in cost and as such we are on the cusp of an age of high-throughput molecular epidemiology. What are needed are tools for rapid, accurate analysis of DNA sequence data for the genetic underpinnings of antibiotic resistance. In an effort to address this problem, we have created the Comprehensive Antibiotic Resistance Database ( This database is a rigorously curated collection of known antibiotics, targets, and resistance determinants. It integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in raw genome sequences using the novel Resistance Gene Identifier (RGI). Here we review the current state of the CARD, particularly recent advances in the curation of resistance determinants and the structure of the ARO. We will also present our plans for development of semi- and fully-automated text mining algorithms for curation of broader AMR data, construction of Probabilistic Graphic Models for improved AMR phenotype prediction, and development of portable command-line genome analysis tools.

Graham, C., D. Boreham, T. Glenn, S. Lance, J. Martino, R. Manzon, A.G. McArthur, S. Rogers, J.Y. Wilson, & C. Somers. 2014. Low quality DNA affects double digest restriction associated DNA sequencing (ddRADSeq). Poster presentation at Genomics: The Power & the Promise, Ottawa, Canada.