Today we say farewell to Arjun Sharma & Suman Virdee. Arjun joined the lab as a second year volunteer, staying to perform a Biochem 3R06 research project. He has been very active in our Comprehensive Antibiotic Resistance Database project, co-developing our CARD*Shark text mining tools for computer-guided curation of literature in PubMed, pipelines for our clinical isolate genome sequencing work, and developing novel algorithms for predicting glycopeptide resistance from genome assemblies. He was the recipient of an IIDR Summer Student Fellowship and leaves the Biochemistry program to enter medical school at the University of Toronto. Suman joined the lab in the 4th year of the Biomedical Discovery & Commercialization program, performing her thesis research on RNA-Seq bioinformatics workflows in a collaboration between our lab and the laboratory of Dr. Kristen Hope (McMaster Stem Cell and Cancer Research Institute), extending her research into the summer by winning a CIHR Summer Undergraduate Research Award. Suman finished her degree and this September starts in the McMaster Master of Science in Global Health program. Bon chance Suman & Arjun!



We haven’t been travelling much this year, but our collaborators have been busy!

Dearborn, D.C., A.B. Gager, A.G. McArthur, M.E. Gilmour, E. Mandzhukova, R.A. Mauck. 2017. How to get diverse MHC genotypes without disassortative mating. Presentation at the 2017 Annual Meeting of the Society for Integrative and Comparative Biology, New Orleans, Louisiana.

McLean, M., D. Theriault, M. Kelley, B.A. Lago, A.G. McArthur, & L. Williams. 2017. Role of Nfe2 and pro-oxidant exposure in inner ear development in zebrafish. Presentation at the Society of Toxicology 56rd Annual Meeting, Baltimore, Maryland.

Williams, L.M., B.A. Lago, A.G. McArthur, A.R. Raphenya, N. Pray, N. Saleem, S. Salas, K. Paulson, R.S. Mangar, Y. Liu, A.H. Vo, & J.A. Shavit. 2017. The transcription factor, Nuclear factor, erythroid 2 (Nfe2), is a regulator of the oxidative stress response during Danio rerio development. Presentation at the Society of Toxicology 56rd Annual Meeting, Baltimore, Maryland.

Winsor, G.L., C. Bertelli, K.K. Tsang, B. Alcock, A.G. McArthur, & F.S.L. Brinkman. 2017. Pseudomonas Genome Database 2017: Improved gene/AMR/VF/genomic island annotations, comparative genome analyses, and a platform for facilitating public health genomic epidemiology. Presentation at the 16th International Conference on Pseudomonas, Liverpool, United Kingdom.


McArthur AG & Tsang KK.

Ann N Y Acad Sci. 2017 Jan;1388(1):78-91.

The loss of effective antimicrobials is reducing our ability to protect the global population from infectious disease. However, the field of antibiotic drug discovery and the public health monitoring of antimicrobial resistance (AMR) is beginning to exploit the power of genome and metagenome sequencing. The creation of novel AMR bioinformatics tools and databases and their continued development will advance our understanding of the molecular mechanisms and threat severity of antibiotic resistance, while simultaneously improving our ability to accurately predict and screen for antibiotic resistance genes within environmental, agricultural, and clinical settings. To do so, efforts must be focused toward exploiting the advancements of genome sequencing and information technology. Currently, AMR bioinformatics software and databases reflect different scopes and functions, each with its own strengths and weaknesses. A review of the available tools reveals common approaches and reference data but also reveals gaps in our curated data, models, algorithms, and data-sharing tools that must be addressed to conquer the limitations and areas of unmet need within the AMR research field before DNA sequencing can be fully exploited for AMR surveillance and improved clinical outcomes.

Jia B, Raphenya AR, Alcock B, Waglechner N, Guo P, Tsang KK, Lago BA, Dave BM, Pereira S, Sharma AN, Doshi S, Courtot M, Lo R, Williams LE, Frye JG, Elsayegh T, Sardar D, Westman EL, Pawlowski AC, Johnson TA, Brinkman FS, Wright GD, & McArthur AG.

Nucleic Acids Res. 2017 Jan 4;45(D1):D566-D573.

The Comprehensive Antibiotic Resistance Database (CARD; is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis.

Three of our undergraduate students gave poster presentations in the last month. Arjun Sharma (Biochemistry & Biomedical Sciences 3rd year) outlined his work on developing the Comprehensive Antibiotic Resistance Database’s ( CARD*Shark text mining algorithms at the Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day while Kirill Pankov (Biomedical Discovery & Commercialization 4th year) presented the results of his summer NSERC Undergraduate Student Research Award (USRA) research in the Laboratory of Dr. Joanna Wilson into the origin of Cnidarian P450 enzymes, work he is continuing in our lab as part of his thesis research. Mohammad Khan (Biomedical Discovery & Commercialization 4th year), a thesis student in the Laboratory of Dr. Eric Brown that collaborates with our group, also presented a poster at IIDR Trainee Day on his work on chemical-genetic interaction database design.

Sharma, A.N., S. Doshi, A.R. Raphenya, B. Alcock, B.M. Dave, B.A. Lago, K.K. Tsang, & A.G. McArthur. 2016. CARDShark: Computer-assisted biocuration of the Comprehensive Antibiotic Resistance Database. Poster presentation at the 2016 Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day, Hamilton, Ontario, Canada.

Pankov, K., A.G. McArthur & J.Y. Wilson. 2016. The Cytochrome P450 (CYP) superfamily in the Cnidarian phylum. Poster presentation at the 2016 Undergraduate Student Research Awards (USRA) Poster Session, Hamilton, Ontario, Canada.

Khan, M.A., S. French, B. Aubie, A.G McArthur & E.D. Brown. 2016. Challenging common screening filters through analysis of a chemical-genetic screening database. Poster presentation at the 2016 Michael G. DeGroote Institute for Infectious Disease Research (IIDR) Trainee Day, Hamilton, Ontario, Canada.

genome-canada-1A cross-national research consortia co-led by McMaster’s Andrew McArthur is receiving two of 16 federal grants to further develop a big data solution to the growing problem of antimicrobial resistance (AMR). The government’s investment, totaling more than $4M, is the result of Genome Canada’s 2015 Bioinformatics and Computational Biology Competition, a partnership with the Canadian Institutes of Health Research (CIHR). McArthur and his colleagues will receive $500,000 over two years. McArthur will work closely with researchers from the University of British Columbia, Simon Fraser University, Dalhousie University and the Public Health Agency of Canada to design and develop novel software and database systems that will empower public health agencies and the agri-food sector to rapidly respond to threats posed by infectious disease outbreaks and food-borne illnesses.

Full Coverage: Faculty of Health Sciences, Genome Canada, Newswire, Hamilton Spectator

McArthur, A.G.B. Jia, A.R. Raphenya, P. Guo, K. Tsang, B. Dave, B. Alcock, B. Lago, N. Waglechner, & G.D. Wright. 2016. The Comprehensive Antibiotic Resistance Database – A Platform for Antimicrobial Resistance Surveillance. Invited presentation at the 2nd Conference Rapid Microbial NGS and Bioinformatics: Translation Into Practice, Hamburg, Germany.

Antimicrobial resistance (AMR) is among the most pressing public health crises of the 21st Century. Despite the importance of resistance to health, this field has been slow to take advantage of genome scale tools. Phenotype based criteria dominate the epidemiology of antibiotic action and effectiveness. There is a poor understanding of which antibiotic resistance genes are in circulation, which a threat, and how clinicians and public health workers can manage the crisis of resistance. However, DNA sequencing is rapidly decreasing in cost and as such we are on the cusp of an age of high-throughput molecular epidemiology. What are needed are tools for rapid, accurate analysis of DNA sequence data for the genetic underpinnings of antibiotic resistance. In an effort to address this problem, we have created the Comprehensive Antibiotic Resistance Database ( This database is a rigorously curated collection of known antibiotics, targets, and resistance determinants. It integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in raw genome sequences using the novel Resistance Gene Identifier (RGI). Here we review the current state of the CARD, particularly recent advances in the curation of resistance determinants and the structure of the ARO. We will also present our plans for development of semi- and fully-automated text mining algorithms for curation of broader AMR data, construction of meta-models for improved AMR phenotype prediction, and release of portable command-line genome analysis tools.


alcockBrian Alcock has joined the McArthur Lab to lead curation of the Comprehensive Antibiotic Resistance Database ( Brian recently completed his MSc in the laboratory of Dr. Ben Evans in McMaster’s Biology Department. Welcome Brian!

image1Zachary Lin – Adapting Galaxy bioinformatics to outbreak-associated Clostridium difficile

The completion of the human genome project in 2001 sparked the beginning of a sequencing revolution with applications that are only now being realized by researchers. The decreasing cost of DNA sequencing has ignited a continuous generation of genomic data with a limited number of researchers able to manipulate the output. Consequentially the demand to examine this genetic information has forced bioinformaticians to improve the analytical tools involved in sequence analysis. Galaxy is a user-friendly analytical platform where researchers without a computational background can navigate their way through an investigation and use various analytical tools and workflows to assist them with their genomic research (1). Galaxy enables the addition of novel software into the environment by individual users to fill in the gaps of tools that haven’t been created by the Galaxy team. This project will focus on a particular analytical gap concerning tools related to antibiotic resistance, phylogenetics, and bacterial virulence. Currently, the proposed software to be adapted to the galaxy setting includes a resistance gene identifier (RGI) associated with the comprehensive antibiotic resistance database (CARD) (2), a single nucleotide polymorphism identifier (BANSP) , and novel virulence factor identification software associated with the virulence factor database (VFDB) (3). The combination of Galaxy’s existing ToolShed and these unique additions will create a comprehensive analytical environment that can be applied to realistic situations. One such situation that this project will concentrate on refers specifically to the outbreaks of Clostridium difficile (C. diff) in the health care system.

karaKara Tsang – Expansion of the Antibiotic Resistance Ontology for the ESKAPE pathogens

The loss of effective antimicrobials is reducing the ability to protect the global population from infectious diseases, leading to profound impacts on the healthcare system, international trade, agriculture, and environment. The field of antibiotic drug discovery and the monitoring of the dynamic and new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. The curation and directed development of the Comprehensive Antibiotic Database (CARD) will advance the understanding of the genetics, genomics, and threat severity of antibiotic resistance, while simultaneously improving its ability to accurately predict and screen for antibiotic resistance genes within raw genomes. Strategically advancing the Antibiotic Resistance Ontology (ARO), the unique organizing principle of the CARD, allows the value of big data in disparate realms of research to be used and understood by the multidisciplinary efforts working to combat the emergence and prevalence of the ESKAPE pathogens, a critical driving force of the global health crisis.


