Difference between revisions of "Team:IIT Roorkee/ML Overview"

Line 146: Line 146:
 
           annually by the year 2050. <i>Acinetobacter baumannii</i> is the most critical priority pathogen  
 
           annually by the year 2050. <i>Acinetobacter baumannii</i> is the most critical priority pathogen  
 
           with 77% of its strains in India being carbapenem-resistant as per ICMR Report, 2017. The  
 
           with 77% of its strains in India being carbapenem-resistant as per ICMR Report, 2017. The  
           majority of focus on overcoming the resistance of <i><i>A. Baumannii</i></i> has been on understanding  
+
           majority of focus on overcoming the resistance of <i><i>A. baumannii</i></i> has been on understanding  
 
           biological mechanisms using wet-lab analysis. But, the availability of public datasets  
 
           biological mechanisms using wet-lab analysis. But, the availability of public datasets  
 
           makes it crucial to carry analysis using computational tools especially machine learning algorithms.
 
           makes it crucial to carry analysis using computational tools especially machine learning algorithms.
Line 196: Line 196:
 
           </p><ul class="wiki-ul">
 
           </p><ul class="wiki-ul">
 
             <li>
 
             <li>
               A list of top genes conferring resistance in <i><i>A. Baumannii</i></i> strains to several well-known antibiotics
+
               A list of top genes conferring resistance in <i><i>A. baumannii</i></i> strains to several well-known antibiotics
 
             </li>
 
             </li>
 
             <li>
 
             <li>

Revision as of 11:42, 26 October 2020

<!DOCTYPE html> PYOMANCER

Overview

DARG
(Detection of Antibiotic Resistant Genes)



Inspiration: https://2018.igem.org/Team:Paris_Bettencourt/AMPDesigner_Overview.



About

Antibiotic resistance is a major threat to mankind with an estimated 10 million deaths annually by the year 2050. Acinetobacter baumannii is the most critical priority pathogen with 77% of its strains in India being carbapenem-resistant as per ICMR Report, 2017. The majority of focus on overcoming the resistance of A. baumannii has been on understanding biological mechanisms using wet-lab analysis. But, the availability of public datasets makes it crucial to carry analysis using computational tools especially machine learning algorithms.





Brief Description

We have developed a machine learning approach called DARG, i.e. Detection of Antibiotic Resistant Genes, utilising a class of machine learning algorithms called Support Vector Machines (SVMs) [1] on allele-strain database PATRIC [2]. The approach is inspired by previous work in literature for Mycobacterium tuberculosis [3] and Escherichia coli, Pseudomonas aeruginosa, and Staphylococcus aureus [4]. The genomes of A. baumanii (target pathogen) strains are annotated using Prokka software [5] to develop pan-genome. We used SVMs to predict phenotype of strains using knowledge of genes present in a particular strain and interpreted the trained SVM model to calculate the weights given to different alleles while predicting resistance phenotype of strains. Following which, we selected top alleles based on these weights and performed correlation analysis along with analysing the impact of mutation on resistance phenotype of strains.





Requirements

  • System having python and other necessary libraries, numpy, pandas, matplotlib and sklearn
  • Ability to run and use Prokka software
  • Knowledge of PATRIC database
  • Basic understanding of Support Vector Machines and their working
  • For more information, check our github link





Brief Results

Implementation of our approach yielded three key results.

  • A list of top genes conferring resistance in A. baumannii strains to several well-known antibiotics
  • Epistatic and correlation analysis between top alleles for understanding impact of mutation in one gene on the other
  • Analysed the impact of mutation in genes on the resistant phenotype of strain





References

  1. Support-Vector Networks, Cortes C., Vapnik V., Machine Learning, 20, 273-297 (1995), DOI: 10.1007/BF00994018
  2. PATRIC, the bacterial bioinformatics database and analysis resource, Wattam A. R. et al., Nucleic Acids Research, Database issue (42), D581-D591 (2013), DOI: 10.1093/nar/gkt1099
  3. Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance, Kavvas S. E. et al., Nature Communications, 9:4306 (2018), DOI: 10.1038/s41467-018-06634-y
  4. Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens, PLOS Computational Biology, 16(3): e1007608, DOI: 10.1371/journal.pcbi.1007608
  5. Prokka: rapid prokaryotic genome annotation, Seemann T., Bioinformatics, 30(14):2068-9 (2014), DOI: 10.1093/bioinformatics/btu153