(3 intermediate revisions by the same user not shown) | |||
Line 135: | Line 135: | ||
<br/><br/> | <br/><br/> | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
<h3 class="wiki-h wiki-h3 wiki-section-start" id="about">About</h3> | <h3 class="wiki-h wiki-h3 wiki-section-start" id="about">About</h3> | ||
<p class="wiki-p"> | <p class="wiki-p"> | ||
Antibiotic resistance is a major threat to mankind with an estimated 10 million deaths | Antibiotic resistance is a major threat to mankind with an estimated 10 million deaths | ||
annually by the year 2050. <i>Acinetobacter baumannii</i> is the most critical priority pathogen | annually by the year 2050. <i>Acinetobacter baumannii</i> is the most critical priority pathogen | ||
− | + | and is resistant to most of the antibiotics tested (>70% approx.) except | |
− | majority of focus on overcoming the resistance of <i>A. baumannii</i> has been on understanding | + | to colistin (<a class="wiki-a" href="https://main.icmr.nic.in/sites/default/files/reports/annual_report_amr_jan2017-18.pdf" target="_blank">ICMR Report</a>). The |
+ | majority of focus on overcoming the resistance of <i><i>A. baumannii</i></i> has been on understanding | ||
biological mechanisms using wet-lab analysis. But, the availability of public datasets | biological mechanisms using wet-lab analysis. But, the availability of public datasets | ||
makes it crucial to carry analysis using computational tools especially machine learning algorithms. | makes it crucial to carry analysis using computational tools especially machine learning algorithms. | ||
</p> | </p> | ||
− | + | <br/> | |
<h3 class="wiki-h wiki-h3 wiki-section-start" id="description">Brief Description</h3> | <h3 class="wiki-h wiki-h3 wiki-section-start" id="description">Brief Description</h3> | ||
Line 158: | Line 154: | ||
utilising a class of machine learning algorithms called Support Vector Machines (SVMs) [1] | utilising a class of machine learning algorithms called Support Vector Machines (SVMs) [1] | ||
on allele-strain database PATRIC [2]. The approach is inspired by previous work in literature | on allele-strain database PATRIC [2]. The approach is inspired by previous work in literature | ||
− | for Mycobacterium tuberculosis [3] and Escherichia coli, Pseudomonas aeruginosa, and Staphylococcus | + | for <i>Mycobacterium tuberculosis</i> [3] and <i>Escherichia coli</i>, <i>Pseudomonas aeruginosa</i>, and <i>Staphylococcus |
− | aureus [4]. The genomes of A. baumanii (target pathogen) strains are annotated using Prokka software | + | aureus</i> [4]. The genomes of <i>A. baumanii</i> (target pathogen) strains are annotated using Prokka software |
[5] to develop pan-genome. We used SVMs to predict phenotype of strains using knowledge of genes present | [5] to develop pan-genome. We used SVMs to predict phenotype of strains using knowledge of genes present | ||
in a particular strain and interpreted the trained SVM model to calculate the weights given to different | in a particular strain and interpreted the trained SVM model to calculate the weights given to different | ||
alleles while predicting resistance phenotype of strains. Following which, we selected top alleles based | alleles while predicting resistance phenotype of strains. Following which, we selected top alleles based | ||
− | on these weights and performed correlation analysis | + | on these weights and performed correlation and mutation analysis, analysing the impact of mutation on |
resistance phenotype of strains. | resistance phenotype of strains. | ||
</p> | </p> | ||
− | <br/> | + | <br/> |
− | + | ||
<h3 class="wiki-h wiki-h3 wiki-section-start" id="Requirements">Requirements</h3> | <h3 class="wiki-h wiki-h3 wiki-section-start" id="Requirements">Requirements</h3> | ||
Line 185: | Line 180: | ||
</li> | </li> | ||
<li> | <li> | ||
− | For more information, check our github link | + | For more information, check our <a class="wiki-a" href="https://github.com/TEAM-IGEM-IIT-ROORKEE/machine-learning-darg" target="_blank">github link</a> |
</li> | </li> | ||
</ul> | </ul> | ||
<p></p> | <p></p> | ||
− | + | <br/> | |
<h3 class="wiki-h wiki-h3 wiki-section-start" id="Brief_Results">Brief Results</h3> | <h3 class="wiki-h wiki-h3 wiki-section-start" id="Brief_Results">Brief Results</h3> | ||
Line 196: | Line 191: | ||
</p><ul class="wiki-ul"> | </p><ul class="wiki-ul"> | ||
<li> | <li> | ||
− | A list of top genes conferring resistance in <i>A. baumannii</i> strains to several well-known antibiotics | + | A list of top genes conferring resistance in <i><i>A. baumannii</i></i> strains to several well-known antibiotics |
</li> | </li> | ||
<li> | <li> | ||
− | + | Correlation analysis between top alleles for understanding impact of mutation in one gene on the other | |
</li> | </li> | ||
<li> | <li> | ||
− | + | Mutational analysis, analysing the impact of mutation in genes on the resistant phenotype of strain | |
</li> | </li> | ||
</ul> | </ul> | ||
<p></p> | <p></p> | ||
− | <br/> | + | <br/> |
− | + | ||
<h3 class="wiki-h wiki-h3 wiki-section-start" id="References">References</h3> | <h3 class="wiki-h wiki-h3 wiki-section-start" id="References">References</h3> | ||
<p class="wiki-p"> | <p class="wiki-p"> | ||
</p><ol class="wiki-ol"> | </p><ol class="wiki-ol"> | ||
− | + | <li>Cortes, C., Vapnik, V. 1995. Support-Vector Networks. <i>Machine Learning</i>, 20, pp.273-297</li> | |
− | <li> | + | <li>Wattam, A. R. et al. 2013. PATRIC, the bacterial bioinformatics database and analysis resource. <i>Nucleic Acids Research</i>, Database issue (42), pp.581-D591</li> |
− | + | <li>Kavvas, S. E. et al. 2018. Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome | |
− | + | identifies genetic signatures of antibiotic resistance. <i>Nature Communications</i>, 9(4306)</li> | |
− | + | <li>Hyun, J. C. et al. 2020. Machine learning with random subspace ensembles identifies antimicrobial | |
− | <li>PATRIC, the bacterial bioinformatics database and analysis resource | + | resistance determinants from pan-genomes of three pathogens. <i>PLOS Computational Biology</i>, 16(3), pp.e1007608</li> |
− | + | <li>Seemann, T. 2014. Prokka: rapid prokaryotic genome annotation. <i>Bioinformatics</i>, 30(14):2068-9</li> | |
− | + | ||
− | + | ||
− | + | ||
− | <li>Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome | + | |
− | identifies genetic signatures of antibiotic resistance | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | <li>Machine learning with random subspace ensembles identifies antimicrobial | + | |
− | resistance determinants from pan-genomes of three pathogens | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | <li>Prokka: rapid prokaryotic genome annotation | + | |
− | + | ||
− | + | ||
− | + | ||
</ol> | </ol> | ||
− | <p></p | + | <p></p><br/> |
Latest revision as of 17:21, 18 December 2020
<!DOCTYPE html>
DARG
(Detection of Antibiotic Resistant Genes)
About
Antibiotic resistance is a major threat to mankind with an estimated 10 million deaths annually by the year 2050. Acinetobacter baumannii is the most critical priority pathogen and is resistant to most of the antibiotics tested (>70% approx.) except to colistin (ICMR Report). The majority of focus on overcoming the resistance of A. baumannii has been on understanding biological mechanisms using wet-lab analysis. But, the availability of public datasets makes it crucial to carry analysis using computational tools especially machine learning algorithms.
Brief Description
We have developed a machine learning approach called DARG, i.e. Detection of Antibiotic Resistant Genes, utilising a class of machine learning algorithms called Support Vector Machines (SVMs) [1] on allele-strain database PATRIC [2]. The approach is inspired by previous work in literature for Mycobacterium tuberculosis [3] and Escherichia coli, Pseudomonas aeruginosa, and Staphylococcus aureus [4]. The genomes of A. baumanii (target pathogen) strains are annotated using Prokka software [5] to develop pan-genome. We used SVMs to predict phenotype of strains using knowledge of genes present in a particular strain and interpreted the trained SVM model to calculate the weights given to different alleles while predicting resistance phenotype of strains. Following which, we selected top alleles based on these weights and performed correlation and mutation analysis, analysing the impact of mutation on resistance phenotype of strains.
Requirements
- System having python and other necessary libraries, numpy, pandas, matplotlib and sklearn
- Ability to run and use Prokka software
- Knowledge of PATRIC database
- Basic understanding of Support Vector Machines and their working
- For more information, check our github link
Brief Results
Implementation of our approach yielded three key results.
- A list of top genes conferring resistance in A. baumannii strains to several well-known antibiotics
- Correlation analysis between top alleles for understanding impact of mutation in one gene on the other
- Mutational analysis, analysing the impact of mutation in genes on the resistant phenotype of strain
References
- Cortes, C., Vapnik, V. 1995. Support-Vector Networks. Machine Learning, 20, pp.273-297
- Wattam, A. R. et al. 2013. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Research, Database issue (42), pp.581-D591
- Kavvas, S. E. et al. 2018. Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance. Nature Communications, 9(4306)
- Hyun, J. C. et al. 2020. Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens. PLOS Computational Biology, 16(3), pp.e1007608
- Seemann, T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics, 30(14):2068-9