The structural context of mutations in proteins predicts their effect on antibiotic resistance.
| Title: | The structural context of mutations in proteins predicts their effect on antibiotic resistance. |
|---|---|
| Authors: | Green AG; Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St., Boston, MA, 02115, USA.; Manning College of Information and Computer Sciences, University of Massachusetts, 140 Governors Dr., Amherst, MA, USA.; Tasmin M; Manning College of Information and Computer Sciences, University of Massachusetts, 140 Governors Dr., Amherst, MA, USA.; Vargas R Jr; Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St., Boston, MA, 02115, USA.; Farhat MR; Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St., Boston, MA, 02115, USA.; Division of Pulmonary & Critical Care, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA. |
| Source: | BioRxiv : the preprint server for biology [bioRxiv] 2025 Sep 25. Date of Electronic Publication: 2025 Sep 25. |
| Publication Type: | Journal Article; Preprint |
| Language: | English |
| Journal Info: | Country of Publication: United States NLM ID: 101680187 Publication Model: Electronic Cited Medium: Internet ISSN: 2692-8205 (Electronic) Linking ISSN: 26928205 NLM ISO Abbreviation: bioRxiv Subsets: PubMed not MEDLINE |
| Abstract: | In Mycobacterium tuberculosis, a prevalent and deadly pathogen, resistance to antibiotics evolves primarily through non-synonymous mutations in proteins. Sequence-based analyses are currently used to understand the genetic basis of antibiotic resistance, either via genotype-phenotype association, or via signals of convergent evolution. These methods focus on primary sequence and usually neglect other biological signals such as protein structural information. We hypothesize that integrating the structural context of mutations improves the prediction of effects on function and phenotype. We curate high confidence structural annotations for the M. tuberculosis proteome from 1,371 crystallography and 2,316 AlphaFold predictions, and combine the structures with mutations from over 31,000 clinical M. tuberculosis isolates. We demonstrate that mutations in proteins known to cause resistance are clustered in 3D space, even in proteins where inactivating mutations at any position are thought to cause resistance. We develop a statistic to search the M. tuberculosis proteome for signal of clustered non-synonymous mutations, finding over 450 proteins that display this signal, many of which have a known relationship with antibiotic resistance. Innovatively, we show that a supervised classifier trained on structure features alone has an F1 score of 94.6% at classifying mutations as resistance-conferring. This work demonstrates that protein structure provides useful information for categorizing which variants may cause antibiotic resistance, even when the majority of structures are AI-predicted. |
| Competing Interests: | Conflict of interest statement. All authors declare no competing interests. |
| Grant Information: | F32 AI161793 United States AI NIAID NIH HHS; S10 RR028832 United States RR NCRR NIH HHS |
| Entry Date(s): | Date Created: 20251003 Date Completed: 20251013 Latest Revision: 20251013 |
| Update Code: | 20260130 |
| PubMed Central ID: | PMC12485870 |
| DOI: | 10.1101/2025.09.23.676583 |
| PMID: | 41040149 |
| Database: | MEDLINE |
Journal Article; Preprint