| Description: |
This dataset presents MARISMa, a large-scale, curated resource of Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS), comprising 202,700 unique spectra from isolates collected between 2018 and 2024 at the Hospital General Universitario Gregorio Marañón (Madrid, Spain). The dataset reflects 1,148 different species of bacterial and fungal organisms. Spectra were acquired using the Bruker Daltonics BT Smart MALDI Biotyper (Bremen, Germany), and both the raw mass spectra and corresponding metadata are included in MARISMa. Additionally, we provide antimicrobial resistance (AMR) annotations for 29,679 unique isolates, including susceptibility results for up to 78 antibiotics and 220 microbial species. Antimicrobial susceptibility testing (AST) results are reported either as MIC values (minimum inhibitory concentration), as interpretations (Resistant – R, Susceptible – S, Intermediate – I), or both. The MARISMa dataset is organized hierarchically by year of collection (2018–2024), followed by genus, and then by species, with raw MALDI-TOF MS spectra stored for each isolate. Within each species folder, data are grouped by isolate identifiers, and each isolate may contain multiple folders corresponding to biological replicates. If multiple numerically named folders are found within a biological replicate, they are considered technical replicates. This structure enables flexible access and analysis of spectra across taxonomic levels and experimental conditions. MARISMa is designed to complement existing public resources and to support reproducible research in computational microbiology, especially for the development and evaluation of machine learning models for AMR prediction directly from MALDI-TOF spectra. |