| Description: |
European habitats are classified under a framework developed by the European Topic Centre for Biodiversity for the European Environment Agency, as part of the European Nature Information System (EUNIS) (Davies et al. 2004). All terrestrial, freshwater, and marine habitats follow a hierarchical classification based on physical features, human influence, and dominant vegetation (Moss 2008, Chytrý et al. 2020). Distribution maps are provided and modelled using occurrence data of indicator species collected from vegetation surveys (Hennekens 2017).Although the system may seem accurate, when we first plotted the distribution of the main species of our habitat study case, EUNIS Habitat S22 ‘Alpine and subalpine ericoid heath’ (European Environment Agency 2019), we observed that occurrence data, e.g., from sources like the Global Biodiversity Information Facility (GBIF), often fell outside the mapped areas of the habitat. Furthermore, important occurrence data sources, such as herbaria, were left out of the official distribution mapping, representing, in our view, a significant shortcoming of the EUNIS system. This study addresses these gaps by integrating diverse sources of in situ occurrence data (herbaria, vegetation surveys, citizen science) through a machine learning approach to complement the current EUNIS mapping.Specifically, we modelled the distributions of diagnostic species of the Habitat S22, using species distribution models (SDMs). For this purpose, we retrieved occurrence data from GBIF, identified by the accepted names as well as taxonomic synonyms, using the R package rgbif (Chamberlain et al. 2025), and utilised the Darwin Core (Wieczorek et al. 2012) standard.Data were filtered to include European occurrences with spatial coordinates and uncertainty of |