| Title: |
Combining phenotypic similarity and network propagation to improve performance and clinical consistency of rare disease diagnosis |
| Authors: |
Chahdil, Maroua; Fabrizzi, Carolina; Hanauer, Marc; Lucano, Caterina; Rath, Ana; Lagorce, David; Tichit, Laurent |
| Contributors: |
Institut de Mathématiques de Marseille (I2M); Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS); Plateforme d'information et de services pour les maladies rares et les médicaments orphelins (Orphanet); Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Hôpital Broussais-Institut National de la Santé et de la Recherche Médicale (INSERM) |
| Source: |
https://hal.science/hal-05517071 ; 2026. |
| Publisher Information: |
CCSD |
| Publication Year: |
2026 |
| Collection: |
Aix-Marseille Université: HAL |
| Subject Terms: |
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]; [SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]; [SDV.GEN.GH]Life Sciences [q-bio]/Genetics/Human genetics |
| Description: |
Achieving timely diagnosis for rare diseases remains challenging due to, among others, phenotypic heterogeneity and incomplete clinical data. While the Solve-RD project developed a phenotype-based gene prioritisation method, this approach did not account for the clinical consistency among related diseases in Orphanet's hierarchical classifications. We present a phenotype-based computational pipeline that ranks candidate ORPHAcodes based on patient phenotypes. The pipeline computes patient-disease similarity using asymmetric semantic aggregation of Human Phenotype Ontology terms, filtering subsumed terms and incorporating Orphanet frequency annotations. Evaluated on 139 expert curated Solve-RD cases representing 78 distinct ORPHAcodes, our methodology outperformed the established Solve-RD baseline method, achieving a harmonic mean rank of 4.64 for confirmed diagnoses (versus 7.97) and retrieving the correct suspected rare disease within the top 10 positions for 39% of patients (versus 29%). We then explore a disease similarity network using Random Walk with Restart to generate ranked candidate lists. Two complementary experiments demonstrate that RWR-ranked candidates exhibited improved clinical consistency, reflected by their proximity within the Orphanet nomenclature of rare diseases. This approach provides more interpretable and actionable differential diagnosis hypotheses to guide clinical decision-making. |
| Document Type: |
report |
| Language: |
English |
| Relation: |
MEDRXIV: 2026.02.15.26346357 |
| DOI: |
10.64898/2026.02.15.26346357 |
| Availability: |
https://hal.science/hal-05517071; https://hal.science/hal-05517071v1/document; https://hal.science/hal-05517071v1/file/2026.02.15.26346357v1.full.pdf; https://doi.org/10.64898/2026.02.15.26346357 |
| Rights: |
https://creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/OpenAccess |
| Accession Number: |
edsbas.63BE5F94 |
| Database: |
BASE |