Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

SampleExplorer: using language models to discover relevant transcriptome data

Title: SampleExplorer: using language models to discover relevant transcriptome data
Authors: Chin, Wee Loong; Lassmann, Timo
Contributors: Birol, Inanc; Stan Perron Foundation
Source: Bioinformatics ; volume 41, issue 1 ; ISSN 1367-4811
Publisher Information: Oxford University Press (OUP)
Publication Year: 2024
Description: Motivation Over the last two decades, transcriptomics has become a standard technique in biomedical research. We now have large databases of RNA-seq data, accompanied by valuable metadata detailing scientific objectives and the experimental procedures used. The metadata is crucial in understanding and replicating published studies, but so far has been underutilized in helping researchers to discover existing datasets. Results We present SampleExplorer, a tool allowing researchers to search for relevant data using both text and gene set queries. SampleExplorer embeds sample metadata and uses a transformer-based language model to retrieve similar datasets. Extensive benchmarking (see Supplementary Materials and Methods) using the ARCHS4 database demonstrates that SampleExplorer provides an effective approach for retrieving biologically relevant samples from large-scale transcriptomicdata. This tool provides an efficient approach for discovering relevant gene expression datasets in large public repositories. It improves sample and dataset identification across diverse experimental contexts, helping researchers leverage existing transcriptomic data for potential replication or verification studies. Availability and implementation: SampleExplorer is available as a Python package compatible with versions 3.9 to 3.11, available for installation via the Python Package Index (PyPI). The codebase and documentation are accessible at https://github.com/wlchin/SampleExplorer. Supplementary data (Supplementary Materials and Methods) provides detailed methodological information, including an algorithmic description of the retrieval process and data preparation steps.
Document Type: article in journal/newspaper
Language: English
DOI: 10.1093/bioinformatics/btae759
DOI: 10.1093/bioinformatics/btae759/61296381/btae759.pdf
Availability: https://doi.org/10.1093/bioinformatics/btae759; https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btae759/61296381/btae759.pdf; https://academic.oup.com/bioinformatics/article-pdf/41/1/btae759/61296381/btae759.pdf
Rights: https://creativecommons.org/licenses/by/4.0/
Accession Number: edsbas.951A80F3
Database: BASE