Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Dataset Discovery using Semantic Matching

Title: Dataset Discovery using Semantic Matching
Authors: Khwaileh, Enas; Velegrakis, Yannis; Sub Data Intensive Systems
Publication Year: 2025
Subject Terms: Information Systems; Software; Computer Science Applications
Description: The exponential growth of data sizes and heterogeneity has made increasingly challenging to be able to identify datasets that meets specific analytical needs. Traditional keyword search methods often fail in that task since they cannot fully capture the semantics of the datasets and match them to those of the query. We introduce a novel dataset discovery method that significantly enhance both accuracy and retrieval speed. By employing advanced semantic matching at the individual field level and leveraging clustering and dimensionality reduction techniques, our method efficiently and effectively retrieves the datasets related to a query. Unlike traditional methods that focus on syntactic matches, our approach uncovers deeper semantic relationships within table data, providing more precise and relevant results. It achieves this by using transformers to generate and work with embeddings instead of the actual values. We present three different search methods that utilize these embeddings, and experimentally demonstrate the improvement that is achieved when compared to the state-of-the-art.
Document Type: book part
File Description: application/pdf
Language: English
ISSN: 2367-2005
Relation: https://dspace.library.uu.nl/handle/1874/482923
Availability: https://dspace.library.uu.nl/handle/1874/482923
Rights: info:eu-repo/semantics/OpenAccess
Accession Number: edsbas.7DEE79F
Database: BASE