Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

FIONA: Detecting Syntactical Outliers in Attributes with Categorical Values

Title: FIONA: Detecting Syntactical Outliers in Attributes with Categorical Values
Authors: Tsiamis, Thanos; Qahtan, Hakim; Data Intensive Systems; Sub Data Intensive Systems
Publication Year: 2025
Subject Terms: Categorical outliers; generalization tree; patterns; similarity measures; syntactic structure; Taverne
Description: Outlier detection is crucial for data cleaning, influencing analysis and decision-making. While numerical outlier detection is well-studied, identifying outliers in relational data with categorical attributes poses greater challenges due to difficulties in defining a suitable similarity measure. Current approaches for detecting categorical outliers are based on coding the categorical values as numerical values, using the frequency as an indicator of the outlierness score and extracting predefined syntactic structures of the values. In this paper, we propose FIONA (FInding Outliers iN Attributes) to detect outliers in attributes with categorical values. Since categorical values in the relational model usually follow specific syntactic structures, FIONA defines a similarity measure that can reveal the hidden patterns and identify a set of dominant patterns in the data. Values that do not conform to the dominating patterns are declared as outliers. In comparison to alternative tools, FIONA accurately identifies outliers and dominant patterns within datasets and provides a clear explanation for declaring a given value as an outlier.
Document Type: conference object
File Description: application/pdf
Language: English
Relation: https://dspace.library.uu.nl/handle/1874/463123
Availability: https://dspace.library.uu.nl/handle/1874/463123
Rights: info:eu-repo/semantics/OpenAccess
Accession Number: edsbas.E416F23
Database: BASE