Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

TGAIN: Missing Data Imputation for Mixed-Type Relational Datasets Using Generative Adversarial Networks

Title: TGAIN: Missing Data Imputation for Mixed-Type Relational Datasets Using Generative Adversarial Networks
Authors: Bannany, Ouassim; Qahtan, Abdulhakim A.; Sub Data Intensive Systems; Stahlbock, Robert; Arabnia, Hamid R.
Publication Year: 2025
Subject Terms: data imputation; generative adversarial network; Missing values; synthetic data generation; Taverne; General Computer Science; General Mathematics
Description: Missing values are common in real-world data sets and represent a challenging problem in performing most data analytics tasks. For that reason, many data imputation techniques have been proposed in the past to fill the missing values. However, these existing techniques may not capture the characteristics of the data and mislead the data analytics techniques, resulting in inaccurate conclusions. Generative Adversarial Networks (GANs) proved to be a good technique for generating synthetic data; using GANs, synthetic examples are generated that preserve the existing values in the record. Then, these synthetic examples can be utilized to fill the missing values and capture the data characteristics better than other data imputation techniques. In this paper, we propose a framework based on Generative Adversarial Networks to impute the missing values for incomplete datasets. The performance of the framework is evaluated using two different methodologies: 1) determining the prediction error of the imputed values after introducing missing values in an otherwise complete data set, and 2) comparing the performance of a classifier trained on a post-imputed data set, which has been imputed using our proposed framework and other imputation frameworks. The proposed framework outperformed other state-of-the-art tools at high missing rates (50% and beyond) while achieving comparable results at lower missing rates. In addition, classifiers trained on the imputed data using this proposed framework lead to higher accuracy compared with some of the other baseline methods.
Document Type: book part
File Description: application/pdf
Language: English
ISSN: 1865-0929
Relation: https://dspace.library.uu.nl/handle/1874/482836
Availability: https://dspace.library.uu.nl/handle/1874/482836
Rights: info:eu-repo/semantics/OpenAccess
Accession Number: edsbas.6F6D4866
Database: BASE