Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data

Title: Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data
Authors: Mohamed S. Abdallah; HyungWon Kim; Mohammad E. Ragab; Elsayed E. Hemayed
Source: Electronics, Vol 8, Iss 12, p 1394 (2019)
Publisher Information: MDPI AG
Publication Year: 2019
Collection: Directory of Open Access Journals: DOAJ Articles
Subject Terms: face clustering; face recognition; face detection; cnn; kl divergence; triplet loss; Electronics; TK7800-8360
Description: The analysis of frame sequences in talk show videos, which is necessary for media mining and television production, requires significant manual efforts and is a very time-consuming process. Given the vast amount of unlabeled face frames from talk show videos, we address and propose a solution to the problem of recognizing and clustering faces. In this paper, we propose a TV media mining system that is based on a deep convolutional neural network approach, which has been trained with a triplet loss minimization method. The main function of the proposed system is the indexing and clustering of video data for achieving an effective media production analysis of individuals in talk show videos and rapidly identifying a specific individual in video data in real-time processing. Our system uses several face datasets from Labeled Faces in the Wild (LFW), which is a collection of unlabeled web face images, as well as YouTube Faces and talk show faces datasets. In the recognition (person spotting) task, our system achieves an F-measure of 0.996 for the collection of unlabeled web face images dataset and an F-measure of 0.972 for the talk show faces dataset. In the clustering task, our system achieves an F-measure of 0.764 and 0.935 for the YouTube Faces database and the LFW dataset, respectively, while achieving an F-measure of 0.832 for the talk show faces dataset, an improvement of 5.4%, 6.5%, and 8.2% over the previous methods.
Document Type: article in journal/newspaper
Language: English
Relation: https://www.mdpi.com/2079-9292/8/12/1394; https://doaj.org/toc/2079-9292; https://doaj.org/article/442c9ee9a36a4c88a78ab3e41f175d69
DOI: 10.3390/electronics8121394
Availability: https://doi.org/10.3390/electronics8121394; https://doaj.org/article/442c9ee9a36a4c88a78ab3e41f175d69
Accession Number: edsbas.FDC4D910
Database: BASE