Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden. Login für vollen Zugriff.

A YOLOv11-Based Deep Learning Framework for Multi-Class Human Action Recognition

Title:	A YOLOv11-Based Deep Learning Framework for Multi-Class Human Action Recognition
Authors:	Nayeemul Islam Nayeem; Shirin Mahbuba; Sanjida Islam Disha; Md Rifat Hossain Buiyan; Shakila Rahman; M. Abdullah-Al-Wadud; Jia Uddin
Source:	Computers, Materials & Continua ; ISSN: 1546-2218 (Print) ; ISSN: 1546-2226 (Online) ; Volume 85 ; Issue 1
Publisher Information:	Tech Science Press
Publication Year:	2025
Subject Terms:	Human activity recognition; YOLOv11; deep learning; real-time detection; anchor-free detection; attention mechanisms; object detection; image classification; multi-class recognition; surveillance applications
Description:	Human activity recognition is a significant area of research in artificial intelligence for surveillance, healthcare, sports, and human-computer interaction applications. The article benchmarks the performance of You Only Look Once version 11-based (YOLOv11-based) architecture for multi-class human activity recognition. The article benchmarks the performance of You Only Look Once version 11-based (YOLOv11-based) architecture for multi-class human activity recognition. The dataset consists of 14,186 images across 19 activity classes, from dynamic activities such as running and swimming to static activities such as sitting and sleeping. Preprocessing included resizing all images to 512 512 pixels, annotating them in YOLO’s bounding box format, and applying data augmentation methods such as flipping, rotation, and cropping to enhance model generalization. The proposed model was trained for 100 epochs with adaptive learning rate methods and hyperparameter optimization for performance improvement, with a mAP@0.5 of 74.93% and a mAP@0.5-0.95 of 64.11%, outperforming previous versions of YOLO (v10, v9, and v8) and general-purpose architectures like ResNet50 and EfficientNet. It exhibited improved precision and recall for all activity classes with high precision values of 0.76 for running, 0.79 for swimming, 0.80 for sitting, and 0.81 for sleeping, and was tested for real-time deployment with an inference time of 8.9 ms per image, being computationally light. Proposed YOLOv11’s improvements are attributed to architectural advancements like a more complex feature extraction process, better attention modules, and an anchor-free detection mechanism. While YOLOv10 was extremely stable in static activity recognition, YOLOv9 performed well in dynamic environments but suffered from overfitting, and YOLOv8, while being a decent baseline, failed to differentiate between overlapping static activities. The experimental results determine proposed YOLOv11 to be the most appropriate model, providing an ideal balance between accuracy, ...
Document Type:	article in journal/newspaper
File Description:	application/pdf
Language:	English
Relation:	https://doi.org/10.32604/cmc.2025.065061
DOI:	10.32604/cmc.2025.065061
Availability:	https://doi.org/10.32604/cmc.2025.065061
Rights:	info:eu-repo/semantics/openAccess ; https://creativecommons.org/licenses/by/4.0/
Accession Number:	edsbas.BB81CE8D
Database:	BASE