Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden. Login für vollen Zugriff.

Multiagent Learning via Dynamic Skill Selection

Title:	Multiagent Learning via Dynamic Skill Selection
Authors:	Sachdeva, Enna
Contributors:	Tumer, Kagan; Turkan, Yelda; Davidson, Joe; Hollinger, Geoffrey
Publisher Information:	Oregon State University
Collection:	ScholarsArchive@OSU (Oregon State University)
Description:	Multiagent coordination has many real-world applications such as self-driving cars, inventory management, search and rescue, package delivery, traﬃc management, warehouse management, and transportation. These tasks are generally character-ized by a global team objective that is often temporally sparse - realized only upon completing an episode. The sparsity of the shared team objective often makes it an inadequate learning signal to learn eﬀective strategies. Moreover, this reward signal does not capture the marginal contribution of each agent towards the global objective. This leads to the problem of structural credit assignment in multia-gent systems. Furthermore, due to a lack of accurate understanding of desired task behaviors, it is often challenging to manually design agent-speciﬁc rewards to improved coordination. While learning these undeﬁned local objectives is very critical for a successful coordination, it is extremely challenging due to these two core challenges. Firstly, due to interaction among agents in an environment, the complexity of the problem may rise exponentially with the number of agents, and their behavioral sophisti-cation. An agent perceives the environment as non-stationary, due to all learn-ing concurrently. This leads to an agent perceiving the coordination objective as extremely noisy. Secondly, the goal information required to learn coordination behavior is distributed among agents. This makes it diﬃcult for agents to learn undeﬁned desired behaviors that optimizes a team objective. The key contribution of this work is to address the credit assignment problem in multiagent coordination using several semantically meaningful local rewards. We argue that real-world multiagent coordination tasks can be decomposed into several meaningful skills. Further, we introduce MADyS, a framework that can optimize a global reward by learning to dynamically select the most optimal skill from semantically meaningful skills, characterized by their local rewards, without requiring any form of reward ...
Document Type:	master thesis
Language:	English; unknown
Relation:	https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/44558m99c
Availability:	https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/44558m99c
Rights:	All rights reserved
Accession Number:	edsbas.7A55A1A3
Database:	BASE