Go to main content

PDF

Description

This work leans into the efficiency and robustness advantages of a hierarchical learning structure by introducing HAC-E-SAK, which expands beyond the Hierarchical Actor-Critic framework with an Exploration paradigm driven by Synchronized, Adversarial, and Knowledge-based actions. While Hierarchical Actor-Critic (HAC) emphasizes a strictly-defined hierarchical organization for rapid learning through parallelized training of multilevel subtask transition functions, it does not extend this principle to the exploration phase of training, an oversight addressed by this work's approach. Further, HAC's exploration strategy consists of simple epsilon-greedy based perturbations to deterministic actions generated from the DDPG algorithm. The novel approach presented in this work substitutes this with an alternate adversarial strategy relying on knowledge of prior agent experiences, motivating guided environment discovery for tasks with continuous state and action spaces. HAC-E-SAK extends the aforementioned hierarchical organization used by leading methods in subtask learning for the parallel purpose of structured exploration, allowing for explicit synchronization between levels. Experiments across a number of sparse-reward scenarios in Flow and OpenAI Gym demonstrate HAC-E-SAK's consistent outperformance over other tested procedures in terms of both sample efficiency and task success rates.

Details

Files

Statistics

from
to
Export
Download Full History
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS