Description
In this report, we first investigate several methods for conducting CPP with dynamic obstacles using a graph search algorithm (Spanning Tree Coverage) and on-policy reinforcement learning algorithm (Proximal Policy Optimization) separately. Then, we introduce a guided reinforcement learning approach for CPP which incorporates a unique reward structure as well as additional coverage information generated from Spanning Tree Coverage to help guide the reinforcement learning agent algorithm. We evaluate our proposed algorithm, Coverage Path Planning through Guided Reinforcement Learning (CPP-GRL), across different settings (grid sizes and obstacle locations) and compare with prior research methods. Experimental results show that CPP-GRL generalizes well for both stochastic and deterministic moving obstacles and performing very similarly to other control-based and learning-based algorithms, but with fewer constraints (such as following specific control laws) and lower computational power (such as using Convolutional Neural Network instead).