Nowadays approaches to accomplish exploration tasks exploit a model that describes the environment and process of interest. This setup works fine as long as the model describes them precisely. In case the environment or the process changes, a new model has to be built and the algorithm must be adapted to it. This increases the effort required to develop algorithms for novel exploration tasks.
The Swarm Exploration group is developing machine learning algorithms that enable a swarm of robots to learn how to carry out complex exploration tasks. In particular, we focus here on model-free deep reinforcement learning (Deep-RL) approaches, which do not require a model of the environment and process of interest. RL algorithms are a family of algorithms that permit an agent to learn how to behave by interacting with the environment. This is done through a reward signal, which encodes how well the agent is performing. Hence, the aim of a RL agent is to learn a policy so as to maximize the expected future reward.
Model-free RL has been shown to offer outstanding results for a wide variety of tasks. Nevertheless, there are many applications for which a model of the process of interest has been well studied. This is the case, e.g., in one of our applications of interest: gas source localization. In gas source localization, partial differential equations have been proven to model the gas dispersion very accurately. Therefore, one of the questions that we also address in our research is: how can we introduce domain knowledge -- a model -- of a physical process in RL to solve an exploration task?
We developed a framework – DeepIG – that allows multiple robots to learn how to accomplish complex exploration tasks using Deep RL. In particular, our focus lies on terrain mapping, wildfire monitoring, and gas source localization tasks.
DeepIG: Multi-Robot Information Gathering with Deep Reinforcement Learning
Viseras, Alberto and Garcia, Ricardo (2019). Deepig: Multi-robot information gathering with deep reinforcement learning. IEEE Robotics and Automation Letters, 4(3), 3059-3066.