Temporal task planning in wirelessly connected environments with unknown channel quality
We consider a mobile robot tasked with gathering data in an environment of interest and transmitting these data to a data center. The task is specified as a high-level Linear Temporal Logic (LTL) formula that captures the data to be gathered at various regions in the workspace. The robot has a limited buffer to store the data, which needs to be transmitted to the data center before the buffer overflows. Communication between the robot and the data center is through a dedicated wireless network to which the robot can upload data with rates that are uncertain and unknown. In this case, most existing methods based on dynamic programming can not be applied due to the lack of an accurate model. To address this challenge, we propose here an actor-critic reinforcement learning algorithm where the task execution, workspace exploration, and parameterized-policy learning are all performed online and simultaneously. The derived motion and communication control strategy satisfies the buffer constraints and is reactive to the uncertainty in the wireless transmission rate. The overall complexity and performance of our method is compared in simulation to static solutions that search for constrained shortest paths, and to existing learning algorithms that rely on the construction of the product automaton.