ESense: BioMimetic modeling of echolocation and electrolocation using homeostatic dual-layered reinforcement learning
This research explores diferent approaches to finding a moving target in a gridworld through reinforcement learning. Onewell known method for implementing reinforcement learning is the SARSA-γ algorithm. The traditional SARSA-γ algorithm is inefficient at finding a moving target because it relies only on the learned values of a stationary Q-table. While this works in static environments (e.g., finding optimal routes through a challenging environment to a stationary goal). The proposed solution to this problem is eSense. eSensex is a dual-layered, dynamic, homeostatic SARSA-γ algorithm with eligibility traces. It gives the AI a temporal sense (so it knows what is around it) to aid in the learning process. The dual-layered descriptor signifies that there are actually two grids in place, one for the navigation within the environment and another one that tracks the area surrounding the agent. Because this second grid moves around on the environment grid, it is dynamic. Additionally, the target the agent is pursuing is also moving, so it is also dynamic. Additionally, since this second grid is centered around the agent it is homeostatic (centered around the agent). Finally, the eligibility traces provide enhanced learning within this environment by providing more feedback per iteration (i.e., more states are updated each iteration). This enhanced configuration has helped eSense learn the target's tendencies while still relying on the Q-table to guide it away from walls and other obstacles. This layered approach provides an improvement to the standard SARSA-γ approach.