The aim of this lab is to understand the reinforcement learning subject of the autonomous robots course and implement a reinforcement learning algorithm to learn a policy that moves a robot to a goal position. The algorithm is the Q-learning algorithm and it will be implemented in Matlab.
1 - Introduction
The reinforcement learning algorithm does not force the robot to plan path by using any path planning algorithm, rather the algorithm learns optimal solution by randomly moving inside map for several times. It is an approximation of natural learning process, where unknown problem is solved just by trial and error method. The following sections will briefly discuss about the implementation and the results obtained by the algorithm.
Environment: The environment used for this lab experiment is shown below.
Figure-1: Environment used for the implementation.
States and Actions: The size of the given environment is 20$\times$14 = 280 states. The robot can only do 4 different actions: ←, ↑, →, ↓. Thus, the size of the Q matrices would be 280$\times$4 = 1120 cells.
Dynamics: Dynamics make the robot move towards a direction according to the actions. The robot will move one cell per iteration to the direction of the action that we select, unless there is an obstacle or the wall in front of it, in which case it will stay in the same position.
Reinforcement function: Reinforcement function assigns reward at each cell, +1 for goal cell and -1 otherwise.