News Release 

Robot learns fast but safe navigation strategy

Combining deep reinforcement learning and curriculum learning to achieve fast but safe mobile robot navigation

Toyohashi University of Technology (TUT)

Research News

IMAGE

IMAGE: Plot of some robot trajectories over several speed settings after training. In the experiments, various speed settings were applied to the mobile robot (depicted as red circles) for three goal... view more 

Credit: COPYRIGHT (C) TOYOHASHI UNIVERSITY OF TECHNOLOGY. ALL RIGHTS RESERVED.

Overview:

A research group from the Active Intelligent System Laboratory (AISL) at Toyohashi University of Technology (TUT) has proposed a new framework for training mobile robots to quickly navigate while maintaining low collision rates. The framework combines deep reinforcement learning (DRL) and curriculum learning in the training process for robots to learn a fast but safe navigation policy.

Details:

One of the basic requirements of autonomous mobile robots is their navigation capability. The robot must be able to navigate from its current position to the specified target position on the map as per given coordinates, while also avoiding surrounding obstacles. In some cases, the robot is required to navigate with a speed sufficient to reach its destination as quickly as possible. However, the robots that navigate faster usually have a high risk of collision, making the navigation unsafe and endangering the robot and the surrounding environment.

To solve this problem, a research group from the Active Intelligent System Laboratory (AISL) in the Department of Computer Science and Engineering at Toyohashi University of Technology (TUT) proposed a new framework capable of balancing fast but safe robot navigation. The proposed framework enables the robot to learn a policy for fast but safe navigation in an indoor environment by utilizing deep reinforcement learning (DRL) and curriculum learning.

Chandra Kusuma Dewa, a doctoral student and the first author of the paper, explained that DRL can enable the robot to learn appropriate actions based on the current state of the environment (e.g., robot position and obstacle placements) by repeatedly trying various actions. In addition, the paper explains that the execution of the current action stops immdediately the robot achieves the goal position or collides with obstacles because the learning algorithms assume that the actions have been successfully executed by the robot, and that consequence needs to be used for improving the policy. The proposed framework can help maintain the consistency of the learning environment so that the robot can learn a better navigation policy.

In addition, Professor Jun Miura, the head of AISL at TUT, described that the framework follows a curriculum learning strategy by setting a small value of velocity for the robot at the beginning of the training episode. As the number of episodes increases, the robot's velocity is increased gradually so that the robot can gradually learn the complex task of fast but safe navigation in the training environment from the easiest level, such as the one with the slow movement, to the most difficult level, such as the one with the fast movement.

Experimental results and prospect:

Because collisions in the training phase are undesirable, the research of learning algorithms is usually conducted in a simulated environment. We simulated the indoor environment as shown below for the experiments. The proposed framework is proven to enable the robot to navigate faster with the highest success rate compared to other previously existing frameworks both in the training and in the validation process. The research group believes that the framework is valuable based on the evaluation results, and it can be widely used to train mobile robots in any field that requires fast but safe navigation.

###

Funding Agency:

This work was supported in part by the Japan Society for the Promotion of Science (JSPS) KAKENHI under Grant 17H01799.

Reference:

C. K. Dewa and J. Miura, "A Framework for DRL Navigation With State Transition Checking and Velocity Increment Scheduling," in IEEE Access, vol. 8, pp. 191826-191838, 2020, doi: 10.1109/ACCESS.2020.3033016.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.