Curriculum Learning for Robot Navigation in Dynamic Environments with Uncertainties

Doğan, Devran

View/Open

Tez Makale (849.8Kb)

Date

2024

Author

Doğan, Devran

xmlui.dri2xhtml.METS-1.0.item-emb

Acik erisim

xmlui.mirage2.itemSummaryView.MetaData

Show full item record

Abstract

In our study we wanted to see if there is any way we can make the training process of a DRL agent much easier, and optimize the success rate in the given tasks. In order to increase the speed of convergence we adopted curriculum learning techniques. Since the importance of the automated vehicles are increasing day by day, and the capabilities such as target search in unknown environments are gaining more attention, that brings us to the importance of path generation, and the exploration of the environment, when human life is at risk or if humans exist in the environment. As we know in complex real-world applications, safety and risk awareness become unavoidable aspects. We used risk-aware systems in unknown environments for testing the model’s robustness in localization and path generation to observe the performance under the situations that are not encountered during training. Systems that are not risk-aware may lead to suboptimal decisions that will lead to failures. These explorations require high computation time. We needed to make improved risk-aware decision making to train a risk-sensitive policy that can have high performance and adaptability to required risk. And can navigate in collision free manner, while acting among static and dynamic obstacles. DRL algorithms showed their capabilities, in learning also easy to compute reward signals. But they require long training times that makes them limited for real-world applications. Therefore, we used curriculum learning and DRL algorithms to build a goal-oriented model. By doing that we achieved faster convergence, search time for the targets is reduced for the same amount of training episodes. Collision rate is reduced. In the training process we wanted to understand in which order the training becomes really hard. For that reason we injected Gaussian noise to neural network parameters in different forms, we used different environments, delayed the sensory information to see the agents behavior, prediction success and also tested with only static obstacles, with dynamic obstacles, and finally we added both of the obstacles together. Many of the environments were partially observable, we also tested in fully observable environments as well, but we saw that DRL agents can solve these environments easily. In order to make this study and measure the efficiency, we build a 2D simulation environment. The performance is verified with results of the simulation analysis. We measured the efficiency of the agent, by collecting the total hit ratio metrics. Experiments show the agent with curriculum learning reaches a better success rate, is efficient at control, performs better under noisy conditions, can adapt faster to unknown environments.

URI

https://hdl.handle.net/11655/36039

xmlui.mirage2.itemSummaryView.Collections

Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu [267]