Agile Fight in Dynamic Environments: Bridging Reinforcement and Imitation Learning
Özet
In recent years, the utilization of drones has seen a remarkable increase across various sectors, including surveillance, delivery services, and environmental monitoring. This surge is largely attributed to advancements in drone technology, making them more accessible and versatile. Among the capabilities that distinguish drones, agile flight emerges as a paramount feature, enabling drones to navigate complex environments with precision and efficiency. However, achieving agile flight in dynamic environments presents significant challenges, particularly in terms of rapid trajectory re-planning and computational demands. This thesis proposes a novel approach to agile drone navigation by integrating Reinforcement Learning (RL) and Imitation Learning (IL). The methodology includes training a state-based teacher policy using the Proximal Policy Optimization (PPO) algorithm, which has access to comprehensive environmental information, including obstacle velocities. Subsequently, a student policy is trained through Behavioral Cloning (BC) to navigate without direct velocity information, relying instead on recurrent neural network architectures to infer this data. Experimental results demonstrate that the proposed method significantly enhances the agility and efficiency of drones in dynamic environments. The combination of RL and IL techniques not only reduces the computational burden but also shortens the training time, facilitating quicker adaptation and improved performance. The findings of this study contribute to advancing autonomous drone technology, offering a robust solution for navigating through cluttered and unpredictable environments. The project can be found in this link: https://github.com/Ag05ccc/agile flight