Show simple item record

dc.contributor.authorLarsen, Thomas Nakken
dc.contributor.authorTeigen, Halvor Ødegård
dc.contributor.authorLaache, Torkel
dc.contributor.authorVaragnolo, Damiano
dc.contributor.authorRasheed, Adil
dc.identifier.citationFrontiers in Robotics and AI. 2021, 8, 738113.en_US
dc.description.abstractReinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that leverages analyzing the performance and task-specific behavioral characteristics for a range of RL algorithms applied to path-following and collision-avoidance for underactuated surface vehicles in environments of increasing complexity. Compared to the introduced RL algorithms, the results show that the Proximal Policy Optimization (PPO) algorithm exhibits superior robustness to changes in the environment complexity, the reward function, and when generalized to environments with a considerable domain gap from the training environment. Whereas the proposed reward function significantly improves the competing algorithms’ ability to solve the training environment, an unexpected consequence of the dimensionality reduction in the sensor suite, combined with the domain gap, is identified as the source of their impaired generalization performance.en_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.subjectDeep reinforcement learningen_US
dc.subjectAutonomous surface vehicleen_US
dc.subjectCollision avoidanceen_US
dc.subjectPath followingen_US
dc.subjectMachine learning controlleren_US
dc.titleComparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Watersen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.rights.holder© 2021 Larsen, Teigen, Laache, Varagnolo and Rasheed. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.en_US
dc.source.journalFrontiers in Robotics and AIen_US
dc.relation.projectNorges forskningsråd: 295033en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

Navngivelse 4.0 Internasjonal
Except where otherwise noted, this item's license is described as Navngivelse 4.0 Internasjonal