Vis enkel innførsel

dc.contributor.authorLarsen, Thomas Nakken
dc.contributor.authorTeigen, Halvor Ødegård
dc.contributor.authorLaache, Torkel
dc.contributor.authorVaragnolo, Damiano
dc.contributor.authorRasheed, Adil
dc.date.accessioned2022-05-06T11:19:34Z
dc.date.available2022-05-06T11:19:34Z
dc.date.created2021-09-03T09:18:33Z
dc.date.issued2021
dc.identifier.citationFrontiers in Robotics and AI. 2021, 8, 738113.en_US
dc.identifier.issn2296-9144
dc.identifier.urihttps://hdl.handle.net/11250/2994547
dc.description.abstractReinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that leverages analyzing the performance and task-specific behavioral characteristics for a range of RL algorithms applied to path-following and collision-avoidance for underactuated surface vehicles in environments of increasing complexity. Compared to the introduced RL algorithms, the results show that the Proximal Policy Optimization (PPO) algorithm exhibits superior robustness to changes in the environment complexity, the reward function, and when generalized to environments with a considerable domain gap from the training environment. Whereas the proposed reward function significantly improves the competing algorithms’ ability to solve the training environment, an unexpected consequence of the dimensionality reduction in the sensor suite, combined with the domain gap, is identified as the source of their impaired generalization performance.en_US
dc.language.isoengen_US
dc.publisherFrontiersen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.subjectDeep reinforcement learningen_US
dc.subjectAutonomous surface vehicleen_US
dc.subjectCollision avoidanceen_US
dc.subjectPath followingen_US
dc.subjectMachine learning controlleren_US
dc.titleComparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Watersen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2021 Larsen, Teigen, Laache, Varagnolo and Rasheed. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.en_US
dc.source.pagenumber19en_US
dc.source.volume8en_US
dc.source.journalFrontiers in Robotics and AIen_US
dc.identifier.doi10.3389/frobt.2021.738113
dc.identifier.cristin1931036
dc.relation.projectNorges forskningsråd: 295033en_US
dc.source.articlenumber738113en_US
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal