• norsk
    • English
  • norsk 
    • norsk
    • English
  • Logg inn
Vis innførsel 
  •   Hjem
  • SINTEF
  • Publikasjoner fra CRIStin
  • Publikasjoner fra CRIStin - SINTEF AS
  • Vis innførsel
  •   Hjem
  • SINTEF
  • Publikasjoner fra CRIStin
  • Publikasjoner fra CRIStin - SINTEF AS
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Safe Reinforcement Learning for Continuous Spaces through Lyapunov-Constrained Behavior

Fjerdingen, Sigrud Aksnes; Kyrkjebø, Erik
Journal article, Peer reviewed
Thumbnail
Åpne
SINTEF+S19504.pdf (194.8Kb)
Permanent lenke
http://hdl.handle.net/11250/2430386
Utgivelsesdato
2011
Metadata
Vis full innførsel
Samlinger
  • Publikasjoner fra CRIStin - SINTEF AS [4330]
  • SINTEF Digital [1671]
Originalversjon
Frontiers in Artificial Intelligence and Applications. 2011, 70-79.  
Sammendrag
This paper presents a safe learning strategy for continuous state and action spaces by utilizing Lyapunov stability properties of the studied systems. The reinforcement learning algorithm Continous Actor-Critic Learning Automation (CACLA) is combined with the notion of control Lyapunov functions (CLF) to limit the learning and exploration behavior to operate inside the stability region of the system to ensure safe operation at all times. The paper extends previous results for discrete action sets to take advantage of the more general continuous actions sets, and show that the continuous method is able to find a comparable solution to the best discrete action choices while avoiding the need for good heuristic choices in the design process.
Tidsskrift
Frontiers in Artificial Intelligence and Applications

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit
 

 

Bla i

Hele arkivetDelarkiv og samlingerUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifterDenne samlingenUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifter

Min side

Logg inn

Statistikk

Besøksstatistikk

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit