Show simple item record

dc.contributor.authorGonzález-Redondo, Alvaro
dc.contributor.authorGarrido Alcázar, Jesús Alberto 
dc.contributor.authorNaveros Arrabal, Francisco 
dc.contributor.authorHellgren, Jeanette
dc.contributor.authorGrillner, Sten
dc.contributor.authorRos Vidal, Eduardo 
dc.date.accessioned2023-06-23T07:53:23Z
dc.date.available2023-06-23T07:53:23Z
dc.date.issued2023-05-26
dc.identifier.urihttps://hdl.handle.net/10481/82756
dc.description.abstractThe basal ganglia (BG), and more specifically the striatum, have long been proposed to play an essential role in action-selection based on a reinforcement learning (RL) paradigm. However, some recent findings, such as striatal spike-timing-dependent plasticity (STDP) or striatal lateral connectivity, require further research and modelling as their respective roles are still not well understood. Theoretical models of spiking neurons with homeostatic mechanisms, lateral connectivity, and reward-modulated STDP have demonstrated a remarkable capability to learn sensorial patterns that statistically correlate with a rewarding signal. In this article, we implement a functional and biologically inspired network model of the striatum, where learning is based on a previously proposed learning rule called spike-timing-dependent eligibility (STDE), which captures important experimental features in the striatum. The proposed computational model can recognize complex input patterns and consistently choose rewarded actions to respond to such sensorial inputs. Moreover, we assess the role different neuronal and network features, such as homeostatic mechanisms and lateral inhibitory connections, play in action-selection with the proposed model. The homeostatic mechanisms make learning more robust (in terms of suitable parameters) and facilitate recovery after rewarding policy swapping, while lateral inhibitory connections are important when multiple input patterns are associated with the same rewarded action. Finally, according to our simulations, the optimal delay between the action and the dopaminergic feedback is obtained around 300 ms, as demonstrated in previous studies of RL and in biological studies.es_ES
dc.language.isoenges_ES
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs 3.0 Licensees_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es_ES
dc.subjectStriatumes_ES
dc.subjectReinforcement learninges_ES
dc.subjectSpiking neural networkes_ES
dc.subjectDopaminees_ES
dc.subjectEligibility tracees_ES
dc.subjectSpike-timing-dependent plasticityes_ES
dc.titleReinforcement Learning in a Spiking Neural Model of Striatum Plasticityes_ES
dc.typejournal articlees_ES
dc.relation.projectIDhttps://cordis.europa.eu/project/id/945539es_ES
dc.relation.projectIDhttps://cordis.europa.eu/project/id/891774es_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doihttps://doi.org/10.1016/j.neucom.2023.126377


Files in this item

[PDF]

This item appears in the following Collection(s)

Show simple item record

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License
Except where otherwise noted, this item's license is described as Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License