研究目的
To extend the projective simulation model of artificial intelligence to account for meta-learning in reinforcement learning settings, enabling the agent to autonomously and dynamically adjust its own learning parameters or meta-parameters.
研究成果
The meta-learning PS agent was shown to cope well in all scenarios, reaching success probabilities that approach near-optimal or optimal values. The agent dynamically adjusts its meta-parameters to fit the new scenario, demonstrating the utility of both reflex-type adaptation and adaptation through learning.
研究不足
The meta-learning process occurs on a much larger time-scale compared to the base-level network learning time scale, which is necessary for statistical evaluation but may limit the speed of adaptation. Additionally, the trade-off between flexibility and learning-time is a constraint, as higher flexibility may lead to longer learning times.