Tuning ExperienceHorizon hyperparamter for PPO agent (Reinforcement Learning)

Hello everyone,
I’m trying to train a PPO agent, and I would like to change the value for the ExperienceHorizon hyperparameter (Options for PPO agent – MATLAB – MathWorks Switzerland)
When I try another value than the default, the agent wait for the end of the episode to update its policy. For example, ExperienceHorizon=1024 don’t work for me, dispite the episode’s lenght of more than 1024 steps. I’m also not using Parallel training.
I also get the same issue if I change the MiniBatchSize from its default value.
Is there anything I’ve missed about this parameter?

More infos on PPO algorithms: Proximal Policy Optimization (PPO) Agents – MATLAB & Simulink – MathWorks Switzerland

If anyone could help, that would be very nice!

Thanks a lot in advance,
NicolasHello everyone,
I’m trying to train a PPO agent, and I would like to change the value for the ExperienceHorizon hyperparameter (Options for PPO agent – MATLAB – MathWorks Switzerland)
When I try another value than the default, the agent wait for the end of the episode to update its policy. For example, ExperienceHorizon=1024 don’t work for me, dispite the episode’s lenght of more than 1024 steps. I’m also not using Parallel training.
I also get the same issue if I change the MiniBatchSize from its default value.
Is there anything I’ve missed about this parameter?

More infos on PPO algorithms: Proximal Policy Optimization (PPO) Agents – MATLAB & Simulink – MathWorks Switzerland

If anyone could help, that would be very nice!

Thanks a lot in advance,
Nicolas Hello everyone,
I’m trying to train a PPO agent, and I would like to change the value for the ExperienceHorizon hyperparameter (Options for PPO agent – MATLAB – MathWorks Switzerland)
When I try another value than the default, the agent wait for the end of the episode to update its policy. For example, ExperienceHorizon=1024 don’t work for me, dispite the episode’s lenght of more than 1024 steps. I’m also not using Parallel training.
I also get the same issue if I change the MiniBatchSize from its default value.
Is there anything I’ve missed about this parameter?

More infos on PPO algorithms: Proximal Policy Optimization (PPO) Agents – MATLAB & Simulink – MathWorks Switzerland

If anyone could help, that would be very nice!

Thanks a lot in advance,
Nicolas ppo agents, reinforcement learning, experience horizon MATLAB Answers — New Questions

Cart

Cart