When DDPG optimizes PID parameters, how do I keep the first 10s of data from the system stabilization phase out of the experienceBuffer?

Adaptive PID control using simulink’s own Agent. Since the first 20 is a buffer process for the system, the first 20s are not part of the transition process, but are necessary to exist. How to make the Agent block the first 20s of action, state, reward and other information, or how to make the first 20s does not affect the training effect. In fact, if the first 10s are learned by the intelligent body, then the training effect is very poor.Adaptive PID control using simulink’s own Agent. Since the first 20 is a buffer process for the system, the first 20s are not part of the transition process, but are necessary to exist. How to make the Agent block the first 20s of action, state, reward and other information, or how to make the first 20s does not affect the training effect. In fact, if the first 10s are learned by the intelligent body, then the training effect is very poor. Adaptive PID control using simulink’s own Agent. Since the first 20 is a buffer process for the system, the first 20s are not part of the transition process, but are necessary to exist. How to make the Agent block the first 20s of action, state, reward and other information, or how to make the first 20s does not affect the training effect. In fact, if the first 10s are learned by the intelligent body, then the training effect is very poor. reinforcement learning, simulink, agent MATLAB Answers — New Questions

Cart

Cart