RL Training Slows Down With Full Experience Buffer

I’m working on an RL project using a DQN agent with a custom environment. I tried training the agent without an LSTM layer with limited success and now I’m trying with an LSTM layer. I very quickly noticed that after the experience buffer is filled, i.e. the Total Number of Steps counter reaches the ExperienceBuffer length set in the agent properties, the training slows to a crawl. Where the first several training episodes complete in seconds, once this value of total steps is reached the next episode takes several minutes to train, and future ones don’t seem to recover the speed. This hasn’t happened in all my iterations of the agent and learning settings before I enabled the LSTM layer.
I’m wondering if this is expected behavior, why this might be happening given my settings, etc. My agent options are below, more details can be given if needed. Thanks!
agentOpts = rlDQNAgentOptions;
agentOpts.SequenceLength = 16;
agentOpts.MiniBatchSize = 100; % grab 100 "trajectories" of length 16 from the buffer at a time for training
agentOpts.NumStepsToLookAhead = 1; % forced for LSTM enabled critics
agentOpts.ExperienceBufferLength = 10000; % once the total steps reaches this number, training slows down dramatically
The critic network setup is:
ObservationInfo = rlNumericSpec([16 1]); % 16 tile game board
ActionInfo = rlFiniteSetSpec([1 2 3 4]); % 4 possible actions per turn

% Input Layer
layers = [
sequenceInputLayer(16, "Name", "input_1")
fullyConnectedLayer(1024,"Name","fc_1", "WeightsInitializer", "he")
reluLayer("Name","relu_body")
];

% Body Hidden Layers
for i = 1:3
layers = [layers
fullyConnectedLayer(1024,"Name", join(["fc_" num2str(i)]), "WeightsInitializer", "he")
reluLayer("Name", join(["body_output_" num2str(i)]))
];
end

% LSTM Layer
layers = [layers
lstmLayer(1024,"Name","lstm", "InputWeightsInitializer", "he", "RecurrentWeightsInitializer", "he")
];

% Output Layer
layers = [layers
fullyConnectedLayer(4,"Name","fc_action")
regressionLayer("Name","output")
];

dqnCritic = rlQValueRepresentation(layers, ObservationInfo, ActionInfo, "Observation", "input_1");I’m working on an RL project using a DQN agent with a custom environment. I tried training the agent without an LSTM layer with limited success and now I’m trying with an LSTM layer. I very quickly noticed that after the experience buffer is filled, i.e. the Total Number of Steps counter reaches the ExperienceBuffer length set in the agent properties, the training slows to a crawl. Where the first several training episodes complete in seconds, once this value of total steps is reached the next episode takes several minutes to train, and future ones don’t seem to recover the speed. This hasn’t happened in all my iterations of the agent and learning settings before I enabled the LSTM layer.
I’m wondering if this is expected behavior, why this might be happening given my settings, etc. My agent options are below, more details can be given if needed. Thanks!
agentOpts = rlDQNAgentOptions;
agentOpts.SequenceLength = 16;
agentOpts.MiniBatchSize = 100; % grab 100 "trajectories" of length 16 from the buffer at a time for training
agentOpts.NumStepsToLookAhead = 1; % forced for LSTM enabled critics
agentOpts.ExperienceBufferLength = 10000; % once the total steps reaches this number, training slows down dramatically
The critic network setup is:
ObservationInfo = rlNumericSpec([16 1]); % 16 tile game board
ActionInfo = rlFiniteSetSpec([1 2 3 4]); % 4 possible actions per turn

% Input Layer
layers = [
sequenceInputLayer(16, "Name", "input_1")
fullyConnectedLayer(1024,"Name","fc_1", "WeightsInitializer", "he")
reluLayer("Name","relu_body")
];

% Body Hidden Layers
for i = 1:3
layers = [layers
fullyConnectedLayer(1024,"Name", join(["fc_" num2str(i)]), "WeightsInitializer", "he")
reluLayer("Name", join(["body_output_" num2str(i)]))
];
end

% LSTM Layer
layers = [layers
lstmLayer(1024,"Name","lstm", "InputWeightsInitializer", "he", "RecurrentWeightsInitializer", "he")
];

% Output Layer
layers = [layers
fullyConnectedLayer(4,"Name","fc_action")
regressionLayer("Name","output")
];

dqnCritic = rlQValueRepresentation(layers, ObservationInfo, ActionInfo, "Observation", "input_1"); I’m working on an RL project using a DQN agent with a custom environment. I tried training the agent without an LSTM layer with limited success and now I’m trying with an LSTM layer. I very quickly noticed that after the experience buffer is filled, i.e. the Total Number of Steps counter reaches the ExperienceBuffer length set in the agent properties, the training slows to a crawl. Where the first several training episodes complete in seconds, once this value of total steps is reached the next episode takes several minutes to train, and future ones don’t seem to recover the speed. This hasn’t happened in all my iterations of the agent and learning settings before I enabled the LSTM layer.
I’m wondering if this is expected behavior, why this might be happening given my settings, etc. My agent options are below, more details can be given if needed. Thanks!
agentOpts = rlDQNAgentOptions;
agentOpts.SequenceLength = 16;
agentOpts.MiniBatchSize = 100; % grab 100 "trajectories" of length 16 from the buffer at a time for training
agentOpts.NumStepsToLookAhead = 1; % forced for LSTM enabled critics
agentOpts.ExperienceBufferLength = 10000; % once the total steps reaches this number, training slows down dramatically
The critic network setup is:
ObservationInfo = rlNumericSpec([16 1]); % 16 tile game board
ActionInfo = rlFiniteSetSpec([1 2 3 4]); % 4 possible actions per turn

% Input Layer
layers = [
sequenceInputLayer(16, "Name", "input_1")
fullyConnectedLayer(1024,"Name","fc_1", "WeightsInitializer", "he")
reluLayer("Name","relu_body")
];

% Body Hidden Layers
for i = 1:3
layers = [layers
fullyConnectedLayer(1024,"Name", join(["fc_" num2str(i)]), "WeightsInitializer", "he")
reluLayer("Name", join(["body_output_" num2str(i)]))
];
end

% LSTM Layer
layers = [layers
lstmLayer(1024,"Name","lstm", "InputWeightsInitializer", "he", "RecurrentWeightsInitializer", "he")
];

% Output Layer
layers = [layers
fullyConnectedLayer(4,"Name","fc_action")
regressionLayer("Name","output")
];

dqnCritic = rlQValueRepresentation(layers, ObservationInfo, ActionInfo, "Observation", "input_1"); rldqnagent, experience buffer, training speed, rnn MATLAB Answers — New Questions

Cart

Cart