Training agent in reinforcement learning: reproducibility of the code

I get two different results from running this water-tank system example for reinforcement learning made by Mathworks:
https://uk.mathworks.com/help/reinforcement-learning/ug/create-simulink-environment-and-train-agent.html
This example has fixed the random number generator seed rng(0), so I expected the result to be the same on all computer. However, I ended up with two different agents on two computers:
Computer A finished training the agent after 86 episodes (just like the published example) and gave me an identical agent to the example.
Computer B needed 182 episodes to train the agent and gave me a different agent.
Both computers run MATLAB R2023b 64-bit on MS Windows 10. The code is unchanged from the example (except for changing doTraining = false to doTraining = true).
Computer A has an 8-core i7 processor. Computer B has a 6-core i7 processor.
I’m writing a tutorial for a univeristy-level course, so reproducibility is necessary so that students can follow the example. Any tip on how to facilitate this is also much appreciated.I get two different results from running this water-tank system example for reinforcement learning made by Mathworks:
https://uk.mathworks.com/help/reinforcement-learning/ug/create-simulink-environment-and-train-agent.html
This example has fixed the random number generator seed rng(0), so I expected the result to be the same on all computer. However, I ended up with two different agents on two computers:
Computer A finished training the agent after 86 episodes (just like the published example) and gave me an identical agent to the example.
Computer B needed 182 episodes to train the agent and gave me a different agent.
Both computers run MATLAB R2023b 64-bit on MS Windows 10. The code is unchanged from the example (except for changing doTraining = false to doTraining = true).
Computer A has an 8-core i7 processor. Computer B has a 6-core i7 processor.
I’m writing a tutorial for a univeristy-level course, so reproducibility is necessary so that students can follow the example. Any tip on how to facilitate this is also much appreciated. I get two different results from running this water-tank system example for reinforcement learning made by Mathworks:
https://uk.mathworks.com/help/reinforcement-learning/ug/create-simulink-environment-and-train-agent.html
This example has fixed the random number generator seed rng(0), so I expected the result to be the same on all computer. However, I ended up with two different agents on two computers:
Computer A finished training the agent after 86 episodes (just like the published example) and gave me an identical agent to the example.
Computer B needed 182 episodes to train the agent and gave me a different agent.
Both computers run MATLAB R2023b 64-bit on MS Windows 10. The code is unchanged from the example (except for changing doTraining = false to doTraining = true).
Computer A has an 8-core i7 processor. Computer B has a 6-core i7 processor.
I’m writing a tutorial for a univeristy-level course, so reproducibility is necessary so that students can follow the example. Any tip on how to facilitate this is also much appreciated. reinforcement learning, agent, training, random number generator MATLAB Answers — New Questions

Cart

Cart