Parfor HPC Cluster – How to Assign Objects to Same Core Consistently?
Hello,
TLDR: Is there a way to force Matlab to consistently assign a classdef object to the same core? With a parfor loop inside another loop?
Details:
I’m working on a fairly complex/large scale project which involves a large number of classdef objects & a 3D simulation. I’m running on an HPC cluster using the Slurm scheduler.
The 3D simulation has to run in a serial triple loop (at least for now; that’s not the bottleneck).
The bottleneck is the array of objects, each of which stores its own state & calls ode15s once per iteration. These are all independent so I want to run this part in a parfor loop, and this step takes much longer than the triple loop right now.
I’m running on a small test chunk within the 3D space, with about 1200 independent objects. Ultimately this will need to scale about 100x to 150,000 objects, so I need to make this as efficient as possible.
It looks like Matlab is smartly assigning the same object to the same core for the first ~704 objects, but then after that it randomly toggles between 2 cores & a few others:
This shows ~20 loops (loop iterations going downwards), with ~1200 class objects on the X axis; the colors represent the core/task assignment on each iteration using this to create this matrix:
task = getCurrentTask();
coreID(ti, ci) = task.ID;
This plot was created assigning the objects in a parfor loop, but that didn’t help:
The basic structure of the code is this:
% pseudocode:
n_objects = 1200; % this needs to scale up to ~150,000 (so ~100x)
for i:n_objects
object_array(i) = constructor();
% also tried doing this as parfor but didn’t help
end
% … other setup code…
% Big Loop:
dt = 1; % seconds
n_timesteps = 10000;
for i = 1:n_timesteps
% unavoidable 3D triple loop update
update3D(dt);
parfor j = 1:n_objects
% each object depends on 1 scalar from the 3D matrix
object_array(i).update_ODEs(dt); % each object calls ode15s independently
end
% update 3D matrix with 1 scalar from each ODE object
end
I’ve tried adding more RAM per core, but for some reason, it still seems to break after the 704th core, which is interesting.
And doing the object initialization/constructors inside a parfor loop made the initial core assignments less consistent (top row of plot).
Anyway, thank you for your help & please let me know if you have any ideas!
I’m also curious if there’s a way to make the "Big Loop" the parfor loop, and make a "serial critical section" or something for the 3D part? Or some other hack like that?
Thank you!
ETA 7/28/25: Updated pseudocode with dt & scalar values passing between 3D simulation & ODE objectsHello,
TLDR: Is there a way to force Matlab to consistently assign a classdef object to the same core? With a parfor loop inside another loop?
Details:
I’m working on a fairly complex/large scale project which involves a large number of classdef objects & a 3D simulation. I’m running on an HPC cluster using the Slurm scheduler.
The 3D simulation has to run in a serial triple loop (at least for now; that’s not the bottleneck).
The bottleneck is the array of objects, each of which stores its own state & calls ode15s once per iteration. These are all independent so I want to run this part in a parfor loop, and this step takes much longer than the triple loop right now.
I’m running on a small test chunk within the 3D space, with about 1200 independent objects. Ultimately this will need to scale about 100x to 150,000 objects, so I need to make this as efficient as possible.
It looks like Matlab is smartly assigning the same object to the same core for the first ~704 objects, but then after that it randomly toggles between 2 cores & a few others:
This shows ~20 loops (loop iterations going downwards), with ~1200 class objects on the X axis; the colors represent the core/task assignment on each iteration using this to create this matrix:
task = getCurrentTask();
coreID(ti, ci) = task.ID;
This plot was created assigning the objects in a parfor loop, but that didn’t help:
The basic structure of the code is this:
% pseudocode:
n_objects = 1200; % this needs to scale up to ~150,000 (so ~100x)
for i:n_objects
object_array(i) = constructor();
% also tried doing this as parfor but didn’t help
end
% … other setup code…
% Big Loop:
dt = 1; % seconds
n_timesteps = 10000;
for i = 1:n_timesteps
% unavoidable 3D triple loop update
update3D(dt);
parfor j = 1:n_objects
% each object depends on 1 scalar from the 3D matrix
object_array(i).update_ODEs(dt); % each object calls ode15s independently
end
% update 3D matrix with 1 scalar from each ODE object
end
I’ve tried adding more RAM per core, but for some reason, it still seems to break after the 704th core, which is interesting.
And doing the object initialization/constructors inside a parfor loop made the initial core assignments less consistent (top row of plot).
Anyway, thank you for your help & please let me know if you have any ideas!
I’m also curious if there’s a way to make the "Big Loop" the parfor loop, and make a "serial critical section" or something for the 3D part? Or some other hack like that?
Thank you!
ETA 7/28/25: Updated pseudocode with dt & scalar values passing between 3D simulation & ODE objects Hello,
TLDR: Is there a way to force Matlab to consistently assign a classdef object to the same core? With a parfor loop inside another loop?
Details:
I’m working on a fairly complex/large scale project which involves a large number of classdef objects & a 3D simulation. I’m running on an HPC cluster using the Slurm scheduler.
The 3D simulation has to run in a serial triple loop (at least for now; that’s not the bottleneck).
The bottleneck is the array of objects, each of which stores its own state & calls ode15s once per iteration. These are all independent so I want to run this part in a parfor loop, and this step takes much longer than the triple loop right now.
I’m running on a small test chunk within the 3D space, with about 1200 independent objects. Ultimately this will need to scale about 100x to 150,000 objects, so I need to make this as efficient as possible.
It looks like Matlab is smartly assigning the same object to the same core for the first ~704 objects, but then after that it randomly toggles between 2 cores & a few others:
This shows ~20 loops (loop iterations going downwards), with ~1200 class objects on the X axis; the colors represent the core/task assignment on each iteration using this to create this matrix:
task = getCurrentTask();
coreID(ti, ci) = task.ID;
This plot was created assigning the objects in a parfor loop, but that didn’t help:
The basic structure of the code is this:
% pseudocode:
n_objects = 1200; % this needs to scale up to ~150,000 (so ~100x)
for i:n_objects
object_array(i) = constructor();
% also tried doing this as parfor but didn’t help
end
% … other setup code…
% Big Loop:
dt = 1; % seconds
n_timesteps = 10000;
for i = 1:n_timesteps
% unavoidable 3D triple loop update
update3D(dt);
parfor j = 1:n_objects
% each object depends on 1 scalar from the 3D matrix
object_array(i).update_ODEs(dt); % each object calls ode15s independently
end
% update 3D matrix with 1 scalar from each ODE object
end
I’ve tried adding more RAM per core, but for some reason, it still seems to break after the 704th core, which is interesting.
And doing the object initialization/constructors inside a parfor loop made the initial core assignments less consistent (top row of plot).
Anyway, thank you for your help & please let me know if you have any ideas!
I’m also curious if there’s a way to make the "Big Loop" the parfor loop, and make a "serial critical section" or something for the 3D part? Or some other hack like that?
Thank you!
ETA 7/28/25: Updated pseudocode with dt & scalar values passing between 3D simulation & ODE objects parallel computing, parfor, hpc, cluster, memory fragmentation MATLAB Answers — New Questions