Month: August 2025
uifigure+uitree, multiple selected nodes
Hello,
Does anybody know a (undocumented) workaround to have multiple nodes selected in a uitree with checkboxes? I like to distinguish between checked and selected items.
If the uitree component is used without checkboxes it is possible. I am wondering why this is not implemented (yet)…
hF = uifigure();
hT = uitree(hF,’checkbox’);
hO1 = uitreenode(Parent=hT, Text=’object1′);
hO2 = uitreenode(Parent=hT, Text=’object2′);
hO3 = uitreenode(Parent=hT, Text=’object3′);
hO4 = uitreenode(Parent=hT, Text=’object4′);
hT.CheckedNodes = [hO1, hO2];
hT.SelectedNodes = [hO3, hO4];
%Error using matlab.ui.container.internal.model.TreeComponent/set.SelectedNodes (line 161)
%’SelectedNodes’ must be an empty array or a 1-by-1 TreeNode object that is a child in the CheckBoxTree.
Thanks in advance,
MartinHello,
Does anybody know a (undocumented) workaround to have multiple nodes selected in a uitree with checkboxes? I like to distinguish between checked and selected items.
If the uitree component is used without checkboxes it is possible. I am wondering why this is not implemented (yet)…
hF = uifigure();
hT = uitree(hF,’checkbox’);
hO1 = uitreenode(Parent=hT, Text=’object1′);
hO2 = uitreenode(Parent=hT, Text=’object2′);
hO3 = uitreenode(Parent=hT, Text=’object3′);
hO4 = uitreenode(Parent=hT, Text=’object4′);
hT.CheckedNodes = [hO1, hO2];
hT.SelectedNodes = [hO3, hO4];
%Error using matlab.ui.container.internal.model.TreeComponent/set.SelectedNodes (line 161)
%’SelectedNodes’ must be an empty array or a 1-by-1 TreeNode object that is a child in the CheckBoxTree.
Thanks in advance,
Martin Hello,
Does anybody know a (undocumented) workaround to have multiple nodes selected in a uitree with checkboxes? I like to distinguish between checked and selected items.
If the uitree component is used without checkboxes it is possible. I am wondering why this is not implemented (yet)…
hF = uifigure();
hT = uitree(hF,’checkbox’);
hO1 = uitreenode(Parent=hT, Text=’object1′);
hO2 = uitreenode(Parent=hT, Text=’object2′);
hO3 = uitreenode(Parent=hT, Text=’object3′);
hO4 = uitreenode(Parent=hT, Text=’object4′);
hT.CheckedNodes = [hO1, hO2];
hT.SelectedNodes = [hO3, hO4];
%Error using matlab.ui.container.internal.model.TreeComponent/set.SelectedNodes (line 161)
%’SelectedNodes’ must be an empty array or a 1-by-1 TreeNode object that is a child in the CheckBoxTree.
Thanks in advance,
Martin uifigure, uitree, selectednodes MATLAB Answers — New Questions
MATLAB graph/cyclebasis: How can I extract labeling-independent “minimal loop units”?
I have two undirected graphs that represent the same connectivity (isomorphic up to node relabeling). When I call cyclebasis on each, I get different sets of cycles. I understand a cycle basis can depend on the spanning tree/edge order, but I want a labeling-invariant notion of “minimal loop units.”
Code
clc; clear; close all
node = [
2 7
2 3
3 4
4 5
5 6
6 1
1 4
1 5
3 6
1 3
5 7
6 7
2 6
5 8];
G = graph(node(:,1), node(:,2), []);
cycles = cyclebasis(G)
figure(1); plot(G,’Layout’,’force’);
%%
node2 = [
1 2
2 3
3 4
4 1
1 5
5 6
2 4
6 7
3 6
2 5
3 7
4 7
2 6
6 8];
G2 = graph(node2(:,1), node2(:,2), []);
cycles2 = cyclebasis(G2)
figure(2); plot(G2,’Layout’,’force’);
Result:
cycles =
7×1 cell array
{[1 3 6 5]}
{[ 1 4 5]}
{[ 1 5 6]}
{[ 2 3 6]}
{[2 6 5 7]}
{[3 4 5 6]}
{[ 5 6 7]}
cycles2 =
7×1 cell array
{[ 1 2 4]}
{[ 1 2 5]}
{[ 2 3 4]}
{[ 2 3 6]}
{[2 3 7 6]}
{[2 4 7 6]}
{[ 2 5 6]}
Questions:
I know cyclebasis can vary with spanning tree/edge ordering. What’s the recommended way in MATLAB to obtain “minimal loop units” that do not depend on node labeling or edge input order? For example, in the above case, each cycle is a 3-node triangle, and there should be seven such cycles.I have two undirected graphs that represent the same connectivity (isomorphic up to node relabeling). When I call cyclebasis on each, I get different sets of cycles. I understand a cycle basis can depend on the spanning tree/edge order, but I want a labeling-invariant notion of “minimal loop units.”
Code
clc; clear; close all
node = [
2 7
2 3
3 4
4 5
5 6
6 1
1 4
1 5
3 6
1 3
5 7
6 7
2 6
5 8];
G = graph(node(:,1), node(:,2), []);
cycles = cyclebasis(G)
figure(1); plot(G,’Layout’,’force’);
%%
node2 = [
1 2
2 3
3 4
4 1
1 5
5 6
2 4
6 7
3 6
2 5
3 7
4 7
2 6
6 8];
G2 = graph(node2(:,1), node2(:,2), []);
cycles2 = cyclebasis(G2)
figure(2); plot(G2,’Layout’,’force’);
Result:
cycles =
7×1 cell array
{[1 3 6 5]}
{[ 1 4 5]}
{[ 1 5 6]}
{[ 2 3 6]}
{[2 6 5 7]}
{[3 4 5 6]}
{[ 5 6 7]}
cycles2 =
7×1 cell array
{[ 1 2 4]}
{[ 1 2 5]}
{[ 2 3 4]}
{[ 2 3 6]}
{[2 3 7 6]}
{[2 4 7 6]}
{[ 2 5 6]}
Questions:
I know cyclebasis can vary with spanning tree/edge ordering. What’s the recommended way in MATLAB to obtain “minimal loop units” that do not depend on node labeling or edge input order? For example, in the above case, each cycle is a 3-node triangle, and there should be seven such cycles. I have two undirected graphs that represent the same connectivity (isomorphic up to node relabeling). When I call cyclebasis on each, I get different sets of cycles. I understand a cycle basis can depend on the spanning tree/edge order, but I want a labeling-invariant notion of “minimal loop units.”
Code
clc; clear; close all
node = [
2 7
2 3
3 4
4 5
5 6
6 1
1 4
1 5
3 6
1 3
5 7
6 7
2 6
5 8];
G = graph(node(:,1), node(:,2), []);
cycles = cyclebasis(G)
figure(1); plot(G,’Layout’,’force’);
%%
node2 = [
1 2
2 3
3 4
4 1
1 5
5 6
2 4
6 7
3 6
2 5
3 7
4 7
2 6
6 8];
G2 = graph(node2(:,1), node2(:,2), []);
cycles2 = cyclebasis(G2)
figure(2); plot(G2,’Layout’,’force’);
Result:
cycles =
7×1 cell array
{[1 3 6 5]}
{[ 1 4 5]}
{[ 1 5 6]}
{[ 2 3 6]}
{[2 6 5 7]}
{[3 4 5 6]}
{[ 5 6 7]}
cycles2 =
7×1 cell array
{[ 1 2 4]}
{[ 1 2 5]}
{[ 2 3 4]}
{[ 2 3 6]}
{[2 3 7 6]}
{[2 4 7 6]}
{[ 2 5 6]}
Questions:
I know cyclebasis can vary with spanning tree/edge ordering. What’s the recommended way in MATLAB to obtain “minimal loop units” that do not depend on node labeling or edge input order? For example, in the above case, each cycle is a 3-node triangle, and there should be seven such cycles. graph, nodes, isomorphic, graph theory MATLAB Answers — New Questions
openfig visibility when saved figure has a callback
I used the method described here to save figures with a callback that would change their visibility to ‘on’ when opened.
fig = figure(‘Visible’,’off’);
set(fig, ‘CreateFcn’, ‘set(gcbo,”Visible”,”on”)’);
plot([1:10]);
savefig(fig,"C:temptest.fig")
close all
openfig("C:temptest.fig",’invisible’)
Behavior of sample code: figure opens and is visible
Desired Behavior: figure opens and is invisible.
Is this possible? Thank you.I used the method described here to save figures with a callback that would change their visibility to ‘on’ when opened.
fig = figure(‘Visible’,’off’);
set(fig, ‘CreateFcn’, ‘set(gcbo,”Visible”,”on”)’);
plot([1:10]);
savefig(fig,"C:temptest.fig")
close all
openfig("C:temptest.fig",’invisible’)
Behavior of sample code: figure opens and is visible
Desired Behavior: figure opens and is invisible.
Is this possible? Thank you. I used the method described here to save figures with a callback that would change their visibility to ‘on’ when opened.
fig = figure(‘Visible’,’off’);
set(fig, ‘CreateFcn’, ‘set(gcbo,”Visible”,”on”)’);
plot([1:10]);
savefig(fig,"C:temptest.fig")
close all
openfig("C:temptest.fig",’invisible’)
Behavior of sample code: figure opens and is visible
Desired Behavior: figure opens and is invisible.
Is this possible? Thank you. figure visibility MATLAB Answers — New Questions
How to avoid trimming of string cells got by readtable method?
Hello, everyone,
I am working with spreadsheet tables, which I import by readtable method. I have several numeric columns, but my crucial point is the column with strings. The readtable function trim every string (removes leading and trailing whitespace). However, I need to avoid this trimming and leave the strings as they are. I tried to look into documentation, but found only the trimming related to tables originating from text files.
My current code related to this looks following:
mytable=readtable(filename,’Sheet’,sheetname,’NumHeaderLines’,0)
Do you have any suggestions?Hello, everyone,
I am working with spreadsheet tables, which I import by readtable method. I have several numeric columns, but my crucial point is the column with strings. The readtable function trim every string (removes leading and trailing whitespace). However, I need to avoid this trimming and leave the strings as they are. I tried to look into documentation, but found only the trimming related to tables originating from text files.
My current code related to this looks following:
mytable=readtable(filename,’Sheet’,sheetname,’NumHeaderLines’,0)
Do you have any suggestions? Hello, everyone,
I am working with spreadsheet tables, which I import by readtable method. I have several numeric columns, but my crucial point is the column with strings. The readtable function trim every string (removes leading and trailing whitespace). However, I need to avoid this trimming and leave the strings as they are. I tried to look into documentation, but found only the trimming related to tables originating from text files.
My current code related to this looks following:
mytable=readtable(filename,’Sheet’,sheetname,’NumHeaderLines’,0)
Do you have any suggestions? readtable, strtrim MATLAB Answers — New Questions
Unverified Sender Messages Highlighted By Outlook Mobile
Unverified Sender Visual Marker to Warn of Potential Problems
It’s easy to forget details of the reports about new features that appear in the Microsoft 365 message center, especially when a delay occurs between the expected availability date and when the feature appears in plain sight. This is the case with MC1112452 (10 July 2025, Microsoft 365 roadmap item 491471), which reports the arrival of a new visual indicator in Outlook mobile to highlight messages from unverified senders. The indicator doesn’t mean that the message contains spam or malware; just that its properties are unexpected in a way that might be problematic.
I knew about the feature and had read that mid-July 2025 was the scheduled rollout date. Alas, after that I forgot about unverified visual indicators until a bunch of messages were tagged on August 18 (Figure 1).

MC1112452 says that the new visual warning “aligns with existing functionality in Outlook for desktop and web, bringing a consistent experience across platforms.” Later, the text adds “This feature is available by default when the unverified signal is received by the service.” The support documentation notes “If the message is suspicious but isn’t deemed malicious, the sender will be marked as unverified to notify the receiver that the sender may not be who they appear to be.”
No administrative controls exist to enable or disable the marking of unverified senders. This seems like a pity because external mail tagging, which is another marking applied to inbound messages by Exchange Online, can be turned off or on. The same is true for the first contact safety tip, which can be disabled through the anti-phishing policy actions in the Microsoft Defender portal.
Message Headers and Visual Markers
After examining the headers of several unverified messages, it seems like if the Exchange Online transport service determines that an inbound message can’t be fully verified in terms of the normal tests it applies like SPF, DKIM, and DMARC, the service marks the message as unverified, and users see the warning alongside the message in the message list.
For example, Figure 2 (generated by the Message Header Analyzer) shows that Exchange was unable to check the DKIM status for a message, perhaps because DKIM configuration error in DNS or the email is routed through a third-party mail hygiene system like Mimecast that must be configured in a certain manner for DKIM to work properly.

Same Warnings in other Outlook Clients
As noted above, the functionality now implemented in Outlook mobile matches what users see in Outlook classic, OWA, and the new Outlook for Windows (adjusted to match the UX of the client).

The interesting thing is that Outlook only displays the unverified sender warning in the message list for the first message in a thread. If you reply to an unverified sender, the warning is still present, but only for that message. Any subsequent messages received from that sender are not highlighted, even if the same issues persist with DKIM etc. Perhaps the logic here is that if you engage in an email conversation with a sender, that person is verified by your action and Outlook no longer considers them to be unverified. It’s the best theory that I have to explain what I see happen.
Most Unverified Senders are OK, and Then There’s a Bad One
Marking email coming from unverified senders is a good idea. In many cases, a perfectly simple reason exists for why the verification checks fail and the sending system can quickly adjust their configuration to address the problem. In others, a user could receive a message that’s an attempt to impersonate someone else. Hopefully, messages from attackers will be intercepted by Exchange Online Protection, Microsoft Defender for Office 365, or whatever anti-malware service is deployed, but it’s nice to have another check. Just in case.
Insight like this doesn’t come easily. You’ve got to know the technology and understand how to look behind the scenes. Benefit from the knowledge and experience of the Office 365 for IT Pros team by subscribing to the best eBook covering Office 365 and the wider Microsoft 365 ecosystem.
Enabled Subsystem does not reliably disable
The Simulink model shown in the screenshot does not behave as I expect, and I would like to understand the reason.
All blocks are on default settings, except:
Gain: 1/pi
Compare To Constant: operator >=, constant value 1.0
Logical Operator: XOR
(Scope: two inputs, line markers)
Note that the Compare To Constant block has Zero-crossing (ZC) detection enabled by default. The Enabled Subsystem consists of one input port directly connected to the output port.
All solver settings are at default values. The solver is auto(VariableStepDiscrete). I only change the StopTime of the simulation (see below).
Expected behavior
The enabled subsystem should be enabled for exactly one time point (in major time step) and "sample and hold" the value of its input (the scaled clock signal).
This should occur when the scaled clock signal reaches the threshold value of 1.0 (but not later than that, thanks to zero-crossing detection).
The sampled and held value (see Display block) should be 1.0 (sampled at time t=pi).
Observed behavior
Sometimes the result (= sampled and held value at the end of the simulation) is larger than expected. The scope screenshot (lower half) shows that the enabled subsystem seems to be enabled for two time points (instead of one). In the first time point (at t=pi) it copies a value of 1.0 to its output, in the subsequent second time point a value of approx. 1.025, which is then held until the end of the simulation.
Whether the problem occurs, weirdly depends on the StopTime. Some examples:
For these values of StopTime the problem occurs: 4.00001, 8, 9, 10.1, 11, 13.
For these values everything works fine: 4, 6, 7, 10, 12.
My analysis so far
When the problem occurs, Simulink has placed two time points at the zero-crossing, one just before and one just after the zero-crossing. When the problem doesn’t occur, there are three time points close to the zero-crossing (one just before, two just after the ZC).
Although the XOR block output is correctly true only for one time point (right hand side of the zero-crossing, see upper half of the scope screenshot), the final output of the enabled subsystem seems to always be equal to the value of its input sampled one time point later (that is: when the XOR has dropped to false again). That is not a problem if that time point is the third of 3 close to the ZC, but it produces a wrong result if that time point is much later than the ZC.
So I wonder why Simulink sometimes uses 3 time points at a ZC (good), but sometimes only 2 (bad). Any hints or explanations welcome why Simulink behaves like this and/or why it should(n’t).
Some more notes
I know the expected behavior could be implemented differently. That’s not the point. This is a minimal example. In my opinion it should not behave the way it does.
I’m unsure about the correct/official wording: What I mean by "time point" are the elements of "tout". Usually I refer to them as major time steps, but maybe that’s wrong as they’re points in time, not steps. (?)
My maybe related (and unfortunately still unanswered) question about zero-crossings and the number of time steps/points taken by Simulink: https://de.mathworks.com/matlabcentral/answers/553960-number-of-necessary-time-steps-to-handle-a-zero-crossingThe Simulink model shown in the screenshot does not behave as I expect, and I would like to understand the reason.
All blocks are on default settings, except:
Gain: 1/pi
Compare To Constant: operator >=, constant value 1.0
Logical Operator: XOR
(Scope: two inputs, line markers)
Note that the Compare To Constant block has Zero-crossing (ZC) detection enabled by default. The Enabled Subsystem consists of one input port directly connected to the output port.
All solver settings are at default values. The solver is auto(VariableStepDiscrete). I only change the StopTime of the simulation (see below).
Expected behavior
The enabled subsystem should be enabled for exactly one time point (in major time step) and "sample and hold" the value of its input (the scaled clock signal).
This should occur when the scaled clock signal reaches the threshold value of 1.0 (but not later than that, thanks to zero-crossing detection).
The sampled and held value (see Display block) should be 1.0 (sampled at time t=pi).
Observed behavior
Sometimes the result (= sampled and held value at the end of the simulation) is larger than expected. The scope screenshot (lower half) shows that the enabled subsystem seems to be enabled for two time points (instead of one). In the first time point (at t=pi) it copies a value of 1.0 to its output, in the subsequent second time point a value of approx. 1.025, which is then held until the end of the simulation.
Whether the problem occurs, weirdly depends on the StopTime. Some examples:
For these values of StopTime the problem occurs: 4.00001, 8, 9, 10.1, 11, 13.
For these values everything works fine: 4, 6, 7, 10, 12.
My analysis so far
When the problem occurs, Simulink has placed two time points at the zero-crossing, one just before and one just after the zero-crossing. When the problem doesn’t occur, there are three time points close to the zero-crossing (one just before, two just after the ZC).
Although the XOR block output is correctly true only for one time point (right hand side of the zero-crossing, see upper half of the scope screenshot), the final output of the enabled subsystem seems to always be equal to the value of its input sampled one time point later (that is: when the XOR has dropped to false again). That is not a problem if that time point is the third of 3 close to the ZC, but it produces a wrong result if that time point is much later than the ZC.
So I wonder why Simulink sometimes uses 3 time points at a ZC (good), but sometimes only 2 (bad). Any hints or explanations welcome why Simulink behaves like this and/or why it should(n’t).
Some more notes
I know the expected behavior could be implemented differently. That’s not the point. This is a minimal example. In my opinion it should not behave the way it does.
I’m unsure about the correct/official wording: What I mean by "time point" are the elements of "tout". Usually I refer to them as major time steps, but maybe that’s wrong as they’re points in time, not steps. (?)
My maybe related (and unfortunately still unanswered) question about zero-crossings and the number of time steps/points taken by Simulink: https://de.mathworks.com/matlabcentral/answers/553960-number-of-necessary-time-steps-to-handle-a-zero-crossing The Simulink model shown in the screenshot does not behave as I expect, and I would like to understand the reason.
All blocks are on default settings, except:
Gain: 1/pi
Compare To Constant: operator >=, constant value 1.0
Logical Operator: XOR
(Scope: two inputs, line markers)
Note that the Compare To Constant block has Zero-crossing (ZC) detection enabled by default. The Enabled Subsystem consists of one input port directly connected to the output port.
All solver settings are at default values. The solver is auto(VariableStepDiscrete). I only change the StopTime of the simulation (see below).
Expected behavior
The enabled subsystem should be enabled for exactly one time point (in major time step) and "sample and hold" the value of its input (the scaled clock signal).
This should occur when the scaled clock signal reaches the threshold value of 1.0 (but not later than that, thanks to zero-crossing detection).
The sampled and held value (see Display block) should be 1.0 (sampled at time t=pi).
Observed behavior
Sometimes the result (= sampled and held value at the end of the simulation) is larger than expected. The scope screenshot (lower half) shows that the enabled subsystem seems to be enabled for two time points (instead of one). In the first time point (at t=pi) it copies a value of 1.0 to its output, in the subsequent second time point a value of approx. 1.025, which is then held until the end of the simulation.
Whether the problem occurs, weirdly depends on the StopTime. Some examples:
For these values of StopTime the problem occurs: 4.00001, 8, 9, 10.1, 11, 13.
For these values everything works fine: 4, 6, 7, 10, 12.
My analysis so far
When the problem occurs, Simulink has placed two time points at the zero-crossing, one just before and one just after the zero-crossing. When the problem doesn’t occur, there are three time points close to the zero-crossing (one just before, two just after the ZC).
Although the XOR block output is correctly true only for one time point (right hand side of the zero-crossing, see upper half of the scope screenshot), the final output of the enabled subsystem seems to always be equal to the value of its input sampled one time point later (that is: when the XOR has dropped to false again). That is not a problem if that time point is the third of 3 close to the ZC, but it produces a wrong result if that time point is much later than the ZC.
So I wonder why Simulink sometimes uses 3 time points at a ZC (good), but sometimes only 2 (bad). Any hints or explanations welcome why Simulink behaves like this and/or why it should(n’t).
Some more notes
I know the expected behavior could be implemented differently. That’s not the point. This is a minimal example. In my opinion it should not behave the way it does.
I’m unsure about the correct/official wording: What I mean by "time point" are the elements of "tout". Usually I refer to them as major time steps, but maybe that’s wrong as they’re points in time, not steps. (?)
My maybe related (and unfortunately still unanswered) question about zero-crossings and the number of time steps/points taken by Simulink: https://de.mathworks.com/matlabcentral/answers/553960-number-of-necessary-time-steps-to-handle-a-zero-crossing enabled subsystem, zero-crossing MATLAB Answers — New Questions
How to patch the area under curve for a polarplot?
I want to create a figure as shown below using "polarplot" function. Can anyone please help me regarding this?I want to create a figure as shown below using "polarplot" function. Can anyone please help me regarding this? I want to create a figure as shown below using "polarplot" function. Can anyone please help me regarding this? patch, polarplot MATLAB Answers — New Questions
Why does my Custom TD3 not learn like the built-in TD3 agent?
So I have tried to code up my custom TD3 agent to behave as much like the built in TD3 agent in the same simulink environment, the only difference between them is for the custom agent, I had to use a rate transition block to perform zero order hold between the states, rewards, done signal and the custom agent. I used the rate transition block specify mode for output port sample time options to set the custom agent sample time.
My code for my custom TD3 agent is below, I tried to make it as much like the built-in TD3 as possible, the ep_counter,num_of_ep properties are unused.
classdef test_TD3Agent_V2 < rl.agent.CustomAgent
properties
%neural networks
actor
critic1
critic2
%target networks
target_actor
target_critic1
target_critic2
%dimensions
statesize
actionsize
%optimizers
actor_optimizer
critic1_optimizer
critic2_optimizer
%buffer
statebuffer
nextstatebuffer
actionbuffer
rewardbuffer
donebuffer
counter %keeps count of number experiences encountered
index %keeps track of current available index in buffer
buffersize
batchsize
%episodes
num_of_ep
ep_counter
%keep count of critic number of updates
num_critic_update
end
methods
%constructor
function obj = test_TD3Agent_V2(actor,critic1,critic2,target_actor,target_critic1,target_critic2,actor_opt,critic1_opt,critic2_opt,statesize,actionsize,buffer_size,batchsize,num_of_ep)
%(required) call abstract class constructor
obj = obj@rl.agent.CustomAgent();
%define observation + action space
obj.ObservationInfo = rlNumericSpec([statesize 1]);
obj.ActionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
obj.SampleTime = -1; %determined by rate transition block
%define networks
obj.actor = actor;
obj.critic1 = critic1;
obj.critic2 = critic2;
%define target networks
obj.target_actor = target_actor;
obj.target_critic1 = target_critic1;
obj.target_critic2 = target_critic2;
%define optimizer
obj.actor_optimizer = actor_opt;
obj.critic1_optimizer = critic1_opt;
obj.critic2_optimizer = critic2_opt;
%record dimensions
obj.statesize = statesize;
obj.actionsize = actionsize;
%initialize buffer
obj.statebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.nextstatebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.actionbuffer = dlarray(zeros(actionsize,1,buffer_size));
obj.rewardbuffer = dlarray(zeros(1,buffer_size));
obj.donebuffer = zeros(1,buffer_size);
obj.buffersize = buffer_size;
obj.batchsize = batchsize;
obj.counter = 0;
obj.index = 1;
%episodes (unused)
obj.num_of_ep = num_of_ep;
obj.ep_counter = 1;
%used for delay actor update and target network soft transfer
obj.num_critic_update = 0;
end
end
methods (Access = protected)
%Action method
function action = getActionImpl(obj,Observation)
% Given the current state of the system, return an action
action = getAction(obj.actor,Observation);
end
%Action with noise method
function action = getActionWithExplorationImpl(obj,Observation)
% Given the current observation, select an action
action = getAction(obj.actor,Observation);
% Add random noise to action
end
%Learn method
function action = learnImpl(obj,Experience)
%parse experience
state = Experience{1};
action_ = Experience{2};
reward = Experience{3};
next_state = Experience{4};
isdone = Experience{5};
%buffer operations
%check if index wraps around
if (obj.index > obj.buffersize)
obj.index = 1;
end
%record experience in buffer
obj.statebuffer(:,:,obj.index) = state{1};
obj.actionbuffer(:,:,obj.index) = action_{1};
obj.rewardbuffer(:,obj.index) = reward;
obj.nextstatebuffer(:,:,obj.index) = next_state{1};
obj.donebuffer(:,obj.index) = isdone;
%increment index and counter
obj.counter = obj.counter + 1;
obj.index = obj.index + 1;
%if non terminal state
if (isdone == false)
action = getAction(obj.actor,next_state); %select next action
noise = randn([6,1]).*0.1; %gaussian noise with standard dev of 0.1
action{1} = action{1} + noise; %add noise
action{1} = clip(action{1},-1,1); %clip action noise
else
%learning at the end of episode
if (obj.counter >= obj.batchsize)
max_index = min([obj.counter obj.buffersize]); %range of index 1 to max_index for buffer
%sample experience randomly from buffer
sample_index_vector = randsample(max_index,obj.batchsize); %vector of index experience to sample
%create buffer mini batch dlarrays
state_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
nextstate_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
action_batch = dlarray(zeros(obj.actionsize,1,obj.batchsize));
reward_batch = dlarray(zeros(1,obj.batchsize));
done_batch = zeros(1,obj.batchsize);
for i = 1:obj.batchsize %iterate through buffer and transfer experience over to mini batch
state_batch(:,:,i) = obj.statebuffer(:,:,sample_index_vector(i));
nextstate_batch(:,:,i) = obj.nextstatebuffer(:,:,sample_index_vector(i));
action_batch(:,:,i) = obj.actionbuffer(:,:,sample_index_vector(i));
reward_batch(:,i) = obj.rewardbuffer(:,sample_index_vector(i));
done_batch(:,i) = obj.donebuffer(:,sample_index_vector(i));
end
%update critic networks
criticgrad1 = dlfeval(@critic_gradient,obj.critic1,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic1,obj.critic1_optimizer] = update(obj.critic1_optimizer,obj.critic1,criticgrad1);
criticgrad2 = dlfeval(@critic_gradient,obj.critic2,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic2,obj.critic2_optimizer] = update(obj.critic2_optimizer,obj.critic2,criticgrad2);
%update num of critic updates
obj.num_critic_update = obj.num_critic_update + 1;
%delayed actor update + target network transfer
if (mod(obj.num_critic_update,2) == 0)
actorgrad = dlfeval(@actor_gradient,obj.actor,obj.critic1,obj.critic2,{state_batch});
[obj.actor,obj.actor_optimizer] = update(obj.actor_optimizer,obj.actor,actorgrad);
target_soft_transfer(obj);
end
end
end
end
%function used to soft transfer over to target networks
function target_soft_transfer(obj)
smooth_factor = 0.005;
for i = 1:6
obj.target_actor.Learnables{i} = smooth_factor*obj.actor.Learnables{i} + (1 – smooth_factor)*obj.target_actor.Learnables{i};
obj.target_critic1.Learnables{i} = smooth_factor*obj.critic1.Learnables{i} + (1 – smooth_factor)*obj.target_critic1.Learnables{i};
obj.target_critic2.Learnables{i} = smooth_factor*obj.critic2.Learnables{i} + (1 – smooth_factor)*obj.target_critic2.Learnables{i};
end
end
end
end
%obtain gradient of Q value wrt actor
function actorgradient = actor_gradient(actorNet,critic1,critic2,states,batchsize)
actoraction = getAction(actorNet,states); %obtain actor action
%obtain Q values
Q1 = getValue(critic1,states,actoraction);
Q2 = getValue(critic2,states,actoraction);
%obtain min Q values + reverse sign for gradient ascent
Qmin = min(Q1,Q2);
Q = -1*mean(Qmin);
gradient = dlgradient(Q,actorNet.Learnables); %calculate gradient of Q value wrt NN learnables
actorgradient = gradient;
end
%obtain gradient of critic NN
function criticgradient = critic_gradient(critic,target_actor,target_critic_1,target_critic_2,states,nextstates,actions,rewards,dones,batchsize)
%obtain target action
target_actions = getAction(target_actor,nextstates);
%target policy smoothing
for i = 1:batchsize
target_noise = randn([6,1]).*sqrt(0.2);
target_noise = clip(target_noise,-0.5,0.5);
target_actions{1}(:,:,i) = target_actions{1}(:,:,i) + target_noise; %add noise to action for smoothing
end
target_actions{1}(:,:,:) = clip(target_actions{1}(:,:,:),-1,1); %clip btw -1 and 1
%obtain Q values
Qtarget1 = getValue(target_critic_1,nextstates,target_actions);
Qtarget2 = getValue(target_critic_2,nextstates,target_actions);
Qmin = min(Qtarget1,Qtarget2);
Qoptimal = rewards + 0.99*Qmin.*(1 – dones);
Qpred = getValue(critic,states,actions);
%obtain critic loss
criticLoss = 0.5*mean((Qoptimal – Qpred).^2);
criticgradient = dlgradient(criticLoss,critic.Learnables);
end
And here is my code when using the built in TD3 agent
clc
%define times
dt = 0.1; %time steps
Tf = 7; %simulation time
%create stateInfo and actionInfo objects
statesize = 38;
actionsize = 6;
stateInfo = rlNumericSpec([statesize 1]);
actionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
mdl = ‘KUKA_EE_Controller_v18_disturbed’;
blk = ‘KUKA_EE_Controller_v18_disturbed/RL Agent’;
%create environment object
env = rlSimulinkEnv(mdl,blk,stateInfo,actionInfo);
%assign reset function
env.ResetFcn = @ResetFunction;
% %create actor network
actorlayers = [
featureInputLayer(statesize)
fullyConnectedLayer(800)
reluLayer
fullyConnectedLayer(600)
reluLayer
fullyConnectedLayer(actionsize)
tanhLayer
];
actorNet = dlnetwork;
actorNet = addLayers(actorNet, actorlayers);
actorNet = initialize(actorNet);
actor = rlContinuousDeterministicActor(actorNet, stateInfo, actionInfo);
%create critic networks
statelayers = [
featureInputLayer(statesize, Name=’states’)
concatenationLayer(1, 2, Name=’concat’)
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(1, Name=’Qvalue’)
];
actionlayers = featureInputLayer(actionsize, Name=’actions’);
criticNet = dlnetwork;
criticNet = addLayers(criticNet, statelayers);
criticNet = addLayers(criticNet, actionlayers);
criticNet = connectLayers(criticNet, ‘actions’, ‘concat/in2’);
criticNet = initialize(criticNet);
critic1 = rlQValueFunction(criticNet,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
criticNet2 = dlnetwork;
criticNet2 = addLayers(criticNet2, statelayers);
criticNet2 = addLayers(criticNet2, actionlayers);
criticNet2 = connectLayers(criticNet2, ‘actions’, ‘concat/in2’);
criticNet2 = initialize(criticNet2);
critic2 = rlQValueFunction(criticNet2,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
%create options object for actor and critic
actoroptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.001);
criticoptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.003);
agentoptions = rlTD3AgentOptions;
agentoptions.SampleTime = dt;
agentoptions.ActorOptimizerOptions = actoroptions;
agentoptions.CriticOptimizerOptions = criticoptions;
agentoptions.DiscountFactor = 0.99;
agentoptions.TargetSmoothFactor = 0.005;
agentoptions.ExperienceBufferLength = 1000000;
agentoptions.MiniBatchSize = 250;
agentoptions.ExplorationModel.StandardDeviation = 0.1;
agentoptions.ExplorationModel.StandardDeviationDecayRate = 1e-4;
agent = rlTD3Agent(actor, [critic1 critic2], agentoptions);
%create training options object
trainOpts = rlTrainingOptions(MaxEpisodes=20,MaxStepsPerEpisode=floor(Tf/dt),StopTrainingCriteria=’none’,SimulationStorageType=’none’);
%train agent
trainresults = train(agent,env,trainOpts);
I made my custom TD3 agent with the same actor and critic structures, with the same hyperparameters, and with the same agent options. But it doesn’t seem to learn and I don’t know why. I don’t know if the rate transition block is having a negative impact on the training. One difference between my custom TD3 and the built-in TD3 is the actor gradient. In the matlab documentation on TD3 agent, it says the gradient is calculated for every sample in the mini batch then the gradient is accumulated and averaged out.
https://www.mathworks.com/help/reinforcement-learning/ug/td3-agents.html (TD3 documentation)
But how I calculated my actor gradient in my above code in the actorgradient function, I averaged the Q values over the minbatch first, then I performed only one gradient operation. So maybe that’s one possible reason why my built in TD3 agent isn’t learning. Here are my reward for
Built-TD3
Custom TD3, I stopped it early because it wasn’t learning
I would appreciate any help because I have been stuck for months.So I have tried to code up my custom TD3 agent to behave as much like the built in TD3 agent in the same simulink environment, the only difference between them is for the custom agent, I had to use a rate transition block to perform zero order hold between the states, rewards, done signal and the custom agent. I used the rate transition block specify mode for output port sample time options to set the custom agent sample time.
My code for my custom TD3 agent is below, I tried to make it as much like the built-in TD3 as possible, the ep_counter,num_of_ep properties are unused.
classdef test_TD3Agent_V2 < rl.agent.CustomAgent
properties
%neural networks
actor
critic1
critic2
%target networks
target_actor
target_critic1
target_critic2
%dimensions
statesize
actionsize
%optimizers
actor_optimizer
critic1_optimizer
critic2_optimizer
%buffer
statebuffer
nextstatebuffer
actionbuffer
rewardbuffer
donebuffer
counter %keeps count of number experiences encountered
index %keeps track of current available index in buffer
buffersize
batchsize
%episodes
num_of_ep
ep_counter
%keep count of critic number of updates
num_critic_update
end
methods
%constructor
function obj = test_TD3Agent_V2(actor,critic1,critic2,target_actor,target_critic1,target_critic2,actor_opt,critic1_opt,critic2_opt,statesize,actionsize,buffer_size,batchsize,num_of_ep)
%(required) call abstract class constructor
obj = obj@rl.agent.CustomAgent();
%define observation + action space
obj.ObservationInfo = rlNumericSpec([statesize 1]);
obj.ActionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
obj.SampleTime = -1; %determined by rate transition block
%define networks
obj.actor = actor;
obj.critic1 = critic1;
obj.critic2 = critic2;
%define target networks
obj.target_actor = target_actor;
obj.target_critic1 = target_critic1;
obj.target_critic2 = target_critic2;
%define optimizer
obj.actor_optimizer = actor_opt;
obj.critic1_optimizer = critic1_opt;
obj.critic2_optimizer = critic2_opt;
%record dimensions
obj.statesize = statesize;
obj.actionsize = actionsize;
%initialize buffer
obj.statebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.nextstatebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.actionbuffer = dlarray(zeros(actionsize,1,buffer_size));
obj.rewardbuffer = dlarray(zeros(1,buffer_size));
obj.donebuffer = zeros(1,buffer_size);
obj.buffersize = buffer_size;
obj.batchsize = batchsize;
obj.counter = 0;
obj.index = 1;
%episodes (unused)
obj.num_of_ep = num_of_ep;
obj.ep_counter = 1;
%used for delay actor update and target network soft transfer
obj.num_critic_update = 0;
end
end
methods (Access = protected)
%Action method
function action = getActionImpl(obj,Observation)
% Given the current state of the system, return an action
action = getAction(obj.actor,Observation);
end
%Action with noise method
function action = getActionWithExplorationImpl(obj,Observation)
% Given the current observation, select an action
action = getAction(obj.actor,Observation);
% Add random noise to action
end
%Learn method
function action = learnImpl(obj,Experience)
%parse experience
state = Experience{1};
action_ = Experience{2};
reward = Experience{3};
next_state = Experience{4};
isdone = Experience{5};
%buffer operations
%check if index wraps around
if (obj.index > obj.buffersize)
obj.index = 1;
end
%record experience in buffer
obj.statebuffer(:,:,obj.index) = state{1};
obj.actionbuffer(:,:,obj.index) = action_{1};
obj.rewardbuffer(:,obj.index) = reward;
obj.nextstatebuffer(:,:,obj.index) = next_state{1};
obj.donebuffer(:,obj.index) = isdone;
%increment index and counter
obj.counter = obj.counter + 1;
obj.index = obj.index + 1;
%if non terminal state
if (isdone == false)
action = getAction(obj.actor,next_state); %select next action
noise = randn([6,1]).*0.1; %gaussian noise with standard dev of 0.1
action{1} = action{1} + noise; %add noise
action{1} = clip(action{1},-1,1); %clip action noise
else
%learning at the end of episode
if (obj.counter >= obj.batchsize)
max_index = min([obj.counter obj.buffersize]); %range of index 1 to max_index for buffer
%sample experience randomly from buffer
sample_index_vector = randsample(max_index,obj.batchsize); %vector of index experience to sample
%create buffer mini batch dlarrays
state_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
nextstate_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
action_batch = dlarray(zeros(obj.actionsize,1,obj.batchsize));
reward_batch = dlarray(zeros(1,obj.batchsize));
done_batch = zeros(1,obj.batchsize);
for i = 1:obj.batchsize %iterate through buffer and transfer experience over to mini batch
state_batch(:,:,i) = obj.statebuffer(:,:,sample_index_vector(i));
nextstate_batch(:,:,i) = obj.nextstatebuffer(:,:,sample_index_vector(i));
action_batch(:,:,i) = obj.actionbuffer(:,:,sample_index_vector(i));
reward_batch(:,i) = obj.rewardbuffer(:,sample_index_vector(i));
done_batch(:,i) = obj.donebuffer(:,sample_index_vector(i));
end
%update critic networks
criticgrad1 = dlfeval(@critic_gradient,obj.critic1,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic1,obj.critic1_optimizer] = update(obj.critic1_optimizer,obj.critic1,criticgrad1);
criticgrad2 = dlfeval(@critic_gradient,obj.critic2,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic2,obj.critic2_optimizer] = update(obj.critic2_optimizer,obj.critic2,criticgrad2);
%update num of critic updates
obj.num_critic_update = obj.num_critic_update + 1;
%delayed actor update + target network transfer
if (mod(obj.num_critic_update,2) == 0)
actorgrad = dlfeval(@actor_gradient,obj.actor,obj.critic1,obj.critic2,{state_batch});
[obj.actor,obj.actor_optimizer] = update(obj.actor_optimizer,obj.actor,actorgrad);
target_soft_transfer(obj);
end
end
end
end
%function used to soft transfer over to target networks
function target_soft_transfer(obj)
smooth_factor = 0.005;
for i = 1:6
obj.target_actor.Learnables{i} = smooth_factor*obj.actor.Learnables{i} + (1 – smooth_factor)*obj.target_actor.Learnables{i};
obj.target_critic1.Learnables{i} = smooth_factor*obj.critic1.Learnables{i} + (1 – smooth_factor)*obj.target_critic1.Learnables{i};
obj.target_critic2.Learnables{i} = smooth_factor*obj.critic2.Learnables{i} + (1 – smooth_factor)*obj.target_critic2.Learnables{i};
end
end
end
end
%obtain gradient of Q value wrt actor
function actorgradient = actor_gradient(actorNet,critic1,critic2,states,batchsize)
actoraction = getAction(actorNet,states); %obtain actor action
%obtain Q values
Q1 = getValue(critic1,states,actoraction);
Q2 = getValue(critic2,states,actoraction);
%obtain min Q values + reverse sign for gradient ascent
Qmin = min(Q1,Q2);
Q = -1*mean(Qmin);
gradient = dlgradient(Q,actorNet.Learnables); %calculate gradient of Q value wrt NN learnables
actorgradient = gradient;
end
%obtain gradient of critic NN
function criticgradient = critic_gradient(critic,target_actor,target_critic_1,target_critic_2,states,nextstates,actions,rewards,dones,batchsize)
%obtain target action
target_actions = getAction(target_actor,nextstates);
%target policy smoothing
for i = 1:batchsize
target_noise = randn([6,1]).*sqrt(0.2);
target_noise = clip(target_noise,-0.5,0.5);
target_actions{1}(:,:,i) = target_actions{1}(:,:,i) + target_noise; %add noise to action for smoothing
end
target_actions{1}(:,:,:) = clip(target_actions{1}(:,:,:),-1,1); %clip btw -1 and 1
%obtain Q values
Qtarget1 = getValue(target_critic_1,nextstates,target_actions);
Qtarget2 = getValue(target_critic_2,nextstates,target_actions);
Qmin = min(Qtarget1,Qtarget2);
Qoptimal = rewards + 0.99*Qmin.*(1 – dones);
Qpred = getValue(critic,states,actions);
%obtain critic loss
criticLoss = 0.5*mean((Qoptimal – Qpred).^2);
criticgradient = dlgradient(criticLoss,critic.Learnables);
end
And here is my code when using the built in TD3 agent
clc
%define times
dt = 0.1; %time steps
Tf = 7; %simulation time
%create stateInfo and actionInfo objects
statesize = 38;
actionsize = 6;
stateInfo = rlNumericSpec([statesize 1]);
actionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
mdl = ‘KUKA_EE_Controller_v18_disturbed’;
blk = ‘KUKA_EE_Controller_v18_disturbed/RL Agent’;
%create environment object
env = rlSimulinkEnv(mdl,blk,stateInfo,actionInfo);
%assign reset function
env.ResetFcn = @ResetFunction;
% %create actor network
actorlayers = [
featureInputLayer(statesize)
fullyConnectedLayer(800)
reluLayer
fullyConnectedLayer(600)
reluLayer
fullyConnectedLayer(actionsize)
tanhLayer
];
actorNet = dlnetwork;
actorNet = addLayers(actorNet, actorlayers);
actorNet = initialize(actorNet);
actor = rlContinuousDeterministicActor(actorNet, stateInfo, actionInfo);
%create critic networks
statelayers = [
featureInputLayer(statesize, Name=’states’)
concatenationLayer(1, 2, Name=’concat’)
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(1, Name=’Qvalue’)
];
actionlayers = featureInputLayer(actionsize, Name=’actions’);
criticNet = dlnetwork;
criticNet = addLayers(criticNet, statelayers);
criticNet = addLayers(criticNet, actionlayers);
criticNet = connectLayers(criticNet, ‘actions’, ‘concat/in2’);
criticNet = initialize(criticNet);
critic1 = rlQValueFunction(criticNet,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
criticNet2 = dlnetwork;
criticNet2 = addLayers(criticNet2, statelayers);
criticNet2 = addLayers(criticNet2, actionlayers);
criticNet2 = connectLayers(criticNet2, ‘actions’, ‘concat/in2’);
criticNet2 = initialize(criticNet2);
critic2 = rlQValueFunction(criticNet2,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
%create options object for actor and critic
actoroptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.001);
criticoptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.003);
agentoptions = rlTD3AgentOptions;
agentoptions.SampleTime = dt;
agentoptions.ActorOptimizerOptions = actoroptions;
agentoptions.CriticOptimizerOptions = criticoptions;
agentoptions.DiscountFactor = 0.99;
agentoptions.TargetSmoothFactor = 0.005;
agentoptions.ExperienceBufferLength = 1000000;
agentoptions.MiniBatchSize = 250;
agentoptions.ExplorationModel.StandardDeviation = 0.1;
agentoptions.ExplorationModel.StandardDeviationDecayRate = 1e-4;
agent = rlTD3Agent(actor, [critic1 critic2], agentoptions);
%create training options object
trainOpts = rlTrainingOptions(MaxEpisodes=20,MaxStepsPerEpisode=floor(Tf/dt),StopTrainingCriteria=’none’,SimulationStorageType=’none’);
%train agent
trainresults = train(agent,env,trainOpts);
I made my custom TD3 agent with the same actor and critic structures, with the same hyperparameters, and with the same agent options. But it doesn’t seem to learn and I don’t know why. I don’t know if the rate transition block is having a negative impact on the training. One difference between my custom TD3 and the built-in TD3 is the actor gradient. In the matlab documentation on TD3 agent, it says the gradient is calculated for every sample in the mini batch then the gradient is accumulated and averaged out.
https://www.mathworks.com/help/reinforcement-learning/ug/td3-agents.html (TD3 documentation)
But how I calculated my actor gradient in my above code in the actorgradient function, I averaged the Q values over the minbatch first, then I performed only one gradient operation. So maybe that’s one possible reason why my built in TD3 agent isn’t learning. Here are my reward for
Built-TD3
Custom TD3, I stopped it early because it wasn’t learning
I would appreciate any help because I have been stuck for months. So I have tried to code up my custom TD3 agent to behave as much like the built in TD3 agent in the same simulink environment, the only difference between them is for the custom agent, I had to use a rate transition block to perform zero order hold between the states, rewards, done signal and the custom agent. I used the rate transition block specify mode for output port sample time options to set the custom agent sample time.
My code for my custom TD3 agent is below, I tried to make it as much like the built-in TD3 as possible, the ep_counter,num_of_ep properties are unused.
classdef test_TD3Agent_V2 < rl.agent.CustomAgent
properties
%neural networks
actor
critic1
critic2
%target networks
target_actor
target_critic1
target_critic2
%dimensions
statesize
actionsize
%optimizers
actor_optimizer
critic1_optimizer
critic2_optimizer
%buffer
statebuffer
nextstatebuffer
actionbuffer
rewardbuffer
donebuffer
counter %keeps count of number experiences encountered
index %keeps track of current available index in buffer
buffersize
batchsize
%episodes
num_of_ep
ep_counter
%keep count of critic number of updates
num_critic_update
end
methods
%constructor
function obj = test_TD3Agent_V2(actor,critic1,critic2,target_actor,target_critic1,target_critic2,actor_opt,critic1_opt,critic2_opt,statesize,actionsize,buffer_size,batchsize,num_of_ep)
%(required) call abstract class constructor
obj = obj@rl.agent.CustomAgent();
%define observation + action space
obj.ObservationInfo = rlNumericSpec([statesize 1]);
obj.ActionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
obj.SampleTime = -1; %determined by rate transition block
%define networks
obj.actor = actor;
obj.critic1 = critic1;
obj.critic2 = critic2;
%define target networks
obj.target_actor = target_actor;
obj.target_critic1 = target_critic1;
obj.target_critic2 = target_critic2;
%define optimizer
obj.actor_optimizer = actor_opt;
obj.critic1_optimizer = critic1_opt;
obj.critic2_optimizer = critic2_opt;
%record dimensions
obj.statesize = statesize;
obj.actionsize = actionsize;
%initialize buffer
obj.statebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.nextstatebuffer = dlarray(zeros(statesize,1,buffer_size));
obj.actionbuffer = dlarray(zeros(actionsize,1,buffer_size));
obj.rewardbuffer = dlarray(zeros(1,buffer_size));
obj.donebuffer = zeros(1,buffer_size);
obj.buffersize = buffer_size;
obj.batchsize = batchsize;
obj.counter = 0;
obj.index = 1;
%episodes (unused)
obj.num_of_ep = num_of_ep;
obj.ep_counter = 1;
%used for delay actor update and target network soft transfer
obj.num_critic_update = 0;
end
end
methods (Access = protected)
%Action method
function action = getActionImpl(obj,Observation)
% Given the current state of the system, return an action
action = getAction(obj.actor,Observation);
end
%Action with noise method
function action = getActionWithExplorationImpl(obj,Observation)
% Given the current observation, select an action
action = getAction(obj.actor,Observation);
% Add random noise to action
end
%Learn method
function action = learnImpl(obj,Experience)
%parse experience
state = Experience{1};
action_ = Experience{2};
reward = Experience{3};
next_state = Experience{4};
isdone = Experience{5};
%buffer operations
%check if index wraps around
if (obj.index > obj.buffersize)
obj.index = 1;
end
%record experience in buffer
obj.statebuffer(:,:,obj.index) = state{1};
obj.actionbuffer(:,:,obj.index) = action_{1};
obj.rewardbuffer(:,obj.index) = reward;
obj.nextstatebuffer(:,:,obj.index) = next_state{1};
obj.donebuffer(:,obj.index) = isdone;
%increment index and counter
obj.counter = obj.counter + 1;
obj.index = obj.index + 1;
%if non terminal state
if (isdone == false)
action = getAction(obj.actor,next_state); %select next action
noise = randn([6,1]).*0.1; %gaussian noise with standard dev of 0.1
action{1} = action{1} + noise; %add noise
action{1} = clip(action{1},-1,1); %clip action noise
else
%learning at the end of episode
if (obj.counter >= obj.batchsize)
max_index = min([obj.counter obj.buffersize]); %range of index 1 to max_index for buffer
%sample experience randomly from buffer
sample_index_vector = randsample(max_index,obj.batchsize); %vector of index experience to sample
%create buffer mini batch dlarrays
state_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
nextstate_batch = dlarray(zeros(obj.statesize,1,obj.batchsize));
action_batch = dlarray(zeros(obj.actionsize,1,obj.batchsize));
reward_batch = dlarray(zeros(1,obj.batchsize));
done_batch = zeros(1,obj.batchsize);
for i = 1:obj.batchsize %iterate through buffer and transfer experience over to mini batch
state_batch(:,:,i) = obj.statebuffer(:,:,sample_index_vector(i));
nextstate_batch(:,:,i) = obj.nextstatebuffer(:,:,sample_index_vector(i));
action_batch(:,:,i) = obj.actionbuffer(:,:,sample_index_vector(i));
reward_batch(:,i) = obj.rewardbuffer(:,sample_index_vector(i));
done_batch(:,i) = obj.donebuffer(:,sample_index_vector(i));
end
%update critic networks
criticgrad1 = dlfeval(@critic_gradient,obj.critic1,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic1,obj.critic1_optimizer] = update(obj.critic1_optimizer,obj.critic1,criticgrad1);
criticgrad2 = dlfeval(@critic_gradient,obj.critic2,obj.target_actor,obj.target_critic1,obj.target_critic2,{state_batch},{nextstate_batch},{action_batch},reward_batch,done_batch,obj.batchsize);
[obj.critic2,obj.critic2_optimizer] = update(obj.critic2_optimizer,obj.critic2,criticgrad2);
%update num of critic updates
obj.num_critic_update = obj.num_critic_update + 1;
%delayed actor update + target network transfer
if (mod(obj.num_critic_update,2) == 0)
actorgrad = dlfeval(@actor_gradient,obj.actor,obj.critic1,obj.critic2,{state_batch});
[obj.actor,obj.actor_optimizer] = update(obj.actor_optimizer,obj.actor,actorgrad);
target_soft_transfer(obj);
end
end
end
end
%function used to soft transfer over to target networks
function target_soft_transfer(obj)
smooth_factor = 0.005;
for i = 1:6
obj.target_actor.Learnables{i} = smooth_factor*obj.actor.Learnables{i} + (1 – smooth_factor)*obj.target_actor.Learnables{i};
obj.target_critic1.Learnables{i} = smooth_factor*obj.critic1.Learnables{i} + (1 – smooth_factor)*obj.target_critic1.Learnables{i};
obj.target_critic2.Learnables{i} = smooth_factor*obj.critic2.Learnables{i} + (1 – smooth_factor)*obj.target_critic2.Learnables{i};
end
end
end
end
%obtain gradient of Q value wrt actor
function actorgradient = actor_gradient(actorNet,critic1,critic2,states,batchsize)
actoraction = getAction(actorNet,states); %obtain actor action
%obtain Q values
Q1 = getValue(critic1,states,actoraction);
Q2 = getValue(critic2,states,actoraction);
%obtain min Q values + reverse sign for gradient ascent
Qmin = min(Q1,Q2);
Q = -1*mean(Qmin);
gradient = dlgradient(Q,actorNet.Learnables); %calculate gradient of Q value wrt NN learnables
actorgradient = gradient;
end
%obtain gradient of critic NN
function criticgradient = critic_gradient(critic,target_actor,target_critic_1,target_critic_2,states,nextstates,actions,rewards,dones,batchsize)
%obtain target action
target_actions = getAction(target_actor,nextstates);
%target policy smoothing
for i = 1:batchsize
target_noise = randn([6,1]).*sqrt(0.2);
target_noise = clip(target_noise,-0.5,0.5);
target_actions{1}(:,:,i) = target_actions{1}(:,:,i) + target_noise; %add noise to action for smoothing
end
target_actions{1}(:,:,:) = clip(target_actions{1}(:,:,:),-1,1); %clip btw -1 and 1
%obtain Q values
Qtarget1 = getValue(target_critic_1,nextstates,target_actions);
Qtarget2 = getValue(target_critic_2,nextstates,target_actions);
Qmin = min(Qtarget1,Qtarget2);
Qoptimal = rewards + 0.99*Qmin.*(1 – dones);
Qpred = getValue(critic,states,actions);
%obtain critic loss
criticLoss = 0.5*mean((Qoptimal – Qpred).^2);
criticgradient = dlgradient(criticLoss,critic.Learnables);
end
And here is my code when using the built in TD3 agent
clc
%define times
dt = 0.1; %time steps
Tf = 7; %simulation time
%create stateInfo and actionInfo objects
statesize = 38;
actionsize = 6;
stateInfo = rlNumericSpec([statesize 1]);
actionInfo = rlNumericSpec([actionsize 1],LowerLimit = -1,UpperLimit = 1);
mdl = ‘KUKA_EE_Controller_v18_disturbed’;
blk = ‘KUKA_EE_Controller_v18_disturbed/RL Agent’;
%create environment object
env = rlSimulinkEnv(mdl,blk,stateInfo,actionInfo);
%assign reset function
env.ResetFcn = @ResetFunction;
% %create actor network
actorlayers = [
featureInputLayer(statesize)
fullyConnectedLayer(800)
reluLayer
fullyConnectedLayer(600)
reluLayer
fullyConnectedLayer(actionsize)
tanhLayer
];
actorNet = dlnetwork;
actorNet = addLayers(actorNet, actorlayers);
actorNet = initialize(actorNet);
actor = rlContinuousDeterministicActor(actorNet, stateInfo, actionInfo);
%create critic networks
statelayers = [
featureInputLayer(statesize, Name=’states’)
concatenationLayer(1, 2, Name=’concat’)
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(400)
reluLayer
fullyConnectedLayer(1, Name=’Qvalue’)
];
actionlayers = featureInputLayer(actionsize, Name=’actions’);
criticNet = dlnetwork;
criticNet = addLayers(criticNet, statelayers);
criticNet = addLayers(criticNet, actionlayers);
criticNet = connectLayers(criticNet, ‘actions’, ‘concat/in2’);
criticNet = initialize(criticNet);
critic1 = rlQValueFunction(criticNet,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
criticNet2 = dlnetwork;
criticNet2 = addLayers(criticNet2, statelayers);
criticNet2 = addLayers(criticNet2, actionlayers);
criticNet2 = connectLayers(criticNet2, ‘actions’, ‘concat/in2’);
criticNet2 = initialize(criticNet2);
critic2 = rlQValueFunction(criticNet2,stateInfo,actionInfo,ObservationInputNames=’states’,ActionInputNames=’actions’);
%create options object for actor and critic
actoroptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.001);
criticoptions = rlOptimizerOptions(Optimizer=’adam’,LearnRate=0.003);
agentoptions = rlTD3AgentOptions;
agentoptions.SampleTime = dt;
agentoptions.ActorOptimizerOptions = actoroptions;
agentoptions.CriticOptimizerOptions = criticoptions;
agentoptions.DiscountFactor = 0.99;
agentoptions.TargetSmoothFactor = 0.005;
agentoptions.ExperienceBufferLength = 1000000;
agentoptions.MiniBatchSize = 250;
agentoptions.ExplorationModel.StandardDeviation = 0.1;
agentoptions.ExplorationModel.StandardDeviationDecayRate = 1e-4;
agent = rlTD3Agent(actor, [critic1 critic2], agentoptions);
%create training options object
trainOpts = rlTrainingOptions(MaxEpisodes=20,MaxStepsPerEpisode=floor(Tf/dt),StopTrainingCriteria=’none’,SimulationStorageType=’none’);
%train agent
trainresults = train(agent,env,trainOpts);
I made my custom TD3 agent with the same actor and critic structures, with the same hyperparameters, and with the same agent options. But it doesn’t seem to learn and I don’t know why. I don’t know if the rate transition block is having a negative impact on the training. One difference between my custom TD3 and the built-in TD3 is the actor gradient. In the matlab documentation on TD3 agent, it says the gradient is calculated for every sample in the mini batch then the gradient is accumulated and averaged out.
https://www.mathworks.com/help/reinforcement-learning/ug/td3-agents.html (TD3 documentation)
But how I calculated my actor gradient in my above code in the actorgradient function, I averaged the Q values over the minbatch first, then I performed only one gradient operation. So maybe that’s one possible reason why my built in TD3 agent isn’t learning. Here are my reward for
Built-TD3
Custom TD3, I stopped it early because it wasn’t learning
I would appreciate any help because I have been stuck for months. simulink, matlab, reinforcement learning, custom td3 MATLAB Answers — New Questions
Special case of function not found even when in current directory or on path
Seeing the behavior confirmed by others, I just submitted a bug report, Case 08020464.
Matlab (v2024a or 2024b) is unable to identify a function in my current directory and/or on my path, if called from another function that has an if-statement like the one shown in the example below. First function, saved to current directory:
function out=matlabbugfun1
out=6;
end
Second function, saved to current directory:
function out=matlabbugfun2
if exist(‘matlabbugfun1.m’)~=2
matlabbugfun1=@()(4);
end
out=matlabbugfun1();
end
Now from the command line:
matlabbugfun2
And I get:
Unrecognized function or variable ‘matlabbugfun1’.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
I have validated this on 2 computers, one with 2024a, the other with 2024b. Note that if the functions are not on your current folder but are on your path:
a=pwd;
addpath(pwd)
cd ..
matlabbugfun2
Then you get a slightly different error:
matlabbugfun1 is not found in the current folder or on the MATLAB path, but exists in:
C:Userstaashertmpmatlabbug
Change the MATLAB current folder or add its folder to the MATLAB path.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
Additional notes:
Calling matlabbugfun1 from the command line (or a script) works just fine.
Calling matlabbugfun2 from a script fails the same as is does on the command line.
Putting a break in while running it shows that the if statement returns false and is not evaulated, as would be expected.
Putting something like strfind(path,fileparts(which(‘matlabbugfun1’))) inside matlabbugfun2 will also show that during execution, Matlab thinks it IS on the path.
As noted in my response to Ron’s answer, if the two functions are made subfunctions of a main script that calls matlabbugfun2, the same error occurs. However the script can call matlabbugfun1 without issue.
Output from ver:
—————————————————————————————————–
MATLAB Version: 24.1.0.2653294 (R2024a) Update 5
MATLAB License Number: ••••••••
Operating System: Microsoft Windows 11 Enterprise Version 10.0 (Build 26100)
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
—————————————————————————————————–
MATLAB Version 24.1 (R2024a)
Signal Processing Toolbox Version 24.1 (R2024a)
Statistics and Machine Learning Toolbox Version 24.1 (R2024a)Seeing the behavior confirmed by others, I just submitted a bug report, Case 08020464.
Matlab (v2024a or 2024b) is unable to identify a function in my current directory and/or on my path, if called from another function that has an if-statement like the one shown in the example below. First function, saved to current directory:
function out=matlabbugfun1
out=6;
end
Second function, saved to current directory:
function out=matlabbugfun2
if exist(‘matlabbugfun1.m’)~=2
matlabbugfun1=@()(4);
end
out=matlabbugfun1();
end
Now from the command line:
matlabbugfun2
And I get:
Unrecognized function or variable ‘matlabbugfun1’.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
I have validated this on 2 computers, one with 2024a, the other with 2024b. Note that if the functions are not on your current folder but are on your path:
a=pwd;
addpath(pwd)
cd ..
matlabbugfun2
Then you get a slightly different error:
matlabbugfun1 is not found in the current folder or on the MATLAB path, but exists in:
C:Userstaashertmpmatlabbug
Change the MATLAB current folder or add its folder to the MATLAB path.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
Additional notes:
Calling matlabbugfun1 from the command line (or a script) works just fine.
Calling matlabbugfun2 from a script fails the same as is does on the command line.
Putting a break in while running it shows that the if statement returns false and is not evaulated, as would be expected.
Putting something like strfind(path,fileparts(which(‘matlabbugfun1’))) inside matlabbugfun2 will also show that during execution, Matlab thinks it IS on the path.
As noted in my response to Ron’s answer, if the two functions are made subfunctions of a main script that calls matlabbugfun2, the same error occurs. However the script can call matlabbugfun1 without issue.
Output from ver:
—————————————————————————————————–
MATLAB Version: 24.1.0.2653294 (R2024a) Update 5
MATLAB License Number: ••••••••
Operating System: Microsoft Windows 11 Enterprise Version 10.0 (Build 26100)
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
—————————————————————————————————–
MATLAB Version 24.1 (R2024a)
Signal Processing Toolbox Version 24.1 (R2024a)
Statistics and Machine Learning Toolbox Version 24.1 (R2024a) Seeing the behavior confirmed by others, I just submitted a bug report, Case 08020464.
Matlab (v2024a or 2024b) is unable to identify a function in my current directory and/or on my path, if called from another function that has an if-statement like the one shown in the example below. First function, saved to current directory:
function out=matlabbugfun1
out=6;
end
Second function, saved to current directory:
function out=matlabbugfun2
if exist(‘matlabbugfun1.m’)~=2
matlabbugfun1=@()(4);
end
out=matlabbugfun1();
end
Now from the command line:
matlabbugfun2
And I get:
Unrecognized function or variable ‘matlabbugfun1’.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
I have validated this on 2 computers, one with 2024a, the other with 2024b. Note that if the functions are not on your current folder but are on your path:
a=pwd;
addpath(pwd)
cd ..
matlabbugfun2
Then you get a slightly different error:
matlabbugfun1 is not found in the current folder or on the MATLAB path, but exists in:
C:Userstaashertmpmatlabbug
Change the MATLAB current folder or add its folder to the MATLAB path.
Error in matlabbugfun2 (line 5)
out=matlabbugfun1();
Additional notes:
Calling matlabbugfun1 from the command line (or a script) works just fine.
Calling matlabbugfun2 from a script fails the same as is does on the command line.
Putting a break in while running it shows that the if statement returns false and is not evaulated, as would be expected.
Putting something like strfind(path,fileparts(which(‘matlabbugfun1’))) inside matlabbugfun2 will also show that during execution, Matlab thinks it IS on the path.
As noted in my response to Ron’s answer, if the two functions are made subfunctions of a main script that calls matlabbugfun2, the same error occurs. However the script can call matlabbugfun1 without issue.
Output from ver:
—————————————————————————————————–
MATLAB Version: 24.1.0.2653294 (R2024a) Update 5
MATLAB License Number: ••••••••
Operating System: Microsoft Windows 11 Enterprise Version 10.0 (Build 26100)
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
—————————————————————————————————–
MATLAB Version 24.1 (R2024a)
Signal Processing Toolbox Version 24.1 (R2024a)
Statistics and Machine Learning Toolbox Version 24.1 (R2024a) bug, matlab MATLAB Answers — New Questions
Microsoft Defender for Office 365, Shared Mailboxes, and Microsoft 365 Groups
MDO Licensing Required for Shared Mailboxes but Not for Groups
Some Microsoft representatives expressed disappointment after the publication of the article about unexpected costs to license shared mailboxes for Microsoft Defender for Office 365 (MDO). They felt that I didn’t do MDO justice. Let me be clear: MDO covers a wide range of functionality to protect user communications (not just email, but also Teams and the Office apps) from threat. MDO Plan 2 also includes some neat SOC and attack simulation tools. Overall, MDO Plan 2 is a strong package that adds a lot of value to the Office 365 E5 and Microsoft 365 E5 SKUs.
The point of the article was not to discuss MDO capabilities. Instead, it turned a light on the unexpected licensing consequences of MDO becoming active within a tenant. Once MDO protects tenant communications, all user mailboxes and all shared mailboxes must be licensed for MDO Plan 2. That’s an unfair and unexpected consequence of upgrading a tenant from E3 to E5 licenses, something that Microsoft wants customers to do.
Indeed, at the analyst call following quarterly Microsoft results, CFO Amy Hood invariably mentions the success Microsoft has in driving higher Average Revenue Per User (ARPU) due to E5 upgrades and add-on licenses. In the Q4 FY25 call, she noted “ARPU growth again driven by E5 and M365 Copilot.” This kind of management commentary must have an effect on those who make licensing decisions for products.
Microsoft pointed out to me that they have not changed their guidance or documentation on this topic. This is accurate. The same guidance has been in place for several years. The MDO service description covers licensing, and anyone who takes the time to peruse that text will discover just how many MDO licenses their tenant needs. In terms of unexpected licensing consequences, if you don’t read what Microsoft says about a product, you won’t understand the rules of the game and surprises are almost inevitable.
Consequences of Previous Microsoft Decisions
But here’s the thing. The situation around MDO licensing for shared mailboxes is the consequence of two Microsoft decisions taken in the past. The first is that when Exchange Server launched shared mailboxes, Exchange created a user account for each shared mailbox. In an on-premises environment, the extra user accounts made no difference to licensing costs.
Entra ID and Exchange Online took the on-premises model and applied it to the cloud. I’ve often been critical of Entra ID’s inability to identify utility accounts used for purposes like shared and room mailboxes or break glass accounts. Treating these accounts like regular user accounts is nonsense. Failing to disable the accounts created for utility Exchange objects is silly, and allowing people to sign into those accounts (which creates a whole new can of licensing worms) isn’t much better. Exchange Online uses accounts for shared mailboxes like it does on-premises, and that’s the root of the problem created for MDO licensing.
Shared Mailboxes and Group Mailboxes Can Receive and Send Mail
Microsoft says that they require MDO licenses for shared mailboxes because the mailboxes can send and receive email and therefore benefit from the MDO service. Well, the group mailboxes created for Microsoft 365 groups can also send and receive email and those mailboxes support many (but not all) of the features found in shared mailboxes. The fact is that the current implementation of mail-based Microsoft 365 groups (Figure 1) operate very much like shared mailboxes when it comes to sending and receiving mail. Both types of mailbox receive the same level of protection from MDO.

Overall, Microsoft 365 groups are used far more extensively than shared mailboxes, mainly to support Teams, but I can’t find a single reference to an MDO licensing requirement for Microsoft 365 groups in the MDO service description. The reason why MDO ignores licensing for Microsoft 365 groups is simple: Microsoft 365 groups don’t have any form of Entra ID account. They exist as an Entra ID group that just happens to be connected to a set of resources like a plan, team, SharePoint site and group mailbox.
It’s possible to assign licenses to a Microsoft 365 group, but only for the purpose of group-based license assignments managed through the Microsoft 365 admin center (you can also manage group-based license assignments with PowerShell). Because Microsoft 365 groups don’t have user accounts, they don’t follow the normal licensing regime, so MDO cannot be licensed.
Drop the Need for MDO to License Shared Mailboxes
Microsoft has long recommended that customers should replace distribution lists and shared mailboxes by Microsoft 365 groups. Indeed, a great deal of engineering effort went into the addition of capabilities like delegated send for Microsoft 365 groups. After 2019, Microsoft dedicated less attention to the email side of Microsoft 365 groups because of the emphasis on Teams, but the debate about whether to use Groups or shared mailboxes remains active.
Today, far fewer Microsoft 365 groups support email-based communication than those used with Teams. However, the fact remains that a dichotomy exists between how MDO treats the licensing of shared mailboxes and Microsoft 365 groups.
A case could be argued that email-based Microsoft 365 Groups operate by distributing copies of email to group members, and those user accounts should have MDO licenses. That’s true, but group mailboxes receive email processed by MDO, just like shared mailboxes do, so shouldn’t the same rule apply? To solve the conundrum, Microsoft should simplify the situation by dropping the need for MDO licenses for shared mailboxes. I suspect that internal budgets, revenue recognition, and a myriad of other issues will stop this happening, but that’s what should be done.
Support the work of the Office 365 for IT Pros team by subscribing to the Office 365 for IT Pros eBook. Your support pays for the time we need to track, analyze, and document the changing world of Microsoft 365 and Office 365. Only humans contribute to our work!
How to replicate Regression Learner app based training using Matlab script?
I have trained a ML model in regression learner app using optimizable GPR model using the default setting such as 5 k-fold validation and 30 iterations etc. Now I am traying to do the same using the Matlab script.using the following where X are the resgressors and Y is the response variable.
>> ML_mdl=fitrgp(X,Y,’OptimizeHyperparameters’,’all’,’HyperparameterOptimizationOptions’,struct(‘KFold’,5))
Are the two resulting models more or less equivalent? I know there will be some difference due to the probabilistic nature of the algorithm. When I test it on the entire training set, the R squared value is practically 1.0. Is it overfitting even with K-fold cross-correlation? The prediction on unseen testing set is not that good. Any suggestions?I have trained a ML model in regression learner app using optimizable GPR model using the default setting such as 5 k-fold validation and 30 iterations etc. Now I am traying to do the same using the Matlab script.using the following where X are the resgressors and Y is the response variable.
>> ML_mdl=fitrgp(X,Y,’OptimizeHyperparameters’,’all’,’HyperparameterOptimizationOptions’,struct(‘KFold’,5))
Are the two resulting models more or less equivalent? I know there will be some difference due to the probabilistic nature of the algorithm. When I test it on the entire training set, the R squared value is practically 1.0. Is it overfitting even with K-fold cross-correlation? The prediction on unseen testing set is not that good. Any suggestions? I have trained a ML model in regression learner app using optimizable GPR model using the default setting such as 5 k-fold validation and 30 iterations etc. Now I am traying to do the same using the Matlab script.using the following where X are the resgressors and Y is the response variable.
>> ML_mdl=fitrgp(X,Y,’OptimizeHyperparameters’,’all’,’HyperparameterOptimizationOptions’,struct(‘KFold’,5))
Are the two resulting models more or less equivalent? I know there will be some difference due to the probabilistic nature of the algorithm. When I test it on the entire training set, the R squared value is practically 1.0. Is it overfitting even with K-fold cross-correlation? The prediction on unseen testing set is not that good. Any suggestions? regression learner, script based ml model training MATLAB Answers — New Questions
I need Matlab for theoretical physics course/research projects. I am uncertain about which products/toolboxes to get.
I have opted for Differential Equations Toolbox, Math Symbolic Toolbox, Optimization and Global Optimization toolbox. Other than these I have no idea what I need and what I don’t. Can anyone from theoretical background help me pick the products.I have opted for Differential Equations Toolbox, Math Symbolic Toolbox, Optimization and Global Optimization toolbox. Other than these I have no idea what I need and what I don’t. Can anyone from theoretical background help me pick the products. I have opted for Differential Equations Toolbox, Math Symbolic Toolbox, Optimization and Global Optimization toolbox. Other than these I have no idea what I need and what I don’t. Can anyone from theoretical background help me pick the products. theoretical physics, newtoproduct MATLAB Answers — New Questions
How do I run a load flow analysis for a solar PV array?
I simulated a microgrid on matlab simulink. I want to run a load flow analysis of the system. However matlab does not recognize the solar PV array in the microgrid as a source. When I place a load flow bus after the inverter of the PV array, it produces an error message saying the model does not converge. Increasing the number of iterations does not help. Any advice would be greatly appreciatedI simulated a microgrid on matlab simulink. I want to run a load flow analysis of the system. However matlab does not recognize the solar PV array in the microgrid as a source. When I place a load flow bus after the inverter of the PV array, it produces an error message saying the model does not converge. Increasing the number of iterations does not help. Any advice would be greatly appreciated I simulated a microgrid on matlab simulink. I want to run a load flow analysis of the system. However matlab does not recognize the solar PV array in the microgrid as a source. When I place a load flow bus after the inverter of the PV array, it produces an error message saying the model does not converge. Increasing the number of iterations does not help. Any advice would be greatly appreciated simulink MATLAB Answers — New Questions
parfor number of cores
I am using a machine with an intel processor with 14 physical cores (among those, 6 performance cores). If I run this code I get an error message.
numWorkers = 8;
p = parpool(numWorkers);
n = 10;
parfor i=1:n
disp(i)
end
The error message is the following:
Error using parpool (line 132)
Too many workers requested. The profile "Threads" has the NumWorkers property set to a
maximum of 6 workers but 8 workers were requested. Either request a number of workers
less than NumWorkers, or increase the value of the NumWorkers property for the profile
(up to a maximum of 512 for thread-based pools).
I don’t understand the error message. Does this mean that the maximum number of cores I can use on my machine is 6, even though it has 14 physical cores? Matlab cannot use cores that are not performance cores?
Thanks in advance for any feedback on this.I am using a machine with an intel processor with 14 physical cores (among those, 6 performance cores). If I run this code I get an error message.
numWorkers = 8;
p = parpool(numWorkers);
n = 10;
parfor i=1:n
disp(i)
end
The error message is the following:
Error using parpool (line 132)
Too many workers requested. The profile "Threads" has the NumWorkers property set to a
maximum of 6 workers but 8 workers were requested. Either request a number of workers
less than NumWorkers, or increase the value of the NumWorkers property for the profile
(up to a maximum of 512 for thread-based pools).
I don’t understand the error message. Does this mean that the maximum number of cores I can use on my machine is 6, even though it has 14 physical cores? Matlab cannot use cores that are not performance cores?
Thanks in advance for any feedback on this. I am using a machine with an intel processor with 14 physical cores (among those, 6 performance cores). If I run this code I get an error message.
numWorkers = 8;
p = parpool(numWorkers);
n = 10;
parfor i=1:n
disp(i)
end
The error message is the following:
Error using parpool (line 132)
Too many workers requested. The profile "Threads" has the NumWorkers property set to a
maximum of 6 workers but 8 workers were requested. Either request a number of workers
less than NumWorkers, or increase the value of the NumWorkers property for the profile
(up to a maximum of 512 for thread-based pools).
I don’t understand the error message. Does this mean that the maximum number of cores I can use on my machine is 6, even though it has 14 physical cores? Matlab cannot use cores that are not performance cores?
Thanks in advance for any feedback on this. parallel computing toolbox, threads, parfor MATLAB Answers — New Questions
double free or corruption (out) error when using MATLAB Runtime 2023b
I can not add an example of my code below because it is in a closed space, however I’ve been able to isolate the issue.
While using matlab runtime 2023b with python, I am importing my own .ctf file. After about 2000 calls of the eigs() function in matlab I eventually receive a "double free or corruption (out) MATLAB is exiting because of the fatal error." I’ve never received this error when simply just using matlab with the same code base. So my theory is that there’s something wrong with MATLAB Runtime. Has anyone ruin into this? Does anyone have any advice on what I could possibily try next?I can not add an example of my code below because it is in a closed space, however I’ve been able to isolate the issue.
While using matlab runtime 2023b with python, I am importing my own .ctf file. After about 2000 calls of the eigs() function in matlab I eventually receive a "double free or corruption (out) MATLAB is exiting because of the fatal error." I’ve never received this error when simply just using matlab with the same code base. So my theory is that there’s something wrong with MATLAB Runtime. Has anyone ruin into this? Does anyone have any advice on what I could possibily try next? I can not add an example of my code below because it is in a closed space, however I’ve been able to isolate the issue.
While using matlab runtime 2023b with python, I am importing my own .ctf file. After about 2000 calls of the eigs() function in matlab I eventually receive a "double free or corruption (out) MATLAB is exiting because of the fatal error." I’ve never received this error when simply just using matlab with the same code base. So my theory is that there’s something wrong with MATLAB Runtime. Has anyone ruin into this? Does anyone have any advice on what I could possibily try next? matlab runtime, matlab compiler, python MATLAB Answers — New Questions
Help with Difference Equation in MATLAB
Hi, all
I have a structural engineering book dated back at 1979 that presents a fairly odd equation to compute rotations at member joints.
This is:
Where C, V , h and K are constant properties. Theta is the joint rotation at the level considered, the level above and the level below (hence the subscript).
The book mentions a several ways of solving it. One being iteration procedures. It also refers to this equation as a difference equation, which I have no experience with at all. Although I have read that they are fairly similar to differential equations which i have some experience with.
My question is, is there a function or command in MATLAB that could be employed to solve this equation? Like there is for differential equations symbolically and numerically. And could somebody help me with this?
Many thanks,
ScottHi, all
I have a structural engineering book dated back at 1979 that presents a fairly odd equation to compute rotations at member joints.
This is:
Where C, V , h and K are constant properties. Theta is the joint rotation at the level considered, the level above and the level below (hence the subscript).
The book mentions a several ways of solving it. One being iteration procedures. It also refers to this equation as a difference equation, which I have no experience with at all. Although I have read that they are fairly similar to differential equations which i have some experience with.
My question is, is there a function or command in MATLAB that could be employed to solve this equation? Like there is for differential equations symbolically and numerically. And could somebody help me with this?
Many thanks,
Scott Hi, all
I have a structural engineering book dated back at 1979 that presents a fairly odd equation to compute rotations at member joints.
This is:
Where C, V , h and K are constant properties. Theta is the joint rotation at the level considered, the level above and the level below (hence the subscript).
The book mentions a several ways of solving it. One being iteration procedures. It also refers to this equation as a difference equation, which I have no experience with at all. Although I have read that they are fairly similar to differential equations which i have some experience with.
My question is, is there a function or command in MATLAB that could be employed to solve this equation? Like there is for differential equations symbolically and numerically. And could somebody help me with this?
Many thanks,
Scott difference equations, functions, commands MATLAB Answers — New Questions
Integrations of Bessel functions
I have these two integrals and i want to find its analytical integral i tried in MATLAB symbolic tool box but i could not.
How to find this integral analytically.I have these two integrals and i want to find its analytical integral i tried in MATLAB symbolic tool box but i could not.
How to find this integral analytically. I have these two integrals and i want to find its analytical integral i tried in MATLAB symbolic tool box but i could not.
How to find this integral analytically. integration, bessel functions MATLAB Answers — New Questions
Must a Function be in Scope Before Creating a Handle to It?
The doc page Create Function Handle states:
"Keep the following in mind when creating handles to functions:
Scope — The function must be in scope at the time you create the handle. Therefore, the function must be on the MATLAB path or in the current folder."
Trying to verify that statement ….
Verify a proposed function name is not currently in scope
exist xyzzy
But we can create a handle to this non-existent function (seemingly contra to the doc page)
f = @xyzzy;
Error is generated if trying to use the handle, of course.
try
f()
catch ME
ME.message
end
After generating the handle we can create the function (typically through the editor, or renaming an existing file, or ….)
str = ["function a = xyzzy";"a=1;"];
writelines(str,"xyzzy.m");
type xyzzy.m
And now the function handle executes
f()
Am I misunderstanding something about that doc page, or is the the doc page wrong, or does the doc page reflect how Matlab is supposed to work and the preceding code indicates a bug of some sort?The doc page Create Function Handle states:
"Keep the following in mind when creating handles to functions:
Scope — The function must be in scope at the time you create the handle. Therefore, the function must be on the MATLAB path or in the current folder."
Trying to verify that statement ….
Verify a proposed function name is not currently in scope
exist xyzzy
But we can create a handle to this non-existent function (seemingly contra to the doc page)
f = @xyzzy;
Error is generated if trying to use the handle, of course.
try
f()
catch ME
ME.message
end
After generating the handle we can create the function (typically through the editor, or renaming an existing file, or ….)
str = ["function a = xyzzy";"a=1;"];
writelines(str,"xyzzy.m");
type xyzzy.m
And now the function handle executes
f()
Am I misunderstanding something about that doc page, or is the the doc page wrong, or does the doc page reflect how Matlab is supposed to work and the preceding code indicates a bug of some sort? The doc page Create Function Handle states:
"Keep the following in mind when creating handles to functions:
Scope — The function must be in scope at the time you create the handle. Therefore, the function must be on the MATLAB path or in the current folder."
Trying to verify that statement ….
Verify a proposed function name is not currently in scope
exist xyzzy
But we can create a handle to this non-existent function (seemingly contra to the doc page)
f = @xyzzy;
Error is generated if trying to use the handle, of course.
try
f()
catch ME
ME.message
end
After generating the handle we can create the function (typically through the editor, or renaming an existing file, or ….)
str = ["function a = xyzzy";"a=1;"];
writelines(str,"xyzzy.m");
type xyzzy.m
And now the function handle executes
f()
Am I misunderstanding something about that doc page, or is the the doc page wrong, or does the doc page reflect how Matlab is supposed to work and the preceding code indicates a bug of some sort? function handle MATLAB Answers — New Questions
How do I create log file using MATLAB Compiler R2025a?
I am using MATLAB Compiler for the first time in R2025a and I can’t find an option like we used to have to create a log file for the user.I am using MATLAB Compiler for the first time in R2025a and I can’t find an option like we used to have to create a log file for the user. I am using MATLAB Compiler for the first time in R2025a and I can’t find an option like we used to have to create a log file for the user. MATLAB Answers — New Questions
How can I continue using my old deployment .prj file in R2025a?
With the change to deployment workflow in R2025a (Release Notes), I understand the Compiler apps have been updated to integrate with MATLAB® projects, represented by .prj files. The .prj files no longer represent deployment project like in previous releases.
How can I use my old .prj file from previous releases to compile code in R2025a?With the change to deployment workflow in R2025a (Release Notes), I understand the Compiler apps have been updated to integrate with MATLAB® projects, represented by .prj files. The .prj files no longer represent deployment project like in previous releases.
How can I use my old .prj file from previous releases to compile code in R2025a? With the change to deployment workflow in R2025a (Release Notes), I understand the Compiler apps have been updated to integrate with MATLAB® projects, represented by .prj files. The .prj files no longer represent deployment project like in previous releases.
How can I use my old .prj file from previous releases to compile code in R2025a? MATLAB Answers — New Questions









