Negative variance of state when training.
For our image segmentation task, we are trying to implement a custom training loop for our network, giving us more freedom to visualize predictions while training. Bellow follows parts of the code that should be key in identifying the underlying issue:
%% Classes
classNames = ["bg", "live", "nk", "round", "blob", "other"];
labelIDs = [0 192 255 1 2 3];
numClasses = 6;
%% Create mobilenet
network = ‘mobilenetv2’;
lgraph = deeplabv3plusLayers([224 224 3],numClasses,network);
% X our whole trainig data
[m, s] = calculate_input_params(single(X));
input_layer_new = imageInputLayer([224 224 3], "Normalization","zscore", "Mean",m, "StandardDeviation",s);
lgraph = replaceLayer(lgraph, "input_1", input_layer_new);
lgraph = removeLayers(lgraph, "classification");
%% Initialize network, training data and parameters
net = dlnetwork(lgraph);
mbq = minibatchqueue(ds_augmented, "MiniBatchSize",16, "MiniBatchFormat",["SSCB" "SSB"]);
numepochs = 3;
initialLearnRate = 0.01;
decay = 0.01;
momentum = 0.9;
vel = [];
%% Necessary code to avoid error
try
nnet.internal.cnngpu.reluForward(1);
catch ME
end
%% Train network
epoch = 0;
iteration = 0;
while epoch < numepochs
epoch = epoch + 1;
shuffle(mbq);
while hasdata(mbq)
iteration = iteration + 1;
epoch_iteration = [epoch iteration]
[X_b, Y_b] = next(mbq);
Y_b = adjust_dimensions(Y_b);
[loss,gradients,state] = dlfeval(@modelLoss,net,X_b,Y_b);
net.State = state;
loss
learnRate = initialLearnRate/(1 + decay*iteration);
[net, vel] = sgdmupdate(net, gradients, vel, learnRate, momentum);
end
end
function [loss,gradients,state] = modelLoss(net,X_b,Y_b)
classWeights = [1 10 10 10 10 10];
% Forward data through network.
[Y_p,state] = forward(net,X_b);
% Calculate cross-entropy loss.
loss = crossentropy(Y_p,Y_b,classWeights,’WeightsFormat’,’UC’,’TargetCategories’,’independent’);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end
Essentialy, when we run the Train network section, we manage to run a couple of iterations (number of iterations may vary), untill we get the following error:
Along this, we have noticed that no matter how many iterations we run, when we access X_b, Y_b, Y_p and try to visualize the first and second image of the batch, we always get the same prediction regardless of X_b and Y_b. It seems that Y_p that is generated from forwad(net, X_b) is somehow constant:
Since me and my lab partner do not pocess any formal training in deep learning and image segmentation, we find it challenging to connect the dots and overcome this problem. Any feedback regarding code or approach would be much appreciated.For our image segmentation task, we are trying to implement a custom training loop for our network, giving us more freedom to visualize predictions while training. Bellow follows parts of the code that should be key in identifying the underlying issue:
%% Classes
classNames = ["bg", "live", "nk", "round", "blob", "other"];
labelIDs = [0 192 255 1 2 3];
numClasses = 6;
%% Create mobilenet
network = ‘mobilenetv2’;
lgraph = deeplabv3plusLayers([224 224 3],numClasses,network);
% X our whole trainig data
[m, s] = calculate_input_params(single(X));
input_layer_new = imageInputLayer([224 224 3], "Normalization","zscore", "Mean",m, "StandardDeviation",s);
lgraph = replaceLayer(lgraph, "input_1", input_layer_new);
lgraph = removeLayers(lgraph, "classification");
%% Initialize network, training data and parameters
net = dlnetwork(lgraph);
mbq = minibatchqueue(ds_augmented, "MiniBatchSize",16, "MiniBatchFormat",["SSCB" "SSB"]);
numepochs = 3;
initialLearnRate = 0.01;
decay = 0.01;
momentum = 0.9;
vel = [];
%% Necessary code to avoid error
try
nnet.internal.cnngpu.reluForward(1);
catch ME
end
%% Train network
epoch = 0;
iteration = 0;
while epoch < numepochs
epoch = epoch + 1;
shuffle(mbq);
while hasdata(mbq)
iteration = iteration + 1;
epoch_iteration = [epoch iteration]
[X_b, Y_b] = next(mbq);
Y_b = adjust_dimensions(Y_b);
[loss,gradients,state] = dlfeval(@modelLoss,net,X_b,Y_b);
net.State = state;
loss
learnRate = initialLearnRate/(1 + decay*iteration);
[net, vel] = sgdmupdate(net, gradients, vel, learnRate, momentum);
end
end
function [loss,gradients,state] = modelLoss(net,X_b,Y_b)
classWeights = [1 10 10 10 10 10];
% Forward data through network.
[Y_p,state] = forward(net,X_b);
% Calculate cross-entropy loss.
loss = crossentropy(Y_p,Y_b,classWeights,’WeightsFormat’,’UC’,’TargetCategories’,’independent’);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end
Essentialy, when we run the Train network section, we manage to run a couple of iterations (number of iterations may vary), untill we get the following error:
Along this, we have noticed that no matter how many iterations we run, when we access X_b, Y_b, Y_p and try to visualize the first and second image of the batch, we always get the same prediction regardless of X_b and Y_b. It seems that Y_p that is generated from forwad(net, X_b) is somehow constant:
Since me and my lab partner do not pocess any formal training in deep learning and image segmentation, we find it challenging to connect the dots and overcome this problem. Any feedback regarding code or approach would be much appreciated. For our image segmentation task, we are trying to implement a custom training loop for our network, giving us more freedom to visualize predictions while training. Bellow follows parts of the code that should be key in identifying the underlying issue:
%% Classes
classNames = ["bg", "live", "nk", "round", "blob", "other"];
labelIDs = [0 192 255 1 2 3];
numClasses = 6;
%% Create mobilenet
network = ‘mobilenetv2’;
lgraph = deeplabv3plusLayers([224 224 3],numClasses,network);
% X our whole trainig data
[m, s] = calculate_input_params(single(X));
input_layer_new = imageInputLayer([224 224 3], "Normalization","zscore", "Mean",m, "StandardDeviation",s);
lgraph = replaceLayer(lgraph, "input_1", input_layer_new);
lgraph = removeLayers(lgraph, "classification");
%% Initialize network, training data and parameters
net = dlnetwork(lgraph);
mbq = minibatchqueue(ds_augmented, "MiniBatchSize",16, "MiniBatchFormat",["SSCB" "SSB"]);
numepochs = 3;
initialLearnRate = 0.01;
decay = 0.01;
momentum = 0.9;
vel = [];
%% Necessary code to avoid error
try
nnet.internal.cnngpu.reluForward(1);
catch ME
end
%% Train network
epoch = 0;
iteration = 0;
while epoch < numepochs
epoch = epoch + 1;
shuffle(mbq);
while hasdata(mbq)
iteration = iteration + 1;
epoch_iteration = [epoch iteration]
[X_b, Y_b] = next(mbq);
Y_b = adjust_dimensions(Y_b);
[loss,gradients,state] = dlfeval(@modelLoss,net,X_b,Y_b);
net.State = state;
loss
learnRate = initialLearnRate/(1 + decay*iteration);
[net, vel] = sgdmupdate(net, gradients, vel, learnRate, momentum);
end
end
function [loss,gradients,state] = modelLoss(net,X_b,Y_b)
classWeights = [1 10 10 10 10 10];
% Forward data through network.
[Y_p,state] = forward(net,X_b);
% Calculate cross-entropy loss.
loss = crossentropy(Y_p,Y_b,classWeights,’WeightsFormat’,’UC’,’TargetCategories’,’independent’);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end
Essentialy, when we run the Train network section, we manage to run a couple of iterations (number of iterations may vary), untill we get the following error:
Along this, we have noticed that no matter how many iterations we run, when we access X_b, Y_b, Y_p and try to visualize the first and second image of the batch, we always get the same prediction regardless of X_b and Y_b. It seems that Y_p that is generated from forwad(net, X_b) is somehow constant:
Since me and my lab partner do not pocess any formal training in deep learning and image segmentation, we find it challenging to connect the dots and overcome this problem. Any feedback regarding code or approach would be much appreciated. image segmentation, custom training loop, deep learning, negative trainedvariance MATLAB Answers — New Questions