Question about training neural networks for regression problem using the Adam optimizer
Hello,
I am trying to construct and train a neural network to do a regression task for multi-output problem. However, I am starting with a toy problem with single output in order to understand how things work. I tried to use both functions ‘trainNetwork’ and ‘trainnet’ but both do not work as expected. In the toy problem, I am trying to approximate a simple function using the neural network with only 2 layers of 20 neuros per layer. The network does not converge for either of the functions above.
Could you tell me if I am doing something wrong in the way I am using them. Thank you
clear; clc; close all;
t = linspace(0,2*pi,500);
s = cos(2*pi*t).*exp(0.1*t) + sech(0.2*t);
%%
% Normalize inputs to the range [0, 1]
input_min = min(t); % Overall minimum
input_max = max(t); % Overall maximum
normalized_inputs = (t – input_min) / (input_max – input_min);
% Normalize outputs to the range [0, 1]
output_min = min(s); % Overall minimum
output_max = max(s); % Overall maximum
normalized_outputs = (s – output_min) / (output_max – output_min);
%%
% Define the architecture
layers = [
featureInputLayer(size(normalized_inputs, 1)) % Input layer
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(size(normalized_outputs, 1)) % Output layer
regressionLayer % Regression layer for continuous outputs
];
idx = randperm(numel(t));
train_idx = idx(1:round(0.8*numel(t)));
val_idx = idx(round(0.8*numel(t))+1:end);
t_train = normalized_inputs(train_idx);
s_train = normalized_outputs(train_idx);
t_val = normalized_inputs(val_idx);
s_val = normalized_outputs(val_idx);
miniBatchSize = 32;
options = trainingOptions(‘adam’, …
‘MiniBatchSize’, miniBatchSize, …
‘MaxEpochs’, 2000, …
‘InitialLearnRate’, 0.001, …
‘LearnRateSchedule’, ‘piecewise’, …
‘LearnRateDropFactor’, 0.1, …
‘LearnRateDropPeriod’, 1000, …
‘Shuffle’, ‘every-epoch’, …
‘ValidationData’, {t_val’, s_val’}, …
‘ValidationFrequency’, 10, …
‘Plots’, ‘training-progress’, …
‘Verbose’, true);
% Train the network
% net = trainnet(normalized_inputs’, normalized_outputs’, layers, "mse", options);
net = trainNetwork(normalized_inputs’, normalized_outputs’, layers, options);
% net = fitrnet(normalized_inputs’, normalized_outputs’, ‘Activations’, "tanh" , …
% "LayerSizes",[50 50 50]);
% Predict using the trained network
normalized_predictions = predict(net, normalized_inputs’)’;
% De-standardize predictions
predictions = normalized_predictions .* (output_max – output_min) + output_min;
% Evaluate performance (e.g., Mean Squared Error)
mse_loss = mean((normalized_outputs – normalized_predictions).^2, ‘all’);
fprintf(‘MSE: %.4fn’, mse_loss);
%%
figure(‘units’,’normalized’,’outerposition’,[0 0 1 1])
plot(t,normalized_outputs,’r’);
grid;
hold on
plot(t,normalized_predictions,’–b’);
legend(‘Truth’,’NN’,’Location’,’best’);
set(gcf,’color’,’w’)Hello,
I am trying to construct and train a neural network to do a regression task for multi-output problem. However, I am starting with a toy problem with single output in order to understand how things work. I tried to use both functions ‘trainNetwork’ and ‘trainnet’ but both do not work as expected. In the toy problem, I am trying to approximate a simple function using the neural network with only 2 layers of 20 neuros per layer. The network does not converge for either of the functions above.
Could you tell me if I am doing something wrong in the way I am using them. Thank you
clear; clc; close all;
t = linspace(0,2*pi,500);
s = cos(2*pi*t).*exp(0.1*t) + sech(0.2*t);
%%
% Normalize inputs to the range [0, 1]
input_min = min(t); % Overall minimum
input_max = max(t); % Overall maximum
normalized_inputs = (t – input_min) / (input_max – input_min);
% Normalize outputs to the range [0, 1]
output_min = min(s); % Overall minimum
output_max = max(s); % Overall maximum
normalized_outputs = (s – output_min) / (output_max – output_min);
%%
% Define the architecture
layers = [
featureInputLayer(size(normalized_inputs, 1)) % Input layer
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(size(normalized_outputs, 1)) % Output layer
regressionLayer % Regression layer for continuous outputs
];
idx = randperm(numel(t));
train_idx = idx(1:round(0.8*numel(t)));
val_idx = idx(round(0.8*numel(t))+1:end);
t_train = normalized_inputs(train_idx);
s_train = normalized_outputs(train_idx);
t_val = normalized_inputs(val_idx);
s_val = normalized_outputs(val_idx);
miniBatchSize = 32;
options = trainingOptions(‘adam’, …
‘MiniBatchSize’, miniBatchSize, …
‘MaxEpochs’, 2000, …
‘InitialLearnRate’, 0.001, …
‘LearnRateSchedule’, ‘piecewise’, …
‘LearnRateDropFactor’, 0.1, …
‘LearnRateDropPeriod’, 1000, …
‘Shuffle’, ‘every-epoch’, …
‘ValidationData’, {t_val’, s_val’}, …
‘ValidationFrequency’, 10, …
‘Plots’, ‘training-progress’, …
‘Verbose’, true);
% Train the network
% net = trainnet(normalized_inputs’, normalized_outputs’, layers, "mse", options);
net = trainNetwork(normalized_inputs’, normalized_outputs’, layers, options);
% net = fitrnet(normalized_inputs’, normalized_outputs’, ‘Activations’, "tanh" , …
% "LayerSizes",[50 50 50]);
% Predict using the trained network
normalized_predictions = predict(net, normalized_inputs’)’;
% De-standardize predictions
predictions = normalized_predictions .* (output_max – output_min) + output_min;
% Evaluate performance (e.g., Mean Squared Error)
mse_loss = mean((normalized_outputs – normalized_predictions).^2, ‘all’);
fprintf(‘MSE: %.4fn’, mse_loss);
%%
figure(‘units’,’normalized’,’outerposition’,[0 0 1 1])
plot(t,normalized_outputs,’r’);
grid;
hold on
plot(t,normalized_predictions,’–b’);
legend(‘Truth’,’NN’,’Location’,’best’);
set(gcf,’color’,’w’) Hello,
I am trying to construct and train a neural network to do a regression task for multi-output problem. However, I am starting with a toy problem with single output in order to understand how things work. I tried to use both functions ‘trainNetwork’ and ‘trainnet’ but both do not work as expected. In the toy problem, I am trying to approximate a simple function using the neural network with only 2 layers of 20 neuros per layer. The network does not converge for either of the functions above.
Could you tell me if I am doing something wrong in the way I am using them. Thank you
clear; clc; close all;
t = linspace(0,2*pi,500);
s = cos(2*pi*t).*exp(0.1*t) + sech(0.2*t);
%%
% Normalize inputs to the range [0, 1]
input_min = min(t); % Overall minimum
input_max = max(t); % Overall maximum
normalized_inputs = (t – input_min) / (input_max – input_min);
% Normalize outputs to the range [0, 1]
output_min = min(s); % Overall minimum
output_max = max(s); % Overall maximum
normalized_outputs = (s – output_min) / (output_max – output_min);
%%
% Define the architecture
layers = [
featureInputLayer(size(normalized_inputs, 1)) % Input layer
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(20) % Hidden layer with 20 neurons
reluLayer % Activation function
fullyConnectedLayer(size(normalized_outputs, 1)) % Output layer
regressionLayer % Regression layer for continuous outputs
];
idx = randperm(numel(t));
train_idx = idx(1:round(0.8*numel(t)));
val_idx = idx(round(0.8*numel(t))+1:end);
t_train = normalized_inputs(train_idx);
s_train = normalized_outputs(train_idx);
t_val = normalized_inputs(val_idx);
s_val = normalized_outputs(val_idx);
miniBatchSize = 32;
options = trainingOptions(‘adam’, …
‘MiniBatchSize’, miniBatchSize, …
‘MaxEpochs’, 2000, …
‘InitialLearnRate’, 0.001, …
‘LearnRateSchedule’, ‘piecewise’, …
‘LearnRateDropFactor’, 0.1, …
‘LearnRateDropPeriod’, 1000, …
‘Shuffle’, ‘every-epoch’, …
‘ValidationData’, {t_val’, s_val’}, …
‘ValidationFrequency’, 10, …
‘Plots’, ‘training-progress’, …
‘Verbose’, true);
% Train the network
% net = trainnet(normalized_inputs’, normalized_outputs’, layers, "mse", options);
net = trainNetwork(normalized_inputs’, normalized_outputs’, layers, options);
% net = fitrnet(normalized_inputs’, normalized_outputs’, ‘Activations’, "tanh" , …
% "LayerSizes",[50 50 50]);
% Predict using the trained network
normalized_predictions = predict(net, normalized_inputs’)’;
% De-standardize predictions
predictions = normalized_predictions .* (output_max – output_min) + output_min;
% Evaluate performance (e.g., Mean Squared Error)
mse_loss = mean((normalized_outputs – normalized_predictions).^2, ‘all’);
fprintf(‘MSE: %.4fn’, mse_loss);
%%
figure(‘units’,’normalized’,’outerposition’,[0 0 1 1])
plot(t,normalized_outputs,’r’);
grid;
hold on
plot(t,normalized_predictions,’–b’);
legend(‘Truth’,’NN’,’Location’,’best’);
set(gcf,’color’,’w’) neural network, neural networks, regression MATLAB Answers — New Questions