Why is my transformer training erroring out, with the following message “Error using trainnet (line 46)”
The full error message reads "Error using trainnet (line 46)
The number of mini-batch queue outputs (2) must match the number of network inputs plus
the number of network outputs (4)."
I’m using an arrayDatastore to pass the predictors (x2) and the targets(x2) to the transformer model. Both predictors have 410 features, one of the targets has 410 features and the other target is a scalar function.
Code to generate the dummy predictor and target data is pasted below:
%——————————————————————–
% data generation for encoder
numObs = 10;
seqLen = vocabSize;
x_enc = randi([1,10],[seqLen,numObs]);
y_enc = zeros(numObs,1);
for i = 1:numObs
idx = x_enc(1:2,i);
y_enc(i,:) = sum(x_enc(idx,i));
end
x_enc = num2cell(x_enc’,2);
y_enc = num2cell(y_enc)’;
x_1 = x_enc;
y_2 = y_enc’;
% data generation for decoder
x_series = randi([1,10],[seqLen,numObs]);
y_series = sin(rand([seqLen,numObs]));
x_dec = x_series(:,1:end)’;
y_dec = y_series(:,1:end)’;
x_dec = num2cell(x_dec,2); x_2 = x_dec;
y_dec = num2cell(y_dec,2); y_1 = y_dec;
cell_data = {}; cell_data = [cell_data x_1 x_2 y_1 y_2];
dstrain = arrayDatastore(cell_data,’OutputType’,’same’);
%——————————————————————-
cell_data is of the form:
cell_data
cell_data =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
If I were to use readall(dstrain) to read the datastore, I get the same format as cell_data:
fds = readall(dstrain)
fds =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
Finally, if I use minibatchqueue to create a minibatch of datastore ‘dstrain’, I get:
mbq = minibatchqueue(dstrain)
mbq =
minibatchqueue with 4 outputs and properties:
Mini-batch creation:
MiniBatchSize: 10
PartialMiniBatch: ‘return’
MiniBatchFcn: ‘collate’
PreprocessingEnvironment: ‘serial’
Outputs:
OutputCast: {‘single’ ‘single’ ‘single’ ‘single’}
OutputAsDlarray: [1 1 1 1]
MiniBatchFormat: {” ” ” ”}
OutputEnvironment: {‘auto’ ‘auto’ ‘auto’ ‘auto’}
As you can see, there are four outputs for the minibatch, which appears to contradict the original error message that there are only two minibatchqueue outputs
Also to confirm, i double checked the transformer input output structure:
net
net =
dlnetwork with properties:
Layers: [64×1 nnet.cnn.layer.Layer]
Connections: [1714×2 table]
Learnables: [110×3 table]
State: [0×3 table]
InputNames: {‘in_enc’ ‘in_dec’}
OutputNames: {‘decoder_out’ ‘fc_13’}
Initialized: 1
View summary with summary.
which shows two inputs and two outputs.
Could someone point me to the mistake I’m making here (likely with the datastore format) – it seems that during batching, the model is only choosing two of the cell columns from cell_data/dstrain for the input and output, rather than all four and its not clear why…thanks in advance for your help!
CGThe full error message reads "Error using trainnet (line 46)
The number of mini-batch queue outputs (2) must match the number of network inputs plus
the number of network outputs (4)."
I’m using an arrayDatastore to pass the predictors (x2) and the targets(x2) to the transformer model. Both predictors have 410 features, one of the targets has 410 features and the other target is a scalar function.
Code to generate the dummy predictor and target data is pasted below:
%——————————————————————–
% data generation for encoder
numObs = 10;
seqLen = vocabSize;
x_enc = randi([1,10],[seqLen,numObs]);
y_enc = zeros(numObs,1);
for i = 1:numObs
idx = x_enc(1:2,i);
y_enc(i,:) = sum(x_enc(idx,i));
end
x_enc = num2cell(x_enc’,2);
y_enc = num2cell(y_enc)’;
x_1 = x_enc;
y_2 = y_enc’;
% data generation for decoder
x_series = randi([1,10],[seqLen,numObs]);
y_series = sin(rand([seqLen,numObs]));
x_dec = x_series(:,1:end)’;
y_dec = y_series(:,1:end)’;
x_dec = num2cell(x_dec,2); x_2 = x_dec;
y_dec = num2cell(y_dec,2); y_1 = y_dec;
cell_data = {}; cell_data = [cell_data x_1 x_2 y_1 y_2];
dstrain = arrayDatastore(cell_data,’OutputType’,’same’);
%——————————————————————-
cell_data is of the form:
cell_data
cell_data =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
If I were to use readall(dstrain) to read the datastore, I get the same format as cell_data:
fds = readall(dstrain)
fds =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
Finally, if I use minibatchqueue to create a minibatch of datastore ‘dstrain’, I get:
mbq = minibatchqueue(dstrain)
mbq =
minibatchqueue with 4 outputs and properties:
Mini-batch creation:
MiniBatchSize: 10
PartialMiniBatch: ‘return’
MiniBatchFcn: ‘collate’
PreprocessingEnvironment: ‘serial’
Outputs:
OutputCast: {‘single’ ‘single’ ‘single’ ‘single’}
OutputAsDlarray: [1 1 1 1]
MiniBatchFormat: {” ” ” ”}
OutputEnvironment: {‘auto’ ‘auto’ ‘auto’ ‘auto’}
As you can see, there are four outputs for the minibatch, which appears to contradict the original error message that there are only two minibatchqueue outputs
Also to confirm, i double checked the transformer input output structure:
net
net =
dlnetwork with properties:
Layers: [64×1 nnet.cnn.layer.Layer]
Connections: [1714×2 table]
Learnables: [110×3 table]
State: [0×3 table]
InputNames: {‘in_enc’ ‘in_dec’}
OutputNames: {‘decoder_out’ ‘fc_13’}
Initialized: 1
View summary with summary.
which shows two inputs and two outputs.
Could someone point me to the mistake I’m making here (likely with the datastore format) – it seems that during batching, the model is only choosing two of the cell columns from cell_data/dstrain for the input and output, rather than all four and its not clear why…thanks in advance for your help!
CG The full error message reads "Error using trainnet (line 46)
The number of mini-batch queue outputs (2) must match the number of network inputs plus
the number of network outputs (4)."
I’m using an arrayDatastore to pass the predictors (x2) and the targets(x2) to the transformer model. Both predictors have 410 features, one of the targets has 410 features and the other target is a scalar function.
Code to generate the dummy predictor and target data is pasted below:
%——————————————————————–
% data generation for encoder
numObs = 10;
seqLen = vocabSize;
x_enc = randi([1,10],[seqLen,numObs]);
y_enc = zeros(numObs,1);
for i = 1:numObs
idx = x_enc(1:2,i);
y_enc(i,:) = sum(x_enc(idx,i));
end
x_enc = num2cell(x_enc’,2);
y_enc = num2cell(y_enc)’;
x_1 = x_enc;
y_2 = y_enc’;
% data generation for decoder
x_series = randi([1,10],[seqLen,numObs]);
y_series = sin(rand([seqLen,numObs]));
x_dec = x_series(:,1:end)’;
y_dec = y_series(:,1:end)’;
x_dec = num2cell(x_dec,2); x_2 = x_dec;
y_dec = num2cell(y_dec,2); y_1 = y_dec;
cell_data = {}; cell_data = [cell_data x_1 x_2 y_1 y_2];
dstrain = arrayDatastore(cell_data,’OutputType’,’same’);
%——————————————————————-
cell_data is of the form:
cell_data
cell_data =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
If I were to use readall(dstrain) to read the datastore, I get the same format as cell_data:
fds = readall(dstrain)
fds =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
Finally, if I use minibatchqueue to create a minibatch of datastore ‘dstrain’, I get:
mbq = minibatchqueue(dstrain)
mbq =
minibatchqueue with 4 outputs and properties:
Mini-batch creation:
MiniBatchSize: 10
PartialMiniBatch: ‘return’
MiniBatchFcn: ‘collate’
PreprocessingEnvironment: ‘serial’
Outputs:
OutputCast: {‘single’ ‘single’ ‘single’ ‘single’}
OutputAsDlarray: [1 1 1 1]
MiniBatchFormat: {” ” ” ”}
OutputEnvironment: {‘auto’ ‘auto’ ‘auto’ ‘auto’}
As you can see, there are four outputs for the minibatch, which appears to contradict the original error message that there are only two minibatchqueue outputs
Also to confirm, i double checked the transformer input output structure:
net
net =
dlnetwork with properties:
Layers: [64×1 nnet.cnn.layer.Layer]
Connections: [1714×2 table]
Learnables: [110×3 table]
State: [0×3 table]
InputNames: {‘in_enc’ ‘in_dec’}
OutputNames: {‘decoder_out’ ‘fc_13’}
Initialized: 1
View summary with summary.
which shows two inputs and two outputs.
Could someone point me to the mistake I’m making here (likely with the datastore format) – it seems that during batching, the model is only choosing two of the cell columns from cell_data/dstrain for the input and output, rather than all four and its not clear why…thanks in advance for your help!
CG deep learning, minibatch, datastore MATLAB Answers — New Questions