I’m using VIT transformer in my code. How to convert the output of 1D layer of VIT into 2D with format SSCB?
I used the following code from Matlab answer to solve the errorrs that shown in the attached figure.
(Samuel Somuyiwa on 24 Jul 2023)
% Get Vision Transformer model
net = visionTransformer;
% Create dummy input
input = dlarray(rand(384,384,3),’SSCB’);
% Obtain output embedding from last layerNormalizationLayer
out = forward(net, input, Outputs=’encoder_norm’);
% Reshape output patch embedding
out = reshapePatchEmbedding(out);
function out = reshapePatchEmbedding(in)
% Remove output embedding corresponding to class token from input
out = in(2:end,:,:);
% Reshape resulting embedding to input format
WH = sqrt(size(out, 1));
C = size(out, 2);
out = reshape(out, WH, WH, C, []); % Shape is W x H x C x N
out = permute(out, [2, 1, 3, 4]); % Shape is H x W x C x N
% Convert to formatted dlarray
out = dlarray(out, ‘SSCB’);
endI used the following code from Matlab answer to solve the errorrs that shown in the attached figure.
(Samuel Somuyiwa on 24 Jul 2023)
% Get Vision Transformer model
net = visionTransformer;
% Create dummy input
input = dlarray(rand(384,384,3),’SSCB’);
% Obtain output embedding from last layerNormalizationLayer
out = forward(net, input, Outputs=’encoder_norm’);
% Reshape output patch embedding
out = reshapePatchEmbedding(out);
function out = reshapePatchEmbedding(in)
% Remove output embedding corresponding to class token from input
out = in(2:end,:,:);
% Reshape resulting embedding to input format
WH = sqrt(size(out, 1));
C = size(out, 2);
out = reshape(out, WH, WH, C, []); % Shape is W x H x C x N
out = permute(out, [2, 1, 3, 4]); % Shape is H x W x C x N
% Convert to formatted dlarray
out = dlarray(out, ‘SSCB’);
end I used the following code from Matlab answer to solve the errorrs that shown in the attached figure.
(Samuel Somuyiwa on 24 Jul 2023)
% Get Vision Transformer model
net = visionTransformer;
% Create dummy input
input = dlarray(rand(384,384,3),’SSCB’);
% Obtain output embedding from last layerNormalizationLayer
out = forward(net, input, Outputs=’encoder_norm’);
% Reshape output patch embedding
out = reshapePatchEmbedding(out);
function out = reshapePatchEmbedding(in)
% Remove output embedding corresponding to class token from input
out = in(2:end,:,:);
% Reshape resulting embedding to input format
WH = sqrt(size(out, 1));
C = size(out, 2);
out = reshape(out, WH, WH, C, []); % Shape is W x H x C x N
out = permute(out, [2, 1, 3, 4]); % Shape is H x W x C x N
% Convert to formatted dlarray
out = dlarray(out, ‘SSCB’);
end deep learning MATLAB Answers — New Questions