How to convert the format of data from sequences to matrices when designing deep learning networks?
Hello,
After importing the network into DeepNetworkDesigner for analysis, I encountered the following problem: after being processed by selfattentionLayer, the data size format is 577 (S) x 577 (C) x 1 (B).
I want to convert it to a format similar to imageInputLayer, _ (S) x_ (S) x_ (C) x_ (B). How can I use MATLAB to implement it?
The code for the network is as follows:
patchSize = 16;
embeddingOutputSize = 768;
layer = patchEmbeddingLayer(patchSize,embeddingOutputSize)
net = dlnetwork;
inputSize = [384 384 3];
maxPosition = (inputSize(1)/patchSize)^2 + 1;
numHeads = 4;
numKeyChannels = 4*embeddingOutputSize;
numClasses = 1000;
layers = [
imageInputLayer(inputSize)
patchEmbeddingLayer(patchSize,embeddingOutputSize,Name="patch-emb")
embeddingConcatenationLayer(Name="emb-cat")
positionEmbeddingLayer(embeddingOutputSize,maxPosition,Name="pos-emb");
additionLayer(2,Name="add")
selfAttentionLayer(numHeads,numKeyChannels,AttentionMask="causal",OutputSize=maxPosition)
fullyConnectedLayer(numClasses)
softmaxLayer];
net = addLayers(net,layers);
net = connectLayers(net,"emb-cat","add/in2");Hello,
After importing the network into DeepNetworkDesigner for analysis, I encountered the following problem: after being processed by selfattentionLayer, the data size format is 577 (S) x 577 (C) x 1 (B).
I want to convert it to a format similar to imageInputLayer, _ (S) x_ (S) x_ (C) x_ (B). How can I use MATLAB to implement it?
The code for the network is as follows:
patchSize = 16;
embeddingOutputSize = 768;
layer = patchEmbeddingLayer(patchSize,embeddingOutputSize)
net = dlnetwork;
inputSize = [384 384 3];
maxPosition = (inputSize(1)/patchSize)^2 + 1;
numHeads = 4;
numKeyChannels = 4*embeddingOutputSize;
numClasses = 1000;
layers = [
imageInputLayer(inputSize)
patchEmbeddingLayer(patchSize,embeddingOutputSize,Name="patch-emb")
embeddingConcatenationLayer(Name="emb-cat")
positionEmbeddingLayer(embeddingOutputSize,maxPosition,Name="pos-emb");
additionLayer(2,Name="add")
selfAttentionLayer(numHeads,numKeyChannels,AttentionMask="causal",OutputSize=maxPosition)
fullyConnectedLayer(numClasses)
softmaxLayer];
net = addLayers(net,layers);
net = connectLayers(net,"emb-cat","add/in2"); Hello,
After importing the network into DeepNetworkDesigner for analysis, I encountered the following problem: after being processed by selfattentionLayer, the data size format is 577 (S) x 577 (C) x 1 (B).
I want to convert it to a format similar to imageInputLayer, _ (S) x_ (S) x_ (C) x_ (B). How can I use MATLAB to implement it?
The code for the network is as follows:
patchSize = 16;
embeddingOutputSize = 768;
layer = patchEmbeddingLayer(patchSize,embeddingOutputSize)
net = dlnetwork;
inputSize = [384 384 3];
maxPosition = (inputSize(1)/patchSize)^2 + 1;
numHeads = 4;
numKeyChannels = 4*embeddingOutputSize;
numClasses = 1000;
layers = [
imageInputLayer(inputSize)
patchEmbeddingLayer(patchSize,embeddingOutputSize,Name="patch-emb")
embeddingConcatenationLayer(Name="emb-cat")
positionEmbeddingLayer(embeddingOutputSize,maxPosition,Name="pos-emb");
additionLayer(2,Name="add")
selfAttentionLayer(numHeads,numKeyChannels,AttentionMask="causal",OutputSize=maxPosition)
fullyConnectedLayer(numClasses)
softmaxLayer];
net = addLayers(net,layers);
net = connectLayers(net,"emb-cat","add/in2"); deep learning MATLAB Answers — New Questions