3D U-Net Semantic Segmentation on custom CT dataset
Hello:
I am trying to apply the concept and sample code of the tutorial "3D brain tumor segmentation using deep learning" found on the below link –
https://in.mathworks.com/help/deeplearning/examples/segment-3d-brain-tumor-using-deep-learning.html
to a custom CT dataset in order to segment the lymph nodes. In my use case also there are 2 classes i.e background and lymph nodes. The current dataset has less data points (70 in total) and the split after preprocessing is 56 for train, 10 for validation and 4 for test.
I have modified the preprocessing code to accept the custom dataset and am cropping the ROI (i.e. the cubiod region around the abdomen which has the ground truth for the lumph nodes from the full CT Torso Volume)
I have visualized the patches that are extracted by the randomPatchExtractorDatastore and found that there is a class imbalance i.e. 4 out of 5 times the extracted patch has a Lymph node portion and 1 out of 5 times – it has the background portion. So, this may be causing an issue is what I am suspecting. (The dice accuracy suggest the same – as mentioed at the later part of this text)
In my case the volume image are 1 channel where as in the tutorial the volume image are 4 channel, so I have changed the input_layer to [64 64 64 1]
The code is working and there are no errors. The training process is also working but the accuracy starts at 100% and the loss starts at approx 1. So, this is unlike the graph shown in the tutorial where the accuracy gradually increase and reaches about 72 %. To train 1 EPOCH it takes like 75 mins on RTX 2080Ti so I tried training for 10 epochs as I was suspicious about the class imbalance of the randomly extracted patches.
I used PatchPerImage = 32 and the initial learning rate at 0.001 for the 1st 5 EPOCHs. The validation freq is 10.
Training Graph is as below –
The segmentation test shows that the model is just identifing the background and not learning to segment the lymph node. The Quantify Segmentation Accuracy section of the code suggest the same thing that the lymph nodes are not identified by the newtork.
Average Dice Score of the background across 4 test volumes = 0.99685 and
Average Dice Score of the lymph nodes across 4 test volumes = 0
Box plot of the Dice Score is as below –
I need some help and guidance on the following –
So, what can be the reason for this. Is class imbalance the reason or lack of training or few datapoint ?
How can i ensure that the patches in the randomPatchExtractorDatastore are equally balance accross both the classes i.e. Background and Lymph Nodes ?
Also, should i train the network for more epochs (like 50 or 100) before evaluating its efficacy ?
Is the limited data point the problem. I only have 175 total datapoints in the comprehensive dataset. So, should i use a higher patchPerImage value like 64 (instead of 16 as used in the tutorial)
Thanks for your guidance.Hello:
I am trying to apply the concept and sample code of the tutorial "3D brain tumor segmentation using deep learning" found on the below link –
https://in.mathworks.com/help/deeplearning/examples/segment-3d-brain-tumor-using-deep-learning.html
to a custom CT dataset in order to segment the lymph nodes. In my use case also there are 2 classes i.e background and lymph nodes. The current dataset has less data points (70 in total) and the split after preprocessing is 56 for train, 10 for validation and 4 for test.
I have modified the preprocessing code to accept the custom dataset and am cropping the ROI (i.e. the cubiod region around the abdomen which has the ground truth for the lumph nodes from the full CT Torso Volume)
I have visualized the patches that are extracted by the randomPatchExtractorDatastore and found that there is a class imbalance i.e. 4 out of 5 times the extracted patch has a Lymph node portion and 1 out of 5 times – it has the background portion. So, this may be causing an issue is what I am suspecting. (The dice accuracy suggest the same – as mentioed at the later part of this text)
In my case the volume image are 1 channel where as in the tutorial the volume image are 4 channel, so I have changed the input_layer to [64 64 64 1]
The code is working and there are no errors. The training process is also working but the accuracy starts at 100% and the loss starts at approx 1. So, this is unlike the graph shown in the tutorial where the accuracy gradually increase and reaches about 72 %. To train 1 EPOCH it takes like 75 mins on RTX 2080Ti so I tried training for 10 epochs as I was suspicious about the class imbalance of the randomly extracted patches.
I used PatchPerImage = 32 and the initial learning rate at 0.001 for the 1st 5 EPOCHs. The validation freq is 10.
Training Graph is as below –
The segmentation test shows that the model is just identifing the background and not learning to segment the lymph node. The Quantify Segmentation Accuracy section of the code suggest the same thing that the lymph nodes are not identified by the newtork.
Average Dice Score of the background across 4 test volumes = 0.99685 and
Average Dice Score of the lymph nodes across 4 test volumes = 0
Box plot of the Dice Score is as below –
I need some help and guidance on the following –
So, what can be the reason for this. Is class imbalance the reason or lack of training or few datapoint ?
How can i ensure that the patches in the randomPatchExtractorDatastore are equally balance accross both the classes i.e. Background and Lymph Nodes ?
Also, should i train the network for more epochs (like 50 or 100) before evaluating its efficacy ?
Is the limited data point the problem. I only have 175 total datapoints in the comprehensive dataset. So, should i use a higher patchPerImage value like 64 (instead of 16 as used in the tutorial)
Thanks for your guidance. Hello:
I am trying to apply the concept and sample code of the tutorial "3D brain tumor segmentation using deep learning" found on the below link –
https://in.mathworks.com/help/deeplearning/examples/segment-3d-brain-tumor-using-deep-learning.html
to a custom CT dataset in order to segment the lymph nodes. In my use case also there are 2 classes i.e background and lymph nodes. The current dataset has less data points (70 in total) and the split after preprocessing is 56 for train, 10 for validation and 4 for test.
I have modified the preprocessing code to accept the custom dataset and am cropping the ROI (i.e. the cubiod region around the abdomen which has the ground truth for the lumph nodes from the full CT Torso Volume)
I have visualized the patches that are extracted by the randomPatchExtractorDatastore and found that there is a class imbalance i.e. 4 out of 5 times the extracted patch has a Lymph node portion and 1 out of 5 times – it has the background portion. So, this may be causing an issue is what I am suspecting. (The dice accuracy suggest the same – as mentioed at the later part of this text)
In my case the volume image are 1 channel where as in the tutorial the volume image are 4 channel, so I have changed the input_layer to [64 64 64 1]
The code is working and there are no errors. The training process is also working but the accuracy starts at 100% and the loss starts at approx 1. So, this is unlike the graph shown in the tutorial where the accuracy gradually increase and reaches about 72 %. To train 1 EPOCH it takes like 75 mins on RTX 2080Ti so I tried training for 10 epochs as I was suspicious about the class imbalance of the randomly extracted patches.
I used PatchPerImage = 32 and the initial learning rate at 0.001 for the 1st 5 EPOCHs. The validation freq is 10.
Training Graph is as below –
The segmentation test shows that the model is just identifing the background and not learning to segment the lymph node. The Quantify Segmentation Accuracy section of the code suggest the same thing that the lymph nodes are not identified by the newtork.
Average Dice Score of the background across 4 test volumes = 0.99685 and
Average Dice Score of the lymph nodes across 4 test volumes = 0
Box plot of the Dice Score is as below –
I need some help and guidance on the following –
So, what can be the reason for this. Is class imbalance the reason or lack of training or few datapoint ?
How can i ensure that the patches in the randomPatchExtractorDatastore are equally balance accross both the classes i.e. Background and Lymph Nodes ?
Also, should i train the network for more epochs (like 50 or 100) before evaluating its efficacy ?
Is the limited data point the problem. I only have 175 total datapoints in the comprehensive dataset. So, should i use a higher patchPerImage value like 64 (instead of 16 as used in the tutorial)
Thanks for your guidance. 3d, unet, semantic segmentation, deep learning, custom dataset, own dataset, class imbalance, randompatchextractordatastore, random patch extractor datastore MATLAB Answers — New Questions