Mask R-CNN maximum number of detected instances per image.
Hi, I’ve been working on the mask r-cnn following the documentation instructions. I’ve got everything tot work but I stumbled upon a potential library mistake. Let me explain better my situation: I am working on a dataset with ~250 images (split between training and validation) and with just 1 category. Each image might have 40-50 instances up to 600-650 instances.
The problem is this, mask r-cnn can only detect up to 100 instances by class definition. I believe this is hurting the training of the network – however I cannot confirm this because I have to run the training by remote, by command prompt, since I don’t have a GPU powerful enough to run the training locally. My evidence is that the network, after the training, performs somewhat well on images with 40-50 instances, while it performs horrible on images with a lot of instances. In fact, when I evaluate my network on the validation set (something that I can do on my own computer), the network outputs at most 100 masks per image.
My "local" fix: I edited the maskrcnn.m file of the library. I went to the directory "C:ProgramFilesMATLABR2023btoolboxvisionvision@maskrcnnmaskrcnn.m" and at line 172 of the code, instead of
NumStrongestRegionsPrediction = 100
I put (expecting to not detect more than 800 instances, given my ground truth data)
NumStrongestRegionsPrediction = 800
which fixes my issue at least at validation time. However, since my training is run without this fix and, given my results, I am writing here to ask what I can do about this issue, I am basically certain my code is correct.
Again, all I can observe at training time is the training loss, which converges to a good number, however sometimes it outputs a bigger number, probably because it encounters the batch with the images with a lot of instances – in other words, the network isn’t learning enough out of these images and mistakes/training loss.
I can provide more information if needed, however for now I want to keep the post simple.Hi, I’ve been working on the mask r-cnn following the documentation instructions. I’ve got everything tot work but I stumbled upon a potential library mistake. Let me explain better my situation: I am working on a dataset with ~250 images (split between training and validation) and with just 1 category. Each image might have 40-50 instances up to 600-650 instances.
The problem is this, mask r-cnn can only detect up to 100 instances by class definition. I believe this is hurting the training of the network – however I cannot confirm this because I have to run the training by remote, by command prompt, since I don’t have a GPU powerful enough to run the training locally. My evidence is that the network, after the training, performs somewhat well on images with 40-50 instances, while it performs horrible on images with a lot of instances. In fact, when I evaluate my network on the validation set (something that I can do on my own computer), the network outputs at most 100 masks per image.
My "local" fix: I edited the maskrcnn.m file of the library. I went to the directory "C:ProgramFilesMATLABR2023btoolboxvisionvision@maskrcnnmaskrcnn.m" and at line 172 of the code, instead of
NumStrongestRegionsPrediction = 100
I put (expecting to not detect more than 800 instances, given my ground truth data)
NumStrongestRegionsPrediction = 800
which fixes my issue at least at validation time. However, since my training is run without this fix and, given my results, I am writing here to ask what I can do about this issue, I am basically certain my code is correct.
Again, all I can observe at training time is the training loss, which converges to a good number, however sometimes it outputs a bigger number, probably because it encounters the batch with the images with a lot of instances – in other words, the network isn’t learning enough out of these images and mistakes/training loss.
I can provide more information if needed, however for now I want to keep the post simple. Hi, I’ve been working on the mask r-cnn following the documentation instructions. I’ve got everything tot work but I stumbled upon a potential library mistake. Let me explain better my situation: I am working on a dataset with ~250 images (split between training and validation) and with just 1 category. Each image might have 40-50 instances up to 600-650 instances.
The problem is this, mask r-cnn can only detect up to 100 instances by class definition. I believe this is hurting the training of the network – however I cannot confirm this because I have to run the training by remote, by command prompt, since I don’t have a GPU powerful enough to run the training locally. My evidence is that the network, after the training, performs somewhat well on images with 40-50 instances, while it performs horrible on images with a lot of instances. In fact, when I evaluate my network on the validation set (something that I can do on my own computer), the network outputs at most 100 masks per image.
My "local" fix: I edited the maskrcnn.m file of the library. I went to the directory "C:ProgramFilesMATLABR2023btoolboxvisionvision@maskrcnnmaskrcnn.m" and at line 172 of the code, instead of
NumStrongestRegionsPrediction = 100
I put (expecting to not detect more than 800 instances, given my ground truth data)
NumStrongestRegionsPrediction = 800
which fixes my issue at least at validation time. However, since my training is run without this fix and, given my results, I am writing here to ask what I can do about this issue, I am basically certain my code is correct.
Again, all I can observe at training time is the training loss, which converges to a good number, however sometimes it outputs a bigger number, probably because it encounters the batch with the images with a lot of instances – in other words, the network isn’t learning enough out of these images and mistakes/training loss.
I can provide more information if needed, however for now I want to keep the post simple. deep learning, image segmentation, cnn MATLAB Answers — New Questions