Custom loss function (based on error multiplication rather than sum) in classification neural network
Hi everyone,
First, thank you! This is a fantastic community from which I’m learning so much. This is my first question (hopefully, I’ll be able to contribute answers in the future!).
I have a system consisting of 10 elements, where each element can exist in one of four states (or classes). This means the system has 410410 possible states. For each element, I have 61 features that can be used to predict its state. I’ve experimented with different neural networks (FF networks have worked well so far), mainly focusing on predicting the labels of individual elements.
However, I’ve encountered some challenges:
The classes are naturally imbalanced.
The problem is non-deterministic, meaning two identical feature vectors can correspond to different labels.
I’ve been addressing these issues with relative success by applying techniques such as downsampling, oversampling, data augmentation, and soft labels (the latter has been the most effective).
Now, I want to predict the probability of the entire system being in each of its 410410 states. One issue I’ve noticed is that a misclassification error of 0.05 has minimal impact when the classification is close to random (e.g., 0.25), but it has a significant impact when probabilities are closer to 1 or 0.
What I’d like to do next is implement a loss function that considers the entire system rather than individual elements, while still being based on predictions for each element. My idea is to:
Take batches of 10 observations (corresponding to the 10 elements of the system).
Compute the probability of each element belonging to each of the 4 classes.
Calculate the probability of the system being in each of its 410410 possible states based on these predictions.
Sort these probabilities and use the known labels to find the index of the correct state.
Minimize this index.
Does this approach make sense? Is it feasible? And if so, how could it be implemented?
Many thanks!
DavidHi everyone,
First, thank you! This is a fantastic community from which I’m learning so much. This is my first question (hopefully, I’ll be able to contribute answers in the future!).
I have a system consisting of 10 elements, where each element can exist in one of four states (or classes). This means the system has 410410 possible states. For each element, I have 61 features that can be used to predict its state. I’ve experimented with different neural networks (FF networks have worked well so far), mainly focusing on predicting the labels of individual elements.
However, I’ve encountered some challenges:
The classes are naturally imbalanced.
The problem is non-deterministic, meaning two identical feature vectors can correspond to different labels.
I’ve been addressing these issues with relative success by applying techniques such as downsampling, oversampling, data augmentation, and soft labels (the latter has been the most effective).
Now, I want to predict the probability of the entire system being in each of its 410410 states. One issue I’ve noticed is that a misclassification error of 0.05 has minimal impact when the classification is close to random (e.g., 0.25), but it has a significant impact when probabilities are closer to 1 or 0.
What I’d like to do next is implement a loss function that considers the entire system rather than individual elements, while still being based on predictions for each element. My idea is to:
Take batches of 10 observations (corresponding to the 10 elements of the system).
Compute the probability of each element belonging to each of the 4 classes.
Calculate the probability of the system being in each of its 410410 possible states based on these predictions.
Sort these probabilities and use the known labels to find the index of the correct state.
Minimize this index.
Does this approach make sense? Is it feasible? And if so, how could it be implemented?
Many thanks!
David Hi everyone,
First, thank you! This is a fantastic community from which I’m learning so much. This is my first question (hopefully, I’ll be able to contribute answers in the future!).
I have a system consisting of 10 elements, where each element can exist in one of four states (or classes). This means the system has 410410 possible states. For each element, I have 61 features that can be used to predict its state. I’ve experimented with different neural networks (FF networks have worked well so far), mainly focusing on predicting the labels of individual elements.
However, I’ve encountered some challenges:
The classes are naturally imbalanced.
The problem is non-deterministic, meaning two identical feature vectors can correspond to different labels.
I’ve been addressing these issues with relative success by applying techniques such as downsampling, oversampling, data augmentation, and soft labels (the latter has been the most effective).
Now, I want to predict the probability of the entire system being in each of its 410410 states. One issue I’ve noticed is that a misclassification error of 0.05 has minimal impact when the classification is close to random (e.g., 0.25), but it has a significant impact when probabilities are closer to 1 or 0.
What I’d like to do next is implement a loss function that considers the entire system rather than individual elements, while still being based on predictions for each element. My idea is to:
Take batches of 10 observations (corresponding to the 10 elements of the system).
Compute the probability of each element belonging to each of the 4 classes.
Calculate the probability of the system being in each of its 410410 possible states based on these predictions.
Sort these probabilities and use the known labels to find the index of the correct state.
Minimize this index.
Does this approach make sense? Is it feasible? And if so, how could it be implemented?
Many thanks!
David neural network, classification, loss function, fast forward, trainnet MATLAB Answers — New Questions