Overfitting if: training loss << validation loss
Underfitting if: training loss >> validation loss
Just right if training loss ~ validation loss
Question: How should we interpret >>, <<, and ~?
For example, what ratio between the training and validation loss would indicate that you are overfitting, underfitting, or in a good place?
Ok so your underlying question. How to interpret << >>
Not an expert but my assumptions have been
- Typically validation loss should be similar to but slightly higher than training loss. As long as validation loss is lower than or even equal to training loss one should keep doing more training.
- If training loss is reducing without increase in validation loss then again keep doing more training
- If validation loss starts increasing then it is time to stop
- If overall accuracy still not acceptable then review mistakes model is making and think of what can one change:
- More data? More / different data augmentations? Generative data?
- Different architecture?
Funnily enough, some over-fitting is nearly always a good thing. All that matters in the end is: is the validation loss as low as you can get it (and/or the val accuracy as high)? This often occurs when the training loss is quite a bit lower.
From lesson 1 we have:
If you try training for more epochs, you’ll notice that we start to overfit, which means that our model is learning to recognize the specific images in the training set, rather than generalizing such that we also get good results on the validation set.
So, I took the 3 lines of code and ran for 50 epochs and got the following:
arch=resnet34 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz)) learn = ConvLearner.pretrained(arch, data, precompute=True) learn.fit(0.01, 50)
Two things I observe from this graph about over-fitting are:
- The training loss keeps decreasing after every epoch. Our model is learning to recognize the specific images in the training set.
- The validation loss keeps increasing after every epoch. Our model is not generalizing well enough on the validation set.
After 250 epochs
The trend is so clear with lots of epochs!