It's tough to understand the basic concept of the Receiver Operating Characteristic (ROC) curve. In my previous blog, I have written about the ROC curve in short that you can find in the given link: https://taseenresearch.blogspot.com/2021/04/machine-learning-model-evaluation.html.
But here I am trying to explain the ROC curve, briefly maintaining a relationship with the Confusion Matrix. Hope it will help to clear your concepts about the ROC curve and how it evaluates the model's performances.
Roc curve is basically the indicator that evaluates the output quality of the classifier algorithms. In ROC curves, True-Positive (TP) rates are featured on the Y-axis, and False-Positive (FP) rate featured on the X-axis, which indicates that the top left corner of the plot is an "Ideal" point with a True-Positive(TP) volume of one and False-Positive(FP) volume of zero, which indicates a better model.
Have a look at this figure, π
Now coming to the point on that why this feature can help to evaluate the model's better performance, I mean why we can be able to say that True-Positive(TP) volume of one and False-Positive(FP) volume of zero, which indicates a better model?
If the dataset is a dataset of pregnant women and not pregnant women, then........................
Have a look at this confusion matrix, πLet's calculate the True Positive (TP) rate,
TP = True positive /(True Positive+False Negative)
TP= 3/ (3+0) =1,
This means every single sample of a pregnant woman is classified correctly as a pregnant woman when the threshold is so low, and each sample is classified as a pregnant woman is 1.
Now, Let's Calculate the False Positive (FP) rate,
FP = False Positive / (False Positive+ True Negatives)
FP= 3/ (3+0) =1
This means every single sample that is not a pregnant woman is classified incorrectly classified as a pregnant woman when the threshold is so low. Each sample is also classified as a pregnant woman is 1.
That means the point is (1,1) if we try to draw a plot.
Actually, The point (1, 1) delineates.....
100% of samples are correctly classified as pregnant women, and 100% of the samples are incorrectly classified as pregnant women who are not pregnant, where the False Positive rate is too high.
As Same As.................
Have a look at this confusion matrix, π
Let's calculate the True Positive (TP) rate,
TP = True positive /(True Positive+False Negative)
TP= 3/ (3+0) =1,
This means every single sample of a pregnant woman is classified correctly as a pregnant woman when the threshold is so low, and each sample is classified as a pregnant woman is 1.
Now, Let's Calculate the False Positive (FP) rate,
FP = False Positive / (False Positive+ True Negatives)
FP= 2/ (2+1) =0.67
This means every single sample that is not a pregnant woman is classified incorrectly classified as a pregnant woman when the threshold is so low. Each sample is also classified as a pregnant woman is 0.67.
That means the point is ( 0.67,1), which is on the left of the point (1,1).
Actually, The point (0.67,1) delineates...............
100% of samples are correctly classified as pregnant women, and 67% of the samples are incorrectly classified as pregnant women who are not pregnant, where the False Positive rate also exists greatly.
That means the proportion of correctly classified samples that are pregnant women means the True Positive rate is greater than the proportion of the incorrectly classified samples as pregnant those who are actually not pregnant women means the False Positive rate.
Which indicates a better model than the first one.
Following Same Way............ π

Have a look at this confusion matrix, πLet's calculate the True Positive (TP) rate,
TP = True positive /(True Positive+False Negative)
TP= 2/ (2+1) =0.67
This means every single sample of a pregnant woman is classified correctly as a pregnant woman when the threshold is so low, and each sample is classified as a pregnant woman is 0.67.
Now, Let's Calculate the False Positive (FP) rate,
FP = False Positive / (False Positive+ True Negatives)
FP= 1/ (1+2) =0.33
This means every single sample that is not a pregnant woman is classified incorrectly classified as a pregnant woman when the threshold is so low. Each sample is also classified as a pregnant woman is 0.33.
That means the point is ( 0.33,0.67) if we try to draw a plot, which is on the left of the point (0.67,1).
Actually, The point (0.33, 0.67) delineates...
67% of samples are correctly classified as pregnant women, and 33% of the samples are incorrectly classified as pregnant women who are not pregnant, where the False Positive rate also exists.
That means the proportion of correctly classified samples that are pregnant women means the True Positive rate is greater than the proportion of the incorrectly classified samples as pregnant those are actually not pregnant women means the False Positive rate.
Which indicates a better model than the first two.
Well, the last description........... π
Have a look at this confusion matrix, πLet's calculate the True Positive (TP) rate,
TP = True positive /(True Positive+False Negative)
TP= 2/ (2+1) =0.67
This means every single sample of a pregnant woman is classified correctly as a pregnant woman when the threshold is so low, and each sample is classified as a pregnant woman is 0.67.
Now, Let's Calculate the False Positive (FP) rate,
FP = False Positive / (False Positive+ True Negatives)
FP= 0/ (0+3) =0
This means every single sample that is not a pregnant woman is classified incorrectly classified as a pregnant woman when the threshold is so low. Each sample is also classified as a pregnant woman is 0.
That means the point is ( 0,0.67) if we try to draw a plot on the left of the point (0.67,0.33).
Actually, The point (0, 0.67) delineates.....
67% of samples are correctly classified as pregnant women, and 0% of the samples are incorrectly classified as pregnant women who are not pregnant. This means 100% correctly predicted them as not pregnant. Simply said, no False Positive rate exists.
That means the proportion of correctly classified samples that are pregnant women means the True Positive rate is greater than the proportion of the incorrectly classified samples as pregnant those who are actually not pregnant women means the False Positive rate.
Which indicates a better model than the first three.
So, It can be said that the less the false positive rate, the better the model is, which Indicates that the top left corner of the plot is an "Ideal" point with a True-Positive(TP) volume of one and a False-Positive(FP) volume of zero.
Comments
Post a Comment