The ROC curve is a graphical representation of the relationship between the true positive and false positive rates. A standard way to evaluate the relationship is with the area under the curve, which is displayed below the graph in the report. ROC curves are often used to graphically show the connection / trade-off between clinical sensitivity and specificity for each possible cutoff of a test or combination of tests.
Furthermore, the area under the ROC curve gives an idea of the benefit of using the tests in question. The term ROC stands for Receiver Performance Characteristic. The initial investigation was motivated by a desire to determine how the US RADAR "receiver operators" had bypassed the Japanese aircraft. Now, ROC curves are frequently used to show the connection between clinical sensitivity and specificity for each possible cutoff of a test or combination of tests. Furthermore, the area under the ROC curve gives an idea of the benefit of using the tests in question.
What are ROC curves?
A useful tool for predicting the probability of a binary outcome is the receiver operating characteristic curve, or ROC curves. It is a plot of the false positive rate (x-axis) versus the true positive rate (y-axis) for several different candidate threshold values between 0.0 and 1.0. In other words, it plots the false alarm rate against the hit rate.
Two standard definitions used in this area are as follows:
- Sensitivity, the probability that a given x-value (a test or measurement) correctly predicts an existing condition. For a given x, the probability of incorrectly predicting the existence of a condition is 1 - sensitivity.
- Specificity, the probability that a test will correctly predict that a condition does not exist.
A ROC curve is a graph of sensitivity by (1 - specificity) for each value of x. The area under the ROC curve is a common index that is used to summarize the information contained in the curve. When you perform a simple logistic regression on a binary result, there is a platform option to request an ROC curve for that analysis.
After selecting the ROC Curve option, you must specify which level to use as a positive response. If a test were to predict perfectly, it would have a value above which the entire abnormal population would fall and below which all normal values would fall. It would be perfectly sensitive and then it would pass through point (0,1) on the grid. The closer the ROC curve is to this sweet spot, the better its discrimination ability. A test without predictive ability produces a curve that follows the diagonal of the grid.
True positive and false positive rate
The true positive rate is calculated as the number of true positives divided by the sum of the number of true positives and the number of false negatives. Describe how good the model is at predicting the positive class when the actual result is positive.
True positive rate = true positives / (true positives + false negatives)
Sensitivity = true positives / (true positives + false negatives)
The false positive rate is calculated as the number of false positives divided by the sum of the number of false positives and the number of true negatives. Also called the false alarm rate, it summarizes how often a positive class is predicted when the actual result is negative.
False positive rate = false positives / (false positives + true negatives)
The false positive rate is also known as the inverted specificity, where the specificity is the total number of true negatives divided by the sum of the number of true negatives and false positives.
Specificity = true negatives / (true negatives + false positives)
False positive rate = 1 - Specificity
The ROC curve is a useful tool for several reasons:
a. The curves of different models can be directly compared in general or for different thresholds.
b. The area under the curve (AUC) can be used as a summary of the model's ability.
The shape of the curve contains a lot of information, including what we might be most interested in about a problem, the expected false positive rate and the false negative rate.
Utility of ROC Curves
It is especially useful for evaluating predictive models or other tests that produce output values in a continuous range, as it captures the trade-off between sensitivity and specificity in that range. There are many ways to perform a ROC analysis. ROC curves are used in clinical biochemistry to choose the most appropriate cutoff for a test. The best cut-off point has the highest true positive rate along with the lowest false positive rate. As the area under a ROC curve is a measure of the usefulness of a test in general, where a larger area means a more useful test, the areas under the ROC curves are used to compare the usefulness of the tests.
Explanation of the values of the ROC Curves
To make this clear: Smaller values on the x-axis of the graph indicate fewer false positives and more true negatives. Larger values on the y-axis of the graph indicate higher true positives and lower false negatives. When we predict a binary result, it is a correct prediction (true positive) or not (false positive). There is a tension between these options, the same with true negative and false negative. This is what we mean when we say that the model has ability. Skillful patterns are generally represented by curves that slope toward the top left of the plot.
Classifiers without skill
A non-skill classifier is one that cannot discriminate between classes and would predict a random class or a constant class in all cases. A model without skill is represented at the point (0.5, 0.5). Therefore, a model with no ability at each threshold is represented by a diagonal line from the lower left of the graph to the upper right and has an AUC of 0.5. A model with perfect ability is represented at one point (0,1). A model with perfect skill is represented by a line that travels from the bottom left of the frame to the top left and then through the top right. An operator can plot the ROC curve for the final model and choose a threshold that provides a desirable balance between false positives and false negatives.
How to make a ROC Curve
To make an ROC curve, you must be familiar with the concepts of true positive, true negative, false positive, and false negative. These concepts are used when the results of a test are compared with clinical truth, which is established through the use of diagnostic procedures that do not involve the test in question.
In a classification problem, we can decide to predict the class values directly. Alternatively, it may be more flexible to predict the probabilities for each class. The reason for this is to provide the ability to choose and even calibrate the threshold of how to interpret the predicted probabilities.
For example, a default value might be to use a threshold of 0.5, which means that a probability in [0.0, 0.49] is a negative result (0) and a probability in [0.5, 1.0] is a positive result. This threshold can be adjusted to adjust the behavior of the model to a specific problem. An example would be to reduce more than one type of error or another. When making a prediction for a binary or two-class classification problem, there are two types of errors that we could make.
False positive. Predict an event when there was no event.
False negative. Do not predict an event when there was actually an event.
By predicting probabilities and calibrating a threshold, the model operator can choose a balance of these two concerns.
A common way to compare models that predict probabilities for two-class problems is to use an ROC curve.
Suppose you have a value of x that is a diagnostic measure and you want to determine a threshold value of x that indicates the following:
- A condition exists if the x value is greater than the threshold.
- There is no condition if the x value is less than the threshold.
Now consider a diagnostic test as the threshold varies and thus causes more or less false positives and false negatives. Ideally, you should have a very narrow range of x criterion values that best divides true negatives and true positives. The receiver operating characteristic (ROC) curve shows how quickly this transition occurs. The goal of the ROC curve is to have diagnoses that maximize the area under the curve.
You may also be interested in: Research Methodology at the University of Miami