Skip to main content
MedCalc
Mail a PDF copy of this page to:
(Your email address will not be added to a mailing list)
working
Show menu

ROC curve analysis

Command:Statistics
Next selectROC curves
Next selectROC curve analysis

What is a ROC curve?

In a ROC curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points of a parameter. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. The Area Under the ROC curve (AUC) is a measure of how well a parameter can distinguish between two diagnostic groups (diseased/normal).

MedCalc creates a complete sensitivity/specificity report.

The ROC curve is a fundamental tool for diagnostic test evaluation.

Theory summary

The diagnostic performance of a test, or the accuracy of a test to discriminate diseased cases from normal cases is evaluated using Receiver Operating Characteristic (ROC) curve analysis (Metz, 1978; Zweig & Campbell, 1993). ROC curves can also be used to compare the diagnostic performance of two or more laboratory or diagnostic tests (Griner et al., 1981).

When you consider the results of a particular test in two populations, one population with a disease, the other population without the disease, you will rarely observe a perfect separation between the two groups. Indeed, the distribution of the test results will overlap, as shown in the following figure.

True negative, false negative, true positive and false positive fractions change when changing the criterion value.

For every possible cut-off point or criterion value you select to discriminate between the two populations, there will be some cases with the disease correctly classified as positive (TP = True Positive fraction), but some cases with the disease will be classified negative (FN = False Negative fraction). On the other hand, some cases without the disease will be correctly classified as negative (TN = True Negative fraction), but some cases without the disease will be classified as positive (FP = False Positive fraction).

Schematic outcomes of a test

The different fractions (TP, FP, TN, FN) are represented in the following table.

 Disease        
TestPresentn  Absentn  Total
PositiveTrue Positive (TP)a  False Positive (FP) c a + c
NegativeFalse Negative (FN)b   True Negative (TN)d  b + d
Total a + b   c + d   

The following statistics can be defined:

Sensitivity
a
a + b
  Specificity
d
c + d
Positive
Likelihood
Ratio
Sensitivity
1 - Specificity
  Negative
Likelihood
Ratio
1 - Sensitivity
Specificity
Positive
Predictive
Value
a
a + c
  Negative
Predictive
Value
d
b + d

Sensitivity and specificity versus criterion value

When you select a higher criterion value, the false positive fraction will decrease with increased specificity but on the other hand the true positive fraction and sensitivity will decrease:

Sensitivity and specificity versus criterion value.

When you select a lower threshold value, then the true positive fraction and sensitivity will increase. On the other hand the false positive fraction will also increase, and therefore the true negative fraction and specificity will decrease.

The ROC curve

In a Receiver Operating Characteristic (ROC) curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. A test with perfect discrimination (no overlap in the two distributions) has a ROC curve that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC curve is to the upper left corner, the higher the overall accuracy of the test (Zweig & Campbell, 1993).

Example of ROC curve

How to enter data for ROC curve analysis

In order to perform ROC curve analysis in MedCalc you should have a measurement of interest (= the parameter you want to study) and an independent diagnosis which classifies your study subjects into two distinct groups: a diseased and non-diseased group. The latter diagnosis should be independent from the measurement of interest.

In the spreadsheet, create a column DIAGNOSIS and a column for the variable of interest, e.g. TEST1. For every study subject enter a code for the diagnosis as follows: 1 for the diseased cases, and 0 for the non-diseased or normal cases. In the TEST1 column, enter the measurement of interest (this can be measurements, grades, etc. - if the data are categorical, code them with numerical values).

How to enter data for ROC curve analysis

Required input

Complete the ROC curve analysis dialog box as follows:

How to complete the ROC curve analysis dialog box

Data

Methodology

Disease prevalence

Whereas sensitivity and specificity, and therefore the ROC curve, and positive and negative likelihood ratio are independent of disease prevalence, positive and negative predictive values are highly dependent on disease prevalence or prior probability of disease. Therefore when disease prevalence is unknown, the program cannot calculate positive and negative predictive values.

Clinically, the disease prevalence is the same as the probability of disease being present before the test is performed (prior probability of disease).

Options

ROC graph

Results

Sample size

First the program displays the number of observations in the two groups. Concerning sample size, it has been suggested that meaningful qualitative conclusions can be drawn from ROC experiments performed with a total of about 100 observations (Metz, 1978).

ROC report - sample size

Area under the ROC curve, with standard error and 95% Confidence Interval

ROC report - Area Under the Curve

This value can be interpreted as follows (Zhou, Obuchowski & McClish, 2002):

When the variable under study cannot distinguish between the two groups, i.e. where there is no difference between the two distributions, the area will be equal to 0.5 (the ROC curve will coincide with the diagonal). When there is a perfect separation of the values of the two groups, i.e. there no overlapping of the distributions, the area under the ROC curve equals 1 (the ROC curve will reach the upper left corner of the plot).

The 95% Confidence Interval is the interval in which the true (population) Area under the ROC curve lies with 95% confidence.

The Significance level or P-value is the probability that the observed sample Area under the ROC curve is found when in fact, the true (population) Area under the ROC curve is 0.5 (null hypothesis: Area = 0.5). If P is small (P<0.05) then it can be concluded that the Area under the ROC curve is significantly different from 0.5 and that therefore there is evidence that the laboratory test does have an ability to distinguish between the two groups (Hanley & McNeil, 1982; Zweig & Campbell, 1993).

Youden index

ROC report - Youden index

The Youden index J (Youden, 1950) is defined as:

$$ J = max\ \{ sensitivity_c + specificity_c - 1\} $$

where c ranges over all possible criterion values.

Graphically, J is the maximum vertical distance between the ROC curve and the diagonal line.

The criterion value corresponding with the Youden index J is the optimal criterion value only when disease prevalence is 50%, equal weight is given to sensitivity and specificity, and costs of various decisions are ignored.

When the corresponding Advanced option has been selected, MedCalc will calculate BCa bootstrapped 95% confidence intervals (Efron, 1987; Efron & Tibshirani, 1993) for both the Youden index and it's corresponding criterion value.

Criterion values

MedCalc does not simply reports threshold or criterion values, but it reports the criterion values with a comparison sign, > or <, depending on whether higher values indicate disease, of lower values indicate disease.

See the note on Criterion values.

Optimal criterion

This panel is only displayed when disease prevalence and cost parameters are known.

ROC report - Optimal criterion

The optimal criterion value takes into account not only sensitivity and specificity, but also disease prevalence, and costs of various decisions. When these data are known, MedCalc will calculate the optimal criterion and associated sensitivity and specificity. And when the corresponding Advanced option has been selected, MedCalc will calculate BCa bootstrapped 95% confidence intervals (Efron, 1987; Efron & Tibshirani, 1993) for these parameters.

When a test is used either for the purpose of screening or to exclude a diagnostic possibility, a cut-off value with a higher sensitivity may be selected; and when a test is used to confirm a disease, a higher specificity may be required.

Summary table

This panel is only displayed when the corresponding Advanced option has been selected.

ROC report - summary table

The summary table displays the estimated specificity for fixed and pre-specified sensitivities of 80, 90, 95 and 97.5% as well as estimated sensitivity for fixed and pre-specified specificities (Zhou et al., 2002), with the corresponding criterion values.

Confidence intervals are BCa bootstrapped 95% confidence intervals (Efron, 1987; Efron & Tibshirani, 1993).

Criterion values and coordinates of the ROC curve

ROC report - criterion values and coordinates of the ROC curve

This section of the results window lists the different filters or cut-off values with their corresponding sensitivity and specificity of the test, and the positive (+LR) and negative likelihood ratio (-LR). When the disease prevalence is known, the program will also report the positive predictive value (+PV) and the negative predictive value (-PV).

When you did not select the option Include all observed criterion values, the program only lists the more important points of the ROC curve: for equal sensitivity (resp. specificity) it gives the threshold value (criterion value) with the highest specificity (resp. sensitivity). When you do select the option Include all observed criterion values, the program will list sensitivity and specificity for all possible threshold values.

Sensitivity, specificity, positive and negative predictive value as well as disease prevalence are expressed as percentages.

Confidence intervals for sensitivity and specificity are "exact" Clopper-Pearson confidence intervals.

Confidence intervals for the likelihood ratios are calculated using the "Log method" as given on page 109 of Altman et al. 2000.

Confidence intervals for the predictive values are the standard logit confidence intervals given by Mercaldo et al. 2007.

ROC curve

The ROC curve will be displayed in a second window when you have selected the corresponding option in the dialog box.

ROC curve option.

In a ROC curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. A test with perfect discrimination (no overlap in the two distributions) has a ROC curve that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC curve is to the upper left corner, the higher the overall accuracy of the test (Zweig & Campbell, 1993).

ROC curve.

When you click on a specific point of the ROC curve, the corresponding cut-off point with sensitivity and specificity will be displayed.

ROC curve with infobox.

This is the ROC curve with the option Include 95% Confidence Bounds:

ROC curve with confidence interval.

Presentation of results

The prevalence of a disease may be different in different clinical settings. For instance the pre-test probability for a positive test will be higher when a patient consults a specialist than when he consults a general practitioner. Since positive and negative predictive values are sensitive to the prevalence of the disease, it would be misleading to compare these values from different studies where the prevalence of the disease differs, or apply them in different settings.

The data from the results window can be summarized in a table. The sample size in the two groups should be clearly stated. The table can contain a column for the different criterion values, the corresponding sensitivity (with 95% CI), specificity (with 95% CI), and possibly the positive and negative predictive value. The table should not only contain the test's characteristics for one single cut-off value, but preferably there should be a row for the values corresponding with a sensitivity of 90%, 95% and 97.5%, specificity of 90%, 95% and 97.5%, and the value corresponding with the Youden index or highest accuracy.

With these data, any reader can calculate the negative and positive predictive value applicable in his own clinical setting when the knows the prior probability of disease (pre-test probability or prevalence of disease) in this setting, by the following formulas based on Bayes' theorem:

Positive predictive value $$ PPV = \frac {sensitivity \times prevalence } {sensitivity \times prevalence + (1-specificity)\times (1-prevalence) } $$

and

Negative predictive value $$ NPV = \frac {specificity \times (1-prevalence) }{ (1-sensitivity) \times prevalence + specificity \times (1-prevalence) } $$

The negative and positive likelihood ratio must be handled with care because they are easily and commonly misinterpreted.

Literature

See also

External links

Recommended book

Statistical Methods in Diagnostic Medicine
Xiao-Hua Zhou, Nancy A. Obuchowski, Donna K. McClish

Buy from Amazon US - CA - UK - DE - FR - ES - IT

Statistical Methods in Diagnostic Medicine provides a comprehensive approach to the topic, guiding readers through the necessary practices for understanding these studies and generalizing the results to patient populations.

Following a basic introduction to measuring test accuracy and study design, the authors successfully define various measures of diagnostic accuracy, describe strategies for designing diagnostic accuracy studies, and present key statistical methods for estimating and comparing test accuracy.