Close
SciStat
 

ROC curve analysis

Description

Use ROC curve analysis to obtain a ROC plot and a complete sensitivity/specificity report. A Receiver Operating Characteristic (ROC) curve is a graph that plots the true positive rate in function of the false positive rate at different cut-off points.

See ROC curve analysis: theory summary for details.

Required input

  • Variable: identify the variables under study.
  • Classification: select a dichotomous variable indicating diagnosis (0=negative, 1=positive). If diagnosis is coded differently than using the values 0 and 1, you can use the IF function to transform the codes into 0 and 1 values, e.g. IF(RESULT="pos",1,0).
  • Filter: (optionally) a filter to include only a subgroup of cases (e.g. AGE>21, SEX="Male").

Methodology

  • DeLong et al.: use the method of DeLong et al. (1988) for the calculation of the Standard Error of the Area Under the Curve (recommended).
  • Hanley & McNeil: use the method of Hanley & McNeil (1982) for the calculation of the Standard Error of the Area Under the Curve.
  • Binomial exact Confidence Interval for the AUC: calculate an exact Binomial Confidence Interval for the Area Under the Curve (recommended). If this option is not selected, the Confidence Interval is calculated as AUC ± 1.96 its Standard Error.

Disease prevalence

If the sample sizes in the positive and the negative group do reflect the real prevalence of the disease, this can be indicated in the dialog box. Alternatively you can enter the disease prevalence, expressed as a percentage. Clinically, the disease prevalence is the same as the probability of disease being present before the test is performed. If the disease prevalence is unknown, or irrelevant for the current statistical analysis, you can ignore these fields. In this case the program will not calculate predictive values.

Options

  • List criterion values with test characteristics: option to create a list of criterion values corresponding with the coordinates of the ROC curve, with associated sensitivity, specificity, likelihood ratios and predictive values (if disease prevalence is known).
  • 95% Confidence Interval for sensitivity/specificity, likelihood ratio and predictive values: select the Confidence Intervals you require.
  • Calculate optimal criterion value taking into account costs: option to calculate the optimal criterion value taking into account the disease prevalence and cost of false and true positive and negative decisions (Zweig & Campbell, 1993). This calculation is only possible when disease prevalence is known (see above).
    • FPc: the cost of a false positive decision.
    • FNc: the cost of a false negative decision.
    • TPc: the cost of a true positive decision.
    • TNc: the cost of a true negative decision.

    These data are used to calculate a parameter S as follows:

    where P denotes the prevalence in the target population (Greiner et al., 2000). The point on the ROC curve where a line with this slope S touches the curve is the optimal operating point, taking into account prevalence and the costs of the different decisions.

    Costs can be financial costs or health costs, but all 4 cost factors need to be expressed on a common scale. Benefits can be expressed as negative costs. Suppose a false negative (FN) decision is judged to be twice as costly as a false positive (FP) decision, and no assumptions are made about the costs for true positive and true negative decisions. Then for FNc you enter 2, for FPc enter 1 and enter 0 for both TPc and TNc.

    Because the slope S must be a positive number:

    • FPc cannot be equal to TNc
    • FNc cannot be equal to TPc
    • When TNc is larger than FPc then TPc must be larger than FNc
    • When TNc is smaller than FPc then TPc must be smaller than FNc

    The parameter S is "cost-neutral" when (FPc-TNc)/(FNc-TPc) evaluates to 1, that is when FPc-TNc equals FNc-TPc. In this case S, and the "optimal criterion value" depends only on the disease prevalence.

  • Advanced: click this button for Advanced options including bootstrapping confidence intervals for the Youden index and its associated criterion value.

Graph

  • Create graph
  • Mark points corresponding to criterion values
  • Include 95% Confidence Bounds (Hilgers, 1991)

Results

The results for ROC curve analysis include:

Sample size

First the program displays the number of observations in the two groups. Concerning sample size, it has been suggested that meaningful qualitative conclusions can be drawn from ROC experiments performed with a total of about 100 observations (Metz, 1978).

Area under the ROC curve, with standard error and 95% Confidence Interval

This value can be interpreted as follows (Zhou, Obuchowski & McClish, 2002):

  • the average value of sensitivity for all possible values of specificity;
  • the average value of specificity for all possible values of sensitivity;
  • the probability that a randomly selected individual from the positive group has a test result indicating greater suspicion than that for a randomly chosen individual from the negative group.

When the variable under study can not distinguish between the two groups, i.e. where there is no difference between the two distributions, the area will be equal to 0.5 (the ROC curve will coincide with the diagonal). When there is a perfect separation of the values of the two groups, i.e. there no overlapping of the distributions, the area under the ROC curve equals 1 (the ROC curve will reach the upper left corner of the plot).

The 95% Confidence Interval is the interval in which the true (population) Area under the ROC curve lies with 95% confidence.

The Significance level or P-value is the probability that the observed sample Area under the ROC curve is found when in fact, the true (population) Area under the ROC curve is 0.5 (null hypothesis: Area = 0.5). If P is small (P<0.05) then it can be concluded that the Area under the ROC curve is significantly different from 0.5 and that therefore there is evidence that the laboratory test does have an ability to distinguish between the two groups.

Youden index

The Youden index J (Youden, 1950) is defined as:

J = max { sensitivityc + specificityc - 1 }

where c ranges over all possible criterion values.

Graphically, J is the maximum vertical distance between the ROC curve and the diagonal line.

The criterion value corresponding with the Youden index J is the optimal criterion value only when disease prevalence is 50%, equal weight is given to sensitivity and specificity, and costs of various decisions are ignored.

When the corresponding Advanced option has been selected, SciStat.com will calculate BCa bootstrapped 95% confidence intervals (Efron 1987; Efron & Tibshirani, 1993) for both the Youden index and it's corresponding criterion value.

Optimal criterion*

The optimal criterion value takes into account not only sensitivity and specificity, but also disease prevalence, and costs of various decisions. When these data are known, SciStat.com will calculate the optimal criterion and associated sensitivity and specificity. And when the corresponding Advanced option has been selected, SciStat.com will calculate BCa bootstrapped 95% confidence intervals (Efron 1987; Efron & Tibshirani, 1993) for these parameters.

When a test is used either for the purpose of screening or to exclude a diagnostic possibility, a cut-off value with a higher sensitivity may be selected; and when a test is used to confirm a disease, a higher specificity may be required.

*This panel is only displayed when disease prevalence and cost parameters are known.

See also a note on Criterion values.

Summary table*

The summary table displays the estimated specificity for a range of fixed and pre-specified sensitivities of 80, 90, 95 and 97.5% as well as estimated sensitivity for a range of fixed and pre-specified specificities (Zhou et al., 2002), with the corresponding criterion values.

Confidence intervals are BCa bootstrapped 95% confidence intervals (Efron 1987; Efron & Tibshirani, 1993).

*This panel is only displayed when the corresponding Advanced option has been selected.

Criterion values and coordinates of the ROC curve

The next section of the results window lists the different threshold or cut-off values with their corresponding sensitivity and specificity of the test, and the positive (+LR) and negative likelihood ratio (-LR). When the disease prevalence is known, the program will also report the positive predictive value (+PV) and the negative predictive value (-PV).

  • Sensitivity (with optional 95% Confidence Interval): Probability that a test result will be positive when the disease is present (true positive rate).
  • Specificity (with optional 95% Confidence Interval): Probability that a test result will be negative when the disease is not present (true negative rate).
  • Positive likelihood ratio (with optional 95% Confidence Interval): Ratio between the probability of a positive test result given the presence of the disease and the probability of a positive test result given the absence of the disease.
  • Negative likelihood ratio (with optional 95% Confidence Interval): Ratio between the probability of a negative test result given the presence of the disease and the probability of a negative test result given the absence of the disease.
  • Positive predictive value (with optional 95% Confidence Interval): Probability that the disease is present when the test is positive.
  • Negative predictive value (with optional 95% Confidence Interval): Probability that the disease is not present when the test is negative.
  • Cost*: The average cost resulting from the use of the diagnostic test at that decision level. Note that the cost reported here excludes the "overhead cost", i.e. the cost of doing the test, which is constant at all decision levels.

    *This column is only displayed when disease prevalence and cost parameters are known.

Sensitivity, specificity, positive and negative predictive value as well as disease prevalence are expressed as percentages.

Confidence intervals for sensitivity and specificity are "exact" Clopper-Pearson confidence intervals.

Confidence intervals for the likelihood ratios are calculated using the "Log method" as given on page 109 of Altman et al. 2000.

Confidence intervals for the predictive values are the standard logit confidence intervals given by Mercaldo et al. 2007.

Graph

In a ROC curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points.

Each point on the ROC plot represents a sensitivity/specificity pair corresponding to a particular decision threshold. A test with perfect discrimination (no overlap in the two distributions) has a ROC plot that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC plot is to the upper left corner, the higher the overall accuracy of the test (Zweig & Campbell, 1993).

Literature

  • DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837-845.
  • Efron B (1987) Better Bootstrap Confidence Intervals. Journal of the American Statistical Association 82:171-185.
  • Efron B, Tibshirani RJ (1993) An introduction to the Bootstrap. Chapman & Hall/CRC.
  • Greiner M, Pfeiffer D, Smith RD (2000) Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Preventive Veterinary Medicine 45:23-41.
  • Griner PF, Mayewski RJ, Mushlin AI, Greenland P (1981) Selection and interpretation of diagnostic tests and procedures. Annals of Internal Medicine 94:555-600.
  • Hanley JA, Hajian-Tilaki KO (1997) Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. Academic Radiology 4:49-58.
  • Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29-36.
  • Hilgers RA (1991) Distribution-free confidence bounds for ROC curves. Methods of Information in Medicine 30:96-101.
  • Mercaldo ND, Lau KF, Zhou XH (2007) Confidence intervals for predictive values with an emphasis to case-control studies. Statistics in Medicine 26:2170-2183.
  • Youden WJ (1950) An index for rating diagnostic tests. Cancer 3:32-35.
  • Zhou XH, Obuchowski NA, McClish DK (2002) Statistical methods in diagnostic medicine. Wiley-Interscience.
  • Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry 39:561-577.

See also

Link

Go to ROC curve analysis.