# Kaplan-Meier survival analysis

## Description

Performs survival analysis and generates a Kaplan-Meier plot.

In clinical trials the investigator is often interested in the time until participants in a study present a specific event or endpoint. This event usually is a clinical outcome such as death, disappearance of a tumour, etc.

The participants will be followed beginning at a certain starting-point, and the time will be recorded needed for the event of interest to occur. Usually, the end of the study is reached before all participants have presented this event, and the outcome of the remaining patients is unknown. Also the outcome is unknown of the participants who have withdrawn from the study. For these cases the time of follow-up is recorded (censored data).

On SciStat.com, these data can be analysed by means of the Kaplan-Meier method, which is the most common method to describe survival characteristics.

In order to be able to analyse the data, you need to enter the data in the following way:

- in a first column, the survival time has to recorded
- in a second column, it must be recorded whether or not the case has reached the endpoint (by entering the code 1) or whether the time is censored, i.e. the outcome is unknown (by entering the code 0);
- in a third column, a code can be entered to assign the case to a particular group (study group - control group).

The order of these column is not important. Also, the rows do not have to be sorted in any way.

From these data, SciStat.com can easily calculate and construct the Kaplan-Meier survival curve.

## Required input

### Survival time

The name of the variable containing the time to reach the event of interest, or the time of follow-up.

### Endpoint

The name of a variable containing codes 1 for the cases that have reached the endpoint, or code 0 for the cases that have not reached the endpoint, because they withdrew from the study, or the end of the study was reached.

### Factor

For *Factor* select a qualitative or discrete variable (grouping variable). This qualitative factor may either be character or numeric codes. These codes are used to break-up the data into several subgroups.

SciStat.com allows comparison of survival curves for up to 6 subgroups.

If no *Factor* variable is selected, then SciStat.com will display only one survival curve (all data are considered to belong to one group).

### Filter

A filter to include only a selected subgroup of cases in the graph.

### Options

**Linear trend for factor levels**: Allows testing for a linear trend across levels of the factor. It is appropriate if factor levels have a natural ordering (for example, factor codes represent doses applied to different groups). Kaplan-Meier assumes that the factor levels are equally spaced.**Restricted Mean Survival Time**: Option to calculate the restricted mean survival time (RMST), and perform comparison of the RMST between groups. The Restricted Mean Survival Time is the average event-free survival time up to a pre-specified time point. It is equivalent to the area under the Kaplan-Meier curve from the beginning of the study through that time point. The selected time point must lie between the first and last event in every group; it not, then the software will reset it to the lowest time point of the last event among the different groups.**Graph**:

Survival probability (%): plot Survival probability (%) against time (descending curves)

100 - Survival probability (%): plot 100 - Survival probability (%) against time (ascending curves)**Include 95% CI in graph**: Allows to plot the 95% confidence interval for the survival curves.**Mark censored data in graph**: Mark censored data in the graph with a small vertical line.**Number at risk table below graph**: Shows a table below the graph with the number of subjects at risk.

## Results

### Cases summary

This table shows the number of cases that reached the endpoint (Number of events), the number of cases that did not reach the endpoint (Number censored), and the total number of cases.

### Mean and median survival

The mean and median survival time are reported with their 95% confidence interval (CI).

The mean survival time is estimated as the area under the survival curve in the interval 0 to *t*_{max} (Klein & Moeschberger, 2003).

The median survival is the smallest time at which the survival probability drops to 0.5 (50%) or below. If the survival curve does not drop to 0.5 or below then the median time cannot be computed. The median survival time and its 95% CI is calculated according to Brookmeyer & Crowley, 1982.

### Restricted Mean Survival Time

The restricted mean survival time (RMST) is reported with its 95% confidence interval. If groups are defined then a table is displayed with the differences of RMST between groups, the 95% CI of the difference, and associated P-value (Royston & Karmar, 2013).

### Survival table

At each observed timepoint, the survival proportions (with standard error) are listed for all groups, as well as the overall survival proportion.

### Comparison of survival curves (Logrank test)

Next follows the result of the logrank test for the comparison between the survival curves. If the P-value associated with the Chi-squared statistic is small (<0.05) then the conclusion is that, statistically, the two survival curves differ significantly, or that the grouping variable has a significant influence on survival time.

### Hazard ratios with 95% Confidence Interval

When you have specified a factor then SciStat.com also calculates the hazard ratios with 95% confidence interval (CI). Hazard is a measure of how rapidly the event of interest occurs. The hazard ratio compares the hazards in two groups.

If for example the hazard ratio is 5.1462, with a 95% confidence interval 2.3506 to 11.2663, then the estimated relative risk of the event of interest occurring in the group indicated in the table column is 5.1462 higher than the group indicated in the table row. This hazard ratio is significantly different from the value 1 (corresponding to equal hazards) since the confidence interval does not include the value 1.

The hazard ratios and confidence intervals are calculated according to Altman et al., 2000.

Note that the computation of the hazard ratio assumes that the ratio is consistent over time, so therefore if the survival curves cross, the hazard ratio statistic should be ignored.

### Logrank test for trend

If more than two survival curves are compared, and there is a natural ordering of the groups, then SciStat.com can also perform the logrank test for trend. This tests the probability that there is a trend in survival scores across the groups.

## Literature

- Altman DG (1991) Practical statistics for medical research. London: Chapman and Hall.
- Altman DG, Machin D, Bryant TN, Gardner MJ (Eds) (2000) Statistics with confidence, 2
^{nd}ed. BMJ Books. - Brookmeyer R, Crowley JA (1982) A confidence interval for the median survival time. Biometrics 38:29-41.
- Klein JP, Moeschberger ML (2003) Survival Analysis. Techniques for censored and truncated data, 2
^{nd}ed. New York: Springer Publishers. - Royston P, Karmar KB (2013) Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Medical Research Methodology 13:152.