## Cox proportional-hazards regression
## DescriptionWhereas the Kaplan-Meier method with log-rank test is useful for comparing survival curves in two or more groups, Cox regression (or proportional hazards regression) allows analyzing the effect of several risk factors on survival. The probability of the endpoint (death, or any other event of interest, e.g. recurrence of disease) is called the hazard. The hazard is modeled as: where X By dividing both sides of the above equation by H We call H(t) / H Suppose the covariate (risk factor) is Suppose the covariate is ## Required input
The Cox proportional regression model assumes that the effects of the predictor variables
are constant over time. Furthermore there should be a linear relationship between the
endpoint and predictor variables. Predictor variables that have a highly skewed distribution may require logarithmic transformation to reduce the effect of extreme values. Logarithmic transformation of a variable
- Method: select the way independent variables are entered into the model.
- Enter: enter all variables in the model in one single step, without checking
- Forward: enter significant variables sequentially
- Backward: first enter all variables into the model and next remove the non-significant variables sequentially
- Stepwise: enter significant variables sequentially; after entering a variable in the model, check and possibly remove variables that became non-significant.
- Enter variable if P<
a variable is entered into the model if its associated significance level is less than this P-value. - Remove variable if P>
a variable is removed from the model if its associated significance level is greater than this P-value. - Categorical: click this button to identify nominal categorical variables.
- Graph:
- Survival probability (%): plot Survival probability (%) against time (descending curves)
- 100 - Survival probability (%): plot 100 - Survival probability (%) against time (ascending curves)
**Graph subgroups**: here you can select one of the predictor variables. The graph will display different survival curves for all values in this covariate (which must be categorical, and may not contain more than 8 categories). If no covariate is selected here, then the graph will display the survival at mean of the covariates in the model.
## ResultsIn the example (taken from Bland, 2000), "survival time" is the time to recurrence of gallstones following dissolution (variable ## Cases summaryThis table shows the number of cases that reached the endpoint (Number of events), the number of cases that did not reach the endpoint (Number censored), and the total number of cases. ## Overall Model FitThe Chi-squared statistic tests the relationship between time and all the covariates in the model. ## Coefficients and Standard ErrorsUsing the Forward selection method, the two covariates MedCalc lists the regression coefficient b, its standard error, Wald statistic (b/SE) The coefficient for months for dissolution (continuous variable The coefficient for multiple gallstones (dichotomous variable ## Variables not included in the modelThe variable ## Baseline cumulative hazard functionFinally, the program lists the baseline cumulative hazard H The baseline cumulative hazard can be used to calculate the survival probability S(t) for any case at time t: where PI is a prognostic index: ## GraphThe graph displays the survival curves for all categories of the categorical variable If no covariate was selected for Graph - Subgroups, or if the selected variable was not included in the model, then the graph displays a single survival curve at mean of all covariates in the model. ## Sample size considerationsBased on the work of Peduzzi et al. (1995) the following guideline for a minimum number of cases to include in a study can be suggested. Let N = 10 k / p For example: you have 3 predictor variables to include in the model and the proportion of positive cases in the population is 0.20 (20%). The minimum number of cases required is N = 10 x 3 / 0.20 = 150 If the resulting number is less than 100 you should increase it to 100 as suggested by Long (1997). ## Literature- Christensen E (1987) Multivariate survival analysis using Cox's regression model. Hepatology 7:1346-1358.
- Long JS (1997) Regression Models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications.
- Peduzzi P, Concato J, Feinstein AR, Holford TR (1995) Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. Journal of Clinical Epidemiology 48:1503-1510. [Abstract]
- Rosner B (2006) Fundamentals of Biostatistics. Pacific Grove: Duxbury.
## See also |