Correlation coefficient

Command:

Statistics

Correlation

Correlation coefficient

Description

Correlation analysis is used to determine whether the values of two variables are associated. The two variables should be random samples, and should have a Normal distribution (possibly after transformation).

Required input

Dialog box for pearson correlation coefficient.

This box has to be completed in a way similar to the box for summary statistics, but now 2 variables must be selected. If you want to select the variables from the variables list, click the button, and select the variable in the list that is displayed. Next, you move the cursor to the Variable X field, and again you click the button to select the variable in the list.

Finally, you can select a logarithmic transformation for one or both variable(s) to obtain Normal distributions. See Logarithmic transformation.

After you click OK you obtain the requested statistics in the results window:

Results

Correlation

Variable Y	WEIGHT
Variable X	LENGTH

Sample size	100
Correlation coefficient r	0.4459
Significance level	P<0.0001
95% Confidence interval for r	0.2734 to 0.5906

Sample size: the number of data pairs n

Pearson's correlation coefficient r with P-value. The Pearson correlation coefficient is a number between -1 and 1. In general, the correlation expresses the degree that, on an average, two variables change correspondingly.

If one variable increases when the second one increases, then there is a positive correlation. In this case the correlation coefficient will be closer to 1. For instance the height and age of children are positively correlated.

If one variable decreases when the other variable increases, then there is a negative correlation and the correlation coefficient will be closer to -1.

The P-value is the probability that you would have found the current result if the correlation coefficient were in fact zero (null hypothesis). If this probability is lower than the conventional 5% (P<0.05) the correlation coefficient is called statistically significant.

It is, however, important not to confuse correlation with causation. When two variables are correlated, there may or may not be a causative connection, and this connection may moreover be indirect. Correlation can only be interpreted in terms of causation if the variables under investigation provide a logical (biological) basis for such interpretation.

95% confidence interval (CI) for the Pearson correlation coefficient: this is the range of values that contains with a 95% confidence the 'true' correlation coefficient.

Presentation of results

The number of data pairs (sample size) should be reported, the correlation coefficient (two decimal places), together with the P-value and the 95% confidence interval: the correlation coefficient was 0.45 (P<0.0001, 95% CI 0.27 to 0.59).

The relationship between two variables can easily be represented graphically by a scatter diagram.

Literature

Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research. 4^th ed. Blackwell Science.
Bland M (2000) An introduction to medical statistics, 3^rd ed. Oxford: Oxford University Press.
Altman DG (1991) Practical statistics for medical research. London: Chapman and Hall.

External links

Pearson product-moment correlation coefficient on Wikipedia.

Correlation coefficient

Description

Required input

Results

Presentation of results

Literature

See also

External links