# Independent samples t-test

Command: | Statistics T-tests Independent samples t-test |

## Description

The independent samples (or two-sample) t-test is used to compare the means of two independent samples.

## Required input

Select the variables for sample 1 and sample 2. Differences will be calculated as Sample2−Sample1.

Caveat: if the two variables are the same, then the two filters must define distinct groups so that the same case is not included in the two samples.

### Options

- Logarithmic transformation: if the data require a logarithmic transformation (e.g. when the data are positively skewed), select the Logarithmic transformation option.
- Confidence interval: select the required confidence interval for the difference between the means. A 95% confidence interval is the usual selection, select a 90% confidence interval for equivalence testing.
- Correction for unequal variances: allows to select the t-test (assuming equal variances) or the t-test corrected for unequal variances (Welch test, Armitage et al., 2002). With the option "Automatic" the software will select the appropriate test based on the F-test (comparison of variances).
- Residuals: optionally, select a Test for Normal distribution of the residuals. In the independent samples t-test, residuals are the differences between the observations and their group or sample mean.

## Results

The results windows for the Independent samples t-test displays the summary statistics of the two samples, followed by the statistical tests.

First an F-test is performed. If the P-value is low (P<0.05) the variances of the two samples cannot be assumed to be equal and it should be considered to use the t-test with a correction for unequal variances (Welch test) (see above).

The independent samples t-test is used to test the hypothesis that the difference between the means of two samples is equal to 0 (this hypothesis is therefore called the null hypothesis). The program displays the difference between the two means, and the confidence interval (CI) of this difference. Next follow the test statistic t, the Degrees of Freedom (DF) and the two-tailed probability P. When the P-value is less than the conventional 0.05, the null hypothesis is rejected and the conclusion is that the two means do indeed differ significantly.

### Logarithmic transformation

If you selected the Logarithmic transformation option, the program performs the calculations on the logarithms of the observations, but reports the back-transformed summary statistics.

For the t-test, the difference and its confidence interval are given, and the test is performed on the log-transformed scale.

Next, the results of the t-test are transformed back and the interpretation is as follows: the back-transformed difference of the means of the logs is the ratio of the geometric means of the two samples (see Bland, 2000).

### Normal distribution of residuals

For the independent samples t-test, it is assumed that the residuals (the differences between the observations and their group or sample mean) follow a Normal distribution. This assumption can be evaluated with a formal test, or by means of graphical methods.

The different formal Tests for Normal distribution may not have enough power to detect deviation from the Normal distribution when sample size is small. On the other hand, when sample size is large, the requirement of a Normal distribution is less stringent because of the central limit theorem.

Therefore, it is often preferred to visually evaluate the symmetry and peakedness of the distribution of the residuals using the Histogram, Box-and-whisker plot, or Normal plot.

To do so, you click the hyperlink "Save residuals" in the results window. This will save the residual values as a new variable in the spreadsheet. You can then use this new variable in the different distribution plots.

## One-sided or two-sided tests

In MedCalc, P-values are always *two-sided* (as recommended by Fleiss, 1981, and Altman, 1991) and not* one-sided*.

A *two-sided* (or two-tailed) P-value is appropriate when the difference between the two means can occur in both directions: it may be either negative or positive, the mean of one sample may either be smaller or larger than that of the other sample.

A *one-sided* test should only be performed when, before the start of the study, it has already been established that a difference can only occur in one direction. E.g. when the mean of sample *A* must be more than the mean of sample *B* for reasons other than those connected with the sample(s).

## Interpretation of P-values

P-values should not be interpreted too strictly. Although a significance level of 5% is generally accepted as a cut-off point for a significant versus a non-significant result, it would be a mistake to interpret a shift of P-value from e.g. 0.045 to 0.055 as a change from significance to non-significance. Therefore the real P-values are preferably reported, P=0.045 or P=0.055, instead of P<0.05 or P>0.05, so the reader can make his own interpretation.

With regards to the interpretation of P-values as significant versus not-significant, is has been recommended to select a smaller significance level of for example 0.01 *when it is necessary to be quite certain that a difference exists before accepting it*. When a study is designed to *uncover a difference*, or when *a life-saving* drug is being studied, we should be willing to accept that there is a difference even when the P-value is as large as 0.10 or even 0.20 (Lentner, 1982). The latter authors state that *"The tendency in medical and biological investigations is to use too small a significance probability"*.

## Confidence intervals

Whereas the P-value may give information on the statistical significance of the result, the 95% confidence interval gives information to assess the clinical importance of the result.

When the number of cases included in the study is large, a biologically unimportant difference can be statistically highly significant. *A statistically significant result does not necessarily indicate a real biological difference*.

On the other hand, a high P-value can lead to the conclusion of statistically non-significant difference although the difference is clinically meaningful and relevant, especially when the number of cases is small. *A non-significant result does not mean that there is no real biological difference*.

Confidence intervals are therefore helpful in interpretation of a difference, whether or not it is statistically significant (Altman et al., 1983).

## Presentation of results

It is recommended to report the results of the t-test (and other tests) not by a simple statement such as P<0.05, but by giving full statistical information, as in the following example by Gardner & Altman (1986):

*The difference between the sample mean systolic blood pressure in diabetics and non-diabetics was 6.0 mm Hg, with a 95% confidence interval from 1.1 to 10.9 mm Hg; the t test statistic was 2.4, with 198 degrees of freedom and an associated P value of P=0.02.*

In short:

*Mean 6.0 mm Hg, 95% CI 1.1 to 10.9; t=2.4, df=198, P=0.02*

## Literature

- Altman DG, Gore SM, Gardner MJ, Pocock SJ (1983) Statistical guidelines for contributors to medical journals. British Medical Journal, 286, 1489-1493.
- Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research. 4
^{th}ed. Blackwell Science. - Bland M (2000) An introduction to medical statistics, 3
^{rd}ed. Oxford: Oxford University Press. - Fleiss JL (1981) Statistical methods for rates and proportions, 2
^{nd}ed. New York: John Wiley & Sons. - Gardner MJ, Altman DG (1986) Confidence intervals rather than P values: estimation rather than hypothesis testing. British Medical Journal, 292, 746-750.
- Lentner C (ed) (1982) Geigy Scientific Tables, 8
^{th}edition, Volume 2. Basle: Ciba-Geigy Limited.