MedCalc  # Calculation of Trimmed Mean, SE and confidence interval

The k-times trimmed mean is calculated as the mean of the sample after the k smallest and k largest observations are deleted from the sample.

If the number of observations to be trimmed is specified as a percentage p, then p is the percentage of observations to be trimmed at each tail of the sample. If for example the percentage is 20%, then MedCalc trims 20% at the lower tail and 20% at the higher tail.

With the proportion $\gamma$ being $p/100$, the number of observations to trim k is given by $$k = trunc ( n \gamma )$$

If $n \gamma$ is not an integer number, it is truncated to the largest smaller integer (rounded down). Note that also SPSS, R (and Excel) round down, but SAS rounds up.

The k-times trimmed mean is calculated as $$\bar{x}_{tk} = \frac{1} {n-2k} \sum_{i=k+1}^{n-k}{x_i}$$

The Standard Error of the trimmed mean is based on the Winsorized mean and Winsorized sum of squared deviations (Tukey & McLauglin, 1963). The Winsorized mean is calculated as $$\bar{x}_{wk} = \frac{1}{n} \left( (k+1) x_{k+1} + \sum_{i=k+2}^{n-k-1}{x_i} + (k+1) x_{n-k} \right)$$

and the Winsorized sum of squared deviations is calculated as $$s^{2}_{wk} = (k+1) {(x_{k+1} - \bar{x}_{wk})}^2 + \sum_{i=k+2}^{n-k-1}{({x_i}-\bar{x}_{wk})}^2 + (k+1) {(x_{n-k} - \bar{x}_{wk})}^2$$

The Standard Error of the trimmed mean can then be calculated as: $$\text{SE}(\bar{x}_{tk}) = \frac{s_{wk}}{\sqrt{(n-2k)(n-2k-1)} }$$

The confidence interval for the trimmed mean is defined as $$\bar{x}_{tk} \pm t_{(1- \frac{\alpha}{2}, n-2k-1)} \text{SE}(\bar{x}_{tk})$$

## Comparison of 2 independent trimmed means, the Yuen-Welch test

These calculations are based on the method given by Yuen, 1974 (see Wilcox, 2022).

Yuen's test statistic is

$$T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 } }$$

where $d_1$ and $d_2$ are the squares of the standard errors of the 2 sample means.

The estimated degrees of freedom is:

$$df = \frac {(d_1+d_2)^2} { \frac{d_1^2}{h_1-1}+\frac{d_2^2}{h_2-1} }$$

where $h_j = n_j-2k_j$ is the number of observations left after trimming.

The P-value is taken from the t-distribution with df degrees of freedom.

The 95% confidence interval for the difference is

$$(\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 }$$

where t is the $1- \alpha / 2$ quantile of the t-distribution with df degrees of freedom.

## Comparison of the trimmed means of paired samples

These calculations are based on the method given by Wilcox, 2022.

The square standard error of the difference between the means $\bar{x}_{wk1} - \bar{x}_{wk2}$ is estimated with: $$\frac {1}{h(h-1)} \left\{ \sum { (x_{1i}-\bar{x}_{wk1})^2 } + \sum { (x_{2i}-\bar{x}_{wk2})^2 } - 2 \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } } \right\}$$

where $h = n-2k$ is the number of observations left after trimming.

Letting

$$d_j = \frac {1}{h(h-1)} \sum { (x_{ji}-\bar{x}_{wkj})^2 }$$

and

$$d_{12} = \frac {1}{h(h-1)} \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } }$$

The test statistic T is given by $$T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 - 2d_{12}} }$$

The P-value is taken from the t-distribution with h−1 degrees of freedom.

The 95% confidence interval for the difference is

$$(\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 - 2d_{12}}$$

where t is the $1- \alpha / 2$ quantile of the t-distribution with h−1 degrees of freedom.

## References

• Tukey JM, McLaughlin DH (1963) Less Vulnerable Confidence and Significance Procedures for Location Based on a Single Sample: Trimming/Winsorization 1. Sankhya A, 25:331–352.
• Wilcox RR (2022) Introduction to robust estimation and hypothesis testing. 5th ed. Elsevier Academic Press. • Yuen KK (1974) The two-sample trimmed t for unequal population variances. Biometrika 61:165-170.

## Recommended book 