Skip to main content
MedCalc
Mail a PDF copy of this page to:
(Your email address will not be added to a mailing list)
working
Show menu Show menu

Calculation of Trimmed Mean, SE and confidence interval

The k-times trimmed mean is calculated as the mean of the sample after the k smallest and k largest observations are deleted from the sample.

If the number of observations to be trimmed is specified as a percentage p, then p is the percentage of observations to be trimmed at each tail of the sample. If for example the percentage is 20%, then MedCalc trims 20% at the lower tail and 20% at the higher tail.

With the proportion $ \gamma $ being $ p/100 $, the number of observations to trim k is given by $$ k = trunc ( n \gamma ) $$

If $ n \gamma $ is not an integer number, it is truncated to the largest smaller integer (rounded down). Note that also SPSS, R (and Excel) round down, but SAS rounds up.

The k-times trimmed mean is calculated as

Trimmed Mean$$ \bar{x}_{tk} = \frac{1} {n-2k} \sum_{i=k+1}^{n-k}{x_i} $$

The Standard Error of the trimmed mean is based on the Winsorized mean and Winsorized sum of squared deviations (Tukey & McLauglin, 1963). The Winsorized mean is calculated as

Winsorized mean$$ \bar{x}_{wk} = \frac{1}{n} \left( (k+1) x_{k+1} + \sum_{i=k+2}^{n-k-1}{x_i} + (k+1) x_{n-k} \right) $$

and the Winsorized sum of squared deviations is calculated as

Winsorized sum of squared deviations$$ s^{2}_{wk} = (k+1) {(x_{k+1} - \bar{x}_{wk})}^2 + \sum_{i=k+2}^{n-k-1}{({x_i}-\bar{x}_{wk})}^2 + (k+1) {(x_{n-k} - \bar{x}_{wk})}^2 $$

The Standard Error of the trimmed mean can then be calculated as:

Standard Error of the trimmed mean$$ \text{SE}(\bar{x}_{tk}) = \frac{s_{wk}}{\sqrt{(n-2k)(n-2k-1)} } $$

The confidence interval for the trimmed mean is defined as

Confidence interval for the trimmed mean$$ \bar{x}_{tk} \pm t_{(1- \frac{\alpha}{2}, n-2k-1)} \text{SE}(\bar{x}_{tk}) $$

Comparison of 2 independent trimmed means, the Yuen-Welch test

These calculations are based on the method given by Yuen, 1974 (see Wilcox, 2022).

Yuen's test statistic is

$$ T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 } } $$

where $d_1$ and $d_2$ are the squares of the standard errors of the 2 sample means.

The estimated degrees of freedom is:

$$ df = \frac {(d_1+d_2)^2} { \frac{d_1^2}{h_1-1}+\frac{d_2^2}{h_2-1} } $$

where $ h_j = n_j-2k_j$ is the number of observations left after trimming.

The P-value is taken from the t-distribution with df degrees of freedom.

The 95% confidence interval for the difference is

$$ (\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 } $$

where t is the $ 1- \alpha / 2 $ quantile of the t-distribution with df degrees of freedom.

Comparison of the trimmed means of paired samples

These calculations are based on the method given by Wilcox, 2022.

The square standard error of the difference between the means $ \bar{x}_{wk1} - \bar{x}_{wk2} $ is estimated with: $$ \frac {1}{h(h-1)} \left\{ \sum { (x_{1i}-\bar{x}_{wk1})^2 } + \sum { (x_{2i}-\bar{x}_{wk2})^2 } - 2 \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } } \right\} $$

where $ h = n-2k$ is the number of observations left after trimming.

Letting

$$ d_j = \frac {1}{h(h-1)} \sum { (x_{ji}-\bar{x}_{wkj})^2 } $$

and

$$ d_{12} = \frac {1}{h(h-1)} \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } } $$

The test statistic T is given by $$ T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 - 2d_{12}} } $$

The P-value is taken from the t-distribution with h−1 degrees of freedom.

The 95% confidence interval for the difference is

$$ (\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 - 2d_{12}} $$

where t is the $ 1- \alpha / 2 $ quantile of the t-distribution with h−1 degrees of freedom.

References

  • Tukey JM, McLaughlin DH (1963) Less Vulnerable Confidence and Significance Procedures for Location Based on a Single Sample: Trimming/Winsorization 1. Sankhya A, 25:331–352.
  • Wilcox RR (2022) Introduction to robust estimation and hypothesis testing. 5th ed. Elsevier Academic Press.
  • Yuen KK (1974) The two-sample trimmed t for unequal population variances. Biometrika 61:165-170.