# Calculation of Trimmed Mean, SE and confidence interval

The k-times trimmed mean is calculated as the mean of the sample after the k smallest and k largest observations are deleted from the sample.

If the number of observations to be trimmed is specified as a percentage p, then p is the percentage of observations to be trimmed at each tail of the sample. If for example the percentage is 20%, then MedCalc trims 20% at the lower tail and 20% at the higher tail.

With the proportion $ \gamma $ being $ p/100 $, the number of observations to trim k is given by $$ k = trunc ( n \gamma ) $$

If $ n \gamma $ is not an integer number, it is truncated to the largest smaller integer (rounded down). Note that also SPSS, R (and Excel) round down, but SAS rounds up.

The k-times trimmed mean is calculated as

The Standard Error of the trimmed mean is based on the Winsorized mean and Winsorized sum of squared deviations (Tukey & McLauglin, 1963). The Winsorized mean is calculated as

and the Winsorized sum of squared deviations is calculated as

The Standard Error of the trimmed mean can then be calculated as:

The confidence interval for the trimmed mean is defined as

## Comparison of 2 independent trimmed means, the Yuen-Welch test

These calculations are based on the method given by Yuen, 1974 (see Wilcox, 2022).

Yuen's test statistic is

$$ T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 } } $$where $d_1$ and $d_2$ are the squares of the standard errors of the 2 sample means.

The estimated degrees of freedom is:

$$ df = \frac {(d_1+d_2)^2} { \frac{d_1^2}{h_1-1}+\frac{d_2^2}{h_2-1} } $$where $ h_j = n_j-2k_j$ is the number of observations left after trimming.

The P-value is taken from the t-distribution with df degrees of freedom.

The 95% confidence interval for the difference is

$$ (\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 } $$where t is the $ 1- \alpha / 2 $ quantile of the t-distribution with df degrees of freedom.

## Comparison of the trimmed means of paired samples

These calculations are based on the method given by Wilcox, 2022.

The square standard error of the difference between the means $ \bar{x}_{wk1} - \bar{x}_{wk2} $ is estimated with: $$ \frac {1}{h(h-1)} \left\{ \sum { (x_{1i}-\bar{x}_{wk1})^2 } + \sum { (x_{2i}-\bar{x}_{wk2})^2 } - 2 \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } } \right\} $$

where $ h = n-2k$ is the number of observations left after trimming.

Letting

$$ d_j = \frac {1}{h(h-1)} \sum { (x_{ji}-\bar{x}_{wkj})^2 } $$and

$$ d_{12} = \frac {1}{h(h-1)} \sum { (x_{1i}-\bar{x}_{wk1})({x_{2i}-\bar{x}_{wk2}) } } $$The test statistic T is given by $$ T = \frac{ \bar{x}_{wk1} - \bar{x}_{wk2} } { \sqrt {d_1 + d_2 - 2d_{12}} } $$

The P-value is taken from the t-distribution with h−1 degrees of freedom.

The 95% confidence interval for the difference is

$$ (\bar{x}_{wk1} - \bar{x}_{wk2}) \pm t \sqrt {d_1 + d_2 - 2d_{12}} $$where t is the $ 1- \alpha / 2 $ quantile of the t-distribution with h−1 degrees of freedom.

## References

- Tukey JM, McLaughlin DH (1963) Less Vulnerable Confidence and Significance Procedures for Location Based on a Single Sample: Trimming/Winsorization 1. Sankhya A, 25:331–352.
- Wilcox RR (2022) Introduction to robust estimation and hypothesis testing. 5
^{th}ed. Elsevier Academic Press. - Yuen KK (1974) The two-sample trimmed t for unequal population variances. Biometrika 61:165-170.

## Recommended book

## Introduction to Robust Estimation and Hypothesis Testing

Rand R. Wilcox

Buy from Amazon