The Chi-squared test can be used for the following:
- To test the hypothesis that for one classification table (e.g. gender), all classification levels have the same frequency.
- To test the relationship between two classification factors (e.g. gender and profession).
How to enter data
In the following example we have two categorical variables. For the variable OUTCOME a code 1 is entered for a positive outcome and a code 0 for a negative outcome. For the variable SMOKING a code 1 is used for the subjects that smoke, and a code 0 for the subjects that do not smoke. The data of each case is entered on one row of the spreadsheet.
In the Chi-squared test dialog box, one or two discrete variables with the classification data must be identified. Classification data may either be numeric or alphanumeric (string) values. If required, you can convert a continuous variable into a discrete variable using the IF function (see elsewhere).
After you have completed the dialog box, click OK to obtain the frequency table with the relevant statistics.
When you select the option Show all percentages in the results window, all percentages are shown in the table as follows:
In this example the number 42 in the upper left cell (for both Codes X and Coded Y equal to 0) is 67.7% of the row total of 62 cases; 75% of the column total of 56 cases and 42% of the grand total of 100 cases.
The Chi-squared statistic is the sum of the squares of the differences of observed and expected frequency divided by the expected frequency for every cell:
Single classification factor
When you want to test the hypothesis that for one single classification table (e.g. gender), all classification levels have the same frequency, then identify only one discrete variable in the dialog form. In this case the null hypothesis is that all classification levels have the same frequency. If the calculated P-value is low (P<0.05), then you reject the null hypothesis and the alternative hypothesis that there is a significant difference between the frequencies of the different classification levels must be accepted.
In a single classification table the mode of the observations is the most common observation or category (the observation with the highest frequency). A unimodal distribution has one mode; a bimodal distribution, two modes.
Two classification factors
When you want to study the relationship between two classification factors (e.g. gender and profession), then identify the two discrete variables in the dialog form. In this case the null hypothesis is that the two factors are independent. If the calculated P-value is low (P<0.05), then the null hypothesis is rejected and you accept the alternative hypothesis that there is a relation between the two factors.
Chi-squared test for trend
If the table has two columns and three or more rows (or two rows and three or more columns), and the categories can be quantified, MedCalc will also perform the Chi-squared test for trend. The Cochran-Armitage test for trend (Armitage, 1955) tests whether there is a linear trend between row (or column) number and the fraction of subjects in the left column (or top row). The Cochran-Armitage test for trend provides a more powerful test than the unordered independence test above.
If there is no meaningful order in the row (or column) categories, then you should ignore this calculation.
Analysis of 2x2 table
- For a 2x2 table, MedCalc uses the "N-1" Chi-squared test as recommended by Campbell (2007) and Richardson (2011). In the "N-1" Chi-squared test as given above is multiplied by a factor (N-1)/N. The use of Yates' continuity correction is no longer recommended.
- When the two classification factors are not independent, or when you want to test the difference between proportions in related or paired observations (e.g. in studies in which patients serve as their own control), you must use the McNemar test.
- Altman DG (1991) Practical statistics for medical research. London: Chapman and Hall.
- Armitage P (1955) Tests for linear trends in proportions and frequencies. Biometrics 11:375-386.
- Campbell I (2007) Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Statistics in Medicine 26:3661-3675. [Abstract]
- Richardson JTE (2011) The analysis of 2 x 2 contingency tables - Yet again. Statistics in Medicine 30:890. [Abstract]
- Cochran-Armitage test for trend on Wikipedia.