Case-Control matching
Command: | Tools Case-Control matching |
Description
The case-control matching procedure is used to randomly match cases and controls based on specific criteria. MedCalc can match on up to 4 different variables.
In the example we will use the following data:
The treated cases are coded 1, the controls are coded 0. For each treated case MedCalc will try to find a control case with matching age and gender.
Required input
- Classification variable: select or enter a dichotomous variable indicating group membership (0=control, 1=case).If your data are coded differently, you can use the Define status tool to recode your data.
- Variable with case identification: select a variable that contains a unique identification code for each subject in the spreadsheet. If you do not select a variable here (not recommended), MedCalc will use row numbers as case identification.
- Match on: select up to 4 variables and for each variable the maximum allowable difference (caliper). Smaller calipers will result in reduced bias and closer matches, but may also result in a smaller number of matches. Select the option "Exact match" to match on a variable that is not numerical (for example, a variable "Gender" that is coded 'Male' and 'Female').
- Filter: (optionally) a filter in order to include only a selected subgroup of cases (e.g. SEX="Male").
- Advanced: click to enter the number of iterations and the random number seed.
For the example data, we complete the dialog box as follows:
Results
The results are displayed in a dialog box.
The program gives the total number of subjects, number of cases, number of controls and the number of matched cases, i.e. the number of cases for which a matching control has been found.
Next, the mean difference between the matched subjects are given, with mean difference, SD, 95% CI of the difference and associated P-value (paired samples t-test). The 95% confidence intervals should be small and neglectable. P-values should be non-significant. If for one or more variables the confidence interval is large or the P-value is significant, the "maximum allowable difference" entered in the input dialog box (see above) was probably too large.
Save match IDs in spreadsheet column
Click
to create a new column in the spreadsheet with for each case the identification of the matched control (and vice-versa).A column is added in the spreadsheet:
In subsequent statistical analyses this new column can be used in a filter in order to include only cases and controls for which a match was found.
E.g. if the new column has MatchID as a heading, the filter could be MatchID>0 or MatchID<>"" (<> means Not Equal To).
Save as new file with paired data
Click
to create a new MedCalc data file in which the data are rearranged as follows:- The file includes the data of cases with matching controls only.
- A first set of columns contains the data of the cases. The heading of these columns is the original heading with "_T" appended. A second set of columns contains the data of the controls. The heading of these columns is the original heading with "_C" appended.
- On each row, the data of a case and its matching control is given.
This new datafile will allow to perform statistical tests on paired data.
Methodology
MedCalc uses a greedy matching algorithm within specified caliper distances. In n iterations the total standardized bias is minimized.
The matching ratio is 1 on 1, without replacement.
A paired t-test is used to assess comparability of baseline characteristics between matched groups.