Skip to main content
MedCalc
Mail a PDF copy of this page to:
(Your email address will not be added to a mailing list)
working
Show menu Show menu

Case-Control matching

Description

The case-control matching procedure is used to randomly match cases and controls based on specific criteria. MedCalc can match on up to 4 different variables.

In the example we will use the following data:

Case-Control matching

The treated cases are coded 1, the controls are coded 0. For each treated case MedCalc will try to find a control case with matching age and gender.

Required input

  • Classification variable: select or enter a dichotomous variable indicating group membership (0=control, 1=case).

    If your data are coded differently, you can use the Define status tool to recode your data.
  • Variable with case identification: select a variable that contains a unique identification code for each subject in the spreadsheet. If you do not select a variable here (not recommended), MedCalc will use row numbers as case identification.
  • Match on: select up to 4 variables and for each variable the maximum allowable difference (caliper). Smaller calipers will result in reduced bias and closer matches, but may also result in a smaller number of matches. Select the option "Exact match" to match on a variable that is not numerical (for example, a variable "Gender" that is coded 'Male' and 'Female').
  • Filter: (optionally) a filter in order to include only a selected subgroup of cases (e.g. SEX="Male").
  • Advanced: click Advanced to enter the number of iterations and the random number seed.

For the example data, we complete the dialog box as follows:

Case-Control matching

Results

The results are displayed in a dialog box.

Case-Control matching

The program gives the total number of subjects, number of cases, number of controls and the number of matched cases, i.e. the number of cases for which a matching control has been found.

Next, the mean difference between the matched subjects are given, with mean difference, SD, 95% CI of the difference and associated P-value (paired samples t-test). The 95% confidence intervals should be small and neglectable. P-values should be non-significant. If for one or more variables the confidence interval is large or the P-value is significant, the "maximum allowable difference" entered in the input dialog box (see above) was probably too large.

Save match IDs in spreadsheet column

Click Save match IDs... to create a new column in the spreadsheet with for each case the identification of the matched control (and vice-versa).

Case-Control matching

A column is added in the spreadsheet:

Case-Control matching

In subsequent statistical analyses this new column can be used in a filter in order to include only cases and controls for which a match was found.

E.g. if the new column has MatchID as a heading, the filter could be MatchID>0 or MatchID<>"" (<> means Not Equal To).

Save as new file with paired data

Click Save new file... to create a new MedCalc data file in which the data are rearranged as follows:

  • The file includes the data of cases with matching controls only.
  • A first set of columns contains the data of the cases. The heading of these columns is the original heading with "_T" appended. A second set of columns contains the data of the controls. The heading of these columns is the original heading with "_C" appended.
  • On each row, the data of a case and its matching control is given.

Case-Control matching

This new datafile will allow to perform statistical tests on paired data.

Methodology

MedCalc uses a greedy matching algorithm within specified caliper distances. In n iterations the total standardized bias is minimized.

The matching ratio is 1 on 1, without replacement.

A paired t-test is used to assess comparability of baseline characteristics between matched groups.

See also