Weighted regression - worked example

The problem of heteroscedasticity

In regression analysis heteroscedasticity means a situation in which the variance of the dependent variable (Y) varies across the levels of the independent data (X). Heteroscedasticity can complicate analysis because regression analysis is based on an assumption of equal variance across the levels of the independent data.

Weighted regression - worked example Weighted regression - worked example
Homoscedasticity Heteroscedasticity

Weighted regression can be used to correct for heteroscedasticity. In a Weighted regression procedure more weight is given to the observations with smaller variance because these observations provide more reliable information about the regression function than those with large variances.

Neter et al. (1996). suggest the following process for estimating the regression coefficients in the presence of heteroscedasticity:

  1. Fit the regression model by unweighted least squares and analyze the residuals.
  2. Estimate the variance function or the standard deviation function by regressing either the squared residuals or the absolute residuals on the appropriate predictor(s).
  3. Use the fitted values from the estimated variance or standard deviation function to obtain the weights wi.
  4. Estimate the regression coefficients using these weights.

How to do this automatically

MedCalc will perform these steps automatically when you select the dummy variable "*** AutoWeight 1/SD^2 ***" for "Weights" in the dialog boxes for regression.

How to perform each step in MedCalc

If you would like to have more control over the process, perhaps because you require some modifications of one or more steps, you can perform each of these steps using Weighted regression and other tools available in MedCalc.

This process is described below in detail.

The data for this example are available in the MedCalc sample files folder, file "Weighted regression (Neter).mc1". This file contains Age and Diastolic blood pressure (DBP) data collected on 54 subjects.

Weighted regression - worked example

Source: http://www.ats.ucla.edu/stat/sas/examples/alsm/alsmsasch10.htm

Step 1. Fit the regression model by unweighted least squares and analyze the residuals

We select Regression in the statistics menu and complete the dialog box as follows.

Weighted regression - worked example

Variable Y, the dependent variable is DBP (Diastolic blood pressure) and Variable X, the independent variable is Age.

We do not select a variable for Weights because in this first step we perform ordinary unweighted least squares regression.

We obtain the following results:

Weighted regression - worked example

In the results window, we click the hyperlink "Save residuals" to save the residuals in a new column of the spreadsheet.

Residuals are the differences between the observed values of the dependent variable DBP and the values calculated using the regression equation.

Weighted regression - worked example

In the subsequent dialog box, we click OK.

Weighted regression - worked example

This will create a new column in the spreadsheet containing the residuals (variable "REGR_Resid1"):

Weighted regression - worked example

Step 2. Estimate the variance function or the standard deviation function

In this step, we build a regression model of the standard deviation against Age. We do that by regressing the absolute values of the residuals against Age, since the absolute residuals are an estimator of the standard deviation of DBP at different values of Age.

We select Regression in the statistics menu and complete the dialog box as follows:

Weighted regression - worked example

For Variable Y, we first select the new variable "REGR_Resid1" and next edit the selection and change the variable into "abs(REGR_Resid1)".

We obtain the following results:

Weighted regression - worked example

Step 3. Use the fitted values from the estimated variance or standard deviation function to obtain the weights

In the last results window, we click the hyperlink "Save predicted values" to save the predicted values in a new column of the spreadsheet.

Weighted regression - worked example

In the subsequent dialog box, we click OK.

Weighted regression - worked example

This will create a new column in the spreadsheet containing the predicted (or estimated) values of the standard deviation (variable "REGR_Pred1"):

Weighted regression - worked example

Step 4. Estimate the regression coefficients using these weights

Finally, we can build our weighted regression model.

For weights we use the reciprocal of the squared predicted values for standard deviation (variance is the standard deviation squared): observations with large standard deviation are given less weight than observations with smaller standard deviation.

Weighted regression - worked example

We select Regression in the statistics menu and complete the dialog box as follows.

Weighted regression - worked example

For Weights, we first select the new variable "REGR_Pred1" and next edit the selection and change the variable into "1/REGR_Pred1^2" (we could also use "1/(REGR_Pred1*REGR_Pred1)" or "1/Power(REGR_Pred1,2)".

We obtain the following results:

Weighted regression - worked example

The final (weighted) regression equation is

DBP = 55.5658 + 0.5963 Age

which is not much different from the original (unweighted) regression equation

DBP = 56.1569 + 0.5800 Age

However, the standard errors of the regression coefficients are smaller, resulting in more narower confidence intervals.

Literature

  • Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models. 4th ed. Boston: McGraw-Hill.

See also

This site uses cookies to store information on your computer. More info...