Skip to main content
MedCalc
Mail a PDF copy of this page to:
(Your email address will not be added to a mailing list)
working
Show menu

Scatter diagram & regression line

Command:Statistics
Next selectRegression
Next selectScatter diagram & regression line

Description

In a scatter diagram, the relation between two numerical variables is presented graphically. One variable (the independent variable X) defines the horizontal axis and the other (dependent variable Y) defines the vertical axis. The values of the two variables on the same row in the data spreadsheet, give the points in the diagram.

Required input

The dialog box for the scatter diagram is similar to the one for Regression:

Dialog box for scatter diagram with regression line

Variables

Regression equation

By default the option Include constant in equation is selected. This is the recommended option that will result in ordinary least-squares regression. When you need regression through the origin (no constant a in the equation), you can uncheck this option (an example of when this is appropriate is given in Eisenhauer, 2003).

MedCalc offers a choice of 5 different regression equations (x represents the independent variable and y the dependent variable):

y = a + b xstraight line
y = a + b log(x)logarithmic curve
log(y) = a + b xexponential curve
log(y) = a + b log(x)geometric curve
y = a + b x + c x2quadratic regression (parabola)

When you select an equation that contains a Logarithmic transformation for one of the variables, the program will use a logarithmic scale for the corresponding variable.

Options

Residuals

In regression analysis, residuals are the differences between the predicted values and the observed values for the dependent variable. The residual plot allows the visual evaluation of the goodness of fit of the selected model.

To obtain a residuals plot, select this option in the dialog box. This graph will be displayed in a second window.

Subgroups

Click the Subgroups button if you want to identify subgroups in the scatter diagram. A new dialog box is displayed in which you can select a categorical variable. The graph will use different markers for the different categories in this variable, and optionally will show regression lines for all cases and for each subgroup.

Examples

Scatter diagram with regression line
Scatter diagram with regression line

Regression line and 95% confidence interval
Regression line and 95% confidence interval

Regression line and 95% prediction interval
Regression line and 95% prediction interval

Regression line, 95% confidence interval and 95% prediction interval
Regression line, 95% confidence interval and 95% prediction interval

Regression line with heatmap
Regression line and heatmap

 

When you click a point on the regression line, the program will give the x-value and the f(x) value calculated using the regression equation.

Regression line show f(x)

You can press Ctrl P to print the scatter diagram, or function key F10 to save the picture as file on disk. To define other titles or colors in the graph, or change the axis scaling, see Format graph.

If you want to repeat the scatter diagram, possibly to select a different regression equation, then you only have to press function key F7. The dialog box will re-appear with the previous entries (see Recall dialog).

Extrapolation

MedCalc only shows the regression line in the range of observed values. As a rule, it is not recommended to extrapolate the regression line beyond the observed range. For particular applications however, such as evaluation of stability data, extrapolation may be useful, see for example the ICH guideline Evaluation of Stability Data (PDF).

To allow extrapolation, right-click in the graph and click Allow extrapolation on the context menu.

Allow extrapolation

Residuals plot

When you select the option Residuals plot in the Regression line dialog box, the program will display a second window with the residuals plot. Residuals are the differences between the predicted values and the observed values for the dependent variable. The residual plot allows for the visual evaluation of the goodness of fit of the selected model or equation. Residuals may point to possible outliers (unusual values) in the data or problems with the regression model. If the residuals display a certain pattern, you should consider to select a different regression model.

Residuals plot

Literature

See also