Scatter diagram & regression line
In a scatter diagram, the relation between two numerical variables is presented graphically. One variable (the independent variable X) defines the horizontal axis and the other (dependent variable Y) defines the vertical axis. The values of the two variables on the same row in the data spreadsheet, give the points in the diagram.
The dialog box for the scatter diagram is similar to the one for Regression:
Select the 2 variables to be represented in the graph. Optionally, you may also enter a data filter in order to include only a selected subgroup of cases in the statistical analysis.
By default the option Include constant in equation is selected. This is the recommended option that will result in ordinary least-squares regression. When you need regression through the origin (no constant a in the equation), you can uncheck this option (an example of when this is appropriate is given in Eisenhauer, 2003).
MedCalc offers a choice of 5 different regression equations (X represents the independent variable and Y the dependent variable):
When you select an equation that contains a Logarithmic transformation for one of the variables, the program will use a logarithmic scale for the corresponding variable.
In regression analysis, residuals are the differences between the predicted values and the observed values for the dependent variable. The residual plot allows the visual evaluation of the goodness of fit of the selected model.
To obtain a residuals plot, select this option in the dialog box. This graph will be displayed in a second window.
Click the Subgroups button if you want to identify subgroups in the scatter diagram. A new dialog box is displayed in which you can select a categorical variable. The graph will use different markers for the different categories in this variable, and optionally will show regression lines for all cases and for each subgroup.
When you click a point on the regression line, the program will give the x-value and the f(x) value calculated using the regression equation.
You can press Ctrl+P to print the scatter diagram, or function key F10 to save the picture as file on disk. To define other titles or colors in the graph, or change the axis scaling, see Format graph.
If you want to repeat the scatter diagram, possibly to select a different regression equation, then you only have to press function key F7. The dialog box will re-appear with the previous entries (see F7 - Repeat key).
MedCalc does only show the regression line in the range of observed values. As a rule, it is not recommended to extrapolate the regression line beyond the observed range. For particular applications however, such as evaluation of stability data, extrapolation may be useful, see for example the ICH guideline Evaluation of Stability Data (PDF).
To allow extrapolation, right-click in the graph and select Allow extrapolation in the popup menu.
When you select the option Residuals plot in the Regression line dialog box, the program will display a second window with the residuals plot. Residuals are the differences between the predicted values and the observed values for the dependent variable. The residual plot allows for the visual evaluation of the goodness of fit of the selected model or equation. Residuals may point to possible outliers (unusual values) in the data or problems with the regression model. If the residuals display a certain pattern, you should consider to select a different regression model.