Multiple Linear Regression

See Also: Linear & Polynomial Regression  Nonlinear Regression  Data Table  Assessing Quality of Regression Models


This Polymath option will fit a linear function of the form: 

y(x1, x2, ..., xn) = a0 + a1*x1 + a2*x2 + ... + an*xn 

where a0, a1, ..., an are regression parameters, to a set of N tabulated values of x1, x2, ..., xn (independent variables) versus y (dependent variable). Note that the number of data points must be greater than n+1 (thus N >= n+1). The program calculates the coefficients a0, a1, ..., an by minimizing the sum of squares of the deviations between the calculated and the data for y.

Multiple Linear Regressions are carried out with the Polymath Data Table.  The tab setting of "Regression" and "Multiple Linear" must be pressed as shown below for Example 4 in the Data Table window.



The options available in this window are the following:

Report: 
If this option is marked, a report showing the regression model the numerical values and confidence intervals of the parameters and some additional statistical and other information are presented and displayed.

Graph: 
If this option is marked, a graph showing the calculated points and the data points is prepared and displayed.

Store Model in column: 
Store the regression model and the calculated parameters in the next available empty column.

Residuals: 
If this option is marked, a graph showing the deviation between the data and the calculated values of the dependent variable (error, residuals) points is prepared and displayed. 

Dependent Variable: 
Select the dependent variable column name for regression from the pull-down menu.

Independent Variables: 
Select the independent variables for regression (indicated by x1, x2, ... ,xn' in the regression equation above). Note that holding down the Cntr key while then pressing left button of the mouse will select more than one variable.

Through origin: 
If this option is marked, the free parameter is set to zero in the regression model (a0 = 0).

Solve ():
Carry out the regression and the additional calculations. 


Example:  Fitting a Multiple Linear Model to Heat of Hardening of Portland Cement versus Weight Percent of Components Data.

Consider the data that can be loaded into Polymath from Example 4 under the Examples drop-down menu in the "Data Table" window. Select hard_heat as dependent variable and Wpc1, Wpc2, Wpc3 and Wpc4 as independent variables. (Note that you must hold down on the left mouse key when clicking on each variable name.)  It is known from theoretical considerations that the free parameter should be zero in this case, so mark the "Through origin" option. Mark also the "Residual" and "Graph" options. Also mark the "Store Model" option. The selections made are shown below.

Press the pink arrowto solve. A report containing the numerical results as well as a residual plot and a graph showing the measured and calculated values are obtained. 

Multiple Linear Regression Report

POLYMATH Report Heat of hardening of portland cement
Multiple linear regression 22-Nov-2004

Model: hard_heat = a1*Wpc1 + a2*Wpc2 + a3*Wpc3 + a4*Wpc4

Variable Value 95% confidence
a1 2.189177 0.4182687
a2 1.154136 0.1082325
a3 0.7532949 0.3601112
a4 0.4885452 0.093483

General
Number of independent variables = 4
Regression not including a free parameter
Number of observations = 13

Statistics

R^2 0.9806563
R^2adj 0.9742084
Rmsd 0.5568439
Variance 5.822523

Source data points and calculated data points

  Wpc1 Wpc2 Wpc3 Wpc4 hard_heat hard_heat calc Delta hard_heat
1 7 26 6 60 78.7 79.164242 -0.46424225
2 1 29 15 52 74.3 72.362883 1.9371171
3 11 56 8 20 104.3 104.5098 -0.20980244
4 11 31 8 47 87.6 88.84713 -1.2471299
5 7 52 6 33 95.9 95.98105 -0.0810505
6 11 55 9 22 109.2 105.08605 4.113948
7 3 71 17 6 102.7 104.24845 -1.548447
8 1 31 22 44 72.5 76.035858 -3.5358576
9 2 54 18 22 93.1 91.008981 2.0910185
10 21 47 4 26 115.9 115.93244 -0.03243851
11 1 40 23 34 83.8 82.290922 1.5090779
12 11 66 9 12 113.3 112.89609 0.40390716
13 10 68 8 12 109.4 112.26189 -2.8618926

Multiple Linear Regression Graph

Multiple Linear Regression Residuals


See the Assessing the Quality of Regression Models for more information on whether the multiple linear regression represents the data appropriately and whether all the selected variables should be included in the regression.