5. Testing and Validation

The objective of testing and validation is to evaluate the reliability and validity of the model we developed above.

The output of regression analysis by using Excel has three parts, Regression Statistics, the Analysis of Variance, and the Parameter Estimates. The Analysis of Variance (ANOVA) explains overall significance of the model, while the Parameter Estimates explain the significance of each individual parameter to the model.

1)      Statistics

Before we move on to validate our model, we want to spend a little bit time on some key statistics that help determine measures of a good fit. 

a.       R2 measures the correlation between the dependent variable and independent variables. It represents the percentage of change in the dependent variable that can be explained by the changes in the independent variables. A value that is closer to 1.00 indicates a strong relationship.

b.      Standard error of the regression is an estimate of the standard deviation around the regression line.

c.       Coefficients measure the impact of independent variables on the outcome. If relationship exists between independent variables and the dependent variable, coefficients should be non-zero numbers. The null hypothesis in the regression analysis is that all coefficients of independent variables are zero except that of intercept.

d.      Significance of F is the significance probability of the model. If it is greater than the significance level, alpha, the null hypothesis holds, which means the model is not significant. If it is smaller than alpha, the model is well fitted to the data.

e.       P-value measures the significance of all the coefficients. We want to have p-values far smaller than alpha so that we feel confident that a regression relationship exists among variables.

2)      Average Firm Demand Time Series

Refer to Table 1, Significance Probability is 0.0004267 and R square is 0.4703. This indicates that the model is significant but the relationship is not very strong. Other factors, such average price, average advertising expenditures, average R&D, impose influences on the demand as well. Our next step is to test the relationship between other factors and residuals from time series analysis.

3)      Average Firm Demand Residual Regression Analysis

To get ideal result, we had to run the residual regression analysis six times, each time eliminating one variable that is not significantly related to the dependent variable. The final result as listed in Table 2 shows that Average Price is the only variable left in the model that has an inverse relationship with the Residuals. Detailed step-by-step regression analysis can be found in the file of Average_Demand_Regression.xls.

4)      Normalized Market Share Regression Analysis

The regression analysis shows that all the independent variables are significant based on their p-values. The adjusted R2 (0.698) also indicates a strong relationship among variables. Given the high adjusted R2 and low se of the NSOM model, the regression equation appears to provide a good fit to the data.  Also, relationship of the independent variables to the dependent variable, NSOM, is logical.  This model should provide an accurate forecast of the future NSOM. 


                                                Top    Content    Previous    Next