Abstract:
The objective of this research is to compare model-combining methods for multiple linear regression models. Three combining methods are studied: least absolute errors (LAE), combination by bootstrap (BO) and adaptive regression by mixing (ARM). The models used in combining are built by the following procedures: all possible regressions, forward selection, backward elimination and stepwise regression. The mean absolute percentage error (MAPE) is used as the criterion in deciding which combining method is best. The number of independent variables are 3, 5 and 7. In 3-variable case, the correlations between independent variables x[subscript 1] and x[subscript 2] are 0.3, 0.5 and 0.8; the sample sizes are 14, 20, 30, 40 and 50. In 5-variable case, the correlations between x[subscript 1] and x[subscript 2], and between x[subscript 4] and x[subscript 5] are (0.3, 0.3), (0.4, 0.6) and (0.7, 0.9); the sample sizes are 20, 30, 40 and 50. In 7-variable case, the correlations between x[subscript 1] and x[subscript 2], between x[subscript 4] and x[subscript 5], and between x[subscript 6] and x[subscript 7] are (0.3, 0.3, 0.3), (0.4, 0.5, 0.6) and (0.7, 0.8, 0.9); the sample sizes are 30, 40 and 50. The random errors are normally distributed with mean 0 and standard deviation 5. This research used the Monte Carlo simulation, repeated 1,000 times in each situation. The results of this research show that factors affecting the average of MAPE for all combining methods are correlations among the independent variables and sample sizes. The average of MAPE tends to increase when the correlation increases, and to fall when the sample size increases. By comparing the average of MAPE for the three combining methods, the researcher concludes that BO method is the best in every case. Generally, the model-building procedure which receives the maximum weight depends on the degree of multicollinearity among the independent variables. When multicollinearity is low, all possible regression receives the maximum weight. When multicollinearity is medium, backward elimination receives the maximum weight. When multicollinearity is high, stepwise regression receives the maximum weight.