2.5 Quick hypothesis testing

Efficient computation of likelihood ratio tests requires updating the sum-of-squared errors matrix without recomputing the actual regressions. Let denote the by matrix of estimates from the regression of on and let denote the by matrix of errors from the regression. Expression ( 17 ) shows the restricted least squares estimate for ,
where is a by 1 vector, denotes a by matrix, and is the number of hypotheses imposed.
Let in ( 18 ) denote the change in the restricted least squares estimates versus the unrestricted estimates for the th regression.
The inner product of any two vectors of restricted regression errors appears in ( 19 ).
where represent the vectors of restricted regression errors and . Expanding ( 19 ) yields ( 20 ).
where two of the possible terms vanish due to the enforced orthogonality between the residuals and the data in least-squares. One can further expand the second term from ( 20 ).

Fortunately, many terms in the above expression cancel which leaves a simple expression in ( 21 ) for the increase in error arising from restrictions.
Finally, define the by matrix of cross-products of restricted least-squares regressions as with th element and therefore the restricted sum of squared errors, .
Computation of the unrestricted regressions means the quantities and the Cholesky factors (even if computed from the QR algorithm) of are already known. However, requires operations for its decomposition. Typically, will be small. Testing for the effect of the deletion of a single variable means equals 1 and for a variable and its associated lags equals 1 plus the number of independent variable lag terms. Since computing the increase in errors from the restrictions requires operations and resolving the first order conditions requires operations, deviance (i.e., likelihood ratio) tests do not depend upon and thus require very little time.
One advantage of the likelihood-based MESS methodology noted in the introduction is the ability to accommodate Bayesian extensions. An interesting point here is that Bayesian logic emphasizes the fact that the significance level should be a decreasing function of sample size. As the sample size grows, the Bayes factor region of rejection is a function of sample size, in contrast to the usual classical region which is held constant and independent of sample size. This distinction may be important for very large models of the type discussed here. The Bayes factor in favor of hypothesis relative to hypothesis is simply the ratio of the marginal likelihoods , where computation of the marginal likelihood requires , where denotes the prior distribution. Leamer (1978,1983) discusses these issues.
This suggests that efficient computation of the likelihood made possible by the matrix exponential specification may enable Bayesian approaches to hypothesis testing that allow the level of significance to vary with the sample size. Since the rejection region shrinks as the sample size grows, Bayesian testing procedures should lead to more parsimonious specifications in cases involving large sample sizes.

See Gentle (1998, p. 166) for the standard restricted least squares estimator as well as some other techniques for computing these estimates.