4 Conclusion

Maximum likelihood estimation based on the matrix exponential spatial specification (MESS) introduced here was shown to be computationally superior to most spatial estimators, requiring O() operations (the same as OLS) conditional upon formation of the spatial weight matrix and as low as O(log()) for the formation of the spatial weight matrix. When used in conjunction with common approaches to specifying spatial influences, the MESS results in a situation where the log-determinant term in the spatial likelihood function vanishes. The matrix exponential spatial specification provides an unusual situation (in spatial problems) where non-linear least-squares and maximum likelihood methods yield the same estimates. A simplification of the log-likelihood stemming from use of the matrix exponential spatial specification produces a situation where a unique closed-form solution for the estimates exists. This unique closed form solution greatly accelerates computation.
As an illustration of the speed gained through the use of these techniques, it took only 3.36 seconds using Matlab to compute a spatial autoregression involving 57,647 observations. We also demonstrated that the estimates for the parameters from the MESS model were almost identical to those from a spatial autoregressive (SAR) model and the inferences were identical, while the MESS model’s computational speed was over 1000 times faster than the more traditional SAR model. , the Fortran code ran 6 to 7 times faster than the Matlab code for the 57,647 observation census dataset. In other experiments (not reported here) we found that the MESS model when specified with spatially lagged explanatory variables produces estimates and inferences similar to those from the spatial Durbin model introduced in Anselin (1988).
This speed, along with the simpler MESS log-likelihood, facilitates maximization over a host of spatial parameter settings that can be used to vary the nature and extent of spatial influences in the model. The application by Bell and Bockstael (2000) provides a compelling motivation for this type of exploration. It may also enable Bayesian model selection criterion to be used in place of traditional likelihood ratio tests which would allow the rejection regions to vary with the sample size. This may have the potential to produce more parsimonious global model specifications because the rejection regions would narrow with larger sample sizes. Recent literature in the area of ‘Bayesian model averaging’ suggests that another potential role for Bayesian methods may be to produce a single posterior model that averages over alternative specifications associated with alternative spatial parameter settings. This would greatly facilitate reporting of results that are not conditional on a particular setting for decay in spatial weights, or number of neighbors employed. A final point is that the simpler MESS log-likelihood may make it an easier model to use in theoretical derivations needed to produce Bayesian and other spatial econometric extensions.
The computational advantages of the MESS should prove useful in solving a number of problems that arise in application of spatial econometric analysis. First, the MESS should provide an easily calculated benchmark against which to gauge the performance of other spatial estimators. In other words, MESS can serve as a more sophisticated null hypothesis than the typical assumption of spatial independence. In addition, since MESS provides a unique optimal estimate, it could help identify when another more complex model has become trapped in a local optima.
Second, the computational efficiency for large problems means that the MESS model can serve as a global description for very large data sets. Such global descriptions can help identify smaller regions where it may be of interest to apply more computationally costly techniques for analysis. Policy decisions often require global descriptions, so a collection of regional descriptions based on smaller subsets of the data set may not serve the desired purpose. This could become particularly important with the pending release of the year 2000 Census that will contain nearly 250,000 observations at the block-group microlevel.
A third area of applications opened up by this approach is computation of diagnostic statistics that have traditionally been problematical in the maximum likelihood spatial estimation setting. We provided a brief demonstration of the application of these statistics, but the potential of these diagnostics in large sample problems represents a relatively unexplored area for future research.
Another area where the MESS method could be useful is Monte Carlo experiments. Research examining the performance characteristics of alternative spatial estimation methodologies has been limited to relatively small data sets because of the computational burdens. Since both simulation and estimation proceed rapidly in the case of the MESS, this should facilitate Monte Carlo experiments based on larger, more realistic data sets.
References
Anselin, L. 1988. Spatial Econometrics: Methods and Models, (Dorddrecht: Kluwer Academic Publishers).
Barry, Ronald, and R. Kelley Pace, “Kriging with Large Data Sets Using Sparse Matrix Techniques,” Communications in Statistics: Computation and Simulation, Volume 26, Number 2, 1997, p. 619-629.
Barry, Ronald, and R. Kelley Pace, “A Monte Carlo Estimator of the Log Determinant of Large Sparse Matrices,” Linear Algebra and its Applications, Volume 289, Number 1-3, 1999, p. 41-54.
Bell, Kathleen P., and Nancy E. Bockstael, “Applying the Generalized-Moments Estimation Approach to Spatial Problems Involving Microlevel Data”, Review of Economics and Statistics, Volume 87, Number 1, 2000, p. 72-82.
Bogart, William T., The Economics of Cities and Suburbs, Upper Saddle River: Prentice Hall, 1998.
Chiu, Tom Y.M., Tom Leonard, and Kam-Wah Tsui, “The Matrix-Logarithmic Covariance Model,” Journal of the American Statistical Association, 91, 1996, p. 198-210.
Christensen, Ronald, Plane Answers to Complex Questions, Second Edition, New York: Springer-Verlag, 1996.
Christensen, Ronald, Wesley Johnson, and Larry Pearson, “Prediction Diagnostics for Spatial Linear Models,” Biometrika, Volume 79, 1992, p. 583-591.
Eppstein, D., M.S. Paterson, and F.F. Yao, “On Nearest-Neighbor Graphs,” Discrete and Computational Geometry, Volume 17, 1997, p. 263-282.
Gentle, James, Numerical Linear Algebra for Applications in Statistics, New York: Springer-Verlag, 1998.
Haining, Robert, “Diagnostics for Regression Modeling in Spatial Econometrics,” Journal of Regional Science, Volume 34, 1994, p. 325-341.
Hendry, David, Adrian Pagan, and Denis Sargan, “Dynamic Specification,” in: Griliches, Z., and M. Intrilligator, eds. Handbook of Econometrics, Volume 2, Amsterdam: North-Holland, 1984, p. 1023-1100.
Horn, Roger, and Charles Johnson, Matrix Analysis, New York: Cambridge University Press, 1993.
Kelejian, H., and I.R. Prucha, “A Generalized Spatial Two-Stage Least Squares Procedure for Estimating a Spatial Autoregressive Model with Autoregressive Disturbances”, Journal of Real Estate and Finance Economics, Volume 17, Number 1 1998, p. 99-121.
Kelejian, H., and I.R. Prucha, “A Generalized Moments Estimator for the Autoregressive Parameter in a Spatial Model”, International Economic Review, Volume 40, 1999, p. 509-533.
Lay, David, Linear Algebra and its Applications, second edition, New York: Addison Wesley Longman, 1997.
Leamer, Edward E., “Regression-Selection Strategies and Revealed Priors,” Journal of the American Statistical Association, Volume 73, 1978, p. 507-510.
Leamer, Edward E., “Model Choice”, in Z. Grilliches and M.D. Intrilligator (eds.) Handbook of Econometrics, Volume 1 Amsterdam: North-Holland, 1983, p. 286-330.
LeSage, James P., “Bayesian Estimation of Spatial Autoregressive Models”, International Regional Science Review, Volume 20, number 1&2, 1997, p. 113-129.
LeSage, James P., “Bayesian Estimation of Limited Dependent variable Spatial Autoregressive Models”, Geographical Analysis, Volume 32, number 1, 2000, p. 19-35.
Mardia, K.V., and A.J. Watkins, “On Multimodality of the Likelihood in the Spatial Linear Model,” Biometrika, Volume 76, 1989, p. 289-295.
Martin, Richard J., “Leverage, Influence, and Residuals in Regression Models when Observations are Correlated,” Communications in Statistics: Theory and Methods, Volume 21, 1992, p. 1183-1212.
Myers, Jeffrey, Geostatistical Error Management, New York: Van Nostrand Reinhold, 1997.
Ord, J.K., “Estimation Methods for Models of Spatial Interaction,” Journal of the American Statistical Association, Volume 70, 1975, p. 120-126.
Pace, R. Kelley, and Ronald Barry, “Simulating Mixed Regressive Spatially Autoregressive Estimators,” Computational Statistics, Volume 13, Number 3, 1998, p. 397-418.
Pace, R. Kelley, and Dongya Zou, “Closed-Form Maximum Likelihood Estimates of Nearest Neighbor Spatial Dependence,” Geographical Analysis, Volume 32, Number 2, April 2000.
Press, William, Saul Teukolsky, William Vetterling, and Brian Flannery, Numerical Recipes in Fortran 77, second edition, New York: Cambridge University Press, 1996.
Ripley, Brian D., Statistical Inference for Spatial Processes, Cambridge: Cambridge University Press, 1988.
Sidje, Roger B., “Expokit: a Software Package for Computing Matrix Exponentials,” ACM Transactions on Mathematical Software, Volume 24, 1998, p. 130-156.
Strang, Gilbert, Linear Algebra and its Applications, New York: Academic Press, 1976.
Warnes, J.J., and Brian Ripley, “Problems with Likelihood Estimation of Covariance Functions of Spatial Gaussian Processes,” Biometrika, Volume 74, 1987, p. 640-642.
Table 2: Spatial and Aspatial Regression Models
Variables Aspatial Model Spatial Model
Intercept 1.224 -0.151
ln(Land Area) -0.085 -0.003
ln(Land Area) -0.017
Deviance 9,379.1 1,218.8
ln(Population) 0.115 0.022
ln(Population) 0.030
Deviance 1,358.6 366.6
ln(Per Capita Income) 1.084 0.677
ln(Per Capita Income) -0.463
Deviance 43,355.2 29,764.3
ln(Age) -0.127 -0.138
ln(Age) 0.127
Deviance 1,175.2 2,289.5
(# of neighbors) 30
Deviance (=29) 2.72
(geometric decay) 0.90
Deviance (=0.95) 612.82
Deviance (=0.85) 1240.52
(autoregressive parameter) -1.673
Deviance () 64,450.6
57,647 57,647
5 12
Maximum Log-likelihood -266,505.2 -228,850.4

Figure 1: Scaled Log-likelihood vs. Number of Neighbors across Differing
Figure 2: US Census Tract Locations with Smallest (O) and Largest () Leverage Observations Identified
Figure 3: US Census Tract Locations with Largest (O) and Smallest () Delete-1 Identified