Spatial Statistics Articles

 

Many users estimate spatial autoregressions to perform inference on regression parameters. However, as the sample size or the number of potential models rise, computational exigencies make exact computation of likelihood-based inferences tedious or even impossible. To address this problem, we introduce a lower bound on the likelihood ratio test that can allow users to conduct conservative maximum likelihood inference while avoiding the computationally demanding task of computing exact maximum likelihood point estimates. This form of inference, known as likelihood dominance, performs almost as well as exact likelihood inference for the empirical examples examined. We illustrate the utility of the technique by performing likelihood-based inference on parameters from a spatial autoregression involving 890,091 observations in less than a minute (given the spatial weight matrix).

Pace and LeSage (2003), "Likelihood Dominance Spatial Inference," Geographical Analysis (pdf)

 

This chapter reviewed some of our techniques for handling spatial dependence in large data sets.

LeSage and Pace (2001), "Spatial Dependence in Data Mining" in Data Mining for Scientific and Engineering Applications, Kluwer

 

Spatial autocorrelation among regression residuals can arise from functional form misspecification. To address both functional form and spatial considerations, this paper simultaneously transforms the dependent and independent variables via B-splines as well as fitting a spatial autoregression (i.e., a spatial additive model). The log-likelihood contains two log-determinants, one for the dependent variable functional form transformation and one for the spatial dependent variable transformation. The paper applies these transformations to 11,006 observations on individual houses in Baton Rouge. The combined transformations greatly improve the pattern of the residuals and reduces their magnitude. On a Pentium Pro 200 MHz PC it took under a minute to calculate the spatial log-determinant and under 10 seconds to calculate the estimated joint transformations. I plan to include this type of function in the Spatial Statistics Toolbox.

Pace, Barry, Slawson, and Sirmans (forthcoming), "Simultaneous Spatial and Functional Form Transformations," Advances in Spatial Econometrics, Florax and Anselin, Editors.

 

Individual data arise over space as well as time, and usually show dependence in all of these dimensions. Modeling such dependence poses difficulties due to the different scales in the dimensions, and due to the unidirectional nature of time compared to the omnidirectional nature of space. This paper provided some computationally feasible means of modeling such spatial-temporal dependence for large data sets. In terms of performance, the spatial-temporal regression with 14 variables displayed 8% lower SSE than a regression using 211 variables attempting to control for the housing characteristics, time, and space via continuous and indicator variables. One-step ahead forecasts document the utility of the proposed spatial-temporal model. This estimator is implemented in the Spatial Statistics Toolbox 2.0.

Pace et al (2000), "A Method for Spatial-Temporal Forecasting with an Application to Real Estate Prices," International Journal of Forecasting (pdf)

(This is on Elsevier's site, and may require payment for institutions which do not subscribe to their services)

 

Using nearest neighbor spatial dependence leads to a closed form for the eigenvalues and hence the log-determinant of the spatial weight matrix. In turn, this simple result leads to a closed-form spatial maximum likelihood estimator. Hence, one can find the neighbors and compute the maximum likelihood estimates for 100,000 observations in under one minute (on a Pentium III 500 MHz machine)! Using OLS for data known a priori to exhibit spatial dependence provides an easily demolished "straw man" null hypothesis. The nearest neighbor maximum likelihood estimator can serve as a more realistic null hypothesis for such spatial data. This estimator is implemented in the Spatial Statistics Toolbox 1.1.

Pace and Zou (2000), "Closed-Form Maximum Likelihood Estimates of Nearest Neighbor Spatial Dependence," Geographical Analysis (pdf)

 

Ron Barry and I devised a means of estimating the log-determinant of large, sparse matrices. Estimation of the log-determinant of the variance-covariance matrix (or its inverse) allows maximum likelihood estimation of large-scale spatial statistical problems. Most importantly, the article shows a way of providing confidence intervals for the estimate and show these work via a coverage study. To illustrate the potential of the estimator, we estimated the log-determinant of a 1,000,000 by 1,000,000 matrix (which we did on a Pentium 133 MHz machine!). The estimator has a simple form and its performance depends only upon the degree of sparsity and not its pattern. Source code and executable code for it resides in SpaceStatPack. The manuscript describing the estimator appeared in Linear Algebra and its Applications. You can obtain a pdf version of the article by selecting the link below or going to the Elsevier Science site.

Barry and Pace (1999), "Monte Carlo Estimates of the Log Determinant of Large Sparse Matrices," Linear Algebra and its Applications (pdf)

 

Much of my computational approach to spatial statistics appeared in Geographical Analysis (1997). The innovations in this article include (1) permuting the rows and columns of the sparse spatial weight matrix to vastly accelerate the computation of the log-determinant; (2) log-determinant reuse; and (3) vectorized computation of the profile likelihoods. To give an idea about the speed involved, on a Pentium II 400 MHz machine it takes under 10 seconds to compute a 3,107 observation simultaneous spatial autoregression (SAR) via maximum likelihood.

Pace and Barry (1997), "Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable," Geographical Analysis (pdf)

 

An additional article, which appeared in the Journal of Statistical Computation and Simulation, discusses fast implementation of Conditional Spatial Autoregressions (CAR).

Pace and Barry (1997), "Fast CARs," Journal of Statistical Computation and Simulation (pdf)

 

We discussed sparse kriging in Communication in Statistics, Simulation and Computation. Using a published estimates on a spherical variogram we solved the estimates 432 times as fast as using more conventional solution techniques.

Pace and Barry (1997), "Kriging with Large Data Sets," Communications in Statistics, Simulation and Compassion (pdf image),

 

Statistics and Probability Letters published an article where we analyzed a spatial data set of 20,640 observations using normal maximum likelihood (SAR). The Geographical Analysis (1997) article proposed a superior computational technology than used in the Statistics and Probability Letters article, but the Statistics and Probability Letters article used a larger data set. Moreover, it provided details on this data set which resides in the Spatial Statistics Toolbox zipped files.

Pace and Barry (1997), "Sparse Spatial Autoregressions," Statistics and Probability Letters (pdf)

  

Otis Gilley and myself corrected a famous data set and augmented it with spatial information. We described this in an article in the Journal of Environmental and Economic Management (JEEM).

Gilley and Pace (1996), "The Harrison and Rubinfeld Data Revisited," Journal of Environmental and Economic Management (pdf)

 

I have a variety of real estate manuscripts which apply spatial statistics.

Real Estate Spatial Statistics Articles

 

 

We have copyright permission from the publishers above to place these manuscripts on the web. In all cases the publishers own the copyright and have graciously granted us permission to place them on this website. More details appear in each manuscript.

Ohio State University Press publishes Geographical Analysis .

Elsevier Science publishes Linear Algebra and its Applications, Journal of Environmental and Economic Management (JEEM),  and Statistics and Probability Letters .

Taylor & Francis publish the Journal of Statistical Computation and Simulation.

Marcel Dekker publish Communications in Statistics

 

You can download the articles above the spatial software packages via anonymous FTP (ftp.spatial-statistics.com/Spatial_Statistics_Toolbox or ftp.spatiotemporal.com). Some FTP clients perform downloading much better than browsers. For example, WS_FTP, CuteFTP, and FTP Explorer allow resumption of interrupted transfers and contain other features that make them ideal for downloading large files over the net.

For those without pdf viewers, I have supplied some of the manuscripts and documentation in html form. If possible, I strongly urge using the pdf versions. The conversion to html loses some of the formatting which makes the manuscripts and documentation harder to read. This is especially true for mathematical symbols and equations. In addition, it sometimes loses footnotes. You can obtain pdf file viewers for free from www.adobe.com. If you download the html files, for many of the manuscripts you need to erase the filename appearing at the end of the URL in the browser URL window and hit return. This will show all the html and gif files pertaining to that manuscript. You need to download all of them to a directory on your machine, if you wish to see the equations. This is far more work than simply downloading the pdf file version.

 

Geographical Analysis (1997) html

Geographical Analysis (2000) html

Geographical Analysis (2002) html

Advances in Spatial Econometrics (forthcoming) html

Statistics and Probability Letters (1997) html

JSCS (1997) html

JEEM (1996) html

Matlab Spatial Statistics Toolbox Documentation (1999) html

SpaceStatPack Documentation (1999) html

Matlab Spatial Statistics Toolbox 2.0 (2003) html

 

 

Home    Articles    Software    Manuscripts   Links   Search