Data-sparse price indexes by spatio-temporal regularization and PCA
Published 11/11/24
We present two novel approaches to overcome the limitations of data-sparse local house price indices and combine them into a single model pipeline that is simple, computationally efficient, and interpretable.
The first contribution is a new spatio-temporal regularisation of least squares dummy variable models, such as the repeat sales regression used here. This regularisation encodes prior knowledge of the proximity of houses in space and their sales in time. It handles missing values in a natural way. The second is nonlocal regularisation using truncated principal component analysis (PCA) applied to the resulting national collection of local price indices. The PCA loadings show that there are important underlying socioeconomic factors that can be leveraged in the construction of Australian market indices.
This PCA reveals important socioeconomic factors, showing that many local markets can be described by a few broad aspects of the national market, consisting of a general trend that contrasts regions influenced by the mining industry with Sydney and Melbourne, and another trend that highlights lifestyle.