ENBIS-8 in Athens

21 – 25 September 2008 Abstract submission: 14 March – 11 August 2008

Quality improvement of land cover data bases: a sequential approach via agreement measures

22 September 2008, 14:00 – 14:20

Abstract

Submitted by
elisabetta carfagna
Authors
E. Carfagna J. Marzialetti
Affiliation
University of Bologna
Abstract
Land cover data bases are frequently produced through photo-interpretation of remote sensing data according to a legend of land cover types. During the photo-interpretation process, the photo-interpreter outlines polygons and can make mistakes concerning the borders of polygons as well as the land cover type. Another much more experienced photo-interpreter (the controller) performs a quality control on a sample of polygons in order to test if some mistakes have been made by the previous photo-interpreter. The result of the two photo-interpretations is a confusion matrix with the classes used in the photo-interpretation by the photo-interpreter (rows - i) and by the controller (columns - j). We assume that the same classes are used. The matrix is based on the number of polygons (or the area of polygons divided by the total area) classified to class i by the photo-interpreter and to class j by the controller.
In this paper, we use quality control for continuously improving the data base production process and to this purpose we propose an adaptive sequential sample design which allows reaching high precision of estimates with the smallest sample size and in the shortest time. However, adaptive sequential procedures do not allow unbiased estimates of the quality parameters because their efficiency is due to sample selection dependent on previously selected units and stopping rules based on the quality parameter.
In Carfagna and Marzialetti (2007) we have proposed an adaptive sequential procedure with permanent random numbers which allows unbiased and efficient estimates of the percentage of area correctly photo-interpreted. In this procedure the sample size per stratum is dependent on the previously selected units but the sample selection is not, and the stopping rule is not based on the estimates of the quality parameter.
In this paper we apply the same adaptive sequential procedure to some measures of agreement, such as the Cohen’s Kappa and its weighted version, where the weights are an inverse function of the distance among the classes of the photo-interpretation legend. Both agreement measures can be applied to confusion matrices based on the area of polygons as well as on the number of polygons.
We have performed several simulations corresponding to different selections of polygons from a land cover data base of photo-interpreted satellite data. The behaviour of the standard deviations of the Cohen’s Kappa and the Weighted Kappa has showed that the sequential procedure allows us to reach a specified precision of the estimates with a smaller sample size with respect to the classical two-step procedure where the sample size necessary to obtain that specific precision is computed on the basis of an initial sample. However, for some simulations, the trend of the standard deviations is not strictly monotone. Some simulations show a small increase of the standard deviation, some others show jumps of the standard deviation to higher values.
With reference to confusion matrices based on the number of polygons, we have used also a log-linear model to describe the “structure” of the agreement (Tanner and Young 1985), where we consider two terms expressing respectively the agreement obtained by chance alone and the effective agreement existing between the photo-interpreter and the controller. If we hypothesize that the level of the agreement does not change among the classes of the legend we can use the homogeneous model, otherwise a non-homogeneous model is more appropriate. We propose a sequential approach for the log-linear model too.
References
Carfagna E. and Marzialetti J. (2007) Sequential Design in Quality Control and Validation of Land Cover Data Bases, Proc. of the ENBIS-DEINDE Conference “Computer Experiments versus Physical Experiments”, Torino, G. Vicario and E. D. Isaia (eds.).
Tanner M. A. and Young M. A. (1985) Modeling agreement among raters, Journal of the American Statistical Association, 80, 175-180.
View paper

Return to programme