4.1.1
Robust statistics methods [Tukey 1977 ; Huber 1981] provide tools for statistics problems in which underlying assumptions are inexact. A robust procedure should be insensitive to departures from underlying assumptions caused by, for example, outliers. That is, it should have good performance under the underlying assumptions and the performance deteriorates gracefully as the situation departs from the assumptions. Applications of robust methods in vision are seen in image restoration, smoothing and segmentation [Kashyap and Eom 1988 ; Jolion et al. 1991 ; Meer et al. 1991], surface and shape fitting [Besl et al. 1988 ; Stein and Werman 1992] and pose estimation [Haralick et al. 1989] where outliers are an issue.
There are several types of robust estimators. Among them are M-estimator (maximum likelihood type estimator), L-estimator (linear combinations of order statistics) and R-estimator (estimator based on rank transformation) [Huber 1981]; RM estimator (repeated median) [Siegel 1982] and LMS estimator (estimator using the least median of squares) [Rousseeuw 1984]. We are concerned with the M-estimator.
The essential form of the M-estimation problem is the following: Given a
set of m data samples where
, the problem is to estimate the location parameter f
under noise
. The distribution of
is not assumed to be
known exactly. The only underlying assumption is that
obey a symmetric, independent, identical
distribution (symmetric i.i.d.). A robust estimator has to deal with
departures from this assumption.
Let the residual errors be (
) and the error
penalty function be
. The M-estimate
is defined as the
minimum of a global error function
where
To minimize above, it is necessary to solve the following equation
This is based on gradient descent. When can also be
expressed as a function of
, its first derivative can take the
following form
where is an even function. In this case, the estimate
can be expressed as the following weighted sum of the data samples
where h acts as the weighting function.
In the LS regression, all data points are weighted
equally with and the estimate is
. When outliers are weighted equally as inliers, it
will cause considerable bias and rapid deterioration of the quality of
the estimate. In robust M estimation, the function h provides adaptive
weighting. The influence from
is decreased when
is very large and suppressed when it is infinitely large.
Table 4.1 lists some robust functions used in
practice where . They are closely related to the
adaptive interaction function and
adaptive potential function defined
in (3.27) and (3.28). Fig.4.1
shows their qualitative shapes in comparison with the quadratic and the
line process models (note that a trivial constant may be added to
). These robust functions are piecewise as in the line
process model. Moreover, the parameter
in
is dependent on
some scale estimate, such as the median of absolute deviation (MAD).
Figure 4.1: The qualitative shapes of potential functions in use. The
quadratic prior (equivalent to LS) model in (a) is unable to deal with
discontinuities (or outliers). The line process model (b), Tukey's (c),
Huber's (d), Andrews' (e) and Hampel's (f) robust model are able to,
owing to their property of .
From (Li 1995a) with permission; © 1995 Elsevier.