4.1.2
Computationally, existing M-estimators have several problems affecting
their performance: Firstly, it is not robust to the initial estimate, a
problem common to nonlinear regression procedures [Myers 1990],
also encountered by vision researchers
[Haralick et al. 1989 ; Meer et al. 1991 ; Zhuang et al. 1992]. The convergence of
the algorithm depends on the initialization. Even if the problem of
convergence is avoided, the need for good initial estimate cannot be
ignored for convergence to the global estimate; this is because most
M-estimators are defined as the global minimum of a generally
non-convex energy function and hence the commonly used gradient based
algorithms can get stuck at unfavorable local solutions. The M-estimator
has the theoretical breakpoint of where p is the number
of unknown parameters to be estimated, but in practice, the breakpoint can be well below this value because of the problem of local
minima.
Secondly, the definition of the M-estimator involves some scale estimate, such as the median of absolute deviation (MAD), and a parameter to be chosen. These are also sources of sensitivity and instability. For example, Tukey's biweight function [Tukey 1977] is defined as
where S is an estimate of spread, c is a constant parameter and
cS is the scale estimate. Possible choices are with c set to 6 or 9; or
(median of absolute deviation (MAD)) with
chosen for the best consistency with the Gaussian
distribution. Classical scale estimates such as the median and MAD are
not very robust. Design of scale estimates is crucial and needs devoted
studies.
Furthermore, the convergence of the M-estimator is often not guaranteed. Divergence can occur when initialization or parameters are not chosen properly. Owing to the above problems, the theoretical breakdown point can hardly be achieved.
In the following, an improved robust M-estimator, referred to as annealing M-estimator (AM-estimator), is presented to overcome the above problems. It has two main ingredients: a redefinition of the M-estimator and a GNC-like annealing algorithm.