6.1.5
MRF parameter estimation can also be done by using a least squares (LS)
fit procedure [Derin and Elliot 1987]. The procedure consists
of the following steps: (1) Find the relationship between the joint
probability and the parameters
; (2) use
histogram techniques to estimate all such probabilities; (3) construct
an over-determined system of equations in terms of the probabilities and
the parameters; and (4) solve it using the LS method.
Figure 6.3: Pixel i and its eight neighbors in the second order
neighborhood system.
Consider the MLL model with the label set . For the
8-neighborhood system shown in Fig.6.3, there are ten
type of clique potential parameters shown in
Fig.1.4. Denote all the parameters as a
vector
Let
be the sum of the potential functions over all the cliques containing i. Define indicator functions
for expressing (1.52) and
for (1.53). In terms of and
's,
(6.32) can be written as
where
is a vector whose elements count the number of each type of cliques in
the window. The RHS of the first row in the above equation
correspond to the potentials weighted by
, the
second row by
and
, the third row by
and
, the fourth to seventh by
, and the
last two rows by
(refer to Fig.1.4).
The conditional likelihood is related to the
potential
by
It can be expressed in terms of the joint probability
of the labels on the
window and the
joint probability
of the labels at the neighboring sites
The reason for this expression is that can be
estimated using histogram techniques (see shortly). Comparing
the above two equations yield the following equation
Note that the RHS of the above is independent of the value of and
therefore, so is the LHS.
Figure 6.4: Realizations from a Gibbs distribution with specified (upper)
and estimated (lower) parameters. See Table 6.1 for parameter values.
From (Derin and Elliott 1987) with permission; © 1987 IEEE.
Consider the LHS of (6.39) for any two distinct
values, I and , for
. Because the LHS is independent of
values, the following holds
Rearranging the above and using the relationship (6.35), we obtain
For any and
, the vector
can be determined using (6.36) and
can be estimated using histogram techniques. Therefore,
(6.41) represents a linear equation with
unknown
. One such equation can be obtained for each distinct
combination of I,
and
. An over-determined linear system
of equations is obtained from all possible combinations. It can be
solved using standard LS method. This procedure is
also used by [Nadabar and Jain 1992] for estimating line process
parameters in edge detection.
Histogram techniques are proposed to estimate , for
all combinations of
, from one or several realizations
(images). Assume that there are a total of N distinct
blocks in the image lattice and that a particular
configuration
occurs
times. Then
. The relationship between
the histogramming and the ML estimation is established in
[Guerelli and Onural 1994].
A few remarks follow: Any combination for which
cannot be used to obtain a linear equation because
of the logarithm in (6.41). Derin and Elliott
propose to use only those combinations
for which
are largest. This selection strategy is quite robust
[]. They also suggest to discard the three and four cliques
(set their potentials to zero), according to some prior or empirical
knowledge, to reduce the length of
and
to simplify the procedure. The normalized form of GRFs
[Griffeath 1976] described in Section 1.2.5
provides a foundation for the reduction. This will be discussed in
Section 6.3.2.
Table 6.1: Specified and estimated parameters.
From (Derin and Elliot 1987) with permission; © 1987 IEEE.
Fig.6.4 shows estimation results obtained by using
the LS fitting method. In this example, the only nonzero parameters are
. In the upper row are the texture images
generated with specified parameter values. These values are estimated
from the images. The estimated values are then used to generate textures
in the lower row. Table 6.1 compares the
specified values and the estimated values. Note that the estimate for
the image in column 4 is numerically not so good; Derin and Elliott
explanation for this is that the parameter representation for a
realization is not unique.
The LS fitting procedure has the following
advantages over the coding method and MPL estimation. It does not
require numerical maximization, avoiding solving nonlinear equations.
Nor is the final solution dependent on the initial estimate. Further, it
is more efficient and faster. A disadvantage is that the number of
equations (6.41) in the system increases
exponentially as the sizes of and
increase. Zero
histogramming counts also cause problems with the RHS of
(6.41).