6.1.1
Given a realization f of a single MRF, the maximum likelihood (ML)
estimate maximizes the conditional
probability (the likelihood of
), that is,
or its log likelihood . Note that in this case,
f is the data used for the estimation. When the prior p.d.f. of the
parameters,
, is known, the MAP estimation which maximizes
the posterior density
can be sought. The ignorance of the prior p.d.f. leads the user to a rather diffuse choice. The prior p.d.f. is assumed to be flat when the prior information is totally unavailable. In this case the MAP estimation reduces to the ML estimation.
Let us review the idea of the ML parameter estimation using a simple
example. Let be a realization of an identical
independent Gaussian distribution with the two parameters, the mean
and the variance
, that is,
.
We want to estimate the two parameters,
, from
f. The likelihood function of
for fixed f is
A necessary condition for maximizing , or
equivalently maximizing
, is
and
.
Solving this, we find the ML estimate as
and
A more general result can be obtained for the m-variate Gaussian
distribution, which when restricted by the Markovianity, is also known
as the auto-normal [Besag 1974] MRF ( cf.
Section 1.3.1). Assuming in
(1.47) and thus
consisting only of
, the log likelihood function for the auto-normal
field is
where f is considered as a vector. Maximizing this needs to evaluate
the determinant . The ML estimation for the Gaussian cases is
usually less involved than for general MRFs.
The ML estimation for an MRF in general needs to evaluate the
normalizing partition function in the corresponding Gibbs distribution.
Consider a homogeneous and isotropic auto-logistic
model in the 4-neighborhood system with the parameters
(this MRF will also be used in subsequent
sections for illustrations). According to
(1.39) and
(1.40), its energy function and
conditional probability are, respectively,
and
The likelihood function is in the Gibbs form
is also a function of . Maximizing
with
respect to
needs to evaluate the partition function
. However, the computation of
is intractable even
for moderately sized problems because there are a combinatorial number
of elements in the configuration space
. This is a main
difficulty in parameter estimation for MRFs. Approximate formulae will
be used for solving this problem.
The approximate formulae are based on the conditional probabilities
,
. (The notation of parameters,
, on which the probabilities are conditioned, is dropped
temporarily for clarity.) Write the energy function into the form
Here, depends on the configuration on the cliques
involving i and
, in which the labels
and
are
mutually dependent. If only single- and pair-site cliques are
considered, then
The conditional probability (1.29) can be written as
Based on this, approximate formulae for the joint probability
are given in the following subsections.