9.5
In the following experiments, we compare several algorithms for combinatorial minimization and for constrained minimization:
The schedules for the annealing algorithms are as follows: There are two
annealing schedules for MFA. In the first schedule which is given in
[], T is decreased according to ; we set the initial temperature as
. The second is also in the form
but
is chosen on an ad hoc
basis of trial and error to get the best result, which is
where S is the saturation defined in (9.23). We refer
to these two schedules as MFA-1 and MFA-2, respectively. For the SA, it
is as but with the initial
temperature set to
; annealing stops if the
desired number of acceptances is not achieved at 100 successive
temperatures [Kirkpatrick et al. 1983]. These are purposely tuned for the
best result possible. The schedule for the annealing ICM is quite
tolerant; it is chosen as
with
.
Initial labeling is assigned as follows: First, we set
(
) as the start point common to
all the compared algorithms, where rnd is a random number evenly
distributed between 0 and 1. For the continuous algorithms (2, 3 and 4),
we normalized
to satisfy (8.19). For the
discrete algorithms (1, 5 and 6), i is iniatially labeled according to
maximal selection,
. The
convergence of the continuous algorithms is judged by checking whether
the saturation, S, is larger than 0.99.
The test bed is the MRF matching of weighted graphs. Here, a node
corresponds to a point in X-Y plane. Two random graphs, each
containing m=M nodes, are randomly generated as follows. First, a
number of nodes, where
is the ``floor'' operation, are generated using random numbers
uniformly distributed within a box of size
. They are
for the first graph. Their counterparts in the other graph are generated
by adding Gaussian noise of
to the x and y
coordinates of each of the nodes in the first graph. This gives
deviated locations of the nodes. Then, the rest of
the
nodes in each of the two graphs are
generated independently by using random numbers uniformly distributed in
the box, where
is the ``ceiling'' operation. This
simulates outlier nodes.
The above steps generate two weighted graphs and
where
and
index
the sets of nodes and
and
are the distances between nodes in the first
and second graph, respectively. The basis of matching is the weights in
d and D which reflect bilateral relations between nodes; unary
properties of nodes are not used in the test. We augment
by a
special node, called the NULL
node and indexed by 0, into
. The purpose is to cater for the matching of
outlier nodes. Refer also to Section 5.1 for the graph
representation. The matching is to assign a node from
to each of
the nodes in
so that some goodness (cost) of matching is maximized
(minimized).
The posterior energy formulated in Section 5.2 is used to
measure the cost of a matching . To be suitable
for real, as opposed to combinatorial, computation, the problem is then
reformulated in terms of relaxation labeling ( cf.
Section 8.2.2). From the two weighted graphs, the
compatibility matrix is defined in the following way
where is a constant. The gain function is then
To reduce storage, we used only one byte (8 bits) to represent the
compatibility coefficients . Positive values are
truncated to an integer between 0 and 255 while negative values are
truncated to zeros. In this case, we set
. Although the
precision of the compatibility coefficients is low, good results are
still obtained; this demonstrates the error-tolerant aspect of the RL
minimization approach.
The purpose of the MRF and RL formulations is to provide the compatibility matrix. Our ultimate objective here is more abstract: to compare the ability of the algorithms to solve the minimization problem expressed, given the posterior probability function or the corresponding compatibility matrix.
The following results are obtained after 200 runs:
Fig.8.9 illustrates the solution quality in
terms of the maximized gain as a function of the noise level .
SA provides the best result when carefully tuned, followed by the
Hummel-Zucker algorithm, whereas the quality of the ICM solution is the
poorest. The annealing procedure ( cf.
Section 9.3.1)
significantly improves the quality of the ICM solution.
The MFT annealing performs next to the Hummel-Zucker algorithm when
m=M=10. However, for ``harder'' problems, e.g.
when m=M=20, the
results produced by using the simple schedule MFA-1 deteriorate quickly
as the noise level increases, droping to even lower than that
of ICM. For large m and M and high
values, the algorithm
is often trapped to the all- NULL
solution, which is a significant local
optimum. We also found that a lower schedule, i.e.
smaller
in
, does not necessarily yield better solutions.
Fig.9.9 demonstrates the cost in terms of the
number of iterations as a function of the noise level . ICM
converges very fast, after just a few iterations. In contrast, the
number for SA with the specified heuristic schedule is two to four
order higher than all the deterministic algorithms.
Fig.9.10 shows the efficiency measured by the
maximized-gain/iteration-number ratio. The ordering by the efficiency is
roughly consistent with the inverse ordering by the iteration number.
Figure 9.8: The maximized gain (after divided by
), for
m=10 (top) and m=20 (bottom). The higher, the better.
Figure 9.9: The number of iterations, for m=10 (top) and m=20
(bottom). The lower, the better.
Figure: The efficiency measured by the maximized-gain/iteration-number
ratio, for m=10 (top) and m=20 (bottom). The higher, the better.
The efficiency of SA is near zero.
Besides the solution quality and the cost, an important factor that must be taken into account is the need for, and difficulties in, the tuning of the annealing schedule. It is well known that the schedule is critical to the success of SA and is an area of study [Aarts 1989]. We add that the schedule is also critical to the mean field annealing. How to choose an optimal schedule depends not only on the type, but also on the size, of the problem. Finding a good heuristic schedule based on trial and error can be tedious.
The comparison leads to a conclusion in favor of the Hummel-Zucker algorithm. It yields good quality solutions, quite comparable with the time-consuming simulated annealing; yet, it is much more efficient than SA. , thus well balancing between the quality and the cost. Further, it avoids cumbersome needs for the heuristic tuning of annealing schedules in annealing algorithms. MFT annealing would also be a good choice if the rapid deterioration of solutions to ``harder'' problems could be remedied. A recent experimental result [Li 1995d] shows that the Lagrange-Hopfield algorithm described in Section 8.3.4 yields solutions of quality similar to those produced by the Hummel-Zucker algorithm.
An earlier experimental comparison of several RL algorithms, in terms of
the number of iterations (cost) and the number of errors (quality), is
done by [Price 1985]. Price made the following conclusion: The
algorithm of Faugeras and Price [Faugeras and Price 1981] is the best, that of
Hummel and Zucker [Hummel and Zucker 1983] is about as well, that by
Peleg [Peleg 1980] converges too fast to yield good result, and that
of Rosenfeld et al.
performs only adequately. The goodness of
interpretation in [Price 1985] relates not only to algorithms
themselves but also to how the problem is formulated ( e.g.
the definition
of compatibilities), which is a combination of the two issues, problem
formulation and computational algorithm. Here, we measure the solution
quality by the quantitative gain and regard RL as just a
mechanism of minimization rather than one of interpretation. The
interest in our comparison is how well an algorithm can minimize an
energy regardless of the problem formulation.