Annealing

Next: Mean Field Annealing Up: Simulated Annealing Previous: Random Sampling Algorithms

9.1.2

Annealing

Figure 9.2: The simulated annealing algorithm.

Consider a system in which any f in the configuration space has probability

where T>0 is the temperature of the system. In the limit as , is a uniform distribution on ; for T=1, ; as , is concentrated on the peak(s) of . Of course, the same is also true with a posterior probability .

The SA algorithm is described in Fig.9.2. SA applies a sampling algorithm, such as Metropolis, successively at decreasing values of temperature T. Initially, T is set very high and f is set to a random configuration. At a fixed T, the sampling is according to the Gibbs distribution . After the Metropolis algorithm converges to the equilibrium at current T, T is decreased according to a carefully chosen schedule. This iterates until T is close to 0 when the system is ``frozen'' near the minimum of . The cooling schedule, specified by a decrement function and a final value, plays an important part; see below.

Two convergence theorems are proven by [Geman and Geman 1984]. The first concerns the convergence of the Metropolis algorithm. It says that if every configuration is visited infinitely often, the distribution of generated configurations is guaranteed to converge to the Boltzmann ( i.e.

Gibbs) distribution.

The second is about SA. It states that if the decreasing sequence of temperatures satisfy

and

where , then the system converges to the global minimum regardless of the initial configuration . Note that the above conditions are sufficient but not necessary for the convergence.

Unfortunately, the schedule (9.4) is too slow to be of practical use. In practice, heuristic, faster schedules have to be used instead. Geman and Geman (1984) adopt the following

where the constant C is set or for their problem. Kirkpatrick et al. (1983) choose

where typically takes a value between and . The initial temperature is set high enough so that essentially all configuration changes are accepted. At each temperature, enough configurations are tried that either there are 10m (10 times the number of sites) accepted configuration transitions or the number of tries exceeds 100m. The system is frozen and annealing stops if the desired number of acceptances is not achieved at three successive temperatures. In [Kato et al. 1993a], a multi-temperature annealing scheme is proposed for annealing in multi-resolution computation; there, the temperature in the Gibbs distribution is related not only to the time but also to the scale.

The two sampling procedures for SA, i.e. Metropolis algorithm [Metropolis and etal 1953] or Gibbs sampler [Geman and Geman 1984], are proven to be asymptotically equivalent for SA performed on lattices but this is not generally true for non-lattice structures [Chiang and Chow 1992]. The interested reader is referred to [Kirkpatrick et al. 1982,Geman and Geman 1984,Aarts 1989] for devoted discussions on SA.

A performance comparison between stochastic SA and deterministic GNC for continuous visual reconstruction is given by [Blake 1989]. Based on his experiments under controlled conditions, Black argues that GNC excels SA both in computational efficiency and problem-solving power. An experimental comparison between SA and deterministic RL algorithms for combinatorial optimization will be presented in Section 9.5.

Next: Mean Field Annealing Up: Simulated Annealing Previous: Random Sampling Algorithms