7.2.1
In optimization-based recognition, the optimal solution is explicitly
defined as the extreme of an objective function. Let f be a
configuration representing a recognition solution. The cost of f is
measured by a global objective function , also called
the energy. The definition of E is dependent on f and a number of
K+1 parameters
. As the
optimality criterion for model-based recognition, it also relates to
other factors such as the observation, denoted
, and model
references, denoted
. Given
and
, the energy
maps a solution f to a real number by which the cost of the solution
is evaluated. The optimal solution corresponds to the global energy
minimum, expressed as:
In this regard, it is important to formulate the energy function so that the ``correct solution'' is embedded as the global minimum. The energy may also serve as a guide to the search for a minimal solution. In this aspect, it is desirable that the energy should differentiate the global minimum from other configurations as much as possible.
The energy function may be derived using one of the following probabilistic approaches: fully parametric, partially parametric and non-parametric. In the fully parametric approach, the energy function is derived from probability distributions in which all the involved parameters are known. Parameter estimation is a problem only in the partially parametric and the non-parametric cases.
In the partially parametric case, the forms of distributions are given but some involved parameters are unknown. An example is the Gaussian distribution with an unknown variance and another is the Gibbs distribution with unknown clique potential parameters. In this case, the problem of estimating parameters in the objective function is related to estimating parameters in the related probability distributions.
In the nonparametric approach, no assumptions are made about distributions and the form of the objective function is obtained based on experiences or pre-specified ``basis functions'' [Poggio and Edelman 1990]. This also applies to situations where the data set is too small to have statistical significance.
An important form for E in object recognition is the weighted sum of various terms, expressed as
where is a vector of
potential functions. A potential function is dependent on f,
and
, where the dependence can be nonlinear in f, and often
measures the violation of a certain constraint incurred by the solution
f. This linear combination of (nonlinear) potential functions is not
an unusual form. It has been used in many matching and recognition
works; see
[Duda and Hart 1973,Fischler and Elschlager 1973,Davis 1979,Gharaman et al. 1980,Jacobus et al. 1980,Shapiro and Haralick 1981,Oshima and Shirai 1983,Bhanu and Faugeras 1984,Wong and You 1985,Fan et al. 1989,Nasrabadi et al. 1990,Wells III 1991,Weng et al. 1992,Li 1994a].
Note that when
takes the linear form, multiplying
by a positive factor
does not change the minimal
configuration
Because of this equivalent, an additional constraint should be imposed
on for the uniqueness. In this work,
is confined to
have a unit Euclidean length
Given an observation , a model reference
and the form of
, it is the
value that completely specifies
the energy function
and thereby defines the minimal
solution
. It is desirable to learn the parameters from exemplars
so that the minimization-based recognition is performed in a right way.
The criteria for this purpose are established in the next subsection.