1.1.3
In terms of the regularity and the continuity, we may classify a vision labeling problem into one of the following four categories:
Restoration or smoothing of images having continuous pixel values is an
LP1. The set of sites corresponds to image pixels and the set
of labels is a real interval. The restoration is to estimate the
true image signal from a degraded or noise-corrupted image.
Restoration of binary or multi-level images is an LP2. Similar to the
continuous restoration, the aim is also to estimate the true image
signal from the input image. The difference is that each pixel in the
resulting image here assumes a discrete value and thus in this case
is a set of discrete labels.
Region segmentation is an LP2. It partitions an observation image into mutually exclusive regions, each of which has some uniform and homogeneous properties whose values are significantly different from those of the neighboring regions. The property can be, for example, grey tone, color or texture. Pixels within each region are assigned a unique label.
The prior assumption in the above problems is that the signal is smooth or piecewise smooth. This is complementary to the assumption of abrupt changes made for edge detection.
Edge detection is also an LP2. Each edge site, located between two neighboring pixels, is assigned a label in {edge, non-edge} if there is a significant difference between the two pixels. Continuous restoration with discontinuities can be viewed as a combination of LP1 and LP2.
Perceptual grouping is an LP3. The sites usually correspond to initially segmented features (points, lines and regions) which are irregularly arranged. The fragmentary features are to be organized into perceptually more significant features. Between each pair of the features is assigned a label in {connected,disconnected}, indicating whether the two features should be linked.
Feature-based object matching and recognition is an LP3. Each site indexes an image feature such as a point, a line segment or a region. Labels are discrete in nature and each of them indexes a model feature. The resulting configuration is a mapping from the image features to those of a model object.
Pose estimation from a set of point correspondences might be formulated
as an LP4. A site is a given correspondence. A label represents an
admissible (orthogonal, affine or perspective) transformation. A prior
(unary) constraint is that the label of transformation itself must be
orthogonal, affine or perspective. A mutual constraint is that the
labels should be close to each other to form a
consistent transformation.
For a discrete labeling problem of m sites and M labels, there exist
a total number of possible labelings. For a continuous labeling
problem, there are an infinite number of them. However, among all the possibilities, there are only a few which are optimal in terms of a criterion
measuring the goodness (or inversely, the cost) of solutions. This is
the optimization approach to visual labeling.