Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- For the Instructor
- Part I Preliminaries
- Part II Preprocessing
- Part III Image Understanding
- 8 Segmentation
- 9 Parametric Transforms
- 10 Representing and Matching Shape
- 11 Representing and Matching Scenes
- Part IV The 2D Image in a 3D World
- A Support Vector Machines
- B How to Differentiate a Function Containing a Kernel Operator
- C The Image File System (IFS) Software
- Author Index
- Subject Index
- References
11 - Representing and Matching Scenes
from Part III - Image Understanding
Published online by Cambridge University Press: 25 October 2017
- Frontmatter
- Dedication
- Contents
- Preface
- For the Instructor
- Part I Preliminaries
- Part II Preprocessing
- Part III Image Understanding
- 8 Segmentation
- 9 Parametric Transforms
- 10 Representing and Matching Shape
- 11 Representing and Matching Scenes
- Part IV The 2D Image in a 3D World
- A Support Vector Machines
- B How to Differentiate a Function Containing a Kernel Operator
- C The Image File System (IFS) Software
- Author Index
- Subject Index
- References
Summary
One of these things is not like the other.
– Sesame StreetIntroduction
In this chapter rather than matching regions that we did in Chapter 10, we consider issues associated with matching scenes.
Matching at this level establishes an interpretation. That is, it puts two representations into correspondence:
• (Section 11.2) In this section, both representations may be of the same form. For example, correlation matches an observed image with a template, an approach called template matching. Eigenimages are also a representation for images that use the concepts of principal components to match images.
• (Section 11.3) When matching scenes, we don't really want to match every single pixel, but only do matching at points that are “interesting.” This requires a definition for interest points.
• (Sections 11.4, 11.5, and 11.6) Once the interest points are identified, these sections develop three methods, SIFT, SKS, and HoG, for describing the neighborhood of the interest points using descriptors and then matching those descriptors.
• (Section 11.7) If the scene is represented abstractly, by nodes in graphs, methods are provided for matching graphs.
• (Sections 11.8 and 11.9) In these sections, two other matching methods, including deformable templates, are described.
As we investigate matching scenes, or components of scenes, a new word is introduced, descriptor. This word denotes a representation for a local neighborhood in a scene, a neighborhood of perhaps 200 pixels, larger than the kernels we have thought about, but smaller than templates. The terms kernels, templates, and descriptors, while they do connote size to some extent, are really describing how this local representation is used, as the reader will see.
Matching Iconic Representations
Matching Templates to Scenes
Recall that an iconic representation of an image is an image, e.g., a smaller image, an image that is not blurred, etc. In this section, we need to match two images.
A template is a representation for an image (or sub-image) that is itself an image, but almost always smaller than the original. A template is typically moved around the target image until a location is found that maximizes some match function. The most obvious such function is the sum squared error, sometimes referred to as the sum-squared difference (SSD),
which provides a measure of how well the template (T) matches the image (f) at point x, y, assuming the template is N × N.
- Type
- Chapter
- Information
- Fundamentals of Computer Vision , pp. 267 - 300Publisher: Cambridge University PressPrint publication year: 2017