Feature Extraction (Local Discriptors)

About 877 wordsAbout 3 min

2025-08-12

Why Local Features?

In computer vision, local features are designed to describe distinctive, repeatable patterns in small regions of an image. These features enable:

Matching between different images (e.g., for image stitching, object recognition)
Robustness to geometric and photometric transformations (rotation, scale, lighting)
Efficiency in handling large scenes with partial views or occlusions

Key Concepts Overview

1. Keypoint Detection

Keypoints are specific, stable locations in the image that are distinctive and repeatable — e.g., corners, blobs, or edges.

A good keypoint detector should be invariant to image transformations (e.g., scale, rotation).
Common algorithms: DoG (SIFT), Harris corner, FAST (ORB), etc.

2. Descriptor Generation

A descriptor is a fixed-length vector (e.g., 128-dim for SIFT, 32-bytes for ORB) that encodes the appearance of the image region around a keypoint.

It must be robust and distinctive to enable reliable matching.
Descriptors can be floating-point (e.g., SIFT, SURF) or binary (e.g., ORB, BRIEF).

3. Invariance Properties

A good local feature has invariance to:

Scale (size of object in image)
Rotation
Affine transformations
Lighting changes

Local Feature Methods

SIFT — Scale-Invariant Feature Transform

Developed by David Lowe (2004), SIFT is one of the most influential feature extraction algorithms.

Key Properties:

Scale and rotation invariant
Robust to noise and small affine distortions
Widely used in academic research and industry

Steps:

1. Scale-space extrema detection

Build a scale-space using Gaussian blur with increasing $\sigma$ :

L(x, y, \sigma) = G(x, y, \sigma) * I(x, y)

Compute Difference of Gaussians (DoG) to approximate Laplacian:

D(x, y, \sigma) = L(x, y, k\sigma) - L(x, y, \sigma)

Detect local extrema in 3D (x, y, scale)

2. Keypoint localization

Fit a 3D quadratic function to refine keypoint location and reject low-contrast or edge-like points using the Hessian matrix.

3. Orientation assignment

Compute image gradients $\nabla I(x, y)$ in a region around the keypoint.
Assign one or more dominant orientations using a histogram (36 bins over 360°).

4. Descriptor computation

Take a $16 \times 16$ patch around the keypoint.
Divide it into $4 \times 4$ subregions.
Compute gradient orientation histograms in each subregion (8 bins).
Result: $4 \times 4 \times 8 = 128$ -dimensional descriptor.

Summary:

Detector: DoG in scale space
Descriptor: 128-dim floating point
Invariant to: Scale, rotation, illumination, partial affine

SURF — Speeded Up Robust Features

Developed by Bay et al. (2006), SURF is a faster approximation of SIFT.

Key Properties:

Faster than SIFT (uses integral images)
Based on Haar wavelet responses
⚠️ Patented — not free for commercial use

Method Overview:

Scale-space: Uses box filters with different sizes (approximated with integral images)
Keypoint detection: Uses Hessian matrix determinant for blob detection:

\text{Hessian}(x, \sigma) = \begin{bmatrix} L_{xx}(x, \sigma) & L_{xy}(x, \sigma) \\ L_{xy}(x, \sigma) & L_{yy}(x, \sigma) \end{bmatrix}

Orientation: Calculated using Haar wavelet responses
Descriptor: 64- or 128-dimensional vector from sums of Haar responses

ORB — Oriented FAST and Rotated BRIEF

Developed by OpenCV (2011) as an efficient, open-source alternative to SIFT and SURF.

Key Properties:

FAST + BRIEF with rotation invariance
Binary descriptor (much smaller and faster)
Free to use, suitable for real-time systems

Method Overview:

1. Keypoint Detection (FAST):

Detect corners using FAST algorithm (rapid intensity comparison).
Apply Harris corner score to retain best keypoints.

2. Orientation Assignment:

Compute intensity centroid $(x_c, y_c)$ of a patch around the keypoint.
Estimate orientation $\theta$ from the centroid offset:

\theta = \arctan\left(\frac{m_{01}}{m_{10}}\right), \quad m_{pq} = \sum x^p y^q I(x,y)

3. Descriptor (Rotated BRIEF):

Use binary tests (intensity comparisons) on pairs of pixels.
Rotate test pattern according to keypoint orientation.

Summary:

Detector: FAST with Harris ranking
Descriptor: 256-bit binary string
Invariant to: Rotation (not scale by default)
Very fast, ideal for mobile or embedded devices

HOG — Histogram of Oriented Gradients

Originally proposed for pedestrian detection by Dalal and Triggs (2005).

Key Properties:

Describes object shape and structure
Based on local gradient orientation histograms
Often used in object detection with SVM

Method Steps:

1. Compute gradients

Horizontal and vertical gradients using simple kernels.

2. Divide image into cells

Each cell: e.g., $8 \times 8$ pixels.

3. Orientation histogram

For each cell, create a histogram of gradient directions (e.g., 9 bins from 0° to 180°).

4. Block normalization

Group adjacent cells into blocks (e.g., $2 \times 2$ cells).
Normalize histograms to reduce illumination sensitivity.

5. Concatenate all normalized histograms

Result: High-dimensional feature vector representing the entire region or image.

Summary:

Descriptor only (no keypoint detection)
Dimensionality: Often 3k–10k dimensions
Invariant to: Illumination, small local distortions

Summary Comparison Table

Method	Type	Invariance	Descriptor	License
SIFT	Float	Scale, Rotation	128-dim	Originally patented, now public
SURF	Float	Scale, Rotation	64–128 dim	Patented
ORB	Binary	Rotation (not scale)	256-bit	Free
HOG	Float	Lighting	Varies	Free

Practical Tips

Use ORB for lightweight, real-time systems.
Use SIFT when robustness and accuracy are more important than speed.
Use HOG for shape-based tasks (e.g., pedestrian detection).
Visualize keypoints using OpenCV:

# Example: ORB keypoint visualization
orb = cv2.ORB_create()
keypoints = orb.detect(image, None)
image_with_kp = cv2.drawKeypoints(image, keypoints, None, flags=0)
cv2.imshow("Keypoints", image_with_kp)

Fundamentals of Images

Classical Image Processing and Feature Extraction

Feature Extraction (Local Discriptors)

Why Local Features?

Key Concepts Overview

1. Keypoint Detection

2. Descriptor Generation

3. Invariance Properties

Local Feature Methods

SIFT — Scale-Invariant Feature Transform

Key Properties:

Steps:

1. Scale-space extrema detection

2. Keypoint localization

3. Orientation assignment

4. Descriptor computation

Summary:

SURF — Speeded Up Robust Features

Key Properties:

Method Overview:

ORB — Oriented FAST and Rotated BRIEF

Key Properties:

Method Overview:

1. Keypoint Detection (FAST):

2. Orientation Assignment:

3. Descriptor (Rotated BRIEF):

Summary:

HOG — Histogram of Oriented Gradients

Key Properties:

Method Steps:

1. Compute gradients

2. Divide image into cells

3. Orientation histogram

4. Block normalization

5. Concatenate all normalized histograms

Summary:

Summary Comparison Table

Practical Tips