Feature matching and Geometric Estimation

About 866 wordsAbout 3 min

2025-08-12

We’ll cover: matching methods → match refinement → geometric transformation estimation, and connect them to practical image registration.

Objective of Feature Matching & Geometric Estimation

Given keypoints and descriptors (from SIFT, ORB, etc.), we want to:

Find correspondences between two images.
Filter out incorrect matches.
Estimate the geometric transformation relating the two images.

These steps are foundational for image stitching, 3D reconstruction, object recognition, and tracking.

1 Feature Matching Methods

1.1 Brute-Force Matcher

Concept: For each descriptor in Image A, compute its distance to all descriptors in Image B, and take the closest.

Distance metric depends on descriptor type:
- For SIFT / SURF (float descriptors): Euclidean distance (L2 norm)
  $d(\mathbf{f}_1, \mathbf{f}_2) = \sqrt{\sum_{i=1}^n (f_{1i} - f_{2i})^2}$
- For ORB / BRIEF (binary descriptors): Hamming distance
  $d_{\text{Hamming}}(\mathbf{b}_1, \mathbf{b}_2) = \text{Number of differing bits}$
Pros: Simple, exact.
Cons: Slow for large datasets (O(NM) comparisons).

1.2 FLANN — Fast Library for Approximate Nearest Neighbors

Concept: Accelerates nearest neighbor search using tree-based indexing.

Common methods:
- KD-Tree (good for low-dimensional float descriptors like SIFT)
- LSH (Locality Sensitive Hashing) (good for binary descriptors like ORB)
Returns approximate nearest neighbors (speed vs. accuracy trade-off).

Distance metrics are the same as above (L2 for float, Hamming for binary).

2 Matching Optimization

2.1 Lowe’s Ratio Test

Purpose: Remove ambiguous matches caused by repetitive patterns or noise.

Idea: For a descriptor in Image A:

Find its two nearest neighbors in Image B: distances $d_1$ and $d_2$ , where $d_1 < d_2$ .
Accept the match only if:

\frac{d_1}{d_2} < \tau

Typically $\tau = 0.75$ .

Reasoning: If the closest match is much better than the second-best, it’s likely correct. If distances are similar, the match is unreliable.

2.2 RANSAC — Random Sample Consensus

Purpose: Remove outlier matches that do not fit the global transformation model.

Algorithm:

Randomly pick a minimal subset of matches (e.g., 4 for homography, 3 for affine).
Estimate the transformation model $T$ from this subset.
Count how many matches fit $T$ within a tolerance (these are inliers).
Repeat many times, keep the model with the maximum inliers.
Re-estimate $T$ using all inliers.

Mathematical Model: If $p$ = inlier ratio, $s$ = sample size, $P$ = desired probability of success, Number of iterations needed:

N = \frac{\log(1 - P)}{\log(1 - p^s)}

3 Geometric Transformation Estimation

Once inliers are determined, we estimate the transformation that aligns the images.

3.1 Homography Matrix $H$

Definition: A 3×3 matrix relating two views of a planar surface (or from pure camera rotation).

\mathbf{x}' \sim H \mathbf{x}

where:

$\mathbf{x} = (x, y, 1)^T$ in Image 1
$\mathbf{x}' = (x', y', 1)^T$ in Image 2
“ $\sim$ ” means equality up to a scale factor.

Degrees of Freedom: 8 (scale is arbitrary). Minimal points needed: 4 correspondences.

DLT (Direct Linear Transform) solution: For each correspondence:

x'(h_{31}x + h_{32}y + h_{33}) = h_{11}x + h_{12}y + h_{13}

y'(h_{31}x + h_{32}y + h_{33}) = h_{21}x + h_{22}y + h_{23}

Stack equations for all points → solve using SVD.

3.2 Affine Transformation $A$

Definition: A 2×3 matrix preserving parallel lines:

\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} & t_x \\ a_{21} & a_{22} & t_y \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}

Degrees of Freedom: 6 Minimal points needed: 3 correspondences.

Affine can represent:

Rotation
Translation
Uniform or non-uniform scaling
Shearing

4 Practical Advice

Image Registration (first step of stitching)

Extract features (SIFT/ORB).
Match descriptors (Brute-Force or FLANN).
Apply Lowe’s ratio test.
Use RANSAC to find inliers.
Estimate Homography (for perspective scenes) or Affine (for flat transformations).
Warp one image to align with the other.

5 OpenCV Implementation Example

import cv2
import numpy as np

# Read images
img1 = cv2.imread('img1.jpg', 0)
img2 = cv2.imread('img2.jpg', 0)

# Detect keypoints and descriptors
sift = cv2.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)

# Match descriptors
bf = cv2.BFMatcher(cv2.NORM_L2)
matches = bf.knnMatch(des1, des2, k=2)

# Lowe's ratio test
good = []
for m, n in matches:
    if m.distance < 0.75 * n.distance:
        good.append(m)

# Extract matched point coordinates
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)

# Estimate Homography with RANSAC
H, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)

# Draw matches
matchesMask = mask.ravel().tolist()
draw_params = dict(matchColor=(0,255,0), matchesMask=matchesMask, flags=2)
img_matches = cv2.drawMatches(img1, kp1, img2, kp2, good, None, **draw_params)

cv2.imshow("Matches", img_matches)
cv2.waitKey(0)

Summary Table

Step	Method	Purpose	Math Core
Matching	Brute-Force	Exact nearest neighbor	L2 or Hamming distance
Matching	FLANN	Fast approx neighbor search	KD-Tree / LSH
Filtering	Lowe’s Ratio	Remove ambiguous matches	Distance ratio threshold
Filtering	RANSAC	Remove geometric outliers	Random sampling, consensus
Transform	Homography	Planar/perspective mapping	8 DoF, 4 points min
Transform	Affine	Rotation/scale/shear	6 DoF, 3 points min

Fundamentals of Images

Classical Image Processing and Feature Extraction