Feature matching and Geometric Estimation
2025-08-12
We’ll cover: matching methods → match refinement → geometric transformation estimation, and connect them to practical image registration.
Objective of Feature Matching & Geometric Estimation
Given keypoints and descriptors (from SIFT, ORB, etc.), we want to:
- Find correspondences between two images.
- Filter out incorrect matches.
- Estimate the geometric transformation relating the two images.
These steps are foundational for image stitching, 3D reconstruction, object recognition, and tracking.
1 Feature Matching Methods
1.1 Brute-Force Matcher
Concept: For each descriptor in Image A, compute its distance to all descriptors in Image B, and take the closest.
Distance metric depends on descriptor type:
For SIFT / SURF (float descriptors): Euclidean distance (L2 norm)
d(f1,f2)=i=1∑n(f1i−f2i)2
For ORB / BRIEF (binary descriptors): Hamming distance
dHamming(b1,b2)=Number of differing bits
Pros: Simple, exact.
Cons: Slow for large datasets (O(NM) comparisons).
1.2 FLANN — Fast Library for Approximate Nearest Neighbors
Concept: Accelerates nearest neighbor search using tree-based indexing.
Common methods:
- KD-Tree (good for low-dimensional float descriptors like SIFT)
- LSH (Locality Sensitive Hashing) (good for binary descriptors like ORB)
Returns approximate nearest neighbors (speed vs. accuracy trade-off).
Distance metrics are the same as above (L2 for float, Hamming for binary).
2 Matching Optimization
2.1 Lowe’s Ratio Test
Purpose: Remove ambiguous matches caused by repetitive patterns or noise.
Idea: For a descriptor in Image A:
- Find its two nearest neighbors in Image B: distances d1 and d2, where d1<d2.
- Accept the match only if:
d2d1<τ
Typically τ=0.75.
Reasoning: If the closest match is much better than the second-best, it’s likely correct. If distances are similar, the match is unreliable.
2.2 RANSAC — Random Sample Consensus
Purpose: Remove outlier matches that do not fit the global transformation model.
Algorithm:
- Randomly pick a minimal subset of matches (e.g., 4 for homography, 3 for affine).
- Estimate the transformation model T from this subset.
- Count how many matches fit T within a tolerance (these are inliers).
- Repeat many times, keep the model with the maximum inliers.
- Re-estimate T using all inliers.
Mathematical Model: If p = inlier ratio, s = sample size, P = desired probability of success, Number of iterations needed:
N=log(1−ps)log(1−P)
3 Geometric Transformation Estimation
Once inliers are determined, we estimate the transformation that aligns the images.
3.1 Homography Matrix H
Definition: A 3×3 matrix relating two views of a planar surface (or from pure camera rotation).
x′∼Hx
where:
- x=(x,y,1)T in Image 1
- x′=(x′,y′,1)T in Image 2
- “∼” means equality up to a scale factor.
Degrees of Freedom: 8 (scale is arbitrary). Minimal points needed: 4 correspondences.
DLT (Direct Linear Transform) solution: For each correspondence:
x′(h31x+h32y+h33)=h11x+h12y+h13
y′(h31x+h32y+h33)=h21x+h22y+h23
Stack equations for all points → solve using SVD.
3.2 Affine Transformation A
Definition: A 2×3 matrix preserving parallel lines:
[x′y′]=[a11a21a12a22txty]xy1
Degrees of Freedom: 6 Minimal points needed: 3 correspondences.
Affine can represent:
- Rotation
- Translation
- Uniform or non-uniform scaling
- Shearing
4 Practical Advice
Image Registration (first step of stitching)
- Extract features (SIFT/ORB).
- Match descriptors (Brute-Force or FLANN).
- Apply Lowe’s ratio test.
- Use RANSAC to find inliers.
- Estimate Homography (for perspective scenes) or Affine (for flat transformations).
- Warp one image to align with the other.
5 OpenCV Implementation Example
import cv2
import numpy as np
# Read images
img1 = cv2.imread('img1.jpg', 0)
img2 = cv2.imread('img2.jpg', 0)
# Detect keypoints and descriptors
sift = cv2.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)
# Match descriptors
bf = cv2.BFMatcher(cv2.NORM_L2)
matches = bf.knnMatch(des1, des2, k=2)
# Lowe's ratio test
good = []
for m, n in matches:
if m.distance < 0.75 * n.distance:
good.append(m)
# Extract matched point coordinates
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)
# Estimate Homography with RANSAC
H, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
# Draw matches
matchesMask = mask.ravel().tolist()
draw_params = dict(matchColor=(0,255,0), matchesMask=matchesMask, flags=2)
img_matches = cv2.drawMatches(img1, kp1, img2, kp2, good, None, **draw_params)
cv2.imshow("Matches", img_matches)
cv2.waitKey(0)
Summary Table
Step | Method | Purpose | Math Core |
---|---|---|---|
Matching | Brute-Force | Exact nearest neighbor | L2 or Hamming distance |
Matching | FLANN | Fast approx neighbor search | KD-Tree / LSH |
Filtering | Lowe’s Ratio | Remove ambiguous matches | Distance ratio threshold |
Filtering | RANSAC | Remove geometric outliers | Random sampling, consensus |
Transform | Homography | Planar/perspective mapping | 8 DoF, 4 points min |
Transform | Affine | Rotation/scale/shear | 6 DoF, 3 points min |