Skip to main content

Estimating a Homography model with RANSAC

Why RANSAC?

Because we want a model of good feature matches.

- Inliers:
  - Good matches
- Outlier:
  - Bad matches




Interest points

(500/image)
(640x480)






What other algorithms are available?

- Exhaustive search
  - for each featfure, compare all other feactures in other images

- Hashing
  - compute a short descriptor from each feature vector

- Nearest neighbor techniques:
  - k-trees and their variants



Putative correspondences (268)(Best match,SSD<20 span="">Outliers (117)(t=1.25 pixel; 43 iterations)



How about outlier rejection?

- use SSD (patch1, patch2) > threshold
- 1-NN: SSD of the closest matches


How to handle too many outliers?


RANSAC loop:

  1. Select four feature pairs (at random)
  2. Compute homography H (exact)
  3. Compute inliers where  SSD(pi’, H pi) < ε
  4. Keep largest set of inliers
  5. Re-compute least-squares H estimate on all of the inliers


Final inliers (262)












Comments

Popular posts from this blog

How to project a camera plane A to a camera plane B

How to Create a holographic display and camcorder In the last part of the series "How to Create a Holographic Display and Camcorder", I talked about what the interest points, descriptors, and features to find the same object in two photos. In this part of the series, I'll talk about how to extract the depth of the object in two photos by calculating the disparity between the photos. In order to that, we need to construct a triangle mesh between correspondences. To construct a mesh, we will use Delaunnay triagulation.  Delaunnay Triagulation - It minimizes angles of all triangles, while the sigma of triangles is maximized. The reason for the triangulation is to do a piece wise affine transformation for each triangle mapped from a projective plane A to a projective plane B. A projective plane A is of a camera projective view at time t, while a projective plane B is of a camera projective view at time t+1. (or, at t-1.  It really doesn't matter)...

How to train a neural network to retrieve 3D maps from videos

This blog is about how to train a neural network to extract depth maps from videos of moving people captured with a monocular camera. Note: With a monocular camera, extracting the depth map of moving people is difficult.  Difficulty is due to the motion blur and the rolling shutter of an image.  However, we can overcome these limitations by predicting the depth maps by the model trained with a generated dataset using SfM and MVS from the normalized videos. This normalized dataset can be the basis of the training set for the neural network to automatically extract the accurate depth maps from a typical video footage, without any further assistance from a MVS. To start this project with a SfM and a MVS, we will use TUM Dataset. So, the basic idea is to use SfM and Multiview Stereo to estimate depth, while serves as supervision during training. The RGB-D SLAM reference implementation from these papers are used: - RGB-D Slam (Robotics OS) - Real-time 3D Visual SLAM ...

How to reduce TOF errors in AR glasses

In this blog, I will describe how we reduced the noise of the Time-Of-Flight sensor in our AR glasses prototype. Types of noise - systematic noise    note: caused by imperfect sinusoidal modulation - random noise    note: by shot noise. use bilateral filtering Motion artifacts reduction note: when motion is observed on a target object, we have motion artifacts observed in the tof sensor.  This happens when TOF measurement is recorded sequentially.  And, this causes doppler effects. fix: - use Plus and Minus rules    -- reference:        1) "Time of flight motion compensation revisited"  (2014)        2) "Time of flight cameras: Principles, Methods and Applications" (2012) Physics-based MPI reduction fix: - use 2K+1 frequency measurements for K inferencing paths in absence of noise. Per-pixel temporal processing of raw ToF measurements fix: - matrix pencil method - Prong's met...