Skip to main content

State of the Art SLAM techniques

Best Stereo SLAMs in 2017 are reviewed.

Namely, (in arbitrary order)

  • EKF-SLAM based, 
  • Keyframe based, 
  • Joint BA optimization based, 
  • RSLAM, 
  • S-PTAM, 
  • LSD-SLAM,  


Best RGB-D SLAMs in 2017 are also reviewed.

  • KinectFusion, 
  • Kintinuouns, 
  • DVO-SLAM, 
  • ElasticFusion, 
  • RGB-D SLAM,  


See my keypoints of the best Stereo SLAMs.


Stereo SLAM

Conditionally Independent Divide and Conquer EKF-SLAM [5] 

  • operate in large environments than other approaches at that time
  • uses both  close and far points
  • far points whose depth cannot be reliably estimated due to little disparity in the stereo camera 
  • uses an inverse depth parametrization [6]
  • shows empirically points can be triangulated reliably, if their depth is less than about 40 times the stereo baseline.

   
- Keyframe-based  Stereo SLAM
  - uses BA optimization in a local area to archive scalability.
  - [8]: joint optimization of BA (point-pose constraints) in a inner window
     -  pose-graph (pose-pose constraints) in an outer window of keyframes
     - achieves the constant time complexity by limiting the size of these windows
     - at the expense of not guaranteeing global consistency.

  - [9]: RSLAM uses a relative representation of landmarks and poses
    - performs relative BA in an active area, constrained by constant-time
    - able to close loops
    - allows to expand active areas at both sides of a loop
    - not enforcing global consistency

  - [10]:S-PTAM 
    - performs local BA
    - lacks large loop closing


 - [11]: LSD-SLAM
   - a semi-dense direct approach
   - minimizes photometric error in image regions with high gradient.
   - More robust than feature-based to 
     -  motion blur
     - low textured environments
   - severely degraded performance by unmodeled effects
     - rolling shutter
     - non-lambertian reflectance.


RGB-D SLAM

- KinectFusion [4]
   - fused all depth data from the sensor into a volumetric dense model
   - uses ICP with the model to track the camera pose.
   - limited to small workspace due to volumetric representation
   - lack of loop closing


- Kintinuous [12]:
  - operate in large environments
  - uses a rolling cyclical buffer
  - does loop closing with place recognition and pose graph optimization

- RGB-D SLAM [13]:
  - feature-based system
  - front-end computes frame-to-frame motion by feature matching and ICP.
  - back-end performs pose-graph optimization with loop closure constraints from a heuristic search.

- DVO-SLAM [14]:
  - optimizes a pose-graph
  - computes keyframe-to-keyframe constraints from a visual odometry
  - a visual odometry minimizes both photometric and depth error.
  - searches for loop candidates in a heuristic manner over all previous frames
  - not relying on place recognition.

- ElasticFusion [15]:
  - builds a surfel-based map of the environment.
  - a map-centric approach that doe do poses
  - performs loop closing with a non-rigid deformation to the map
  - not using the standard pose-graph optimization
  - impressive detail reconstruction
  - impressive localization accuracy
  - implementation limited to a room-size map due to the complexity scales with the number of surfels in the map.




Reference
[5] L. M. Paz, P. Pinie ́s, J. D. Tardo ́s, and J. Neira, “Large-scale 6-DOF SLAM with stereo-in-hand,” IEEE Trans. Robot., vol. 24, no. 5, pp. 946–957, 2008.


[6] J. Civera, A. J. Davison, and J. M. M. Montiel, “Inverse depth parametrization for monocular SLAM,” IEEE Trans. Robot., vol. 24, no. 5, pp. 932–945, 2008.


[7] H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Visual SLAM: Why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65–77, 2012.


[8] H. Strasdat, A. J. Davison, J. M. M. Montiel, and K. Konolige, “Double window optimisation for constant time visual SLAM,” in IEEE Int. Conf. Comput. Vision (ICCV), 2011, pp. 2352–2359.


[9] C. Mei, G. Sibley, M. Cummins, P. Newman, and I. Reid, “RSLAM: A system for large-scale mapping in constant-time using stereo,” Int. J. Comput. Vision, vol. 94, no. 2, pp. 198–214, 2011.


[10] T. Pire, T. Fischer, J. Civera, P. De Cristo ́foris, and J. J. Berlles, “Stereo parallel tracking and mapping for robot localization,” in IEEE/RSJ Int. Conf. Intell. Robots and Syst. (IROS), 2015, pp. 1373–1378.


[11] J. Engel, J. Stueckler, and D. Cremers, “Large-scale direct SLAM with stereo cameras,” in IEEE/RSJ Int. Conf. Intell. Robots and Syst. (IROS), 2015.


[12] T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J. J. Leonard, and J. McDonald, “Real-time large-scale dense RGB-D SLAM with volu- metric fusion,” Int. J. Robot. Res., vol. 34, no. 4-5, pp. 598–626, 2015.

[13] F.Endres,J.Hess,J.Sturm,D.Cremers,andW.Burgard,“3-Dmapping with an RGB-D camera,” IEEE Trans. Robot., vol. 30, no. 1, pp. 177– 187, 2014.

[14] C. Kerl, J. Sturm, and D. Cremers, “Dense visual SLAM for RGB-D cameras,” in IEEE/RSJ Int. Conf. Intell. Robots and Syst. (IROS), 2013. [15] T. Whelan, R. F. Salas-Moreno, B. Glocker, A. J. Davison, and S. Leutenegger, “ElasticFusion: Real-time dense SLAM and light source

estimation,” Int. J. Robot. Res., vol. 35, no. 14, pp. 1697–1716, 2016.

Comments

Popular posts from this blog

How to project a camera plane A to a camera plane B

How to Create a holographic display and camcorder In the last part of the series "How to Create a Holographic Display and Camcorder", I talked about what the interest points, descriptors, and features to find the same object in two photos. In this part of the series, I'll talk about how to extract the depth of the object in two photos by calculating the disparity between the photos. In order to that, we need to construct a triangle mesh between correspondences. To construct a mesh, we will use Delaunnay triagulation.  Delaunnay Triagulation - It minimizes angles of all triangles, while the sigma of triangles is maximized. The reason for the triangulation is to do a piece wise affine transformation for each triangle mapped from a projective plane A to a projective plane B. A projective plane A is of a camera projective view at time t, while a projective plane B is of a camera projective view at time t+1. (or, at t-1.  It really doesn't matter)

How to create a holographic camcorder

Since the invention of a camcorder, we haven't seen much of advancement of a video camcorder. Sure, there are few interesting, new features like capturing video in 360 or taking high resolution 4K content. But the content is still in 2D and we still watch it on a 2D display. Have you seen the movie Minority Report (2002)? There is a scene where Tom Cruise is watching a video recording of his lost son in 3D or holographically. Here is a video clip of this scene. I have been waiting for the technological advancement to do this, but it's not here yet. So I decided to build one myself. In order to build a holographic video camcorder, we need two devices. 1) a video recorder - a recorder which captures the video content in 3D or holographically. 2) a video display - a display device which shows the recorded holographic content in 3D or holographically. Do we have a technology to record a video, holographically. Yes, we can now do it, and I'll e

Creating an optical computer

Creating an optical computer  Note on creating an optical computer.  What is Optical Computer? A laptop is a microchip based computer and uses electricity and transisters to compute. An optical computer uses photons to compute.  How does it compare to a typical laptop? A modern desktop computer has about 5 TFLOPS (5 x 10^16 floating calculations per second). With an optical computer, there is no limit in the calcuations per second.   Is an optical computer faster than a quantuam computer?  In 2016, the fastest known quantum computer has 2000 qubits, which is 1000 faster than 512 qubits.  With an optical computer, there is no artificial limitation like 2000 or 500 qubits.   What's the theoretical compute limit on an optical computer?  There is a limit of speed of light. For now, the only artificial limitation is how we design the first prototype.  How much electricity energy does it require?  The first POC should use less than 1000 W/hr.  Has there been any prior inventions or work