Artificial Intelligence and Machine Vision

Posts

Showing posts from August, 2017

How to train a neural network to retrieve 3D maps from videos

This blog is about how to train a neural network to extract depth maps from videos of moving people captured with a monocular camera. Note: With a monocular camera, extracting the depth map of moving people is difficult. Difficulty is due to the motion blur and the rolling shutter of an image. However, we can overcome these limitations by predicting the depth maps by the model trained with a generated dataset using SfM and MVS from the normalized videos. This normalized dataset can be the basis of the training set for the neural network to automatically extract the accurate depth maps from a typical video footage, without any further assistance from a MVS. To start this project with a SfM and a MVS, we will use TUM Dataset. So, the basic idea is to use SfM and Multiview Stereo to estimate depth, while serves as supervision during training. The RGB-D SLAM reference implementation from these papers are used: - RGB-D Slam (Robotics OS) - Real-time 3D Visual SLAM ...

State of the Art SLAM techniques

Best Stereo SLAMs in 2017 are reviewed. Namely, (in arbitrary order) EKF-SLAM based, Keyframe based, Joint BA optimization based, RSLAM, S-PTAM, LSD-SLAM, Best RGB-D SLAMs in 2017 are also reviewed. KinectFusion, Kintinuouns, DVO-SLAM, ElasticFusion, RGB-D SLAM, See my keypoints of the best Stereo SLAMs. Stereo SLAM Conditionally Independent Divide and Conquer EKF-SLAM [5] operate in large environments than other approaches at that time uses both close and far points far points whose depth cannot be reliably estimated due to little disparity in the stereo camera uses an inverse depth parametrization [6] shows empirically points can be triangulated reliably, if their depth is less than about 40 times the stereo baseline. - Keyframe-based Stereo SLAM - uses BA optimization in a local area to archive scalability. ...