Skip to main content

What are the depth sensors?

How to Create a holographic display and camcorder

In the last part of the series, I talked about why the depth sensors may not be ideal for a consumer grade camcorder.

These depth sensors lack
  • Miniaturized form factor
  • Cost effectiveness 
  • Poor weather handling
  • Noticeable noise errors

Due to these limitations, the holographic display and camcorder will use other depth sensor alternatives.

What are the depth sensor alternatives?

Cameras


We can use one or more cameras.  When we use a camera or more, we can retrieve the depth information.

These camera configurations are
  • Monocular Camera
  • Stereoscopic Cameras
  • N-View Cameras


For the first prototype, we will limit our use case to indoors.

I haven't decided if I should use a monocular camera, stereoscopic cameras or n-view cameras.  This may largely decided by how much time I have available.  Likely, I will use all these camera configurations to compare and contrast the results over the design and the ease-of-use.

The camcorder should should record
  • A person
  • Indoors

What are the depth sensors?

A camera records a scene.
The scene is recorded in 2D.  That is, it has width and height.
There is no depth distance recorded with a camera.


A depth sensor records the z-axis distance to every depth point.

The z-axis distance is the depth distance.


  • It is the distance between the depth sensor emitter and the surface point of an object in the scene.

For example, imagine you shoot one single laser beam from the depth sensor emitter to some object. Let's say, it's a small cube box in the scene. 

When the laser beam hits some point on the surface of the small box, you should see only one laser beam point reflected on the surface of the box.  

This reflected point is the depth point. 

This depth point is reflected back to the depth sensory plane.  When it is reflected, the time of flight is measured to calculate the distance between the depth sensory emitter and the reflected surface point.

Now, image multiple laser beams hitting the surface of the box.   This means we can sample the surface distance from the laser beam emitters at each point. 

That is just one object.

What if we shoot many laser beams to all objects in the scene?

With this, we can sample the time of flight distances between all laser beams and the reflected depth points from all visible objects.



On the next part, I'll talk about How to use the cameras to retrieve the depth information.
After that, we can use the depth distance points to reconstruct a scene.



Comments

Popular posts from this blog

How to train a neural network to retrieve 3D maps from videos

This blog is about how to train a neural network to extract depth maps from videos of moving people captured with a monocular camera. Note: With a monocular camera, extracting the depth map of moving people is difficult.  Difficulty is due to the motion blur and the rolling shutter of an image.  However, we can overcome these limitations by predicting the depth maps by the model trained with a generated dataset using SfM and MVS from the normalized videos. This normalized dataset can be the basis of the training set for the neural network to automatically extract the accurate depth maps from a typical video footage, without any further assistance from a MVS. To start this project with a SfM and a MVS, we will use TUM Dataset. So, the basic idea is to use SfM and Multiview Stereo to estimate depth, while serves as supervision during training. The RGB-D SLAM reference implementation from these papers are used: - RGB-D Slam (Robotics OS) - Real-time 3D Visual SLAM ...

How to project a camera plane A to a camera plane B

How to Create a holographic display and camcorder In the last part of the series "How to Create a Holographic Display and Camcorder", I talked about what the interest points, descriptors, and features to find the same object in two photos. In this part of the series, I'll talk about how to extract the depth of the object in two photos by calculating the disparity between the photos. In order to that, we need to construct a triangle mesh between correspondences. To construct a mesh, we will use Delaunnay triagulation.  Delaunnay Triagulation - It minimizes angles of all triangles, while the sigma of triangles is maximized. The reason for the triangulation is to do a piece wise affine transformation for each triangle mapped from a projective plane A to a projective plane B. A projective plane A is of a camera projective view at time t, while a projective plane B is of a camera projective view at time t+1. (or, at t-1.  It really doesn't matter)...

Calibrating sensors on a L2 autonomous vehicle

In this blog, I will discuss how to calibrate a suite of sensors used in a L2 autonomous prototype vehicle. Note: - To ensure a dataset generated from a L2 autonomous prototype, calibrate all sensors on board per each trip. In our autonomous vehicle prototypes, we use 6 - Cameras - we use 6 cameras - cropped native resolution from 1600x900 to smaller images - are in native BGR format. - with auto-exposure with a maximum limit of 20 ms. - use Bayer8 format for 1 byte per pixelin encoding - 1/1.8" CMOS sensor for 12 Hz capture frequency. - positions:   - one front center camera   - one front side mirrow camera per side   - one rear center camera   - one rear door centered camera per side 5 - Long Range RADARs - we use 5 sensors of RADAR - @ 13Hz capture frequency at 77 Ghz - measures distance & velocity, independently, in one cycle - positions:   - one front bumper center radar   - one front side mirror radar per side   - one...