Skip to main content

Calibrating sensors on a L2 autonomous vehicle

In this blog, I will discuss how to calibrate a suite of sensors used in a L2 autonomous prototype vehicle.

Note:
- To ensure a dataset generated from a L2 autonomous prototype, calibrate all sensors on board per each trip.

In our autonomous vehicle prototypes, we use

6 - Cameras
- we use 6 cameras
- cropped native resolution from 1600x900 to smaller images
- are in native BGR format.
- with auto-exposure with a maximum limit of 20 ms.
- use Bayer8 format for 1 byte per pixelin encoding
- 1/1.8" CMOS sensor for 12 Hz capture frequency.
- positions:
  - one front center camera
  - one front side mirrow camera per side
  - one rear center camera
  - one rear door centered camera per side

5 - Long Range RADARs
- we use 5 sensors of RADAR
- @ 13Hz capture frequency at 77 Ghz
- measures distance & velocity, independently, in one cycle
- positions:
  - one front bumper center radar
  - one front side mirror radar per side
  - one rear door center radar per side


1 - LIDAR
- @ 20Hz capture frequency with 32 Channels
- 360 degrees - Horizontal FOV
- +/- 10 degrees to -30 degrees Vertical FOV
- range: 80m - 100m, but usuable at 70 m
- accurte: +/- 2cm
- upto 1.39 million points per second capture


Camera Calibration - Extrinsics
- use a cube-shaped object with known charuco patterns on three orthogonal planes.
- compute the transformation matrix from the camera to LIdAR by aligning the planes of the object.
- compute the camera to the ego frame transformation matrix from the LIDAR to the ego frame transformation.

note:
- the ego frame is at the rear vehicle axle's mid point.

Camera Clibartion - Intrinsics
- compute the intrinsics and the distortion parameters of the camera with a calibration  board with a know set of  patterns.

RADAR caliboration
- calibrate the yaw angle using a brute force approach to minimize the compensated range rates for static objects.

LIDAR calibration
- use a laseer liner to measure the relative location of the LIDAR to the ego frame.



Comments

Popular posts from this blog

How to project a camera plane A to a camera plane B

How to Create a holographic display and camcorder In the last part of the series "How to Create a Holographic Display and Camcorder", I talked about what the interest points, descriptors, and features to find the same object in two photos. In this part of the series, I'll talk about how to extract the depth of the object in two photos by calculating the disparity between the photos. In order to that, we need to construct a triangle mesh between correspondences. To construct a mesh, we will use Delaunnay triagulation.  Delaunnay Triagulation - It minimizes angles of all triangles, while the sigma of triangles is maximized. The reason for the triangulation is to do a piece wise affine transformation for each triangle mapped from a projective plane A to a projective plane B. A projective plane A is of a camera projective view at time t, while a projective plane B is of a camera projective view at time t+1. (or, at t-1.  It really doesn't matter)...

State of the Art SLAM techniques

Best Stereo SLAMs in 2017 are reviewed. Namely, (in arbitrary order) EKF-SLAM based,  Keyframe based,  Joint BA optimization based,  RSLAM,  S-PTAM,  LSD-SLAM,   Best RGB-D SLAMs in 2017 are also reviewed. KinectFusion,  Kintinuouns,  DVO-SLAM,  ElasticFusion,  RGB-D SLAM,   See my keypoints of the best Stereo SLAMs. Stereo SLAM Conditionally Independent Divide and Conquer EKF-SLAM [5]   operate in large environments than other approaches at that time uses both  close and far points far points whose depth cannot be reliably estimated due to little disparity in the stereo camera  uses an inverse depth parametrization [6] shows empirically points can be triangulated reliably, if their depth is less than about 40 times the stereo baseline.     - Keyframe-based  Stereo SLAM   - uses BA optimization in a local area to archive scalability.  ...

How to train a neural network to retrieve 3D maps from videos

This blog is about how to train a neural network to extract depth maps from videos of moving people captured with a monocular camera. Note: With a monocular camera, extracting the depth map of moving people is difficult.  Difficulty is due to the motion blur and the rolling shutter of an image.  However, we can overcome these limitations by predicting the depth maps by the model trained with a generated dataset using SfM and MVS from the normalized videos. This normalized dataset can be the basis of the training set for the neural network to automatically extract the accurate depth maps from a typical video footage, without any further assistance from a MVS. To start this project with a SfM and a MVS, we will use TUM Dataset. So, the basic idea is to use SfM and Multiview Stereo to estimate depth, while serves as supervision during training. The RGB-D SLAM reference implementation from these papers are used: - RGB-D Slam (Robotics OS) - Real-time 3D Visual SLAM ...