Perception: modify documents

0dd2d7c4 · huiyujiang · Calvin Miao · 640e0d9d · 0dd2d7c4 · 0dd2d7c4
3 changed file
--- a/docs/howto/how_to_run_perception_module_on_your_local_computer.md
+++ b/docs/howto/how_to_run_perception_module_on_your_local_computer.md
 # How to Run Perception Module on Your Local Computer

-The perception module requires Nvidia GPU and CUDA installed to run the CNN Segmentation algorithm with Caffe. We have already installed the CUDA and Caffe libraries in the released docker. However, the Nvidia GPU driver is not installed in the released dev docker image. To run the perception module with CUDA acceleration, we suggest to install the exactly same version of Nvidia driver in the docker as the one installed in your host machine, and build Apollo with GPU option.
+The perception module requires Nvidia GPU and CUDA installed to run the perception algorithms with Caffe. We have already installed the CUDA and Caffe libraries in the released docker. However, the Nvidia GPU driver is not installed in the released dev docker image. To run the perception module with CUDA acceleration, we suggest to install the exactly same version of Nvidia driver in the docker as the one installed in your host machine, and build Apollo with GPU option.

 We provide a step-by-step instruction on running perception module with Nvidia GPU as below:

@@ -47,4 +47,4 @@ or
 ./apollo.sh build_opt_gpu
 ```

-Now the perception module can be running in GPU mode with command `./scripts/perception.sh start`. Please note that the Nvidia driver should be installed appropriately as shown above even if the perception module is running in Caffe CPU_ONLY mode (i.e., using `./apollo.sh build` or `./apollo.sh build_opt` to build the perception module).
\ No newline at end of file
+Now the perception module can be running in GPU mode with command `./scripts/perception.sh start`. Please note that the Nvidia driver should be installed appropriately as shown above even if the perception module is running in Caffe CPU_ONLY mode (i.e., using `./apollo.sh build` or `./apollo.sh build_opt` to build the perception module). Please see the detailed instruction of perception module in [the perception README](https://github.com/ApolloAuto/apollo/blob/master/modules/perception/README.md).
--- a/docs/specs/3d_obstacle_perception.md
+++ b/docs/specs/3d_obstacle_perception.md
 3D Obstacle Perception
 ===================

-The following sections describe the perception pipeline of obstacles
-that are resolved by Apollo:
+There are three main components of 3D obstacle perception:

-   HDMap Region of Interest (ROI) Filter
-   Convolutional Neural Networks (CNN) Segmentation
-   MinBox Builder
-   HM Object Tracker
-   Sequential Type Fusion
-   Sensor Fusion
+-   LiDAR Obstacle Perception
+-   RADAR Obstacle Perception
+-   Obstacle Results Fusion

-HDMap Region of Interest (ROI) Filter
-------------------------------------
+## LiDAR Obstacle Perception
+
+The details of LiDAR obstacle perception pipeline are provided in the following sections, including: 
+
+- HDMap Region of Interest (ROI) Filter
+- Convolutional Neural Networks (CNN) Segmentation
+- MinBox Builder
+- HM Object Tracker
+- Sequential Type Fusion
+
+### HDMap Region of Interest (ROI) Filter

 The Region of Interest (ROI) specifies the drivable area that includes
 road surfaces and junctions are retrieved from the HD
@@ -42,7 +47,7 @@ steps:

 3.  Point inquiry with ROI LUT

-### Coordinate Transformation
+#### Coordinate Transformation

 For the HDMap ROI filter, the data interface for HD map is defined in
 terms of a set of polygons, each of which is actually an ordered set of
@@ -53,7 +58,7 @@ transforms the points of input point cloud and the HDMap polygons into a
 local coordinate system that originates from the LiDAR sensor’s
 location.

-### ROI LUT Construction
+#### ROI LUT Construction

 To determine an input point whether inside or outside the ROI, Apollo
 adopts a grid-wise LUT that quantifies the ROI into a birds-eye view 2D
@@ -73,7 +78,7 @@ system corresponding to the LiDAR sensor’s location. The 2D grid is composed
 of 8×8 cells that are shown as green squares. The cells inside the ROI are 
 blue-filled squares while the ones outside the ROI are yellow-filled squares.

-### Point Inquiry with ROI LUT
+#### Point Inquiry with ROI LUT

 Based on the ROI LUT, the affiliation of each input point is queried
 using two-step verification. Then, Apollo conducts data compilation and
@@ -98,8 +103,7 @@ table below on the usage of parameters for HDMap ROI Filter.
 | cell_size      | The size of cells for quantizing the 2D grid. | 0.25 meter  |
 | extend_dist    | The distance of extending the ROI from the polygon boundary. | 0.0 meter   |

-Convolutional Neural Networks (CNN) Segmentation
------------------------------------------------
+### Convolutional Neural Networks (CNN) Segmentation

 After the HDMap ROI filter, Apollo obtains the filtered point cloud that
 includes *only* the points inside ROI (i.e., the drivable road and
@@ -128,7 +132,7 @@ The Apollo CNN segmentation consists of four successive steps:

 The following sections describe the deep CNN in detail.

-### Channel Feature Extraction
+#### Channel Feature Extraction

 Given a frame of point cloud, Apollo build a birds-eye view (i.e.,
 projected to the X-Y plane) 2D grid in the local coordinate system. Each
@@ -155,7 +159,7 @@ statistical measurements computed are the:

 8.  Binary value indicating whether the cell is empty or occupied

-### CNN-Based Obstacle Predication
+#### CNN-Based Obstacle Predication

 Based on the channel features described above, Apollo uses a deep fully
 convolutional neural network (FCNN) to predict the cell-wise obstacle
@@ -191,7 +195,7 @@ layers with non-linear activation (i.e., ReLu) layers.

 <div align=center>Figure 2 The FCNN for cell-wise obstacle prediction</div>

-### Obstacle Clustering
+#### Obstacle Clustering

 After the CNN-based prediction step, Apollo obtains prediction
 information for individual cells. Apollo utilizes five cell object
@@ -234,7 +238,7 @@ components whose root nodes are adjacent to each other.

 The class probabilities are summed up over the nodes (cells) within the object cluster for each candidate obstacle type, including vehicle, pedestrian, bicyclist and unknown. The obstacle type corresponding to the maximum averaged probability is the final classification result of the object cluster.

-### Post-processing
+#### Post-processing

 After clustering, Apollo obtains a set of candidate object clusters each
 of which includes several cells. In the post-processing step, Apollo
@@ -264,10 +268,7 @@ explains the parameter usage and default values for CNN Segmentation.
 | feature_param {height}       | The number of cells in Y (row) axis of the 2D grid. | 512        |
 | feature_param {range}        | The range of the 2D grid with respect to the origin (the LiDAR sensor). | 60 meters  |

-
-
-MinBox Builder
--------------
+### MinBox Builder

 The object builder component establishes a bounding box for the detected
 obstacles. Due to occlusions or distance to the LiDAR sensor, the point
@@ -292,8 +293,7 @@ minimum area as the final bounding box.

 <div align=center>Figure 4 Illustration of MinBox Object Builder</div>

-HM Object Tracker
-----------------
+### HM Object Tracker

 The HM object tracker is designed to track obstacles detected by the
 segmentation step. In general, it forms and updates track lists by
@@ -303,7 +303,7 @@ lists will be estimated after association. In HM object tracker, the
 Hungarian algorithm is used for detection-to-track association, and a
 Robust Kalman Filter is adopted for motion estimation.

-### Detection-to-Track Association
+#### Detection-to-Track Association

 When associating detection to existing track lists, Apollo constructs a
 bipartite graph and then uses the Hungarian algorithm to find the best
@@ -343,7 +343,7 @@ with distance greater than a reasonable maximum distance threshold.

 <div align=center>Figure 5 Illustration of Bipartite Graph Matching</div>

-### Track Motion Estimation
+#### Track Motion Estimation

 After the detection-to-track association, HM object tracker uses a
 Robust Kalman Filter to estimate the motion states of current track
@@ -393,31 +393,29 @@ A high-level workflow of HM object tracker is given in figure 6.

 3)  Update the motion state of updated track lists and collect the
    tracking results.
-## Sequential Type Fusion
+### Sequential Type Fusion

-To smooth the obstacle type and reduce the type switch over the whole trajectory, Apollo utilizes a sequential type fusion algorithm based on a linear-chain Conditional Random Field (CRF), which can be formulated as below:
+To smooth the obstacle type over the whole trajectory and reduce the type switch between adjacent frames, Apollo utilizes a sequential type fusion algorithm based on a linear-chain Conditional Random Field (CRF), which can be formulated as below:

-![CRF_eq1](images/3d_obstacle_perception/CRF_eq1.png)
+<div align=center>![CRF_eq1](images\3d_obstacle_perception\CRF_eq1.png)</div>

-![CRF_eq2](images/3d_obstacle_perception/CRF_eq2.png)
+<div align=center>![CRF_eq2](images\3d_obstacle_perception\CRF_eq2.png)</div>

 where the unary term acts on each single node, while the binary one acts on each edge. 

-The probability in the unary term is the class probability output by the CNN-based prediction, and the state transition probability in the binary term is modeled by the obstacle type transition from time t-1 to time t, which is statistically learned from large amounts of obstacle trajectories. Specifically, Apollo also uses a learned confusion matrix to indicate the probability of changing from the predicted type to ground truth type to optimize the original class probability. 
+The probability in the unary term is the class probability output by the CNN-based prediction, and the state transition probability in the binary term is modeled by the obstacle type transition from time t-1 to time t, which is statistically learned from large amounts of obstacle trajectories. Specifically, Apollo also uses a learned confusion matrix indicating the probability of changing from the predicted type to ground truth type to rectify the original CNN-based class probability. 

-The sequential obstacle type is optimized by solving the following problem: 
+The sequential obstacle type is finally optimized by using Viterbi algorithm to solve the following problem:

-![CRF_eq3](images/3d_obstacle_perception/CRF_eq3.png)
+<div align=center>![CRF_eq3](images\3d_obstacle_perception\CRF_eq3.png)</div>

-using Viterbi algorithm.
+## RADAR Obstacle Perception

-## Sensor Fusion
-
-The sensor fusion module is designed to fuse LIDAR tracking results and RADAR detection results. In general, fusion items is kept, Apollo first matches the sensor results with the fusion items by tracking id, then computes association matrix for unmatched sensor results and unmatched fusion items to get an optimal matching result. For the matched sensor result, update the corresponding fusion item by Adaptive Kalman Filter. For the unmatched sensor result, create a new fusion item. For the unmatched fusion item, removed from the fusion items if it is too stale. 
+Given the radar data from the sensor, some basic process would be done. First of all, the track id needs to be extended, because Apollo needs a global track id for id association. Original radar sensor only provides id with 8 bits, so it is hard to determine if two objects with same id in two adjacent frames are denotes one object in tracking history, especially there exits frame dropping problem. Apollo uses meas state provided by radar sensor to handle this problem. Meanwhile, Apollo assigns new track id to the object which far away from the object with same track id in last frame. Secondly, false positive filter is used to remove noise. Apollo set some threshold via RADAR data to filter results that would be noise. And then, objects is built according the RADAR data as an unified object format. Apollo translates objects into world coordinate via calibration results. Original RADAR sensor provides the relative velocity of the object, so Apollo uses host car velocity from localization. Apollo adds these two velocity to denote the absolute velocity of the detected object. Finally, HDMap roi filter is used to get interested objects. Only objects inside the roi is used by sensor fusion algorithm.

-### RADAR Detector
+## Obstacle Results Fusion

-Given the radar data from the sensor, some basic process would be done. First of all, the track id needs to be extended, because Apollo needs a global track id for id association. Original radar sensor only provides id with 8 bits, so it is hard to determine if two objects with same id in two adjacent frames are denotes one object in tracking history, especially there exits frame dropping problem. Apollo uses meas state provided by radar sensor to handle this problem. Meanwhile, Apollo assigns new track id to the object which far away from the object with same track id in last frame. Secondly, false positive filter is used to remove noise. Apollo set some threshold via RADAR data to filter results that would be noise. And then, objects is built according the RADAR data as an unified object format. Apollo translates objects into world coordinate via calibration results. Original RADAR sensor provides the relative velocity of the object, so Apollo uses host car velocity from localization. Apollo adds these two velocity to denote the absolute velocity of the detected object. Finally, HDMap roi filter is used to get interested objects. Only objects inside the roi is used by sensor fusion algorithm.
+The sensor fusion module is designed to fuse LIDAR tracking results and RADAR detection results. In general, fusion items is kept, Apollo first matches the sensor results with the fusion items by tracking id, then computes association matrix for unmatched sensor results and unmatched fusion items to get an optimal matching result. For the matched sensor result, update the corresponding fusion item by Adaptive Kalman Filter. For the unmatched sensor result, create a new fusion item. For the unmatched fusion item, removed from the fusion items if it is too stale. 

 ### Fusion Items Management

@@ -429,4 +427,4 @@ When associating sensor results to the fusion lists, Apollo first matches the id

 ### Motion Fusion

-Apollo uses Adaptive Kalman filter to estimate the motion of current item with a constant acceleration motion model. The motion states include its belief anchor point, belief velocity and belief acceleration, which correspond to the 3D position, its 3D velocity and acceleration respectively. Actually, Apollo only gets position and velocity from sensor results. In motion fusion, Apollo caches state of all sensor result and computes acceleration via Kalman Filter.  Apollo provides uncertainty of position and velocity in LIDAR tracker and RADAR detector. Apollo feeds all the states and uncertainty to Adaptive Kalman Filter to get the fused results. What needs to be explained is, to overcome the over-estimation of update gain, a breakdown threshold is used in the process of filtering.
+Apollo uses Adaptive Kalman filter to estimate the motion of current item with a constant acceleration motion model. The motion states include its belief anchor point, belief velocity and belief acceleration, which correspond to the 3D position, its 3D velocity and acceleration respectively. Actually, Apollo only gets position and velocity from sensor results. In motion fusion, Apollo caches state of all sensor result and computes acceleration via Kalman Filter.  Apollo provides uncertainty of position and velocity in LIDAR tracker and RADAR detector. Apollo feeds all the states and uncertainty to Adaptive Kalman Filter to get the fused results. What needs to be explained is, to overcome the over-estimation of update gain, a breakdown threshold is used in the process of filtering.
\ No newline at end of file
--- a/modules/perception/README.md
+++ b/modules/perception/README.md
 # Perception

 ## Introduction
-  The goal of perception module is to provide the ability of perceiving obstacles given input 3D point cloud data from LiDAR sensor. It detects, segments and tracks obstacles in the ROI defined by high-resolution (HD) map. In addition, it predicts the obstacles’ motion and pose information (e.g., heading, velocity, etc). It consists of four successive sub-modules including **_HDMap ROI Filter_**, **_CNN Segmentation_**, **_MinBox Builder_** and **_HM Object Tracker_**. Please see details in [the document of perception](https://github.com/ApolloAuto/apollo/blob/master/docs/specs/3d_obstacle_perception.md).
+The goal of perception module is to provide the ability of perceiving obstacles and traffic lights. The obstacle submodule detects, segments, classifies and tracks obstacles in the ROI defined by high-resolution (HD) map. In addition, it predicts the obstacles’ motion and pose information (e.g., heading, velocity, etc). It consists of two main components, including obstacle perception given input 3D point cloud data from LiDAR sensor, and obstacle fusion based on LiDAR and RADAR obstacles. Please see details in [the document of 3D obstacles perception](https://github.com/ApolloAuto/apollo/blob/master/docs/specs/3d_obstacle_perception.md). The traffic light submodule detects traffic lights and recognizes their status in the images. Please see details in [the document of traffic light perception](https://github.com/ApolloAuto/apollo/blob/master/docs/specs/traffic_light.md).

 ## Input
-  * Point cloud data from LiDAR sensor (ROS topic _/apollo/sensor/velodyne64/compensator/PointCloud2_)
+  * Point cloud data (ROS topic _/apollo/sensor/velodyne64/compensator/PointCloud2_)
+  * RADAR data (ROS topic _/apollo/sensor/conti_radar_)
+  * Image data (ROS topic _/apollo/sensor/camera/traffic/image_long_ & _/apollo/sensor/camera/traffic/image_short_)
  * Coordinate frame transformation information over time (ROS topic _/tf_)
  * HD map
  * Extrinsic parameters of LiDAR sensor calibration (ROS topic _/tf_static_)
-
+  * Extrinsic parameters of RADAR sensor calibration (from YAML files)
+  * Extrinsic and Intrinsic parameters of all camera calibration (from YAML files)
+  * Velocity of host vehicle (ROS topic /apollo/localization/pose)
 ## Output
-  * 3D obstacle tracks with heading and velocity information (ROS topic _/apollo/perception/obstacles_)
-
+  * 3D obstacle tracks with heading, velocity and classification information (ROS topic _/apollo/perception/obstacles_)
+* traffic light bounding box and status (ROS topic _/apollo/perception/traffic_light_)
 ## Instruction
-  Before running the 3D obstacle perception program, please select the appropriate HD map by setting the option `--map_dir` in the global configuration file `modules/common/data/global_flagfile.txt`. After that, you may setup the general settings in the configuration file `modules/perception/conf/perception.conf`. Then, you can launch the perception program by using the command `./scripts/perception.sh start` or enabling the perception button in HMI. The command of stopping perception is `./scripts/perception.sh stop`. In addition we provide some demo data for developers. Please download the demo data from our [Open Data Platform](https://console.bce.baidu.com/apollo/task/download).

-  **Note**: It requires Nvidia GPU and CUDA installed to run the perception module with Caffe. We have already installed the CUDA and Caffe libraries in the released docker. However, the Nvidia GPU driver is not installed in the released dev docker image. To run the perception module with CUDA acceleration, we suggest to install the exactly same version of Nvidia driver in the docker as the one installed in your host machine, and build Apollo with GPU option (i.e., using `./apollo.sh build_gpu` or `./apollo.sh build_opt_gpu`). Please see the detailed instruction in [How to Run Perception Module on Your Local Computer](https://github.com/ApolloAuto/apollo/blob/master/docs/howto/how_to_run_perception_module_on_your_local_computer.md).
\ No newline at end of file
+1. Setup the general settings in the configuration file `modules/perception/conf/perception.conf`.
+2. Run command  `./scripts/bootstrap.sh` to launch web GUI.
+3. Select the vehicle model and HD map in web GUI.
+4. Launch the perception module by using the command `./scripts/perception.sh start` or enabling the perception button on the *Module Controller* page of web GUI. The command of stopping perception is `./scripts/perception.sh stop`.
+5. In addition we provide some demo data for developers. Please download the demo data from our [Open Data Platform](https://console.bce.baidu.com/apollo/task/download).
+
+## Function enable/disable
+The perception framework is designed as a directed acyclic graph (DAG). A typical DAG configuration for perception module is shown as below. There are three components in DAG configuration, including sub-nodes, edges and shared data. Each function is implemented as a sub-node in DAG. The sub-nodes that share data have an edge from producer to customer. 
+
+Default obstacle perception consists of "LidarProcessSubnode", "RadarProcessSubnode" and "FusionSubnode", as shown in *subnode_config* part. The "LidarProcessSubnode" and "RadarProcessSubnode" receive sensor data and output obstacle data independently, i.e. the "LidarObjectData" and "RadarObjectData" in *data_config* part. The "FusionSubnode" subscribes both the obstacle data and publishes the final results. Traffic light perception is composed of "TLPreprocessorSubnode" and "TLProcSubnode". The edge and data configuration define the links. Each function could be disabled by removing the corresponding sub-node, edge and shared data configuration. Just make sure all the input and output configurations are correct.
+
+``` protobuf
+# Define all nodes in DAG streaming.
+subnode_config {
+    # 64-Lidar Input nodes.
+    subnodes {
+        id: 1
+        name: "LidarProcessSubnode"
+        reserve: "device_id:velodyne64;"
+        type: SUBNODE_IN
+    }
+
+    # Front radar Input nodes.
+    subnodes {
+        id: 2
+        name: "RadarProcessSubnode"
+        reserve: "device_id:radar;"
+        type: SUBNODE_IN
+    }
+
+    # Fusion node.
+    subnodes {
+        id: 31
+        name: "FusionSubnode"
+        reserve: "pub_driven_event_id:1001;lidar_event_id:1001;radar_event_id:1002;"
+        type: SUBNODE_OUT
+    }
+
+    # TrafficLight Preprocess node.
+    subnodes {
+        id: 41
+        name: "TLPreprocessorSubnode"
+        type: SUBNODE_IN
+    }
+    # TrafficLight process node.
+    subnodes {
+        id: 42
+        name: "TLProcSubnode"
+        type: SUBNODE_OUT
+    }
+}
+
+###################################################################
+# Define all edges linked nodes.
+edge_config {
+    # 64-Lidar LidarProcessSubnode -> FusionSubnode
+    edges {
+        id: 101
+        from_node: 1
+        to_node: 31
+        events {
+            id: 1001
+            name: "lidar_fusion"
+        }
+    }
+    # Radar RadarProcessSubnode -> FusionSubnode
+    edges {
+        id: 102
+        from_node: 2
+        to_node: 31
+        events {
+            id: 1002
+            name: "radar_fusion"
+        }
+    }
+    # TLPreprocessorSubnode -> TLProcSubnode
+    edges {
+        id: 201
+        from_node: 41
+        to_node: 42
+        events {
+            id: 1003
+            name: "traffic_light"
+        }
+    }
+}
+
+###################################################################
+# Define all shared data.
+data_config {
+    datas {
+        id: 1
+        name: "LidarObjectData"
+    }
+    datas {
+        id: 2
+        name: "RadarObjectData"
+    }
+    datas {
+        id: 3
+        name: "TLPreprocessingData"
+    }
+}
+```
+
+
+**Note**: Nvidia GPU and CUDA is required to run the perception module with Caffe. We have already installed the CUDA and Caffe libraries in the released docker. However, the Nvidia GPU driver is not installed in the released dev docker image. To run the perception module with CUDA acceleration, we suggest to install the exactly same version of Nvidia driver in the docker as the one installed in your host machine, and build Apollo with GPU option (i.e., using `./apollo.sh build_gpu` or `./apollo.sh build_opt_gpu`). Please see the detailed instruction in [How to Run Perception Module on Your Local Computer](https://github.com/ApolloAuto/apollo/blob/master/docs/howto/how_to_run_perception_module_on_your_local_computer.md).
\ No newline at end of file