diff --git a/PaddleCV/Paddle3D/PointRCNN/.gitignore b/PaddleCV/Paddle3D/PointRCNN/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..9ea6e75c687e4ac93fa06d18bd0d1444e5d3b054
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/.gitignore
@@ -0,0 +1,14 @@
+*log*
+checkpoints*
+build
+output
+result_dir
+pp_pointrcnn*
+data/gt_database
+utils/pts_utils/dist
+utils/pts_utils/build
+utils/pts_utils/pts_utils.egg-info
+utils/cyops/*.c
+utils/cyops/*.so
+ext_op/src/*.o
+ext_op/src/*.so
diff --git a/PaddleCV/Paddle3D/PointRCNN/README.md b/PaddleCV/Paddle3D/PointRCNN/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0560203b293d1e12ab576dcc1bd66891b1a44af1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/README.md
@@ -0,0 +1,338 @@
+# PointRCNN 3D目标检测模型
+
+---
+## 内容
+
+- [简介](#简介)
+- [快速开始](#快速开始)
+- [参考文献](#参考文献)
+- [版本更新](#版本更新)
+
+## 简介
+
+[PointRCNN](https://arxiv.org/abs/1812.04244) 是 Shaoshuai Shi, Xiaogang Wang, Hongsheng Li. 等人提出的,是第一个仅使用原始点云的2-stage(两阶段)3D目标检测器,第一阶段将 Pointnet++ with MSG(Multi-scale Grouping)作为backbone,直接将原始点云数据分割为前景点和背景点,并利用前景点生成bounding box。第二阶段在标准坐标系中对生成对bounding box进一步筛选和优化。该模型还提出了基于bin的方式,把回归问题转化为分类问题,验证了在三维边界框回归中的有效性。PointRCNN在KITTI数据集上进行评估,论文发布时在KITTI 3D目标检测排行榜上获得了最佳性能。
+
+网络结构如下所示:
+
+
+
+用于点云的目标检测器 PointNet++
+
+
+**注意:** PointRCNN 模型构建依赖于自定义的 C++ 算子,目前仅支持GPU设备在Linux/Unix系统上进行编译,本模型**不能运行在Windows系统或CPU设备上**
+
+
+## 快速开始
+
+### 安装
+
+**安装 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):**
+
+在当前目录下运行样例代码需要 PaddelPaddle Fluid [develop每日版本](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)或使用PaddlePaddle [develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装.
+
+为了使自定义算子与paddle版本兼容,建议您**优先使用源码编译paddle**,源码编译方式请参考[编译安装](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
+
+**安装PointRCNN:**
+
+1. 下载[PaddlePaddle/models](https://github.com/PaddlePaddle/models)模型库
+
+通过如下命令下载Paddle models模型库:
+
+```
+git clone https://github.com/PaddlePaddle/models
+```
+
+2. 在`PaddleCV/Paddle3D/PointRCNN`目录下下载[pybind11](https://github.com/pybind/pybind11)
+
+`pts_utils`依赖`pybind11`编译,须在`PaddleCV/Paddle3D/PointRCNN`目录下下载`pybind11`子库,可使用如下命令下载:
+
+```
+cd PaddleCV/Paddle3D/PointRCNN
+git clone https://github.com/pybind/pybind11
+```
+
+3. 编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块
+
+使用如下命令编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块:
+```
+sh build_and_install.sh
+```
+
+4. 安装python依赖库
+
+使用如下命令安装python依赖库:
+
+```
+pip install -r requirement.txt
+```
+
+**注意:** KITTI mAP评估工具只能在python 3.6及以上版本中使用,且python3环境中需要安装`scikit-image`,`Numba`,`fire`等子库。
+`requirement.txt`中的`scikit-image`,`Numba`,`fire`即为KITTI mAP评估工具所需依赖库。
+
+### 编译自定义OP
+
+请确认Paddle版本为PaddelPaddle Fluid develop每日版本或基于Paddle develop分支源码编译安装,**推荐使用源码编译安装的方式**。
+
+自定义OP编译方式如下:
+
+ 进入 `ext_op/src` 目录,执行编译脚本
+ ```
+ cd ext_op/src
+ sh make.sh
+ ```
+
+ 成功编译后,`ext_op/src` 目录下将会生成 `pointnet2_lib.so`
+
+ 执行下列操作,确保自定义算子编译正确:
+
+ ```
+ # 设置动态库的路径到 LD_LIBRARY_PATH 中
+ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+ # 回到 ext_op 目录,添加 PYTHONPATH
+ cd ..
+ export PYTHONPATH=$PYTHONPATH:`pwd`
+
+ # 运行单测
+ python tests/test_farthest_point_sampling_op.py
+ python tests/test_gather_point_op.py
+ python tests/test_group_points_op.py
+ python tests/test_query_ball_op.py
+ python tests/test_three_interp_op.py
+ python tests/test_three_nn_op.py
+ ```
+ 单测运行成功会输出提示信息,如下所示:
+
+ ```
+ .
+ ----------------------------------------------------------------------
+ Ran 1 test in 13.205s
+
+ OK
+ ```
+
+**说明:** 自定义OP编译与[PointNet++](../PointNet++)下一致,更多关于自定义OP的编译说明,请参考[自定义OP编译](../PointNet++/ext_op/README.md)
+
+### 数据准备
+
+**KITTI 3D object detection 数据集:**
+
+PointRCNN使用数据集[KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d)
+上进行训练。
+
+可通过如下方式下载数据集:
+
+```
+cd data/KITTI/object
+sh download.sh
+```
+
+此处的images只用做可视化,训练过程中使用[road planes](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing)数据来做训练时的数据增强,
+请下载并解压至`./data/KITTI/object/training`目录下。
+
+数据目录结构如下所示:
+
+```
+PointRCNN
+├── data
+│ ├── KITTI
+│ │ ├── ImageSets
+│ │ ├── object
+│ │ │ ├──training
+│ │ │ │ ├──calib & velodyne & label_2 & image_2 & planes
+│ │ │ ├──testing
+│ │ │ │ ├──calib & velodyne & image_2
+
+```
+
+
+### 训练
+
+**PointRCNN模型:**
+
+可通过如下方式启动 PointRCNN模型的训练:
+
+1. 指定单卡训练并设置动态库路径
+
+```
+# 指定单卡GPU训练
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+```
+
+2. 生成Groud Truth采样数据,命令如下:
+
+```
+python tools/generate_gt_database.py --class_name 'Car' --split train
+```
+
+3. 训练 RPN 模型
+
+```
+python train.py --cfg=./cfgs/default.yml \
+ --train_mode=rpn \
+ --batch_size=16 \
+ --epoch=200 \
+ --save_dir=checkpoints
+```
+
+RPN训练checkpoints默认保存在`checkpoints/rpn`目录,也可以通过`--save_dir`来指定。
+
+4. 生成增强离线场景数据并保存RPN模型的输出特征和ROI,用于离线训练 RCNN 模型
+
+生成增强的离线场景数据命令如下:
+
+```
+python tools/generate_aug_scene.py --class_name 'Car' --split train --aug_times 4
+```
+
+保存RPN模型对离线增强数据的输出特征和ROI,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
+保存输出特征和ROI时须指定`TEST.SPLIT`为`train_aug`,指定`TEST.RPN_POST_NMS_TOP_N`为`300`, `TEST.RPN_NMS_THRESH`为`0.85`。
+通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rpn \
+ --ckpt_dir=./checkpoints/rpn/199 \
+ --save_rpn_feature \
+ --output_dir=output \
+ --set TEST.SPLIT train_aug TEST.RPN_POST_NMS_TOP_N 300 TEST.RPN_NMS_THRESH 0.85
+```
+
+`--output_dir`下保存的数据目录结构如下:
+
+```
+output
+├── detections
+│ ├── data # 保存ROI数据
+│ │ ├── 000000.txt
+│ │ ├── 000003.txt
+│ │ ├── ...
+├── features # 保存输出特征
+│ ├── 000000_intensity.npy
+│ ├── 000000.npy
+│ ├── 000000_rawscore.npy
+│ ├── 000000_seg.npy
+│ ├── 000000_xyz.npy
+│ ├── ...
+├── seg_result # 保存语义分割结果
+│ ├── 000000.npy
+│ ├── 000003.npy
+│ ├── ...
+```
+
+5. 离线训练RCNN,并且通过参数`--rcnn_training_roi_dir` and `--rcnn_training_feature_dir` 来指定 RPN 模型保存的输出特征和ROI路径。
+
+```
+python train.py --cfg=./cfgs/default.yml \
+ --train_mode=rcnn_offline \
+ --batch_size=4 \
+ --epoch=30 \
+ --save_dir=checkpoints \
+ --rcnn_training_roi_dir=output/detections/data \
+ --rcnn_training_feature_dir=output/features
+```
+
+RCNN模型训练权重默认保存在`checkpoints/rcnn`目录下,可通过`--save_dir`参数指定。
+
+**注意**: 最好的模型是通过保存RPN模型输出特征和ROI并离线数据增强的方式训练RCNN模型得出的,目前默认仅支持这种方式。
+
+
+### 模型评估
+
+**PointRCNN模型:**
+
+可通过如下方式启动 PointRCNN 模型的评估:
+
+1. 指定单卡训练并设置动态库路径
+
+```
+# 指定单卡GPU训练
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+```
+
+2. 保存RPN模型对评估数据的输出特征和ROI
+
+保存RPN模型对评估数据的输出特征和ROI命令如下,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
+通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rpn \
+ --ckpt_dir=./checkpoints/rpn/199 \
+ --save_rpn_feature \
+ --output_dir=output/val
+```
+
+保存RPN模型对评估数据的输出特征和ROI保存的目录结构与上述保存离线增强数据保存目录结构一致。
+
+3. 评估离线RCNN模型
+
+评估离线RCNN模型命令如下:
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rcnn_offline \
+ --ckpt_dir=./checkpoints/rcnn_offline/29 \
+ --rcnn_eval_roi_dir=output/val/detections/data \
+ --rcnn_eval_feature_dir=output/val/features \
+ --save_result
+```
+
+最终目标检测结果文件保存在`./result_dir`目录下`final_result`文件夹下,同时可通过`--save_result`开启保存`roi_output`和`refine_output`结果文件。
+`result_dir`目录结构如下:
+
+```
+result_dir
+├── final_result
+│ ├── data # 最终检测结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+├── roi_output
+│ ├── data # RCNN模型输出检测ROI结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+├── refine_output
+│ ├── data # 解码后的检测结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+```
+
+4. 使用KITTI mAP工具获得评估结果
+
+若在评估过程中使用的python版本为3.6及以上版本,则程序会自动运行KITTI mAP评估,若使用python版本低于3.6,
+由于KITTI mAP仅支持python 3.6及以上版本,须使用对应python版本通过如下命令进行评估:
+
+```
+python3 kitti_map.py
+```
+
+使用训练最终权重[RPN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rpn.tar)和[RCNN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rcnn_offline.tar)评估结果如下所示:
+
+| Car AP@ | 0.70(easy) | 0.70(moderate) | 0.70(hard) |
+| :------- | :--------: | :------------: | :--------: |
+| bbox AP: | 90.20 | 88.85 | 88.59 |
+| bev AP: | 89.50 | 86.97 | 85.58 |
+| 3d AP: | 86.66 | 76.65 | 75.90 |
+| aos AP: | 90.10 | 88.64 | 88.26 |
+
+
+## 参考文献
+
+- [PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud](https://arxiv.org/abs/1812.04244), Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.
+- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
+- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas.
+
+## 版本更新
+
+- 11/2019, 新增 PointRCNN模型。
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh
new file mode 100644
index 0000000000000000000000000000000000000000..83aaef84704445cf9c7bf3e87cc453e0daa708cd
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh
@@ -0,0 +1,7 @@
+# compile cyops
+python utils/cyops/setup.py develop
+
+# compile and install pts_utils
+cd utils/pts_utils
+python setup.py install
+cd ../..
diff --git a/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml
new file mode 100644
index 0000000000000000000000000000000000000000..33dc45086ca48128174fc341e7f9fdee9374d53e
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml
@@ -0,0 +1,167 @@
+# This config is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/cfgs/default.yaml
+CLASSES: Car
+
+INCLUDE_SIMILAR_TYPE: True
+
+# config of augmentation
+AUG_DATA: True
+AUG_METHOD_LIST: ['rotation', 'scaling', 'flip']
+AUG_METHOD_PROB: [1.0, 1.0, 0.5]
+AUG_ROT_RANGE: 18
+
+GT_AUG_ENABLED: True
+GT_EXTRA_NUM: 15
+GT_AUG_RAND_NUM: True
+GT_AUG_APPLY_PROB: 1.0
+GT_AUG_HARD_RATIO: 0.6
+
+PC_REDUCE_BY_RANGE: True
+PC_AREA_SCOPE: [[-40, 40], [-1, 3], [0, 70.4]] # x, y, z scope in rect camera coords
+CLS_MEAN_SIZE: [[1.52563191462, 1.62856739989, 3.88311640418]]
+
+
+# 1. config of rpn network
+RPN:
+ ENABLED: True
+ FIXED: False
+
+ # config of input
+ USE_INTENSITY: False
+
+ # config of bin-based loss
+ LOC_XZ_FINE: True
+ LOC_SCOPE: 3.0
+ LOC_BIN_SIZE: 0.5
+ NUM_HEAD_BIN: 12
+
+ # config of network structure
+ BACKBONE: pointnet2_msg
+ USE_BN: True
+ NUM_POINTS: 16384
+
+ SA_CONFIG:
+ NPOINTS: [4096, 1024, 256, 64]
+ RADIUS: [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
+ NSAMPLE: [[16, 32], [16, 32], [16, 32], [16, 32]]
+ MLPS: [[[16, 16, 32], [32, 32, 64]],
+ [[64, 64, 128], [64, 96, 128]],
+ [[128, 196, 256], [128, 196, 256]],
+ [[256, 256, 512], [256, 384, 512]]]
+ FP_MLPS: [[128, 128], [256, 256], [512, 512], [512, 512]]
+ CLS_FC: [128]
+ REG_FC: [128]
+ DP_RATIO: 0.5
+
+ # config of training
+ LOSS_CLS: SigmoidFocalLoss
+ FG_WEIGHT: 15
+ FOCAL_ALPHA: [0.25, 0.75]
+ FOCAL_GAMMA: 2.0
+ REG_LOSS_WEIGHT: [1.0, 1.0, 1.0, 1.0]
+ LOSS_WEIGHT: [1.0, 1.0]
+ NMS_TYPE: normal
+
+ # config of testing
+ SCORE_THRESH: 0.3
+
+# 2. config of rcnn network
+RCNN:
+ ENABLED: True
+
+ # config of input
+ ROI_SAMPLE_JIT: False
+ REG_AUG_METHOD: multiple # multiple, single, normal
+ ROI_FG_AUG_TIMES: 10
+
+ USE_RPN_FEATURES: True
+ USE_MASK: True
+ MASK_TYPE: seg
+ USE_INTENSITY: False
+ USE_DEPTH: True
+ USE_SEG_SCORE: False
+
+ POOL_EXTRA_WIDTH: 1.0
+
+ # config of bin-based loss
+ LOC_SCOPE: 1.5
+ LOC_BIN_SIZE: 0.5
+ NUM_HEAD_BIN: 9
+ LOC_Y_BY_BIN: False
+ LOC_Y_SCOPE: 0.5
+ LOC_Y_BIN_SIZE: 0.25
+ SIZE_RES_ON_ROI: False
+
+ # config of network structure
+ USE_BN: False
+ DP_RATIO: 0.0
+
+ BACKBONE: pointnet # pointnet
+ XYZ_UP_LAYER: [128, 128]
+
+ NUM_POINTS: 512
+ SA_CONFIG:
+ NPOINTS: [128, 32, -1]
+ RADIUS: [0.2, 0.4, 100]
+ NSAMPLE: [64, 64, 64]
+ MLPS: [[128, 128, 128],
+ [128, 128, 256],
+ [256, 256, 512]]
+ CLS_FC: [256, 256]
+ REG_FC: [256, 256]
+
+ # config of training
+ LOSS_CLS: BinaryCrossEntropy
+ FOCAL_ALPHA: [0.25, 0.75]
+ FOCAL_GAMMA: 2.0
+ CLS_WEIGHT: [1.0, 1.0, 1.0]
+ CLS_FG_THRESH: 0.6
+ CLS_BG_THRESH: 0.45
+ CLS_BG_THRESH_LO: 0.05
+ REG_FG_THRESH: 0.55
+ FG_RATIO: 0.5
+ ROI_PER_IMAGE: 64
+ HARD_BG_RATIO: 0.8
+
+ # config of testing
+ SCORE_THRESH: 0.3
+ NMS_THRESH: 0.1
+
+# general training config
+TRAIN:
+ SPLIT: train
+ VAL_SPLIT: smallval
+
+ LR: 0.002
+ LR_CLIP: 0.00001
+ LR_DECAY: 0.5
+ DECAY_STEP_LIST: [100, 150, 180, 200]
+ LR_WARMUP: True
+ WARMUP_MIN: 0.0002
+ WARMUP_EPOCH: 1
+
+ BN_MOMENTUM: 0.1
+ BN_DECAY: 0.5
+ BNM_CLIP: 0.01
+ BN_DECAY_STEP_LIST: [1000]
+
+ OPTIMIZER: adam # adam, adam_onecycle
+ WEIGHT_DECAY: 0.001 # L2 regularization
+ MOMENTUM: 0.9
+
+ MOMS: [0.95, 0.85]
+ DIV_FACTOR: 10.0
+ PCT_START: 0.4
+
+ GRAD_NORM_CLIP: 1.0
+
+ RPN_PRE_NMS_TOP_N: 9000
+ RPN_POST_NMS_TOP_N: 512
+ RPN_NMS_THRESH: 0.85
+ RPN_DISTANCE_BASED_PROPOSE: True
+
+TEST:
+ SPLIT: val
+ RPN_PRE_NMS_TOP_N: 9000
+ RPN_POST_NMS_TOP_N: 100
+ RPN_NMS_THRESH: 0.8
+ RPN_DISTANCE_BASED_PROPOSE: True
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh
new file mode 100644
index 0000000000000000000000000000000000000000..1f5818d38323c5cc7349022ba82d2a55315a59a7
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh
@@ -0,0 +1,25 @@
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+
+echo "Downloading https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
+
+echo "Decompressing data_object_velodyne.zip"
+unzip data_object_velodyne.zip
+echo "Decompressing data_object_image_2.zip"
+unzip "data_object_image_2.zip"
+echo "Decompressing data_object_calib.zip"
+unzip data_object_calib.zip
+echo "Decompressing data_object_label_2.zip"
+unzip data_object_label_2.zip
+
+echo "Download KITTI ImageSets"
+wget https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_kitti_imagesets.tar
+tar xf pointrcnn_kitti_imagesets.tar
+mv ImageSets ..
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/__init__.py b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py
@@ -0,0 +1,13 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py
new file mode 100644
index 0000000000000000000000000000000000000000..0765a5045f6e330646fde26fe391eb313d022124
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py
@@ -0,0 +1,77 @@
+"""
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_dataset.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import cv2
+import numpy as np
+import utils.calibration as calibration
+from utils.object3d import get_objects_from_label
+from PIL import Image
+
+__all__ = ["KittiDataset"]
+
+
+class KittiDataset(object):
+ def __init__(self, data_dir, split='train'):
+ assert split in ['train', 'train_aug', 'val', 'test'], "unknown split {}".format(split)
+ self.split = split
+ self.is_test = self.split == 'test'
+ self.imageset_dir = os.path.join(data_dir, 'KITTI', 'object', 'testing' if self.is_test else 'training')
+
+ split_dir = os.path.join(data_dir, 'KITTI', 'ImageSets', split + '.txt')
+ self.image_idx_list = [x.strip() for x in open(split_dir).readlines()]
+ self.num_sample = self.image_idx_list.__len__()
+
+ self.image_dir = os.path.join(self.imageset_dir, 'image_2')
+ self.lidar_dir = os.path.join(self.imageset_dir, 'velodyne')
+ self.calib_dir = os.path.join(self.imageset_dir, 'calib')
+ self.label_dir = os.path.join(self.imageset_dir, 'label_2')
+ self.plane_dir = os.path.join(self.imageset_dir, 'planes')
+
+ def get_image(self, idx):
+ img_file = os.path.join(self.image_dir, '%06d.png' % idx)
+ assert os.path.exists(img_file)
+ return cv2.imread(img_file) # (H, W, 3) BGR mode
+
+ def get_image_shape(self, idx):
+ img_file = os.path.join(self.image_dir, '%06d.png' % idx)
+ assert os.path.exists(img_file)
+ im = Image.open(img_file)
+ width, height = im.size
+ return height, width, 3
+
+ def get_lidar(self, idx):
+ lidar_file = os.path.join(self.lidar_dir, '%06d.bin' % idx)
+ assert os.path.exists(lidar_file)
+ return np.fromfile(lidar_file, dtype=np.float32).reshape(-1, 4)
+
+ def get_calib(self, idx):
+ calib_file = os.path.join(self.calib_dir, '%06d.txt' % idx)
+ assert os.path.exists(calib_file)
+ return calibration.Calibration(calib_file)
+
+ def get_label(self, idx):
+ label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
+ assert os.path.exists(label_file)
+ # return kitti_utils.get_objects_from_label(label_file)
+ return get_objects_from_label(label_file)
+
+ def get_road_plane(self, idx):
+ plane_file = os.path.join(self.plane_dir, '%06d.txt' % idx)
+ with open(plane_file, 'r') as f:
+ lines = f.readlines()
+ lines = [float(i) for i in lines[3].split()]
+ plane = np.asarray(lines)
+
+ # Ensure normal is always facing up, this is in the rectified camera coordinate
+ if plane[1] > 0:
+ plane = -plane
+
+ norm = np.linalg.norm(plane[0:3])
+ plane = plane / norm
+ return plane
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py
new file mode 100644
index 0000000000000000000000000000000000000000..811a20b28402f0c7119a03605e9e90074ad99097
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py
@@ -0,0 +1,1184 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_rcnn_dataset.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import logging
+import multiprocessing
+import numpy as np
+import scipy
+from scipy.spatial import Delaunay
+try:
+ import cPickle as pickle
+except:
+ import pickle
+
+import pts_utils
+import utils.cyops.kitti_utils as kitti_utils
+import utils.cyops.roipool3d_utils as roipool3d_utils
+from data.kitti_dataset import KittiDataset
+from utils.config import cfg
+from collections import OrderedDict
+
+__all__ = ["KittiRCNNReader"]
+
+logger = logging.getLogger(__name__)
+
+
+def has_empty(data):
+ for d in data:
+ if isinstance(d, np.ndarray) and len(d) == 0:
+ return True
+ return False
+
+
+def in_hull(p, hull):
+ """
+ :param p: (N, K) test points
+ :param hull: (M, K) M corners of a box
+ :return (N) bool
+ """
+ try:
+ if not isinstance(hull, Delaunay):
+ hull = Delaunay(hull)
+ flag = hull.find_simplex(p) >= 0
+ except scipy.spatial.qhull.QhullError:
+ logger.debug('Warning: not a hull.')
+ flag = np.zeros(p.shape[0], dtype=np.bool)
+
+ return flag
+
+
+class KittiRCNNReader(KittiDataset):
+ def __init__(self, data_dir, npoints=16384, split='train', classes='Car', mode='TRAIN',
+ random_select=True, rcnn_training_roi_dir=None, rcnn_training_feature_dir=None,
+ rcnn_eval_roi_dir=None, rcnn_eval_feature_dir=None, gt_database_dir=None):
+ super(KittiRCNNReader, self).__init__(data_dir=data_dir, split=split)
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_ped')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ self.num_classes = len(self.classes)
+
+ self.npoints = npoints
+ self.sample_id_list = []
+ self.random_select = random_select
+
+ if split == 'train_aug':
+ self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
+ self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
+ else:
+ self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
+ self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
+
+ # for rcnn training
+ self.rcnn_training_bbox_list = []
+ self.rpn_feature_list = {}
+ self.pos_bbox_list = []
+ self.neg_bbox_list = []
+ self.far_neg_bbox_list = []
+ self.rcnn_eval_roi_dir = rcnn_eval_roi_dir
+ self.rcnn_eval_feature_dir = rcnn_eval_feature_dir
+ self.rcnn_training_roi_dir = rcnn_training_roi_dir
+ self.rcnn_training_feature_dir = rcnn_training_feature_dir
+
+ self.gt_database = None
+
+ if not self.random_select:
+ logger.warning('random select is False')
+
+ assert mode in ['TRAIN', 'EVAL', 'TEST'], 'Invalid mode: %s' % mode
+ self.mode = mode
+
+ if cfg.RPN.ENABLED:
+ if gt_database_dir is not None:
+ self.gt_database = pickle.load(open(gt_database_dir, 'rb'))
+
+ if cfg.GT_AUG_HARD_RATIO > 0:
+ easy_list, hard_list = [], []
+ for k in range(self.gt_database.__len__()):
+ obj = self.gt_database[k]
+ if obj['points'].shape[0] > 100:
+ easy_list.append(obj)
+ else:
+ hard_list.append(obj)
+ self.gt_database = [easy_list, hard_list]
+ logger.info('Loading gt_database(easy(pt_num>100): %d, hard(pt_num<=100): %d) from %s'
+ % (len(easy_list), len(hard_list), gt_database_dir))
+ else:
+ logger.info('Loading gt_database(%d) from %s' % (len(self.gt_database), gt_database_dir))
+
+ if mode == 'TRAIN':
+ self.preprocess_rpn_training_data()
+ else:
+ self.sample_id_list = [int(sample_id) for sample_id in self.image_idx_list]
+ logger.info('Load testing samples from %s' % self.imageset_dir)
+ logger.info('Done: total test samples %d' % len(self.sample_id_list))
+ elif cfg.RCNN.ENABLED:
+ for idx in range(0, self.num_sample):
+ sample_id = int(self.image_idx_list[idx])
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if len(obj_list) == 0:
+ # logger.info('No gt classes: %06d' % sample_id)
+ continue
+ self.sample_id_list.append(sample_id)
+
+ logger.info('Done: filter %s results for rcnn training: %d / %d\n' %
+ (self.mode, len(self.sample_id_list), len(self.image_idx_list)))
+
+ def preprocess_rpn_training_data(self):
+ """
+ Discard samples which don't have current classes, which will not be used for training.
+ Valid sample_id is stored in self.sample_id_list
+ """
+ logger.info('Loading %s samples from %s ...' % (self.mode, self.label_dir))
+ for idx in range(0, self.num_sample):
+ sample_id = int(self.image_idx_list[idx])
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if len(obj_list) == 0:
+ logger.debug('No gt classes: %06d' % sample_id)
+ continue
+ self.sample_id_list.append(sample_id)
+
+ logger.info('Done: filter %s results: %d / %d\n' % (self.mode, len(self.sample_id_list),
+ len(self.image_idx_list)))
+
+ def get_label(self, idx):
+ if idx < 10000:
+ label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
+ else:
+ label_file = os.path.join(self.aug_label_dir, '%06d.txt' % idx)
+
+ assert os.path.exists(label_file)
+ return kitti_utils.get_objects_from_label(label_file)
+
+ def get_image(self, idx):
+ return super(KittiRCNNReader, self).get_image(idx % 10000)
+
+ def get_image_shape(self, idx):
+ return super(KittiRCNNReader, self).get_image_shape(idx % 10000)
+
+ def get_calib(self, idx):
+ return super(KittiRCNNReader, self).get_calib(idx % 10000)
+
+ def get_road_plane(self, idx):
+ return super(KittiRCNNReader, self).get_road_plane(idx % 10000)
+
+ @staticmethod
+ def get_rpn_features(rpn_feature_dir, idx):
+ rpn_feature_file = os.path.join(rpn_feature_dir, '%06d.npy' % idx)
+ rpn_xyz_file = os.path.join(rpn_feature_dir, '%06d_xyz.npy' % idx)
+ rpn_intensity_file = os.path.join(rpn_feature_dir, '%06d_intensity.npy' % idx)
+ if cfg.RCNN.USE_SEG_SCORE:
+ rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_rawscore.npy' % idx)
+ rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
+ rpn_seg_score = torch.sigmoid(torch.from_numpy(rpn_seg_score)).numpy()
+ else:
+ rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_seg.npy' % idx)
+ rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
+ return np.load(rpn_xyz_file), np.load(rpn_feature_file), np.load(rpn_intensity_file).reshape(-1), rpn_seg_score
+
+ def filtrate_objects(self, obj_list):
+ """
+ Discard objects which are not in self.classes (or its similar classes)
+ :param obj_list: list
+ :return: list
+ """
+ type_whitelist = self.classes
+ if self.mode == 'TRAIN' and cfg.INCLUDE_SIMILAR_TYPE:
+ type_whitelist = list(self.classes)
+ if 'Car' in self.classes:
+ type_whitelist.append('Van')
+ if 'Pedestrian' in self.classes: # or 'Cyclist' in self.classes:
+ type_whitelist.append('Person_sitting')
+
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type not in type_whitelist: # rm Van, 20180928
+ continue
+ if self.mode == 'TRAIN' and cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(obj.pos) is False):
+ continue
+ valid_obj_list.append(obj)
+ return valid_obj_list
+
+ @staticmethod
+ def filtrate_dc_objects(obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type in ['DontCare']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ @staticmethod
+ def check_pc_range(xyz):
+ """
+ :param xyz: [x, y, z]
+ :return:
+ """
+ x_range, y_range, z_range = cfg.PC_AREA_SCOPE
+ if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
+ (z_range[0] <= xyz[2] <= z_range[1]):
+ return True
+ return False
+
+ @staticmethod
+ def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
+ """
+ Valid point should be in the image (and in the PC_AREA_SCOPE)
+ :param pts_rect:
+ :param pts_img:
+ :param pts_rect_depth:
+ :param img_shape:
+ :return:
+ """
+ val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
+ val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
+ val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
+ pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
+
+ if cfg.PC_REDUCE_BY_RANGE:
+ x_range, y_range, z_range = cfg.PC_AREA_SCOPE
+ pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
+ range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
+ & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
+ & (pts_z >= z_range[0]) & (pts_z <= z_range[1])
+ pts_valid_flag = pts_valid_flag & range_flag
+ return pts_valid_flag
+
+ def get_rpn_sample(self, index):
+ sample_id = int(self.sample_id_list[index])
+ if sample_id < 10000:
+ calib = self.get_calib(sample_id)
+ # img = self.get_image(sample_id)
+ img_shape = self.get_image_shape(sample_id)
+ pts_lidar = self.get_lidar(sample_id)
+
+ # get valid point (projected points should be in image)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_intensity = pts_lidar[:, 3]
+ else:
+ calib = self.get_calib(sample_id % 10000)
+ # img = self.get_image(sample_id % 10000)
+ img_shape = self.get_image_shape(sample_id % 10000)
+
+ pts_file = os.path.join(self.aug_pts_dir, '%06d.bin' % sample_id)
+ assert os.path.exists(pts_file), '%s' % pts_file
+ aug_pts = np.fromfile(pts_file, dtype=np.float32).reshape(-1, 4)
+ pts_rect, pts_intensity = aug_pts[:, 0:3], aug_pts[:, 3]
+
+ pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
+ pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
+
+ pts_rect = pts_rect[pts_valid_flag][:, 0:3]
+ pts_intensity = pts_intensity[pts_valid_flag]
+
+ if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN':
+ # all labels for checking overlapping
+ all_gt_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
+ all_gt_boxes3d = kitti_utils.objs_to_boxes3d(all_gt_obj_list)
+
+ gt_aug_flag = False
+ if np.random.rand() < cfg.GT_AUG_APPLY_PROB:
+ # augment one scene
+ gt_aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
+ self.apply_gt_aug_to_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
+
+ # generate inputs
+ if self.mode == 'TRAIN' or self.random_select:
+ if self.npoints < len(pts_rect):
+ pts_depth = pts_rect[:, 2]
+ pts_near_flag = pts_depth < 40.0
+ far_idxs_choice = np.where(pts_near_flag == 0)[0]
+ near_idxs = np.where(pts_near_flag == 1)[0]
+ near_idxs_choice = np.random.choice(near_idxs, self.npoints - len(far_idxs_choice), replace=False)
+
+ choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \
+ if len(far_idxs_choice) > 0 else near_idxs_choice
+ np.random.shuffle(choice)
+ else:
+ choice = np.arange(0, len(pts_rect), dtype=np.int32)
+ if self.npoints > len(pts_rect):
+ extra_choice = np.random.choice(choice, self.npoints - len(pts_rect), replace=False)
+ choice = np.concatenate((choice, extra_choice), axis=0)
+ np.random.shuffle(choice)
+
+ ret_pts_rect = pts_rect[choice, :]
+ ret_pts_intensity = pts_intensity[choice] - 0.5 # translate intensity to [-0.5, 0.5]
+ else:
+ ret_pts_rect = np.zeros((self.npoints, pts_rect.shape[1])).astype(pts_rect.dtype)
+ num_ = min(self.npoints, pts_rect.shape[0])
+ ret_pts_rect[:num_] = pts_rect[:num_]
+
+ ret_pts_intensity = pts_intensity - 0.5
+
+ pts_features = [ret_pts_intensity.reshape(-1, 1)]
+ ret_pts_features = np.concatenate(pts_features, axis=1) if pts_features.__len__() > 1 else pts_features[0]
+
+ sample_info = {'sample_id': sample_id, 'random_select': self.random_select}
+
+ if self.mode == 'TEST':
+ if cfg.RPN.USE_INTENSITY:
+ pts_input = np.concatenate((ret_pts_rect, ret_pts_features), axis=1) # (N, C)
+ else:
+ pts_input = ret_pts_rect
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = ret_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ return sample_info
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN' and gt_aug_flag:
+ gt_obj_list.extend(extra_gt_obj_list)
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ gt_alpha = np.zeros((gt_obj_list.__len__()), dtype=np.float32)
+ for k, obj in enumerate(gt_obj_list):
+ gt_alpha[k] = obj.alpha
+
+ # data augmentation
+ aug_pts_rect = ret_pts_rect.copy()
+ aug_gt_boxes3d = gt_boxes3d.copy()
+ if cfg.AUG_DATA and self.mode == 'TRAIN':
+ aug_pts_rect, aug_gt_boxes3d, aug_method = self.data_augmentation(aug_pts_rect, aug_gt_boxes3d, gt_alpha,
+ sample_id)
+ sample_info['aug_method'] = aug_method
+
+ # prepare input
+ if cfg.RPN.USE_INTENSITY:
+ pts_input = np.concatenate((aug_pts_rect, ret_pts_features), axis=1) # (N, C)
+ else:
+ pts_input = aug_pts_rect
+
+ if cfg.RPN.FIXED:
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = aug_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ sample_info['gt_boxes3d'] = aug_gt_boxes3d
+ return sample_info
+
+ if self.mode == 'EVAL' and aug_gt_boxes3d.shape[0] == 0:
+ aug_gt_boxes3d = np.zeros((1, aug_gt_boxes3d.shape[1]))
+
+ # generate training labels
+ rpn_cls_label, rpn_reg_label = self.generate_rpn_training_labels(aug_pts_rect, aug_gt_boxes3d)
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = aug_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ sample_info['rpn_cls_label'] = rpn_cls_label
+ sample_info['rpn_reg_label'] = rpn_reg_label
+ sample_info['gt_boxes3d'] = aug_gt_boxes3d
+ return sample_info
+
+ def apply_gt_aug_to_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
+ """
+ :param pts_rect: (N, 3)
+ :param all_gt_boxex3d: (M2, 7)
+ :return:
+ """
+ assert self.gt_database is not None
+ # extra_gt_num = np.random.randint(10, 15)
+ # try_times = 50
+ if cfg.GT_AUG_RAND_NUM:
+ extra_gt_num = np.random.randint(10, cfg.GT_EXTRA_NUM)
+ else:
+ extra_gt_num = cfg.GT_EXTRA_NUM
+ try_times = 100
+ cnt = 0
+ cur_gt_boxes3d = all_gt_boxes3d.copy()
+ cur_gt_boxes3d[:, 4] += 0.5 # TODO: consider different objects
+ cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
+ cur_gt_corners = kitti_utils.boxes3d_to_corners3d(cur_gt_boxes3d)
+
+ extra_gt_obj_list = []
+ extra_gt_boxes3d_list = []
+ new_pts_list, new_pts_intensity_list = [], []
+ src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
+
+ road_plane = self.get_road_plane(sample_id)
+ a, b, c, d = road_plane
+
+ while try_times > 0:
+ if cnt > extra_gt_num:
+ break
+
+ try_times -= 1
+ if cfg.GT_AUG_HARD_RATIO > 0:
+ p = np.random.rand()
+ if p > cfg.GT_AUG_HARD_RATIO:
+ # use easy sample
+ rand_idx = np.random.randint(0, len(self.gt_database[0]))
+ new_gt_dict = self.gt_database[0][rand_idx]
+ else:
+ # use hard sample
+ rand_idx = np.random.randint(0, len(self.gt_database[1]))
+ new_gt_dict = self.gt_database[1][rand_idx]
+ else:
+ rand_idx = np.random.randint(0, self.gt_database.__len__())
+ new_gt_dict = self.gt_database[rand_idx]
+
+ new_gt_box3d = new_gt_dict['gt_box3d'].copy()
+ new_gt_points = new_gt_dict['points'].copy()
+ new_gt_intensity = new_gt_dict['intensity'].copy()
+ new_gt_obj = new_gt_dict['obj']
+ center = new_gt_box3d[0:3]
+ if cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
+ continue
+
+ if new_gt_points.__len__() < 5: # too few points
+ continue
+
+ # put it on the road plane
+ cur_height = (-d - a * center[0] - c * center[2]) / b
+ move_height = new_gt_box3d[1] - cur_height
+ new_gt_box3d[1] -= move_height
+ new_gt_points[:, 1] -= move_height
+ new_gt_obj.pos[1] -= move_height
+
+ new_enlarged_box3d = new_gt_box3d.copy()
+ new_enlarged_box3d[4] += 0.5
+ new_enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
+
+ cnt += 1
+ new_corners = kitti_utils.boxes3d_to_corners3d(new_enlarged_box3d.reshape(1, 7))
+ iou3d = kitti_utils.get_iou3d(new_corners, cur_gt_corners)
+ valid_flag = iou3d.max() < 1e-8
+ if not valid_flag:
+ continue
+
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[3] += 2 # remove the points above and below the object
+
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect,
+ enlarged_box3d.reshape(1, 7))
+ pt_mask_flag = (boxes_pts_mask_list[0] == 1)
+ src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
+
+ new_pts_list.append(new_gt_points)
+ new_pts_intensity_list.append(new_gt_intensity)
+ cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, new_enlarged_box3d.reshape(1, 7)), axis=0)
+ cur_gt_corners = np.concatenate((cur_gt_corners, new_corners), axis=0)
+ extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
+ extra_gt_obj_list.append(new_gt_obj)
+
+ if new_pts_list.__len__() == 0:
+ return False, pts_rect, pts_intensity, None, None
+
+ extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
+ # remove original points and add new points
+ pts_rect = pts_rect[src_pts_flag == 1]
+ pts_intensity = pts_intensity[src_pts_flag == 1]
+ new_pts_rect = np.concatenate(new_pts_list, axis=0)
+ new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
+ pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
+ pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
+
+ return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
+
+ def rotate_box3d_along_y(self, box3d, rot_angle):
+ old_x, old_z, ry = box3d[0], box3d[2], box3d[6]
+ old_beta = np.arctan2(old_z, old_x)
+ alpha = -np.sign(old_beta) * np.pi / 2 + old_beta + ry
+ box3d = kitti_utils.rotate_pc_along_y(box3d.reshape(1, 7), rot_angle=rot_angle)[0]
+ new_x, new_z = box3d[0], box3d[2]
+ new_beta = np.arctan2(new_z, new_x)
+ box3d[6] = np.sign(new_beta) * np.pi / 2 + alpha - new_beta
+ return box3d
+
+ def data_augmentation(self, aug_pts_rect, aug_gt_boxes3d, gt_alpha, sample_id=None, mustaug=False, stage=1):
+ """
+ :param aug_pts_rect: (N, 3)
+ :param aug_gt_boxes3d: (N, 7)
+ :param gt_alpha: (N)
+ :return:
+ """
+ aug_list = cfg.AUG_METHOD_LIST
+ aug_enable = 1 - np.random.rand(3)
+ if mustaug is True:
+ aug_enable[0] = -1
+ aug_enable[1] = -1
+ aug_method = []
+ if 'rotation' in aug_list and aug_enable[0] < cfg.AUG_METHOD_PROB[0]:
+ angle = np.random.uniform(-np.pi / cfg.AUG_ROT_RANGE, np.pi / cfg.AUG_ROT_RANGE)
+ aug_pts_rect = kitti_utils.rotate_pc_along_y(aug_pts_rect, rot_angle=angle)
+ if stage == 1:
+ # xyz change, hwl unchange
+ aug_gt_boxes3d = kitti_utils.rotate_pc_along_y(aug_gt_boxes3d, rot_angle=angle)
+
+ # calculate the ry after rotation
+ x, z = aug_gt_boxes3d[:, 0], aug_gt_boxes3d[:, 2]
+ beta = np.arctan2(z, x)
+ new_ry = np.sign(beta) * np.pi / 2 + gt_alpha - beta
+ aug_gt_boxes3d[:, 6] = new_ry # TODO: not in [-np.pi / 2, np.pi / 2]
+ elif stage == 2:
+ # for debug stage-2, this implementation has little float precision difference with the above one
+ assert aug_gt_boxes3d.shape[0] == 2
+ aug_gt_boxes3d[0] = self.rotate_box3d_along_y(aug_gt_boxes3d[0], angle)
+ aug_gt_boxes3d[1] = self.rotate_box3d_along_y(aug_gt_boxes3d[1], angle)
+ else:
+ raise NotImplementedError
+
+ aug_method.append(['rotation', angle])
+
+ if 'scaling' in aug_list and aug_enable[1] < cfg.AUG_METHOD_PROB[1]:
+ scale = np.random.uniform(0.95, 1.05)
+ aug_pts_rect = aug_pts_rect * scale
+ aug_gt_boxes3d[:, 0:6] = aug_gt_boxes3d[:, 0:6] * scale
+ aug_method.append(['scaling', scale])
+
+ if 'flip' in aug_list and aug_enable[2] < cfg.AUG_METHOD_PROB[2]:
+ # flip horizontal
+ aug_pts_rect[:, 0] = -aug_pts_rect[:, 0]
+ aug_gt_boxes3d[:, 0] = -aug_gt_boxes3d[:, 0]
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ if stage == 1:
+ aug_gt_boxes3d[:, 6] = np.sign(aug_gt_boxes3d[:, 6]) * np.pi - aug_gt_boxes3d[:, 6]
+ elif stage == 2:
+ assert aug_gt_boxes3d.shape[0] == 2
+ aug_gt_boxes3d[0, 6] = np.sign(aug_gt_boxes3d[0, 6]) * np.pi - aug_gt_boxes3d[0, 6]
+ aug_gt_boxes3d[1, 6] = np.sign(aug_gt_boxes3d[1, 6]) * np.pi - aug_gt_boxes3d[1, 6]
+ else:
+ raise NotImplementedError
+
+ aug_method.append('flip')
+
+ return aug_pts_rect, aug_gt_boxes3d, aug_method
+
+ @staticmethod
+ def generate_rpn_training_labels(pts_rect, gt_boxes3d):
+ cls_label = np.zeros((pts_rect.shape[0]), dtype=np.int32)
+ reg_label = np.zeros((pts_rect.shape[0], 7), dtype=np.float32) # dx, dy, dz, ry, h, w, l
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d, rotate=True)
+ extend_gt_boxes3d = kitti_utils.enlarge_box3d(gt_boxes3d, extra_width=0.2)
+ extend_gt_corners = kitti_utils.boxes3d_to_corners3d(extend_gt_boxes3d, rotate=True)
+ for k in range(gt_boxes3d.shape[0]):
+ box_corners = gt_corners[k]
+ fg_pt_flag = in_hull(pts_rect, box_corners)
+ fg_pts_rect = pts_rect[fg_pt_flag]
+ cls_label[fg_pt_flag] = 1
+
+ # enlarge the bbox3d, ignore nearby points
+ extend_box_corners = extend_gt_corners[k]
+ fg_enlarge_flag = in_hull(pts_rect, extend_box_corners)
+ ignore_flag = np.logical_xor(fg_pt_flag, fg_enlarge_flag)
+ cls_label[ignore_flag] = -1
+
+ # pixel offset of object center
+ center3d = gt_boxes3d[k][0:3].copy() # (x, y, z)
+ center3d[1] -= gt_boxes3d[k][3] / 2
+ reg_label[fg_pt_flag, 0:3] = center3d - fg_pts_rect # Now y is the true center of 3d box 20180928
+
+ # size and angle encoding
+ reg_label[fg_pt_flag, 3] = gt_boxes3d[k][3] # h
+ reg_label[fg_pt_flag, 4] = gt_boxes3d[k][4] # w
+ reg_label[fg_pt_flag, 5] = gt_boxes3d[k][5] # l
+ reg_label[fg_pt_flag, 6] = gt_boxes3d[k][6] # ry
+
+ return cls_label, reg_label
+
+ def get_rcnn_sample_jit(self, index):
+ sample_id = int(self.sample_id_list[index])
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
+ self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
+
+ # load rois and gt_boxes3d for this sample
+ roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
+ roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
+ # roi_scores is not used currently
+ # roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+ sample_info = OrderedDict()
+ sample_info["sample_id"] = sample_id
+ sample_info['rpn_xyz'] = rpn_xyz
+ sample_info['rpn_features'] = rpn_features
+ sample_info['rpn_intensity'] = rpn_intensity
+ sample_info['seg_mask'] = seg_mask
+ sample_info['roi_boxes3d'] = roi_boxes3d
+ sample_info['pts_depth'] = np.linalg.norm(rpn_xyz, ord=2, axis=1)
+ sample_info['gt_boxes3d'] = gt_boxes3d
+
+ return sample_info
+
+ def sample_bg_inds(self, hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
+ if hard_bg_inds.size > 0 and easy_bg_inds.size > 0:
+ hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
+ easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
+
+ # sampling hard bg
+ rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
+ hard_bg_inds = hard_bg_inds[rand_num]
+ # sampling easy bg
+ rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
+ easy_bg_inds = easy_bg_inds[rand_num]
+
+ bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
+ elif hard_bg_inds.size > 0 and easy_bg_inds.size == 0:
+ hard_bg_rois_num = bg_rois_per_this_image
+ # sampling hard bg
+ rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
+ bg_inds = hard_bg_inds[rand_num]
+ elif hard_bg_inds.size == 0 and easy_bg_inds.size > 0:
+ easy_bg_rois_num = bg_rois_per_this_image
+ # sampling easy bg
+ rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
+ bg_inds = easy_bg_inds[rand_num]
+ else:
+ raise NotImplementedError
+
+ return bg_inds
+
+ def aug_roi_by_noise_batch(self, roi_boxes3d, gt_boxes3d, aug_times=10):
+ """
+ :param roi_boxes3d: (N, 7)
+ :param gt_boxes3d: (N, 7)
+ :return:
+ """
+ iou_of_rois = np.zeros(roi_boxes3d.shape[0], dtype=np.float32)
+ for k in range(roi_boxes3d.__len__()):
+ temp_iou = cnt = 0
+ roi_box3d = roi_boxes3d[k]
+ gt_box3d = gt_boxes3d[k]
+ pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_box3d.reshape(1, 7), True)
+ aug_box3d = roi_box3d
+ while temp_iou < pos_thresh and cnt < aug_times:
+ if np.random.rand() < 0.2:
+ aug_box3d = roi_box3d # p=0.2 to keep the original roi box
+ else:
+ aug_box3d = self.random_aug_box3d(roi_box3d)
+ aug_corners = kitti_utils.boxes3d_to_corners3d(aug_box3d.reshape(1, 7), True)
+ iou3d = kitti_utils.get_iou3d(aug_corners, gt_corners)
+ temp_iou = iou3d[0][0]
+ cnt += 1
+ roi_boxes3d[k] = aug_box3d
+ iou_of_rois[k] = temp_iou
+ return roi_boxes3d, iou_of_rois
+
+ @staticmethod
+ def canonical_transform_batch(pts_input, roi_boxes3d, gt_boxes3d):
+ """
+ :param pts_input: (N, npoints, 3 + C)
+ :param roi_boxes3d: (N, 7)
+ :param gt_boxes3d: (N, 7)
+ :return:
+ """
+ roi_ry = roi_boxes3d[:, 6] % (2 * np.pi) # 0 ~ 2pi
+ roi_center = roi_boxes3d[:, 0:3]
+ # shift to center
+ pts_input[:, :, [0, 1, 2]] = pts_input[:, :, [0, 1, 2]] - roi_center.reshape(-1, 1, 3)
+ gt_boxes3d_ct = np.copy(gt_boxes3d)
+ gt_boxes3d_ct[:, 0:3] = gt_boxes3d_ct[:, 0:3] - roi_center
+ # rotate to the direction of head
+ gt_boxes3d_ct = kitti_utils.rotate_pc_along_y_np(
+ gt_boxes3d_ct.reshape(-1, 1, 7),
+ roi_ry,
+ )
+ # TODO: check here
+ gt_boxes3d_ct = gt_boxes3d_ct.reshape(-1,7)
+ gt_boxes3d_ct[:, 6] = gt_boxes3d_ct[:, 6] - roi_ry
+ pts_input = kitti_utils.rotate_pc_along_y_np(
+ pts_input,
+ roi_ry
+ )
+ return pts_input, gt_boxes3d_ct
+
+ def get_rcnn_training_sample_batch(self, index):
+ sample_id = int(self.sample_id_list[index])
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
+ self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
+
+ # load rois and gt_boxes3d for this sample
+ roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
+ roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
+ # roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ # calculate original iou
+ iou3d = kitti_utils.get_iou3d(kitti_utils.boxes3d_to_corners3d(roi_boxes3d, True),
+ kitti_utils.boxes3d_to_corners3d(gt_boxes3d, True))
+ max_overlaps, gt_assignment = iou3d.max(axis=1), iou3d.argmax(axis=1)
+ max_iou_of_gt, roi_assignment = iou3d.max(axis=0), iou3d.argmax(axis=0)
+ roi_assignment = roi_assignment[max_iou_of_gt > 0].reshape(-1)
+
+ # sample fg, easy_bg, hard_bg
+ fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
+ fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ fg_inds = np.nonzero(max_overlaps >= fg_thresh)[0]
+ fg_inds = np.concatenate((fg_inds, roi_assignment), axis=0) # consider the roi which has max_overlaps with gt as fg
+
+ easy_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO))[0]
+ hard_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH) &
+ (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0]
+
+ fg_num_rois = fg_inds.size
+ bg_num_rois = hard_bg_inds.size + easy_bg_inds.size
+
+ if fg_num_rois > 0 and bg_num_rois > 0:
+ # sampling fg
+ fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
+ rand_num = np.random.permutation(fg_num_rois)
+ fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
+
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
+ bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ elif fg_num_rois > 0 and bg_num_rois == 0:
+ # sampling fg
+ rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE ) * fg_num_rois)
+ # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
+ fg_inds = fg_inds[rand_num]
+ fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_rois_per_this_image = 0
+ elif bg_num_rois > 0 and fg_num_rois == 0:
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+ fg_rois_per_this_image = 0
+ else:
+ import pdb
+ pdb.set_trace()
+ raise NotImplementedError
+
+ # augment the rois by noise
+ roi_list, roi_iou_list, roi_gt_list = [], [], []
+ if fg_rois_per_this_image > 0:
+ fg_rois_src = roi_boxes3d[fg_inds].copy()
+ gt_of_fg_rois = gt_boxes3d[gt_assignment[fg_inds]]
+ fg_rois, fg_iou3d = self.aug_roi_by_noise_batch(fg_rois_src, gt_of_fg_rois, aug_times=10)
+ roi_list.append(fg_rois)
+ roi_iou_list.append(fg_iou3d)
+ roi_gt_list.append(gt_of_fg_rois)
+
+ if bg_rois_per_this_image > 0:
+ bg_rois_src = roi_boxes3d[bg_inds].copy()
+ gt_of_bg_rois = gt_boxes3d[gt_assignment[bg_inds]]
+ bg_rois, bg_iou3d = self.aug_roi_by_noise_batch(bg_rois_src, gt_of_bg_rois, aug_times=1)
+ roi_list.append(bg_rois)
+ roi_iou_list.append(bg_iou3d)
+ roi_gt_list.append(gt_of_bg_rois)
+
+ rois = np.concatenate(roi_list, axis=0)
+ iou_of_rois = np.concatenate(roi_iou_list, axis=0)
+ gt_of_rois = np.concatenate(roi_gt_list, axis=0)
+
+ # collect extra features for point cloud pooling
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [rpn_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
+ else:
+ pts_extra_input_list = [seg_mask.reshape(-1, 1)]
+
+ if cfg.RCNN.USE_DEPTH:
+ pts_depth = (np.linalg.norm(rpn_xyz, ord=2, axis=1) / 70.0) - 0.5
+ pts_extra_input_list.append(pts_depth.reshape(-1, 1))
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
+
+ # pts, pts_feature, boxes3d, pool_extra_width, sampled_pt_num
+ pts_input, pts_features, pts_empty_flag = roipool3d_utils.roipool3d_cpu(
+ rpn_xyz, rpn_features, rois, pts_extra_input,
+ cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=cfg.RCNN.NUM_POINTS,
+ #canonical_transform=False
+ )
+
+ # data augmentation
+ if cfg.AUG_DATA and self.mode == 'TRAIN':
+ for k in range(rois.__len__()):
+ aug_pts = pts_input[k, :, 0:3].copy()
+ aug_gt_box3d = gt_of_rois[k].copy()
+ aug_roi_box3d = rois[k].copy()
+
+ # calculate alpha by ry
+ temp_boxes3d = np.concatenate([aug_roi_box3d.reshape(1, 7), aug_gt_box3d.reshape(1, 7)], axis=0)
+ temp_x, temp_z, temp_ry = temp_boxes3d[:, 0], temp_boxes3d[:, 2], temp_boxes3d[:, 6]
+ temp_beta = np.arctan2(temp_z, temp_x).astype(np.float64)
+ temp_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry
+
+ # data augmentation
+ aug_pts, aug_boxes3d, aug_method = self.data_augmentation(aug_pts, temp_boxes3d, temp_alpha,
+ mustaug=True, stage=2)
+
+ # assign to original data
+ pts_input[k, :, 0:3] = aug_pts
+ rois[k] = aug_boxes3d[0]
+ gt_of_rois[k] = aug_boxes3d[1]
+
+ valid_mask = (pts_empty_flag == 0).astype(np.int32)
+ # regression valid mask
+ reg_valid_mask = (iou_of_rois > cfg.RCNN.REG_FG_THRESH).astype(np.int32) & valid_mask
+
+ # classification label
+ cls_label = (iou_of_rois > cfg.RCNN.CLS_FG_THRESH).astype(np.int32)
+ invalid_mask = (iou_of_rois > cfg.RCNN.CLS_BG_THRESH) & (iou_of_rois < cfg.RCNN.CLS_FG_THRESH)
+ cls_label[invalid_mask] = -1
+ cls_label[valid_mask == 0] = -1
+
+ # canonical transform and sampling
+ pts_input_ct, gt_boxes3d_ct = self.canonical_transform_batch(pts_input, rois, gt_of_rois)
+
+ pts_input_ = np.concatenate((pts_input_ct, pts_features), axis=-1)
+ sample_info = OrderedDict()
+
+ sample_info['sample_id'] = sample_id
+ sample_info['pts_input'] = pts_input_
+ sample_info['pts_feature'] = pts_features
+ sample_info['roi_boxes3d'] = rois
+ sample_info['cls_label'] = cls_label
+ sample_info['reg_valid_mask'] = reg_valid_mask
+ sample_info['gt_boxes3d_ct'] = gt_boxes3d_ct
+ sample_info['gt_of_rois'] = gt_of_rois
+ return sample_info
+
+ @staticmethod
+ def random_aug_box3d(box3d):
+ """
+ :param box3d: (7) [x, y, z, h, w, l, ry]
+ random shift, scale, orientation
+ """
+ if cfg.RCNN.REG_AUG_METHOD == 'single':
+ pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
+ hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
+ angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
+
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale,
+ box3d[6:7] + angle_rot])
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
+ # pos_range, hwl_range, angle_range, mean_iou
+ range_config = [[0.2, 0.1, np.pi / 12, 0.7],
+ [0.3, 0.15, np.pi / 12, 0.6],
+ [0.5, 0.15, np.pi / 9, 0.5],
+ [0.8, 0.15, np.pi / 6, 0.3],
+ [1.0, 0.15, np.pi / 3, 0.2]]
+ idx = np.random.randint(len(range_config))
+
+ pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
+ hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
+ angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
+
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot])
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'normal':
+ x_shift = np.random.normal(loc=0, scale=0.3)
+ y_shift = np.random.normal(loc=0, scale=0.2)
+ z_shift = np.random.normal(loc=0, scale=0.3)
+ h_shift = np.random.normal(loc=0, scale=0.25)
+ w_shift = np.random.normal(loc=0, scale=0.15)
+ l_shift = np.random.normal(loc=0, scale=0.5)
+ ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
+
+ aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
+ box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift])
+ return aug_box3d
+ else:
+ raise NotImplementedError
+
+ def get_proposal_from_file(self, index):
+ sample_id = int(self.image_idx_list[index])
+ proposal_file = os.path.join(self.rcnn_eval_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(proposal_file)
+
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = self.get_rpn_features(self.rcnn_eval_feature_dir, sample_id)
+ pts_rect, pts_rpn_features, pts_intensity = rpn_xyz, rpn_features, rpn_intensity
+
+ roi_box3d_list, roi_scores = [], []
+ for obj in roi_obj_list:
+ box3d = np.array([obj.pos[0], obj.pos[1], obj.pos[2], obj.h, obj.w, obj.l, obj.ry], dtype=np.float32)
+ roi_box3d_list.append(box3d.reshape(1, 7))
+ roi_scores.append(obj.score)
+
+ roi_boxes3d = np.concatenate(roi_box3d_list, axis=0) # (N, 7)
+ roi_scores = np.array(roi_scores, dtype=np.float32) # (N)
+
+ if cfg.RCNN.ROI_SAMPLE_JIT:
+ sample_dict = {'sample_id': sample_id,
+ 'rpn_xyz': rpn_xyz,
+ 'rpn_features': rpn_features,
+ 'seg_mask': seg_mask,
+ 'roi_boxes3d': roi_boxes3d,
+ 'roi_scores': roi_scores,
+ 'pts_depth': np.linalg.norm(rpn_xyz, ord=2, axis=1)}
+
+ if self.mode != 'TEST':
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
+ iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
+ if gt_boxes3d.shape[0] > 0:
+ gt_iou = iou3d.max(axis=1)
+ else:
+ gt_iou = np.zeros(roi_boxes3d.shape[0]).astype(np.float32)
+
+ sample_dict['gt_boxes3d'] = gt_boxes3d
+ sample_dict['gt_iou'] = gt_iou
+ return sample_dict
+
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [pts_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
+ else:
+ pts_extra_input_list = [seg_mask.reshape(-1, 1)]
+
+ if cfg.RCNN.USE_DEPTH:
+ cur_depth = np.linalg.norm(pts_rect, axis=1, ord=2)
+ cur_depth_norm = (cur_depth / 70.0) - 0.5
+ pts_extra_input_list.append(cur_depth_norm.reshape(-1, 1))
+
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
+ pts_input, pts_features, _ = roipool3d_utils.roipool3d_cpu(
+ pts_rect, pts_rpn_features, roi_boxes3d, pts_extra_input,
+ cfg.RCNN.POOL_EXTRA_WIDTH, sampled_pt_num=cfg.RCNN.NUM_POINTS,
+ canonical_transform=True
+ )
+ pts_input = np.concatenate((pts_input, pts_features), axis=-1)
+
+ sample_dict = OrderedDict()
+ sample_dict['sample_id'] = sample_id
+ sample_dict['pts_input'] = pts_input
+ sample_dict['pts_feature'] = pts_features
+ sample_dict['roi_boxes3d'] = roi_boxes3d
+ sample_dict['roi_scores'] = roi_scores
+ #sample_dict['roi_size'] = roi_boxes3d[:, 3:6]
+
+ if self.mode == 'TEST':
+ return sample_dict
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = np.zeros((gt_obj_list.__len__(), 7), dtype=np.float32)
+
+ for k, obj in enumerate(gt_obj_list):
+ gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ if gt_boxes3d.__len__() == 0:
+ gt_iou = np.zeros((roi_boxes3d.shape[0]), dtype=np.float32)
+ else:
+ roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
+ iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
+ gt_iou = iou3d.max(axis=1)
+
+ sample_dict['gt_iou'] = gt_iou
+ sample_dict['gt_boxes3d'] = gt_boxes3d
+
+ return sample_dict
+
+ def __len__(self):
+ if cfg.RPN.ENABLED:
+ return len(self.sample_id_list)
+ elif cfg.RCNN.ENABLED:
+ if self.mode == 'TRAIN':
+ return len(self.sample_id_list)
+ else:
+ return len(self.image_idx_list)
+ else:
+ raise NotImplementedError
+
+ def __getitem__(self, index):
+ if cfg.RPN.ENABLED:
+ return self.get_rpn_sample(index)
+ elif cfg.RCNN.ENABLED:
+ if self.mode == 'TRAIN':
+ if cfg.RCNN.ROI_SAMPLE_JIT:
+ return self.get_rcnn_sample_jit(index)
+ else:
+ return self.get_rcnn_training_sample_batch(index)
+ else:
+ return self.get_proposal_from_file(index)
+ else:
+ raise NotImplementedError
+
+ def padding_batch(self, batch_data, batch_size):
+ max_roi = 0
+ max_gt = 0
+
+ for k in range(batch_size):
+ # roi_boxes3d
+ max_roi = max(max_roi, batch_data[k][3].shape[0])
+ # gt_boxes3d
+ max_gt = max(max_gt, batch_data[k][-1].shape[0])
+ batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7))
+ batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
+
+ for i, data in enumerate(batch_data):
+ roi_num = data[3].shape[0]
+ gt_num = data[-1].shape[0]
+ batch_roi_boxes3d[i,:roi_num,:] = data[3]
+ batch_gt_boxes3d[i,:gt_num,:] = data[-1]
+
+ new_batch = []
+ for i, data in enumerate(batch_data):
+ new_batch.append(data[:3])
+ # roi_boxes3d
+ new_batch[i].append(batch_roi_boxes3d[i])
+ # ...
+ new_batch[i].extend(data[4:7])
+ # gt_boxes3d
+ new_batch[i].append(batch_gt_boxes3d[i])
+ return new_batch
+
+ def padding_batch_eval(self, batch_data, batch_size):
+ max_pts = 0
+ max_feats = 0
+ max_roi = 0
+ max_score = 0
+ max_iou = 0
+ max_gt = 0
+
+ for k in range(batch_size):
+ # pts_input
+ max_pts = max(max_pts, batch_data[k][1].shape[0])
+ # pts_feature
+ max_feats = max(max_feats, batch_data[k][2].shape[0])
+ # roi_boxes3d
+ max_roi = max(max_roi, batch_data[k][3].shape[0])
+ # gt_iou
+ max_iou = max(max_iou, batch_data[k][-2].shape[0])
+ # gt_boxes3d
+ max_gt = max(max_gt, batch_data[k][-1].shape[0])
+ batch_pts_input = np.zeros((batch_size, max_pts, 512, 133), dtype=np.float32)
+ batch_pts_feat = np.zeros((batch_size, max_feats, 512, 128), dtype=np.float32)
+ batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7), dtype=np.float32)
+ batch_gt_iou = np.zeros((batch_size, max_iou), dtype=np.float32)
+ batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
+
+ for i, data in enumerate(batch_data):
+ # num
+ pts_num = data[1].shape[0]
+ pts_feat_num = data[2].shape[0]
+ roi_num = data[3].shape[0]
+ iou_num = data[-2].shape[0]
+ gt_num = data[-1].shape[0]
+ # data
+ batch_pts_input[i, :pts_num, :, :] = data[1]
+ batch_pts_feat[i, :pts_feat_num, :, :] = data[2]
+ batch_roi_boxes3d[i,:roi_num,:] = data[3]
+ batch_gt_iou[i,:iou_num] = data[-2]
+ batch_gt_boxes3d[i,:gt_num,:] = data[-1]
+
+ new_batch = []
+ for i, data in enumerate(batch_data):
+ new_batch.append(data[:1])
+ new_batch[i].append(batch_pts_input[i])
+ new_batch[i].append(batch_pts_feat[i])
+ new_batch[i].append(batch_roi_boxes3d[i])
+ new_batch[i].append(data[4])
+ new_batch[i].append(batch_gt_iou[i])
+ new_batch[i].append(batch_gt_boxes3d[i])
+ return new_batch
+
+ def get_reader(self, batch_size, fields, drop_last=False):
+ def reader():
+ batch_out = []
+ idxs = np.arange(self.__len__())
+ if self.mode == 'TRAIN':
+ np.random.shuffle(idxs)
+ for idx in idxs:
+ sample_all = self.__getitem__(idx)
+ sample = [sample_all[f] for f in fields]
+ if has_empty(sample):
+ logger.info("sample field: %d has empty field"%len(sample))
+ continue
+ batch_out.append(sample)
+ if len(batch_out) >= batch_size:
+ if cfg.RPN.ENABLED:
+ yield batch_out
+ else:
+ if self.mode == 'TRAIN':
+ yield self.padding_batch(batch_out, batch_size)
+ elif self.mode == 'EVAL':
+ # batch_size can should be 1 in rcnn_offline eval currently
+ # if batch_size > 1, batch should be padded as follow
+ # yield self.padding_batch_eval(batch_out, batch_size)
+ yield batch_out
+ else:
+ logger.error("not only support train/eval padding")
+ batch_out = []
+ if not drop_last:
+ if len(batch_out) > 0:
+ yield batch_out
+ return reader
+
+ def get_multiprocess_reader(self, batch_size, fields, proc_num=8, max_queue_len=128, drop_last=False):
+ def read_to_queue(idxs, queue):
+ for idx in idxs:
+ sample_all = self.__getitem__(idx)
+ sample = [sample_all[f] for f in fields]
+ queue.put(sample)
+ queue.put(None)
+
+ def reader():
+ sample_num = self.__len__()
+ idxs = np.arange(self.__len__())
+ if self.mode == 'TRAIN':
+ np.random.shuffle(idxs)
+
+ proc_idxs = []
+ proc_sample_num = int(sample_num / proc_num)
+ start_idx = 0
+ for i in range(proc_num - 1):
+ proc_idxs.append(idxs[start_idx:start_idx + proc_sample_num])
+ start_idx += proc_sample_num
+ proc_idxs.append(idxs[start_idx:])
+
+ queue = multiprocessing.Queue(max_queue_len)
+ p_list = []
+ for i in range(proc_num):
+ p_list.append(multiprocessing.Process(
+ target=read_to_queue, args=(proc_idxs[i], queue,)))
+ p_list[-1].start()
+
+ finish_num = 0
+ batch_out = []
+ while finish_num < len(p_list):
+ sample = queue.get()
+ if sample is None:
+ finish_num += 1
+ else:
+ batch_out.append(sample)
+ if len(batch_out) == batch_size:
+ yield batch_out
+ batch_out = []
+
+ # join process
+ for p in p_list:
+ if p.is_alive():
+ p.join()
+
+ return reader
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/eval.py b/PaddleCV/Paddle3D/PointRCNN/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..7ee5d37f40bbee8a5486090b1ebda05f0d5928a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/eval.py
@@ -0,0 +1,343 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import logging
+import multiprocessing
+import numpy as np
+from collections import OrderedDict
+import paddle
+import paddle.fluid as fluid
+
+from models.point_rcnn import PointRCNN
+from data.kitti_rcnn_reader import KittiRCNNReader
+from utils.run_utils import *
+from utils.config import cfg, load_config, set_config_from_list
+from utils.metric_utils import calc_iou_recall, rpn_metric, rcnn_metric
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+np.random.seed(1024) # use same seed
+METRIC_PROC_NUM = 4
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ "PointRCNN semantic segmentation train script")
+ parser.add_argument(
+ '--cfg',
+ type=str,
+ default='cfgs/default.yml',
+ help='specify the config for training')
+ parser.add_argument(
+ '--eval_mode',
+ type=str,
+ default='rpn',
+ required=True,
+ help='specify the training mode')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=1,
+ help='evaluation batch size, default 1')
+ parser.add_argument(
+ '--ckpt_dir',
+ type=str,
+ default='checkpoints/199',
+ help='specify a ckpt directory to be evaluated if needed')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--output_dir',
+ type=str,
+ default='output',
+ help='output directory')
+ parser.add_argument(
+ '--save_rpn_feature',
+ action='store_true',
+ default=False,
+ help='save features for separately rcnn training and evaluation')
+ parser.add_argument(
+ '--save_result',
+ action='store_true',
+ default=False,
+ help='save roi and refine result of evaluation')
+ parser.add_argument(
+ '--rcnn_eval_roi_dir',
+ type=str,
+ default=None,
+ help='specify the saved rois for rcnn evaluation when using rcnn_offline mode')
+ parser.add_argument(
+ '--rcnn_eval_feature_dir',
+ type=str,
+ default=None,
+ help='specify the saved features for rcnn evaluation when using rcnn_offline mode')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval to log.')
+ parser.add_argument(
+ '--set',
+ dest='set_cfgs',
+ default=None,
+ nargs=argparse.REMAINDER,
+ help='set extra config keys if needed.')
+ args = parser.parse_args()
+ return args
+
+
+def eval():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ # PointRCNN model can only run on GPU
+ check_gpu(True)
+
+ load_config(args.cfg)
+ if args.set_cfgs is not None:
+ set_config_from_list(args.set_cfgs)
+
+ if not os.path.isdir(args.output_dir):
+ os.makedirs(args.output_dir)
+
+ if args.eval_mode == 'rpn':
+ cfg.RPN.ENABLED = True
+ cfg.RCNN.ENABLED = False
+ elif args.eval_mode == 'rcnn':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = cfg.RPN.FIXED = True
+ assert args.batch_size, "batch size must be 1 in rcnn evaluation"
+ elif args.eval_mode == 'rcnn_offline':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = False
+ assert args.batch_size, "batch size must be 1 in rcnn_offline evaluation"
+ else:
+ raise NotImplementedError("unkown eval mode: {}".format(args.eval_mode))
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+
+ # build model
+ startup = fluid.Program()
+ eval_prog = fluid.Program()
+ with fluid.program_guard(eval_prog, startup):
+ with fluid.unique_name.guard():
+ eval_model = PointRCNN(cfg, args.batch_size, True, 'TEST')
+ eval_model.build()
+ eval_pyreader = eval_model.get_pyreader()
+ eval_feeds = eval_model.get_feeds()
+ eval_outputs = eval_model.get_outputs()
+ eval_prog = eval_prog.clone(True)
+
+ extra_keys = []
+ if args.eval_mode == 'rpn':
+ extra_keys.extend(['sample_id', 'rpn_cls_label', 'gt_boxes3d'])
+ if args.save_rpn_feature:
+ extra_keys.extend(['pts_rect', 'pts_features', 'pts_input',])
+ eval_keys, eval_values = parse_outputs(
+ eval_outputs, prog=eval_prog, extra_keys=extra_keys)
+
+ eval_compile_prog = fluid.compiler.CompiledProgram(
+ eval_prog).with_data_parallel()
+
+ exe.run(startup)
+
+ # load checkpoint
+ assert os.path.isdir(
+ args.ckpt_dir), "ckpt_dir {} not a directory".format(args.ckpt_dir)
+
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.ckpt_dir, var.name))
+ fluid.io.load_vars(exe, args.ckpt_dir, eval_prog, predicate=if_exist)
+
+ kitti_feature_dir = os.path.join(args.output_dir, 'features')
+ kitti_output_dir = os.path.join(args.output_dir, 'detections', 'data')
+ seg_output_dir = os.path.join(args.output_dir, 'seg_result')
+ if args.save_rpn_feature:
+ if os.path.exists(kitti_feature_dir):
+ shutil.rmtree(kitti_feature_dir)
+ os.makedirs(kitti_feature_dir)
+ if os.path.exists(kitti_output_dir):
+ shutil.rmtree(kitti_output_dir)
+ os.makedirs(kitti_output_dir)
+ if os.path.exists(seg_output_dir):
+ shutil.rmtree(seg_output_dir)
+ os.makedirs(seg_output_dir)
+
+ # must make sure these dirs existing
+ roi_output_dir = os.path.join('./result_dir', 'roi_result', 'data')
+ refine_output_dir = os.path.join('./result_dir', 'refine_result', 'data')
+ final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
+ if not os.path.exists(final_output_dir):
+ os.makedirs(final_output_dir)
+ if args.save_result:
+ if not os.path.exists(roi_output_dir):
+ os.makedirs(roi_output_dir)
+ if not os.path.exists(refine_output_dir):
+ os.makedirs(refine_output_dir)
+
+ # get reader
+ kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
+ npoints=cfg.RPN.NUM_POINTS,
+ split=cfg.TEST.SPLIT,
+ mode='EVAL',
+ classes=cfg.CLASSES,
+ rcnn_eval_roi_dir=args.rcnn_eval_roi_dir,
+ rcnn_eval_feature_dir=args.rcnn_eval_feature_dir)
+ eval_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, eval_feeds)
+ eval_pyreader.decorate_sample_list_generator(eval_reader, place)
+
+ thresh_list = [0.1, 0.3, 0.5, 0.7, 0.9]
+ queue = multiprocessing.Queue(128)
+ mgr = multiprocessing.Manager()
+ lock = multiprocessing.Lock()
+ mdict = mgr.dict()
+ if cfg.RPN.ENABLED:
+ mdict['exit_proc'] = 0
+ mdict['total_gt_bbox'] = 0
+ mdict['total_cnt'] = 0
+ mdict['total_rpn_iou'] = 0
+ for i in range(len(thresh_list)):
+ mdict['total_recalled_bbox_list_{}'.format(i)] = 0
+
+ p_list = []
+ for i in range(METRIC_PROC_NUM):
+ p_list.append(multiprocessing.Process(
+ target=rpn_metric,
+ args=(queue, mdict, lock, thresh_list, args.save_rpn_feature, kitti_feature_dir,
+ seg_output_dir, kitti_output_dir, kitti_rcnn_reader, cfg.CLASSES)))
+ p_list[-1].start()
+
+ if cfg.RCNN.ENABLED:
+ for i in range(len(thresh_list)):
+ mdict['total_recalled_bbox_list_{}'.format(i)] = 0
+ mdict['total_roi_recalled_bbox_list_{}'.format(i)] = 0
+ mdict['exit_proc'] = 0
+ mdict['total_cls_acc'] = 0
+ mdict['total_cls_acc_refined'] = 0
+ mdict['total_det_num'] = 0
+ mdict['total_gt_bbox'] = 0
+ p_list = []
+ for i in range(METRIC_PROC_NUM):
+ p_list.append(multiprocessing.Process(
+ target=rcnn_metric,
+ args=(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
+ refine_output_dir, final_output_dir, args.save_result)
+ ))
+ p_list[-1].start()
+
+ try:
+ eval_pyreader.start()
+ eval_iter = 0
+ start_time = time.time()
+
+ cur_time = time.time()
+ while True:
+ eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values, return_numpy=False)
+ rets_dict = {k: (np.array(v), v.recursive_sequence_lengths())
+ for k, v in zip(eval_keys, eval_outs)}
+ run_time = time.time() - cur_time
+ cur_time = time.time()
+ queue.put(rets_dict)
+ eval_iter += 1
+
+ logger.info("[EVAL] iter {}, time: {:.2f}".format(
+ eval_iter, run_time))
+
+ except fluid.core.EOFException:
+ # terminate metric process
+ for i in range(METRIC_PROC_NUM):
+ queue.put(None)
+ while mdict['exit_proc'] < METRIC_PROC_NUM:
+ time.sleep(1)
+ for p in p_list:
+ if p.is_alive():
+ p.join()
+
+ end_time = time.time()
+ logger.info("[EVAL] total {} iter finished, average time: {:.2f}".format(
+ eval_iter, (end_time - start_time) / float(eval_iter)))
+
+ if cfg.RPN.ENABLED:
+ avg_rpn_iou = mdict['total_rpn_iou'] / max(len(kitti_rcnn_reader), 1.)
+ logger.info("average rpn iou: {:.3f}".format(avg_rpn_iou))
+ total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
+ for idx, thresh in enumerate(thresh_list):
+ recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info("total bbox recall(thresh={:.3f}): {} / {} = {:.3f}".format(
+ thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], mdict['total_gt_bbox'], recall))
+
+ if cfg.RCNN.ENABLED:
+ cnt = float(max(eval_iter, 1.0))
+ avg_cls_acc = mdict['total_cls_acc'] / cnt
+ avg_cls_acc_refined = mdict['total_cls_acc_refined'] / cnt
+ avg_det_num = mdict['total_det_num'] / cnt
+
+ logger.info("avg_cls_acc: {}".format(avg_cls_acc))
+ logger.info("avg_cls_acc_refined: {}".format(avg_cls_acc_refined))
+ logger.info("avg_det_num: {}".format(avg_det_num))
+
+ total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
+ for idx, thresh in enumerate(thresh_list):
+ cur_roi_recall = mdict['total_roi_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info('total roi bbox recall(thresh=%.3f): %d / %d = %f' % (
+ thresh, mdict['total_roi_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_roi_recall))
+
+ for idx, thresh in enumerate(thresh_list):
+ cur_recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info('total bbox recall(thresh=%.2f) %d / %.2f = %.4f' % (
+ thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_recall))
+
+ split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
+ image_idx_list = [x.strip() for x in open(split_file).readlines()]
+ for k in range(image_idx_list.__len__()):
+ cur_file = os.path.join(final_output_dir, '%s.txt' % image_idx_list[k])
+ if not os.path.exists(cur_file):
+ with open(cur_file, 'w') as temp_f:
+ pass
+
+ if float(sys.version[:3]) >= 3.6:
+ label_dir = os.path.join('./data/KITTI/object/training', 'label_2')
+ split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
+ final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
+ name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
+
+ from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
+ ap_result_str, ap_dict = kitti_evaluate(
+ label_dir, final_output_dir, label_split_file=split_file,
+ current_class=name_to_class["Car"])
+
+ logger.info("KITTI evaluate: {}, {}".format(ap_result_str, ap_dict))
+
+ else:
+ logger.info("KITTI mAP only support python version >= 3.6, users can "
+ "run 'python3 tools/kitti_eval.py' to evaluate KITTI mAP.")
+
+ finally:
+ eval_pyreader.reset()
+
+
+if __name__ == "__main__":
+ eval()
diff --git a/PaddleCV/Paddle3D/PointRCNN/ext_op b/PaddleCV/Paddle3D/PointRCNN/ext_op
new file mode 120000
index 0000000000000000000000000000000000000000..dca99c677c8fa26e7cbf3ce1d50a8e6af0621655
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/ext_op
@@ -0,0 +1 @@
+../PointNet++/ext_op
\ No newline at end of file
diff --git a/PaddleCV/Paddle3D/PointRCNN/images/teaser.png b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png
new file mode 100644
index 0000000000000000000000000000000000000000..21ae7e98165074ef93dc34fc643b3fddc5fe6c36
Binary files /dev/null and b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png differ
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/__init__.py b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py
@@ -0,0 +1,13 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..04db2398d099b7edee10e72a11af710c0a509231
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py
@@ -0,0 +1,201 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+__all__ = ["get_reg_loss"]
+
+
+def sigmoid_focal_loss(logits, labels, weights, gamma=2.0, alpha=0.25):
+ sce_loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, labels)
+ prob = fluid.layers.sigmoid(logits)
+ p_t = labels * prob + (1.0 - labels) * (1.0 - prob)
+ modulating_factor = fluid.layers.pow(1.0 - p_t, gamma)
+ alpha_weight_factor = labels * alpha + (1.0 - labels) * (1.0 - alpha)
+ return modulating_factor * alpha_weight_factor * sce_loss * weights
+
+
+def get_reg_loss(pred_reg, reg_label, fg_mask, point_num, loc_scope,
+ loc_bin_size, num_head_bin, anchor_size,
+ get_xz_fine=True, get_y_by_bin=False, loc_y_scope=0.5,
+ loc_y_bin_size=0.25, get_ry_fine=False):
+
+ """
+ Bin-based 3D bounding boxes regression loss. See https://arxiv.org/abs/1812.04244 for more details.
+
+ :param pred_reg: (N, C)
+ :param reg_label: (N, 7) [dx, dy, dz, h, w, l, ry]
+ :param loc_scope: constant
+ :param loc_bin_size: constant
+ :param num_head_bin: constant
+ :param anchor_size: (N, 3) or (3)
+ :param get_xz_fine:
+ :param get_y_by_bin:
+ :param loc_y_scope:
+ :param loc_y_bin_size:
+ :param get_ry_fine:
+ :return:
+ """
+ fg_num = fluid.layers.cast(fluid.layers.reduce_sum(fg_mask), dtype=pred_reg.dtype)
+ fg_num = fluid.layers.clip(fg_num, min=1.0, max=point_num)
+ fg_scale = float(point_num) / fg_num
+
+ per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ reg_loss_dict = {}
+
+ # xz localization loss
+ x_offset_label, y_offset_label, z_offset_label = reg_label[:, 0:1], reg_label[:, 1:2], reg_label[:, 2:3]
+ x_shift = fluid.layers.clip(x_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
+ z_shift = fluid.layers.clip(z_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
+ x_bin_label = fluid.layers.cast(x_shift / loc_bin_size, dtype='int64')
+ z_bin_label = fluid.layers.cast(z_shift / loc_bin_size, dtype='int64')
+
+ x_bin_l, x_bin_r = 0, per_loc_bin_num
+ z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
+ start_offset = z_bin_r
+
+ loss_x_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, x_bin_l: x_bin_r], x_bin_label)
+ loss_x_bin = fluid.layers.reduce_mean(loss_x_bin * fg_mask) * fg_scale
+ loss_z_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, z_bin_l: z_bin_r], z_bin_label)
+ loss_z_bin = fluid.layers.reduce_mean(loss_z_bin * fg_mask) * fg_scale
+ reg_loss_dict['loss_x_bin'] = loss_x_bin
+ reg_loss_dict['loss_z_bin'] = loss_z_bin
+ loc_loss = loss_x_bin + loss_z_bin
+
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_label = x_shift - (fluid.layers.cast(x_bin_label, dtype=x_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
+ z_res_label = z_shift - (fluid.layers.cast(z_bin_label, dtype=z_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
+ x_res_norm_label = x_res_label / loc_bin_size
+ z_res_norm_label = z_res_label / loc_bin_size
+
+ x_bin_onehot = fluid.layers.one_hot(x_bin_label, depth=per_loc_bin_num)
+ z_bin_onehot = fluid.layers.one_hot(z_bin_label, depth=per_loc_bin_num)
+
+ loss_x_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, x_res_l: x_res_r] * x_bin_onehot, dim=1, keep_dim=True), x_res_norm_label)
+ loss_x_res = fluid.layers.reduce_mean(loss_x_res * fg_mask) * fg_scale
+ loss_z_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, z_res_l: z_res_r] * z_bin_onehot, dim=1, keep_dim=True), z_res_norm_label)
+ loss_z_res = fluid.layers.reduce_mean(loss_z_res * fg_mask) * fg_scale
+ reg_loss_dict['loss_x_res'] = loss_x_res
+ reg_loss_dict['loss_z_res'] = loss_z_res
+ loc_loss += loss_x_res + loss_z_res
+
+ # y localization loss
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_shift = fluid.layers.clip(y_offset_label + loc_y_scope, 0., loc_y_scope * 2 - 1e-3)
+ y_bin_label = fluid.layers.cast(y_shift / loc_y_bin_size, dtype='int64')
+ y_res_label = y_shift - (fluid.layers.cast(y_bin_label, dtype=y_shift.dtype) * loc_y_bin_size + loc_y_bin_size / 2.)
+ y_res_norm_label = y_res_label / loc_y_bin_size
+
+ y_bin_onehot = fluid.layers.one_hot(y_bin_label, depth=per_loc_bin_num)
+
+ loss_y_bin = fluid.layers.cross_entropy(pred_reg[:, y_bin_l: y_bin_r], y_bin_label)
+ loss_y_bin = fluid.layers.reduce_mean(loss_y_bin * fg_mask) * fg_scale
+ loss_y_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_res_l: y_res_r] * y_bin_onehot, dim=1, keep_dim=True), y_res_norm_label)
+ loss_y_res = fluid.layers.reduce_mean(loss_y_res * fg_mask) * fg_scale
+
+ reg_loss_dict['loss_y_bin'] = loss_y_bin
+ reg_loss_dict['loss_y_res'] = loss_y_res
+
+ loc_loss += loss_y_bin + loss_y_res
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ loss_y_offset = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_offset_l: y_offset_r], dim=1, keep_dim=True), y_offset_label)
+ loss_y_offset = fluid.layers.reduce_mean(loss_y_offset * fg_mask) * fg_scale
+ reg_loss_dict['loss_y_offset'] = loss_y_offset
+ loc_loss += loss_y_offset
+
+ # angle loss
+ ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
+ ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
+
+ ry_label = reg_label[:, 6:7]
+
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+
+ ry_label = ry_label % (2 * np.pi) # 0 ~ 2pi
+ opposite_flag = fluid.layers.logical_and(ry_label > np.pi * 0.5, ry_label < np.pi * 1.5)
+ opposite_flag = fluid.layers.cast(opposite_flag, dtype=ry_label.dtype)
+ shift_angle = (ry_label + opposite_flag * np.pi + np.pi * 0.5) % (2 * np.pi) # (0 ~ pi)
+ shift_angle.stop_gradient = True
+
+ shift_angle = fluid.layers.clip(shift_angle - np.pi * 0.25, min=1e-3, max=np.pi * 0.5 - 1e-3) # (0, pi/2)
+
+ # bin center is (5, 10, 15, ..., 85)
+ ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
+ ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
+ ry_res_norm_label = ry_res_label / (angle_per_class / 2)
+
+ else:
+ # divide 2pi into several bins
+ angle_per_class = (2 * np.pi) / num_head_bin
+ heading_angle = ry_label % (2 * np.pi) # 0 ~ 2pi
+
+ shift_angle = (heading_angle + angle_per_class / 2) % (2 * np.pi)
+ shift_angle.stop_gradient = True
+ ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
+ ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
+ ry_res_norm_label = ry_res_label / (angle_per_class / 2)
+
+ ry_bin_onehot = fluid.layers.one_hot(ry_bin_label, depth=num_head_bin)
+ loss_ry_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, ry_bin_l:ry_bin_r], ry_bin_label)
+ loss_ry_bin = fluid.layers.reduce_mean(loss_ry_bin * fg_mask) * fg_scale
+ loss_ry_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, ry_res_l: ry_res_r] * ry_bin_onehot, dim=1, keep_dim=True), ry_res_norm_label)
+ loss_ry_res = fluid.layers.reduce_mean(loss_ry_res * fg_mask) * fg_scale
+
+ reg_loss_dict['loss_ry_bin'] = loss_ry_bin
+ reg_loss_dict['loss_ry_res'] = loss_ry_res
+ angle_loss = loss_ry_bin + loss_ry_res
+
+ # size loss
+ size_res_l, size_res_r = ry_res_r, ry_res_r + 3
+ assert pred_reg.shape[1] == size_res_r, '%d vs %d' % (pred_reg.shape[1], size_res_r)
+
+ anchor_size_var = fluid.layers.zeros(shape=[3], dtype=reg_label.dtype)
+ fluid.layers.assign(np.array(anchor_size).astype('float32'), anchor_size_var)
+ size_res_norm_label = (reg_label[:, 3:6] - anchor_size_var) / anchor_size_var
+ size_res_norm_label = fluid.layers.reshape(size_res_norm_label, shape=[-1, 1], inplace=True)
+ size_res_norm = pred_reg[:, size_res_l:size_res_r]
+ size_res_norm = fluid.layers.reshape(size_res_norm, shape=[-1, 1], inplace=True)
+ size_loss = fluid.layers.smooth_l1(size_res_norm, size_res_norm_label)
+ size_loss = fluid.layers.reduce_mean(fluid.layers.reshape(size_loss, [-1, 3]) * fg_mask) * fg_scale
+
+ # Total regression loss
+ reg_loss_dict['loss_loc'] = loc_loss
+ reg_loss_dict['loss_angle'] = angle_loss
+ reg_loss_dict['loss_size'] = size_loss
+
+ return loc_loss, angle_loss, size_loss, reg_loss_dict
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py
new file mode 100644
index 0000000000000000000000000000000000000000..890ef897405722f9cc1ba1d129bea2c80fce17a1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py
@@ -0,0 +1,125 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+from collections import OrderedDict
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+from models.rpn import RPN
+from models.rcnn import RCNN
+
+
+__all__ = ["PointRCNN"]
+
+
+class PointRCNN(object):
+ def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
+ self.cfg = cfg
+ self.batch_size = batch_size
+ self.use_xyz = use_xyz
+ self.mode = mode
+ self.is_train = mode == 'TRAIN'
+ self.num_points = self.cfg.RPN.NUM_POINTS
+ self.prog = prog
+ self.inputs = None
+ self.pyreader = None
+
+ def build_inputs(self):
+ self.inputs = OrderedDict()
+
+ if self.cfg.RPN.ENABLED:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32')
+ self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[self.num_points, 3], dtype='float32')
+ self.inputs['pts_rect'] = fluid.layers.data(name='pts_rect', shape=[self.num_points, 3], dtype='float32')
+ self.inputs['pts_features'] = fluid.layers.data(name='pts_features', shape=[self.num_points, 1], dtype='float32')
+ self.inputs['rpn_cls_label'] = fluid.layers.data(name='rpn_cls_label', shape=[self.num_points], dtype='int32')
+ self.inputs['rpn_reg_label'] = fluid.layers.data(name='rpn_reg_label', shape=[self.num_points, 7], dtype='float32')
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[7], lod_level=1, dtype='float32')
+
+ if self.cfg.RCNN.ENABLED:
+ if self.cfg.RCNN.ROI_SAMPLE_JIT:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32', append_batch_size=False)
+ self.inputs['rpn_xyz'] = fluid.layers.data(name='rpn_xyz', shape=[self.num_points, 3], dtype='float32', append_batch_size=False)
+ self.inputs['rpn_features'] = fluid.layers.data(name='rpn_features', shape=[self.num_points,128], dtype='float32', append_batch_size=False)
+ self.inputs['rpn_intensity'] = fluid.layers.data(name='rpn_intensity', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['seg_mask'] = fluid.layers.data(name='seg_mask', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
+ self.inputs['pts_depth'] = fluid.layers.data(name='pts_depth', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
+ else:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[-1], dtype='int32', append_batch_size=False)
+ self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[-1,512,133], dtype='float32', append_batch_size=False)
+ self.inputs['pts_feature'] = fluid.layers.data(name='pts_feature', shape=[-1,512,128], dtype='float32', append_batch_size=False)
+ self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1,7], dtype='float32', append_batch_size=False)
+ if self.is_train:
+ self.inputs['cls_label'] = fluid.layers.data(name='cls_label', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['reg_valid_mask'] = fluid.layers.data(name='reg_valid_mask', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d_ct'] = fluid.layers.data(name='gt_boxes3d_ct', shape=[-1,7], dtype='float32', append_batch_size=False)
+ self.inputs['gt_of_rois'] = fluid.layers.data(name='gt_of_rois', shape=[-1,7], dtype='float32', append_batch_size=False)
+ else:
+ self.inputs['roi_scores'] = fluid.layers.data(name='roi_scores', shape=[-1,], dtype='float32', append_batch_size=False)
+ self.inputs['gt_iou'] = fluid.layers.data(name='gt_iou', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1,-1,7], dtype='float32', append_batch_size=False, lod_level=0)
+
+
+ self.pyreader = fluid.io.PyReader(
+ feed_list=list(self.inputs.values()),
+ capacity=64,
+ use_double_buffer=True,
+ iterable=False)
+
+ def build(self):
+ self.build_inputs()
+ if self.cfg.RPN.ENABLED:
+ self.rpn = RPN(self.cfg, self.batch_size, self.use_xyz,
+ self.mode, self.prog)
+ self.rpn.build(self.inputs)
+ self.rpn_outputs = self.rpn.get_outputs()
+ self.outputs = self.rpn_outputs
+
+ if self.cfg.RCNN.ENABLED:
+ self.rcnn = RCNN(self.cfg, 1, self.batch_size, self.mode)
+ self.rcnn.build_model(self.inputs)
+ self.outputs = self.rcnn.get_outputs()
+
+ if self.mode == 'TRAIN':
+ if self.cfg.RPN.ENABLED:
+ self.outputs['rpn_loss'], self.outputs['rpn_loss_cls'], \
+ self.outputs['rpn_loss_reg'] = self.rpn.get_loss()
+ if self.cfg.RCNN.ENABLED:
+ self.outputs['rcnn_loss'], self.outputs['rcnn_loss_cls'], \
+ self.outputs['rcnn_loss_reg'] = self.rcnn.get_loss()
+ self.outputs['loss'] = self.outputs.get('rpn_loss', 0.) \
+ + self.outputs.get('rcnn_loss', 0.)
+
+ def get_feeds(self):
+ return list(self.inputs.keys())
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ rpn_loss, _, _ = self.rpn.get_loss()
+ rcnn_loss, _, _ = self.rcnn.get_loss()
+ return rpn_loss + rcnn_loss
+
+ def get_pyreader(self):
+ return self.pyreader
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py
new file mode 100644
index 0000000000000000000000000000000000000000..43942fcf8110869dd066dfabe7716db055af05b6
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py
@@ -0,0 +1,197 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from ext_op import *
+
+__all__ = ["conv_bn", "pointnet_sa_module", "pointnet_fp_module", "MLP"]
+
+
+def query_and_group(xyz, new_xyz, radius, nsample, features=None, use_xyz=True):
+ """
+ Perform query_ball and group_points
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ radius (float32): radius of ball
+ nsample (int32): maximum number of gather features
+ features (Variable): features with shape [B, N, C]
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ out (Variable): features with shape [B, npoint, nsample, C + 3]
+ """
+ idx = query_ball(xyz, new_xyz, radius, nsample)
+ idx.stop_gradient = True
+ xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
+ grouped_xyz = group_points(xyz, idx)
+ expand_new_xyz = fluid.layers.unsqueeze(fluid.layers.transpose(new_xyz, perm=[0, 2, 1]), axes=[-1])
+ expand_new_xyz = fluid.layers.expand(expand_new_xyz, [1, 1, 1, grouped_xyz.shape[3]])
+ grouped_xyz -= expand_new_xyz
+
+ if features is not None:
+ grouped_features = group_points(features, idx)
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) \
+ if use_xyz else grouped_features
+ else:
+ assert use_xyz, "use_xyz should be True when features is None"
+ return grouped_xyz
+
+
+def group_all(xyz, features=None, use_xyz=True):
+ """
+ Group all xyz and features when npoint is None
+ See query_and_group
+ """
+ xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
+ grouped_xyz = fluid.layers.unsqueeze(xyz, axes=[2])
+ if features is not None:
+ grouped_features = fluid.layers.unsqueeze(features, axes=[2])
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) if use_xyz else grouped_features
+ else:
+ return grouped_xyz
+
+
+def conv_bn(input, out_channels, bn=True, bn_momentum=0.95, act='relu', name=None):
+ param_attr = ParamAttr(name='{}_conv_weight'.format(name),)
+ bias_attr = ParamAttr(name='{}_conv_bias'.format(name)) \
+ if not bn else False
+ out = fluid.layers.conv2d(input,
+ num_filters=out_channels,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=param_attr,
+ bias_attr=bias_attr,
+ act=act if not bn else None)
+ if bn:
+ bn_name = name + "_bn"
+ out = fluid.layers.batch_norm(out,
+ act=act,
+ momentum=bn_momentum,
+ param_attr=ParamAttr(name=bn_name + "_scale"),
+ bias_attr=ParamAttr(name=bn_name + "_offset"),
+ moving_mean_name=bn_name + '_mean',
+ moving_variance_name=bn_name + '_var')
+
+ return out
+
+
+def MLP(features, out_channels_list, bn=True, bn_momentum=0.95, act='relu', name=None):
+ out = features
+ for i, out_channels in enumerate(out_channels_list):
+ out = conv_bn(out, out_channels, bn=bn, act=act, bn_momentum=bn_momentum, name=name + "_{}".format(i))
+ return out
+
+
+def pointnet_sa_module(xyz,
+ npoint=None,
+ radiuss=[],
+ nsamples=[],
+ mlps=[],
+ feature=None,
+ bn=True,
+ bn_momentum=0.95,
+ use_xyz=True,
+ name=None):
+ """
+ PointNet MSG(Multi-Scale Group) Set Abstraction Module.
+ Call with radiuss, nsamples, mlps as single element list for
+ SSG(Single-Scale Group).
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ radiuss ([float32]): list of radius of ball
+ nsamples ([int32]): list of maximum number of gather features
+ mlps ([[int32]]): list of out_channels_list
+ feature (Variable): features with shape [B, C, N]
+ bn (bool): whether perform batch norm after conv2d
+ bn_momentum (float): momentum of batch norm
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ out (Variable): features with shape [B, npoint, \sum_i{mlps[i][-1]}]
+ """
+ assert len(radiuss) == len(nsamples) == len(mlps), \
+ "radiuss, nsamples, mlps length should be same"
+
+ farthest_idx = farthest_point_sampling(xyz, npoint)
+ farthest_idx.stop_gradient = True
+ new_xyz = gather_point(xyz, farthest_idx) if npoint is not None else None
+
+ outs = []
+ for i, (radius, nsample, mlp) in enumerate(zip(radiuss, nsamples, mlps)):
+ out = query_and_group(xyz, new_xyz, radius, nsample, feature, use_xyz) if npoint is not None else group_all(xyz, feature, use_xyz)
+ out = MLP(out, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp{}'.format(i))
+ out = fluid.layers.pool2d(out, pool_size=[1, out.shape[3]], pool_type='max')
+ out = fluid.layers.squeeze(out, axes=[-1])
+ outs.append(out)
+ out = fluid.layers.concat(outs, axis=1)
+
+ return (new_xyz, out)
+
+
+def pointnet_fp_module(unknown, known, unknown_feats, known_feats, mlp, bn=True, bn_momentum=0.95, name=None):
+ """
+ PointNet Feature Propagation Module
+
+ Args:
+ unknown (Variable): unknown xyz coordiantes features with shape [B, N, 3]
+ known (Variable): known xyz coordiantes features with shape [B, M, 3]
+ unknown_feats (Variable): unknown features with shape [B, N, C1] to be propagated to
+ known_feats (Variable): known features with shape [B, M, C2] to be propagated from
+ mlp ([int32]): out_channels_list
+ bn (bool): whether perform batch norm after conv2d
+
+ Returns:
+ new_features (Variable): new features with shape [B, N, mlp[-1]]
+ """
+ if known is None:
+ raise NotImplementedError("Not implement known as None currently.")
+ else:
+ dist, idx = three_nn(unknown, known, eps=0.)
+ dist.stop_gradient = True
+ idx.stop_gradient = True
+ dist = fluid.layers.sqrt(dist)
+ ones = fluid.layers.fill_constant_batch_size_like(dist, dist.shape, dist.dtype, 1)
+ dist_recip = ones / (dist + 1e-8); # 1.0 / dist
+ norm = fluid.layers.reduce_sum(dist_recip, dim=-1, keep_dim=True)
+ weight = dist_recip / norm
+ weight.stop_gradient = True
+ interp_feats = three_interp(known_feats, weight, idx)
+
+ new_features = interp_feats if unknown_feats is None else \
+ fluid.layers.concat([interp_feats, unknown_feats], axis=-1)
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+ new_features = fluid.layers.unsqueeze(new_features, axes=[-1])
+ new_features = MLP(new_features, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp')
+ new_features = fluid.layers.squeeze(new_features, axes=[-1])
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+
+ return new_features
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py
new file mode 100644
index 0000000000000000000000000000000000000000..b4d5f98c3b320663111cf9eceef4f2649f44007d
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py
@@ -0,0 +1,78 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ SSG/MSG semantic segmentation models
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from models.pointnet2_modules import *
+
+__all__ = ["PointNet2MSG"]
+
+
+class PointNet2MSG(object):
+ def __init__(self, cfg, xyz, feature=None, use_xyz=True):
+ self.cfg = cfg
+ self.xyz = xyz
+ self.feature = feature
+ self.use_xyz = use_xyz
+ self.model_config()
+
+ def model_config(self):
+ self.SA_confs = []
+ for i in range(self.cfg.RPN.SA_CONFIG.NPOINTS.__len__()):
+ self.SA_confs.append({
+ "npoint": self.cfg.RPN.SA_CONFIG.NPOINTS[i],
+ "radiuss": self.cfg.RPN.SA_CONFIG.RADIUS[i],
+ "nsamples": self.cfg.RPN.SA_CONFIG.NSAMPLE[i],
+ "mlps": self.cfg.RPN.SA_CONFIG.MLPS[i],
+ })
+
+ self.FP_confs = []
+ for i in range(self.cfg.RPN.FP_MLPS.__len__()):
+ self.FP_confs.append({"mlp": self.cfg.RPN.FP_MLPS[i]})
+
+ def build(self, bn_momentum=0.95):
+ xyzs, features = [self.xyz], [self.feature]
+ xyzi, featurei = self.xyz, self.feature
+ for i, SA_conf in enumerate(self.SA_confs):
+ xyzi, featurei = pointnet_sa_module(
+ xyz=xyzi,
+ feature=featurei,
+ bn_momentum=bn_momentum,
+ use_xyz=self.use_xyz,
+ name="sa_{}".format(i),
+ **SA_conf)
+ xyzs.append(xyzi)
+ features.append(fluid.layers.transpose(featurei, perm=[0, 2, 1]))
+ for i in range(-1, -(len(self.FP_confs) + 1), -1):
+ features[i - 1] = pointnet_fp_module(
+ unknown=xyzs[i - 1],
+ known=xyzs[i],
+ unknown_feats=features[i - 1],
+ known_feats=features[i],
+ bn_momentum=bn_momentum,
+ name="fp_{}".format(i + len(self.FP_confs)),
+ **self.FP_confs[i])
+
+ return xyzs[0], features[0]
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py
new file mode 100644
index 0000000000000000000000000000000000000000..11247eb48c505e4cb8dc8a466ed1abca20078dd8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py
@@ -0,0 +1,302 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import sys
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+from models.pointnet2_modules import MLP, pointnet_sa_module, conv_bn
+from models.loss_utils import sigmoid_focal_loss , get_reg_loss
+from utils.proposal_target import get_proposal_target_func
+from utils.cyops.kitti_utils import rotate_pc_along_y
+
+__all__ = ['RCNN']
+
+
+class RCNN(object):
+ def __init__(self, cfg, num_classes, batch_size, mode='TRAIN', use_xyz=True, input_channels=0):
+ self.cfg = cfg
+ self.use_xyz = use_xyz
+ self.num_classes = num_classes
+ self.input_channels = input_channels
+ self.inputs = None
+ self.training = mode == 'TRAIN'
+ self.batch_size = batch_size
+
+ def create_tmp_var(self, name, dtype, shape):
+ return fluid.default_main_program().current_block().create_var(
+ name=name, dtype=dtype, shape=shape
+ )
+
+ def build_model(self, inputs):
+ self.inputs = inputs
+ if self.cfg.RCNN.ROI_SAMPLE_JIT:
+ if self.training:
+ proposal_target = get_proposal_target_func(self.cfg)
+
+ tmp_list = [
+ self.inputs['seg_mask'],
+ self.inputs['rpn_features'],
+ self.inputs['gt_boxes3d'],
+ self.inputs['rpn_xyz'],
+ self.inputs['pts_depth'],
+ self.inputs['roi_boxes3d'],
+ self.inputs['rpn_intensity'],
+ ]
+ out_name = ['reg_valid_mask' ,'sampled_pts' ,'roi_boxes3d', 'gt_of_rois', 'pts_feature' ,'cls_label','gt_iou']
+ reg_valid_mask = self.create_tmp_var(name="reg_valid_mask",dtype='float32',shape=[-1,])
+ sampled_pts = self.create_tmp_var(name="sampled_pts",dtype='float32',shape=[-1, self.cfg.RCNN.NUM_POINTS, 3])
+ new_roi_boxes3d = self.create_tmp_var(name="new_roi_boxes3d",dtype='float32',shape=[-1, 7])
+ gt_of_rois = self.create_tmp_var(name="gt_of_rois", dtype='float32', shape=[-1,7])
+ pts_feature = self.create_tmp_var(name="pts_feature", dtype='float32',shape=[-1,512,130])
+ cls_label = self.create_tmp_var(name="cls_label",dtype='int64',shape=[-1])
+ gt_iou = self.create_tmp_var(name="gt_iou",dtype='float32',shape=[-1])
+
+ out_list = [reg_valid_mask, sampled_pts, new_roi_boxes3d, gt_of_rois, pts_feature, cls_label, gt_iou]
+ out = fluid.layers.py_func(func=proposal_target,x=tmp_list,out=out_list)
+
+ self.target_dict = {}
+ for i,item in enumerate(out):
+ self.target_dict[out_name[i]] = item
+
+ pts = fluid.layers.concat(input=[self.target_dict['sampled_pts'],self.target_dict['pts_feature']], axis=2)
+ self.debug = pts
+ self.target_dict['pts_input'] = pts
+ else:
+ rpn_xyz, rpn_features = inputs['rpn_xyz'], inputs['rpn_features']
+ batch_rois = inputs['roi_boxes3d']
+ rpn_intensity = inputs['rpn_intensity']
+ rpn_intensity = fluid.layers.unsqueeze(rpn_intensity,axes=[2])
+ seg_mask = fluid.layers.unsqueeze(inputs['seg_mask'],axes=[2])
+ if self.cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [rpn_intensity, seg_mask]
+ else:
+ pts_extra_input_list = [seg_mask]
+
+ if self.cfg.RCNN.USE_DEPTH:
+ pts_depth = inputs['pts_depth'] / 70.0 -0.5
+ pts_depth = fluid.layers.unsqueeze(pts_depth,axes=[2])
+ pts_extra_input_list.append(pts_depth)
+ pts_extra_input = fluid.layers.concat(pts_extra_input_list, axis=2)
+ pts_feature = fluid.layers.concat([pts_extra_input, rpn_features],axis=2)
+
+ pooled_features, pooled_empty_flag = fluid.layers.roi_pool_3d(rpn_xyz,pts_feature,batch_rois,
+ self.cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=self.cfg.RCNN.NUM_POINTS)
+ # canonical transformation
+ batch_size = batch_rois.shape[0]
+ roi_center = batch_rois[:, :, 0:3]
+ tmp = pooled_features[:, :, :, 0:3] - fluid.layers.unsqueeze(roi_center,axes=[2])
+ pooled_features = fluid.layers.concat(input=[tmp,pooled_features[:,:,:,3:]],axis=3)
+ concat_list = []
+ for i in range(batch_size):
+ tmp = rotate_pc_along_y(pooled_features[i, :, :, 0:3],
+ batch_rois[i, :, 6])
+ concat = fluid.layers.concat([tmp,pooled_features[i,:,:,3:]],axis=-1)
+ concat = fluid.layers.unsqueeze(concat,axes=[0])
+ concat_list.append(concat)
+ pooled_features = fluid.layers.concat(concat_list,axis=0)
+ pts = fluid.layers.reshape(pooled_features,shape=[-1,pooled_features.shape[2],pooled_features.shape[3]])
+
+ else:
+ pts = inputs['pts_input']
+ self.target_dict = {}
+ self.target_dict['pts_input'] = inputs['pts_input']
+ self.target_dict['roi_boxes3d'] = inputs['roi_boxes3d']
+
+ if self.training:
+ self.target_dict['cls_label'] = inputs['cls_label']
+ self.target_dict['reg_valid_mask'] = inputs['reg_valid_mask']
+ self.target_dict['gt_of_rois'] = inputs['gt_boxes3d_ct']
+
+ xyz = pts[:,:,0:3]
+ feature = fluid.layers.transpose(pts[:,:,3:], [0,2,1]) if pts.shape[-1]>3 else None
+ if self.cfg.RCNN.USE_RPN_FEATURES:
+ self.rcnn_input_channel = 3 + int(self.cfg.RCNN.USE_INTENSITY) + \
+ int(self.cfg.RCNN.USE_MASK) + int(self.cfg.RCNN.USE_DEPTH)
+ c_out = self.cfg.RCNN.XYZ_UP_LAYER[-1]
+
+ xyz_input = pts[:,:,:self.rcnn_input_channel]
+ xyz_input = fluid.layers.transpose(xyz_input, [0,2,1])
+ xyz_input = fluid.layers.unsqueeze(xyz_input, axes=[3])
+
+ rpn_feature = pts[:,:,self.rcnn_input_channel:]
+ rpn_feature = fluid.layers.transpose(rpn_feature, [0,2,1])
+ rpn_feature = fluid.layers.unsqueeze(rpn_feature,axes=[3])
+
+ xyz_feature = MLP(
+ xyz_input,
+ out_channels_list=self.cfg.RCNN.XYZ_UP_LAYER,
+ bn=self.cfg.RCNN.USE_BN,
+ name="xyz_up_layer")
+
+ merged_feature = fluid.layers.concat([xyz_feature, rpn_feature],axis=1)
+ merged_feature = MLP(
+ merged_feature,
+ out_channels_list=[c_out],
+ bn=self.cfg.RCNN.USE_BN,
+ name="xyz_down_layer")
+
+ xyzs = [xyz]
+ features = [fluid.layers.squeeze(merged_feature,axes=[3])]
+ else:
+ xyzs = [xyz]
+ features = [feature]
+
+ # forward
+ xyzi, featurei = xyzs[-1], features[-1]
+ for k in range(len(self.cfg.RCNN.SA_CONFIG.NPOINTS)):
+ mlps = self.cfg.RCNN.SA_CONFIG.MLPS[k]
+ npoint = self.cfg.RCNN.SA_CONFIG.NPOINTS[k] if self.cfg.RCNN.SA_CONFIG.NPOINTS[k] != -1 else None
+
+ xyzi, featurei = pointnet_sa_module(
+ xyz=xyzi,
+ feature = featurei,
+ bn = self.cfg.RCNN.USE_BN,
+ use_xyz = self.use_xyz,
+ name = "sa_{}".format(k),
+ npoint = npoint,
+ mlps = [mlps],
+ radiuss = [self.cfg.RCNN.SA_CONFIG.RADIUS[k]],
+ nsamples = [self.cfg.RCNN.SA_CONFIG.NSAMPLE[k]]
+ )
+ xyzs.append(xyzi)
+ features.append(featurei)
+
+ head_in = features[-1]
+ head_in = fluid.layers.unsqueeze(head_in, axes=[2])
+
+ cls_out = head_in
+ reg_out = cls_out
+
+ for i in range(0, self.cfg.RCNN.CLS_FC.__len__()):
+ cls_out = conv_bn(cls_out, self.cfg.RCNN.CLS_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_cls_{}'.format(i))
+ if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
+ cls_out = fluid.layers.dropout(cls_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
+ cls_channel = 1 if self.num_classes == 2 else self.num_classes
+ cls_out = conv_bn(cls_out, cls_channel, act=None, name="cls_out", bn=self.cfg.RCNN.USE_BN)
+ self.cls_out = fluid.layers.squeeze(cls_out,axes=[1,3])
+
+ per_loc_bin_num = int(self.cfg.RCNN.LOC_SCOPE / self.cfg.RCNN.LOC_BIN_SIZE) * 2
+ loc_y_bin_num = int(self.cfg.RCNN.LOC_Y_SCOPE / self.cfg.RCNN.LOC_Y_BIN_SIZE) * 2
+ reg_channel = per_loc_bin_num * 4 + self.cfg.RCNN.NUM_HEAD_BIN * 2 + 3
+ reg_channel += (1 if not self.cfg.RCNN.LOC_Y_BY_BIN else loc_y_bin_num * 2)
+ for i in range(0, self.cfg.RCNN.REG_FC.__len__()):
+ reg_out = conv_bn(reg_out, self.cfg.RCNN.REG_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_reg_{}'.format(i))
+ if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
+ reg_out = fluid.layers.dropout(reg_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
+
+ reg_out = conv_bn(reg_out, reg_channel, act=None, name="reg_out", bn=self.cfg.RCNN.USE_BN)
+ self.reg_out = fluid.layers.squeeze(reg_out, axes=[2,3])
+
+
+ self.outputs = {
+ 'rcnn_cls':self.cls_out,
+ 'rcnn_reg':self.reg_out,
+ }
+ if self.training:
+ self.outputs.update(self.target_dict)
+ elif not self.training:
+ self.outputs['sample_id'] = inputs['sample_id']
+ self.outputs['pts_input'] = inputs['pts_input']
+ self.outputs['roi_boxes3d'] = inputs['roi_boxes3d']
+ self.outputs['roi_scores'] = inputs['roi_scores']
+ self.outputs['gt_iou'] = inputs['gt_iou']
+ self.outputs['gt_boxes3d'] = inputs['gt_boxes3d']
+
+ if self.cls_out.shape[1] == 1:
+ raw_scores = fluid.layers.reshape(self.cls_out, shape=[-1])
+ norm_scores = fluid.layers.sigmoid(raw_scores)
+ else:
+ norm_scores = fluid.layers.softmax(self.cls_out, axis=1)
+ self.outputs['norm_scores'] = norm_scores
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ assert self.inputs is not None, \
+ "please call build() first"
+ rcnn_cls_label = self.outputs['cls_label']
+ reg_valid_mask = self.outputs['reg_valid_mask']
+ roi_boxes3d = self.outputs['roi_boxes3d']
+ roi_size = roi_boxes3d[:, 3:6]
+ gt_boxes3d_ct = self.outputs['gt_of_rois']
+ pts_input = self.outputs['pts_input']
+
+ rcnn_cls = self.cls_out
+ rcnn_reg = self.reg_out
+
+ # RCNN classification loss
+ assert self.cfg.RCNN.LOSS_CLS in ["SigmoidFocalLoss", "BinaryCrossEntropy"], \
+ "unsupported RCNN cls loss type {}".format(self.cfg.RCNN.LOSS_CLS)
+
+ if self.cfg.RCNN.LOSS_CLS == "SigmoidFocalLoss":
+ cls_flat = fluid.layers.reshape(self.cls_out, shape=[-1])
+ cls_label_flat = fluid.layers.reshape(rcnn_cls_label, shape=[-1])
+ cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
+ cls_target = fluid.layers.cast(cls_label_flat>0, dtype=cls_flat.dtype)
+ cls_label_flat.stop_gradient = True
+ pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
+ pos.stop_gradient = True
+ pos_normalizer = fluid.layers.reduce_sum(pos)
+ cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
+ cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
+ cls_weights.stop_gradient = True
+ rcnn_loss_cls = sigmoid_focal_loss(cls_flat, cls_target, cls_weights)
+ rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls)
+ else: # BinaryCrossEntropy
+ cls_label = fluid.layers.reshape(rcnn_cls_label, shape=self.cls_out.shape)
+ cls_valid_mask = fluid.layers.cast(cls_label >= 0, dtype=self.cls_out.dtype)
+ cls_label = fluid.layers.cast(cls_label, dtype=self.cls_out.dtype)
+ cls_label.stop_gradient = True
+ rcnn_loss_cls = fluid.layers.sigmoid_cross_entropy_with_logits(self.cls_out, cls_label)
+ cls_mask_normalzer = fluid.layers.reduce_sum(cls_valid_mask)
+ rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls * cls_valid_mask) \
+ / fluid.layers.clip(cls_mask_normalzer, min=1.0, max=1e10)
+
+ # RCNN regression loss
+ reg_out = self.reg_out
+ fg_mask = fluid.layers.cast(reg_valid_mask > 0, dtype=reg_out.dtype)
+ fg_mask.stop_gradient = True
+ gt_boxes3d_ct = fluid.layers.reshape(gt_boxes3d_ct, [-1,7])
+ all_anchor_size = roi_size
+ anchor_size = all_anchor_size[fg_mask] if self.cfg.RCNN.SIZE_RES_ON_ROI else self.cfg.CLS_MEAN_SIZE[0]
+
+ loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
+ reg_out * fg_mask,
+ gt_boxes3d_ct,
+ fg_mask,
+ point_num=float(self.batch_size*64),
+ loc_scope=self.cfg.RCNN.LOC_SCOPE,
+ loc_bin_size=self.cfg.RCNN.LOC_BIN_SIZE,
+ num_head_bin=self.cfg.RCNN.NUM_HEAD_BIN,
+ anchor_size=anchor_size,
+ get_xz_fine=True,
+ get_y_by_bin=self.cfg.RCNN.LOC_Y_BY_BIN,
+ loc_y_scope=self.cfg.RCNN.LOC_Y_SCOPE,
+ loc_y_bin_size=self.cfg.RCNN.LOC_Y_BIN_SIZE,
+ get_ry_fine=True
+ )
+ rcnn_loss_reg = loc_loss + angle_loss + size_loss * 3
+ rcnn_loss = rcnn_loss_cls + rcnn_loss_reg
+ return rcnn_loss, rcnn_loss_cls, rcnn_loss_reg
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rpn.py b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..30f0e34551a01a065f0784650256399506588639
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py
@@ -0,0 +1,167 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Normal, Constant
+
+from utils.proposal_utils import get_proposal_func
+from models.pointnet2_msg import PointNet2MSG
+from models.pointnet2_modules import conv_bn
+from models.loss_utils import sigmoid_focal_loss, get_reg_loss
+
+__all__ = ["RPN"]
+
+
+class RPN(object):
+ def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
+ self.cfg = cfg
+ self.batch_size = batch_size
+ self.use_xyz = use_xyz
+ self.mode = mode
+ self.is_train = mode == 'TRAIN'
+ self.inputs = None
+ self.prog = fluid.default_main_program() if prog is None else prog
+
+ def build(self, inputs):
+ assert self.cfg.RPN.BACKBONE == 'pointnet2_msg', \
+ "RPN backbone only support pointnet2_msg"
+ self.inputs = inputs
+ self.outputs = {}
+
+ xyz = inputs["pts_input"]
+ assert not self.cfg.RPN.USE_INTENSITY, \
+ "RPN.USE_INTENSITY not support now"
+ feature = None
+ msg = PointNet2MSG(self.cfg, xyz, feature, self.use_xyz)
+ backbone_xyz, backbone_feature = msg.build()
+ self.outputs['backbone_xyz'] = backbone_xyz
+ self.outputs['backbone_feature'] = backbone_feature
+
+ backbone_feature = fluid.layers.transpose(backbone_feature, perm=[0, 2, 1])
+ cls_out = fluid.layers.unsqueeze(backbone_feature, axes=[-1])
+ reg_out = cls_out
+
+ # classification branch
+ for i in range(self.cfg.RPN.CLS_FC.__len__()):
+ cls_out = conv_bn(cls_out, self.cfg.RPN.CLS_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_cls_{}'.format(i))
+ if i == 0 and self.cfg.RPN.DP_RATIO > 0:
+ cls_out = fluid.layers.dropout(cls_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
+ cls_out = fluid.layers.conv2d(cls_out,
+ num_filters=1,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=ParamAttr(name='rpn_cls_out_conv_weight'),
+ bias_attr=ParamAttr(name='rpn_cls_out_conv_bias',
+ initializer=Constant(-np.log(99))))
+ cls_out = fluid.layers.squeeze(cls_out, axes=[1, 3])
+ self.outputs['rpn_cls'] = cls_out
+
+ # regression branch
+ per_loc_bin_num = int(self.cfg.RPN.LOC_SCOPE / self.cfg.RPN.LOC_BIN_SIZE) * 2
+ if self.cfg.RPN.LOC_XZ_FINE:
+ reg_channel = per_loc_bin_num * 4 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
+ else:
+ reg_channel = per_loc_bin_num * 2 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
+ reg_channel += 1 # reg y
+
+ for i in range(self.cfg.RPN.REG_FC.__len__()):
+ reg_out = conv_bn(reg_out, self.cfg.RPN.REG_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_reg_{}'.format(i))
+ if i == 0 and self.cfg.RPN.DP_RATIO > 0:
+ reg_out = fluid.layers.dropout(reg_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
+ reg_out = fluid.layers.conv2d(reg_out,
+ num_filters=reg_channel,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=ParamAttr(name='rpn_reg_out_conv_weight',
+ initializer=Normal(0., 0.001),),
+ bias_attr=ParamAttr(name='rpn_reg_out_conv_bias'))
+ reg_out = fluid.layers.squeeze(reg_out, axes=[3])
+ reg_out = fluid.layers.transpose(reg_out, [0, 2, 1])
+ self.outputs['rpn_reg'] = reg_out
+
+ if self.mode != 'TRAIN' or self.cfg.RCNN.ENABLED:
+ rpn_scores_row = cls_out
+ rpn_scores_norm = fluid.layers.sigmoid(rpn_scores_row)
+ seg_mask = fluid.layers.cast(rpn_scores_norm > self.cfg.RPN.SCORE_THRESH, dtype='float32')
+ pts_depth = fluid.layers.sqrt(fluid.layers.reduce_sum(backbone_xyz * backbone_xyz, dim=2))
+ proposal_func = get_proposal_func(self.cfg, self.mode)
+ proposal_input = fluid.layers.concat([fluid.layers.unsqueeze(rpn_scores_row, axes=[-1]),
+ backbone_xyz, reg_out], axis=-1)
+ proposal = self.prog.current_block().create_var(name='proposal',
+ shape=[-1, proposal_input.shape[1], 8],
+ dtype='float32')
+ fluid.layers.py_func(proposal_func, proposal_input, proposal)
+ rois, roi_scores_row = proposal[:, :, :7], proposal[:, :, -1]
+ self.outputs['rois'] = rois
+ self.outputs['roi_scores_row'] = roi_scores_row
+ self.outputs['seg_mask'] = seg_mask
+ self.outputs['pts_depth'] = pts_depth
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ assert self.inputs is not None, \
+ "please call build() first"
+ rpn_cls_label = self.inputs['rpn_cls_label']
+ rpn_reg_label = self.inputs['rpn_reg_label']
+ rpn_cls = self.outputs['rpn_cls']
+ rpn_reg = self.outputs['rpn_reg']
+
+ # RPN classification loss
+ assert self.cfg.RPN.LOSS_CLS == "SigmoidFocalLoss", \
+ "unsupported RPN cls loss type {}".format(self.cfg.RPN.LOSS_CLS)
+ cls_flat = fluid.layers.reshape(rpn_cls, shape=[-1])
+ cls_label_flat = fluid.layers.reshape(rpn_cls_label, shape=[-1])
+ cls_label_pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
+ pos_normalizer = fluid.layers.reduce_sum(cls_label_pos)
+ cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
+ cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
+ cls_weights.stop_gradient = True
+ cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
+ cls_label_flat.stop_gradient = True
+ rpn_loss_cls = sigmoid_focal_loss(cls_flat, cls_label_pos, cls_weights)
+ rpn_loss_cls = fluid.layers.reduce_sum(rpn_loss_cls)
+
+ # RPN regression loss
+ rpn_reg = fluid.layers.reshape(rpn_reg, [-1, rpn_reg.shape[-1]])
+ reg_label = fluid.layers.reshape(rpn_reg_label, [-1, rpn_reg_label.shape[-1]])
+ fg_mask = fluid.layers.cast(cls_label_flat > 0, dtype=rpn_reg.dtype)
+ fg_mask.stop_gradient = True
+ loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
+ rpn_reg * fg_mask, reg_label, fg_mask,
+ float(self.batch_size * self.cfg.RPN.NUM_POINTS),
+ loc_scope=self.cfg.RPN.LOC_SCOPE,
+ loc_bin_size=self.cfg.RPN.LOC_BIN_SIZE,
+ num_head_bin=self.cfg.RPN.NUM_HEAD_BIN,
+ anchor_size=self.cfg.CLS_MEAN_SIZE[0],
+ get_xz_fine=self.cfg.RPN.LOC_XZ_FINE,
+ get_y_by_bin=False,
+ get_ry_fine=False)
+ rpn_loss_reg = loc_loss + angle_loss + size_loss * 3
+
+ self.rpn_loss = rpn_loss_cls * self.cfg.RPN.LOSS_WEIGHT[0] + rpn_loss_reg * self.cfg.RPN.LOSS_WEIGHT[1]
+ return self.rpn_loss, rpn_loss_cls, rpn_loss_reg
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/requirement.txt b/PaddleCV/Paddle3D/PointRCNN/requirement.txt
new file mode 100644
index 0000000000000000000000000000000000000000..6ff347ab06c588b507fd6b5f1442e2375afb032a
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/requirement.txt
@@ -0,0 +1,6 @@
+Cython
+opencv-python
+shapely
+scikit-image
+Numba
+fire
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py
new file mode 100644
index 0000000000000000000000000000000000000000..59cfa4abc0629c71d150f750e8f32400c6c361b9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py
@@ -0,0 +1,330 @@
+"""
+Generate GT database
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_aug_scene.py
+"""
+
+import os
+import numpy as np
+import pickle
+
+import pts_utils
+import utils.cyops.kitti_utils as kitti_utils
+from utils.box_utils import boxes_iou3d
+from utils import calibration as calib
+from data.kitti_dataset import KittiDataset
+import argparse
+
+np.random.seed(1024)
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--mode', type=str, default='generator')
+parser.add_argument('--class_name', type=str, default='Car')
+parser.add_argument('--data_dir', type=str, default='./data')
+parser.add_argument('--save_dir', type=str, default='./data/KITTI/aug_scene/training')
+parser.add_argument('--split', type=str, default='train')
+parser.add_argument('--gt_database_dir', type=str, default='./data/gt_database/train_gt_database_3level_Car.pkl')
+parser.add_argument('--include_similar', action='store_true', default=False)
+parser.add_argument('--aug_times', type=int, default=4)
+args = parser.parse_args()
+
+PC_REDUCE_BY_RANGE = True
+if args.class_name == 'Car':
+ PC_AREA_SCOPE = np.array([[-40, 40], [-1, 3], [0, 70.4]]) # x, y, z scope in rect camera coords
+else:
+ PC_AREA_SCOPE = np.array([[-30, 30], [-1, 3], [0, 50]])
+
+
+def log_print(info, fp=None):
+ print(info)
+ if fp is not None:
+ # print(info, file=fp)
+ fp.write(info+"\n")
+
+
+def save_kitti_format(calib, bbox3d, obj_list, img_shape, save_fp):
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ # Discard boxes that are larger than 80% of the image width OR height
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ save_fp.write('%s %.2f %d %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
+ (args.class_name, obj_list[k].trucation, int(obj_list[k].occlusion), alpha, img_boxes[k, 0], img_boxes[k, 1],
+ img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6]))
+
+
+class AugSceneGenerator(KittiDataset):
+ def __init__(self, root_dir, gt_database=None, split='train', classes=args.class_name):
+ super(AugSceneGenerator, self).__init__(root_dir, split=split)
+ self.gt_database = None
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ self.gt_database = gt_database
+
+ def __len__(self):
+ raise NotImplementedError
+
+ def __getitem__(self, item):
+ raise NotImplementedError
+
+ def filtrate_dc_objects(self, obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type in ['DontCare']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ def filtrate_objects(self, obj_list):
+ valid_obj_list = []
+ type_whitelist = self.classes
+ if args.include_similar:
+ type_whitelist = list(self.classes)
+ if 'Car' in self.classes:
+ type_whitelist.append('Van')
+ if 'Pedestrian' in self.classes or 'Cyclist' in self.classes:
+ type_whitelist.append('Person_sitting')
+
+ for obj in obj_list:
+ if obj.cls_type in type_whitelist:
+ valid_obj_list.append(obj)
+ return valid_obj_list
+
+ @staticmethod
+ def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
+ """
+ Valid point should be in the image (and in the PC_AREA_SCOPE)
+ :param pts_rect:
+ :param pts_img:
+ :param pts_rect_depth:
+ :param img_shape:
+ :return:
+ """
+ val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
+ val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
+ val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
+ pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
+
+ if PC_REDUCE_BY_RANGE:
+ x_range, y_range, z_range = PC_AREA_SCOPE
+ pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
+ range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
+ & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
+ & (pts_z >= z_range[0]) & (pts_z <= z_range[1])
+ pts_valid_flag = pts_valid_flag & range_flag
+ return pts_valid_flag
+
+ @staticmethod
+ def check_pc_range(xyz):
+ """
+ :param xyz: [x, y, z]
+ :return:
+ """
+ x_range, y_range, z_range = PC_AREA_SCOPE
+ if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
+ (z_range[0] <= xyz[2] <= z_range[1]):
+ return True
+ return False
+
+ def aug_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
+ """
+ :param pts_rect: (N, 3)
+ :param gt_boxes3d: (M1, 7)
+ :param all_gt_boxex3d: (M2, 7)
+ :return:
+ """
+ assert self.gt_database is not None
+ extra_gt_num = np.random.randint(10, 15)
+ try_times = 50
+ cnt = 0
+ cur_gt_boxes3d = all_gt_boxes3d.copy()
+ cur_gt_boxes3d[:, 4] += 0.5
+ cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
+
+ extra_gt_obj_list = []
+ extra_gt_boxes3d_list = []
+ new_pts_list, new_pts_intensity_list = [], []
+ src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
+
+ road_plane = self.get_road_plane(sample_id)
+ a, b, c, d = road_plane
+
+ while try_times > 0:
+ try_times -= 1
+
+ rand_idx = np.random.randint(0, self.gt_database.__len__() - 1)
+
+ new_gt_dict = self.gt_database[rand_idx]
+ new_gt_box3d = new_gt_dict['gt_box3d'].copy()
+ new_gt_points = new_gt_dict['points'].copy()
+ new_gt_intensity = new_gt_dict['intensity'].copy()
+ new_gt_obj = new_gt_dict['obj']
+ center = new_gt_box3d[0:3]
+ if PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
+ continue
+ if cnt > extra_gt_num:
+ break
+ if new_gt_points.__len__() < 5: # too few points
+ continue
+
+ # put it on the road plane
+ cur_height = (-d - a * center[0] - c * center[2]) / b
+ move_height = new_gt_box3d[1] - cur_height
+ new_gt_box3d[1] -= move_height
+ new_gt_points[:, 1] -= move_height
+
+ cnt += 1
+
+ iou3d = boxes_iou3d(new_gt_box3d.reshape(1, 7), cur_gt_boxes3d)
+
+ valid_flag = iou3d.max() < 1e-8
+ if not valid_flag:
+ continue
+
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[3] += 2 # remove the points above and below the object
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, enlarged_box3d.reshape(1, 7))
+ pt_mask_flag = (boxes_pts_mask_list[0] == 1)
+ src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
+
+ new_pts_list.append(new_gt_points)
+ new_pts_intensity_list.append(new_gt_intensity)
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[4] += 0.5
+ enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
+ cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, enlarged_box3d.reshape(1, 7)), axis=0)
+ extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
+ extra_gt_obj_list.append(new_gt_obj)
+
+ if new_pts_list.__len__() == 0:
+ return False, pts_rect, pts_intensity, None, None
+
+ extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
+ # remove original points and add new points
+ pts_rect = pts_rect[src_pts_flag == 1]
+ pts_intensity = pts_intensity[src_pts_flag == 1]
+ new_pts_rect = np.concatenate(new_pts_list, axis=0)
+ new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
+ pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
+ pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
+
+ return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
+
+ def aug_one_epoch_scene(self, base_id, data_save_dir, label_save_dir, split_list, log_fp=None):
+ for idx, sample_id in enumerate(self.image_idx_list):
+ sample_id = int(sample_id)
+ print('process gt sample (%s, id=%06d)' % (args.split, sample_id))
+
+ pts_lidar = self.get_lidar(sample_id)
+ calib = self.get_calib(sample_id)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
+ img_shape = self.get_image_shape(sample_id)
+
+ pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
+ pts_rect = pts_rect[pts_valid_flag][:, 0:3]
+ pts_intensity = pts_lidar[pts_valid_flag][:, 3]
+
+ # all labels for checking overlapping
+ all_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
+ all_gt_boxes3d = np.zeros((all_obj_list.__len__(), 7), dtype=np.float32)
+ for k, obj in enumerate(all_obj_list):
+ all_gt_boxes3d[k, 0:3], all_gt_boxes3d[k, 3], all_gt_boxes3d[k, 4], all_gt_boxes3d[k, 5], \
+ all_gt_boxes3d[k, 6] = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ # gt_boxes3d of current label
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if args.class_name != 'Car' and obj_list.__len__() == 0:
+ continue
+
+ # augment one scene
+ aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
+ self.aug_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
+
+ # save augment result to file
+ pts_info = np.concatenate((pts_rect, pts_intensity.reshape(-1, 1)), axis=1)
+ bin_file = os.path.join(data_save_dir, '%06d.bin' % (base_id + sample_id))
+ pts_info.astype(np.float32).tofile(bin_file)
+
+ # save filtered original gt_boxes3d
+ label_save_file = os.path.join(label_save_dir, '%06d.txt' % (base_id + sample_id))
+ with open(label_save_file, 'w') as f:
+ for obj in obj_list:
+ f.write(obj.to_kitti_format() + '\n')
+
+ if aug_flag:
+ # augment successfully
+ save_kitti_format(calib, extra_gt_boxes3d, extra_gt_obj_list, img_shape=img_shape, save_fp=f)
+ else:
+ extra_gt_boxes3d = np.zeros((0, 7), dtype=np.float32)
+ log_print('Save to file (new_obj: %s): %s' % (extra_gt_boxes3d.__len__(), label_save_file), fp=log_fp)
+ split_list.append('%06d' % (base_id + sample_id))
+
+ def generate_aug_scene(self, aug_times, log_fp=None):
+ data_save_dir = os.path.join(args.save_dir, 'rectified_data')
+ label_save_dir = os.path.join(args.save_dir, 'aug_label')
+ if not os.path.isdir(data_save_dir):
+ os.makedirs(data_save_dir)
+ if not os.path.isdir(label_save_dir):
+ os.makedirs(label_save_dir)
+
+ split_file = os.path.join(args.save_dir, '%s_aug.txt' % args.split)
+ split_list = self.image_idx_list[:]
+ for epoch in range(aug_times):
+ base_id = (epoch + 1) * 10000
+ self.aug_one_epoch_scene(base_id, data_save_dir, label_save_dir, split_list, log_fp=log_fp)
+
+ with open(split_file, 'w') as f:
+ for idx, sample_id in enumerate(split_list):
+ f.write(str(sample_id) + '\n')
+ log_print('Save split file to %s' % split_file, fp=log_fp)
+ target_dir = os.path.join(args.data_dir, 'KITTI/ImageSets/')
+ os.system('cp %s %s' % (split_file, target_dir))
+ log_print('Copy split file from %s to %s' % (split_file, target_dir), fp=log_fp)
+
+
+if __name__ == '__main__':
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+ info_file = os.path.join(args.save_dir, 'log_info.txt')
+
+ if args.mode == 'generator':
+ log_fp = open(info_file, 'w')
+
+ gt_database = pickle.load(open(args.gt_database_dir, 'rb'))
+ log_print('Loading gt_database(%d) from %s' % (gt_database.__len__(), args.gt_database_dir), fp=log_fp)
+
+ dataset = AugSceneGenerator(root_dir=args.data_dir, gt_database=gt_database, split=args.split)
+ dataset.generate_aug_scene(aug_times=args.aug_times, log_fp=log_fp)
+
+ log_fp.close()
+
+ else:
+ pass
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py
new file mode 100644
index 0000000000000000000000000000000000000000..43290db734c9734fef8120031cab44a394f4323b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py
@@ -0,0 +1,104 @@
+"""
+Generate GT database
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_gt_database.py
+"""
+
+import os
+import numpy as np
+import pickle
+
+from data.kitti_dataset import KittiDataset
+import pts_utils
+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--data_dir', type=str, default='./data')
+parser.add_argument('--save_dir', type=str, default='./data/gt_database')
+parser.add_argument('--class_name', type=str, default='Car')
+parser.add_argument('--split', type=str, default='train')
+args = parser.parse_args()
+
+
+class GTDatabaseGenerator(KittiDataset):
+ def __init__(self, root_dir, split='train', classes=args.class_name):
+ super(GTDatabaseGenerator, self).__init__(root_dir, split=split)
+ self.gt_database = None
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ def __len__(self):
+ raise NotImplementedError
+
+ def __getitem__(self, item):
+ raise NotImplementedError
+
+ def filtrate_objects(self, obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type not in self.classes:
+ continue
+ if obj.level_str not in ['Easy', 'Moderate', 'Hard']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ def generate_gt_database(self):
+ gt_database = []
+ for idx, sample_id in enumerate(self.image_idx_list):
+ sample_id = int(sample_id)
+ print('process gt sample (id=%06d)' % sample_id)
+
+ pts_lidar = self.get_lidar(sample_id)
+ calib = self.get_calib(sample_id)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_intensity = pts_lidar[:, 3]
+
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+
+ gt_boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
+ for k, obj in enumerate(obj_list):
+ gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ if gt_boxes3d.__len__() == 0:
+ print('No gt object')
+ continue
+
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, gt_boxes3d)
+
+ for k in range(boxes_pts_mask_list.shape[0]):
+ pt_mask_flag = (boxes_pts_mask_list[k] == 1)
+ cur_pts = pts_rect[pt_mask_flag].astype(np.float32)
+ cur_pts_intensity = pts_intensity[pt_mask_flag].astype(np.float32)
+ sample_dict = {'sample_id': sample_id,
+ 'cls_type': obj_list[k].cls_type,
+ 'gt_box3d': gt_boxes3d[k],
+ 'points': cur_pts,
+ 'intensity': cur_pts_intensity,
+ 'obj': obj_list[k]}
+ gt_database.append(sample_dict)
+
+ save_file_name = os.path.join(args.save_dir, '%s_gt_database_3level_%s.pkl' % (args.split, self.classes[-1]))
+ with open(save_file_name, 'wb') as f:
+ pickle.dump(gt_database, f)
+
+ self.gt_database = gt_database
+ print('Save refine training sample info file to %s' % save_file_name)
+
+
+if __name__ == '__main__':
+ dataset = GTDatabaseGenerator(root_dir=args.data_dir, split=args.split)
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+
+ dataset.generate_gt_database()
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..6d16ef487301fb7ba45b71c64cd3af337cef13c5
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py
@@ -0,0 +1,71 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import argparse
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ "KITTI mAP evaluation script")
+ parser.add_argument(
+ '--result_dir',
+ type=str,
+ default='./result_dir',
+ help='detection result directory to evaluate')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--split',
+ type=str,
+ default='val',
+ help='evaluation split, default val')
+ parser.add_argument(
+ '--class_name',
+ type=str,
+ default='Car',
+ help='evaluation class name, default Car')
+ args = parser.parse_args()
+ return args
+
+
+def kitti_eval():
+ if float(sys.version[:3]) < 3.6:
+ print("KITTI mAP evaluation can only run with python3.6+")
+ sys.exit(1)
+
+ args = parse_args()
+
+ label_dir = os.path.join(args.data_dir, 'KITTI/object/training', 'label_2')
+ split_file = os.path.join(args.data_dir, 'KITTI/ImageSets',
+ '{}.txt'.format(args.split))
+ final_output_dir = os.path.join(args.result_dir, 'final_result', 'data')
+ name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
+
+ from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
+ ap_result_str, ap_dict = kitti_evaluate(
+ label_dir, final_output_dir, label_split_file=split_file,
+ current_class=name_to_class[args.class_name])
+
+ print("KITTI evaluate: ", ap_result_str, ap_dict)
+
+
+if __name__ == "__main__":
+ kitti_eval()
+
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE
new file mode 100644
index 0000000000000000000000000000000000000000..ab602974d200aa6849e6ad8220951ef9a78d9f08
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2018
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0e0e0c307c2db3f0486e594deae1c04ac49f55f3
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md
@@ -0,0 +1,32 @@
+# kitti-object-eval-python
+**NOTE**: This is borrowed from [traveller59/kitti-object-eval-python](https://github.com/traveller59/kitti-object-eval-python)
+
+Fast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.
+## Dependencies
+Only support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.
+* Install by conda:
+```
+conda install -c numba cudatoolkit=x.x (8.0, 9.0, 9.1, depend on your environment)
+```
+## Usage
+* commandline interface:
+```
+python evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False
+```
+* python interface:
+```Python
+import kitti_common as kitti
+from eval import get_official_eval_result, get_coco_eval_result
+def _read_imageset_file(path):
+ with open(path, 'r') as f:
+ lines = f.readlines()
+ return [int(line) for line in lines]
+det_path = "/path/to/your_result_folder"
+dt_annos = kitti.get_label_annos(det_path)
+gt_path = "/path/to/your_gt_label_folder"
+gt_split_file = "/path/to/val.txt" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz
+val_image_ids = _read_imageset_file(gt_split_file)
+gt_annos = kitti.get_label_annos(gt_path, val_image_ids)
+print(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer
+print(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer
+```
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..38101ca69a59cdc0603ebc82cac0338432457550
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py
@@ -0,0 +1,740 @@
+import numpy as np
+import numba
+import io as sysio
+from tools.kitti_object_eval_python.rotate_iou import rotate_iou_gpu_eval
+
+
+@numba.jit
+def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):
+ scores.sort()
+ scores = scores[::-1]
+ current_recall = 0
+ thresholds = []
+ for i, score in enumerate(scores):
+ l_recall = (i + 1) / num_gt
+ if i < (len(scores) - 1):
+ r_recall = (i + 2) / num_gt
+ else:
+ r_recall = l_recall
+ if (((r_recall - current_recall) < (current_recall - l_recall))
+ and (i < (len(scores) - 1))):
+ continue
+ # recall = l_recall
+ thresholds.append(score)
+ current_recall += 1 / (num_sample_pts - 1.0)
+ return thresholds
+
+
+def clean_data(gt_anno, dt_anno, current_class, difficulty):
+ CLASS_NAMES = ['car', 'pedestrian', 'cyclist']
+ MIN_HEIGHT = [40, 25, 25]
+ MAX_OCCLUSION = [0, 1, 2]
+ MAX_TRUNCATION = [0.15, 0.3, 0.5]
+ dc_bboxes, ignored_gt, ignored_dt = [], [], []
+ current_cls_name = CLASS_NAMES[current_class].lower()
+ num_gt = len(gt_anno["name"])
+ num_dt = len(dt_anno["name"])
+ num_valid_gt = 0
+ for i in range(num_gt):
+ bbox = gt_anno["bbox"][i]
+ gt_name = gt_anno["name"][i].lower()
+ height = bbox[3] - bbox[1]
+ valid_class = -1
+ if (gt_name == current_cls_name):
+ valid_class = 1
+ elif (current_cls_name == "Pedestrian".lower()
+ and "Person_sitting".lower() == gt_name):
+ valid_class = 0
+ elif (current_cls_name == "Car".lower() and "Van".lower() == gt_name):
+ valid_class = 0
+ else:
+ valid_class = -1
+ ignore = False
+ if ((gt_anno["occluded"][i] > MAX_OCCLUSION[difficulty])
+ or (gt_anno["truncated"][i] > MAX_TRUNCATION[difficulty])
+ or (height <= MIN_HEIGHT[difficulty])):
+ # if gt_anno["difficulty"][i] > difficulty or gt_anno["difficulty"][i] == -1:
+ ignore = True
+ if valid_class == 1 and not ignore:
+ ignored_gt.append(0)
+ num_valid_gt += 1
+ elif (valid_class == 0 or (ignore and (valid_class == 1))):
+ ignored_gt.append(1)
+ else:
+ ignored_gt.append(-1)
+ # for i in range(num_gt):
+ if gt_anno["name"][i] == "DontCare":
+ dc_bboxes.append(gt_anno["bbox"][i])
+ for i in range(num_dt):
+ if (dt_anno["name"][i].lower() == current_cls_name):
+ valid_class = 1
+ else:
+ valid_class = -1
+ height = abs(dt_anno["bbox"][i, 3] - dt_anno["bbox"][i, 1])
+ if height < MIN_HEIGHT[difficulty]:
+ ignored_dt.append(1)
+ elif valid_class == 1:
+ ignored_dt.append(0)
+ else:
+ ignored_dt.append(-1)
+
+ return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes
+
+
+@numba.jit(nopython=True)
+def image_box_overlap(boxes, query_boxes, criterion=-1):
+ N = boxes.shape[0]
+ K = query_boxes.shape[0]
+ overlaps = np.zeros((N, K), dtype=boxes.dtype)
+ for k in range(K):
+ qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *
+ (query_boxes[k, 3] - query_boxes[k, 1]))
+ for n in range(N):
+ iw = (min(boxes[n, 2], query_boxes[k, 2]) -
+ max(boxes[n, 0], query_boxes[k, 0]))
+ if iw > 0:
+ ih = (min(boxes[n, 3], query_boxes[k, 3]) -
+ max(boxes[n, 1], query_boxes[k, 1]))
+ if ih > 0:
+ if criterion == -1:
+ ua = (
+ (boxes[n, 2] - boxes[n, 0]) *
+ (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)
+ elif criterion == 0:
+ ua = ((boxes[n, 2] - boxes[n, 0]) *
+ (boxes[n, 3] - boxes[n, 1]))
+ elif criterion == 1:
+ ua = qbox_area
+ else:
+ ua = 1.0
+ overlaps[n, k] = iw * ih / ua
+ return overlaps
+
+
+def bev_box_overlap(boxes, qboxes, criterion=-1):
+ riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)
+ return riou
+
+
+@numba.jit(nopython=True, parallel=True)
+def d3_box_overlap_kernel(boxes, qboxes, rinc, criterion=-1):
+ # ONLY support overlap in CAMERA, not lider.
+ N, K = boxes.shape[0], qboxes.shape[0]
+ for i in range(N):
+ for j in range(K):
+ if rinc[i, j] > 0:
+ # iw = (min(boxes[i, 1] + boxes[i, 4], qboxes[j, 1] +
+ # qboxes[j, 4]) - max(boxes[i, 1], qboxes[j, 1]))
+ iw = (min(boxes[i, 1], qboxes[j, 1]) - max(
+ boxes[i, 1] - boxes[i, 4], qboxes[j, 1] - qboxes[j, 4]))
+
+ if iw > 0:
+ area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]
+ area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]
+ inc = iw * rinc[i, j]
+ if criterion == -1:
+ ua = (area1 + area2 - inc)
+ elif criterion == 0:
+ ua = area1
+ elif criterion == 1:
+ ua = area2
+ else:
+ ua = inc
+ rinc[i, j] = inc / ua
+ else:
+ rinc[i, j] = 0.0
+
+
+def d3_box_overlap(boxes, qboxes, criterion=-1):
+ rinc = rotate_iou_gpu_eval(boxes[:, [0, 2, 3, 5, 6]],
+ qboxes[:, [0, 2, 3, 5, 6]], 2)
+ d3_box_overlap_kernel(boxes, qboxes, rinc, criterion)
+ return rinc
+
+
+@numba.jit(nopython=True)
+def compute_statistics_jit(overlaps,
+ gt_datas,
+ dt_datas,
+ ignored_gt,
+ ignored_det,
+ dc_bboxes,
+ metric,
+ min_overlap,
+ thresh=0,
+ compute_fp=False,
+ compute_aos=False):
+
+ det_size = dt_datas.shape[0]
+ gt_size = gt_datas.shape[0]
+ dt_scores = dt_datas[:, -1]
+ dt_alphas = dt_datas[:, 4]
+ gt_alphas = gt_datas[:, 4]
+ dt_bboxes = dt_datas[:, :4]
+ gt_bboxes = gt_datas[:, :4]
+
+ assigned_detection = [False] * det_size
+ ignored_threshold = [False] * det_size
+ if compute_fp:
+ for i in range(det_size):
+ if (dt_scores[i] < thresh):
+ ignored_threshold[i] = True
+ NO_DETECTION = -10000000
+ tp, fp, fn, similarity = 0, 0, 0, 0
+ # thresholds = [0.0]
+ # delta = [0.0]
+ thresholds = np.zeros((gt_size, ))
+ thresh_idx = 0
+ delta = np.zeros((gt_size, ))
+ delta_idx = 0
+ for i in range(gt_size):
+ if ignored_gt[i] == -1:
+ continue
+ det_idx = -1
+ valid_detection = NO_DETECTION
+ max_overlap = 0
+ assigned_ignored_det = False
+
+ for j in range(det_size):
+ if (ignored_det[j] == -1):
+ continue
+ if (assigned_detection[j]):
+ continue
+ if (ignored_threshold[j]):
+ continue
+ overlap = overlaps[j, i]
+ dt_score = dt_scores[j]
+ if (not compute_fp and (overlap > min_overlap)
+ and dt_score > valid_detection):
+ det_idx = j
+ valid_detection = dt_score
+ elif (compute_fp and (overlap > min_overlap)
+ and (overlap > max_overlap or assigned_ignored_det)
+ and ignored_det[j] == 0):
+ max_overlap = overlap
+ det_idx = j
+ valid_detection = 1
+ assigned_ignored_det = False
+ elif (compute_fp and (overlap > min_overlap)
+ and (valid_detection == NO_DETECTION)
+ and ignored_det[j] == 1):
+ det_idx = j
+ valid_detection = 1
+ assigned_ignored_det = True
+
+ if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:
+ fn += 1
+ elif ((valid_detection != NO_DETECTION)
+ and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):
+ assigned_detection[det_idx] = True
+ elif valid_detection != NO_DETECTION:
+ tp += 1
+ # thresholds.append(dt_scores[det_idx])
+ thresholds[thresh_idx] = dt_scores[det_idx]
+ thresh_idx += 1
+ if compute_aos:
+ # delta.append(gt_alphas[i] - dt_alphas[det_idx])
+ delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]
+ delta_idx += 1
+
+ assigned_detection[det_idx] = True
+ if compute_fp:
+ for i in range(det_size):
+ if (not (assigned_detection[i] or ignored_det[i] == -1
+ or ignored_det[i] == 1 or ignored_threshold[i])):
+ fp += 1
+ nstuff = 0
+ if metric == 0:
+ overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)
+ for i in range(dc_bboxes.shape[0]):
+ for j in range(det_size):
+ if (assigned_detection[j]):
+ continue
+ if (ignored_det[j] == -1 or ignored_det[j] == 1):
+ continue
+ if (ignored_threshold[j]):
+ continue
+ if overlaps_dt_dc[j, i] > min_overlap:
+ assigned_detection[j] = True
+ nstuff += 1
+ fp -= nstuff
+ if compute_aos:
+ tmp = np.zeros((fp + delta_idx, ))
+ # tmp = [0] * fp
+ for i in range(delta_idx):
+ tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0
+ # tmp.append((1.0 + np.cos(delta[i])) / 2.0)
+ # assert len(tmp) == fp + tp
+ # assert len(delta) == tp
+ if tp > 0 or fp > 0:
+ similarity = np.sum(tmp)
+ else:
+ similarity = -1
+ return tp, fp, fn, similarity, thresholds[:thresh_idx]
+
+
+def get_split_parts(num, num_part):
+ same_part = num // num_part
+ remain_num = num % num_part
+ if remain_num == 0:
+ return [same_part] * num_part
+ else:
+ return [same_part] * num_part + [remain_num]
+
+
+@numba.jit(nopython=True)
+def fused_compute_statistics(overlaps,
+ pr,
+ gt_nums,
+ dt_nums,
+ dc_nums,
+ gt_datas,
+ dt_datas,
+ dontcares,
+ ignored_gts,
+ ignored_dets,
+ metric,
+ min_overlap,
+ thresholds,
+ compute_aos=False):
+ gt_num = 0
+ dt_num = 0
+ dc_num = 0
+ for i in range(gt_nums.shape[0]):
+ for t, thresh in enumerate(thresholds):
+ overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:
+ gt_num + gt_nums[i]]
+
+ gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]
+ dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]
+ ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]
+ ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]
+ dontcare = dontcares[dc_num:dc_num + dc_nums[i]]
+ tp, fp, fn, similarity, _ = compute_statistics_jit(
+ overlap,
+ gt_data,
+ dt_data,
+ ignored_gt,
+ ignored_det,
+ dontcare,
+ metric,
+ min_overlap=min_overlap,
+ thresh=thresh,
+ compute_fp=True,
+ compute_aos=compute_aos)
+ pr[t, 0] += tp
+ pr[t, 1] += fp
+ pr[t, 2] += fn
+ if similarity != -1:
+ pr[t, 3] += similarity
+ gt_num += gt_nums[i]
+ dt_num += dt_nums[i]
+ dc_num += dc_nums[i]
+
+
+def calculate_iou_partly(gt_annos, dt_annos, metric, num_parts=50):
+ """fast iou algorithm. this function can be used independently to
+ do result analysis. Must be used in CAMERA coordinate system.
+ Args:
+ gt_annos: dict, must from get_label_annos() in kitti_common.py
+ dt_annos: dict, must from get_label_annos() in kitti_common.py
+ metric: eval type. 0: bbox, 1: bev, 2: 3d
+ num_parts: int. a parameter for fast calculate algorithm
+ """
+ assert len(gt_annos) == len(dt_annos)
+ total_dt_num = np.stack([len(a["name"]) for a in dt_annos], 0)
+ total_gt_num = np.stack([len(a["name"]) for a in gt_annos], 0)
+ num_examples = len(gt_annos)
+ split_parts = get_split_parts(num_examples, num_parts)
+ parted_overlaps = []
+ example_idx = 0
+
+ for num_part in split_parts:
+ gt_annos_part = gt_annos[example_idx:example_idx + num_part]
+ dt_annos_part = dt_annos[example_idx:example_idx + num_part]
+ if metric == 0:
+ gt_boxes = np.concatenate([a["bbox"] for a in gt_annos_part], 0)
+ dt_boxes = np.concatenate([a["bbox"] for a in dt_annos_part], 0)
+ overlap_part = image_box_overlap(gt_boxes, dt_boxes)
+ elif metric == 1:
+ loc = np.concatenate(
+ [a["location"][:, [0, 2]] for a in gt_annos_part], 0)
+ dims = np.concatenate(
+ [a["dimensions"][:, [0, 2]] for a in gt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
+ gt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ loc = np.concatenate(
+ [a["location"][:, [0, 2]] for a in dt_annos_part], 0)
+ dims = np.concatenate(
+ [a["dimensions"][:, [0, 2]] for a in dt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
+ dt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype(
+ np.float64)
+ elif metric == 2:
+ loc = np.concatenate([a["location"] for a in gt_annos_part], 0)
+ dims = np.concatenate([a["dimensions"] for a in gt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
+ gt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ loc = np.concatenate([a["location"] for a in dt_annos_part], 0)
+ dims = np.concatenate([a["dimensions"] for a in dt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
+ dt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ overlap_part = d3_box_overlap(gt_boxes, dt_boxes).astype(
+ np.float64)
+ else:
+ raise ValueError("unknown metric")
+ parted_overlaps.append(overlap_part)
+ example_idx += num_part
+ overlaps = []
+ example_idx = 0
+ for j, num_part in enumerate(split_parts):
+ gt_annos_part = gt_annos[example_idx:example_idx + num_part]
+ dt_annos_part = dt_annos[example_idx:example_idx + num_part]
+ gt_num_idx, dt_num_idx = 0, 0
+ for i in range(num_part):
+ gt_box_num = total_gt_num[example_idx + i]
+ dt_box_num = total_dt_num[example_idx + i]
+ overlaps.append(
+ parted_overlaps[j][gt_num_idx:gt_num_idx + gt_box_num,
+ dt_num_idx:dt_num_idx + dt_box_num])
+ gt_num_idx += gt_box_num
+ dt_num_idx += dt_box_num
+ example_idx += num_part
+
+ return overlaps, parted_overlaps, total_gt_num, total_dt_num
+
+
+def _prepare_data(gt_annos, dt_annos, current_class, difficulty):
+ gt_datas_list = []
+ dt_datas_list = []
+ total_dc_num = []
+ ignored_gts, ignored_dets, dontcares = [], [], []
+ total_num_valid_gt = 0
+ for i in range(len(gt_annos)):
+ rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)
+ num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets
+ ignored_gts.append(np.array(ignored_gt, dtype=np.int64))
+ ignored_dets.append(np.array(ignored_det, dtype=np.int64))
+ if len(dc_bboxes) == 0:
+ dc_bboxes = np.zeros((0, 4)).astype(np.float64)
+ else:
+ dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)
+ total_dc_num.append(dc_bboxes.shape[0])
+ dontcares.append(dc_bboxes)
+ total_num_valid_gt += num_valid_gt
+ gt_datas = np.concatenate(
+ [gt_annos[i]["bbox"], gt_annos[i]["alpha"][..., np.newaxis]], 1)
+ dt_datas = np.concatenate([
+ dt_annos[i]["bbox"], dt_annos[i]["alpha"][..., np.newaxis],
+ dt_annos[i]["score"][..., np.newaxis]
+ ], 1)
+ gt_datas_list.append(gt_datas)
+ dt_datas_list.append(dt_datas)
+ total_dc_num = np.stack(total_dc_num, axis=0)
+ return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,
+ total_dc_num, total_num_valid_gt)
+
+
+def eval_class(gt_annos,
+ dt_annos,
+ current_classes,
+ difficultys,
+ metric,
+ min_overlaps,
+ compute_aos=False,
+ num_parts=50):
+ """Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.
+ Args:
+ gt_annos: dict, must from get_label_annos() in kitti_common.py
+ dt_annos: dict, must from get_label_annos() in kitti_common.py
+ current_classes: list of int, 0: car, 1: pedestrian, 2: cyclist
+ difficultys: list of int. eval difficulty, 0: easy, 1: normal, 2: hard
+ metric: eval type. 0: bbox, 1: bev, 2: 3d
+ min_overlaps: float, min overlap. format: [num_overlap, metric, class].
+ num_parts: int. a parameter for fast calculate algorithm
+
+ Returns:
+ dict of recall, precision and aos
+ """
+ assert len(gt_annos) == len(dt_annos)
+ num_examples = len(gt_annos)
+ split_parts = get_split_parts(num_examples, num_parts)
+
+ rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)
+ overlaps, parted_overlaps, total_dt_num, total_gt_num = rets
+ N_SAMPLE_PTS = 41
+ num_minoverlap = len(min_overlaps)
+ num_class = len(current_classes)
+ num_difficulty = len(difficultys)
+ precision = np.zeros(
+ [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ recall = np.zeros(
+ [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ for m, current_class in enumerate(current_classes):
+ for l, difficulty in enumerate(difficultys):
+ rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)
+ (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,
+ dontcares, total_dc_num, total_num_valid_gt) = rets
+ for k, min_overlap in enumerate(min_overlaps[:, metric, m]):
+ thresholdss = []
+ for i in range(len(gt_annos)):
+ rets = compute_statistics_jit(
+ overlaps[i],
+ gt_datas_list[i],
+ dt_datas_list[i],
+ ignored_gts[i],
+ ignored_dets[i],
+ dontcares[i],
+ metric,
+ min_overlap=min_overlap,
+ thresh=0.0,
+ compute_fp=False)
+ tp, fp, fn, similarity, thresholds = rets
+ thresholdss += thresholds.tolist()
+ thresholdss = np.array(thresholdss)
+ thresholds = get_thresholds(thresholdss, total_num_valid_gt)
+ thresholds = np.array(thresholds)
+ pr = np.zeros([len(thresholds), 4])
+ idx = 0
+ for j, num_part in enumerate(split_parts):
+ gt_datas_part = np.concatenate(
+ gt_datas_list[idx:idx + num_part], 0)
+ dt_datas_part = np.concatenate(
+ dt_datas_list[idx:idx + num_part], 0)
+ dc_datas_part = np.concatenate(
+ dontcares[idx:idx + num_part], 0)
+ ignored_dets_part = np.concatenate(
+ ignored_dets[idx:idx + num_part], 0)
+ ignored_gts_part = np.concatenate(
+ ignored_gts[idx:idx + num_part], 0)
+ fused_compute_statistics(
+ parted_overlaps[j],
+ pr,
+ total_gt_num[idx:idx + num_part],
+ total_dt_num[idx:idx + num_part],
+ total_dc_num[idx:idx + num_part],
+ gt_datas_part,
+ dt_datas_part,
+ dc_datas_part,
+ ignored_gts_part,
+ ignored_dets_part,
+ metric,
+ min_overlap=min_overlap,
+ thresholds=thresholds,
+ compute_aos=compute_aos)
+ idx += num_part
+ for i in range(len(thresholds)):
+ recall[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 2])
+ precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])
+ if compute_aos:
+ aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])
+ for i in range(len(thresholds)):
+ precision[m, l, k, i] = np.max(
+ precision[m, l, k, i:], axis=-1)
+ recall[m, l, k, i] = np.max(recall[m, l, k, i:], axis=-1)
+ if compute_aos:
+ aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)
+ ret_dict = {
+ "recall": recall,
+ "precision": precision,
+ "orientation": aos,
+ }
+ return ret_dict
+
+
+def get_mAP(prec):
+ sums = 0
+ for i in range(0, prec.shape[-1], 4):
+ sums = sums + prec[..., i]
+ return sums / 11 * 100
+
+
+def print_str(value, *arg, sstream=None):
+ if sstream is None:
+ sstream = sysio.StringIO()
+ sstream.truncate(0)
+ sstream.seek(0)
+ print(value, *arg, file=sstream)
+ return sstream.getvalue()
+
+
+def do_eval(gt_annos,
+ dt_annos,
+ current_classes,
+ min_overlaps,
+ compute_aos=False):
+ # min_overlaps: [num_minoverlap, metric, num_class]
+ difficultys = [0, 1, 2]
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 0,
+ min_overlaps, compute_aos)
+ # ret: [num_class, num_diff, num_minoverlap, num_sample_points]
+ mAP_bbox = get_mAP(ret["precision"])
+ mAP_aos = None
+ if compute_aos:
+ mAP_aos = get_mAP(ret["orientation"])
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1,
+ min_overlaps)
+ mAP_bev = get_mAP(ret["precision"])
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 2,
+ min_overlaps)
+ mAP_3d = get_mAP(ret["precision"])
+ return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
+
+
+def do_coco_style_eval(gt_annos, dt_annos, current_classes, overlap_ranges,
+ compute_aos):
+ # overlap_ranges: [range, metric, num_class]
+ min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])
+ for i in range(overlap_ranges.shape[1]):
+ for j in range(overlap_ranges.shape[2]):
+ min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])
+ mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval(
+ gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
+ # ret: [num_class, num_diff, num_minoverlap]
+ mAP_bbox = mAP_bbox.mean(-1)
+ mAP_bev = mAP_bev.mean(-1)
+ mAP_3d = mAP_3d.mean(-1)
+ if mAP_aos is not None:
+ mAP_aos = mAP_aos.mean(-1)
+ return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
+
+
+def get_official_eval_result(gt_annos, dt_annos, current_classes):
+ overlap_0_7 = np.array([[0.7, 0.5, 0.5, 0.7,
+ 0.5], [0.7, 0.5, 0.5, 0.7, 0.5],
+ [0.7, 0.5, 0.5, 0.7, 0.5]])
+ overlap_0_5 = np.array([[0.7, 0.5, 0.5, 0.7,
+ 0.5], [0.5, 0.25, 0.25, 0.5, 0.25],
+ [0.5, 0.25, 0.25, 0.5, 0.25]])
+ min_overlaps = np.stack([overlap_0_7, overlap_0_5], axis=0) # [2, 3, 5]
+ class_to_name = {
+ 0: 'Car',
+ 1: 'Pedestrian',
+ 2: 'Cyclist',
+ 3: 'Van',
+ 4: 'Person_sitting',
+ }
+ name_to_class = {v: n for n, v in class_to_name.items()}
+ if not isinstance(current_classes, (list, tuple)):
+ current_classes = [current_classes]
+ current_classes_int = []
+ for curcls in current_classes:
+ if isinstance(curcls, str):
+ current_classes_int.append(name_to_class[curcls])
+ else:
+ current_classes_int.append(curcls)
+ current_classes = current_classes_int
+ min_overlaps = min_overlaps[:, :, current_classes]
+ result = ''
+ # check whether alpha is valid
+ compute_aos = False
+ for anno in dt_annos:
+ if anno['alpha'].shape[0] != 0:
+ if anno['alpha'][0] != -10:
+ compute_aos = True
+ break
+ mAPbbox, mAPbev, mAP3d, mAPaos = do_eval(
+ gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
+
+ ret_dict = {}
+ for j, curcls in enumerate(current_classes):
+ # mAP threshold array: [num_minoverlap, metric, class]
+ # mAP result: [num_class, num_diff, num_minoverlap]
+ for i in range(min_overlaps.shape[0]):
+ result += print_str(
+ (f"{class_to_name[curcls]} "
+ "AP@{:.2f}, {:.2f}, {:.2f}:".format(*min_overlaps[i, :, j])))
+ result += print_str((f"bbox AP:{mAPbbox[j, 0, i]:.4f}, "
+ f"{mAPbbox[j, 1, i]:.4f}, "
+ f"{mAPbbox[j, 2, i]:.4f}"))
+ result += print_str((f"bev AP:{mAPbev[j, 0, i]:.4f}, "
+ f"{mAPbev[j, 1, i]:.4f}, "
+ f"{mAPbev[j, 2, i]:.4f}"))
+ result += print_str((f"3d AP:{mAP3d[j, 0, i]:.4f}, "
+ f"{mAP3d[j, 1, i]:.4f}, "
+ f"{mAP3d[j, 2, i]:.4f}"))
+
+
+ if compute_aos:
+ result += print_str((f"aos AP:{mAPaos[j, 0, i]:.2f}, "
+ f"{mAPaos[j, 1, i]:.2f}, "
+ f"{mAPaos[j, 2, i]:.2f}"))
+ ret_dict['Car_3d_easy'] = mAP3d[0, 0, 0]
+ ret_dict['Car_3d_moderate'] = mAP3d[0, 1, 0]
+ ret_dict['Car_3d_hard'] = mAP3d[0, 2, 0]
+ ret_dict['Car_bev_easy'] = mAPbev[0, 0, 0]
+ ret_dict['Car_bev_moderate'] = mAPbev[0, 1, 0]
+ ret_dict['Car_bev_hard'] = mAPbev[0, 2, 0]
+ ret_dict['Car_image_easy'] = mAPbbox[0, 0, 0]
+ ret_dict['Car_image_moderate'] = mAPbbox[0, 1, 0]
+ ret_dict['Car_image_hard'] = mAPbbox[0, 2, 0]
+
+ return result, ret_dict
+
+
+def get_coco_eval_result(gt_annos, dt_annos, current_classes):
+ class_to_name = {
+ 0: 'Car',
+ 1: 'Pedestrian',
+ 2: 'Cyclist',
+ 3: 'Van',
+ 4: 'Person_sitting',
+ }
+ class_to_range = {
+ 0: [0.5, 0.95, 10],
+ 1: [0.25, 0.7, 10],
+ 2: [0.25, 0.7, 10],
+ 3: [0.5, 0.95, 10],
+ 4: [0.25, 0.7, 10],
+ }
+ name_to_class = {v: n for n, v in class_to_name.items()}
+ if not isinstance(current_classes, (list, tuple)):
+ current_classes = [current_classes]
+ current_classes_int = []
+ for curcls in current_classes:
+ if isinstance(curcls, str):
+ current_classes_int.append(name_to_class[curcls])
+ else:
+ current_classes_int.append(curcls)
+ current_classes = current_classes_int
+ overlap_ranges = np.zeros([3, 3, len(current_classes)])
+ for i, curcls in enumerate(current_classes):
+ overlap_ranges[:, :, i] = np.array(
+ class_to_range[curcls])[:, np.newaxis]
+ result = ''
+ # check whether alpha is valid
+ compute_aos = False
+ for anno in dt_annos:
+ if anno['alpha'].shape[0] != 0:
+ if anno['alpha'][0] != -10:
+ compute_aos = True
+ break
+ mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(
+ gt_annos, dt_annos, current_classes, overlap_ranges, compute_aos)
+ for j, curcls in enumerate(current_classes):
+ # mAP threshold array: [num_minoverlap, metric, class]
+ # mAP result: [num_class, num_diff, num_minoverlap]
+ o_range = np.array(class_to_range[curcls])[[0, 2, 1]]
+ o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)
+ result += print_str((f"{class_to_name[curcls]} "
+ "coco AP@{:.2f}:{:.2f}:{:.2f}:".format(*o_range)))
+ result += print_str((f"bbox AP:{mAPbbox[j, 0]:.2f}, "
+ f"{mAPbbox[j, 1]:.2f}, "
+ f"{mAPbbox[j, 2]:.2f}"))
+ result += print_str((f"bev AP:{mAPbev[j, 0]:.2f}, "
+ f"{mAPbev[j, 1]:.2f}, "
+ f"{mAPbev[j, 2]:.2f}"))
+ result += print_str((f"3d AP:{mAP3d[j, 0]:.2f}, "
+ f"{mAP3d[j, 1]:.2f}, "
+ f"{mAP3d[j, 2]:.2f}"))
+ if compute_aos:
+ result += print_str((f"aos AP:{mAPaos[j, 0]:.2f}, "
+ f"{mAPaos[j, 1]:.2f}, "
+ f"{mAPaos[j, 2]:.2f}"))
+ return result
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py
new file mode 100644
index 0000000000000000000000000000000000000000..e822ae464618eb05c4123b7bd05cec875a567b70
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py
@@ -0,0 +1,32 @@
+import time
+import fire
+
+import tools.kitti_object_eval_python.kitti_common as kitti
+from tools.kitti_object_eval_python.eval import get_official_eval_result, get_coco_eval_result
+
+
+def _read_imageset_file(path):
+ with open(path, 'r') as f:
+ lines = f.readlines()
+ return [int(line) for line in lines]
+
+
+def evaluate(label_path,
+ result_path,
+ label_split_file,
+ current_class=0,
+ coco=False,
+ score_thresh=-1):
+ dt_annos = kitti.get_label_annos(result_path)
+ if score_thresh > 0:
+ dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)
+ val_image_ids = _read_imageset_file(label_split_file)
+ gt_annos = kitti.get_label_annos(label_path, val_image_ids)
+ if coco:
+ return get_coco_eval_result(gt_annos, dt_annos, current_class)
+ else:
+ return get_official_eval_result(gt_annos, dt_annos, current_class)
+
+
+if __name__ == '__main__':
+ fire.Fire()
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py
new file mode 100644
index 0000000000000000000000000000000000000000..e7e254ea4a27af9656757bbfb1f932c1348f59fe
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py
@@ -0,0 +1,411 @@
+import concurrent.futures as futures
+import os
+import pathlib
+import re
+from collections import OrderedDict
+
+import numpy as np
+from skimage import io
+
+def get_image_index_str(img_idx):
+ return "{:06d}".format(img_idx)
+
+
+def get_kitti_info_path(idx,
+ prefix,
+ info_type='image_2',
+ file_tail='.png',
+ training=True,
+ relative_path=True):
+ img_idx_str = get_image_index_str(idx)
+ img_idx_str += file_tail
+ prefix = pathlib.Path(prefix)
+ if training:
+ file_path = pathlib.Path('training') / info_type / img_idx_str
+ else:
+ file_path = pathlib.Path('testing') / info_type / img_idx_str
+ if not (prefix / file_path).exists():
+ raise ValueError("file not exist: {}".format(file_path))
+ if relative_path:
+ return str(file_path)
+ else:
+ return str(prefix / file_path)
+
+
+def get_image_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,
+ relative_path)
+
+
+def get_label_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,
+ relative_path)
+
+
+def get_velodyne_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,
+ relative_path)
+
+
+def get_calib_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,
+ relative_path)
+
+
+def _extend_matrix(mat):
+ mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)
+ return mat
+
+
+def get_kitti_image_info(path,
+ training=True,
+ label_info=True,
+ velodyne=False,
+ calib=False,
+ image_ids=7481,
+ extend_matrix=True,
+ num_worker=8,
+ relative_path=True,
+ with_imageshape=True):
+ # image_infos = []
+ root_path = pathlib.Path(path)
+ if not isinstance(image_ids, list):
+ image_ids = list(range(image_ids))
+
+ def map_func(idx):
+ image_info = {'image_idx': idx}
+ annotations = None
+ if velodyne:
+ image_info['velodyne_path'] = get_velodyne_path(
+ idx, path, training, relative_path)
+ image_info['img_path'] = get_image_path(idx, path, training,
+ relative_path)
+ if with_imageshape:
+ img_path = image_info['img_path']
+ if relative_path:
+ img_path = str(root_path / img_path)
+ image_info['img_shape'] = np.array(
+ io.imread(img_path).shape[:2], dtype=np.int32)
+ if label_info:
+ label_path = get_label_path(idx, path, training, relative_path)
+ if relative_path:
+ label_path = str(root_path / label_path)
+ annotations = get_label_anno(label_path)
+ if calib:
+ calib_path = get_calib_path(
+ idx, path, training, relative_path=False)
+ with open(calib_path, 'r') as f:
+ lines = f.readlines()
+ P0 = np.array(
+ [float(info) for info in lines[0].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P1 = np.array(
+ [float(info) for info in lines[1].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P2 = np.array(
+ [float(info) for info in lines[2].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P3 = np.array(
+ [float(info) for info in lines[3].split(' ')[1:13]]).reshape(
+ [3, 4])
+ if extend_matrix:
+ P0 = _extend_matrix(P0)
+ P1 = _extend_matrix(P1)
+ P2 = _extend_matrix(P2)
+ P3 = _extend_matrix(P3)
+ image_info['calib/P0'] = P0
+ image_info['calib/P1'] = P1
+ image_info['calib/P2'] = P2
+ image_info['calib/P3'] = P3
+ R0_rect = np.array([
+ float(info) for info in lines[4].split(' ')[1:10]
+ ]).reshape([3, 3])
+ if extend_matrix:
+ rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)
+ rect_4x4[3, 3] = 1.
+ rect_4x4[:3, :3] = R0_rect
+ else:
+ rect_4x4 = R0_rect
+ image_info['calib/R0_rect'] = rect_4x4
+ Tr_velo_to_cam = np.array([
+ float(info) for info in lines[5].split(' ')[1:13]
+ ]).reshape([3, 4])
+ Tr_imu_to_velo = np.array([
+ float(info) for info in lines[6].split(' ')[1:13]
+ ]).reshape([3, 4])
+ if extend_matrix:
+ Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)
+ Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)
+ image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam
+ image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo
+ if annotations is not None:
+ image_info['annos'] = annotations
+ add_difficulty_to_annos(image_info)
+ return image_info
+
+ with futures.ThreadPoolExecutor(num_worker) as executor:
+ image_infos = executor.map(map_func, image_ids)
+ return list(image_infos)
+
+
+def filter_kitti_anno(image_anno,
+ used_classes,
+ used_difficulty=None,
+ dontcare_iou=None):
+ if not isinstance(used_classes, (list, tuple)):
+ used_classes = [used_classes]
+ img_filtered_annotations = {}
+ relevant_annotation_indices = [
+ i for i, x in enumerate(image_anno['name']) if x in used_classes
+ ]
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (
+ image_anno[key][relevant_annotation_indices])
+ if used_difficulty is not None:
+ relevant_annotation_indices = [
+ i for i, x in enumerate(img_filtered_annotations['difficulty'])
+ if x in used_difficulty
+ ]
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (
+ img_filtered_annotations[key][relevant_annotation_indices])
+
+ if 'DontCare' in used_classes and dontcare_iou is not None:
+ dont_care_indices = [
+ i for i, x in enumerate(img_filtered_annotations['name'])
+ if x == 'DontCare'
+ ]
+ # bounding box format [y_min, x_min, y_max, x_max]
+ all_boxes = img_filtered_annotations['bbox']
+ ious = iou(all_boxes, all_boxes[dont_care_indices])
+
+ # Remove all bounding boxes that overlap with a dontcare region.
+ if ious.size > 0:
+ boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (img_filtered_annotations[key][
+ np.logical_not(boxes_to_remove)])
+ return img_filtered_annotations
+
+def filter_annos_low_score(image_annos, thresh):
+ new_image_annos = []
+ for anno in image_annos:
+ img_filtered_annotations = {}
+ relevant_annotation_indices = [
+ i for i, s in enumerate(anno['score']) if s >= thresh
+ ]
+ for key in anno.keys():
+ img_filtered_annotations[key] = (
+ anno[key][relevant_annotation_indices])
+ new_image_annos.append(img_filtered_annotations)
+ return new_image_annos
+
+def kitti_result_line(result_dict, precision=4):
+ prec_float = "{" + ":.{}f".format(precision) + "}"
+ res_line = []
+ all_field_default = OrderedDict([
+ ('name', None),
+ ('truncated', -1),
+ ('occluded', -1),
+ ('alpha', -10),
+ ('bbox', None),
+ ('dimensions', [-1, -1, -1]),
+ ('location', [-1000, -1000, -1000]),
+ ('rotation_y', -10),
+ ('score', None),
+ ])
+ res_dict = [(key, None) for key, val in all_field_default.items()]
+ res_dict = OrderedDict(res_dict)
+ for key, val in result_dict.items():
+ if all_field_default[key] is None and val is None:
+ raise ValueError("you must specify a value for {}".format(key))
+ res_dict[key] = val
+
+ for key, val in res_dict.items():
+ if key == 'name':
+ res_line.append(val)
+ elif key in ['truncated', 'alpha', 'rotation_y', 'score']:
+ if val is None:
+ res_line.append(str(all_field_default[key]))
+ else:
+ res_line.append(prec_float.format(val))
+ elif key == 'occluded':
+ if val is None:
+ res_line.append(str(all_field_default[key]))
+ else:
+ res_line.append('{}'.format(val))
+ elif key in ['bbox', 'dimensions', 'location']:
+ if val is None:
+ res_line += [str(v) for v in all_field_default[key]]
+ else:
+ res_line += [prec_float.format(v) for v in val]
+ else:
+ raise ValueError("unknown key. supported key:{}".format(
+ res_dict.keys()))
+ return ' '.join(res_line)
+
+
+def add_difficulty_to_annos(info):
+ min_height = [40, 25,
+ 25] # minimum height for evaluated groundtruth/detections
+ max_occlusion = [
+ 0, 1, 2
+ ] # maximum occlusion level of the groundtruth used for evaluation
+ max_trunc = [
+ 0.15, 0.3, 0.5
+ ] # maximum truncation level of the groundtruth used for evaluation
+ annos = info['annos']
+ dims = annos['dimensions'] # lhw format
+ bbox = annos['bbox']
+ height = bbox[:, 3] - bbox[:, 1]
+ occlusion = annos['occluded']
+ truncation = annos['truncated']
+ diff = []
+ easy_mask = np.ones((len(dims), ), dtype=np.bool)
+ moderate_mask = np.ones((len(dims), ), dtype=np.bool)
+ hard_mask = np.ones((len(dims), ), dtype=np.bool)
+ i = 0
+ for h, o, t in zip(height, occlusion, truncation):
+ if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]:
+ easy_mask[i] = False
+ if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]:
+ moderate_mask[i] = False
+ if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]:
+ hard_mask[i] = False
+ i += 1
+ is_easy = easy_mask
+ is_moderate = np.logical_xor(easy_mask, moderate_mask)
+ is_hard = np.logical_xor(hard_mask, moderate_mask)
+
+ for i in range(len(dims)):
+ if is_easy[i]:
+ diff.append(0)
+ elif is_moderate[i]:
+ diff.append(1)
+ elif is_hard[i]:
+ diff.append(2)
+ else:
+ diff.append(-1)
+ annos["difficulty"] = np.array(diff, np.int32)
+ return diff
+
+
+def get_label_anno(label_path):
+ annotations = {}
+ annotations.update({
+ 'name': [],
+ 'truncated': [],
+ 'occluded': [],
+ 'alpha': [],
+ 'bbox': [],
+ 'dimensions': [],
+ 'location': [],
+ 'rotation_y': []
+ })
+ with open(label_path, 'r') as f:
+ lines = f.readlines()
+ # if len(lines) == 0 or len(lines[0]) < 15:
+ # content = []
+ # else:
+ content = [line.strip().split(' ') for line in lines]
+ annotations['name'] = np.array([x[0] for x in content])
+ annotations['truncated'] = np.array([float(x[1]) for x in content])
+ annotations['occluded'] = np.array([int(x[2]) for x in content])
+ annotations['alpha'] = np.array([float(x[3]) for x in content])
+ annotations['bbox'] = np.array(
+ [[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4)
+ # dimensions will convert hwl format to standard lhw(camera) format.
+ annotations['dimensions'] = np.array(
+ [[float(info) for info in x[8:11]] for x in content]).reshape(
+ -1, 3)[:, [2, 0, 1]]
+ annotations['location'] = np.array(
+ [[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3)
+ annotations['rotation_y'] = np.array(
+ [float(x[14]) for x in content]).reshape(-1)
+ if len(content) != 0 and len(content[0]) == 16: # have score
+ annotations['score'] = np.array([float(x[15]) for x in content])
+ else:
+ annotations['score'] = np.zeros([len(annotations['bbox'])])
+ return annotations
+
+def get_label_annos(label_folder, image_ids=None):
+ if image_ids is None:
+ filepaths = pathlib.Path(label_folder).glob('*.txt')
+ prog = re.compile(r'^\d{6}.txt$')
+ filepaths = filter(lambda f: prog.match(f.name), filepaths)
+ image_ids = [int(p.stem) for p in filepaths]
+ image_ids = sorted(image_ids)
+ if not isinstance(image_ids, list):
+ image_ids = list(range(image_ids))
+ annos = []
+ label_folder = pathlib.Path(label_folder)
+ for idx in image_ids:
+ image_idx = get_image_index_str(idx)
+ label_filename = label_folder / (image_idx + '.txt')
+ annos.append(get_label_anno(label_filename))
+ return annos
+
+def area(boxes, add1=False):
+ """Computes area of boxes.
+
+ Args:
+ boxes: Numpy array with shape [N, 4] holding N boxes
+
+ Returns:
+ a numpy array with shape [N*1] representing box areas
+ """
+ if add1:
+ return (boxes[:, 2] - boxes[:, 0] + 1.0) * (
+ boxes[:, 3] - boxes[:, 1] + 1.0)
+ else:
+ return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
+
+
+def intersection(boxes1, boxes2, add1=False):
+ """Compute pairwise intersection areas between boxes.
+
+ Args:
+ boxes1: a numpy array with shape [N, 4] holding N boxes
+ boxes2: a numpy array with shape [M, 4] holding M boxes
+
+ Returns:
+ a numpy array with shape [N*M] representing pairwise intersection area
+ """
+ [y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1)
+ [y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1)
+
+ all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2))
+ all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2))
+ if add1:
+ all_pairs_min_ymax += 1.0
+ intersect_heights = np.maximum(
+ np.zeros(all_pairs_max_ymin.shape),
+ all_pairs_min_ymax - all_pairs_max_ymin)
+
+ all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2))
+ all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2))
+ if add1:
+ all_pairs_min_xmax += 1.0
+ intersect_widths = np.maximum(
+ np.zeros(all_pairs_max_xmin.shape),
+ all_pairs_min_xmax - all_pairs_max_xmin)
+ return intersect_heights * intersect_widths
+
+
+def iou(boxes1, boxes2, add1=False):
+ """Computes pairwise intersection-over-union between box collections.
+
+ Args:
+ boxes1: a numpy array with shape [N, 4] holding N boxes.
+ boxes2: a numpy array with shape [M, 4] holding N boxes.
+
+ Returns:
+ a numpy array with shape [N, M] representing pairwise iou scores.
+ """
+ intersect = intersection(boxes1, boxes2, add1)
+ area1 = area(boxes1, add1)
+ area2 = area(boxes2, add1)
+ union = np.expand_dims(
+ area1, axis=1) + np.expand_dims(
+ area2, axis=0) - intersect
+ return intersect / union
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py
new file mode 100644
index 0000000000000000000000000000000000000000..cd694ef5c5a0c9fac9595a17743a35db37d48820
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py
@@ -0,0 +1,329 @@
+#####################
+# Based on https://github.com/hongzhenwang/RRPN-revise
+# Licensed under The MIT License
+# Author: yanyan, scrin@foxmail.com
+#####################
+import math
+
+import numba
+import numpy as np
+from numba import cuda
+
+@numba.jit(nopython=True)
+def div_up(m, n):
+ return m // n + (m % n > 0)
+
+@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
+def trangle_area(a, b, c):
+ return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) *
+ (b[0] - c[0])) / 2.0
+
+
+@cuda.jit('(float32[:], int32)', device=True, inline=True)
+def area(int_pts, num_of_inter):
+ area_val = 0.0
+ for i in range(num_of_inter - 2):
+ area_val += abs(
+ trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4],
+ int_pts[2 * i + 4:2 * i + 6]))
+ return area_val
+
+
+@cuda.jit('(float32[:], int32)', device=True, inline=True)
+def sort_vertex_in_convex_polygon(int_pts, num_of_inter):
+ if num_of_inter > 0:
+ center = cuda.local.array((2, ), dtype=numba.float32)
+ center[:] = 0.0
+ for i in range(num_of_inter):
+ center[0] += int_pts[2 * i]
+ center[1] += int_pts[2 * i + 1]
+ center[0] /= num_of_inter
+ center[1] /= num_of_inter
+ v = cuda.local.array((2, ), dtype=numba.float32)
+ vs = cuda.local.array((16, ), dtype=numba.float32)
+ for i in range(num_of_inter):
+ v[0] = int_pts[2 * i] - center[0]
+ v[1] = int_pts[2 * i + 1] - center[1]
+ d = math.sqrt(v[0] * v[0] + v[1] * v[1])
+ v[0] = v[0] / d
+ v[1] = v[1] / d
+ if v[1] < 0:
+ v[0] = -2 - v[0]
+ vs[i] = v[0]
+ j = 0
+ temp = 0
+ for i in range(1, num_of_inter):
+ if vs[i - 1] > vs[i]:
+ temp = vs[i]
+ tx = int_pts[2 * i]
+ ty = int_pts[2 * i + 1]
+ j = i
+ while j > 0 and vs[j - 1] > temp:
+ vs[j] = vs[j - 1]
+ int_pts[j * 2] = int_pts[j * 2 - 2]
+ int_pts[j * 2 + 1] = int_pts[j * 2 - 1]
+ j -= 1
+
+ vs[j] = temp
+ int_pts[j * 2] = tx
+ int_pts[j * 2 + 1] = ty
+
+
+@cuda.jit(
+ '(float32[:], float32[:], int32, int32, float32[:])',
+ device=True,
+ inline=True)
+def line_segment_intersection(pts1, pts2, i, j, temp_pts):
+ A = cuda.local.array((2, ), dtype=numba.float32)
+ B = cuda.local.array((2, ), dtype=numba.float32)
+ C = cuda.local.array((2, ), dtype=numba.float32)
+ D = cuda.local.array((2, ), dtype=numba.float32)
+
+ A[0] = pts1[2 * i]
+ A[1] = pts1[2 * i + 1]
+
+ B[0] = pts1[2 * ((i + 1) % 4)]
+ B[1] = pts1[2 * ((i + 1) % 4) + 1]
+
+ C[0] = pts2[2 * j]
+ C[1] = pts2[2 * j + 1]
+
+ D[0] = pts2[2 * ((j + 1) % 4)]
+ D[1] = pts2[2 * ((j + 1) % 4) + 1]
+ BA0 = B[0] - A[0]
+ BA1 = B[1] - A[1]
+ DA0 = D[0] - A[0]
+ CA0 = C[0] - A[0]
+ DA1 = D[1] - A[1]
+ CA1 = C[1] - A[1]
+ acd = DA1 * CA0 > CA1 * DA0
+ bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])
+ if acd != bcd:
+ abc = CA1 * BA0 > BA1 * CA0
+ abd = DA1 * BA0 > BA1 * DA0
+ if abc != abd:
+ DC0 = D[0] - C[0]
+ DC1 = D[1] - C[1]
+ ABBA = A[0] * B[1] - B[0] * A[1]
+ CDDC = C[0] * D[1] - D[0] * C[1]
+ DH = BA1 * DC0 - BA0 * DC1
+ Dx = ABBA * DC0 - BA0 * CDDC
+ Dy = ABBA * DC1 - BA1 * CDDC
+ temp_pts[0] = Dx / DH
+ temp_pts[1] = Dy / DH
+ return True
+ return False
+
+
+@cuda.jit(
+ '(float32[:], float32[:], int32, int32, float32[:])',
+ device=True,
+ inline=True)
+def line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):
+ a = cuda.local.array((2, ), dtype=numba.float32)
+ b = cuda.local.array((2, ), dtype=numba.float32)
+ c = cuda.local.array((2, ), dtype=numba.float32)
+ d = cuda.local.array((2, ), dtype=numba.float32)
+
+ a[0] = pts1[2 * i]
+ a[1] = pts1[2 * i + 1]
+
+ b[0] = pts1[2 * ((i + 1) % 4)]
+ b[1] = pts1[2 * ((i + 1) % 4) + 1]
+
+ c[0] = pts2[2 * j]
+ c[1] = pts2[2 * j + 1]
+
+ d[0] = pts2[2 * ((j + 1) % 4)]
+ d[1] = pts2[2 * ((j + 1) % 4) + 1]
+
+ area_abc = trangle_area(a, b, c)
+ area_abd = trangle_area(a, b, d)
+
+ if area_abc * area_abd >= 0:
+ return False
+
+ area_cda = trangle_area(c, d, a)
+ area_cdb = area_cda + area_abc - area_abd
+
+ if area_cda * area_cdb >= 0:
+ return False
+ t = area_cda / (area_abd - area_abc)
+
+ dx = t * (b[0] - a[0])
+ dy = t * (b[1] - a[1])
+ temp_pts[0] = a[0] + dx
+ temp_pts[1] = a[1] + dy
+ return True
+
+
+@cuda.jit('(float32, float32, float32[:])', device=True, inline=True)
+def point_in_quadrilateral(pt_x, pt_y, corners):
+ ab0 = corners[2] - corners[0]
+ ab1 = corners[3] - corners[1]
+
+ ad0 = corners[6] - corners[0]
+ ad1 = corners[7] - corners[1]
+
+ ap0 = pt_x - corners[0]
+ ap1 = pt_y - corners[1]
+
+ abab = ab0 * ab0 + ab1 * ab1
+ abap = ab0 * ap0 + ab1 * ap1
+ adad = ad0 * ad0 + ad1 * ad1
+ adap = ad0 * ap0 + ad1 * ap1
+
+ return abab >= abap and abap >= 0 and adad >= adap and adap >= 0
+
+
+@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
+def quadrilateral_intersection(pts1, pts2, int_pts):
+ num_of_inter = 0
+ for i in range(4):
+ if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2):
+ int_pts[num_of_inter * 2] = pts1[2 * i]
+ int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1]
+ num_of_inter += 1
+ if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1):
+ int_pts[num_of_inter * 2] = pts2[2 * i]
+ int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1]
+ num_of_inter += 1
+ temp_pts = cuda.local.array((2, ), dtype=numba.float32)
+ for i in range(4):
+ for j in range(4):
+ has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts)
+ if has_pts:
+ int_pts[num_of_inter * 2] = temp_pts[0]
+ int_pts[num_of_inter * 2 + 1] = temp_pts[1]
+ num_of_inter += 1
+
+ return num_of_inter
+
+
+@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
+def rbbox_to_corners(corners, rbbox):
+ # generate clockwise corners and rotate it clockwise
+ angle = rbbox[4]
+ a_cos = math.cos(angle)
+ a_sin = math.sin(angle)
+ center_x = rbbox[0]
+ center_y = rbbox[1]
+ x_d = rbbox[2]
+ y_d = rbbox[3]
+ corners_x = cuda.local.array((4, ), dtype=numba.float32)
+ corners_y = cuda.local.array((4, ), dtype=numba.float32)
+ corners_x[0] = -x_d / 2
+ corners_x[1] = -x_d / 2
+ corners_x[2] = x_d / 2
+ corners_x[3] = x_d / 2
+ corners_y[0] = -y_d / 2
+ corners_y[1] = y_d / 2
+ corners_y[2] = y_d / 2
+ corners_y[3] = -y_d / 2
+ for i in range(4):
+ corners[2 *
+ i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x
+ corners[2 * i
+ + 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y
+
+
+@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
+def inter(rbbox1, rbbox2):
+ corners1 = cuda.local.array((8, ), dtype=numba.float32)
+ corners2 = cuda.local.array((8, ), dtype=numba.float32)
+ intersection_corners = cuda.local.array((16, ), dtype=numba.float32)
+
+ rbbox_to_corners(corners1, rbbox1)
+ rbbox_to_corners(corners2, rbbox2)
+
+ num_intersection = quadrilateral_intersection(corners1, corners2,
+ intersection_corners)
+ sort_vertex_in_convex_polygon(intersection_corners, num_intersection)
+ # print(intersection_corners.reshape([-1, 2])[:num_intersection])
+
+ return area(intersection_corners, num_intersection)
+
+
+@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True)
+def devRotateIoUEval(rbox1, rbox2, criterion=-1):
+ area1 = rbox1[2] * rbox1[3]
+ area2 = rbox2[2] * rbox2[3]
+ area_inter = inter(rbox1, rbox2)
+ if criterion == -1:
+ return area_inter / (area1 + area2 - area_inter)
+ elif criterion == 0:
+ return area_inter / area1
+ elif criterion == 1:
+ return area_inter / area2
+ else:
+ return area_inter
+
+@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False)
+def rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1):
+ threadsPerBlock = 8 * 8
+ row_start = cuda.blockIdx.x
+ col_start = cuda.blockIdx.y
+ tx = cuda.threadIdx.x
+ row_size = min(N - row_start * threadsPerBlock, threadsPerBlock)
+ col_size = min(K - col_start * threadsPerBlock, threadsPerBlock)
+ block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
+ block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
+
+ dev_query_box_idx = threadsPerBlock * col_start + tx
+ dev_box_idx = threadsPerBlock * row_start + tx
+ if (tx < col_size):
+ block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0]
+ block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1]
+ block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2]
+ block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3]
+ block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4]
+ if (tx < row_size):
+ block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0]
+ block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1]
+ block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2]
+ block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3]
+ block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4]
+ cuda.syncthreads()
+ if tx < row_size:
+ for i in range(col_size):
+ offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i
+ dev_iou[offset] = devRotateIoUEval(block_qboxes[i * 5:i * 5 + 5],
+ block_boxes[tx * 5:tx * 5 + 5], criterion)
+
+
+def rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):
+ """rotated box iou running in gpu. 500x faster than cpu version
+ (take 5ms in one example with numba.cuda code).
+ convert from [this project](
+ https://github.com/hongzhenwang/RRPN-revise/tree/master/lib/rotation).
+
+ Args:
+ boxes (float tensor: [N, 5]): rbboxes. format: centers, dims,
+ angles(clockwise when positive)
+ query_boxes (float tensor: [K, 5]): [description]
+ device_id (int, optional): Defaults to 0. [description]
+
+ Returns:
+ [type]: [description]
+ """
+ box_dtype = boxes.dtype
+ boxes = boxes.astype(np.float32)
+ query_boxes = query_boxes.astype(np.float32)
+ N = boxes.shape[0]
+ K = query_boxes.shape[0]
+ iou = np.zeros((N, K), dtype=np.float32)
+ if N == 0 or K == 0:
+ return iou
+ threadsPerBlock = 8 * 8
+ cuda.select_device(device_id)
+ blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock))
+
+ stream = cuda.stream()
+ with stream.auto_synchronize():
+ boxes_dev = cuda.to_device(boxes.reshape([-1]), stream)
+ query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream)
+ iou_dev = cuda.to_device(iou.reshape([-1]), stream)
+ rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream](
+ N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)
+ iou_dev.copy_to_host(iou.reshape([-1]), stream=stream)
+ return iou.astype(boxes.dtype)
diff --git a/PaddleCV/Paddle3D/PointRCNN/train.py b/PaddleCV/Paddle3D/PointRCNN/train.py
new file mode 100644
index 0000000000000000000000000000000000000000..b7a39ca4555defbabdeee204c954bbcdfb7f8ee9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/train.py
@@ -0,0 +1,240 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import logging
+import numpy as np
+import paddle
+import paddle.fluid as fluid
+from paddle.fluid.layers import control_flow
+from paddle.fluid.contrib.extend_optimizer import extend_with_decoupled_weight_decay
+import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
+
+from models.point_rcnn import PointRCNN
+from data.kitti_rcnn_reader import KittiRCNNReader
+from utils.run_utils import *
+from utils.config import cfg, load_config, set_config_from_list
+from utils.optimizer import optimize
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointRCNN semantic segmentation train script")
+ parser.add_argument(
+ '--cfg',
+ type=str,
+ default='cfgs/default.yml',
+ help='specify the config for training')
+ parser.add_argument(
+ '--train_mode',
+ type=str,
+ default='rpn',
+ required=True,
+ help='specify the training mode')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=16,
+ required=True,
+ help='training batch size, default 16')
+ parser.add_argument(
+ '--epoch',
+ type=int,
+ default=200,
+ required=True,
+ help='epoch number. default 200.')
+ parser.add_argument(
+ '--save_dir',
+ type=str,
+ default='checkpoints',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--resume',
+ type=str,
+ default=None,
+ help='path to resume training based on previous checkpoints. '
+ 'None for not resuming any checkpoints.')
+ parser.add_argument(
+ '--resume_epoch',
+ type=int,
+ default=0,
+ help='resume epoch id')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--gt_database',
+ type=str,
+ default='data/gt_database/train_gt_database_3level_Car.pkl',
+ help='generated gt database for augmentation')
+ parser.add_argument(
+ '--rcnn_training_roi_dir',
+ type=str,
+ default=None,
+ help='specify the saved rois for rcnn training when using rcnn_offline mode')
+ parser.add_argument(
+ '--rcnn_training_feature_dir',
+ type=str,
+ default=None,
+ help='specify the saved features for rcnn training when using rcnn_offline mode')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval to log.')
+ parser.add_argument(
+ '--set',
+ dest='set_cfgs',
+ default=None,
+ nargs=argparse.REMAINDER,
+ help='set extra config keys if needed.')
+ args = parser.parse_args()
+ return args
+
+
+def train():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ # PointRCNN model can only run on GPU
+ check_gpu(True)
+
+ load_config(args.cfg)
+ if args.set_cfgs is not None:
+ set_config_from_list(args.set_cfgs)
+
+ if args.train_mode == 'rpn':
+ cfg.RPN.ENABLED = True
+ cfg.RCNN.ENABLED = False
+ elif args.train_mode == 'rcnn':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = cfg.RPN.FIXED = True
+ elif args.train_mode == 'rcnn_offline':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = False
+ else:
+ raise NotImplementedError("unknown train mode: {}".format(args.train_mode))
+
+ checkpoints_dir = os.path.join(args.save_dir, args.train_mode)
+ if not os.path.isdir(checkpoints_dir):
+ os.makedirs(checkpoints_dir)
+
+ kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
+ npoints=cfg.RPN.NUM_POINTS,
+ split=cfg.TRAIN.SPLIT,
+ mode='TRAIN',
+ classes=cfg.CLASSES,
+ rcnn_training_roi_dir=args.rcnn_training_roi_dir,
+ rcnn_training_feature_dir=args.rcnn_training_feature_dir,
+ gt_database_dir=args.gt_database)
+ num_samples = len(kitti_rcnn_reader)
+ steps_per_epoch = int(num_samples / args.batch_size)
+ logger.info("Total {} samples, {} batch per epoch.".format(num_samples, steps_per_epoch))
+ boundaries = [i * steps_per_epoch for i in cfg.TRAIN.DECAY_STEP_LIST]
+ values = [cfg.TRAIN.LR * (cfg.TRAIN.LR_DECAY ** i) for i in range(len(boundaries) + 1)]
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+
+ # build model
+ startup = fluid.Program()
+ train_prog = fluid.Program()
+ with fluid.program_guard(train_prog, startup):
+ with fluid.unique_name.guard():
+ train_model = PointRCNN(cfg, args.batch_size, True, 'TRAIN')
+ train_model.build()
+ train_pyreader = train_model.get_pyreader()
+ train_feeds = train_model.get_feeds()
+ train_outputs = train_model.get_outputs()
+ train_loss = train_outputs['loss']
+ lr = optimize(train_loss,
+ learning_rate=cfg.TRAIN.LR,
+ warmup_factor=1. / cfg.TRAIN.DIV_FACTOR,
+ decay_factor=1e-5,
+ total_step=steps_per_epoch * args.epoch,
+ warmup_pct=cfg.TRAIN.PCT_START,
+ train_program=train_prog,
+ startup_prog=startup,
+ weight_decay=cfg.TRAIN.WEIGHT_DECAY,
+ clip_norm=cfg.TRAIN.GRAD_NORM_CLIP)
+ train_keys, train_values = parse_outputs(train_outputs, 'loss')
+
+ exe.run(startup)
+
+ if args.resume:
+ assert os.path.exists(args.resume), \
+ "Given resume weight dir {} not exist.".format(args.resume)
+ def if_exist(var):
+ logger.debug("{}: {}".format(var.name, os.path.exists(os.path.join(args.resume, var.name))))
+ return os.path.exists(os.path.join(args.resume, var.name))
+ fluid.io.load_vars(
+ exe, args.resume, predicate=if_exist, main_program=train_prog)
+
+ build_strategy = fluid.BuildStrategy()
+ build_strategy.memory_optimize = False
+ build_strategy.enable_inplace = False
+ build_strategy.fuse_all_optimizer_ops = False
+ train_compile_prog = fluid.compiler.CompiledProgram(
+ train_prog).with_data_parallel(loss_name=train_loss.name,
+ build_strategy=build_strategy)
+
+ def save_model(exe, prog, path):
+ if os.path.isdir(path):
+ shutil.rmtree(path)
+ logger.info("Save model to {}".format(path))
+ fluid.io.save_persistables(exe, path, prog)
+
+ # get reader
+ train_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, train_feeds, drop_last=True)
+ train_pyreader.decorate_sample_list_generator(train_reader, place)
+
+ train_stat = Stat()
+ for epoch_id in range(args.resume_epoch, args.epoch):
+ try:
+ train_pyreader.start()
+ train_iter = 0
+ train_periods = []
+ while True:
+ cur_time = time.time()
+ train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name])
+ period = time.time() - cur_time
+ train_periods.append(period)
+ train_stat.update(train_keys, train_outs[:-1])
+ if train_iter % args.log_interval == 0:
+ log_str = ""
+ for name, values in zip(train_keys + ['learning_rate'], train_outs):
+ log_str += "{}: {:.6f}, ".format(name, np.mean(values))
+ logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period))
+ train_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[2:])))
+ save_model(exe, train_prog, os.path.join(checkpoints_dir, str(epoch_id)))
+ train_stat.reset()
+ train_periods = []
+ finally:
+ train_pyreader.reset()
+
+
+if __name__ == "__main__":
+ train()
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..cad1d5d9ab5b0e5ed0724ddfc65ef53d14044b76
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py
@@ -0,0 +1,14 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..49c9ee74a64634e1836d081220996919ffae16a4
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py
@@ -0,0 +1,275 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains proposal functions
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+
+from utils.config import cfg
+
+__all__ = ["boxes3d_to_bev", "box_overlap_rotate", "boxes3d_to_bev", "box_iou", "box_nms"]
+
+
+def boxes3d_to_bev(boxes3d):
+ """
+ Args:
+ boxes3d: [N, 7], (x, y, z, h, w, l, ry)
+ Return:
+ boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
+ """
+ boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
+
+ cu, cv = boxes3d[:, 0], boxes3d[:, 2]
+ half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
+ boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
+ boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
+ boxes_bev[:, 4] = boxes3d[:, 6]
+ return boxes_bev
+
+
+def rotate_around_center(center, angle_cos, angle_sin, corners):
+ new_x = (corners[:, 0] - center[0]) * angle_cos + \
+ (corners[:, 1] - center[1]) * angle_sin + center[0]
+ new_y = -(corners[:, 0] - center[0]) * angle_sin + \
+ (corners[:, 1] - center[1]) * angle_cos + center[1]
+ return np.concatenate([new_x[:, np.newaxis], new_y[:, np.newaxis]], axis=-1)
+
+
+def check_rect_cross(p1, p2, q1, q2):
+ return min(p1[0], p2[0]) <= max(q1[0], q2[0]) and \
+ min(q1[0], q2[0]) <= max(p1[0], p2[0]) and \
+ min(p1[1], p2[1]) <= max(q1[1], q2[1]) and \
+ min(q1[1], q2[1]) <= max(p1[1], p2[1])
+
+
+def cross(p1, p2, p0):
+ return (p1[0] - p0[0]) * (p2[1] - p0[1]) - (p2[0] - p0[0]) * (p1[1] - p0[1]);
+
+
+def cross_area(a, b):
+ return a[0] * b[1] - a[1] * b[0]
+
+
+def intersection(p1, p0, q1, q0):
+ if not check_rect_cross(p1, p0, q1, q0):
+ return None
+
+ s1 = cross(q0, p1, p0)
+ s2 = cross(p1, q1, p0)
+ s3 = cross(p0, q1, q0)
+ s4 = cross(q1, p1, q0)
+ if not (s1 * s2 > 0 and s3 * s4 > 0):
+ return None
+
+ s5 = cross(q1, p1, p0)
+ if np.abs(s5 - s1) > 1e-8:
+ return np.array([(s5 * q0[0] - s1 * q1[0]) / (s5 - s1),
+ (s5 * q0[1] - s1 * q1[1]) / (s5 - s1)], dtype='float32')
+ else:
+ a0 = p0[1] - p1[1]
+ b0 = p1[0] - p0[0]
+ c0 = p0[0] * p1[1] - p1[0] * p0[1]
+ a0 = q0[1] - q1[1]
+ b0 = q1[0] - q0[0]
+ c0 = q0[0] * q1[1] - q1[0] * q0[1]
+ D = a0 * b1 - a1 * b0
+ return np.array([(b0 * c1 - b1 * c0) / D, (a1 * c0 - a0 * c1) / D], dtype='float32')
+
+
+def check_in_box2d(box, p):
+ center_x = (box[0] + box[2]) / 2.
+ center_y = (box[1] + box[3]) / 2.
+ angle_cos = np.cos(-box[4])
+ angle_sin = np.sin(-box[4])
+ rot_x = (p[0] - center_x) * angle_cos + (p[1] - center_y) * angle_sin + center_x
+ rot_y = -(p[0] - center_x) * angle_sin + (p[1] - center_y) * angle_cos + center_y
+ return rot_x > box[0] - 1e-5 and rot_x < box[2] + 1e-5 and \
+ rot_y > box[1] - 1e-5 and rot_y < box[3] + 1e-5
+
+
+def point_cmp(a, b, center):
+ return np.arctan2(a[1] - center[1], a[0] - center[0]) > \
+ np.arctan2(b[1] - center[1], b[0] - center[0])
+
+
+def box_overlap_rotate(cur_box, boxes):
+ """
+ Calculate box overlap with rotate, box: [x1, y1, x2, y2, angle]
+ """
+ areas = np.zeros((len(boxes), ), dtype='float32')
+ cur_center = [(cur_box[0] + cur_box[2]) / 2., (cur_box[1] + cur_box[3]) / 2.]
+ cur_corners = np.array([
+ [cur_box[0], cur_box[1]], # (x1, y1)
+ [cur_box[2], cur_box[1]], # (x2, y1)
+ [cur_box[2], cur_box[3]], # (x2, y2)
+ [cur_box[0], cur_box[3]], # (x1, y2)
+ [cur_box[0], cur_box[1]], # (x1, y1)
+ ], dtype='float32')
+ cur_angle_cos = np.cos(cur_box[4])
+ cur_angle_sin = np.sin(cur_box[4])
+ cur_corners = rotate_around_center(cur_center, cur_angle_cos, cur_angle_sin, cur_corners)
+
+ for i, box in enumerate(boxes):
+ box_center = [(box[0] + box[2]) / 2., (box[1] + box[3]) / 2.]
+ box_corners = np.array([
+ [box[0], box[1]],
+ [box[2], box[1]],
+ [box[2], box[3]],
+ [box[0], box[3]],
+ [box[0], box[1]],
+ ], dtype='float32')
+ box_angle_cos = np.cos(box[4])
+ box_angle_sin = np.sin(box[4])
+ box_corners = rotate_around_center(box_center, box_angle_cos, box_angle_sin, box_corners)
+
+ cross_points = np.zeros((16, 2), dtype='float32')
+ cnt = 0
+ # get intersection of lines
+ for j in range(4):
+ for k in range(4):
+ inters = intersection(cur_corners[j + 1], cur_corners[j],
+ box_corners[k + 1], box_corners[k])
+ if inters is not None:
+ cross_points[cnt, :] = inters
+ cnt += 1
+ # check corners
+ for l in range(4):
+ if check_in_box2d(cur_box, box_corners[l]):
+ cross_points[cnt, :] = box_corners[l]
+ cnt += 1
+ if check_in_box2d(box, cur_corners[l]):
+ cross_points[cnt, :] = cur_corners[l]
+ cnt += 1
+
+ if cnt > 0:
+ poly_center = np.sum(cross_points[:cnt, :], axis=0) / cnt
+ else:
+ poly_center = np.zeros((2,))
+
+ # sort the points of polygon
+ for j in range(cnt - 1):
+ for k in range(cnt - j - 1):
+ if point_cmp(cross_points[k], cross_points[k + 1], poly_center):
+ cross_points[k], cross_points[k + 1] = \
+ cross_points[k + 1].copy(), cross_points[k].copy()
+
+ # get the overlap areas
+ area = 0.
+ for j in range(cnt - 1):
+ area += cross_area(cross_points[j] - cross_points[0],
+ cross_points[j + 1] - cross_points[0])
+ areas[i] = np.abs(area) / 2.
+
+ return areas
+
+
+def box_iou(cur_box, boxes, box_type='normal'):
+ cur_S = (cur_box[2] - cur_box[0]) * (cur_box[3] - cur_box[1])
+ boxes_S = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
+
+ if box_type == 'normal':
+ inter_x1 = np.maximum(cur_box[0], boxes[:, 0])
+ inter_y1 = np.maximum(cur_box[1], boxes[:, 1])
+ inter_x2 = np.minimum(cur_box[2], boxes[:, 2])
+ inter_y2 = np.minimum(cur_box[3], boxes[:, 3])
+ inter_w = np.maximum(inter_x2 - inter_x1, 0.)
+ inter_h = np.maximum(inter_y2 - inter_y1, 0.)
+ inter_area = inter_w * inter_h
+ elif box_type == 'rotate':
+ inter_area = box_overlap_rotate(cur_box, boxes)
+ else:
+ raise NotImplementedError
+
+ return inter_area / np.maximum(cur_S + boxes_S - inter_area, 1e-8)
+
+
+def box_nms(boxes, scores, proposals, thresh, topk, nms_type='normal'):
+ assert nms_type in ['normal', 'rotate'], \
+ "unknown nms type {}".format(nms_type)
+ order = np.argsort(-scores)
+ boxes = boxes[order]
+ scores = scores[order]
+ proposals = proposals[order]
+
+ nmsed_scores = []
+ nmsed_proposals = []
+ cnt = 0
+ while boxes.shape[0]:
+ nmsed_scores.append(scores[0])
+ nmsed_proposals.append(proposals[0])
+ cnt +=1
+ if cnt >= topk or boxes.shape[0] == 1:
+ break
+ iou = box_iou(boxes[0], boxes[1:], nms_type)
+ boxes = boxes[1:][iou < thresh]
+ scores = scores[1:][iou < thresh]
+ proposals = proposals[1:][iou < thresh]
+ return nmsed_scores, nmsed_proposals
+
+
+def box_nms_eval(boxes, scores, proposals, thresh, nms_type='rotate'):
+ assert nms_type in ['normal', 'rotate'], \
+ "unknown nms type {}".format(nms_type)
+ order = np.argsort(-scores)
+ boxes = boxes[order]
+ scores = scores[order]
+ proposals = proposals[order]
+
+ nmsed_scores = []
+ nmsed_proposals = []
+ while boxes.shape[0]:
+ nmsed_scores.append(scores[0])
+ nmsed_proposals.append(proposals[0])
+ iou = box_iou(boxes[0], boxes[1:], nms_type)
+ inds = iou < thresh
+ boxes = boxes[1:][inds]
+ scores = scores[1:][inds]
+ proposals = proposals[1:][inds]
+ nmsed_scores = np.asarray(nmsed_scores)
+ nmsed_proposals = np.asarray(nmsed_proposals)
+ return nmsed_scores, nmsed_proposals
+
+def boxes_iou3d(boxes1, boxes2):
+ boxes1_bev = boxes3d_to_bev(boxes1)
+ boxes2_bev = boxes3d_to_bev(boxes2)
+
+ # bev overlap
+ overlaps_bev = np.zeros((boxes1_bev.shape[0], boxes2_bev.shape[0]))
+ for i in range(boxes1_bev.shape[0]):
+ overlaps_bev[i, :] = box_overlap_rotate(boxes1_bev[i], boxes2_bev)
+
+ # height overlap
+ boxes1_height_min = (boxes1[:, 1] - boxes1[:, 3]).reshape(-1, 1)
+ boxes1_height_max = boxes1[:, 1].reshape(-1, 1)
+ boxes2_height_min = (boxes2[:, 1] - boxes2[:, 3]).reshape(1, -1)
+ boxes2_height_max = boxes2[:, 1].reshape(1, -1)
+
+ max_of_min = np.maximum(boxes1_height_min, boxes2_height_min)
+ min_of_max = np.minimum(boxes1_height_max, boxes2_height_max)
+ overlaps_h = np.maximum(min_of_max - max_of_min, 0.)
+
+ # 3d iou
+ overlaps_3d = overlaps_bev * overlaps_h
+
+ vol_a = (boxes1[:, 3] * boxes1[:, 4] * boxes1[:, 5]).reshape(-1, 1)
+ vol_b = (boxes2[:, 3] * boxes2[:, 4] * boxes2[:, 5]).reshape(1, -1)
+ iou3d = overlaps_3d / np.maximum(vol_a + vol_b - overlaps_3d, 1e-7)
+
+ return iou3d
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py
new file mode 100644
index 0000000000000000000000000000000000000000..41fcf279db5a194c5dcc81ae8dafa48b088a42bc
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py
@@ -0,0 +1,143 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/kitti_utils.py
+"""
+import numpy as np
+import os
+
+
+def get_calib_from_file(calib_file):
+ with open(calib_file) as f:
+ lines = f.readlines()
+
+ obj = lines[2].strip().split(' ')[1:]
+ P2 = np.array(obj, dtype=np.float32)
+ obj = lines[3].strip().split(' ')[1:]
+ P3 = np.array(obj, dtype=np.float32)
+ obj = lines[4].strip().split(' ')[1:]
+ R0 = np.array(obj, dtype=np.float32)
+ obj = lines[5].strip().split(' ')[1:]
+ Tr_velo_to_cam = np.array(obj, dtype=np.float32)
+
+ return {'P2': P2.reshape(3, 4),
+ 'P3': P3.reshape(3, 4),
+ 'R0': R0.reshape(3, 3),
+ 'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)}
+
+
+class Calibration(object):
+ def __init__(self, calib_file):
+ if isinstance(calib_file, str):
+ calib = get_calib_from_file(calib_file)
+ else:
+ calib = calib_file
+
+ self.P2 = calib['P2'] # 3 x 4
+ self.R0 = calib['R0'] # 3 x 3
+ self.V2C = calib['Tr_velo2cam'] # 3 x 4
+
+ # Camera intrinsics and extrinsics
+ self.cu = self.P2[0, 2]
+ self.cv = self.P2[1, 2]
+ self.fu = self.P2[0, 0]
+ self.fv = self.P2[1, 1]
+ self.tx = self.P2[0, 3] / (-self.fu)
+ self.ty = self.P2[1, 3] / (-self.fv)
+
+ def cart_to_hom(self, pts):
+ """
+ :param pts: (N, 3 or 2)
+ :return pts_hom: (N, 4 or 3)
+ """
+ pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32)))
+ return pts_hom
+
+ def lidar_to_rect(self, pts_lidar):
+ """
+ :param pts_lidar: (N, 3)
+ :return pts_rect: (N, 3)
+ """
+ pts_lidar_hom = self.cart_to_hom(pts_lidar)
+ pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T))
+ # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))
+ return pts_rect
+
+ def rect_to_img(self, pts_rect):
+ """
+ :param pts_rect: (N, 3)
+ :return pts_img: (N, 2)
+ """
+ pts_rect_hom = self.cart_to_hom(pts_rect)
+ pts_2d_hom = np.dot(pts_rect_hom, self.P2.T)
+ pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T # (N, 2)
+ pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2] # depth in rect camera coord
+ return pts_img, pts_rect_depth
+
+ def lidar_to_img(self, pts_lidar):
+ """
+ :param pts_lidar: (N, 3)
+ :return pts_img: (N, 2)
+ """
+ pts_rect = self.lidar_to_rect(pts_lidar)
+ pts_img, pts_depth = self.rect_to_img(pts_rect)
+ return pts_img, pts_depth
+
+ def img_to_rect(self, u, v, depth_rect):
+ """
+ :param u: (N)
+ :param v: (N)
+ :param depth_rect: (N)
+ :return:
+ """
+ x = ((u - self.cu) * depth_rect) / self.fu + self.tx
+ y = ((v - self.cv) * depth_rect) / self.fv + self.ty
+ pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1)
+ return pts_rect
+
+ def depthmap_to_rect(self, depth_map):
+ """
+ :param depth_map: (H, W), depth_map
+ :return:
+ """
+ x_range = np.arange(0, depth_map.shape[1])
+ y_range = np.arange(0, depth_map.shape[0])
+ x_idxs, y_idxs = np.meshgrid(x_range, y_range)
+ x_idxs, y_idxs = x_idxs.reshape(-1), y_idxs.reshape(-1)
+ depth = depth_map[y_idxs, x_idxs]
+ pts_rect = self.img_to_rect(x_idxs, y_idxs, depth)
+ return pts_rect, x_idxs, y_idxs
+
+ def corners3d_to_img_boxes(self, corners3d):
+ """
+ :param corners3d: (N, 8, 3) corners in rect coordinate
+ :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate
+ :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate
+ """
+ sample_num = corners3d.shape[0]
+ corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2) # (N, 8, 4)
+
+ img_pts = np.matmul(corners3d_hom, self.P2.T) # (N, 8, 3)
+
+ x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]
+ x1, y1 = np.min(x, axis=1), np.min(y, axis=1)
+ x2, y2 = np.max(x, axis=1), np.max(y, axis=1)
+
+ boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)
+ boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)
+
+ return boxes, boxes_corner
+
+ def camera_dis_to_rect(self, u, v, d):
+ """
+ Can only process valid u, v, d, which means u, v can not beyond the image shape, reprojection error 0.02
+ :param u: (N)
+ :param v: (N)
+ :param d: (N), the distance between camera and 3d points, d^2 = x^2 + y^2 + z^2
+ :return:
+ """
+ assert self.fu == self.fv, '%.8f != %.8f' % (self.fu, self.fv)
+ fd = np.sqrt((u - self.cu)**2 + (v - self.cv)**2 + self.fu**2)
+ x = ((u - self.cu) * d) / fd + self.tx
+ y = ((v - self.cv) * d) / fd + self.ty
+ z = np.sqrt(d**2 - x**2 - y**2)
+ pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), z.reshape(-1, 1)), axis=1)
+ return pts_rect
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/config.py b/PaddleCV/Paddle3D/PointRCNN/utils/config.py
new file mode 100644
index 0000000000000000000000000000000000000000..dc24aee5253576e3e5f78b8ed246af51c06279ba
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/config.py
@@ -0,0 +1,279 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+This code is bases on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/config.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import yaml
+import numpy as np
+from ast import literal_eval
+
+__all__ = ["load_config", "cfg"]
+
+
+class AttrDict(dict):
+ def __init__(self, *args, **kwargs):
+ for arg in args:
+ for k, v in arg.items():
+ if isinstance(v, dict):
+ arg[k] = AttrDict(v)
+ else:
+ arg[k] = v
+ super(AttrDict, self).__init__(*args, **kwargs)
+
+ def __getattr__(self, name):
+ if name in self.__dict__:
+ return self.__dict__[name]
+ elif name in self:
+ return self[name]
+ else:
+ raise AttributeError(name)
+
+ def __setattr__(self, name, value):
+ if name in self.__dict__:
+ self.__dict__[name] = value
+ else:
+ self[name] = value
+
+
+__C = AttrDict()
+cfg = __C
+
+# 0. basic config
+__C.TAG = 'default'
+__C.CLASSES = 'Car'
+
+__C.INCLUDE_SIMILAR_TYPE = False
+
+# config of augmentation
+__C.AUG_DATA = True
+__C.AUG_METHOD_LIST = ['rotation', 'scaling', 'flip']
+__C.AUG_METHOD_PROB = [0.5, 0.5, 0.5]
+__C.AUG_ROT_RANGE = 18
+
+__C.GT_AUG_ENABLED = False
+__C.GT_EXTRA_NUM = 15
+__C.GT_AUG_RAND_NUM = False
+__C.GT_AUG_APPLY_PROB = 0.75
+__C.GT_AUG_HARD_RATIO = 0.6
+
+__C.PC_REDUCE_BY_RANGE = True
+__C.PC_AREA_SCOPE = np.array([[-40, 40],
+ [-1, 3],
+ [0, 70.4]]) # x, y, z scope in rect camera coords
+
+__C.CLS_MEAN_SIZE = np.array([[1.52, 1.63, 3.88]], dtype=np.float32)
+
+
+# 1. config of rpn network
+__C.RPN = AttrDict()
+__C.RPN.ENABLED = True
+__C.RPN.FIXED = False
+
+__C.RPN.USE_INTENSITY = True
+
+# config of bin-based loss
+__C.RPN.LOC_XZ_FINE = False
+__C.RPN.LOC_SCOPE = 3.0
+__C.RPN.LOC_BIN_SIZE = 0.5
+__C.RPN.NUM_HEAD_BIN = 12
+
+# config of network structure
+__C.RPN.BACKBONE = 'pointnet2_msg'
+
+__C.RPN.USE_BN = True
+__C.RPN.NUM_POINTS = 16384
+
+__C.RPN.SA_CONFIG = AttrDict()
+__C.RPN.SA_CONFIG.NPOINTS = [4096, 1024, 256, 64]
+__C.RPN.SA_CONFIG.RADIUS = [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
+__C.RPN.SA_CONFIG.NSAMPLE = [[16, 32], [16, 32], [16, 32], [16, 32]]
+__C.RPN.SA_CONFIG.MLPS = [[[16, 16, 32], [32, 32, 64]],
+ [[64, 64, 128], [64, 96, 128]],
+ [[128, 196, 256], [128, 196, 256]],
+ [[256, 256, 512], [256, 384, 512]]]
+__C.RPN.FP_MLPS = [[128, 128], [256, 256], [512, 512], [512, 512]]
+__C.RPN.CLS_FC = [128]
+__C.RPN.REG_FC = [128]
+__C.RPN.DP_RATIO = 0.5
+
+# config of training
+__C.RPN.LOSS_CLS = 'DiceLoss'
+__C.RPN.FG_WEIGHT = 15
+__C.RPN.FOCAL_ALPHA = [0.25, 0.75]
+__C.RPN.FOCAL_GAMMA = 2.0
+__C.RPN.REG_LOSS_WEIGHT = [1.0, 1.0, 1.0, 1.0]
+__C.RPN.LOSS_WEIGHT = [1.0, 1.0]
+__C.RPN.NMS_TYPE = 'normal' # normal, rotate
+
+# config of testing
+__C.RPN.SCORE_THRESH = 0.3
+
+
+# 2. config of rcnn network
+__C.RCNN = AttrDict()
+__C.RCNN.ENABLED = False
+
+# config of input
+__C.RCNN.USE_RPN_FEATURES = True
+__C.RCNN.USE_MASK = True
+__C.RCNN.MASK_TYPE = 'seg'
+__C.RCNN.USE_INTENSITY = False
+__C.RCNN.USE_DEPTH = True
+__C.RCNN.USE_SEG_SCORE = False
+__C.RCNN.ROI_SAMPLE_JIT = False
+__C.RCNN.ROI_FG_AUG_TIMES = 10
+
+__C.RCNN.REG_AUG_METHOD = 'multiple' # multiple, single, normal
+__C.RCNN.POOL_EXTRA_WIDTH = 1.0
+
+# config of bin-based loss
+__C.RCNN.LOC_SCOPE = 1.5
+__C.RCNN.LOC_BIN_SIZE = 0.5
+__C.RCNN.NUM_HEAD_BIN = 9
+__C.RCNN.LOC_Y_BY_BIN = False
+__C.RCNN.LOC_Y_SCOPE = 0.5
+__C.RCNN.LOC_Y_BIN_SIZE = 0.25
+__C.RCNN.SIZE_RES_ON_ROI = False
+
+# config of network structure
+__C.RCNN.USE_BN = False
+__C.RCNN.DP_RATIO = 0.0
+
+__C.RCNN.BACKBONE = 'pointnet' # pointnet, pointsift
+__C.RCNN.XYZ_UP_LAYER = [128, 128]
+
+__C.RCNN.NUM_POINTS = 512
+__C.RCNN.SA_CONFIG = AttrDict()
+__C.RCNN.SA_CONFIG.NPOINTS = [128, 32, -1]
+__C.RCNN.SA_CONFIG.RADIUS = [0.2, 0.4, 100]
+__C.RCNN.SA_CONFIG.NSAMPLE = [64, 64, 64]
+__C.RCNN.SA_CONFIG.MLPS = [[128, 128, 128],
+ [128, 128, 256],
+ [256, 256, 512]]
+__C.RCNN.CLS_FC = [256, 256]
+__C.RCNN.REG_FC = [256, 256]
+
+# config of training
+__C.RCNN.LOSS_CLS = 'BinaryCrossEntropy'
+__C.RCNN.FOCAL_ALPHA = [0.25, 0.75]
+__C.RCNN.FOCAL_GAMMA = 2.0
+__C.RCNN.CLS_WEIGHT = np.array([1.0, 1.0, 1.0], dtype=np.float32)
+__C.RCNN.CLS_FG_THRESH = 0.6
+__C.RCNN.CLS_BG_THRESH = 0.45
+__C.RCNN.CLS_BG_THRESH_LO = 0.05
+__C.RCNN.REG_FG_THRESH = 0.55
+__C.RCNN.FG_RATIO = 0.5
+__C.RCNN.ROI_PER_IMAGE = 64
+__C.RCNN.HARD_BG_RATIO = 0.6
+
+# config of testing
+__C.RCNN.SCORE_THRESH = 0.3
+__C.RCNN.NMS_THRESH = 0.1
+
+
+# general training config
+__C.TRAIN = AttrDict()
+__C.TRAIN.SPLIT = 'train'
+__C.TRAIN.VAL_SPLIT = 'smallval'
+
+__C.TRAIN.LR = 0.002
+__C.TRAIN.LR_CLIP = 0.00001
+__C.TRAIN.LR_DECAY = 0.5
+__C.TRAIN.DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
+__C.TRAIN.LR_WARMUP = False
+__C.TRAIN.WARMUP_MIN = 0.0002
+__C.TRAIN.WARMUP_EPOCH = 5
+
+__C.TRAIN.BN_MOMENTUM = 0.9
+__C.TRAIN.BN_DECAY = 0.5
+__C.TRAIN.BNM_CLIP = 0.01
+__C.TRAIN.BN_DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
+
+__C.TRAIN.OPTIMIZER = 'adam'
+__C.TRAIN.WEIGHT_DECAY = 0.0 # "L2 regularization coeff [default: 0.0]"
+__C.TRAIN.MOMENTUM = 0.9
+
+__C.TRAIN.MOMS = [0.95, 0.85]
+__C.TRAIN.DIV_FACTOR = 10.0
+__C.TRAIN.PCT_START = 0.4
+
+__C.TRAIN.GRAD_NORM_CLIP = 1.0
+
+__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000
+__C.TRAIN.RPN_POST_NMS_TOP_N = 2048
+__C.TRAIN.RPN_NMS_THRESH = 0.85
+__C.TRAIN.RPN_DISTANCE_BASED_PROPOSE = True
+
+
+__C.TEST = AttrDict()
+__C.TEST.SPLIT = 'val'
+__C.TEST.RPN_PRE_NMS_TOP_N = 9000
+__C.TEST.RPN_POST_NMS_TOP_N = 300
+__C.TEST.RPN_NMS_THRESH = 0.7
+__C.TEST.RPN_DISTANCE_BASED_PROPOSE = True
+
+
+def load_config(fname):
+ """
+ Load config from yaml file and merge into global cfg
+ """
+ with open(fname) as f:
+ yml_cfg = AttrDict(yaml.load(f.read(), Loader=yaml.Loader))
+ _merge_cfg_a_to_b(yml_cfg, __C)
+
+
+def set_config_from_list(cfg_list):
+ assert len(cfg_list) % 2 == 0, "cfgs list length invalid"
+ for k, v in zip(cfg_list[0::2], cfg_list[1::2]):
+ key_list = k.split('.')
+ d = __C
+ for subkey in key_list[:-1]:
+ assert subkey in d
+ d = d[subkey]
+ subkey = key_list[-1]
+ assert subkey in d
+ try:
+ value = literal_eval(v)
+ except:
+ # handle the case when v is a string literal
+ value = v
+ assert type(value) == type(d[subkey]), \
+ 'type {} does not match original type {}'.format(type(value), type(d[subkey]))
+ d[subkey] = value
+
+
+def _merge_cfg_a_to_b(a, b):
+ assert isinstance(a, AttrDict), \
+ "unknown type {}".format(type(a))
+
+ for k, v in a.items():
+ assert k in b, "unknown key {}".format(k)
+ if type(v) is not type(b[k]):
+ if isinstance(b[k], np.ndarray):
+ b[k] = np.array(v, dtype=b[k].dtype)
+ else:
+ raise TypeError("Config type mismatch")
+ if isinstance(v, AttrDict):
+ _merge_cfg_a_to_b(v, b[k])
+ else:
+ b[k] = v
+
+
+if __name__ == "__main__":
+ load_config("./cfgs/default.yml")
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e02c54922625934fe1ab74a8c29e435f44f4d302
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py
@@ -0,0 +1,15 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..b2c7f3c7169c0a0f5da1adeeb029eec423daf39e
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx
@@ -0,0 +1,195 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import cython
+from math import pi, cos, sin
+import numpy as np
+cimport numpy as np
+
+
+cdef class Point:
+ cdef float x, y
+ def __cinit__(self, x, y):
+ self.x = x
+ self.y = y
+
+ def __add__(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return Point(self.x + v.x, self.y + v.y)
+
+ def __sub__(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return Point(self.x - v.x, self.y - v.y)
+
+ def cross(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return self.x*v.y - self.y*v.x
+
+
+cdef class Line:
+ cdef float a, b, c
+ # ax + by + c = 0
+ def __cinit__(self, v1, v2):
+ self.a = v2.y - v1.y
+ self.b = v1.x - v2.x
+ self.c = v2.cross(v1)
+
+ def __call__(self, p):
+ return self.a*p.x + self.b*p.y + self.c
+
+ def intersection(self, other):
+ if not isinstance(other, Line):
+ return NotImplemented
+ w = self.a*other.b - self.b*other.a
+ return Point(
+ (self.b*other.c - self.c*other.b)/w,
+ (self.c*other.a - self.a*other.c)/w
+ )
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rectangle_vertices_(x1, y1, x2, y2, r):
+
+ cx = (x1 + x2) / 2
+ cy = (y1 + y2) / 2
+ angle = r
+ cr = cos(angle)
+ sr = sin(angle)
+ # rotate around center
+ return (
+ Point(
+ x=(x1-cx)*cr+(y1-cy)*sr+cx,
+ y=-(x1-cx)*sr+(y1-cy)*cr+cy
+ ),
+ Point(
+ x=(x2-cx)*cr+(y1-cy)*sr+cx,
+ y=-(x2-cx)*sr+(y1-cy)*cr+cy
+ ),
+ Point(
+ x=(x2-cx)*cr+(y2-cy)*sr+cx,
+ y=-(x2-cx)*sr+(y2-cy)*cr+cy
+ ),
+ Point(
+ x=(x1-cx)*cr+(y2-cy)*sr+cx,
+ y=-(x1-cx)*sr+(y2-cy)*cr+cy
+ )
+ )
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def intersection_area(r1, r2):
+ # r1 and r2 are in (center, width, height, rotation) representation
+ # First convert these into a sequence of vertices
+
+ rect1 = rectangle_vertices_(*r1)
+ rect2 = rectangle_vertices_(*r2)
+
+ # Use the vertices of the first rectangle as
+ # starting vertices of the intersection polygon.
+ intersection = rect1
+
+ # Loop over the edges of the second rectangle
+ for p, q in zip(rect2, rect2[1:] + rect2[:1]):
+ if len(intersection) <= 2:
+ break # No intersection
+
+ line = Line(p, q)
+
+ # Any point p with line(p) <= 0 is on the "inside" (or on the boundary),
+ # any point p with line(p) > 0 is on the "outside".
+
+ # Loop over the edges of the intersection polygon,
+ # and determine which part is inside and which is outside.
+ new_intersection = []
+ line_values = [line(t) for t in intersection]
+ for s, t, s_value, t_value in zip(
+ intersection, intersection[1:] + intersection[:1],
+ line_values, line_values[1:] + line_values[:1]):
+ if s_value <= 0:
+ new_intersection.append(s)
+ if s_value * t_value < 0:
+ # Points are on opposite sides.
+ # Add the intersection of the lines to new_intersection.
+ intersection_point = line.intersection(Line(s, t))
+ new_intersection.append(intersection_point)
+
+ intersection = new_intersection
+
+ # Calculate area
+ if len(intersection) <= 2:
+ return 0
+
+ return 0.5 * sum(p.x*q.y - p.y*q.x for p, q in zip(intersection, intersection[1:] + intersection[:1]))
+
+
+def boxes3d_to_bev_(boxes3d):
+ """
+ Args:
+ boxes3d: [N, 7], (x, y, z, h, w, l, ry)
+ Return:
+ boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
+ """
+ boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
+ cu, cv = boxes3d[:, 0], boxes3d[:, 2]
+ half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
+ boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
+ boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
+ boxes_bev[:, 4] = boxes3d[:, 6]
+ return boxes_bev
+
+
+def boxes_iou3d(boxes_a, boxes_b):
+ """
+ :param boxes_a: (N, 7) [x, y, z, h, w, l, ry]
+ :param boxes_b: (M, 7) [x, y, z, h, w, l, ry]
+ :return:
+ ans_iou: (M, N)
+ """
+ boxes_a_bev = boxes3d_to_bev_(boxes_a)
+ boxes_b_bev = boxes3d_to_bev_(boxes_b)
+ # bev overlap
+ num_a = boxes_a_bev.shape[0]
+ num_b = boxes_b_bev.shape[0]
+ overlaps_bev = np.zeros((num_a, num_b), dtype=np.float32)
+ for i in range(num_a):
+ for j in range(num_b):
+ overlaps_bev[i][j] = intersection_area(boxes_a_bev[i], boxes_b_bev[j])
+
+ # height overlap
+ boxes_a_height_min = (boxes_a[:, 1] - boxes_a[:, 3]).reshape(-1, 1)
+ boxes_a_height_max = boxes_a[:, 1].reshape(-1, 1)
+ boxes_b_height_min = (boxes_b[:, 1] - boxes_b[:, 3]).reshape(1, -1)
+ boxes_b_height_max = boxes_b[:, 1].reshape(1, -1)
+
+ max_of_min = np.maximum(boxes_a_height_min, boxes_b_height_min)
+ min_of_max = np.minimum(boxes_a_height_max, boxes_b_height_max)
+ overlaps_h = np.clip(min_of_max - max_of_min, a_min=0, a_max=np.inf)
+ # 3d iou
+ overlaps_3d = overlaps_bev * overlaps_h
+
+ vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).reshape(-1, 1)
+ vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).reshape(1, -1)
+
+ iou3d = overlaps_3d / np.clip(vol_a + vol_b - overlaps_3d, a_min=1e-7, a_max=np.inf)
+ return iou3d
+
+#if __name__ == '__main__':
+# # (center, width, height, rotation)
+# r1 = (10, 15, 15, 10, 30)
+# r2 = (15, 15, 20, 10, 0)
+# print(intersection_area(r1, r2))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..593dd0c9354516a2861701c5103f8e9b10ae46b1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx
@@ -0,0 +1,346 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import cython
+import numpy as np
+cimport numpy as np
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def pts_in_boxes3d(np.ndarray pts_rect, np.ndarray boxes3d):
+ """
+ :param pts: (N, 3) in rect-camera coords
+ :param boxes3d: (M, 7)
+ :return: boxes_pts_mask_list: (M), list with [(N), (N), ..]
+ """
+ cdef float MAX_DIS = 10.0
+ cdef np.ndarray boxes_pts_mask_list = np.zeros((boxes3d.shape[0], pts_rect.shape[0]), dtype='int32')
+ cdef int boxes3d_num = boxes3d.shape[0]
+ cdef int pts_rect_num = pts_rect.shape[0]
+ cdef float cx, by, cz, h, w, l, angle, cy, cosa, sina, x_rot, z_rot
+ cdef int x, y, z
+
+ for i in range(boxes3d_num):
+ cx, by, cz, h, w, l, angle = boxes3d[i, :]
+ cy = by - h / 2.
+ cosa = np.cos(angle)
+ sina = np.sin(angle)
+ for j in range(pts_rect_num):
+ x, y, z = pts_rect[j, :]
+
+ if np.abs(x - cx) > MAX_DIS or np.abs(y - cy) > h / 2. or np.abs(z - cz) > MAX_DIS:
+ continue
+
+ x_rot = (x - cx) * cosa + (z - cz) * (-sina)
+ z_rot = (x - cx) * sina + (z - cz) * cosa
+ boxes_pts_mask_list[i, j] = int(x_rot >= -l / 2. and x_rot <= l / 2. and
+ z_rot >= -w / 2. and z_rot <= w / 2.)
+ return boxes_pts_mask_list
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rotate_pc_along_y(np.ndarray pc, float rot_angle):
+ """
+ params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
+ params rot_angle: rad scalar
+ Output pc: updated pc with XYZ rotated
+ """
+ cosval = np.cos(rot_angle)
+ sinval = np.sin(rot_angle)
+ rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
+ pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
+ return pc
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rotate_pc_along_y_np(np.ndarray pc, np.ndarray rot_angle):
+ """
+ :param pc: (N, 512, 3 + C)
+ :param rot_angle: (N)
+ :return:
+ TODO: merge with rotate_pc_along_y_torch in bbox_transform.py
+ """
+ cdef np.ndarray cosa, sina, raw_1, raw_2, R, pc_temp
+ cosa = np.cos(rot_angle).reshape(-1, 1)
+ sina = np.sin(rot_angle).reshape(-1, 1)
+ raw_1 = np.concatenate([cosa, -sina], axis=1)
+ raw_2 = np.concatenate([sina, cosa], axis=1)
+ # # (N, 2, 2)
+ R = np.concatenate((np.expand_dims(raw_1, axis=1), np.expand_dims(raw_2, axis=1)), axis=1)
+ pc_temp = pc[:, :, [0, 2]]
+ pc[:, :, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1))
+
+ return pc
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def enlarge_box3d(np.ndarray boxes3d, float extra_width):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ """
+ cdef np.ndarray large_boxes3d
+ if isinstance(boxes3d, np.ndarray):
+ large_boxes3d = boxes3d.copy()
+ else:
+ large_boxes3d = boxes3d.clone()
+ large_boxes3d[:, 3:6] += extra_width * 2
+ large_boxes3d[:, 1] += extra_width
+
+ return large_boxes3d
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def boxes3d_to_corners3d(np.ndarray boxes3d, bint rotate=True):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ :param rotate:
+ :return: corners3d: (N, 8, 3)
+ """
+ cdef int boxes_num = boxes3d.shape[0]
+ cdef np.ndarray h, w, l
+ h, w, l = boxes3d[:, 3], boxes3d[:, 4], boxes3d[:, 5]
+ cdef np.ndarray x_corners, y_corners
+ x_corners = np.array([l / 2., l / 2., -l / 2., -l / 2., l / 2., l / 2., -l / 2., -l / 2.], dtype=np.float32).T # (N, 8)
+ z_corners = np.array([w / 2., -w / 2., -w / 2., w / 2., w / 2., -w / 2., -w / 2., w / 2.], dtype=np.float32).T # (N, 8)
+
+ y_corners = np.zeros((boxes_num, 8), dtype=np.float32)
+ y_corners[:, 4:8] = -h.reshape(boxes_num, 1).repeat(4, axis=1) # (N, 8)
+
+ cdef np.ndarray ry, zeros, ones, rot_list, R_list, temp_corners, rotated_corners
+ if rotate:
+ ry = boxes3d[:, 6]
+ zeros, ones = np.zeros(ry.size, dtype=np.float32), np.ones(ry.size, dtype=np.float32)
+ rot_list = np.array([[np.cos(ry), zeros, -np.sin(ry)],
+ [zeros, ones, zeros],
+ [np.sin(ry), zeros, np.cos(ry)]]) # (3, 3, N)
+ R_list = np.transpose(rot_list, (2, 0, 1)) # (N, 3, 3)
+
+ temp_corners = np.concatenate((x_corners.reshape(-1, 8, 1), y_corners.reshape(-1, 8, 1),
+ z_corners.reshape(-1, 8, 1)), axis=2) # (N, 8, 3)
+ rotated_corners = np.matmul(temp_corners, R_list) # (N, 8, 3)
+ x_corners, y_corners, z_corners = rotated_corners[:, :, 0], rotated_corners[:, :, 1], rotated_corners[:, :, 2]
+
+ cdef np.ndarray x_loc, y_loc, z_loc
+ x_loc, y_loc, z_loc = boxes3d[:, 0], boxes3d[:, 1], boxes3d[:, 2]
+
+ cdef np.ndarray x, y, z, corners
+ x = x_loc.reshape(-1, 1) + x_corners.reshape(-1, 8)
+ y = y_loc.reshape(-1, 1) + y_corners.reshape(-1, 8)
+ z = z_loc.reshape(-1, 1) + z_corners.reshape(-1, 8)
+
+ corners = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1), z.reshape(-1, 8, 1)), axis=2).astype(np.float32)
+
+ return corners
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def objs_to_boxes3d(obj_list):
+ cdef np.ndarray boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
+ cdef int k
+ for k, obj in enumerate(obj_list):
+ boxes3d[k, 0:3], boxes3d[k, 3], boxes3d[k, 4], boxes3d[k, 5], boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+ return boxes3d
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def objs_to_scores(obj_list):
+ cdef np.ndarray scores = np.zeros((obj_list.__len__()), dtype=np.float32)
+ cdef int k
+ for k, obj in enumerate(obj_list):
+ scores[k] = obj.score
+ return scores
+
+
+def get_iou3d(np.ndarray corners3d, np.ndarray query_corners3d, bint need_bev=False):
+ """
+ :param corners3d: (N, 8, 3) in rect coords
+ :param query_corners3d: (M, 8, 3)
+ :return:
+ """
+ from shapely.geometry import Polygon
+ A, B = corners3d, query_corners3d
+ N, M = A.shape[0], B.shape[0]
+ iou3d = np.zeros((N, M), dtype=np.float32)
+ iou_bev = np.zeros((N, M), dtype=np.float32)
+
+ # for height overlap, since y face down, use the negative y
+ min_h_a = -A[:, 0:4, 1].sum(axis=1) / 4.0
+ max_h_a = -A[:, 4:8, 1].sum(axis=1) / 4.0
+ min_h_b = -B[:, 0:4, 1].sum(axis=1) / 4.0
+ max_h_b = -B[:, 4:8, 1].sum(axis=1) / 4.0
+
+ for i in range(N):
+ for j in range(M):
+ max_of_min = np.max([min_h_a[i], min_h_b[j]])
+ min_of_max = np.min([max_h_a[i], max_h_b[j]])
+ h_overlap = np.max([0, min_of_max - max_of_min])
+ if h_overlap == 0:
+ continue
+
+ bottom_a, bottom_b = Polygon(A[i, 0:4, [0, 2]].T), Polygon(B[j, 0:4, [0, 2]].T)
+ if bottom_a.is_valid and bottom_b.is_valid:
+ # check is valid, A valid Polygon may not possess any overlapping exterior or interior rings.
+ bottom_overlap = bottom_a.intersection(bottom_b).area
+ else:
+ bottom_overlap = 0.
+ overlap3d = bottom_overlap * h_overlap
+ union3d = bottom_a.area * (max_h_a[i] - min_h_a[i]) + bottom_b.area * (max_h_b[j] - min_h_b[j]) - overlap3d
+ iou3d[i][j] = overlap3d / union3d
+ iou_bev[i][j] = bottom_overlap / (bottom_a.area + bottom_b.area - bottom_overlap)
+
+ if need_bev:
+ return iou3d, iou_bev
+
+ return iou3d
+
+
+def get_objects_from_label(label_file):
+ import utils.object3d as object3d
+
+ with open(label_file, 'r') as f:
+ lines = f.readlines()
+ objects = [object3d.Object3d(line) for line in lines]
+ return objects
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def _rotate_pc_along_y(np.ndarray pc, np.ndarray angle):
+ cdef np.ndarray cosa = np.cos(angle)
+ cosa=cosa.reshape(-1, 1)
+ cdef np.ndarray sina = np.sin(angle)
+ sina = sina.reshape(-1, 1)
+
+ cdef np.ndarray R = np.concatenate([cosa, -sina, sina, cosa], axis=-1)
+ R = R.reshape(-1, 2, 2)
+ cdef np.ndarray pc_temp = pc[:, [0, 2]]
+ pc_temp = pc_temp.reshape(-1, 1, 2)
+ cdef np.ndarray pc_temp_1 = np.matmul(pc_temp, R.transpose(0, 2, 1))
+ pc_temp_1 = pc_temp_1.reshape(-1, 2)
+ pc[:,[0,2]] = pc_temp_1
+
+ return pc
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def decode_bbox_target(
+ np.ndarray roi_box3d,
+ np.ndarray pred_reg,
+ np.ndarray anchor_size,
+ float loc_scope,
+ float loc_bin_size,
+ int num_head_bin,
+ bint get_xz_fine=True,
+ float loc_y_scope=0.5,
+ float loc_y_bin_size=0.25,
+ bint get_y_by_bin=False,
+ bint get_ry_fine=False):
+
+ cdef int per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ cdef int loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ # recover xz localization
+ cdef int x_bin_l = 0
+ cdef int x_bin_r = per_loc_bin_num
+ cdef int z_bin_l = per_loc_bin_num,
+ cdef int z_bin_r = per_loc_bin_num * 2
+ cdef int start_offset = z_bin_r
+ cdef np.ndarray x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
+ cdef np.ndarray z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
+
+ cdef np.ndarray pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ cdef np.ndarray pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
+ z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
+
+ x_res = x_res_norm * loc_bin_size
+ z_res = z_res_norm * loc_bin_size
+ pos_x += x_res
+ pos_z += z_res
+
+ # recover y localization
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
+ y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
+ y_res = y_res_norm * loc_y_bin_size
+ pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
+ pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
+ pos_y = pos_y.reshape(-1)
+
+ # recover ry rotation
+ cdef int ry_bin_l = start_offset,
+ cdef int ry_bin_r = start_offset + num_head_bin
+ cdef int ry_res_l = ry_bin_r,
+ cdef int ry_res_r = ry_bin_r + num_head_bin
+
+ cdef np.ndarray ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
+ cdef np.ndarray ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+ ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
+ else:
+ angle_per_class = (2 * np.pi) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+
+ # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
+ ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
+ ry[ry > np.pi] -= 2 * np.pi
+
+ # recover size
+ cdef int size_res_l = ry_res_r
+ cdef int size_res_r = ry_res_r + 3
+ assert size_res_r == pred_reg.shape[1]
+
+ cdef np.ndarray size_res_norm = pred_reg[:, size_res_l: size_res_r]
+ cdef np.ndarray hwl = size_res_norm * anchor_size + anchor_size
+
+ # shift to original coords
+ cdef np.ndarray roi_center = np.array(roi_box3d[:, 0:3])
+ cdef np.ndarray shift_ret_box3d = np.concatenate((
+ pos_x.reshape(-1, 1),
+ pos_y.reshape(-1, 1),
+ pos_z.reshape(-1, 1),
+ hwl, ry.reshape(-1, 1)), axis=1)
+ ret_box3d = shift_ret_box3d
+ if roi_box3d.shape[1] == 7:
+ roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
+ ret_box3d = _rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
+ ret_box3d[:, 6] += roi_ry
+ ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
+
+ return ret_box3d
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py
new file mode 100644
index 0000000000000000000000000000000000000000..97d81421afa89a0e26daa4f956c4d835763cb966
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py
@@ -0,0 +1,107 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
+"""
+import numpy as np
+
+
+def cls_type_to_id(cls_type):
+ type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
+ if cls_type not in type_to_id.keys():
+ return -1
+ return type_to_id[cls_type]
+
+
+class Object3d(object):
+
+ def __init__(self, line):
+ label = line.strip().split(' ')
+ self.src = line
+ self.cls_type = label[0]
+ self.cls_id = cls_type_to_id(self.cls_type)
+ self.trucation = float(label[1])
+ self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
+ self.alpha = float(label[3])
+ self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
+ self.h = float(label[8])
+ self.w = float(label[9])
+ self.l = float(label[10])
+ self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
+ self.dis_to_cam = np.linalg.norm(self.pos)
+ self.ry = float(label[14])
+ self.score = float(label[15]) if label.__len__() == 16 else -1.0
+ self.level_str = None
+ self.level = self.get_obj_level()
+
+ def get_obj_level(self):
+ height = float(self.box2d[3]) - float(self.box2d[1]) + 1
+
+ if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
+ self.level_str = 'Easy'
+ return 1 # Easy
+ elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
+ self.level_str = 'Moderate'
+ return 2 # Moderate
+ elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
+ self.level_str = 'Hard'
+ return 3 # Hard
+ else:
+ self.level_str = 'UnKnown'
+ return 4
+
+ def generate_corners3d(self):
+ """
+ generate corners3d representation for this object
+ :return corners_3d: (8, 3) corners of box3d in camera coord
+ """
+ l, h, w = self.l, self.h, self.w
+ x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
+ y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
+ z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
+
+ R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
+ [0, 1, 0],
+ [-np.sin(self.ry), 0, np.cos(self.ry)]])
+ corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
+ corners3d = np.dot(R, corners3d).T
+ corners3d = corners3d + self.pos
+ return corners3d
+
+ def to_bev_box2d(self, oblique=True, voxel_size=0.1):
+ """
+ :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
+ :param voxel_size: float, 0.1m
+ :param oblique:
+ :return: box2d (4, 2)/ (4) in image coordinate
+ """
+ if oblique:
+ corners3d = self.generate_corners3d()
+ xz_corners = corners3d[0:4, [0, 2]]
+ box2d = np.zeros((4, 2), dtype=np.int32)
+ box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
+ box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
+ else:
+ box2d = np.zeros(4, dtype=np.int32)
+ # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
+ cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
+ box2d[0], box2d[1] = cu - half_l, cv - half_w
+ box2d[2], box2d[3] = cu + half_l, cv + half_w
+
+ return box2d
+
+ def to_str(self):
+ print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
+ % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
+ self.pos, self.ry)
+ return print_str
+
+ def to_kitti_format(self):
+ kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
+ % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
+ self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
+ self.ry)
+ return kitti_str
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..3efa83135fed11d3e3a3daceb821c63424beb524
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx
@@ -0,0 +1,160 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import numpy as np
+cimport numpy as np
+cimport cython
+from libc.math cimport sin, cos
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef enlarge_box3d(np.ndarray boxes3d, int extra_width):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ """
+ if isinstance(boxes3d, np.ndarray):
+ large_boxes3d = boxes3d.copy()
+ else:
+ large_boxes3d = boxes3d.clone()
+ large_boxes3d[:, 3:6] += extra_width * 2
+ large_boxes3d[:, 1] += extra_width
+ return large_boxes3d
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef pt_in_box(float x, float y, float z, float cx, float bottom_y, float cz, float h, float w, float l, float angle):
+ cdef float max_ids = 10.0
+ cdef float cy = bottom_y - h / 2.0
+ if ((abs(x - cx) > max_ids) or (abs(y - cy) > h / 2.0) or (abs(z - cz) > max_ids)):
+ return 0
+ cdef float cosa = cos(angle)
+ cdef float sina = sin(angle)
+ cdef float x_rot = (x - cx) * cosa + (z - cz) * (-sina)
+
+ cdef float z_rot = (x - cx) * sina + (z - cz) * cosa
+
+ cdef float flag = (x_rot >= -l / 2.0) and (x_rot <= l / 2.0) and (z_rot >= -w / 2.0) and (z_rot <= w / 2.0)
+ return flag
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef _rotate_pc_along_y(np.ndarray pc, float rot_angle):
+ """
+ params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
+ params rot_angle: rad scalar
+ Output pc: updated pc with XYZ rotated
+ """
+ cosval = np.cos(rot_angle)
+ sinval = np.sin(rot_angle)
+ rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
+ pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
+ return pc
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def roipool3d_cpu(
+ np.ndarray[float, ndim=2] pts,
+ np.ndarray[float, ndim=2] pts_feature,
+ np.ndarray[float, ndim=2] boxes3d,
+ np.ndarray[float, ndim=2] pts_extra_input,
+ int pool_extra_width, int sampled_pt_num, int batch_size=1, bint canonical_transform=False):
+ cdef np.ndarray pts_feature_all = np.concatenate((pts_extra_input, pts_feature), axis=1)
+
+ cdef np.ndarray larged_boxes3d = enlarge_box3d(boxes3d.reshape(-1, 7), pool_extra_width).reshape(batch_size, -1, 7)
+
+ cdef int pts_num = pts.shape[0],
+ cdef int boxes_num = boxes3d.shape[0]
+ cdef int feature_len = pts_feature_all.shape[1]
+ cdef np.ndarray pts_data = np.zeros((batch_size, boxes_num, sampled_pt_num, 3))
+ cdef np.ndarray features_data = np.zeros((batch_size, boxes_num, sampled_pt_num, feature_len))
+ cdef np.ndarray empty_flag_data = np.zeros((batch_size, boxes_num))
+
+ cdef int cnt = 0
+ cdef float cx = 0.
+ cdef float bottom_y = 0.
+ cdef float cz = 0.
+ cdef float h = 0.
+ cdef float w = 0.
+ cdef float l = 0.
+ cdef float ry = 0.
+ cdef float x = 0.
+ cdef float y = 0.
+ cdef float z = 0.
+ cdef np.ndarray x_i
+ cdef np.ndarray feat_i
+ cdef int bs
+ cdef int i
+ cdef int j
+ for bs in range(batch_size):
+ # boxes: 64,7
+ for i in range(boxes_num):
+ cnt = 0
+ # box
+ box = larged_boxes3d[bs][i]
+ cx = box[0]
+ bottom_y = box[1]
+ cz = box[2]
+ h = box[3]
+ w = box[4]
+ l = box[5]
+ ry = box[6]
+ # points: 16384,3
+ x_i = pts
+ # features: 16384, 128
+ feat_i = pts_feature_all
+
+ for j in range(pts_num):
+ x = x_i[j][0]
+ y = x_i[j][1]
+ z = x_i[j][2]
+ cur_in_flag = pt_in_box(x,y,z,cx,bottom_y,cz,h,w,l,ry)
+ if cur_in_flag:
+ if cnt < sampled_pt_num:
+ pts_data[bs][i][cnt][:] = x_i[j]
+ features_data[bs][i][cnt][:] = feat_i[j]
+ cnt += 1
+ else:
+ break
+
+ if cnt == 0:
+ empty_flag_data[bs][i] = 1
+ elif (cnt < sampled_pt_num):
+ for k in range(cnt, sampled_pt_num):
+ pts_data[bs][i][k] = pts_data[bs][i][k % cnt]
+ features_data[bs][i][k] = features_data[bs][i][k % cnt]
+
+
+ pooled_pts = pts_data.astype("float32")[0]
+ pooled_features = features_data.astype('float32')[0]
+ pooled_empty_flag = empty_flag_data.astype('int64')[0]
+
+ cdef int extra_input_len = pts_extra_input.shape[1]
+ pooled_pts = np.concatenate((pooled_pts, pooled_features[:,:,0:extra_input_len]),axis=2)
+ pooled_features = pooled_features[:,:,extra_input_len:]
+
+ if canonical_transform:
+ # Translate to the roi coordinates
+ roi_ry = boxes3d[:, 6] % (2 * np.pi) # 0~2pi
+ roi_center = boxes3d[:, 0:3]
+ # shift to center
+ pooled_pts[:, :, 0:3] = pooled_pts[:, :, 0:3] - roi_center[:, np.newaxis, :]
+ for k in range(pooled_pts.shape[0]):
+ pooled_pts[k] = _rotate_pc_along_y(pooled_pts[k], roi_ry[k])
+ return pooled_pts, pooled_features, pooled_empty_flag
+
+ return pooled_pts, pooled_features, pooled_empty_flag
+
+
+#def roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num=512, batch_size=1):
+# return _roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num, batch_size)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d775017468bbb683d0ea0f0058062e5de12da73
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py
@@ -0,0 +1,74 @@
+# Copyright (c) 2017-present, Facebook, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+##############################################################################
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from Cython.Build import cythonize
+from setuptools import Extension
+from setuptools import setup
+
+import numpy as np
+
+_NP_INCLUDE_DIRS = np.get_include()
+
+
+# Extension modules
+ext_modules = [
+ Extension(
+ name='utils.cyops.roipool3d_utils',
+ sources=[
+ 'utils/cyops/roipool3d_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+
+ Extension(
+ name='utils.cyops.iou3d_utils',
+ sources=[
+ 'utils/cyops/iou3d_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+
+ Extension(
+ name='utils.cyops.kitti_utils',
+ sources=[
+ 'utils/cyops/kitti_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+]
+
+setup(
+ name='pp_pointrcnn',
+ ext_modules=cythonize(ext_modules)
+)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa7ee70652ac4e76aef9f4d755ec057ef2bc9123
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py
@@ -0,0 +1,216 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import logging
+import numpy as np
+import utils.cyops.kitti_utils as kitti_utils
+from utils.config import cfg
+from utils.box_utils import boxes_iou3d, box_nms_eval, boxes3d_to_bev
+from utils.save_utils import save_rpn_feature, save_kitti_result, save_kitti_format
+
+__all__ = ['calc_iou_recall', 'rpn_metric', 'rcnn_metric']
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def calc_iou_recall(rets, thresh_list):
+ rpn_cls_label = rets['rpn_cls_label'][0]
+ boxes3d = rets['rois'][0]
+ seg_mask = rets['seg_mask'][0]
+ sample_id = rets['sample_id'][0]
+ gt_boxes3d = rets['gt_boxes3d'][0]
+ gt_boxes3d_num = rets['gt_boxes3d'][1]
+
+ gt_box_idx = 0
+ recalled_bbox_list = [0] * len(thresh_list)
+ gt_box_num = 0
+ rpn_iou_sum = 0.
+ for i in range(len(gt_boxes3d_num)):
+ cur_rpn_cls_label = rpn_cls_label[i]
+ cur_boxes3d = boxes3d[i]
+ cur_seg_mask = seg_mask[i]
+ cur_sample_id = sample_id[i]
+ cur_gt_boxes3d = gt_boxes3d[gt_box_idx: gt_box_idx +
+ gt_boxes3d_num[0][i]]
+ gt_box_idx += gt_boxes3d_num[0][i]
+
+ k = cur_gt_boxes3d.__len__() - 1
+ while k >= 0 and np.sum(cur_gt_boxes3d[k]) == 0:
+ k -= 1
+ cur_gt_boxes3d = cur_gt_boxes3d[:k + 1]
+
+ if cur_gt_boxes3d.shape[0] > 0:
+ iou3d = boxes_iou3d(cur_boxes3d, cur_gt_boxes3d[:, 0:7])
+ gt_max_iou = iou3d.max(axis=0)
+
+ for idx, thresh in enumerate(thresh_list):
+ recalled_bbox_list[idx] += np.sum(gt_max_iou > thresh)
+ gt_box_num += cur_gt_boxes3d.__len__()
+
+ fg_mask = cur_rpn_cls_label > 0
+ correct = np.sum(np.logical_and(
+ cur_seg_mask == cur_rpn_cls_label, fg_mask))
+ union = np.sum(fg_mask) + np.sum(cur_seg_mask > 0) - correct
+ rpn_iou = float(correct) / max(float(union), 1.0)
+ rpn_iou_sum += rpn_iou
+ logger.debug('sample_id:{}, rpn_iou:{}, gt_box_num:{}, recalled_bbox_list:{}'.format(
+ sample_id, rpn_iou, gt_box_num, str(recalled_bbox_list)))
+
+ return len(gt_boxes3d_num), gt_box_num, rpn_iou_sum, recalled_bbox_list
+
+
+def rpn_metric(queue, mdict, lock, thresh_list, is_save_rpn_feature, kitti_feature_dir,
+ seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes):
+ while True:
+ rets_dict = queue.get()
+ if rets_dict is None:
+ lock.acquire()
+ mdict['exit_proc'] += 1
+ lock.release()
+ return
+
+ cnt, gt_box_num, rpn_iou_sum, recalled_bbox_list = calc_iou_recall(
+ rets_dict, thresh_list)
+ lock.acquire()
+ mdict['total_cnt'] += cnt
+ mdict['total_gt_bbox'] += gt_box_num
+ mdict['total_rpn_iou'] += rpn_iou_sum
+ for i, bbox_num in enumerate(recalled_bbox_list):
+ mdict['total_recalled_bbox_list_{}'.format(i)] += bbox_num
+ logger.debug("rpn_metric: {}".format(str(mdict)))
+ lock.release()
+
+ if is_save_rpn_feature:
+ save_rpn_feature(rets_dict, kitti_feature_dir)
+ save_kitti_result(
+ rets_dict, seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes)
+
+
+def rcnn_metric(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
+ refine_output_dir, final_output_dir, is_save_result=False):
+ while True:
+ rets_dict = queue.get()
+ if rets_dict is None:
+ lock.acquire()
+ mdict['exit_proc'] += 1
+ lock.release()
+ return
+
+ for k,v in rets_dict.items():
+ rets_dict[k] = v[0]
+
+ rcnn_cls = rets_dict['rcnn_cls']
+ rcnn_reg = rets_dict['rcnn_reg']
+ roi_boxes3d = rets_dict['roi_boxes3d']
+ roi_scores = rets_dict['roi_scores']
+
+ # bounding box regression
+ anchor_size = cfg.CLS_MEAN_SIZE[0]
+ pred_boxes3d = kitti_utils.decode_bbox_target(
+ roi_boxes3d,
+ rcnn_reg,
+ anchor_size=np.array(anchor_size),
+ loc_scope=cfg.RCNN.LOC_SCOPE,
+ loc_bin_size=cfg.RCNN.LOC_BIN_SIZE,
+ num_head_bin=cfg.RCNN.NUM_HEAD_BIN,
+ get_xz_fine=True,
+ get_y_by_bin=cfg.RCNN.LOC_Y_BY_BIN,
+ loc_y_scope=cfg.RCNN.LOC_Y_SCOPE,
+ loc_y_bin_size=cfg.RCNN.LOC_Y_BIN_SIZE,
+ get_ry_fine=True
+ )
+
+ # scoring
+ if rcnn_cls.shape[1] == 1:
+ raw_scores = rcnn_cls.reshape(-1)
+ norm_scores = rets_dict['norm_scores']
+ pred_classes = norm_scores > cfg.RCNN.SCORE_THRESH
+ pred_classes = pred_classes.astype(np.float32)
+ else:
+ pred_classes = np.argmax(rcnn_cls, axis=1).reshape(-1)
+ raw_scores = rcnn_cls[:, pred_classes]
+
+ # evaluation
+ gt_iou = rets_dict['gt_iou']
+ gt_boxes3d = rets_dict['gt_boxes3d']
+
+ # recall
+ if gt_boxes3d.size > 0:
+ gt_num = gt_boxes3d.shape[1]
+ gt_boxes3d = gt_boxes3d.reshape((-1,7))
+ iou3d = boxes_iou3d(pred_boxes3d, gt_boxes3d)
+ gt_max_iou = iou3d.max(axis=0)
+ refined_iou = iou3d.max(axis=1)
+
+ recalled_num = (gt_max_iou > 0.7).sum()
+ roi_boxes3d = roi_boxes3d.reshape((-1,7))
+ iou3d_in = boxes_iou3d(roi_boxes3d, gt_boxes3d)
+ gt_max_iou_in = iou3d_in.max(axis=0)
+
+ lock.acquire()
+ mdict['total_gt_bbox'] += gt_num
+ for idx, thresh in enumerate(thresh_list):
+ recalled_bbox_num = (gt_max_iou > thresh).sum()
+ mdict['total_recalled_bbox_list_{}'.format(idx)] += recalled_bbox_num
+ for idx, thresh in enumerate(thresh_list):
+ roi_recalled_bbox_num = (gt_max_iou_in > thresh).sum()
+ mdict['total_roi_recalled_bbox_list_{}'.format(idx)] += roi_recalled_bbox_num
+ lock.release()
+
+ # classification accuracy
+ cls_label = gt_iou > cfg.RCNN.CLS_FG_THRESH
+ cls_label = cls_label.astype(np.float32)
+ cls_valid_mask = (gt_iou >= cfg.RCNN.CLS_FG_THRESH) | (gt_iou <= cfg.RCNN.CLS_BG_THRESH)
+ cls_valid_mask = cls_valid_mask.astype(np.float32)
+ cls_acc = (pred_classes == cls_label).astype(np.float32)
+ cls_acc = (cls_acc * cls_valid_mask).sum() / max(cls_valid_mask.sum(), 1.0) * 1.0
+
+ iou_thresh = 0.7 if cfg.CLASSES == 'Car' else 0.5
+ cls_label_refined = (gt_iou >= iou_thresh)
+ cls_label_refined = cls_label_refined.astype(np.float32)
+ cls_acc_refined = (pred_classes == cls_label_refined).astype(np.float32).sum() / max(cls_label_refined.shape[0], 1.0)
+
+ sample_id = rets_dict['sample_id']
+ image_shape = kitti_rcnn_reader.get_image_shape(sample_id)
+
+ if is_save_result:
+ roi_boxes3d_np = roi_boxes3d
+ pred_boxes3d_np = pred_boxes3d
+ calib = kitti_rcnn_reader.get_calib(sample_id)
+ save_kitti_format(sample_id, calib, roi_boxes3d_np, roi_output_dir, roi_scores, image_shape)
+ save_kitti_format(sample_id, calib, pred_boxes3d_np, refine_output_dir, raw_scores, image_shape)
+
+ inds = norm_scores > cfg.RCNN.SCORE_THRESH
+ if inds.astype(np.float32).sum() == 0:
+ logger.debug("The num of 'norm_scores > thresh' of sample {} is 0".format(sample_id))
+ continue
+ pred_boxes3d_selected = pred_boxes3d[inds]
+ raw_scores_selected = raw_scores[inds]
+ # NMS thresh
+ boxes_bev_selected = boxes3d_to_bev(pred_boxes3d_selected)
+ scores_selected, pred_boxes3d_selected = box_nms_eval(boxes_bev_selected, raw_scores_selected, pred_boxes3d_selected, cfg.RCNN.NMS_THRESH)
+ calib = kitti_rcnn_reader.get_calib(sample_id)
+ save_kitti_format(sample_id, calib, pred_boxes3d_selected, final_output_dir, scores_selected, image_shape)
+ lock.acquire()
+ mdict['total_det_num'] += pred_boxes3d_selected.shape[0]
+ mdict['total_cls_acc'] += cls_acc
+ mdict['total_cls_acc_refined'] += cls_acc_refined
+ lock.release()
+ logger.debug("rcnn_metric: {}".format(str(mdict)))
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py
new file mode 100644
index 0000000000000000000000000000000000000000..7b5703bdbfba1c1bf239c2a2c9f2179ea908a7e5
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py
@@ -0,0 +1,113 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
+"""
+import numpy as np
+
+
+def cls_type_to_id(cls_type):
+ type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
+ if cls_type not in type_to_id.keys():
+ return -1
+ return type_to_id[cls_type]
+
+
+def get_objects_from_label(label_file):
+ with open(label_file, 'r') as f:
+ lines = f.readlines()
+ objects = [Object3d(line) for line in lines]
+ return objects
+
+
+class Object3d(object):
+ def __init__(self, line):
+ label = line.strip().split(' ')
+ self.src = line
+ self.cls_type = label[0]
+ self.cls_id = cls_type_to_id(self.cls_type)
+ self.trucation = float(label[1])
+ self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
+ self.alpha = float(label[3])
+ self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
+ self.h = float(label[8])
+ self.w = float(label[9])
+ self.l = float(label[10])
+ self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
+ self.dis_to_cam = np.linalg.norm(self.pos)
+ self.ry = float(label[14])
+ self.score = float(label[15]) if label.__len__() == 16 else -1.0
+ self.level_str = None
+ self.level = self.get_obj_level()
+
+ def get_obj_level(self):
+ height = float(self.box2d[3]) - float(self.box2d[1]) + 1
+
+ if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
+ self.level_str = 'Easy'
+ return 1 # Easy
+ elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
+ self.level_str = 'Moderate'
+ return 2 # Moderate
+ elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
+ self.level_str = 'Hard'
+ return 3 # Hard
+ else:
+ self.level_str = 'UnKnown'
+ return 4
+
+ def generate_corners3d(self):
+ """
+ generate corners3d representation for this object
+ :return corners_3d: (8, 3) corners of box3d in camera coord
+ """
+ l, h, w = self.l, self.h, self.w
+ x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
+ y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
+ z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
+
+ R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
+ [0, 1, 0],
+ [-np.sin(self.ry), 0, np.cos(self.ry)]])
+ corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
+ corners3d = np.dot(R, corners3d).T
+ corners3d = corners3d + self.pos
+ return corners3d
+
+ def to_bev_box2d(self, oblique=True, voxel_size=0.1):
+ """
+ :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
+ :param voxel_size: float, 0.1m
+ :param oblique:
+ :return: box2d (4, 2)/ (4) in image coordinate
+ """
+ if oblique:
+ corners3d = self.generate_corners3d()
+ xz_corners = corners3d[0:4, [0, 2]]
+ box2d = np.zeros((4, 2), dtype=np.int32)
+ box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
+ box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
+ else:
+ box2d = np.zeros(4, dtype=np.int32)
+ # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
+ cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
+ box2d[0], box2d[1] = cu - half_l, cv - half_w
+ box2d[2], box2d[3] = cu + half_l, cv + half_w
+
+ return box2d
+
+ def to_str(self):
+ print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
+ % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
+ self.pos, self.ry)
+ return print_str
+
+ def to_kitti_format(self):
+ kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
+ % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
+ self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
+ self.ry)
+ return kitti_str
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py
new file mode 100644
index 0000000000000000000000000000000000000000..e32d1df862de7692e520168a2b35f482535f3ac6
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py
@@ -0,0 +1,122 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Optimization and learning rate scheduling."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
+from paddle.fluid.layers import control_flow
+
+import logging
+logger = logging.getLogger(__name__)
+
+def cosine_warmup_decay(learning_rate, betas, warmup_factor, decay_factor,
+ total_step, warmup_pct):
+ def annealing_cos(start, end, pct):
+ "Cosine anneal from `start` to `end` as pct goes from 0.0 to 1.0."
+ cos_out = fluid.layers.cos(pct * np.pi) + 1.
+ return cos_out * (start - end) / 2. + end
+
+ warmup_start_lr = learning_rate * warmup_factor
+ decay_end_lr = learning_rate * decay_factor
+ warmup_step = total_step * warmup_pct
+
+ global_step = lr_scheduler._decay_step_counter()
+
+ lr = fluid.layers.create_global_var(
+ shape=[1],
+ value=float(learning_rate),
+ dtype='float32',
+ persistable=True,
+ name="learning_rate")
+ beta1 = fluid.layers.create_global_var(
+ shape=[1],
+ value=float(betas[0]),
+ dtype='float32',
+ persistable=True,
+ name="beta1")
+
+ warmup_step_var = fluid.layers.fill_constant(
+ shape=[1], dtype='float32', value=float(warmup_step), force_cpu=True)
+
+ with control_flow.Switch() as switch:
+ with switch.case(global_step < warmup_step_var):
+ cur_lr = annealing_cos(warmup_start_lr, learning_rate,
+ global_step / warmup_step_var)
+ fluid.layers.assign(cur_lr, lr)
+ cur_beta1 = annealing_cos(betas[0], betas[1],
+ global_step / warmup_step_var)
+ fluid.layers.assign(cur_beta1, beta1)
+ with switch.case(global_step >= warmup_step_var):
+ cur_lr = annealing_cos(learning_rate, decay_end_lr,
+ (global_step - warmup_step_var) / (total_step - warmup_step))
+ fluid.layers.assign(cur_lr, lr)
+ cur_beta1 = annealing_cos(betas[1], betas[0],
+ (global_step - warmup_step_var) / (total_step - warmup_step))
+ fluid.layers.assign(cur_beta1, beta1)
+
+ return lr, beta1
+
+
+def optimize(loss,
+ learning_rate,
+ warmup_factor,
+ decay_factor,
+ total_step,
+ warmup_pct,
+ train_program,
+ startup_prog,
+ weight_decay,
+ clip_norm,
+ beta1=[0.95, 0.85],
+ beta2=0.99,
+ scheduler='cosine_warmup_decay'):
+
+ scheduled_lr= None
+ if scheduler == 'cosine_warmup_decay':
+ scheduled_lr, scheduled_beta1 = cosine_warmup_decay(learning_rate, beta1, warmup_factor,
+ decay_factor, total_step,
+ warmup_pct)
+ else:
+ raise ValueError("Unkown learning rate scheduler, should be "
+ "'cosine_warmup_decay'")
+
+ optimizer = fluid.optimizer.Adam(learning_rate=scheduled_lr,
+ beta1=scheduled_beta1,
+ beta2=beta2)
+ fluid.clip.set_gradient_clip(
+ clip=fluid.clip.GradientClipByGlobalNorm(clip_norm=clip_norm))
+
+ param_list = dict()
+
+ if weight_decay > 0:
+ for param in train_program.global_block().all_parameters():
+ param_list[param.name] = param * 1.0
+ param_list[param.name].stop_gradient = True
+
+ _, param_grads = optimizer.minimize(loss)
+
+ if weight_decay > 0:
+ for param, grad in param_grads:
+ with param.block.program._optimized_guard(
+ [param, grad]), fluid.framework.name_scope("weight_decay"):
+ updated_param = param - param_list[
+ param.name] * weight_decay * scheduled_lr
+ fluid.layers.assign(output=param, input=updated_param)
+
+ return scheduled_lr
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py
new file mode 100644
index 0000000000000000000000000000000000000000..deda51180bfb9007f1dadd265c3f33f397b1cccf
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py
@@ -0,0 +1,369 @@
+import numpy as np
+from utils.cyops import kitti_utils, roipool3d_utils, iou3d_utils
+
+CLOSE_RANDOM = False
+
+def get_proposal_target_func(cfg, mode='TRAIN'):
+
+ def sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d):
+ """
+ :param roi_boxes3d: (B, M, 7)
+ :param gt_boxes3d: (B, N, 8) [x, y, z, h, w, l, ry, cls]
+ :return
+ batch_rois: (B, N, 7)
+ batch_gt_of_rois: (B, N, 8)
+ batch_roi_iou: (B, N)
+ """
+
+ batch_size = roi_boxes3d.shape[0]
+
+ #batch_size = 1
+ fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
+
+ batch_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
+ batch_gt_of_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
+ batch_roi_iou = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE))
+ for idx in range(batch_size):
+ cur_roi, cur_gt = roi_boxes3d[idx], gt_boxes3d[idx]
+ k = cur_gt.shape[0] - 1
+ while cur_gt[k].sum() == 0:
+ k -= 1
+ cur_gt = cur_gt[:k + 1]
+ # include gt boxes in the candidate rois
+ iou3d = iou3d_utils.boxes_iou3d(cur_roi, cur_gt[:, 0:7]) # (M, N)
+ max_overlaps = np.max(iou3d, axis=1)
+ gt_assignment = np.argmax(iou3d, axis=1)
+ # sample fg, easy_bg, hard_bg
+ fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ fg_inds = np.where(max_overlaps >= fg_thresh)[0].reshape(-1)
+
+ # TODO: this will mix the fg and bg when CLS_BG_THRESH_LO < iou < CLS_BG_THRESH
+ # fg_inds = torch.cat((fg_inds, roi_assignment), dim=0) # consider the roi which has max_iou with gt as fg
+ easy_bg_inds = np.where(max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO)[0].reshape(-1)
+ hard_bg_inds = np.where((max_overlaps < cfg.RCNN.CLS_BG_THRESH) & (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0].reshape(-1)
+
+ fg_num_rois = fg_inds.shape[0]
+ bg_num_rois = hard_bg_inds.shape[0] + easy_bg_inds.shape[0]
+
+ if fg_num_rois > 0 and bg_num_rois > 0:
+ # sampling fg
+ fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
+ if CLOSE_RANDOM:
+ fg_inds = fg_inds[:fg_rois_per_this_image]
+ else:
+ rand_num = np.random.permutation(fg_num_rois)
+ fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
+
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
+ bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ elif fg_num_rois > 0 and bg_num_rois == 0:
+ # sampling fg
+ rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE) * fg_num_rois)
+ # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
+ fg_inds = fg_inds[rand_num]
+ fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_rois_per_this_image = 0
+ elif bg_num_rois > 0 and fg_num_rois == 0:
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ fg_rois_per_this_image = 0
+ else:
+ import pdb
+ pdb.set_trace()
+ raise NotImplementedError
+ # augment the rois by noise
+ roi_list, roi_iou_list, roi_gt_list = [], [], []
+ if fg_rois_per_this_image > 0:
+ fg_rois_src = cur_roi[fg_inds]
+ gt_of_fg_rois = cur_gt[gt_assignment[fg_inds]]
+ iou3d_src = max_overlaps[fg_inds]
+ fg_rois, fg_iou3d = aug_roi_by_noise(
+ fg_rois_src, gt_of_fg_rois, iou3d_src, aug_times=cfg.RCNN.ROI_FG_AUG_TIMES)
+ roi_list.append(fg_rois)
+ roi_iou_list.append(fg_iou3d)
+ roi_gt_list.append(gt_of_fg_rois)
+
+ if bg_rois_per_this_image > 0:
+ bg_rois_src = cur_roi[bg_inds]
+ gt_of_bg_rois = cur_gt[gt_assignment[bg_inds]]
+ iou3d_src = max_overlaps[bg_inds]
+ aug_times = 1 if cfg.RCNN.ROI_FG_AUG_TIMES > 0 else 0
+ bg_rois, bg_iou3d = aug_roi_by_noise(
+ bg_rois_src, gt_of_bg_rois, iou3d_src, aug_times=aug_times)
+ roi_list.append(bg_rois)
+ roi_iou_list.append(bg_iou3d)
+ roi_gt_list.append(gt_of_bg_rois)
+
+
+ rois = np.concatenate(roi_list, axis=0)
+ iou_of_rois = np.concatenate(roi_iou_list, axis=0)
+ gt_of_rois = np.concatenate(roi_gt_list, axis=0)
+ batch_rois[idx] = rois
+ batch_gt_of_rois[idx] = gt_of_rois
+ batch_roi_iou[idx] = iou_of_rois
+
+ return batch_rois, batch_gt_of_rois, batch_roi_iou
+
+ def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
+
+ if hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] > 0:
+ hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
+ easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
+ # sampling hard bg
+ if CLOSE_RANDOM:
+ rand_idx = list(np.arange(0,hard_bg_inds.shape[0]))*hard_bg_rois_num
+ rand_idx = rand_idx[:hard_bg_rois_num]
+ else:
+ rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
+ hard_bg_inds = hard_bg_inds[rand_idx]
+ # sampling easy bg
+ if CLOSE_RANDOM:
+ rand_idx = list(np.arange(0,easy_bg_inds.shape[0]))*easy_bg_rois_num
+ rand_idx = rand_idx[:easy_bg_rois_num]
+ else:
+ rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
+ easy_bg_inds = easy_bg_inds[rand_idx]
+ bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
+ elif hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] == 0:
+ hard_bg_rois_num = bg_rois_per_this_image
+ # sampling hard bg
+ rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
+ bg_inds = hard_bg_inds[rand_idx]
+ elif hard_bg_inds.shape[0] == 0 and easy_bg_inds.shape[0] > 0:
+ easy_bg_rois_num = bg_rois_per_this_image
+ # sampling easy bg
+ rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
+ bg_inds = easy_bg_inds[rand_idx]
+ else:
+ raise NotImplementedError
+
+ return bg_inds
+
+ def aug_roi_by_noise(roi_boxes3d, gt_boxes3d, iou3d_src, aug_times=10):
+ iou_of_rois = np.zeros(roi_boxes3d.shape[0]).astype(gt_boxes3d.dtype)
+ pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+
+ for k in range(roi_boxes3d.shape[0]):
+ temp_iou = cnt = 0
+ roi_box3d = roi_boxes3d[k]
+
+ gt_box3d = gt_boxes3d[k].reshape(1, 7)
+ aug_box3d = roi_box3d
+ keep = True
+ while temp_iou < pos_thresh and cnt < aug_times:
+ if True: #np.random.rand() < 0.2:
+ aug_box3d = roi_box3d # p=0.2 to keep the original roi box
+ keep = True
+ else:
+ aug_box3d = random_aug_box3d(roi_box3d)
+ keep = False
+ aug_box3d = aug_box3d.reshape((1, 7))
+ iou3d = iou3d_utils.boxes_iou3d(aug_box3d, gt_box3d)
+ temp_iou = iou3d[0][0]
+ cnt += 1
+ roi_boxes3d[k] = aug_box3d.reshape(-1)
+ if cnt == 0 or keep:
+ iou_of_rois[k] = iou3d_src[k]
+ else:
+ iou_of_rois[k] = temp_iou
+ return roi_boxes3d, iou_of_rois
+
+ def random_aug_box3d(box3d):
+ """
+ :param box3d: (7) [x, y, z, h, w, l, ry]
+ random shift, scale, orientation
+ """
+ if cfg.RCNN.REG_AUG_METHOD == 'single':
+
+ pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
+ hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
+ angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
+ # pos_range, hwl_range, angle_range, mean_iou
+ range_config = [[0.2, 0.1, np.pi / 12, 0.7],
+ [0.3, 0.15, np.pi / 12, 0.6],
+ [0.5, 0.15, np.pi / 9, 0.5],
+ [0.8, 0.15, np.pi / 6, 0.3],
+ [1.0, 0.15, np.pi / 3, 0.2]]
+ idx = np.random.randint(low=0, high=len(range_config), size=(1,))[0]
+ pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
+ hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
+ angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'normal':
+ x_shift = np.random.normal(loc=0, scale=0.3)
+ y_shift = np.random.normal(loc=0, scale=0.2)
+ z_shift = np.random.normal(loc=0, scale=0.3)
+ h_shift = np.random.normal(loc=0, scale=0.25)
+ w_shift = np.random.normal(loc=0, scale=0.15)
+ l_shift = np.random.normal(loc=0, scale=0.5)
+ ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
+ aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
+ box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift], dtype=np.float32)
+ aug_box3d = aug_box3d.astype(box3d.dtype)
+ return aug_box3d
+ else:
+ raise NotImplementedError
+
+ def data_augmentation(pts, rois, gt_of_rois):
+ """
+ :param pts: (B, M, 512, 3)
+ :param rois: (B, M. 7)
+ :param gt_of_rois: (B, M, 7)
+ :return:
+ """
+ batch_size, boxes_num = pts.shape[0], pts.shape[1]
+
+ # rotation augmentation
+ angles = (np.random.rand(batch_size, boxes_num) - 0.5 / 0.5) * (np.pi / cfg.AUG_ROT_RANGE)
+ # calculate gt alpha from gt_of_rois
+ temp_x, temp_z, temp_ry = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2], gt_of_rois[:, :, 6]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ gt_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
+
+ temp_x, temp_z, temp_ry = rois[:, :, 0], rois[:, :, 2], rois[:, :, 6]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ roi_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
+
+ for k in range(batch_size):
+ pts[k] = kitti_utils.rotate_pc_along_y_np(pts[k], angles[k])
+ gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(gt_of_rois[k], axis=1), angles[k]), axis=1)
+ rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(rois[k], axis=1), angles[k]),axis=1)
+
+ # calculate the ry after rotation
+ temp_x, temp_z = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ gt_of_rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + gt_alpha - temp_beta
+ temp_x, temp_z = rois[:, :, 0], rois[:, :, 2]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + roi_alpha - temp_beta
+ # scaling augmentation
+ scales = 1 + ((np.random.rand(batch_size, boxes_num) - 0.5) / 0.5) * 0.05
+ pts = pts * np.expand_dims(np.expand_dims(scales, axis=2), axis=3)
+ gt_of_rois[:, :, 0:6] = gt_of_rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
+ rois[:, :, 0:6] = rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
+
+ # flip augmentation
+ flip_flag = np.sign(np.random.rand(batch_size, boxes_num) - 0.5)
+ pts[:, :, :, 0] = pts[:, :, :, 0] * np.expand_dims(flip_flag, axis=2)
+ gt_of_rois[:, :, 0] = gt_of_rois[:, :, 0] * flip_flag
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ src_ry = gt_of_rois[:, :, 6]
+ ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
+ gt_of_rois[:, :, 6] = ry
+
+ rois[:, :, 0] = rois[:, :, 0] * flip_flag
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ src_ry = rois[:, :, 6]
+ ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
+ rois[:, :, 6] = ry
+
+ return pts, rois, gt_of_rois
+
+ def generate_proposal_target(seg_mask,rpn_features,gt_boxes3d,rpn_xyz,pts_depth,roi_boxes3d,rpn_intensity):
+ seg_mask = np.array(seg_mask)
+ features = np.array(rpn_features)
+ gt_boxes3d = np.array(gt_boxes3d)
+ rpn_xyz = np.array(rpn_xyz)
+ pts_depth = np.array(pts_depth)
+ roi_boxes3d = np.array(roi_boxes3d)
+ rpn_intensity = np.array(rpn_intensity)
+ batch_rois, batch_gt_of_rois, batch_roi_iou = sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d)
+
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [np.expand_dims(rpn_intensity, axis=2),
+ np.expand_dims(seg_mask, axis=2)]
+ else:
+ pts_extra_input_list = [np.expand_dims(seg_mask, axis=2)]
+
+ if cfg.RCNN.USE_DEPTH:
+ pts_depth = pts_depth / 70.0 - 0.5
+ pts_extra_input_list.append(np.expand_dims(pts_depth, axis=2))
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=2)
+
+ # point cloud pooling
+ pts_feature = np.concatenate((pts_extra_input, rpn_features), axis=2)
+
+ batch_rois = batch_rois.astype(np.float32)
+
+ pooled_features, pooled_empty_flag = roipool3d_utils.roipool3d_gpu(
+ rpn_xyz, pts_feature, batch_rois, cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=cfg.RCNN.NUM_POINTS
+ )
+
+ sampled_pts, sampled_features = pooled_features[:, :, :, 0:3], pooled_features[:, :, :, 3:]
+ # data augmentation
+ if cfg.AUG_DATA:
+ # data augmentation
+ sampled_pts, batch_rois, batch_gt_of_rois = \
+ data_augmentation(sampled_pts, batch_rois, batch_gt_of_rois)
+
+ # canonical transformation
+ batch_size = batch_rois.shape[0]
+ roi_ry = batch_rois[:, :, 6] % (2 * np.pi)
+ roi_center = batch_rois[:, :, 0:3]
+ sampled_pts = sampled_pts - np.expand_dims(roi_center, axis=2) # (B, M, 512, 3)
+ batch_gt_of_rois[:, :, 0:3] = batch_gt_of_rois[:, :, 0:3] - roi_center
+ batch_gt_of_rois[:, :, 6] = batch_gt_of_rois[:, :, 6] - roi_ry
+
+ for k in range(batch_size):
+ sampled_pts[k] = kitti_utils.rotate_pc_along_y_np(sampled_pts[k], batch_rois[k, :, 6])
+ batch_gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(batch_gt_of_rois[k], axis=1), roi_ry[k]), axis=1)
+
+ # regression valid mask
+ valid_mask = (pooled_empty_flag == 0)
+ reg_valid_mask = ((batch_roi_iou > cfg.RCNN.REG_FG_THRESH) & valid_mask).astype(np.float32)
+
+ # classification label
+ batch_cls_label = (batch_roi_iou > cfg.RCNN.CLS_FG_THRESH).astype(np.int64)
+ invalid_mask = (batch_roi_iou > cfg.RCNN.CLS_BG_THRESH) & (batch_roi_iou < cfg.RCNN.CLS_FG_THRESH)
+ batch_cls_label[valid_mask == 0] = -1
+ batch_cls_label[invalid_mask > 0] = -1
+
+ output_dict = {'sampled_pts': sampled_pts.reshape(-1, cfg.RCNN.NUM_POINTS, 3).astype(np.float32),
+ 'pts_feature': sampled_features.reshape(-1, cfg.RCNN.NUM_POINTS, sampled_features.shape[3]).astype(np.float32),
+ 'cls_label': batch_cls_label.reshape(-1),
+ 'reg_valid_mask': reg_valid_mask.reshape(-1).astype(np.float32),
+ 'gt_of_rois': batch_gt_of_rois.reshape(-1, 7).astype(np.float32),
+ 'gt_iou': batch_roi_iou.reshape(-1).astype(np.float32),
+ 'roi_boxes3d': batch_rois.reshape(-1, 7).astype(np.float32)}
+
+ return output_dict.values()
+
+ return generate_proposal_target
+
+
+if __name__ == "__main__":
+
+ input_dict = {}
+ input_dict['roi_boxes3d'] = np.load("models/rpn_data/roi_boxes3d.npy")
+ input_dict['gt_boxes3d'] = np.load("models/rpn_data/gt_boxes3d.npy")
+ input_dict['rpn_xyz'] = np.load("models/rpn_data/rpn_xyz.npy")
+ input_dict['rpn_features'] = np.load("models/rpn_data/rpn_features.npy")
+ input_dict['rpn_intensity'] = np.load("models/rpn_data/rpn_intensity.npy")
+ input_dict['seg_mask'] = np.load("models/rpn_data/seg_mask.npy")
+ input_dict['pts_depth'] = np.load("models/rpn_data/pts_depth.npy")
+ for k, v in input_dict.items():
+ print(k, v.shape, np.sum(np.abs(v)))
+ input_dict[k] = np.expand_dims(v, axis=0)
+
+ from utils.config import cfg
+ cfg.RPN.LOC_XZ_FINE = True
+ cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
+ cfg.RPN.NMS_TYPE = 'rotate'
+
+ proposal_target_func = get_proposal_target_func(cfg)
+ out_dict = proposal_target_func(input_dict['seg_mask'],input_dict['rpn_features'],input_dict['gt_boxes3d'],
+ input_dict['rpn_xyz'],input_dict['pts_depth'],input_dict['roi_boxes3d'],input_dict['rpn_intensity'])
+ for key in out_dict.keys():
+ print("name:{}, shape{}".format(key,out_dict[key].shape))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..9160ffe8e4e4a1aff7f8e8984e5ddd3711d1ffb0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py
@@ -0,0 +1,270 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains proposal functions
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+
+import utils.box_utils as box_utils
+from utils.config import cfg
+
+__all__ = ["get_proposal_func"]
+
+
+def get_proposal_func(cfg, mode='TRAIN'):
+ def decode_bbox_target(roi_box3d, pred_reg, anchor_size, loc_scope,
+ loc_bin_size, num_head_bin, get_xz_fine=True,
+ loc_y_scope=0.5, loc_y_bin_size=0.25,
+ get_y_by_bin=False, get_ry_fine=False):
+ per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ # recover xz localization
+ x_bin_l, x_bin_r = 0, per_loc_bin_num
+ z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
+ start_offset = z_bin_r
+
+ x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
+ z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
+
+ pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
+ z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
+
+ x_res = x_res_norm * loc_bin_size
+ z_res = z_res_norm * loc_bin_size
+ pos_x += x_res
+ pos_z += z_res
+
+ # recover y localization
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
+ y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
+ y_res = y_res_norm * loc_y_bin_size
+ pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
+ pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
+ pos_y = pos_y.reshape(-1)
+
+ # recover ry rotation
+ ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
+ ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
+
+ ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
+ ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+ ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
+ else:
+ angle_per_class = (2 * np.pi) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+
+ # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
+ ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
+ ry[ry > np.pi] -= 2 * np.pi
+
+ # recover size
+ size_res_l, size_res_r = ry_res_r, ry_res_r + 3
+ assert size_res_r == pred_reg.shape[1]
+
+ size_res_norm = pred_reg[:, size_res_l: size_res_r]
+ hwl = size_res_norm * anchor_size + anchor_size
+
+ def rotate_pc_along_y(pc, angle):
+ cosa = np.cos(angle).reshape(-1, 1)
+ sina = np.sin(angle).reshape(-1, 1)
+
+ R = np.concatenate([cosa, -sina, sina, cosa], axis=-1).reshape(-1, 2, 2)
+ pc_temp = pc[:, [0, 2]].reshape(-1, 1, 2)
+ pc[:, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1)).reshape(-1, 2)
+
+ return pc
+
+ # shift to original coords
+ roi_center = np.array(roi_box3d[:, 0:3])
+ shift_ret_box3d = np.concatenate((
+ pos_x.reshape(-1, 1),
+ pos_y.reshape(-1, 1),
+ pos_z.reshape(-1, 1),
+ hwl, ry.reshape(-1, 1)), axis=1)
+ ret_box3d = shift_ret_box3d
+ if roi_box3d.shape[1] == 7:
+ roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
+ ret_box3d = rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
+ ret_box3d[:, 6] += roi_ry
+ ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
+ return ret_box3d
+
+ def distance_based_proposal(scores, proposals, sorted_idxs):
+ nms_range_list = [0, 40.0, 80.0]
+ pre_tot_top_n = cfg[mode].RPN_PRE_NMS_TOP_N
+ pre_top_n_list = [0, int(pre_tot_top_n * 0.7), pre_tot_top_n - int(pre_tot_top_n * 0.7)]
+ post_tot_top_n = cfg[mode].RPN_POST_NMS_TOP_N
+ post_top_n_list = [0, int(post_tot_top_n * 0.7), post_tot_top_n - int(post_tot_top_n * 0.7)]
+
+ batch_size = scores.shape[0]
+ ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
+ ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
+
+ for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
+ # sort by score
+ score_ord = score[sorted_idx]
+ proposal_ord = proposal[sorted_idx]
+
+ dist = proposal_ord[:, 2]
+ first_mask = (dist > nms_range_list[0]) & (dist <= nms_range_list[1])
+
+ scores_single_list, proposals_single_list = [], []
+ for i in range(1, len(nms_range_list)):
+ # get proposal distance mask
+ dist_mask = ((dist > nms_range_list[i - 1]) & (dist <= nms_range_list[i]))
+
+ if dist_mask.sum() != 0:
+ # this area has points, reduce by mask
+ cur_scores = score_ord[dist_mask]
+ cur_proposals = proposal_ord[dist_mask]
+
+ # fetch pre nms top K
+ cur_scores = cur_scores[:pre_top_n_list[i]]
+ cur_proposals = cur_proposals[:pre_top_n_list[i]]
+ else:
+ assert i == 2, '%d' % i
+ # this area doesn't have any points, so use rois of first area
+ cur_scores = score_ord[first_mask]
+ cur_proposals = proposal_ord[first_mask]
+
+ # fetch top K of first area
+ cur_scores = cur_scores[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
+ cur_proposals = cur_proposals[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
+
+ # oriented nms
+ boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
+ s_scores, s_proposals = box_utils.box_nms(
+ boxes_bev, cur_scores, cur_proposals,
+ cfg[mode].RPN_NMS_THRESH, post_top_n_list[i],
+ cfg.RPN.NMS_TYPE)
+ if len(s_scores) > 0:
+ scores_single_list.append(s_scores)
+ proposals_single_list.append(s_proposals)
+
+ scores_single = np.concatenate(scores_single_list, axis=0)
+ proposals_single = np.concatenate(proposals_single_list, axis=0)
+
+ prop_num = proposals_single.shape[0]
+ ret_scores[b, :prop_num, 0] = scores_single
+ ret_proposals[b, :prop_num] = proposals_single
+ # ret_proposals.tofile("proposal.data")
+ # ret_scores.tofile("score.data")
+ return np.concatenate([ret_proposals, ret_scores], axis=-1)
+
+ def score_based_proposal(scores, proposals, sorted_idxs):
+ batch_size = scores.shape[0]
+ ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
+ ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
+ for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
+ # sort by score
+ score_ord = score[sorted_idx]
+ proposal_ord = proposal[sorted_idx]
+
+ # pre nms top K
+ cur_scores = score_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
+ cur_proposals = proposal_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
+
+ boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
+ s_scores, s_proposals = box_utils.box_nms(
+ boxes_bev, cur_scores, cur_proposals,
+ cfg[mode].RPN_NMS_THRESH,
+ cfg[mode].RPN_POST_NMS_TOP_N,
+ 'rotate')
+ prop_num = len(s_proposals)
+ ret_scores[b, :prop_num, 0] = s_scores
+ ret_proposals[b, :prop_num] = s_proposals
+ # ret_proposals.tofile("proposal.data")
+ # ret_scores.tofile("score.data")
+ return np.concatenate([ret_proposals, ret_scores], axis=-1)
+
+ def generate_proposal(x):
+ rpn_scores = np.array(x[:, :, 0])[:, :, 0]
+ roi_box3d = x[:, :, 1:4]
+ pred_reg = x[:, :, 4:]
+
+ proposals = decode_bbox_target(
+ np.array(roi_box3d).reshape(-1, roi_box3d.shape()[-1]),
+ np.array(pred_reg).reshape(-1, pred_reg.shape()[-1]),
+ anchor_size=np.array(cfg.CLS_MEAN_SIZE[0], dtype='float32'),
+ loc_scope=cfg.RPN.LOC_SCOPE,
+ loc_bin_size=cfg.RPN.LOC_BIN_SIZE,
+ num_head_bin=cfg.RPN.NUM_HEAD_BIN,
+ get_xz_fine=cfg.RPN.LOC_XZ_FINE,
+ get_y_by_bin=False,
+ get_ry_fine=False)
+ proposals[:, 1] += proposals[:, 3] / 2
+ proposals = proposals.reshape(rpn_scores.shape[0], -1, proposals.shape[-1])
+
+ sorted_idxs = np.argsort(-rpn_scores, axis=-1)
+
+ if cfg.TEST.RPN_DISTANCE_BASED_PROPOSE:
+ ret = distance_based_proposal(rpn_scores, proposals, sorted_idxs)
+ else:
+ ret = score_based_proposal(rpn_scores, proposals, sorted_idxs)
+
+ return ret
+
+
+ return generate_proposal
+
+
+if __name__ == "__main__":
+ np.random.seed(3333)
+ x_np = np.random.random((4, 256, 84)).astype('float32')
+
+ from config import cfg
+ cfg.RPN.LOC_XZ_FINE = True
+ # cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
+ # cfg.RPN.NMS_TYPE = 'rotate'
+ proposal_func = get_proposal_func(cfg)
+
+ x = fluid.layers.data(name="x", shape=[256, 84], dtype='float32')
+ proposal = fluid.default_main_program().current_block().create_var(
+ name="proposal", dtype='float32', shape=[256, 7])
+ fluid.layers.py_func(proposal_func, x, proposal)
+ loss = fluid.layers.reduce_mean(proposal)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ exe.run(fluid.default_startup_program())
+ ret = exe.run(fetch_list=[proposal.name, loss.name], feed={'x': x_np})
+ print(ret)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt
new file mode 100644
index 0000000000000000000000000000000000000000..044bbed5d020464250810601ec2dcdacdec0cd18
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt
@@ -0,0 +1,6 @@
+
+cmake_minimum_required(VERSION 2.8.12)
+project(pts_utils)
+
+add_subdirectory(pybind11)
+pybind11_add_module(pts_utils pts_utils.cpp)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..356b02baa5288903e218c8fca1b17118ef8ea72b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp
@@ -0,0 +1,62 @@
+#include
+#include
+#include
+
+namespace py = pybind11;
+
+int pt_in_box3d(float x, float y, float z, float cx, float cy, float cz, float h, float w, float l, float cosa, float sina) {
+ if ((fabsf(x - cx) > 10.) || (fabsf(y - cy) > h / 2.0) || (fabsf(z - cz) > 10.)){
+ return 0;
+ }
+
+ float x_rot = (x - cx) * cosa + (z - cz) * (-sina);
+ float z_rot = (x - cx) * sina + (z - cz) * cosa;
+
+ int in_flag = static_cast((x_rot >= -l / 2.0) & (x_rot <= l / 2.0) & (z_rot >= -w / 2.0) & (z_rot <= w / 2.0));
+ return in_flag;
+}
+
+py::array_t pts_in_boxes3d(py::array_t pts, py::array_t boxes) {
+ py::buffer_info pts_buf= pts.request(), boxes_buf = boxes.request();
+
+ if (pts_buf.ndim != 2 || boxes_buf.ndim != 2) {
+ throw std::runtime_error("Number of dimensions must be 2");
+ }
+ if (pts_buf.shape[1] != 3) {
+ throw std::runtime_error("pts 2nd dimension must be 3");
+ }
+ if (boxes_buf.shape[1] != 7) {
+ throw std::runtime_error("boxes 2nd dimension must be 7");
+ }
+
+ auto pts_num = pts_buf.shape[0];
+ auto boxes_num = boxes_buf.shape[0];
+ auto mask = py::array_t(pts_num * boxes_num);
+ py::buffer_info mask_buf = mask.request();
+
+ float *pts_ptr = (float *) pts_buf.ptr,
+ *boxes_ptr = (float *) boxes_buf.ptr;
+ int *mask_ptr = (int *) mask_buf.ptr;
+
+ for (ssize_t i = 0; i < boxes_num; i++) {
+ float cx = boxes_ptr[i * 7];
+ float cy = boxes_ptr[i * 7 + 1] - boxes_ptr[i * 7 + 3] / 2.;
+ float cz = boxes_ptr[i * 7 + 2];
+ float h = boxes_ptr[i * 7 + 3];
+ float w = boxes_ptr[i * 7 + 4];
+ float l = boxes_ptr[i * 7 + 5];
+ float angle = boxes_ptr[i * 7 + 6];
+ float cosa = cosf(angle);
+ float sina = sinf(angle);
+ for (ssize_t j = 0; j < pts_num; j++) {
+ mask_ptr[i * pts_num + j] = pt_in_box3d(pts_ptr[j * 3], pts_ptr[j * 3 + 1], pts_ptr[j * 3 + 2], cx, cy, cz, h, w, l, cosa, sina);
+ }
+ }
+
+ mask.resize({boxes_num, pts_num});
+ return mask;
+}
+
+PYBIND11_MODULE(pts_utils, m) {
+ m.def("pts_in_boxes3d", &pts_in_boxes3d, "Calculate mask for whether points in boxes3d");
+}
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py
new file mode 100644
index 0000000000000000000000000000000000000000..e44e80ea703c0b2b3d1938fadc3c1befadb1dad0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py
@@ -0,0 +1,12 @@
+from setuptools import setup
+from setuptools import Extension
+
+setup(
+ name='pts_utils',
+ ext_modules = [Extension(
+ name='pts_utils',
+ sources=['pts_utils.cpp'],
+ include_dirs=[r'../../pybind11/include'],
+ extra_compile_args=['-std=c++11']
+ )],
+)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py
new file mode 100644
index 0000000000000000000000000000000000000000..e4e3be285e3363a2193102732f1c0d9894eb497d
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py
@@ -0,0 +1,7 @@
+import numpy as np
+import pts_utils
+
+a = np.random.random((16384, 3)).astype('float32')
+b = np.random.random((64, 7)).astype('float32')
+c = pts_utils.pts_in_boxes3d(a, b)
+print(a, b, c, c.shape, np.sum(c))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..0df37e5658f86c0cfc416e8a0185c5556bffe9f9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py
@@ -0,0 +1,110 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains common utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import sys
+import six
+import logging
+import numpy as np
+import paddle.fluid as fluid
+
+__all__ = ["check_gpu", "print_arguments", "parse_outputs", "Stat"]
+
+logger = logging.getLogger(__name__)
+
+
+def check_gpu(use_gpu):
+ """
+ Log error and exit when set use_gpu=True in paddlepaddle
+ cpu version.
+ """
+ err = "Config use_gpu cannot be set as True while you are " \
+ "using paddlepaddle cpu version ! \nPlease try: \n" \
+ "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
+ "\t2. Set --use_gpu=False to run model on CPU"
+
+ try:
+ if use_gpu and not fluid.is_compiled_with_cuda():
+ logger.error(err)
+ sys.exit(1)
+ except Exception as e:
+ pass
+
+
+def print_arguments(args):
+ """Print argparse's arguments.
+
+ Usage:
+
+ .. code-block:: python
+
+ parser = argparse.ArgumentParser()
+ parser.add_argument("name", default="Jonh", type=str, help="User name.")
+ args = parser.parse_args()
+ print_arguments(args)
+
+ :param args: Input argparse.Namespace for printing.
+ :type args: argparse.Namespace
+ """
+ logger.info("----------- Configuration Arguments -----------")
+ for arg, value in sorted(six.iteritems(vars(args))):
+ logger.info("%s: %s" % (arg, value))
+ logger.info("------------------------------------------------")
+
+
+def parse_outputs(outputs, filter_key=None, extra_keys=None, prog=None):
+ keys, values = [], []
+ for k, v in outputs.items():
+ if filter_key is not None and k.find(filter_key) < 0:
+ continue
+ keys.append(k)
+ v.persistable = True
+ values.append(v.name)
+
+ if prog is not None and extra_keys is not None:
+ for k in extra_keys:
+ try:
+ v = fluid.framework._get_var(k, prog)
+ keys.append(k)
+ v.persistable = True
+ values.append(v.name)
+ except:
+ pass
+ return keys, values
+
+
+class Stat(object):
+ def __init__(self):
+ self.stats = {}
+
+ def update(self, keys, values):
+ for k, v in zip(keys, values):
+ if k not in self.stats:
+ self.stats[k] = []
+ self.stats[k].append(v)
+
+ def reset(self):
+ self.stats = {}
+
+ def get_mean_log(self):
+ log = ""
+ for k, v in self.stats.items():
+ log += "avg_{}: {:.4f}, ".format(k, np.mean(v))
+ return log
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..c24a89a2429bd5f45386efa1176f8c8770500120
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py
@@ -0,0 +1,132 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import numpy as np
+from utils.config import cfg
+from utils import calibration as calib
+import utils.cyops.kitti_utils as kitti_utils
+
+__all__ = ['save_rpn_feature', 'save_kitti_result', 'save_kitti_format']
+
+
+def save_rpn_feature(rets, kitti_features_dir):
+ """
+ save rpn features for RCNN offline training
+ """
+
+ sample_id = rets['sample_id'][0]
+ backbone_xyz = rets['backbone_xyz'][0]
+ backbone_feature = rets['backbone_feature'][0]
+ pts_features = rets['pts_features'][0]
+ seg_mask = rets['seg_mask'][0]
+ rpn_cls = rets['rpn_cls'][0]
+
+ for i in range(len(sample_id)):
+ pts_intensity = pts_features[i, :, 0]
+ s_id = sample_id[i, 0]
+
+ output_file = os.path.join(kitti_features_dir, '%06d.npy' % s_id)
+ xyz_file = os.path.join(kitti_features_dir, '%06d_xyz.npy' % s_id)
+ seg_file = os.path.join(kitti_features_dir, '%06d_seg.npy' % s_id)
+ intensity_file = os.path.join(
+ kitti_features_dir, '%06d_intensity.npy' % s_id)
+ np.save(output_file, backbone_feature[i])
+ np.save(xyz_file, backbone_xyz[i])
+ np.save(seg_file, seg_mask[i])
+ np.save(intensity_file, pts_intensity)
+ rpn_scores_raw_file = os.path.join(
+ kitti_features_dir, '%06d_rawscore.npy' % s_id)
+ np.save(rpn_scores_raw_file, rpn_cls[i])
+
+
+def save_kitti_result(rets, seg_output_dir, kitti_output_dir, reader, classes):
+ sample_id = rets['sample_id'][0]
+ roi_scores_row = rets['roi_scores_row'][0]
+ bboxes3d = rets['rois'][0]
+ pts_rect = rets['pts_rect'][0]
+ seg_mask = rets['seg_mask'][0]
+ rpn_cls_label = rets['rpn_cls_label'][0]
+ gt_boxes3d = rets['gt_boxes3d'][0]
+ gt_boxes3d_num = rets['gt_boxes3d'][1]
+
+ for i in range(len(sample_id)):
+ s_id = sample_id[i, 0]
+
+ seg_result_data = np.concatenate((pts_rect[i].reshape(-1, 3),
+ rpn_cls_label[i].reshape(-1, 1),
+ seg_mask[i].reshape(-1, 1)),
+ axis=1).astype('float16')
+ seg_output_file = os.path.join(seg_output_dir, '%06d.npy' % s_id)
+ np.save(seg_output_file, seg_result_data)
+
+ scores = roi_scores_row[i, :]
+ bbox3d = bboxes3d[i, :]
+ img_shape = reader.get_image_shape(s_id)
+ calib = reader.get_calib(s_id)
+
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(
+ img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % s_id)
+ with open(kitti_output_file, 'w') as f:
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ f.write('{} -1 -1 {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f}\n'.format(
+ classes, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6], scores[k]))
+
+
+def save_kitti_format(sample_id, calib, bbox3d, kitti_output_dir, scores, img_shape):
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % sample_id)
+ with open(kitti_output_file, 'w') as f:
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ f.write('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
+ (cfg.CLASSES, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6], scores[k]))
+