diff --git a/PaddleCV/Paddle3D/PointRCNN/.gitignore b/PaddleCV/Paddle3D/PointRCNN/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..9ea6e75c687e4ac93fa06d18bd0d1444e5d3b054 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/.gitignore @@ -0,0 +1,14 @@ +*log* +checkpoints* +build +output +result_dir +pp_pointrcnn* +data/gt_database +utils/pts_utils/dist +utils/pts_utils/build +utils/pts_utils/pts_utils.egg-info +utils/cyops/*.c +utils/cyops/*.so +ext_op/src/*.o +ext_op/src/*.so diff --git a/PaddleCV/Paddle3D/PointRCNN/README.md b/PaddleCV/Paddle3D/PointRCNN/README.md new file mode 100644 index 0000000000000000000000000000000000000000..0560203b293d1e12ab576dcc1bd66891b1a44af1 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/README.md @@ -0,0 +1,338 @@ +# PointRCNN 3D目标检测模型 + +--- +## 内容 + +- [简介](#简介) +- [快速开始](#快速开始) +- [参考文献](#参考文献) +- [版本更新](#版本更新) + +## 简介 + +[PointRCNN](https://arxiv.org/abs/1812.04244) 是 Shaoshuai Shi, Xiaogang Wang, Hongsheng Li. 等人提出的,是第一个仅使用原始点云的2-stage(两阶段)3D目标检测器,第一阶段将 Pointnet++ with MSG(Multi-scale Grouping)作为backbone,直接将原始点云数据分割为前景点和背景点,并利用前景点生成bounding box。第二阶段在标准坐标系中对生成对bounding box进一步筛选和优化。该模型还提出了基于bin的方式,把回归问题转化为分类问题,验证了在三维边界框回归中的有效性。PointRCNN在KITTI数据集上进行评估,论文发布时在KITTI 3D目标检测排行榜上获得了最佳性能。 + +网络结构如下所示: + +

+
+用于点云的目标检测器 PointNet++ +

+ +**注意:** PointRCNN 模型构建依赖于自定义的 C++ 算子,目前仅支持GPU设备在Linux/Unix系统上进行编译,本模型**不能运行在Windows系统或CPU设备上** + + +## 快速开始 + +### 安装 + +**安装 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):** + +在当前目录下运行样例代码需要 PaddelPaddle Fluid [develop每日版本](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)或使用PaddlePaddle [develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装. + +为了使自定义算子与paddle版本兼容,建议您**优先使用源码编译paddle**,源码编译方式请参考[编译安装](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu) + +**安装PointRCNN:** + +1. 下载[PaddlePaddle/models](https://github.com/PaddlePaddle/models)模型库 + +通过如下命令下载Paddle models模型库: + +``` +git clone https://github.com/PaddlePaddle/models +``` + +2. 在`PaddleCV/Paddle3D/PointRCNN`目录下下载[pybind11](https://github.com/pybind/pybind11) + +`pts_utils`依赖`pybind11`编译,须在`PaddleCV/Paddle3D/PointRCNN`目录下下载`pybind11`子库,可使用如下命令下载: + +``` +cd PaddleCV/Paddle3D/PointRCNN +git clone https://github.com/pybind/pybind11 +``` + +3. 编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块 + +使用如下命令编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块: +``` +sh build_and_install.sh +``` + +4. 安装python依赖库 + +使用如下命令安装python依赖库: + +``` +pip install -r requirement.txt +``` + +**注意:** KITTI mAP评估工具只能在python 3.6及以上版本中使用,且python3环境中需要安装`scikit-image`,`Numba`,`fire`等子库。 +`requirement.txt`中的`scikit-image`,`Numba`,`fire`即为KITTI mAP评估工具所需依赖库。 + +### 编译自定义OP + +请确认Paddle版本为PaddelPaddle Fluid develop每日版本或基于Paddle develop分支源码编译安装,**推荐使用源码编译安装的方式**。 + +自定义OP编译方式如下: + + 进入 `ext_op/src` 目录,执行编译脚本 + ``` + cd ext_op/src + sh make.sh + ``` + + 成功编译后,`ext_op/src` 目录下将会生成 `pointnet2_lib.so` + + 执行下列操作,确保自定义算子编译正确: + + ``` + # 设置动态库的路径到 LD_LIBRARY_PATH 中 + export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'` + + # 回到 ext_op 目录,添加 PYTHONPATH + cd .. + export PYTHONPATH=$PYTHONPATH:`pwd` + + # 运行单测 + python tests/test_farthest_point_sampling_op.py + python tests/test_gather_point_op.py + python tests/test_group_points_op.py + python tests/test_query_ball_op.py + python tests/test_three_interp_op.py + python tests/test_three_nn_op.py + ``` + 单测运行成功会输出提示信息,如下所示: + + ``` + . + ---------------------------------------------------------------------- + Ran 1 test in 13.205s + + OK + ``` + +**说明:** 自定义OP编译与[PointNet++](../PointNet++)下一致,更多关于自定义OP的编译说明,请参考[自定义OP编译](../PointNet++/ext_op/README.md) + +### 数据准备 + +**KITTI 3D object detection 数据集:** + +PointRCNN使用数据集[KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) +上进行训练。 + +可通过如下方式下载数据集: + +``` +cd data/KITTI/object +sh download.sh +``` + +此处的images只用做可视化,训练过程中使用[road planes](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing)数据来做训练时的数据增强, +请下载并解压至`./data/KITTI/object/training`目录下。 + +数据目录结构如下所示: + +``` +PointRCNN +├── data +│ ├── KITTI +│ │ ├── ImageSets +│ │ ├── object +│ │ │ ├──training +│ │ │ │ ├──calib & velodyne & label_2 & image_2 & planes +│ │ │ ├──testing +│ │ │ │ ├──calib & velodyne & image_2 + +``` + + +### 训练 + +**PointRCNN模型:** + +可通过如下方式启动 PointRCNN模型的训练: + +1. 指定单卡训练并设置动态库路径 + +``` +# 指定单卡GPU训练 +export CUDA_VISIBLE_DEVICES=0 + +# 设置动态库的路径到 LD_LIBRARY_PATH 中 +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'` +``` + +2. 生成Groud Truth采样数据,命令如下: + +``` +python tools/generate_gt_database.py --class_name 'Car' --split train +``` + +3. 训练 RPN 模型 + +``` +python train.py --cfg=./cfgs/default.yml \ + --train_mode=rpn \ + --batch_size=16 \ + --epoch=200 \ + --save_dir=checkpoints +``` + +RPN训练checkpoints默认保存在`checkpoints/rpn`目录,也可以通过`--save_dir`来指定。 + +4. 生成增强离线场景数据并保存RPN模型的输出特征和ROI,用于离线训练 RCNN 模型 + +生成增强的离线场景数据命令如下: + +``` +python tools/generate_aug_scene.py --class_name 'Car' --split train --aug_times 4 +``` + +保存RPN模型对离线增强数据的输出特征和ROI,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。 +保存输出特征和ROI时须指定`TEST.SPLIT`为`train_aug`,指定`TEST.RPN_POST_NMS_TOP_N`为`300`, `TEST.RPN_NMS_THRESH`为`0.85`。 +通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。 + +``` +python eval.py --cfg=cfgs/default.yml \ + --eval_mode=rpn \ + --ckpt_dir=./checkpoints/rpn/199 \ + --save_rpn_feature \ + --output_dir=output \ + --set TEST.SPLIT train_aug TEST.RPN_POST_NMS_TOP_N 300 TEST.RPN_NMS_THRESH 0.85 +``` + +`--output_dir`下保存的数据目录结构如下: + +``` +output +├── detections +│ ├── data # 保存ROI数据 +│ │ ├── 000000.txt +│ │ ├── 000003.txt +│ │ ├── ... +├── features # 保存输出特征 +│ ├── 000000_intensity.npy +│ ├── 000000.npy +│ ├── 000000_rawscore.npy +│ ├── 000000_seg.npy +│ ├── 000000_xyz.npy +│ ├── ... +├── seg_result # 保存语义分割结果 +│ ├── 000000.npy +│ ├── 000003.npy +│ ├── ... +``` + +5. 离线训练RCNN,并且通过参数`--rcnn_training_roi_dir` and `--rcnn_training_feature_dir` 来指定 RPN 模型保存的输出特征和ROI路径。 + +``` +python train.py --cfg=./cfgs/default.yml \ + --train_mode=rcnn_offline \ + --batch_size=4 \ + --epoch=30 \ + --save_dir=checkpoints \ + --rcnn_training_roi_dir=output/detections/data \ + --rcnn_training_feature_dir=output/features +``` + +RCNN模型训练权重默认保存在`checkpoints/rcnn`目录下,可通过`--save_dir`参数指定。 + +**注意**: 最好的模型是通过保存RPN模型输出特征和ROI并离线数据增强的方式训练RCNN模型得出的,目前默认仅支持这种方式。 + + +### 模型评估 + +**PointRCNN模型:** + +可通过如下方式启动 PointRCNN 模型的评估: + +1. 指定单卡训练并设置动态库路径 + +``` +# 指定单卡GPU训练 +export CUDA_VISIBLE_DEVICES=0 + +# 设置动态库的路径到 LD_LIBRARY_PATH 中 +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'` + +``` + +2. 保存RPN模型对评估数据的输出特征和ROI + +保存RPN模型对评估数据的输出特征和ROI命令如下,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。 +通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。 + +``` +python eval.py --cfg=cfgs/default.yml \ + --eval_mode=rpn \ + --ckpt_dir=./checkpoints/rpn/199 \ + --save_rpn_feature \ + --output_dir=output/val +``` + +保存RPN模型对评估数据的输出特征和ROI保存的目录结构与上述保存离线增强数据保存目录结构一致。 + +3. 评估离线RCNN模型 + +评估离线RCNN模型命令如下: + +``` +python eval.py --cfg=cfgs/default.yml \ + --eval_mode=rcnn_offline \ + --ckpt_dir=./checkpoints/rcnn_offline/29 \ + --rcnn_eval_roi_dir=output/val/detections/data \ + --rcnn_eval_feature_dir=output/val/features \ + --save_result +``` + +最终目标检测结果文件保存在`./result_dir`目录下`final_result`文件夹下,同时可通过`--save_result`开启保存`roi_output`和`refine_output`结果文件。 +`result_dir`目录结构如下: + +``` +result_dir +├── final_result +│ ├── data # 最终检测结果 +│ │ ├── 000001.txt +│ │ ├── 000002.txt +│ │ ├── ... +├── roi_output +│ ├── data # RCNN模型输出检测ROI结果 +│ │ ├── 000001.txt +│ │ ├── 000002.txt +│ │ ├── ... +├── refine_output +│ ├── data # 解码后的检测结果 +│ │ ├── 000001.txt +│ │ ├── 000002.txt +│ │ ├── ... +``` + +4. 使用KITTI mAP工具获得评估结果 + +若在评估过程中使用的python版本为3.6及以上版本,则程序会自动运行KITTI mAP评估,若使用python版本低于3.6, +由于KITTI mAP仅支持python 3.6及以上版本,须使用对应python版本通过如下命令进行评估: + +``` +python3 kitti_map.py +``` + +使用训练最终权重[RPN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rpn.tar)和[RCNN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rcnn_offline.tar)评估结果如下所示: + +| Car AP@ | 0.70(easy) | 0.70(moderate) | 0.70(hard) | +| :------- | :--------: | :------------: | :--------: | +| bbox AP: | 90.20 | 88.85 | 88.59 | +| bev AP: | 89.50 | 86.97 | 85.58 | +| 3d AP: | 86.66 | 76.65 | 75.90 | +| aos AP: | 90.10 | 88.64 | 88.26 | + + +## 参考文献 + +- [PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud](https://arxiv.org/abs/1812.04244), Shaoshuai Shi, Xiaogang Wang, Hongsheng Li. +- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas. +- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas. + +## 版本更新 + +- 11/2019, 新增 PointRCNN模型。 + diff --git a/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh new file mode 100644 index 0000000000000000000000000000000000000000..83aaef84704445cf9c7bf3e87cc453e0daa708cd --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh @@ -0,0 +1,7 @@ +# compile cyops +python utils/cyops/setup.py develop + +# compile and install pts_utils +cd utils/pts_utils +python setup.py install +cd ../.. diff --git a/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml new file mode 100644 index 0000000000000000000000000000000000000000..33dc45086ca48128174fc341e7f9fdee9374d53e --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml @@ -0,0 +1,167 @@ +# This config is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/cfgs/default.yaml +CLASSES: Car + +INCLUDE_SIMILAR_TYPE: True + +# config of augmentation +AUG_DATA: True +AUG_METHOD_LIST: ['rotation', 'scaling', 'flip'] +AUG_METHOD_PROB: [1.0, 1.0, 0.5] +AUG_ROT_RANGE: 18 + +GT_AUG_ENABLED: True +GT_EXTRA_NUM: 15 +GT_AUG_RAND_NUM: True +GT_AUG_APPLY_PROB: 1.0 +GT_AUG_HARD_RATIO: 0.6 + +PC_REDUCE_BY_RANGE: True +PC_AREA_SCOPE: [[-40, 40], [-1, 3], [0, 70.4]] # x, y, z scope in rect camera coords +CLS_MEAN_SIZE: [[1.52563191462, 1.62856739989, 3.88311640418]] + + +# 1. config of rpn network +RPN: + ENABLED: True + FIXED: False + + # config of input + USE_INTENSITY: False + + # config of bin-based loss + LOC_XZ_FINE: True + LOC_SCOPE: 3.0 + LOC_BIN_SIZE: 0.5 + NUM_HEAD_BIN: 12 + + # config of network structure + BACKBONE: pointnet2_msg + USE_BN: True + NUM_POINTS: 16384 + + SA_CONFIG: + NPOINTS: [4096, 1024, 256, 64] + RADIUS: [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]] + NSAMPLE: [[16, 32], [16, 32], [16, 32], [16, 32]] + MLPS: [[[16, 16, 32], [32, 32, 64]], + [[64, 64, 128], [64, 96, 128]], + [[128, 196, 256], [128, 196, 256]], + [[256, 256, 512], [256, 384, 512]]] + FP_MLPS: [[128, 128], [256, 256], [512, 512], [512, 512]] + CLS_FC: [128] + REG_FC: [128] + DP_RATIO: 0.5 + + # config of training + LOSS_CLS: SigmoidFocalLoss + FG_WEIGHT: 15 + FOCAL_ALPHA: [0.25, 0.75] + FOCAL_GAMMA: 2.0 + REG_LOSS_WEIGHT: [1.0, 1.0, 1.0, 1.0] + LOSS_WEIGHT: [1.0, 1.0] + NMS_TYPE: normal + + # config of testing + SCORE_THRESH: 0.3 + +# 2. config of rcnn network +RCNN: + ENABLED: True + + # config of input + ROI_SAMPLE_JIT: False + REG_AUG_METHOD: multiple # multiple, single, normal + ROI_FG_AUG_TIMES: 10 + + USE_RPN_FEATURES: True + USE_MASK: True + MASK_TYPE: seg + USE_INTENSITY: False + USE_DEPTH: True + USE_SEG_SCORE: False + + POOL_EXTRA_WIDTH: 1.0 + + # config of bin-based loss + LOC_SCOPE: 1.5 + LOC_BIN_SIZE: 0.5 + NUM_HEAD_BIN: 9 + LOC_Y_BY_BIN: False + LOC_Y_SCOPE: 0.5 + LOC_Y_BIN_SIZE: 0.25 + SIZE_RES_ON_ROI: False + + # config of network structure + USE_BN: False + DP_RATIO: 0.0 + + BACKBONE: pointnet # pointnet + XYZ_UP_LAYER: [128, 128] + + NUM_POINTS: 512 + SA_CONFIG: + NPOINTS: [128, 32, -1] + RADIUS: [0.2, 0.4, 100] + NSAMPLE: [64, 64, 64] + MLPS: [[128, 128, 128], + [128, 128, 256], + [256, 256, 512]] + CLS_FC: [256, 256] + REG_FC: [256, 256] + + # config of training + LOSS_CLS: BinaryCrossEntropy + FOCAL_ALPHA: [0.25, 0.75] + FOCAL_GAMMA: 2.0 + CLS_WEIGHT: [1.0, 1.0, 1.0] + CLS_FG_THRESH: 0.6 + CLS_BG_THRESH: 0.45 + CLS_BG_THRESH_LO: 0.05 + REG_FG_THRESH: 0.55 + FG_RATIO: 0.5 + ROI_PER_IMAGE: 64 + HARD_BG_RATIO: 0.8 + + # config of testing + SCORE_THRESH: 0.3 + NMS_THRESH: 0.1 + +# general training config +TRAIN: + SPLIT: train + VAL_SPLIT: smallval + + LR: 0.002 + LR_CLIP: 0.00001 + LR_DECAY: 0.5 + DECAY_STEP_LIST: [100, 150, 180, 200] + LR_WARMUP: True + WARMUP_MIN: 0.0002 + WARMUP_EPOCH: 1 + + BN_MOMENTUM: 0.1 + BN_DECAY: 0.5 + BNM_CLIP: 0.01 + BN_DECAY_STEP_LIST: [1000] + + OPTIMIZER: adam # adam, adam_onecycle + WEIGHT_DECAY: 0.001 # L2 regularization + MOMENTUM: 0.9 + + MOMS: [0.95, 0.85] + DIV_FACTOR: 10.0 + PCT_START: 0.4 + + GRAD_NORM_CLIP: 1.0 + + RPN_PRE_NMS_TOP_N: 9000 + RPN_POST_NMS_TOP_N: 512 + RPN_NMS_THRESH: 0.85 + RPN_DISTANCE_BASED_PROPOSE: True + +TEST: + SPLIT: val + RPN_PRE_NMS_TOP_N: 9000 + RPN_POST_NMS_TOP_N: 100 + RPN_NMS_THRESH: 0.8 + RPN_DISTANCE_BASED_PROPOSE: True diff --git a/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh new file mode 100644 index 0000000000000000000000000000000000000000..1f5818d38323c5cc7349022ba82d2a55315a59a7 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh @@ -0,0 +1,25 @@ +DIR="$( cd "$(dirname "$0")" ; pwd -P )" +cd "$DIR" + +echo "Downloading https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip" +wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip +echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip" +wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip +echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip" +wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip +echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip" +wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip + +echo "Decompressing data_object_velodyne.zip" +unzip data_object_velodyne.zip +echo "Decompressing data_object_image_2.zip" +unzip "data_object_image_2.zip" +echo "Decompressing data_object_calib.zip" +unzip data_object_calib.zip +echo "Decompressing data_object_label_2.zip" +unzip data_object_label_2.zip + +echo "Download KITTI ImageSets" +wget https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_kitti_imagesets.tar +tar xf pointrcnn_kitti_imagesets.tar +mv ImageSets .. diff --git a/PaddleCV/Paddle3D/PointRCNN/data/__init__.py b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py @@ -0,0 +1,13 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..0765a5045f6e330646fde26fe391eb313d022124 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py @@ -0,0 +1,77 @@ +""" +This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_dataset.py +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import os +import cv2 +import numpy as np +import utils.calibration as calibration +from utils.object3d import get_objects_from_label +from PIL import Image + +__all__ = ["KittiDataset"] + + +class KittiDataset(object): + def __init__(self, data_dir, split='train'): + assert split in ['train', 'train_aug', 'val', 'test'], "unknown split {}".format(split) + self.split = split + self.is_test = self.split == 'test' + self.imageset_dir = os.path.join(data_dir, 'KITTI', 'object', 'testing' if self.is_test else 'training') + + split_dir = os.path.join(data_dir, 'KITTI', 'ImageSets', split + '.txt') + self.image_idx_list = [x.strip() for x in open(split_dir).readlines()] + self.num_sample = self.image_idx_list.__len__() + + self.image_dir = os.path.join(self.imageset_dir, 'image_2') + self.lidar_dir = os.path.join(self.imageset_dir, 'velodyne') + self.calib_dir = os.path.join(self.imageset_dir, 'calib') + self.label_dir = os.path.join(self.imageset_dir, 'label_2') + self.plane_dir = os.path.join(self.imageset_dir, 'planes') + + def get_image(self, idx): + img_file = os.path.join(self.image_dir, '%06d.png' % idx) + assert os.path.exists(img_file) + return cv2.imread(img_file) # (H, W, 3) BGR mode + + def get_image_shape(self, idx): + img_file = os.path.join(self.image_dir, '%06d.png' % idx) + assert os.path.exists(img_file) + im = Image.open(img_file) + width, height = im.size + return height, width, 3 + + def get_lidar(self, idx): + lidar_file = os.path.join(self.lidar_dir, '%06d.bin' % idx) + assert os.path.exists(lidar_file) + return np.fromfile(lidar_file, dtype=np.float32).reshape(-1, 4) + + def get_calib(self, idx): + calib_file = os.path.join(self.calib_dir, '%06d.txt' % idx) + assert os.path.exists(calib_file) + return calibration.Calibration(calib_file) + + def get_label(self, idx): + label_file = os.path.join(self.label_dir, '%06d.txt' % idx) + assert os.path.exists(label_file) + # return kitti_utils.get_objects_from_label(label_file) + return get_objects_from_label(label_file) + + def get_road_plane(self, idx): + plane_file = os.path.join(self.plane_dir, '%06d.txt' % idx) + with open(plane_file, 'r') as f: + lines = f.readlines() + lines = [float(i) for i in lines[3].split()] + plane = np.asarray(lines) + + # Ensure normal is always facing up, this is in the rectified camera coordinate + if plane[1] > 0: + plane = -plane + + norm = np.linalg.norm(plane[0:3]) + plane = plane / norm + return plane diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py new file mode 100644 index 0000000000000000000000000000000000000000..811a20b28402f0c7119a03605e9e90074ad99097 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py @@ -0,0 +1,1184 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_rcnn_dataset.py +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import os +import logging +import multiprocessing +import numpy as np +import scipy +from scipy.spatial import Delaunay +try: + import cPickle as pickle +except: + import pickle + +import pts_utils +import utils.cyops.kitti_utils as kitti_utils +import utils.cyops.roipool3d_utils as roipool3d_utils +from data.kitti_dataset import KittiDataset +from utils.config import cfg +from collections import OrderedDict + +__all__ = ["KittiRCNNReader"] + +logger = logging.getLogger(__name__) + + +def has_empty(data): + for d in data: + if isinstance(d, np.ndarray) and len(d) == 0: + return True + return False + + +def in_hull(p, hull): + """ + :param p: (N, K) test points + :param hull: (M, K) M corners of a box + :return (N) bool + """ + try: + if not isinstance(hull, Delaunay): + hull = Delaunay(hull) + flag = hull.find_simplex(p) >= 0 + except scipy.spatial.qhull.QhullError: + logger.debug('Warning: not a hull.') + flag = np.zeros(p.shape[0], dtype=np.bool) + + return flag + + +class KittiRCNNReader(KittiDataset): + def __init__(self, data_dir, npoints=16384, split='train', classes='Car', mode='TRAIN', + random_select=True, rcnn_training_roi_dir=None, rcnn_training_feature_dir=None, + rcnn_eval_roi_dir=None, rcnn_eval_feature_dir=None, gt_database_dir=None): + super(KittiRCNNReader, self).__init__(data_dir=data_dir, split=split) + if classes == 'Car': + self.classes = ('Background', 'Car') + aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene') + elif classes == 'People': + self.classes = ('Background', 'Pedestrian', 'Cyclist') + elif classes == 'Pedestrian': + self.classes = ('Background', 'Pedestrian') + aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_ped') + elif classes == 'Cyclist': + self.classes = ('Background', 'Cyclist') + aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_cyclist') + else: + assert False, "Invalid classes: %s" % classes + + self.num_classes = len(self.classes) + + self.npoints = npoints + self.sample_id_list = [] + self.random_select = random_select + + if split == 'train_aug': + self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label') + self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data') + else: + self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label') + self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data') + + # for rcnn training + self.rcnn_training_bbox_list = [] + self.rpn_feature_list = {} + self.pos_bbox_list = [] + self.neg_bbox_list = [] + self.far_neg_bbox_list = [] + self.rcnn_eval_roi_dir = rcnn_eval_roi_dir + self.rcnn_eval_feature_dir = rcnn_eval_feature_dir + self.rcnn_training_roi_dir = rcnn_training_roi_dir + self.rcnn_training_feature_dir = rcnn_training_feature_dir + + self.gt_database = None + + if not self.random_select: + logger.warning('random select is False') + + assert mode in ['TRAIN', 'EVAL', 'TEST'], 'Invalid mode: %s' % mode + self.mode = mode + + if cfg.RPN.ENABLED: + if gt_database_dir is not None: + self.gt_database = pickle.load(open(gt_database_dir, 'rb')) + + if cfg.GT_AUG_HARD_RATIO > 0: + easy_list, hard_list = [], [] + for k in range(self.gt_database.__len__()): + obj = self.gt_database[k] + if obj['points'].shape[0] > 100: + easy_list.append(obj) + else: + hard_list.append(obj) + self.gt_database = [easy_list, hard_list] + logger.info('Loading gt_database(easy(pt_num>100): %d, hard(pt_num<=100): %d) from %s' + % (len(easy_list), len(hard_list), gt_database_dir)) + else: + logger.info('Loading gt_database(%d) from %s' % (len(self.gt_database), gt_database_dir)) + + if mode == 'TRAIN': + self.preprocess_rpn_training_data() + else: + self.sample_id_list = [int(sample_id) for sample_id in self.image_idx_list] + logger.info('Load testing samples from %s' % self.imageset_dir) + logger.info('Done: total test samples %d' % len(self.sample_id_list)) + elif cfg.RCNN.ENABLED: + for idx in range(0, self.num_sample): + sample_id = int(self.image_idx_list[idx]) + obj_list = self.filtrate_objects(self.get_label(sample_id)) + if len(obj_list) == 0: + # logger.info('No gt classes: %06d' % sample_id) + continue + self.sample_id_list.append(sample_id) + + logger.info('Done: filter %s results for rcnn training: %d / %d\n' % + (self.mode, len(self.sample_id_list), len(self.image_idx_list))) + + def preprocess_rpn_training_data(self): + """ + Discard samples which don't have current classes, which will not be used for training. + Valid sample_id is stored in self.sample_id_list + """ + logger.info('Loading %s samples from %s ...' % (self.mode, self.label_dir)) + for idx in range(0, self.num_sample): + sample_id = int(self.image_idx_list[idx]) + obj_list = self.filtrate_objects(self.get_label(sample_id)) + if len(obj_list) == 0: + logger.debug('No gt classes: %06d' % sample_id) + continue + self.sample_id_list.append(sample_id) + + logger.info('Done: filter %s results: %d / %d\n' % (self.mode, len(self.sample_id_list), + len(self.image_idx_list))) + + def get_label(self, idx): + if idx < 10000: + label_file = os.path.join(self.label_dir, '%06d.txt' % idx) + else: + label_file = os.path.join(self.aug_label_dir, '%06d.txt' % idx) + + assert os.path.exists(label_file) + return kitti_utils.get_objects_from_label(label_file) + + def get_image(self, idx): + return super(KittiRCNNReader, self).get_image(idx % 10000) + + def get_image_shape(self, idx): + return super(KittiRCNNReader, self).get_image_shape(idx % 10000) + + def get_calib(self, idx): + return super(KittiRCNNReader, self).get_calib(idx % 10000) + + def get_road_plane(self, idx): + return super(KittiRCNNReader, self).get_road_plane(idx % 10000) + + @staticmethod + def get_rpn_features(rpn_feature_dir, idx): + rpn_feature_file = os.path.join(rpn_feature_dir, '%06d.npy' % idx) + rpn_xyz_file = os.path.join(rpn_feature_dir, '%06d_xyz.npy' % idx) + rpn_intensity_file = os.path.join(rpn_feature_dir, '%06d_intensity.npy' % idx) + if cfg.RCNN.USE_SEG_SCORE: + rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_rawscore.npy' % idx) + rpn_seg_score = np.load(rpn_seg_file).reshape(-1) + rpn_seg_score = torch.sigmoid(torch.from_numpy(rpn_seg_score)).numpy() + else: + rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_seg.npy' % idx) + rpn_seg_score = np.load(rpn_seg_file).reshape(-1) + return np.load(rpn_xyz_file), np.load(rpn_feature_file), np.load(rpn_intensity_file).reshape(-1), rpn_seg_score + + def filtrate_objects(self, obj_list): + """ + Discard objects which are not in self.classes (or its similar classes) + :param obj_list: list + :return: list + """ + type_whitelist = self.classes + if self.mode == 'TRAIN' and cfg.INCLUDE_SIMILAR_TYPE: + type_whitelist = list(self.classes) + if 'Car' in self.classes: + type_whitelist.append('Van') + if 'Pedestrian' in self.classes: # or 'Cyclist' in self.classes: + type_whitelist.append('Person_sitting') + + valid_obj_list = [] + for obj in obj_list: + if obj.cls_type not in type_whitelist: # rm Van, 20180928 + continue + if self.mode == 'TRAIN' and cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(obj.pos) is False): + continue + valid_obj_list.append(obj) + return valid_obj_list + + @staticmethod + def filtrate_dc_objects(obj_list): + valid_obj_list = [] + for obj in obj_list: + if obj.cls_type in ['DontCare']: + continue + valid_obj_list.append(obj) + + return valid_obj_list + + @staticmethod + def check_pc_range(xyz): + """ + :param xyz: [x, y, z] + :return: + """ + x_range, y_range, z_range = cfg.PC_AREA_SCOPE + if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \ + (z_range[0] <= xyz[2] <= z_range[1]): + return True + return False + + @staticmethod + def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape): + """ + Valid point should be in the image (and in the PC_AREA_SCOPE) + :param pts_rect: + :param pts_img: + :param pts_rect_depth: + :param img_shape: + :return: + """ + val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1]) + val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0]) + val_flag_merge = np.logical_and(val_flag_1, val_flag_2) + pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0) + + if cfg.PC_REDUCE_BY_RANGE: + x_range, y_range, z_range = cfg.PC_AREA_SCOPE + pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2] + range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \ + & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \ + & (pts_z >= z_range[0]) & (pts_z <= z_range[1]) + pts_valid_flag = pts_valid_flag & range_flag + return pts_valid_flag + + def get_rpn_sample(self, index): + sample_id = int(self.sample_id_list[index]) + if sample_id < 10000: + calib = self.get_calib(sample_id) + # img = self.get_image(sample_id) + img_shape = self.get_image_shape(sample_id) + pts_lidar = self.get_lidar(sample_id) + + # get valid point (projected points should be in image) + pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3]) + pts_intensity = pts_lidar[:, 3] + else: + calib = self.get_calib(sample_id % 10000) + # img = self.get_image(sample_id % 10000) + img_shape = self.get_image_shape(sample_id % 10000) + + pts_file = os.path.join(self.aug_pts_dir, '%06d.bin' % sample_id) + assert os.path.exists(pts_file), '%s' % pts_file + aug_pts = np.fromfile(pts_file, dtype=np.float32).reshape(-1, 4) + pts_rect, pts_intensity = aug_pts[:, 0:3], aug_pts[:, 3] + + pts_img, pts_rect_depth = calib.rect_to_img(pts_rect) + pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape) + + pts_rect = pts_rect[pts_valid_flag][:, 0:3] + pts_intensity = pts_intensity[pts_valid_flag] + + if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN': + # all labels for checking overlapping + all_gt_obj_list = self.filtrate_dc_objects(self.get_label(sample_id)) + all_gt_boxes3d = kitti_utils.objs_to_boxes3d(all_gt_obj_list) + + gt_aug_flag = False + if np.random.rand() < cfg.GT_AUG_APPLY_PROB: + # augment one scene + gt_aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \ + self.apply_gt_aug_to_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d) + + # generate inputs + if self.mode == 'TRAIN' or self.random_select: + if self.npoints < len(pts_rect): + pts_depth = pts_rect[:, 2] + pts_near_flag = pts_depth < 40.0 + far_idxs_choice = np.where(pts_near_flag == 0)[0] + near_idxs = np.where(pts_near_flag == 1)[0] + near_idxs_choice = np.random.choice(near_idxs, self.npoints - len(far_idxs_choice), replace=False) + + choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \ + if len(far_idxs_choice) > 0 else near_idxs_choice + np.random.shuffle(choice) + else: + choice = np.arange(0, len(pts_rect), dtype=np.int32) + if self.npoints > len(pts_rect): + extra_choice = np.random.choice(choice, self.npoints - len(pts_rect), replace=False) + choice = np.concatenate((choice, extra_choice), axis=0) + np.random.shuffle(choice) + + ret_pts_rect = pts_rect[choice, :] + ret_pts_intensity = pts_intensity[choice] - 0.5 # translate intensity to [-0.5, 0.5] + else: + ret_pts_rect = np.zeros((self.npoints, pts_rect.shape[1])).astype(pts_rect.dtype) + num_ = min(self.npoints, pts_rect.shape[0]) + ret_pts_rect[:num_] = pts_rect[:num_] + + ret_pts_intensity = pts_intensity - 0.5 + + pts_features = [ret_pts_intensity.reshape(-1, 1)] + ret_pts_features = np.concatenate(pts_features, axis=1) if pts_features.__len__() > 1 else pts_features[0] + + sample_info = {'sample_id': sample_id, 'random_select': self.random_select} + + if self.mode == 'TEST': + if cfg.RPN.USE_INTENSITY: + pts_input = np.concatenate((ret_pts_rect, ret_pts_features), axis=1) # (N, C) + else: + pts_input = ret_pts_rect + sample_info['pts_input'] = pts_input + sample_info['pts_rect'] = ret_pts_rect + sample_info['pts_features'] = ret_pts_features + return sample_info + + gt_obj_list = self.filtrate_objects(self.get_label(sample_id)) + if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN' and gt_aug_flag: + gt_obj_list.extend(extra_gt_obj_list) + gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list) + + gt_alpha = np.zeros((gt_obj_list.__len__()), dtype=np.float32) + for k, obj in enumerate(gt_obj_list): + gt_alpha[k] = obj.alpha + + # data augmentation + aug_pts_rect = ret_pts_rect.copy() + aug_gt_boxes3d = gt_boxes3d.copy() + if cfg.AUG_DATA and self.mode == 'TRAIN': + aug_pts_rect, aug_gt_boxes3d, aug_method = self.data_augmentation(aug_pts_rect, aug_gt_boxes3d, gt_alpha, + sample_id) + sample_info['aug_method'] = aug_method + + # prepare input + if cfg.RPN.USE_INTENSITY: + pts_input = np.concatenate((aug_pts_rect, ret_pts_features), axis=1) # (N, C) + else: + pts_input = aug_pts_rect + + if cfg.RPN.FIXED: + sample_info['pts_input'] = pts_input + sample_info['pts_rect'] = aug_pts_rect + sample_info['pts_features'] = ret_pts_features + sample_info['gt_boxes3d'] = aug_gt_boxes3d + return sample_info + + if self.mode == 'EVAL' and aug_gt_boxes3d.shape[0] == 0: + aug_gt_boxes3d = np.zeros((1, aug_gt_boxes3d.shape[1])) + + # generate training labels + rpn_cls_label, rpn_reg_label = self.generate_rpn_training_labels(aug_pts_rect, aug_gt_boxes3d) + sample_info['pts_input'] = pts_input + sample_info['pts_rect'] = aug_pts_rect + sample_info['pts_features'] = ret_pts_features + sample_info['rpn_cls_label'] = rpn_cls_label + sample_info['rpn_reg_label'] = rpn_reg_label + sample_info['gt_boxes3d'] = aug_gt_boxes3d + return sample_info + + def apply_gt_aug_to_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d): + """ + :param pts_rect: (N, 3) + :param all_gt_boxex3d: (M2, 7) + :return: + """ + assert self.gt_database is not None + # extra_gt_num = np.random.randint(10, 15) + # try_times = 50 + if cfg.GT_AUG_RAND_NUM: + extra_gt_num = np.random.randint(10, cfg.GT_EXTRA_NUM) + else: + extra_gt_num = cfg.GT_EXTRA_NUM + try_times = 100 + cnt = 0 + cur_gt_boxes3d = all_gt_boxes3d.copy() + cur_gt_boxes3d[:, 4] += 0.5 # TODO: consider different objects + cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes + cur_gt_corners = kitti_utils.boxes3d_to_corners3d(cur_gt_boxes3d) + + extra_gt_obj_list = [] + extra_gt_boxes3d_list = [] + new_pts_list, new_pts_intensity_list = [], [] + src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32) + + road_plane = self.get_road_plane(sample_id) + a, b, c, d = road_plane + + while try_times > 0: + if cnt > extra_gt_num: + break + + try_times -= 1 + if cfg.GT_AUG_HARD_RATIO > 0: + p = np.random.rand() + if p > cfg.GT_AUG_HARD_RATIO: + # use easy sample + rand_idx = np.random.randint(0, len(self.gt_database[0])) + new_gt_dict = self.gt_database[0][rand_idx] + else: + # use hard sample + rand_idx = np.random.randint(0, len(self.gt_database[1])) + new_gt_dict = self.gt_database[1][rand_idx] + else: + rand_idx = np.random.randint(0, self.gt_database.__len__()) + new_gt_dict = self.gt_database[rand_idx] + + new_gt_box3d = new_gt_dict['gt_box3d'].copy() + new_gt_points = new_gt_dict['points'].copy() + new_gt_intensity = new_gt_dict['intensity'].copy() + new_gt_obj = new_gt_dict['obj'] + center = new_gt_box3d[0:3] + if cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False): + continue + + if new_gt_points.__len__() < 5: # too few points + continue + + # put it on the road plane + cur_height = (-d - a * center[0] - c * center[2]) / b + move_height = new_gt_box3d[1] - cur_height + new_gt_box3d[1] -= move_height + new_gt_points[:, 1] -= move_height + new_gt_obj.pos[1] -= move_height + + new_enlarged_box3d = new_gt_box3d.copy() + new_enlarged_box3d[4] += 0.5 + new_enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes + + cnt += 1 + new_corners = kitti_utils.boxes3d_to_corners3d(new_enlarged_box3d.reshape(1, 7)) + iou3d = kitti_utils.get_iou3d(new_corners, cur_gt_corners) + valid_flag = iou3d.max() < 1e-8 + if not valid_flag: + continue + + enlarged_box3d = new_gt_box3d.copy() + enlarged_box3d[3] += 2 # remove the points above and below the object + + boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, + enlarged_box3d.reshape(1, 7)) + pt_mask_flag = (boxes_pts_mask_list[0] == 1) + src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box + + new_pts_list.append(new_gt_points) + new_pts_intensity_list.append(new_gt_intensity) + cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, new_enlarged_box3d.reshape(1, 7)), axis=0) + cur_gt_corners = np.concatenate((cur_gt_corners, new_corners), axis=0) + extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7)) + extra_gt_obj_list.append(new_gt_obj) + + if new_pts_list.__len__() == 0: + return False, pts_rect, pts_intensity, None, None + + extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0) + # remove original points and add new points + pts_rect = pts_rect[src_pts_flag == 1] + pts_intensity = pts_intensity[src_pts_flag == 1] + new_pts_rect = np.concatenate(new_pts_list, axis=0) + new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0) + pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0) + pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0) + + return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list + + def rotate_box3d_along_y(self, box3d, rot_angle): + old_x, old_z, ry = box3d[0], box3d[2], box3d[6] + old_beta = np.arctan2(old_z, old_x) + alpha = -np.sign(old_beta) * np.pi / 2 + old_beta + ry + box3d = kitti_utils.rotate_pc_along_y(box3d.reshape(1, 7), rot_angle=rot_angle)[0] + new_x, new_z = box3d[0], box3d[2] + new_beta = np.arctan2(new_z, new_x) + box3d[6] = np.sign(new_beta) * np.pi / 2 + alpha - new_beta + return box3d + + def data_augmentation(self, aug_pts_rect, aug_gt_boxes3d, gt_alpha, sample_id=None, mustaug=False, stage=1): + """ + :param aug_pts_rect: (N, 3) + :param aug_gt_boxes3d: (N, 7) + :param gt_alpha: (N) + :return: + """ + aug_list = cfg.AUG_METHOD_LIST + aug_enable = 1 - np.random.rand(3) + if mustaug is True: + aug_enable[0] = -1 + aug_enable[1] = -1 + aug_method = [] + if 'rotation' in aug_list and aug_enable[0] < cfg.AUG_METHOD_PROB[0]: + angle = np.random.uniform(-np.pi / cfg.AUG_ROT_RANGE, np.pi / cfg.AUG_ROT_RANGE) + aug_pts_rect = kitti_utils.rotate_pc_along_y(aug_pts_rect, rot_angle=angle) + if stage == 1: + # xyz change, hwl unchange + aug_gt_boxes3d = kitti_utils.rotate_pc_along_y(aug_gt_boxes3d, rot_angle=angle) + + # calculate the ry after rotation + x, z = aug_gt_boxes3d[:, 0], aug_gt_boxes3d[:, 2] + beta = np.arctan2(z, x) + new_ry = np.sign(beta) * np.pi / 2 + gt_alpha - beta + aug_gt_boxes3d[:, 6] = new_ry # TODO: not in [-np.pi / 2, np.pi / 2] + elif stage == 2: + # for debug stage-2, this implementation has little float precision difference with the above one + assert aug_gt_boxes3d.shape[0] == 2 + aug_gt_boxes3d[0] = self.rotate_box3d_along_y(aug_gt_boxes3d[0], angle) + aug_gt_boxes3d[1] = self.rotate_box3d_along_y(aug_gt_boxes3d[1], angle) + else: + raise NotImplementedError + + aug_method.append(['rotation', angle]) + + if 'scaling' in aug_list and aug_enable[1] < cfg.AUG_METHOD_PROB[1]: + scale = np.random.uniform(0.95, 1.05) + aug_pts_rect = aug_pts_rect * scale + aug_gt_boxes3d[:, 0:6] = aug_gt_boxes3d[:, 0:6] * scale + aug_method.append(['scaling', scale]) + + if 'flip' in aug_list and aug_enable[2] < cfg.AUG_METHOD_PROB[2]: + # flip horizontal + aug_pts_rect[:, 0] = -aug_pts_rect[:, 0] + aug_gt_boxes3d[:, 0] = -aug_gt_boxes3d[:, 0] + # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry + if stage == 1: + aug_gt_boxes3d[:, 6] = np.sign(aug_gt_boxes3d[:, 6]) * np.pi - aug_gt_boxes3d[:, 6] + elif stage == 2: + assert aug_gt_boxes3d.shape[0] == 2 + aug_gt_boxes3d[0, 6] = np.sign(aug_gt_boxes3d[0, 6]) * np.pi - aug_gt_boxes3d[0, 6] + aug_gt_boxes3d[1, 6] = np.sign(aug_gt_boxes3d[1, 6]) * np.pi - aug_gt_boxes3d[1, 6] + else: + raise NotImplementedError + + aug_method.append('flip') + + return aug_pts_rect, aug_gt_boxes3d, aug_method + + @staticmethod + def generate_rpn_training_labels(pts_rect, gt_boxes3d): + cls_label = np.zeros((pts_rect.shape[0]), dtype=np.int32) + reg_label = np.zeros((pts_rect.shape[0], 7), dtype=np.float32) # dx, dy, dz, ry, h, w, l + gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d, rotate=True) + extend_gt_boxes3d = kitti_utils.enlarge_box3d(gt_boxes3d, extra_width=0.2) + extend_gt_corners = kitti_utils.boxes3d_to_corners3d(extend_gt_boxes3d, rotate=True) + for k in range(gt_boxes3d.shape[0]): + box_corners = gt_corners[k] + fg_pt_flag = in_hull(pts_rect, box_corners) + fg_pts_rect = pts_rect[fg_pt_flag] + cls_label[fg_pt_flag] = 1 + + # enlarge the bbox3d, ignore nearby points + extend_box_corners = extend_gt_corners[k] + fg_enlarge_flag = in_hull(pts_rect, extend_box_corners) + ignore_flag = np.logical_xor(fg_pt_flag, fg_enlarge_flag) + cls_label[ignore_flag] = -1 + + # pixel offset of object center + center3d = gt_boxes3d[k][0:3].copy() # (x, y, z) + center3d[1] -= gt_boxes3d[k][3] / 2 + reg_label[fg_pt_flag, 0:3] = center3d - fg_pts_rect # Now y is the true center of 3d box 20180928 + + # size and angle encoding + reg_label[fg_pt_flag, 3] = gt_boxes3d[k][3] # h + reg_label[fg_pt_flag, 4] = gt_boxes3d[k][4] # w + reg_label[fg_pt_flag, 5] = gt_boxes3d[k][5] # l + reg_label[fg_pt_flag, 6] = gt_boxes3d[k][6] # ry + + return cls_label, reg_label + + def get_rcnn_sample_jit(self, index): + sample_id = int(self.sample_id_list[index]) + rpn_xyz, rpn_features, rpn_intensity, seg_mask = \ + self.get_rpn_features(self.rcnn_training_feature_dir, sample_id) + + # load rois and gt_boxes3d for this sample + roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id) + roi_obj_list = kitti_utils.get_objects_from_label(roi_file) + roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list) + # roi_scores is not used currently + # roi_scores = kitti_utils.objs_to_scores(roi_obj_list) + + gt_obj_list = self.filtrate_objects(self.get_label(sample_id)) + gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list) + sample_info = OrderedDict() + sample_info["sample_id"] = sample_id + sample_info['rpn_xyz'] = rpn_xyz + sample_info['rpn_features'] = rpn_features + sample_info['rpn_intensity'] = rpn_intensity + sample_info['seg_mask'] = seg_mask + sample_info['roi_boxes3d'] = roi_boxes3d + sample_info['pts_depth'] = np.linalg.norm(rpn_xyz, ord=2, axis=1) + sample_info['gt_boxes3d'] = gt_boxes3d + + return sample_info + + def sample_bg_inds(self, hard_bg_inds, easy_bg_inds, bg_rois_per_this_image): + if hard_bg_inds.size > 0 and easy_bg_inds.size > 0: + hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO) + easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num + + # sampling hard bg + rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32) + hard_bg_inds = hard_bg_inds[rand_num] + # sampling easy bg + rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32) + easy_bg_inds = easy_bg_inds[rand_num] + + bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0) + elif hard_bg_inds.size > 0 and easy_bg_inds.size == 0: + hard_bg_rois_num = bg_rois_per_this_image + # sampling hard bg + rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32) + bg_inds = hard_bg_inds[rand_num] + elif hard_bg_inds.size == 0 and easy_bg_inds.size > 0: + easy_bg_rois_num = bg_rois_per_this_image + # sampling easy bg + rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32) + bg_inds = easy_bg_inds[rand_num] + else: + raise NotImplementedError + + return bg_inds + + def aug_roi_by_noise_batch(self, roi_boxes3d, gt_boxes3d, aug_times=10): + """ + :param roi_boxes3d: (N, 7) + :param gt_boxes3d: (N, 7) + :return: + """ + iou_of_rois = np.zeros(roi_boxes3d.shape[0], dtype=np.float32) + for k in range(roi_boxes3d.__len__()): + temp_iou = cnt = 0 + roi_box3d = roi_boxes3d[k] + gt_box3d = gt_boxes3d[k] + pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH) + gt_corners = kitti_utils.boxes3d_to_corners3d(gt_box3d.reshape(1, 7), True) + aug_box3d = roi_box3d + while temp_iou < pos_thresh and cnt < aug_times: + if np.random.rand() < 0.2: + aug_box3d = roi_box3d # p=0.2 to keep the original roi box + else: + aug_box3d = self.random_aug_box3d(roi_box3d) + aug_corners = kitti_utils.boxes3d_to_corners3d(aug_box3d.reshape(1, 7), True) + iou3d = kitti_utils.get_iou3d(aug_corners, gt_corners) + temp_iou = iou3d[0][0] + cnt += 1 + roi_boxes3d[k] = aug_box3d + iou_of_rois[k] = temp_iou + return roi_boxes3d, iou_of_rois + + @staticmethod + def canonical_transform_batch(pts_input, roi_boxes3d, gt_boxes3d): + """ + :param pts_input: (N, npoints, 3 + C) + :param roi_boxes3d: (N, 7) + :param gt_boxes3d: (N, 7) + :return: + """ + roi_ry = roi_boxes3d[:, 6] % (2 * np.pi) # 0 ~ 2pi + roi_center = roi_boxes3d[:, 0:3] + # shift to center + pts_input[:, :, [0, 1, 2]] = pts_input[:, :, [0, 1, 2]] - roi_center.reshape(-1, 1, 3) + gt_boxes3d_ct = np.copy(gt_boxes3d) + gt_boxes3d_ct[:, 0:3] = gt_boxes3d_ct[:, 0:3] - roi_center + # rotate to the direction of head + gt_boxes3d_ct = kitti_utils.rotate_pc_along_y_np( + gt_boxes3d_ct.reshape(-1, 1, 7), + roi_ry, + ) + # TODO: check here + gt_boxes3d_ct = gt_boxes3d_ct.reshape(-1,7) + gt_boxes3d_ct[:, 6] = gt_boxes3d_ct[:, 6] - roi_ry + pts_input = kitti_utils.rotate_pc_along_y_np( + pts_input, + roi_ry + ) + return pts_input, gt_boxes3d_ct + + def get_rcnn_training_sample_batch(self, index): + sample_id = int(self.sample_id_list[index]) + rpn_xyz, rpn_features, rpn_intensity, seg_mask = \ + self.get_rpn_features(self.rcnn_training_feature_dir, sample_id) + + # load rois and gt_boxes3d for this sample + roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id) + roi_obj_list = kitti_utils.get_objects_from_label(roi_file) + roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list) + # roi_scores = kitti_utils.objs_to_scores(roi_obj_list) + + gt_obj_list = self.filtrate_objects(self.get_label(sample_id)) + gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list) + + # calculate original iou + iou3d = kitti_utils.get_iou3d(kitti_utils.boxes3d_to_corners3d(roi_boxes3d, True), + kitti_utils.boxes3d_to_corners3d(gt_boxes3d, True)) + max_overlaps, gt_assignment = iou3d.max(axis=1), iou3d.argmax(axis=1) + max_iou_of_gt, roi_assignment = iou3d.max(axis=0), iou3d.argmax(axis=0) + roi_assignment = roi_assignment[max_iou_of_gt > 0].reshape(-1) + + # sample fg, easy_bg, hard_bg + fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE)) + fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH) + fg_inds = np.nonzero(max_overlaps >= fg_thresh)[0] + fg_inds = np.concatenate((fg_inds, roi_assignment), axis=0) # consider the roi which has max_overlaps with gt as fg + + easy_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO))[0] + hard_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH) & + (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0] + + fg_num_rois = fg_inds.size + bg_num_rois = hard_bg_inds.size + easy_bg_inds.size + + if fg_num_rois > 0 and bg_num_rois > 0: + # sampling fg + fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois) + rand_num = np.random.permutation(fg_num_rois) + fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]] + + # sampling bg + bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image + bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image) + + elif fg_num_rois > 0 and bg_num_rois == 0: + # sampling fg + rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE ) * fg_num_rois) + # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long() + fg_inds = fg_inds[rand_num] + fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE + bg_rois_per_this_image = 0 + elif bg_num_rois > 0 and fg_num_rois == 0: + # sampling bg + bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE + bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image) + fg_rois_per_this_image = 0 + else: + import pdb + pdb.set_trace() + raise NotImplementedError + + # augment the rois by noise + roi_list, roi_iou_list, roi_gt_list = [], [], [] + if fg_rois_per_this_image > 0: + fg_rois_src = roi_boxes3d[fg_inds].copy() + gt_of_fg_rois = gt_boxes3d[gt_assignment[fg_inds]] + fg_rois, fg_iou3d = self.aug_roi_by_noise_batch(fg_rois_src, gt_of_fg_rois, aug_times=10) + roi_list.append(fg_rois) + roi_iou_list.append(fg_iou3d) + roi_gt_list.append(gt_of_fg_rois) + + if bg_rois_per_this_image > 0: + bg_rois_src = roi_boxes3d[bg_inds].copy() + gt_of_bg_rois = gt_boxes3d[gt_assignment[bg_inds]] + bg_rois, bg_iou3d = self.aug_roi_by_noise_batch(bg_rois_src, gt_of_bg_rois, aug_times=1) + roi_list.append(bg_rois) + roi_iou_list.append(bg_iou3d) + roi_gt_list.append(gt_of_bg_rois) + + rois = np.concatenate(roi_list, axis=0) + iou_of_rois = np.concatenate(roi_iou_list, axis=0) + gt_of_rois = np.concatenate(roi_gt_list, axis=0) + + # collect extra features for point cloud pooling + if cfg.RCNN.USE_INTENSITY: + pts_extra_input_list = [rpn_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)] + else: + pts_extra_input_list = [seg_mask.reshape(-1, 1)] + + if cfg.RCNN.USE_DEPTH: + pts_depth = (np.linalg.norm(rpn_xyz, ord=2, axis=1) / 70.0) - 0.5 + pts_extra_input_list.append(pts_depth.reshape(-1, 1)) + pts_extra_input = np.concatenate(pts_extra_input_list, axis=1) + + # pts, pts_feature, boxes3d, pool_extra_width, sampled_pt_num + pts_input, pts_features, pts_empty_flag = roipool3d_utils.roipool3d_cpu( + rpn_xyz, rpn_features, rois, pts_extra_input, + cfg.RCNN.POOL_EXTRA_WIDTH, + sampled_pt_num=cfg.RCNN.NUM_POINTS, + #canonical_transform=False + ) + + # data augmentation + if cfg.AUG_DATA and self.mode == 'TRAIN': + for k in range(rois.__len__()): + aug_pts = pts_input[k, :, 0:3].copy() + aug_gt_box3d = gt_of_rois[k].copy() + aug_roi_box3d = rois[k].copy() + + # calculate alpha by ry + temp_boxes3d = np.concatenate([aug_roi_box3d.reshape(1, 7), aug_gt_box3d.reshape(1, 7)], axis=0) + temp_x, temp_z, temp_ry = temp_boxes3d[:, 0], temp_boxes3d[:, 2], temp_boxes3d[:, 6] + temp_beta = np.arctan2(temp_z, temp_x).astype(np.float64) + temp_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry + + # data augmentation + aug_pts, aug_boxes3d, aug_method = self.data_augmentation(aug_pts, temp_boxes3d, temp_alpha, + mustaug=True, stage=2) + + # assign to original data + pts_input[k, :, 0:3] = aug_pts + rois[k] = aug_boxes3d[0] + gt_of_rois[k] = aug_boxes3d[1] + + valid_mask = (pts_empty_flag == 0).astype(np.int32) + # regression valid mask + reg_valid_mask = (iou_of_rois > cfg.RCNN.REG_FG_THRESH).astype(np.int32) & valid_mask + + # classification label + cls_label = (iou_of_rois > cfg.RCNN.CLS_FG_THRESH).astype(np.int32) + invalid_mask = (iou_of_rois > cfg.RCNN.CLS_BG_THRESH) & (iou_of_rois < cfg.RCNN.CLS_FG_THRESH) + cls_label[invalid_mask] = -1 + cls_label[valid_mask == 0] = -1 + + # canonical transform and sampling + pts_input_ct, gt_boxes3d_ct = self.canonical_transform_batch(pts_input, rois, gt_of_rois) + + pts_input_ = np.concatenate((pts_input_ct, pts_features), axis=-1) + sample_info = OrderedDict() + + sample_info['sample_id'] = sample_id + sample_info['pts_input'] = pts_input_ + sample_info['pts_feature'] = pts_features + sample_info['roi_boxes3d'] = rois + sample_info['cls_label'] = cls_label + sample_info['reg_valid_mask'] = reg_valid_mask + sample_info['gt_boxes3d_ct'] = gt_boxes3d_ct + sample_info['gt_of_rois'] = gt_of_rois + return sample_info + + @staticmethod + def random_aug_box3d(box3d): + """ + :param box3d: (7) [x, y, z, h, w, l, ry] + random shift, scale, orientation + """ + if cfg.RCNN.REG_AUG_METHOD == 'single': + pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5] + hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 # + angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12] + + aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, + box3d[6:7] + angle_rot]) + return aug_box3d + elif cfg.RCNN.REG_AUG_METHOD == 'multiple': + # pos_range, hwl_range, angle_range, mean_iou + range_config = [[0.2, 0.1, np.pi / 12, 0.7], + [0.3, 0.15, np.pi / 12, 0.6], + [0.5, 0.15, np.pi / 9, 0.5], + [0.8, 0.15, np.pi / 6, 0.3], + [1.0, 0.15, np.pi / 3, 0.2]] + idx = np.random.randint(len(range_config)) + + pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0] + hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0 + angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2] + + aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot]) + return aug_box3d + elif cfg.RCNN.REG_AUG_METHOD == 'normal': + x_shift = np.random.normal(loc=0, scale=0.3) + y_shift = np.random.normal(loc=0, scale=0.2) + z_shift = np.random.normal(loc=0, scale=0.3) + h_shift = np.random.normal(loc=0, scale=0.25) + w_shift = np.random.normal(loc=0, scale=0.15) + l_shift = np.random.normal(loc=0, scale=0.5) + ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12 + + aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift, + box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift]) + return aug_box3d + else: + raise NotImplementedError + + def get_proposal_from_file(self, index): + sample_id = int(self.image_idx_list[index]) + proposal_file = os.path.join(self.rcnn_eval_roi_dir, '%06d.txt' % sample_id) + roi_obj_list = kitti_utils.get_objects_from_label(proposal_file) + + rpn_xyz, rpn_features, rpn_intensity, seg_mask = self.get_rpn_features(self.rcnn_eval_feature_dir, sample_id) + pts_rect, pts_rpn_features, pts_intensity = rpn_xyz, rpn_features, rpn_intensity + + roi_box3d_list, roi_scores = [], [] + for obj in roi_obj_list: + box3d = np.array([obj.pos[0], obj.pos[1], obj.pos[2], obj.h, obj.w, obj.l, obj.ry], dtype=np.float32) + roi_box3d_list.append(box3d.reshape(1, 7)) + roi_scores.append(obj.score) + + roi_boxes3d = np.concatenate(roi_box3d_list, axis=0) # (N, 7) + roi_scores = np.array(roi_scores, dtype=np.float32) # (N) + + if cfg.RCNN.ROI_SAMPLE_JIT: + sample_dict = {'sample_id': sample_id, + 'rpn_xyz': rpn_xyz, + 'rpn_features': rpn_features, + 'seg_mask': seg_mask, + 'roi_boxes3d': roi_boxes3d, + 'roi_scores': roi_scores, + 'pts_depth': np.linalg.norm(rpn_xyz, ord=2, axis=1)} + + if self.mode != 'TEST': + gt_obj_list = self.filtrate_objects(self.get_label(sample_id)) + gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list) + + roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True) + gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True) + iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners) + if gt_boxes3d.shape[0] > 0: + gt_iou = iou3d.max(axis=1) + else: + gt_iou = np.zeros(roi_boxes3d.shape[0]).astype(np.float32) + + sample_dict['gt_boxes3d'] = gt_boxes3d + sample_dict['gt_iou'] = gt_iou + return sample_dict + + if cfg.RCNN.USE_INTENSITY: + pts_extra_input_list = [pts_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)] + else: + pts_extra_input_list = [seg_mask.reshape(-1, 1)] + + if cfg.RCNN.USE_DEPTH: + cur_depth = np.linalg.norm(pts_rect, axis=1, ord=2) + cur_depth_norm = (cur_depth / 70.0) - 0.5 + pts_extra_input_list.append(cur_depth_norm.reshape(-1, 1)) + + pts_extra_input = np.concatenate(pts_extra_input_list, axis=1) + pts_input, pts_features, _ = roipool3d_utils.roipool3d_cpu( + pts_rect, pts_rpn_features, roi_boxes3d, pts_extra_input, + cfg.RCNN.POOL_EXTRA_WIDTH, sampled_pt_num=cfg.RCNN.NUM_POINTS, + canonical_transform=True + ) + pts_input = np.concatenate((pts_input, pts_features), axis=-1) + + sample_dict = OrderedDict() + sample_dict['sample_id'] = sample_id + sample_dict['pts_input'] = pts_input + sample_dict['pts_feature'] = pts_features + sample_dict['roi_boxes3d'] = roi_boxes3d + sample_dict['roi_scores'] = roi_scores + #sample_dict['roi_size'] = roi_boxes3d[:, 3:6] + + if self.mode == 'TEST': + return sample_dict + + gt_obj_list = self.filtrate_objects(self.get_label(sample_id)) + gt_boxes3d = np.zeros((gt_obj_list.__len__(), 7), dtype=np.float32) + + for k, obj in enumerate(gt_obj_list): + gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \ + = obj.pos, obj.h, obj.w, obj.l, obj.ry + + if gt_boxes3d.__len__() == 0: + gt_iou = np.zeros((roi_boxes3d.shape[0]), dtype=np.float32) + else: + roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True) + gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True) + iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners) + gt_iou = iou3d.max(axis=1) + + sample_dict['gt_iou'] = gt_iou + sample_dict['gt_boxes3d'] = gt_boxes3d + + return sample_dict + + def __len__(self): + if cfg.RPN.ENABLED: + return len(self.sample_id_list) + elif cfg.RCNN.ENABLED: + if self.mode == 'TRAIN': + return len(self.sample_id_list) + else: + return len(self.image_idx_list) + else: + raise NotImplementedError + + def __getitem__(self, index): + if cfg.RPN.ENABLED: + return self.get_rpn_sample(index) + elif cfg.RCNN.ENABLED: + if self.mode == 'TRAIN': + if cfg.RCNN.ROI_SAMPLE_JIT: + return self.get_rcnn_sample_jit(index) + else: + return self.get_rcnn_training_sample_batch(index) + else: + return self.get_proposal_from_file(index) + else: + raise NotImplementedError + + def padding_batch(self, batch_data, batch_size): + max_roi = 0 + max_gt = 0 + + for k in range(batch_size): + # roi_boxes3d + max_roi = max(max_roi, batch_data[k][3].shape[0]) + # gt_boxes3d + max_gt = max(max_gt, batch_data[k][-1].shape[0]) + batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7)) + batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32) + + for i, data in enumerate(batch_data): + roi_num = data[3].shape[0] + gt_num = data[-1].shape[0] + batch_roi_boxes3d[i,:roi_num,:] = data[3] + batch_gt_boxes3d[i,:gt_num,:] = data[-1] + + new_batch = [] + for i, data in enumerate(batch_data): + new_batch.append(data[:3]) + # roi_boxes3d + new_batch[i].append(batch_roi_boxes3d[i]) + # ... + new_batch[i].extend(data[4:7]) + # gt_boxes3d + new_batch[i].append(batch_gt_boxes3d[i]) + return new_batch + + def padding_batch_eval(self, batch_data, batch_size): + max_pts = 0 + max_feats = 0 + max_roi = 0 + max_score = 0 + max_iou = 0 + max_gt = 0 + + for k in range(batch_size): + # pts_input + max_pts = max(max_pts, batch_data[k][1].shape[0]) + # pts_feature + max_feats = max(max_feats, batch_data[k][2].shape[0]) + # roi_boxes3d + max_roi = max(max_roi, batch_data[k][3].shape[0]) + # gt_iou + max_iou = max(max_iou, batch_data[k][-2].shape[0]) + # gt_boxes3d + max_gt = max(max_gt, batch_data[k][-1].shape[0]) + batch_pts_input = np.zeros((batch_size, max_pts, 512, 133), dtype=np.float32) + batch_pts_feat = np.zeros((batch_size, max_feats, 512, 128), dtype=np.float32) + batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7), dtype=np.float32) + batch_gt_iou = np.zeros((batch_size, max_iou), dtype=np.float32) + batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32) + + for i, data in enumerate(batch_data): + # num + pts_num = data[1].shape[0] + pts_feat_num = data[2].shape[0] + roi_num = data[3].shape[0] + iou_num = data[-2].shape[0] + gt_num = data[-1].shape[0] + # data + batch_pts_input[i, :pts_num, :, :] = data[1] + batch_pts_feat[i, :pts_feat_num, :, :] = data[2] + batch_roi_boxes3d[i,:roi_num,:] = data[3] + batch_gt_iou[i,:iou_num] = data[-2] + batch_gt_boxes3d[i,:gt_num,:] = data[-1] + + new_batch = [] + for i, data in enumerate(batch_data): + new_batch.append(data[:1]) + new_batch[i].append(batch_pts_input[i]) + new_batch[i].append(batch_pts_feat[i]) + new_batch[i].append(batch_roi_boxes3d[i]) + new_batch[i].append(data[4]) + new_batch[i].append(batch_gt_iou[i]) + new_batch[i].append(batch_gt_boxes3d[i]) + return new_batch + + def get_reader(self, batch_size, fields, drop_last=False): + def reader(): + batch_out = [] + idxs = np.arange(self.__len__()) + if self.mode == 'TRAIN': + np.random.shuffle(idxs) + for idx in idxs: + sample_all = self.__getitem__(idx) + sample = [sample_all[f] for f in fields] + if has_empty(sample): + logger.info("sample field: %d has empty field"%len(sample)) + continue + batch_out.append(sample) + if len(batch_out) >= batch_size: + if cfg.RPN.ENABLED: + yield batch_out + else: + if self.mode == 'TRAIN': + yield self.padding_batch(batch_out, batch_size) + elif self.mode == 'EVAL': + # batch_size can should be 1 in rcnn_offline eval currently + # if batch_size > 1, batch should be padded as follow + # yield self.padding_batch_eval(batch_out, batch_size) + yield batch_out + else: + logger.error("not only support train/eval padding") + batch_out = [] + if not drop_last: + if len(batch_out) > 0: + yield batch_out + return reader + + def get_multiprocess_reader(self, batch_size, fields, proc_num=8, max_queue_len=128, drop_last=False): + def read_to_queue(idxs, queue): + for idx in idxs: + sample_all = self.__getitem__(idx) + sample = [sample_all[f] for f in fields] + queue.put(sample) + queue.put(None) + + def reader(): + sample_num = self.__len__() + idxs = np.arange(self.__len__()) + if self.mode == 'TRAIN': + np.random.shuffle(idxs) + + proc_idxs = [] + proc_sample_num = int(sample_num / proc_num) + start_idx = 0 + for i in range(proc_num - 1): + proc_idxs.append(idxs[start_idx:start_idx + proc_sample_num]) + start_idx += proc_sample_num + proc_idxs.append(idxs[start_idx:]) + + queue = multiprocessing.Queue(max_queue_len) + p_list = [] + for i in range(proc_num): + p_list.append(multiprocessing.Process( + target=read_to_queue, args=(proc_idxs[i], queue,))) + p_list[-1].start() + + finish_num = 0 + batch_out = [] + while finish_num < len(p_list): + sample = queue.get() + if sample is None: + finish_num += 1 + else: + batch_out.append(sample) + if len(batch_out) == batch_size: + yield batch_out + batch_out = [] + + # join process + for p in p_list: + if p.is_alive(): + p.join() + + return reader + diff --git a/PaddleCV/Paddle3D/PointRCNN/eval.py b/PaddleCV/Paddle3D/PointRCNN/eval.py new file mode 100644 index 0000000000000000000000000000000000000000..7ee5d37f40bbee8a5486090b1ebda05f0d5928a8 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/eval.py @@ -0,0 +1,343 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import sys +import time +import shutil +import argparse +import logging +import multiprocessing +import numpy as np +from collections import OrderedDict +import paddle +import paddle.fluid as fluid + +from models.point_rcnn import PointRCNN +from data.kitti_rcnn_reader import KittiRCNNReader +from utils.run_utils import * +from utils.config import cfg, load_config, set_config_from_list +from utils.metric_utils import calc_iou_recall, rpn_metric, rcnn_metric + +logging.root.handlers = [] +FORMAT = '%(asctime)s-%(levelname)s: %(message)s' +logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout) +logger = logging.getLogger(__name__) + +np.random.seed(1024) # use same seed +METRIC_PROC_NUM = 4 + + +def parse_args(): + parser = argparse.ArgumentParser( + "PointRCNN semantic segmentation train script") + parser.add_argument( + '--cfg', + type=str, + default='cfgs/default.yml', + help='specify the config for training') + parser.add_argument( + '--eval_mode', + type=str, + default='rpn', + required=True, + help='specify the training mode') + parser.add_argument( + '--batch_size', + type=int, + default=1, + help='evaluation batch size, default 1') + parser.add_argument( + '--ckpt_dir', + type=str, + default='checkpoints/199', + help='specify a ckpt directory to be evaluated if needed') + parser.add_argument( + '--data_dir', + type=str, + default='./data', + help='KITTI dataset root directory') + parser.add_argument( + '--output_dir', + type=str, + default='output', + help='output directory') + parser.add_argument( + '--save_rpn_feature', + action='store_true', + default=False, + help='save features for separately rcnn training and evaluation') + parser.add_argument( + '--save_result', + action='store_true', + default=False, + help='save roi and refine result of evaluation') + parser.add_argument( + '--rcnn_eval_roi_dir', + type=str, + default=None, + help='specify the saved rois for rcnn evaluation when using rcnn_offline mode') + parser.add_argument( + '--rcnn_eval_feature_dir', + type=str, + default=None, + help='specify the saved features for rcnn evaluation when using rcnn_offline mode') + parser.add_argument( + '--log_interval', + type=int, + default=1, + help='mini-batch interval to log.') + parser.add_argument( + '--set', + dest='set_cfgs', + default=None, + nargs=argparse.REMAINDER, + help='set extra config keys if needed.') + args = parser.parse_args() + return args + + +def eval(): + args = parse_args() + print_arguments(args) + # check whether the installed paddle is compiled with GPU + # PointRCNN model can only run on GPU + check_gpu(True) + + load_config(args.cfg) + if args.set_cfgs is not None: + set_config_from_list(args.set_cfgs) + + if not os.path.isdir(args.output_dir): + os.makedirs(args.output_dir) + + if args.eval_mode == 'rpn': + cfg.RPN.ENABLED = True + cfg.RCNN.ENABLED = False + elif args.eval_mode == 'rcnn': + cfg.RCNN.ENABLED = True + cfg.RPN.ENABLED = cfg.RPN.FIXED = True + assert args.batch_size, "batch size must be 1 in rcnn evaluation" + elif args.eval_mode == 'rcnn_offline': + cfg.RCNN.ENABLED = True + cfg.RPN.ENABLED = False + assert args.batch_size, "batch size must be 1 in rcnn_offline evaluation" + else: + raise NotImplementedError("unkown eval mode: {}".format(args.eval_mode)) + + place = fluid.CUDAPlace(0) + exe = fluid.Executor(place) + + # build model + startup = fluid.Program() + eval_prog = fluid.Program() + with fluid.program_guard(eval_prog, startup): + with fluid.unique_name.guard(): + eval_model = PointRCNN(cfg, args.batch_size, True, 'TEST') + eval_model.build() + eval_pyreader = eval_model.get_pyreader() + eval_feeds = eval_model.get_feeds() + eval_outputs = eval_model.get_outputs() + eval_prog = eval_prog.clone(True) + + extra_keys = [] + if args.eval_mode == 'rpn': + extra_keys.extend(['sample_id', 'rpn_cls_label', 'gt_boxes3d']) + if args.save_rpn_feature: + extra_keys.extend(['pts_rect', 'pts_features', 'pts_input',]) + eval_keys, eval_values = parse_outputs( + eval_outputs, prog=eval_prog, extra_keys=extra_keys) + + eval_compile_prog = fluid.compiler.CompiledProgram( + eval_prog).with_data_parallel() + + exe.run(startup) + + # load checkpoint + assert os.path.isdir( + args.ckpt_dir), "ckpt_dir {} not a directory".format(args.ckpt_dir) + + def if_exist(var): + return os.path.exists(os.path.join(args.ckpt_dir, var.name)) + fluid.io.load_vars(exe, args.ckpt_dir, eval_prog, predicate=if_exist) + + kitti_feature_dir = os.path.join(args.output_dir, 'features') + kitti_output_dir = os.path.join(args.output_dir, 'detections', 'data') + seg_output_dir = os.path.join(args.output_dir, 'seg_result') + if args.save_rpn_feature: + if os.path.exists(kitti_feature_dir): + shutil.rmtree(kitti_feature_dir) + os.makedirs(kitti_feature_dir) + if os.path.exists(kitti_output_dir): + shutil.rmtree(kitti_output_dir) + os.makedirs(kitti_output_dir) + if os.path.exists(seg_output_dir): + shutil.rmtree(seg_output_dir) + os.makedirs(seg_output_dir) + + # must make sure these dirs existing + roi_output_dir = os.path.join('./result_dir', 'roi_result', 'data') + refine_output_dir = os.path.join('./result_dir', 'refine_result', 'data') + final_output_dir = os.path.join("./result_dir", 'final_result', 'data') + if not os.path.exists(final_output_dir): + os.makedirs(final_output_dir) + if args.save_result: + if not os.path.exists(roi_output_dir): + os.makedirs(roi_output_dir) + if not os.path.exists(refine_output_dir): + os.makedirs(refine_output_dir) + + # get reader + kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir, + npoints=cfg.RPN.NUM_POINTS, + split=cfg.TEST.SPLIT, + mode='EVAL', + classes=cfg.CLASSES, + rcnn_eval_roi_dir=args.rcnn_eval_roi_dir, + rcnn_eval_feature_dir=args.rcnn_eval_feature_dir) + eval_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, eval_feeds) + eval_pyreader.decorate_sample_list_generator(eval_reader, place) + + thresh_list = [0.1, 0.3, 0.5, 0.7, 0.9] + queue = multiprocessing.Queue(128) + mgr = multiprocessing.Manager() + lock = multiprocessing.Lock() + mdict = mgr.dict() + if cfg.RPN.ENABLED: + mdict['exit_proc'] = 0 + mdict['total_gt_bbox'] = 0 + mdict['total_cnt'] = 0 + mdict['total_rpn_iou'] = 0 + for i in range(len(thresh_list)): + mdict['total_recalled_bbox_list_{}'.format(i)] = 0 + + p_list = [] + for i in range(METRIC_PROC_NUM): + p_list.append(multiprocessing.Process( + target=rpn_metric, + args=(queue, mdict, lock, thresh_list, args.save_rpn_feature, kitti_feature_dir, + seg_output_dir, kitti_output_dir, kitti_rcnn_reader, cfg.CLASSES))) + p_list[-1].start() + + if cfg.RCNN.ENABLED: + for i in range(len(thresh_list)): + mdict['total_recalled_bbox_list_{}'.format(i)] = 0 + mdict['total_roi_recalled_bbox_list_{}'.format(i)] = 0 + mdict['exit_proc'] = 0 + mdict['total_cls_acc'] = 0 + mdict['total_cls_acc_refined'] = 0 + mdict['total_det_num'] = 0 + mdict['total_gt_bbox'] = 0 + p_list = [] + for i in range(METRIC_PROC_NUM): + p_list.append(multiprocessing.Process( + target=rcnn_metric, + args=(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir, + refine_output_dir, final_output_dir, args.save_result) + )) + p_list[-1].start() + + try: + eval_pyreader.start() + eval_iter = 0 + start_time = time.time() + + cur_time = time.time() + while True: + eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values, return_numpy=False) + rets_dict = {k: (np.array(v), v.recursive_sequence_lengths()) + for k, v in zip(eval_keys, eval_outs)} + run_time = time.time() - cur_time + cur_time = time.time() + queue.put(rets_dict) + eval_iter += 1 + + logger.info("[EVAL] iter {}, time: {:.2f}".format( + eval_iter, run_time)) + + except fluid.core.EOFException: + # terminate metric process + for i in range(METRIC_PROC_NUM): + queue.put(None) + while mdict['exit_proc'] < METRIC_PROC_NUM: + time.sleep(1) + for p in p_list: + if p.is_alive(): + p.join() + + end_time = time.time() + logger.info("[EVAL] total {} iter finished, average time: {:.2f}".format( + eval_iter, (end_time - start_time) / float(eval_iter))) + + if cfg.RPN.ENABLED: + avg_rpn_iou = mdict['total_rpn_iou'] / max(len(kitti_rcnn_reader), 1.) + logger.info("average rpn iou: {:.3f}".format(avg_rpn_iou)) + total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0)) + for idx, thresh in enumerate(thresh_list): + recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox + logger.info("total bbox recall(thresh={:.3f}): {} / {} = {:.3f}".format( + thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], mdict['total_gt_bbox'], recall)) + + if cfg.RCNN.ENABLED: + cnt = float(max(eval_iter, 1.0)) + avg_cls_acc = mdict['total_cls_acc'] / cnt + avg_cls_acc_refined = mdict['total_cls_acc_refined'] / cnt + avg_det_num = mdict['total_det_num'] / cnt + + logger.info("avg_cls_acc: {}".format(avg_cls_acc)) + logger.info("avg_cls_acc_refined: {}".format(avg_cls_acc_refined)) + logger.info("avg_det_num: {}".format(avg_det_num)) + + total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0)) + for idx, thresh in enumerate(thresh_list): + cur_roi_recall = mdict['total_roi_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox + logger.info('total roi bbox recall(thresh=%.3f): %d / %d = %f' % ( + thresh, mdict['total_roi_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_roi_recall)) + + for idx, thresh in enumerate(thresh_list): + cur_recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox + logger.info('total bbox recall(thresh=%.2f) %d / %.2f = %.4f' % ( + thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_recall)) + + split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt') + image_idx_list = [x.strip() for x in open(split_file).readlines()] + for k in range(image_idx_list.__len__()): + cur_file = os.path.join(final_output_dir, '%s.txt' % image_idx_list[k]) + if not os.path.exists(cur_file): + with open(cur_file, 'w') as temp_f: + pass + + if float(sys.version[:3]) >= 3.6: + label_dir = os.path.join('./data/KITTI/object/training', 'label_2') + split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt') + final_output_dir = os.path.join("./result_dir", 'final_result', 'data') + name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2} + + from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate + ap_result_str, ap_dict = kitti_evaluate( + label_dir, final_output_dir, label_split_file=split_file, + current_class=name_to_class["Car"]) + + logger.info("KITTI evaluate: {}, {}".format(ap_result_str, ap_dict)) + + else: + logger.info("KITTI mAP only support python version >= 3.6, users can " + "run 'python3 tools/kitti_eval.py' to evaluate KITTI mAP.") + + finally: + eval_pyreader.reset() + + +if __name__ == "__main__": + eval() diff --git a/PaddleCV/Paddle3D/PointRCNN/ext_op b/PaddleCV/Paddle3D/PointRCNN/ext_op new file mode 120000 index 0000000000000000000000000000000000000000..dca99c677c8fa26e7cbf3ce1d50a8e6af0621655 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/ext_op @@ -0,0 +1 @@ +../PointNet++/ext_op \ No newline at end of file diff --git a/PaddleCV/Paddle3D/PointRCNN/images/teaser.png b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png new file mode 100644 index 0000000000000000000000000000000000000000..21ae7e98165074ef93dc34fc643b3fddc5fe6c36 Binary files /dev/null and b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png differ diff --git a/PaddleCV/Paddle3D/PointRCNN/models/__init__.py b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py @@ -0,0 +1,13 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. diff --git a/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..04db2398d099b7edee10e72a11af710c0a509231 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py @@ -0,0 +1,201 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Constant + +__all__ = ["get_reg_loss"] + + +def sigmoid_focal_loss(logits, labels, weights, gamma=2.0, alpha=0.25): + sce_loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, labels) + prob = fluid.layers.sigmoid(logits) + p_t = labels * prob + (1.0 - labels) * (1.0 - prob) + modulating_factor = fluid.layers.pow(1.0 - p_t, gamma) + alpha_weight_factor = labels * alpha + (1.0 - labels) * (1.0 - alpha) + return modulating_factor * alpha_weight_factor * sce_loss * weights + + +def get_reg_loss(pred_reg, reg_label, fg_mask, point_num, loc_scope, + loc_bin_size, num_head_bin, anchor_size, + get_xz_fine=True, get_y_by_bin=False, loc_y_scope=0.5, + loc_y_bin_size=0.25, get_ry_fine=False): + + """ + Bin-based 3D bounding boxes regression loss. See https://arxiv.org/abs/1812.04244 for more details. + + :param pred_reg: (N, C) + :param reg_label: (N, 7) [dx, dy, dz, h, w, l, ry] + :param loc_scope: constant + :param loc_bin_size: constant + :param num_head_bin: constant + :param anchor_size: (N, 3) or (3) + :param get_xz_fine: + :param get_y_by_bin: + :param loc_y_scope: + :param loc_y_bin_size: + :param get_ry_fine: + :return: + """ + fg_num = fluid.layers.cast(fluid.layers.reduce_sum(fg_mask), dtype=pred_reg.dtype) + fg_num = fluid.layers.clip(fg_num, min=1.0, max=point_num) + fg_scale = float(point_num) / fg_num + + per_loc_bin_num = int(loc_scope / loc_bin_size) * 2 + loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2 + + reg_loss_dict = {} + + # xz localization loss + x_offset_label, y_offset_label, z_offset_label = reg_label[:, 0:1], reg_label[:, 1:2], reg_label[:, 2:3] + x_shift = fluid.layers.clip(x_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3) + z_shift = fluid.layers.clip(z_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3) + x_bin_label = fluid.layers.cast(x_shift / loc_bin_size, dtype='int64') + z_bin_label = fluid.layers.cast(z_shift / loc_bin_size, dtype='int64') + + x_bin_l, x_bin_r = 0, per_loc_bin_num + z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2 + start_offset = z_bin_r + + loss_x_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, x_bin_l: x_bin_r], x_bin_label) + loss_x_bin = fluid.layers.reduce_mean(loss_x_bin * fg_mask) * fg_scale + loss_z_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, z_bin_l: z_bin_r], z_bin_label) + loss_z_bin = fluid.layers.reduce_mean(loss_z_bin * fg_mask) * fg_scale + reg_loss_dict['loss_x_bin'] = loss_x_bin + reg_loss_dict['loss_z_bin'] = loss_z_bin + loc_loss = loss_x_bin + loss_z_bin + + if get_xz_fine: + x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3 + z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4 + start_offset = z_res_r + + x_res_label = x_shift - (fluid.layers.cast(x_bin_label, dtype=x_shift.dtype) * loc_bin_size + loc_bin_size / 2.) + z_res_label = z_shift - (fluid.layers.cast(z_bin_label, dtype=z_shift.dtype) * loc_bin_size + loc_bin_size / 2.) + x_res_norm_label = x_res_label / loc_bin_size + z_res_norm_label = z_res_label / loc_bin_size + + x_bin_onehot = fluid.layers.one_hot(x_bin_label, depth=per_loc_bin_num) + z_bin_onehot = fluid.layers.one_hot(z_bin_label, depth=per_loc_bin_num) + + loss_x_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, x_res_l: x_res_r] * x_bin_onehot, dim=1, keep_dim=True), x_res_norm_label) + loss_x_res = fluid.layers.reduce_mean(loss_x_res * fg_mask) * fg_scale + loss_z_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, z_res_l: z_res_r] * z_bin_onehot, dim=1, keep_dim=True), z_res_norm_label) + loss_z_res = fluid.layers.reduce_mean(loss_z_res * fg_mask) * fg_scale + reg_loss_dict['loss_x_res'] = loss_x_res + reg_loss_dict['loss_z_res'] = loss_z_res + loc_loss += loss_x_res + loss_z_res + + # y localization loss + if get_y_by_bin: + y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num + y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num + start_offset = y_res_r + + y_shift = fluid.layers.clip(y_offset_label + loc_y_scope, 0., loc_y_scope * 2 - 1e-3) + y_bin_label = fluid.layers.cast(y_shift / loc_y_bin_size, dtype='int64') + y_res_label = y_shift - (fluid.layers.cast(y_bin_label, dtype=y_shift.dtype) * loc_y_bin_size + loc_y_bin_size / 2.) + y_res_norm_label = y_res_label / loc_y_bin_size + + y_bin_onehot = fluid.layers.one_hot(y_bin_label, depth=per_loc_bin_num) + + loss_y_bin = fluid.layers.cross_entropy(pred_reg[:, y_bin_l: y_bin_r], y_bin_label) + loss_y_bin = fluid.layers.reduce_mean(loss_y_bin * fg_mask) * fg_scale + loss_y_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_res_l: y_res_r] * y_bin_onehot, dim=1, keep_dim=True), y_res_norm_label) + loss_y_res = fluid.layers.reduce_mean(loss_y_res * fg_mask) * fg_scale + + reg_loss_dict['loss_y_bin'] = loss_y_bin + reg_loss_dict['loss_y_res'] = loss_y_res + + loc_loss += loss_y_bin + loss_y_res + else: + y_offset_l, y_offset_r = start_offset, start_offset + 1 + start_offset = y_offset_r + + loss_y_offset = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_offset_l: y_offset_r], dim=1, keep_dim=True), y_offset_label) + loss_y_offset = fluid.layers.reduce_mean(loss_y_offset * fg_mask) * fg_scale + reg_loss_dict['loss_y_offset'] = loss_y_offset + loc_loss += loss_y_offset + + # angle loss + ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin + ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin + + ry_label = reg_label[:, 6:7] + + if get_ry_fine: + # divide pi/2 into several bins + angle_per_class = (np.pi / 2) / num_head_bin + + ry_label = ry_label % (2 * np.pi) # 0 ~ 2pi + opposite_flag = fluid.layers.logical_and(ry_label > np.pi * 0.5, ry_label < np.pi * 1.5) + opposite_flag = fluid.layers.cast(opposite_flag, dtype=ry_label.dtype) + shift_angle = (ry_label + opposite_flag * np.pi + np.pi * 0.5) % (2 * np.pi) # (0 ~ pi) + shift_angle.stop_gradient = True + + shift_angle = fluid.layers.clip(shift_angle - np.pi * 0.25, min=1e-3, max=np.pi * 0.5 - 1e-3) # (0, pi/2) + + # bin center is (5, 10, 15, ..., 85) + ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64') + ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2) + ry_res_norm_label = ry_res_label / (angle_per_class / 2) + + else: + # divide 2pi into several bins + angle_per_class = (2 * np.pi) / num_head_bin + heading_angle = ry_label % (2 * np.pi) # 0 ~ 2pi + + shift_angle = (heading_angle + angle_per_class / 2) % (2 * np.pi) + shift_angle.stop_gradient = True + ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64') + ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2) + ry_res_norm_label = ry_res_label / (angle_per_class / 2) + + ry_bin_onehot = fluid.layers.one_hot(ry_bin_label, depth=num_head_bin) + loss_ry_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, ry_bin_l:ry_bin_r], ry_bin_label) + loss_ry_bin = fluid.layers.reduce_mean(loss_ry_bin * fg_mask) * fg_scale + loss_ry_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, ry_res_l: ry_res_r] * ry_bin_onehot, dim=1, keep_dim=True), ry_res_norm_label) + loss_ry_res = fluid.layers.reduce_mean(loss_ry_res * fg_mask) * fg_scale + + reg_loss_dict['loss_ry_bin'] = loss_ry_bin + reg_loss_dict['loss_ry_res'] = loss_ry_res + angle_loss = loss_ry_bin + loss_ry_res + + # size loss + size_res_l, size_res_r = ry_res_r, ry_res_r + 3 + assert pred_reg.shape[1] == size_res_r, '%d vs %d' % (pred_reg.shape[1], size_res_r) + + anchor_size_var = fluid.layers.zeros(shape=[3], dtype=reg_label.dtype) + fluid.layers.assign(np.array(anchor_size).astype('float32'), anchor_size_var) + size_res_norm_label = (reg_label[:, 3:6] - anchor_size_var) / anchor_size_var + size_res_norm_label = fluid.layers.reshape(size_res_norm_label, shape=[-1, 1], inplace=True) + size_res_norm = pred_reg[:, size_res_l:size_res_r] + size_res_norm = fluid.layers.reshape(size_res_norm, shape=[-1, 1], inplace=True) + size_loss = fluid.layers.smooth_l1(size_res_norm, size_res_norm_label) + size_loss = fluid.layers.reduce_mean(fluid.layers.reshape(size_loss, [-1, 3]) * fg_mask) * fg_scale + + # Total regression loss + reg_loss_dict['loss_loc'] = loc_loss + reg_loss_dict['loss_angle'] = angle_loss + reg_loss_dict['loss_size'] = size_loss + + return loc_loss, angle_loss, size_loss, reg_loss_dict + diff --git a/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py new file mode 100644 index 0000000000000000000000000000000000000000..890ef897405722f9cc1ba1d129bea2c80fce17a1 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py @@ -0,0 +1,125 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +from collections import OrderedDict + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Constant + +from models.rpn import RPN +from models.rcnn import RCNN + + +__all__ = ["PointRCNN"] + + +class PointRCNN(object): + def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None): + self.cfg = cfg + self.batch_size = batch_size + self.use_xyz = use_xyz + self.mode = mode + self.is_train = mode == 'TRAIN' + self.num_points = self.cfg.RPN.NUM_POINTS + self.prog = prog + self.inputs = None + self.pyreader = None + + def build_inputs(self): + self.inputs = OrderedDict() + + if self.cfg.RPN.ENABLED: + self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32') + self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[self.num_points, 3], dtype='float32') + self.inputs['pts_rect'] = fluid.layers.data(name='pts_rect', shape=[self.num_points, 3], dtype='float32') + self.inputs['pts_features'] = fluid.layers.data(name='pts_features', shape=[self.num_points, 1], dtype='float32') + self.inputs['rpn_cls_label'] = fluid.layers.data(name='rpn_cls_label', shape=[self.num_points], dtype='int32') + self.inputs['rpn_reg_label'] = fluid.layers.data(name='rpn_reg_label', shape=[self.num_points, 7], dtype='float32') + self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[7], lod_level=1, dtype='float32') + + if self.cfg.RCNN.ENABLED: + if self.cfg.RCNN.ROI_SAMPLE_JIT: + self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32', append_batch_size=False) + self.inputs['rpn_xyz'] = fluid.layers.data(name='rpn_xyz', shape=[self.num_points, 3], dtype='float32', append_batch_size=False) + self.inputs['rpn_features'] = fluid.layers.data(name='rpn_features', shape=[self.num_points,128], dtype='float32', append_batch_size=False) + self.inputs['rpn_intensity'] = fluid.layers.data(name='rpn_intensity', shape=[self.num_points], dtype='float32', append_batch_size=False) + self.inputs['seg_mask'] = fluid.layers.data(name='seg_mask', shape=[self.num_points], dtype='float32', append_batch_size=False) + self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0) + self.inputs['pts_depth'] = fluid.layers.data(name='pts_depth', shape=[self.num_points], dtype='float32', append_batch_size=False) + self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0) + else: + self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[-1], dtype='int32', append_batch_size=False) + self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[-1,512,133], dtype='float32', append_batch_size=False) + self.inputs['pts_feature'] = fluid.layers.data(name='pts_feature', shape=[-1,512,128], dtype='float32', append_batch_size=False) + self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1,7], dtype='float32', append_batch_size=False) + if self.is_train: + self.inputs['cls_label'] = fluid.layers.data(name='cls_label', shape=[-1], dtype='float32', append_batch_size=False) + self.inputs['reg_valid_mask'] = fluid.layers.data(name='reg_valid_mask', shape=[-1], dtype='float32', append_batch_size=False) + self.inputs['gt_boxes3d_ct'] = fluid.layers.data(name='gt_boxes3d_ct', shape=[-1,7], dtype='float32', append_batch_size=False) + self.inputs['gt_of_rois'] = fluid.layers.data(name='gt_of_rois', shape=[-1,7], dtype='float32', append_batch_size=False) + else: + self.inputs['roi_scores'] = fluid.layers.data(name='roi_scores', shape=[-1,], dtype='float32', append_batch_size=False) + self.inputs['gt_iou'] = fluid.layers.data(name='gt_iou', shape=[-1], dtype='float32', append_batch_size=False) + self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1,-1,7], dtype='float32', append_batch_size=False, lod_level=0) + + + self.pyreader = fluid.io.PyReader( + feed_list=list(self.inputs.values()), + capacity=64, + use_double_buffer=True, + iterable=False) + + def build(self): + self.build_inputs() + if self.cfg.RPN.ENABLED: + self.rpn = RPN(self.cfg, self.batch_size, self.use_xyz, + self.mode, self.prog) + self.rpn.build(self.inputs) + self.rpn_outputs = self.rpn.get_outputs() + self.outputs = self.rpn_outputs + + if self.cfg.RCNN.ENABLED: + self.rcnn = RCNN(self.cfg, 1, self.batch_size, self.mode) + self.rcnn.build_model(self.inputs) + self.outputs = self.rcnn.get_outputs() + + if self.mode == 'TRAIN': + if self.cfg.RPN.ENABLED: + self.outputs['rpn_loss'], self.outputs['rpn_loss_cls'], \ + self.outputs['rpn_loss_reg'] = self.rpn.get_loss() + if self.cfg.RCNN.ENABLED: + self.outputs['rcnn_loss'], self.outputs['rcnn_loss_cls'], \ + self.outputs['rcnn_loss_reg'] = self.rcnn.get_loss() + self.outputs['loss'] = self.outputs.get('rpn_loss', 0.) \ + + self.outputs.get('rcnn_loss', 0.) + + def get_feeds(self): + return list(self.inputs.keys()) + + def get_outputs(self): + return self.outputs + + def get_loss(self): + rpn_loss, _, _ = self.rpn.get_loss() + rcnn_loss, _, _ = self.rcnn.get_loss() + return rpn_loss + rcnn_loss + + def get_pyreader(self): + return self.pyreader + diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py new file mode 100644 index 0000000000000000000000000000000000000000..43942fcf8110869dd066dfabe7716db055af05b6 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py @@ -0,0 +1,197 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +Contains PointNet++ utility functions. +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Constant +from ext_op import * + +__all__ = ["conv_bn", "pointnet_sa_module", "pointnet_fp_module", "MLP"] + + +def query_and_group(xyz, new_xyz, radius, nsample, features=None, use_xyz=True): + """ + Perform query_ball and group_points + + Args: + xyz (Variable): xyz coordiantes features with shape [B, N, 3] + new_xyz (Variable): centriods features with shape [B, npoint, 3] + radius (float32): radius of ball + nsample (int32): maximum number of gather features + features (Variable): features with shape [B, N, C] + use_xyz (bool): whether use xyz coordiantes features + + Returns: + out (Variable): features with shape [B, npoint, nsample, C + 3] + """ + idx = query_ball(xyz, new_xyz, radius, nsample) + idx.stop_gradient = True + xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1]) + grouped_xyz = group_points(xyz, idx) + expand_new_xyz = fluid.layers.unsqueeze(fluid.layers.transpose(new_xyz, perm=[0, 2, 1]), axes=[-1]) + expand_new_xyz = fluid.layers.expand(expand_new_xyz, [1, 1, 1, grouped_xyz.shape[3]]) + grouped_xyz -= expand_new_xyz + + if features is not None: + grouped_features = group_points(features, idx) + return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) \ + if use_xyz else grouped_features + else: + assert use_xyz, "use_xyz should be True when features is None" + return grouped_xyz + + +def group_all(xyz, features=None, use_xyz=True): + """ + Group all xyz and features when npoint is None + See query_and_group + """ + xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1]) + grouped_xyz = fluid.layers.unsqueeze(xyz, axes=[2]) + if features is not None: + grouped_features = fluid.layers.unsqueeze(features, axes=[2]) + return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) if use_xyz else grouped_features + else: + return grouped_xyz + + +def conv_bn(input, out_channels, bn=True, bn_momentum=0.95, act='relu', name=None): + param_attr = ParamAttr(name='{}_conv_weight'.format(name),) + bias_attr = ParamAttr(name='{}_conv_bias'.format(name)) \ + if not bn else False + out = fluid.layers.conv2d(input, + num_filters=out_channels, + filter_size=1, + stride=1, + padding=0, + dilation=1, + param_attr=param_attr, + bias_attr=bias_attr, + act=act if not bn else None) + if bn: + bn_name = name + "_bn" + out = fluid.layers.batch_norm(out, + act=act, + momentum=bn_momentum, + param_attr=ParamAttr(name=bn_name + "_scale"), + bias_attr=ParamAttr(name=bn_name + "_offset"), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_var') + + return out + + +def MLP(features, out_channels_list, bn=True, bn_momentum=0.95, act='relu', name=None): + out = features + for i, out_channels in enumerate(out_channels_list): + out = conv_bn(out, out_channels, bn=bn, act=act, bn_momentum=bn_momentum, name=name + "_{}".format(i)) + return out + + +def pointnet_sa_module(xyz, + npoint=None, + radiuss=[], + nsamples=[], + mlps=[], + feature=None, + bn=True, + bn_momentum=0.95, + use_xyz=True, + name=None): + """ + PointNet MSG(Multi-Scale Group) Set Abstraction Module. + Call with radiuss, nsamples, mlps as single element list for + SSG(Single-Scale Group). + + Args: + xyz (Variable): xyz coordiantes features with shape [B, N, 3] + radiuss ([float32]): list of radius of ball + nsamples ([int32]): list of maximum number of gather features + mlps ([[int32]]): list of out_channels_list + feature (Variable): features with shape [B, C, N] + bn (bool): whether perform batch norm after conv2d + bn_momentum (float): momentum of batch norm + use_xyz (bool): whether use xyz coordiantes features + + Returns: + new_xyz (Variable): centriods features with shape [B, npoint, 3] + out (Variable): features with shape [B, npoint, \sum_i{mlps[i][-1]}] + """ + assert len(radiuss) == len(nsamples) == len(mlps), \ + "radiuss, nsamples, mlps length should be same" + + farthest_idx = farthest_point_sampling(xyz, npoint) + farthest_idx.stop_gradient = True + new_xyz = gather_point(xyz, farthest_idx) if npoint is not None else None + + outs = [] + for i, (radius, nsample, mlp) in enumerate(zip(radiuss, nsamples, mlps)): + out = query_and_group(xyz, new_xyz, radius, nsample, feature, use_xyz) if npoint is not None else group_all(xyz, feature, use_xyz) + out = MLP(out, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp{}'.format(i)) + out = fluid.layers.pool2d(out, pool_size=[1, out.shape[3]], pool_type='max') + out = fluid.layers.squeeze(out, axes=[-1]) + outs.append(out) + out = fluid.layers.concat(outs, axis=1) + + return (new_xyz, out) + + +def pointnet_fp_module(unknown, known, unknown_feats, known_feats, mlp, bn=True, bn_momentum=0.95, name=None): + """ + PointNet Feature Propagation Module + + Args: + unknown (Variable): unknown xyz coordiantes features with shape [B, N, 3] + known (Variable): known xyz coordiantes features with shape [B, M, 3] + unknown_feats (Variable): unknown features with shape [B, N, C1] to be propagated to + known_feats (Variable): known features with shape [B, M, C2] to be propagated from + mlp ([int32]): out_channels_list + bn (bool): whether perform batch norm after conv2d + + Returns: + new_features (Variable): new features with shape [B, N, mlp[-1]] + """ + if known is None: + raise NotImplementedError("Not implement known as None currently.") + else: + dist, idx = three_nn(unknown, known, eps=0.) + dist.stop_gradient = True + idx.stop_gradient = True + dist = fluid.layers.sqrt(dist) + ones = fluid.layers.fill_constant_batch_size_like(dist, dist.shape, dist.dtype, 1) + dist_recip = ones / (dist + 1e-8); # 1.0 / dist + norm = fluid.layers.reduce_sum(dist_recip, dim=-1, keep_dim=True) + weight = dist_recip / norm + weight.stop_gradient = True + interp_feats = three_interp(known_feats, weight, idx) + + new_features = interp_feats if unknown_feats is None else \ + fluid.layers.concat([interp_feats, unknown_feats], axis=-1) + new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1]) + new_features = fluid.layers.unsqueeze(new_features, axes=[-1]) + new_features = MLP(new_features, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp') + new_features = fluid.layers.squeeze(new_features, axes=[-1]) + new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1]) + + return new_features + diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py new file mode 100644 index 0000000000000000000000000000000000000000..b4d5f98c3b320663111cf9eceef4f2649f44007d --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py @@ -0,0 +1,78 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +Contains PointNet++ SSG/MSG semantic segmentation models +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Constant +from models.pointnet2_modules import * + +__all__ = ["PointNet2MSG"] + + +class PointNet2MSG(object): + def __init__(self, cfg, xyz, feature=None, use_xyz=True): + self.cfg = cfg + self.xyz = xyz + self.feature = feature + self.use_xyz = use_xyz + self.model_config() + + def model_config(self): + self.SA_confs = [] + for i in range(self.cfg.RPN.SA_CONFIG.NPOINTS.__len__()): + self.SA_confs.append({ + "npoint": self.cfg.RPN.SA_CONFIG.NPOINTS[i], + "radiuss": self.cfg.RPN.SA_CONFIG.RADIUS[i], + "nsamples": self.cfg.RPN.SA_CONFIG.NSAMPLE[i], + "mlps": self.cfg.RPN.SA_CONFIG.MLPS[i], + }) + + self.FP_confs = [] + for i in range(self.cfg.RPN.FP_MLPS.__len__()): + self.FP_confs.append({"mlp": self.cfg.RPN.FP_MLPS[i]}) + + def build(self, bn_momentum=0.95): + xyzs, features = [self.xyz], [self.feature] + xyzi, featurei = self.xyz, self.feature + for i, SA_conf in enumerate(self.SA_confs): + xyzi, featurei = pointnet_sa_module( + xyz=xyzi, + feature=featurei, + bn_momentum=bn_momentum, + use_xyz=self.use_xyz, + name="sa_{}".format(i), + **SA_conf) + xyzs.append(xyzi) + features.append(fluid.layers.transpose(featurei, perm=[0, 2, 1])) + for i in range(-1, -(len(self.FP_confs) + 1), -1): + features[i - 1] = pointnet_fp_module( + unknown=xyzs[i - 1], + known=xyzs[i], + unknown_feats=features[i - 1], + known_feats=features[i], + bn_momentum=bn_momentum, + name="fp_{}".format(i + len(self.FP_confs)), + **self.FP_confs[i]) + + return xyzs[0], features[0] + diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py new file mode 100644 index 0000000000000000000000000000000000000000..11247eb48c505e4cb8dc8a466ed1abca20078dd8 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py @@ -0,0 +1,302 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +import sys + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Constant + +from models.pointnet2_modules import MLP, pointnet_sa_module, conv_bn +from models.loss_utils import sigmoid_focal_loss , get_reg_loss +from utils.proposal_target import get_proposal_target_func +from utils.cyops.kitti_utils import rotate_pc_along_y + +__all__ = ['RCNN'] + + +class RCNN(object): + def __init__(self, cfg, num_classes, batch_size, mode='TRAIN', use_xyz=True, input_channels=0): + self.cfg = cfg + self.use_xyz = use_xyz + self.num_classes = num_classes + self.input_channels = input_channels + self.inputs = None + self.training = mode == 'TRAIN' + self.batch_size = batch_size + + def create_tmp_var(self, name, dtype, shape): + return fluid.default_main_program().current_block().create_var( + name=name, dtype=dtype, shape=shape + ) + + def build_model(self, inputs): + self.inputs = inputs + if self.cfg.RCNN.ROI_SAMPLE_JIT: + if self.training: + proposal_target = get_proposal_target_func(self.cfg) + + tmp_list = [ + self.inputs['seg_mask'], + self.inputs['rpn_features'], + self.inputs['gt_boxes3d'], + self.inputs['rpn_xyz'], + self.inputs['pts_depth'], + self.inputs['roi_boxes3d'], + self.inputs['rpn_intensity'], + ] + out_name = ['reg_valid_mask' ,'sampled_pts' ,'roi_boxes3d', 'gt_of_rois', 'pts_feature' ,'cls_label','gt_iou'] + reg_valid_mask = self.create_tmp_var(name="reg_valid_mask",dtype='float32',shape=[-1,]) + sampled_pts = self.create_tmp_var(name="sampled_pts",dtype='float32',shape=[-1, self.cfg.RCNN.NUM_POINTS, 3]) + new_roi_boxes3d = self.create_tmp_var(name="new_roi_boxes3d",dtype='float32',shape=[-1, 7]) + gt_of_rois = self.create_tmp_var(name="gt_of_rois", dtype='float32', shape=[-1,7]) + pts_feature = self.create_tmp_var(name="pts_feature", dtype='float32',shape=[-1,512,130]) + cls_label = self.create_tmp_var(name="cls_label",dtype='int64',shape=[-1]) + gt_iou = self.create_tmp_var(name="gt_iou",dtype='float32',shape=[-1]) + + out_list = [reg_valid_mask, sampled_pts, new_roi_boxes3d, gt_of_rois, pts_feature, cls_label, gt_iou] + out = fluid.layers.py_func(func=proposal_target,x=tmp_list,out=out_list) + + self.target_dict = {} + for i,item in enumerate(out): + self.target_dict[out_name[i]] = item + + pts = fluid.layers.concat(input=[self.target_dict['sampled_pts'],self.target_dict['pts_feature']], axis=2) + self.debug = pts + self.target_dict['pts_input'] = pts + else: + rpn_xyz, rpn_features = inputs['rpn_xyz'], inputs['rpn_features'] + batch_rois = inputs['roi_boxes3d'] + rpn_intensity = inputs['rpn_intensity'] + rpn_intensity = fluid.layers.unsqueeze(rpn_intensity,axes=[2]) + seg_mask = fluid.layers.unsqueeze(inputs['seg_mask'],axes=[2]) + if self.cfg.RCNN.USE_INTENSITY: + pts_extra_input_list = [rpn_intensity, seg_mask] + else: + pts_extra_input_list = [seg_mask] + + if self.cfg.RCNN.USE_DEPTH: + pts_depth = inputs['pts_depth'] / 70.0 -0.5 + pts_depth = fluid.layers.unsqueeze(pts_depth,axes=[2]) + pts_extra_input_list.append(pts_depth) + pts_extra_input = fluid.layers.concat(pts_extra_input_list, axis=2) + pts_feature = fluid.layers.concat([pts_extra_input, rpn_features],axis=2) + + pooled_features, pooled_empty_flag = fluid.layers.roi_pool_3d(rpn_xyz,pts_feature,batch_rois, + self.cfg.RCNN.POOL_EXTRA_WIDTH, + sampled_pt_num=self.cfg.RCNN.NUM_POINTS) + # canonical transformation + batch_size = batch_rois.shape[0] + roi_center = batch_rois[:, :, 0:3] + tmp = pooled_features[:, :, :, 0:3] - fluid.layers.unsqueeze(roi_center,axes=[2]) + pooled_features = fluid.layers.concat(input=[tmp,pooled_features[:,:,:,3:]],axis=3) + concat_list = [] + for i in range(batch_size): + tmp = rotate_pc_along_y(pooled_features[i, :, :, 0:3], + batch_rois[i, :, 6]) + concat = fluid.layers.concat([tmp,pooled_features[i,:,:,3:]],axis=-1) + concat = fluid.layers.unsqueeze(concat,axes=[0]) + concat_list.append(concat) + pooled_features = fluid.layers.concat(concat_list,axis=0) + pts = fluid.layers.reshape(pooled_features,shape=[-1,pooled_features.shape[2],pooled_features.shape[3]]) + + else: + pts = inputs['pts_input'] + self.target_dict = {} + self.target_dict['pts_input'] = inputs['pts_input'] + self.target_dict['roi_boxes3d'] = inputs['roi_boxes3d'] + + if self.training: + self.target_dict['cls_label'] = inputs['cls_label'] + self.target_dict['reg_valid_mask'] = inputs['reg_valid_mask'] + self.target_dict['gt_of_rois'] = inputs['gt_boxes3d_ct'] + + xyz = pts[:,:,0:3] + feature = fluid.layers.transpose(pts[:,:,3:], [0,2,1]) if pts.shape[-1]>3 else None + if self.cfg.RCNN.USE_RPN_FEATURES: + self.rcnn_input_channel = 3 + int(self.cfg.RCNN.USE_INTENSITY) + \ + int(self.cfg.RCNN.USE_MASK) + int(self.cfg.RCNN.USE_DEPTH) + c_out = self.cfg.RCNN.XYZ_UP_LAYER[-1] + + xyz_input = pts[:,:,:self.rcnn_input_channel] + xyz_input = fluid.layers.transpose(xyz_input, [0,2,1]) + xyz_input = fluid.layers.unsqueeze(xyz_input, axes=[3]) + + rpn_feature = pts[:,:,self.rcnn_input_channel:] + rpn_feature = fluid.layers.transpose(rpn_feature, [0,2,1]) + rpn_feature = fluid.layers.unsqueeze(rpn_feature,axes=[3]) + + xyz_feature = MLP( + xyz_input, + out_channels_list=self.cfg.RCNN.XYZ_UP_LAYER, + bn=self.cfg.RCNN.USE_BN, + name="xyz_up_layer") + + merged_feature = fluid.layers.concat([xyz_feature, rpn_feature],axis=1) + merged_feature = MLP( + merged_feature, + out_channels_list=[c_out], + bn=self.cfg.RCNN.USE_BN, + name="xyz_down_layer") + + xyzs = [xyz] + features = [fluid.layers.squeeze(merged_feature,axes=[3])] + else: + xyzs = [xyz] + features = [feature] + + # forward + xyzi, featurei = xyzs[-1], features[-1] + for k in range(len(self.cfg.RCNN.SA_CONFIG.NPOINTS)): + mlps = self.cfg.RCNN.SA_CONFIG.MLPS[k] + npoint = self.cfg.RCNN.SA_CONFIG.NPOINTS[k] if self.cfg.RCNN.SA_CONFIG.NPOINTS[k] != -1 else None + + xyzi, featurei = pointnet_sa_module( + xyz=xyzi, + feature = featurei, + bn = self.cfg.RCNN.USE_BN, + use_xyz = self.use_xyz, + name = "sa_{}".format(k), + npoint = npoint, + mlps = [mlps], + radiuss = [self.cfg.RCNN.SA_CONFIG.RADIUS[k]], + nsamples = [self.cfg.RCNN.SA_CONFIG.NSAMPLE[k]] + ) + xyzs.append(xyzi) + features.append(featurei) + + head_in = features[-1] + head_in = fluid.layers.unsqueeze(head_in, axes=[2]) + + cls_out = head_in + reg_out = cls_out + + for i in range(0, self.cfg.RCNN.CLS_FC.__len__()): + cls_out = conv_bn(cls_out, self.cfg.RCNN.CLS_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_cls_{}'.format(i)) + if i == 0 and self.cfg.RCNN.DP_RATIO >= 0: + cls_out = fluid.layers.dropout(cls_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train") + cls_channel = 1 if self.num_classes == 2 else self.num_classes + cls_out = conv_bn(cls_out, cls_channel, act=None, name="cls_out", bn=self.cfg.RCNN.USE_BN) + self.cls_out = fluid.layers.squeeze(cls_out,axes=[1,3]) + + per_loc_bin_num = int(self.cfg.RCNN.LOC_SCOPE / self.cfg.RCNN.LOC_BIN_SIZE) * 2 + loc_y_bin_num = int(self.cfg.RCNN.LOC_Y_SCOPE / self.cfg.RCNN.LOC_Y_BIN_SIZE) * 2 + reg_channel = per_loc_bin_num * 4 + self.cfg.RCNN.NUM_HEAD_BIN * 2 + 3 + reg_channel += (1 if not self.cfg.RCNN.LOC_Y_BY_BIN else loc_y_bin_num * 2) + for i in range(0, self.cfg.RCNN.REG_FC.__len__()): + reg_out = conv_bn(reg_out, self.cfg.RCNN.REG_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_reg_{}'.format(i)) + if i == 0 and self.cfg.RCNN.DP_RATIO >= 0: + reg_out = fluid.layers.dropout(reg_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train") + + reg_out = conv_bn(reg_out, reg_channel, act=None, name="reg_out", bn=self.cfg.RCNN.USE_BN) + self.reg_out = fluid.layers.squeeze(reg_out, axes=[2,3]) + + + self.outputs = { + 'rcnn_cls':self.cls_out, + 'rcnn_reg':self.reg_out, + } + if self.training: + self.outputs.update(self.target_dict) + elif not self.training: + self.outputs['sample_id'] = inputs['sample_id'] + self.outputs['pts_input'] = inputs['pts_input'] + self.outputs['roi_boxes3d'] = inputs['roi_boxes3d'] + self.outputs['roi_scores'] = inputs['roi_scores'] + self.outputs['gt_iou'] = inputs['gt_iou'] + self.outputs['gt_boxes3d'] = inputs['gt_boxes3d'] + + if self.cls_out.shape[1] == 1: + raw_scores = fluid.layers.reshape(self.cls_out, shape=[-1]) + norm_scores = fluid.layers.sigmoid(raw_scores) + else: + norm_scores = fluid.layers.softmax(self.cls_out, axis=1) + self.outputs['norm_scores'] = norm_scores + + def get_outputs(self): + return self.outputs + + def get_loss(self): + assert self.inputs is not None, \ + "please call build() first" + rcnn_cls_label = self.outputs['cls_label'] + reg_valid_mask = self.outputs['reg_valid_mask'] + roi_boxes3d = self.outputs['roi_boxes3d'] + roi_size = roi_boxes3d[:, 3:6] + gt_boxes3d_ct = self.outputs['gt_of_rois'] + pts_input = self.outputs['pts_input'] + + rcnn_cls = self.cls_out + rcnn_reg = self.reg_out + + # RCNN classification loss + assert self.cfg.RCNN.LOSS_CLS in ["SigmoidFocalLoss", "BinaryCrossEntropy"], \ + "unsupported RCNN cls loss type {}".format(self.cfg.RCNN.LOSS_CLS) + + if self.cfg.RCNN.LOSS_CLS == "SigmoidFocalLoss": + cls_flat = fluid.layers.reshape(self.cls_out, shape=[-1]) + cls_label_flat = fluid.layers.reshape(rcnn_cls_label, shape=[-1]) + cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype) + cls_target = fluid.layers.cast(cls_label_flat>0, dtype=cls_flat.dtype) + cls_label_flat.stop_gradient = True + pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype) + pos.stop_gradient = True + pos_normalizer = fluid.layers.reduce_sum(pos) + cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype) + cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10) + cls_weights.stop_gradient = True + rcnn_loss_cls = sigmoid_focal_loss(cls_flat, cls_target, cls_weights) + rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls) + else: # BinaryCrossEntropy + cls_label = fluid.layers.reshape(rcnn_cls_label, shape=self.cls_out.shape) + cls_valid_mask = fluid.layers.cast(cls_label >= 0, dtype=self.cls_out.dtype) + cls_label = fluid.layers.cast(cls_label, dtype=self.cls_out.dtype) + cls_label.stop_gradient = True + rcnn_loss_cls = fluid.layers.sigmoid_cross_entropy_with_logits(self.cls_out, cls_label) + cls_mask_normalzer = fluid.layers.reduce_sum(cls_valid_mask) + rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls * cls_valid_mask) \ + / fluid.layers.clip(cls_mask_normalzer, min=1.0, max=1e10) + + # RCNN regression loss + reg_out = self.reg_out + fg_mask = fluid.layers.cast(reg_valid_mask > 0, dtype=reg_out.dtype) + fg_mask.stop_gradient = True + gt_boxes3d_ct = fluid.layers.reshape(gt_boxes3d_ct, [-1,7]) + all_anchor_size = roi_size + anchor_size = all_anchor_size[fg_mask] if self.cfg.RCNN.SIZE_RES_ON_ROI else self.cfg.CLS_MEAN_SIZE[0] + + loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss( + reg_out * fg_mask, + gt_boxes3d_ct, + fg_mask, + point_num=float(self.batch_size*64), + loc_scope=self.cfg.RCNN.LOC_SCOPE, + loc_bin_size=self.cfg.RCNN.LOC_BIN_SIZE, + num_head_bin=self.cfg.RCNN.NUM_HEAD_BIN, + anchor_size=anchor_size, + get_xz_fine=True, + get_y_by_bin=self.cfg.RCNN.LOC_Y_BY_BIN, + loc_y_scope=self.cfg.RCNN.LOC_Y_SCOPE, + loc_y_bin_size=self.cfg.RCNN.LOC_Y_BIN_SIZE, + get_ry_fine=True + ) + rcnn_loss_reg = loc_loss + angle_loss + size_loss * 3 + rcnn_loss = rcnn_loss_cls + rcnn_loss_reg + return rcnn_loss, rcnn_loss_cls, rcnn_loss_reg + diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rpn.py b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py new file mode 100644 index 0000000000000000000000000000000000000000..30f0e34551a01a065f0784650256399506588639 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py @@ -0,0 +1,167 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddle.fluid.initializer import Normal, Constant + +from utils.proposal_utils import get_proposal_func +from models.pointnet2_msg import PointNet2MSG +from models.pointnet2_modules import conv_bn +from models.loss_utils import sigmoid_focal_loss, get_reg_loss + +__all__ = ["RPN"] + + +class RPN(object): + def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None): + self.cfg = cfg + self.batch_size = batch_size + self.use_xyz = use_xyz + self.mode = mode + self.is_train = mode == 'TRAIN' + self.inputs = None + self.prog = fluid.default_main_program() if prog is None else prog + + def build(self, inputs): + assert self.cfg.RPN.BACKBONE == 'pointnet2_msg', \ + "RPN backbone only support pointnet2_msg" + self.inputs = inputs + self.outputs = {} + + xyz = inputs["pts_input"] + assert not self.cfg.RPN.USE_INTENSITY, \ + "RPN.USE_INTENSITY not support now" + feature = None + msg = PointNet2MSG(self.cfg, xyz, feature, self.use_xyz) + backbone_xyz, backbone_feature = msg.build() + self.outputs['backbone_xyz'] = backbone_xyz + self.outputs['backbone_feature'] = backbone_feature + + backbone_feature = fluid.layers.transpose(backbone_feature, perm=[0, 2, 1]) + cls_out = fluid.layers.unsqueeze(backbone_feature, axes=[-1]) + reg_out = cls_out + + # classification branch + for i in range(self.cfg.RPN.CLS_FC.__len__()): + cls_out = conv_bn(cls_out, self.cfg.RPN.CLS_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_cls_{}'.format(i)) + if i == 0 and self.cfg.RPN.DP_RATIO > 0: + cls_out = fluid.layers.dropout(cls_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train") + cls_out = fluid.layers.conv2d(cls_out, + num_filters=1, + filter_size=1, + stride=1, + padding=0, + dilation=1, + param_attr=ParamAttr(name='rpn_cls_out_conv_weight'), + bias_attr=ParamAttr(name='rpn_cls_out_conv_bias', + initializer=Constant(-np.log(99)))) + cls_out = fluid.layers.squeeze(cls_out, axes=[1, 3]) + self.outputs['rpn_cls'] = cls_out + + # regression branch + per_loc_bin_num = int(self.cfg.RPN.LOC_SCOPE / self.cfg.RPN.LOC_BIN_SIZE) * 2 + if self.cfg.RPN.LOC_XZ_FINE: + reg_channel = per_loc_bin_num * 4 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3 + else: + reg_channel = per_loc_bin_num * 2 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3 + reg_channel += 1 # reg y + + for i in range(self.cfg.RPN.REG_FC.__len__()): + reg_out = conv_bn(reg_out, self.cfg.RPN.REG_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_reg_{}'.format(i)) + if i == 0 and self.cfg.RPN.DP_RATIO > 0: + reg_out = fluid.layers.dropout(reg_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train") + reg_out = fluid.layers.conv2d(reg_out, + num_filters=reg_channel, + filter_size=1, + stride=1, + padding=0, + dilation=1, + param_attr=ParamAttr(name='rpn_reg_out_conv_weight', + initializer=Normal(0., 0.001),), + bias_attr=ParamAttr(name='rpn_reg_out_conv_bias')) + reg_out = fluid.layers.squeeze(reg_out, axes=[3]) + reg_out = fluid.layers.transpose(reg_out, [0, 2, 1]) + self.outputs['rpn_reg'] = reg_out + + if self.mode != 'TRAIN' or self.cfg.RCNN.ENABLED: + rpn_scores_row = cls_out + rpn_scores_norm = fluid.layers.sigmoid(rpn_scores_row) + seg_mask = fluid.layers.cast(rpn_scores_norm > self.cfg.RPN.SCORE_THRESH, dtype='float32') + pts_depth = fluid.layers.sqrt(fluid.layers.reduce_sum(backbone_xyz * backbone_xyz, dim=2)) + proposal_func = get_proposal_func(self.cfg, self.mode) + proposal_input = fluid.layers.concat([fluid.layers.unsqueeze(rpn_scores_row, axes=[-1]), + backbone_xyz, reg_out], axis=-1) + proposal = self.prog.current_block().create_var(name='proposal', + shape=[-1, proposal_input.shape[1], 8], + dtype='float32') + fluid.layers.py_func(proposal_func, proposal_input, proposal) + rois, roi_scores_row = proposal[:, :, :7], proposal[:, :, -1] + self.outputs['rois'] = rois + self.outputs['roi_scores_row'] = roi_scores_row + self.outputs['seg_mask'] = seg_mask + self.outputs['pts_depth'] = pts_depth + + def get_outputs(self): + return self.outputs + + def get_loss(self): + assert self.inputs is not None, \ + "please call build() first" + rpn_cls_label = self.inputs['rpn_cls_label'] + rpn_reg_label = self.inputs['rpn_reg_label'] + rpn_cls = self.outputs['rpn_cls'] + rpn_reg = self.outputs['rpn_reg'] + + # RPN classification loss + assert self.cfg.RPN.LOSS_CLS == "SigmoidFocalLoss", \ + "unsupported RPN cls loss type {}".format(self.cfg.RPN.LOSS_CLS) + cls_flat = fluid.layers.reshape(rpn_cls, shape=[-1]) + cls_label_flat = fluid.layers.reshape(rpn_cls_label, shape=[-1]) + cls_label_pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype) + pos_normalizer = fluid.layers.reduce_sum(cls_label_pos) + cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype) + cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10) + cls_weights.stop_gradient = True + cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype) + cls_label_flat.stop_gradient = True + rpn_loss_cls = sigmoid_focal_loss(cls_flat, cls_label_pos, cls_weights) + rpn_loss_cls = fluid.layers.reduce_sum(rpn_loss_cls) + + # RPN regression loss + rpn_reg = fluid.layers.reshape(rpn_reg, [-1, rpn_reg.shape[-1]]) + reg_label = fluid.layers.reshape(rpn_reg_label, [-1, rpn_reg_label.shape[-1]]) + fg_mask = fluid.layers.cast(cls_label_flat > 0, dtype=rpn_reg.dtype) + fg_mask.stop_gradient = True + loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss( + rpn_reg * fg_mask, reg_label, fg_mask, + float(self.batch_size * self.cfg.RPN.NUM_POINTS), + loc_scope=self.cfg.RPN.LOC_SCOPE, + loc_bin_size=self.cfg.RPN.LOC_BIN_SIZE, + num_head_bin=self.cfg.RPN.NUM_HEAD_BIN, + anchor_size=self.cfg.CLS_MEAN_SIZE[0], + get_xz_fine=self.cfg.RPN.LOC_XZ_FINE, + get_y_by_bin=False, + get_ry_fine=False) + rpn_loss_reg = loc_loss + angle_loss + size_loss * 3 + + self.rpn_loss = rpn_loss_cls * self.cfg.RPN.LOSS_WEIGHT[0] + rpn_loss_reg * self.cfg.RPN.LOSS_WEIGHT[1] + return self.rpn_loss, rpn_loss_cls, rpn_loss_reg + diff --git a/PaddleCV/Paddle3D/PointRCNN/requirement.txt b/PaddleCV/Paddle3D/PointRCNN/requirement.txt new file mode 100644 index 0000000000000000000000000000000000000000..6ff347ab06c588b507fd6b5f1442e2375afb032a --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/requirement.txt @@ -0,0 +1,6 @@ +Cython +opencv-python +shapely +scikit-image +Numba +fire diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py new file mode 100644 index 0000000000000000000000000000000000000000..59cfa4abc0629c71d150f750e8f32400c6c361b9 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py @@ -0,0 +1,330 @@ +""" +Generate GT database +This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_aug_scene.py +""" + +import os +import numpy as np +import pickle + +import pts_utils +import utils.cyops.kitti_utils as kitti_utils +from utils.box_utils import boxes_iou3d +from utils import calibration as calib +from data.kitti_dataset import KittiDataset +import argparse + +np.random.seed(1024) + +parser = argparse.ArgumentParser() +parser.add_argument('--mode', type=str, default='generator') +parser.add_argument('--class_name', type=str, default='Car') +parser.add_argument('--data_dir', type=str, default='./data') +parser.add_argument('--save_dir', type=str, default='./data/KITTI/aug_scene/training') +parser.add_argument('--split', type=str, default='train') +parser.add_argument('--gt_database_dir', type=str, default='./data/gt_database/train_gt_database_3level_Car.pkl') +parser.add_argument('--include_similar', action='store_true', default=False) +parser.add_argument('--aug_times', type=int, default=4) +args = parser.parse_args() + +PC_REDUCE_BY_RANGE = True +if args.class_name == 'Car': + PC_AREA_SCOPE = np.array([[-40, 40], [-1, 3], [0, 70.4]]) # x, y, z scope in rect camera coords +else: + PC_AREA_SCOPE = np.array([[-30, 30], [-1, 3], [0, 50]]) + + +def log_print(info, fp=None): + print(info) + if fp is not None: + # print(info, file=fp) + fp.write(info+"\n") + + +def save_kitti_format(calib, bbox3d, obj_list, img_shape, save_fp): + corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d) + img_boxes, _ = calib.corners3d_to_img_boxes(corners3d) + + img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1) + img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1) + img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1) + img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1) + + # Discard boxes that are larger than 80% of the image width OR height + img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0] + img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1] + box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8) + + for k in range(bbox3d.shape[0]): + if box_valid_mask[k] == 0: + continue + x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6] + beta = np.arctan2(z, x) + alpha = -np.sign(beta) * np.pi / 2 + beta + ry + + save_fp.write('%s %.2f %d %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' % + (args.class_name, obj_list[k].trucation, int(obj_list[k].occlusion), alpha, img_boxes[k, 0], img_boxes[k, 1], + img_boxes[k, 2], img_boxes[k, 3], + bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2], + bbox3d[k, 6])) + + +class AugSceneGenerator(KittiDataset): + def __init__(self, root_dir, gt_database=None, split='train', classes=args.class_name): + super(AugSceneGenerator, self).__init__(root_dir, split=split) + self.gt_database = None + if classes == 'Car': + self.classes = ('Background', 'Car') + elif classes == 'People': + self.classes = ('Background', 'Pedestrian', 'Cyclist') + elif classes == 'Pedestrian': + self.classes = ('Background', 'Pedestrian') + elif classes == 'Cyclist': + self.classes = ('Background', 'Cyclist') + else: + assert False, "Invalid classes: %s" % classes + + self.gt_database = gt_database + + def __len__(self): + raise NotImplementedError + + def __getitem__(self, item): + raise NotImplementedError + + def filtrate_dc_objects(self, obj_list): + valid_obj_list = [] + for obj in obj_list: + if obj.cls_type in ['DontCare']: + continue + valid_obj_list.append(obj) + + return valid_obj_list + + def filtrate_objects(self, obj_list): + valid_obj_list = [] + type_whitelist = self.classes + if args.include_similar: + type_whitelist = list(self.classes) + if 'Car' in self.classes: + type_whitelist.append('Van') + if 'Pedestrian' in self.classes or 'Cyclist' in self.classes: + type_whitelist.append('Person_sitting') + + for obj in obj_list: + if obj.cls_type in type_whitelist: + valid_obj_list.append(obj) + return valid_obj_list + + @staticmethod + def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape): + """ + Valid point should be in the image (and in the PC_AREA_SCOPE) + :param pts_rect: + :param pts_img: + :param pts_rect_depth: + :param img_shape: + :return: + """ + val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1]) + val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0]) + val_flag_merge = np.logical_and(val_flag_1, val_flag_2) + pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0) + + if PC_REDUCE_BY_RANGE: + x_range, y_range, z_range = PC_AREA_SCOPE + pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2] + range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \ + & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \ + & (pts_z >= z_range[0]) & (pts_z <= z_range[1]) + pts_valid_flag = pts_valid_flag & range_flag + return pts_valid_flag + + @staticmethod + def check_pc_range(xyz): + """ + :param xyz: [x, y, z] + :return: + """ + x_range, y_range, z_range = PC_AREA_SCOPE + if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \ + (z_range[0] <= xyz[2] <= z_range[1]): + return True + return False + + def aug_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d): + """ + :param pts_rect: (N, 3) + :param gt_boxes3d: (M1, 7) + :param all_gt_boxex3d: (M2, 7) + :return: + """ + assert self.gt_database is not None + extra_gt_num = np.random.randint(10, 15) + try_times = 50 + cnt = 0 + cur_gt_boxes3d = all_gt_boxes3d.copy() + cur_gt_boxes3d[:, 4] += 0.5 + cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes + + extra_gt_obj_list = [] + extra_gt_boxes3d_list = [] + new_pts_list, new_pts_intensity_list = [], [] + src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32) + + road_plane = self.get_road_plane(sample_id) + a, b, c, d = road_plane + + while try_times > 0: + try_times -= 1 + + rand_idx = np.random.randint(0, self.gt_database.__len__() - 1) + + new_gt_dict = self.gt_database[rand_idx] + new_gt_box3d = new_gt_dict['gt_box3d'].copy() + new_gt_points = new_gt_dict['points'].copy() + new_gt_intensity = new_gt_dict['intensity'].copy() + new_gt_obj = new_gt_dict['obj'] + center = new_gt_box3d[0:3] + if PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False): + continue + if cnt > extra_gt_num: + break + if new_gt_points.__len__() < 5: # too few points + continue + + # put it on the road plane + cur_height = (-d - a * center[0] - c * center[2]) / b + move_height = new_gt_box3d[1] - cur_height + new_gt_box3d[1] -= move_height + new_gt_points[:, 1] -= move_height + + cnt += 1 + + iou3d = boxes_iou3d(new_gt_box3d.reshape(1, 7), cur_gt_boxes3d) + + valid_flag = iou3d.max() < 1e-8 + if not valid_flag: + continue + + enlarged_box3d = new_gt_box3d.copy() + enlarged_box3d[3] += 2 # remove the points above and below the object + boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, enlarged_box3d.reshape(1, 7)) + pt_mask_flag = (boxes_pts_mask_list[0] == 1) + src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box + + new_pts_list.append(new_gt_points) + new_pts_intensity_list.append(new_gt_intensity) + enlarged_box3d = new_gt_box3d.copy() + enlarged_box3d[4] += 0.5 + enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes + cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, enlarged_box3d.reshape(1, 7)), axis=0) + extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7)) + extra_gt_obj_list.append(new_gt_obj) + + if new_pts_list.__len__() == 0: + return False, pts_rect, pts_intensity, None, None + + extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0) + # remove original points and add new points + pts_rect = pts_rect[src_pts_flag == 1] + pts_intensity = pts_intensity[src_pts_flag == 1] + new_pts_rect = np.concatenate(new_pts_list, axis=0) + new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0) + pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0) + pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0) + + return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list + + def aug_one_epoch_scene(self, base_id, data_save_dir, label_save_dir, split_list, log_fp=None): + for idx, sample_id in enumerate(self.image_idx_list): + sample_id = int(sample_id) + print('process gt sample (%s, id=%06d)' % (args.split, sample_id)) + + pts_lidar = self.get_lidar(sample_id) + calib = self.get_calib(sample_id) + pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3]) + pts_img, pts_rect_depth = calib.rect_to_img(pts_rect) + img_shape = self.get_image_shape(sample_id) + + pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape) + pts_rect = pts_rect[pts_valid_flag][:, 0:3] + pts_intensity = pts_lidar[pts_valid_flag][:, 3] + + # all labels for checking overlapping + all_obj_list = self.filtrate_dc_objects(self.get_label(sample_id)) + all_gt_boxes3d = np.zeros((all_obj_list.__len__(), 7), dtype=np.float32) + for k, obj in enumerate(all_obj_list): + all_gt_boxes3d[k, 0:3], all_gt_boxes3d[k, 3], all_gt_boxes3d[k, 4], all_gt_boxes3d[k, 5], \ + all_gt_boxes3d[k, 6] = obj.pos, obj.h, obj.w, obj.l, obj.ry + + # gt_boxes3d of current label + obj_list = self.filtrate_objects(self.get_label(sample_id)) + if args.class_name != 'Car' and obj_list.__len__() == 0: + continue + + # augment one scene + aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \ + self.aug_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d) + + # save augment result to file + pts_info = np.concatenate((pts_rect, pts_intensity.reshape(-1, 1)), axis=1) + bin_file = os.path.join(data_save_dir, '%06d.bin' % (base_id + sample_id)) + pts_info.astype(np.float32).tofile(bin_file) + + # save filtered original gt_boxes3d + label_save_file = os.path.join(label_save_dir, '%06d.txt' % (base_id + sample_id)) + with open(label_save_file, 'w') as f: + for obj in obj_list: + f.write(obj.to_kitti_format() + '\n') + + if aug_flag: + # augment successfully + save_kitti_format(calib, extra_gt_boxes3d, extra_gt_obj_list, img_shape=img_shape, save_fp=f) + else: + extra_gt_boxes3d = np.zeros((0, 7), dtype=np.float32) + log_print('Save to file (new_obj: %s): %s' % (extra_gt_boxes3d.__len__(), label_save_file), fp=log_fp) + split_list.append('%06d' % (base_id + sample_id)) + + def generate_aug_scene(self, aug_times, log_fp=None): + data_save_dir = os.path.join(args.save_dir, 'rectified_data') + label_save_dir = os.path.join(args.save_dir, 'aug_label') + if not os.path.isdir(data_save_dir): + os.makedirs(data_save_dir) + if not os.path.isdir(label_save_dir): + os.makedirs(label_save_dir) + + split_file = os.path.join(args.save_dir, '%s_aug.txt' % args.split) + split_list = self.image_idx_list[:] + for epoch in range(aug_times): + base_id = (epoch + 1) * 10000 + self.aug_one_epoch_scene(base_id, data_save_dir, label_save_dir, split_list, log_fp=log_fp) + + with open(split_file, 'w') as f: + for idx, sample_id in enumerate(split_list): + f.write(str(sample_id) + '\n') + log_print('Save split file to %s' % split_file, fp=log_fp) + target_dir = os.path.join(args.data_dir, 'KITTI/ImageSets/') + os.system('cp %s %s' % (split_file, target_dir)) + log_print('Copy split file from %s to %s' % (split_file, target_dir), fp=log_fp) + + +if __name__ == '__main__': + if not os.path.isdir(args.save_dir): + os.makedirs(args.save_dir) + info_file = os.path.join(args.save_dir, 'log_info.txt') + + if args.mode == 'generator': + log_fp = open(info_file, 'w') + + gt_database = pickle.load(open(args.gt_database_dir, 'rb')) + log_print('Loading gt_database(%d) from %s' % (gt_database.__len__(), args.gt_database_dir), fp=log_fp) + + dataset = AugSceneGenerator(root_dir=args.data_dir, gt_database=gt_database, split=args.split) + dataset.generate_aug_scene(aug_times=args.aug_times, log_fp=log_fp) + + log_fp.close() + + else: + pass + diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py new file mode 100644 index 0000000000000000000000000000000000000000..43290db734c9734fef8120031cab44a394f4323b --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py @@ -0,0 +1,104 @@ +""" +Generate GT database +This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_gt_database.py +""" + +import os +import numpy as np +import pickle + +from data.kitti_dataset import KittiDataset +import pts_utils +import argparse + +parser = argparse.ArgumentParser() +parser.add_argument('--data_dir', type=str, default='./data') +parser.add_argument('--save_dir', type=str, default='./data/gt_database') +parser.add_argument('--class_name', type=str, default='Car') +parser.add_argument('--split', type=str, default='train') +args = parser.parse_args() + + +class GTDatabaseGenerator(KittiDataset): + def __init__(self, root_dir, split='train', classes=args.class_name): + super(GTDatabaseGenerator, self).__init__(root_dir, split=split) + self.gt_database = None + if classes == 'Car': + self.classes = ('Background', 'Car') + elif classes == 'People': + self.classes = ('Background', 'Pedestrian', 'Cyclist') + elif classes == 'Pedestrian': + self.classes = ('Background', 'Pedestrian') + elif classes == 'Cyclist': + self.classes = ('Background', 'Cyclist') + else: + assert False, "Invalid classes: %s" % classes + + def __len__(self): + raise NotImplementedError + + def __getitem__(self, item): + raise NotImplementedError + + def filtrate_objects(self, obj_list): + valid_obj_list = [] + for obj in obj_list: + if obj.cls_type not in self.classes: + continue + if obj.level_str not in ['Easy', 'Moderate', 'Hard']: + continue + valid_obj_list.append(obj) + + return valid_obj_list + + def generate_gt_database(self): + gt_database = [] + for idx, sample_id in enumerate(self.image_idx_list): + sample_id = int(sample_id) + print('process gt sample (id=%06d)' % sample_id) + + pts_lidar = self.get_lidar(sample_id) + calib = self.get_calib(sample_id) + pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3]) + pts_intensity = pts_lidar[:, 3] + + obj_list = self.filtrate_objects(self.get_label(sample_id)) + + gt_boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32) + for k, obj in enumerate(obj_list): + gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \ + = obj.pos, obj.h, obj.w, obj.l, obj.ry + + if gt_boxes3d.__len__() == 0: + print('No gt object') + continue + + boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, gt_boxes3d) + + for k in range(boxes_pts_mask_list.shape[0]): + pt_mask_flag = (boxes_pts_mask_list[k] == 1) + cur_pts = pts_rect[pt_mask_flag].astype(np.float32) + cur_pts_intensity = pts_intensity[pt_mask_flag].astype(np.float32) + sample_dict = {'sample_id': sample_id, + 'cls_type': obj_list[k].cls_type, + 'gt_box3d': gt_boxes3d[k], + 'points': cur_pts, + 'intensity': cur_pts_intensity, + 'obj': obj_list[k]} + gt_database.append(sample_dict) + + save_file_name = os.path.join(args.save_dir, '%s_gt_database_3level_%s.pkl' % (args.split, self.classes[-1])) + with open(save_file_name, 'wb') as f: + pickle.dump(gt_database, f) + + self.gt_database = gt_database + print('Save refine training sample info file to %s' % save_file_name) + + +if __name__ == '__main__': + dataset = GTDatabaseGenerator(root_dir=args.data_dir, split=args.split) + if not os.path.isdir(args.save_dir): + os.makedirs(args.save_dir) + + dataset.generate_gt_database() + diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py new file mode 100644 index 0000000000000000000000000000000000000000..6d16ef487301fb7ba45b71c64cd3af337cef13c5 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py @@ -0,0 +1,71 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import sys +import argparse + + +def parse_args(): + parser = argparse.ArgumentParser( + "KITTI mAP evaluation script") + parser.add_argument( + '--result_dir', + type=str, + default='./result_dir', + help='detection result directory to evaluate') + parser.add_argument( + '--data_dir', + type=str, + default='./data', + help='KITTI dataset root directory') + parser.add_argument( + '--split', + type=str, + default='val', + help='evaluation split, default val') + parser.add_argument( + '--class_name', + type=str, + default='Car', + help='evaluation class name, default Car') + args = parser.parse_args() + return args + + +def kitti_eval(): + if float(sys.version[:3]) < 3.6: + print("KITTI mAP evaluation can only run with python3.6+") + sys.exit(1) + + args = parse_args() + + label_dir = os.path.join(args.data_dir, 'KITTI/object/training', 'label_2') + split_file = os.path.join(args.data_dir, 'KITTI/ImageSets', + '{}.txt'.format(args.split)) + final_output_dir = os.path.join(args.result_dir, 'final_result', 'data') + name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2} + + from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate + ap_result_str, ap_dict = kitti_evaluate( + label_dir, final_output_dir, label_split_file=split_file, + current_class=name_to_class[args.class_name]) + + print("KITTI evaluate: ", ap_result_str, ap_dict) + + +if __name__ == "__main__": + kitti_eval() + + diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..ab602974d200aa6849e6ad8220951ef9a78d9f08 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2018 + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md new file mode 100644 index 0000000000000000000000000000000000000000..0e0e0c307c2db3f0486e594deae1c04ac49f55f3 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md @@ -0,0 +1,32 @@ +# kitti-object-eval-python +**NOTE**: This is borrowed from [traveller59/kitti-object-eval-python](https://github.com/traveller59/kitti-object-eval-python) + +Fast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions. +## Dependencies +Only support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba. +* Install by conda: +``` +conda install -c numba cudatoolkit=x.x (8.0, 9.0, 9.1, depend on your environment) +``` +## Usage +* commandline interface: +``` +python evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False +``` +* python interface: +```Python +import kitti_common as kitti +from eval import get_official_eval_result, get_coco_eval_result +def _read_imageset_file(path): + with open(path, 'r') as f: + lines = f.readlines() + return [int(line) for line in lines] +det_path = "/path/to/your_result_folder" +dt_annos = kitti.get_label_annos(det_path) +gt_path = "/path/to/your_gt_label_folder" +gt_split_file = "/path/to/val.txt" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz +val_image_ids = _read_imageset_file(gt_split_file) +gt_annos = kitti.get_label_annos(gt_path, val_image_ids) +print(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer +print(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer +``` diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py new file mode 100644 index 0000000000000000000000000000000000000000..38101ca69a59cdc0603ebc82cac0338432457550 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py @@ -0,0 +1,740 @@ +import numpy as np +import numba +import io as sysio +from tools.kitti_object_eval_python.rotate_iou import rotate_iou_gpu_eval + + +@numba.jit +def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41): + scores.sort() + scores = scores[::-1] + current_recall = 0 + thresholds = [] + for i, score in enumerate(scores): + l_recall = (i + 1) / num_gt + if i < (len(scores) - 1): + r_recall = (i + 2) / num_gt + else: + r_recall = l_recall + if (((r_recall - current_recall) < (current_recall - l_recall)) + and (i < (len(scores) - 1))): + continue + # recall = l_recall + thresholds.append(score) + current_recall += 1 / (num_sample_pts - 1.0) + return thresholds + + +def clean_data(gt_anno, dt_anno, current_class, difficulty): + CLASS_NAMES = ['car', 'pedestrian', 'cyclist'] + MIN_HEIGHT = [40, 25, 25] + MAX_OCCLUSION = [0, 1, 2] + MAX_TRUNCATION = [0.15, 0.3, 0.5] + dc_bboxes, ignored_gt, ignored_dt = [], [], [] + current_cls_name = CLASS_NAMES[current_class].lower() + num_gt = len(gt_anno["name"]) + num_dt = len(dt_anno["name"]) + num_valid_gt = 0 + for i in range(num_gt): + bbox = gt_anno["bbox"][i] + gt_name = gt_anno["name"][i].lower() + height = bbox[3] - bbox[1] + valid_class = -1 + if (gt_name == current_cls_name): + valid_class = 1 + elif (current_cls_name == "Pedestrian".lower() + and "Person_sitting".lower() == gt_name): + valid_class = 0 + elif (current_cls_name == "Car".lower() and "Van".lower() == gt_name): + valid_class = 0 + else: + valid_class = -1 + ignore = False + if ((gt_anno["occluded"][i] > MAX_OCCLUSION[difficulty]) + or (gt_anno["truncated"][i] > MAX_TRUNCATION[difficulty]) + or (height <= MIN_HEIGHT[difficulty])): + # if gt_anno["difficulty"][i] > difficulty or gt_anno["difficulty"][i] == -1: + ignore = True + if valid_class == 1 and not ignore: + ignored_gt.append(0) + num_valid_gt += 1 + elif (valid_class == 0 or (ignore and (valid_class == 1))): + ignored_gt.append(1) + else: + ignored_gt.append(-1) + # for i in range(num_gt): + if gt_anno["name"][i] == "DontCare": + dc_bboxes.append(gt_anno["bbox"][i]) + for i in range(num_dt): + if (dt_anno["name"][i].lower() == current_cls_name): + valid_class = 1 + else: + valid_class = -1 + height = abs(dt_anno["bbox"][i, 3] - dt_anno["bbox"][i, 1]) + if height < MIN_HEIGHT[difficulty]: + ignored_dt.append(1) + elif valid_class == 1: + ignored_dt.append(0) + else: + ignored_dt.append(-1) + + return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes + + +@numba.jit(nopython=True) +def image_box_overlap(boxes, query_boxes, criterion=-1): + N = boxes.shape[0] + K = query_boxes.shape[0] + overlaps = np.zeros((N, K), dtype=boxes.dtype) + for k in range(K): + qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) * + (query_boxes[k, 3] - query_boxes[k, 1])) + for n in range(N): + iw = (min(boxes[n, 2], query_boxes[k, 2]) - + max(boxes[n, 0], query_boxes[k, 0])) + if iw > 0: + ih = (min(boxes[n, 3], query_boxes[k, 3]) - + max(boxes[n, 1], query_boxes[k, 1])) + if ih > 0: + if criterion == -1: + ua = ( + (boxes[n, 2] - boxes[n, 0]) * + (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih) + elif criterion == 0: + ua = ((boxes[n, 2] - boxes[n, 0]) * + (boxes[n, 3] - boxes[n, 1])) + elif criterion == 1: + ua = qbox_area + else: + ua = 1.0 + overlaps[n, k] = iw * ih / ua + return overlaps + + +def bev_box_overlap(boxes, qboxes, criterion=-1): + riou = rotate_iou_gpu_eval(boxes, qboxes, criterion) + return riou + + +@numba.jit(nopython=True, parallel=True) +def d3_box_overlap_kernel(boxes, qboxes, rinc, criterion=-1): + # ONLY support overlap in CAMERA, not lider. + N, K = boxes.shape[0], qboxes.shape[0] + for i in range(N): + for j in range(K): + if rinc[i, j] > 0: + # iw = (min(boxes[i, 1] + boxes[i, 4], qboxes[j, 1] + + # qboxes[j, 4]) - max(boxes[i, 1], qboxes[j, 1])) + iw = (min(boxes[i, 1], qboxes[j, 1]) - max( + boxes[i, 1] - boxes[i, 4], qboxes[j, 1] - qboxes[j, 4])) + + if iw > 0: + area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5] + area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5] + inc = iw * rinc[i, j] + if criterion == -1: + ua = (area1 + area2 - inc) + elif criterion == 0: + ua = area1 + elif criterion == 1: + ua = area2 + else: + ua = inc + rinc[i, j] = inc / ua + else: + rinc[i, j] = 0.0 + + +def d3_box_overlap(boxes, qboxes, criterion=-1): + rinc = rotate_iou_gpu_eval(boxes[:, [0, 2, 3, 5, 6]], + qboxes[:, [0, 2, 3, 5, 6]], 2) + d3_box_overlap_kernel(boxes, qboxes, rinc, criterion) + return rinc + + +@numba.jit(nopython=True) +def compute_statistics_jit(overlaps, + gt_datas, + dt_datas, + ignored_gt, + ignored_det, + dc_bboxes, + metric, + min_overlap, + thresh=0, + compute_fp=False, + compute_aos=False): + + det_size = dt_datas.shape[0] + gt_size = gt_datas.shape[0] + dt_scores = dt_datas[:, -1] + dt_alphas = dt_datas[:, 4] + gt_alphas = gt_datas[:, 4] + dt_bboxes = dt_datas[:, :4] + gt_bboxes = gt_datas[:, :4] + + assigned_detection = [False] * det_size + ignored_threshold = [False] * det_size + if compute_fp: + for i in range(det_size): + if (dt_scores[i] < thresh): + ignored_threshold[i] = True + NO_DETECTION = -10000000 + tp, fp, fn, similarity = 0, 0, 0, 0 + # thresholds = [0.0] + # delta = [0.0] + thresholds = np.zeros((gt_size, )) + thresh_idx = 0 + delta = np.zeros((gt_size, )) + delta_idx = 0 + for i in range(gt_size): + if ignored_gt[i] == -1: + continue + det_idx = -1 + valid_detection = NO_DETECTION + max_overlap = 0 + assigned_ignored_det = False + + for j in range(det_size): + if (ignored_det[j] == -1): + continue + if (assigned_detection[j]): + continue + if (ignored_threshold[j]): + continue + overlap = overlaps[j, i] + dt_score = dt_scores[j] + if (not compute_fp and (overlap > min_overlap) + and dt_score > valid_detection): + det_idx = j + valid_detection = dt_score + elif (compute_fp and (overlap > min_overlap) + and (overlap > max_overlap or assigned_ignored_det) + and ignored_det[j] == 0): + max_overlap = overlap + det_idx = j + valid_detection = 1 + assigned_ignored_det = False + elif (compute_fp and (overlap > min_overlap) + and (valid_detection == NO_DETECTION) + and ignored_det[j] == 1): + det_idx = j + valid_detection = 1 + assigned_ignored_det = True + + if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0: + fn += 1 + elif ((valid_detection != NO_DETECTION) + and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)): + assigned_detection[det_idx] = True + elif valid_detection != NO_DETECTION: + tp += 1 + # thresholds.append(dt_scores[det_idx]) + thresholds[thresh_idx] = dt_scores[det_idx] + thresh_idx += 1 + if compute_aos: + # delta.append(gt_alphas[i] - dt_alphas[det_idx]) + delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx] + delta_idx += 1 + + assigned_detection[det_idx] = True + if compute_fp: + for i in range(det_size): + if (not (assigned_detection[i] or ignored_det[i] == -1 + or ignored_det[i] == 1 or ignored_threshold[i])): + fp += 1 + nstuff = 0 + if metric == 0: + overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0) + for i in range(dc_bboxes.shape[0]): + for j in range(det_size): + if (assigned_detection[j]): + continue + if (ignored_det[j] == -1 or ignored_det[j] == 1): + continue + if (ignored_threshold[j]): + continue + if overlaps_dt_dc[j, i] > min_overlap: + assigned_detection[j] = True + nstuff += 1 + fp -= nstuff + if compute_aos: + tmp = np.zeros((fp + delta_idx, )) + # tmp = [0] * fp + for i in range(delta_idx): + tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0 + # tmp.append((1.0 + np.cos(delta[i])) / 2.0) + # assert len(tmp) == fp + tp + # assert len(delta) == tp + if tp > 0 or fp > 0: + similarity = np.sum(tmp) + else: + similarity = -1 + return tp, fp, fn, similarity, thresholds[:thresh_idx] + + +def get_split_parts(num, num_part): + same_part = num // num_part + remain_num = num % num_part + if remain_num == 0: + return [same_part] * num_part + else: + return [same_part] * num_part + [remain_num] + + +@numba.jit(nopython=True) +def fused_compute_statistics(overlaps, + pr, + gt_nums, + dt_nums, + dc_nums, + gt_datas, + dt_datas, + dontcares, + ignored_gts, + ignored_dets, + metric, + min_overlap, + thresholds, + compute_aos=False): + gt_num = 0 + dt_num = 0 + dc_num = 0 + for i in range(gt_nums.shape[0]): + for t, thresh in enumerate(thresholds): + overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num: + gt_num + gt_nums[i]] + + gt_data = gt_datas[gt_num:gt_num + gt_nums[i]] + dt_data = dt_datas[dt_num:dt_num + dt_nums[i]] + ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]] + ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]] + dontcare = dontcares[dc_num:dc_num + dc_nums[i]] + tp, fp, fn, similarity, _ = compute_statistics_jit( + overlap, + gt_data, + dt_data, + ignored_gt, + ignored_det, + dontcare, + metric, + min_overlap=min_overlap, + thresh=thresh, + compute_fp=True, + compute_aos=compute_aos) + pr[t, 0] += tp + pr[t, 1] += fp + pr[t, 2] += fn + if similarity != -1: + pr[t, 3] += similarity + gt_num += gt_nums[i] + dt_num += dt_nums[i] + dc_num += dc_nums[i] + + +def calculate_iou_partly(gt_annos, dt_annos, metric, num_parts=50): + """fast iou algorithm. this function can be used independently to + do result analysis. Must be used in CAMERA coordinate system. + Args: + gt_annos: dict, must from get_label_annos() in kitti_common.py + dt_annos: dict, must from get_label_annos() in kitti_common.py + metric: eval type. 0: bbox, 1: bev, 2: 3d + num_parts: int. a parameter for fast calculate algorithm + """ + assert len(gt_annos) == len(dt_annos) + total_dt_num = np.stack([len(a["name"]) for a in dt_annos], 0) + total_gt_num = np.stack([len(a["name"]) for a in gt_annos], 0) + num_examples = len(gt_annos) + split_parts = get_split_parts(num_examples, num_parts) + parted_overlaps = [] + example_idx = 0 + + for num_part in split_parts: + gt_annos_part = gt_annos[example_idx:example_idx + num_part] + dt_annos_part = dt_annos[example_idx:example_idx + num_part] + if metric == 0: + gt_boxes = np.concatenate([a["bbox"] for a in gt_annos_part], 0) + dt_boxes = np.concatenate([a["bbox"] for a in dt_annos_part], 0) + overlap_part = image_box_overlap(gt_boxes, dt_boxes) + elif metric == 1: + loc = np.concatenate( + [a["location"][:, [0, 2]] for a in gt_annos_part], 0) + dims = np.concatenate( + [a["dimensions"][:, [0, 2]] for a in gt_annos_part], 0) + rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0) + gt_boxes = np.concatenate( + [loc, dims, rots[..., np.newaxis]], axis=1) + loc = np.concatenate( + [a["location"][:, [0, 2]] for a in dt_annos_part], 0) + dims = np.concatenate( + [a["dimensions"][:, [0, 2]] for a in dt_annos_part], 0) + rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0) + dt_boxes = np.concatenate( + [loc, dims, rots[..., np.newaxis]], axis=1) + overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype( + np.float64) + elif metric == 2: + loc = np.concatenate([a["location"] for a in gt_annos_part], 0) + dims = np.concatenate([a["dimensions"] for a in gt_annos_part], 0) + rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0) + gt_boxes = np.concatenate( + [loc, dims, rots[..., np.newaxis]], axis=1) + loc = np.concatenate([a["location"] for a in dt_annos_part], 0) + dims = np.concatenate([a["dimensions"] for a in dt_annos_part], 0) + rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0) + dt_boxes = np.concatenate( + [loc, dims, rots[..., np.newaxis]], axis=1) + overlap_part = d3_box_overlap(gt_boxes, dt_boxes).astype( + np.float64) + else: + raise ValueError("unknown metric") + parted_overlaps.append(overlap_part) + example_idx += num_part + overlaps = [] + example_idx = 0 + for j, num_part in enumerate(split_parts): + gt_annos_part = gt_annos[example_idx:example_idx + num_part] + dt_annos_part = dt_annos[example_idx:example_idx + num_part] + gt_num_idx, dt_num_idx = 0, 0 + for i in range(num_part): + gt_box_num = total_gt_num[example_idx + i] + dt_box_num = total_dt_num[example_idx + i] + overlaps.append( + parted_overlaps[j][gt_num_idx:gt_num_idx + gt_box_num, + dt_num_idx:dt_num_idx + dt_box_num]) + gt_num_idx += gt_box_num + dt_num_idx += dt_box_num + example_idx += num_part + + return overlaps, parted_overlaps, total_gt_num, total_dt_num + + +def _prepare_data(gt_annos, dt_annos, current_class, difficulty): + gt_datas_list = [] + dt_datas_list = [] + total_dc_num = [] + ignored_gts, ignored_dets, dontcares = [], [], [] + total_num_valid_gt = 0 + for i in range(len(gt_annos)): + rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty) + num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets + ignored_gts.append(np.array(ignored_gt, dtype=np.int64)) + ignored_dets.append(np.array(ignored_det, dtype=np.int64)) + if len(dc_bboxes) == 0: + dc_bboxes = np.zeros((0, 4)).astype(np.float64) + else: + dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64) + total_dc_num.append(dc_bboxes.shape[0]) + dontcares.append(dc_bboxes) + total_num_valid_gt += num_valid_gt + gt_datas = np.concatenate( + [gt_annos[i]["bbox"], gt_annos[i]["alpha"][..., np.newaxis]], 1) + dt_datas = np.concatenate([ + dt_annos[i]["bbox"], dt_annos[i]["alpha"][..., np.newaxis], + dt_annos[i]["score"][..., np.newaxis] + ], 1) + gt_datas_list.append(gt_datas) + dt_datas_list.append(dt_datas) + total_dc_num = np.stack(total_dc_num, axis=0) + return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares, + total_dc_num, total_num_valid_gt) + + +def eval_class(gt_annos, + dt_annos, + current_classes, + difficultys, + metric, + min_overlaps, + compute_aos=False, + num_parts=50): + """Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP. + Args: + gt_annos: dict, must from get_label_annos() in kitti_common.py + dt_annos: dict, must from get_label_annos() in kitti_common.py + current_classes: list of int, 0: car, 1: pedestrian, 2: cyclist + difficultys: list of int. eval difficulty, 0: easy, 1: normal, 2: hard + metric: eval type. 0: bbox, 1: bev, 2: 3d + min_overlaps: float, min overlap. format: [num_overlap, metric, class]. + num_parts: int. a parameter for fast calculate algorithm + + Returns: + dict of recall, precision and aos + """ + assert len(gt_annos) == len(dt_annos) + num_examples = len(gt_annos) + split_parts = get_split_parts(num_examples, num_parts) + + rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts) + overlaps, parted_overlaps, total_dt_num, total_gt_num = rets + N_SAMPLE_PTS = 41 + num_minoverlap = len(min_overlaps) + num_class = len(current_classes) + num_difficulty = len(difficultys) + precision = np.zeros( + [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS]) + recall = np.zeros( + [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS]) + aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS]) + for m, current_class in enumerate(current_classes): + for l, difficulty in enumerate(difficultys): + rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty) + (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, + dontcares, total_dc_num, total_num_valid_gt) = rets + for k, min_overlap in enumerate(min_overlaps[:, metric, m]): + thresholdss = [] + for i in range(len(gt_annos)): + rets = compute_statistics_jit( + overlaps[i], + gt_datas_list[i], + dt_datas_list[i], + ignored_gts[i], + ignored_dets[i], + dontcares[i], + metric, + min_overlap=min_overlap, + thresh=0.0, + compute_fp=False) + tp, fp, fn, similarity, thresholds = rets + thresholdss += thresholds.tolist() + thresholdss = np.array(thresholdss) + thresholds = get_thresholds(thresholdss, total_num_valid_gt) + thresholds = np.array(thresholds) + pr = np.zeros([len(thresholds), 4]) + idx = 0 + for j, num_part in enumerate(split_parts): + gt_datas_part = np.concatenate( + gt_datas_list[idx:idx + num_part], 0) + dt_datas_part = np.concatenate( + dt_datas_list[idx:idx + num_part], 0) + dc_datas_part = np.concatenate( + dontcares[idx:idx + num_part], 0) + ignored_dets_part = np.concatenate( + ignored_dets[idx:idx + num_part], 0) + ignored_gts_part = np.concatenate( + ignored_gts[idx:idx + num_part], 0) + fused_compute_statistics( + parted_overlaps[j], + pr, + total_gt_num[idx:idx + num_part], + total_dt_num[idx:idx + num_part], + total_dc_num[idx:idx + num_part], + gt_datas_part, + dt_datas_part, + dc_datas_part, + ignored_gts_part, + ignored_dets_part, + metric, + min_overlap=min_overlap, + thresholds=thresholds, + compute_aos=compute_aos) + idx += num_part + for i in range(len(thresholds)): + recall[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 2]) + precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1]) + if compute_aos: + aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1]) + for i in range(len(thresholds)): + precision[m, l, k, i] = np.max( + precision[m, l, k, i:], axis=-1) + recall[m, l, k, i] = np.max(recall[m, l, k, i:], axis=-1) + if compute_aos: + aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1) + ret_dict = { + "recall": recall, + "precision": precision, + "orientation": aos, + } + return ret_dict + + +def get_mAP(prec): + sums = 0 + for i in range(0, prec.shape[-1], 4): + sums = sums + prec[..., i] + return sums / 11 * 100 + + +def print_str(value, *arg, sstream=None): + if sstream is None: + sstream = sysio.StringIO() + sstream.truncate(0) + sstream.seek(0) + print(value, *arg, file=sstream) + return sstream.getvalue() + + +def do_eval(gt_annos, + dt_annos, + current_classes, + min_overlaps, + compute_aos=False): + # min_overlaps: [num_minoverlap, metric, num_class] + difficultys = [0, 1, 2] + ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 0, + min_overlaps, compute_aos) + # ret: [num_class, num_diff, num_minoverlap, num_sample_points] + mAP_bbox = get_mAP(ret["precision"]) + mAP_aos = None + if compute_aos: + mAP_aos = get_mAP(ret["orientation"]) + ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1, + min_overlaps) + mAP_bev = get_mAP(ret["precision"]) + ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 2, + min_overlaps) + mAP_3d = get_mAP(ret["precision"]) + return mAP_bbox, mAP_bev, mAP_3d, mAP_aos + + +def do_coco_style_eval(gt_annos, dt_annos, current_classes, overlap_ranges, + compute_aos): + # overlap_ranges: [range, metric, num_class] + min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]]) + for i in range(overlap_ranges.shape[1]): + for j in range(overlap_ranges.shape[2]): + min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j]) + mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval( + gt_annos, dt_annos, current_classes, min_overlaps, compute_aos) + # ret: [num_class, num_diff, num_minoverlap] + mAP_bbox = mAP_bbox.mean(-1) + mAP_bev = mAP_bev.mean(-1) + mAP_3d = mAP_3d.mean(-1) + if mAP_aos is not None: + mAP_aos = mAP_aos.mean(-1) + return mAP_bbox, mAP_bev, mAP_3d, mAP_aos + + +def get_official_eval_result(gt_annos, dt_annos, current_classes): + overlap_0_7 = np.array([[0.7, 0.5, 0.5, 0.7, + 0.5], [0.7, 0.5, 0.5, 0.7, 0.5], + [0.7, 0.5, 0.5, 0.7, 0.5]]) + overlap_0_5 = np.array([[0.7, 0.5, 0.5, 0.7, + 0.5], [0.5, 0.25, 0.25, 0.5, 0.25], + [0.5, 0.25, 0.25, 0.5, 0.25]]) + min_overlaps = np.stack([overlap_0_7, overlap_0_5], axis=0) # [2, 3, 5] + class_to_name = { + 0: 'Car', + 1: 'Pedestrian', + 2: 'Cyclist', + 3: 'Van', + 4: 'Person_sitting', + } + name_to_class = {v: n for n, v in class_to_name.items()} + if not isinstance(current_classes, (list, tuple)): + current_classes = [current_classes] + current_classes_int = [] + for curcls in current_classes: + if isinstance(curcls, str): + current_classes_int.append(name_to_class[curcls]) + else: + current_classes_int.append(curcls) + current_classes = current_classes_int + min_overlaps = min_overlaps[:, :, current_classes] + result = '' + # check whether alpha is valid + compute_aos = False + for anno in dt_annos: + if anno['alpha'].shape[0] != 0: + if anno['alpha'][0] != -10: + compute_aos = True + break + mAPbbox, mAPbev, mAP3d, mAPaos = do_eval( + gt_annos, dt_annos, current_classes, min_overlaps, compute_aos) + + ret_dict = {} + for j, curcls in enumerate(current_classes): + # mAP threshold array: [num_minoverlap, metric, class] + # mAP result: [num_class, num_diff, num_minoverlap] + for i in range(min_overlaps.shape[0]): + result += print_str( + (f"{class_to_name[curcls]} " + "AP@{:.2f}, {:.2f}, {:.2f}:".format(*min_overlaps[i, :, j]))) + result += print_str((f"bbox AP:{mAPbbox[j, 0, i]:.4f}, " + f"{mAPbbox[j, 1, i]:.4f}, " + f"{mAPbbox[j, 2, i]:.4f}")) + result += print_str((f"bev AP:{mAPbev[j, 0, i]:.4f}, " + f"{mAPbev[j, 1, i]:.4f}, " + f"{mAPbev[j, 2, i]:.4f}")) + result += print_str((f"3d AP:{mAP3d[j, 0, i]:.4f}, " + f"{mAP3d[j, 1, i]:.4f}, " + f"{mAP3d[j, 2, i]:.4f}")) + + + if compute_aos: + result += print_str((f"aos AP:{mAPaos[j, 0, i]:.2f}, " + f"{mAPaos[j, 1, i]:.2f}, " + f"{mAPaos[j, 2, i]:.2f}")) + ret_dict['Car_3d_easy'] = mAP3d[0, 0, 0] + ret_dict['Car_3d_moderate'] = mAP3d[0, 1, 0] + ret_dict['Car_3d_hard'] = mAP3d[0, 2, 0] + ret_dict['Car_bev_easy'] = mAPbev[0, 0, 0] + ret_dict['Car_bev_moderate'] = mAPbev[0, 1, 0] + ret_dict['Car_bev_hard'] = mAPbev[0, 2, 0] + ret_dict['Car_image_easy'] = mAPbbox[0, 0, 0] + ret_dict['Car_image_moderate'] = mAPbbox[0, 1, 0] + ret_dict['Car_image_hard'] = mAPbbox[0, 2, 0] + + return result, ret_dict + + +def get_coco_eval_result(gt_annos, dt_annos, current_classes): + class_to_name = { + 0: 'Car', + 1: 'Pedestrian', + 2: 'Cyclist', + 3: 'Van', + 4: 'Person_sitting', + } + class_to_range = { + 0: [0.5, 0.95, 10], + 1: [0.25, 0.7, 10], + 2: [0.25, 0.7, 10], + 3: [0.5, 0.95, 10], + 4: [0.25, 0.7, 10], + } + name_to_class = {v: n for n, v in class_to_name.items()} + if not isinstance(current_classes, (list, tuple)): + current_classes = [current_classes] + current_classes_int = [] + for curcls in current_classes: + if isinstance(curcls, str): + current_classes_int.append(name_to_class[curcls]) + else: + current_classes_int.append(curcls) + current_classes = current_classes_int + overlap_ranges = np.zeros([3, 3, len(current_classes)]) + for i, curcls in enumerate(current_classes): + overlap_ranges[:, :, i] = np.array( + class_to_range[curcls])[:, np.newaxis] + result = '' + # check whether alpha is valid + compute_aos = False + for anno in dt_annos: + if anno['alpha'].shape[0] != 0: + if anno['alpha'][0] != -10: + compute_aos = True + break + mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval( + gt_annos, dt_annos, current_classes, overlap_ranges, compute_aos) + for j, curcls in enumerate(current_classes): + # mAP threshold array: [num_minoverlap, metric, class] + # mAP result: [num_class, num_diff, num_minoverlap] + o_range = np.array(class_to_range[curcls])[[0, 2, 1]] + o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1) + result += print_str((f"{class_to_name[curcls]} " + "coco AP@{:.2f}:{:.2f}:{:.2f}:".format(*o_range))) + result += print_str((f"bbox AP:{mAPbbox[j, 0]:.2f}, " + f"{mAPbbox[j, 1]:.2f}, " + f"{mAPbbox[j, 2]:.2f}")) + result += print_str((f"bev AP:{mAPbev[j, 0]:.2f}, " + f"{mAPbev[j, 1]:.2f}, " + f"{mAPbev[j, 2]:.2f}")) + result += print_str((f"3d AP:{mAP3d[j, 0]:.2f}, " + f"{mAP3d[j, 1]:.2f}, " + f"{mAP3d[j, 2]:.2f}")) + if compute_aos: + result += print_str((f"aos AP:{mAPaos[j, 0]:.2f}, " + f"{mAPaos[j, 1]:.2f}, " + f"{mAPaos[j, 2]:.2f}")) + return result diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py new file mode 100644 index 0000000000000000000000000000000000000000..e822ae464618eb05c4123b7bd05cec875a567b70 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py @@ -0,0 +1,32 @@ +import time +import fire + +import tools.kitti_object_eval_python.kitti_common as kitti +from tools.kitti_object_eval_python.eval import get_official_eval_result, get_coco_eval_result + + +def _read_imageset_file(path): + with open(path, 'r') as f: + lines = f.readlines() + return [int(line) for line in lines] + + +def evaluate(label_path, + result_path, + label_split_file, + current_class=0, + coco=False, + score_thresh=-1): + dt_annos = kitti.get_label_annos(result_path) + if score_thresh > 0: + dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh) + val_image_ids = _read_imageset_file(label_split_file) + gt_annos = kitti.get_label_annos(label_path, val_image_ids) + if coco: + return get_coco_eval_result(gt_annos, dt_annos, current_class) + else: + return get_official_eval_result(gt_annos, dt_annos, current_class) + + +if __name__ == '__main__': + fire.Fire() diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py new file mode 100644 index 0000000000000000000000000000000000000000..e7e254ea4a27af9656757bbfb1f932c1348f59fe --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py @@ -0,0 +1,411 @@ +import concurrent.futures as futures +import os +import pathlib +import re +from collections import OrderedDict + +import numpy as np +from skimage import io + +def get_image_index_str(img_idx): + return "{:06d}".format(img_idx) + + +def get_kitti_info_path(idx, + prefix, + info_type='image_2', + file_tail='.png', + training=True, + relative_path=True): + img_idx_str = get_image_index_str(idx) + img_idx_str += file_tail + prefix = pathlib.Path(prefix) + if training: + file_path = pathlib.Path('training') / info_type / img_idx_str + else: + file_path = pathlib.Path('testing') / info_type / img_idx_str + if not (prefix / file_path).exists(): + raise ValueError("file not exist: {}".format(file_path)) + if relative_path: + return str(file_path) + else: + return str(prefix / file_path) + + +def get_image_path(idx, prefix, training=True, relative_path=True): + return get_kitti_info_path(idx, prefix, 'image_2', '.png', training, + relative_path) + + +def get_label_path(idx, prefix, training=True, relative_path=True): + return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training, + relative_path) + + +def get_velodyne_path(idx, prefix, training=True, relative_path=True): + return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training, + relative_path) + + +def get_calib_path(idx, prefix, training=True, relative_path=True): + return get_kitti_info_path(idx, prefix, 'calib', '.txt', training, + relative_path) + + +def _extend_matrix(mat): + mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0) + return mat + + +def get_kitti_image_info(path, + training=True, + label_info=True, + velodyne=False, + calib=False, + image_ids=7481, + extend_matrix=True, + num_worker=8, + relative_path=True, + with_imageshape=True): + # image_infos = [] + root_path = pathlib.Path(path) + if not isinstance(image_ids, list): + image_ids = list(range(image_ids)) + + def map_func(idx): + image_info = {'image_idx': idx} + annotations = None + if velodyne: + image_info['velodyne_path'] = get_velodyne_path( + idx, path, training, relative_path) + image_info['img_path'] = get_image_path(idx, path, training, + relative_path) + if with_imageshape: + img_path = image_info['img_path'] + if relative_path: + img_path = str(root_path / img_path) + image_info['img_shape'] = np.array( + io.imread(img_path).shape[:2], dtype=np.int32) + if label_info: + label_path = get_label_path(idx, path, training, relative_path) + if relative_path: + label_path = str(root_path / label_path) + annotations = get_label_anno(label_path) + if calib: + calib_path = get_calib_path( + idx, path, training, relative_path=False) + with open(calib_path, 'r') as f: + lines = f.readlines() + P0 = np.array( + [float(info) for info in lines[0].split(' ')[1:13]]).reshape( + [3, 4]) + P1 = np.array( + [float(info) for info in lines[1].split(' ')[1:13]]).reshape( + [3, 4]) + P2 = np.array( + [float(info) for info in lines[2].split(' ')[1:13]]).reshape( + [3, 4]) + P3 = np.array( + [float(info) for info in lines[3].split(' ')[1:13]]).reshape( + [3, 4]) + if extend_matrix: + P0 = _extend_matrix(P0) + P1 = _extend_matrix(P1) + P2 = _extend_matrix(P2) + P3 = _extend_matrix(P3) + image_info['calib/P0'] = P0 + image_info['calib/P1'] = P1 + image_info['calib/P2'] = P2 + image_info['calib/P3'] = P3 + R0_rect = np.array([ + float(info) for info in lines[4].split(' ')[1:10] + ]).reshape([3, 3]) + if extend_matrix: + rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype) + rect_4x4[3, 3] = 1. + rect_4x4[:3, :3] = R0_rect + else: + rect_4x4 = R0_rect + image_info['calib/R0_rect'] = rect_4x4 + Tr_velo_to_cam = np.array([ + float(info) for info in lines[5].split(' ')[1:13] + ]).reshape([3, 4]) + Tr_imu_to_velo = np.array([ + float(info) for info in lines[6].split(' ')[1:13] + ]).reshape([3, 4]) + if extend_matrix: + Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam) + Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo) + image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam + image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo + if annotations is not None: + image_info['annos'] = annotations + add_difficulty_to_annos(image_info) + return image_info + + with futures.ThreadPoolExecutor(num_worker) as executor: + image_infos = executor.map(map_func, image_ids) + return list(image_infos) + + +def filter_kitti_anno(image_anno, + used_classes, + used_difficulty=None, + dontcare_iou=None): + if not isinstance(used_classes, (list, tuple)): + used_classes = [used_classes] + img_filtered_annotations = {} + relevant_annotation_indices = [ + i for i, x in enumerate(image_anno['name']) if x in used_classes + ] + for key in image_anno.keys(): + img_filtered_annotations[key] = ( + image_anno[key][relevant_annotation_indices]) + if used_difficulty is not None: + relevant_annotation_indices = [ + i for i, x in enumerate(img_filtered_annotations['difficulty']) + if x in used_difficulty + ] + for key in image_anno.keys(): + img_filtered_annotations[key] = ( + img_filtered_annotations[key][relevant_annotation_indices]) + + if 'DontCare' in used_classes and dontcare_iou is not None: + dont_care_indices = [ + i for i, x in enumerate(img_filtered_annotations['name']) + if x == 'DontCare' + ] + # bounding box format [y_min, x_min, y_max, x_max] + all_boxes = img_filtered_annotations['bbox'] + ious = iou(all_boxes, all_boxes[dont_care_indices]) + + # Remove all bounding boxes that overlap with a dontcare region. + if ious.size > 0: + boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou + for key in image_anno.keys(): + img_filtered_annotations[key] = (img_filtered_annotations[key][ + np.logical_not(boxes_to_remove)]) + return img_filtered_annotations + +def filter_annos_low_score(image_annos, thresh): + new_image_annos = [] + for anno in image_annos: + img_filtered_annotations = {} + relevant_annotation_indices = [ + i for i, s in enumerate(anno['score']) if s >= thresh + ] + for key in anno.keys(): + img_filtered_annotations[key] = ( + anno[key][relevant_annotation_indices]) + new_image_annos.append(img_filtered_annotations) + return new_image_annos + +def kitti_result_line(result_dict, precision=4): + prec_float = "{" + ":.{}f".format(precision) + "}" + res_line = [] + all_field_default = OrderedDict([ + ('name', None), + ('truncated', -1), + ('occluded', -1), + ('alpha', -10), + ('bbox', None), + ('dimensions', [-1, -1, -1]), + ('location', [-1000, -1000, -1000]), + ('rotation_y', -10), + ('score', None), + ]) + res_dict = [(key, None) for key, val in all_field_default.items()] + res_dict = OrderedDict(res_dict) + for key, val in result_dict.items(): + if all_field_default[key] is None and val is None: + raise ValueError("you must specify a value for {}".format(key)) + res_dict[key] = val + + for key, val in res_dict.items(): + if key == 'name': + res_line.append(val) + elif key in ['truncated', 'alpha', 'rotation_y', 'score']: + if val is None: + res_line.append(str(all_field_default[key])) + else: + res_line.append(prec_float.format(val)) + elif key == 'occluded': + if val is None: + res_line.append(str(all_field_default[key])) + else: + res_line.append('{}'.format(val)) + elif key in ['bbox', 'dimensions', 'location']: + if val is None: + res_line += [str(v) for v in all_field_default[key]] + else: + res_line += [prec_float.format(v) for v in val] + else: + raise ValueError("unknown key. supported key:{}".format( + res_dict.keys())) + return ' '.join(res_line) + + +def add_difficulty_to_annos(info): + min_height = [40, 25, + 25] # minimum height for evaluated groundtruth/detections + max_occlusion = [ + 0, 1, 2 + ] # maximum occlusion level of the groundtruth used for evaluation + max_trunc = [ + 0.15, 0.3, 0.5 + ] # maximum truncation level of the groundtruth used for evaluation + annos = info['annos'] + dims = annos['dimensions'] # lhw format + bbox = annos['bbox'] + height = bbox[:, 3] - bbox[:, 1] + occlusion = annos['occluded'] + truncation = annos['truncated'] + diff = [] + easy_mask = np.ones((len(dims), ), dtype=np.bool) + moderate_mask = np.ones((len(dims), ), dtype=np.bool) + hard_mask = np.ones((len(dims), ), dtype=np.bool) + i = 0 + for h, o, t in zip(height, occlusion, truncation): + if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]: + easy_mask[i] = False + if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]: + moderate_mask[i] = False + if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]: + hard_mask[i] = False + i += 1 + is_easy = easy_mask + is_moderate = np.logical_xor(easy_mask, moderate_mask) + is_hard = np.logical_xor(hard_mask, moderate_mask) + + for i in range(len(dims)): + if is_easy[i]: + diff.append(0) + elif is_moderate[i]: + diff.append(1) + elif is_hard[i]: + diff.append(2) + else: + diff.append(-1) + annos["difficulty"] = np.array(diff, np.int32) + return diff + + +def get_label_anno(label_path): + annotations = {} + annotations.update({ + 'name': [], + 'truncated': [], + 'occluded': [], + 'alpha': [], + 'bbox': [], + 'dimensions': [], + 'location': [], + 'rotation_y': [] + }) + with open(label_path, 'r') as f: + lines = f.readlines() + # if len(lines) == 0 or len(lines[0]) < 15: + # content = [] + # else: + content = [line.strip().split(' ') for line in lines] + annotations['name'] = np.array([x[0] for x in content]) + annotations['truncated'] = np.array([float(x[1]) for x in content]) + annotations['occluded'] = np.array([int(x[2]) for x in content]) + annotations['alpha'] = np.array([float(x[3]) for x in content]) + annotations['bbox'] = np.array( + [[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4) + # dimensions will convert hwl format to standard lhw(camera) format. + annotations['dimensions'] = np.array( + [[float(info) for info in x[8:11]] for x in content]).reshape( + -1, 3)[:, [2, 0, 1]] + annotations['location'] = np.array( + [[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3) + annotations['rotation_y'] = np.array( + [float(x[14]) for x in content]).reshape(-1) + if len(content) != 0 and len(content[0]) == 16: # have score + annotations['score'] = np.array([float(x[15]) for x in content]) + else: + annotations['score'] = np.zeros([len(annotations['bbox'])]) + return annotations + +def get_label_annos(label_folder, image_ids=None): + if image_ids is None: + filepaths = pathlib.Path(label_folder).glob('*.txt') + prog = re.compile(r'^\d{6}.txt$') + filepaths = filter(lambda f: prog.match(f.name), filepaths) + image_ids = [int(p.stem) for p in filepaths] + image_ids = sorted(image_ids) + if not isinstance(image_ids, list): + image_ids = list(range(image_ids)) + annos = [] + label_folder = pathlib.Path(label_folder) + for idx in image_ids: + image_idx = get_image_index_str(idx) + label_filename = label_folder / (image_idx + '.txt') + annos.append(get_label_anno(label_filename)) + return annos + +def area(boxes, add1=False): + """Computes area of boxes. + + Args: + boxes: Numpy array with shape [N, 4] holding N boxes + + Returns: + a numpy array with shape [N*1] representing box areas + """ + if add1: + return (boxes[:, 2] - boxes[:, 0] + 1.0) * ( + boxes[:, 3] - boxes[:, 1] + 1.0) + else: + return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1]) + + +def intersection(boxes1, boxes2, add1=False): + """Compute pairwise intersection areas between boxes. + + Args: + boxes1: a numpy array with shape [N, 4] holding N boxes + boxes2: a numpy array with shape [M, 4] holding M boxes + + Returns: + a numpy array with shape [N*M] representing pairwise intersection area + """ + [y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1) + [y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1) + + all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2)) + all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2)) + if add1: + all_pairs_min_ymax += 1.0 + intersect_heights = np.maximum( + np.zeros(all_pairs_max_ymin.shape), + all_pairs_min_ymax - all_pairs_max_ymin) + + all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2)) + all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2)) + if add1: + all_pairs_min_xmax += 1.0 + intersect_widths = np.maximum( + np.zeros(all_pairs_max_xmin.shape), + all_pairs_min_xmax - all_pairs_max_xmin) + return intersect_heights * intersect_widths + + +def iou(boxes1, boxes2, add1=False): + """Computes pairwise intersection-over-union between box collections. + + Args: + boxes1: a numpy array with shape [N, 4] holding N boxes. + boxes2: a numpy array with shape [M, 4] holding N boxes. + + Returns: + a numpy array with shape [N, M] representing pairwise iou scores. + """ + intersect = intersection(boxes1, boxes2, add1) + area1 = area(boxes1, add1) + area2 = area(boxes2, add1) + union = np.expand_dims( + area1, axis=1) + np.expand_dims( + area2, axis=0) - intersect + return intersect / union diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py new file mode 100644 index 0000000000000000000000000000000000000000..cd694ef5c5a0c9fac9595a17743a35db37d48820 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py @@ -0,0 +1,329 @@ +##################### +# Based on https://github.com/hongzhenwang/RRPN-revise +# Licensed under The MIT License +# Author: yanyan, scrin@foxmail.com +##################### +import math + +import numba +import numpy as np +from numba import cuda + +@numba.jit(nopython=True) +def div_up(m, n): + return m // n + (m % n > 0) + +@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True) +def trangle_area(a, b, c): + return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) * + (b[0] - c[0])) / 2.0 + + +@cuda.jit('(float32[:], int32)', device=True, inline=True) +def area(int_pts, num_of_inter): + area_val = 0.0 + for i in range(num_of_inter - 2): + area_val += abs( + trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4], + int_pts[2 * i + 4:2 * i + 6])) + return area_val + + +@cuda.jit('(float32[:], int32)', device=True, inline=True) +def sort_vertex_in_convex_polygon(int_pts, num_of_inter): + if num_of_inter > 0: + center = cuda.local.array((2, ), dtype=numba.float32) + center[:] = 0.0 + for i in range(num_of_inter): + center[0] += int_pts[2 * i] + center[1] += int_pts[2 * i + 1] + center[0] /= num_of_inter + center[1] /= num_of_inter + v = cuda.local.array((2, ), dtype=numba.float32) + vs = cuda.local.array((16, ), dtype=numba.float32) + for i in range(num_of_inter): + v[0] = int_pts[2 * i] - center[0] + v[1] = int_pts[2 * i + 1] - center[1] + d = math.sqrt(v[0] * v[0] + v[1] * v[1]) + v[0] = v[0] / d + v[1] = v[1] / d + if v[1] < 0: + v[0] = -2 - v[0] + vs[i] = v[0] + j = 0 + temp = 0 + for i in range(1, num_of_inter): + if vs[i - 1] > vs[i]: + temp = vs[i] + tx = int_pts[2 * i] + ty = int_pts[2 * i + 1] + j = i + while j > 0 and vs[j - 1] > temp: + vs[j] = vs[j - 1] + int_pts[j * 2] = int_pts[j * 2 - 2] + int_pts[j * 2 + 1] = int_pts[j * 2 - 1] + j -= 1 + + vs[j] = temp + int_pts[j * 2] = tx + int_pts[j * 2 + 1] = ty + + +@cuda.jit( + '(float32[:], float32[:], int32, int32, float32[:])', + device=True, + inline=True) +def line_segment_intersection(pts1, pts2, i, j, temp_pts): + A = cuda.local.array((2, ), dtype=numba.float32) + B = cuda.local.array((2, ), dtype=numba.float32) + C = cuda.local.array((2, ), dtype=numba.float32) + D = cuda.local.array((2, ), dtype=numba.float32) + + A[0] = pts1[2 * i] + A[1] = pts1[2 * i + 1] + + B[0] = pts1[2 * ((i + 1) % 4)] + B[1] = pts1[2 * ((i + 1) % 4) + 1] + + C[0] = pts2[2 * j] + C[1] = pts2[2 * j + 1] + + D[0] = pts2[2 * ((j + 1) % 4)] + D[1] = pts2[2 * ((j + 1) % 4) + 1] + BA0 = B[0] - A[0] + BA1 = B[1] - A[1] + DA0 = D[0] - A[0] + CA0 = C[0] - A[0] + DA1 = D[1] - A[1] + CA1 = C[1] - A[1] + acd = DA1 * CA0 > CA1 * DA0 + bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0]) + if acd != bcd: + abc = CA1 * BA0 > BA1 * CA0 + abd = DA1 * BA0 > BA1 * DA0 + if abc != abd: + DC0 = D[0] - C[0] + DC1 = D[1] - C[1] + ABBA = A[0] * B[1] - B[0] * A[1] + CDDC = C[0] * D[1] - D[0] * C[1] + DH = BA1 * DC0 - BA0 * DC1 + Dx = ABBA * DC0 - BA0 * CDDC + Dy = ABBA * DC1 - BA1 * CDDC + temp_pts[0] = Dx / DH + temp_pts[1] = Dy / DH + return True + return False + + +@cuda.jit( + '(float32[:], float32[:], int32, int32, float32[:])', + device=True, + inline=True) +def line_segment_intersection_v1(pts1, pts2, i, j, temp_pts): + a = cuda.local.array((2, ), dtype=numba.float32) + b = cuda.local.array((2, ), dtype=numba.float32) + c = cuda.local.array((2, ), dtype=numba.float32) + d = cuda.local.array((2, ), dtype=numba.float32) + + a[0] = pts1[2 * i] + a[1] = pts1[2 * i + 1] + + b[0] = pts1[2 * ((i + 1) % 4)] + b[1] = pts1[2 * ((i + 1) % 4) + 1] + + c[0] = pts2[2 * j] + c[1] = pts2[2 * j + 1] + + d[0] = pts2[2 * ((j + 1) % 4)] + d[1] = pts2[2 * ((j + 1) % 4) + 1] + + area_abc = trangle_area(a, b, c) + area_abd = trangle_area(a, b, d) + + if area_abc * area_abd >= 0: + return False + + area_cda = trangle_area(c, d, a) + area_cdb = area_cda + area_abc - area_abd + + if area_cda * area_cdb >= 0: + return False + t = area_cda / (area_abd - area_abc) + + dx = t * (b[0] - a[0]) + dy = t * (b[1] - a[1]) + temp_pts[0] = a[0] + dx + temp_pts[1] = a[1] + dy + return True + + +@cuda.jit('(float32, float32, float32[:])', device=True, inline=True) +def point_in_quadrilateral(pt_x, pt_y, corners): + ab0 = corners[2] - corners[0] + ab1 = corners[3] - corners[1] + + ad0 = corners[6] - corners[0] + ad1 = corners[7] - corners[1] + + ap0 = pt_x - corners[0] + ap1 = pt_y - corners[1] + + abab = ab0 * ab0 + ab1 * ab1 + abap = ab0 * ap0 + ab1 * ap1 + adad = ad0 * ad0 + ad1 * ad1 + adap = ad0 * ap0 + ad1 * ap1 + + return abab >= abap and abap >= 0 and adad >= adap and adap >= 0 + + +@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True) +def quadrilateral_intersection(pts1, pts2, int_pts): + num_of_inter = 0 + for i in range(4): + if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2): + int_pts[num_of_inter * 2] = pts1[2 * i] + int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1] + num_of_inter += 1 + if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1): + int_pts[num_of_inter * 2] = pts2[2 * i] + int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1] + num_of_inter += 1 + temp_pts = cuda.local.array((2, ), dtype=numba.float32) + for i in range(4): + for j in range(4): + has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts) + if has_pts: + int_pts[num_of_inter * 2] = temp_pts[0] + int_pts[num_of_inter * 2 + 1] = temp_pts[1] + num_of_inter += 1 + + return num_of_inter + + +@cuda.jit('(float32[:], float32[:])', device=True, inline=True) +def rbbox_to_corners(corners, rbbox): + # generate clockwise corners and rotate it clockwise + angle = rbbox[4] + a_cos = math.cos(angle) + a_sin = math.sin(angle) + center_x = rbbox[0] + center_y = rbbox[1] + x_d = rbbox[2] + y_d = rbbox[3] + corners_x = cuda.local.array((4, ), dtype=numba.float32) + corners_y = cuda.local.array((4, ), dtype=numba.float32) + corners_x[0] = -x_d / 2 + corners_x[1] = -x_d / 2 + corners_x[2] = x_d / 2 + corners_x[3] = x_d / 2 + corners_y[0] = -y_d / 2 + corners_y[1] = y_d / 2 + corners_y[2] = y_d / 2 + corners_y[3] = -y_d / 2 + for i in range(4): + corners[2 * + i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x + corners[2 * i + + 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y + + +@cuda.jit('(float32[:], float32[:])', device=True, inline=True) +def inter(rbbox1, rbbox2): + corners1 = cuda.local.array((8, ), dtype=numba.float32) + corners2 = cuda.local.array((8, ), dtype=numba.float32) + intersection_corners = cuda.local.array((16, ), dtype=numba.float32) + + rbbox_to_corners(corners1, rbbox1) + rbbox_to_corners(corners2, rbbox2) + + num_intersection = quadrilateral_intersection(corners1, corners2, + intersection_corners) + sort_vertex_in_convex_polygon(intersection_corners, num_intersection) + # print(intersection_corners.reshape([-1, 2])[:num_intersection]) + + return area(intersection_corners, num_intersection) + + +@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True) +def devRotateIoUEval(rbox1, rbox2, criterion=-1): + area1 = rbox1[2] * rbox1[3] + area2 = rbox2[2] * rbox2[3] + area_inter = inter(rbox1, rbox2) + if criterion == -1: + return area_inter / (area1 + area2 - area_inter) + elif criterion == 0: + return area_inter / area1 + elif criterion == 1: + return area_inter / area2 + else: + return area_inter + +@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False) +def rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1): + threadsPerBlock = 8 * 8 + row_start = cuda.blockIdx.x + col_start = cuda.blockIdx.y + tx = cuda.threadIdx.x + row_size = min(N - row_start * threadsPerBlock, threadsPerBlock) + col_size = min(K - col_start * threadsPerBlock, threadsPerBlock) + block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32) + block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32) + + dev_query_box_idx = threadsPerBlock * col_start + tx + dev_box_idx = threadsPerBlock * row_start + tx + if (tx < col_size): + block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0] + block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1] + block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2] + block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3] + block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4] + if (tx < row_size): + block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0] + block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1] + block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2] + block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3] + block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4] + cuda.syncthreads() + if tx < row_size: + for i in range(col_size): + offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i + dev_iou[offset] = devRotateIoUEval(block_qboxes[i * 5:i * 5 + 5], + block_boxes[tx * 5:tx * 5 + 5], criterion) + + +def rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0): + """rotated box iou running in gpu. 500x faster than cpu version + (take 5ms in one example with numba.cuda code). + convert from [this project]( + https://github.com/hongzhenwang/RRPN-revise/tree/master/lib/rotation). + + Args: + boxes (float tensor: [N, 5]): rbboxes. format: centers, dims, + angles(clockwise when positive) + query_boxes (float tensor: [K, 5]): [description] + device_id (int, optional): Defaults to 0. [description] + + Returns: + [type]: [description] + """ + box_dtype = boxes.dtype + boxes = boxes.astype(np.float32) + query_boxes = query_boxes.astype(np.float32) + N = boxes.shape[0] + K = query_boxes.shape[0] + iou = np.zeros((N, K), dtype=np.float32) + if N == 0 or K == 0: + return iou + threadsPerBlock = 8 * 8 + cuda.select_device(device_id) + blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock)) + + stream = cuda.stream() + with stream.auto_synchronize(): + boxes_dev = cuda.to_device(boxes.reshape([-1]), stream) + query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream) + iou_dev = cuda.to_device(iou.reshape([-1]), stream) + rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream]( + N, K, boxes_dev, query_boxes_dev, iou_dev, criterion) + iou_dev.copy_to_host(iou.reshape([-1]), stream=stream) + return iou.astype(boxes.dtype) diff --git a/PaddleCV/Paddle3D/PointRCNN/train.py b/PaddleCV/Paddle3D/PointRCNN/train.py new file mode 100644 index 0000000000000000000000000000000000000000..b7a39ca4555defbabdeee204c954bbcdfb7f8ee9 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/train.py @@ -0,0 +1,240 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import os +import sys +import time +import shutil +import argparse +import logging +import numpy as np +import paddle +import paddle.fluid as fluid +from paddle.fluid.layers import control_flow +from paddle.fluid.contrib.extend_optimizer import extend_with_decoupled_weight_decay +import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler + +from models.point_rcnn import PointRCNN +from data.kitti_rcnn_reader import KittiRCNNReader +from utils.run_utils import * +from utils.config import cfg, load_config, set_config_from_list +from utils.optimizer import optimize + +logging.root.handlers = [] +FORMAT = '%(asctime)s-%(levelname)s: %(message)s' +logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout) +logger = logging.getLogger(__name__) + + +def parse_args(): + parser = argparse.ArgumentParser("PointRCNN semantic segmentation train script") + parser.add_argument( + '--cfg', + type=str, + default='cfgs/default.yml', + help='specify the config for training') + parser.add_argument( + '--train_mode', + type=str, + default='rpn', + required=True, + help='specify the training mode') + parser.add_argument( + '--batch_size', + type=int, + default=16, + required=True, + help='training batch size, default 16') + parser.add_argument( + '--epoch', + type=int, + default=200, + required=True, + help='epoch number. default 200.') + parser.add_argument( + '--save_dir', + type=str, + default='checkpoints', + help='directory name to save train snapshoot') + parser.add_argument( + '--resume', + type=str, + default=None, + help='path to resume training based on previous checkpoints. ' + 'None for not resuming any checkpoints.') + parser.add_argument( + '--resume_epoch', + type=int, + default=0, + help='resume epoch id') + parser.add_argument( + '--data_dir', + type=str, + default='./data', + help='KITTI dataset root directory') + parser.add_argument( + '--gt_database', + type=str, + default='data/gt_database/train_gt_database_3level_Car.pkl', + help='generated gt database for augmentation') + parser.add_argument( + '--rcnn_training_roi_dir', + type=str, + default=None, + help='specify the saved rois for rcnn training when using rcnn_offline mode') + parser.add_argument( + '--rcnn_training_feature_dir', + type=str, + default=None, + help='specify the saved features for rcnn training when using rcnn_offline mode') + parser.add_argument( + '--log_interval', + type=int, + default=1, + help='mini-batch interval to log.') + parser.add_argument( + '--set', + dest='set_cfgs', + default=None, + nargs=argparse.REMAINDER, + help='set extra config keys if needed.') + args = parser.parse_args() + return args + + +def train(): + args = parse_args() + print_arguments(args) + # check whether the installed paddle is compiled with GPU + # PointRCNN model can only run on GPU + check_gpu(True) + + load_config(args.cfg) + if args.set_cfgs is not None: + set_config_from_list(args.set_cfgs) + + if args.train_mode == 'rpn': + cfg.RPN.ENABLED = True + cfg.RCNN.ENABLED = False + elif args.train_mode == 'rcnn': + cfg.RCNN.ENABLED = True + cfg.RPN.ENABLED = cfg.RPN.FIXED = True + elif args.train_mode == 'rcnn_offline': + cfg.RCNN.ENABLED = True + cfg.RPN.ENABLED = False + else: + raise NotImplementedError("unknown train mode: {}".format(args.train_mode)) + + checkpoints_dir = os.path.join(args.save_dir, args.train_mode) + if not os.path.isdir(checkpoints_dir): + os.makedirs(checkpoints_dir) + + kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir, + npoints=cfg.RPN.NUM_POINTS, + split=cfg.TRAIN.SPLIT, + mode='TRAIN', + classes=cfg.CLASSES, + rcnn_training_roi_dir=args.rcnn_training_roi_dir, + rcnn_training_feature_dir=args.rcnn_training_feature_dir, + gt_database_dir=args.gt_database) + num_samples = len(kitti_rcnn_reader) + steps_per_epoch = int(num_samples / args.batch_size) + logger.info("Total {} samples, {} batch per epoch.".format(num_samples, steps_per_epoch)) + boundaries = [i * steps_per_epoch for i in cfg.TRAIN.DECAY_STEP_LIST] + values = [cfg.TRAIN.LR * (cfg.TRAIN.LR_DECAY ** i) for i in range(len(boundaries) + 1)] + + place = fluid.CUDAPlace(0) + exe = fluid.Executor(place) + + # build model + startup = fluid.Program() + train_prog = fluid.Program() + with fluid.program_guard(train_prog, startup): + with fluid.unique_name.guard(): + train_model = PointRCNN(cfg, args.batch_size, True, 'TRAIN') + train_model.build() + train_pyreader = train_model.get_pyreader() + train_feeds = train_model.get_feeds() + train_outputs = train_model.get_outputs() + train_loss = train_outputs['loss'] + lr = optimize(train_loss, + learning_rate=cfg.TRAIN.LR, + warmup_factor=1. / cfg.TRAIN.DIV_FACTOR, + decay_factor=1e-5, + total_step=steps_per_epoch * args.epoch, + warmup_pct=cfg.TRAIN.PCT_START, + train_program=train_prog, + startup_prog=startup, + weight_decay=cfg.TRAIN.WEIGHT_DECAY, + clip_norm=cfg.TRAIN.GRAD_NORM_CLIP) + train_keys, train_values = parse_outputs(train_outputs, 'loss') + + exe.run(startup) + + if args.resume: + assert os.path.exists(args.resume), \ + "Given resume weight dir {} not exist.".format(args.resume) + def if_exist(var): + logger.debug("{}: {}".format(var.name, os.path.exists(os.path.join(args.resume, var.name)))) + return os.path.exists(os.path.join(args.resume, var.name)) + fluid.io.load_vars( + exe, args.resume, predicate=if_exist, main_program=train_prog) + + build_strategy = fluid.BuildStrategy() + build_strategy.memory_optimize = False + build_strategy.enable_inplace = False + build_strategy.fuse_all_optimizer_ops = False + train_compile_prog = fluid.compiler.CompiledProgram( + train_prog).with_data_parallel(loss_name=train_loss.name, + build_strategy=build_strategy) + + def save_model(exe, prog, path): + if os.path.isdir(path): + shutil.rmtree(path) + logger.info("Save model to {}".format(path)) + fluid.io.save_persistables(exe, path, prog) + + # get reader + train_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, train_feeds, drop_last=True) + train_pyreader.decorate_sample_list_generator(train_reader, place) + + train_stat = Stat() + for epoch_id in range(args.resume_epoch, args.epoch): + try: + train_pyreader.start() + train_iter = 0 + train_periods = [] + while True: + cur_time = time.time() + train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name]) + period = time.time() - cur_time + train_periods.append(period) + train_stat.update(train_keys, train_outs[:-1]) + if train_iter % args.log_interval == 0: + log_str = "" + for name, values in zip(train_keys + ['learning_rate'], train_outs): + log_str += "{}: {:.6f}, ".format(name, np.mean(values)) + logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period)) + train_iter += 1 + except fluid.core.EOFException: + logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[2:]))) + save_model(exe, train_prog, os.path.join(checkpoints_dir, str(epoch_id))) + train_stat.reset() + train_periods = [] + finally: + train_pyreader.reset() + + +if __name__ == "__main__": + train() diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..cad1d5d9ab5b0e5ed0724ddfc65ef53d14044b76 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py @@ -0,0 +1,14 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..49c9ee74a64634e1836d081220996919ffae16a4 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py @@ -0,0 +1,275 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +Contains proposal functions +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +import paddle.fluid as fluid + +from utils.config import cfg + +__all__ = ["boxes3d_to_bev", "box_overlap_rotate", "boxes3d_to_bev", "box_iou", "box_nms"] + + +def boxes3d_to_bev(boxes3d): + """ + Args: + boxes3d: [N, 7], (x, y, z, h, w, l, ry) + Return: + boxes_bev: [N, 5], (x1, y1, x2, y2, ry) + """ + boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32') + + cu, cv = boxes3d[:, 0], boxes3d[:, 2] + half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2 + boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w + boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w + boxes_bev[:, 4] = boxes3d[:, 6] + return boxes_bev + + +def rotate_around_center(center, angle_cos, angle_sin, corners): + new_x = (corners[:, 0] - center[0]) * angle_cos + \ + (corners[:, 1] - center[1]) * angle_sin + center[0] + new_y = -(corners[:, 0] - center[0]) * angle_sin + \ + (corners[:, 1] - center[1]) * angle_cos + center[1] + return np.concatenate([new_x[:, np.newaxis], new_y[:, np.newaxis]], axis=-1) + + +def check_rect_cross(p1, p2, q1, q2): + return min(p1[0], p2[0]) <= max(q1[0], q2[0]) and \ + min(q1[0], q2[0]) <= max(p1[0], p2[0]) and \ + min(p1[1], p2[1]) <= max(q1[1], q2[1]) and \ + min(q1[1], q2[1]) <= max(p1[1], p2[1]) + + +def cross(p1, p2, p0): + return (p1[0] - p0[0]) * (p2[1] - p0[1]) - (p2[0] - p0[0]) * (p1[1] - p0[1]); + + +def cross_area(a, b): + return a[0] * b[1] - a[1] * b[0] + + +def intersection(p1, p0, q1, q0): + if not check_rect_cross(p1, p0, q1, q0): + return None + + s1 = cross(q0, p1, p0) + s2 = cross(p1, q1, p0) + s3 = cross(p0, q1, q0) + s4 = cross(q1, p1, q0) + if not (s1 * s2 > 0 and s3 * s4 > 0): + return None + + s5 = cross(q1, p1, p0) + if np.abs(s5 - s1) > 1e-8: + return np.array([(s5 * q0[0] - s1 * q1[0]) / (s5 - s1), + (s5 * q0[1] - s1 * q1[1]) / (s5 - s1)], dtype='float32') + else: + a0 = p0[1] - p1[1] + b0 = p1[0] - p0[0] + c0 = p0[0] * p1[1] - p1[0] * p0[1] + a0 = q0[1] - q1[1] + b0 = q1[0] - q0[0] + c0 = q0[0] * q1[1] - q1[0] * q0[1] + D = a0 * b1 - a1 * b0 + return np.array([(b0 * c1 - b1 * c0) / D, (a1 * c0 - a0 * c1) / D], dtype='float32') + + +def check_in_box2d(box, p): + center_x = (box[0] + box[2]) / 2. + center_y = (box[1] + box[3]) / 2. + angle_cos = np.cos(-box[4]) + angle_sin = np.sin(-box[4]) + rot_x = (p[0] - center_x) * angle_cos + (p[1] - center_y) * angle_sin + center_x + rot_y = -(p[0] - center_x) * angle_sin + (p[1] - center_y) * angle_cos + center_y + return rot_x > box[0] - 1e-5 and rot_x < box[2] + 1e-5 and \ + rot_y > box[1] - 1e-5 and rot_y < box[3] + 1e-5 + + +def point_cmp(a, b, center): + return np.arctan2(a[1] - center[1], a[0] - center[0]) > \ + np.arctan2(b[1] - center[1], b[0] - center[0]) + + +def box_overlap_rotate(cur_box, boxes): + """ + Calculate box overlap with rotate, box: [x1, y1, x2, y2, angle] + """ + areas = np.zeros((len(boxes), ), dtype='float32') + cur_center = [(cur_box[0] + cur_box[2]) / 2., (cur_box[1] + cur_box[3]) / 2.] + cur_corners = np.array([ + [cur_box[0], cur_box[1]], # (x1, y1) + [cur_box[2], cur_box[1]], # (x2, y1) + [cur_box[2], cur_box[3]], # (x2, y2) + [cur_box[0], cur_box[3]], # (x1, y2) + [cur_box[0], cur_box[1]], # (x1, y1) + ], dtype='float32') + cur_angle_cos = np.cos(cur_box[4]) + cur_angle_sin = np.sin(cur_box[4]) + cur_corners = rotate_around_center(cur_center, cur_angle_cos, cur_angle_sin, cur_corners) + + for i, box in enumerate(boxes): + box_center = [(box[0] + box[2]) / 2., (box[1] + box[3]) / 2.] + box_corners = np.array([ + [box[0], box[1]], + [box[2], box[1]], + [box[2], box[3]], + [box[0], box[3]], + [box[0], box[1]], + ], dtype='float32') + box_angle_cos = np.cos(box[4]) + box_angle_sin = np.sin(box[4]) + box_corners = rotate_around_center(box_center, box_angle_cos, box_angle_sin, box_corners) + + cross_points = np.zeros((16, 2), dtype='float32') + cnt = 0 + # get intersection of lines + for j in range(4): + for k in range(4): + inters = intersection(cur_corners[j + 1], cur_corners[j], + box_corners[k + 1], box_corners[k]) + if inters is not None: + cross_points[cnt, :] = inters + cnt += 1 + # check corners + for l in range(4): + if check_in_box2d(cur_box, box_corners[l]): + cross_points[cnt, :] = box_corners[l] + cnt += 1 + if check_in_box2d(box, cur_corners[l]): + cross_points[cnt, :] = cur_corners[l] + cnt += 1 + + if cnt > 0: + poly_center = np.sum(cross_points[:cnt, :], axis=0) / cnt + else: + poly_center = np.zeros((2,)) + + # sort the points of polygon + for j in range(cnt - 1): + for k in range(cnt - j - 1): + if point_cmp(cross_points[k], cross_points[k + 1], poly_center): + cross_points[k], cross_points[k + 1] = \ + cross_points[k + 1].copy(), cross_points[k].copy() + + # get the overlap areas + area = 0. + for j in range(cnt - 1): + area += cross_area(cross_points[j] - cross_points[0], + cross_points[j + 1] - cross_points[0]) + areas[i] = np.abs(area) / 2. + + return areas + + +def box_iou(cur_box, boxes, box_type='normal'): + cur_S = (cur_box[2] - cur_box[0]) * (cur_box[3] - cur_box[1]) + boxes_S = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1]) + + if box_type == 'normal': + inter_x1 = np.maximum(cur_box[0], boxes[:, 0]) + inter_y1 = np.maximum(cur_box[1], boxes[:, 1]) + inter_x2 = np.minimum(cur_box[2], boxes[:, 2]) + inter_y2 = np.minimum(cur_box[3], boxes[:, 3]) + inter_w = np.maximum(inter_x2 - inter_x1, 0.) + inter_h = np.maximum(inter_y2 - inter_y1, 0.) + inter_area = inter_w * inter_h + elif box_type == 'rotate': + inter_area = box_overlap_rotate(cur_box, boxes) + else: + raise NotImplementedError + + return inter_area / np.maximum(cur_S + boxes_S - inter_area, 1e-8) + + +def box_nms(boxes, scores, proposals, thresh, topk, nms_type='normal'): + assert nms_type in ['normal', 'rotate'], \ + "unknown nms type {}".format(nms_type) + order = np.argsort(-scores) + boxes = boxes[order] + scores = scores[order] + proposals = proposals[order] + + nmsed_scores = [] + nmsed_proposals = [] + cnt = 0 + while boxes.shape[0]: + nmsed_scores.append(scores[0]) + nmsed_proposals.append(proposals[0]) + cnt +=1 + if cnt >= topk or boxes.shape[0] == 1: + break + iou = box_iou(boxes[0], boxes[1:], nms_type) + boxes = boxes[1:][iou < thresh] + scores = scores[1:][iou < thresh] + proposals = proposals[1:][iou < thresh] + return nmsed_scores, nmsed_proposals + + +def box_nms_eval(boxes, scores, proposals, thresh, nms_type='rotate'): + assert nms_type in ['normal', 'rotate'], \ + "unknown nms type {}".format(nms_type) + order = np.argsort(-scores) + boxes = boxes[order] + scores = scores[order] + proposals = proposals[order] + + nmsed_scores = [] + nmsed_proposals = [] + while boxes.shape[0]: + nmsed_scores.append(scores[0]) + nmsed_proposals.append(proposals[0]) + iou = box_iou(boxes[0], boxes[1:], nms_type) + inds = iou < thresh + boxes = boxes[1:][inds] + scores = scores[1:][inds] + proposals = proposals[1:][inds] + nmsed_scores = np.asarray(nmsed_scores) + nmsed_proposals = np.asarray(nmsed_proposals) + return nmsed_scores, nmsed_proposals + +def boxes_iou3d(boxes1, boxes2): + boxes1_bev = boxes3d_to_bev(boxes1) + boxes2_bev = boxes3d_to_bev(boxes2) + + # bev overlap + overlaps_bev = np.zeros((boxes1_bev.shape[0], boxes2_bev.shape[0])) + for i in range(boxes1_bev.shape[0]): + overlaps_bev[i, :] = box_overlap_rotate(boxes1_bev[i], boxes2_bev) + + # height overlap + boxes1_height_min = (boxes1[:, 1] - boxes1[:, 3]).reshape(-1, 1) + boxes1_height_max = boxes1[:, 1].reshape(-1, 1) + boxes2_height_min = (boxes2[:, 1] - boxes2[:, 3]).reshape(1, -1) + boxes2_height_max = boxes2[:, 1].reshape(1, -1) + + max_of_min = np.maximum(boxes1_height_min, boxes2_height_min) + min_of_max = np.minimum(boxes1_height_max, boxes2_height_max) + overlaps_h = np.maximum(min_of_max - max_of_min, 0.) + + # 3d iou + overlaps_3d = overlaps_bev * overlaps_h + + vol_a = (boxes1[:, 3] * boxes1[:, 4] * boxes1[:, 5]).reshape(-1, 1) + vol_b = (boxes2[:, 3] * boxes2[:, 4] * boxes2[:, 5]).reshape(1, -1) + iou3d = overlaps_3d / np.maximum(vol_a + vol_b - overlaps_3d, 1e-7) + + return iou3d diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py new file mode 100644 index 0000000000000000000000000000000000000000..41fcf279db5a194c5dcc81ae8dafa48b088a42bc --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py @@ -0,0 +1,143 @@ +""" +This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/kitti_utils.py +""" +import numpy as np +import os + + +def get_calib_from_file(calib_file): + with open(calib_file) as f: + lines = f.readlines() + + obj = lines[2].strip().split(' ')[1:] + P2 = np.array(obj, dtype=np.float32) + obj = lines[3].strip().split(' ')[1:] + P3 = np.array(obj, dtype=np.float32) + obj = lines[4].strip().split(' ')[1:] + R0 = np.array(obj, dtype=np.float32) + obj = lines[5].strip().split(' ')[1:] + Tr_velo_to_cam = np.array(obj, dtype=np.float32) + + return {'P2': P2.reshape(3, 4), + 'P3': P3.reshape(3, 4), + 'R0': R0.reshape(3, 3), + 'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)} + + +class Calibration(object): + def __init__(self, calib_file): + if isinstance(calib_file, str): + calib = get_calib_from_file(calib_file) + else: + calib = calib_file + + self.P2 = calib['P2'] # 3 x 4 + self.R0 = calib['R0'] # 3 x 3 + self.V2C = calib['Tr_velo2cam'] # 3 x 4 + + # Camera intrinsics and extrinsics + self.cu = self.P2[0, 2] + self.cv = self.P2[1, 2] + self.fu = self.P2[0, 0] + self.fv = self.P2[1, 1] + self.tx = self.P2[0, 3] / (-self.fu) + self.ty = self.P2[1, 3] / (-self.fv) + + def cart_to_hom(self, pts): + """ + :param pts: (N, 3 or 2) + :return pts_hom: (N, 4 or 3) + """ + pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32))) + return pts_hom + + def lidar_to_rect(self, pts_lidar): + """ + :param pts_lidar: (N, 3) + :return pts_rect: (N, 3) + """ + pts_lidar_hom = self.cart_to_hom(pts_lidar) + pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T)) + # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T)) + return pts_rect + + def rect_to_img(self, pts_rect): + """ + :param pts_rect: (N, 3) + :return pts_img: (N, 2) + """ + pts_rect_hom = self.cart_to_hom(pts_rect) + pts_2d_hom = np.dot(pts_rect_hom, self.P2.T) + pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T # (N, 2) + pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2] # depth in rect camera coord + return pts_img, pts_rect_depth + + def lidar_to_img(self, pts_lidar): + """ + :param pts_lidar: (N, 3) + :return pts_img: (N, 2) + """ + pts_rect = self.lidar_to_rect(pts_lidar) + pts_img, pts_depth = self.rect_to_img(pts_rect) + return pts_img, pts_depth + + def img_to_rect(self, u, v, depth_rect): + """ + :param u: (N) + :param v: (N) + :param depth_rect: (N) + :return: + """ + x = ((u - self.cu) * depth_rect) / self.fu + self.tx + y = ((v - self.cv) * depth_rect) / self.fv + self.ty + pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1) + return pts_rect + + def depthmap_to_rect(self, depth_map): + """ + :param depth_map: (H, W), depth_map + :return: + """ + x_range = np.arange(0, depth_map.shape[1]) + y_range = np.arange(0, depth_map.shape[0]) + x_idxs, y_idxs = np.meshgrid(x_range, y_range) + x_idxs, y_idxs = x_idxs.reshape(-1), y_idxs.reshape(-1) + depth = depth_map[y_idxs, x_idxs] + pts_rect = self.img_to_rect(x_idxs, y_idxs, depth) + return pts_rect, x_idxs, y_idxs + + def corners3d_to_img_boxes(self, corners3d): + """ + :param corners3d: (N, 8, 3) corners in rect coordinate + :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate + :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate + """ + sample_num = corners3d.shape[0] + corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2) # (N, 8, 4) + + img_pts = np.matmul(corners3d_hom, self.P2.T) # (N, 8, 3) + + x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2] + x1, y1 = np.min(x, axis=1), np.min(y, axis=1) + x2, y2 = np.max(x, axis=1), np.max(y, axis=1) + + boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1) + boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2) + + return boxes, boxes_corner + + def camera_dis_to_rect(self, u, v, d): + """ + Can only process valid u, v, d, which means u, v can not beyond the image shape, reprojection error 0.02 + :param u: (N) + :param v: (N) + :param d: (N), the distance between camera and 3d points, d^2 = x^2 + y^2 + z^2 + :return: + """ + assert self.fu == self.fv, '%.8f != %.8f' % (self.fu, self.fv) + fd = np.sqrt((u - self.cu)**2 + (v - self.cv)**2 + self.fu**2) + x = ((u - self.cu) * d) / fd + self.tx + y = ((v - self.cv) * d) / fd + self.ty + z = np.sqrt(d**2 - x**2 - y**2) + pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), z.reshape(-1, 1)), axis=1) + return pts_rect diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/config.py b/PaddleCV/Paddle3D/PointRCNN/utils/config.py new file mode 100644 index 0000000000000000000000000000000000000000..dc24aee5253576e3e5f78b8ed246af51c06279ba --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/config.py @@ -0,0 +1,279 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +This code is bases on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/config.py +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import yaml +import numpy as np +from ast import literal_eval + +__all__ = ["load_config", "cfg"] + + +class AttrDict(dict): + def __init__(self, *args, **kwargs): + for arg in args: + for k, v in arg.items(): + if isinstance(v, dict): + arg[k] = AttrDict(v) + else: + arg[k] = v + super(AttrDict, self).__init__(*args, **kwargs) + + def __getattr__(self, name): + if name in self.__dict__: + return self.__dict__[name] + elif name in self: + return self[name] + else: + raise AttributeError(name) + + def __setattr__(self, name, value): + if name in self.__dict__: + self.__dict__[name] = value + else: + self[name] = value + + +__C = AttrDict() +cfg = __C + +# 0. basic config +__C.TAG = 'default' +__C.CLASSES = 'Car' + +__C.INCLUDE_SIMILAR_TYPE = False + +# config of augmentation +__C.AUG_DATA = True +__C.AUG_METHOD_LIST = ['rotation', 'scaling', 'flip'] +__C.AUG_METHOD_PROB = [0.5, 0.5, 0.5] +__C.AUG_ROT_RANGE = 18 + +__C.GT_AUG_ENABLED = False +__C.GT_EXTRA_NUM = 15 +__C.GT_AUG_RAND_NUM = False +__C.GT_AUG_APPLY_PROB = 0.75 +__C.GT_AUG_HARD_RATIO = 0.6 + +__C.PC_REDUCE_BY_RANGE = True +__C.PC_AREA_SCOPE = np.array([[-40, 40], + [-1, 3], + [0, 70.4]]) # x, y, z scope in rect camera coords + +__C.CLS_MEAN_SIZE = np.array([[1.52, 1.63, 3.88]], dtype=np.float32) + + +# 1. config of rpn network +__C.RPN = AttrDict() +__C.RPN.ENABLED = True +__C.RPN.FIXED = False + +__C.RPN.USE_INTENSITY = True + +# config of bin-based loss +__C.RPN.LOC_XZ_FINE = False +__C.RPN.LOC_SCOPE = 3.0 +__C.RPN.LOC_BIN_SIZE = 0.5 +__C.RPN.NUM_HEAD_BIN = 12 + +# config of network structure +__C.RPN.BACKBONE = 'pointnet2_msg' + +__C.RPN.USE_BN = True +__C.RPN.NUM_POINTS = 16384 + +__C.RPN.SA_CONFIG = AttrDict() +__C.RPN.SA_CONFIG.NPOINTS = [4096, 1024, 256, 64] +__C.RPN.SA_CONFIG.RADIUS = [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]] +__C.RPN.SA_CONFIG.NSAMPLE = [[16, 32], [16, 32], [16, 32], [16, 32]] +__C.RPN.SA_CONFIG.MLPS = [[[16, 16, 32], [32, 32, 64]], + [[64, 64, 128], [64, 96, 128]], + [[128, 196, 256], [128, 196, 256]], + [[256, 256, 512], [256, 384, 512]]] +__C.RPN.FP_MLPS = [[128, 128], [256, 256], [512, 512], [512, 512]] +__C.RPN.CLS_FC = [128] +__C.RPN.REG_FC = [128] +__C.RPN.DP_RATIO = 0.5 + +# config of training +__C.RPN.LOSS_CLS = 'DiceLoss' +__C.RPN.FG_WEIGHT = 15 +__C.RPN.FOCAL_ALPHA = [0.25, 0.75] +__C.RPN.FOCAL_GAMMA = 2.0 +__C.RPN.REG_LOSS_WEIGHT = [1.0, 1.0, 1.0, 1.0] +__C.RPN.LOSS_WEIGHT = [1.0, 1.0] +__C.RPN.NMS_TYPE = 'normal' # normal, rotate + +# config of testing +__C.RPN.SCORE_THRESH = 0.3 + + +# 2. config of rcnn network +__C.RCNN = AttrDict() +__C.RCNN.ENABLED = False + +# config of input +__C.RCNN.USE_RPN_FEATURES = True +__C.RCNN.USE_MASK = True +__C.RCNN.MASK_TYPE = 'seg' +__C.RCNN.USE_INTENSITY = False +__C.RCNN.USE_DEPTH = True +__C.RCNN.USE_SEG_SCORE = False +__C.RCNN.ROI_SAMPLE_JIT = False +__C.RCNN.ROI_FG_AUG_TIMES = 10 + +__C.RCNN.REG_AUG_METHOD = 'multiple' # multiple, single, normal +__C.RCNN.POOL_EXTRA_WIDTH = 1.0 + +# config of bin-based loss +__C.RCNN.LOC_SCOPE = 1.5 +__C.RCNN.LOC_BIN_SIZE = 0.5 +__C.RCNN.NUM_HEAD_BIN = 9 +__C.RCNN.LOC_Y_BY_BIN = False +__C.RCNN.LOC_Y_SCOPE = 0.5 +__C.RCNN.LOC_Y_BIN_SIZE = 0.25 +__C.RCNN.SIZE_RES_ON_ROI = False + +# config of network structure +__C.RCNN.USE_BN = False +__C.RCNN.DP_RATIO = 0.0 + +__C.RCNN.BACKBONE = 'pointnet' # pointnet, pointsift +__C.RCNN.XYZ_UP_LAYER = [128, 128] + +__C.RCNN.NUM_POINTS = 512 +__C.RCNN.SA_CONFIG = AttrDict() +__C.RCNN.SA_CONFIG.NPOINTS = [128, 32, -1] +__C.RCNN.SA_CONFIG.RADIUS = [0.2, 0.4, 100] +__C.RCNN.SA_CONFIG.NSAMPLE = [64, 64, 64] +__C.RCNN.SA_CONFIG.MLPS = [[128, 128, 128], + [128, 128, 256], + [256, 256, 512]] +__C.RCNN.CLS_FC = [256, 256] +__C.RCNN.REG_FC = [256, 256] + +# config of training +__C.RCNN.LOSS_CLS = 'BinaryCrossEntropy' +__C.RCNN.FOCAL_ALPHA = [0.25, 0.75] +__C.RCNN.FOCAL_GAMMA = 2.0 +__C.RCNN.CLS_WEIGHT = np.array([1.0, 1.0, 1.0], dtype=np.float32) +__C.RCNN.CLS_FG_THRESH = 0.6 +__C.RCNN.CLS_BG_THRESH = 0.45 +__C.RCNN.CLS_BG_THRESH_LO = 0.05 +__C.RCNN.REG_FG_THRESH = 0.55 +__C.RCNN.FG_RATIO = 0.5 +__C.RCNN.ROI_PER_IMAGE = 64 +__C.RCNN.HARD_BG_RATIO = 0.6 + +# config of testing +__C.RCNN.SCORE_THRESH = 0.3 +__C.RCNN.NMS_THRESH = 0.1 + + +# general training config +__C.TRAIN = AttrDict() +__C.TRAIN.SPLIT = 'train' +__C.TRAIN.VAL_SPLIT = 'smallval' + +__C.TRAIN.LR = 0.002 +__C.TRAIN.LR_CLIP = 0.00001 +__C.TRAIN.LR_DECAY = 0.5 +__C.TRAIN.DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300] +__C.TRAIN.LR_WARMUP = False +__C.TRAIN.WARMUP_MIN = 0.0002 +__C.TRAIN.WARMUP_EPOCH = 5 + +__C.TRAIN.BN_MOMENTUM = 0.9 +__C.TRAIN.BN_DECAY = 0.5 +__C.TRAIN.BNM_CLIP = 0.01 +__C.TRAIN.BN_DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300] + +__C.TRAIN.OPTIMIZER = 'adam' +__C.TRAIN.WEIGHT_DECAY = 0.0 # "L2 regularization coeff [default: 0.0]" +__C.TRAIN.MOMENTUM = 0.9 + +__C.TRAIN.MOMS = [0.95, 0.85] +__C.TRAIN.DIV_FACTOR = 10.0 +__C.TRAIN.PCT_START = 0.4 + +__C.TRAIN.GRAD_NORM_CLIP = 1.0 + +__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000 +__C.TRAIN.RPN_POST_NMS_TOP_N = 2048 +__C.TRAIN.RPN_NMS_THRESH = 0.85 +__C.TRAIN.RPN_DISTANCE_BASED_PROPOSE = True + + +__C.TEST = AttrDict() +__C.TEST.SPLIT = 'val' +__C.TEST.RPN_PRE_NMS_TOP_N = 9000 +__C.TEST.RPN_POST_NMS_TOP_N = 300 +__C.TEST.RPN_NMS_THRESH = 0.7 +__C.TEST.RPN_DISTANCE_BASED_PROPOSE = True + + +def load_config(fname): + """ + Load config from yaml file and merge into global cfg + """ + with open(fname) as f: + yml_cfg = AttrDict(yaml.load(f.read(), Loader=yaml.Loader)) + _merge_cfg_a_to_b(yml_cfg, __C) + + +def set_config_from_list(cfg_list): + assert len(cfg_list) % 2 == 0, "cfgs list length invalid" + for k, v in zip(cfg_list[0::2], cfg_list[1::2]): + key_list = k.split('.') + d = __C + for subkey in key_list[:-1]: + assert subkey in d + d = d[subkey] + subkey = key_list[-1] + assert subkey in d + try: + value = literal_eval(v) + except: + # handle the case when v is a string literal + value = v + assert type(value) == type(d[subkey]), \ + 'type {} does not match original type {}'.format(type(value), type(d[subkey])) + d[subkey] = value + + +def _merge_cfg_a_to_b(a, b): + assert isinstance(a, AttrDict), \ + "unknown type {}".format(type(a)) + + for k, v in a.items(): + assert k in b, "unknown key {}".format(k) + if type(v) is not type(b[k]): + if isinstance(b[k], np.ndarray): + b[k] = np.array(v, dtype=b[k].dtype) + else: + raise TypeError("Config type mismatch") + if isinstance(v, AttrDict): + _merge_cfg_a_to_b(v, b[k]) + else: + b[k] = v + + +if __name__ == "__main__": + load_config("./cfgs/default.yml") diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e02c54922625934fe1ab74a8c29e435f44f4d302 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py @@ -0,0 +1,15 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + + diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx new file mode 100644 index 0000000000000000000000000000000000000000..b2c7f3c7169c0a0f5da1adeeb029eec423daf39e --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx @@ -0,0 +1,195 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import cython +from math import pi, cos, sin +import numpy as np +cimport numpy as np + + +cdef class Point: + cdef float x, y + def __cinit__(self, x, y): + self.x = x + self.y = y + + def __add__(self, v): + if not isinstance(v, Point): + return NotImplemented + return Point(self.x + v.x, self.y + v.y) + + def __sub__(self, v): + if not isinstance(v, Point): + return NotImplemented + return Point(self.x - v.x, self.y - v.y) + + def cross(self, v): + if not isinstance(v, Point): + return NotImplemented + return self.x*v.y - self.y*v.x + + +cdef class Line: + cdef float a, b, c + # ax + by + c = 0 + def __cinit__(self, v1, v2): + self.a = v2.y - v1.y + self.b = v1.x - v2.x + self.c = v2.cross(v1) + + def __call__(self, p): + return self.a*p.x + self.b*p.y + self.c + + def intersection(self, other): + if not isinstance(other, Line): + return NotImplemented + w = self.a*other.b - self.b*other.a + return Point( + (self.b*other.c - self.c*other.b)/w, + (self.c*other.a - self.a*other.c)/w + ) + + +@cython.boundscheck(False) +@cython.wraparound(False) +def rectangle_vertices_(x1, y1, x2, y2, r): + + cx = (x1 + x2) / 2 + cy = (y1 + y2) / 2 + angle = r + cr = cos(angle) + sr = sin(angle) + # rotate around center + return ( + Point( + x=(x1-cx)*cr+(y1-cy)*sr+cx, + y=-(x1-cx)*sr+(y1-cy)*cr+cy + ), + Point( + x=(x2-cx)*cr+(y1-cy)*sr+cx, + y=-(x2-cx)*sr+(y1-cy)*cr+cy + ), + Point( + x=(x2-cx)*cr+(y2-cy)*sr+cx, + y=-(x2-cx)*sr+(y2-cy)*cr+cy + ), + Point( + x=(x1-cx)*cr+(y2-cy)*sr+cx, + y=-(x1-cx)*sr+(y2-cy)*cr+cy + ) + ) + +@cython.boundscheck(False) +@cython.wraparound(False) +def intersection_area(r1, r2): + # r1 and r2 are in (center, width, height, rotation) representation + # First convert these into a sequence of vertices + + rect1 = rectangle_vertices_(*r1) + rect2 = rectangle_vertices_(*r2) + + # Use the vertices of the first rectangle as + # starting vertices of the intersection polygon. + intersection = rect1 + + # Loop over the edges of the second rectangle + for p, q in zip(rect2, rect2[1:] + rect2[:1]): + if len(intersection) <= 2: + break # No intersection + + line = Line(p, q) + + # Any point p with line(p) <= 0 is on the "inside" (or on the boundary), + # any point p with line(p) > 0 is on the "outside". + + # Loop over the edges of the intersection polygon, + # and determine which part is inside and which is outside. + new_intersection = [] + line_values = [line(t) for t in intersection] + for s, t, s_value, t_value in zip( + intersection, intersection[1:] + intersection[:1], + line_values, line_values[1:] + line_values[:1]): + if s_value <= 0: + new_intersection.append(s) + if s_value * t_value < 0: + # Points are on opposite sides. + # Add the intersection of the lines to new_intersection. + intersection_point = line.intersection(Line(s, t)) + new_intersection.append(intersection_point) + + intersection = new_intersection + + # Calculate area + if len(intersection) <= 2: + return 0 + + return 0.5 * sum(p.x*q.y - p.y*q.x for p, q in zip(intersection, intersection[1:] + intersection[:1])) + + +def boxes3d_to_bev_(boxes3d): + """ + Args: + boxes3d: [N, 7], (x, y, z, h, w, l, ry) + Return: + boxes_bev: [N, 5], (x1, y1, x2, y2, ry) + """ + boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32') + cu, cv = boxes3d[:, 0], boxes3d[:, 2] + half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2 + boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w + boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w + boxes_bev[:, 4] = boxes3d[:, 6] + return boxes_bev + + +def boxes_iou3d(boxes_a, boxes_b): + """ + :param boxes_a: (N, 7) [x, y, z, h, w, l, ry] + :param boxes_b: (M, 7) [x, y, z, h, w, l, ry] + :return: + ans_iou: (M, N) + """ + boxes_a_bev = boxes3d_to_bev_(boxes_a) + boxes_b_bev = boxes3d_to_bev_(boxes_b) + # bev overlap + num_a = boxes_a_bev.shape[0] + num_b = boxes_b_bev.shape[0] + overlaps_bev = np.zeros((num_a, num_b), dtype=np.float32) + for i in range(num_a): + for j in range(num_b): + overlaps_bev[i][j] = intersection_area(boxes_a_bev[i], boxes_b_bev[j]) + + # height overlap + boxes_a_height_min = (boxes_a[:, 1] - boxes_a[:, 3]).reshape(-1, 1) + boxes_a_height_max = boxes_a[:, 1].reshape(-1, 1) + boxes_b_height_min = (boxes_b[:, 1] - boxes_b[:, 3]).reshape(1, -1) + boxes_b_height_max = boxes_b[:, 1].reshape(1, -1) + + max_of_min = np.maximum(boxes_a_height_min, boxes_b_height_min) + min_of_max = np.minimum(boxes_a_height_max, boxes_b_height_max) + overlaps_h = np.clip(min_of_max - max_of_min, a_min=0, a_max=np.inf) + # 3d iou + overlaps_3d = overlaps_bev * overlaps_h + + vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).reshape(-1, 1) + vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).reshape(1, -1) + + iou3d = overlaps_3d / np.clip(vol_a + vol_b - overlaps_3d, a_min=1e-7, a_max=np.inf) + return iou3d + +#if __name__ == '__main__': +# # (center, width, height, rotation) +# r1 = (10, 15, 15, 10, 30) +# r2 = (15, 15, 20, 10, 0) +# print(intersection_area(r1, r2)) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx new file mode 100644 index 0000000000000000000000000000000000000000..593dd0c9354516a2861701c5103f8e9b10ae46b1 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx @@ -0,0 +1,346 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import cython +import numpy as np +cimport numpy as np + +@cython.boundscheck(False) +@cython.wraparound(False) +def pts_in_boxes3d(np.ndarray pts_rect, np.ndarray boxes3d): + """ + :param pts: (N, 3) in rect-camera coords + :param boxes3d: (M, 7) + :return: boxes_pts_mask_list: (M), list with [(N), (N), ..] + """ + cdef float MAX_DIS = 10.0 + cdef np.ndarray boxes_pts_mask_list = np.zeros((boxes3d.shape[0], pts_rect.shape[0]), dtype='int32') + cdef int boxes3d_num = boxes3d.shape[0] + cdef int pts_rect_num = pts_rect.shape[0] + cdef float cx, by, cz, h, w, l, angle, cy, cosa, sina, x_rot, z_rot + cdef int x, y, z + + for i in range(boxes3d_num): + cx, by, cz, h, w, l, angle = boxes3d[i, :] + cy = by - h / 2. + cosa = np.cos(angle) + sina = np.sin(angle) + for j in range(pts_rect_num): + x, y, z = pts_rect[j, :] + + if np.abs(x - cx) > MAX_DIS or np.abs(y - cy) > h / 2. or np.abs(z - cz) > MAX_DIS: + continue + + x_rot = (x - cx) * cosa + (z - cz) * (-sina) + z_rot = (x - cx) * sina + (z - cz) * cosa + boxes_pts_mask_list[i, j] = int(x_rot >= -l / 2. and x_rot <= l / 2. and + z_rot >= -w / 2. and z_rot <= w / 2.) + return boxes_pts_mask_list + + +@cython.boundscheck(False) +@cython.wraparound(False) +def rotate_pc_along_y(np.ndarray pc, float rot_angle): + """ + params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate + params rot_angle: rad scalar + Output pc: updated pc with XYZ rotated + """ + cosval = np.cos(rot_angle) + sinval = np.sin(rot_angle) + rotmat = np.array([[cosval, -sinval], [sinval, cosval]]) + pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat)) + return pc + + +@cython.boundscheck(False) +@cython.wraparound(False) +def rotate_pc_along_y_np(np.ndarray pc, np.ndarray rot_angle): + """ + :param pc: (N, 512, 3 + C) + :param rot_angle: (N) + :return: + TODO: merge with rotate_pc_along_y_torch in bbox_transform.py + """ + cdef np.ndarray cosa, sina, raw_1, raw_2, R, pc_temp + cosa = np.cos(rot_angle).reshape(-1, 1) + sina = np.sin(rot_angle).reshape(-1, 1) + raw_1 = np.concatenate([cosa, -sina], axis=1) + raw_2 = np.concatenate([sina, cosa], axis=1) + # # (N, 2, 2) + R = np.concatenate((np.expand_dims(raw_1, axis=1), np.expand_dims(raw_2, axis=1)), axis=1) + pc_temp = pc[:, :, [0, 2]] + pc[:, :, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1)) + + return pc + + +@cython.boundscheck(False) +@cython.wraparound(False) +def enlarge_box3d(np.ndarray boxes3d, float extra_width): + """ + :param boxes3d: (N, 7) [x, y, z, h, w, l, ry] + """ + cdef np.ndarray large_boxes3d + if isinstance(boxes3d, np.ndarray): + large_boxes3d = boxes3d.copy() + else: + large_boxes3d = boxes3d.clone() + large_boxes3d[:, 3:6] += extra_width * 2 + large_boxes3d[:, 1] += extra_width + + return large_boxes3d + + +@cython.boundscheck(False) +@cython.wraparound(False) +def boxes3d_to_corners3d(np.ndarray boxes3d, bint rotate=True): + """ + :param boxes3d: (N, 7) [x, y, z, h, w, l, ry] + :param rotate: + :return: corners3d: (N, 8, 3) + """ + cdef int boxes_num = boxes3d.shape[0] + cdef np.ndarray h, w, l + h, w, l = boxes3d[:, 3], boxes3d[:, 4], boxes3d[:, 5] + cdef np.ndarray x_corners, y_corners + x_corners = np.array([l / 2., l / 2., -l / 2., -l / 2., l / 2., l / 2., -l / 2., -l / 2.], dtype=np.float32).T # (N, 8) + z_corners = np.array([w / 2., -w / 2., -w / 2., w / 2., w / 2., -w / 2., -w / 2., w / 2.], dtype=np.float32).T # (N, 8) + + y_corners = np.zeros((boxes_num, 8), dtype=np.float32) + y_corners[:, 4:8] = -h.reshape(boxes_num, 1).repeat(4, axis=1) # (N, 8) + + cdef np.ndarray ry, zeros, ones, rot_list, R_list, temp_corners, rotated_corners + if rotate: + ry = boxes3d[:, 6] + zeros, ones = np.zeros(ry.size, dtype=np.float32), np.ones(ry.size, dtype=np.float32) + rot_list = np.array([[np.cos(ry), zeros, -np.sin(ry)], + [zeros, ones, zeros], + [np.sin(ry), zeros, np.cos(ry)]]) # (3, 3, N) + R_list = np.transpose(rot_list, (2, 0, 1)) # (N, 3, 3) + + temp_corners = np.concatenate((x_corners.reshape(-1, 8, 1), y_corners.reshape(-1, 8, 1), + z_corners.reshape(-1, 8, 1)), axis=2) # (N, 8, 3) + rotated_corners = np.matmul(temp_corners, R_list) # (N, 8, 3) + x_corners, y_corners, z_corners = rotated_corners[:, :, 0], rotated_corners[:, :, 1], rotated_corners[:, :, 2] + + cdef np.ndarray x_loc, y_loc, z_loc + x_loc, y_loc, z_loc = boxes3d[:, 0], boxes3d[:, 1], boxes3d[:, 2] + + cdef np.ndarray x, y, z, corners + x = x_loc.reshape(-1, 1) + x_corners.reshape(-1, 8) + y = y_loc.reshape(-1, 1) + y_corners.reshape(-1, 8) + z = z_loc.reshape(-1, 1) + z_corners.reshape(-1, 8) + + corners = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1), z.reshape(-1, 8, 1)), axis=2).astype(np.float32) + + return corners + + +@cython.boundscheck(False) +@cython.wraparound(False) +def objs_to_boxes3d(obj_list): + cdef np.ndarray boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32) + cdef int k + for k, obj in enumerate(obj_list): + boxes3d[k, 0:3], boxes3d[k, 3], boxes3d[k, 4], boxes3d[k, 5], boxes3d[k, 6] \ + = obj.pos, obj.h, obj.w, obj.l, obj.ry + return boxes3d + + +@cython.boundscheck(False) +@cython.wraparound(False) +def objs_to_scores(obj_list): + cdef np.ndarray scores = np.zeros((obj_list.__len__()), dtype=np.float32) + cdef int k + for k, obj in enumerate(obj_list): + scores[k] = obj.score + return scores + + +def get_iou3d(np.ndarray corners3d, np.ndarray query_corners3d, bint need_bev=False): + """ + :param corners3d: (N, 8, 3) in rect coords + :param query_corners3d: (M, 8, 3) + :return: + """ + from shapely.geometry import Polygon + A, B = corners3d, query_corners3d + N, M = A.shape[0], B.shape[0] + iou3d = np.zeros((N, M), dtype=np.float32) + iou_bev = np.zeros((N, M), dtype=np.float32) + + # for height overlap, since y face down, use the negative y + min_h_a = -A[:, 0:4, 1].sum(axis=1) / 4.0 + max_h_a = -A[:, 4:8, 1].sum(axis=1) / 4.0 + min_h_b = -B[:, 0:4, 1].sum(axis=1) / 4.0 + max_h_b = -B[:, 4:8, 1].sum(axis=1) / 4.0 + + for i in range(N): + for j in range(M): + max_of_min = np.max([min_h_a[i], min_h_b[j]]) + min_of_max = np.min([max_h_a[i], max_h_b[j]]) + h_overlap = np.max([0, min_of_max - max_of_min]) + if h_overlap == 0: + continue + + bottom_a, bottom_b = Polygon(A[i, 0:4, [0, 2]].T), Polygon(B[j, 0:4, [0, 2]].T) + if bottom_a.is_valid and bottom_b.is_valid: + # check is valid, A valid Polygon may not possess any overlapping exterior or interior rings. + bottom_overlap = bottom_a.intersection(bottom_b).area + else: + bottom_overlap = 0. + overlap3d = bottom_overlap * h_overlap + union3d = bottom_a.area * (max_h_a[i] - min_h_a[i]) + bottom_b.area * (max_h_b[j] - min_h_b[j]) - overlap3d + iou3d[i][j] = overlap3d / union3d + iou_bev[i][j] = bottom_overlap / (bottom_a.area + bottom_b.area - bottom_overlap) + + if need_bev: + return iou3d, iou_bev + + return iou3d + + +def get_objects_from_label(label_file): + import utils.object3d as object3d + + with open(label_file, 'r') as f: + lines = f.readlines() + objects = [object3d.Object3d(line) for line in lines] + return objects + + +@cython.boundscheck(False) +@cython.wraparound(False) +def _rotate_pc_along_y(np.ndarray pc, np.ndarray angle): + cdef np.ndarray cosa = np.cos(angle) + cosa=cosa.reshape(-1, 1) + cdef np.ndarray sina = np.sin(angle) + sina = sina.reshape(-1, 1) + + cdef np.ndarray R = np.concatenate([cosa, -sina, sina, cosa], axis=-1) + R = R.reshape(-1, 2, 2) + cdef np.ndarray pc_temp = pc[:, [0, 2]] + pc_temp = pc_temp.reshape(-1, 1, 2) + cdef np.ndarray pc_temp_1 = np.matmul(pc_temp, R.transpose(0, 2, 1)) + pc_temp_1 = pc_temp_1.reshape(-1, 2) + pc[:,[0,2]] = pc_temp_1 + + return pc + +@cython.boundscheck(False) +@cython.wraparound(False) +def decode_bbox_target( + np.ndarray roi_box3d, + np.ndarray pred_reg, + np.ndarray anchor_size, + float loc_scope, + float loc_bin_size, + int num_head_bin, + bint get_xz_fine=True, + float loc_y_scope=0.5, + float loc_y_bin_size=0.25, + bint get_y_by_bin=False, + bint get_ry_fine=False): + + cdef int per_loc_bin_num = int(loc_scope / loc_bin_size) * 2 + cdef int loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2 + + # recover xz localization + cdef int x_bin_l = 0 + cdef int x_bin_r = per_loc_bin_num + cdef int z_bin_l = per_loc_bin_num, + cdef int z_bin_r = per_loc_bin_num * 2 + cdef int start_offset = z_bin_r + cdef np.ndarray x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1) + cdef np.ndarray z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1) + + cdef np.ndarray pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope + cdef np.ndarray pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope + + if get_xz_fine: + x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3 + z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4 + start_offset = z_res_r + + x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin] + z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin] + + x_res = x_res_norm * loc_bin_size + z_res = z_res_norm * loc_bin_size + pos_x += x_res + pos_z += z_res + + # recover y localization + if get_y_by_bin: + y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num + y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num + start_offset = y_res_r + + y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1) + y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin] + y_res = y_res_norm * loc_y_bin_size + pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res + pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1) + else: + y_offset_l, y_offset_r = start_offset, start_offset + 1 + start_offset = y_offset_r + + pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l]) + pos_y = pos_y.reshape(-1) + + # recover ry rotation + cdef int ry_bin_l = start_offset, + cdef int ry_bin_r = start_offset + num_head_bin + cdef int ry_res_l = ry_bin_r, + cdef int ry_res_r = ry_bin_r + num_head_bin + + cdef np.ndarray ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1) + cdef np.ndarray ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin] + if get_ry_fine: + # divide pi/2 into several bins + angle_per_class = (np.pi / 2) / num_head_bin + ry_res = ry_res_norm * (angle_per_class / 2) + ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4 + else: + angle_per_class = (2 * np.pi) / num_head_bin + ry_res = ry_res_norm * (angle_per_class / 2) + + # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330) + ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi) + ry[ry > np.pi] -= 2 * np.pi + + # recover size + cdef int size_res_l = ry_res_r + cdef int size_res_r = ry_res_r + 3 + assert size_res_r == pred_reg.shape[1] + + cdef np.ndarray size_res_norm = pred_reg[:, size_res_l: size_res_r] + cdef np.ndarray hwl = size_res_norm * anchor_size + anchor_size + + # shift to original coords + cdef np.ndarray roi_center = np.array(roi_box3d[:, 0:3]) + cdef np.ndarray shift_ret_box3d = np.concatenate(( + pos_x.reshape(-1, 1), + pos_y.reshape(-1, 1), + pos_z.reshape(-1, 1), + hwl, ry.reshape(-1, 1)), axis=1) + ret_box3d = shift_ret_box3d + if roi_box3d.shape[1] == 7: + roi_ry = np.array(roi_box3d[:, 6]).reshape(-1) + ret_box3d = _rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry) + ret_box3d[:, 6] += roi_ry + ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]] + + return ret_box3d diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py new file mode 100644 index 0000000000000000000000000000000000000000..97d81421afa89a0e26daa4f956c4d835763cb966 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py @@ -0,0 +1,107 @@ +""" +This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py +""" +import numpy as np + + +def cls_type_to_id(cls_type): + type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4} + if cls_type not in type_to_id.keys(): + return -1 + return type_to_id[cls_type] + + +class Object3d(object): + + def __init__(self, line): + label = line.strip().split(' ') + self.src = line + self.cls_type = label[0] + self.cls_id = cls_type_to_id(self.cls_type) + self.trucation = float(label[1]) + self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown + self.alpha = float(label[3]) + self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32) + self.h = float(label[8]) + self.w = float(label[9]) + self.l = float(label[10]) + self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32) + self.dis_to_cam = np.linalg.norm(self.pos) + self.ry = float(label[14]) + self.score = float(label[15]) if label.__len__() == 16 else -1.0 + self.level_str = None + self.level = self.get_obj_level() + + def get_obj_level(self): + height = float(self.box2d[3]) - float(self.box2d[1]) + 1 + + if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0: + self.level_str = 'Easy' + return 1 # Easy + elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1: + self.level_str = 'Moderate' + return 2 # Moderate + elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2: + self.level_str = 'Hard' + return 3 # Hard + else: + self.level_str = 'UnKnown' + return 4 + + def generate_corners3d(self): + """ + generate corners3d representation for this object + :return corners_3d: (8, 3) corners of box3d in camera coord + """ + l, h, w = self.l, self.h, self.w + x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2] + y_corners = [0, 0, 0, 0, -h, -h, -h, -h] + z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2] + + R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)], + [0, 1, 0], + [-np.sin(self.ry), 0, np.cos(self.ry)]]) + corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8) + corners3d = np.dot(R, corners3d).T + corners3d = corners3d + self.pos + return corners3d + + def to_bev_box2d(self, oblique=True, voxel_size=0.1): + """ + :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image + :param voxel_size: float, 0.1m + :param oblique: + :return: box2d (4, 2)/ (4) in image coordinate + """ + if oblique: + corners3d = self.generate_corners3d() + xz_corners = corners3d[0:4, [0, 2]] + box2d = np.zeros((4, 2), dtype=np.int32) + box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32) + box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32) + box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1]) + box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0]) + else: + box2d = np.zeros(4, dtype=np.int32) + # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32) + cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32) + cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32) + half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2) + box2d[0], box2d[1] = cu - half_l, cv - half_w + box2d[2], box2d[3] = cu + half_l, cv + half_w + + return box2d + + def to_str(self): + print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \ + % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l, + self.pos, self.ry) + return print_str + + def to_kitti_format(self): + kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \ + % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1], + self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2], + self.ry) + return kitti_str + diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx new file mode 100644 index 0000000000000000000000000000000000000000..3efa83135fed11d3e3a3daceb821c63424beb524 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx @@ -0,0 +1,160 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. + +import numpy as np +cimport numpy as np +cimport cython +from libc.math cimport sin, cos + +@cython.boundscheck(False) +@cython.wraparound(False) +cdef enlarge_box3d(np.ndarray boxes3d, int extra_width): + """ + :param boxes3d: (N, 7) [x, y, z, h, w, l, ry] + """ + if isinstance(boxes3d, np.ndarray): + large_boxes3d = boxes3d.copy() + else: + large_boxes3d = boxes3d.clone() + large_boxes3d[:, 3:6] += extra_width * 2 + large_boxes3d[:, 1] += extra_width + return large_boxes3d + +@cython.boundscheck(False) +@cython.wraparound(False) +cdef pt_in_box(float x, float y, float z, float cx, float bottom_y, float cz, float h, float w, float l, float angle): + cdef float max_ids = 10.0 + cdef float cy = bottom_y - h / 2.0 + if ((abs(x - cx) > max_ids) or (abs(y - cy) > h / 2.0) or (abs(z - cz) > max_ids)): + return 0 + cdef float cosa = cos(angle) + cdef float sina = sin(angle) + cdef float x_rot = (x - cx) * cosa + (z - cz) * (-sina) + + cdef float z_rot = (x - cx) * sina + (z - cz) * cosa + + cdef float flag = (x_rot >= -l / 2.0) and (x_rot <= l / 2.0) and (z_rot >= -w / 2.0) and (z_rot <= w / 2.0) + return flag + +@cython.boundscheck(False) +@cython.wraparound(False) +cdef _rotate_pc_along_y(np.ndarray pc, float rot_angle): + """ + params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate + params rot_angle: rad scalar + Output pc: updated pc with XYZ rotated + """ + cosval = np.cos(rot_angle) + sinval = np.sin(rot_angle) + rotmat = np.array([[cosval, -sinval], [sinval, cosval]]) + pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat)) + return pc + +@cython.boundscheck(False) +@cython.wraparound(False) +def roipool3d_cpu( + np.ndarray[float, ndim=2] pts, + np.ndarray[float, ndim=2] pts_feature, + np.ndarray[float, ndim=2] boxes3d, + np.ndarray[float, ndim=2] pts_extra_input, + int pool_extra_width, int sampled_pt_num, int batch_size=1, bint canonical_transform=False): + cdef np.ndarray pts_feature_all = np.concatenate((pts_extra_input, pts_feature), axis=1) + + cdef np.ndarray larged_boxes3d = enlarge_box3d(boxes3d.reshape(-1, 7), pool_extra_width).reshape(batch_size, -1, 7) + + cdef int pts_num = pts.shape[0], + cdef int boxes_num = boxes3d.shape[0] + cdef int feature_len = pts_feature_all.shape[1] + cdef np.ndarray pts_data = np.zeros((batch_size, boxes_num, sampled_pt_num, 3)) + cdef np.ndarray features_data = np.zeros((batch_size, boxes_num, sampled_pt_num, feature_len)) + cdef np.ndarray empty_flag_data = np.zeros((batch_size, boxes_num)) + + cdef int cnt = 0 + cdef float cx = 0. + cdef float bottom_y = 0. + cdef float cz = 0. + cdef float h = 0. + cdef float w = 0. + cdef float l = 0. + cdef float ry = 0. + cdef float x = 0. + cdef float y = 0. + cdef float z = 0. + cdef np.ndarray x_i + cdef np.ndarray feat_i + cdef int bs + cdef int i + cdef int j + for bs in range(batch_size): + # boxes: 64,7 + for i in range(boxes_num): + cnt = 0 + # box + box = larged_boxes3d[bs][i] + cx = box[0] + bottom_y = box[1] + cz = box[2] + h = box[3] + w = box[4] + l = box[5] + ry = box[6] + # points: 16384,3 + x_i = pts + # features: 16384, 128 + feat_i = pts_feature_all + + for j in range(pts_num): + x = x_i[j][0] + y = x_i[j][1] + z = x_i[j][2] + cur_in_flag = pt_in_box(x,y,z,cx,bottom_y,cz,h,w,l,ry) + if cur_in_flag: + if cnt < sampled_pt_num: + pts_data[bs][i][cnt][:] = x_i[j] + features_data[bs][i][cnt][:] = feat_i[j] + cnt += 1 + else: + break + + if cnt == 0: + empty_flag_data[bs][i] = 1 + elif (cnt < sampled_pt_num): + for k in range(cnt, sampled_pt_num): + pts_data[bs][i][k] = pts_data[bs][i][k % cnt] + features_data[bs][i][k] = features_data[bs][i][k % cnt] + + + pooled_pts = pts_data.astype("float32")[0] + pooled_features = features_data.astype('float32')[0] + pooled_empty_flag = empty_flag_data.astype('int64')[0] + + cdef int extra_input_len = pts_extra_input.shape[1] + pooled_pts = np.concatenate((pooled_pts, pooled_features[:,:,0:extra_input_len]),axis=2) + pooled_features = pooled_features[:,:,extra_input_len:] + + if canonical_transform: + # Translate to the roi coordinates + roi_ry = boxes3d[:, 6] % (2 * np.pi) # 0~2pi + roi_center = boxes3d[:, 0:3] + # shift to center + pooled_pts[:, :, 0:3] = pooled_pts[:, :, 0:3] - roi_center[:, np.newaxis, :] + for k in range(pooled_pts.shape[0]): + pooled_pts[k] = _rotate_pc_along_y(pooled_pts[k], roi_ry[k]) + return pooled_pts, pooled_features, pooled_empty_flag + + return pooled_pts, pooled_features, pooled_empty_flag + + +#def roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num=512, batch_size=1): +# return _roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num, batch_size) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py new file mode 100644 index 0000000000000000000000000000000000000000..0d775017468bbb683d0ea0f0058062e5de12da73 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py @@ -0,0 +1,74 @@ +# Copyright (c) 2017-present, Facebook, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +############################################################################## + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +from Cython.Build import cythonize +from setuptools import Extension +from setuptools import setup + +import numpy as np + +_NP_INCLUDE_DIRS = np.get_include() + + +# Extension modules +ext_modules = [ + Extension( + name='utils.cyops.roipool3d_utils', + sources=[ + 'utils/cyops/roipool3d_utils.pyx' + ], + extra_compile_args=[ + '-Wno-cpp' + ], + include_dirs=[ + _NP_INCLUDE_DIRS + ] + ), + + Extension( + name='utils.cyops.iou3d_utils', + sources=[ + 'utils/cyops/iou3d_utils.pyx' + ], + extra_compile_args=[ + '-Wno-cpp' + ], + include_dirs=[ + _NP_INCLUDE_DIRS + ] + ), + + Extension( + name='utils.cyops.kitti_utils', + sources=[ + 'utils/cyops/kitti_utils.pyx' + ], + extra_compile_args=[ + '-Wno-cpp' + ], + include_dirs=[ + _NP_INCLUDE_DIRS + ] + ), +] + +setup( + name='pp_pointrcnn', + ext_modules=cythonize(ext_modules) +) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..aa7ee70652ac4e76aef9f4d755ec057ef2bc9123 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py @@ -0,0 +1,216 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import sys +import logging +import numpy as np +import utils.cyops.kitti_utils as kitti_utils +from utils.config import cfg +from utils.box_utils import boxes_iou3d, box_nms_eval, boxes3d_to_bev +from utils.save_utils import save_rpn_feature, save_kitti_result, save_kitti_format + +__all__ = ['calc_iou_recall', 'rpn_metric', 'rcnn_metric'] + +logging.root.handlers = [] +FORMAT = '%(asctime)s-%(levelname)s: %(message)s' +logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout) +logger = logging.getLogger(__name__) + + +def calc_iou_recall(rets, thresh_list): + rpn_cls_label = rets['rpn_cls_label'][0] + boxes3d = rets['rois'][0] + seg_mask = rets['seg_mask'][0] + sample_id = rets['sample_id'][0] + gt_boxes3d = rets['gt_boxes3d'][0] + gt_boxes3d_num = rets['gt_boxes3d'][1] + + gt_box_idx = 0 + recalled_bbox_list = [0] * len(thresh_list) + gt_box_num = 0 + rpn_iou_sum = 0. + for i in range(len(gt_boxes3d_num)): + cur_rpn_cls_label = rpn_cls_label[i] + cur_boxes3d = boxes3d[i] + cur_seg_mask = seg_mask[i] + cur_sample_id = sample_id[i] + cur_gt_boxes3d = gt_boxes3d[gt_box_idx: gt_box_idx + + gt_boxes3d_num[0][i]] + gt_box_idx += gt_boxes3d_num[0][i] + + k = cur_gt_boxes3d.__len__() - 1 + while k >= 0 and np.sum(cur_gt_boxes3d[k]) == 0: + k -= 1 + cur_gt_boxes3d = cur_gt_boxes3d[:k + 1] + + if cur_gt_boxes3d.shape[0] > 0: + iou3d = boxes_iou3d(cur_boxes3d, cur_gt_boxes3d[:, 0:7]) + gt_max_iou = iou3d.max(axis=0) + + for idx, thresh in enumerate(thresh_list): + recalled_bbox_list[idx] += np.sum(gt_max_iou > thresh) + gt_box_num += cur_gt_boxes3d.__len__() + + fg_mask = cur_rpn_cls_label > 0 + correct = np.sum(np.logical_and( + cur_seg_mask == cur_rpn_cls_label, fg_mask)) + union = np.sum(fg_mask) + np.sum(cur_seg_mask > 0) - correct + rpn_iou = float(correct) / max(float(union), 1.0) + rpn_iou_sum += rpn_iou + logger.debug('sample_id:{}, rpn_iou:{}, gt_box_num:{}, recalled_bbox_list:{}'.format( + sample_id, rpn_iou, gt_box_num, str(recalled_bbox_list))) + + return len(gt_boxes3d_num), gt_box_num, rpn_iou_sum, recalled_bbox_list + + +def rpn_metric(queue, mdict, lock, thresh_list, is_save_rpn_feature, kitti_feature_dir, + seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes): + while True: + rets_dict = queue.get() + if rets_dict is None: + lock.acquire() + mdict['exit_proc'] += 1 + lock.release() + return + + cnt, gt_box_num, rpn_iou_sum, recalled_bbox_list = calc_iou_recall( + rets_dict, thresh_list) + lock.acquire() + mdict['total_cnt'] += cnt + mdict['total_gt_bbox'] += gt_box_num + mdict['total_rpn_iou'] += rpn_iou_sum + for i, bbox_num in enumerate(recalled_bbox_list): + mdict['total_recalled_bbox_list_{}'.format(i)] += bbox_num + logger.debug("rpn_metric: {}".format(str(mdict))) + lock.release() + + if is_save_rpn_feature: + save_rpn_feature(rets_dict, kitti_feature_dir) + save_kitti_result( + rets_dict, seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes) + + +def rcnn_metric(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir, + refine_output_dir, final_output_dir, is_save_result=False): + while True: + rets_dict = queue.get() + if rets_dict is None: + lock.acquire() + mdict['exit_proc'] += 1 + lock.release() + return + + for k,v in rets_dict.items(): + rets_dict[k] = v[0] + + rcnn_cls = rets_dict['rcnn_cls'] + rcnn_reg = rets_dict['rcnn_reg'] + roi_boxes3d = rets_dict['roi_boxes3d'] + roi_scores = rets_dict['roi_scores'] + + # bounding box regression + anchor_size = cfg.CLS_MEAN_SIZE[0] + pred_boxes3d = kitti_utils.decode_bbox_target( + roi_boxes3d, + rcnn_reg, + anchor_size=np.array(anchor_size), + loc_scope=cfg.RCNN.LOC_SCOPE, + loc_bin_size=cfg.RCNN.LOC_BIN_SIZE, + num_head_bin=cfg.RCNN.NUM_HEAD_BIN, + get_xz_fine=True, + get_y_by_bin=cfg.RCNN.LOC_Y_BY_BIN, + loc_y_scope=cfg.RCNN.LOC_Y_SCOPE, + loc_y_bin_size=cfg.RCNN.LOC_Y_BIN_SIZE, + get_ry_fine=True + ) + + # scoring + if rcnn_cls.shape[1] == 1: + raw_scores = rcnn_cls.reshape(-1) + norm_scores = rets_dict['norm_scores'] + pred_classes = norm_scores > cfg.RCNN.SCORE_THRESH + pred_classes = pred_classes.astype(np.float32) + else: + pred_classes = np.argmax(rcnn_cls, axis=1).reshape(-1) + raw_scores = rcnn_cls[:, pred_classes] + + # evaluation + gt_iou = rets_dict['gt_iou'] + gt_boxes3d = rets_dict['gt_boxes3d'] + + # recall + if gt_boxes3d.size > 0: + gt_num = gt_boxes3d.shape[1] + gt_boxes3d = gt_boxes3d.reshape((-1,7)) + iou3d = boxes_iou3d(pred_boxes3d, gt_boxes3d) + gt_max_iou = iou3d.max(axis=0) + refined_iou = iou3d.max(axis=1) + + recalled_num = (gt_max_iou > 0.7).sum() + roi_boxes3d = roi_boxes3d.reshape((-1,7)) + iou3d_in = boxes_iou3d(roi_boxes3d, gt_boxes3d) + gt_max_iou_in = iou3d_in.max(axis=0) + + lock.acquire() + mdict['total_gt_bbox'] += gt_num + for idx, thresh in enumerate(thresh_list): + recalled_bbox_num = (gt_max_iou > thresh).sum() + mdict['total_recalled_bbox_list_{}'.format(idx)] += recalled_bbox_num + for idx, thresh in enumerate(thresh_list): + roi_recalled_bbox_num = (gt_max_iou_in > thresh).sum() + mdict['total_roi_recalled_bbox_list_{}'.format(idx)] += roi_recalled_bbox_num + lock.release() + + # classification accuracy + cls_label = gt_iou > cfg.RCNN.CLS_FG_THRESH + cls_label = cls_label.astype(np.float32) + cls_valid_mask = (gt_iou >= cfg.RCNN.CLS_FG_THRESH) | (gt_iou <= cfg.RCNN.CLS_BG_THRESH) + cls_valid_mask = cls_valid_mask.astype(np.float32) + cls_acc = (pred_classes == cls_label).astype(np.float32) + cls_acc = (cls_acc * cls_valid_mask).sum() / max(cls_valid_mask.sum(), 1.0) * 1.0 + + iou_thresh = 0.7 if cfg.CLASSES == 'Car' else 0.5 + cls_label_refined = (gt_iou >= iou_thresh) + cls_label_refined = cls_label_refined.astype(np.float32) + cls_acc_refined = (pred_classes == cls_label_refined).astype(np.float32).sum() / max(cls_label_refined.shape[0], 1.0) + + sample_id = rets_dict['sample_id'] + image_shape = kitti_rcnn_reader.get_image_shape(sample_id) + + if is_save_result: + roi_boxes3d_np = roi_boxes3d + pred_boxes3d_np = pred_boxes3d + calib = kitti_rcnn_reader.get_calib(sample_id) + save_kitti_format(sample_id, calib, roi_boxes3d_np, roi_output_dir, roi_scores, image_shape) + save_kitti_format(sample_id, calib, pred_boxes3d_np, refine_output_dir, raw_scores, image_shape) + + inds = norm_scores > cfg.RCNN.SCORE_THRESH + if inds.astype(np.float32).sum() == 0: + logger.debug("The num of 'norm_scores > thresh' of sample {} is 0".format(sample_id)) + continue + pred_boxes3d_selected = pred_boxes3d[inds] + raw_scores_selected = raw_scores[inds] + # NMS thresh + boxes_bev_selected = boxes3d_to_bev(pred_boxes3d_selected) + scores_selected, pred_boxes3d_selected = box_nms_eval(boxes_bev_selected, raw_scores_selected, pred_boxes3d_selected, cfg.RCNN.NMS_THRESH) + calib = kitti_rcnn_reader.get_calib(sample_id) + save_kitti_format(sample_id, calib, pred_boxes3d_selected, final_output_dir, scores_selected, image_shape) + lock.acquire() + mdict['total_det_num'] += pred_boxes3d_selected.shape[0] + mdict['total_cls_acc'] += cls_acc + mdict['total_cls_acc_refined'] += cls_acc_refined + lock.release() + logger.debug("rcnn_metric: {}".format(str(mdict))) + diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py new file mode 100644 index 0000000000000000000000000000000000000000..7b5703bdbfba1c1bf239c2a2c9f2179ea908a7e5 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py @@ -0,0 +1,113 @@ +""" +This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py +""" +import numpy as np + + +def cls_type_to_id(cls_type): + type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4} + if cls_type not in type_to_id.keys(): + return -1 + return type_to_id[cls_type] + + +def get_objects_from_label(label_file): + with open(label_file, 'r') as f: + lines = f.readlines() + objects = [Object3d(line) for line in lines] + return objects + + +class Object3d(object): + def __init__(self, line): + label = line.strip().split(' ') + self.src = line + self.cls_type = label[0] + self.cls_id = cls_type_to_id(self.cls_type) + self.trucation = float(label[1]) + self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown + self.alpha = float(label[3]) + self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32) + self.h = float(label[8]) + self.w = float(label[9]) + self.l = float(label[10]) + self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32) + self.dis_to_cam = np.linalg.norm(self.pos) + self.ry = float(label[14]) + self.score = float(label[15]) if label.__len__() == 16 else -1.0 + self.level_str = None + self.level = self.get_obj_level() + + def get_obj_level(self): + height = float(self.box2d[3]) - float(self.box2d[1]) + 1 + + if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0: + self.level_str = 'Easy' + return 1 # Easy + elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1: + self.level_str = 'Moderate' + return 2 # Moderate + elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2: + self.level_str = 'Hard' + return 3 # Hard + else: + self.level_str = 'UnKnown' + return 4 + + def generate_corners3d(self): + """ + generate corners3d representation for this object + :return corners_3d: (8, 3) corners of box3d in camera coord + """ + l, h, w = self.l, self.h, self.w + x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2] + y_corners = [0, 0, 0, 0, -h, -h, -h, -h] + z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2] + + R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)], + [0, 1, 0], + [-np.sin(self.ry), 0, np.cos(self.ry)]]) + corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8) + corners3d = np.dot(R, corners3d).T + corners3d = corners3d + self.pos + return corners3d + + def to_bev_box2d(self, oblique=True, voxel_size=0.1): + """ + :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image + :param voxel_size: float, 0.1m + :param oblique: + :return: box2d (4, 2)/ (4) in image coordinate + """ + if oblique: + corners3d = self.generate_corners3d() + xz_corners = corners3d[0:4, [0, 2]] + box2d = np.zeros((4, 2), dtype=np.int32) + box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32) + box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32) + box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1]) + box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0]) + else: + box2d = np.zeros(4, dtype=np.int32) + # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32) + cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32) + cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32) + half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2) + box2d[0], box2d[1] = cu - half_l, cv - half_w + box2d[2], box2d[3] = cu + half_l, cv + half_w + + return box2d + + def to_str(self): + print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \ + % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l, + self.pos, self.ry) + return print_str + + def to_kitti_format(self): + kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \ + % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1], + self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2], + self.ry) + return kitti_str + diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py new file mode 100644 index 0000000000000000000000000000000000000000..e32d1df862de7692e520168a2b35f482535f3ac6 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py @@ -0,0 +1,122 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Optimization and learning rate scheduling.""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +import paddle.fluid as fluid +import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler +from paddle.fluid.layers import control_flow + +import logging +logger = logging.getLogger(__name__) + +def cosine_warmup_decay(learning_rate, betas, warmup_factor, decay_factor, + total_step, warmup_pct): + def annealing_cos(start, end, pct): + "Cosine anneal from `start` to `end` as pct goes from 0.0 to 1.0." + cos_out = fluid.layers.cos(pct * np.pi) + 1. + return cos_out * (start - end) / 2. + end + + warmup_start_lr = learning_rate * warmup_factor + decay_end_lr = learning_rate * decay_factor + warmup_step = total_step * warmup_pct + + global_step = lr_scheduler._decay_step_counter() + + lr = fluid.layers.create_global_var( + shape=[1], + value=float(learning_rate), + dtype='float32', + persistable=True, + name="learning_rate") + beta1 = fluid.layers.create_global_var( + shape=[1], + value=float(betas[0]), + dtype='float32', + persistable=True, + name="beta1") + + warmup_step_var = fluid.layers.fill_constant( + shape=[1], dtype='float32', value=float(warmup_step), force_cpu=True) + + with control_flow.Switch() as switch: + with switch.case(global_step < warmup_step_var): + cur_lr = annealing_cos(warmup_start_lr, learning_rate, + global_step / warmup_step_var) + fluid.layers.assign(cur_lr, lr) + cur_beta1 = annealing_cos(betas[0], betas[1], + global_step / warmup_step_var) + fluid.layers.assign(cur_beta1, beta1) + with switch.case(global_step >= warmup_step_var): + cur_lr = annealing_cos(learning_rate, decay_end_lr, + (global_step - warmup_step_var) / (total_step - warmup_step)) + fluid.layers.assign(cur_lr, lr) + cur_beta1 = annealing_cos(betas[1], betas[0], + (global_step - warmup_step_var) / (total_step - warmup_step)) + fluid.layers.assign(cur_beta1, beta1) + + return lr, beta1 + + +def optimize(loss, + learning_rate, + warmup_factor, + decay_factor, + total_step, + warmup_pct, + train_program, + startup_prog, + weight_decay, + clip_norm, + beta1=[0.95, 0.85], + beta2=0.99, + scheduler='cosine_warmup_decay'): + + scheduled_lr= None + if scheduler == 'cosine_warmup_decay': + scheduled_lr, scheduled_beta1 = cosine_warmup_decay(learning_rate, beta1, warmup_factor, + decay_factor, total_step, + warmup_pct) + else: + raise ValueError("Unkown learning rate scheduler, should be " + "'cosine_warmup_decay'") + + optimizer = fluid.optimizer.Adam(learning_rate=scheduled_lr, + beta1=scheduled_beta1, + beta2=beta2) + fluid.clip.set_gradient_clip( + clip=fluid.clip.GradientClipByGlobalNorm(clip_norm=clip_norm)) + + param_list = dict() + + if weight_decay > 0: + for param in train_program.global_block().all_parameters(): + param_list[param.name] = param * 1.0 + param_list[param.name].stop_gradient = True + + _, param_grads = optimizer.minimize(loss) + + if weight_decay > 0: + for param, grad in param_grads: + with param.block.program._optimized_guard( + [param, grad]), fluid.framework.name_scope("weight_decay"): + updated_param = param - param_list[ + param.name] * weight_decay * scheduled_lr + fluid.layers.assign(output=param, input=updated_param) + + return scheduled_lr diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py new file mode 100644 index 0000000000000000000000000000000000000000..deda51180bfb9007f1dadd265c3f33f397b1cccf --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py @@ -0,0 +1,369 @@ +import numpy as np +from utils.cyops import kitti_utils, roipool3d_utils, iou3d_utils + +CLOSE_RANDOM = False + +def get_proposal_target_func(cfg, mode='TRAIN'): + + def sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d): + """ + :param roi_boxes3d: (B, M, 7) + :param gt_boxes3d: (B, N, 8) [x, y, z, h, w, l, ry, cls] + :return + batch_rois: (B, N, 7) + batch_gt_of_rois: (B, N, 8) + batch_roi_iou: (B, N) + """ + + batch_size = roi_boxes3d.shape[0] + + #batch_size = 1 + fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE)) + + batch_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7)) + batch_gt_of_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7)) + batch_roi_iou = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE)) + for idx in range(batch_size): + cur_roi, cur_gt = roi_boxes3d[idx], gt_boxes3d[idx] + k = cur_gt.shape[0] - 1 + while cur_gt[k].sum() == 0: + k -= 1 + cur_gt = cur_gt[:k + 1] + # include gt boxes in the candidate rois + iou3d = iou3d_utils.boxes_iou3d(cur_roi, cur_gt[:, 0:7]) # (M, N) + max_overlaps = np.max(iou3d, axis=1) + gt_assignment = np.argmax(iou3d, axis=1) + # sample fg, easy_bg, hard_bg + fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH) + fg_inds = np.where(max_overlaps >= fg_thresh)[0].reshape(-1) + + # TODO: this will mix the fg and bg when CLS_BG_THRESH_LO < iou < CLS_BG_THRESH + # fg_inds = torch.cat((fg_inds, roi_assignment), dim=0) # consider the roi which has max_iou with gt as fg + easy_bg_inds = np.where(max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO)[0].reshape(-1) + hard_bg_inds = np.where((max_overlaps < cfg.RCNN.CLS_BG_THRESH) & (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0].reshape(-1) + + fg_num_rois = fg_inds.shape[0] + bg_num_rois = hard_bg_inds.shape[0] + easy_bg_inds.shape[0] + + if fg_num_rois > 0 and bg_num_rois > 0: + # sampling fg + fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois) + if CLOSE_RANDOM: + fg_inds = fg_inds[:fg_rois_per_this_image] + else: + rand_num = np.random.permutation(fg_num_rois) + fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]] + + # sampling bg + bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image + bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image) + + elif fg_num_rois > 0 and bg_num_rois == 0: + # sampling fg + rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE) * fg_num_rois) + # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long() + fg_inds = fg_inds[rand_num] + fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE + bg_rois_per_this_image = 0 + elif bg_num_rois > 0 and fg_num_rois == 0: + # sampling bg + bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE + bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image) + + fg_rois_per_this_image = 0 + else: + import pdb + pdb.set_trace() + raise NotImplementedError + # augment the rois by noise + roi_list, roi_iou_list, roi_gt_list = [], [], [] + if fg_rois_per_this_image > 0: + fg_rois_src = cur_roi[fg_inds] + gt_of_fg_rois = cur_gt[gt_assignment[fg_inds]] + iou3d_src = max_overlaps[fg_inds] + fg_rois, fg_iou3d = aug_roi_by_noise( + fg_rois_src, gt_of_fg_rois, iou3d_src, aug_times=cfg.RCNN.ROI_FG_AUG_TIMES) + roi_list.append(fg_rois) + roi_iou_list.append(fg_iou3d) + roi_gt_list.append(gt_of_fg_rois) + + if bg_rois_per_this_image > 0: + bg_rois_src = cur_roi[bg_inds] + gt_of_bg_rois = cur_gt[gt_assignment[bg_inds]] + iou3d_src = max_overlaps[bg_inds] + aug_times = 1 if cfg.RCNN.ROI_FG_AUG_TIMES > 0 else 0 + bg_rois, bg_iou3d = aug_roi_by_noise( + bg_rois_src, gt_of_bg_rois, iou3d_src, aug_times=aug_times) + roi_list.append(bg_rois) + roi_iou_list.append(bg_iou3d) + roi_gt_list.append(gt_of_bg_rois) + + + rois = np.concatenate(roi_list, axis=0) + iou_of_rois = np.concatenate(roi_iou_list, axis=0) + gt_of_rois = np.concatenate(roi_gt_list, axis=0) + batch_rois[idx] = rois + batch_gt_of_rois[idx] = gt_of_rois + batch_roi_iou[idx] = iou_of_rois + + return batch_rois, batch_gt_of_rois, batch_roi_iou + + def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image): + + if hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] > 0: + hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO) + easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num + # sampling hard bg + if CLOSE_RANDOM: + rand_idx = list(np.arange(0,hard_bg_inds.shape[0]))*hard_bg_rois_num + rand_idx = rand_idx[:hard_bg_rois_num] + else: + rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,)) + hard_bg_inds = hard_bg_inds[rand_idx] + # sampling easy bg + if CLOSE_RANDOM: + rand_idx = list(np.arange(0,easy_bg_inds.shape[0]))*easy_bg_rois_num + rand_idx = rand_idx[:easy_bg_rois_num] + else: + rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,)) + easy_bg_inds = easy_bg_inds[rand_idx] + bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0) + elif hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] == 0: + hard_bg_rois_num = bg_rois_per_this_image + # sampling hard bg + rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,)) + bg_inds = hard_bg_inds[rand_idx] + elif hard_bg_inds.shape[0] == 0 and easy_bg_inds.shape[0] > 0: + easy_bg_rois_num = bg_rois_per_this_image + # sampling easy bg + rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,)) + bg_inds = easy_bg_inds[rand_idx] + else: + raise NotImplementedError + + return bg_inds + + def aug_roi_by_noise(roi_boxes3d, gt_boxes3d, iou3d_src, aug_times=10): + iou_of_rois = np.zeros(roi_boxes3d.shape[0]).astype(gt_boxes3d.dtype) + pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH) + + for k in range(roi_boxes3d.shape[0]): + temp_iou = cnt = 0 + roi_box3d = roi_boxes3d[k] + + gt_box3d = gt_boxes3d[k].reshape(1, 7) + aug_box3d = roi_box3d + keep = True + while temp_iou < pos_thresh and cnt < aug_times: + if True: #np.random.rand() < 0.2: + aug_box3d = roi_box3d # p=0.2 to keep the original roi box + keep = True + else: + aug_box3d = random_aug_box3d(roi_box3d) + keep = False + aug_box3d = aug_box3d.reshape((1, 7)) + iou3d = iou3d_utils.boxes_iou3d(aug_box3d, gt_box3d) + temp_iou = iou3d[0][0] + cnt += 1 + roi_boxes3d[k] = aug_box3d.reshape(-1) + if cnt == 0 or keep: + iou_of_rois[k] = iou3d_src[k] + else: + iou_of_rois[k] = temp_iou + return roi_boxes3d, iou_of_rois + + def random_aug_box3d(box3d): + """ + :param box3d: (7) [x, y, z, h, w, l, ry] + random shift, scale, orientation + """ + if cfg.RCNN.REG_AUG_METHOD == 'single': + + pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5] + hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 # + angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12] + aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0) + return aug_box3d + elif cfg.RCNN.REG_AUG_METHOD == 'multiple': + # pos_range, hwl_range, angle_range, mean_iou + range_config = [[0.2, 0.1, np.pi / 12, 0.7], + [0.3, 0.15, np.pi / 12, 0.6], + [0.5, 0.15, np.pi / 9, 0.5], + [0.8, 0.15, np.pi / 6, 0.3], + [1.0, 0.15, np.pi / 3, 0.2]] + idx = np.random.randint(low=0, high=len(range_config), size=(1,))[0] + pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0] + hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0 + angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2] + aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0) + return aug_box3d + elif cfg.RCNN.REG_AUG_METHOD == 'normal': + x_shift = np.random.normal(loc=0, scale=0.3) + y_shift = np.random.normal(loc=0, scale=0.2) + z_shift = np.random.normal(loc=0, scale=0.3) + h_shift = np.random.normal(loc=0, scale=0.25) + w_shift = np.random.normal(loc=0, scale=0.15) + l_shift = np.random.normal(loc=0, scale=0.5) + ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12 + aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift, + box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift], dtype=np.float32) + aug_box3d = aug_box3d.astype(box3d.dtype) + return aug_box3d + else: + raise NotImplementedError + + def data_augmentation(pts, rois, gt_of_rois): + """ + :param pts: (B, M, 512, 3) + :param rois: (B, M. 7) + :param gt_of_rois: (B, M, 7) + :return: + """ + batch_size, boxes_num = pts.shape[0], pts.shape[1] + + # rotation augmentation + angles = (np.random.rand(batch_size, boxes_num) - 0.5 / 0.5) * (np.pi / cfg.AUG_ROT_RANGE) + # calculate gt alpha from gt_of_rois + temp_x, temp_z, temp_ry = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2], gt_of_rois[:, :, 6] + temp_beta = np.arctan2(temp_z, temp_x) + gt_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M) + + temp_x, temp_z, temp_ry = rois[:, :, 0], rois[:, :, 2], rois[:, :, 6] + temp_beta = np.arctan2(temp_z, temp_x) + roi_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M) + + for k in range(batch_size): + pts[k] = kitti_utils.rotate_pc_along_y_np(pts[k], angles[k]) + gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np( + np.expand_dims(gt_of_rois[k], axis=1), angles[k]), axis=1) + rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np( + np.expand_dims(rois[k], axis=1), angles[k]),axis=1) + + # calculate the ry after rotation + temp_x, temp_z = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2] + temp_beta = np.arctan2(temp_z, temp_x) + gt_of_rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + gt_alpha - temp_beta + temp_x, temp_z = rois[:, :, 0], rois[:, :, 2] + temp_beta = np.arctan2(temp_z, temp_x) + rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + roi_alpha - temp_beta + # scaling augmentation + scales = 1 + ((np.random.rand(batch_size, boxes_num) - 0.5) / 0.5) * 0.05 + pts = pts * np.expand_dims(np.expand_dims(scales, axis=2), axis=3) + gt_of_rois[:, :, 0:6] = gt_of_rois[:, :, 0:6] * np.expand_dims(scales, axis=2) + rois[:, :, 0:6] = rois[:, :, 0:6] * np.expand_dims(scales, axis=2) + + # flip augmentation + flip_flag = np.sign(np.random.rand(batch_size, boxes_num) - 0.5) + pts[:, :, :, 0] = pts[:, :, :, 0] * np.expand_dims(flip_flag, axis=2) + gt_of_rois[:, :, 0] = gt_of_rois[:, :, 0] * flip_flag + # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry + src_ry = gt_of_rois[:, :, 6] + ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry) + gt_of_rois[:, :, 6] = ry + + rois[:, :, 0] = rois[:, :, 0] * flip_flag + # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry + src_ry = rois[:, :, 6] + ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry) + rois[:, :, 6] = ry + + return pts, rois, gt_of_rois + + def generate_proposal_target(seg_mask,rpn_features,gt_boxes3d,rpn_xyz,pts_depth,roi_boxes3d,rpn_intensity): + seg_mask = np.array(seg_mask) + features = np.array(rpn_features) + gt_boxes3d = np.array(gt_boxes3d) + rpn_xyz = np.array(rpn_xyz) + pts_depth = np.array(pts_depth) + roi_boxes3d = np.array(roi_boxes3d) + rpn_intensity = np.array(rpn_intensity) + batch_rois, batch_gt_of_rois, batch_roi_iou = sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d) + + if cfg.RCNN.USE_INTENSITY: + pts_extra_input_list = [np.expand_dims(rpn_intensity, axis=2), + np.expand_dims(seg_mask, axis=2)] + else: + pts_extra_input_list = [np.expand_dims(seg_mask, axis=2)] + + if cfg.RCNN.USE_DEPTH: + pts_depth = pts_depth / 70.0 - 0.5 + pts_extra_input_list.append(np.expand_dims(pts_depth, axis=2)) + pts_extra_input = np.concatenate(pts_extra_input_list, axis=2) + + # point cloud pooling + pts_feature = np.concatenate((pts_extra_input, rpn_features), axis=2) + + batch_rois = batch_rois.astype(np.float32) + + pooled_features, pooled_empty_flag = roipool3d_utils.roipool3d_gpu( + rpn_xyz, pts_feature, batch_rois, cfg.RCNN.POOL_EXTRA_WIDTH, + sampled_pt_num=cfg.RCNN.NUM_POINTS + ) + + sampled_pts, sampled_features = pooled_features[:, :, :, 0:3], pooled_features[:, :, :, 3:] + # data augmentation + if cfg.AUG_DATA: + # data augmentation + sampled_pts, batch_rois, batch_gt_of_rois = \ + data_augmentation(sampled_pts, batch_rois, batch_gt_of_rois) + + # canonical transformation + batch_size = batch_rois.shape[0] + roi_ry = batch_rois[:, :, 6] % (2 * np.pi) + roi_center = batch_rois[:, :, 0:3] + sampled_pts = sampled_pts - np.expand_dims(roi_center, axis=2) # (B, M, 512, 3) + batch_gt_of_rois[:, :, 0:3] = batch_gt_of_rois[:, :, 0:3] - roi_center + batch_gt_of_rois[:, :, 6] = batch_gt_of_rois[:, :, 6] - roi_ry + + for k in range(batch_size): + sampled_pts[k] = kitti_utils.rotate_pc_along_y_np(sampled_pts[k], batch_rois[k, :, 6]) + batch_gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np( + np.expand_dims(batch_gt_of_rois[k], axis=1), roi_ry[k]), axis=1) + + # regression valid mask + valid_mask = (pooled_empty_flag == 0) + reg_valid_mask = ((batch_roi_iou > cfg.RCNN.REG_FG_THRESH) & valid_mask).astype(np.float32) + + # classification label + batch_cls_label = (batch_roi_iou > cfg.RCNN.CLS_FG_THRESH).astype(np.int64) + invalid_mask = (batch_roi_iou > cfg.RCNN.CLS_BG_THRESH) & (batch_roi_iou < cfg.RCNN.CLS_FG_THRESH) + batch_cls_label[valid_mask == 0] = -1 + batch_cls_label[invalid_mask > 0] = -1 + + output_dict = {'sampled_pts': sampled_pts.reshape(-1, cfg.RCNN.NUM_POINTS, 3).astype(np.float32), + 'pts_feature': sampled_features.reshape(-1, cfg.RCNN.NUM_POINTS, sampled_features.shape[3]).astype(np.float32), + 'cls_label': batch_cls_label.reshape(-1), + 'reg_valid_mask': reg_valid_mask.reshape(-1).astype(np.float32), + 'gt_of_rois': batch_gt_of_rois.reshape(-1, 7).astype(np.float32), + 'gt_iou': batch_roi_iou.reshape(-1).astype(np.float32), + 'roi_boxes3d': batch_rois.reshape(-1, 7).astype(np.float32)} + + return output_dict.values() + + return generate_proposal_target + + +if __name__ == "__main__": + + input_dict = {} + input_dict['roi_boxes3d'] = np.load("models/rpn_data/roi_boxes3d.npy") + input_dict['gt_boxes3d'] = np.load("models/rpn_data/gt_boxes3d.npy") + input_dict['rpn_xyz'] = np.load("models/rpn_data/rpn_xyz.npy") + input_dict['rpn_features'] = np.load("models/rpn_data/rpn_features.npy") + input_dict['rpn_intensity'] = np.load("models/rpn_data/rpn_intensity.npy") + input_dict['seg_mask'] = np.load("models/rpn_data/seg_mask.npy") + input_dict['pts_depth'] = np.load("models/rpn_data/pts_depth.npy") + for k, v in input_dict.items(): + print(k, v.shape, np.sum(np.abs(v))) + input_dict[k] = np.expand_dims(v, axis=0) + + from utils.config import cfg + cfg.RPN.LOC_XZ_FINE = True + cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False + cfg.RPN.NMS_TYPE = 'rotate' + + proposal_target_func = get_proposal_target_func(cfg) + out_dict = proposal_target_func(input_dict['seg_mask'],input_dict['rpn_features'],input_dict['gt_boxes3d'], + input_dict['rpn_xyz'],input_dict['pts_depth'],input_dict['roi_boxes3d'],input_dict['rpn_intensity']) + for key in out_dict.keys(): + print("name:{}, shape{}".format(key,out_dict[key].shape)) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..9160ffe8e4e4a1aff7f8e8984e5ddd3711d1ffb0 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py @@ -0,0 +1,270 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +Contains proposal functions +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +import paddle.fluid as fluid + +import utils.box_utils as box_utils +from utils.config import cfg + +__all__ = ["get_proposal_func"] + + +def get_proposal_func(cfg, mode='TRAIN'): + def decode_bbox_target(roi_box3d, pred_reg, anchor_size, loc_scope, + loc_bin_size, num_head_bin, get_xz_fine=True, + loc_y_scope=0.5, loc_y_bin_size=0.25, + get_y_by_bin=False, get_ry_fine=False): + per_loc_bin_num = int(loc_scope / loc_bin_size) * 2 + loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2 + + # recover xz localization + x_bin_l, x_bin_r = 0, per_loc_bin_num + z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2 + start_offset = z_bin_r + + x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1) + z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1) + + pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope + pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope + if get_xz_fine: + x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3 + z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4 + start_offset = z_res_r + + x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin] + z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin] + + x_res = x_res_norm * loc_bin_size + z_res = z_res_norm * loc_bin_size + pos_x += x_res + pos_z += z_res + + # recover y localization + if get_y_by_bin: + y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num + y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num + start_offset = y_res_r + + y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1) + y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin] + y_res = y_res_norm * loc_y_bin_size + pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res + pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1) + else: + y_offset_l, y_offset_r = start_offset, start_offset + 1 + start_offset = y_offset_r + + pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l]) + pos_y = pos_y.reshape(-1) + + # recover ry rotation + ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin + ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin + + ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1) + ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin] + if get_ry_fine: + # divide pi/2 into several bins + angle_per_class = (np.pi / 2) / num_head_bin + ry_res = ry_res_norm * (angle_per_class / 2) + ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4 + else: + angle_per_class = (2 * np.pi) / num_head_bin + ry_res = ry_res_norm * (angle_per_class / 2) + + # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330) + ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi) + ry[ry > np.pi] -= 2 * np.pi + + # recover size + size_res_l, size_res_r = ry_res_r, ry_res_r + 3 + assert size_res_r == pred_reg.shape[1] + + size_res_norm = pred_reg[:, size_res_l: size_res_r] + hwl = size_res_norm * anchor_size + anchor_size + + def rotate_pc_along_y(pc, angle): + cosa = np.cos(angle).reshape(-1, 1) + sina = np.sin(angle).reshape(-1, 1) + + R = np.concatenate([cosa, -sina, sina, cosa], axis=-1).reshape(-1, 2, 2) + pc_temp = pc[:, [0, 2]].reshape(-1, 1, 2) + pc[:, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1)).reshape(-1, 2) + + return pc + + # shift to original coords + roi_center = np.array(roi_box3d[:, 0:3]) + shift_ret_box3d = np.concatenate(( + pos_x.reshape(-1, 1), + pos_y.reshape(-1, 1), + pos_z.reshape(-1, 1), + hwl, ry.reshape(-1, 1)), axis=1) + ret_box3d = shift_ret_box3d + if roi_box3d.shape[1] == 7: + roi_ry = np.array(roi_box3d[:, 6]).reshape(-1) + ret_box3d = rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry) + ret_box3d[:, 6] += roi_ry + ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]] + return ret_box3d + + def distance_based_proposal(scores, proposals, sorted_idxs): + nms_range_list = [0, 40.0, 80.0] + pre_tot_top_n = cfg[mode].RPN_PRE_NMS_TOP_N + pre_top_n_list = [0, int(pre_tot_top_n * 0.7), pre_tot_top_n - int(pre_tot_top_n * 0.7)] + post_tot_top_n = cfg[mode].RPN_POST_NMS_TOP_N + post_top_n_list = [0, int(post_tot_top_n * 0.7), post_tot_top_n - int(post_tot_top_n * 0.7)] + + batch_size = scores.shape[0] + ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32') + ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32') + + for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)): + # sort by score + score_ord = score[sorted_idx] + proposal_ord = proposal[sorted_idx] + + dist = proposal_ord[:, 2] + first_mask = (dist > nms_range_list[0]) & (dist <= nms_range_list[1]) + + scores_single_list, proposals_single_list = [], [] + for i in range(1, len(nms_range_list)): + # get proposal distance mask + dist_mask = ((dist > nms_range_list[i - 1]) & (dist <= nms_range_list[i])) + + if dist_mask.sum() != 0: + # this area has points, reduce by mask + cur_scores = score_ord[dist_mask] + cur_proposals = proposal_ord[dist_mask] + + # fetch pre nms top K + cur_scores = cur_scores[:pre_top_n_list[i]] + cur_proposals = cur_proposals[:pre_top_n_list[i]] + else: + assert i == 2, '%d' % i + # this area doesn't have any points, so use rois of first area + cur_scores = score_ord[first_mask] + cur_proposals = proposal_ord[first_mask] + + # fetch top K of first area + cur_scores = cur_scores[pre_top_n_list[i - 1]:][:pre_top_n_list[i]] + cur_proposals = cur_proposals[pre_top_n_list[i - 1]:][:pre_top_n_list[i]] + + # oriented nms + boxes_bev = box_utils.boxes3d_to_bev(cur_proposals) + s_scores, s_proposals = box_utils.box_nms( + boxes_bev, cur_scores, cur_proposals, + cfg[mode].RPN_NMS_THRESH, post_top_n_list[i], + cfg.RPN.NMS_TYPE) + if len(s_scores) > 0: + scores_single_list.append(s_scores) + proposals_single_list.append(s_proposals) + + scores_single = np.concatenate(scores_single_list, axis=0) + proposals_single = np.concatenate(proposals_single_list, axis=0) + + prop_num = proposals_single.shape[0] + ret_scores[b, :prop_num, 0] = scores_single + ret_proposals[b, :prop_num] = proposals_single + # ret_proposals.tofile("proposal.data") + # ret_scores.tofile("score.data") + return np.concatenate([ret_proposals, ret_scores], axis=-1) + + def score_based_proposal(scores, proposals, sorted_idxs): + batch_size = scores.shape[0] + ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32') + ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32') + for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)): + # sort by score + score_ord = score[sorted_idx] + proposal_ord = proposal[sorted_idx] + + # pre nms top K + cur_scores = score_ord[:cfg[mode].RPN_PRE_NMS_TOP_N] + cur_proposals = proposal_ord[:cfg[mode].RPN_PRE_NMS_TOP_N] + + boxes_bev = box_utils.boxes3d_to_bev(cur_proposals) + s_scores, s_proposals = box_utils.box_nms( + boxes_bev, cur_scores, cur_proposals, + cfg[mode].RPN_NMS_THRESH, + cfg[mode].RPN_POST_NMS_TOP_N, + 'rotate') + prop_num = len(s_proposals) + ret_scores[b, :prop_num, 0] = s_scores + ret_proposals[b, :prop_num] = s_proposals + # ret_proposals.tofile("proposal.data") + # ret_scores.tofile("score.data") + return np.concatenate([ret_proposals, ret_scores], axis=-1) + + def generate_proposal(x): + rpn_scores = np.array(x[:, :, 0])[:, :, 0] + roi_box3d = x[:, :, 1:4] + pred_reg = x[:, :, 4:] + + proposals = decode_bbox_target( + np.array(roi_box3d).reshape(-1, roi_box3d.shape()[-1]), + np.array(pred_reg).reshape(-1, pred_reg.shape()[-1]), + anchor_size=np.array(cfg.CLS_MEAN_SIZE[0], dtype='float32'), + loc_scope=cfg.RPN.LOC_SCOPE, + loc_bin_size=cfg.RPN.LOC_BIN_SIZE, + num_head_bin=cfg.RPN.NUM_HEAD_BIN, + get_xz_fine=cfg.RPN.LOC_XZ_FINE, + get_y_by_bin=False, + get_ry_fine=False) + proposals[:, 1] += proposals[:, 3] / 2 + proposals = proposals.reshape(rpn_scores.shape[0], -1, proposals.shape[-1]) + + sorted_idxs = np.argsort(-rpn_scores, axis=-1) + + if cfg.TEST.RPN_DISTANCE_BASED_PROPOSE: + ret = distance_based_proposal(rpn_scores, proposals, sorted_idxs) + else: + ret = score_based_proposal(rpn_scores, proposals, sorted_idxs) + + return ret + + + return generate_proposal + + +if __name__ == "__main__": + np.random.seed(3333) + x_np = np.random.random((4, 256, 84)).astype('float32') + + from config import cfg + cfg.RPN.LOC_XZ_FINE = True + # cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False + # cfg.RPN.NMS_TYPE = 'rotate' + proposal_func = get_proposal_func(cfg) + + x = fluid.layers.data(name="x", shape=[256, 84], dtype='float32') + proposal = fluid.default_main_program().current_block().create_var( + name="proposal", dtype='float32', shape=[256, 7]) + fluid.layers.py_func(proposal_func, x, proposal) + loss = fluid.layers.reduce_mean(proposal) + + place = fluid.CUDAPlace(0) + exe = fluid.Executor(place) + exe.run(fluid.default_startup_program()) + ret = exe.run(fetch_list=[proposal.name, loss.name], feed={'x': x_np}) + print(ret) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..044bbed5d020464250810601ec2dcdacdec0cd18 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt @@ -0,0 +1,6 @@ + +cmake_minimum_required(VERSION 2.8.12) +project(pts_utils) + +add_subdirectory(pybind11) +pybind11_add_module(pts_utils pts_utils.cpp) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp new file mode 100644 index 0000000000000000000000000000000000000000..356b02baa5288903e218c8fca1b17118ef8ea72b --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp @@ -0,0 +1,62 @@ +#include +#include +#include + +namespace py = pybind11; + +int pt_in_box3d(float x, float y, float z, float cx, float cy, float cz, float h, float w, float l, float cosa, float sina) { + if ((fabsf(x - cx) > 10.) || (fabsf(y - cy) > h / 2.0) || (fabsf(z - cz) > 10.)){ + return 0; + } + + float x_rot = (x - cx) * cosa + (z - cz) * (-sina); + float z_rot = (x - cx) * sina + (z - cz) * cosa; + + int in_flag = static_cast((x_rot >= -l / 2.0) & (x_rot <= l / 2.0) & (z_rot >= -w / 2.0) & (z_rot <= w / 2.0)); + return in_flag; +} + +py::array_t pts_in_boxes3d(py::array_t pts, py::array_t boxes) { + py::buffer_info pts_buf= pts.request(), boxes_buf = boxes.request(); + + if (pts_buf.ndim != 2 || boxes_buf.ndim != 2) { + throw std::runtime_error("Number of dimensions must be 2"); + } + if (pts_buf.shape[1] != 3) { + throw std::runtime_error("pts 2nd dimension must be 3"); + } + if (boxes_buf.shape[1] != 7) { + throw std::runtime_error("boxes 2nd dimension must be 7"); + } + + auto pts_num = pts_buf.shape[0]; + auto boxes_num = boxes_buf.shape[0]; + auto mask = py::array_t(pts_num * boxes_num); + py::buffer_info mask_buf = mask.request(); + + float *pts_ptr = (float *) pts_buf.ptr, + *boxes_ptr = (float *) boxes_buf.ptr; + int *mask_ptr = (int *) mask_buf.ptr; + + for (ssize_t i = 0; i < boxes_num; i++) { + float cx = boxes_ptr[i * 7]; + float cy = boxes_ptr[i * 7 + 1] - boxes_ptr[i * 7 + 3] / 2.; + float cz = boxes_ptr[i * 7 + 2]; + float h = boxes_ptr[i * 7 + 3]; + float w = boxes_ptr[i * 7 + 4]; + float l = boxes_ptr[i * 7 + 5]; + float angle = boxes_ptr[i * 7 + 6]; + float cosa = cosf(angle); + float sina = sinf(angle); + for (ssize_t j = 0; j < pts_num; j++) { + mask_ptr[i * pts_num + j] = pt_in_box3d(pts_ptr[j * 3], pts_ptr[j * 3 + 1], pts_ptr[j * 3 + 2], cx, cy, cz, h, w, l, cosa, sina); + } + } + + mask.resize({boxes_num, pts_num}); + return mask; +} + +PYBIND11_MODULE(pts_utils, m) { + m.def("pts_in_boxes3d", &pts_in_boxes3d, "Calculate mask for whether points in boxes3d"); +} diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py new file mode 100644 index 0000000000000000000000000000000000000000..e44e80ea703c0b2b3d1938fadc3c1befadb1dad0 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py @@ -0,0 +1,12 @@ +from setuptools import setup +from setuptools import Extension + +setup( + name='pts_utils', + ext_modules = [Extension( + name='pts_utils', + sources=['pts_utils.cpp'], + include_dirs=[r'../../pybind11/include'], + extra_compile_args=['-std=c++11'] + )], +) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py new file mode 100644 index 0000000000000000000000000000000000000000..e4e3be285e3363a2193102732f1c0d9894eb497d --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py @@ -0,0 +1,7 @@ +import numpy as np +import pts_utils + +a = np.random.random((16384, 3)).astype('float32') +b = np.random.random((64, 7)).astype('float32') +c = pts_utils.pts_in_boxes3d(a, b) +print(a, b, c, c.shape, np.sum(c)) diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..0df37e5658f86c0cfc416e8a0185c5556bffe9f9 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py @@ -0,0 +1,110 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +""" +Contains common utility functions. +""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import sys +import six +import logging +import numpy as np +import paddle.fluid as fluid + +__all__ = ["check_gpu", "print_arguments", "parse_outputs", "Stat"] + +logger = logging.getLogger(__name__) + + +def check_gpu(use_gpu): + """ + Log error and exit when set use_gpu=True in paddlepaddle + cpu version. + """ + err = "Config use_gpu cannot be set as True while you are " \ + "using paddlepaddle cpu version ! \nPlease try: \n" \ + "\t1. Install paddlepaddle-gpu to run model on GPU \n" \ + "\t2. Set --use_gpu=False to run model on CPU" + + try: + if use_gpu and not fluid.is_compiled_with_cuda(): + logger.error(err) + sys.exit(1) + except Exception as e: + pass + + +def print_arguments(args): + """Print argparse's arguments. + + Usage: + + .. code-block:: python + + parser = argparse.ArgumentParser() + parser.add_argument("name", default="Jonh", type=str, help="User name.") + args = parser.parse_args() + print_arguments(args) + + :param args: Input argparse.Namespace for printing. + :type args: argparse.Namespace + """ + logger.info("----------- Configuration Arguments -----------") + for arg, value in sorted(six.iteritems(vars(args))): + logger.info("%s: %s" % (arg, value)) + logger.info("------------------------------------------------") + + +def parse_outputs(outputs, filter_key=None, extra_keys=None, prog=None): + keys, values = [], [] + for k, v in outputs.items(): + if filter_key is not None and k.find(filter_key) < 0: + continue + keys.append(k) + v.persistable = True + values.append(v.name) + + if prog is not None and extra_keys is not None: + for k in extra_keys: + try: + v = fluid.framework._get_var(k, prog) + keys.append(k) + v.persistable = True + values.append(v.name) + except: + pass + return keys, values + + +class Stat(object): + def __init__(self): + self.stats = {} + + def update(self, keys, values): + for k, v in zip(keys, values): + if k not in self.stats: + self.stats[k] = [] + self.stats[k].append(v) + + def reset(self): + self.stats = {} + + def get_mean_log(self): + log = "" + for k, v in self.stats.items(): + log += "avg_{}: {:.4f}, ".format(k, np.mean(v)) + return log diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..c24a89a2429bd5f45386efa1176f8c8770500120 --- /dev/null +++ b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py @@ -0,0 +1,132 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import numpy as np +from utils.config import cfg +from utils import calibration as calib +import utils.cyops.kitti_utils as kitti_utils + +__all__ = ['save_rpn_feature', 'save_kitti_result', 'save_kitti_format'] + + +def save_rpn_feature(rets, kitti_features_dir): + """ + save rpn features for RCNN offline training + """ + + sample_id = rets['sample_id'][0] + backbone_xyz = rets['backbone_xyz'][0] + backbone_feature = rets['backbone_feature'][0] + pts_features = rets['pts_features'][0] + seg_mask = rets['seg_mask'][0] + rpn_cls = rets['rpn_cls'][0] + + for i in range(len(sample_id)): + pts_intensity = pts_features[i, :, 0] + s_id = sample_id[i, 0] + + output_file = os.path.join(kitti_features_dir, '%06d.npy' % s_id) + xyz_file = os.path.join(kitti_features_dir, '%06d_xyz.npy' % s_id) + seg_file = os.path.join(kitti_features_dir, '%06d_seg.npy' % s_id) + intensity_file = os.path.join( + kitti_features_dir, '%06d_intensity.npy' % s_id) + np.save(output_file, backbone_feature[i]) + np.save(xyz_file, backbone_xyz[i]) + np.save(seg_file, seg_mask[i]) + np.save(intensity_file, pts_intensity) + rpn_scores_raw_file = os.path.join( + kitti_features_dir, '%06d_rawscore.npy' % s_id) + np.save(rpn_scores_raw_file, rpn_cls[i]) + + +def save_kitti_result(rets, seg_output_dir, kitti_output_dir, reader, classes): + sample_id = rets['sample_id'][0] + roi_scores_row = rets['roi_scores_row'][0] + bboxes3d = rets['rois'][0] + pts_rect = rets['pts_rect'][0] + seg_mask = rets['seg_mask'][0] + rpn_cls_label = rets['rpn_cls_label'][0] + gt_boxes3d = rets['gt_boxes3d'][0] + gt_boxes3d_num = rets['gt_boxes3d'][1] + + for i in range(len(sample_id)): + s_id = sample_id[i, 0] + + seg_result_data = np.concatenate((pts_rect[i].reshape(-1, 3), + rpn_cls_label[i].reshape(-1, 1), + seg_mask[i].reshape(-1, 1)), + axis=1).astype('float16') + seg_output_file = os.path.join(seg_output_dir, '%06d.npy' % s_id) + np.save(seg_output_file, seg_result_data) + + scores = roi_scores_row[i, :] + bbox3d = bboxes3d[i, :] + img_shape = reader.get_image_shape(s_id) + calib = reader.get_calib(s_id) + + corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d) + img_boxes, _ = calib.corners3d_to_img_boxes(corners3d) + + img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1) + img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1) + img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1) + img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1) + + img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0] + img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1] + box_valid_mask = np.logical_and( + img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8) + + kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % s_id) + with open(kitti_output_file, 'w') as f: + for k in range(bbox3d.shape[0]): + if box_valid_mask[k] == 0: + continue + x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6] + beta = np.arctan2(z, x) + alpha = -np.sign(beta) * np.pi / 2 + beta + ry + + f.write('{} -1 -1 {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f}\n'.format( + classes, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3], + bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2], + bbox3d[k, 6], scores[k])) + + +def save_kitti_format(sample_id, calib, bbox3d, kitti_output_dir, scores, img_shape): + corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d) + img_boxes, _ = calib.corners3d_to_img_boxes(corners3d) + img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1) + img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1) + img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1) + img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1) + + img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0] + img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1] + box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8) + + kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % sample_id) + with open(kitti_output_file, 'w') as f: + for k in range(bbox3d.shape[0]): + if box_valid_mask[k] == 0: + continue + x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6] + beta = np.arctan2(z, x) + alpha = -np.sign(beta) * np.pi / 2 + beta + ry + + f.write('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' % + (cfg.CLASSES, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3], + bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2], + bbox3d[k, 6], scores[k])) +