未验证 提交 9b67ef53 编写于 作者: K Kaipeng Deng 提交者: GitHub

Add PointRCNN model (#3967)

* add PointRCNN model
   by heavengate, FDInSky, tink2123
上级 3584aeec
*log*
checkpoints*
build
output
result_dir
pp_pointrcnn*
data/gt_database
utils/pts_utils/dist
utils/pts_utils/build
utils/pts_utils/pts_utils.egg-info
utils/cyops/*.c
utils/cyops/*.so
ext_op/src/*.o
ext_op/src/*.so
# PointRCNN 3D目标检测模型
---
## 内容
- [简介](#简介)
- [快速开始](#快速开始)
- [参考文献](#参考文献)
- [版本更新](#版本更新)
## 简介
[PointRCNN](https://arxiv.org/abs/1812.04244) 是 Shaoshuai Shi, Xiaogang Wang, Hongsheng Li. 等人提出的,是第一个仅使用原始点云的2-stage(两阶段)3D目标检测器,第一阶段将 Pointnet++ with MSG(Multi-scale Grouping)作为backbone,直接将原始点云数据分割为前景点和背景点,并利用前景点生成bounding box。第二阶段在标准坐标系中对生成对bounding box进一步筛选和优化。该模型还提出了基于bin的方式,把回归问题转化为分类问题,验证了在三维边界框回归中的有效性。PointRCNN在KITTI数据集上进行评估,论文发布时在KITTI 3D目标检测排行榜上获得了最佳性能。
网络结构如下所示:
<p align="center">
<img src="images/teaser.png" height=300 width=800 hspace='10'/> <br />
用于点云的目标检测器 PointNet++
</p>
**注意:** PointRCNN 模型构建依赖于自定义的 C++ 算子,目前仅支持GPU设备在Linux/Unix系统上进行编译,本模型**不能运行在Windows系统或CPU设备上**
## 快速开始
### 安装
**安装 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):**
在当前目录下运行样例代码需要 PaddelPaddle Fluid [develop每日版本](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)或使用PaddlePaddle [develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装.
为了使自定义算子与paddle版本兼容,建议您**优先使用源码编译paddle**,源码编译方式请参考[编译安装](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
**安装PointRCNN:**
1. 下载[PaddlePaddle/models](https://github.com/PaddlePaddle/models)模型库
通过如下命令下载Paddle models模型库:
```
git clone https://github.com/PaddlePaddle/models
```
2.`PaddleCV/Paddle3D/PointRCNN`目录下下载[pybind11](https://github.com/pybind/pybind11)
`pts_utils`依赖`pybind11`编译,须在`PaddleCV/Paddle3D/PointRCNN`目录下下载`pybind11`子库,可使用如下命令下载:
```
cd PaddleCV/Paddle3D/PointRCNN
git clone https://github.com/pybind/pybind11
```
3. 编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块
使用如下命令编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块:
```
sh build_and_install.sh
```
4. 安装python依赖库
使用如下命令安装python依赖库:
```
pip install -r requirement.txt
```
**注意:** KITTI mAP评估工具只能在python 3.6及以上版本中使用,且python3环境中需要安装`scikit-image`,`Numba`,`fire`等子库。
`requirement.txt`中的`scikit-image`,`Numba`,`fire`即为KITTI mAP评估工具所需依赖库。
### 编译自定义OP
请确认Paddle版本为PaddelPaddle Fluid develop每日版本或基于Paddle develop分支源码编译安装,**推荐使用源码编译安装的方式**
自定义OP编译方式如下:
进入 `ext_op/src` 目录,执行编译脚本
```
cd ext_op/src
sh make.sh
```
成功编译后,`ext_op/src` 目录下将会生成 `pointnet2_lib.so`
执行下列操作,确保自定义算子编译正确:
```
# 设置动态库的路径到 LD_LIBRARY_PATH 中
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
# 回到 ext_op 目录,添加 PYTHONPATH
cd ..
export PYTHONPATH=$PYTHONPATH:`pwd`
# 运行单测
python tests/test_farthest_point_sampling_op.py
python tests/test_gather_point_op.py
python tests/test_group_points_op.py
python tests/test_query_ball_op.py
python tests/test_three_interp_op.py
python tests/test_three_nn_op.py
```
单测运行成功会输出提示信息,如下所示:
```
.
----------------------------------------------------------------------
Ran 1 test in 13.205s
OK
```
**说明:** 自定义OP编译与[PointNet++](../PointNet++)下一致,更多关于自定义OP的编译说明,请参考[自定义OP编译](../PointNet++/ext_op/README.md)
### 数据准备
**KITTI 3D object detection 数据集:**
PointRCNN使用数据集[KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d)
上进行训练。
可通过如下方式下载数据集:
```
cd data/KITTI/object
sh download.sh
```
此处的images只用做可视化,训练过程中使用[road planes](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing)数据来做训练时的数据增强,
请下载并解压至`./data/KITTI/object/training`目录下。
数据目录结构如下所示:
```
PointRCNN
├── data
│ ├── KITTI
│ │ ├── ImageSets
│ │ ├── object
│ │ │ ├──training
│ │ │ │ ├──calib & velodyne & label_2 & image_2 & planes
│ │ │ ├──testing
│ │ │ │ ├──calib & velodyne & image_2
```
### 训练
**PointRCNN模型:**
可通过如下方式启动 PointRCNN模型的训练:
1. 指定单卡训练并设置动态库路径
```
# 指定单卡GPU训练
export CUDA_VISIBLE_DEVICES=0
# 设置动态库的路径到 LD_LIBRARY_PATH 中
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
```
2. 生成Groud Truth采样数据,命令如下:
```
python tools/generate_gt_database.py --class_name 'Car' --split train
```
3. 训练 RPN 模型
```
python train.py --cfg=./cfgs/default.yml \
--train_mode=rpn \
--batch_size=16 \
--epoch=200 \
--save_dir=checkpoints
```
RPN训练checkpoints默认保存在`checkpoints/rpn`目录,也可以通过`--save_dir`来指定。
4. 生成增强离线场景数据并保存RPN模型的输出特征和ROI,用于离线训练 RCNN 模型
生成增强的离线场景数据命令如下:
```
python tools/generate_aug_scene.py --class_name 'Car' --split train --aug_times 4
```
保存RPN模型对离线增强数据的输出特征和ROI,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
保存输出特征和ROI时须指定`TEST.SPLIT``train_aug`,指定`TEST.RPN_POST_NMS_TOP_N``300`, `TEST.RPN_NMS_THRESH``0.85`
通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
```
python eval.py --cfg=cfgs/default.yml \
--eval_mode=rpn \
--ckpt_dir=./checkpoints/rpn/199 \
--save_rpn_feature \
--output_dir=output \
--set TEST.SPLIT train_aug TEST.RPN_POST_NMS_TOP_N 300 TEST.RPN_NMS_THRESH 0.85
```
`--output_dir`下保存的数据目录结构如下:
```
output
├── detections
│ ├── data # 保存ROI数据
│ │ ├── 000000.txt
│ │ ├── 000003.txt
│ │ ├── ...
├── features # 保存输出特征
│ ├── 000000_intensity.npy
│ ├── 000000.npy
│ ├── 000000_rawscore.npy
│ ├── 000000_seg.npy
│ ├── 000000_xyz.npy
│ ├── ...
├── seg_result # 保存语义分割结果
│ ├── 000000.npy
│ ├── 000003.npy
│ ├── ...
```
5. 离线训练RCNN,并且通过参数`--rcnn_training_roi_dir` and `--rcnn_training_feature_dir` 来指定 RPN 模型保存的输出特征和ROI路径。
```
python train.py --cfg=./cfgs/default.yml \
--train_mode=rcnn_offline \
--batch_size=4 \
--epoch=30 \
--save_dir=checkpoints \
--rcnn_training_roi_dir=output/detections/data \
--rcnn_training_feature_dir=output/features
```
RCNN模型训练权重默认保存在`checkpoints/rcnn`目录下,可通过`--save_dir`参数指定。
**注意**: 最好的模型是通过保存RPN模型输出特征和ROI并离线数据增强的方式训练RCNN模型得出的,目前默认仅支持这种方式。
### 模型评估
**PointRCNN模型:**
可通过如下方式启动 PointRCNN 模型的评估:
1. 指定单卡训练并设置动态库路径
```
# 指定单卡GPU训练
export CUDA_VISIBLE_DEVICES=0
# 设置动态库的路径到 LD_LIBRARY_PATH 中
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
```
2. 保存RPN模型对评估数据的输出特征和ROI
保存RPN模型对评估数据的输出特征和ROI命令如下,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
```
python eval.py --cfg=cfgs/default.yml \
--eval_mode=rpn \
--ckpt_dir=./checkpoints/rpn/199 \
--save_rpn_feature \
--output_dir=output/val
```
保存RPN模型对评估数据的输出特征和ROI保存的目录结构与上述保存离线增强数据保存目录结构一致。
3. 评估离线RCNN模型
评估离线RCNN模型命令如下:
```
python eval.py --cfg=cfgs/default.yml \
--eval_mode=rcnn_offline \
--ckpt_dir=./checkpoints/rcnn_offline/29 \
--rcnn_eval_roi_dir=output/val/detections/data \
--rcnn_eval_feature_dir=output/val/features \
--save_result
```
最终目标检测结果文件保存在`./result_dir`目录下`final_result`文件夹下,同时可通过`--save_result`开启保存`roi_output``refine_output`结果文件。
`result_dir`目录结构如下:
```
result_dir
├── final_result
│ ├── data # 最终检测结果
│ │ ├── 000001.txt
│ │ ├── 000002.txt
│ │ ├── ...
├── roi_output
│ ├── data # RCNN模型输出检测ROI结果
│ │ ├── 000001.txt
│ │ ├── 000002.txt
│ │ ├── ...
├── refine_output
│ ├── data # 解码后的检测结果
│ │ ├── 000001.txt
│ │ ├── 000002.txt
│ │ ├── ...
```
4. 使用KITTI mAP工具获得评估结果
若在评估过程中使用的python版本为3.6及以上版本,则程序会自动运行KITTI mAP评估,若使用python版本低于3.6,
由于KITTI mAP仅支持python 3.6及以上版本,须使用对应python版本通过如下命令进行评估:
```
python3 kitti_map.py
```
使用训练最终权重[RPN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rpn.tar)[RCNN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rcnn_offline.tar)评估结果如下所示:
| Car AP@ | 0.70(easy) | 0.70(moderate) | 0.70(hard) |
| :------- | :--------: | :------------: | :--------: |
| bbox AP: | 90.20 | 88.85 | 88.59 |
| bev AP: | 89.50 | 86.97 | 85.58 |
| 3d AP: | 86.66 | 76.65 | 75.90 |
| aos AP: | 90.10 | 88.64 | 88.26 |
## 参考文献
- [PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud](https://arxiv.org/abs/1812.04244), Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.
- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas.
## 版本更新
- 11/2019, 新增 PointRCNN模型。
# compile cyops
python utils/cyops/setup.py develop
# compile and install pts_utils
cd utils/pts_utils
python setup.py install
cd ../..
# This config is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/cfgs/default.yaml
CLASSES: Car
INCLUDE_SIMILAR_TYPE: True
# config of augmentation
AUG_DATA: True
AUG_METHOD_LIST: ['rotation', 'scaling', 'flip']
AUG_METHOD_PROB: [1.0, 1.0, 0.5]
AUG_ROT_RANGE: 18
GT_AUG_ENABLED: True
GT_EXTRA_NUM: 15
GT_AUG_RAND_NUM: True
GT_AUG_APPLY_PROB: 1.0
GT_AUG_HARD_RATIO: 0.6
PC_REDUCE_BY_RANGE: True
PC_AREA_SCOPE: [[-40, 40], [-1, 3], [0, 70.4]] # x, y, z scope in rect camera coords
CLS_MEAN_SIZE: [[1.52563191462, 1.62856739989, 3.88311640418]]
# 1. config of rpn network
RPN:
ENABLED: True
FIXED: False
# config of input
USE_INTENSITY: False
# config of bin-based loss
LOC_XZ_FINE: True
LOC_SCOPE: 3.0
LOC_BIN_SIZE: 0.5
NUM_HEAD_BIN: 12
# config of network structure
BACKBONE: pointnet2_msg
USE_BN: True
NUM_POINTS: 16384
SA_CONFIG:
NPOINTS: [4096, 1024, 256, 64]
RADIUS: [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
NSAMPLE: [[16, 32], [16, 32], [16, 32], [16, 32]]
MLPS: [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 512], [256, 384, 512]]]
FP_MLPS: [[128, 128], [256, 256], [512, 512], [512, 512]]
CLS_FC: [128]
REG_FC: [128]
DP_RATIO: 0.5
# config of training
LOSS_CLS: SigmoidFocalLoss
FG_WEIGHT: 15
FOCAL_ALPHA: [0.25, 0.75]
FOCAL_GAMMA: 2.0
REG_LOSS_WEIGHT: [1.0, 1.0, 1.0, 1.0]
LOSS_WEIGHT: [1.0, 1.0]
NMS_TYPE: normal
# config of testing
SCORE_THRESH: 0.3
# 2. config of rcnn network
RCNN:
ENABLED: True
# config of input
ROI_SAMPLE_JIT: False
REG_AUG_METHOD: multiple # multiple, single, normal
ROI_FG_AUG_TIMES: 10
USE_RPN_FEATURES: True
USE_MASK: True
MASK_TYPE: seg
USE_INTENSITY: False
USE_DEPTH: True
USE_SEG_SCORE: False
POOL_EXTRA_WIDTH: 1.0
# config of bin-based loss
LOC_SCOPE: 1.5
LOC_BIN_SIZE: 0.5
NUM_HEAD_BIN: 9
LOC_Y_BY_BIN: False
LOC_Y_SCOPE: 0.5
LOC_Y_BIN_SIZE: 0.25
SIZE_RES_ON_ROI: False
# config of network structure
USE_BN: False
DP_RATIO: 0.0
BACKBONE: pointnet # pointnet
XYZ_UP_LAYER: [128, 128]
NUM_POINTS: 512
SA_CONFIG:
NPOINTS: [128, 32, -1]
RADIUS: [0.2, 0.4, 100]
NSAMPLE: [64, 64, 64]
MLPS: [[128, 128, 128],
[128, 128, 256],
[256, 256, 512]]
CLS_FC: [256, 256]
REG_FC: [256, 256]
# config of training
LOSS_CLS: BinaryCrossEntropy
FOCAL_ALPHA: [0.25, 0.75]
FOCAL_GAMMA: 2.0
CLS_WEIGHT: [1.0, 1.0, 1.0]
CLS_FG_THRESH: 0.6
CLS_BG_THRESH: 0.45
CLS_BG_THRESH_LO: 0.05
REG_FG_THRESH: 0.55
FG_RATIO: 0.5
ROI_PER_IMAGE: 64
HARD_BG_RATIO: 0.8
# config of testing
SCORE_THRESH: 0.3
NMS_THRESH: 0.1
# general training config
TRAIN:
SPLIT: train
VAL_SPLIT: smallval
LR: 0.002
LR_CLIP: 0.00001
LR_DECAY: 0.5
DECAY_STEP_LIST: [100, 150, 180, 200]
LR_WARMUP: True
WARMUP_MIN: 0.0002
WARMUP_EPOCH: 1
BN_MOMENTUM: 0.1
BN_DECAY: 0.5
BNM_CLIP: 0.01
BN_DECAY_STEP_LIST: [1000]
OPTIMIZER: adam # adam, adam_onecycle
WEIGHT_DECAY: 0.001 # L2 regularization
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
DIV_FACTOR: 10.0
PCT_START: 0.4
GRAD_NORM_CLIP: 1.0
RPN_PRE_NMS_TOP_N: 9000
RPN_POST_NMS_TOP_N: 512
RPN_NMS_THRESH: 0.85
RPN_DISTANCE_BASED_PROPOSE: True
TEST:
SPLIT: val
RPN_PRE_NMS_TOP_N: 9000
RPN_POST_NMS_TOP_N: 100
RPN_NMS_THRESH: 0.8
RPN_DISTANCE_BASED_PROPOSE: True
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"
echo "Downloading https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip"
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip
echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip"
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip"
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip
echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip"
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
echo "Decompressing data_object_velodyne.zip"
unzip data_object_velodyne.zip
echo "Decompressing data_object_image_2.zip"
unzip "data_object_image_2.zip"
echo "Decompressing data_object_calib.zip"
unzip data_object_calib.zip
echo "Decompressing data_object_label_2.zip"
unzip data_object_label_2.zip
echo "Download KITTI ImageSets"
wget https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_kitti_imagesets.tar
tar xf pointrcnn_kitti_imagesets.tar
mv ImageSets ..
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_dataset.py
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import cv2
import numpy as np
import utils.calibration as calibration
from utils.object3d import get_objects_from_label
from PIL import Image
__all__ = ["KittiDataset"]
class KittiDataset(object):
def __init__(self, data_dir, split='train'):
assert split in ['train', 'train_aug', 'val', 'test'], "unknown split {}".format(split)
self.split = split
self.is_test = self.split == 'test'
self.imageset_dir = os.path.join(data_dir, 'KITTI', 'object', 'testing' if self.is_test else 'training')
split_dir = os.path.join(data_dir, 'KITTI', 'ImageSets', split + '.txt')
self.image_idx_list = [x.strip() for x in open(split_dir).readlines()]
self.num_sample = self.image_idx_list.__len__()
self.image_dir = os.path.join(self.imageset_dir, 'image_2')
self.lidar_dir = os.path.join(self.imageset_dir, 'velodyne')
self.calib_dir = os.path.join(self.imageset_dir, 'calib')
self.label_dir = os.path.join(self.imageset_dir, 'label_2')
self.plane_dir = os.path.join(self.imageset_dir, 'planes')
def get_image(self, idx):
img_file = os.path.join(self.image_dir, '%06d.png' % idx)
assert os.path.exists(img_file)
return cv2.imread(img_file) # (H, W, 3) BGR mode
def get_image_shape(self, idx):
img_file = os.path.join(self.image_dir, '%06d.png' % idx)
assert os.path.exists(img_file)
im = Image.open(img_file)
width, height = im.size
return height, width, 3
def get_lidar(self, idx):
lidar_file = os.path.join(self.lidar_dir, '%06d.bin' % idx)
assert os.path.exists(lidar_file)
return np.fromfile(lidar_file, dtype=np.float32).reshape(-1, 4)
def get_calib(self, idx):
calib_file = os.path.join(self.calib_dir, '%06d.txt' % idx)
assert os.path.exists(calib_file)
return calibration.Calibration(calib_file)
def get_label(self, idx):
label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
assert os.path.exists(label_file)
# return kitti_utils.get_objects_from_label(label_file)
return get_objects_from_label(label_file)
def get_road_plane(self, idx):
plane_file = os.path.join(self.plane_dir, '%06d.txt' % idx)
with open(plane_file, 'r') as f:
lines = f.readlines()
lines = [float(i) for i in lines[3].split()]
plane = np.asarray(lines)
# Ensure normal is always facing up, this is in the rectified camera coordinate
if plane[1] > 0:
plane = -plane
norm = np.linalg.norm(plane[0:3])
plane = plane / norm
return plane
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_rcnn_dataset.py
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import logging
import multiprocessing
import numpy as np
import scipy
from scipy.spatial import Delaunay
try:
import cPickle as pickle
except:
import pickle
import pts_utils
import utils.cyops.kitti_utils as kitti_utils
import utils.cyops.roipool3d_utils as roipool3d_utils
from data.kitti_dataset import KittiDataset
from utils.config import cfg
from collections import OrderedDict
__all__ = ["KittiRCNNReader"]
logger = logging.getLogger(__name__)
def has_empty(data):
for d in data:
if isinstance(d, np.ndarray) and len(d) == 0:
return True
return False
def in_hull(p, hull):
"""
:param p: (N, K) test points
:param hull: (M, K) M corners of a box
:return (N) bool
"""
try:
if not isinstance(hull, Delaunay):
hull = Delaunay(hull)
flag = hull.find_simplex(p) >= 0
except scipy.spatial.qhull.QhullError:
logger.debug('Warning: not a hull.')
flag = np.zeros(p.shape[0], dtype=np.bool)
return flag
class KittiRCNNReader(KittiDataset):
def __init__(self, data_dir, npoints=16384, split='train', classes='Car', mode='TRAIN',
random_select=True, rcnn_training_roi_dir=None, rcnn_training_feature_dir=None,
rcnn_eval_roi_dir=None, rcnn_eval_feature_dir=None, gt_database_dir=None):
super(KittiRCNNReader, self).__init__(data_dir=data_dir, split=split)
if classes == 'Car':
self.classes = ('Background', 'Car')
aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene')
elif classes == 'People':
self.classes = ('Background', 'Pedestrian', 'Cyclist')
elif classes == 'Pedestrian':
self.classes = ('Background', 'Pedestrian')
aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_ped')
elif classes == 'Cyclist':
self.classes = ('Background', 'Cyclist')
aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_cyclist')
else:
assert False, "Invalid classes: %s" % classes
self.num_classes = len(self.classes)
self.npoints = npoints
self.sample_id_list = []
self.random_select = random_select
if split == 'train_aug':
self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
else:
self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
# for rcnn training
self.rcnn_training_bbox_list = []
self.rpn_feature_list = {}
self.pos_bbox_list = []
self.neg_bbox_list = []
self.far_neg_bbox_list = []
self.rcnn_eval_roi_dir = rcnn_eval_roi_dir
self.rcnn_eval_feature_dir = rcnn_eval_feature_dir
self.rcnn_training_roi_dir = rcnn_training_roi_dir
self.rcnn_training_feature_dir = rcnn_training_feature_dir
self.gt_database = None
if not self.random_select:
logger.warning('random select is False')
assert mode in ['TRAIN', 'EVAL', 'TEST'], 'Invalid mode: %s' % mode
self.mode = mode
if cfg.RPN.ENABLED:
if gt_database_dir is not None:
self.gt_database = pickle.load(open(gt_database_dir, 'rb'))
if cfg.GT_AUG_HARD_RATIO > 0:
easy_list, hard_list = [], []
for k in range(self.gt_database.__len__()):
obj = self.gt_database[k]
if obj['points'].shape[0] > 100:
easy_list.append(obj)
else:
hard_list.append(obj)
self.gt_database = [easy_list, hard_list]
logger.info('Loading gt_database(easy(pt_num>100): %d, hard(pt_num<=100): %d) from %s'
% (len(easy_list), len(hard_list), gt_database_dir))
else:
logger.info('Loading gt_database(%d) from %s' % (len(self.gt_database), gt_database_dir))
if mode == 'TRAIN':
self.preprocess_rpn_training_data()
else:
self.sample_id_list = [int(sample_id) for sample_id in self.image_idx_list]
logger.info('Load testing samples from %s' % self.imageset_dir)
logger.info('Done: total test samples %d' % len(self.sample_id_list))
elif cfg.RCNN.ENABLED:
for idx in range(0, self.num_sample):
sample_id = int(self.image_idx_list[idx])
obj_list = self.filtrate_objects(self.get_label(sample_id))
if len(obj_list) == 0:
# logger.info('No gt classes: %06d' % sample_id)
continue
self.sample_id_list.append(sample_id)
logger.info('Done: filter %s results for rcnn training: %d / %d\n' %
(self.mode, len(self.sample_id_list), len(self.image_idx_list)))
def preprocess_rpn_training_data(self):
"""
Discard samples which don't have current classes, which will not be used for training.
Valid sample_id is stored in self.sample_id_list
"""
logger.info('Loading %s samples from %s ...' % (self.mode, self.label_dir))
for idx in range(0, self.num_sample):
sample_id = int(self.image_idx_list[idx])
obj_list = self.filtrate_objects(self.get_label(sample_id))
if len(obj_list) == 0:
logger.debug('No gt classes: %06d' % sample_id)
continue
self.sample_id_list.append(sample_id)
logger.info('Done: filter %s results: %d / %d\n' % (self.mode, len(self.sample_id_list),
len(self.image_idx_list)))
def get_label(self, idx):
if idx < 10000:
label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
else:
label_file = os.path.join(self.aug_label_dir, '%06d.txt' % idx)
assert os.path.exists(label_file)
return kitti_utils.get_objects_from_label(label_file)
def get_image(self, idx):
return super(KittiRCNNReader, self).get_image(idx % 10000)
def get_image_shape(self, idx):
return super(KittiRCNNReader, self).get_image_shape(idx % 10000)
def get_calib(self, idx):
return super(KittiRCNNReader, self).get_calib(idx % 10000)
def get_road_plane(self, idx):
return super(KittiRCNNReader, self).get_road_plane(idx % 10000)
@staticmethod
def get_rpn_features(rpn_feature_dir, idx):
rpn_feature_file = os.path.join(rpn_feature_dir, '%06d.npy' % idx)
rpn_xyz_file = os.path.join(rpn_feature_dir, '%06d_xyz.npy' % idx)
rpn_intensity_file = os.path.join(rpn_feature_dir, '%06d_intensity.npy' % idx)
if cfg.RCNN.USE_SEG_SCORE:
rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_rawscore.npy' % idx)
rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
rpn_seg_score = torch.sigmoid(torch.from_numpy(rpn_seg_score)).numpy()
else:
rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_seg.npy' % idx)
rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
return np.load(rpn_xyz_file), np.load(rpn_feature_file), np.load(rpn_intensity_file).reshape(-1), rpn_seg_score
def filtrate_objects(self, obj_list):
"""
Discard objects which are not in self.classes (or its similar classes)
:param obj_list: list
:return: list
"""
type_whitelist = self.classes
if self.mode == 'TRAIN' and cfg.INCLUDE_SIMILAR_TYPE:
type_whitelist = list(self.classes)
if 'Car' in self.classes:
type_whitelist.append('Van')
if 'Pedestrian' in self.classes: # or 'Cyclist' in self.classes:
type_whitelist.append('Person_sitting')
valid_obj_list = []
for obj in obj_list:
if obj.cls_type not in type_whitelist: # rm Van, 20180928
continue
if self.mode == 'TRAIN' and cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(obj.pos) is False):
continue
valid_obj_list.append(obj)
return valid_obj_list
@staticmethod
def filtrate_dc_objects(obj_list):
valid_obj_list = []
for obj in obj_list:
if obj.cls_type in ['DontCare']:
continue
valid_obj_list.append(obj)
return valid_obj_list
@staticmethod
def check_pc_range(xyz):
"""
:param xyz: [x, y, z]
:return:
"""
x_range, y_range, z_range = cfg.PC_AREA_SCOPE
if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
(z_range[0] <= xyz[2] <= z_range[1]):
return True
return False
@staticmethod
def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
"""
Valid point should be in the image (and in the PC_AREA_SCOPE)
:param pts_rect:
:param pts_img:
:param pts_rect_depth:
:param img_shape:
:return:
"""
val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
if cfg.PC_REDUCE_BY_RANGE:
x_range, y_range, z_range = cfg.PC_AREA_SCOPE
pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
& (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
& (pts_z >= z_range[0]) & (pts_z <= z_range[1])
pts_valid_flag = pts_valid_flag & range_flag
return pts_valid_flag
def get_rpn_sample(self, index):
sample_id = int(self.sample_id_list[index])
if sample_id < 10000:
calib = self.get_calib(sample_id)
# img = self.get_image(sample_id)
img_shape = self.get_image_shape(sample_id)
pts_lidar = self.get_lidar(sample_id)
# get valid point (projected points should be in image)
pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
pts_intensity = pts_lidar[:, 3]
else:
calib = self.get_calib(sample_id % 10000)
# img = self.get_image(sample_id % 10000)
img_shape = self.get_image_shape(sample_id % 10000)
pts_file = os.path.join(self.aug_pts_dir, '%06d.bin' % sample_id)
assert os.path.exists(pts_file), '%s' % pts_file
aug_pts = np.fromfile(pts_file, dtype=np.float32).reshape(-1, 4)
pts_rect, pts_intensity = aug_pts[:, 0:3], aug_pts[:, 3]
pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
pts_rect = pts_rect[pts_valid_flag][:, 0:3]
pts_intensity = pts_intensity[pts_valid_flag]
if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN':
# all labels for checking overlapping
all_gt_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
all_gt_boxes3d = kitti_utils.objs_to_boxes3d(all_gt_obj_list)
gt_aug_flag = False
if np.random.rand() < cfg.GT_AUG_APPLY_PROB:
# augment one scene
gt_aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
self.apply_gt_aug_to_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
# generate inputs
if self.mode == 'TRAIN' or self.random_select:
if self.npoints < len(pts_rect):
pts_depth = pts_rect[:, 2]
pts_near_flag = pts_depth < 40.0
far_idxs_choice = np.where(pts_near_flag == 0)[0]
near_idxs = np.where(pts_near_flag == 1)[0]
near_idxs_choice = np.random.choice(near_idxs, self.npoints - len(far_idxs_choice), replace=False)
choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \
if len(far_idxs_choice) > 0 else near_idxs_choice
np.random.shuffle(choice)
else:
choice = np.arange(0, len(pts_rect), dtype=np.int32)
if self.npoints > len(pts_rect):
extra_choice = np.random.choice(choice, self.npoints - len(pts_rect), replace=False)
choice = np.concatenate((choice, extra_choice), axis=0)
np.random.shuffle(choice)
ret_pts_rect = pts_rect[choice, :]
ret_pts_intensity = pts_intensity[choice] - 0.5 # translate intensity to [-0.5, 0.5]
else:
ret_pts_rect = np.zeros((self.npoints, pts_rect.shape[1])).astype(pts_rect.dtype)
num_ = min(self.npoints, pts_rect.shape[0])
ret_pts_rect[:num_] = pts_rect[:num_]
ret_pts_intensity = pts_intensity - 0.5
pts_features = [ret_pts_intensity.reshape(-1, 1)]
ret_pts_features = np.concatenate(pts_features, axis=1) if pts_features.__len__() > 1 else pts_features[0]
sample_info = {'sample_id': sample_id, 'random_select': self.random_select}
if self.mode == 'TEST':
if cfg.RPN.USE_INTENSITY:
pts_input = np.concatenate((ret_pts_rect, ret_pts_features), axis=1) # (N, C)
else:
pts_input = ret_pts_rect
sample_info['pts_input'] = pts_input
sample_info['pts_rect'] = ret_pts_rect
sample_info['pts_features'] = ret_pts_features
return sample_info
gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN' and gt_aug_flag:
gt_obj_list.extend(extra_gt_obj_list)
gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
gt_alpha = np.zeros((gt_obj_list.__len__()), dtype=np.float32)
for k, obj in enumerate(gt_obj_list):
gt_alpha[k] = obj.alpha
# data augmentation
aug_pts_rect = ret_pts_rect.copy()
aug_gt_boxes3d = gt_boxes3d.copy()
if cfg.AUG_DATA and self.mode == 'TRAIN':
aug_pts_rect, aug_gt_boxes3d, aug_method = self.data_augmentation(aug_pts_rect, aug_gt_boxes3d, gt_alpha,
sample_id)
sample_info['aug_method'] = aug_method
# prepare input
if cfg.RPN.USE_INTENSITY:
pts_input = np.concatenate((aug_pts_rect, ret_pts_features), axis=1) # (N, C)
else:
pts_input = aug_pts_rect
if cfg.RPN.FIXED:
sample_info['pts_input'] = pts_input
sample_info['pts_rect'] = aug_pts_rect
sample_info['pts_features'] = ret_pts_features
sample_info['gt_boxes3d'] = aug_gt_boxes3d
return sample_info
if self.mode == 'EVAL' and aug_gt_boxes3d.shape[0] == 0:
aug_gt_boxes3d = np.zeros((1, aug_gt_boxes3d.shape[1]))
# generate training labels
rpn_cls_label, rpn_reg_label = self.generate_rpn_training_labels(aug_pts_rect, aug_gt_boxes3d)
sample_info['pts_input'] = pts_input
sample_info['pts_rect'] = aug_pts_rect
sample_info['pts_features'] = ret_pts_features
sample_info['rpn_cls_label'] = rpn_cls_label
sample_info['rpn_reg_label'] = rpn_reg_label
sample_info['gt_boxes3d'] = aug_gt_boxes3d
return sample_info
def apply_gt_aug_to_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
"""
:param pts_rect: (N, 3)
:param all_gt_boxex3d: (M2, 7)
:return:
"""
assert self.gt_database is not None
# extra_gt_num = np.random.randint(10, 15)
# try_times = 50
if cfg.GT_AUG_RAND_NUM:
extra_gt_num = np.random.randint(10, cfg.GT_EXTRA_NUM)
else:
extra_gt_num = cfg.GT_EXTRA_NUM
try_times = 100
cnt = 0
cur_gt_boxes3d = all_gt_boxes3d.copy()
cur_gt_boxes3d[:, 4] += 0.5 # TODO: consider different objects
cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
cur_gt_corners = kitti_utils.boxes3d_to_corners3d(cur_gt_boxes3d)
extra_gt_obj_list = []
extra_gt_boxes3d_list = []
new_pts_list, new_pts_intensity_list = [], []
src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
road_plane = self.get_road_plane(sample_id)
a, b, c, d = road_plane
while try_times > 0:
if cnt > extra_gt_num:
break
try_times -= 1
if cfg.GT_AUG_HARD_RATIO > 0:
p = np.random.rand()
if p > cfg.GT_AUG_HARD_RATIO:
# use easy sample
rand_idx = np.random.randint(0, len(self.gt_database[0]))
new_gt_dict = self.gt_database[0][rand_idx]
else:
# use hard sample
rand_idx = np.random.randint(0, len(self.gt_database[1]))
new_gt_dict = self.gt_database[1][rand_idx]
else:
rand_idx = np.random.randint(0, self.gt_database.__len__())
new_gt_dict = self.gt_database[rand_idx]
new_gt_box3d = new_gt_dict['gt_box3d'].copy()
new_gt_points = new_gt_dict['points'].copy()
new_gt_intensity = new_gt_dict['intensity'].copy()
new_gt_obj = new_gt_dict['obj']
center = new_gt_box3d[0:3]
if cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
continue
if new_gt_points.__len__() < 5: # too few points
continue
# put it on the road plane
cur_height = (-d - a * center[0] - c * center[2]) / b
move_height = new_gt_box3d[1] - cur_height
new_gt_box3d[1] -= move_height
new_gt_points[:, 1] -= move_height
new_gt_obj.pos[1] -= move_height
new_enlarged_box3d = new_gt_box3d.copy()
new_enlarged_box3d[4] += 0.5
new_enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
cnt += 1
new_corners = kitti_utils.boxes3d_to_corners3d(new_enlarged_box3d.reshape(1, 7))
iou3d = kitti_utils.get_iou3d(new_corners, cur_gt_corners)
valid_flag = iou3d.max() < 1e-8
if not valid_flag:
continue
enlarged_box3d = new_gt_box3d.copy()
enlarged_box3d[3] += 2 # remove the points above and below the object
boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect,
enlarged_box3d.reshape(1, 7))
pt_mask_flag = (boxes_pts_mask_list[0] == 1)
src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
new_pts_list.append(new_gt_points)
new_pts_intensity_list.append(new_gt_intensity)
cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, new_enlarged_box3d.reshape(1, 7)), axis=0)
cur_gt_corners = np.concatenate((cur_gt_corners, new_corners), axis=0)
extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
extra_gt_obj_list.append(new_gt_obj)
if new_pts_list.__len__() == 0:
return False, pts_rect, pts_intensity, None, None
extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
# remove original points and add new points
pts_rect = pts_rect[src_pts_flag == 1]
pts_intensity = pts_intensity[src_pts_flag == 1]
new_pts_rect = np.concatenate(new_pts_list, axis=0)
new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
def rotate_box3d_along_y(self, box3d, rot_angle):
old_x, old_z, ry = box3d[0], box3d[2], box3d[6]
old_beta = np.arctan2(old_z, old_x)
alpha = -np.sign(old_beta) * np.pi / 2 + old_beta + ry
box3d = kitti_utils.rotate_pc_along_y(box3d.reshape(1, 7), rot_angle=rot_angle)[0]
new_x, new_z = box3d[0], box3d[2]
new_beta = np.arctan2(new_z, new_x)
box3d[6] = np.sign(new_beta) * np.pi / 2 + alpha - new_beta
return box3d
def data_augmentation(self, aug_pts_rect, aug_gt_boxes3d, gt_alpha, sample_id=None, mustaug=False, stage=1):
"""
:param aug_pts_rect: (N, 3)
:param aug_gt_boxes3d: (N, 7)
:param gt_alpha: (N)
:return:
"""
aug_list = cfg.AUG_METHOD_LIST
aug_enable = 1 - np.random.rand(3)
if mustaug is True:
aug_enable[0] = -1
aug_enable[1] = -1
aug_method = []
if 'rotation' in aug_list and aug_enable[0] < cfg.AUG_METHOD_PROB[0]:
angle = np.random.uniform(-np.pi / cfg.AUG_ROT_RANGE, np.pi / cfg.AUG_ROT_RANGE)
aug_pts_rect = kitti_utils.rotate_pc_along_y(aug_pts_rect, rot_angle=angle)
if stage == 1:
# xyz change, hwl unchange
aug_gt_boxes3d = kitti_utils.rotate_pc_along_y(aug_gt_boxes3d, rot_angle=angle)
# calculate the ry after rotation
x, z = aug_gt_boxes3d[:, 0], aug_gt_boxes3d[:, 2]
beta = np.arctan2(z, x)
new_ry = np.sign(beta) * np.pi / 2 + gt_alpha - beta
aug_gt_boxes3d[:, 6] = new_ry # TODO: not in [-np.pi / 2, np.pi / 2]
elif stage == 2:
# for debug stage-2, this implementation has little float precision difference with the above one
assert aug_gt_boxes3d.shape[0] == 2
aug_gt_boxes3d[0] = self.rotate_box3d_along_y(aug_gt_boxes3d[0], angle)
aug_gt_boxes3d[1] = self.rotate_box3d_along_y(aug_gt_boxes3d[1], angle)
else:
raise NotImplementedError
aug_method.append(['rotation', angle])
if 'scaling' in aug_list and aug_enable[1] < cfg.AUG_METHOD_PROB[1]:
scale = np.random.uniform(0.95, 1.05)
aug_pts_rect = aug_pts_rect * scale
aug_gt_boxes3d[:, 0:6] = aug_gt_boxes3d[:, 0:6] * scale
aug_method.append(['scaling', scale])
if 'flip' in aug_list and aug_enable[2] < cfg.AUG_METHOD_PROB[2]:
# flip horizontal
aug_pts_rect[:, 0] = -aug_pts_rect[:, 0]
aug_gt_boxes3d[:, 0] = -aug_gt_boxes3d[:, 0]
# flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
if stage == 1:
aug_gt_boxes3d[:, 6] = np.sign(aug_gt_boxes3d[:, 6]) * np.pi - aug_gt_boxes3d[:, 6]
elif stage == 2:
assert aug_gt_boxes3d.shape[0] == 2
aug_gt_boxes3d[0, 6] = np.sign(aug_gt_boxes3d[0, 6]) * np.pi - aug_gt_boxes3d[0, 6]
aug_gt_boxes3d[1, 6] = np.sign(aug_gt_boxes3d[1, 6]) * np.pi - aug_gt_boxes3d[1, 6]
else:
raise NotImplementedError
aug_method.append('flip')
return aug_pts_rect, aug_gt_boxes3d, aug_method
@staticmethod
def generate_rpn_training_labels(pts_rect, gt_boxes3d):
cls_label = np.zeros((pts_rect.shape[0]), dtype=np.int32)
reg_label = np.zeros((pts_rect.shape[0], 7), dtype=np.float32) # dx, dy, dz, ry, h, w, l
gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d, rotate=True)
extend_gt_boxes3d = kitti_utils.enlarge_box3d(gt_boxes3d, extra_width=0.2)
extend_gt_corners = kitti_utils.boxes3d_to_corners3d(extend_gt_boxes3d, rotate=True)
for k in range(gt_boxes3d.shape[0]):
box_corners = gt_corners[k]
fg_pt_flag = in_hull(pts_rect, box_corners)
fg_pts_rect = pts_rect[fg_pt_flag]
cls_label[fg_pt_flag] = 1
# enlarge the bbox3d, ignore nearby points
extend_box_corners = extend_gt_corners[k]
fg_enlarge_flag = in_hull(pts_rect, extend_box_corners)
ignore_flag = np.logical_xor(fg_pt_flag, fg_enlarge_flag)
cls_label[ignore_flag] = -1
# pixel offset of object center
center3d = gt_boxes3d[k][0:3].copy() # (x, y, z)
center3d[1] -= gt_boxes3d[k][3] / 2
reg_label[fg_pt_flag, 0:3] = center3d - fg_pts_rect # Now y is the true center of 3d box 20180928
# size and angle encoding
reg_label[fg_pt_flag, 3] = gt_boxes3d[k][3] # h
reg_label[fg_pt_flag, 4] = gt_boxes3d[k][4] # w
reg_label[fg_pt_flag, 5] = gt_boxes3d[k][5] # l
reg_label[fg_pt_flag, 6] = gt_boxes3d[k][6] # ry
return cls_label, reg_label
def get_rcnn_sample_jit(self, index):
sample_id = int(self.sample_id_list[index])
rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
# load rois and gt_boxes3d for this sample
roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
# roi_scores is not used currently
# roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
sample_info = OrderedDict()
sample_info["sample_id"] = sample_id
sample_info['rpn_xyz'] = rpn_xyz
sample_info['rpn_features'] = rpn_features
sample_info['rpn_intensity'] = rpn_intensity
sample_info['seg_mask'] = seg_mask
sample_info['roi_boxes3d'] = roi_boxes3d
sample_info['pts_depth'] = np.linalg.norm(rpn_xyz, ord=2, axis=1)
sample_info['gt_boxes3d'] = gt_boxes3d
return sample_info
def sample_bg_inds(self, hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
if hard_bg_inds.size > 0 and easy_bg_inds.size > 0:
hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
# sampling hard bg
rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
hard_bg_inds = hard_bg_inds[rand_num]
# sampling easy bg
rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
easy_bg_inds = easy_bg_inds[rand_num]
bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
elif hard_bg_inds.size > 0 and easy_bg_inds.size == 0:
hard_bg_rois_num = bg_rois_per_this_image
# sampling hard bg
rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
bg_inds = hard_bg_inds[rand_num]
elif hard_bg_inds.size == 0 and easy_bg_inds.size > 0:
easy_bg_rois_num = bg_rois_per_this_image
# sampling easy bg
rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
bg_inds = easy_bg_inds[rand_num]
else:
raise NotImplementedError
return bg_inds
def aug_roi_by_noise_batch(self, roi_boxes3d, gt_boxes3d, aug_times=10):
"""
:param roi_boxes3d: (N, 7)
:param gt_boxes3d: (N, 7)
:return:
"""
iou_of_rois = np.zeros(roi_boxes3d.shape[0], dtype=np.float32)
for k in range(roi_boxes3d.__len__()):
temp_iou = cnt = 0
roi_box3d = roi_boxes3d[k]
gt_box3d = gt_boxes3d[k]
pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
gt_corners = kitti_utils.boxes3d_to_corners3d(gt_box3d.reshape(1, 7), True)
aug_box3d = roi_box3d
while temp_iou < pos_thresh and cnt < aug_times:
if np.random.rand() < 0.2:
aug_box3d = roi_box3d # p=0.2 to keep the original roi box
else:
aug_box3d = self.random_aug_box3d(roi_box3d)
aug_corners = kitti_utils.boxes3d_to_corners3d(aug_box3d.reshape(1, 7), True)
iou3d = kitti_utils.get_iou3d(aug_corners, gt_corners)
temp_iou = iou3d[0][0]
cnt += 1
roi_boxes3d[k] = aug_box3d
iou_of_rois[k] = temp_iou
return roi_boxes3d, iou_of_rois
@staticmethod
def canonical_transform_batch(pts_input, roi_boxes3d, gt_boxes3d):
"""
:param pts_input: (N, npoints, 3 + C)
:param roi_boxes3d: (N, 7)
:param gt_boxes3d: (N, 7)
:return:
"""
roi_ry = roi_boxes3d[:, 6] % (2 * np.pi) # 0 ~ 2pi
roi_center = roi_boxes3d[:, 0:3]
# shift to center
pts_input[:, :, [0, 1, 2]] = pts_input[:, :, [0, 1, 2]] - roi_center.reshape(-1, 1, 3)
gt_boxes3d_ct = np.copy(gt_boxes3d)
gt_boxes3d_ct[:, 0:3] = gt_boxes3d_ct[:, 0:3] - roi_center
# rotate to the direction of head
gt_boxes3d_ct = kitti_utils.rotate_pc_along_y_np(
gt_boxes3d_ct.reshape(-1, 1, 7),
roi_ry,
)
# TODO: check here
gt_boxes3d_ct = gt_boxes3d_ct.reshape(-1,7)
gt_boxes3d_ct[:, 6] = gt_boxes3d_ct[:, 6] - roi_ry
pts_input = kitti_utils.rotate_pc_along_y_np(
pts_input,
roi_ry
)
return pts_input, gt_boxes3d_ct
def get_rcnn_training_sample_batch(self, index):
sample_id = int(self.sample_id_list[index])
rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
# load rois and gt_boxes3d for this sample
roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
# roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
# calculate original iou
iou3d = kitti_utils.get_iou3d(kitti_utils.boxes3d_to_corners3d(roi_boxes3d, True),
kitti_utils.boxes3d_to_corners3d(gt_boxes3d, True))
max_overlaps, gt_assignment = iou3d.max(axis=1), iou3d.argmax(axis=1)
max_iou_of_gt, roi_assignment = iou3d.max(axis=0), iou3d.argmax(axis=0)
roi_assignment = roi_assignment[max_iou_of_gt > 0].reshape(-1)
# sample fg, easy_bg, hard_bg
fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
fg_inds = np.nonzero(max_overlaps >= fg_thresh)[0]
fg_inds = np.concatenate((fg_inds, roi_assignment), axis=0) # consider the roi which has max_overlaps with gt as fg
easy_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO))[0]
hard_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH) &
(max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0]
fg_num_rois = fg_inds.size
bg_num_rois = hard_bg_inds.size + easy_bg_inds.size
if fg_num_rois > 0 and bg_num_rois > 0:
# sampling fg
fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
rand_num = np.random.permutation(fg_num_rois)
fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
# sampling bg
bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
elif fg_num_rois > 0 and bg_num_rois == 0:
# sampling fg
rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE ) * fg_num_rois)
# rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
fg_inds = fg_inds[rand_num]
fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
bg_rois_per_this_image = 0
elif bg_num_rois > 0 and fg_num_rois == 0:
# sampling bg
bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
fg_rois_per_this_image = 0
else:
import pdb
pdb.set_trace()
raise NotImplementedError
# augment the rois by noise
roi_list, roi_iou_list, roi_gt_list = [], [], []
if fg_rois_per_this_image > 0:
fg_rois_src = roi_boxes3d[fg_inds].copy()
gt_of_fg_rois = gt_boxes3d[gt_assignment[fg_inds]]
fg_rois, fg_iou3d = self.aug_roi_by_noise_batch(fg_rois_src, gt_of_fg_rois, aug_times=10)
roi_list.append(fg_rois)
roi_iou_list.append(fg_iou3d)
roi_gt_list.append(gt_of_fg_rois)
if bg_rois_per_this_image > 0:
bg_rois_src = roi_boxes3d[bg_inds].copy()
gt_of_bg_rois = gt_boxes3d[gt_assignment[bg_inds]]
bg_rois, bg_iou3d = self.aug_roi_by_noise_batch(bg_rois_src, gt_of_bg_rois, aug_times=1)
roi_list.append(bg_rois)
roi_iou_list.append(bg_iou3d)
roi_gt_list.append(gt_of_bg_rois)
rois = np.concatenate(roi_list, axis=0)
iou_of_rois = np.concatenate(roi_iou_list, axis=0)
gt_of_rois = np.concatenate(roi_gt_list, axis=0)
# collect extra features for point cloud pooling
if cfg.RCNN.USE_INTENSITY:
pts_extra_input_list = [rpn_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
else:
pts_extra_input_list = [seg_mask.reshape(-1, 1)]
if cfg.RCNN.USE_DEPTH:
pts_depth = (np.linalg.norm(rpn_xyz, ord=2, axis=1) / 70.0) - 0.5
pts_extra_input_list.append(pts_depth.reshape(-1, 1))
pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
# pts, pts_feature, boxes3d, pool_extra_width, sampled_pt_num
pts_input, pts_features, pts_empty_flag = roipool3d_utils.roipool3d_cpu(
rpn_xyz, rpn_features, rois, pts_extra_input,
cfg.RCNN.POOL_EXTRA_WIDTH,
sampled_pt_num=cfg.RCNN.NUM_POINTS,
#canonical_transform=False
)
# data augmentation
if cfg.AUG_DATA and self.mode == 'TRAIN':
for k in range(rois.__len__()):
aug_pts = pts_input[k, :, 0:3].copy()
aug_gt_box3d = gt_of_rois[k].copy()
aug_roi_box3d = rois[k].copy()
# calculate alpha by ry
temp_boxes3d = np.concatenate([aug_roi_box3d.reshape(1, 7), aug_gt_box3d.reshape(1, 7)], axis=0)
temp_x, temp_z, temp_ry = temp_boxes3d[:, 0], temp_boxes3d[:, 2], temp_boxes3d[:, 6]
temp_beta = np.arctan2(temp_z, temp_x).astype(np.float64)
temp_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry
# data augmentation
aug_pts, aug_boxes3d, aug_method = self.data_augmentation(aug_pts, temp_boxes3d, temp_alpha,
mustaug=True, stage=2)
# assign to original data
pts_input[k, :, 0:3] = aug_pts
rois[k] = aug_boxes3d[0]
gt_of_rois[k] = aug_boxes3d[1]
valid_mask = (pts_empty_flag == 0).astype(np.int32)
# regression valid mask
reg_valid_mask = (iou_of_rois > cfg.RCNN.REG_FG_THRESH).astype(np.int32) & valid_mask
# classification label
cls_label = (iou_of_rois > cfg.RCNN.CLS_FG_THRESH).astype(np.int32)
invalid_mask = (iou_of_rois > cfg.RCNN.CLS_BG_THRESH) & (iou_of_rois < cfg.RCNN.CLS_FG_THRESH)
cls_label[invalid_mask] = -1
cls_label[valid_mask == 0] = -1
# canonical transform and sampling
pts_input_ct, gt_boxes3d_ct = self.canonical_transform_batch(pts_input, rois, gt_of_rois)
pts_input_ = np.concatenate((pts_input_ct, pts_features), axis=-1)
sample_info = OrderedDict()
sample_info['sample_id'] = sample_id
sample_info['pts_input'] = pts_input_
sample_info['pts_feature'] = pts_features
sample_info['roi_boxes3d'] = rois
sample_info['cls_label'] = cls_label
sample_info['reg_valid_mask'] = reg_valid_mask
sample_info['gt_boxes3d_ct'] = gt_boxes3d_ct
sample_info['gt_of_rois'] = gt_of_rois
return sample_info
@staticmethod
def random_aug_box3d(box3d):
"""
:param box3d: (7) [x, y, z, h, w, l, ry]
random shift, scale, orientation
"""
if cfg.RCNN.REG_AUG_METHOD == 'single':
pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale,
box3d[6:7] + angle_rot])
return aug_box3d
elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
# pos_range, hwl_range, angle_range, mean_iou
range_config = [[0.2, 0.1, np.pi / 12, 0.7],
[0.3, 0.15, np.pi / 12, 0.6],
[0.5, 0.15, np.pi / 9, 0.5],
[0.8, 0.15, np.pi / 6, 0.3],
[1.0, 0.15, np.pi / 3, 0.2]]
idx = np.random.randint(len(range_config))
pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot])
return aug_box3d
elif cfg.RCNN.REG_AUG_METHOD == 'normal':
x_shift = np.random.normal(loc=0, scale=0.3)
y_shift = np.random.normal(loc=0, scale=0.2)
z_shift = np.random.normal(loc=0, scale=0.3)
h_shift = np.random.normal(loc=0, scale=0.25)
w_shift = np.random.normal(loc=0, scale=0.15)
l_shift = np.random.normal(loc=0, scale=0.5)
ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift])
return aug_box3d
else:
raise NotImplementedError
def get_proposal_from_file(self, index):
sample_id = int(self.image_idx_list[index])
proposal_file = os.path.join(self.rcnn_eval_roi_dir, '%06d.txt' % sample_id)
roi_obj_list = kitti_utils.get_objects_from_label(proposal_file)
rpn_xyz, rpn_features, rpn_intensity, seg_mask = self.get_rpn_features(self.rcnn_eval_feature_dir, sample_id)
pts_rect, pts_rpn_features, pts_intensity = rpn_xyz, rpn_features, rpn_intensity
roi_box3d_list, roi_scores = [], []
for obj in roi_obj_list:
box3d = np.array([obj.pos[0], obj.pos[1], obj.pos[2], obj.h, obj.w, obj.l, obj.ry], dtype=np.float32)
roi_box3d_list.append(box3d.reshape(1, 7))
roi_scores.append(obj.score)
roi_boxes3d = np.concatenate(roi_box3d_list, axis=0) # (N, 7)
roi_scores = np.array(roi_scores, dtype=np.float32) # (N)
if cfg.RCNN.ROI_SAMPLE_JIT:
sample_dict = {'sample_id': sample_id,
'rpn_xyz': rpn_xyz,
'rpn_features': rpn_features,
'seg_mask': seg_mask,
'roi_boxes3d': roi_boxes3d,
'roi_scores': roi_scores,
'pts_depth': np.linalg.norm(rpn_xyz, ord=2, axis=1)}
if self.mode != 'TEST':
gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
if gt_boxes3d.shape[0] > 0:
gt_iou = iou3d.max(axis=1)
else:
gt_iou = np.zeros(roi_boxes3d.shape[0]).astype(np.float32)
sample_dict['gt_boxes3d'] = gt_boxes3d
sample_dict['gt_iou'] = gt_iou
return sample_dict
if cfg.RCNN.USE_INTENSITY:
pts_extra_input_list = [pts_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
else:
pts_extra_input_list = [seg_mask.reshape(-1, 1)]
if cfg.RCNN.USE_DEPTH:
cur_depth = np.linalg.norm(pts_rect, axis=1, ord=2)
cur_depth_norm = (cur_depth / 70.0) - 0.5
pts_extra_input_list.append(cur_depth_norm.reshape(-1, 1))
pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
pts_input, pts_features, _ = roipool3d_utils.roipool3d_cpu(
pts_rect, pts_rpn_features, roi_boxes3d, pts_extra_input,
cfg.RCNN.POOL_EXTRA_WIDTH, sampled_pt_num=cfg.RCNN.NUM_POINTS,
canonical_transform=True
)
pts_input = np.concatenate((pts_input, pts_features), axis=-1)
sample_dict = OrderedDict()
sample_dict['sample_id'] = sample_id
sample_dict['pts_input'] = pts_input
sample_dict['pts_feature'] = pts_features
sample_dict['roi_boxes3d'] = roi_boxes3d
sample_dict['roi_scores'] = roi_scores
#sample_dict['roi_size'] = roi_boxes3d[:, 3:6]
if self.mode == 'TEST':
return sample_dict
gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
gt_boxes3d = np.zeros((gt_obj_list.__len__(), 7), dtype=np.float32)
for k, obj in enumerate(gt_obj_list):
gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
= obj.pos, obj.h, obj.w, obj.l, obj.ry
if gt_boxes3d.__len__() == 0:
gt_iou = np.zeros((roi_boxes3d.shape[0]), dtype=np.float32)
else:
roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
gt_iou = iou3d.max(axis=1)
sample_dict['gt_iou'] = gt_iou
sample_dict['gt_boxes3d'] = gt_boxes3d
return sample_dict
def __len__(self):
if cfg.RPN.ENABLED:
return len(self.sample_id_list)
elif cfg.RCNN.ENABLED:
if self.mode == 'TRAIN':
return len(self.sample_id_list)
else:
return len(self.image_idx_list)
else:
raise NotImplementedError
def __getitem__(self, index):
if cfg.RPN.ENABLED:
return self.get_rpn_sample(index)
elif cfg.RCNN.ENABLED:
if self.mode == 'TRAIN':
if cfg.RCNN.ROI_SAMPLE_JIT:
return self.get_rcnn_sample_jit(index)
else:
return self.get_rcnn_training_sample_batch(index)
else:
return self.get_proposal_from_file(index)
else:
raise NotImplementedError
def padding_batch(self, batch_data, batch_size):
max_roi = 0
max_gt = 0
for k in range(batch_size):
# roi_boxes3d
max_roi = max(max_roi, batch_data[k][3].shape[0])
# gt_boxes3d
max_gt = max(max_gt, batch_data[k][-1].shape[0])
batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7))
batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
for i, data in enumerate(batch_data):
roi_num = data[3].shape[0]
gt_num = data[-1].shape[0]
batch_roi_boxes3d[i,:roi_num,:] = data[3]
batch_gt_boxes3d[i,:gt_num,:] = data[-1]
new_batch = []
for i, data in enumerate(batch_data):
new_batch.append(data[:3])
# roi_boxes3d
new_batch[i].append(batch_roi_boxes3d[i])
# ...
new_batch[i].extend(data[4:7])
# gt_boxes3d
new_batch[i].append(batch_gt_boxes3d[i])
return new_batch
def padding_batch_eval(self, batch_data, batch_size):
max_pts = 0
max_feats = 0
max_roi = 0
max_score = 0
max_iou = 0
max_gt = 0
for k in range(batch_size):
# pts_input
max_pts = max(max_pts, batch_data[k][1].shape[0])
# pts_feature
max_feats = max(max_feats, batch_data[k][2].shape[0])
# roi_boxes3d
max_roi = max(max_roi, batch_data[k][3].shape[0])
# gt_iou
max_iou = max(max_iou, batch_data[k][-2].shape[0])
# gt_boxes3d
max_gt = max(max_gt, batch_data[k][-1].shape[0])
batch_pts_input = np.zeros((batch_size, max_pts, 512, 133), dtype=np.float32)
batch_pts_feat = np.zeros((batch_size, max_feats, 512, 128), dtype=np.float32)
batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7), dtype=np.float32)
batch_gt_iou = np.zeros((batch_size, max_iou), dtype=np.float32)
batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
for i, data in enumerate(batch_data):
# num
pts_num = data[1].shape[0]
pts_feat_num = data[2].shape[0]
roi_num = data[3].shape[0]
iou_num = data[-2].shape[0]
gt_num = data[-1].shape[0]
# data
batch_pts_input[i, :pts_num, :, :] = data[1]
batch_pts_feat[i, :pts_feat_num, :, :] = data[2]
batch_roi_boxes3d[i,:roi_num,:] = data[3]
batch_gt_iou[i,:iou_num] = data[-2]
batch_gt_boxes3d[i,:gt_num,:] = data[-1]
new_batch = []
for i, data in enumerate(batch_data):
new_batch.append(data[:1])
new_batch[i].append(batch_pts_input[i])
new_batch[i].append(batch_pts_feat[i])
new_batch[i].append(batch_roi_boxes3d[i])
new_batch[i].append(data[4])
new_batch[i].append(batch_gt_iou[i])
new_batch[i].append(batch_gt_boxes3d[i])
return new_batch
def get_reader(self, batch_size, fields, drop_last=False):
def reader():
batch_out = []
idxs = np.arange(self.__len__())
if self.mode == 'TRAIN':
np.random.shuffle(idxs)
for idx in idxs:
sample_all = self.__getitem__(idx)
sample = [sample_all[f] for f in fields]
if has_empty(sample):
logger.info("sample field: %d has empty field"%len(sample))
continue
batch_out.append(sample)
if len(batch_out) >= batch_size:
if cfg.RPN.ENABLED:
yield batch_out
else:
if self.mode == 'TRAIN':
yield self.padding_batch(batch_out, batch_size)
elif self.mode == 'EVAL':
# batch_size can should be 1 in rcnn_offline eval currently
# if batch_size > 1, batch should be padded as follow
# yield self.padding_batch_eval(batch_out, batch_size)
yield batch_out
else:
logger.error("not only support train/eval padding")
batch_out = []
if not drop_last:
if len(batch_out) > 0:
yield batch_out
return reader
def get_multiprocess_reader(self, batch_size, fields, proc_num=8, max_queue_len=128, drop_last=False):
def read_to_queue(idxs, queue):
for idx in idxs:
sample_all = self.__getitem__(idx)
sample = [sample_all[f] for f in fields]
queue.put(sample)
queue.put(None)
def reader():
sample_num = self.__len__()
idxs = np.arange(self.__len__())
if self.mode == 'TRAIN':
np.random.shuffle(idxs)
proc_idxs = []
proc_sample_num = int(sample_num / proc_num)
start_idx = 0
for i in range(proc_num - 1):
proc_idxs.append(idxs[start_idx:start_idx + proc_sample_num])
start_idx += proc_sample_num
proc_idxs.append(idxs[start_idx:])
queue = multiprocessing.Queue(max_queue_len)
p_list = []
for i in range(proc_num):
p_list.append(multiprocessing.Process(
target=read_to_queue, args=(proc_idxs[i], queue,)))
p_list[-1].start()
finish_num = 0
batch_out = []
while finish_num < len(p_list):
sample = queue.get()
if sample is None:
finish_num += 1
else:
batch_out.append(sample)
if len(batch_out) == batch_size:
yield batch_out
batch_out = []
# join process
for p in p_list:
if p.is_alive():
p.join()
return reader
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import time
import shutil
import argparse
import logging
import multiprocessing
import numpy as np
from collections import OrderedDict
import paddle
import paddle.fluid as fluid
from models.point_rcnn import PointRCNN
from data.kitti_rcnn_reader import KittiRCNNReader
from utils.run_utils import *
from utils.config import cfg, load_config, set_config_from_list
from utils.metric_utils import calc_iou_recall, rpn_metric, rcnn_metric
logging.root.handlers = []
FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
logger = logging.getLogger(__name__)
np.random.seed(1024) # use same seed
METRIC_PROC_NUM = 4
def parse_args():
parser = argparse.ArgumentParser(
"PointRCNN semantic segmentation train script")
parser.add_argument(
'--cfg',
type=str,
default='cfgs/default.yml',
help='specify the config for training')
parser.add_argument(
'--eval_mode',
type=str,
default='rpn',
required=True,
help='specify the training mode')
parser.add_argument(
'--batch_size',
type=int,
default=1,
help='evaluation batch size, default 1')
parser.add_argument(
'--ckpt_dir',
type=str,
default='checkpoints/199',
help='specify a ckpt directory to be evaluated if needed')
parser.add_argument(
'--data_dir',
type=str,
default='./data',
help='KITTI dataset root directory')
parser.add_argument(
'--output_dir',
type=str,
default='output',
help='output directory')
parser.add_argument(
'--save_rpn_feature',
action='store_true',
default=False,
help='save features for separately rcnn training and evaluation')
parser.add_argument(
'--save_result',
action='store_true',
default=False,
help='save roi and refine result of evaluation')
parser.add_argument(
'--rcnn_eval_roi_dir',
type=str,
default=None,
help='specify the saved rois for rcnn evaluation when using rcnn_offline mode')
parser.add_argument(
'--rcnn_eval_feature_dir',
type=str,
default=None,
help='specify the saved features for rcnn evaluation when using rcnn_offline mode')
parser.add_argument(
'--log_interval',
type=int,
default=1,
help='mini-batch interval to log.')
parser.add_argument(
'--set',
dest='set_cfgs',
default=None,
nargs=argparse.REMAINDER,
help='set extra config keys if needed.')
args = parser.parse_args()
return args
def eval():
args = parse_args()
print_arguments(args)
# check whether the installed paddle is compiled with GPU
# PointRCNN model can only run on GPU
check_gpu(True)
load_config(args.cfg)
if args.set_cfgs is not None:
set_config_from_list(args.set_cfgs)
if not os.path.isdir(args.output_dir):
os.makedirs(args.output_dir)
if args.eval_mode == 'rpn':
cfg.RPN.ENABLED = True
cfg.RCNN.ENABLED = False
elif args.eval_mode == 'rcnn':
cfg.RCNN.ENABLED = True
cfg.RPN.ENABLED = cfg.RPN.FIXED = True
assert args.batch_size, "batch size must be 1 in rcnn evaluation"
elif args.eval_mode == 'rcnn_offline':
cfg.RCNN.ENABLED = True
cfg.RPN.ENABLED = False
assert args.batch_size, "batch size must be 1 in rcnn_offline evaluation"
else:
raise NotImplementedError("unkown eval mode: {}".format(args.eval_mode))
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
# build model
startup = fluid.Program()
eval_prog = fluid.Program()
with fluid.program_guard(eval_prog, startup):
with fluid.unique_name.guard():
eval_model = PointRCNN(cfg, args.batch_size, True, 'TEST')
eval_model.build()
eval_pyreader = eval_model.get_pyreader()
eval_feeds = eval_model.get_feeds()
eval_outputs = eval_model.get_outputs()
eval_prog = eval_prog.clone(True)
extra_keys = []
if args.eval_mode == 'rpn':
extra_keys.extend(['sample_id', 'rpn_cls_label', 'gt_boxes3d'])
if args.save_rpn_feature:
extra_keys.extend(['pts_rect', 'pts_features', 'pts_input',])
eval_keys, eval_values = parse_outputs(
eval_outputs, prog=eval_prog, extra_keys=extra_keys)
eval_compile_prog = fluid.compiler.CompiledProgram(
eval_prog).with_data_parallel()
exe.run(startup)
# load checkpoint
assert os.path.isdir(
args.ckpt_dir), "ckpt_dir {} not a directory".format(args.ckpt_dir)
def if_exist(var):
return os.path.exists(os.path.join(args.ckpt_dir, var.name))
fluid.io.load_vars(exe, args.ckpt_dir, eval_prog, predicate=if_exist)
kitti_feature_dir = os.path.join(args.output_dir, 'features')
kitti_output_dir = os.path.join(args.output_dir, 'detections', 'data')
seg_output_dir = os.path.join(args.output_dir, 'seg_result')
if args.save_rpn_feature:
if os.path.exists(kitti_feature_dir):
shutil.rmtree(kitti_feature_dir)
os.makedirs(kitti_feature_dir)
if os.path.exists(kitti_output_dir):
shutil.rmtree(kitti_output_dir)
os.makedirs(kitti_output_dir)
if os.path.exists(seg_output_dir):
shutil.rmtree(seg_output_dir)
os.makedirs(seg_output_dir)
# must make sure these dirs existing
roi_output_dir = os.path.join('./result_dir', 'roi_result', 'data')
refine_output_dir = os.path.join('./result_dir', 'refine_result', 'data')
final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
if not os.path.exists(final_output_dir):
os.makedirs(final_output_dir)
if args.save_result:
if not os.path.exists(roi_output_dir):
os.makedirs(roi_output_dir)
if not os.path.exists(refine_output_dir):
os.makedirs(refine_output_dir)
# get reader
kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
npoints=cfg.RPN.NUM_POINTS,
split=cfg.TEST.SPLIT,
mode='EVAL',
classes=cfg.CLASSES,
rcnn_eval_roi_dir=args.rcnn_eval_roi_dir,
rcnn_eval_feature_dir=args.rcnn_eval_feature_dir)
eval_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, eval_feeds)
eval_pyreader.decorate_sample_list_generator(eval_reader, place)
thresh_list = [0.1, 0.3, 0.5, 0.7, 0.9]
queue = multiprocessing.Queue(128)
mgr = multiprocessing.Manager()
lock = multiprocessing.Lock()
mdict = mgr.dict()
if cfg.RPN.ENABLED:
mdict['exit_proc'] = 0
mdict['total_gt_bbox'] = 0
mdict['total_cnt'] = 0
mdict['total_rpn_iou'] = 0
for i in range(len(thresh_list)):
mdict['total_recalled_bbox_list_{}'.format(i)] = 0
p_list = []
for i in range(METRIC_PROC_NUM):
p_list.append(multiprocessing.Process(
target=rpn_metric,
args=(queue, mdict, lock, thresh_list, args.save_rpn_feature, kitti_feature_dir,
seg_output_dir, kitti_output_dir, kitti_rcnn_reader, cfg.CLASSES)))
p_list[-1].start()
if cfg.RCNN.ENABLED:
for i in range(len(thresh_list)):
mdict['total_recalled_bbox_list_{}'.format(i)] = 0
mdict['total_roi_recalled_bbox_list_{}'.format(i)] = 0
mdict['exit_proc'] = 0
mdict['total_cls_acc'] = 0
mdict['total_cls_acc_refined'] = 0
mdict['total_det_num'] = 0
mdict['total_gt_bbox'] = 0
p_list = []
for i in range(METRIC_PROC_NUM):
p_list.append(multiprocessing.Process(
target=rcnn_metric,
args=(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
refine_output_dir, final_output_dir, args.save_result)
))
p_list[-1].start()
try:
eval_pyreader.start()
eval_iter = 0
start_time = time.time()
cur_time = time.time()
while True:
eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values, return_numpy=False)
rets_dict = {k: (np.array(v), v.recursive_sequence_lengths())
for k, v in zip(eval_keys, eval_outs)}
run_time = time.time() - cur_time
cur_time = time.time()
queue.put(rets_dict)
eval_iter += 1
logger.info("[EVAL] iter {}, time: {:.2f}".format(
eval_iter, run_time))
except fluid.core.EOFException:
# terminate metric process
for i in range(METRIC_PROC_NUM):
queue.put(None)
while mdict['exit_proc'] < METRIC_PROC_NUM:
time.sleep(1)
for p in p_list:
if p.is_alive():
p.join()
end_time = time.time()
logger.info("[EVAL] total {} iter finished, average time: {:.2f}".format(
eval_iter, (end_time - start_time) / float(eval_iter)))
if cfg.RPN.ENABLED:
avg_rpn_iou = mdict['total_rpn_iou'] / max(len(kitti_rcnn_reader), 1.)
logger.info("average rpn iou: {:.3f}".format(avg_rpn_iou))
total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
for idx, thresh in enumerate(thresh_list):
recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
logger.info("total bbox recall(thresh={:.3f}): {} / {} = {:.3f}".format(
thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], mdict['total_gt_bbox'], recall))
if cfg.RCNN.ENABLED:
cnt = float(max(eval_iter, 1.0))
avg_cls_acc = mdict['total_cls_acc'] / cnt
avg_cls_acc_refined = mdict['total_cls_acc_refined'] / cnt
avg_det_num = mdict['total_det_num'] / cnt
logger.info("avg_cls_acc: {}".format(avg_cls_acc))
logger.info("avg_cls_acc_refined: {}".format(avg_cls_acc_refined))
logger.info("avg_det_num: {}".format(avg_det_num))
total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
for idx, thresh in enumerate(thresh_list):
cur_roi_recall = mdict['total_roi_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
logger.info('total roi bbox recall(thresh=%.3f): %d / %d = %f' % (
thresh, mdict['total_roi_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_roi_recall))
for idx, thresh in enumerate(thresh_list):
cur_recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
logger.info('total bbox recall(thresh=%.2f) %d / %.2f = %.4f' % (
thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_recall))
split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
image_idx_list = [x.strip() for x in open(split_file).readlines()]
for k in range(image_idx_list.__len__()):
cur_file = os.path.join(final_output_dir, '%s.txt' % image_idx_list[k])
if not os.path.exists(cur_file):
with open(cur_file, 'w') as temp_f:
pass
if float(sys.version[:3]) >= 3.6:
label_dir = os.path.join('./data/KITTI/object/training', 'label_2')
split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
ap_result_str, ap_dict = kitti_evaluate(
label_dir, final_output_dir, label_split_file=split_file,
current_class=name_to_class["Car"])
logger.info("KITTI evaluate: {}, {}".format(ap_result_str, ap_dict))
else:
logger.info("KITTI mAP only support python version >= 3.6, users can "
"run 'python3 tools/kitti_eval.py' to evaluate KITTI mAP.")
finally:
eval_pyreader.reset()
if __name__ == "__main__":
eval()
../PointNet++/ext_op
\ No newline at end of file
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant
__all__ = ["get_reg_loss"]
def sigmoid_focal_loss(logits, labels, weights, gamma=2.0, alpha=0.25):
sce_loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, labels)
prob = fluid.layers.sigmoid(logits)
p_t = labels * prob + (1.0 - labels) * (1.0 - prob)
modulating_factor = fluid.layers.pow(1.0 - p_t, gamma)
alpha_weight_factor = labels * alpha + (1.0 - labels) * (1.0 - alpha)
return modulating_factor * alpha_weight_factor * sce_loss * weights
def get_reg_loss(pred_reg, reg_label, fg_mask, point_num, loc_scope,
loc_bin_size, num_head_bin, anchor_size,
get_xz_fine=True, get_y_by_bin=False, loc_y_scope=0.5,
loc_y_bin_size=0.25, get_ry_fine=False):
"""
Bin-based 3D bounding boxes regression loss. See https://arxiv.org/abs/1812.04244 for more details.
:param pred_reg: (N, C)
:param reg_label: (N, 7) [dx, dy, dz, h, w, l, ry]
:param loc_scope: constant
:param loc_bin_size: constant
:param num_head_bin: constant
:param anchor_size: (N, 3) or (3)
:param get_xz_fine:
:param get_y_by_bin:
:param loc_y_scope:
:param loc_y_bin_size:
:param get_ry_fine:
:return:
"""
fg_num = fluid.layers.cast(fluid.layers.reduce_sum(fg_mask), dtype=pred_reg.dtype)
fg_num = fluid.layers.clip(fg_num, min=1.0, max=point_num)
fg_scale = float(point_num) / fg_num
per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
reg_loss_dict = {}
# xz localization loss
x_offset_label, y_offset_label, z_offset_label = reg_label[:, 0:1], reg_label[:, 1:2], reg_label[:, 2:3]
x_shift = fluid.layers.clip(x_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
z_shift = fluid.layers.clip(z_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
x_bin_label = fluid.layers.cast(x_shift / loc_bin_size, dtype='int64')
z_bin_label = fluid.layers.cast(z_shift / loc_bin_size, dtype='int64')
x_bin_l, x_bin_r = 0, per_loc_bin_num
z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
start_offset = z_bin_r
loss_x_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, x_bin_l: x_bin_r], x_bin_label)
loss_x_bin = fluid.layers.reduce_mean(loss_x_bin * fg_mask) * fg_scale
loss_z_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, z_bin_l: z_bin_r], z_bin_label)
loss_z_bin = fluid.layers.reduce_mean(loss_z_bin * fg_mask) * fg_scale
reg_loss_dict['loss_x_bin'] = loss_x_bin
reg_loss_dict['loss_z_bin'] = loss_z_bin
loc_loss = loss_x_bin + loss_z_bin
if get_xz_fine:
x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
start_offset = z_res_r
x_res_label = x_shift - (fluid.layers.cast(x_bin_label, dtype=x_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
z_res_label = z_shift - (fluid.layers.cast(z_bin_label, dtype=z_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
x_res_norm_label = x_res_label / loc_bin_size
z_res_norm_label = z_res_label / loc_bin_size
x_bin_onehot = fluid.layers.one_hot(x_bin_label, depth=per_loc_bin_num)
z_bin_onehot = fluid.layers.one_hot(z_bin_label, depth=per_loc_bin_num)
loss_x_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, x_res_l: x_res_r] * x_bin_onehot, dim=1, keep_dim=True), x_res_norm_label)
loss_x_res = fluid.layers.reduce_mean(loss_x_res * fg_mask) * fg_scale
loss_z_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, z_res_l: z_res_r] * z_bin_onehot, dim=1, keep_dim=True), z_res_norm_label)
loss_z_res = fluid.layers.reduce_mean(loss_z_res * fg_mask) * fg_scale
reg_loss_dict['loss_x_res'] = loss_x_res
reg_loss_dict['loss_z_res'] = loss_z_res
loc_loss += loss_x_res + loss_z_res
# y localization loss
if get_y_by_bin:
y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
start_offset = y_res_r
y_shift = fluid.layers.clip(y_offset_label + loc_y_scope, 0., loc_y_scope * 2 - 1e-3)
y_bin_label = fluid.layers.cast(y_shift / loc_y_bin_size, dtype='int64')
y_res_label = y_shift - (fluid.layers.cast(y_bin_label, dtype=y_shift.dtype) * loc_y_bin_size + loc_y_bin_size / 2.)
y_res_norm_label = y_res_label / loc_y_bin_size
y_bin_onehot = fluid.layers.one_hot(y_bin_label, depth=per_loc_bin_num)
loss_y_bin = fluid.layers.cross_entropy(pred_reg[:, y_bin_l: y_bin_r], y_bin_label)
loss_y_bin = fluid.layers.reduce_mean(loss_y_bin * fg_mask) * fg_scale
loss_y_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_res_l: y_res_r] * y_bin_onehot, dim=1, keep_dim=True), y_res_norm_label)
loss_y_res = fluid.layers.reduce_mean(loss_y_res * fg_mask) * fg_scale
reg_loss_dict['loss_y_bin'] = loss_y_bin
reg_loss_dict['loss_y_res'] = loss_y_res
loc_loss += loss_y_bin + loss_y_res
else:
y_offset_l, y_offset_r = start_offset, start_offset + 1
start_offset = y_offset_r
loss_y_offset = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_offset_l: y_offset_r], dim=1, keep_dim=True), y_offset_label)
loss_y_offset = fluid.layers.reduce_mean(loss_y_offset * fg_mask) * fg_scale
reg_loss_dict['loss_y_offset'] = loss_y_offset
loc_loss += loss_y_offset
# angle loss
ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
ry_label = reg_label[:, 6:7]
if get_ry_fine:
# divide pi/2 into several bins
angle_per_class = (np.pi / 2) / num_head_bin
ry_label = ry_label % (2 * np.pi) # 0 ~ 2pi
opposite_flag = fluid.layers.logical_and(ry_label > np.pi * 0.5, ry_label < np.pi * 1.5)
opposite_flag = fluid.layers.cast(opposite_flag, dtype=ry_label.dtype)
shift_angle = (ry_label + opposite_flag * np.pi + np.pi * 0.5) % (2 * np.pi) # (0 ~ pi)
shift_angle.stop_gradient = True
shift_angle = fluid.layers.clip(shift_angle - np.pi * 0.25, min=1e-3, max=np.pi * 0.5 - 1e-3) # (0, pi/2)
# bin center is (5, 10, 15, ..., 85)
ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
ry_res_norm_label = ry_res_label / (angle_per_class / 2)
else:
# divide 2pi into several bins
angle_per_class = (2 * np.pi) / num_head_bin
heading_angle = ry_label % (2 * np.pi) # 0 ~ 2pi
shift_angle = (heading_angle + angle_per_class / 2) % (2 * np.pi)
shift_angle.stop_gradient = True
ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
ry_res_norm_label = ry_res_label / (angle_per_class / 2)
ry_bin_onehot = fluid.layers.one_hot(ry_bin_label, depth=num_head_bin)
loss_ry_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, ry_bin_l:ry_bin_r], ry_bin_label)
loss_ry_bin = fluid.layers.reduce_mean(loss_ry_bin * fg_mask) * fg_scale
loss_ry_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, ry_res_l: ry_res_r] * ry_bin_onehot, dim=1, keep_dim=True), ry_res_norm_label)
loss_ry_res = fluid.layers.reduce_mean(loss_ry_res * fg_mask) * fg_scale
reg_loss_dict['loss_ry_bin'] = loss_ry_bin
reg_loss_dict['loss_ry_res'] = loss_ry_res
angle_loss = loss_ry_bin + loss_ry_res
# size loss
size_res_l, size_res_r = ry_res_r, ry_res_r + 3
assert pred_reg.shape[1] == size_res_r, '%d vs %d' % (pred_reg.shape[1], size_res_r)
anchor_size_var = fluid.layers.zeros(shape=[3], dtype=reg_label.dtype)
fluid.layers.assign(np.array(anchor_size).astype('float32'), anchor_size_var)
size_res_norm_label = (reg_label[:, 3:6] - anchor_size_var) / anchor_size_var
size_res_norm_label = fluid.layers.reshape(size_res_norm_label, shape=[-1, 1], inplace=True)
size_res_norm = pred_reg[:, size_res_l:size_res_r]
size_res_norm = fluid.layers.reshape(size_res_norm, shape=[-1, 1], inplace=True)
size_loss = fluid.layers.smooth_l1(size_res_norm, size_res_norm_label)
size_loss = fluid.layers.reduce_mean(fluid.layers.reshape(size_loss, [-1, 3]) * fg_mask) * fg_scale
# Total regression loss
reg_loss_dict['loss_loc'] = loc_loss
reg_loss_dict['loss_angle'] = angle_loss
reg_loss_dict['loss_size'] = size_loss
return loc_loss, angle_loss, size_loss, reg_loss_dict
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from collections import OrderedDict
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant
from models.rpn import RPN
from models.rcnn import RCNN
__all__ = ["PointRCNN"]
class PointRCNN(object):
def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
self.cfg = cfg
self.batch_size = batch_size
self.use_xyz = use_xyz
self.mode = mode
self.is_train = mode == 'TRAIN'
self.num_points = self.cfg.RPN.NUM_POINTS
self.prog = prog
self.inputs = None
self.pyreader = None
def build_inputs(self):
self.inputs = OrderedDict()
if self.cfg.RPN.ENABLED:
self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32')
self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[self.num_points, 3], dtype='float32')
self.inputs['pts_rect'] = fluid.layers.data(name='pts_rect', shape=[self.num_points, 3], dtype='float32')
self.inputs['pts_features'] = fluid.layers.data(name='pts_features', shape=[self.num_points, 1], dtype='float32')
self.inputs['rpn_cls_label'] = fluid.layers.data(name='rpn_cls_label', shape=[self.num_points], dtype='int32')
self.inputs['rpn_reg_label'] = fluid.layers.data(name='rpn_reg_label', shape=[self.num_points, 7], dtype='float32')
self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[7], lod_level=1, dtype='float32')
if self.cfg.RCNN.ENABLED:
if self.cfg.RCNN.ROI_SAMPLE_JIT:
self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32', append_batch_size=False)
self.inputs['rpn_xyz'] = fluid.layers.data(name='rpn_xyz', shape=[self.num_points, 3], dtype='float32', append_batch_size=False)
self.inputs['rpn_features'] = fluid.layers.data(name='rpn_features', shape=[self.num_points,128], dtype='float32', append_batch_size=False)
self.inputs['rpn_intensity'] = fluid.layers.data(name='rpn_intensity', shape=[self.num_points], dtype='float32', append_batch_size=False)
self.inputs['seg_mask'] = fluid.layers.data(name='seg_mask', shape=[self.num_points], dtype='float32', append_batch_size=False)
self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
self.inputs['pts_depth'] = fluid.layers.data(name='pts_depth', shape=[self.num_points], dtype='float32', append_batch_size=False)
self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
else:
self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[-1], dtype='int32', append_batch_size=False)
self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[-1,512,133], dtype='float32', append_batch_size=False)
self.inputs['pts_feature'] = fluid.layers.data(name='pts_feature', shape=[-1,512,128], dtype='float32', append_batch_size=False)
self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1,7], dtype='float32', append_batch_size=False)
if self.is_train:
self.inputs['cls_label'] = fluid.layers.data(name='cls_label', shape=[-1], dtype='float32', append_batch_size=False)
self.inputs['reg_valid_mask'] = fluid.layers.data(name='reg_valid_mask', shape=[-1], dtype='float32', append_batch_size=False)
self.inputs['gt_boxes3d_ct'] = fluid.layers.data(name='gt_boxes3d_ct', shape=[-1,7], dtype='float32', append_batch_size=False)
self.inputs['gt_of_rois'] = fluid.layers.data(name='gt_of_rois', shape=[-1,7], dtype='float32', append_batch_size=False)
else:
self.inputs['roi_scores'] = fluid.layers.data(name='roi_scores', shape=[-1,], dtype='float32', append_batch_size=False)
self.inputs['gt_iou'] = fluid.layers.data(name='gt_iou', shape=[-1], dtype='float32', append_batch_size=False)
self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1,-1,7], dtype='float32', append_batch_size=False, lod_level=0)
self.pyreader = fluid.io.PyReader(
feed_list=list(self.inputs.values()),
capacity=64,
use_double_buffer=True,
iterable=False)
def build(self):
self.build_inputs()
if self.cfg.RPN.ENABLED:
self.rpn = RPN(self.cfg, self.batch_size, self.use_xyz,
self.mode, self.prog)
self.rpn.build(self.inputs)
self.rpn_outputs = self.rpn.get_outputs()
self.outputs = self.rpn_outputs
if self.cfg.RCNN.ENABLED:
self.rcnn = RCNN(self.cfg, 1, self.batch_size, self.mode)
self.rcnn.build_model(self.inputs)
self.outputs = self.rcnn.get_outputs()
if self.mode == 'TRAIN':
if self.cfg.RPN.ENABLED:
self.outputs['rpn_loss'], self.outputs['rpn_loss_cls'], \
self.outputs['rpn_loss_reg'] = self.rpn.get_loss()
if self.cfg.RCNN.ENABLED:
self.outputs['rcnn_loss'], self.outputs['rcnn_loss_cls'], \
self.outputs['rcnn_loss_reg'] = self.rcnn.get_loss()
self.outputs['loss'] = self.outputs.get('rpn_loss', 0.) \
+ self.outputs.get('rcnn_loss', 0.)
def get_feeds(self):
return list(self.inputs.keys())
def get_outputs(self):
return self.outputs
def get_loss(self):
rpn_loss, _, _ = self.rpn.get_loss()
rcnn_loss, _, _ = self.rcnn.get_loss()
return rpn_loss + rcnn_loss
def get_pyreader(self):
return self.pyreader
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
Contains PointNet++ utility functions.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant
from ext_op import *
__all__ = ["conv_bn", "pointnet_sa_module", "pointnet_fp_module", "MLP"]
def query_and_group(xyz, new_xyz, radius, nsample, features=None, use_xyz=True):
"""
Perform query_ball and group_points
Args:
xyz (Variable): xyz coordiantes features with shape [B, N, 3]
new_xyz (Variable): centriods features with shape [B, npoint, 3]
radius (float32): radius of ball
nsample (int32): maximum number of gather features
features (Variable): features with shape [B, N, C]
use_xyz (bool): whether use xyz coordiantes features
Returns:
out (Variable): features with shape [B, npoint, nsample, C + 3]
"""
idx = query_ball(xyz, new_xyz, radius, nsample)
idx.stop_gradient = True
xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
grouped_xyz = group_points(xyz, idx)
expand_new_xyz = fluid.layers.unsqueeze(fluid.layers.transpose(new_xyz, perm=[0, 2, 1]), axes=[-1])
expand_new_xyz = fluid.layers.expand(expand_new_xyz, [1, 1, 1, grouped_xyz.shape[3]])
grouped_xyz -= expand_new_xyz
if features is not None:
grouped_features = group_points(features, idx)
return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) \
if use_xyz else grouped_features
else:
assert use_xyz, "use_xyz should be True when features is None"
return grouped_xyz
def group_all(xyz, features=None, use_xyz=True):
"""
Group all xyz and features when npoint is None
See query_and_group
"""
xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
grouped_xyz = fluid.layers.unsqueeze(xyz, axes=[2])
if features is not None:
grouped_features = fluid.layers.unsqueeze(features, axes=[2])
return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) if use_xyz else grouped_features
else:
return grouped_xyz
def conv_bn(input, out_channels, bn=True, bn_momentum=0.95, act='relu', name=None):
param_attr = ParamAttr(name='{}_conv_weight'.format(name),)
bias_attr = ParamAttr(name='{}_conv_bias'.format(name)) \
if not bn else False
out = fluid.layers.conv2d(input,
num_filters=out_channels,
filter_size=1,
stride=1,
padding=0,
dilation=1,
param_attr=param_attr,
bias_attr=bias_attr,
act=act if not bn else None)
if bn:
bn_name = name + "_bn"
out = fluid.layers.batch_norm(out,
act=act,
momentum=bn_momentum,
param_attr=ParamAttr(name=bn_name + "_scale"),
bias_attr=ParamAttr(name=bn_name + "_offset"),
moving_mean_name=bn_name + '_mean',
moving_variance_name=bn_name + '_var')
return out
def MLP(features, out_channels_list, bn=True, bn_momentum=0.95, act='relu', name=None):
out = features
for i, out_channels in enumerate(out_channels_list):
out = conv_bn(out, out_channels, bn=bn, act=act, bn_momentum=bn_momentum, name=name + "_{}".format(i))
return out
def pointnet_sa_module(xyz,
npoint=None,
radiuss=[],
nsamples=[],
mlps=[],
feature=None,
bn=True,
bn_momentum=0.95,
use_xyz=True,
name=None):
"""
PointNet MSG(Multi-Scale Group) Set Abstraction Module.
Call with radiuss, nsamples, mlps as single element list for
SSG(Single-Scale Group).
Args:
xyz (Variable): xyz coordiantes features with shape [B, N, 3]
radiuss ([float32]): list of radius of ball
nsamples ([int32]): list of maximum number of gather features
mlps ([[int32]]): list of out_channels_list
feature (Variable): features with shape [B, C, N]
bn (bool): whether perform batch norm after conv2d
bn_momentum (float): momentum of batch norm
use_xyz (bool): whether use xyz coordiantes features
Returns:
new_xyz (Variable): centriods features with shape [B, npoint, 3]
out (Variable): features with shape [B, npoint, \sum_i{mlps[i][-1]}]
"""
assert len(radiuss) == len(nsamples) == len(mlps), \
"radiuss, nsamples, mlps length should be same"
farthest_idx = farthest_point_sampling(xyz, npoint)
farthest_idx.stop_gradient = True
new_xyz = gather_point(xyz, farthest_idx) if npoint is not None else None
outs = []
for i, (radius, nsample, mlp) in enumerate(zip(radiuss, nsamples, mlps)):
out = query_and_group(xyz, new_xyz, radius, nsample, feature, use_xyz) if npoint is not None else group_all(xyz, feature, use_xyz)
out = MLP(out, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp{}'.format(i))
out = fluid.layers.pool2d(out, pool_size=[1, out.shape[3]], pool_type='max')
out = fluid.layers.squeeze(out, axes=[-1])
outs.append(out)
out = fluid.layers.concat(outs, axis=1)
return (new_xyz, out)
def pointnet_fp_module(unknown, known, unknown_feats, known_feats, mlp, bn=True, bn_momentum=0.95, name=None):
"""
PointNet Feature Propagation Module
Args:
unknown (Variable): unknown xyz coordiantes features with shape [B, N, 3]
known (Variable): known xyz coordiantes features with shape [B, M, 3]
unknown_feats (Variable): unknown features with shape [B, N, C1] to be propagated to
known_feats (Variable): known features with shape [B, M, C2] to be propagated from
mlp ([int32]): out_channels_list
bn (bool): whether perform batch norm after conv2d
Returns:
new_features (Variable): new features with shape [B, N, mlp[-1]]
"""
if known is None:
raise NotImplementedError("Not implement known as None currently.")
else:
dist, idx = three_nn(unknown, known, eps=0.)
dist.stop_gradient = True
idx.stop_gradient = True
dist = fluid.layers.sqrt(dist)
ones = fluid.layers.fill_constant_batch_size_like(dist, dist.shape, dist.dtype, 1)
dist_recip = ones / (dist + 1e-8); # 1.0 / dist
norm = fluid.layers.reduce_sum(dist_recip, dim=-1, keep_dim=True)
weight = dist_recip / norm
weight.stop_gradient = True
interp_feats = three_interp(known_feats, weight, idx)
new_features = interp_feats if unknown_feats is None else \
fluid.layers.concat([interp_feats, unknown_feats], axis=-1)
new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
new_features = fluid.layers.unsqueeze(new_features, axes=[-1])
new_features = MLP(new_features, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp')
new_features = fluid.layers.squeeze(new_features, axes=[-1])
new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
return new_features
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
Contains PointNet++ SSG/MSG semantic segmentation models
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant
from models.pointnet2_modules import *
__all__ = ["PointNet2MSG"]
class PointNet2MSG(object):
def __init__(self, cfg, xyz, feature=None, use_xyz=True):
self.cfg = cfg
self.xyz = xyz
self.feature = feature
self.use_xyz = use_xyz
self.model_config()
def model_config(self):
self.SA_confs = []
for i in range(self.cfg.RPN.SA_CONFIG.NPOINTS.__len__()):
self.SA_confs.append({
"npoint": self.cfg.RPN.SA_CONFIG.NPOINTS[i],
"radiuss": self.cfg.RPN.SA_CONFIG.RADIUS[i],
"nsamples": self.cfg.RPN.SA_CONFIG.NSAMPLE[i],
"mlps": self.cfg.RPN.SA_CONFIG.MLPS[i],
})
self.FP_confs = []
for i in range(self.cfg.RPN.FP_MLPS.__len__()):
self.FP_confs.append({"mlp": self.cfg.RPN.FP_MLPS[i]})
def build(self, bn_momentum=0.95):
xyzs, features = [self.xyz], [self.feature]
xyzi, featurei = self.xyz, self.feature
for i, SA_conf in enumerate(self.SA_confs):
xyzi, featurei = pointnet_sa_module(
xyz=xyzi,
feature=featurei,
bn_momentum=bn_momentum,
use_xyz=self.use_xyz,
name="sa_{}".format(i),
**SA_conf)
xyzs.append(xyzi)
features.append(fluid.layers.transpose(featurei, perm=[0, 2, 1]))
for i in range(-1, -(len(self.FP_confs) + 1), -1):
features[i - 1] = pointnet_fp_module(
unknown=xyzs[i - 1],
known=xyzs[i],
unknown_feats=features[i - 1],
known_feats=features[i],
bn_momentum=bn_momentum,
name="fp_{}".format(i + len(self.FP_confs)),
**self.FP_confs[i])
return xyzs[0], features[0]
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import sys
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Constant
from models.pointnet2_modules import MLP, pointnet_sa_module, conv_bn
from models.loss_utils import sigmoid_focal_loss , get_reg_loss
from utils.proposal_target import get_proposal_target_func
from utils.cyops.kitti_utils import rotate_pc_along_y
__all__ = ['RCNN']
class RCNN(object):
def __init__(self, cfg, num_classes, batch_size, mode='TRAIN', use_xyz=True, input_channels=0):
self.cfg = cfg
self.use_xyz = use_xyz
self.num_classes = num_classes
self.input_channels = input_channels
self.inputs = None
self.training = mode == 'TRAIN'
self.batch_size = batch_size
def create_tmp_var(self, name, dtype, shape):
return fluid.default_main_program().current_block().create_var(
name=name, dtype=dtype, shape=shape
)
def build_model(self, inputs):
self.inputs = inputs
if self.cfg.RCNN.ROI_SAMPLE_JIT:
if self.training:
proposal_target = get_proposal_target_func(self.cfg)
tmp_list = [
self.inputs['seg_mask'],
self.inputs['rpn_features'],
self.inputs['gt_boxes3d'],
self.inputs['rpn_xyz'],
self.inputs['pts_depth'],
self.inputs['roi_boxes3d'],
self.inputs['rpn_intensity'],
]
out_name = ['reg_valid_mask' ,'sampled_pts' ,'roi_boxes3d', 'gt_of_rois', 'pts_feature' ,'cls_label','gt_iou']
reg_valid_mask = self.create_tmp_var(name="reg_valid_mask",dtype='float32',shape=[-1,])
sampled_pts = self.create_tmp_var(name="sampled_pts",dtype='float32',shape=[-1, self.cfg.RCNN.NUM_POINTS, 3])
new_roi_boxes3d = self.create_tmp_var(name="new_roi_boxes3d",dtype='float32',shape=[-1, 7])
gt_of_rois = self.create_tmp_var(name="gt_of_rois", dtype='float32', shape=[-1,7])
pts_feature = self.create_tmp_var(name="pts_feature", dtype='float32',shape=[-1,512,130])
cls_label = self.create_tmp_var(name="cls_label",dtype='int64',shape=[-1])
gt_iou = self.create_tmp_var(name="gt_iou",dtype='float32',shape=[-1])
out_list = [reg_valid_mask, sampled_pts, new_roi_boxes3d, gt_of_rois, pts_feature, cls_label, gt_iou]
out = fluid.layers.py_func(func=proposal_target,x=tmp_list,out=out_list)
self.target_dict = {}
for i,item in enumerate(out):
self.target_dict[out_name[i]] = item
pts = fluid.layers.concat(input=[self.target_dict['sampled_pts'],self.target_dict['pts_feature']], axis=2)
self.debug = pts
self.target_dict['pts_input'] = pts
else:
rpn_xyz, rpn_features = inputs['rpn_xyz'], inputs['rpn_features']
batch_rois = inputs['roi_boxes3d']
rpn_intensity = inputs['rpn_intensity']
rpn_intensity = fluid.layers.unsqueeze(rpn_intensity,axes=[2])
seg_mask = fluid.layers.unsqueeze(inputs['seg_mask'],axes=[2])
if self.cfg.RCNN.USE_INTENSITY:
pts_extra_input_list = [rpn_intensity, seg_mask]
else:
pts_extra_input_list = [seg_mask]
if self.cfg.RCNN.USE_DEPTH:
pts_depth = inputs['pts_depth'] / 70.0 -0.5
pts_depth = fluid.layers.unsqueeze(pts_depth,axes=[2])
pts_extra_input_list.append(pts_depth)
pts_extra_input = fluid.layers.concat(pts_extra_input_list, axis=2)
pts_feature = fluid.layers.concat([pts_extra_input, rpn_features],axis=2)
pooled_features, pooled_empty_flag = fluid.layers.roi_pool_3d(rpn_xyz,pts_feature,batch_rois,
self.cfg.RCNN.POOL_EXTRA_WIDTH,
sampled_pt_num=self.cfg.RCNN.NUM_POINTS)
# canonical transformation
batch_size = batch_rois.shape[0]
roi_center = batch_rois[:, :, 0:3]
tmp = pooled_features[:, :, :, 0:3] - fluid.layers.unsqueeze(roi_center,axes=[2])
pooled_features = fluid.layers.concat(input=[tmp,pooled_features[:,:,:,3:]],axis=3)
concat_list = []
for i in range(batch_size):
tmp = rotate_pc_along_y(pooled_features[i, :, :, 0:3],
batch_rois[i, :, 6])
concat = fluid.layers.concat([tmp,pooled_features[i,:,:,3:]],axis=-1)
concat = fluid.layers.unsqueeze(concat,axes=[0])
concat_list.append(concat)
pooled_features = fluid.layers.concat(concat_list,axis=0)
pts = fluid.layers.reshape(pooled_features,shape=[-1,pooled_features.shape[2],pooled_features.shape[3]])
else:
pts = inputs['pts_input']
self.target_dict = {}
self.target_dict['pts_input'] = inputs['pts_input']
self.target_dict['roi_boxes3d'] = inputs['roi_boxes3d']
if self.training:
self.target_dict['cls_label'] = inputs['cls_label']
self.target_dict['reg_valid_mask'] = inputs['reg_valid_mask']
self.target_dict['gt_of_rois'] = inputs['gt_boxes3d_ct']
xyz = pts[:,:,0:3]
feature = fluid.layers.transpose(pts[:,:,3:], [0,2,1]) if pts.shape[-1]>3 else None
if self.cfg.RCNN.USE_RPN_FEATURES:
self.rcnn_input_channel = 3 + int(self.cfg.RCNN.USE_INTENSITY) + \
int(self.cfg.RCNN.USE_MASK) + int(self.cfg.RCNN.USE_DEPTH)
c_out = self.cfg.RCNN.XYZ_UP_LAYER[-1]
xyz_input = pts[:,:,:self.rcnn_input_channel]
xyz_input = fluid.layers.transpose(xyz_input, [0,2,1])
xyz_input = fluid.layers.unsqueeze(xyz_input, axes=[3])
rpn_feature = pts[:,:,self.rcnn_input_channel:]
rpn_feature = fluid.layers.transpose(rpn_feature, [0,2,1])
rpn_feature = fluid.layers.unsqueeze(rpn_feature,axes=[3])
xyz_feature = MLP(
xyz_input,
out_channels_list=self.cfg.RCNN.XYZ_UP_LAYER,
bn=self.cfg.RCNN.USE_BN,
name="xyz_up_layer")
merged_feature = fluid.layers.concat([xyz_feature, rpn_feature],axis=1)
merged_feature = MLP(
merged_feature,
out_channels_list=[c_out],
bn=self.cfg.RCNN.USE_BN,
name="xyz_down_layer")
xyzs = [xyz]
features = [fluid.layers.squeeze(merged_feature,axes=[3])]
else:
xyzs = [xyz]
features = [feature]
# forward
xyzi, featurei = xyzs[-1], features[-1]
for k in range(len(self.cfg.RCNN.SA_CONFIG.NPOINTS)):
mlps = self.cfg.RCNN.SA_CONFIG.MLPS[k]
npoint = self.cfg.RCNN.SA_CONFIG.NPOINTS[k] if self.cfg.RCNN.SA_CONFIG.NPOINTS[k] != -1 else None
xyzi, featurei = pointnet_sa_module(
xyz=xyzi,
feature = featurei,
bn = self.cfg.RCNN.USE_BN,
use_xyz = self.use_xyz,
name = "sa_{}".format(k),
npoint = npoint,
mlps = [mlps],
radiuss = [self.cfg.RCNN.SA_CONFIG.RADIUS[k]],
nsamples = [self.cfg.RCNN.SA_CONFIG.NSAMPLE[k]]
)
xyzs.append(xyzi)
features.append(featurei)
head_in = features[-1]
head_in = fluid.layers.unsqueeze(head_in, axes=[2])
cls_out = head_in
reg_out = cls_out
for i in range(0, self.cfg.RCNN.CLS_FC.__len__()):
cls_out = conv_bn(cls_out, self.cfg.RCNN.CLS_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_cls_{}'.format(i))
if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
cls_out = fluid.layers.dropout(cls_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
cls_channel = 1 if self.num_classes == 2 else self.num_classes
cls_out = conv_bn(cls_out, cls_channel, act=None, name="cls_out", bn=self.cfg.RCNN.USE_BN)
self.cls_out = fluid.layers.squeeze(cls_out,axes=[1,3])
per_loc_bin_num = int(self.cfg.RCNN.LOC_SCOPE / self.cfg.RCNN.LOC_BIN_SIZE) * 2
loc_y_bin_num = int(self.cfg.RCNN.LOC_Y_SCOPE / self.cfg.RCNN.LOC_Y_BIN_SIZE) * 2
reg_channel = per_loc_bin_num * 4 + self.cfg.RCNN.NUM_HEAD_BIN * 2 + 3
reg_channel += (1 if not self.cfg.RCNN.LOC_Y_BY_BIN else loc_y_bin_num * 2)
for i in range(0, self.cfg.RCNN.REG_FC.__len__()):
reg_out = conv_bn(reg_out, self.cfg.RCNN.REG_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_reg_{}'.format(i))
if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
reg_out = fluid.layers.dropout(reg_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
reg_out = conv_bn(reg_out, reg_channel, act=None, name="reg_out", bn=self.cfg.RCNN.USE_BN)
self.reg_out = fluid.layers.squeeze(reg_out, axes=[2,3])
self.outputs = {
'rcnn_cls':self.cls_out,
'rcnn_reg':self.reg_out,
}
if self.training:
self.outputs.update(self.target_dict)
elif not self.training:
self.outputs['sample_id'] = inputs['sample_id']
self.outputs['pts_input'] = inputs['pts_input']
self.outputs['roi_boxes3d'] = inputs['roi_boxes3d']
self.outputs['roi_scores'] = inputs['roi_scores']
self.outputs['gt_iou'] = inputs['gt_iou']
self.outputs['gt_boxes3d'] = inputs['gt_boxes3d']
if self.cls_out.shape[1] == 1:
raw_scores = fluid.layers.reshape(self.cls_out, shape=[-1])
norm_scores = fluid.layers.sigmoid(raw_scores)
else:
norm_scores = fluid.layers.softmax(self.cls_out, axis=1)
self.outputs['norm_scores'] = norm_scores
def get_outputs(self):
return self.outputs
def get_loss(self):
assert self.inputs is not None, \
"please call build() first"
rcnn_cls_label = self.outputs['cls_label']
reg_valid_mask = self.outputs['reg_valid_mask']
roi_boxes3d = self.outputs['roi_boxes3d']
roi_size = roi_boxes3d[:, 3:6]
gt_boxes3d_ct = self.outputs['gt_of_rois']
pts_input = self.outputs['pts_input']
rcnn_cls = self.cls_out
rcnn_reg = self.reg_out
# RCNN classification loss
assert self.cfg.RCNN.LOSS_CLS in ["SigmoidFocalLoss", "BinaryCrossEntropy"], \
"unsupported RCNN cls loss type {}".format(self.cfg.RCNN.LOSS_CLS)
if self.cfg.RCNN.LOSS_CLS == "SigmoidFocalLoss":
cls_flat = fluid.layers.reshape(self.cls_out, shape=[-1])
cls_label_flat = fluid.layers.reshape(rcnn_cls_label, shape=[-1])
cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
cls_target = fluid.layers.cast(cls_label_flat>0, dtype=cls_flat.dtype)
cls_label_flat.stop_gradient = True
pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
pos.stop_gradient = True
pos_normalizer = fluid.layers.reduce_sum(pos)
cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
cls_weights.stop_gradient = True
rcnn_loss_cls = sigmoid_focal_loss(cls_flat, cls_target, cls_weights)
rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls)
else: # BinaryCrossEntropy
cls_label = fluid.layers.reshape(rcnn_cls_label, shape=self.cls_out.shape)
cls_valid_mask = fluid.layers.cast(cls_label >= 0, dtype=self.cls_out.dtype)
cls_label = fluid.layers.cast(cls_label, dtype=self.cls_out.dtype)
cls_label.stop_gradient = True
rcnn_loss_cls = fluid.layers.sigmoid_cross_entropy_with_logits(self.cls_out, cls_label)
cls_mask_normalzer = fluid.layers.reduce_sum(cls_valid_mask)
rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls * cls_valid_mask) \
/ fluid.layers.clip(cls_mask_normalzer, min=1.0, max=1e10)
# RCNN regression loss
reg_out = self.reg_out
fg_mask = fluid.layers.cast(reg_valid_mask > 0, dtype=reg_out.dtype)
fg_mask.stop_gradient = True
gt_boxes3d_ct = fluid.layers.reshape(gt_boxes3d_ct, [-1,7])
all_anchor_size = roi_size
anchor_size = all_anchor_size[fg_mask] if self.cfg.RCNN.SIZE_RES_ON_ROI else self.cfg.CLS_MEAN_SIZE[0]
loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
reg_out * fg_mask,
gt_boxes3d_ct,
fg_mask,
point_num=float(self.batch_size*64),
loc_scope=self.cfg.RCNN.LOC_SCOPE,
loc_bin_size=self.cfg.RCNN.LOC_BIN_SIZE,
num_head_bin=self.cfg.RCNN.NUM_HEAD_BIN,
anchor_size=anchor_size,
get_xz_fine=True,
get_y_by_bin=self.cfg.RCNN.LOC_Y_BY_BIN,
loc_y_scope=self.cfg.RCNN.LOC_Y_SCOPE,
loc_y_bin_size=self.cfg.RCNN.LOC_Y_BIN_SIZE,
get_ry_fine=True
)
rcnn_loss_reg = loc_loss + angle_loss + size_loss * 3
rcnn_loss = rcnn_loss_cls + rcnn_loss_reg
return rcnn_loss, rcnn_loss_cls, rcnn_loss_reg
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Normal, Constant
from utils.proposal_utils import get_proposal_func
from models.pointnet2_msg import PointNet2MSG
from models.pointnet2_modules import conv_bn
from models.loss_utils import sigmoid_focal_loss, get_reg_loss
__all__ = ["RPN"]
class RPN(object):
def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
self.cfg = cfg
self.batch_size = batch_size
self.use_xyz = use_xyz
self.mode = mode
self.is_train = mode == 'TRAIN'
self.inputs = None
self.prog = fluid.default_main_program() if prog is None else prog
def build(self, inputs):
assert self.cfg.RPN.BACKBONE == 'pointnet2_msg', \
"RPN backbone only support pointnet2_msg"
self.inputs = inputs
self.outputs = {}
xyz = inputs["pts_input"]
assert not self.cfg.RPN.USE_INTENSITY, \
"RPN.USE_INTENSITY not support now"
feature = None
msg = PointNet2MSG(self.cfg, xyz, feature, self.use_xyz)
backbone_xyz, backbone_feature = msg.build()
self.outputs['backbone_xyz'] = backbone_xyz
self.outputs['backbone_feature'] = backbone_feature
backbone_feature = fluid.layers.transpose(backbone_feature, perm=[0, 2, 1])
cls_out = fluid.layers.unsqueeze(backbone_feature, axes=[-1])
reg_out = cls_out
# classification branch
for i in range(self.cfg.RPN.CLS_FC.__len__()):
cls_out = conv_bn(cls_out, self.cfg.RPN.CLS_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_cls_{}'.format(i))
if i == 0 and self.cfg.RPN.DP_RATIO > 0:
cls_out = fluid.layers.dropout(cls_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
cls_out = fluid.layers.conv2d(cls_out,
num_filters=1,
filter_size=1,
stride=1,
padding=0,
dilation=1,
param_attr=ParamAttr(name='rpn_cls_out_conv_weight'),
bias_attr=ParamAttr(name='rpn_cls_out_conv_bias',
initializer=Constant(-np.log(99))))
cls_out = fluid.layers.squeeze(cls_out, axes=[1, 3])
self.outputs['rpn_cls'] = cls_out
# regression branch
per_loc_bin_num = int(self.cfg.RPN.LOC_SCOPE / self.cfg.RPN.LOC_BIN_SIZE) * 2
if self.cfg.RPN.LOC_XZ_FINE:
reg_channel = per_loc_bin_num * 4 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
else:
reg_channel = per_loc_bin_num * 2 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
reg_channel += 1 # reg y
for i in range(self.cfg.RPN.REG_FC.__len__()):
reg_out = conv_bn(reg_out, self.cfg.RPN.REG_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_reg_{}'.format(i))
if i == 0 and self.cfg.RPN.DP_RATIO > 0:
reg_out = fluid.layers.dropout(reg_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
reg_out = fluid.layers.conv2d(reg_out,
num_filters=reg_channel,
filter_size=1,
stride=1,
padding=0,
dilation=1,
param_attr=ParamAttr(name='rpn_reg_out_conv_weight',
initializer=Normal(0., 0.001),),
bias_attr=ParamAttr(name='rpn_reg_out_conv_bias'))
reg_out = fluid.layers.squeeze(reg_out, axes=[3])
reg_out = fluid.layers.transpose(reg_out, [0, 2, 1])
self.outputs['rpn_reg'] = reg_out
if self.mode != 'TRAIN' or self.cfg.RCNN.ENABLED:
rpn_scores_row = cls_out
rpn_scores_norm = fluid.layers.sigmoid(rpn_scores_row)
seg_mask = fluid.layers.cast(rpn_scores_norm > self.cfg.RPN.SCORE_THRESH, dtype='float32')
pts_depth = fluid.layers.sqrt(fluid.layers.reduce_sum(backbone_xyz * backbone_xyz, dim=2))
proposal_func = get_proposal_func(self.cfg, self.mode)
proposal_input = fluid.layers.concat([fluid.layers.unsqueeze(rpn_scores_row, axes=[-1]),
backbone_xyz, reg_out], axis=-1)
proposal = self.prog.current_block().create_var(name='proposal',
shape=[-1, proposal_input.shape[1], 8],
dtype='float32')
fluid.layers.py_func(proposal_func, proposal_input, proposal)
rois, roi_scores_row = proposal[:, :, :7], proposal[:, :, -1]
self.outputs['rois'] = rois
self.outputs['roi_scores_row'] = roi_scores_row
self.outputs['seg_mask'] = seg_mask
self.outputs['pts_depth'] = pts_depth
def get_outputs(self):
return self.outputs
def get_loss(self):
assert self.inputs is not None, \
"please call build() first"
rpn_cls_label = self.inputs['rpn_cls_label']
rpn_reg_label = self.inputs['rpn_reg_label']
rpn_cls = self.outputs['rpn_cls']
rpn_reg = self.outputs['rpn_reg']
# RPN classification loss
assert self.cfg.RPN.LOSS_CLS == "SigmoidFocalLoss", \
"unsupported RPN cls loss type {}".format(self.cfg.RPN.LOSS_CLS)
cls_flat = fluid.layers.reshape(rpn_cls, shape=[-1])
cls_label_flat = fluid.layers.reshape(rpn_cls_label, shape=[-1])
cls_label_pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
pos_normalizer = fluid.layers.reduce_sum(cls_label_pos)
cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
cls_weights.stop_gradient = True
cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
cls_label_flat.stop_gradient = True
rpn_loss_cls = sigmoid_focal_loss(cls_flat, cls_label_pos, cls_weights)
rpn_loss_cls = fluid.layers.reduce_sum(rpn_loss_cls)
# RPN regression loss
rpn_reg = fluid.layers.reshape(rpn_reg, [-1, rpn_reg.shape[-1]])
reg_label = fluid.layers.reshape(rpn_reg_label, [-1, rpn_reg_label.shape[-1]])
fg_mask = fluid.layers.cast(cls_label_flat > 0, dtype=rpn_reg.dtype)
fg_mask.stop_gradient = True
loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
rpn_reg * fg_mask, reg_label, fg_mask,
float(self.batch_size * self.cfg.RPN.NUM_POINTS),
loc_scope=self.cfg.RPN.LOC_SCOPE,
loc_bin_size=self.cfg.RPN.LOC_BIN_SIZE,
num_head_bin=self.cfg.RPN.NUM_HEAD_BIN,
anchor_size=self.cfg.CLS_MEAN_SIZE[0],
get_xz_fine=self.cfg.RPN.LOC_XZ_FINE,
get_y_by_bin=False,
get_ry_fine=False)
rpn_loss_reg = loc_loss + angle_loss + size_loss * 3
self.rpn_loss = rpn_loss_cls * self.cfg.RPN.LOSS_WEIGHT[0] + rpn_loss_reg * self.cfg.RPN.LOSS_WEIGHT[1]
return self.rpn_loss, rpn_loss_cls, rpn_loss_reg
Cython
opencv-python
shapely
scikit-image
Numba
fire
"""
Generate GT database
This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_aug_scene.py
"""
import os
import numpy as np
import pickle
import pts_utils
import utils.cyops.kitti_utils as kitti_utils
from utils.box_utils import boxes_iou3d
from utils import calibration as calib
from data.kitti_dataset import KittiDataset
import argparse
np.random.seed(1024)
parser = argparse.ArgumentParser()
parser.add_argument('--mode', type=str, default='generator')
parser.add_argument('--class_name', type=str, default='Car')
parser.add_argument('--data_dir', type=str, default='./data')
parser.add_argument('--save_dir', type=str, default='./data/KITTI/aug_scene/training')
parser.add_argument('--split', type=str, default='train')
parser.add_argument('--gt_database_dir', type=str, default='./data/gt_database/train_gt_database_3level_Car.pkl')
parser.add_argument('--include_similar', action='store_true', default=False)
parser.add_argument('--aug_times', type=int, default=4)
args = parser.parse_args()
PC_REDUCE_BY_RANGE = True
if args.class_name == 'Car':
PC_AREA_SCOPE = np.array([[-40, 40], [-1, 3], [0, 70.4]]) # x, y, z scope in rect camera coords
else:
PC_AREA_SCOPE = np.array([[-30, 30], [-1, 3], [0, 50]])
def log_print(info, fp=None):
print(info)
if fp is not None:
# print(info, file=fp)
fp.write(info+"\n")
def save_kitti_format(calib, bbox3d, obj_list, img_shape, save_fp):
corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
# Discard boxes that are larger than 80% of the image width OR height
img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
for k in range(bbox3d.shape[0]):
if box_valid_mask[k] == 0:
continue
x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
beta = np.arctan2(z, x)
alpha = -np.sign(beta) * np.pi / 2 + beta + ry
save_fp.write('%s %.2f %d %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
(args.class_name, obj_list[k].trucation, int(obj_list[k].occlusion), alpha, img_boxes[k, 0], img_boxes[k, 1],
img_boxes[k, 2], img_boxes[k, 3],
bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
bbox3d[k, 6]))
class AugSceneGenerator(KittiDataset):
def __init__(self, root_dir, gt_database=None, split='train', classes=args.class_name):
super(AugSceneGenerator, self).__init__(root_dir, split=split)
self.gt_database = None
if classes == 'Car':
self.classes = ('Background', 'Car')
elif classes == 'People':
self.classes = ('Background', 'Pedestrian', 'Cyclist')
elif classes == 'Pedestrian':
self.classes = ('Background', 'Pedestrian')
elif classes == 'Cyclist':
self.classes = ('Background', 'Cyclist')
else:
assert False, "Invalid classes: %s" % classes
self.gt_database = gt_database
def __len__(self):
raise NotImplementedError
def __getitem__(self, item):
raise NotImplementedError
def filtrate_dc_objects(self, obj_list):
valid_obj_list = []
for obj in obj_list:
if obj.cls_type in ['DontCare']:
continue
valid_obj_list.append(obj)
return valid_obj_list
def filtrate_objects(self, obj_list):
valid_obj_list = []
type_whitelist = self.classes
if args.include_similar:
type_whitelist = list(self.classes)
if 'Car' in self.classes:
type_whitelist.append('Van')
if 'Pedestrian' in self.classes or 'Cyclist' in self.classes:
type_whitelist.append('Person_sitting')
for obj in obj_list:
if obj.cls_type in type_whitelist:
valid_obj_list.append(obj)
return valid_obj_list
@staticmethod
def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
"""
Valid point should be in the image (and in the PC_AREA_SCOPE)
:param pts_rect:
:param pts_img:
:param pts_rect_depth:
:param img_shape:
:return:
"""
val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
if PC_REDUCE_BY_RANGE:
x_range, y_range, z_range = PC_AREA_SCOPE
pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
& (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
& (pts_z >= z_range[0]) & (pts_z <= z_range[1])
pts_valid_flag = pts_valid_flag & range_flag
return pts_valid_flag
@staticmethod
def check_pc_range(xyz):
"""
:param xyz: [x, y, z]
:return:
"""
x_range, y_range, z_range = PC_AREA_SCOPE
if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
(z_range[0] <= xyz[2] <= z_range[1]):
return True
return False
def aug_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
"""
:param pts_rect: (N, 3)
:param gt_boxes3d: (M1, 7)
:param all_gt_boxex3d: (M2, 7)
:return:
"""
assert self.gt_database is not None
extra_gt_num = np.random.randint(10, 15)
try_times = 50
cnt = 0
cur_gt_boxes3d = all_gt_boxes3d.copy()
cur_gt_boxes3d[:, 4] += 0.5
cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
extra_gt_obj_list = []
extra_gt_boxes3d_list = []
new_pts_list, new_pts_intensity_list = [], []
src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
road_plane = self.get_road_plane(sample_id)
a, b, c, d = road_plane
while try_times > 0:
try_times -= 1
rand_idx = np.random.randint(0, self.gt_database.__len__() - 1)
new_gt_dict = self.gt_database[rand_idx]
new_gt_box3d = new_gt_dict['gt_box3d'].copy()
new_gt_points = new_gt_dict['points'].copy()
new_gt_intensity = new_gt_dict['intensity'].copy()
new_gt_obj = new_gt_dict['obj']
center = new_gt_box3d[0:3]
if PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
continue
if cnt > extra_gt_num:
break
if new_gt_points.__len__() < 5: # too few points
continue
# put it on the road plane
cur_height = (-d - a * center[0] - c * center[2]) / b
move_height = new_gt_box3d[1] - cur_height
new_gt_box3d[1] -= move_height
new_gt_points[:, 1] -= move_height
cnt += 1
iou3d = boxes_iou3d(new_gt_box3d.reshape(1, 7), cur_gt_boxes3d)
valid_flag = iou3d.max() < 1e-8
if not valid_flag:
continue
enlarged_box3d = new_gt_box3d.copy()
enlarged_box3d[3] += 2 # remove the points above and below the object
boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, enlarged_box3d.reshape(1, 7))
pt_mask_flag = (boxes_pts_mask_list[0] == 1)
src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
new_pts_list.append(new_gt_points)
new_pts_intensity_list.append(new_gt_intensity)
enlarged_box3d = new_gt_box3d.copy()
enlarged_box3d[4] += 0.5
enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, enlarged_box3d.reshape(1, 7)), axis=0)
extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
extra_gt_obj_list.append(new_gt_obj)
if new_pts_list.__len__() == 0:
return False, pts_rect, pts_intensity, None, None
extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
# remove original points and add new points
pts_rect = pts_rect[src_pts_flag == 1]
pts_intensity = pts_intensity[src_pts_flag == 1]
new_pts_rect = np.concatenate(new_pts_list, axis=0)
new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
def aug_one_epoch_scene(self, base_id, data_save_dir, label_save_dir, split_list, log_fp=None):
for idx, sample_id in enumerate(self.image_idx_list):
sample_id = int(sample_id)
print('process gt sample (%s, id=%06d)' % (args.split, sample_id))
pts_lidar = self.get_lidar(sample_id)
calib = self.get_calib(sample_id)
pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
img_shape = self.get_image_shape(sample_id)
pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
pts_rect = pts_rect[pts_valid_flag][:, 0:3]
pts_intensity = pts_lidar[pts_valid_flag][:, 3]
# all labels for checking overlapping
all_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
all_gt_boxes3d = np.zeros((all_obj_list.__len__(), 7), dtype=np.float32)
for k, obj in enumerate(all_obj_list):
all_gt_boxes3d[k, 0:3], all_gt_boxes3d[k, 3], all_gt_boxes3d[k, 4], all_gt_boxes3d[k, 5], \
all_gt_boxes3d[k, 6] = obj.pos, obj.h, obj.w, obj.l, obj.ry
# gt_boxes3d of current label
obj_list = self.filtrate_objects(self.get_label(sample_id))
if args.class_name != 'Car' and obj_list.__len__() == 0:
continue
# augment one scene
aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
self.aug_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
# save augment result to file
pts_info = np.concatenate((pts_rect, pts_intensity.reshape(-1, 1)), axis=1)
bin_file = os.path.join(data_save_dir, '%06d.bin' % (base_id + sample_id))
pts_info.astype(np.float32).tofile(bin_file)
# save filtered original gt_boxes3d
label_save_file = os.path.join(label_save_dir, '%06d.txt' % (base_id + sample_id))
with open(label_save_file, 'w') as f:
for obj in obj_list:
f.write(obj.to_kitti_format() + '\n')
if aug_flag:
# augment successfully
save_kitti_format(calib, extra_gt_boxes3d, extra_gt_obj_list, img_shape=img_shape, save_fp=f)
else:
extra_gt_boxes3d = np.zeros((0, 7), dtype=np.float32)
log_print('Save to file (new_obj: %s): %s' % (extra_gt_boxes3d.__len__(), label_save_file), fp=log_fp)
split_list.append('%06d' % (base_id + sample_id))
def generate_aug_scene(self, aug_times, log_fp=None):
data_save_dir = os.path.join(args.save_dir, 'rectified_data')
label_save_dir = os.path.join(args.save_dir, 'aug_label')
if not os.path.isdir(data_save_dir):
os.makedirs(data_save_dir)
if not os.path.isdir(label_save_dir):
os.makedirs(label_save_dir)
split_file = os.path.join(args.save_dir, '%s_aug.txt' % args.split)
split_list = self.image_idx_list[:]
for epoch in range(aug_times):
base_id = (epoch + 1) * 10000
self.aug_one_epoch_scene(base_id, data_save_dir, label_save_dir, split_list, log_fp=log_fp)
with open(split_file, 'w') as f:
for idx, sample_id in enumerate(split_list):
f.write(str(sample_id) + '\n')
log_print('Save split file to %s' % split_file, fp=log_fp)
target_dir = os.path.join(args.data_dir, 'KITTI/ImageSets/')
os.system('cp %s %s' % (split_file, target_dir))
log_print('Copy split file from %s to %s' % (split_file, target_dir), fp=log_fp)
if __name__ == '__main__':
if not os.path.isdir(args.save_dir):
os.makedirs(args.save_dir)
info_file = os.path.join(args.save_dir, 'log_info.txt')
if args.mode == 'generator':
log_fp = open(info_file, 'w')
gt_database = pickle.load(open(args.gt_database_dir, 'rb'))
log_print('Loading gt_database(%d) from %s' % (gt_database.__len__(), args.gt_database_dir), fp=log_fp)
dataset = AugSceneGenerator(root_dir=args.data_dir, gt_database=gt_database, split=args.split)
dataset.generate_aug_scene(aug_times=args.aug_times, log_fp=log_fp)
log_fp.close()
else:
pass
"""
Generate GT database
This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_gt_database.py
"""
import os
import numpy as np
import pickle
from data.kitti_dataset import KittiDataset
import pts_utils
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--data_dir', type=str, default='./data')
parser.add_argument('--save_dir', type=str, default='./data/gt_database')
parser.add_argument('--class_name', type=str, default='Car')
parser.add_argument('--split', type=str, default='train')
args = parser.parse_args()
class GTDatabaseGenerator(KittiDataset):
def __init__(self, root_dir, split='train', classes=args.class_name):
super(GTDatabaseGenerator, self).__init__(root_dir, split=split)
self.gt_database = None
if classes == 'Car':
self.classes = ('Background', 'Car')
elif classes == 'People':
self.classes = ('Background', 'Pedestrian', 'Cyclist')
elif classes == 'Pedestrian':
self.classes = ('Background', 'Pedestrian')
elif classes == 'Cyclist':
self.classes = ('Background', 'Cyclist')
else:
assert False, "Invalid classes: %s" % classes
def __len__(self):
raise NotImplementedError
def __getitem__(self, item):
raise NotImplementedError
def filtrate_objects(self, obj_list):
valid_obj_list = []
for obj in obj_list:
if obj.cls_type not in self.classes:
continue
if obj.level_str not in ['Easy', 'Moderate', 'Hard']:
continue
valid_obj_list.append(obj)
return valid_obj_list
def generate_gt_database(self):
gt_database = []
for idx, sample_id in enumerate(self.image_idx_list):
sample_id = int(sample_id)
print('process gt sample (id=%06d)' % sample_id)
pts_lidar = self.get_lidar(sample_id)
calib = self.get_calib(sample_id)
pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
pts_intensity = pts_lidar[:, 3]
obj_list = self.filtrate_objects(self.get_label(sample_id))
gt_boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
for k, obj in enumerate(obj_list):
gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
= obj.pos, obj.h, obj.w, obj.l, obj.ry
if gt_boxes3d.__len__() == 0:
print('No gt object')
continue
boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, gt_boxes3d)
for k in range(boxes_pts_mask_list.shape[0]):
pt_mask_flag = (boxes_pts_mask_list[k] == 1)
cur_pts = pts_rect[pt_mask_flag].astype(np.float32)
cur_pts_intensity = pts_intensity[pt_mask_flag].astype(np.float32)
sample_dict = {'sample_id': sample_id,
'cls_type': obj_list[k].cls_type,
'gt_box3d': gt_boxes3d[k],
'points': cur_pts,
'intensity': cur_pts_intensity,
'obj': obj_list[k]}
gt_database.append(sample_dict)
save_file_name = os.path.join(args.save_dir, '%s_gt_database_3level_%s.pkl' % (args.split, self.classes[-1]))
with open(save_file_name, 'wb') as f:
pickle.dump(gt_database, f)
self.gt_database = gt_database
print('Save refine training sample info file to %s' % save_file_name)
if __name__ == '__main__':
dataset = GTDatabaseGenerator(root_dir=args.data_dir, split=args.split)
if not os.path.isdir(args.save_dir):
os.makedirs(args.save_dir)
dataset.generate_gt_database()
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import argparse
def parse_args():
parser = argparse.ArgumentParser(
"KITTI mAP evaluation script")
parser.add_argument(
'--result_dir',
type=str,
default='./result_dir',
help='detection result directory to evaluate')
parser.add_argument(
'--data_dir',
type=str,
default='./data',
help='KITTI dataset root directory')
parser.add_argument(
'--split',
type=str,
default='val',
help='evaluation split, default val')
parser.add_argument(
'--class_name',
type=str,
default='Car',
help='evaluation class name, default Car')
args = parser.parse_args()
return args
def kitti_eval():
if float(sys.version[:3]) < 3.6:
print("KITTI mAP evaluation can only run with python3.6+")
sys.exit(1)
args = parse_args()
label_dir = os.path.join(args.data_dir, 'KITTI/object/training', 'label_2')
split_file = os.path.join(args.data_dir, 'KITTI/ImageSets',
'{}.txt'.format(args.split))
final_output_dir = os.path.join(args.result_dir, 'final_result', 'data')
name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
ap_result_str, ap_dict = kitti_evaluate(
label_dir, final_output_dir, label_split_file=split_file,
current_class=name_to_class[args.class_name])
print("KITTI evaluate: ", ap_result_str, ap_dict)
if __name__ == "__main__":
kitti_eval()
MIT License
Copyright (c) 2018
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# kitti-object-eval-python
**NOTE**: This is borrowed from [traveller59/kitti-object-eval-python](https://github.com/traveller59/kitti-object-eval-python)
Fast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.
## Dependencies
Only support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.
* Install by conda:
```
conda install -c numba cudatoolkit=x.x (8.0, 9.0, 9.1, depend on your environment)
```
## Usage
* commandline interface:
```
python evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False
```
* python interface:
```Python
import kitti_common as kitti
from eval import get_official_eval_result, get_coco_eval_result
def _read_imageset_file(path):
with open(path, 'r') as f:
lines = f.readlines()
return [int(line) for line in lines]
det_path = "/path/to/your_result_folder"
dt_annos = kitti.get_label_annos(det_path)
gt_path = "/path/to/your_gt_label_folder"
gt_split_file = "/path/to/val.txt" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz
val_image_ids = _read_imageset_file(gt_split_file)
gt_annos = kitti.get_label_annos(gt_path, val_image_ids)
print(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer
print(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer
```
import numpy as np
import numba
import io as sysio
from tools.kitti_object_eval_python.rotate_iou import rotate_iou_gpu_eval
@numba.jit
def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):
scores.sort()
scores = scores[::-1]
current_recall = 0
thresholds = []
for i, score in enumerate(scores):
l_recall = (i + 1) / num_gt
if i < (len(scores) - 1):
r_recall = (i + 2) / num_gt
else:
r_recall = l_recall
if (((r_recall - current_recall) < (current_recall - l_recall))
and (i < (len(scores) - 1))):
continue
# recall = l_recall
thresholds.append(score)
current_recall += 1 / (num_sample_pts - 1.0)
return thresholds
def clean_data(gt_anno, dt_anno, current_class, difficulty):
CLASS_NAMES = ['car', 'pedestrian', 'cyclist']
MIN_HEIGHT = [40, 25, 25]
MAX_OCCLUSION = [0, 1, 2]
MAX_TRUNCATION = [0.15, 0.3, 0.5]
dc_bboxes, ignored_gt, ignored_dt = [], [], []
current_cls_name = CLASS_NAMES[current_class].lower()
num_gt = len(gt_anno["name"])
num_dt = len(dt_anno["name"])
num_valid_gt = 0
for i in range(num_gt):
bbox = gt_anno["bbox"][i]
gt_name = gt_anno["name"][i].lower()
height = bbox[3] - bbox[1]
valid_class = -1
if (gt_name == current_cls_name):
valid_class = 1
elif (current_cls_name == "Pedestrian".lower()
and "Person_sitting".lower() == gt_name):
valid_class = 0
elif (current_cls_name == "Car".lower() and "Van".lower() == gt_name):
valid_class = 0
else:
valid_class = -1
ignore = False
if ((gt_anno["occluded"][i] > MAX_OCCLUSION[difficulty])
or (gt_anno["truncated"][i] > MAX_TRUNCATION[difficulty])
or (height <= MIN_HEIGHT[difficulty])):
# if gt_anno["difficulty"][i] > difficulty or gt_anno["difficulty"][i] == -1:
ignore = True
if valid_class == 1 and not ignore:
ignored_gt.append(0)
num_valid_gt += 1
elif (valid_class == 0 or (ignore and (valid_class == 1))):
ignored_gt.append(1)
else:
ignored_gt.append(-1)
# for i in range(num_gt):
if gt_anno["name"][i] == "DontCare":
dc_bboxes.append(gt_anno["bbox"][i])
for i in range(num_dt):
if (dt_anno["name"][i].lower() == current_cls_name):
valid_class = 1
else:
valid_class = -1
height = abs(dt_anno["bbox"][i, 3] - dt_anno["bbox"][i, 1])
if height < MIN_HEIGHT[difficulty]:
ignored_dt.append(1)
elif valid_class == 1:
ignored_dt.append(0)
else:
ignored_dt.append(-1)
return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes
@numba.jit(nopython=True)
def image_box_overlap(boxes, query_boxes, criterion=-1):
N = boxes.shape[0]
K = query_boxes.shape[0]
overlaps = np.zeros((N, K), dtype=boxes.dtype)
for k in range(K):
qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *
(query_boxes[k, 3] - query_boxes[k, 1]))
for n in range(N):
iw = (min(boxes[n, 2], query_boxes[k, 2]) -
max(boxes[n, 0], query_boxes[k, 0]))
if iw > 0:
ih = (min(boxes[n, 3], query_boxes[k, 3]) -
max(boxes[n, 1], query_boxes[k, 1]))
if ih > 0:
if criterion == -1:
ua = (
(boxes[n, 2] - boxes[n, 0]) *
(boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)
elif criterion == 0:
ua = ((boxes[n, 2] - boxes[n, 0]) *
(boxes[n, 3] - boxes[n, 1]))
elif criterion == 1:
ua = qbox_area
else:
ua = 1.0
overlaps[n, k] = iw * ih / ua
return overlaps
def bev_box_overlap(boxes, qboxes, criterion=-1):
riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)
return riou
@numba.jit(nopython=True, parallel=True)
def d3_box_overlap_kernel(boxes, qboxes, rinc, criterion=-1):
# ONLY support overlap in CAMERA, not lider.
N, K = boxes.shape[0], qboxes.shape[0]
for i in range(N):
for j in range(K):
if rinc[i, j] > 0:
# iw = (min(boxes[i, 1] + boxes[i, 4], qboxes[j, 1] +
# qboxes[j, 4]) - max(boxes[i, 1], qboxes[j, 1]))
iw = (min(boxes[i, 1], qboxes[j, 1]) - max(
boxes[i, 1] - boxes[i, 4], qboxes[j, 1] - qboxes[j, 4]))
if iw > 0:
area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]
area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]
inc = iw * rinc[i, j]
if criterion == -1:
ua = (area1 + area2 - inc)
elif criterion == 0:
ua = area1
elif criterion == 1:
ua = area2
else:
ua = inc
rinc[i, j] = inc / ua
else:
rinc[i, j] = 0.0
def d3_box_overlap(boxes, qboxes, criterion=-1):
rinc = rotate_iou_gpu_eval(boxes[:, [0, 2, 3, 5, 6]],
qboxes[:, [0, 2, 3, 5, 6]], 2)
d3_box_overlap_kernel(boxes, qboxes, rinc, criterion)
return rinc
@numba.jit(nopython=True)
def compute_statistics_jit(overlaps,
gt_datas,
dt_datas,
ignored_gt,
ignored_det,
dc_bboxes,
metric,
min_overlap,
thresh=0,
compute_fp=False,
compute_aos=False):
det_size = dt_datas.shape[0]
gt_size = gt_datas.shape[0]
dt_scores = dt_datas[:, -1]
dt_alphas = dt_datas[:, 4]
gt_alphas = gt_datas[:, 4]
dt_bboxes = dt_datas[:, :4]
gt_bboxes = gt_datas[:, :4]
assigned_detection = [False] * det_size
ignored_threshold = [False] * det_size
if compute_fp:
for i in range(det_size):
if (dt_scores[i] < thresh):
ignored_threshold[i] = True
NO_DETECTION = -10000000
tp, fp, fn, similarity = 0, 0, 0, 0
# thresholds = [0.0]
# delta = [0.0]
thresholds = np.zeros((gt_size, ))
thresh_idx = 0
delta = np.zeros((gt_size, ))
delta_idx = 0
for i in range(gt_size):
if ignored_gt[i] == -1:
continue
det_idx = -1
valid_detection = NO_DETECTION
max_overlap = 0
assigned_ignored_det = False
for j in range(det_size):
if (ignored_det[j] == -1):
continue
if (assigned_detection[j]):
continue
if (ignored_threshold[j]):
continue
overlap = overlaps[j, i]
dt_score = dt_scores[j]
if (not compute_fp and (overlap > min_overlap)
and dt_score > valid_detection):
det_idx = j
valid_detection = dt_score
elif (compute_fp and (overlap > min_overlap)
and (overlap > max_overlap or assigned_ignored_det)
and ignored_det[j] == 0):
max_overlap = overlap
det_idx = j
valid_detection = 1
assigned_ignored_det = False
elif (compute_fp and (overlap > min_overlap)
and (valid_detection == NO_DETECTION)
and ignored_det[j] == 1):
det_idx = j
valid_detection = 1
assigned_ignored_det = True
if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:
fn += 1
elif ((valid_detection != NO_DETECTION)
and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):
assigned_detection[det_idx] = True
elif valid_detection != NO_DETECTION:
tp += 1
# thresholds.append(dt_scores[det_idx])
thresholds[thresh_idx] = dt_scores[det_idx]
thresh_idx += 1
if compute_aos:
# delta.append(gt_alphas[i] - dt_alphas[det_idx])
delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]
delta_idx += 1
assigned_detection[det_idx] = True
if compute_fp:
for i in range(det_size):
if (not (assigned_detection[i] or ignored_det[i] == -1
or ignored_det[i] == 1 or ignored_threshold[i])):
fp += 1
nstuff = 0
if metric == 0:
overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)
for i in range(dc_bboxes.shape[0]):
for j in range(det_size):
if (assigned_detection[j]):
continue
if (ignored_det[j] == -1 or ignored_det[j] == 1):
continue
if (ignored_threshold[j]):
continue
if overlaps_dt_dc[j, i] > min_overlap:
assigned_detection[j] = True
nstuff += 1
fp -= nstuff
if compute_aos:
tmp = np.zeros((fp + delta_idx, ))
# tmp = [0] * fp
for i in range(delta_idx):
tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0
# tmp.append((1.0 + np.cos(delta[i])) / 2.0)
# assert len(tmp) == fp + tp
# assert len(delta) == tp
if tp > 0 or fp > 0:
similarity = np.sum(tmp)
else:
similarity = -1
return tp, fp, fn, similarity, thresholds[:thresh_idx]
def get_split_parts(num, num_part):
same_part = num // num_part
remain_num = num % num_part
if remain_num == 0:
return [same_part] * num_part
else:
return [same_part] * num_part + [remain_num]
@numba.jit(nopython=True)
def fused_compute_statistics(overlaps,
pr,
gt_nums,
dt_nums,
dc_nums,
gt_datas,
dt_datas,
dontcares,
ignored_gts,
ignored_dets,
metric,
min_overlap,
thresholds,
compute_aos=False):
gt_num = 0
dt_num = 0
dc_num = 0
for i in range(gt_nums.shape[0]):
for t, thresh in enumerate(thresholds):
overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:
gt_num + gt_nums[i]]
gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]
dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]
ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]
ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]
dontcare = dontcares[dc_num:dc_num + dc_nums[i]]
tp, fp, fn, similarity, _ = compute_statistics_jit(
overlap,
gt_data,
dt_data,
ignored_gt,
ignored_det,
dontcare,
metric,
min_overlap=min_overlap,
thresh=thresh,
compute_fp=True,
compute_aos=compute_aos)
pr[t, 0] += tp
pr[t, 1] += fp
pr[t, 2] += fn
if similarity != -1:
pr[t, 3] += similarity
gt_num += gt_nums[i]
dt_num += dt_nums[i]
dc_num += dc_nums[i]
def calculate_iou_partly(gt_annos, dt_annos, metric, num_parts=50):
"""fast iou algorithm. this function can be used independently to
do result analysis. Must be used in CAMERA coordinate system.
Args:
gt_annos: dict, must from get_label_annos() in kitti_common.py
dt_annos: dict, must from get_label_annos() in kitti_common.py
metric: eval type. 0: bbox, 1: bev, 2: 3d
num_parts: int. a parameter for fast calculate algorithm
"""
assert len(gt_annos) == len(dt_annos)
total_dt_num = np.stack([len(a["name"]) for a in dt_annos], 0)
total_gt_num = np.stack([len(a["name"]) for a in gt_annos], 0)
num_examples = len(gt_annos)
split_parts = get_split_parts(num_examples, num_parts)
parted_overlaps = []
example_idx = 0
for num_part in split_parts:
gt_annos_part = gt_annos[example_idx:example_idx + num_part]
dt_annos_part = dt_annos[example_idx:example_idx + num_part]
if metric == 0:
gt_boxes = np.concatenate([a["bbox"] for a in gt_annos_part], 0)
dt_boxes = np.concatenate([a["bbox"] for a in dt_annos_part], 0)
overlap_part = image_box_overlap(gt_boxes, dt_boxes)
elif metric == 1:
loc = np.concatenate(
[a["location"][:, [0, 2]] for a in gt_annos_part], 0)
dims = np.concatenate(
[a["dimensions"][:, [0, 2]] for a in gt_annos_part], 0)
rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
gt_boxes = np.concatenate(
[loc, dims, rots[..., np.newaxis]], axis=1)
loc = np.concatenate(
[a["location"][:, [0, 2]] for a in dt_annos_part], 0)
dims = np.concatenate(
[a["dimensions"][:, [0, 2]] for a in dt_annos_part], 0)
rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
dt_boxes = np.concatenate(
[loc, dims, rots[..., np.newaxis]], axis=1)
overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype(
np.float64)
elif metric == 2:
loc = np.concatenate([a["location"] for a in gt_annos_part], 0)
dims = np.concatenate([a["dimensions"] for a in gt_annos_part], 0)
rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
gt_boxes = np.concatenate(
[loc, dims, rots[..., np.newaxis]], axis=1)
loc = np.concatenate([a["location"] for a in dt_annos_part], 0)
dims = np.concatenate([a["dimensions"] for a in dt_annos_part], 0)
rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
dt_boxes = np.concatenate(
[loc, dims, rots[..., np.newaxis]], axis=1)
overlap_part = d3_box_overlap(gt_boxes, dt_boxes).astype(
np.float64)
else:
raise ValueError("unknown metric")
parted_overlaps.append(overlap_part)
example_idx += num_part
overlaps = []
example_idx = 0
for j, num_part in enumerate(split_parts):
gt_annos_part = gt_annos[example_idx:example_idx + num_part]
dt_annos_part = dt_annos[example_idx:example_idx + num_part]
gt_num_idx, dt_num_idx = 0, 0
for i in range(num_part):
gt_box_num = total_gt_num[example_idx + i]
dt_box_num = total_dt_num[example_idx + i]
overlaps.append(
parted_overlaps[j][gt_num_idx:gt_num_idx + gt_box_num,
dt_num_idx:dt_num_idx + dt_box_num])
gt_num_idx += gt_box_num
dt_num_idx += dt_box_num
example_idx += num_part
return overlaps, parted_overlaps, total_gt_num, total_dt_num
def _prepare_data(gt_annos, dt_annos, current_class, difficulty):
gt_datas_list = []
dt_datas_list = []
total_dc_num = []
ignored_gts, ignored_dets, dontcares = [], [], []
total_num_valid_gt = 0
for i in range(len(gt_annos)):
rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)
num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets
ignored_gts.append(np.array(ignored_gt, dtype=np.int64))
ignored_dets.append(np.array(ignored_det, dtype=np.int64))
if len(dc_bboxes) == 0:
dc_bboxes = np.zeros((0, 4)).astype(np.float64)
else:
dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)
total_dc_num.append(dc_bboxes.shape[0])
dontcares.append(dc_bboxes)
total_num_valid_gt += num_valid_gt
gt_datas = np.concatenate(
[gt_annos[i]["bbox"], gt_annos[i]["alpha"][..., np.newaxis]], 1)
dt_datas = np.concatenate([
dt_annos[i]["bbox"], dt_annos[i]["alpha"][..., np.newaxis],
dt_annos[i]["score"][..., np.newaxis]
], 1)
gt_datas_list.append(gt_datas)
dt_datas_list.append(dt_datas)
total_dc_num = np.stack(total_dc_num, axis=0)
return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,
total_dc_num, total_num_valid_gt)
def eval_class(gt_annos,
dt_annos,
current_classes,
difficultys,
metric,
min_overlaps,
compute_aos=False,
num_parts=50):
"""Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.
Args:
gt_annos: dict, must from get_label_annos() in kitti_common.py
dt_annos: dict, must from get_label_annos() in kitti_common.py
current_classes: list of int, 0: car, 1: pedestrian, 2: cyclist
difficultys: list of int. eval difficulty, 0: easy, 1: normal, 2: hard
metric: eval type. 0: bbox, 1: bev, 2: 3d
min_overlaps: float, min overlap. format: [num_overlap, metric, class].
num_parts: int. a parameter for fast calculate algorithm
Returns:
dict of recall, precision and aos
"""
assert len(gt_annos) == len(dt_annos)
num_examples = len(gt_annos)
split_parts = get_split_parts(num_examples, num_parts)
rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)
overlaps, parted_overlaps, total_dt_num, total_gt_num = rets
N_SAMPLE_PTS = 41
num_minoverlap = len(min_overlaps)
num_class = len(current_classes)
num_difficulty = len(difficultys)
precision = np.zeros(
[num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
recall = np.zeros(
[num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
for m, current_class in enumerate(current_classes):
for l, difficulty in enumerate(difficultys):
rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)
(gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,
dontcares, total_dc_num, total_num_valid_gt) = rets
for k, min_overlap in enumerate(min_overlaps[:, metric, m]):
thresholdss = []
for i in range(len(gt_annos)):
rets = compute_statistics_jit(
overlaps[i],
gt_datas_list[i],
dt_datas_list[i],
ignored_gts[i],
ignored_dets[i],
dontcares[i],
metric,
min_overlap=min_overlap,
thresh=0.0,
compute_fp=False)
tp, fp, fn, similarity, thresholds = rets
thresholdss += thresholds.tolist()
thresholdss = np.array(thresholdss)
thresholds = get_thresholds(thresholdss, total_num_valid_gt)
thresholds = np.array(thresholds)
pr = np.zeros([len(thresholds), 4])
idx = 0
for j, num_part in enumerate(split_parts):
gt_datas_part = np.concatenate(
gt_datas_list[idx:idx + num_part], 0)
dt_datas_part = np.concatenate(
dt_datas_list[idx:idx + num_part], 0)
dc_datas_part = np.concatenate(
dontcares[idx:idx + num_part], 0)
ignored_dets_part = np.concatenate(
ignored_dets[idx:idx + num_part], 0)
ignored_gts_part = np.concatenate(
ignored_gts[idx:idx + num_part], 0)
fused_compute_statistics(
parted_overlaps[j],
pr,
total_gt_num[idx:idx + num_part],
total_dt_num[idx:idx + num_part],
total_dc_num[idx:idx + num_part],
gt_datas_part,
dt_datas_part,
dc_datas_part,
ignored_gts_part,
ignored_dets_part,
metric,
min_overlap=min_overlap,
thresholds=thresholds,
compute_aos=compute_aos)
idx += num_part
for i in range(len(thresholds)):
recall[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 2])
precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])
if compute_aos:
aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])
for i in range(len(thresholds)):
precision[m, l, k, i] = np.max(
precision[m, l, k, i:], axis=-1)
recall[m, l, k, i] = np.max(recall[m, l, k, i:], axis=-1)
if compute_aos:
aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)
ret_dict = {
"recall": recall,
"precision": precision,
"orientation": aos,
}
return ret_dict
def get_mAP(prec):
sums = 0
for i in range(0, prec.shape[-1], 4):
sums = sums + prec[..., i]
return sums / 11 * 100
def print_str(value, *arg, sstream=None):
if sstream is None:
sstream = sysio.StringIO()
sstream.truncate(0)
sstream.seek(0)
print(value, *arg, file=sstream)
return sstream.getvalue()
def do_eval(gt_annos,
dt_annos,
current_classes,
min_overlaps,
compute_aos=False):
# min_overlaps: [num_minoverlap, metric, num_class]
difficultys = [0, 1, 2]
ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 0,
min_overlaps, compute_aos)
# ret: [num_class, num_diff, num_minoverlap, num_sample_points]
mAP_bbox = get_mAP(ret["precision"])
mAP_aos = None
if compute_aos:
mAP_aos = get_mAP(ret["orientation"])
ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1,
min_overlaps)
mAP_bev = get_mAP(ret["precision"])
ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 2,
min_overlaps)
mAP_3d = get_mAP(ret["precision"])
return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
def do_coco_style_eval(gt_annos, dt_annos, current_classes, overlap_ranges,
compute_aos):
# overlap_ranges: [range, metric, num_class]
min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])
for i in range(overlap_ranges.shape[1]):
for j in range(overlap_ranges.shape[2]):
min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])
mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval(
gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
# ret: [num_class, num_diff, num_minoverlap]
mAP_bbox = mAP_bbox.mean(-1)
mAP_bev = mAP_bev.mean(-1)
mAP_3d = mAP_3d.mean(-1)
if mAP_aos is not None:
mAP_aos = mAP_aos.mean(-1)
return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
def get_official_eval_result(gt_annos, dt_annos, current_classes):
overlap_0_7 = np.array([[0.7, 0.5, 0.5, 0.7,
0.5], [0.7, 0.5, 0.5, 0.7, 0.5],
[0.7, 0.5, 0.5, 0.7, 0.5]])
overlap_0_5 = np.array([[0.7, 0.5, 0.5, 0.7,
0.5], [0.5, 0.25, 0.25, 0.5, 0.25],
[0.5, 0.25, 0.25, 0.5, 0.25]])
min_overlaps = np.stack([overlap_0_7, overlap_0_5], axis=0) # [2, 3, 5]
class_to_name = {
0: 'Car',
1: 'Pedestrian',
2: 'Cyclist',
3: 'Van',
4: 'Person_sitting',
}
name_to_class = {v: n for n, v in class_to_name.items()}
if not isinstance(current_classes, (list, tuple)):
current_classes = [current_classes]
current_classes_int = []
for curcls in current_classes:
if isinstance(curcls, str):
current_classes_int.append(name_to_class[curcls])
else:
current_classes_int.append(curcls)
current_classes = current_classes_int
min_overlaps = min_overlaps[:, :, current_classes]
result = ''
# check whether alpha is valid
compute_aos = False
for anno in dt_annos:
if anno['alpha'].shape[0] != 0:
if anno['alpha'][0] != -10:
compute_aos = True
break
mAPbbox, mAPbev, mAP3d, mAPaos = do_eval(
gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
ret_dict = {}
for j, curcls in enumerate(current_classes):
# mAP threshold array: [num_minoverlap, metric, class]
# mAP result: [num_class, num_diff, num_minoverlap]
for i in range(min_overlaps.shape[0]):
result += print_str(
(f"{class_to_name[curcls]} "
"AP@{:.2f}, {:.2f}, {:.2f}:".format(*min_overlaps[i, :, j])))
result += print_str((f"bbox AP:{mAPbbox[j, 0, i]:.4f}, "
f"{mAPbbox[j, 1, i]:.4f}, "
f"{mAPbbox[j, 2, i]:.4f}"))
result += print_str((f"bev AP:{mAPbev[j, 0, i]:.4f}, "
f"{mAPbev[j, 1, i]:.4f}, "
f"{mAPbev[j, 2, i]:.4f}"))
result += print_str((f"3d AP:{mAP3d[j, 0, i]:.4f}, "
f"{mAP3d[j, 1, i]:.4f}, "
f"{mAP3d[j, 2, i]:.4f}"))
if compute_aos:
result += print_str((f"aos AP:{mAPaos[j, 0, i]:.2f}, "
f"{mAPaos[j, 1, i]:.2f}, "
f"{mAPaos[j, 2, i]:.2f}"))
ret_dict['Car_3d_easy'] = mAP3d[0, 0, 0]
ret_dict['Car_3d_moderate'] = mAP3d[0, 1, 0]
ret_dict['Car_3d_hard'] = mAP3d[0, 2, 0]
ret_dict['Car_bev_easy'] = mAPbev[0, 0, 0]
ret_dict['Car_bev_moderate'] = mAPbev[0, 1, 0]
ret_dict['Car_bev_hard'] = mAPbev[0, 2, 0]
ret_dict['Car_image_easy'] = mAPbbox[0, 0, 0]
ret_dict['Car_image_moderate'] = mAPbbox[0, 1, 0]
ret_dict['Car_image_hard'] = mAPbbox[0, 2, 0]
return result, ret_dict
def get_coco_eval_result(gt_annos, dt_annos, current_classes):
class_to_name = {
0: 'Car',
1: 'Pedestrian',
2: 'Cyclist',
3: 'Van',
4: 'Person_sitting',
}
class_to_range = {
0: [0.5, 0.95, 10],
1: [0.25, 0.7, 10],
2: [0.25, 0.7, 10],
3: [0.5, 0.95, 10],
4: [0.25, 0.7, 10],
}
name_to_class = {v: n for n, v in class_to_name.items()}
if not isinstance(current_classes, (list, tuple)):
current_classes = [current_classes]
current_classes_int = []
for curcls in current_classes:
if isinstance(curcls, str):
current_classes_int.append(name_to_class[curcls])
else:
current_classes_int.append(curcls)
current_classes = current_classes_int
overlap_ranges = np.zeros([3, 3, len(current_classes)])
for i, curcls in enumerate(current_classes):
overlap_ranges[:, :, i] = np.array(
class_to_range[curcls])[:, np.newaxis]
result = ''
# check whether alpha is valid
compute_aos = False
for anno in dt_annos:
if anno['alpha'].shape[0] != 0:
if anno['alpha'][0] != -10:
compute_aos = True
break
mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(
gt_annos, dt_annos, current_classes, overlap_ranges, compute_aos)
for j, curcls in enumerate(current_classes):
# mAP threshold array: [num_minoverlap, metric, class]
# mAP result: [num_class, num_diff, num_minoverlap]
o_range = np.array(class_to_range[curcls])[[0, 2, 1]]
o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)
result += print_str((f"{class_to_name[curcls]} "
"coco AP@{:.2f}:{:.2f}:{:.2f}:".format(*o_range)))
result += print_str((f"bbox AP:{mAPbbox[j, 0]:.2f}, "
f"{mAPbbox[j, 1]:.2f}, "
f"{mAPbbox[j, 2]:.2f}"))
result += print_str((f"bev AP:{mAPbev[j, 0]:.2f}, "
f"{mAPbev[j, 1]:.2f}, "
f"{mAPbev[j, 2]:.2f}"))
result += print_str((f"3d AP:{mAP3d[j, 0]:.2f}, "
f"{mAP3d[j, 1]:.2f}, "
f"{mAP3d[j, 2]:.2f}"))
if compute_aos:
result += print_str((f"aos AP:{mAPaos[j, 0]:.2f}, "
f"{mAPaos[j, 1]:.2f}, "
f"{mAPaos[j, 2]:.2f}"))
return result
import time
import fire
import tools.kitti_object_eval_python.kitti_common as kitti
from tools.kitti_object_eval_python.eval import get_official_eval_result, get_coco_eval_result
def _read_imageset_file(path):
with open(path, 'r') as f:
lines = f.readlines()
return [int(line) for line in lines]
def evaluate(label_path,
result_path,
label_split_file,
current_class=0,
coco=False,
score_thresh=-1):
dt_annos = kitti.get_label_annos(result_path)
if score_thresh > 0:
dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)
val_image_ids = _read_imageset_file(label_split_file)
gt_annos = kitti.get_label_annos(label_path, val_image_ids)
if coco:
return get_coco_eval_result(gt_annos, dt_annos, current_class)
else:
return get_official_eval_result(gt_annos, dt_annos, current_class)
if __name__ == '__main__':
fire.Fire()
import concurrent.futures as futures
import os
import pathlib
import re
from collections import OrderedDict
import numpy as np
from skimage import io
def get_image_index_str(img_idx):
return "{:06d}".format(img_idx)
def get_kitti_info_path(idx,
prefix,
info_type='image_2',
file_tail='.png',
training=True,
relative_path=True):
img_idx_str = get_image_index_str(idx)
img_idx_str += file_tail
prefix = pathlib.Path(prefix)
if training:
file_path = pathlib.Path('training') / info_type / img_idx_str
else:
file_path = pathlib.Path('testing') / info_type / img_idx_str
if not (prefix / file_path).exists():
raise ValueError("file not exist: {}".format(file_path))
if relative_path:
return str(file_path)
else:
return str(prefix / file_path)
def get_image_path(idx, prefix, training=True, relative_path=True):
return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,
relative_path)
def get_label_path(idx, prefix, training=True, relative_path=True):
return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,
relative_path)
def get_velodyne_path(idx, prefix, training=True, relative_path=True):
return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,
relative_path)
def get_calib_path(idx, prefix, training=True, relative_path=True):
return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,
relative_path)
def _extend_matrix(mat):
mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)
return mat
def get_kitti_image_info(path,
training=True,
label_info=True,
velodyne=False,
calib=False,
image_ids=7481,
extend_matrix=True,
num_worker=8,
relative_path=True,
with_imageshape=True):
# image_infos = []
root_path = pathlib.Path(path)
if not isinstance(image_ids, list):
image_ids = list(range(image_ids))
def map_func(idx):
image_info = {'image_idx': idx}
annotations = None
if velodyne:
image_info['velodyne_path'] = get_velodyne_path(
idx, path, training, relative_path)
image_info['img_path'] = get_image_path(idx, path, training,
relative_path)
if with_imageshape:
img_path = image_info['img_path']
if relative_path:
img_path = str(root_path / img_path)
image_info['img_shape'] = np.array(
io.imread(img_path).shape[:2], dtype=np.int32)
if label_info:
label_path = get_label_path(idx, path, training, relative_path)
if relative_path:
label_path = str(root_path / label_path)
annotations = get_label_anno(label_path)
if calib:
calib_path = get_calib_path(
idx, path, training, relative_path=False)
with open(calib_path, 'r') as f:
lines = f.readlines()
P0 = np.array(
[float(info) for info in lines[0].split(' ')[1:13]]).reshape(
[3, 4])
P1 = np.array(
[float(info) for info in lines[1].split(' ')[1:13]]).reshape(
[3, 4])
P2 = np.array(
[float(info) for info in lines[2].split(' ')[1:13]]).reshape(
[3, 4])
P3 = np.array(
[float(info) for info in lines[3].split(' ')[1:13]]).reshape(
[3, 4])
if extend_matrix:
P0 = _extend_matrix(P0)
P1 = _extend_matrix(P1)
P2 = _extend_matrix(P2)
P3 = _extend_matrix(P3)
image_info['calib/P0'] = P0
image_info['calib/P1'] = P1
image_info['calib/P2'] = P2
image_info['calib/P3'] = P3
R0_rect = np.array([
float(info) for info in lines[4].split(' ')[1:10]
]).reshape([3, 3])
if extend_matrix:
rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)
rect_4x4[3, 3] = 1.
rect_4x4[:3, :3] = R0_rect
else:
rect_4x4 = R0_rect
image_info['calib/R0_rect'] = rect_4x4
Tr_velo_to_cam = np.array([
float(info) for info in lines[5].split(' ')[1:13]
]).reshape([3, 4])
Tr_imu_to_velo = np.array([
float(info) for info in lines[6].split(' ')[1:13]
]).reshape([3, 4])
if extend_matrix:
Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)
Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)
image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam
image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo
if annotations is not None:
image_info['annos'] = annotations
add_difficulty_to_annos(image_info)
return image_info
with futures.ThreadPoolExecutor(num_worker) as executor:
image_infos = executor.map(map_func, image_ids)
return list(image_infos)
def filter_kitti_anno(image_anno,
used_classes,
used_difficulty=None,
dontcare_iou=None):
if not isinstance(used_classes, (list, tuple)):
used_classes = [used_classes]
img_filtered_annotations = {}
relevant_annotation_indices = [
i for i, x in enumerate(image_anno['name']) if x in used_classes
]
for key in image_anno.keys():
img_filtered_annotations[key] = (
image_anno[key][relevant_annotation_indices])
if used_difficulty is not None:
relevant_annotation_indices = [
i for i, x in enumerate(img_filtered_annotations['difficulty'])
if x in used_difficulty
]
for key in image_anno.keys():
img_filtered_annotations[key] = (
img_filtered_annotations[key][relevant_annotation_indices])
if 'DontCare' in used_classes and dontcare_iou is not None:
dont_care_indices = [
i for i, x in enumerate(img_filtered_annotations['name'])
if x == 'DontCare'
]
# bounding box format [y_min, x_min, y_max, x_max]
all_boxes = img_filtered_annotations['bbox']
ious = iou(all_boxes, all_boxes[dont_care_indices])
# Remove all bounding boxes that overlap with a dontcare region.
if ious.size > 0:
boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou
for key in image_anno.keys():
img_filtered_annotations[key] = (img_filtered_annotations[key][
np.logical_not(boxes_to_remove)])
return img_filtered_annotations
def filter_annos_low_score(image_annos, thresh):
new_image_annos = []
for anno in image_annos:
img_filtered_annotations = {}
relevant_annotation_indices = [
i for i, s in enumerate(anno['score']) if s >= thresh
]
for key in anno.keys():
img_filtered_annotations[key] = (
anno[key][relevant_annotation_indices])
new_image_annos.append(img_filtered_annotations)
return new_image_annos
def kitti_result_line(result_dict, precision=4):
prec_float = "{" + ":.{}f".format(precision) + "}"
res_line = []
all_field_default = OrderedDict([
('name', None),
('truncated', -1),
('occluded', -1),
('alpha', -10),
('bbox', None),
('dimensions', [-1, -1, -1]),
('location', [-1000, -1000, -1000]),
('rotation_y', -10),
('score', None),
])
res_dict = [(key, None) for key, val in all_field_default.items()]
res_dict = OrderedDict(res_dict)
for key, val in result_dict.items():
if all_field_default[key] is None and val is None:
raise ValueError("you must specify a value for {}".format(key))
res_dict[key] = val
for key, val in res_dict.items():
if key == 'name':
res_line.append(val)
elif key in ['truncated', 'alpha', 'rotation_y', 'score']:
if val is None:
res_line.append(str(all_field_default[key]))
else:
res_line.append(prec_float.format(val))
elif key == 'occluded':
if val is None:
res_line.append(str(all_field_default[key]))
else:
res_line.append('{}'.format(val))
elif key in ['bbox', 'dimensions', 'location']:
if val is None:
res_line += [str(v) for v in all_field_default[key]]
else:
res_line += [prec_float.format(v) for v in val]
else:
raise ValueError("unknown key. supported key:{}".format(
res_dict.keys()))
return ' '.join(res_line)
def add_difficulty_to_annos(info):
min_height = [40, 25,
25] # minimum height for evaluated groundtruth/detections
max_occlusion = [
0, 1, 2
] # maximum occlusion level of the groundtruth used for evaluation
max_trunc = [
0.15, 0.3, 0.5
] # maximum truncation level of the groundtruth used for evaluation
annos = info['annos']
dims = annos['dimensions'] # lhw format
bbox = annos['bbox']
height = bbox[:, 3] - bbox[:, 1]
occlusion = annos['occluded']
truncation = annos['truncated']
diff = []
easy_mask = np.ones((len(dims), ), dtype=np.bool)
moderate_mask = np.ones((len(dims), ), dtype=np.bool)
hard_mask = np.ones((len(dims), ), dtype=np.bool)
i = 0
for h, o, t in zip(height, occlusion, truncation):
if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]:
easy_mask[i] = False
if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]:
moderate_mask[i] = False
if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]:
hard_mask[i] = False
i += 1
is_easy = easy_mask
is_moderate = np.logical_xor(easy_mask, moderate_mask)
is_hard = np.logical_xor(hard_mask, moderate_mask)
for i in range(len(dims)):
if is_easy[i]:
diff.append(0)
elif is_moderate[i]:
diff.append(1)
elif is_hard[i]:
diff.append(2)
else:
diff.append(-1)
annos["difficulty"] = np.array(diff, np.int32)
return diff
def get_label_anno(label_path):
annotations = {}
annotations.update({
'name': [],
'truncated': [],
'occluded': [],
'alpha': [],
'bbox': [],
'dimensions': [],
'location': [],
'rotation_y': []
})
with open(label_path, 'r') as f:
lines = f.readlines()
# if len(lines) == 0 or len(lines[0]) < 15:
# content = []
# else:
content = [line.strip().split(' ') for line in lines]
annotations['name'] = np.array([x[0] for x in content])
annotations['truncated'] = np.array([float(x[1]) for x in content])
annotations['occluded'] = np.array([int(x[2]) for x in content])
annotations['alpha'] = np.array([float(x[3]) for x in content])
annotations['bbox'] = np.array(
[[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4)
# dimensions will convert hwl format to standard lhw(camera) format.
annotations['dimensions'] = np.array(
[[float(info) for info in x[8:11]] for x in content]).reshape(
-1, 3)[:, [2, 0, 1]]
annotations['location'] = np.array(
[[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3)
annotations['rotation_y'] = np.array(
[float(x[14]) for x in content]).reshape(-1)
if len(content) != 0 and len(content[0]) == 16: # have score
annotations['score'] = np.array([float(x[15]) for x in content])
else:
annotations['score'] = np.zeros([len(annotations['bbox'])])
return annotations
def get_label_annos(label_folder, image_ids=None):
if image_ids is None:
filepaths = pathlib.Path(label_folder).glob('*.txt')
prog = re.compile(r'^\d{6}.txt$')
filepaths = filter(lambda f: prog.match(f.name), filepaths)
image_ids = [int(p.stem) for p in filepaths]
image_ids = sorted(image_ids)
if not isinstance(image_ids, list):
image_ids = list(range(image_ids))
annos = []
label_folder = pathlib.Path(label_folder)
for idx in image_ids:
image_idx = get_image_index_str(idx)
label_filename = label_folder / (image_idx + '.txt')
annos.append(get_label_anno(label_filename))
return annos
def area(boxes, add1=False):
"""Computes area of boxes.
Args:
boxes: Numpy array with shape [N, 4] holding N boxes
Returns:
a numpy array with shape [N*1] representing box areas
"""
if add1:
return (boxes[:, 2] - boxes[:, 0] + 1.0) * (
boxes[:, 3] - boxes[:, 1] + 1.0)
else:
return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
def intersection(boxes1, boxes2, add1=False):
"""Compute pairwise intersection areas between boxes.
Args:
boxes1: a numpy array with shape [N, 4] holding N boxes
boxes2: a numpy array with shape [M, 4] holding M boxes
Returns:
a numpy array with shape [N*M] representing pairwise intersection area
"""
[y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1)
[y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1)
all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2))
all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2))
if add1:
all_pairs_min_ymax += 1.0
intersect_heights = np.maximum(
np.zeros(all_pairs_max_ymin.shape),
all_pairs_min_ymax - all_pairs_max_ymin)
all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2))
all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2))
if add1:
all_pairs_min_xmax += 1.0
intersect_widths = np.maximum(
np.zeros(all_pairs_max_xmin.shape),
all_pairs_min_xmax - all_pairs_max_xmin)
return intersect_heights * intersect_widths
def iou(boxes1, boxes2, add1=False):
"""Computes pairwise intersection-over-union between box collections.
Args:
boxes1: a numpy array with shape [N, 4] holding N boxes.
boxes2: a numpy array with shape [M, 4] holding N boxes.
Returns:
a numpy array with shape [N, M] representing pairwise iou scores.
"""
intersect = intersection(boxes1, boxes2, add1)
area1 = area(boxes1, add1)
area2 = area(boxes2, add1)
union = np.expand_dims(
area1, axis=1) + np.expand_dims(
area2, axis=0) - intersect
return intersect / union
#####################
# Based on https://github.com/hongzhenwang/RRPN-revise
# Licensed under The MIT License
# Author: yanyan, scrin@foxmail.com
#####################
import math
import numba
import numpy as np
from numba import cuda
@numba.jit(nopython=True)
def div_up(m, n):
return m // n + (m % n > 0)
@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
def trangle_area(a, b, c):
return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) *
(b[0] - c[0])) / 2.0
@cuda.jit('(float32[:], int32)', device=True, inline=True)
def area(int_pts, num_of_inter):
area_val = 0.0
for i in range(num_of_inter - 2):
area_val += abs(
trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4],
int_pts[2 * i + 4:2 * i + 6]))
return area_val
@cuda.jit('(float32[:], int32)', device=True, inline=True)
def sort_vertex_in_convex_polygon(int_pts, num_of_inter):
if num_of_inter > 0:
center = cuda.local.array((2, ), dtype=numba.float32)
center[:] = 0.0
for i in range(num_of_inter):
center[0] += int_pts[2 * i]
center[1] += int_pts[2 * i + 1]
center[0] /= num_of_inter
center[1] /= num_of_inter
v = cuda.local.array((2, ), dtype=numba.float32)
vs = cuda.local.array((16, ), dtype=numba.float32)
for i in range(num_of_inter):
v[0] = int_pts[2 * i] - center[0]
v[1] = int_pts[2 * i + 1] - center[1]
d = math.sqrt(v[0] * v[0] + v[1] * v[1])
v[0] = v[0] / d
v[1] = v[1] / d
if v[1] < 0:
v[0] = -2 - v[0]
vs[i] = v[0]
j = 0
temp = 0
for i in range(1, num_of_inter):
if vs[i - 1] > vs[i]:
temp = vs[i]
tx = int_pts[2 * i]
ty = int_pts[2 * i + 1]
j = i
while j > 0 and vs[j - 1] > temp:
vs[j] = vs[j - 1]
int_pts[j * 2] = int_pts[j * 2 - 2]
int_pts[j * 2 + 1] = int_pts[j * 2 - 1]
j -= 1
vs[j] = temp
int_pts[j * 2] = tx
int_pts[j * 2 + 1] = ty
@cuda.jit(
'(float32[:], float32[:], int32, int32, float32[:])',
device=True,
inline=True)
def line_segment_intersection(pts1, pts2, i, j, temp_pts):
A = cuda.local.array((2, ), dtype=numba.float32)
B = cuda.local.array((2, ), dtype=numba.float32)
C = cuda.local.array((2, ), dtype=numba.float32)
D = cuda.local.array((2, ), dtype=numba.float32)
A[0] = pts1[2 * i]
A[1] = pts1[2 * i + 1]
B[0] = pts1[2 * ((i + 1) % 4)]
B[1] = pts1[2 * ((i + 1) % 4) + 1]
C[0] = pts2[2 * j]
C[1] = pts2[2 * j + 1]
D[0] = pts2[2 * ((j + 1) % 4)]
D[1] = pts2[2 * ((j + 1) % 4) + 1]
BA0 = B[0] - A[0]
BA1 = B[1] - A[1]
DA0 = D[0] - A[0]
CA0 = C[0] - A[0]
DA1 = D[1] - A[1]
CA1 = C[1] - A[1]
acd = DA1 * CA0 > CA1 * DA0
bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])
if acd != bcd:
abc = CA1 * BA0 > BA1 * CA0
abd = DA1 * BA0 > BA1 * DA0
if abc != abd:
DC0 = D[0] - C[0]
DC1 = D[1] - C[1]
ABBA = A[0] * B[1] - B[0] * A[1]
CDDC = C[0] * D[1] - D[0] * C[1]
DH = BA1 * DC0 - BA0 * DC1
Dx = ABBA * DC0 - BA0 * CDDC
Dy = ABBA * DC1 - BA1 * CDDC
temp_pts[0] = Dx / DH
temp_pts[1] = Dy / DH
return True
return False
@cuda.jit(
'(float32[:], float32[:], int32, int32, float32[:])',
device=True,
inline=True)
def line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):
a = cuda.local.array((2, ), dtype=numba.float32)
b = cuda.local.array((2, ), dtype=numba.float32)
c = cuda.local.array((2, ), dtype=numba.float32)
d = cuda.local.array((2, ), dtype=numba.float32)
a[0] = pts1[2 * i]
a[1] = pts1[2 * i + 1]
b[0] = pts1[2 * ((i + 1) % 4)]
b[1] = pts1[2 * ((i + 1) % 4) + 1]
c[0] = pts2[2 * j]
c[1] = pts2[2 * j + 1]
d[0] = pts2[2 * ((j + 1) % 4)]
d[1] = pts2[2 * ((j + 1) % 4) + 1]
area_abc = trangle_area(a, b, c)
area_abd = trangle_area(a, b, d)
if area_abc * area_abd >= 0:
return False
area_cda = trangle_area(c, d, a)
area_cdb = area_cda + area_abc - area_abd
if area_cda * area_cdb >= 0:
return False
t = area_cda / (area_abd - area_abc)
dx = t * (b[0] - a[0])
dy = t * (b[1] - a[1])
temp_pts[0] = a[0] + dx
temp_pts[1] = a[1] + dy
return True
@cuda.jit('(float32, float32, float32[:])', device=True, inline=True)
def point_in_quadrilateral(pt_x, pt_y, corners):
ab0 = corners[2] - corners[0]
ab1 = corners[3] - corners[1]
ad0 = corners[6] - corners[0]
ad1 = corners[7] - corners[1]
ap0 = pt_x - corners[0]
ap1 = pt_y - corners[1]
abab = ab0 * ab0 + ab1 * ab1
abap = ab0 * ap0 + ab1 * ap1
adad = ad0 * ad0 + ad1 * ad1
adap = ad0 * ap0 + ad1 * ap1
return abab >= abap and abap >= 0 and adad >= adap and adap >= 0
@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
def quadrilateral_intersection(pts1, pts2, int_pts):
num_of_inter = 0
for i in range(4):
if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2):
int_pts[num_of_inter * 2] = pts1[2 * i]
int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1]
num_of_inter += 1
if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1):
int_pts[num_of_inter * 2] = pts2[2 * i]
int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1]
num_of_inter += 1
temp_pts = cuda.local.array((2, ), dtype=numba.float32)
for i in range(4):
for j in range(4):
has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts)
if has_pts:
int_pts[num_of_inter * 2] = temp_pts[0]
int_pts[num_of_inter * 2 + 1] = temp_pts[1]
num_of_inter += 1
return num_of_inter
@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
def rbbox_to_corners(corners, rbbox):
# generate clockwise corners and rotate it clockwise
angle = rbbox[4]
a_cos = math.cos(angle)
a_sin = math.sin(angle)
center_x = rbbox[0]
center_y = rbbox[1]
x_d = rbbox[2]
y_d = rbbox[3]
corners_x = cuda.local.array((4, ), dtype=numba.float32)
corners_y = cuda.local.array((4, ), dtype=numba.float32)
corners_x[0] = -x_d / 2
corners_x[1] = -x_d / 2
corners_x[2] = x_d / 2
corners_x[3] = x_d / 2
corners_y[0] = -y_d / 2
corners_y[1] = y_d / 2
corners_y[2] = y_d / 2
corners_y[3] = -y_d / 2
for i in range(4):
corners[2 *
i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x
corners[2 * i
+ 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y
@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
def inter(rbbox1, rbbox2):
corners1 = cuda.local.array((8, ), dtype=numba.float32)
corners2 = cuda.local.array((8, ), dtype=numba.float32)
intersection_corners = cuda.local.array((16, ), dtype=numba.float32)
rbbox_to_corners(corners1, rbbox1)
rbbox_to_corners(corners2, rbbox2)
num_intersection = quadrilateral_intersection(corners1, corners2,
intersection_corners)
sort_vertex_in_convex_polygon(intersection_corners, num_intersection)
# print(intersection_corners.reshape([-1, 2])[:num_intersection])
return area(intersection_corners, num_intersection)
@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True)
def devRotateIoUEval(rbox1, rbox2, criterion=-1):
area1 = rbox1[2] * rbox1[3]
area2 = rbox2[2] * rbox2[3]
area_inter = inter(rbox1, rbox2)
if criterion == -1:
return area_inter / (area1 + area2 - area_inter)
elif criterion == 0:
return area_inter / area1
elif criterion == 1:
return area_inter / area2
else:
return area_inter
@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False)
def rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1):
threadsPerBlock = 8 * 8
row_start = cuda.blockIdx.x
col_start = cuda.blockIdx.y
tx = cuda.threadIdx.x
row_size = min(N - row_start * threadsPerBlock, threadsPerBlock)
col_size = min(K - col_start * threadsPerBlock, threadsPerBlock)
block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
dev_query_box_idx = threadsPerBlock * col_start + tx
dev_box_idx = threadsPerBlock * row_start + tx
if (tx < col_size):
block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0]
block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1]
block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2]
block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3]
block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4]
if (tx < row_size):
block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0]
block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1]
block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2]
block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3]
block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4]
cuda.syncthreads()
if tx < row_size:
for i in range(col_size):
offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i
dev_iou[offset] = devRotateIoUEval(block_qboxes[i * 5:i * 5 + 5],
block_boxes[tx * 5:tx * 5 + 5], criterion)
def rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):
"""rotated box iou running in gpu. 500x faster than cpu version
(take 5ms in one example with numba.cuda code).
convert from [this project](
https://github.com/hongzhenwang/RRPN-revise/tree/master/lib/rotation).
Args:
boxes (float tensor: [N, 5]): rbboxes. format: centers, dims,
angles(clockwise when positive)
query_boxes (float tensor: [K, 5]): [description]
device_id (int, optional): Defaults to 0. [description]
Returns:
[type]: [description]
"""
box_dtype = boxes.dtype
boxes = boxes.astype(np.float32)
query_boxes = query_boxes.astype(np.float32)
N = boxes.shape[0]
K = query_boxes.shape[0]
iou = np.zeros((N, K), dtype=np.float32)
if N == 0 or K == 0:
return iou
threadsPerBlock = 8 * 8
cuda.select_device(device_id)
blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock))
stream = cuda.stream()
with stream.auto_synchronize():
boxes_dev = cuda.to_device(boxes.reshape([-1]), stream)
query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream)
iou_dev = cuda.to_device(iou.reshape([-1]), stream)
rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream](
N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)
iou_dev.copy_to_host(iou.reshape([-1]), stream=stream)
return iou.astype(boxes.dtype)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import os
import sys
import time
import shutil
import argparse
import logging
import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.layers import control_flow
from paddle.fluid.contrib.extend_optimizer import extend_with_decoupled_weight_decay
import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
from models.point_rcnn import PointRCNN
from data.kitti_rcnn_reader import KittiRCNNReader
from utils.run_utils import *
from utils.config import cfg, load_config, set_config_from_list
from utils.optimizer import optimize
logging.root.handlers = []
FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
logger = logging.getLogger(__name__)
def parse_args():
parser = argparse.ArgumentParser("PointRCNN semantic segmentation train script")
parser.add_argument(
'--cfg',
type=str,
default='cfgs/default.yml',
help='specify the config for training')
parser.add_argument(
'--train_mode',
type=str,
default='rpn',
required=True,
help='specify the training mode')
parser.add_argument(
'--batch_size',
type=int,
default=16,
required=True,
help='training batch size, default 16')
parser.add_argument(
'--epoch',
type=int,
default=200,
required=True,
help='epoch number. default 200.')
parser.add_argument(
'--save_dir',
type=str,
default='checkpoints',
help='directory name to save train snapshoot')
parser.add_argument(
'--resume',
type=str,
default=None,
help='path to resume training based on previous checkpoints. '
'None for not resuming any checkpoints.')
parser.add_argument(
'--resume_epoch',
type=int,
default=0,
help='resume epoch id')
parser.add_argument(
'--data_dir',
type=str,
default='./data',
help='KITTI dataset root directory')
parser.add_argument(
'--gt_database',
type=str,
default='data/gt_database/train_gt_database_3level_Car.pkl',
help='generated gt database for augmentation')
parser.add_argument(
'--rcnn_training_roi_dir',
type=str,
default=None,
help='specify the saved rois for rcnn training when using rcnn_offline mode')
parser.add_argument(
'--rcnn_training_feature_dir',
type=str,
default=None,
help='specify the saved features for rcnn training when using rcnn_offline mode')
parser.add_argument(
'--log_interval',
type=int,
default=1,
help='mini-batch interval to log.')
parser.add_argument(
'--set',
dest='set_cfgs',
default=None,
nargs=argparse.REMAINDER,
help='set extra config keys if needed.')
args = parser.parse_args()
return args
def train():
args = parse_args()
print_arguments(args)
# check whether the installed paddle is compiled with GPU
# PointRCNN model can only run on GPU
check_gpu(True)
load_config(args.cfg)
if args.set_cfgs is not None:
set_config_from_list(args.set_cfgs)
if args.train_mode == 'rpn':
cfg.RPN.ENABLED = True
cfg.RCNN.ENABLED = False
elif args.train_mode == 'rcnn':
cfg.RCNN.ENABLED = True
cfg.RPN.ENABLED = cfg.RPN.FIXED = True
elif args.train_mode == 'rcnn_offline':
cfg.RCNN.ENABLED = True
cfg.RPN.ENABLED = False
else:
raise NotImplementedError("unknown train mode: {}".format(args.train_mode))
checkpoints_dir = os.path.join(args.save_dir, args.train_mode)
if not os.path.isdir(checkpoints_dir):
os.makedirs(checkpoints_dir)
kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
npoints=cfg.RPN.NUM_POINTS,
split=cfg.TRAIN.SPLIT,
mode='TRAIN',
classes=cfg.CLASSES,
rcnn_training_roi_dir=args.rcnn_training_roi_dir,
rcnn_training_feature_dir=args.rcnn_training_feature_dir,
gt_database_dir=args.gt_database)
num_samples = len(kitti_rcnn_reader)
steps_per_epoch = int(num_samples / args.batch_size)
logger.info("Total {} samples, {} batch per epoch.".format(num_samples, steps_per_epoch))
boundaries = [i * steps_per_epoch for i in cfg.TRAIN.DECAY_STEP_LIST]
values = [cfg.TRAIN.LR * (cfg.TRAIN.LR_DECAY ** i) for i in range(len(boundaries) + 1)]
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
# build model
startup = fluid.Program()
train_prog = fluid.Program()
with fluid.program_guard(train_prog, startup):
with fluid.unique_name.guard():
train_model = PointRCNN(cfg, args.batch_size, True, 'TRAIN')
train_model.build()
train_pyreader = train_model.get_pyreader()
train_feeds = train_model.get_feeds()
train_outputs = train_model.get_outputs()
train_loss = train_outputs['loss']
lr = optimize(train_loss,
learning_rate=cfg.TRAIN.LR,
warmup_factor=1. / cfg.TRAIN.DIV_FACTOR,
decay_factor=1e-5,
total_step=steps_per_epoch * args.epoch,
warmup_pct=cfg.TRAIN.PCT_START,
train_program=train_prog,
startup_prog=startup,
weight_decay=cfg.TRAIN.WEIGHT_DECAY,
clip_norm=cfg.TRAIN.GRAD_NORM_CLIP)
train_keys, train_values = parse_outputs(train_outputs, 'loss')
exe.run(startup)
if args.resume:
assert os.path.exists(args.resume), \
"Given resume weight dir {} not exist.".format(args.resume)
def if_exist(var):
logger.debug("{}: {}".format(var.name, os.path.exists(os.path.join(args.resume, var.name))))
return os.path.exists(os.path.join(args.resume, var.name))
fluid.io.load_vars(
exe, args.resume, predicate=if_exist, main_program=train_prog)
build_strategy = fluid.BuildStrategy()
build_strategy.memory_optimize = False
build_strategy.enable_inplace = False
build_strategy.fuse_all_optimizer_ops = False
train_compile_prog = fluid.compiler.CompiledProgram(
train_prog).with_data_parallel(loss_name=train_loss.name,
build_strategy=build_strategy)
def save_model(exe, prog, path):
if os.path.isdir(path):
shutil.rmtree(path)
logger.info("Save model to {}".format(path))
fluid.io.save_persistables(exe, path, prog)
# get reader
train_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, train_feeds, drop_last=True)
train_pyreader.decorate_sample_list_generator(train_reader, place)
train_stat = Stat()
for epoch_id in range(args.resume_epoch, args.epoch):
try:
train_pyreader.start()
train_iter = 0
train_periods = []
while True:
cur_time = time.time()
train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name])
period = time.time() - cur_time
train_periods.append(period)
train_stat.update(train_keys, train_outs[:-1])
if train_iter % args.log_interval == 0:
log_str = ""
for name, values in zip(train_keys + ['learning_rate'], train_outs):
log_str += "{}: {:.6f}, ".format(name, np.mean(values))
logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period))
train_iter += 1
except fluid.core.EOFException:
logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[2:])))
save_model(exe, train_prog, os.path.join(checkpoints_dir, str(epoch_id)))
train_stat.reset()
train_periods = []
finally:
train_pyreader.reset()
if __name__ == "__main__":
train()
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
Contains proposal functions
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from utils.config import cfg
__all__ = ["boxes3d_to_bev", "box_overlap_rotate", "boxes3d_to_bev", "box_iou", "box_nms"]
def boxes3d_to_bev(boxes3d):
"""
Args:
boxes3d: [N, 7], (x, y, z, h, w, l, ry)
Return:
boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
"""
boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
cu, cv = boxes3d[:, 0], boxes3d[:, 2]
half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
boxes_bev[:, 4] = boxes3d[:, 6]
return boxes_bev
def rotate_around_center(center, angle_cos, angle_sin, corners):
new_x = (corners[:, 0] - center[0]) * angle_cos + \
(corners[:, 1] - center[1]) * angle_sin + center[0]
new_y = -(corners[:, 0] - center[0]) * angle_sin + \
(corners[:, 1] - center[1]) * angle_cos + center[1]
return np.concatenate([new_x[:, np.newaxis], new_y[:, np.newaxis]], axis=-1)
def check_rect_cross(p1, p2, q1, q2):
return min(p1[0], p2[0]) <= max(q1[0], q2[0]) and \
min(q1[0], q2[0]) <= max(p1[0], p2[0]) and \
min(p1[1], p2[1]) <= max(q1[1], q2[1]) and \
min(q1[1], q2[1]) <= max(p1[1], p2[1])
def cross(p1, p2, p0):
return (p1[0] - p0[0]) * (p2[1] - p0[1]) - (p2[0] - p0[0]) * (p1[1] - p0[1]);
def cross_area(a, b):
return a[0] * b[1] - a[1] * b[0]
def intersection(p1, p0, q1, q0):
if not check_rect_cross(p1, p0, q1, q0):
return None
s1 = cross(q0, p1, p0)
s2 = cross(p1, q1, p0)
s3 = cross(p0, q1, q0)
s4 = cross(q1, p1, q0)
if not (s1 * s2 > 0 and s3 * s4 > 0):
return None
s5 = cross(q1, p1, p0)
if np.abs(s5 - s1) > 1e-8:
return np.array([(s5 * q0[0] - s1 * q1[0]) / (s5 - s1),
(s5 * q0[1] - s1 * q1[1]) / (s5 - s1)], dtype='float32')
else:
a0 = p0[1] - p1[1]
b0 = p1[0] - p0[0]
c0 = p0[0] * p1[1] - p1[0] * p0[1]
a0 = q0[1] - q1[1]
b0 = q1[0] - q0[0]
c0 = q0[0] * q1[1] - q1[0] * q0[1]
D = a0 * b1 - a1 * b0
return np.array([(b0 * c1 - b1 * c0) / D, (a1 * c0 - a0 * c1) / D], dtype='float32')
def check_in_box2d(box, p):
center_x = (box[0] + box[2]) / 2.
center_y = (box[1] + box[3]) / 2.
angle_cos = np.cos(-box[4])
angle_sin = np.sin(-box[4])
rot_x = (p[0] - center_x) * angle_cos + (p[1] - center_y) * angle_sin + center_x
rot_y = -(p[0] - center_x) * angle_sin + (p[1] - center_y) * angle_cos + center_y
return rot_x > box[0] - 1e-5 and rot_x < box[2] + 1e-5 and \
rot_y > box[1] - 1e-5 and rot_y < box[3] + 1e-5
def point_cmp(a, b, center):
return np.arctan2(a[1] - center[1], a[0] - center[0]) > \
np.arctan2(b[1] - center[1], b[0] - center[0])
def box_overlap_rotate(cur_box, boxes):
"""
Calculate box overlap with rotate, box: [x1, y1, x2, y2, angle]
"""
areas = np.zeros((len(boxes), ), dtype='float32')
cur_center = [(cur_box[0] + cur_box[2]) / 2., (cur_box[1] + cur_box[3]) / 2.]
cur_corners = np.array([
[cur_box[0], cur_box[1]], # (x1, y1)
[cur_box[2], cur_box[1]], # (x2, y1)
[cur_box[2], cur_box[3]], # (x2, y2)
[cur_box[0], cur_box[3]], # (x1, y2)
[cur_box[0], cur_box[1]], # (x1, y1)
], dtype='float32')
cur_angle_cos = np.cos(cur_box[4])
cur_angle_sin = np.sin(cur_box[4])
cur_corners = rotate_around_center(cur_center, cur_angle_cos, cur_angle_sin, cur_corners)
for i, box in enumerate(boxes):
box_center = [(box[0] + box[2]) / 2., (box[1] + box[3]) / 2.]
box_corners = np.array([
[box[0], box[1]],
[box[2], box[1]],
[box[2], box[3]],
[box[0], box[3]],
[box[0], box[1]],
], dtype='float32')
box_angle_cos = np.cos(box[4])
box_angle_sin = np.sin(box[4])
box_corners = rotate_around_center(box_center, box_angle_cos, box_angle_sin, box_corners)
cross_points = np.zeros((16, 2), dtype='float32')
cnt = 0
# get intersection of lines
for j in range(4):
for k in range(4):
inters = intersection(cur_corners[j + 1], cur_corners[j],
box_corners[k + 1], box_corners[k])
if inters is not None:
cross_points[cnt, :] = inters
cnt += 1
# check corners
for l in range(4):
if check_in_box2d(cur_box, box_corners[l]):
cross_points[cnt, :] = box_corners[l]
cnt += 1
if check_in_box2d(box, cur_corners[l]):
cross_points[cnt, :] = cur_corners[l]
cnt += 1
if cnt > 0:
poly_center = np.sum(cross_points[:cnt, :], axis=0) / cnt
else:
poly_center = np.zeros((2,))
# sort the points of polygon
for j in range(cnt - 1):
for k in range(cnt - j - 1):
if point_cmp(cross_points[k], cross_points[k + 1], poly_center):
cross_points[k], cross_points[k + 1] = \
cross_points[k + 1].copy(), cross_points[k].copy()
# get the overlap areas
area = 0.
for j in range(cnt - 1):
area += cross_area(cross_points[j] - cross_points[0],
cross_points[j + 1] - cross_points[0])
areas[i] = np.abs(area) / 2.
return areas
def box_iou(cur_box, boxes, box_type='normal'):
cur_S = (cur_box[2] - cur_box[0]) * (cur_box[3] - cur_box[1])
boxes_S = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
if box_type == 'normal':
inter_x1 = np.maximum(cur_box[0], boxes[:, 0])
inter_y1 = np.maximum(cur_box[1], boxes[:, 1])
inter_x2 = np.minimum(cur_box[2], boxes[:, 2])
inter_y2 = np.minimum(cur_box[3], boxes[:, 3])
inter_w = np.maximum(inter_x2 - inter_x1, 0.)
inter_h = np.maximum(inter_y2 - inter_y1, 0.)
inter_area = inter_w * inter_h
elif box_type == 'rotate':
inter_area = box_overlap_rotate(cur_box, boxes)
else:
raise NotImplementedError
return inter_area / np.maximum(cur_S + boxes_S - inter_area, 1e-8)
def box_nms(boxes, scores, proposals, thresh, topk, nms_type='normal'):
assert nms_type in ['normal', 'rotate'], \
"unknown nms type {}".format(nms_type)
order = np.argsort(-scores)
boxes = boxes[order]
scores = scores[order]
proposals = proposals[order]
nmsed_scores = []
nmsed_proposals = []
cnt = 0
while boxes.shape[0]:
nmsed_scores.append(scores[0])
nmsed_proposals.append(proposals[0])
cnt +=1
if cnt >= topk or boxes.shape[0] == 1:
break
iou = box_iou(boxes[0], boxes[1:], nms_type)
boxes = boxes[1:][iou < thresh]
scores = scores[1:][iou < thresh]
proposals = proposals[1:][iou < thresh]
return nmsed_scores, nmsed_proposals
def box_nms_eval(boxes, scores, proposals, thresh, nms_type='rotate'):
assert nms_type in ['normal', 'rotate'], \
"unknown nms type {}".format(nms_type)
order = np.argsort(-scores)
boxes = boxes[order]
scores = scores[order]
proposals = proposals[order]
nmsed_scores = []
nmsed_proposals = []
while boxes.shape[0]:
nmsed_scores.append(scores[0])
nmsed_proposals.append(proposals[0])
iou = box_iou(boxes[0], boxes[1:], nms_type)
inds = iou < thresh
boxes = boxes[1:][inds]
scores = scores[1:][inds]
proposals = proposals[1:][inds]
nmsed_scores = np.asarray(nmsed_scores)
nmsed_proposals = np.asarray(nmsed_proposals)
return nmsed_scores, nmsed_proposals
def boxes_iou3d(boxes1, boxes2):
boxes1_bev = boxes3d_to_bev(boxes1)
boxes2_bev = boxes3d_to_bev(boxes2)
# bev overlap
overlaps_bev = np.zeros((boxes1_bev.shape[0], boxes2_bev.shape[0]))
for i in range(boxes1_bev.shape[0]):
overlaps_bev[i, :] = box_overlap_rotate(boxes1_bev[i], boxes2_bev)
# height overlap
boxes1_height_min = (boxes1[:, 1] - boxes1[:, 3]).reshape(-1, 1)
boxes1_height_max = boxes1[:, 1].reshape(-1, 1)
boxes2_height_min = (boxes2[:, 1] - boxes2[:, 3]).reshape(1, -1)
boxes2_height_max = boxes2[:, 1].reshape(1, -1)
max_of_min = np.maximum(boxes1_height_min, boxes2_height_min)
min_of_max = np.minimum(boxes1_height_max, boxes2_height_max)
overlaps_h = np.maximum(min_of_max - max_of_min, 0.)
# 3d iou
overlaps_3d = overlaps_bev * overlaps_h
vol_a = (boxes1[:, 3] * boxes1[:, 4] * boxes1[:, 5]).reshape(-1, 1)
vol_b = (boxes2[:, 3] * boxes2[:, 4] * boxes2[:, 5]).reshape(1, -1)
iou3d = overlaps_3d / np.maximum(vol_a + vol_b - overlaps_3d, 1e-7)
return iou3d
"""
This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/kitti_utils.py
"""
import numpy as np
import os
def get_calib_from_file(calib_file):
with open(calib_file) as f:
lines = f.readlines()
obj = lines[2].strip().split(' ')[1:]
P2 = np.array(obj, dtype=np.float32)
obj = lines[3].strip().split(' ')[1:]
P3 = np.array(obj, dtype=np.float32)
obj = lines[4].strip().split(' ')[1:]
R0 = np.array(obj, dtype=np.float32)
obj = lines[5].strip().split(' ')[1:]
Tr_velo_to_cam = np.array(obj, dtype=np.float32)
return {'P2': P2.reshape(3, 4),
'P3': P3.reshape(3, 4),
'R0': R0.reshape(3, 3),
'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)}
class Calibration(object):
def __init__(self, calib_file):
if isinstance(calib_file, str):
calib = get_calib_from_file(calib_file)
else:
calib = calib_file
self.P2 = calib['P2'] # 3 x 4
self.R0 = calib['R0'] # 3 x 3
self.V2C = calib['Tr_velo2cam'] # 3 x 4
# Camera intrinsics and extrinsics
self.cu = self.P2[0, 2]
self.cv = self.P2[1, 2]
self.fu = self.P2[0, 0]
self.fv = self.P2[1, 1]
self.tx = self.P2[0, 3] / (-self.fu)
self.ty = self.P2[1, 3] / (-self.fv)
def cart_to_hom(self, pts):
"""
:param pts: (N, 3 or 2)
:return pts_hom: (N, 4 or 3)
"""
pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32)))
return pts_hom
def lidar_to_rect(self, pts_lidar):
"""
:param pts_lidar: (N, 3)
:return pts_rect: (N, 3)
"""
pts_lidar_hom = self.cart_to_hom(pts_lidar)
pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T))
# pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))
return pts_rect
def rect_to_img(self, pts_rect):
"""
:param pts_rect: (N, 3)
:return pts_img: (N, 2)
"""
pts_rect_hom = self.cart_to_hom(pts_rect)
pts_2d_hom = np.dot(pts_rect_hom, self.P2.T)
pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T # (N, 2)
pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2] # depth in rect camera coord
return pts_img, pts_rect_depth
def lidar_to_img(self, pts_lidar):
"""
:param pts_lidar: (N, 3)
:return pts_img: (N, 2)
"""
pts_rect = self.lidar_to_rect(pts_lidar)
pts_img, pts_depth = self.rect_to_img(pts_rect)
return pts_img, pts_depth
def img_to_rect(self, u, v, depth_rect):
"""
:param u: (N)
:param v: (N)
:param depth_rect: (N)
:return:
"""
x = ((u - self.cu) * depth_rect) / self.fu + self.tx
y = ((v - self.cv) * depth_rect) / self.fv + self.ty
pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1)
return pts_rect
def depthmap_to_rect(self, depth_map):
"""
:param depth_map: (H, W), depth_map
:return:
"""
x_range = np.arange(0, depth_map.shape[1])
y_range = np.arange(0, depth_map.shape[0])
x_idxs, y_idxs = np.meshgrid(x_range, y_range)
x_idxs, y_idxs = x_idxs.reshape(-1), y_idxs.reshape(-1)
depth = depth_map[y_idxs, x_idxs]
pts_rect = self.img_to_rect(x_idxs, y_idxs, depth)
return pts_rect, x_idxs, y_idxs
def corners3d_to_img_boxes(self, corners3d):
"""
:param corners3d: (N, 8, 3) corners in rect coordinate
:return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate
:return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate
"""
sample_num = corners3d.shape[0]
corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2) # (N, 8, 4)
img_pts = np.matmul(corners3d_hom, self.P2.T) # (N, 8, 3)
x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]
x1, y1 = np.min(x, axis=1), np.min(y, axis=1)
x2, y2 = np.max(x, axis=1), np.max(y, axis=1)
boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)
boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)
return boxes, boxes_corner
def camera_dis_to_rect(self, u, v, d):
"""
Can only process valid u, v, d, which means u, v can not beyond the image shape, reprojection error 0.02
:param u: (N)
:param v: (N)
:param d: (N), the distance between camera and 3d points, d^2 = x^2 + y^2 + z^2
:return:
"""
assert self.fu == self.fv, '%.8f != %.8f' % (self.fu, self.fv)
fd = np.sqrt((u - self.cu)**2 + (v - self.cv)**2 + self.fu**2)
x = ((u - self.cu) * d) / fd + self.tx
y = ((v - self.cv) * d) / fd + self.ty
z = np.sqrt(d**2 - x**2 - y**2)
pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), z.reshape(-1, 1)), axis=1)
return pts_rect
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
This code is bases on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/config.py
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import yaml
import numpy as np
from ast import literal_eval
__all__ = ["load_config", "cfg"]
class AttrDict(dict):
def __init__(self, *args, **kwargs):
for arg in args:
for k, v in arg.items():
if isinstance(v, dict):
arg[k] = AttrDict(v)
else:
arg[k] = v
super(AttrDict, self).__init__(*args, **kwargs)
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif name in self:
return self[name]
else:
raise AttributeError(name)
def __setattr__(self, name, value):
if name in self.__dict__:
self.__dict__[name] = value
else:
self[name] = value
__C = AttrDict()
cfg = __C
# 0. basic config
__C.TAG = 'default'
__C.CLASSES = 'Car'
__C.INCLUDE_SIMILAR_TYPE = False
# config of augmentation
__C.AUG_DATA = True
__C.AUG_METHOD_LIST = ['rotation', 'scaling', 'flip']
__C.AUG_METHOD_PROB = [0.5, 0.5, 0.5]
__C.AUG_ROT_RANGE = 18
__C.GT_AUG_ENABLED = False
__C.GT_EXTRA_NUM = 15
__C.GT_AUG_RAND_NUM = False
__C.GT_AUG_APPLY_PROB = 0.75
__C.GT_AUG_HARD_RATIO = 0.6
__C.PC_REDUCE_BY_RANGE = True
__C.PC_AREA_SCOPE = np.array([[-40, 40],
[-1, 3],
[0, 70.4]]) # x, y, z scope in rect camera coords
__C.CLS_MEAN_SIZE = np.array([[1.52, 1.63, 3.88]], dtype=np.float32)
# 1. config of rpn network
__C.RPN = AttrDict()
__C.RPN.ENABLED = True
__C.RPN.FIXED = False
__C.RPN.USE_INTENSITY = True
# config of bin-based loss
__C.RPN.LOC_XZ_FINE = False
__C.RPN.LOC_SCOPE = 3.0
__C.RPN.LOC_BIN_SIZE = 0.5
__C.RPN.NUM_HEAD_BIN = 12
# config of network structure
__C.RPN.BACKBONE = 'pointnet2_msg'
__C.RPN.USE_BN = True
__C.RPN.NUM_POINTS = 16384
__C.RPN.SA_CONFIG = AttrDict()
__C.RPN.SA_CONFIG.NPOINTS = [4096, 1024, 256, 64]
__C.RPN.SA_CONFIG.RADIUS = [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
__C.RPN.SA_CONFIG.NSAMPLE = [[16, 32], [16, 32], [16, 32], [16, 32]]
__C.RPN.SA_CONFIG.MLPS = [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 512], [256, 384, 512]]]
__C.RPN.FP_MLPS = [[128, 128], [256, 256], [512, 512], [512, 512]]
__C.RPN.CLS_FC = [128]
__C.RPN.REG_FC = [128]
__C.RPN.DP_RATIO = 0.5
# config of training
__C.RPN.LOSS_CLS = 'DiceLoss'
__C.RPN.FG_WEIGHT = 15
__C.RPN.FOCAL_ALPHA = [0.25, 0.75]
__C.RPN.FOCAL_GAMMA = 2.0
__C.RPN.REG_LOSS_WEIGHT = [1.0, 1.0, 1.0, 1.0]
__C.RPN.LOSS_WEIGHT = [1.0, 1.0]
__C.RPN.NMS_TYPE = 'normal' # normal, rotate
# config of testing
__C.RPN.SCORE_THRESH = 0.3
# 2. config of rcnn network
__C.RCNN = AttrDict()
__C.RCNN.ENABLED = False
# config of input
__C.RCNN.USE_RPN_FEATURES = True
__C.RCNN.USE_MASK = True
__C.RCNN.MASK_TYPE = 'seg'
__C.RCNN.USE_INTENSITY = False
__C.RCNN.USE_DEPTH = True
__C.RCNN.USE_SEG_SCORE = False
__C.RCNN.ROI_SAMPLE_JIT = False
__C.RCNN.ROI_FG_AUG_TIMES = 10
__C.RCNN.REG_AUG_METHOD = 'multiple' # multiple, single, normal
__C.RCNN.POOL_EXTRA_WIDTH = 1.0
# config of bin-based loss
__C.RCNN.LOC_SCOPE = 1.5
__C.RCNN.LOC_BIN_SIZE = 0.5
__C.RCNN.NUM_HEAD_BIN = 9
__C.RCNN.LOC_Y_BY_BIN = False
__C.RCNN.LOC_Y_SCOPE = 0.5
__C.RCNN.LOC_Y_BIN_SIZE = 0.25
__C.RCNN.SIZE_RES_ON_ROI = False
# config of network structure
__C.RCNN.USE_BN = False
__C.RCNN.DP_RATIO = 0.0
__C.RCNN.BACKBONE = 'pointnet' # pointnet, pointsift
__C.RCNN.XYZ_UP_LAYER = [128, 128]
__C.RCNN.NUM_POINTS = 512
__C.RCNN.SA_CONFIG = AttrDict()
__C.RCNN.SA_CONFIG.NPOINTS = [128, 32, -1]
__C.RCNN.SA_CONFIG.RADIUS = [0.2, 0.4, 100]
__C.RCNN.SA_CONFIG.NSAMPLE = [64, 64, 64]
__C.RCNN.SA_CONFIG.MLPS = [[128, 128, 128],
[128, 128, 256],
[256, 256, 512]]
__C.RCNN.CLS_FC = [256, 256]
__C.RCNN.REG_FC = [256, 256]
# config of training
__C.RCNN.LOSS_CLS = 'BinaryCrossEntropy'
__C.RCNN.FOCAL_ALPHA = [0.25, 0.75]
__C.RCNN.FOCAL_GAMMA = 2.0
__C.RCNN.CLS_WEIGHT = np.array([1.0, 1.0, 1.0], dtype=np.float32)
__C.RCNN.CLS_FG_THRESH = 0.6
__C.RCNN.CLS_BG_THRESH = 0.45
__C.RCNN.CLS_BG_THRESH_LO = 0.05
__C.RCNN.REG_FG_THRESH = 0.55
__C.RCNN.FG_RATIO = 0.5
__C.RCNN.ROI_PER_IMAGE = 64
__C.RCNN.HARD_BG_RATIO = 0.6
# config of testing
__C.RCNN.SCORE_THRESH = 0.3
__C.RCNN.NMS_THRESH = 0.1
# general training config
__C.TRAIN = AttrDict()
__C.TRAIN.SPLIT = 'train'
__C.TRAIN.VAL_SPLIT = 'smallval'
__C.TRAIN.LR = 0.002
__C.TRAIN.LR_CLIP = 0.00001
__C.TRAIN.LR_DECAY = 0.5
__C.TRAIN.DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
__C.TRAIN.LR_WARMUP = False
__C.TRAIN.WARMUP_MIN = 0.0002
__C.TRAIN.WARMUP_EPOCH = 5
__C.TRAIN.BN_MOMENTUM = 0.9
__C.TRAIN.BN_DECAY = 0.5
__C.TRAIN.BNM_CLIP = 0.01
__C.TRAIN.BN_DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
__C.TRAIN.OPTIMIZER = 'adam'
__C.TRAIN.WEIGHT_DECAY = 0.0 # "L2 regularization coeff [default: 0.0]"
__C.TRAIN.MOMENTUM = 0.9
__C.TRAIN.MOMS = [0.95, 0.85]
__C.TRAIN.DIV_FACTOR = 10.0
__C.TRAIN.PCT_START = 0.4
__C.TRAIN.GRAD_NORM_CLIP = 1.0
__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000
__C.TRAIN.RPN_POST_NMS_TOP_N = 2048
__C.TRAIN.RPN_NMS_THRESH = 0.85
__C.TRAIN.RPN_DISTANCE_BASED_PROPOSE = True
__C.TEST = AttrDict()
__C.TEST.SPLIT = 'val'
__C.TEST.RPN_PRE_NMS_TOP_N = 9000
__C.TEST.RPN_POST_NMS_TOP_N = 300
__C.TEST.RPN_NMS_THRESH = 0.7
__C.TEST.RPN_DISTANCE_BASED_PROPOSE = True
def load_config(fname):
"""
Load config from yaml file and merge into global cfg
"""
with open(fname) as f:
yml_cfg = AttrDict(yaml.load(f.read(), Loader=yaml.Loader))
_merge_cfg_a_to_b(yml_cfg, __C)
def set_config_from_list(cfg_list):
assert len(cfg_list) % 2 == 0, "cfgs list length invalid"
for k, v in zip(cfg_list[0::2], cfg_list[1::2]):
key_list = k.split('.')
d = __C
for subkey in key_list[:-1]:
assert subkey in d
d = d[subkey]
subkey = key_list[-1]
assert subkey in d
try:
value = literal_eval(v)
except:
# handle the case when v is a string literal
value = v
assert type(value) == type(d[subkey]), \
'type {} does not match original type {}'.format(type(value), type(d[subkey]))
d[subkey] = value
def _merge_cfg_a_to_b(a, b):
assert isinstance(a, AttrDict), \
"unknown type {}".format(type(a))
for k, v in a.items():
assert k in b, "unknown key {}".format(k)
if type(v) is not type(b[k]):
if isinstance(b[k], np.ndarray):
b[k] = np.array(v, dtype=b[k].dtype)
else:
raise TypeError("Config type mismatch")
if isinstance(v, AttrDict):
_merge_cfg_a_to_b(v, b[k])
else:
b[k] = v
if __name__ == "__main__":
load_config("./cfgs/default.yml")
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import cython
from math import pi, cos, sin
import numpy as np
cimport numpy as np
cdef class Point:
cdef float x, y
def __cinit__(self, x, y):
self.x = x
self.y = y
def __add__(self, v):
if not isinstance(v, Point):
return NotImplemented
return Point(self.x + v.x, self.y + v.y)
def __sub__(self, v):
if not isinstance(v, Point):
return NotImplemented
return Point(self.x - v.x, self.y - v.y)
def cross(self, v):
if not isinstance(v, Point):
return NotImplemented
return self.x*v.y - self.y*v.x
cdef class Line:
cdef float a, b, c
# ax + by + c = 0
def __cinit__(self, v1, v2):
self.a = v2.y - v1.y
self.b = v1.x - v2.x
self.c = v2.cross(v1)
def __call__(self, p):
return self.a*p.x + self.b*p.y + self.c
def intersection(self, other):
if not isinstance(other, Line):
return NotImplemented
w = self.a*other.b - self.b*other.a
return Point(
(self.b*other.c - self.c*other.b)/w,
(self.c*other.a - self.a*other.c)/w
)
@cython.boundscheck(False)
@cython.wraparound(False)
def rectangle_vertices_(x1, y1, x2, y2, r):
cx = (x1 + x2) / 2
cy = (y1 + y2) / 2
angle = r
cr = cos(angle)
sr = sin(angle)
# rotate around center
return (
Point(
x=(x1-cx)*cr+(y1-cy)*sr+cx,
y=-(x1-cx)*sr+(y1-cy)*cr+cy
),
Point(
x=(x2-cx)*cr+(y1-cy)*sr+cx,
y=-(x2-cx)*sr+(y1-cy)*cr+cy
),
Point(
x=(x2-cx)*cr+(y2-cy)*sr+cx,
y=-(x2-cx)*sr+(y2-cy)*cr+cy
),
Point(
x=(x1-cx)*cr+(y2-cy)*sr+cx,
y=-(x1-cx)*sr+(y2-cy)*cr+cy
)
)
@cython.boundscheck(False)
@cython.wraparound(False)
def intersection_area(r1, r2):
# r1 and r2 are in (center, width, height, rotation) representation
# First convert these into a sequence of vertices
rect1 = rectangle_vertices_(*r1)
rect2 = rectangle_vertices_(*r2)
# Use the vertices of the first rectangle as
# starting vertices of the intersection polygon.
intersection = rect1
# Loop over the edges of the second rectangle
for p, q in zip(rect2, rect2[1:] + rect2[:1]):
if len(intersection) <= 2:
break # No intersection
line = Line(p, q)
# Any point p with line(p) <= 0 is on the "inside" (or on the boundary),
# any point p with line(p) > 0 is on the "outside".
# Loop over the edges of the intersection polygon,
# and determine which part is inside and which is outside.
new_intersection = []
line_values = [line(t) for t in intersection]
for s, t, s_value, t_value in zip(
intersection, intersection[1:] + intersection[:1],
line_values, line_values[1:] + line_values[:1]):
if s_value <= 0:
new_intersection.append(s)
if s_value * t_value < 0:
# Points are on opposite sides.
# Add the intersection of the lines to new_intersection.
intersection_point = line.intersection(Line(s, t))
new_intersection.append(intersection_point)
intersection = new_intersection
# Calculate area
if len(intersection) <= 2:
return 0
return 0.5 * sum(p.x*q.y - p.y*q.x for p, q in zip(intersection, intersection[1:] + intersection[:1]))
def boxes3d_to_bev_(boxes3d):
"""
Args:
boxes3d: [N, 7], (x, y, z, h, w, l, ry)
Return:
boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
"""
boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
cu, cv = boxes3d[:, 0], boxes3d[:, 2]
half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
boxes_bev[:, 4] = boxes3d[:, 6]
return boxes_bev
def boxes_iou3d(boxes_a, boxes_b):
"""
:param boxes_a: (N, 7) [x, y, z, h, w, l, ry]
:param boxes_b: (M, 7) [x, y, z, h, w, l, ry]
:return:
ans_iou: (M, N)
"""
boxes_a_bev = boxes3d_to_bev_(boxes_a)
boxes_b_bev = boxes3d_to_bev_(boxes_b)
# bev overlap
num_a = boxes_a_bev.shape[0]
num_b = boxes_b_bev.shape[0]
overlaps_bev = np.zeros((num_a, num_b), dtype=np.float32)
for i in range(num_a):
for j in range(num_b):
overlaps_bev[i][j] = intersection_area(boxes_a_bev[i], boxes_b_bev[j])
# height overlap
boxes_a_height_min = (boxes_a[:, 1] - boxes_a[:, 3]).reshape(-1, 1)
boxes_a_height_max = boxes_a[:, 1].reshape(-1, 1)
boxes_b_height_min = (boxes_b[:, 1] - boxes_b[:, 3]).reshape(1, -1)
boxes_b_height_max = boxes_b[:, 1].reshape(1, -1)
max_of_min = np.maximum(boxes_a_height_min, boxes_b_height_min)
min_of_max = np.minimum(boxes_a_height_max, boxes_b_height_max)
overlaps_h = np.clip(min_of_max - max_of_min, a_min=0, a_max=np.inf)
# 3d iou
overlaps_3d = overlaps_bev * overlaps_h
vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).reshape(-1, 1)
vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).reshape(1, -1)
iou3d = overlaps_3d / np.clip(vol_a + vol_b - overlaps_3d, a_min=1e-7, a_max=np.inf)
return iou3d
#if __name__ == '__main__':
# # (center, width, height, rotation)
# r1 = (10, 15, 15, 10, 30)
# r2 = (15, 15, 20, 10, 0)
# print(intersection_area(r1, r2))
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import cython
import numpy as np
cimport numpy as np
@cython.boundscheck(False)
@cython.wraparound(False)
def pts_in_boxes3d(np.ndarray pts_rect, np.ndarray boxes3d):
"""
:param pts: (N, 3) in rect-camera coords
:param boxes3d: (M, 7)
:return: boxes_pts_mask_list: (M), list with [(N), (N), ..]
"""
cdef float MAX_DIS = 10.0
cdef np.ndarray boxes_pts_mask_list = np.zeros((boxes3d.shape[0], pts_rect.shape[0]), dtype='int32')
cdef int boxes3d_num = boxes3d.shape[0]
cdef int pts_rect_num = pts_rect.shape[0]
cdef float cx, by, cz, h, w, l, angle, cy, cosa, sina, x_rot, z_rot
cdef int x, y, z
for i in range(boxes3d_num):
cx, by, cz, h, w, l, angle = boxes3d[i, :]
cy = by - h / 2.
cosa = np.cos(angle)
sina = np.sin(angle)
for j in range(pts_rect_num):
x, y, z = pts_rect[j, :]
if np.abs(x - cx) > MAX_DIS or np.abs(y - cy) > h / 2. or np.abs(z - cz) > MAX_DIS:
continue
x_rot = (x - cx) * cosa + (z - cz) * (-sina)
z_rot = (x - cx) * sina + (z - cz) * cosa
boxes_pts_mask_list[i, j] = int(x_rot >= -l / 2. and x_rot <= l / 2. and
z_rot >= -w / 2. and z_rot <= w / 2.)
return boxes_pts_mask_list
@cython.boundscheck(False)
@cython.wraparound(False)
def rotate_pc_along_y(np.ndarray pc, float rot_angle):
"""
params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
params rot_angle: rad scalar
Output pc: updated pc with XYZ rotated
"""
cosval = np.cos(rot_angle)
sinval = np.sin(rot_angle)
rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
return pc
@cython.boundscheck(False)
@cython.wraparound(False)
def rotate_pc_along_y_np(np.ndarray pc, np.ndarray rot_angle):
"""
:param pc: (N, 512, 3 + C)
:param rot_angle: (N)
:return:
TODO: merge with rotate_pc_along_y_torch in bbox_transform.py
"""
cdef np.ndarray cosa, sina, raw_1, raw_2, R, pc_temp
cosa = np.cos(rot_angle).reshape(-1, 1)
sina = np.sin(rot_angle).reshape(-1, 1)
raw_1 = np.concatenate([cosa, -sina], axis=1)
raw_2 = np.concatenate([sina, cosa], axis=1)
# # (N, 2, 2)
R = np.concatenate((np.expand_dims(raw_1, axis=1), np.expand_dims(raw_2, axis=1)), axis=1)
pc_temp = pc[:, :, [0, 2]]
pc[:, :, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1))
return pc
@cython.boundscheck(False)
@cython.wraparound(False)
def enlarge_box3d(np.ndarray boxes3d, float extra_width):
"""
:param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
"""
cdef np.ndarray large_boxes3d
if isinstance(boxes3d, np.ndarray):
large_boxes3d = boxes3d.copy()
else:
large_boxes3d = boxes3d.clone()
large_boxes3d[:, 3:6] += extra_width * 2
large_boxes3d[:, 1] += extra_width
return large_boxes3d
@cython.boundscheck(False)
@cython.wraparound(False)
def boxes3d_to_corners3d(np.ndarray boxes3d, bint rotate=True):
"""
:param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
:param rotate:
:return: corners3d: (N, 8, 3)
"""
cdef int boxes_num = boxes3d.shape[0]
cdef np.ndarray h, w, l
h, w, l = boxes3d[:, 3], boxes3d[:, 4], boxes3d[:, 5]
cdef np.ndarray x_corners, y_corners
x_corners = np.array([l / 2., l / 2., -l / 2., -l / 2., l / 2., l / 2., -l / 2., -l / 2.], dtype=np.float32).T # (N, 8)
z_corners = np.array([w / 2., -w / 2., -w / 2., w / 2., w / 2., -w / 2., -w / 2., w / 2.], dtype=np.float32).T # (N, 8)
y_corners = np.zeros((boxes_num, 8), dtype=np.float32)
y_corners[:, 4:8] = -h.reshape(boxes_num, 1).repeat(4, axis=1) # (N, 8)
cdef np.ndarray ry, zeros, ones, rot_list, R_list, temp_corners, rotated_corners
if rotate:
ry = boxes3d[:, 6]
zeros, ones = np.zeros(ry.size, dtype=np.float32), np.ones(ry.size, dtype=np.float32)
rot_list = np.array([[np.cos(ry), zeros, -np.sin(ry)],
[zeros, ones, zeros],
[np.sin(ry), zeros, np.cos(ry)]]) # (3, 3, N)
R_list = np.transpose(rot_list, (2, 0, 1)) # (N, 3, 3)
temp_corners = np.concatenate((x_corners.reshape(-1, 8, 1), y_corners.reshape(-1, 8, 1),
z_corners.reshape(-1, 8, 1)), axis=2) # (N, 8, 3)
rotated_corners = np.matmul(temp_corners, R_list) # (N, 8, 3)
x_corners, y_corners, z_corners = rotated_corners[:, :, 0], rotated_corners[:, :, 1], rotated_corners[:, :, 2]
cdef np.ndarray x_loc, y_loc, z_loc
x_loc, y_loc, z_loc = boxes3d[:, 0], boxes3d[:, 1], boxes3d[:, 2]
cdef np.ndarray x, y, z, corners
x = x_loc.reshape(-1, 1) + x_corners.reshape(-1, 8)
y = y_loc.reshape(-1, 1) + y_corners.reshape(-1, 8)
z = z_loc.reshape(-1, 1) + z_corners.reshape(-1, 8)
corners = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1), z.reshape(-1, 8, 1)), axis=2).astype(np.float32)
return corners
@cython.boundscheck(False)
@cython.wraparound(False)
def objs_to_boxes3d(obj_list):
cdef np.ndarray boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
cdef int k
for k, obj in enumerate(obj_list):
boxes3d[k, 0:3], boxes3d[k, 3], boxes3d[k, 4], boxes3d[k, 5], boxes3d[k, 6] \
= obj.pos, obj.h, obj.w, obj.l, obj.ry
return boxes3d
@cython.boundscheck(False)
@cython.wraparound(False)
def objs_to_scores(obj_list):
cdef np.ndarray scores = np.zeros((obj_list.__len__()), dtype=np.float32)
cdef int k
for k, obj in enumerate(obj_list):
scores[k] = obj.score
return scores
def get_iou3d(np.ndarray corners3d, np.ndarray query_corners3d, bint need_bev=False):
"""
:param corners3d: (N, 8, 3) in rect coords
:param query_corners3d: (M, 8, 3)
:return:
"""
from shapely.geometry import Polygon
A, B = corners3d, query_corners3d
N, M = A.shape[0], B.shape[0]
iou3d = np.zeros((N, M), dtype=np.float32)
iou_bev = np.zeros((N, M), dtype=np.float32)
# for height overlap, since y face down, use the negative y
min_h_a = -A[:, 0:4, 1].sum(axis=1) / 4.0
max_h_a = -A[:, 4:8, 1].sum(axis=1) / 4.0
min_h_b = -B[:, 0:4, 1].sum(axis=1) / 4.0
max_h_b = -B[:, 4:8, 1].sum(axis=1) / 4.0
for i in range(N):
for j in range(M):
max_of_min = np.max([min_h_a[i], min_h_b[j]])
min_of_max = np.min([max_h_a[i], max_h_b[j]])
h_overlap = np.max([0, min_of_max - max_of_min])
if h_overlap == 0:
continue
bottom_a, bottom_b = Polygon(A[i, 0:4, [0, 2]].T), Polygon(B[j, 0:4, [0, 2]].T)
if bottom_a.is_valid and bottom_b.is_valid:
# check is valid, A valid Polygon may not possess any overlapping exterior or interior rings.
bottom_overlap = bottom_a.intersection(bottom_b).area
else:
bottom_overlap = 0.
overlap3d = bottom_overlap * h_overlap
union3d = bottom_a.area * (max_h_a[i] - min_h_a[i]) + bottom_b.area * (max_h_b[j] - min_h_b[j]) - overlap3d
iou3d[i][j] = overlap3d / union3d
iou_bev[i][j] = bottom_overlap / (bottom_a.area + bottom_b.area - bottom_overlap)
if need_bev:
return iou3d, iou_bev
return iou3d
def get_objects_from_label(label_file):
import utils.object3d as object3d
with open(label_file, 'r') as f:
lines = f.readlines()
objects = [object3d.Object3d(line) for line in lines]
return objects
@cython.boundscheck(False)
@cython.wraparound(False)
def _rotate_pc_along_y(np.ndarray pc, np.ndarray angle):
cdef np.ndarray cosa = np.cos(angle)
cosa=cosa.reshape(-1, 1)
cdef np.ndarray sina = np.sin(angle)
sina = sina.reshape(-1, 1)
cdef np.ndarray R = np.concatenate([cosa, -sina, sina, cosa], axis=-1)
R = R.reshape(-1, 2, 2)
cdef np.ndarray pc_temp = pc[:, [0, 2]]
pc_temp = pc_temp.reshape(-1, 1, 2)
cdef np.ndarray pc_temp_1 = np.matmul(pc_temp, R.transpose(0, 2, 1))
pc_temp_1 = pc_temp_1.reshape(-1, 2)
pc[:,[0,2]] = pc_temp_1
return pc
@cython.boundscheck(False)
@cython.wraparound(False)
def decode_bbox_target(
np.ndarray roi_box3d,
np.ndarray pred_reg,
np.ndarray anchor_size,
float loc_scope,
float loc_bin_size,
int num_head_bin,
bint get_xz_fine=True,
float loc_y_scope=0.5,
float loc_y_bin_size=0.25,
bint get_y_by_bin=False,
bint get_ry_fine=False):
cdef int per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
cdef int loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
# recover xz localization
cdef int x_bin_l = 0
cdef int x_bin_r = per_loc_bin_num
cdef int z_bin_l = per_loc_bin_num,
cdef int z_bin_r = per_loc_bin_num * 2
cdef int start_offset = z_bin_r
cdef np.ndarray x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
cdef np.ndarray z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
cdef np.ndarray pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
cdef np.ndarray pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
if get_xz_fine:
x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
start_offset = z_res_r
x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
x_res = x_res_norm * loc_bin_size
z_res = z_res_norm * loc_bin_size
pos_x += x_res
pos_z += z_res
# recover y localization
if get_y_by_bin:
y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
start_offset = y_res_r
y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
y_res = y_res_norm * loc_y_bin_size
pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
else:
y_offset_l, y_offset_r = start_offset, start_offset + 1
start_offset = y_offset_r
pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
pos_y = pos_y.reshape(-1)
# recover ry rotation
cdef int ry_bin_l = start_offset,
cdef int ry_bin_r = start_offset + num_head_bin
cdef int ry_res_l = ry_bin_r,
cdef int ry_res_r = ry_bin_r + num_head_bin
cdef np.ndarray ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
cdef np.ndarray ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
if get_ry_fine:
# divide pi/2 into several bins
angle_per_class = (np.pi / 2) / num_head_bin
ry_res = ry_res_norm * (angle_per_class / 2)
ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
else:
angle_per_class = (2 * np.pi) / num_head_bin
ry_res = ry_res_norm * (angle_per_class / 2)
# bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
ry[ry > np.pi] -= 2 * np.pi
# recover size
cdef int size_res_l = ry_res_r
cdef int size_res_r = ry_res_r + 3
assert size_res_r == pred_reg.shape[1]
cdef np.ndarray size_res_norm = pred_reg[:, size_res_l: size_res_r]
cdef np.ndarray hwl = size_res_norm * anchor_size + anchor_size
# shift to original coords
cdef np.ndarray roi_center = np.array(roi_box3d[:, 0:3])
cdef np.ndarray shift_ret_box3d = np.concatenate((
pos_x.reshape(-1, 1),
pos_y.reshape(-1, 1),
pos_z.reshape(-1, 1),
hwl, ry.reshape(-1, 1)), axis=1)
ret_box3d = shift_ret_box3d
if roi_box3d.shape[1] == 7:
roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
ret_box3d = _rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
ret_box3d[:, 6] += roi_ry
ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
return ret_box3d
"""
This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
"""
import numpy as np
def cls_type_to_id(cls_type):
type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
if cls_type not in type_to_id.keys():
return -1
return type_to_id[cls_type]
class Object3d(object):
def __init__(self, line):
label = line.strip().split(' ')
self.src = line
self.cls_type = label[0]
self.cls_id = cls_type_to_id(self.cls_type)
self.trucation = float(label[1])
self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
self.alpha = float(label[3])
self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
self.h = float(label[8])
self.w = float(label[9])
self.l = float(label[10])
self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
self.dis_to_cam = np.linalg.norm(self.pos)
self.ry = float(label[14])
self.score = float(label[15]) if label.__len__() == 16 else -1.0
self.level_str = None
self.level = self.get_obj_level()
def get_obj_level(self):
height = float(self.box2d[3]) - float(self.box2d[1]) + 1
if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
self.level_str = 'Easy'
return 1 # Easy
elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
self.level_str = 'Moderate'
return 2 # Moderate
elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
self.level_str = 'Hard'
return 3 # Hard
else:
self.level_str = 'UnKnown'
return 4
def generate_corners3d(self):
"""
generate corners3d representation for this object
:return corners_3d: (8, 3) corners of box3d in camera coord
"""
l, h, w = self.l, self.h, self.w
x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
[0, 1, 0],
[-np.sin(self.ry), 0, np.cos(self.ry)]])
corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
corners3d = np.dot(R, corners3d).T
corners3d = corners3d + self.pos
return corners3d
def to_bev_box2d(self, oblique=True, voxel_size=0.1):
"""
:param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
:param voxel_size: float, 0.1m
:param oblique:
:return: box2d (4, 2)/ (4) in image coordinate
"""
if oblique:
corners3d = self.generate_corners3d()
xz_corners = corners3d[0:4, [0, 2]]
box2d = np.zeros((4, 2), dtype=np.int32)
box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
else:
box2d = np.zeros(4, dtype=np.int32)
# discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
box2d[0], box2d[1] = cu - half_l, cv - half_w
box2d[2], box2d[3] = cu + half_l, cv + half_w
return box2d
def to_str(self):
print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
% (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
self.pos, self.ry)
return print_str
def to_kitti_format(self):
kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
% (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
self.ry)
return kitti_str
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import numpy as np
cimport numpy as np
cimport cython
from libc.math cimport sin, cos
@cython.boundscheck(False)
@cython.wraparound(False)
cdef enlarge_box3d(np.ndarray boxes3d, int extra_width):
"""
:param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
"""
if isinstance(boxes3d, np.ndarray):
large_boxes3d = boxes3d.copy()
else:
large_boxes3d = boxes3d.clone()
large_boxes3d[:, 3:6] += extra_width * 2
large_boxes3d[:, 1] += extra_width
return large_boxes3d
@cython.boundscheck(False)
@cython.wraparound(False)
cdef pt_in_box(float x, float y, float z, float cx, float bottom_y, float cz, float h, float w, float l, float angle):
cdef float max_ids = 10.0
cdef float cy = bottom_y - h / 2.0
if ((abs(x - cx) > max_ids) or (abs(y - cy) > h / 2.0) or (abs(z - cz) > max_ids)):
return 0
cdef float cosa = cos(angle)
cdef float sina = sin(angle)
cdef float x_rot = (x - cx) * cosa + (z - cz) * (-sina)
cdef float z_rot = (x - cx) * sina + (z - cz) * cosa
cdef float flag = (x_rot >= -l / 2.0) and (x_rot <= l / 2.0) and (z_rot >= -w / 2.0) and (z_rot <= w / 2.0)
return flag
@cython.boundscheck(False)
@cython.wraparound(False)
cdef _rotate_pc_along_y(np.ndarray pc, float rot_angle):
"""
params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
params rot_angle: rad scalar
Output pc: updated pc with XYZ rotated
"""
cosval = np.cos(rot_angle)
sinval = np.sin(rot_angle)
rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
return pc
@cython.boundscheck(False)
@cython.wraparound(False)
def roipool3d_cpu(
np.ndarray[float, ndim=2] pts,
np.ndarray[float, ndim=2] pts_feature,
np.ndarray[float, ndim=2] boxes3d,
np.ndarray[float, ndim=2] pts_extra_input,
int pool_extra_width, int sampled_pt_num, int batch_size=1, bint canonical_transform=False):
cdef np.ndarray pts_feature_all = np.concatenate((pts_extra_input, pts_feature), axis=1)
cdef np.ndarray larged_boxes3d = enlarge_box3d(boxes3d.reshape(-1, 7), pool_extra_width).reshape(batch_size, -1, 7)
cdef int pts_num = pts.shape[0],
cdef int boxes_num = boxes3d.shape[0]
cdef int feature_len = pts_feature_all.shape[1]
cdef np.ndarray pts_data = np.zeros((batch_size, boxes_num, sampled_pt_num, 3))
cdef np.ndarray features_data = np.zeros((batch_size, boxes_num, sampled_pt_num, feature_len))
cdef np.ndarray empty_flag_data = np.zeros((batch_size, boxes_num))
cdef int cnt = 0
cdef float cx = 0.
cdef float bottom_y = 0.
cdef float cz = 0.
cdef float h = 0.
cdef float w = 0.
cdef float l = 0.
cdef float ry = 0.
cdef float x = 0.
cdef float y = 0.
cdef float z = 0.
cdef np.ndarray x_i
cdef np.ndarray feat_i
cdef int bs
cdef int i
cdef int j
for bs in range(batch_size):
# boxes: 64,7
for i in range(boxes_num):
cnt = 0
# box
box = larged_boxes3d[bs][i]
cx = box[0]
bottom_y = box[1]
cz = box[2]
h = box[3]
w = box[4]
l = box[5]
ry = box[6]
# points: 16384,3
x_i = pts
# features: 16384, 128
feat_i = pts_feature_all
for j in range(pts_num):
x = x_i[j][0]
y = x_i[j][1]
z = x_i[j][2]
cur_in_flag = pt_in_box(x,y,z,cx,bottom_y,cz,h,w,l,ry)
if cur_in_flag:
if cnt < sampled_pt_num:
pts_data[bs][i][cnt][:] = x_i[j]
features_data[bs][i][cnt][:] = feat_i[j]
cnt += 1
else:
break
if cnt == 0:
empty_flag_data[bs][i] = 1
elif (cnt < sampled_pt_num):
for k in range(cnt, sampled_pt_num):
pts_data[bs][i][k] = pts_data[bs][i][k % cnt]
features_data[bs][i][k] = features_data[bs][i][k % cnt]
pooled_pts = pts_data.astype("float32")[0]
pooled_features = features_data.astype('float32')[0]
pooled_empty_flag = empty_flag_data.astype('int64')[0]
cdef int extra_input_len = pts_extra_input.shape[1]
pooled_pts = np.concatenate((pooled_pts, pooled_features[:,:,0:extra_input_len]),axis=2)
pooled_features = pooled_features[:,:,extra_input_len:]
if canonical_transform:
# Translate to the roi coordinates
roi_ry = boxes3d[:, 6] % (2 * np.pi) # 0~2pi
roi_center = boxes3d[:, 0:3]
# shift to center
pooled_pts[:, :, 0:3] = pooled_pts[:, :, 0:3] - roi_center[:, np.newaxis, :]
for k in range(pooled_pts.shape[0]):
pooled_pts[k] = _rotate_pc_along_y(pooled_pts[k], roi_ry[k])
return pooled_pts, pooled_features, pooled_empty_flag
return pooled_pts, pooled_features, pooled_empty_flag
#def roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num=512, batch_size=1):
# return _roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num, batch_size)
# Copyright (c) 2017-present, Facebook, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##############################################################################
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from Cython.Build import cythonize
from setuptools import Extension
from setuptools import setup
import numpy as np
_NP_INCLUDE_DIRS = np.get_include()
# Extension modules
ext_modules = [
Extension(
name='utils.cyops.roipool3d_utils',
sources=[
'utils/cyops/roipool3d_utils.pyx'
],
extra_compile_args=[
'-Wno-cpp'
],
include_dirs=[
_NP_INCLUDE_DIRS
]
),
Extension(
name='utils.cyops.iou3d_utils',
sources=[
'utils/cyops/iou3d_utils.pyx'
],
extra_compile_args=[
'-Wno-cpp'
],
include_dirs=[
_NP_INCLUDE_DIRS
]
),
Extension(
name='utils.cyops.kitti_utils',
sources=[
'utils/cyops/kitti_utils.pyx'
],
extra_compile_args=[
'-Wno-cpp'
],
include_dirs=[
_NP_INCLUDE_DIRS
]
),
]
setup(
name='pp_pointrcnn',
ext_modules=cythonize(ext_modules)
)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import logging
import numpy as np
import utils.cyops.kitti_utils as kitti_utils
from utils.config import cfg
from utils.box_utils import boxes_iou3d, box_nms_eval, boxes3d_to_bev
from utils.save_utils import save_rpn_feature, save_kitti_result, save_kitti_format
__all__ = ['calc_iou_recall', 'rpn_metric', 'rcnn_metric']
logging.root.handlers = []
FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
logger = logging.getLogger(__name__)
def calc_iou_recall(rets, thresh_list):
rpn_cls_label = rets['rpn_cls_label'][0]
boxes3d = rets['rois'][0]
seg_mask = rets['seg_mask'][0]
sample_id = rets['sample_id'][0]
gt_boxes3d = rets['gt_boxes3d'][0]
gt_boxes3d_num = rets['gt_boxes3d'][1]
gt_box_idx = 0
recalled_bbox_list = [0] * len(thresh_list)
gt_box_num = 0
rpn_iou_sum = 0.
for i in range(len(gt_boxes3d_num)):
cur_rpn_cls_label = rpn_cls_label[i]
cur_boxes3d = boxes3d[i]
cur_seg_mask = seg_mask[i]
cur_sample_id = sample_id[i]
cur_gt_boxes3d = gt_boxes3d[gt_box_idx: gt_box_idx +
gt_boxes3d_num[0][i]]
gt_box_idx += gt_boxes3d_num[0][i]
k = cur_gt_boxes3d.__len__() - 1
while k >= 0 and np.sum(cur_gt_boxes3d[k]) == 0:
k -= 1
cur_gt_boxes3d = cur_gt_boxes3d[:k + 1]
if cur_gt_boxes3d.shape[0] > 0:
iou3d = boxes_iou3d(cur_boxes3d, cur_gt_boxes3d[:, 0:7])
gt_max_iou = iou3d.max(axis=0)
for idx, thresh in enumerate(thresh_list):
recalled_bbox_list[idx] += np.sum(gt_max_iou > thresh)
gt_box_num += cur_gt_boxes3d.__len__()
fg_mask = cur_rpn_cls_label > 0
correct = np.sum(np.logical_and(
cur_seg_mask == cur_rpn_cls_label, fg_mask))
union = np.sum(fg_mask) + np.sum(cur_seg_mask > 0) - correct
rpn_iou = float(correct) / max(float(union), 1.0)
rpn_iou_sum += rpn_iou
logger.debug('sample_id:{}, rpn_iou:{}, gt_box_num:{}, recalled_bbox_list:{}'.format(
sample_id, rpn_iou, gt_box_num, str(recalled_bbox_list)))
return len(gt_boxes3d_num), gt_box_num, rpn_iou_sum, recalled_bbox_list
def rpn_metric(queue, mdict, lock, thresh_list, is_save_rpn_feature, kitti_feature_dir,
seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes):
while True:
rets_dict = queue.get()
if rets_dict is None:
lock.acquire()
mdict['exit_proc'] += 1
lock.release()
return
cnt, gt_box_num, rpn_iou_sum, recalled_bbox_list = calc_iou_recall(
rets_dict, thresh_list)
lock.acquire()
mdict['total_cnt'] += cnt
mdict['total_gt_bbox'] += gt_box_num
mdict['total_rpn_iou'] += rpn_iou_sum
for i, bbox_num in enumerate(recalled_bbox_list):
mdict['total_recalled_bbox_list_{}'.format(i)] += bbox_num
logger.debug("rpn_metric: {}".format(str(mdict)))
lock.release()
if is_save_rpn_feature:
save_rpn_feature(rets_dict, kitti_feature_dir)
save_kitti_result(
rets_dict, seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes)
def rcnn_metric(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
refine_output_dir, final_output_dir, is_save_result=False):
while True:
rets_dict = queue.get()
if rets_dict is None:
lock.acquire()
mdict['exit_proc'] += 1
lock.release()
return
for k,v in rets_dict.items():
rets_dict[k] = v[0]
rcnn_cls = rets_dict['rcnn_cls']
rcnn_reg = rets_dict['rcnn_reg']
roi_boxes3d = rets_dict['roi_boxes3d']
roi_scores = rets_dict['roi_scores']
# bounding box regression
anchor_size = cfg.CLS_MEAN_SIZE[0]
pred_boxes3d = kitti_utils.decode_bbox_target(
roi_boxes3d,
rcnn_reg,
anchor_size=np.array(anchor_size),
loc_scope=cfg.RCNN.LOC_SCOPE,
loc_bin_size=cfg.RCNN.LOC_BIN_SIZE,
num_head_bin=cfg.RCNN.NUM_HEAD_BIN,
get_xz_fine=True,
get_y_by_bin=cfg.RCNN.LOC_Y_BY_BIN,
loc_y_scope=cfg.RCNN.LOC_Y_SCOPE,
loc_y_bin_size=cfg.RCNN.LOC_Y_BIN_SIZE,
get_ry_fine=True
)
# scoring
if rcnn_cls.shape[1] == 1:
raw_scores = rcnn_cls.reshape(-1)
norm_scores = rets_dict['norm_scores']
pred_classes = norm_scores > cfg.RCNN.SCORE_THRESH
pred_classes = pred_classes.astype(np.float32)
else:
pred_classes = np.argmax(rcnn_cls, axis=1).reshape(-1)
raw_scores = rcnn_cls[:, pred_classes]
# evaluation
gt_iou = rets_dict['gt_iou']
gt_boxes3d = rets_dict['gt_boxes3d']
# recall
if gt_boxes3d.size > 0:
gt_num = gt_boxes3d.shape[1]
gt_boxes3d = gt_boxes3d.reshape((-1,7))
iou3d = boxes_iou3d(pred_boxes3d, gt_boxes3d)
gt_max_iou = iou3d.max(axis=0)
refined_iou = iou3d.max(axis=1)
recalled_num = (gt_max_iou > 0.7).sum()
roi_boxes3d = roi_boxes3d.reshape((-1,7))
iou3d_in = boxes_iou3d(roi_boxes3d, gt_boxes3d)
gt_max_iou_in = iou3d_in.max(axis=0)
lock.acquire()
mdict['total_gt_bbox'] += gt_num
for idx, thresh in enumerate(thresh_list):
recalled_bbox_num = (gt_max_iou > thresh).sum()
mdict['total_recalled_bbox_list_{}'.format(idx)] += recalled_bbox_num
for idx, thresh in enumerate(thresh_list):
roi_recalled_bbox_num = (gt_max_iou_in > thresh).sum()
mdict['total_roi_recalled_bbox_list_{}'.format(idx)] += roi_recalled_bbox_num
lock.release()
# classification accuracy
cls_label = gt_iou > cfg.RCNN.CLS_FG_THRESH
cls_label = cls_label.astype(np.float32)
cls_valid_mask = (gt_iou >= cfg.RCNN.CLS_FG_THRESH) | (gt_iou <= cfg.RCNN.CLS_BG_THRESH)
cls_valid_mask = cls_valid_mask.astype(np.float32)
cls_acc = (pred_classes == cls_label).astype(np.float32)
cls_acc = (cls_acc * cls_valid_mask).sum() / max(cls_valid_mask.sum(), 1.0) * 1.0
iou_thresh = 0.7 if cfg.CLASSES == 'Car' else 0.5
cls_label_refined = (gt_iou >= iou_thresh)
cls_label_refined = cls_label_refined.astype(np.float32)
cls_acc_refined = (pred_classes == cls_label_refined).astype(np.float32).sum() / max(cls_label_refined.shape[0], 1.0)
sample_id = rets_dict['sample_id']
image_shape = kitti_rcnn_reader.get_image_shape(sample_id)
if is_save_result:
roi_boxes3d_np = roi_boxes3d
pred_boxes3d_np = pred_boxes3d
calib = kitti_rcnn_reader.get_calib(sample_id)
save_kitti_format(sample_id, calib, roi_boxes3d_np, roi_output_dir, roi_scores, image_shape)
save_kitti_format(sample_id, calib, pred_boxes3d_np, refine_output_dir, raw_scores, image_shape)
inds = norm_scores > cfg.RCNN.SCORE_THRESH
if inds.astype(np.float32).sum() == 0:
logger.debug("The num of 'norm_scores > thresh' of sample {} is 0".format(sample_id))
continue
pred_boxes3d_selected = pred_boxes3d[inds]
raw_scores_selected = raw_scores[inds]
# NMS thresh
boxes_bev_selected = boxes3d_to_bev(pred_boxes3d_selected)
scores_selected, pred_boxes3d_selected = box_nms_eval(boxes_bev_selected, raw_scores_selected, pred_boxes3d_selected, cfg.RCNN.NMS_THRESH)
calib = kitti_rcnn_reader.get_calib(sample_id)
save_kitti_format(sample_id, calib, pred_boxes3d_selected, final_output_dir, scores_selected, image_shape)
lock.acquire()
mdict['total_det_num'] += pred_boxes3d_selected.shape[0]
mdict['total_cls_acc'] += cls_acc
mdict['total_cls_acc_refined'] += cls_acc_refined
lock.release()
logger.debug("rcnn_metric: {}".format(str(mdict)))
"""
This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
"""
import numpy as np
def cls_type_to_id(cls_type):
type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
if cls_type not in type_to_id.keys():
return -1
return type_to_id[cls_type]
def get_objects_from_label(label_file):
with open(label_file, 'r') as f:
lines = f.readlines()
objects = [Object3d(line) for line in lines]
return objects
class Object3d(object):
def __init__(self, line):
label = line.strip().split(' ')
self.src = line
self.cls_type = label[0]
self.cls_id = cls_type_to_id(self.cls_type)
self.trucation = float(label[1])
self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
self.alpha = float(label[3])
self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
self.h = float(label[8])
self.w = float(label[9])
self.l = float(label[10])
self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
self.dis_to_cam = np.linalg.norm(self.pos)
self.ry = float(label[14])
self.score = float(label[15]) if label.__len__() == 16 else -1.0
self.level_str = None
self.level = self.get_obj_level()
def get_obj_level(self):
height = float(self.box2d[3]) - float(self.box2d[1]) + 1
if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
self.level_str = 'Easy'
return 1 # Easy
elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
self.level_str = 'Moderate'
return 2 # Moderate
elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
self.level_str = 'Hard'
return 3 # Hard
else:
self.level_str = 'UnKnown'
return 4
def generate_corners3d(self):
"""
generate corners3d representation for this object
:return corners_3d: (8, 3) corners of box3d in camera coord
"""
l, h, w = self.l, self.h, self.w
x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
[0, 1, 0],
[-np.sin(self.ry), 0, np.cos(self.ry)]])
corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
corners3d = np.dot(R, corners3d).T
corners3d = corners3d + self.pos
return corners3d
def to_bev_box2d(self, oblique=True, voxel_size=0.1):
"""
:param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
:param voxel_size: float, 0.1m
:param oblique:
:return: box2d (4, 2)/ (4) in image coordinate
"""
if oblique:
corners3d = self.generate_corners3d()
xz_corners = corners3d[0:4, [0, 2]]
box2d = np.zeros((4, 2), dtype=np.int32)
box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
else:
box2d = np.zeros(4, dtype=np.int32)
# discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
box2d[0], box2d[1] = cu - half_l, cv - half_w
box2d[2], box2d[3] = cu + half_l, cv + half_w
return box2d
def to_str(self):
print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
% (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
self.pos, self.ry)
return print_str
def to_kitti_format(self):
kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
% (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
self.ry)
return kitti_str
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Optimization and learning rate scheduling."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
from paddle.fluid.layers import control_flow
import logging
logger = logging.getLogger(__name__)
def cosine_warmup_decay(learning_rate, betas, warmup_factor, decay_factor,
total_step, warmup_pct):
def annealing_cos(start, end, pct):
"Cosine anneal from `start` to `end` as pct goes from 0.0 to 1.0."
cos_out = fluid.layers.cos(pct * np.pi) + 1.
return cos_out * (start - end) / 2. + end
warmup_start_lr = learning_rate * warmup_factor
decay_end_lr = learning_rate * decay_factor
warmup_step = total_step * warmup_pct
global_step = lr_scheduler._decay_step_counter()
lr = fluid.layers.create_global_var(
shape=[1],
value=float(learning_rate),
dtype='float32',
persistable=True,
name="learning_rate")
beta1 = fluid.layers.create_global_var(
shape=[1],
value=float(betas[0]),
dtype='float32',
persistable=True,
name="beta1")
warmup_step_var = fluid.layers.fill_constant(
shape=[1], dtype='float32', value=float(warmup_step), force_cpu=True)
with control_flow.Switch() as switch:
with switch.case(global_step < warmup_step_var):
cur_lr = annealing_cos(warmup_start_lr, learning_rate,
global_step / warmup_step_var)
fluid.layers.assign(cur_lr, lr)
cur_beta1 = annealing_cos(betas[0], betas[1],
global_step / warmup_step_var)
fluid.layers.assign(cur_beta1, beta1)
with switch.case(global_step >= warmup_step_var):
cur_lr = annealing_cos(learning_rate, decay_end_lr,
(global_step - warmup_step_var) / (total_step - warmup_step))
fluid.layers.assign(cur_lr, lr)
cur_beta1 = annealing_cos(betas[1], betas[0],
(global_step - warmup_step_var) / (total_step - warmup_step))
fluid.layers.assign(cur_beta1, beta1)
return lr, beta1
def optimize(loss,
learning_rate,
warmup_factor,
decay_factor,
total_step,
warmup_pct,
train_program,
startup_prog,
weight_decay,
clip_norm,
beta1=[0.95, 0.85],
beta2=0.99,
scheduler='cosine_warmup_decay'):
scheduled_lr= None
if scheduler == 'cosine_warmup_decay':
scheduled_lr, scheduled_beta1 = cosine_warmup_decay(learning_rate, beta1, warmup_factor,
decay_factor, total_step,
warmup_pct)
else:
raise ValueError("Unkown learning rate scheduler, should be "
"'cosine_warmup_decay'")
optimizer = fluid.optimizer.Adam(learning_rate=scheduled_lr,
beta1=scheduled_beta1,
beta2=beta2)
fluid.clip.set_gradient_clip(
clip=fluid.clip.GradientClipByGlobalNorm(clip_norm=clip_norm))
param_list = dict()
if weight_decay > 0:
for param in train_program.global_block().all_parameters():
param_list[param.name] = param * 1.0
param_list[param.name].stop_gradient = True
_, param_grads = optimizer.minimize(loss)
if weight_decay > 0:
for param, grad in param_grads:
with param.block.program._optimized_guard(
[param, grad]), fluid.framework.name_scope("weight_decay"):
updated_param = param - param_list[
param.name] * weight_decay * scheduled_lr
fluid.layers.assign(output=param, input=updated_param)
return scheduled_lr
import numpy as np
from utils.cyops import kitti_utils, roipool3d_utils, iou3d_utils
CLOSE_RANDOM = False
def get_proposal_target_func(cfg, mode='TRAIN'):
def sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d):
"""
:param roi_boxes3d: (B, M, 7)
:param gt_boxes3d: (B, N, 8) [x, y, z, h, w, l, ry, cls]
:return
batch_rois: (B, N, 7)
batch_gt_of_rois: (B, N, 8)
batch_roi_iou: (B, N)
"""
batch_size = roi_boxes3d.shape[0]
#batch_size = 1
fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
batch_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
batch_gt_of_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
batch_roi_iou = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE))
for idx in range(batch_size):
cur_roi, cur_gt = roi_boxes3d[idx], gt_boxes3d[idx]
k = cur_gt.shape[0] - 1
while cur_gt[k].sum() == 0:
k -= 1
cur_gt = cur_gt[:k + 1]
# include gt boxes in the candidate rois
iou3d = iou3d_utils.boxes_iou3d(cur_roi, cur_gt[:, 0:7]) # (M, N)
max_overlaps = np.max(iou3d, axis=1)
gt_assignment = np.argmax(iou3d, axis=1)
# sample fg, easy_bg, hard_bg
fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
fg_inds = np.where(max_overlaps >= fg_thresh)[0].reshape(-1)
# TODO: this will mix the fg and bg when CLS_BG_THRESH_LO < iou < CLS_BG_THRESH
# fg_inds = torch.cat((fg_inds, roi_assignment), dim=0) # consider the roi which has max_iou with gt as fg
easy_bg_inds = np.where(max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO)[0].reshape(-1)
hard_bg_inds = np.where((max_overlaps < cfg.RCNN.CLS_BG_THRESH) & (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0].reshape(-1)
fg_num_rois = fg_inds.shape[0]
bg_num_rois = hard_bg_inds.shape[0] + easy_bg_inds.shape[0]
if fg_num_rois > 0 and bg_num_rois > 0:
# sampling fg
fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
if CLOSE_RANDOM:
fg_inds = fg_inds[:fg_rois_per_this_image]
else:
rand_num = np.random.permutation(fg_num_rois)
fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
# sampling bg
bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
elif fg_num_rois > 0 and bg_num_rois == 0:
# sampling fg
rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE) * fg_num_rois)
# rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
fg_inds = fg_inds[rand_num]
fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
bg_rois_per_this_image = 0
elif bg_num_rois > 0 and fg_num_rois == 0:
# sampling bg
bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
fg_rois_per_this_image = 0
else:
import pdb
pdb.set_trace()
raise NotImplementedError
# augment the rois by noise
roi_list, roi_iou_list, roi_gt_list = [], [], []
if fg_rois_per_this_image > 0:
fg_rois_src = cur_roi[fg_inds]
gt_of_fg_rois = cur_gt[gt_assignment[fg_inds]]
iou3d_src = max_overlaps[fg_inds]
fg_rois, fg_iou3d = aug_roi_by_noise(
fg_rois_src, gt_of_fg_rois, iou3d_src, aug_times=cfg.RCNN.ROI_FG_AUG_TIMES)
roi_list.append(fg_rois)
roi_iou_list.append(fg_iou3d)
roi_gt_list.append(gt_of_fg_rois)
if bg_rois_per_this_image > 0:
bg_rois_src = cur_roi[bg_inds]
gt_of_bg_rois = cur_gt[gt_assignment[bg_inds]]
iou3d_src = max_overlaps[bg_inds]
aug_times = 1 if cfg.RCNN.ROI_FG_AUG_TIMES > 0 else 0
bg_rois, bg_iou3d = aug_roi_by_noise(
bg_rois_src, gt_of_bg_rois, iou3d_src, aug_times=aug_times)
roi_list.append(bg_rois)
roi_iou_list.append(bg_iou3d)
roi_gt_list.append(gt_of_bg_rois)
rois = np.concatenate(roi_list, axis=0)
iou_of_rois = np.concatenate(roi_iou_list, axis=0)
gt_of_rois = np.concatenate(roi_gt_list, axis=0)
batch_rois[idx] = rois
batch_gt_of_rois[idx] = gt_of_rois
batch_roi_iou[idx] = iou_of_rois
return batch_rois, batch_gt_of_rois, batch_roi_iou
def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
if hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] > 0:
hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
# sampling hard bg
if CLOSE_RANDOM:
rand_idx = list(np.arange(0,hard_bg_inds.shape[0]))*hard_bg_rois_num
rand_idx = rand_idx[:hard_bg_rois_num]
else:
rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
hard_bg_inds = hard_bg_inds[rand_idx]
# sampling easy bg
if CLOSE_RANDOM:
rand_idx = list(np.arange(0,easy_bg_inds.shape[0]))*easy_bg_rois_num
rand_idx = rand_idx[:easy_bg_rois_num]
else:
rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
easy_bg_inds = easy_bg_inds[rand_idx]
bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
elif hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] == 0:
hard_bg_rois_num = bg_rois_per_this_image
# sampling hard bg
rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
bg_inds = hard_bg_inds[rand_idx]
elif hard_bg_inds.shape[0] == 0 and easy_bg_inds.shape[0] > 0:
easy_bg_rois_num = bg_rois_per_this_image
# sampling easy bg
rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
bg_inds = easy_bg_inds[rand_idx]
else:
raise NotImplementedError
return bg_inds
def aug_roi_by_noise(roi_boxes3d, gt_boxes3d, iou3d_src, aug_times=10):
iou_of_rois = np.zeros(roi_boxes3d.shape[0]).astype(gt_boxes3d.dtype)
pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
for k in range(roi_boxes3d.shape[0]):
temp_iou = cnt = 0
roi_box3d = roi_boxes3d[k]
gt_box3d = gt_boxes3d[k].reshape(1, 7)
aug_box3d = roi_box3d
keep = True
while temp_iou < pos_thresh and cnt < aug_times:
if True: #np.random.rand() < 0.2:
aug_box3d = roi_box3d # p=0.2 to keep the original roi box
keep = True
else:
aug_box3d = random_aug_box3d(roi_box3d)
keep = False
aug_box3d = aug_box3d.reshape((1, 7))
iou3d = iou3d_utils.boxes_iou3d(aug_box3d, gt_box3d)
temp_iou = iou3d[0][0]
cnt += 1
roi_boxes3d[k] = aug_box3d.reshape(-1)
if cnt == 0 or keep:
iou_of_rois[k] = iou3d_src[k]
else:
iou_of_rois[k] = temp_iou
return roi_boxes3d, iou_of_rois
def random_aug_box3d(box3d):
"""
:param box3d: (7) [x, y, z, h, w, l, ry]
random shift, scale, orientation
"""
if cfg.RCNN.REG_AUG_METHOD == 'single':
pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
return aug_box3d
elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
# pos_range, hwl_range, angle_range, mean_iou
range_config = [[0.2, 0.1, np.pi / 12, 0.7],
[0.3, 0.15, np.pi / 12, 0.6],
[0.5, 0.15, np.pi / 9, 0.5],
[0.8, 0.15, np.pi / 6, 0.3],
[1.0, 0.15, np.pi / 3, 0.2]]
idx = np.random.randint(low=0, high=len(range_config), size=(1,))[0]
pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
return aug_box3d
elif cfg.RCNN.REG_AUG_METHOD == 'normal':
x_shift = np.random.normal(loc=0, scale=0.3)
y_shift = np.random.normal(loc=0, scale=0.2)
z_shift = np.random.normal(loc=0, scale=0.3)
h_shift = np.random.normal(loc=0, scale=0.25)
w_shift = np.random.normal(loc=0, scale=0.15)
l_shift = np.random.normal(loc=0, scale=0.5)
ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift], dtype=np.float32)
aug_box3d = aug_box3d.astype(box3d.dtype)
return aug_box3d
else:
raise NotImplementedError
def data_augmentation(pts, rois, gt_of_rois):
"""
:param pts: (B, M, 512, 3)
:param rois: (B, M. 7)
:param gt_of_rois: (B, M, 7)
:return:
"""
batch_size, boxes_num = pts.shape[0], pts.shape[1]
# rotation augmentation
angles = (np.random.rand(batch_size, boxes_num) - 0.5 / 0.5) * (np.pi / cfg.AUG_ROT_RANGE)
# calculate gt alpha from gt_of_rois
temp_x, temp_z, temp_ry = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2], gt_of_rois[:, :, 6]
temp_beta = np.arctan2(temp_z, temp_x)
gt_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
temp_x, temp_z, temp_ry = rois[:, :, 0], rois[:, :, 2], rois[:, :, 6]
temp_beta = np.arctan2(temp_z, temp_x)
roi_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
for k in range(batch_size):
pts[k] = kitti_utils.rotate_pc_along_y_np(pts[k], angles[k])
gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
np.expand_dims(gt_of_rois[k], axis=1), angles[k]), axis=1)
rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
np.expand_dims(rois[k], axis=1), angles[k]),axis=1)
# calculate the ry after rotation
temp_x, temp_z = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2]
temp_beta = np.arctan2(temp_z, temp_x)
gt_of_rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + gt_alpha - temp_beta
temp_x, temp_z = rois[:, :, 0], rois[:, :, 2]
temp_beta = np.arctan2(temp_z, temp_x)
rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + roi_alpha - temp_beta
# scaling augmentation
scales = 1 + ((np.random.rand(batch_size, boxes_num) - 0.5) / 0.5) * 0.05
pts = pts * np.expand_dims(np.expand_dims(scales, axis=2), axis=3)
gt_of_rois[:, :, 0:6] = gt_of_rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
rois[:, :, 0:6] = rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
# flip augmentation
flip_flag = np.sign(np.random.rand(batch_size, boxes_num) - 0.5)
pts[:, :, :, 0] = pts[:, :, :, 0] * np.expand_dims(flip_flag, axis=2)
gt_of_rois[:, :, 0] = gt_of_rois[:, :, 0] * flip_flag
# flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
src_ry = gt_of_rois[:, :, 6]
ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
gt_of_rois[:, :, 6] = ry
rois[:, :, 0] = rois[:, :, 0] * flip_flag
# flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
src_ry = rois[:, :, 6]
ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
rois[:, :, 6] = ry
return pts, rois, gt_of_rois
def generate_proposal_target(seg_mask,rpn_features,gt_boxes3d,rpn_xyz,pts_depth,roi_boxes3d,rpn_intensity):
seg_mask = np.array(seg_mask)
features = np.array(rpn_features)
gt_boxes3d = np.array(gt_boxes3d)
rpn_xyz = np.array(rpn_xyz)
pts_depth = np.array(pts_depth)
roi_boxes3d = np.array(roi_boxes3d)
rpn_intensity = np.array(rpn_intensity)
batch_rois, batch_gt_of_rois, batch_roi_iou = sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d)
if cfg.RCNN.USE_INTENSITY:
pts_extra_input_list = [np.expand_dims(rpn_intensity, axis=2),
np.expand_dims(seg_mask, axis=2)]
else:
pts_extra_input_list = [np.expand_dims(seg_mask, axis=2)]
if cfg.RCNN.USE_DEPTH:
pts_depth = pts_depth / 70.0 - 0.5
pts_extra_input_list.append(np.expand_dims(pts_depth, axis=2))
pts_extra_input = np.concatenate(pts_extra_input_list, axis=2)
# point cloud pooling
pts_feature = np.concatenate((pts_extra_input, rpn_features), axis=2)
batch_rois = batch_rois.astype(np.float32)
pooled_features, pooled_empty_flag = roipool3d_utils.roipool3d_gpu(
rpn_xyz, pts_feature, batch_rois, cfg.RCNN.POOL_EXTRA_WIDTH,
sampled_pt_num=cfg.RCNN.NUM_POINTS
)
sampled_pts, sampled_features = pooled_features[:, :, :, 0:3], pooled_features[:, :, :, 3:]
# data augmentation
if cfg.AUG_DATA:
# data augmentation
sampled_pts, batch_rois, batch_gt_of_rois = \
data_augmentation(sampled_pts, batch_rois, batch_gt_of_rois)
# canonical transformation
batch_size = batch_rois.shape[0]
roi_ry = batch_rois[:, :, 6] % (2 * np.pi)
roi_center = batch_rois[:, :, 0:3]
sampled_pts = sampled_pts - np.expand_dims(roi_center, axis=2) # (B, M, 512, 3)
batch_gt_of_rois[:, :, 0:3] = batch_gt_of_rois[:, :, 0:3] - roi_center
batch_gt_of_rois[:, :, 6] = batch_gt_of_rois[:, :, 6] - roi_ry
for k in range(batch_size):
sampled_pts[k] = kitti_utils.rotate_pc_along_y_np(sampled_pts[k], batch_rois[k, :, 6])
batch_gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
np.expand_dims(batch_gt_of_rois[k], axis=1), roi_ry[k]), axis=1)
# regression valid mask
valid_mask = (pooled_empty_flag == 0)
reg_valid_mask = ((batch_roi_iou > cfg.RCNN.REG_FG_THRESH) & valid_mask).astype(np.float32)
# classification label
batch_cls_label = (batch_roi_iou > cfg.RCNN.CLS_FG_THRESH).astype(np.int64)
invalid_mask = (batch_roi_iou > cfg.RCNN.CLS_BG_THRESH) & (batch_roi_iou < cfg.RCNN.CLS_FG_THRESH)
batch_cls_label[valid_mask == 0] = -1
batch_cls_label[invalid_mask > 0] = -1
output_dict = {'sampled_pts': sampled_pts.reshape(-1, cfg.RCNN.NUM_POINTS, 3).astype(np.float32),
'pts_feature': sampled_features.reshape(-1, cfg.RCNN.NUM_POINTS, sampled_features.shape[3]).astype(np.float32),
'cls_label': batch_cls_label.reshape(-1),
'reg_valid_mask': reg_valid_mask.reshape(-1).astype(np.float32),
'gt_of_rois': batch_gt_of_rois.reshape(-1, 7).astype(np.float32),
'gt_iou': batch_roi_iou.reshape(-1).astype(np.float32),
'roi_boxes3d': batch_rois.reshape(-1, 7).astype(np.float32)}
return output_dict.values()
return generate_proposal_target
if __name__ == "__main__":
input_dict = {}
input_dict['roi_boxes3d'] = np.load("models/rpn_data/roi_boxes3d.npy")
input_dict['gt_boxes3d'] = np.load("models/rpn_data/gt_boxes3d.npy")
input_dict['rpn_xyz'] = np.load("models/rpn_data/rpn_xyz.npy")
input_dict['rpn_features'] = np.load("models/rpn_data/rpn_features.npy")
input_dict['rpn_intensity'] = np.load("models/rpn_data/rpn_intensity.npy")
input_dict['seg_mask'] = np.load("models/rpn_data/seg_mask.npy")
input_dict['pts_depth'] = np.load("models/rpn_data/pts_depth.npy")
for k, v in input_dict.items():
print(k, v.shape, np.sum(np.abs(v)))
input_dict[k] = np.expand_dims(v, axis=0)
from utils.config import cfg
cfg.RPN.LOC_XZ_FINE = True
cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
cfg.RPN.NMS_TYPE = 'rotate'
proposal_target_func = get_proposal_target_func(cfg)
out_dict = proposal_target_func(input_dict['seg_mask'],input_dict['rpn_features'],input_dict['gt_boxes3d'],
input_dict['rpn_xyz'],input_dict['pts_depth'],input_dict['roi_boxes3d'],input_dict['rpn_intensity'])
for key in out_dict.keys():
print("name:{}, shape{}".format(key,out_dict[key].shape))
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
Contains proposal functions
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
import utils.box_utils as box_utils
from utils.config import cfg
__all__ = ["get_proposal_func"]
def get_proposal_func(cfg, mode='TRAIN'):
def decode_bbox_target(roi_box3d, pred_reg, anchor_size, loc_scope,
loc_bin_size, num_head_bin, get_xz_fine=True,
loc_y_scope=0.5, loc_y_bin_size=0.25,
get_y_by_bin=False, get_ry_fine=False):
per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
# recover xz localization
x_bin_l, x_bin_r = 0, per_loc_bin_num
z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
start_offset = z_bin_r
x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
if get_xz_fine:
x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
start_offset = z_res_r
x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
x_res = x_res_norm * loc_bin_size
z_res = z_res_norm * loc_bin_size
pos_x += x_res
pos_z += z_res
# recover y localization
if get_y_by_bin:
y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
start_offset = y_res_r
y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
y_res = y_res_norm * loc_y_bin_size
pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
else:
y_offset_l, y_offset_r = start_offset, start_offset + 1
start_offset = y_offset_r
pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
pos_y = pos_y.reshape(-1)
# recover ry rotation
ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
if get_ry_fine:
# divide pi/2 into several bins
angle_per_class = (np.pi / 2) / num_head_bin
ry_res = ry_res_norm * (angle_per_class / 2)
ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
else:
angle_per_class = (2 * np.pi) / num_head_bin
ry_res = ry_res_norm * (angle_per_class / 2)
# bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
ry[ry > np.pi] -= 2 * np.pi
# recover size
size_res_l, size_res_r = ry_res_r, ry_res_r + 3
assert size_res_r == pred_reg.shape[1]
size_res_norm = pred_reg[:, size_res_l: size_res_r]
hwl = size_res_norm * anchor_size + anchor_size
def rotate_pc_along_y(pc, angle):
cosa = np.cos(angle).reshape(-1, 1)
sina = np.sin(angle).reshape(-1, 1)
R = np.concatenate([cosa, -sina, sina, cosa], axis=-1).reshape(-1, 2, 2)
pc_temp = pc[:, [0, 2]].reshape(-1, 1, 2)
pc[:, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1)).reshape(-1, 2)
return pc
# shift to original coords
roi_center = np.array(roi_box3d[:, 0:3])
shift_ret_box3d = np.concatenate((
pos_x.reshape(-1, 1),
pos_y.reshape(-1, 1),
pos_z.reshape(-1, 1),
hwl, ry.reshape(-1, 1)), axis=1)
ret_box3d = shift_ret_box3d
if roi_box3d.shape[1] == 7:
roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
ret_box3d = rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
ret_box3d[:, 6] += roi_ry
ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
return ret_box3d
def distance_based_proposal(scores, proposals, sorted_idxs):
nms_range_list = [0, 40.0, 80.0]
pre_tot_top_n = cfg[mode].RPN_PRE_NMS_TOP_N
pre_top_n_list = [0, int(pre_tot_top_n * 0.7), pre_tot_top_n - int(pre_tot_top_n * 0.7)]
post_tot_top_n = cfg[mode].RPN_POST_NMS_TOP_N
post_top_n_list = [0, int(post_tot_top_n * 0.7), post_tot_top_n - int(post_tot_top_n * 0.7)]
batch_size = scores.shape[0]
ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
# sort by score
score_ord = score[sorted_idx]
proposal_ord = proposal[sorted_idx]
dist = proposal_ord[:, 2]
first_mask = (dist > nms_range_list[0]) & (dist <= nms_range_list[1])
scores_single_list, proposals_single_list = [], []
for i in range(1, len(nms_range_list)):
# get proposal distance mask
dist_mask = ((dist > nms_range_list[i - 1]) & (dist <= nms_range_list[i]))
if dist_mask.sum() != 0:
# this area has points, reduce by mask
cur_scores = score_ord[dist_mask]
cur_proposals = proposal_ord[dist_mask]
# fetch pre nms top K
cur_scores = cur_scores[:pre_top_n_list[i]]
cur_proposals = cur_proposals[:pre_top_n_list[i]]
else:
assert i == 2, '%d' % i
# this area doesn't have any points, so use rois of first area
cur_scores = score_ord[first_mask]
cur_proposals = proposal_ord[first_mask]
# fetch top K of first area
cur_scores = cur_scores[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
cur_proposals = cur_proposals[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
# oriented nms
boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
s_scores, s_proposals = box_utils.box_nms(
boxes_bev, cur_scores, cur_proposals,
cfg[mode].RPN_NMS_THRESH, post_top_n_list[i],
cfg.RPN.NMS_TYPE)
if len(s_scores) > 0:
scores_single_list.append(s_scores)
proposals_single_list.append(s_proposals)
scores_single = np.concatenate(scores_single_list, axis=0)
proposals_single = np.concatenate(proposals_single_list, axis=0)
prop_num = proposals_single.shape[0]
ret_scores[b, :prop_num, 0] = scores_single
ret_proposals[b, :prop_num] = proposals_single
# ret_proposals.tofile("proposal.data")
# ret_scores.tofile("score.data")
return np.concatenate([ret_proposals, ret_scores], axis=-1)
def score_based_proposal(scores, proposals, sorted_idxs):
batch_size = scores.shape[0]
ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
# sort by score
score_ord = score[sorted_idx]
proposal_ord = proposal[sorted_idx]
# pre nms top K
cur_scores = score_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
cur_proposals = proposal_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
s_scores, s_proposals = box_utils.box_nms(
boxes_bev, cur_scores, cur_proposals,
cfg[mode].RPN_NMS_THRESH,
cfg[mode].RPN_POST_NMS_TOP_N,
'rotate')
prop_num = len(s_proposals)
ret_scores[b, :prop_num, 0] = s_scores
ret_proposals[b, :prop_num] = s_proposals
# ret_proposals.tofile("proposal.data")
# ret_scores.tofile("score.data")
return np.concatenate([ret_proposals, ret_scores], axis=-1)
def generate_proposal(x):
rpn_scores = np.array(x[:, :, 0])[:, :, 0]
roi_box3d = x[:, :, 1:4]
pred_reg = x[:, :, 4:]
proposals = decode_bbox_target(
np.array(roi_box3d).reshape(-1, roi_box3d.shape()[-1]),
np.array(pred_reg).reshape(-1, pred_reg.shape()[-1]),
anchor_size=np.array(cfg.CLS_MEAN_SIZE[0], dtype='float32'),
loc_scope=cfg.RPN.LOC_SCOPE,
loc_bin_size=cfg.RPN.LOC_BIN_SIZE,
num_head_bin=cfg.RPN.NUM_HEAD_BIN,
get_xz_fine=cfg.RPN.LOC_XZ_FINE,
get_y_by_bin=False,
get_ry_fine=False)
proposals[:, 1] += proposals[:, 3] / 2
proposals = proposals.reshape(rpn_scores.shape[0], -1, proposals.shape[-1])
sorted_idxs = np.argsort(-rpn_scores, axis=-1)
if cfg.TEST.RPN_DISTANCE_BASED_PROPOSE:
ret = distance_based_proposal(rpn_scores, proposals, sorted_idxs)
else:
ret = score_based_proposal(rpn_scores, proposals, sorted_idxs)
return ret
return generate_proposal
if __name__ == "__main__":
np.random.seed(3333)
x_np = np.random.random((4, 256, 84)).astype('float32')
from config import cfg
cfg.RPN.LOC_XZ_FINE = True
# cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
# cfg.RPN.NMS_TYPE = 'rotate'
proposal_func = get_proposal_func(cfg)
x = fluid.layers.data(name="x", shape=[256, 84], dtype='float32')
proposal = fluid.default_main_program().current_block().create_var(
name="proposal", dtype='float32', shape=[256, 7])
fluid.layers.py_func(proposal_func, x, proposal)
loss = fluid.layers.reduce_mean(proposal)
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
ret = exe.run(fetch_list=[proposal.name, loss.name], feed={'x': x_np})
print(ret)
cmake_minimum_required(VERSION 2.8.12)
project(pts_utils)
add_subdirectory(pybind11)
pybind11_add_module(pts_utils pts_utils.cpp)
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <math.h>
namespace py = pybind11;
int pt_in_box3d(float x, float y, float z, float cx, float cy, float cz, float h, float w, float l, float cosa, float sina) {
if ((fabsf(x - cx) > 10.) || (fabsf(y - cy) > h / 2.0) || (fabsf(z - cz) > 10.)){
return 0;
}
float x_rot = (x - cx) * cosa + (z - cz) * (-sina);
float z_rot = (x - cx) * sina + (z - cz) * cosa;
int in_flag = static_cast<int>((x_rot >= -l / 2.0) & (x_rot <= l / 2.0) & (z_rot >= -w / 2.0) & (z_rot <= w / 2.0));
return in_flag;
}
py::array_t<int> pts_in_boxes3d(py::array_t<float> pts, py::array_t<float> boxes) {
py::buffer_info pts_buf= pts.request(), boxes_buf = boxes.request();
if (pts_buf.ndim != 2 || boxes_buf.ndim != 2) {
throw std::runtime_error("Number of dimensions must be 2");
}
if (pts_buf.shape[1] != 3) {
throw std::runtime_error("pts 2nd dimension must be 3");
}
if (boxes_buf.shape[1] != 7) {
throw std::runtime_error("boxes 2nd dimension must be 7");
}
auto pts_num = pts_buf.shape[0];
auto boxes_num = boxes_buf.shape[0];
auto mask = py::array_t<int>(pts_num * boxes_num);
py::buffer_info mask_buf = mask.request();
float *pts_ptr = (float *) pts_buf.ptr,
*boxes_ptr = (float *) boxes_buf.ptr;
int *mask_ptr = (int *) mask_buf.ptr;
for (ssize_t i = 0; i < boxes_num; i++) {
float cx = boxes_ptr[i * 7];
float cy = boxes_ptr[i * 7 + 1] - boxes_ptr[i * 7 + 3] / 2.;
float cz = boxes_ptr[i * 7 + 2];
float h = boxes_ptr[i * 7 + 3];
float w = boxes_ptr[i * 7 + 4];
float l = boxes_ptr[i * 7 + 5];
float angle = boxes_ptr[i * 7 + 6];
float cosa = cosf(angle);
float sina = sinf(angle);
for (ssize_t j = 0; j < pts_num; j++) {
mask_ptr[i * pts_num + j] = pt_in_box3d(pts_ptr[j * 3], pts_ptr[j * 3 + 1], pts_ptr[j * 3 + 2], cx, cy, cz, h, w, l, cosa, sina);
}
}
mask.resize({boxes_num, pts_num});
return mask;
}
PYBIND11_MODULE(pts_utils, m) {
m.def("pts_in_boxes3d", &pts_in_boxes3d, "Calculate mask for whether points in boxes3d");
}
from setuptools import setup
from setuptools import Extension
setup(
name='pts_utils',
ext_modules = [Extension(
name='pts_utils',
sources=['pts_utils.cpp'],
include_dirs=[r'../../pybind11/include'],
extra_compile_args=['-std=c++11']
)],
)
import numpy as np
import pts_utils
a = np.random.random((16384, 3)).astype('float32')
b = np.random.random((64, 7)).astype('float32')
c = pts_utils.pts_in_boxes3d(a, b)
print(a, b, c, c.shape, np.sum(c))
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
Contains common utility functions.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import six
import logging
import numpy as np
import paddle.fluid as fluid
__all__ = ["check_gpu", "print_arguments", "parse_outputs", "Stat"]
logger = logging.getLogger(__name__)
def check_gpu(use_gpu):
"""
Log error and exit when set use_gpu=True in paddlepaddle
cpu version.
"""
err = "Config use_gpu cannot be set as True while you are " \
"using paddlepaddle cpu version ! \nPlease try: \n" \
"\t1. Install paddlepaddle-gpu to run model on GPU \n" \
"\t2. Set --use_gpu=False to run model on CPU"
try:
if use_gpu and not fluid.is_compiled_with_cuda():
logger.error(err)
sys.exit(1)
except Exception as e:
pass
def print_arguments(args):
"""Print argparse's arguments.
Usage:
.. code-block:: python
parser = argparse.ArgumentParser()
parser.add_argument("name", default="Jonh", type=str, help="User name.")
args = parser.parse_args()
print_arguments(args)
:param args: Input argparse.Namespace for printing.
:type args: argparse.Namespace
"""
logger.info("----------- Configuration Arguments -----------")
for arg, value in sorted(six.iteritems(vars(args))):
logger.info("%s: %s" % (arg, value))
logger.info("------------------------------------------------")
def parse_outputs(outputs, filter_key=None, extra_keys=None, prog=None):
keys, values = [], []
for k, v in outputs.items():
if filter_key is not None and k.find(filter_key) < 0:
continue
keys.append(k)
v.persistable = True
values.append(v.name)
if prog is not None and extra_keys is not None:
for k in extra_keys:
try:
v = fluid.framework._get_var(k, prog)
keys.append(k)
v.persistable = True
values.append(v.name)
except:
pass
return keys, values
class Stat(object):
def __init__(self):
self.stats = {}
def update(self, keys, values):
for k, v in zip(keys, values):
if k not in self.stats:
self.stats[k] = []
self.stats[k].append(v)
def reset(self):
self.stats = {}
def get_mean_log(self):
log = ""
for k, v in self.stats.items():
log += "avg_{}: {:.4f}, ".format(k, np.mean(v))
return log
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import numpy as np
from utils.config import cfg
from utils import calibration as calib
import utils.cyops.kitti_utils as kitti_utils
__all__ = ['save_rpn_feature', 'save_kitti_result', 'save_kitti_format']
def save_rpn_feature(rets, kitti_features_dir):
"""
save rpn features for RCNN offline training
"""
sample_id = rets['sample_id'][0]
backbone_xyz = rets['backbone_xyz'][0]
backbone_feature = rets['backbone_feature'][0]
pts_features = rets['pts_features'][0]
seg_mask = rets['seg_mask'][0]
rpn_cls = rets['rpn_cls'][0]
for i in range(len(sample_id)):
pts_intensity = pts_features[i, :, 0]
s_id = sample_id[i, 0]
output_file = os.path.join(kitti_features_dir, '%06d.npy' % s_id)
xyz_file = os.path.join(kitti_features_dir, '%06d_xyz.npy' % s_id)
seg_file = os.path.join(kitti_features_dir, '%06d_seg.npy' % s_id)
intensity_file = os.path.join(
kitti_features_dir, '%06d_intensity.npy' % s_id)
np.save(output_file, backbone_feature[i])
np.save(xyz_file, backbone_xyz[i])
np.save(seg_file, seg_mask[i])
np.save(intensity_file, pts_intensity)
rpn_scores_raw_file = os.path.join(
kitti_features_dir, '%06d_rawscore.npy' % s_id)
np.save(rpn_scores_raw_file, rpn_cls[i])
def save_kitti_result(rets, seg_output_dir, kitti_output_dir, reader, classes):
sample_id = rets['sample_id'][0]
roi_scores_row = rets['roi_scores_row'][0]
bboxes3d = rets['rois'][0]
pts_rect = rets['pts_rect'][0]
seg_mask = rets['seg_mask'][0]
rpn_cls_label = rets['rpn_cls_label'][0]
gt_boxes3d = rets['gt_boxes3d'][0]
gt_boxes3d_num = rets['gt_boxes3d'][1]
for i in range(len(sample_id)):
s_id = sample_id[i, 0]
seg_result_data = np.concatenate((pts_rect[i].reshape(-1, 3),
rpn_cls_label[i].reshape(-1, 1),
seg_mask[i].reshape(-1, 1)),
axis=1).astype('float16')
seg_output_file = os.path.join(seg_output_dir, '%06d.npy' % s_id)
np.save(seg_output_file, seg_result_data)
scores = roi_scores_row[i, :]
bbox3d = bboxes3d[i, :]
img_shape = reader.get_image_shape(s_id)
calib = reader.get_calib(s_id)
corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
box_valid_mask = np.logical_and(
img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % s_id)
with open(kitti_output_file, 'w') as f:
for k in range(bbox3d.shape[0]):
if box_valid_mask[k] == 0:
continue
x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
beta = np.arctan2(z, x)
alpha = -np.sign(beta) * np.pi / 2 + beta + ry
f.write('{} -1 -1 {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f}\n'.format(
classes, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
bbox3d[k, 6], scores[k]))
def save_kitti_format(sample_id, calib, bbox3d, kitti_output_dir, scores, img_shape):
corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % sample_id)
with open(kitti_output_file, 'w') as f:
for k in range(bbox3d.shape[0]):
if box_valid_mask[k] == 0:
continue
x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
beta = np.arctan2(z, x)
alpha = -np.sign(beta) * np.pi / 2 + beta + ry
f.write('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
(cfg.CLASSES, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
bbox3d[k, 6], scores[k]))
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册