diff --git a/.gitignore b/.gitignore
index 61a80a88edb71e9ba4192f84ab7821ba139bb9ce..60b49517f73affaf8da00009ba11684dc1a352c0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,3 +4,4 @@
*.pyc
*~
*.vscode
+*.idea
\ No newline at end of file
diff --git a/PaddleCV/Paddle3D/PointNet++/.gitignore b/PaddleCV/Paddle3D/PointNet++/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..b278bea2f2de85ff3778008fe1302b9e3fdfba81
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/.gitignore
@@ -0,0 +1,8 @@
+checkpoints*
+ext_op/src/*.o
+ext_op/src/*.so
+*.log*
+dataset/Indoor3DSemSeg/*
+!dataset/Indoor3DSemSeg/*.sh
+dataset/ModelNet40/*
+!dataset/ModelNet40/*.sh
diff --git a/PaddleCV/Paddle3D/PointNet++/README.md b/PaddleCV/Paddle3D/PointNet++/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..cb5842bce83e7bc0f7a510fb185e4972670b04ac
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/README.md
@@ -0,0 +1,247 @@
+# PointNet++ 分类和语义分割模型
+
+---
+## 内容
+
+- [简介](#简介)
+- [快速开始](#快速开始)
+- [参考文献](#参考文献)
+- [版本更新](#版本更新)
+
+## 简介
+
+[PointNet++](https://arxiv.org/abs/1706.02413) 是 Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas 等人提出的,针对3D数据进行分类和语义分割的模型。该模型基于PointNet进行了拓展, 使用分层点集特征学习来提取点云数据的特征,首先通过对输入point进行分组和采样提取局部区域模式,然后使用多层感知器来获取点特征。PointNet++ 还将点特征传播用于语义分割模型,采用基于距离插值和跨级跳转连接的分层传播策略,对点特征进行向上采样,获得所有原始点的点特征。
+
+
+网络结构如下所示:
+
+
+
+用于点云分类和分割的 PointNet++ 网络结构
+
+
+集合抽象层是网络的基本模块,每个集合抽象层由三个关键层构成:采样层、分组层和特征提取层。
+
+- **采样层**:采样层使用最远点采样(FPS)的方法,从输入点中选择一组点,它定义了局部区域的中心。与随机抽样的方法相比,在质心数目相同的情况下,FPS可以更好的覆盖整个点集。
+
+- **分组层**:分组层通过寻找中心体周围的“邻近”点来构造局部区域集。在度量空间采样的点集中,点的邻域由度量距离定义。这种方法被称为“query ball”,它使得局部区域的特征在空间上更加一般化。
+
+- **特征提取层**: 特征提取层使用 mini-PointNet 对分组层给出的各个区域进行特征提取,获得局部特征。
+
+
+
+**注意:** PointNet++ 模型构建依赖于自定义的 C++ 算子,目前仅支持GPU设备在Linux/Unix系统上进行编译,本模型**不能运行在Windows系统或CPU设备上**
+
+
+## 快速开始
+
+### 安装
+
+**安装 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):**
+
+在当前目录下运行样例代码需要 PaddelPaddle Fluid [develop每日版本](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)或使用PaddlePaddle [develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装.
+
+为了使自定义算子与paddle版本兼容,建议您**优先使用源码编译paddle**,源码编译方式请参考[编译安装](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
+
+
+### 编译自定义OP
+
+请确认Paddle版本为PaddelPaddle Fluid develop每日版本或基于Paddle develop分支源码编译安装,**推荐使用源码编译安装的方式**。
+
+自定义OP编译方式如下:
+
+ 进入 `ext_op/src` 目录,执行编译脚本
+ ```
+ cd ext_op/src
+ sh make.sh
+ ```
+
+ 成功编译后,`ext_op/src` 目录下将会生成 `pointnet2_lib.so`
+
+ 执行下列操作,确保自定义算子编译正确:
+
+ ```
+ # 设置动态库的路径到 LD_LIBRARY_PATH 中
+ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+ # 回到 ext_op 目录,添加 PYTHONPATH
+ cd ..
+ export PYTHONPATH=$PYTHONPATH:`pwd`
+
+ # 运行单测
+ python tests/test_farthest_point_sampling_op.py
+ python tests/test_gather_point_op.py
+ python tests/test_group_points_op.py
+ python tests/test_query_ball_op.py
+ python tests/test_three_interp_op.py
+ python tests/test_three_nn_op.py
+ ```
+ 单测运行成功会输出提示信息,如下所示:
+
+ ```
+ .
+ ----------------------------------------------------------------------
+ Ran 1 test in 13.205s
+
+ OK
+ ```
+
+**说明:** 更多关于自定义OP的编译说明,请参考[自定义OP编译](./ext_op/README.md)
+
+
+### 数据准备
+
+**ModelNet40 数据集:**
+
+PointNet++ 分类模型在 [ModelNet40 数据集](https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip)上进行训练,我们提供了数据集下载脚本:
+
+```
+cd dataset/ModelNet40
+sh download.sh
+```
+
+数据目录结构如下所示:
+
+```
+ dataset/ModelNet40/modelnet40_ply_hdf5_2048
+ ├── train_files.txt
+ ├── test_files.txt
+ ├── shape_names.txt
+ ├── ply_data_train0.h5
+ ├── ply_data_train_0_id2file.json
+ ├── ply_data_test0.h5
+ ├── ply_data_test_0_id2file.json
+ | ...
+
+```
+
+**Indoor3DSemSeg 数据集:**
+
+PointNet++ 分割模型在 [Indoor3DSemSeg 数据集](https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip)上进行训练,我们提供了数据集下载脚本:
+
+```
+cd dataset/Indoor3DSemSeg
+sh download.sh
+```
+
+数据目录结构如下所示:
+
+```
+ dataset/Indoor3DSemSeg/
+ ├── all_files.txt
+ ├── room_filelist.txt
+ ├── ply_data_all_0.h5
+ ├── ply_data_all_1.h5
+ | ...
+
+```
+
+### 训练
+
+分类/分割模型默认使用单卡训练,在启动训练前请指定单卡GPU,并将动态库的路径添加到 LD_LIBRARY_PATH 中:
+
+```
+# 指定0号卡进行GPU训练
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+```
+
+**分类模型:**
+
+可通过如下方式启动 PointNet++ 分类模型的训练:
+
+```
+# 开始训练
+python train_cls.py --model=MSG --batch_size=16 --save_dir=checkpoints_msg_cls
+```
+
+我们同时提供了训练分类模型的“快速开始”脚本:
+
+```
+sh scripts/train_cls.sh
+```
+
+**语义分割模型:**
+
+可通过如下方式启动 PointNet++ 语义分割模型的训练:
+
+```
+# 开始训练
+python train_seg.py --model=MSG --batch_size=32 --save_dir=checkpoints_msg_seg
+```
+
+我们同时提供了训练语义分割模型的“快速开始”脚本:
+
+```
+sh scripts/train_seg.sh
+```
+
+### 模型评估
+
+
+分类/分割模型默认使用单卡评估,首先指定单卡GPU,并将动态库的路径添加到 LD_LIBRARY_PATH 中:
+
+```
+# 指定0号卡进行GPU评估
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+```
+
+**分类模型:**
+
+可通过如下方式启动 PointNet++ 分类模型的评估:
+
+```
+# 对给定权重进行评估
+python eval_cls.py --model=MSG --weights=checkpoints_cls/200
+```
+
+我们同时提供了评估分类模型的“快速开始”脚本:
+
+```
+sh scripts/eval_cls.sh
+```
+
+分类模型的评估结果如下所示:
+
+| model | Top-1 | download |
+| :----- | :---: | :---: |
+| SSG(Single-Scale Group) | 89.3 | [model](https://paddlemodels.bj.bcebos.com/Paddle3D/pointnet2_ssg_cls.tar) |
+| MSG(Multi-Scale Group) | 90.0 | [model](https://paddlemodels.bj.bcebos.com/Paddle3D/pointnet2_msg_cls.tar) |
+
+**语义分割模型:**
+
+可通过如下方式启动 PointNet++ 语义分割模型的评估:
+
+```
+# 对给定权重进行评估
+python eval_seg.py --model=MSG --weights=checkpoints_seg/200
+```
+
+我们同时提供了评估语义分割模型的“快速开始”脚本:
+
+```
+sh scripts/eval_seg.sh
+```
+
+语义分割模型的评估结果如下所示:
+
+| model | Top-1 | download |
+| :----- | :---: | :---: |
+| SSG(Single-Scale Group) | 86.1 | [model](https://paddlemodels.bj.bcebos.com/Paddle3D/pointnet2_ssg_seg.tar) |
+| MSG(Multi-Scale Group) | 86.6 | [model](https://paddlemodels.bj.bcebos.com/Paddle3D/pointnet2_msg_seg.tar) |
+
+## 参考文献
+
+- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
+- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas.
+
+## 版本更新
+
+- 11/2019, 新增 PointNet++ 分类和语义分割模型。
diff --git a/PaddleCV/Paddle3D/PointNet++/README_en.md b/PaddleCV/Paddle3D/PointNet++/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..f42fe9dbf5037a685fb0afbd920465fbcf8e4406
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/README_en.md
@@ -0,0 +1,253 @@
+# PointNet++ classification and semantic segmentation model
+
+---
+## Table of Contents
+
+- [Introduction](#introduction)
+- [Quick Start](#quick-start)
+- [Reference](#reference)
+- [Update](#update)
+
+## Introduction
+
+[PointNet++](https://arxiv.org/abs/1706.02413) is a point classification and segmentation model for 3D data proposed by Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
+This model is a extension work based on PointNet extract features of point clouds data with hierarchical point set feature learning, perform set abstractions by grouping and sampling points at first to extract
+local region patterns, then use multi-layer perceptron to get point features. PointNet++ also used point feature propagation for semantic segmentation model, adopt a hierarchical
+propagation strategy with distance based interpolation and across level skip links, point features was upsampled to obtain point features for all the original points.
+
+The network structure is shown as below.
+
+
+
+PointNet++ architecture for Point set Segmentation and Classification
+
+
+Set Abstraction layer is the basic module of the network, each set abstraction layer is made of three key layers:Sampling layer, Grouping layer and PointNet layer.
+
+- **Sample layer**: Sampling layer uses farthest point sampling(FPS) to select a set of points from input points, which defines the centroids of local regions. Compared with random sampling, it has better converage of the entire point set given the same number of centroids.
+
+- **Grouping layer**: Grouping layer constructs local region sets by finding "neighboring" points around the centroids. In a point set sampled from a metric space, the neighborhood of a point is defined by metric distance. This method is called "ball query", which make local region feature more generalizable across space.
+
+- **PointNet layer**: PointNet layer uses a mini-PointNet to encode local region patterns into feature vectors.
+
+
+**NOTE:** PointNet++ model builds base on custom C++ operations, which can only support GPU devices and compiled on Linux/Unix currently, this model **cannot run on Windows or CPU deivices**.
+
+
+## Quick Start
+
+### Installation
+
+**Install [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):**
+
+Running sample code in this directory requires PaddelPaddle Fluid develop [daily version wheel](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11) or compiled from PaddlePaddle [develop branch](https://github.com/PaddlePaddle/Paddle/tree/develop).
+
+In order to make the custom OP compatible with the Paddle version, it is recommended to **compile from PaddlePaddle develop branch source code**. For source code compilation, please refer to [Compile and Install](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
+
+### Compile custom operations
+
+Please make sure you are using PaddlePaddle Fluid develop daily version or compiled from PaddlePaddle develop branch.
+Custom operations can be compiled as follows:
+
+```
+cd ext_op/src
+sh make.sh
+```
+
+If the compilation is finished successfully, `pointnet2_lib.so` will be generated under `exr_op/src`.
+
+Make sure custom operations pass as follows:
+
+```
+# export paddle libs to LD_LIBRARY_PATH for custom op library
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+# back to ext_op and add PYTHONPATH
+cd ..
+export PYTHONPATH=$PYTHONPATH:`pwd`
+
+# Run unit tests
+python test/test_farthest_point_sampling_op.py
+python test/test_gather_point_op.py
+python test/test_group_points_op.py
+python test/test_query_ball_op.py
+python test/test_three_interp_op.py
+python test/test_three_nn_op.py
+```
+The prompt message for successful running is as follows:
+
+```
+.
+----------------------------------------------------------------------
+Ran 1 test in 13.205s
+
+OK
+```
+
+### Data preparation
+
+**ModelNet40 dataset:**
+
+PointNet++ classification models are trained on [ModelNet40 dataset](https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip), we also provide download scripts as follows:
+
+```
+cd dataset/ModelNet40
+sh download.sh
+```
+
+The dataset catalog structure is as follows:
+
+```
+ dataset/ModelNet40/modelnet40_ply_hdf5_2048
+ ├── train_files.txt
+ ├── test_files.txt
+ ├── shape_names.txt
+ ├── ply_data_train0.h5
+ ├── ply_data_train_0_id2file.json
+ ├── ply_data_test0.h5
+ ├── ply_data_test_0_id2file.json
+ | ...
+
+```
+
+**Indoor3DSemSeg dataset:**
+
+PointNet++ semantic segmentation models are trained on [Indoor3DSemSeg dataset](https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip), we also provide download scripts as follows:
+
+```
+cd dataset/Indoor3DSemSeg
+sh download.sh
+```
+
+The dataset catalog structure is as follows:
+
+```
+ dataset/Indoor3DSemSeg/
+ ├── all_files.txt
+ ├── room_filelist.txt
+ ├── ply_data_all_0.h5
+ ├── ply_data_all_1.h5
+ | ...
+
+```
+
+### Training
+
+**Classification Model:**
+
+For PointNet++ classification model, training can be start as follows:
+
+```
+# For single GPU deivces
+export CUDA_VISIBLE_DEVICES=0
+
+# enable gc to save GPU memory
+export FLAGS_fast_eager_deletion_mode=1
+export FLAGS_eager_delete_tensor_gb=0.0
+export FLAGS_fraction_of_gpu_memory_to_use=0.98
+
+# export paddle libs to LD_LIBRARY_PATH for custom op library
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+# start training
+python train_cls.py --model=MSG --batch_size=16 --save_dir=checkpoints_msg_cls
+```
+
+We also provided quick start script for training classification model as follows:
+
+```
+sh scripts/train_cls.sh
+```
+
+**Semantic Segmentation Model:**
+
+For PointNet++ semantic segmentation model, training can be start as follows:
+
+```
+# For single GPU deivces
+export CUDA_VISIBLE_DEVICES=0
+
+# enable gc to save GPU memory
+export FLAGS_fast_eager_deletion_mode=1
+export FLAGS_eager_delete_tensor_gb=0.0
+export FLAGS_fraction_of_gpu_memory_to_use=0.98
+
+# export paddle libs to LD_LIBRARY_PATH for custom op library
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+# start training
+python train_seg.py --model=MSG --batch_size=32 --save_dir=checkpoints_msg_seg
+```
+
+We also provided quick start scripts for training semantic segmentation model as follows:
+
+```
+sh scripts/train_seg.sh
+```
+
+### Evaluation
+
+**Classification Model:**
+
+For PointNet++ classification model, evaluation can be start as follows:
+
+```
+# For single GPU deivces
+export CUDA_VISIBLE_DEVICES=0
+
+# export paddle libs to LD_LIBRARY_PATH for custom op library
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+# start evaluation with given weights
+python eval_cls.py --model=MSG --weights=checkpoints_cls/200
+```
+
+We also provided quick start script for training classification model as follows:
+
+```
+sh scripts/eval_cls.sh
+```
+
+Classification model evaluation result is shown as below:
+
+| model | Top-1 | download |
+| :----- | :---: | :---: |
+| SSG(Single-Scale Group) | 89.3 | [model]() |
+| MSG(Multi-Scale Group) | 90.0 | [model]() |
+
+**Semantic Segmentation Model:**
+
+For PointNet++ semantic segmentation model, evaluation can be start as follows:
+
+```
+# For single GPU deivces
+export CUDA_VISIBLE_DEVICES=0
+
+# export paddle libs to LD_LIBRARY_PATH for custom op library
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+# start evaluation with given weights
+python eval_seg.py --model=MSG --weights=checkpoints_seg/200
+```
+
+We also provided quick start scripts for training semantic segmentation model as follows:
+
+```
+sh scripts/eval_seg.sh
+```
+
+Semantic segmentation model evaluation result is shown as below:
+
+| model | Top-1 | download |
+| :----- | :---: | :---: |
+| SSG(Single-Scale Group) | 86.1 | [model]() |
+| MSG(Multi-Scale Group) | 86.8 | [model]() |
+
+## Reference
+
+- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
+- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas.
+
+## Update
+
+- 11/2019, Add PointNet++ classification and semantic segmentation model.
diff --git a/PaddleCV/Paddle3D/PointNet++/data/__init__.py b/PaddleCV/Paddle3D/PointNet++/data/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..939af885c1485aaef3d6a85ba950cecb310827c5
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/data/__init__.py
@@ -0,0 +1,21 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from . import indoor3d_reader
+from . import modelnet40_reader
+from .indoor3d_reader import *
+from .modelnet40_reader import *
+
+__all__ = indoor3d_reader.__all__
+__all__ += modelnet40_reader.__all__
diff --git a/PaddleCV/Paddle3D/PointNet++/data/data_utils.py b/PaddleCV/Paddle3D/PointNet++/data/data_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..0f0872895bd55be2d9f84837e28671d5486d8b40
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/data/data_utils.py
@@ -0,0 +1,127 @@
+"""
+This code is based on https://github.com/erikwijmans/Pointnet2_PyTorch/blob/master/pointnet2/data/data_utils.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import numpy as np
+
+def angle_axis(angle, axis):
+ """
+ Returns a 4x4 rotation matrix that performs a rotation around axis by angle
+ Parameters
+ ----------
+ angle : float
+ Angle to rotate by
+ axis: np.ndarray
+ Axis to rotate about
+ Returns
+ -------
+ Tensor
+ 3x3 rotation matrix
+ """
+ u = axis / np.linalg.norm(axis)
+ cosval, sinval = np.cos(angle), np.sin(angle)
+
+ # yapf: disable
+ cross_prod_mat = np.array([[0.0, -u[2], u[1]],
+ [u[2], 0.0, -u[0]],
+ [-u[1], u[0], 0.0]])
+
+ R = np.array(
+ cosval * np.eye(3)
+ + sinval * cross_prod_mat
+ + (1.0 - cosval) * np.outer(u, u)).astype("float32")
+ return R
+
+class PointcloudScale(object):
+ def __init__(self, lo=0.8, hi=1.25):
+ self.lo, self.hi = lo, hi
+
+ def __call__(self, points):
+ scaler = np.random.uniform(self.lo, self.hi)
+ points[:, 0:3] *= scaler
+ return points
+
+class PointcloudRotate(object):
+ def __init__(self, axis=np.array([0.0, 1.0, 0.0])):
+ self.axis = axis
+
+ def __call__(self, points):
+ rotation_angle = np.random.uniform() * 2 * np.pi
+ rotation_matrix = angle_axis(rotation_angle, self.axis)
+
+ normals = points.shape[1] > 3
+ if not normals:
+ return np.matmul(points, rotation_matrix.T)
+ else:
+ pc_xyz = points[:, 0:3]
+ pc_normals = points[:, 3:]
+ points[:, 0:3] = np.matmul(pc_xyz, rotation_matrix.T)
+ points[:, 3:] = np.matmul(pc_normals, rotation_matrix.T)
+ return points
+
+class PointcloudTranslate(object):
+ def __init__(self, translate_range=0.1):
+ self.translate_range = translate_range
+
+ def __call__(self, points):
+ translation = np.random.uniform(-self.translate_range, self.translate_range)
+ points[:, 0:3] += translation
+ return points
+
+
+class PointcloudJitter(object):
+ def __init__(self, std=0.01, clip=0.05):
+ self.std, self.clip = std, clip
+
+ def __call__(self, points):
+ jittered_data = np.random.normal(loc=0,scale=self.std,size=(points.shape[0],3))
+ jittered_data = np.clip(jittered_data, -self.clip, self.clip)
+
+ points[:, 0:3] += jittered_data
+ return points
+
+class PointcloudRotatePerturbation(object):
+ def __init__(self, angle_sigma=0.06, angle_clip=0.18):
+ self.angle_sigma, self.angle_clip = angle_sigma, angle_clip
+
+ def _get_angles(self):
+ angles = np.clip(
+ self.angle_sigma * np.random.randn(3), -self.angle_clip, self.angle_clip
+ )
+ return angles
+ def __call__(self, points):
+ angles = self._get_angles()
+ Rx = angle_axis(angles[0], np.array([1.0, 0.0, 0.0]))
+ Ry = angle_axis(angles[1], np.array([0.0, 1.0, 0.0]))
+ Rz = angle_axis(angles[2], np.array([0.0, 0.0, 1.0]))
+
+ rotation_matrix = np.matmul(np.matmul(Rz, Ry), Rx)
+
+ normals = points.shape[1] > 3
+ if not normals:
+ return np.matmul(points, rotation_matrix.T)
+ else:
+ pc_xyz = points[:, 0:3]
+ pc_normals = points[:, 3:]
+ points[:, 0:3] = np.matmul(pc_xyz, rotation_matrix.T)
+ points[:, 3:] = np.matmul(pc_normals, rotation_matrix.T)
+ return points
+
+
+class PointcloudRandomInputDropout(object):
+ def __init__(self, max_dropout_ratio=0.875):
+ assert max_dropout_ratio >= 0 and max_dropout_ratio < 1
+ self.max_dropout_ratio = max_dropout_ratio
+
+ def __call__(self, points):
+ dropout_ratio = np.random.random() * self.max_dropout_ratio # 0~0.875
+ drop_idx = np.where(np.random.random((points.shape[0])) <= dropout_ratio)[0]
+ if len(drop_idx) > 0:
+ points[drop_idx] = points[0] # set to the first point
+
+ return points
diff --git a/PaddleCV/Paddle3D/PointNet++/data/indoor3d_reader.py b/PaddleCV/Paddle3D/PointNet++/data/indoor3d_reader.py
new file mode 100644
index 0000000000000000000000000000000000000000..c27a37963ecb808f51ed65235e739d89c70c0ac6
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/data/indoor3d_reader.py
@@ -0,0 +1,129 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import os
+import os.path as osp
+import signal
+import numpy as np
+import h5py
+import random
+import logging
+
+__all__ = ["Indoor3DReader"]
+
+logger = logging.getLogger(__name__)
+
+
+class Indoor3DReader(object):
+ def __init__(self, data_dir, test_area="Area_5"):
+ self.data_dir = data_dir
+ self.test_area = test_area
+ self.load_data()
+
+ def _read_data_file(self, fname):
+ assert osp.isfile(fname), \
+ "{} is not a file".format(fname)
+ with open(fname) as f:
+ return [line.strip() for line in f]
+
+ def _load_h5_file(self, fname):
+ assert osp.isfile(fname), \
+ "{} is not a file".format(fname)
+ f = h5py.File(fname, mode='r')
+ return f['data'][:], f['label'][:]
+
+ def load_data(self):
+ logger.info("Loading Indoor3D dataset from {} ...".format(self.data_dir))
+ # read all_files.txt
+ all_files_fname = osp.join(self.data_dir, 'all_files.txt')
+ all_files = self._read_data_file(all_files_fname)
+
+ # read room_filelist.txt
+ room_fname = osp.join(self.data_dir, 'room_filelist.txt')
+ room_filelist = self._read_data_file(room_fname)
+
+ points, labels = [], []
+ for f in all_files:
+ h5_fname = osp.join(self.data_dir, osp.split(f)[-1])
+ point, label = self._load_h5_file(h5_fname)
+ points.append(point)
+ labels.append(label)
+ points = np.concatenate(points, 0)
+ labels = np.concatenate(labels, 0)
+
+ train_idxs, test_idxs = [], []
+ for i, room in enumerate(room_filelist):
+ if self.test_area in room:
+ test_idxs.append(i)
+ else:
+ train_idxs.append(i)
+
+ self.data = {}
+ self.data['train'] = {}
+ self.data['train']['points'] = points[train_idxs, ...]
+ self.data['train']['labels'] = labels[train_idxs, ...]
+ self.data['test'] = {}
+ self.data['test']['points'] = points[test_idxs, ...]
+ self.data['test']['labels'] = labels[test_idxs, ...]
+ logger.info("Load data finished")
+
+ def get_reader(self, batch_size, num_points, mode='train', shuffle=True):
+ assert mode in ['train', 'test'], \
+ "mode can only be 'train' or 'test'"
+ data = self.data[mode]
+ points = data['points']
+ labels = data['labels']
+
+ if mode == 'train' and shuffle:
+ idxs = np.arange(len(points))
+ np.random.shuffle(idxs)
+ points = points[idxs]
+ labels = labels[idxs]
+
+ def reader():
+ batch_out = []
+ for point, label in zip(points, labels):
+ # shuffle points
+ p = point.copy()
+ l = label.copy()
+ pt_idxs = np.arange(num_points)
+ np.random.shuffle(pt_idxs)
+ p = p[pt_idxs]
+ l = l[pt_idxs]
+
+ xyz = p[:, :3]
+ feature = p[:, 3:]
+ label = l[:, np.newaxis]
+ batch_out.append((xyz, feature, label))
+
+ if len(batch_out) == batch_size:
+ yield batch_out
+ batch_out = []
+
+ return reader
+
+
+def _term_reader(signum, frame):
+ logger.info('pid {} terminated, terminate reader process '
+ 'group {}...'.format(os.getpid(), os.getpgrp()))
+ os.killpg(os.getpgid(os.getpid()), signal.SIGKILL)
+
+signal.signal(signal.SIGINT, _term_reader)
+signal.signal(signal.SIGTERM, _term_reader)
+
diff --git a/PaddleCV/Paddle3D/PointNet++/data/modelnet40_reader.py b/PaddleCV/Paddle3D/PointNet++/data/modelnet40_reader.py
new file mode 100644
index 0000000000000000000000000000000000000000..e32f10ab719db0742df127f243c8064bf4dfdd48
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/data/modelnet40_reader.py
@@ -0,0 +1,116 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import os
+import os.path as osp
+import signal
+import numpy as np
+import h5py
+import random
+import logging
+
+__all__ = ["ModelNet40ClsReader"]
+
+logger = logging.getLogger(__name__)
+
+
+class ModelNet40ClsReader(object):
+ def __init__(self, data_dir, mode='train', transforms=None):
+ assert mode in ['train', 'test'], \
+ "mode can only be 'train' or 'test'"
+ self.data_dir = data_dir
+ self.mode = mode
+ self.transforms = transforms
+ self.load_data()
+
+ def _read_data_file(self, fname):
+ assert osp.isfile(fname), \
+ "{} is not a file".format(fname)
+ with open(fname) as f:
+ return [line.strip()[5:] for line in f]
+
+ def _load_h5_file(self, fname):
+ assert osp.isfile(fname), \
+ "{} is not a file".format(fname)
+ f = h5py.File(fname, mode='r')
+ return f['data'][:], f['label'][:]
+
+ def load_data(self):
+ logger.info("Loading ModelNet40 dataset {} split from {} "
+ "...".format(self.mode, self.data_dir))
+ if self.mode == 'train':
+ files_fname = osp.join(self.data_dir, 'train_files.txt')
+ files = self._read_data_file(files_fname)
+ else:
+ files_fname = osp.join(self.data_dir, 'test_files.txt')
+ files = self._read_data_file(files_fname)
+
+ points, labels = [], []
+ for f in files:
+ h5_fname = osp.join(self.data_dir, osp.split(f)[-1])
+ point, label = self._load_h5_file(h5_fname)
+ points.append(point)
+ labels.append(label)
+ self.points = np.concatenate(points, 0)
+ self.labels = np.concatenate(labels, 0)
+ logger.info("Load {} data finished".format(self.mode))
+
+ def get_reader(self, batch_size, num_points, shuffle=True):
+ self.num_points = min(num_points, self.points.shape[1])
+ points = self.points
+ labels = self.labels
+ if shuffle and self.mode == 'train':
+ idxs = np.arange(len(self.points))
+ np.random.shuffle(idxs)
+ points = points[idxs]
+ labels = labels[idxs]
+
+ def reader():
+ batch_out = []
+ for point, label in zip(points, labels):
+ p = point.copy()
+ l = label.copy()
+ pt_idxs = np.arange(self.num_points)
+ if shuffle:
+ np.random.shuffle(pt_idxs)
+ c_points = p[pt_idxs]
+ if self.transforms is not None:
+ for trans in self.transforms:
+ c_points = trans(c_points)
+
+ xyz = c_points[:, :3]
+ # modelnet40 only have xyz features
+ # feature = c_points[:, 3:]
+ label = l[:, np.newaxis]
+ batch_out.append((xyz, label))
+
+ if len(batch_out) == batch_size:
+ yield batch_out
+ batch_out = []
+ return reader
+
+
+def _term_reader(signum, frame):
+ logger.info('pid {} terminated, terminate reader process '
+ 'group {}...'.format(os.getpid(), os.getpgrp()))
+ os.killpg(os.getpgid(os.getpid()), signal.SIGKILL)
+
+signal.signal(signal.SIGINT, _term_reader)
+signal.signal(signal.SIGTERM, _term_reader)
+
diff --git a/PaddleCV/Paddle3D/PointNet++/dataset/Indoor3DSemSeg/download.sh b/PaddleCV/Paddle3D/PointNet++/dataset/Indoor3DSemSeg/download.sh
new file mode 100644
index 0000000000000000000000000000000000000000..27a889806416ef56e09660058d5db1da1f0de725
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/dataset/Indoor3DSemSeg/download.sh
@@ -0,0 +1,8 @@
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+
+echo "Downloading https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip"
+wget https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip
+
+echo "Unzip indoor3d_sem_seg_hdf5_data.zip"
+unzip indoor3d_sem_seg_hdf5_data.zip
diff --git a/PaddleCV/Paddle3D/PointNet++/dataset/ModelNet40/download.sh b/PaddleCV/Paddle3D/PointNet++/dataset/ModelNet40/download.sh
new file mode 100644
index 0000000000000000000000000000000000000000..0a6e95328eac4188cb2fee6b7f331be6e76ae16d
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/dataset/ModelNet40/download.sh
@@ -0,0 +1,8 @@
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+
+echo "Downloading https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip"
+wget https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip
+
+echo "Unzip modelnet40_ply_hdf5_2048.zip"
+unzip modelnet40_ply_hdf5_2048.zip
diff --git a/PaddleCV/Paddle3D/PointNet++/eval_cls.py b/PaddleCV/Paddle3D/PointNet++/eval_cls.py
new file mode 100644
index 0000000000000000000000000000000000000000..a25731a658b18ec8814b8521303a90b6f5dcf02b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/eval_cls.py
@@ -0,0 +1,148 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import ast
+import logging
+import numpy as np
+import paddle.fluid as fluid
+
+from models import *
+from data.data_utils import *
+from data.modelnet40_reader import ModelNet40ClsReader
+from utils import *
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+np.random.seed(1024)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointNet++ semantic segmentation train script")
+ parser.add_argument(
+ '--model',
+ type=str,
+ default='MSG',
+ help='SSG or MSG model to train, default MSG')
+ parser.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=True,
+ help='default use gpu.')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=1,
+ help='evaluation batch size, default 1')
+ parser.add_argument(
+ '--num_points',
+ type=int,
+ default=4096,
+ help='number of points in a sample, default: 4096')
+ parser.add_argument(
+ '--num_classes',
+ type=int,
+ default=40,
+ help='number of classes in dataset, default: 13')
+ parser.add_argument(
+ '--weights',
+ type=str,
+ default='checkpoints/200',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='dataset/ModelNet40/modelnet40_ply_hdf5_2048',
+ help='dataset directory')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=100,
+ help='mini-batch interval for logging.')
+ args = parser.parse_args()
+ return args
+
+
+def eval():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ check_gpu(args.use_gpu)
+
+ assert args.model in ['MSG', 'SSG'], \
+ "--model can only be 'MSG' or 'SSG'"
+
+ # build model
+ startup = fluid.Program()
+ eval_prog = fluid.Program()
+ with fluid.program_guard(eval_prog, startup):
+ with fluid.unique_name.guard():
+ eval_model = PointNet2ClsMSG(args.num_classes, args.num_points) \
+ if args.model == 'MSG' else \
+ PointNet2ClsSSG(args.num_classes, args.num_points)
+ eval_model.build_model()
+ eval_feeds = eval_model.get_feeds()
+ eval_outputs = eval_model.get_outputs()
+ eval_pyreader = eval_model.get_pyreader()
+ eval_prog = eval_prog.clone(True)
+ eval_keys, eval_values = parse_outputs(eval_outputs)
+
+ place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
+ exe = fluid.Executor(place)
+ exe.run(startup)
+
+ assert os.path.exists(args.weights), "weights {} not exists.".format(args.weights)
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.weights, var.name))
+ fluid.io.load_vars(exe, args.weights, eval_prog, predicate=if_exist)
+
+ eval_compile_prog = fluid.compiler.CompiledProgram(eval_prog)
+
+ # get reader
+ modelnet_reader = ModelNet40ClsReader(args.data_dir, mode='test')
+ eval_reader = modelnet_reader.get_reader(args.batch_size, args.num_points)
+ eval_pyreader.decorate_sample_list_generator(eval_reader, place)
+
+ eval_stat = Stat()
+ try:
+ eval_pyreader.start()
+ eval_iter = 0
+ eval_periods = []
+ while True:
+ cur_time = time.time()
+ eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values)
+ period = time.time() - cur_time
+ eval_periods.append(period)
+ eval_stat.update(eval_keys, eval_outs)
+ if eval_iter % args.log_interval == 0:
+ log_str = ""
+ for name, value in zip(eval_keys, eval_outs):
+ log_str += "{}: {:.4f}, ".format(name, np.mean(value))
+ logger.info("[EVAL] batch {}: {}time: {:.2f}".format(eval_iter, log_str, period))
+ eval_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[EVAL] Eval finished, {}average time: {:.2f}".format(eval_stat.get_mean_log(), np.mean(eval_periods[1:])))
+ finally:
+ eval_pyreader.reset()
+
+
+if __name__ == "__main__":
+ eval()
diff --git a/PaddleCV/Paddle3D/PointNet++/eval_seg.py b/PaddleCV/Paddle3D/PointNet++/eval_seg.py
new file mode 100644
index 0000000000000000000000000000000000000000..56c257bb6dee2027a49d3abe48097bb7bfd4a610
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/eval_seg.py
@@ -0,0 +1,147 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import ast
+import logging
+import numpy as np
+import paddle.fluid as fluid
+
+from models import *
+from data.indoor3d_reader import Indoor3DReader
+from utils import *
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+np.random.seed(1024)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointNet++ semantic segmentation train script")
+ parser.add_argument(
+ '--model',
+ type=str,
+ default='MSG',
+ help='SSG or MSG model to train, default MSG')
+ parser.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=True,
+ help='default use gpu.')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=1,
+ help='evaluation batch size, default 1')
+ parser.add_argument(
+ '--num_points',
+ type=int,
+ default=4096,
+ help='number of points in a sample, default: 4096')
+ parser.add_argument(
+ '--num_classes',
+ type=int,
+ default=13,
+ help='number of classes in dataset, default: 13')
+ parser.add_argument(
+ '--weights',
+ type=str,
+ default='checkpoints/200',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='dataset/Indoor3DSemSeg/indoor3d_sem_seg_hdf5_data',
+ help='dataset directory')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=100,
+ help='mini-batch interval for logging.')
+ args = parser.parse_args()
+ return args
+
+
+def eval():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ check_gpu(args.use_gpu)
+
+ assert args.model in ['MSG', 'SSG'], \
+ "--model can only be 'MSG' or 'SSG'"
+
+ # build model
+ startup = fluid.Program()
+ eval_prog = fluid.Program()
+ with fluid.program_guard(eval_prog, startup):
+ with fluid.unique_name.guard():
+ eval_model = PointNet2SemSegMSG(args.num_classes, args.num_points) \
+ if args.model == 'MSG' else \
+ PointNet2SemSegSSG(args.num_classes, args.num_points)
+ eval_model.build_model()
+ eval_feeds = eval_model.get_feeds()
+ eval_outputs = eval_model.get_outputs()
+ eval_pyreader = eval_model.get_pyreader()
+ eval_prog = eval_prog.clone(True)
+ eval_keys, eval_values = parse_outputs(eval_outputs)
+
+ place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
+ exe = fluid.Executor(place)
+ exe.run(startup)
+
+ assert os.path.exists(args.weights), "weights {} not exists.".format(args.weights)
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.weights, var.name))
+ fluid.io.load_vars(exe, args.weights, eval_prog, predicate=if_exist)
+
+ eval_compile_prog = fluid.compiler.CompiledProgram(eval_prog)
+
+ # get reader
+ indoor_reader = Indoor3DReader(args.data_dir)
+ eval_reader = indoor_reader.get_reader(args.batch_size, args.num_points, mode='test')
+ eval_pyreader.decorate_sample_list_generator(eval_reader, place)
+
+ eval_stat = Stat()
+ try:
+ eval_pyreader.start()
+ eval_iter = 0
+ eval_periods = []
+ while True:
+ cur_time = time.time()
+ eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values)
+ period = time.time() - cur_time
+ eval_periods.append(period)
+ eval_stat.update(eval_keys, eval_outs)
+ if eval_iter % args.log_interval == 0:
+ log_str = ""
+ for name, value in zip(eval_keys, eval_outs):
+ log_str += "{}: {:.4f}, ".format(name, np.mean(value))
+ logger.info("[EVAL] batch {}: {}time: {:.2f}".format(eval_iter, log_str, period))
+ eval_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[EVAL] Eval finished, {}average time: {:.2f}".format(eval_stat.get_mean_log(), np.mean(eval_periods[1:])))
+ finally:
+ eval_pyreader.reset()
+
+
+if __name__ == "__main__":
+ eval()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/README.md b/PaddleCV/Paddle3D/PointNet++/ext_op/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..3316b51a0194f4e006d1c9455504f7d712c35c6e
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/README.md
@@ -0,0 +1,96 @@
+# 自定义OP的编译过程
+
+## 代码结构
+
+ - src: 扩展OP C++/CUDA 源码
+ - pointnet_lib.py: Python封装
+ - tests: 各OP单测程序
+
+## 安装PaddlePaddle
+
+请通过如下方式安装PaddlePaddle:
+
+- 通过[Paddle develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装,编译方法如下:
+
+ 1. [Ubuntu](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
+ 1. [CentOS](https://www.paddlepaddle.org.cn/install/doc/source/centos)
+ 1. [MasOS](https://www.paddlepaddle.org.cn/install/doc/source/macos)
+ 1. [Windows](https://www.paddlepaddle.org.cn/install/doc/source/windows)
+
+ **说明:** 推荐使用docker编译
+
+- 安装Paddle develop[每日版本whl包](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)
+
+ **注意:** 编译自定义OP使用的gcc版本须与Paddle编译使用gcc版本一致,Paddle develop每日版本目前采用**gcc 4.8.2**版本编译,若使用每日版本,请使用**gcc 4.8.2**版本编译自定义OP,否则可能出现兼容性问题。
+
+## 编译自定义OP
+
+自定义op需要将实现的C++、CUDA代码编译成动态库,mask.sh中通过g++/nvcc编译,当然您也可以写Makefile或者CMake。
+
+编译需要include PaddlePaddle的相关头文件,链接PaddlePaddle的lib库。 头文件和lib库可通过下面命令获取到:
+
+```
+# python
+>>> import paddle
+>>> print(paddle.sysconfig.get_include())
+/paddle/pyenv/local/lib/python2.7/site-packages/paddle/include
+>>> print(paddle.sysconfig.get_lib())
+/paddle/pyenv/local/lib/python2.7/site-packages/paddle/libs
+```
+
+我们提供动态库编译脚本如下:
+
+```
+cd src
+sh make.sh
+```
+
+最终编译会产出`pointnet_lib.so`
+
+**说明:** 若使用源码编译安装PaddlePaddle的方式,编译过程中`cmake`未设置`WITH_MKLDNN`的方式,
+编译自定义OP时会报错找不到`mkldnn.h`等文件,可在`make.sh`中删除编译命令中的`-DPADDLE_WITH_MKLDNN`选项。
+
+## 设置环境变量
+
+需要将Paddle的核心库设置到`LD_LIBRARY_PATH`里, 先运行下面程序获取路径:
+
+```
+import paddle
+print(paddle.sysconfig.get_lib())
+```
+
+可通过如下方式添加动态库路径:
+
+```
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+```
+
+## 执行单测
+
+执行下列单测,确保自定义算子可在网络中正确使用:
+
+```
+# 回到 ext_op 目录,添加 PYTHONPATH
+cd ..
+export PYTHONPATH=$PYTHONPATH:`pwd`
+
+# 运行单测
+python test/test_farthest_point_sampling_op.py
+python test/test_gather_point_op.py
+python test/test_group_points_op.py
+python test/test_query_ball_op.py
+python test/test_three_interp_op.py
+python test/test_three_nn_op.py
+```
+
+单测运行成功会输出提示信息,如下所示:
+
+```
+.
+----------------------------------------------------------------------
+Ran 1 test in 13.205s
+
+OK
+```
+
+更多关于如何在框架外部自定义 C++ OP,可阅读[官网说明文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/index_cn.html)
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/__init__.py b/PaddleCV/Paddle3D/PointNet++/ext_op/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..4afd4332108896cd690efda3fc6f0b1eb86fb086
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/__init__.py
@@ -0,0 +1,18 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from . import pointnet_lib
+from .pointnet_lib import *
+
+__all__ = pointnet_lib.__all__
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/pointnet_lib.py b/PaddleCV/Paddle3D/PointNet++/ext_op/pointnet_lib.py
new file mode 100644
index 0000000000000000000000000000000000000000..5f607bf8775a6c0b20440f1635ae1c05b2ad8f07
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/pointnet_lib.py
@@ -0,0 +1,264 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import paddle.fluid as fluid
+
+file_dir = os.path.dirname(os.path.abspath(__file__))
+fluid.load_op_library(os.path.join(file_dir, 'src/pointnet_lib.so'))
+
+from paddle.fluid.layer_helper import LayerHelper
+
+__all__ = ['three_nn', 'three_interp', 'query_ball', 'gather_point',
+ 'farthest_point_sampling', 'group_points']
+
+
+def three_nn(input, known, eps=1e-10, name=None):
+ """
+ **Three Nearest Neighbor Layer**
+
+ This operator samples the top-3 nearest neighbor of each point
+ coordinates specified by Input(X) between known point coordinates
+ specified by Input(Known) and calcualte the distance between these
+ nearest neighbors.
+
+ Args:
+ input (Variable): The input tensor of three_nn operator. This
+ is a 3-D tensor with shape of [B, N, 3].
+ known (Variable): The input tensor of known points of three_nn
+ operator. This is a 3-D tensor with shape of
+ [B, M, 3].
+ name(str|None): A name for this layer(optional). If set None, the layer
+ will be named automatically.
+
+ Returns:
+ distance (Variable): The output distance tensor of three_nn operator.
+ This is a 3-D tensor with shape of [B, N, 3].
+ idx (Variable): The output index tensor of three_nn operator.
+ This is a 3-D tensor with shape of [B, N, 3].
+
+ Examples:
+
+ .. code-block:: python
+
+ import paddle.fluid as fluid
+ x = fluid.layers.data(name='x', shape=[16, 3], dtype='float32')
+ known = fluid.layers.data(name='known', shape=[32, 3], dtype='float32')
+ distance, idx = fluid.layers.three_nn(input, known)
+ """
+ helper = LayerHelper('three_nn', **locals())
+ dtype = helper.input_dtype()
+ dist = helper.create_variable_for_type_inference(dtype)
+ idx = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type="three_nn",
+ inputs={"X": input,
+ "Known": known},
+ outputs={"Distance": dist,
+ "Idx": idx},
+ attrs={'eps': eps})
+ return (dist, idx)
+
+
+def three_interp(input, weight, idx, name=None):
+ """
+ **Three Interpolate Layer**
+
+ This operator calculate interpolate results from input, weight and
+ index.
+
+ Args:
+ input (Variable): The input tensor of three_interp operator. This
+ is a 3-D tensor with shape of [B, M, C].
+ weight (Variable): The weight tensor of three_interp operator. This
+ is a 3-D tensor with shape of [B, N, 3].
+ idx (Variable): The index tensor of three_interp operator. This
+ is a 3-D tensor with shape of [B, N, 3].
+ name(str|None): A name for this layer(optional). If set None, the layer
+ will be named automatically.
+
+ Returns:
+ output (Variable): The output tensor of three_interp operator.
+ This is a 3-D tensor with shape of [B, N, C].
+
+ Examples:
+
+ .. code-block:: python
+
+ import paddle.fluid as fluid
+ x = fluid.layers.data(name='x', shape=[16, 3], dtype='float32')
+ weight = fluid.layers.data(name='weight', shape=[32, 3], dtype='float32')
+ index = fluid.layers.data(name='index', shape=[32, 3], dtype='int32')
+ out = fluid.layers.three_interp(x, weight, index)
+ """
+ helper = LayerHelper('three_interp', **locals())
+ dtype = helper.input_dtype()
+ out = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type="three_interp",
+ inputs={"X": input,
+ "Weight": weight,
+ "Idx": idx},
+ outputs={"Out": out, })
+ return out
+
+
+def query_ball(input, new_points, radius, n_sample):
+ """
+ **Query Ball Layer**
+
+ Output is a tensor with the indicies of the features that form the query balls.
+
+ Args:
+ input(Variable): XYZ coordinates of features with shape of [B,N,3].
+ new_points(Variable): Centers coordinates of the ball query with shape of [B,M,3].
+ radius(float|Variable): Radius of the balls.
+ n_sample(int|Variable): Maximum number of features in the balls.
+ Return:
+ output(Variable): Tensor with the indicies of the features that form the query balls,with shape of [B,M,n_sample]
+
+ Examples:
+ .. code-block::python
+
+ import paddle.fluid as fluid
+ x = fluid.layers.data(name='points',shape=[-1,5,3],dtype='float32')
+ new_points = fluid.layers.data(name='new_points', shape=[-1,2,3], dtype='float32')
+ output = fluid.layers.query_ball(x,new_points,radius=4.0,n_sample=5)
+
+
+
+ """
+ helper = LayerHelper('query_ball', **locals())
+ dtype = helper.input_dtype()
+ out = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type="query_ball",
+ inputs={"Points": input,
+ "New_Points": new_points},
+ attrs={"N_sample": n_sample,
+ "Radius": radius},
+ outputs={"Output": out})
+ return out
+
+
+def farthest_point_sampling(input, sampled_point_num):
+ '''
+ Sampling point based on its max eucliden distance with other points.
+
+ Args:
+ input (Variable): input point cloud dataset with shape (B, N, 3)
+ B is batch size, N is points's nums, 3 is (x,y,z) coordinate
+ sampled_point_num (int): sampled points's nums
+
+ Retrun:
+ output (Variable): return sampled points with shape (B, M)
+ B is batch size, M is points's nums
+
+ Examples:
+ .. code-block:: python
+ x = fluid.layers.data(name='data', shape=(2,100,3), dtype='float32')
+ sampled_points = fluid.layers.farthest_point_sampling(
+ x, 50
+ )
+ '''
+
+ helper = LayerHelper('farthest_point_sampling', **locals())
+ dtype = input.dtype
+ op_out = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type='farthest_point_sampling',
+ inputs={'X': input},
+ outputs={'Output': op_out},
+ attrs={'sampled_point_num': sampled_point_num})
+ return op_out
+
+
+def gather_point(input, index):
+ """
+ **Gather Point Layer**
+ Output is obtained by gathering entries of X indexed by `index`
+ and concatenate them together.
+ .. math::
+ Out = X[Index]
+ .. code-block:: text
+ Given:
+ X = [[1, 2, 3],
+ [3, 4, 5],
+ [5, 6, 7]]
+ Index = [[1, 2]
+ Then:
+ Out = [[3, 4, 5],
+ [5, 6, 7]]
+ Args:
+ input (Variable): The source input with rank>=1, This
+ is a 3-D tensor with shape of [B, N, 3].
+ index (Variable): The index input with shape of [B, M].
+
+ Returns:
+ output (Variable): The output is a tensor with shape of [B,M].
+ Examples:
+ .. code-block:: python
+ import paddle.fluid as fluid
+ x = fluid.layers.data(name='x', shape=[-1, 5, 3], dtype='float32')
+ index = fluid.layers.data(name='index', shape=[-1, 1], dtype='int32')
+ output = fluid.layers.gather_point(x, index)
+ """
+
+ helper = LayerHelper('gather_point', **locals())
+ dtype = helper.input_dtype()
+ out = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type="gather_point",
+ inputs={"X": input,
+ "Index": index},
+ outputs={"Output": out})
+ return out
+
+
+def group_points(input, idx, name=None):
+ """
+ **Group Points Layer**
+
+ This operator group input points with index.
+
+ Args:
+ input (Variable): The input tensor of three_interp operator. This
+ is a 3-D tensor with shape of [B, N, C].
+ idx (Variable): The index tensor of three_interp operator. This
+ is a 3-D tensor with shape of [B, M, S].
+ name(str|None): A name for this layer(optional). If set None, the layer
+ will be named automatically.
+
+ Returns:
+ output (Variable): The output tensor of three_interp operator.
+ This is a 4-D tensor with shape of [B, M, S, C].
+
+ Examples:
+
+ .. code-block:: python
+
+ import paddle.fluid as fluid
+ x = fluid.layers.data(name='x', shape=[16, 3], dtype='float32')
+ index = fluid.layers.data(name='index', shape=[32, 3], dtype='int32')
+ out = fluid.layers.group_points(x, index)
+ """
+ helper = LayerHelper('group_points', **locals())
+ dtype = helper.input_dtype()
+ out = helper.create_variable_for_type_inference(dtype)
+ helper.append_op(
+ type="group_points",
+ inputs={"X": input,
+ "Idx": idx},
+ outputs={"Out": out, })
+ return out
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..ace1e01c9475b20bb13019d211e39770f7160bac
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cc
@@ -0,0 +1,69 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include
+#include
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+class FarthestPointSamplingOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+ void Make() override {
+ AddInput("X",
+ "(Tensor)input point cloud dataset with shape (B, N, 3)"
+ "B is batch size, N is points's nums, 3 is (x,y,z) coordinate");
+ AddOutput("Output",
+ "(Tensor)return sampled points with shape (B, M)"
+ "B is batch size, M is points's nums");
+ AddAttr("sampled_point_num", "sampling points's num")
+ .SetDefault(0)
+ .EqualGreaterThan(0);
+ AddComment(
+ R"Doc(
+ Sampling point based on
+ its max eucliden distance with other points.)Doc");
+ }
+};
+
+class FarthestPointSamplingOp : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+protected:
+ void InferShape(framework::InferShapeContext *ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) shoud not be null");
+ auto x_dims = ctx->GetInputDim("X");
+ PADDLE_ENFORCE(x_dims.size() == 3,
+ "Input(X) of FathestPointSamplingOp should be 3-D Tensor");
+ const int m = ctx->Attrs().Get("sampled_point_num");
+ ctx->SetOutputDim("Output", {x_dims[0], m});
+ }
+
+protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext &ctx) const override {
+ auto input_data_type = ctx.Input("X")->type();
+ return framework::OpKernelType(input_data_type, ctx.GetPlace());
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(farthest_point_sampling,
+ ops::FarthestPointSamplingOp,
+ ops::FarthestPointSamplingOpMaker);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..56515254991d09b66335f202546a176411ade2f2
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/farthest_point_sampling_op.cu
@@ -0,0 +1,151 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/fluid/framework/eigen.h"
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+template
+__global__ void farthestpointsamplingKernel(int b,
+ int n,
+ int m,
+ const T *__restrict__ dataset,
+ T *__restrict__ temp,
+ int *__restrict__ idxs) {
+ // 1. add first point
+ // 2. add the point having farthest distance with first point's
+ // 3. make second point as first point, repeat 1,2
+ if (m <= 0) return;
+ const int BlockSize = block_size;
+ __shared__ float dists[BlockSize];
+ __shared__ int dists_i[BlockSize];
+ const int BufferSize = 3072;
+ __shared__ float buf[BufferSize * 3];
+
+ // one block one batch, n points
+ // one thread one point
+ for (int i = blockIdx.x; i < b; i += gridDim.x) {
+ // can select old point as first point randomly
+ int old = 0;
+ if (threadIdx.x == 0) idxs[i * m + 0] = old;
+
+ for (int j = threadIdx.x; j < n; j += blockDim.x) {
+ temp[blockIdx.x * n + j] = 1e38;
+ }
+ for (int j = threadIdx.x; j < min(BufferSize, n) * 3; j += blockDim.x) {
+ buf[j] = dataset[i * n * 3 + j];
+ }
+ // wait all threads do this in the same block
+ __syncthreads();
+
+ // out m points
+ for (int j = 1; j < m; j++) {
+ // Step 1.
+ // fatherest distance
+ int besti = 0;
+ float best = -1;
+ // first point in m points
+ float x1 = dataset[i * n * 3 + old * 3 + 0];
+ float y1 = dataset[i * n * 3 + old * 3 + 1];
+ float z1 = dataset[i * n * 3 + old * 3 + 2];
+
+ // Step 2.
+ // find farthest point of (x1, y1, z1)
+ for (int k = threadIdx.x; k < n; k += blockDim.x) {
+ float td = temp[blockIdx.x * n + k];
+ float x2, y2, z2;
+ if (k < BufferSize) {
+ x2 = buf[k * 3 + 0];
+ y2 = buf[k * 3 + 1];
+ z2 = buf[k * 3 + 2];
+ } else {
+ x2 = dataset[i * n * 3 + k * 3 + 0];
+ y2 = dataset[i * n * 3 + k * 3 + 1];
+ z2 = dataset[i * n * 3 + k * 3 + 2];
+ }
+ // compute eucliden distance
+ float d = (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) +
+ (z2 - z1) * (z2 - z1);
+ float d2 = min(d, td);
+ if (d2 != td) temp[blockIdx.x * n + k] = d2;
+ if (d2 > best) {
+ best = d2;
+ besti = k;
+ }
+ }
+
+ // step 3.
+ dists[threadIdx.x] = best;
+ dists_i[threadIdx.x] = besti;
+ for (int u = 0; (1 << u) < blockDim.x; u++) {
+ __syncthreads();
+ if (threadIdx.x < (blockDim.x >> (u + 1))) {
+ int i1 = (threadIdx.x * 2) << u;
+ int i2 = (threadIdx.x * 2 + 1) << u;
+ if (dists[i1] < dists[i2]) {
+ dists[i1] = dists[i2];
+ dists_i[i1] = dists_i[i2];
+ }
+ }
+ }
+ __syncthreads();
+ // store the found node index
+ old = dists_i[0];
+ if (threadIdx.x == 0) idxs[i * m + j] = old;
+ }
+ }
+}
+
+template
+class FarthestPointSamplingOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext &ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ auto *input = ctx.Input("X");
+ auto *output = ctx.Output("Output");
+ if (input->numel() == 0) return;
+ // allocate memory
+ auto *ptr_out_points_index = output->mutable_data(ctx.GetPlace());
+
+ // b, n, m
+ int batch_size = input->dims()[0];
+ int in_n_points = input->dims()[1];
+ int out_m_points = ctx.Attr("sampled_point_num");
+
+ const T *ptr_in_points = input->data();
+
+ Tensor tmp;
+ auto *ptr_tmp_e =
+ tmp.mutable_data({batch_size, in_n_points}, ctx.GetPlace());
+
+ // run fathest point sampling kernel
+ // P40 have max 512 thread
+ farthestpointsamplingKernel<<<32, 512>>>(batch_size,
+ in_n_points,
+ out_m_points,
+ ptr_in_points,
+ ptr_tmp_e,
+ ptr_out_points_index);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(farthest_point_sampling,
+ ops::FarthestPointSamplingOpCUDAKernel,
+ ops::FarthestPointSamplingOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..0f41f1b3ad7cfc22e7fa7abfa8cbfa277ad9b136
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cc
@@ -0,0 +1,118 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/fluid/framework/op_registry.h"
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+class GatherPointOp : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) shoud not be null");
+ auto x_dims = ctx->GetInputDim("X");
+ PADDLE_ENFORCE(x_dims.size() == 3 && x_dims[2] == 3,
+ "Input(X) of GatherPointOp should be 3-D Tensor, the last "
+ "dimension must be 3");
+ auto index_dims = ctx->GetInputDim("Index");
+ PADDLE_ENFORCE(index_dims.size() == 2 && index_dims[0] == x_dims[0],
+ "Index of GatherPointop should be 2-D Tensor");
+ ctx->SetOutputDim("Output", {x_dims[0], index_dims[1], 3});
+ }
+
+protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ auto input_data_type = ctx.Input("X")->type();
+ return framework::OpKernelType(input_data_type, ctx.GetPlace());
+ }
+};
+
+class GatherPointOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+ void Make() override {
+ AddInput("X",
+ "Input points with shape (batch, n, 3), n is input "
+ "points's num");
+ AddInput("Index",
+ "input index with shape (batch, m), m is output points's num");
+ AddOutput("Output", "output points with shape(batch, m, 3)");
+ AddComment(
+ R"Doc(
+ Gather Point Operator.
+ Out is obtained by gathering entries of X indexed by Index and
+ concatenate them together.
+
+ Example:
+ X = [[1, 2, 3],
+ [3, 4, 5],
+ [5, 6, 7]]
+ Index = [[1, 2]]
+
+ Then:
+ Out = [[3, 4, 5],[5, 6, 7]])Doc");
+ }
+};
+
+class GatherPointOpGrad : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("Index"), "Input(Index) should not be null");
+ PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Output")),
+ "Input(Output@GRAD) should not be null");
+ auto dim_x = ctx->GetInputDim("X");
+ if (ctx->HasOutput(framework::GradVarName("X"))) {
+ ctx->SetOutputDim(framework::GradVarName("X"), dim_x);
+ }
+ }
+
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(
+ ctx.Input(framework::GradVarName("Output"))->type(),
+ ctx.GetPlace());
+ }
+};
+
+template
+class GatherPointGradDescMaker : public framework::SingleGradOpMaker {
+public:
+ using framework::SingleGradOpMaker::SingleGradOpMaker;
+
+protected:
+ std::unique_ptr Apply() const override {
+ auto* op = new T();
+ op->SetType("gather_point_grad");
+ op->SetInput("X", this->Input("X"));
+ op->SetInput("Index", this->Input("Index"));
+ op->SetInput(framework::GradVarName("Output"), this->OutputGrad("Output"));
+ op->SetOutput(framework::GradVarName("X"), this->InputGrad("X"));
+ op->SetAttrMap(this->Attrs());
+ return std::unique_ptr(op);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(gather_point,
+ ops::GatherPointOp,
+ ops::GatherPointOpMaker,
+ ops::GatherPointGradDescMaker,
+ ops::GatherPointGradDescMaker);
+REGISTER_OPERATOR(gather_point_grad, ops::GatherPointOpGrad);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..fa3a96f8a7061b8340d58265d9a876900bf0f234
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/gather_point_op.cu
@@ -0,0 +1,126 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/fluid/framework/eigen.h"
+#include "paddle/fluid/framework/op_registry.h"
+#include "paddle/fluid/platform/cuda_primitives.h"
+
+#include "util.cu.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+template
+__global__ void GatherPointKernel(int b,
+ int n,
+ int m,
+ const T *__restrict__ inp,
+ const int *__restrict__ idx,
+ T *__restrict__ out) {
+ for (int i = blockIdx.x; i < b; i += gridDim.x) {
+ for (int j = blockIdx.y * blockDim.x + threadIdx.x; j < m;
+ j += blockDim.x * gridDim.y) {
+ int a = idx[i * m + j];
+ for (int k = 0; k < 3; k++) {
+ out[(i * m + j) * 3 + k] = inp[(i * n + a) * 3 + k];
+ }
+ }
+ }
+}
+
+template
+__global__ void GatherPointGradKernel(int b,
+ int n,
+ int m,
+ const T *__restrict__ out_grad,
+ const int *__restrict__ idx,
+ T *__restrict__ in_grad) {
+ for (int i = blockIdx.x; i < b; i += gridDim.x) {
+ for (int j = blockIdx.y * blockDim.x + threadIdx.x; j < m;
+ j += blockDim.x * gridDim.y) {
+ int a = idx[i * m + j];
+ const T *out_grad_pos = &out_grad[(i * m + j) * 3];
+ T *in_grad_pos = &in_grad[(i * n + a) * 3];
+ for (int k = 0; k < 3; k++) {
+ platform::CudaAtomicAdd(&in_grad_pos[k], out_grad_pos[k]);
+ }
+ }
+ }
+}
+
+template
+class GatherPointOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext &ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ auto *points = ctx.Input("X");
+ auto *index = ctx.Input("Index");
+ auto *output = ctx.Output("Output");
+
+ if (points->numel() == 0) return;
+
+ const T *p_points = points->data();
+ const int *p_index = index->data();
+ T *p_out_points = output->mutable_data(ctx.GetPlace());
+
+ int batch_size = points->dims()[0];
+ int n_points = points->dims()[1];
+ int m_points = index->dims()[1];
+
+ GatherPointKernel<<>>(
+ batch_size, n_points, m_points, p_points, p_index, p_out_points);
+ }
+};
+
+template
+class GatherPointGradOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext &ctx) const override {
+ auto *points = ctx.Input("X");
+ auto *index = ctx.Input("Index");
+ auto *output_grad = ctx.Input(framework::GradVarName("Output"));
+ auto *points_grad = ctx.Output(framework::GradVarName("X"));
+
+ if (points->numel() == 0) return;
+
+ const T *p_output_grad = output_grad->data();
+ const int *p_index = index->data();
+ T *p_points_grad = points_grad->mutable_data(ctx.GetPlace());
+ int pnum = points_grad->numel();
+
+ auto &dev_ctx = ctx.template device_context();
+ Zero<<<(pnum + 512 - 1) / 512, 512, 0, dev_ctx.stream()>>>(p_points_grad,
+ pnum);
+
+ int batch_size = points->dims()[0];
+ int n_points = points->dims()[1];
+ int m_points = index->dims()[1];
+
+ GatherPointGradKernel<<>>(
+ batch_size, n_points, m_points, p_output_grad, p_index, p_points_grad);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(gather_point,
+ ops::GatherPointOpCUDAKernel,
+ ops::GatherPointOpCUDAKernel,
+ ops::GatherPointOpCUDAKernel);
+REGISTER_OP_CUDA_KERNEL(gather_point_grad,
+ ops::GatherPointGradOpCUDAKernel,
+ ops::GatherPointGradOpCUDAKernel,
+ ops::GatherPointGradOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..7266c553b2d2da95a8fa6355a0ffa2250ba01f71
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cc
@@ -0,0 +1,124 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include
+#include
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+class GroupPointsOp : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("X"),
+ "Input(X) of GroupPointsOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasInput("Idx"),
+ "Input(Idx) of GroupPointsOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasOutput("Out"),
+ "Output(Out) of GroupPointsOp should not be null.");
+
+ auto dim_x = ctx->GetInputDim("X"); // [B, C, N]
+ PADDLE_ENFORCE_EQ(dim_x.size(), 3, "X's dimension must be 3");
+
+ auto dim_idx = ctx->GetInputDim("Idx"); // [B, npoints, nsample]
+ PADDLE_ENFORCE_EQ(dim_idx.size(), 3, "Idx's dimension must be 3");
+
+ PADDLE_ENFORCE_EQ(dim_x[0], dim_idx[0],
+ "X and Idx dim[0] should be equal.");
+
+ // output: [B, C, M, S]
+ std::vector dim_out({dim_x[0], dim_x[1], dim_idx[1], dim_idx[2]});
+ ctx->SetOutputDim("Out", framework::make_ddim(dim_out));
+ }
+
+ protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(ctx.Input("X")->type(),
+ ctx.GetPlace());
+ }
+};
+
+class GroupPointsOpMaker : public framework::OpProtoAndCheckerMaker {
+ public:
+ void Make() override {
+ AddInput("X",
+ "The input tensor of group_points operator. "
+ "This is a 3-D tensor with shape of [B, C, N].");
+ AddInput("Idx",
+ "The input tensor of nearest neighbor index of group_points "
+ "operator. This is a 3-D tensor with shape of [B, M, S].");
+ AddOutput("Out",
+ "The output tensor of group_points operator. "
+ "This is a 4-D tensor with shape of [B, C, M, S].");
+
+ AddComment(R"DOC(
+ This operator group input points with index.
+ )DOC");
+ }
+};
+
+class GroupPointsOpGrad : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("Idx"), "Input(Idx) should not be null");
+ PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
+ "Input(Out@GRAD) should not be null");
+ auto dim_x = ctx->GetInputDim("X");
+ if (ctx->HasOutput(framework::GradVarName("X"))) {
+ ctx->SetOutputDim(framework::GradVarName("X"), dim_x);
+ }
+ }
+
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(
+ ctx.Input(framework::GradVarName("Out"))->type(),
+ ctx.GetPlace());
+ }
+};
+
+template
+class GroupPointsGradDescMaker : public framework::SingleGradOpMaker {
+ public:
+ using framework::SingleGradOpMaker::SingleGradOpMaker;
+
+ protected:
+ std::unique_ptr Apply() const override {
+ auto* op = new T();
+ op->SetType("group_points_grad");
+ op->SetInput("X", this->Input("X"));
+ op->SetInput("Idx", this->Input("Idx"));
+ op->SetInput(framework::GradVarName("Out"), this->OutputGrad("Out"));
+ op->SetOutput(framework::GradVarName("X"), this->InputGrad("X"));
+ op->SetAttrMap(this->Attrs());
+ return std::unique_ptr(op);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(group_points, ops::GroupPointsOp, ops::GroupPointsOpMaker,
+ ops::GroupPointsGradDescMaker,
+ ops::GroupPointsGradDescMaker);
+REGISTER_OPERATOR(group_points_grad, ops::GroupPointsOpGrad);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..0d7e02898f3dc68f59215b89356bb56b957b524a
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/group_points_op.cu
@@ -0,0 +1,144 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include "paddle/fluid/framework/op_registry.h"
+#include "paddle/fluid/platform/cuda_primitives.h"
+
+#include "util.cu.h"
+
+#define TOTAL_THREADS 1024
+#define THREADS_PER_BLOCK 256
+#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+template
+__global__ void KeGroupPointsFW(int b, int c, int n, int npoints, int nsample,
+ const T* __restrict__ points,
+ const int* __restrict__ idx,
+ T* __restrict__ out) {
+ // points: (B, C, N)
+ // idx: (B, npoints, nsample)
+ // output:
+ // out: (B, C, npoints, nsample)
+ int bs_idx = blockIdx.z;
+ int c_idx = blockIdx.y;
+ int index = blockIdx.x * blockDim.x + threadIdx.x;
+ int pt_idx = index / nsample;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;
+
+ int sample_idx = index % nsample;
+
+ idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
+ int in_idx = bs_idx * c * n + c_idx * n + idx[0];
+ int out_idx = bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
+ pt_idx * nsample + sample_idx;
+
+ out[out_idx] = points[in_idx];
+}
+
+template
+
+__global__ void KeGroupPointsBW(int b, int c, int n, int npoints, int nsample,
+ const T* __restrict__ grad_out,
+ const int* __restrict__ idx,
+ T* __restrict__ grad_points) {
+ // grad_out: (B, C, npoints, nsample)
+ // idx: (B, npoints, nsample)
+ // output:
+ // grad_points: (B, C, N)
+ int bs_idx = blockIdx.z;
+ int c_idx = blockIdx.y;
+ int index = blockIdx.x * blockDim.x + threadIdx.x;
+ int pt_idx = index / nsample;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;
+
+ int sample_idx = index % nsample;
+ grad_out += bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
+ pt_idx * nsample + sample_idx;
+ idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
+
+ platform::CudaAtomicAdd(grad_points + bs_idx * c * n + c_idx * n + idx[0],
+ grad_out[0]);
+}
+
+template
+class GroupPointsOpCUDAKernel : public framework::OpKernel {
+ public:
+ void Compute(const framework::ExecutionContext& ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ auto* input = ctx.Input("X");
+ auto* idx = ctx.Input("Idx");
+ auto* output = ctx.Output("Out");
+ auto* input_data = input->data();
+ auto* idx_data = idx->data();
+
+ const int b = input->dims()[0];
+ const int c = input->dims()[1];
+ const int n = input->dims()[2];
+ const int m = idx->dims()[1];
+ const int s = idx->dims()[2];
+
+ auto* output_data = output->mutable_data({b, c, m, s}, ctx.GetPlace());
+
+ dim3 blocks(DIVUP(m * s, THREADS_PER_BLOCK), c, b);
+ dim3 threads(THREADS_PER_BLOCK);
+ KeGroupPointsFW<
+ T><<>>(
+ b, c, n, m, s, input_data, idx_data, output_data);
+ }
+};
+
+template
+class GroupPointsGradOpCUDAKernel : public framework::OpKernel {
+ public:
+ void Compute(const framework::ExecutionContext& ctx) const override {
+ auto* input = ctx.Input("X");
+ auto* idx = ctx.Input("Idx");
+ auto* output_grad = ctx.Input(framework::GradVarName("Out"));
+ auto* input_grad = ctx.Output(framework::GradVarName("X"));
+ auto* idx_data = idx->data();
+ auto output_grad_data = output_grad->data();
+
+ const int b = input->dims()[0];
+ const int c = input->dims()[1];
+ const int n = input->dims()[2];
+ const int m = idx->dims()[1];
+ const int s = idx->dims()[2];
+
+ auto* input_grad_data =
+ input_grad->mutable_data({b, c, n}, ctx.GetPlace());
+ auto& dev_ctx =
+ ctx.template device_context();
+ int pnum = input_grad->numel();
+ Zero<<<(pnum + 512 - 1) / 512, 512, 0, dev_ctx.stream()>>>(input_grad_data,
+ pnum);
+
+ dim3 blocks(DIVUP(m * s, THREADS_PER_BLOCK), c, b);
+ dim3 threads(THREADS_PER_BLOCK);
+
+ KeGroupPointsBW<<>>(
+ b, c, n, m, s, output_grad_data, idx_data, input_grad_data);
+ }
+};
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(group_points, ops::GroupPointsOpCUDAKernel,
+ ops::GroupPointsOpCUDAKernel);
+REGISTER_OP_CUDA_KERNEL(group_points_grad,
+ ops::GroupPointsGradOpCUDAKernel,
+ ops::GroupPointsGradOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/make.sh b/PaddleCV/Paddle3D/PointNet++/ext_op/src/make.sh
new file mode 100644
index 0000000000000000000000000000000000000000..79505635c0392a32c065a99502614128397bf3e7
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/make.sh
@@ -0,0 +1,21 @@
+include_dir=$( python -c 'import paddle; print(paddle.sysconfig.get_include())' )
+lib_dir=$( python -c 'import paddle; print(paddle.sysconfig.get_lib())' )
+
+echo $include_dir
+echo $lib_dir
+
+OPS='farthest_point_sampling_op gather_point_op group_points_op query_ball_op three_interp_op three_nn_op'
+for op in ${OPS}
+do
+nvcc ${op}.cu -c -o ${op}.cu.o -ccbin cc -DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO -DPADDLE_WITH_MKLDNN -Xcompiler -fPIC -std=c++11 -Xcompiler -fPIC -w --expt-relaxed-constexpr -O0 -g -DNVCC \
+ -I ${include_dir}/third_party/ \
+ -I ${include_dir}
+done
+
+g++ farthest_point_sampling_op.cc farthest_point_sampling_op.cu.o gather_point_op.cc gather_point_op.cu.o group_points_op.cc group_points_op.cu.o query_ball_op.cu.o query_ball_op.cc three_interp_op.cu.o three_interp_op.cc three_nn_op.cu.o three_nn_op.cc -o pointnet_lib.so -DPADDLE_WITH_MKLDNN -shared -fPIC -std=c++11 -O0 -g \
+ -I ${include_dir}/third_party/ \
+ -I ${include_dir} \
+ -L ${lib_dir} \
+ -L /usr/local/cuda/lib64 -lpaddle_framework -lcudart
+
+rm *.cu.o
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..c473b0d325db422a25fac4a133c7127311418557
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cc
@@ -0,0 +1,82 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/fluid/framework/op_registry.h"
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+class QueryBallOp : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ void InferShape(framework::InferShapeContext *ctx) const override {
+ // points: [b,n,3]
+ PADDLE_ENFORCE(ctx->HasInput("Points"), "Input(Points) shoud not be null");
+ auto p_dims = ctx->GetInputDim("Points");
+ PADDLE_ENFORCE(p_dims.size() == 3 && p_dims[2] == 3,
+ "Input(Points) of QueryBallOp should be 3-D Tensor, the "
+ "last dimension must be 3");
+ // new_points: [b,m,3]
+ PADDLE_ENFORCE(ctx->HasInput("New_Points"),
+ "Input(New_Points) shoud not be null");
+ auto np_dims = ctx->GetInputDim("New_Points");
+ PADDLE_ENFORCE(np_dims.size() == 3 && np_dims[2] == 3,
+ "Input(New_Points) of QueryBallOp should be 3-D Tensor, the "
+ "last dimension must be 3");
+ int n_sample = ctx->Attrs().Get("N_sample");
+ PADDLE_ENFORCE(n_sample >= 0,
+ "The n_sample should be greater than or equal to 0.");
+ float radius = ctx->Attrs().Get("Radius");
+ PADDLE_ENFORCE(radius >= 0,
+ "The radius should be greater than or equal to 0.");
+ // output: [b,m,nsample]
+ std::vector dim_out({p_dims[0], np_dims[1], n_sample});
+ ctx->SetOutputDim("Output", framework::make_ddim(dim_out));
+ }
+
+protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext &ctx) const override {
+ auto input_data_type = ctx.Input("Points")->type();
+ return framework::OpKernelType(input_data_type, ctx.GetPlace());
+ }
+};
+
+class QueryBallOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+ void Make() override {
+ AddInput("Points",
+ "Input points with shape (batch, n, 3), n is input "
+ "points's num");
+ AddInput("New_Points",
+ "Query points with shape (batch, m, 3), m is query points's num");
+ AddOutput("Output", "output points with shape(batch, m, nsample)");
+ AddAttr("N_sample",
+ R"Doc(Number of points selected in each ball region")Doc")
+ .SetDefault(0)
+ .EqualGreaterThan(0);
+ AddAttr("Radius",
+ R"Doc(Ball search radius with shape(1))Doc")
+ .SetDefault(0)
+ .EqualGreaterThan(0);
+
+ AddComment(
+ R"Doc(Query Ball Points)Doc");
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(query_ball, ops::QueryBallOp, ops::QueryBallOpMaker);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..8e8917f1b0e39bbf5793999fe5ebb5d5df699863
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/query_ball_op.cu
@@ -0,0 +1,113 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/fluid/framework/eigen.h"
+#include "paddle/fluid/framework/op_registry.h"
+
+#include "util.cu.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+template
+// input: radius (1), nsample (1), points (b,n,3), new_points (b,m,3)
+// output: idx (b,m,nsample)
+__global__ void QueryBall(int b,
+ int n,
+ int m,
+ T radius,
+ int nsample,
+ const T *points,
+ const T *new_points,
+ int *idx) {
+ int batch_index = blockIdx.x;
+ points += n * 3 * batch_index;
+ new_points += m * 3 * batch_index;
+ idx += m * nsample * batch_index;
+
+ int index = threadIdx.x;
+ int stride = blockDim.x;
+
+ for (int j = index; j < m; j += stride) {
+ int cnt = 0;
+ for (int k = 0; k < n; ++k) {
+ if (cnt == nsample)
+ break; // only pick the FIRST nsample points in the ball
+ float x2 = new_points[j * 3 + 0];
+ float y2 = new_points[j * 3 + 1];
+ float z2 = new_points[j * 3 + 2];
+ float x1 = points[k * 3 + 0];
+ float y1 = points[k * 3 + 1];
+ float z1 = points[k * 3 + 2];
+ float d =
+ (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) + (z2 - z1) * (z2 - z1);
+ if (d < radius * radius) {
+ if (cnt == 0) { // set ALL indices to k, s.t. if there are less points
+ // in ball than nsample, we still have valid
+ // (repeating) indices
+ for (int l = 0; l < nsample; ++l) idx[j * nsample + l] = k;
+ }
+ idx[j * nsample + cnt] = k;
+ cnt += 1;
+ }
+ }
+ }
+}
+
+template
+class QueryBallOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext &ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ // input: radius (1), nsample (1), points (b,n,3), new_points (b,m,3)
+ // output: idx (b,m,nsample)
+ auto *points = ctx.Input("Points");
+ auto *new_points = ctx.Input("New_Points");
+ auto *output = ctx.Output("Output");
+
+ float radius = ctx.Attr("Radius");
+ int nsample = ctx.Attr("N_sample");
+
+ if (points->numel() == 0 || new_points->numel() == 0) return;
+
+ int batch_size = points->dims()[0];
+ int n = points->dims()[1];
+ int m = new_points->dims()[1];
+ // allocate memory
+ int* p_out_points = output->mutable_data({batch_size, m, nsample}, ctx.GetPlace());
+
+ auto& dev_ctx = ctx.template device_context();
+ int pnum = output->numel();
+ Zero<<<(pnum + 512 - 1) / 512, 512, 0, dev_ctx.stream()>>>(p_out_points,
+ pnum);
+
+ const T *p_points = points->data();
+ const T *p_new_points = new_points->data();
+
+ QueryBall<<>>(batch_size,
+ n,
+ m,
+ radius,
+ nsample,
+ p_points,
+ p_new_points,
+ p_out_points);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(query_ball, ops::QueryBallOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..b7bfbe7f935b74c46a795dd5370c814e4f5350c4
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cc
@@ -0,0 +1,142 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include
+#include
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+class ThreeInterpOp : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("X"),
+ "Input(X) of ThreeInterpOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasInput("Weight"),
+ "Input(Weight) of ThreeInterpOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasInput("Idx"),
+ "Input(Idx) of ThreeInterpOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasOutput("Out"),
+ "Output(Out) of ThreeInterpOp should not be null.");
+
+ auto dim_x = ctx->GetInputDim("X"); // [B, M, C]
+ PADDLE_ENFORCE_EQ(dim_x.size(), 3, "X's dimension must be 3");
+
+ auto dim_weight = ctx->GetInputDim("Weight"); // [B, N, 3]
+ PADDLE_ENFORCE_EQ(dim_weight.size(), 3, "Weight's dimension must be 3");
+
+ PADDLE_ENFORCE_EQ(
+ dim_x[0], dim_weight[0], "X and Weight dim[0] should be equal.");
+
+ auto dim_idx = ctx->GetInputDim("Idx"); // [B, N, 3]
+ PADDLE_ENFORCE_EQ(dim_idx.size(), 3, "Idx's dimension must be 3");
+
+ for (int i = 0; i < 3; i++) {
+ PADDLE_ENFORCE_EQ(
+ dim_weight[i], dim_idx[i], "Weight and Idx shape should be same.");
+ }
+
+ // output: [B, N, C]
+ std::vector dim_out({dim_x[0], dim_idx[1], dim_x[2]});
+ ctx->SetOutputDim("Out", framework::make_ddim(dim_out));
+ }
+
+protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(ctx.Input("X")->type(),
+ ctx.GetPlace());
+ }
+};
+
+class ThreeInterpOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+ void Make() override {
+ AddInput("X",
+ "The input tensor of three_interp operator. "
+ "This is a 3-D tensor with shape of [B, M, C].");
+ AddInput("Weight",
+ "The input tensor of point weight of three_interp operator. "
+ "This is a 3-D tensor with shape of [B, N, 3].");
+ AddInput("Idx",
+ "The input tensor of nearest neighbor index of three_interp "
+ "operator. This is a 3-D tensor with shape of [B, N, 3].");
+ AddOutput("Out",
+ "The output tensor of three_interp operator. "
+ "This is a 3-D tensor with shape of [B, N, 3].");
+
+ AddComment(R"DOC(
+ This operator calculate interpolate results from input, weight and
+ index.
+ )DOC");
+ }
+};
+
+class ThreeInterpOpGrad : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("Weight"), "Input(Weight) should not be null");
+ PADDLE_ENFORCE(ctx->HasInput("Idx"), "Input(Idx) should not be null");
+ PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
+ "Input(Out@GRAD) should not be null");
+ auto dim_x = ctx->GetInputDim("X");
+ if (ctx->HasOutput(framework::GradVarName("X"))) {
+ ctx->SetOutputDim(framework::GradVarName("X"), dim_x);
+ }
+ }
+
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(
+ ctx.Input(framework::GradVarName("Out"))->type(),
+ ctx.GetPlace());
+ }
+};
+
+template
+class ThreeInterpGradDescMaker : public framework::SingleGradOpMaker {
+public:
+ using framework::SingleGradOpMaker::SingleGradOpMaker;
+
+protected:
+ std::unique_ptr Apply() const override {
+ auto* op = new T();
+ op->SetType("three_interp_grad");
+ op->SetInput("X", this->Input("X"));
+ op->SetInput("Weight", this->Input("Weight"));
+ op->SetInput("Idx", this->Input("Idx"));
+ op->SetInput(framework::GradVarName("Out"), this->OutputGrad("Out"));
+ op->SetOutput(framework::GradVarName("X"), this->InputGrad("X"));
+ op->SetAttrMap(this->Attrs());
+ return std::unique_ptr(op);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(three_interp,
+ ops::ThreeInterpOp,
+ ops::ThreeInterpOpMaker,
+ ops::ThreeInterpGradDescMaker,
+ ops::ThreeInterpGradDescMaker);
+REGISTER_OPERATOR(three_interp_grad, ops::ThreeInterpOpGrad);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..0e23440b70da75fbcfa32d32be741904bdfedaa0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_interp_op.cu
@@ -0,0 +1,152 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include "paddle/fluid/framework/op_registry.h"
+#include "paddle/fluid/platform/cuda_primitives.h"
+
+#include "util.cu.h"
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+template
+__global__ void KeThreeInterpFw(T* output,
+ const T* input,
+ const T* weight,
+ const int* idx,
+ const int b,
+ const int m,
+ const int c,
+ const int n) {
+ int nthreads = b * n * c;
+ int tid = blockIdx.x * blockDim.x + threadIdx.x;
+ int stride = blockDim.x * gridDim.x;
+ for (; tid < nthreads; tid += stride) {
+ int bi = tid / n / c;
+ int ni = (tid % (n * c)) / c;
+ int ci = tid % c;
+
+ int input_base_idx = bi * m * c;
+ int w_idx = bi * n * 3 + ni * 3;
+ output[tid] =
+ input[input_base_idx + idx[w_idx] * c + ci] * weight[w_idx] +
+ input[input_base_idx + idx[w_idx + 1] * c + ci] * weight[w_idx + 1] +
+ input[input_base_idx + idx[w_idx + 2] * c + ci] * weight[w_idx + 2];
+ }
+}
+
+template
+__global__ void KeThreeInterpBw(T* input_grad,
+ const T* output_grad,
+ const T* weight,
+ const int* idx,
+ const int b,
+ const int m,
+ const int c,
+ const int n) {
+ int nthreads = b * n * c;
+ int tid = blockIdx.x * blockDim.x + threadIdx.x;
+ int stride = blockDim.x * gridDim.x;
+ for (; tid < nthreads; tid += stride) {
+ int bi = tid / n / c;
+ int ni = (tid % (c * n)) / c;
+ int ci = tid % c;
+
+ int input_base_idx = bi * m * c;
+ int w_idx = bi * n * 3 + ni * 3;
+ platform::CudaAtomicAdd(&input_grad[input_base_idx + idx[w_idx] * c + ci],
+ output_grad[tid] * weight[w_idx]);
+ platform::CudaAtomicAdd(
+ &input_grad[input_base_idx + idx[w_idx + 1] * c + ci],
+ output_grad[tid] * weight[w_idx + 1]);
+ platform::CudaAtomicAdd(
+ &input_grad[input_base_idx + idx[w_idx + 2] * c + ci],
+ output_grad[tid] * weight[w_idx + 2]);
+ }
+}
+
+template
+class ThreeInterpOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext& ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ auto* input = ctx.Input("X");
+ auto* weight = ctx.Input("Weight");
+ auto* idx = ctx.Input("Idx");
+ auto* output = ctx.Output("Out");
+ auto* input_data = input->data();
+ auto* weight_data = weight->data();
+ auto* idx_data = idx->data();
+
+ const int b = input->dims()[0];
+ const int m = input->dims()[1];
+ const int c = input->dims()[2];
+ const int n = weight->dims()[1];
+
+ auto* output_data = output->mutable_data({b, n, c}, ctx.GetPlace());
+
+ int pixelNum = b * n * c;
+ int grid_dim = (pixelNum + 512 - 1) / 512;
+ grid_dim = grid_dim > 8 ? 8 : grid_dim;
+
+ KeThreeInterpFw<
+ T><<>>(
+ output_data, input_data, weight_data, idx_data, b, m, c, n);
+ }
+};
+
+template
+class ThreeInterpGradOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext& ctx) const override {
+ auto* input = ctx.Input("X");
+ auto* weight = ctx.Input("Weight");
+ auto* idx = ctx.Input("Idx");
+ auto* output_grad = ctx.Input(framework::GradVarName("Out"));
+ auto* input_grad = ctx.Output(framework::GradVarName("X"));
+ auto* weight_data = weight->data();
+ auto* idx_data = idx->data();
+ auto output_grad_data = output_grad->data();
+
+ const int b = input->dims()[0];
+ const int m = input->dims()[1];
+ const int c = input->dims()[2];
+ const int n = weight->dims()[1];
+
+ auto* input_grad_data =
+ input_grad->mutable_data({b, m, c}, ctx.GetPlace());
+ auto& dev_ctx = ctx.template device_context();
+ int pnum = input_grad->numel();
+ Zero<<<(pnum + 512 - 1) / 512, 512, 0, dev_ctx.stream()>>>(input_grad_data,
+ pnum);
+
+ int pixelNum = b * n * c;
+ int grid_dim = (pixelNum + 512 - 1) / 512;
+ grid_dim = grid_dim > 8 ? 8 : grid_dim;
+
+ KeThreeInterpBw<
+ T><<>>(
+ input_grad_data, output_grad_data, weight_data, idx_data, b, m, c, n);
+ }
+};
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(three_interp,
+ ops::ThreeInterpOpCUDAKernel,
+ ops::ThreeInterpOpCUDAKernel);
+REGISTER_OP_CUDA_KERNEL(three_interp_grad,
+ ops::ThreeInterpGradOpCUDAKernel,
+ ops::ThreeInterpGradOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cc b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..5ca8b261c79cb2c7a28d1e2e8064dabc3eba921b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cc
@@ -0,0 +1,93 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include
+#include
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+class ThreeNNOp : public framework::OperatorWithKernel {
+public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+protected:
+ void InferShape(framework::InferShapeContext* ctx) const override {
+ PADDLE_ENFORCE(ctx->HasInput("X"),
+ "Input(X) of ThreeNNOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasInput("Known"),
+ "Input(Known) of ThreeNNOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasOutput("Distance"),
+ "Output(Distance) of ThreeNNOp should not be null.");
+ PADDLE_ENFORCE(ctx->HasOutput("Idx"),
+ "Output(Idx) of ThreeNNOp should not be null.");
+
+ auto dim_x = ctx->GetInputDim("X"); // [B, N, 3]
+ PADDLE_ENFORCE_EQ(dim_x.size(), 3, "X's dimension must be 3");
+ PADDLE_ENFORCE_EQ(dim_x[2], 3, "X dim[2] must be 3");
+
+ auto dim_known = ctx->GetInputDim("Known"); // [B, M, 3]
+ PADDLE_ENFORCE_EQ(dim_known.size(), 3, "Known's dimension must be 3");
+ PADDLE_ENFORCE_EQ(dim_known[2], 3, "Known dim[2] must be 3");
+
+ PADDLE_ENFORCE_EQ(
+ dim_x[0], dim_known[0], "X and Known dim[0] should be equal.");
+ PADDLE_ENFORCE_GE(
+ dim_known[1], 3, "Known dim[1] shoule be greater or euqal than 3.");
+
+ ctx->SetOutputDim("Distance", dim_x);
+ ctx->SetOutputDim("Idx", dim_x);
+ }
+
+protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override {
+ return framework::OpKernelType(ctx.Input("X")->type(),
+ ctx.GetPlace());
+ }
+};
+
+class ThreeNNOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+ void Make() override {
+ AddInput("X",
+ "The input tensor of three_nn operator. "
+ "This is a 3-D tensor with shape of [B, N, 3].");
+ AddInput("Known",
+ "The input tensor of known points of three_nn operator. "
+ "This is a 3-D tensor with shape of [B, M, 3].");
+ AddOutput("Distance",
+ "The output distance tensor of three_nn operator. "
+ "This is a 3-D tensor with shape of [B, N, 3].");
+ AddOutput("Idx",
+ "The output index tensor of three_nn operator. "
+ "This is a 3-D tensor with shape of [B, N, 3].");
+
+ AddAttr("eps", "minimum value of distance.").SetDefault(1e-10);
+
+ AddComment(R"DOC(
+ This operator samples the top-3 nearest neighbor of each point
+ coordinates specified by Input(X) between known point coordinates
+ specified by Input(Known) and calcualte the distance between these
+ nearest neighbors.
+ )DOC");
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(three_nn, ops::ThreeNNOp, ops::ThreeNNOpMaker);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cu b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..4120599a9dfde4d75fcceb6f3f17a002d3d896d9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/three_nn_op.cu
@@ -0,0 +1,110 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+#include "paddle/fluid/framework/op_registry.h"
+#include "paddle/fluid/platform/cuda_primitives.h"
+
+namespace paddle {
+namespace operators {
+
+using framework::Tensor;
+
+template
+__global__ void KeThreeNNFw(T* distance,
+ int* idx,
+ const T* input,
+ const T* known,
+ const float eps,
+ const int b,
+ const int n,
+ const int m) {
+ int nthreads = b * n;
+ int tid = blockIdx.x * blockDim.x + threadIdx.x;
+ int stride = blockDim.x * gridDim.x;
+ for (; tid < nthreads; tid += stride) {
+ int bi = tid / n;
+ int ni = tid % n;
+
+ int input_idx = tid * 3;
+ T x1 = input[input_idx];
+ T y1 = input[input_idx + 1];
+ T z1 = input[input_idx + 2];
+
+ distance[input_idx] = 1e40;
+ distance[input_idx + 1] = 1e40;
+ distance[input_idx + 2] = 1e40;
+ idx[input_idx] = 0;
+ idx[input_idx + 1] = 0;
+ idx[input_idx + 2] = 0;
+ for (int i = 0; i < m; i++) {
+ int known_idx = bi * m * 3 + i * 3;
+ double dist = (x1 - known[known_idx]) * (x1 - known[known_idx]) +
+ (y1 - known[known_idx + 1]) * (y1 - known[known_idx + 1]) +
+ (z1 - known[known_idx + 2]) * (z1 - known[known_idx + 2]);
+ T valid_dist = dist > eps ? static_cast(dist) : eps;
+ if (dist < distance[input_idx]) {
+ distance[input_idx + 2] = distance[input_idx + 1];
+ idx[input_idx + 2] = idx[input_idx + 1];
+ distance[input_idx + 1] = distance[input_idx];
+ idx[input_idx + 1] = idx[input_idx];
+ distance[input_idx] = dist;
+ idx[input_idx] = i;
+ } else if (dist < distance[input_idx + 1]) {
+ distance[input_idx + 2] = distance[input_idx + 1];
+ idx[input_idx + 2] = idx[input_idx + 1];
+ distance[input_idx + 1] = dist;
+ idx[input_idx + 1] = i;
+ } else if (dist < distance[input_idx + 2]) {
+ distance[input_idx + 2] = dist;
+ idx[input_idx + 2] = i;
+ }
+ }
+ }
+}
+
+template
+class ThreeNNOpCUDAKernel : public framework::OpKernel {
+public:
+ void Compute(const framework::ExecutionContext& ctx) const override {
+ PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()),
+ "This kernel only runs on GPU device.");
+ auto* input = ctx.Input("X");
+ auto* known = ctx.Input("Known");
+ auto* distance = ctx.Output("Distance");
+ auto* idx = ctx.Output("Idx");
+ auto* input_data = input->data();
+ auto* known_data = known->data();
+
+ const float eps = ctx.Attr("eps");
+
+ const int b = input->dims()[0];
+ const int n = input->dims()[1];
+ const int m = known->dims()[1];
+
+ auto* idx_data = idx->mutable_data({b, n, 3}, ctx.GetPlace());
+ auto* distance_data = distance->mutable_data({b, n, 3}, ctx.GetPlace());
+
+ int pixelNum = b * n;
+ int grid_dim = (pixelNum + 512 - 1) / 512;
+ grid_dim = grid_dim > 8 ? 8 : grid_dim;
+
+ KeThreeNNFw<<>>(
+ distance_data, idx_data, input_data, known_data, eps, b, n, m);
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP_CUDA_KERNEL(three_nn,
+ ops::ThreeNNOpCUDAKernel,
+ ops::ThreeNNOpCUDAKernel);
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/src/util.cu.h b/PaddleCV/Paddle3D/PointNet++/ext_op/src/util.cu.h
new file mode 100644
index 0000000000000000000000000000000000000000..05f1e9f9046644df1d92c4aca592d4a9017e4d5b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/src/util.cu.h
@@ -0,0 +1,18 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. */
+
+template
+__global__ void Zero(T* x, int num) {
+ for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < num;
+ i += blockDim.x * gridDim.x) {
+ x[i] = static_cast(0);
+ }
+}
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_farthest_point_sampling_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_farthest_point_sampling_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..76df7c77f100070a75f253cfe9ac005663ecb3d7
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_farthest_point_sampling_op.py
@@ -0,0 +1,63 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def farthest_point_sampling_np(xyz, npoint):
+ B, N, C = xyz.shape
+ S = npoint
+
+ centroids = np.zeros((B, S))
+ distance = np.ones((B, N)) * 1e10
+ farthest = 0
+ batch_indices = np.arange(B).astype('int32')
+ for i in range(S):
+ centroids[:, i] = farthest
+ centroid = xyz[batch_indices, farthest, :].reshape((B, 1, 3))
+ dist = np.sum((xyz - centroid)**2, -1)
+ mask = dist < distance
+ distance[mask] = dist[mask]
+ farthest = np.argmax(distance, -1)
+ return centroids.astype('int32')
+
+
+class TestFarthestPointSamplingOp(unittest.TestCase):
+ def test_check_output(self):
+ x_shape = (1, 512, 3)
+ x_type = 'float32'
+ sampled_point_num = 256
+
+ x = fluid.layers.data(
+ name='x', shape=x_shape, dtype=x_type, append_batch_size=False)
+ y = pointnet_lib.farthest_point_sampling(x, sampled_point_num)
+
+ x_np = np.random.randint(1, 100, (x_shape[0] * x_shape[1] *
+ 3, )).reshape(x_shape).astype(x_type)
+ out_np = farthest_point_sampling_np(x_np, sampled_point_num)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'x': x_np}, fetch_list=[y])
+
+ self.assertTrue(np.allclose(outs[0], out_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_gather_point_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_gather_point_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..ff01bc8ad70e1a20ff854b270fa4d4fc2c2f08e1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_gather_point_op.py
@@ -0,0 +1,56 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def gather_point_np(points, index):
+ result = []
+ for i in range(len(index)):
+ a = points[i][index[i]]
+ result.append(a.tolist())
+ return result
+
+
+class TestGatherPointOp(unittest.TestCase):
+ def test_check_output(self):
+ x_shape = (1, 512, 3)
+ x_type = 'float32'
+ idx_shape = (1, 32)
+ idx_type = 'int32'
+
+ x = fluid.layers.data(
+ name='x', shape=x_shape, dtype=x_type, append_batch_size=False)
+ idx = fluid.layers.data(
+ name='idx', shape=idx_shape, dtype=idx_type, append_batch_size=False)
+ y = pointnet_lib.gather_point(x, idx)
+
+ x_np = np.random.uniform(-10, 10, x_shape).astype(x_type)
+ idx_np = np.random.randint(0, x_shape[1], idx_shape).astype(idx_type)
+ out_np = gather_point_np(x_np, idx_np)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'x': x_np, 'idx': idx_np}, fetch_list=[y])
+
+ self.assertTrue(np.allclose(outs[0], out_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_group_points_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_group_points_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..8ab4fb7a9c5040bf2c8130d1bd211243038c046f
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_group_points_op.py
@@ -0,0 +1,60 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def group_points_np(x, idx):
+ b, m, s = idx.shape
+ _, c, n = x.shape
+
+ output = np.zeros((b, c, m, s)).astype(x.dtype)
+ for i in range(b):
+ for j in range(m):
+ for k in range(s):
+ output[i, :, j, k] = x[i, :, idx[i, j, k]]
+ return output
+
+
+class TestGroupPointsOp(unittest.TestCase):
+ def test_check_output(self):
+ x_shape = [8, 43, 29]
+ x_type = 'float32'
+ idx_shape = [8, 37, 41]
+ idx_type = 'int32'
+
+ x = fluid.layers.data(
+ name='x', shape=x_shape, dtype=x_type, append_batch_size=False)
+ idx = fluid.layers.data(
+ name='idx', shape=idx_shape, dtype=idx_type, append_batch_size=False)
+ y = pointnet_lib.group_points(x, idx)
+
+ x_np = np.random.uniform(-10, 10, x_shape).astype(x_type)
+ idx_np = np.random.randint(0, x_shape[2], idx_shape).astype(idx_type)
+ out_np = group_points_np(x_np, idx_np)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'x': x_np, 'idx': idx_np}, fetch_list=[y])
+
+ self.assertTrue(np.allclose(outs[0], out_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_query_ball_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_query_ball_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..ab3ea1821f388108edf753420e90d37c8abfcbc0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_query_ball_op.py
@@ -0,0 +1,69 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def query_ball_point_np(points, new_points, radius, nsample):
+ b, n, c = points.shape
+ _, m, _ = new_points.shape
+ out = np.zeros(shape=(b, m, nsample)).astype('int32')
+ radius_2 = radius * radius
+ for i in range(b):
+ for j in range(m):
+ cnt = 0
+ for k in range(n):
+ if (cnt == nsample):
+ break
+ dist = np.sum(np.square(points[i][k] - new_points[i][j]))
+ if (dist < radius_2):
+ if cnt == 0:
+ out[i][j] = np.ones(shape=(nsample)) * k
+ out[i][j][cnt] = k
+ cnt += 1
+ return out
+
+
+class TestQueryBallOp(unittest.TestCase):
+ def test_check_output(self):
+ points_shape = [2, 5, 3]
+ new_points_shape = [2, 4, 3]
+ points_type = 'float32'
+ radius = 6
+ nsample = 5
+
+ points = fluid.layers.data(
+ name='points', shape=points_shape, dtype=points_type, append_batch_size=False)
+ new_points = fluid.layers.data(
+ name='new_points', shape=new_points_shape, dtype=points_type, append_batch_size=False)
+ y = pointnet_lib.query_ball(points, new_points, radius, nsample)
+
+ points_np = np.random.randint(1, 5, points_shape).astype(points_type)
+ new_points_np = np.random.randint(1, 5, new_points_shape).astype(points_type)
+ out_np = query_ball_point_np(points_np, new_points_np, radius, nsample)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'points': points_np, 'new_points': new_points_np}, fetch_list=[y])
+
+ self.assertTrue(np.allclose(outs[0], out_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_interp_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_interp_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..e73fbad756ac5e5d3703f8354e2b0641b7cc9383
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_interp_op.py
@@ -0,0 +1,66 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def three_interp_np(x, weight, idx):
+ b, m, c = x.shape
+ n = weight.shape[1]
+
+ output = np.zeros((b, n, c)).astype('float32')
+ for i in range(b):
+ for j in range(n):
+ w1, w2, w3 = weight[i, j, :]
+ i1, i2, i3 = idx[i, j, :]
+ output[i, j, :] = w1 * x[i, i1, :] \
+ + w2 * x[i, i2, :] \
+ + w3 * x[i, i3, :]
+ return output
+
+
+class TestThreeInterpOp(unittest.TestCase):
+ def test_check_output(self):
+ input_shape = [8, 21, 29]
+ input_type = 'float32'
+ weight_shape = [8, 37, 3]
+ weight_type = 'float32'
+
+ x = fluid.layers.data(
+ name='x', shape=input_shape, dtype=input_type, append_batch_size=False)
+ weight = fluid.layers.data(
+ name='weight', shape=weight_shape, dtype=weight_type, append_batch_size=False)
+ idx = fluid.layers.data(
+ name='idx', shape=weight_shape, dtype="int32", append_batch_size=False)
+ y = pointnet_lib.three_interp(x, weight, idx)
+
+ x_np = np.random.random(input_shape).astype(input_type)
+ weight_np = np.random.random(weight_shape).astype(weight_type)
+ idx_np = np.random.uniform(0, input_shape[1], weight_shape).astype("int32")
+ out_np = three_interp_np(x_np, weight_np, idx_np)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'x': x_np, 'weight': weight_np, 'idx': idx_np}, fetch_list=[y])
+
+ self.assertTrue(np.allclose(outs[0], out_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_nn_op.py b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_nn_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..c6468e8b8cf881e3bbd1a16b1f8a6896fce07333
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/ext_op/tests/test_three_nn_op.py
@@ -0,0 +1,79 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import unittest
+import numpy as np
+import paddle.fluid as fluid
+import pointnet_lib
+
+
+def three_nn_np(x, known, eps=1e-10):
+ distance = np.ones_like(x).astype('float32') * 1e40
+ idx = np.zeros_like(x).astype('int32')
+
+ b, n, _ = x.shape
+ m = known.shape[1]
+ for i in range(b):
+ for j in range(n):
+ for k in range(m):
+ sub = x[i, j, :] - known[i, k, :]
+ d = float(np.sum(sub * sub))
+ valid_d = max(d, eps)
+ if d < distance[i, j, 0]:
+ distance[i, j, 2] = distance[i, j, 1]
+ idx[i, j, 2] = idx[i, j, 1]
+ distance[i, j, 1] = distance[i, j, 0]
+ idx[i, j, 1] = idx[i, j, 0]
+ distance[i, j, 0] = valid_d
+ idx[i, j, 0] = k
+ elif d < distance[i, j, 1]:
+ distance[i, j, 2] = distance[i, j, 1]
+ idx[i, j, 2] = idx[i, j, 1]
+ distance[i, j, 1] = valid_d
+ idx[i, j, 1] = k
+ elif d < distance[i, j, 2]:
+ distance[i, j, 2] = valid_d
+ idx[i, j, 2] = k
+ return distance, idx
+
+
+class TestThreeNNOp(unittest.TestCase):
+ def test_check_output(self):
+ input_shape = [16, 32, 3]
+ known_shape = [16, 8, 3]
+ input_type = 'float32'
+ eps = 1e-10
+
+ x = fluid.layers.data(
+ name='x', shape=input_shape, dtype=input_type, append_batch_size=False)
+ known = fluid.layers.data(
+ name='known', shape=known_shape, dtype=input_type, append_batch_size=False)
+ dist, idx = pointnet_lib.three_nn(x, known, eps)
+
+ x_np = np.random.random(input_shape).astype(input_type)
+ known_np = np.random.random(known_shape).astype(input_type)
+ dist_np, idx_np = three_nn_np(x_np, known_np, eps)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ outs = exe.run(feed={'x': x_np, 'known': known_np}, fetch_list=[dist, idx])
+
+ self.assertTrue(np.allclose(outs[0], dist_np))
+ self.assertTrue(np.allclose(outs[1], idx_np))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/PaddleCV/Paddle3D/PointNet++/image/pointnet2.jpg b/PaddleCV/Paddle3D/PointNet++/image/pointnet2.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..5d3b0f4ec8f614332ea258e7bd8d64d261264a92
Binary files /dev/null and b/PaddleCV/Paddle3D/PointNet++/image/pointnet2.jpg differ
diff --git a/PaddleCV/Paddle3D/PointNet++/models/__init__.py b/PaddleCV/Paddle3D/PointNet++/models/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..2435d63de22342ce5f3219f6f0347e0afb27c014
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/models/__init__.py
@@ -0,0 +1,27 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+
+from . import pointnet2_modules
+from . import pointnet2_seg
+from . import pointnet2_cls
+
+from .pointnet2_modules import *
+from .pointnet2_seg import *
+from .pointnet2_cls import *
+
+__all__ = pointnet2_modules.__all__
+__all__ += pointnet2_seg.__all__
+__all__ += pointnet2_cls.__all__
diff --git a/PaddleCV/Paddle3D/PointNet++/models/pointnet2_cls.py b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_cls.py
new file mode 100644
index 0000000000000000000000000000000000000000..778433c17794ebfeb520f59655c2c4772ce23b0a
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_cls.py
@@ -0,0 +1,151 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ classification models
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from .pointnet2_modules import *
+
+__all__ = ["PointNet2ClsSSG", "PointNet2ClsMSG"]
+
+
+class PointNet2Cls(object):
+ def __init__(self, num_classes, num_points, use_xyz=True):
+ self.num_classes = num_classes
+ self.num_points = num_points
+ self.use_xyz = use_xyz
+ self.out_feature = None
+ self.pyreader = None
+ self.model_config()
+
+ def model_config(self):
+ self.SA_confs = []
+
+ def build_input(self):
+ self.xyz = fluid.layers.data(name='xyz', shape=[self.num_points, 3], dtype='float32', lod_level=0)
+ self.label = fluid.layers.data(name='label', shape=[1], dtype='int64', lod_level=0)
+ self.pyreader = fluid.io.PyReader(
+ feed_list=[self.xyz, self.label],
+ capacity=64,
+ use_double_buffer=True,
+ iterable=False)
+ self.feed_vars = [self.xyz, self.label]
+
+ def build_model(self, bn_momentum=0.99):
+ self.build_input()
+
+ xyz, feature = self.xyz, None
+ for i, SA_conf in enumerate(self.SA_confs):
+ xyz, feature = pointnet_sa_module(
+ xyz=xyz,
+ feature=feature,
+ bn_momentum=bn_momentum,
+ use_xyz=self.use_xyz,
+ name="sa_{}".format(i),
+ **SA_conf)
+
+ out = fluid.layers.squeeze(feature, axes=[-1])
+ out = fc_bn(out,out_channels=512, bn=True, bn_momentum=bn_momentum, name="fc_1")
+ out = fluid.layers.dropout(out, 0.5, dropout_implementation="upscale_in_train")
+ out = fc_bn(out,out_channels=256, bn=True, bn_momentum=bn_momentum, name="fc_2")
+ out = fluid.layers.dropout(out, 0.5, dropout_implementation="upscale_in_train")
+ out = fc_bn(out,out_channels=self.num_classes, act=None, name="fc_3")
+ pred = fluid.layers.softmax(out)
+
+ # calc loss
+ self.loss = fluid.layers.cross_entropy(pred, self.label)
+ self.loss = fluid.layers.reduce_mean(self.loss)
+
+ # calc acc
+ pred = fluid.layers.reshape(pred, shape=[-1, self.num_classes])
+ label = fluid.layers.reshape(self.label, shape=[-1, 1])
+ self.acc1 = fluid.layers.accuracy(pred, label, k=1)
+
+ def get_feeds(self):
+ return self.feed_vars
+
+ def get_outputs(self):
+ return {"loss": self.loss, "accuracy": self.acc1}
+
+ def get_pyreader(self):
+ return self.pyreader
+
+
+class PointNet2ClsSSG(PointNet2Cls):
+ def __init__(self, num_classes, num_points, use_xyz=True):
+ super(PointNet2ClsSSG, self).__init__(num_classes, num_points, use_xyz)
+
+ def model_config(self):
+ self.SA_confs = [
+ {
+ "npoint": 512,
+ "radiuss": [0.2],
+ "nsamples": [64],
+ "mlps": [[64, 64, 128]],
+ },
+ {
+ "npoint": 128,
+ "radiuss": [0.4],
+ "nsamples": [64],
+ "mlps": [[128, 128, 256]],
+ },
+ {
+ "npoint":None,
+ "radiuss": [None],
+ "nsamples":[None],
+ "mlps": [[256, 512, 1024]],
+ },
+ ]
+
+
+class PointNet2ClsMSG(PointNet2Cls):
+ def __init__(self, num_classes, num_points, use_xyz=True):
+ super(PointNet2ClsMSG, self).__init__(num_classes, num_points, use_xyz)
+
+ def model_config(self):
+ self.SA_confs = [
+ {
+ "npoint": 512,
+ "radiuss": [0.1, 0.2, 0.4],
+ "nsamples": [16, 32, 128],
+ "mlps": [[32, 32, 64],
+ [64, 64, 128],
+ [64,96,128]],
+ },
+ {
+ "npoint": 128,
+ "radiuss": [0.2, 0.4, 0.8],
+ "nsamples": [32, 64, 128],
+ "mlps": [[64, 64, 128],
+ [128, 128, 256],
+ [128,128,256]],
+ },
+ {
+ "npoint":None,
+ "radiuss": [None],
+ "nsamples":[None],
+ "mlps": [[256, 512, 1024]],
+ },
+ ]
+
+
diff --git a/PaddleCV/Paddle3D/PointNet++/models/pointnet2_modules.py b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_modules.py
new file mode 100644
index 0000000000000000000000000000000000000000..08cc15ae1ea9730da670738c79c14b80c5557361
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_modules.py
@@ -0,0 +1,219 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from ext_op import *
+
+__all__ = ["conv_bn", "pointnet_sa_module", "pointnet_fp_module","fc_bn"]
+
+
+def query_and_group(xyz, new_xyz, radius, nsample, features=None, use_xyz=True):
+ """
+ Perform query_ball and group_points
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ radius (float32): radius of ball
+ nsample (int32): maximum number of gather features
+ features (Variable): features with shape [B, N, C]
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ out (Variable): features with shape [B, npoint, nsample, C + 3]
+ """
+ idx = query_ball(xyz, new_xyz, radius, nsample)
+ idx.stop_gradient = True
+ xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
+ grouped_xyz = group_points(xyz, idx)
+ expand_new_xyz = fluid.layers.unsqueeze(fluid.layers.transpose(new_xyz, perm=[0, 2, 1]), axes=[-1])
+ expand_new_xyz = fluid.layers.expand(expand_new_xyz, [1, 1, 1, grouped_xyz.shape[3]])
+ grouped_xyz -= expand_new_xyz
+
+ if features is not None:
+ grouped_features = group_points(features, idx)
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) \
+ if use_xyz else grouped_features
+ else:
+ assert use_xyz, "use_xyz should be True when features is None"
+ return grouped_xyz
+
+
+def group_all(xyz, features=None, use_xyz=True):
+ """
+ Group all xyz and features when npoint is None
+ See query_and_group
+ """
+ xyz = fluid.layers.transpose(xyz,perm=[0,2,1])
+ grouped_xyz = fluid.layers.unsqueeze(xyz, axes=[2])
+ if features is not None:
+ grouped_features = fluid.layers.unsqueeze(features, axes=[2])
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) if use_xyz else grouped_features
+ else:
+ return grouped_xyz
+
+
+def conv_bn(input, out_channels, bn=True, bn_momentum=0.99, act='relu', name=None):
+ param_attr = ParamAttr(name='{}_conv_weight'.format(name),)
+ bias_attr = ParamAttr(name='{}_conv_bias'.format(name)) \
+ if not bn else False
+ out = fluid.layers.conv2d(input,
+ num_filters=out_channels,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=param_attr,
+ bias_attr=bias_attr,
+ act=act if not bn else None)
+ if bn:
+ bn_name = name + "_bn"
+ out = fluid.layers.batch_norm(out,
+ act=act,
+ momentum=bn_momentum,
+ param_attr=ParamAttr(name=bn_name + "_scale"),
+ bias_attr=ParamAttr(name=bn_name + "_offset"),
+ moving_mean_name=bn_name + '_mean',
+ moving_variance_name=bn_name + '_var')
+
+ return out
+
+def fc_bn(input, out_channels, bn=False, bn_momentum=0.99, act='relu', name=None):
+ param_attr = ParamAttr(name='{}_fc_weight'.format(name))
+ if not bn:
+ bias_attr = ParamAttr(name='{}_fc_bias'.format(name))
+ else:
+ bias_attr = False
+ out = fluid.layers.fc(input,
+ size=out_channels,
+ param_attr=param_attr,
+ bias_attr=bias_attr)
+ if bn:
+ bn_name = name + "_bn"
+ out = fluid.layers.batch_norm(out,
+ momentum=bn_momentum,
+ param_attr=ParamAttr(name=bn_name + "_scale"),
+ bias_attr=ParamAttr(name=bn_name + "_offset"),
+ moving_mean_name=bn_name + '_mean',
+ moving_variance_name=bn_name + '_var')
+ if act == "relu":
+ out = fluid.layers.relu(out)
+ return out
+
+def MLP(features, out_channels_list, bn=True, bn_momentum=0.99, act='relu', name=None):
+ out = features
+ for i, out_channels in enumerate(out_channels_list):
+ out = conv_bn(out, out_channels, bn=bn, act=act, bn_momentum=bn_momentum, name=name + "_{}".format(i))
+ return out
+
+
+def pointnet_sa_module(xyz,
+ npoint=None,
+ radiuss=[],
+ nsamples=[],
+ mlps=[],
+ feature=None,
+ bn=True,
+ bn_momentum=0.99,
+ use_xyz=True,
+ name=None):
+ """
+ PointNet MSG(Multi-Scale Group) Set Abstraction Module.
+ Call with radiuss, nsamples, mlps as single element list for
+ SSG(Single-Scale Group).
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ radiuss ([float32]): list of radius of ball
+ nsamples ([int32]): list of maximum number of gather features
+ mlps ([[int32]]): list of out_channels_list
+ feature (Variable): features with shape [B, C, N]
+ bn (bool): whether perform batch norm after conv2d
+ bn_momentum (float): momentum of batch norm
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ out (Variable): features with shape [B, npoint, \sum_i{mlps[i][-1]}]
+ """
+ assert len(radiuss) == len(nsamples) == len(mlps), \
+ "radiuss, nsamples, mlps length should be same"
+
+ farthest_idx = farthest_point_sampling(xyz, npoint)
+ farthest_idx.stop_gradient = True
+ new_xyz = gather_point(xyz, farthest_idx) if npoint is not None else None
+
+ outs = []
+ for i, (radius, nsample, mlp) in enumerate(zip(radiuss, nsamples, mlps)):
+ out = query_and_group(xyz, new_xyz, radius, nsample, feature, use_xyz) if npoint is not None else group_all(xyz, feature, use_xyz)
+ out = MLP(out, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp{}'.format(i))
+ out = fluid.layers.pool2d(out, pool_size=[1, out.shape[3]], pool_type='max')
+ out = fluid.layers.squeeze(out, axes=[-1])
+ outs.append(out)
+ out = fluid.layers.concat(outs, axis=1)
+
+ return (new_xyz, out)
+
+
+def pointnet_fp_module(unknown, known, unknown_feats, known_feats, mlp, bn=True, bn_momentum=0.99, name=None):
+ """
+ PointNet Feature Propagation Module
+
+ Args:
+ unknown (Variable): unknown xyz coordiantes features with shape [B, N, 3]
+ known (Variable): known xyz coordiantes features with shape [B, M, 3]
+ unknown_feats (Variable): unknown features with shape [B, N, C1] to be propagated to
+ known_feats (Variable): known features with shape [B, M, C2] to be propagated from
+ mlp ([int32]): out_channels_list
+ bn (bool): whether perform batch norm after conv2d
+ bn_momentum (float): momentum of batch norm
+
+ Returns:
+ new_features (Variable): new features with shape [B, N, mlp[-1]]
+ """
+ if known is None:
+ raise NotImplementedError("Not implement known as None currently.")
+ else:
+ dist, idx = three_nn(unknown, known, eps=0)
+ dist.stop_gradient = True
+ idx.stop_gradient = True
+ dist = fluid.layers.sqrt(dist)
+ ones = fluid.layers.fill_constant_batch_size_like(dist, dist.shape, dist.dtype, 1)
+ dist_recip = ones / (dist + 1e-8); # 1.0 / dist
+ norm = fluid.layers.reduce_sum(dist_recip, dim=-1, keep_dim=True)
+ weight = dist_recip / norm
+ weight.stop_gradient = True
+ interp_feats = three_interp(known_feats, weight, idx)
+
+ new_features = interp_feats if unknown_feats is None else \
+ fluid.layers.concat([interp_feats, unknown_feats], axis=-1)
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+ new_features = fluid.layers.unsqueeze(new_features, axes=[-1])
+ new_features = MLP(new_features, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp')
+ new_features = fluid.layers.squeeze(new_features, axes=[-1])
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+
+ return new_features
+
diff --git a/PaddleCV/Paddle3D/PointNet++/models/pointnet2_seg.py b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_seg.py
new file mode 100644
index 0000000000000000000000000000000000000000..04d6d73e2b6d066aa940ab176698af3738b4de94
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/models/pointnet2_seg.py
@@ -0,0 +1,188 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ SSG/MSG semantic segmentation models
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from .pointnet2_modules import *
+
+__all__ = ["PointNet2SemSegSSG", "PointNet2SemSegMSG"]
+
+
+class PointNet2SemSeg(object):
+ def __init__(self, num_classes, num_points, use_xyz=True):
+ self.num_classes = num_classes
+ self.num_points = num_points
+ self.use_xyz = use_xyz
+ self.feed_vars = []
+ self.out_feature = None
+ self.pyreader = None
+ self.model_config()
+
+ def model_config(self):
+ self.SA_confs = []
+ self.FP_confs = []
+
+ def build_input(self):
+ self.xyz = fluid.layers.data(name='xyz', shape=[self.num_points, 3], dtype='float32', lod_level=0)
+ self.feature = fluid.layers.data(name='feature', shape=[self.num_points, 6], dtype='float32', lod_level=0)
+ self.label = fluid.layers.data(name='label', shape=[self.num_points, 1], dtype='int64', lod_level=0)
+ self.pyreader = fluid.io.PyReader(
+ feed_list=[self.xyz, self.feature, self.label],
+ capacity=64,
+ use_double_buffer=True,
+ iterable=False)
+ self.feed_vars = [self.xyz, self.feature, self.label]
+
+ def build_model(self, bn_momentum=0.99):
+ self.build_input()
+
+ xyzs, features = [self.xyz], [self.feature]
+ xyzi, featurei = xyzs[-1], fluid.layers.transpose(self.feature, perm=[0, 2, 1])
+ for i, SA_conf in enumerate(self.SA_confs):
+ xyzi, featurei = pointnet_sa_module(
+ xyz=xyzi,
+ feature=featurei,
+ bn_momentum=bn_momentum,
+ use_xyz=self.use_xyz,
+ name="sa_{}".format(i),
+ **SA_conf)
+ xyzs.append(xyzi)
+ features.append(fluid.layers.transpose(featurei, perm=[0, 2, 1]))
+ for i in range(-1, -(len(self.FP_confs) + 1), -1):
+ features[i - 1] = pointnet_fp_module(
+ unknown=xyzs[i - 1],
+ known=xyzs[i],
+ unknown_feats=features[i - 1],
+ known_feats=features[i],
+ bn_momentum=bn_momentum,
+ name="fp_{}".format(i+len(self.FP_confs)),
+ **self.FP_confs[i])
+
+ out = fluid.layers.transpose(features[0], perm=[0, 2, 1])
+ out = fluid.layers.unsqueeze(out, axes=[-1])
+ out = conv_bn(out, out_channels=128, bn=True, bn_momentum=bn_momentum, name="output_1")
+ out = fluid.layers.dropout(out, 0.5, dropout_implementation="upscale_in_train")
+ out = conv_bn(out, out_channels=self.num_classes, bn=False, act=None, name="output_2")
+ out = fluid.layers.squeeze(out, axes=[-1])
+ out = fluid.layers.transpose(out, perm=[0, 2, 1])
+ pred = fluid.layers.softmax(out)
+
+ # calc loss
+ self.loss = fluid.layers.cross_entropy(pred, self.label)
+ self.loss = fluid.layers.reduce_mean(self.loss)
+
+ # calc acc
+ pred = fluid.layers.reshape(pred, shape=[-1, self.num_classes])
+ label = fluid.layers.reshape(self.label, shape=[-1, 1])
+ self.acc1 = fluid.layers.accuracy(pred, label, k=1)
+
+ def get_feeds(self):
+ return self.feed_vars
+
+ def get_outputs(self):
+ return {"loss": self.loss, "accuracy": self.acc1}
+
+ def get_pyreader(self):
+ return self.pyreader
+
+
+class PointNet2SemSegSSG(PointNet2SemSeg):
+ def __init__(self, num_classes, use_xyz=True):
+ super(PointNet2SemSegSSG, self).__init__(num_classes, use_xyz)
+
+ def model_config(self):
+ self.SA_confs = [
+ {
+ "npoint": 1024,
+ "radiuss": [0.1],
+ "nsamples": [32],
+ "mlps": [[32, 32, 64]],
+ },
+ {
+ "npoint": 256,
+ "radiuss": [0.2],
+ "nsamples": [32],
+ "mlps": [[64, 64, 128]],
+ },
+ {
+ "npoint": 64,
+ "radiuss": [0.4],
+ "nsamples": [32],
+ "mlps": [[128, 128, 256]],
+ },
+ {
+ "npoint": 16,
+ "radiuss": [0.8],
+ "nsamples": [32],
+ "mlps": [[256, 256, 512]],
+ },
+ ]
+
+ self.FP_confs = [
+ {"mlp": [128, 128, 128]},
+ {"mlp": [256, 128]},
+ {"mlp": [256, 256]},
+ {"mlp": [256, 256]},
+ ]
+
+
+class PointNet2SemSegMSG(PointNet2SemSeg):
+ def __init__(self, num_classes, use_xyz=True):
+ super(PointNet2SemSegMSG, self).__init__(num_classes, use_xyz)
+
+ def model_config(self):
+ self.SA_confs = [
+ {
+ "npoint": 1024,
+ "radiuss": [0.05, 0.1],
+ "nsamples": [16, 32],
+ "mlps": [[16, 16, 32], [32, 32, 64]],
+ },
+ {
+ "npoint": 256,
+ "radiuss": [0.1, 0.2],
+ "nsamples": [16, 32],
+ "mlps": [[64, 64, 128], [64, 96, 128]],
+ },
+ {
+ "npoint": 64,
+ "radiuss": [0.2, 0.4],
+ "nsamples": [16, 32],
+ "mlps": [[128, 196, 256], [128, 196, 256]],
+ },
+ {
+ "npoint": 16,
+ "radiuss": [0.4, 0.8],
+ "nsamples": [16, 32],
+ "mlps": [[256, 256, 512], [256, 384, 512]],
+ },
+ ]
+
+ self.FP_confs = [
+ {"mlp": [128, 128]},
+ {"mlp": [256, 256]},
+ {"mlp": [512, 512]},
+ {"mlp": [512, 512]},
+ ]
+
diff --git a/PaddleCV/Paddle3D/PointNet++/scripts/eval_cls.sh b/PaddleCV/Paddle3D/PointNet++/scripts/eval_cls.sh
new file mode 100644
index 0000000000000000000000000000000000000000..b8ddac6e94dad30794ddb2e0d21f1afb7cf385bd
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/scripts/eval_cls.sh
@@ -0,0 +1,4 @@
+export CUDA_VISIBLE_DEVICES=0
+
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+python eval_cls.py --model=MSG --weights=checkpoints/200
diff --git a/PaddleCV/Paddle3D/PointNet++/scripts/eval_seg.sh b/PaddleCV/Paddle3D/PointNet++/scripts/eval_seg.sh
new file mode 100644
index 0000000000000000000000000000000000000000..3fb7583cc7f54fcedacebcf96df1a61aa121fc54
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/scripts/eval_seg.sh
@@ -0,0 +1,4 @@
+export CUDA_VISIBLE_DEVICES=0
+
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+python eval_seg.py --model=MSG --weights=checkpoints/200
diff --git a/PaddleCV/Paddle3D/PointNet++/scripts/train_cls.sh b/PaddleCV/Paddle3D/PointNet++/scripts/train_cls.sh
new file mode 100644
index 0000000000000000000000000000000000000000..fdcd8d42b6570c463808d5e96b6344bb892211e2
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/scripts/train_cls.sh
@@ -0,0 +1,4 @@
+export CUDA_VISIBLE_DEVICES=0
+
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+python train_cls.py --model=MSG --batch_size=16 --num_points=4096 --epoch=200
diff --git a/PaddleCV/Paddle3D/PointNet++/scripts/train_seg.sh b/PaddleCV/Paddle3D/PointNet++/scripts/train_seg.sh
new file mode 100644
index 0000000000000000000000000000000000000000..2b2055056c2eb821f685edb330bf56cdbe19713f
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/scripts/train_seg.sh
@@ -0,0 +1,4 @@
+export CUDA_VISIBLE_DEVICES=0
+
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+python train_seg.py --model=MSG --batch_size=32 --num_points=4096 --epoch=201
diff --git a/PaddleCV/Paddle3D/PointNet++/train_cls.py b/PaddleCV/Paddle3D/PointNet++/train_cls.py
new file mode 100644
index 0000000000000000000000000000000000000000..f9b49b9dceacc48848a9fa9c3570e2fbf8d79a76
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/train_cls.py
@@ -0,0 +1,302 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import ast
+import logging
+import numpy as np
+import paddle.fluid as fluid
+import paddle.fluid.framework as framework
+
+from models import *
+from data.modelnet40_reader import ModelNet40ClsReader
+from data.data_utils import *
+from utils import *
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointNet++ classification train script")
+ parser.add_argument(
+ '--model',
+ type=str,
+ default='MSG',
+ help='SSG or MSG model to train, default MSG')
+ parser.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=True,
+ help='default use gpu.')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=16,
+ help='training batch size, default 16')
+ parser.add_argument(
+ '--num_points',
+ type=int,
+ default=4096,
+ help='number of points in a sample, default: 4096')
+ parser.add_argument(
+ '--num_classes',
+ type=int,
+ default=40,
+ help='number of classes in dataset, default: 40')
+ parser.add_argument(
+ '--lr',
+ type=float,
+ default=0.01,
+ help='initial learning rate, default 0.01')
+ parser.add_argument(
+ '--lr_decay',
+ type=float,
+ default=0.7,
+ help='learning rate decay gamma, default 0.5')
+ parser.add_argument(
+ '--bn_momentum',
+ type=float,
+ default=0.99,
+ help='initial batch norm momentum, default 0.99')
+ parser.add_argument(
+ '--decay_steps',
+ type=int,
+ default=12500,
+ help='learning rate and batch norm momentum decay steps, default 12500')
+ parser.add_argument(
+ '--weight_decay',
+ type=float,
+ default=1e-5,
+ help='L2 regularization weight decay coeff, default 1e-5.')
+ parser.add_argument(
+ '--epoch',
+ type=int,
+ default=201,
+ help='epoch number. default 201.')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='dataset/ModelNet40/modelnet40_ply_hdf5_2048',
+ help='dataset directory')
+ parser.add_argument(
+ '--save_dir',
+ type=str,
+ default='checkpoints_cls',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--resume',
+ type=str,
+ default=None,
+ help='path to resume training based on previous checkpoints. '
+ 'None for not resuming any checkpoints.')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval for logging.')
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help='The flag indicating whether to run the task '
+ 'for continuous evaluation.')
+ args = parser.parse_args()
+ return args
+
+
+def train():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ check_gpu(args.use_gpu)
+
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+
+ assert args.model in ['MSG', 'SSG'], \
+ "--model can only be 'MSG' or 'SSG'"
+
+ # build model
+ if args.enable_ce:
+ SEED = 102
+ fluid.default_main_program().random_seed = SEED
+ framework.default_startup_program().random_seed = SEED
+
+ startup = fluid.Program()
+ train_prog = fluid.Program()
+ with fluid.program_guard(train_prog, startup):
+ with fluid.unique_name.guard():
+ train_model = PointNet2ClsMSG(args.num_classes, args.num_points) \
+ if args.model == "MSG" else \
+ PointNet2ClsSSG(args.num_classes, args.num_points)
+ train_model.build_model(bn_momentum=args.bn_momentum)
+ train_feeds = train_model.get_feeds()
+ train_pyreader = train_model.get_pyreader()
+ train_outputs = train_model.get_outputs()
+ train_loss = train_outputs['loss']
+ lr = fluid.layers.exponential_decay(
+ learning_rate=args.lr,
+ decay_steps=args.decay_steps,
+ decay_rate=args.lr_decay,
+ staircase=True)
+ lr = fluid.layers.clip(lr, 1e-5, args.lr)
+ optimizer = fluid.optimizer.Adam(learning_rate=lr,
+ regularization=fluid.regularizer.L2Decay(args.weight_decay))
+ optimizer.minimize(train_loss)
+ train_keys, train_values = parse_outputs(train_outputs)
+
+ test_prog = fluid.Program()
+ with fluid.program_guard(test_prog, startup):
+ with fluid.unique_name.guard():
+ test_model = PointNet2ClsMSG(args.num_classes, args.num_points) \
+ if args.model == "MSG" else \
+ PointNet2ClsSSG(args.num_classes, args.num_points)
+ test_model.build_model()
+ test_feeds = test_model.get_feeds()
+ test_outputs = test_model.get_outputs()
+ test_pyreader = test_model.get_pyreader()
+ test_prog = test_prog.clone(True)
+ test_keys, test_values = parse_outputs(test_outputs)
+
+ place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
+ exe = fluid.Executor(place)
+ exe.run(startup)
+
+ if args.resume:
+ assert os.path.exists(args.resume), \
+ "Given resume weight dir {} not exist.".format(args.resume)
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.resume, var.name))
+ fluid.io.load_vars(
+ exe, args.resume, predicate=if_exist, main_program=train_prog)
+
+ build_strategy = fluid.BuildStrategy()
+ build_strategy.memory_optimize = False
+ build_strategy.enable_inplace = False
+ build_strategy.fuse_all_optimizer_ops = False
+ train_compile_prog = fluid.compiler.CompiledProgram(
+ train_prog).with_data_parallel(loss_name=train_loss.name,
+ build_strategy=build_strategy)
+ test_compile_prog = fluid.compiler.CompiledProgram(test_prog)
+
+ def save_model(exe, prog, path):
+ if os.path.isdir(path):
+ shutil.rmtree(path)
+ logger.info("Save model to {}".format(path))
+ fluid.io.save_persistables(exe, path, prog)
+
+ # get reader
+ trans_list = [
+ PointcloudScale(),
+ PointcloudRotate(),
+ PointcloudRotatePerturbation(),
+ PointcloudTranslate(),
+ PointcloudJitter(),
+ PointcloudRandomInputDropout(),
+ ]
+ modelnet_reader = ModelNet40ClsReader(args.data_dir, mode='train', transforms=trans_list)
+ train_reader = modelnet_reader.get_reader(args.batch_size, args.num_points)
+ train_pyreader.decorate_sample_list_generator(train_reader, place)
+ modelnet_reader = ModelNet40ClsReader(args.data_dir, mode='test', transforms=None)
+ test_reader = modelnet_reader.get_reader(args.batch_size, args.num_points)
+ test_pyreader.decorate_sample_list_generator(test_reader, place)
+
+ train_stat = Stat()
+ test_stat = Stat()
+
+ ce_time = 0
+ ce_loss = []
+
+ for epoch_id in range(args.epoch):
+ try:
+ train_pyreader.start()
+ train_iter = 0
+ train_periods = []
+ while True:
+ cur_time = time.time()
+ train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name])
+ period = time.time() - cur_time
+ train_periods.append(period)
+ train_stat.update(train_keys, train_outs[:-1])
+ if train_iter % args.log_interval == 0:
+ log_str = ""
+ for name, values in zip(train_keys + ['learning_rate'], train_outs):
+ log_str += "{}: {:.5f}, ".format(name, np.mean(values))
+ if name == 'loss':
+ ce_loss.append(np.mean(values))
+ logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period))
+ train_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[1:])))
+ ce_time = np.mean(train_periods[1:])
+ save_model(exe, train_prog, os.path.join(args.save_dir, str(epoch_id)))
+
+ # evaluation
+ if not args.enable_ce:
+ try:
+ test_pyreader.start()
+ test_iter = 0
+ test_periods = []
+ while True:
+ cur_time = time.time()
+ test_outs = exe.run(test_compile_prog, fetch_list=test_values)
+ period = time.time() - cur_time
+ test_periods.append(period)
+ test_stat.update(test_keys, test_outs)
+ if test_iter % args.log_interval == 0:
+ log_str = ""
+ for name, value in zip(test_keys, test_outs):
+ log_str += "{}: {:.4f}, ".format(name, np.mean(value))
+ logger.info("[TEST] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, test_iter, log_str, period))
+ test_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TEST] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, test_stat.get_mean_log(), np.mean(test_periods[1:])))
+ finally:
+ test_pyreader.reset()
+ test_stat.reset()
+ test_periods = []
+
+ finally:
+ train_pyreader.reset()
+ train_stat.reset()
+ train_periods = []
+
+ # only for ce
+ if args.enable_ce:
+ card_num = get_cards()
+ _loss = 0
+ _time = 0
+ try:
+ _time = ce_time
+ _loss = np.mean(ce_loss[1:])
+ except:
+ print("ce info error")
+ print("kpis\ttrain_cls_%s_duration_card%s\t%s" % (args.model, card_num, _time))
+ print("kpis\ttrain_cls_%s_loss_card%s\t%f" % (args.model, card_num, _loss))
+
+def get_cards():
+ num = 0
+ cards = os.environ.get('CUDA_VISIBLE_DEVICES', '')
+ if cards != '':
+ num = len(cards.split(","))
+ return num
+
+if __name__ == "__main__":
+ train()
diff --git a/PaddleCV/Paddle3D/PointNet++/train_seg.py b/PaddleCV/Paddle3D/PointNet++/train_seg.py
new file mode 100644
index 0000000000000000000000000000000000000000..11eaabc7b298f70c9b64faa630541ce8d1d89ec6
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/train_seg.py
@@ -0,0 +1,292 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import ast
+import logging
+import numpy as np
+import paddle.fluid as fluid
+import paddle.fluid.framework as framework
+
+from models import *
+from data.indoor3d_reader import Indoor3DReader
+from utils import *
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointNet++ semantic segmentation train script")
+ parser.add_argument(
+ '--model',
+ type=str,
+ default='MSG',
+ help='SSG or MSG model to train, default MSG')
+ parser.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=True,
+ help='default use gpu.')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=32,
+ help='training batch size, default 32')
+ parser.add_argument(
+ '--num_points',
+ type=int,
+ default=4096,
+ help='number of points in a sample, default: 4096')
+ parser.add_argument(
+ '--num_classes',
+ type=int,
+ default=13,
+ help='number of classes in dataset, default: 13')
+ parser.add_argument(
+ '--lr',
+ type=float,
+ default=0.01,
+ help='initial learning rate, default 0.01')
+ parser.add_argument(
+ '--lr_decay',
+ type=float,
+ default=0.5,
+ help='learning rate decay gamma, default 0.5')
+ parser.add_argument(
+ '--bn_momentum',
+ type=float,
+ default=0.99,
+ help='initial batch norm momentum, default 0.99')
+ parser.add_argument(
+ '--decay_steps',
+ type=int,
+ default=6250,
+ help='learning rate and batch norm momentum decay steps, default 6250')
+ parser.add_argument(
+ '--weight_decay',
+ type=float,
+ default=0.,
+ help='L2 regularization weight decay coeff, default 0.')
+ parser.add_argument(
+ '--epoch',
+ type=int,
+ default=201,
+ help='epoch number. default 201.')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='dataset/Indoor3DSemSeg/indoor3d_sem_seg_hdf5_data',
+ help='dataset directory')
+ parser.add_argument(
+ '--save_dir',
+ type=str,
+ default='checkpoints_seg',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--resume',
+ type=str,
+ default=None,
+ help='path to resume training based on previous checkpoints. '
+ 'None for not resuming any checkpoints.')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval for logging.')
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help='The flag indicating whether to run the task '
+ 'for continuous evaluation.')
+ args = parser.parse_args()
+ return args
+
+
+def train():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ check_gpu(args.use_gpu)
+
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+
+ assert args.model in ['MSG', 'SSG'], \
+ "--model can only be 'MSG' or 'SSG'"
+
+ # build model
+ if args.enable_ce:
+ SEED = 102
+ fluid.default_main_program().random_seed = SEED
+ framework.default_startup_program().random_seed = SEED
+
+ startup = fluid.Program()
+ train_prog = fluid.Program()
+ with fluid.program_guard(train_prog, startup):
+ with fluid.unique_name.guard():
+ train_model = PointNet2SemSegMSG(args.num_classes, args.num_points) \
+ if args.model == "MSG" else \
+ PointNet2SemSegSSG(args.num_classes, args.num_points)
+ train_model.build_model(bn_momentum=args.bn_momentum)
+ train_feeds = train_model.get_feeds()
+ train_pyreader = train_model.get_pyreader()
+ train_outputs = train_model.get_outputs()
+ train_loss = train_outputs['loss']
+ lr = fluid.layers.exponential_decay(
+ learning_rate=args.lr,
+ decay_steps=args.decay_steps,
+ decay_rate=args.lr_decay,
+ staircase=True)
+ lr = fluid.layers.clip(lr, 1e-5, args.lr)
+ optimizer = fluid.optimizer.Adam(learning_rate=lr,
+ regularization=fluid.regularizer.L2Decay(args.weight_decay))
+ optimizer.minimize(train_loss)
+ train_keys, train_values = parse_outputs(train_outputs)
+
+ test_prog = fluid.Program()
+ with fluid.program_guard(test_prog, startup):
+ with fluid.unique_name.guard():
+ test_model = PointNet2SemSegMSG(args.num_classes, args.num_points) \
+ if args.model == "MSG" else \
+ PointNet2SemSegSSG(args.num_classes, args.num_points)
+ test_model.build_model()
+ test_feeds = test_model.get_feeds()
+ test_outputs = test_model.get_outputs()
+ test_pyreader = test_model.get_pyreader()
+ test_prog = test_prog.clone(True)
+ test_keys, test_values = parse_outputs(test_outputs)
+
+ place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
+ exe = fluid.Executor(place)
+ exe.run(startup)
+
+ if args.resume:
+ assert os.path.exists(args.resume), \
+ "Given resume weight dir {} not exist.".format(args.resume)
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.resume, var.name))
+ fluid.io.load_vars(
+ exe, args.resume, predicate=if_exist, main_program=train_prog)
+
+ build_strategy = fluid.BuildStrategy()
+ build_strategy.memory_optimize = False
+ build_strategy.enable_inplace = False
+ build_strategy.fuse_all_optimizer_ops = False
+ train_compile_prog = fluid.compiler.CompiledProgram(
+ train_prog).with_data_parallel(loss_name=train_loss.name,
+ build_strategy=build_strategy)
+ test_compile_prog = fluid.compiler.CompiledProgram(test_prog)
+
+ def save_model(exe, prog, path):
+ if os.path.isdir(path):
+ shutil.rmtree(path)
+ logger.info("Save model to {}".format(path))
+ fluid.io.save_persistables(exe, path, prog)
+
+ # get reader
+ indoor_reader = Indoor3DReader(args.data_dir)
+ train_reader = indoor_reader.get_reader(args.batch_size, args.num_points, mode='train')
+ test_reader = indoor_reader.get_reader(args.batch_size, args.num_points, mode='test')
+ train_pyreader.decorate_sample_list_generator(train_reader, place)
+ test_pyreader.decorate_sample_list_generator(test_reader, place)
+
+ train_stat = Stat()
+ test_stat = Stat()
+
+ ce_time = 0
+ ce_loss = []
+
+ for epoch_id in range(args.epoch):
+ try:
+ train_pyreader.start()
+ train_iter = 0
+ train_periods = []
+ while True:
+ cur_time = time.time()
+ train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name])
+ period = time.time() - cur_time
+ train_periods.append(period)
+ train_stat.update(train_keys, train_outs[:-1])
+ if train_iter % args.log_interval == 0:
+ log_str = ""
+ for name, values in zip(train_keys + ['learning_rate'], train_outs):
+ log_str += "{}: {:.5f}, ".format(name, np.mean(values))
+ if name == 'loss':
+ ce_loss.append(np.mean(values))
+ logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period))
+ train_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[1:])))
+ ce_time = np.mean(train_periods[1:])
+ save_model(exe, train_prog, os.path.join(args.save_dir, str(epoch_id)))
+
+ # evaluation
+ if not args.enable_ce:
+ try:
+ test_pyreader.start()
+ test_iter = 0
+ test_periods = []
+ while True:
+ cur_time = time.time()
+ test_outs = exe.run(test_compile_prog, fetch_list=test_values)
+ period = time.time() - cur_time
+ test_periods.append(period)
+ test_stat.update(test_keys, test_outs)
+ if test_iter % args.log_interval == 0:
+ log_str = ""
+ for name, value in zip(test_keys, test_outs):
+ log_str += "{}: {:.4f}, ".format(name, np.mean(value))
+ logger.info("[TEST] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, test_iter, log_str, period))
+ test_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TEST] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, test_stat.get_mean_log(), np.mean(test_periods[1:])))
+ finally:
+ test_pyreader.reset()
+ test_stat.reset()
+ test_periods = []
+
+ finally:
+ train_pyreader.reset()
+ train_stat.reset()
+ train_periods = []
+
+ # only for ce
+ if args.enable_ce:
+ card_num = get_cards()
+ _loss = 0
+ _time = 0
+ try:
+ _time = ce_time
+ _loss = np.mean(ce_loss[1:])
+ except:
+ print("ce info error")
+ print("kpis\ttrain_seg_%s_duration_card%s\t%s" % (args.model, card_num, _time))
+ print("kpis\ttrain_seg_%s_loss_card%s\t%f" % (args.model, card_num, _loss))
+
+def get_cards():
+ num = 0
+ cards = os.environ.get('CUDA_VISIBLE_DEVICES', '')
+ if cards != '':
+ num = len(cards.split(","))
+ return num
+
+if __name__ == "__main__":
+ train()
diff --git a/PaddleCV/Paddle3D/PointNet++/utils.py b/PaddleCV/Paddle3D/PointNet++/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..2629bebf2869ab8316fe8cada38c26f198ce9dcf
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointNet++/utils.py
@@ -0,0 +1,98 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains common utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import sys
+import six
+import logging
+import numpy as np
+import paddle.fluid as fluid
+
+__all__ = ["check_gpu", "print_arguments", "parse_outputs", "Stat"]
+
+logger = logging.getLogger(__name__)
+
+
+def check_gpu(use_gpu):
+ """
+ Log error and exit when set use_gpu=True in paddlepaddle
+ cpu version.
+ """
+ err = "Config use_gpu cannot be set as True while you are " \
+ "using paddlepaddle cpu version ! \nPlease try: \n" \
+ "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
+ "\t2. Set --use_gpu=False to run model on CPU"
+
+ try:
+ if use_gpu and not fluid.is_compiled_with_cuda():
+ logger.error(err)
+ sys.exit(1)
+ except Exception as e:
+ pass
+
+
+def print_arguments(args):
+ """Print argparse's arguments.
+
+ Usage:
+
+ .. code-block:: python
+
+ parser = argparse.ArgumentParser()
+ parser.add_argument("name", default="Jonh", type=str, help="User name.")
+ args = parser.parse_args()
+ print_arguments(args)
+
+ :param args: Input argparse.Namespace for printing.
+ :type args: argparse.Namespace
+ """
+ logger.info("----------- Configuration Arguments -----------")
+ for arg, value in sorted(six.iteritems(vars(args))):
+ logger.info("%s: %s" % (arg, value))
+ logger.info("------------------------------------------------")
+
+
+def parse_outputs(outputs):
+ keys, values = [], []
+ for k, v in outputs.items():
+ keys.append(k)
+ v.persistable = True
+ values.append(v.name)
+ return keys, values
+
+
+class Stat(object):
+ def __init__(self):
+ self.stats = {}
+
+ def update(self, keys, values):
+ for k, v in zip(keys, values):
+ if k not in self.stats:
+ self.stats[k] = []
+ self.stats[k].append(v)
+
+ def reset(self):
+ self.stats = {}
+
+ def get_mean_log(self):
+ log = ""
+ for k, v in self.stats.items():
+ log += "avg_{}: {:.4f}, ".format(k, np.mean(v))
+ return log
diff --git a/PaddleCV/Paddle3D/PointRCNN/.gitignore b/PaddleCV/Paddle3D/PointRCNN/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..9ea6e75c687e4ac93fa06d18bd0d1444e5d3b054
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/.gitignore
@@ -0,0 +1,14 @@
+*log*
+checkpoints*
+build
+output
+result_dir
+pp_pointrcnn*
+data/gt_database
+utils/pts_utils/dist
+utils/pts_utils/build
+utils/pts_utils/pts_utils.egg-info
+utils/cyops/*.c
+utils/cyops/*.so
+ext_op/src/*.o
+ext_op/src/*.so
diff --git a/PaddleCV/Paddle3D/PointRCNN/README.md b/PaddleCV/Paddle3D/PointRCNN/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5b2d82920bf702146589879291a2de9ececf1371
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/README.md
@@ -0,0 +1,339 @@
+# PointRCNN 3D目标检测模型
+
+---
+## 内容
+
+- [简介](#简介)
+- [快速开始](#快速开始)
+- [参考文献](#参考文献)
+- [版本更新](#版本更新)
+
+## 简介
+
+[PointRCNN](https://arxiv.org/abs/1812.04244) 是 Shaoshuai Shi, Xiaogang Wang, Hongsheng Li. 等人提出的,是第一个仅使用原始点云的2-stage(两阶段)3D目标检测器,第一阶段将 Pointnet++ with MSG(Multi-scale Grouping)作为backbone,直接将原始点云数据分割为前景点和背景点,并利用前景点生成bounding box。第二阶段在标准坐标系中对生成对bounding box进一步筛选和优化。该模型还提出了基于bin的方式,把回归问题转化为分类问题,验证了在三维边界框回归中的有效性。PointRCNN在KITTI数据集上进行评估,论文发布时在KITTI 3D目标检测排行榜上获得了最佳性能。
+
+网络结构如下所示:
+
+
+
+用于点云的目标检测器 PointNet++
+
+
+**注意:** PointRCNN 模型构建依赖于自定义的 C++ 算子,目前仅支持GPU设备在Linux/Unix系统上进行编译,本模型**不能运行在Windows系统或CPU设备上**
+
+
+## 快速开始
+
+### 安装
+
+**安装 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle):**
+
+在当前目录下运行样例代码需要 PaddelPaddle Fluid [develop每日版本](https://www.paddlepaddle.org.cn/install/doc/tables#多版本whl包列表-dev-11)或使用PaddlePaddle [develop分支](https://github.com/PaddlePaddle/Paddle/tree/develop)源码编译安装.
+
+为了使自定义算子与paddle版本兼容,建议您**优先使用源码编译paddle**,源码编译方式请参考[编译安装](https://www.paddlepaddle.org.cn/install/doc/source/ubuntu)
+
+**安装PointRCNN:**
+
+1. 下载[PaddlePaddle/models](https://github.com/PaddlePaddle/models)模型库
+
+通过如下命令下载Paddle models模型库:
+
+```
+git clone https://github.com/PaddlePaddle/models
+```
+
+2. 在`PaddleCV/Paddle3D/PointRCNN`目录下下载[pybind11](https://github.com/pybind/pybind11)
+
+`pts_utils`依赖`pybind11`编译,须在`PaddleCV/Paddle3D/PointRCNN`目录下下载`pybind11`子库,可使用如下命令下载:
+
+```
+cd PaddleCV/Paddle3D/PointRCNN
+git clone https://github.com/pybind/pybind11
+```
+
+3. 安装python依赖库
+
+使用如下命令安装python依赖库:
+
+```
+pip install -r requirement.txt
+```
+
+**注意:** KITTI mAP评估工具只能在python 3.6及以上版本中使用,且python3环境中需要安装`scikit-image`,`Numba`,`fire`等子库。
+`requirement.txt`中的`scikit-image`,`Numba`,`fire`即为KITTI mAP评估工具所需依赖库。
+
+4. 编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块
+
+使用如下命令编译安装`pts_utils`, `kitti_utils`, `roipool3d_utils`, `iou_utils` 等模块:
+```
+sh build_and_install.sh
+```
+
+### 编译自定义OP
+
+请确认Paddle版本为PaddelPaddle Fluid develop每日版本或基于Paddle develop分支源码编译安装,**推荐使用源码编译安装的方式**。
+
+自定义OP编译方式如下:
+
+ 进入 `ext_op/src` 目录,执行编译脚本
+ ```
+ cd ext_op/src
+ sh make.sh
+ ```
+
+ 成功编译后,`ext_op/src` 目录下将会生成 `pointnet_lib.so`
+
+ 执行下列操作,确保自定义算子编译正确:
+
+ ```
+ # 设置动态库的路径到 LD_LIBRARY_PATH 中
+ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+ # 回到 ext_op 目录,添加 PYTHONPATH
+ cd ..
+ export PYTHONPATH=$PYTHONPATH:`pwd`
+
+ # 运行单测
+ python tests/test_farthest_point_sampling_op.py
+ python tests/test_gather_point_op.py
+ python tests/test_group_points_op.py
+ python tests/test_query_ball_op.py
+ python tests/test_three_interp_op.py
+ python tests/test_three_nn_op.py
+ ```
+ 单测运行成功会输出提示信息,如下所示:
+
+ ```
+ .
+ ----------------------------------------------------------------------
+ Ran 1 test in 13.205s
+
+ OK
+ ```
+
+**说明:** 自定义OP编译与[PointNet++](../PointNet++)下一致,更多关于自定义OP的编译说明,请参考[自定义OP编译](../PointNet++/ext_op/README.md)
+
+### 数据准备
+
+**KITTI 3D object detection 数据集:**
+
+PointRCNN使用数据集[KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d)
+上进行训练。
+
+可通过如下方式下载数据集:
+
+```
+cd data/KITTI/object
+sh download.sh
+```
+
+此处的images只用做可视化,训练过程中使用[road planes](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing)数据来做训练时的数据增强,
+请下载并解压至`./data/KITTI/object/training`目录下。
+
+数据目录结构如下所示:
+
+```
+PointRCNN
+├── data
+│ ├── KITTI
+│ │ ├── ImageSets
+│ │ ├── object
+│ │ │ ├──training
+│ │ │ │ ├──calib & velodyne & label_2 & image_2 & planes
+│ │ │ ├──testing
+│ │ │ │ ├──calib & velodyne & image_2
+
+```
+
+
+### 训练
+
+**PointRCNN模型:**
+
+可通过如下方式启动 PointRCNN模型的训练:
+
+1. 指定单卡训练并设置动态库路径
+
+```
+# 指定单卡GPU训练
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+```
+
+2. 生成Groud Truth采样数据,命令如下:
+
+```
+python tools/generate_gt_database.py --class_name 'Car' --split train
+```
+
+3. 训练 RPN 模型
+
+```
+python train.py --cfg=./cfgs/default.yml \
+ --train_mode=rpn \
+ --batch_size=16 \
+ --epoch=200 \
+ --save_dir=checkpoints
+```
+
+RPN训练checkpoints默认保存在`checkpoints/rpn`目录,也可以通过`--save_dir`来指定。
+
+4. 生成增强离线场景数据并保存RPN模型的输出特征和ROI,用于离线训练 RCNN 模型
+
+生成增强的离线场景数据命令如下:
+
+```
+python tools/generate_aug_scene.py --class_name 'Car' --split train --aug_times 4
+```
+
+保存RPN模型对离线增强数据的输出特征和ROI,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
+保存输出特征和ROI时须指定`TEST.SPLIT`为`train_aug`,指定`TEST.RPN_POST_NMS_TOP_N`为`300`, `TEST.RPN_NMS_THRESH`为`0.85`。
+通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rpn \
+ --ckpt_dir=./checkpoints/rpn/199 \
+ --save_rpn_feature \
+ --output_dir=output \
+ --set TEST.SPLIT train_aug TEST.RPN_POST_NMS_TOP_N 300 TEST.RPN_NMS_THRESH 0.85
+```
+
+`--output_dir`下保存的数据目录结构如下:
+
+```
+output
+├── detections
+│ ├── data # 保存ROI数据
+│ │ ├── 000000.txt
+│ │ ├── 000003.txt
+│ │ ├── ...
+├── features # 保存输出特征
+│ ├── 000000_intensity.npy
+│ ├── 000000.npy
+│ ├── 000000_rawscore.npy
+│ ├── 000000_seg.npy
+│ ├── 000000_xyz.npy
+│ ├── ...
+├── seg_result # 保存语义分割结果
+│ ├── 000000.npy
+│ ├── 000003.npy
+│ ├── ...
+```
+
+5. 离线训练RCNN,并且通过参数`--rcnn_training_roi_dir` and `--rcnn_training_feature_dir` 来指定 RPN 模型保存的输出特征和ROI路径。
+
+```
+python train.py --cfg=./cfgs/default.yml \
+ --train_mode=rcnn_offline \
+ --batch_size=4 \
+ --epoch=30 \
+ --save_dir=checkpoints \
+ --rcnn_training_roi_dir=output/detections/data \
+ --rcnn_training_feature_dir=output/features \
+ --set TRAIN.SPLIT train_aug
+```
+
+RCNN模型训练权重默认保存在`checkpoints/rcnn`目录下,可通过`--save_dir`参数指定。
+
+**注意**: 最好的模型是通过保存RPN模型输出特征和ROI并离线数据增强的方式训练RCNN模型得出的,目前默认仅支持这种方式。
+
+
+### 模型评估
+
+**PointRCNN模型:**
+
+可通过如下方式启动 PointRCNN 模型的评估:
+
+1. 指定单卡训练并设置动态库路径
+
+```
+# 指定单卡GPU训练
+export CUDA_VISIBLE_DEVICES=0
+
+# 设置动态库的路径到 LD_LIBRARY_PATH 中
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+
+```
+
+2. 保存RPN模型对评估数据的输出特征和ROI
+
+保存RPN模型对评估数据的输出特征和ROI命令如下,可以通过参数`--ckpt_dir`来指定RPN训练最终权重保存路径,RPN权重默认保存在`checkpoints/rpn`目录。
+通过`--output_dir`指定保存输出特征和ROI的路径,默认保存到`./output`目录。
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rpn \
+ --ckpt_dir=./checkpoints/rpn/199 \
+ --save_rpn_feature \
+ --output_dir=output/val
+```
+
+保存RPN模型对评估数据的输出特征和ROI保存的目录结构与上述保存离线增强数据保存目录结构一致。
+
+3. 评估离线RCNN模型
+
+评估离线RCNN模型命令如下:
+
+```
+python eval.py --cfg=cfgs/default.yml \
+ --eval_mode=rcnn_offline \
+ --ckpt_dir=./checkpoints/rcnn_offline/29 \
+ --rcnn_eval_roi_dir=output/val/detections/data \
+ --rcnn_eval_feature_dir=output/val/features \
+ --save_result
+```
+
+最终目标检测结果文件保存在`./result_dir`目录下`final_result`文件夹下,同时可通过`--save_result`开启保存`roi_output`和`refine_output`结果文件。
+`result_dir`目录结构如下:
+
+```
+result_dir
+├── final_result
+│ ├── data # 最终检测结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+├── roi_output
+│ ├── data # RCNN模型输出检测ROI结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+├── refine_output
+│ ├── data # 解码后的检测结果
+│ │ ├── 000001.txt
+│ │ ├── 000002.txt
+│ │ ├── ...
+```
+
+4. 使用KITTI mAP工具获得评估结果
+
+若在评估过程中使用的python版本为3.6及以上版本,则程序会自动运行KITTI mAP评估,若使用python版本低于3.6,
+由于KITTI mAP仅支持python 3.6及以上版本,须使用对应python版本通过如下命令进行评估:
+
+```
+python3 tools/kitti_eval.py
+```
+
+使用训练最终权重[RPN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rpn.tar)和[RCNN模型](https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_rcnn_offline.tar)评估结果如下所示:
+
+| Car AP@ | 0.70(easy) | 0.70(moderate) | 0.70(hard) |
+| :------- | :--------: | :------------: | :--------: |
+| bbox AP: | 90.20 | 88.85 | 88.59 |
+| bev AP: | 89.50 | 86.97 | 85.58 |
+| 3d AP: | 86.66 | 76.65 | 75.90 |
+| aos AP: | 90.10 | 88.64 | 88.26 |
+
+
+## 参考文献
+
+- [PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud](https://arxiv.org/abs/1812.04244), Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.
+- [PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space](https://arxiv.org/abs/1706.02413), Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas.
+- [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://www.semanticscholar.org/paper/PointNet%3A-Deep-Learning-on-Point-Sets-for-3D-and-Qi-Su/d997beefc0922d97202789d2ac307c55c2c52fba), Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas.
+
+## 版本更新
+
+- 11/2019, 新增 PointRCNN模型。
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh
new file mode 100644
index 0000000000000000000000000000000000000000..83aaef84704445cf9c7bf3e87cc453e0daa708cd
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/build_and_install.sh
@@ -0,0 +1,7 @@
+# compile cyops
+python utils/cyops/setup.py develop
+
+# compile and install pts_utils
+cd utils/pts_utils
+python setup.py install
+cd ../..
diff --git a/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml
new file mode 100644
index 0000000000000000000000000000000000000000..33dc45086ca48128174fc341e7f9fdee9374d53e
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/cfgs/default.yml
@@ -0,0 +1,167 @@
+# This config is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/cfgs/default.yaml
+CLASSES: Car
+
+INCLUDE_SIMILAR_TYPE: True
+
+# config of augmentation
+AUG_DATA: True
+AUG_METHOD_LIST: ['rotation', 'scaling', 'flip']
+AUG_METHOD_PROB: [1.0, 1.0, 0.5]
+AUG_ROT_RANGE: 18
+
+GT_AUG_ENABLED: True
+GT_EXTRA_NUM: 15
+GT_AUG_RAND_NUM: True
+GT_AUG_APPLY_PROB: 1.0
+GT_AUG_HARD_RATIO: 0.6
+
+PC_REDUCE_BY_RANGE: True
+PC_AREA_SCOPE: [[-40, 40], [-1, 3], [0, 70.4]] # x, y, z scope in rect camera coords
+CLS_MEAN_SIZE: [[1.52563191462, 1.62856739989, 3.88311640418]]
+
+
+# 1. config of rpn network
+RPN:
+ ENABLED: True
+ FIXED: False
+
+ # config of input
+ USE_INTENSITY: False
+
+ # config of bin-based loss
+ LOC_XZ_FINE: True
+ LOC_SCOPE: 3.0
+ LOC_BIN_SIZE: 0.5
+ NUM_HEAD_BIN: 12
+
+ # config of network structure
+ BACKBONE: pointnet2_msg
+ USE_BN: True
+ NUM_POINTS: 16384
+
+ SA_CONFIG:
+ NPOINTS: [4096, 1024, 256, 64]
+ RADIUS: [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
+ NSAMPLE: [[16, 32], [16, 32], [16, 32], [16, 32]]
+ MLPS: [[[16, 16, 32], [32, 32, 64]],
+ [[64, 64, 128], [64, 96, 128]],
+ [[128, 196, 256], [128, 196, 256]],
+ [[256, 256, 512], [256, 384, 512]]]
+ FP_MLPS: [[128, 128], [256, 256], [512, 512], [512, 512]]
+ CLS_FC: [128]
+ REG_FC: [128]
+ DP_RATIO: 0.5
+
+ # config of training
+ LOSS_CLS: SigmoidFocalLoss
+ FG_WEIGHT: 15
+ FOCAL_ALPHA: [0.25, 0.75]
+ FOCAL_GAMMA: 2.0
+ REG_LOSS_WEIGHT: [1.0, 1.0, 1.0, 1.0]
+ LOSS_WEIGHT: [1.0, 1.0]
+ NMS_TYPE: normal
+
+ # config of testing
+ SCORE_THRESH: 0.3
+
+# 2. config of rcnn network
+RCNN:
+ ENABLED: True
+
+ # config of input
+ ROI_SAMPLE_JIT: False
+ REG_AUG_METHOD: multiple # multiple, single, normal
+ ROI_FG_AUG_TIMES: 10
+
+ USE_RPN_FEATURES: True
+ USE_MASK: True
+ MASK_TYPE: seg
+ USE_INTENSITY: False
+ USE_DEPTH: True
+ USE_SEG_SCORE: False
+
+ POOL_EXTRA_WIDTH: 1.0
+
+ # config of bin-based loss
+ LOC_SCOPE: 1.5
+ LOC_BIN_SIZE: 0.5
+ NUM_HEAD_BIN: 9
+ LOC_Y_BY_BIN: False
+ LOC_Y_SCOPE: 0.5
+ LOC_Y_BIN_SIZE: 0.25
+ SIZE_RES_ON_ROI: False
+
+ # config of network structure
+ USE_BN: False
+ DP_RATIO: 0.0
+
+ BACKBONE: pointnet # pointnet
+ XYZ_UP_LAYER: [128, 128]
+
+ NUM_POINTS: 512
+ SA_CONFIG:
+ NPOINTS: [128, 32, -1]
+ RADIUS: [0.2, 0.4, 100]
+ NSAMPLE: [64, 64, 64]
+ MLPS: [[128, 128, 128],
+ [128, 128, 256],
+ [256, 256, 512]]
+ CLS_FC: [256, 256]
+ REG_FC: [256, 256]
+
+ # config of training
+ LOSS_CLS: BinaryCrossEntropy
+ FOCAL_ALPHA: [0.25, 0.75]
+ FOCAL_GAMMA: 2.0
+ CLS_WEIGHT: [1.0, 1.0, 1.0]
+ CLS_FG_THRESH: 0.6
+ CLS_BG_THRESH: 0.45
+ CLS_BG_THRESH_LO: 0.05
+ REG_FG_THRESH: 0.55
+ FG_RATIO: 0.5
+ ROI_PER_IMAGE: 64
+ HARD_BG_RATIO: 0.8
+
+ # config of testing
+ SCORE_THRESH: 0.3
+ NMS_THRESH: 0.1
+
+# general training config
+TRAIN:
+ SPLIT: train
+ VAL_SPLIT: smallval
+
+ LR: 0.002
+ LR_CLIP: 0.00001
+ LR_DECAY: 0.5
+ DECAY_STEP_LIST: [100, 150, 180, 200]
+ LR_WARMUP: True
+ WARMUP_MIN: 0.0002
+ WARMUP_EPOCH: 1
+
+ BN_MOMENTUM: 0.1
+ BN_DECAY: 0.5
+ BNM_CLIP: 0.01
+ BN_DECAY_STEP_LIST: [1000]
+
+ OPTIMIZER: adam # adam, adam_onecycle
+ WEIGHT_DECAY: 0.001 # L2 regularization
+ MOMENTUM: 0.9
+
+ MOMS: [0.95, 0.85]
+ DIV_FACTOR: 10.0
+ PCT_START: 0.4
+
+ GRAD_NORM_CLIP: 1.0
+
+ RPN_PRE_NMS_TOP_N: 9000
+ RPN_POST_NMS_TOP_N: 512
+ RPN_NMS_THRESH: 0.85
+ RPN_DISTANCE_BASED_PROPOSE: True
+
+TEST:
+ SPLIT: val
+ RPN_PRE_NMS_TOP_N: 9000
+ RPN_POST_NMS_TOP_N: 100
+ RPN_NMS_THRESH: 0.8
+ RPN_DISTANCE_BASED_PROPOSE: True
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh
new file mode 100644
index 0000000000000000000000000000000000000000..1f5818d38323c5cc7349022ba82d2a55315a59a7
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/KITTI/object/download.sh
@@ -0,0 +1,25 @@
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+
+echo "Downloading https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip
+echo "https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip"
+wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
+
+echo "Decompressing data_object_velodyne.zip"
+unzip data_object_velodyne.zip
+echo "Decompressing data_object_image_2.zip"
+unzip "data_object_image_2.zip"
+echo "Decompressing data_object_calib.zip"
+unzip data_object_calib.zip
+echo "Decompressing data_object_label_2.zip"
+unzip data_object_label_2.zip
+
+echo "Download KITTI ImageSets"
+wget https://paddlemodels.bj.bcebos.com/Paddle3D/pointrcnn_kitti_imagesets.tar
+tar xf pointrcnn_kitti_imagesets.tar
+mv ImageSets ..
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/__init__.py b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/__init__.py
@@ -0,0 +1,13 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py
new file mode 100644
index 0000000000000000000000000000000000000000..0765a5045f6e330646fde26fe391eb313d022124
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_dataset.py
@@ -0,0 +1,77 @@
+"""
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_dataset.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import cv2
+import numpy as np
+import utils.calibration as calibration
+from utils.object3d import get_objects_from_label
+from PIL import Image
+
+__all__ = ["KittiDataset"]
+
+
+class KittiDataset(object):
+ def __init__(self, data_dir, split='train'):
+ assert split in ['train', 'train_aug', 'val', 'test'], "unknown split {}".format(split)
+ self.split = split
+ self.is_test = self.split == 'test'
+ self.imageset_dir = os.path.join(data_dir, 'KITTI', 'object', 'testing' if self.is_test else 'training')
+
+ split_dir = os.path.join(data_dir, 'KITTI', 'ImageSets', split + '.txt')
+ self.image_idx_list = [x.strip() for x in open(split_dir).readlines()]
+ self.num_sample = self.image_idx_list.__len__()
+
+ self.image_dir = os.path.join(self.imageset_dir, 'image_2')
+ self.lidar_dir = os.path.join(self.imageset_dir, 'velodyne')
+ self.calib_dir = os.path.join(self.imageset_dir, 'calib')
+ self.label_dir = os.path.join(self.imageset_dir, 'label_2')
+ self.plane_dir = os.path.join(self.imageset_dir, 'planes')
+
+ def get_image(self, idx):
+ img_file = os.path.join(self.image_dir, '%06d.png' % idx)
+ assert os.path.exists(img_file)
+ return cv2.imread(img_file) # (H, W, 3) BGR mode
+
+ def get_image_shape(self, idx):
+ img_file = os.path.join(self.image_dir, '%06d.png' % idx)
+ assert os.path.exists(img_file)
+ im = Image.open(img_file)
+ width, height = im.size
+ return height, width, 3
+
+ def get_lidar(self, idx):
+ lidar_file = os.path.join(self.lidar_dir, '%06d.bin' % idx)
+ assert os.path.exists(lidar_file)
+ return np.fromfile(lidar_file, dtype=np.float32).reshape(-1, 4)
+
+ def get_calib(self, idx):
+ calib_file = os.path.join(self.calib_dir, '%06d.txt' % idx)
+ assert os.path.exists(calib_file)
+ return calibration.Calibration(calib_file)
+
+ def get_label(self, idx):
+ label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
+ assert os.path.exists(label_file)
+ # return kitti_utils.get_objects_from_label(label_file)
+ return get_objects_from_label(label_file)
+
+ def get_road_plane(self, idx):
+ plane_file = os.path.join(self.plane_dir, '%06d.txt' % idx)
+ with open(plane_file, 'r') as f:
+ lines = f.readlines()
+ lines = [float(i) for i in lines[3].split()]
+ plane = np.asarray(lines)
+
+ # Ensure normal is always facing up, this is in the rectified camera coordinate
+ if plane[1] > 0:
+ plane = -plane
+
+ norm = np.linalg.norm(plane[0:3])
+ plane = plane / norm
+ return plane
diff --git a/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py
new file mode 100644
index 0000000000000000000000000000000000000000..57367d2c6ff5abd3c21a15cbe7c0a90ba9e64e62
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/data/kitti_rcnn_reader.py
@@ -0,0 +1,1193 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/datasets/kitti_rcnn_dataset.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import signal
+import logging
+import multiprocessing
+import numpy as np
+import scipy
+from scipy.spatial import Delaunay
+try:
+ import cPickle as pickle
+except:
+ import pickle
+
+import pts_utils
+import utils.cyops.kitti_utils as kitti_utils
+import utils.cyops.roipool3d_utils as roipool3d_utils
+from data.kitti_dataset import KittiDataset
+from utils.config import cfg
+from collections import OrderedDict
+
+__all__ = ["KittiRCNNReader"]
+
+logger = logging.getLogger(__name__)
+
+
+def has_empty(data):
+ for d in data:
+ if isinstance(d, np.ndarray) and len(d) == 0:
+ return True
+ return False
+
+
+def in_hull(p, hull):
+ """
+ :param p: (N, K) test points
+ :param hull: (M, K) M corners of a box
+ :return (N) bool
+ """
+ try:
+ if not isinstance(hull, Delaunay):
+ hull = Delaunay(hull)
+ flag = hull.find_simplex(p) >= 0
+ except scipy.spatial.qhull.QhullError:
+ logger.debug('Warning: not a hull.')
+ flag = np.zeros(p.shape[0], dtype=np.bool)
+
+ return flag
+
+
+class KittiRCNNReader(KittiDataset):
+ def __init__(self, data_dir, npoints=16384, split='train', classes='Car', mode='TRAIN',
+ random_select=True, rcnn_training_roi_dir=None, rcnn_training_feature_dir=None,
+ rcnn_eval_roi_dir=None, rcnn_eval_feature_dir=None, gt_database_dir=None):
+ super(KittiRCNNReader, self).__init__(data_dir=data_dir, split=split)
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_ped')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ aug_scene_data_dir = os.path.join(data_dir, 'KITTI', 'aug_scene_cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ self.num_classes = len(self.classes)
+
+ self.npoints = npoints
+ self.sample_id_list = []
+ self.random_select = random_select
+
+ if split == 'train_aug':
+ self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
+ self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
+ else:
+ self.aug_label_dir = os.path.join(aug_scene_data_dir, 'training', 'aug_label')
+ self.aug_pts_dir = os.path.join(aug_scene_data_dir, 'training', 'rectified_data')
+
+ # for rcnn training
+ self.rcnn_training_bbox_list = []
+ self.rpn_feature_list = {}
+ self.pos_bbox_list = []
+ self.neg_bbox_list = []
+ self.far_neg_bbox_list = []
+ self.rcnn_eval_roi_dir = rcnn_eval_roi_dir
+ self.rcnn_eval_feature_dir = rcnn_eval_feature_dir
+ self.rcnn_training_roi_dir = rcnn_training_roi_dir
+ self.rcnn_training_feature_dir = rcnn_training_feature_dir
+
+ self.gt_database = None
+
+ if not self.random_select:
+ logger.warning('random select is False')
+
+ assert mode in ['TRAIN', 'EVAL', 'TEST'], 'Invalid mode: %s' % mode
+ self.mode = mode
+
+ if cfg.RPN.ENABLED:
+ if gt_database_dir is not None:
+ self.gt_database = pickle.load(open(gt_database_dir, 'rb'))
+
+ if cfg.GT_AUG_HARD_RATIO > 0:
+ easy_list, hard_list = [], []
+ for k in range(self.gt_database.__len__()):
+ obj = self.gt_database[k]
+ if obj['points'].shape[0] > 100:
+ easy_list.append(obj)
+ else:
+ hard_list.append(obj)
+ self.gt_database = [easy_list, hard_list]
+ logger.info('Loading gt_database(easy(pt_num>100): %d, hard(pt_num<=100): %d) from %s'
+ % (len(easy_list), len(hard_list), gt_database_dir))
+ else:
+ logger.info('Loading gt_database(%d) from %s' % (len(self.gt_database), gt_database_dir))
+
+ if mode == 'TRAIN':
+ self.preprocess_rpn_training_data()
+ else:
+ self.sample_id_list = [int(sample_id) for sample_id in self.image_idx_list]
+ logger.info('Load testing samples from %s' % self.imageset_dir)
+ logger.info('Done: total test samples %d' % len(self.sample_id_list))
+ elif cfg.RCNN.ENABLED:
+ for idx in range(0, self.num_sample):
+ sample_id = int(self.image_idx_list[idx])
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if len(obj_list) == 0:
+ # logger.info('No gt classes: %06d' % sample_id)
+ continue
+ self.sample_id_list.append(sample_id)
+
+ logger.info('Done: filter %s results for rcnn training: %d / %d\n' %
+ (self.mode, len(self.sample_id_list), len(self.image_idx_list)))
+
+ def preprocess_rpn_training_data(self):
+ """
+ Discard samples which don't have current classes, which will not be used for training.
+ Valid sample_id is stored in self.sample_id_list
+ """
+ logger.info('Loading %s samples from %s ...' % (self.mode, self.label_dir))
+ for idx in range(0, self.num_sample):
+ sample_id = int(self.image_idx_list[idx])
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if len(obj_list) == 0:
+ logger.debug('No gt classes: %06d' % sample_id)
+ continue
+ self.sample_id_list.append(sample_id)
+
+ logger.info('Done: filter %s results: %d / %d\n' % (self.mode, len(self.sample_id_list),
+ len(self.image_idx_list)))
+
+ def get_label(self, idx):
+ if idx < 10000:
+ label_file = os.path.join(self.label_dir, '%06d.txt' % idx)
+ else:
+ label_file = os.path.join(self.aug_label_dir, '%06d.txt' % idx)
+
+ assert os.path.exists(label_file)
+ return kitti_utils.get_objects_from_label(label_file)
+
+ def get_image(self, idx):
+ return super(KittiRCNNReader, self).get_image(idx % 10000)
+
+ def get_image_shape(self, idx):
+ return super(KittiRCNNReader, self).get_image_shape(idx % 10000)
+
+ def get_calib(self, idx):
+ return super(KittiRCNNReader, self).get_calib(idx % 10000)
+
+ def get_road_plane(self, idx):
+ return super(KittiRCNNReader, self).get_road_plane(idx % 10000)
+
+ @staticmethod
+ def get_rpn_features(rpn_feature_dir, idx):
+ rpn_feature_file = os.path.join(rpn_feature_dir, '%06d.npy' % idx)
+ rpn_xyz_file = os.path.join(rpn_feature_dir, '%06d_xyz.npy' % idx)
+ rpn_intensity_file = os.path.join(rpn_feature_dir, '%06d_intensity.npy' % idx)
+ if cfg.RCNN.USE_SEG_SCORE:
+ rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_rawscore.npy' % idx)
+ rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
+ rpn_seg_score = torch.sigmoid(torch.from_numpy(rpn_seg_score)).numpy()
+ else:
+ rpn_seg_file = os.path.join(rpn_feature_dir, '%06d_seg.npy' % idx)
+ rpn_seg_score = np.load(rpn_seg_file).reshape(-1)
+ return np.load(rpn_xyz_file), np.load(rpn_feature_file), np.load(rpn_intensity_file).reshape(-1), rpn_seg_score
+
+ def filtrate_objects(self, obj_list):
+ """
+ Discard objects which are not in self.classes (or its similar classes)
+ :param obj_list: list
+ :return: list
+ """
+ type_whitelist = self.classes
+ if self.mode == 'TRAIN' and cfg.INCLUDE_SIMILAR_TYPE:
+ type_whitelist = list(self.classes)
+ if 'Car' in self.classes:
+ type_whitelist.append('Van')
+ if 'Pedestrian' in self.classes: # or 'Cyclist' in self.classes:
+ type_whitelist.append('Person_sitting')
+
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type not in type_whitelist: # rm Van, 20180928
+ continue
+ if self.mode == 'TRAIN' and cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(obj.pos) is False):
+ continue
+ valid_obj_list.append(obj)
+ return valid_obj_list
+
+ @staticmethod
+ def filtrate_dc_objects(obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type in ['DontCare']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ @staticmethod
+ def check_pc_range(xyz):
+ """
+ :param xyz: [x, y, z]
+ :return:
+ """
+ x_range, y_range, z_range = cfg.PC_AREA_SCOPE
+ if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
+ (z_range[0] <= xyz[2] <= z_range[1]):
+ return True
+ return False
+
+ @staticmethod
+ def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
+ """
+ Valid point should be in the image (and in the PC_AREA_SCOPE)
+ :param pts_rect:
+ :param pts_img:
+ :param pts_rect_depth:
+ :param img_shape:
+ :return:
+ """
+ val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
+ val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
+ val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
+ pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
+
+ if cfg.PC_REDUCE_BY_RANGE:
+ x_range, y_range, z_range = cfg.PC_AREA_SCOPE
+ pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
+ range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
+ & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
+ & (pts_z >= z_range[0]) & (pts_z <= z_range[1])
+ pts_valid_flag = pts_valid_flag & range_flag
+ return pts_valid_flag
+
+ def get_rpn_sample(self, index):
+ sample_id = int(self.sample_id_list[index])
+ if sample_id < 10000:
+ calib = self.get_calib(sample_id)
+ # img = self.get_image(sample_id)
+ img_shape = self.get_image_shape(sample_id)
+ pts_lidar = self.get_lidar(sample_id)
+
+ # get valid point (projected points should be in image)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_intensity = pts_lidar[:, 3]
+ else:
+ calib = self.get_calib(sample_id % 10000)
+ # img = self.get_image(sample_id % 10000)
+ img_shape = self.get_image_shape(sample_id % 10000)
+
+ pts_file = os.path.join(self.aug_pts_dir, '%06d.bin' % sample_id)
+ assert os.path.exists(pts_file), '%s' % pts_file
+ aug_pts = np.fromfile(pts_file, dtype=np.float32).reshape(-1, 4)
+ pts_rect, pts_intensity = aug_pts[:, 0:3], aug_pts[:, 3]
+
+ pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
+ pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
+
+ pts_rect = pts_rect[pts_valid_flag][:, 0:3]
+ pts_intensity = pts_intensity[pts_valid_flag]
+
+ if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN':
+ # all labels for checking overlapping
+ all_gt_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
+ all_gt_boxes3d = kitti_utils.objs_to_boxes3d(all_gt_obj_list)
+
+ gt_aug_flag = False
+ if np.random.rand() < cfg.GT_AUG_APPLY_PROB:
+ # augment one scene
+ gt_aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
+ self.apply_gt_aug_to_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
+
+ # generate inputs
+ if self.mode == 'TRAIN' or self.random_select:
+ if self.npoints < len(pts_rect):
+ pts_depth = pts_rect[:, 2]
+ pts_near_flag = pts_depth < 40.0
+ far_idxs_choice = np.where(pts_near_flag == 0)[0]
+ near_idxs = np.where(pts_near_flag == 1)[0]
+ near_idxs_choice = np.random.choice(near_idxs, self.npoints - len(far_idxs_choice), replace=False)
+
+ choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \
+ if len(far_idxs_choice) > 0 else near_idxs_choice
+ np.random.shuffle(choice)
+ else:
+ choice = np.arange(0, len(pts_rect), dtype=np.int32)
+ if self.npoints > len(pts_rect):
+ extra_choice = np.random.choice(choice, self.npoints - len(pts_rect), replace=False)
+ choice = np.concatenate((choice, extra_choice), axis=0)
+ np.random.shuffle(choice)
+
+ ret_pts_rect = pts_rect[choice, :]
+ ret_pts_intensity = pts_intensity[choice] - 0.5 # translate intensity to [-0.5, 0.5]
+ else:
+ ret_pts_rect = np.zeros((self.npoints, pts_rect.shape[1])).astype(pts_rect.dtype)
+ num_ = min(self.npoints, pts_rect.shape[0])
+ ret_pts_rect[:num_] = pts_rect[:num_]
+
+ ret_pts_intensity = pts_intensity - 0.5
+
+ pts_features = [ret_pts_intensity.reshape(-1, 1)]
+ ret_pts_features = np.concatenate(pts_features, axis=1) if pts_features.__len__() > 1 else pts_features[0]
+
+ sample_info = {'sample_id': sample_id, 'random_select': self.random_select}
+
+ if self.mode == 'TEST':
+ if cfg.RPN.USE_INTENSITY:
+ pts_input = np.concatenate((ret_pts_rect, ret_pts_features), axis=1) # (N, C)
+ else:
+ pts_input = ret_pts_rect
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = ret_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ return sample_info
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if cfg.GT_AUG_ENABLED and self.mode == 'TRAIN' and gt_aug_flag:
+ gt_obj_list.extend(extra_gt_obj_list)
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ gt_alpha = np.zeros((gt_obj_list.__len__()), dtype=np.float32)
+ for k, obj in enumerate(gt_obj_list):
+ gt_alpha[k] = obj.alpha
+
+ # data augmentation
+ aug_pts_rect = ret_pts_rect.copy()
+ aug_gt_boxes3d = gt_boxes3d.copy()
+ if cfg.AUG_DATA and self.mode == 'TRAIN':
+ aug_pts_rect, aug_gt_boxes3d, aug_method = self.data_augmentation(aug_pts_rect, aug_gt_boxes3d, gt_alpha,
+ sample_id)
+ sample_info['aug_method'] = aug_method
+
+ # prepare input
+ if cfg.RPN.USE_INTENSITY:
+ pts_input = np.concatenate((aug_pts_rect, ret_pts_features), axis=1) # (N, C)
+ else:
+ pts_input = aug_pts_rect
+
+ if cfg.RPN.FIXED:
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = aug_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ sample_info['gt_boxes3d'] = aug_gt_boxes3d
+ return sample_info
+
+ if self.mode == 'EVAL' and aug_gt_boxes3d.shape[0] == 0:
+ aug_gt_boxes3d = np.zeros((1, aug_gt_boxes3d.shape[1]))
+
+ # generate training labels
+ rpn_cls_label, rpn_reg_label = self.generate_rpn_training_labels(aug_pts_rect, aug_gt_boxes3d)
+ sample_info['pts_input'] = pts_input
+ sample_info['pts_rect'] = aug_pts_rect
+ sample_info['pts_features'] = ret_pts_features
+ sample_info['rpn_cls_label'] = rpn_cls_label
+ sample_info['rpn_reg_label'] = rpn_reg_label
+ sample_info['gt_boxes3d'] = aug_gt_boxes3d
+ return sample_info
+
+ def apply_gt_aug_to_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
+ """
+ :param pts_rect: (N, 3)
+ :param all_gt_boxex3d: (M2, 7)
+ :return:
+ """
+ assert self.gt_database is not None
+ # extra_gt_num = np.random.randint(10, 15)
+ # try_times = 50
+ if cfg.GT_AUG_RAND_NUM:
+ extra_gt_num = np.random.randint(10, cfg.GT_EXTRA_NUM)
+ else:
+ extra_gt_num = cfg.GT_EXTRA_NUM
+ try_times = 100
+ cnt = 0
+ cur_gt_boxes3d = all_gt_boxes3d.copy()
+ cur_gt_boxes3d[:, 4] += 0.5 # TODO: consider different objects
+ cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
+ cur_gt_corners = kitti_utils.boxes3d_to_corners3d(cur_gt_boxes3d)
+
+ extra_gt_obj_list = []
+ extra_gt_boxes3d_list = []
+ new_pts_list, new_pts_intensity_list = [], []
+ src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
+
+ road_plane = self.get_road_plane(sample_id)
+ a, b, c, d = road_plane
+
+ while try_times > 0:
+ if cnt > extra_gt_num:
+ break
+
+ try_times -= 1
+ if cfg.GT_AUG_HARD_RATIO > 0:
+ p = np.random.rand()
+ if p > cfg.GT_AUG_HARD_RATIO:
+ # use easy sample
+ rand_idx = np.random.randint(0, len(self.gt_database[0]))
+ new_gt_dict = self.gt_database[0][rand_idx]
+ else:
+ # use hard sample
+ rand_idx = np.random.randint(0, len(self.gt_database[1]))
+ new_gt_dict = self.gt_database[1][rand_idx]
+ else:
+ rand_idx = np.random.randint(0, self.gt_database.__len__())
+ new_gt_dict = self.gt_database[rand_idx]
+
+ new_gt_box3d = new_gt_dict['gt_box3d'].copy()
+ new_gt_points = new_gt_dict['points'].copy()
+ new_gt_intensity = new_gt_dict['intensity'].copy()
+ new_gt_obj = new_gt_dict['obj']
+ center = new_gt_box3d[0:3]
+ if cfg.PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
+ continue
+
+ if new_gt_points.__len__() < 5: # too few points
+ continue
+
+ # put it on the road plane
+ cur_height = (-d - a * center[0] - c * center[2]) / b
+ move_height = new_gt_box3d[1] - cur_height
+ new_gt_box3d[1] -= move_height
+ new_gt_points[:, 1] -= move_height
+ new_gt_obj.pos[1] -= move_height
+
+ new_enlarged_box3d = new_gt_box3d.copy()
+ new_enlarged_box3d[4] += 0.5
+ new_enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
+
+ cnt += 1
+ new_corners = kitti_utils.boxes3d_to_corners3d(new_enlarged_box3d.reshape(1, 7))
+ iou3d = kitti_utils.get_iou3d(new_corners, cur_gt_corners)
+ valid_flag = iou3d.max() < 1e-8
+ if not valid_flag:
+ continue
+
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[3] += 2 # remove the points above and below the object
+
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect,
+ enlarged_box3d.reshape(1, 7))
+ pt_mask_flag = (boxes_pts_mask_list[0] == 1)
+ src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
+
+ new_pts_list.append(new_gt_points)
+ new_pts_intensity_list.append(new_gt_intensity)
+ cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, new_enlarged_box3d.reshape(1, 7)), axis=0)
+ cur_gt_corners = np.concatenate((cur_gt_corners, new_corners), axis=0)
+ extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
+ extra_gt_obj_list.append(new_gt_obj)
+
+ if new_pts_list.__len__() == 0:
+ return False, pts_rect, pts_intensity, None, None
+
+ extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
+ # remove original points and add new points
+ pts_rect = pts_rect[src_pts_flag == 1]
+ pts_intensity = pts_intensity[src_pts_flag == 1]
+ new_pts_rect = np.concatenate(new_pts_list, axis=0)
+ new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
+ pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
+ pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
+
+ return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
+
+ def rotate_box3d_along_y(self, box3d, rot_angle):
+ old_x, old_z, ry = box3d[0], box3d[2], box3d[6]
+ old_beta = np.arctan2(old_z, old_x)
+ alpha = -np.sign(old_beta) * np.pi / 2 + old_beta + ry
+ box3d = kitti_utils.rotate_pc_along_y(box3d.reshape(1, 7), rot_angle=rot_angle)[0]
+ new_x, new_z = box3d[0], box3d[2]
+ new_beta = np.arctan2(new_z, new_x)
+ box3d[6] = np.sign(new_beta) * np.pi / 2 + alpha - new_beta
+ return box3d
+
+ def data_augmentation(self, aug_pts_rect, aug_gt_boxes3d, gt_alpha, sample_id=None, mustaug=False, stage=1):
+ """
+ :param aug_pts_rect: (N, 3)
+ :param aug_gt_boxes3d: (N, 7)
+ :param gt_alpha: (N)
+ :return:
+ """
+ aug_list = cfg.AUG_METHOD_LIST
+ aug_enable = 1 - np.random.rand(3)
+ if mustaug is True:
+ aug_enable[0] = -1
+ aug_enable[1] = -1
+ aug_method = []
+ if 'rotation' in aug_list and aug_enable[0] < cfg.AUG_METHOD_PROB[0]:
+ angle = np.random.uniform(-np.pi / cfg.AUG_ROT_RANGE, np.pi / cfg.AUG_ROT_RANGE)
+ aug_pts_rect = kitti_utils.rotate_pc_along_y(aug_pts_rect, rot_angle=angle)
+ if stage == 1:
+ # xyz change, hwl unchange
+ aug_gt_boxes3d = kitti_utils.rotate_pc_along_y(aug_gt_boxes3d, rot_angle=angle)
+
+ # calculate the ry after rotation
+ x, z = aug_gt_boxes3d[:, 0], aug_gt_boxes3d[:, 2]
+ beta = np.arctan2(z, x)
+ new_ry = np.sign(beta) * np.pi / 2 + gt_alpha - beta
+ aug_gt_boxes3d[:, 6] = new_ry # TODO: not in [-np.pi / 2, np.pi / 2]
+ elif stage == 2:
+ # for debug stage-2, this implementation has little float precision difference with the above one
+ assert aug_gt_boxes3d.shape[0] == 2
+ aug_gt_boxes3d[0] = self.rotate_box3d_along_y(aug_gt_boxes3d[0], angle)
+ aug_gt_boxes3d[1] = self.rotate_box3d_along_y(aug_gt_boxes3d[1], angle)
+ else:
+ raise NotImplementedError
+
+ aug_method.append(['rotation', angle])
+
+ if 'scaling' in aug_list and aug_enable[1] < cfg.AUG_METHOD_PROB[1]:
+ scale = np.random.uniform(0.95, 1.05)
+ aug_pts_rect = aug_pts_rect * scale
+ aug_gt_boxes3d[:, 0:6] = aug_gt_boxes3d[:, 0:6] * scale
+ aug_method.append(['scaling', scale])
+
+ if 'flip' in aug_list and aug_enable[2] < cfg.AUG_METHOD_PROB[2]:
+ # flip horizontal
+ aug_pts_rect[:, 0] = -aug_pts_rect[:, 0]
+ aug_gt_boxes3d[:, 0] = -aug_gt_boxes3d[:, 0]
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ if stage == 1:
+ aug_gt_boxes3d[:, 6] = np.sign(aug_gt_boxes3d[:, 6]) * np.pi - aug_gt_boxes3d[:, 6]
+ elif stage == 2:
+ assert aug_gt_boxes3d.shape[0] == 2
+ aug_gt_boxes3d[0, 6] = np.sign(aug_gt_boxes3d[0, 6]) * np.pi - aug_gt_boxes3d[0, 6]
+ aug_gt_boxes3d[1, 6] = np.sign(aug_gt_boxes3d[1, 6]) * np.pi - aug_gt_boxes3d[1, 6]
+ else:
+ raise NotImplementedError
+
+ aug_method.append('flip')
+
+ return aug_pts_rect, aug_gt_boxes3d, aug_method
+
+ @staticmethod
+ def generate_rpn_training_labels(pts_rect, gt_boxes3d):
+ cls_label = np.zeros((pts_rect.shape[0]), dtype=np.int32)
+ reg_label = np.zeros((pts_rect.shape[0], 7), dtype=np.float32) # dx, dy, dz, ry, h, w, l
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d, rotate=True)
+ extend_gt_boxes3d = kitti_utils.enlarge_box3d(gt_boxes3d, extra_width=0.2)
+ extend_gt_corners = kitti_utils.boxes3d_to_corners3d(extend_gt_boxes3d, rotate=True)
+ for k in range(gt_boxes3d.shape[0]):
+ box_corners = gt_corners[k]
+ fg_pt_flag = in_hull(pts_rect, box_corners)
+ fg_pts_rect = pts_rect[fg_pt_flag]
+ cls_label[fg_pt_flag] = 1
+
+ # enlarge the bbox3d, ignore nearby points
+ extend_box_corners = extend_gt_corners[k]
+ fg_enlarge_flag = in_hull(pts_rect, extend_box_corners)
+ ignore_flag = np.logical_xor(fg_pt_flag, fg_enlarge_flag)
+ cls_label[ignore_flag] = -1
+
+ # pixel offset of object center
+ center3d = gt_boxes3d[k][0:3].copy() # (x, y, z)
+ center3d[1] -= gt_boxes3d[k][3] / 2
+ reg_label[fg_pt_flag, 0:3] = center3d - fg_pts_rect # Now y is the true center of 3d box 20180928
+
+ # size and angle encoding
+ reg_label[fg_pt_flag, 3] = gt_boxes3d[k][3] # h
+ reg_label[fg_pt_flag, 4] = gt_boxes3d[k][4] # w
+ reg_label[fg_pt_flag, 5] = gt_boxes3d[k][5] # l
+ reg_label[fg_pt_flag, 6] = gt_boxes3d[k][6] # ry
+
+ return cls_label, reg_label
+
+ def get_rcnn_sample_jit(self, index):
+ sample_id = int(self.sample_id_list[index])
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
+ self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
+
+ # load rois and gt_boxes3d for this sample
+ roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
+ roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
+ # roi_scores is not used currently
+ # roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+ sample_info = OrderedDict()
+ sample_info["sample_id"] = sample_id
+ sample_info['rpn_xyz'] = rpn_xyz
+ sample_info['rpn_features'] = rpn_features
+ sample_info['rpn_intensity'] = rpn_intensity
+ sample_info['seg_mask'] = seg_mask
+ sample_info['roi_boxes3d'] = roi_boxes3d
+ sample_info['pts_depth'] = np.linalg.norm(rpn_xyz, ord=2, axis=1)
+ sample_info['gt_boxes3d'] = gt_boxes3d
+
+ return sample_info
+
+ def sample_bg_inds(self, hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
+ if hard_bg_inds.size > 0 and easy_bg_inds.size > 0:
+ hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
+ easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
+
+ # sampling hard bg
+ rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
+ hard_bg_inds = hard_bg_inds[rand_num]
+ # sampling easy bg
+ rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
+ easy_bg_inds = easy_bg_inds[rand_num]
+
+ bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
+ elif hard_bg_inds.size > 0 and easy_bg_inds.size == 0:
+ hard_bg_rois_num = bg_rois_per_this_image
+ # sampling hard bg
+ rand_num = np.floor(np.random.rand(hard_bg_rois_num) * hard_bg_inds.size).astype(np.int32)
+ bg_inds = hard_bg_inds[rand_num]
+ elif hard_bg_inds.size == 0 and easy_bg_inds.size > 0:
+ easy_bg_rois_num = bg_rois_per_this_image
+ # sampling easy bg
+ rand_num = np.floor(np.random.rand(easy_bg_rois_num) * easy_bg_inds.size).astype(np.int32)
+ bg_inds = easy_bg_inds[rand_num]
+ else:
+ raise NotImplementedError
+
+ return bg_inds
+
+ def aug_roi_by_noise_batch(self, roi_boxes3d, gt_boxes3d, aug_times=10):
+ """
+ :param roi_boxes3d: (N, 7)
+ :param gt_boxes3d: (N, 7)
+ :return:
+ """
+ iou_of_rois = np.zeros(roi_boxes3d.shape[0], dtype=np.float32)
+ for k in range(roi_boxes3d.__len__()):
+ temp_iou = cnt = 0
+ roi_box3d = roi_boxes3d[k]
+ gt_box3d = gt_boxes3d[k]
+ pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_box3d.reshape(1, 7), True)
+ aug_box3d = roi_box3d
+ while temp_iou < pos_thresh and cnt < aug_times:
+ if np.random.rand() < 0.2:
+ aug_box3d = roi_box3d # p=0.2 to keep the original roi box
+ else:
+ aug_box3d = self.random_aug_box3d(roi_box3d)
+ aug_corners = kitti_utils.boxes3d_to_corners3d(aug_box3d.reshape(1, 7), True)
+ iou3d = kitti_utils.get_iou3d(aug_corners, gt_corners)
+ temp_iou = iou3d[0][0]
+ cnt += 1
+ roi_boxes3d[k] = aug_box3d
+ iou_of_rois[k] = temp_iou
+ return roi_boxes3d, iou_of_rois
+
+ @staticmethod
+ def canonical_transform_batch(pts_input, roi_boxes3d, gt_boxes3d):
+ """
+ :param pts_input: (N, npoints, 3 + C)
+ :param roi_boxes3d: (N, 7)
+ :param gt_boxes3d: (N, 7)
+ :return:
+ """
+ roi_ry = roi_boxes3d[:, 6] % (2 * np.pi) # 0 ~ 2pi
+ roi_center = roi_boxes3d[:, 0:3]
+ # shift to center
+ pts_input[:, :, [0, 1, 2]] = pts_input[:, :, [0, 1, 2]] - roi_center.reshape(-1, 1, 3)
+ gt_boxes3d_ct = np.copy(gt_boxes3d)
+ gt_boxes3d_ct[:, 0:3] = gt_boxes3d_ct[:, 0:3] - roi_center
+ # rotate to the direction of head
+ gt_boxes3d_ct = kitti_utils.rotate_pc_along_y_np(
+ gt_boxes3d_ct.reshape(-1, 1, 7),
+ roi_ry,
+ )
+ # TODO: check here
+ gt_boxes3d_ct = gt_boxes3d_ct.reshape(-1,7)
+ gt_boxes3d_ct[:, 6] = gt_boxes3d_ct[:, 6] - roi_ry
+ pts_input = kitti_utils.rotate_pc_along_y_np(
+ pts_input,
+ roi_ry
+ )
+ return pts_input, gt_boxes3d_ct
+
+ def get_rcnn_training_sample_batch(self, index):
+ sample_id = int(self.sample_id_list[index])
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = \
+ self.get_rpn_features(self.rcnn_training_feature_dir, sample_id)
+
+ # load rois and gt_boxes3d for this sample
+ roi_file = os.path.join(self.rcnn_training_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(roi_file)
+ roi_boxes3d = kitti_utils.objs_to_boxes3d(roi_obj_list)
+ # roi_scores = kitti_utils.objs_to_scores(roi_obj_list)
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ # calculate original iou
+ iou3d = kitti_utils.get_iou3d(kitti_utils.boxes3d_to_corners3d(roi_boxes3d, True),
+ kitti_utils.boxes3d_to_corners3d(gt_boxes3d, True))
+ max_overlaps, gt_assignment = iou3d.max(axis=1), iou3d.argmax(axis=1)
+ max_iou_of_gt, roi_assignment = iou3d.max(axis=0), iou3d.argmax(axis=0)
+ roi_assignment = roi_assignment[max_iou_of_gt > 0].reshape(-1)
+
+ # sample fg, easy_bg, hard_bg
+ fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
+ fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ fg_inds = np.nonzero(max_overlaps >= fg_thresh)[0]
+ fg_inds = np.concatenate((fg_inds, roi_assignment), axis=0) # consider the roi which has max_overlaps with gt as fg
+
+ easy_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO))[0]
+ hard_bg_inds = np.nonzero((max_overlaps < cfg.RCNN.CLS_BG_THRESH) &
+ (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0]
+
+ fg_num_rois = fg_inds.size
+ bg_num_rois = hard_bg_inds.size + easy_bg_inds.size
+
+ if fg_num_rois > 0 and bg_num_rois > 0:
+ # sampling fg
+ fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
+ rand_num = np.random.permutation(fg_num_rois)
+ fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
+
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
+ bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ elif fg_num_rois > 0 and bg_num_rois == 0:
+ # sampling fg
+ rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE ) * fg_num_rois)
+ # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
+ fg_inds = fg_inds[rand_num]
+ fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_rois_per_this_image = 0
+ elif bg_num_rois > 0 and fg_num_rois == 0:
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_inds = self.sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+ fg_rois_per_this_image = 0
+ else:
+ import pdb
+ pdb.set_trace()
+ raise NotImplementedError
+
+ # augment the rois by noise
+ roi_list, roi_iou_list, roi_gt_list = [], [], []
+ if fg_rois_per_this_image > 0:
+ fg_rois_src = roi_boxes3d[fg_inds].copy()
+ gt_of_fg_rois = gt_boxes3d[gt_assignment[fg_inds]]
+ fg_rois, fg_iou3d = self.aug_roi_by_noise_batch(fg_rois_src, gt_of_fg_rois, aug_times=10)
+ roi_list.append(fg_rois)
+ roi_iou_list.append(fg_iou3d)
+ roi_gt_list.append(gt_of_fg_rois)
+
+ if bg_rois_per_this_image > 0:
+ bg_rois_src = roi_boxes3d[bg_inds].copy()
+ gt_of_bg_rois = gt_boxes3d[gt_assignment[bg_inds]]
+ bg_rois, bg_iou3d = self.aug_roi_by_noise_batch(bg_rois_src, gt_of_bg_rois, aug_times=1)
+ roi_list.append(bg_rois)
+ roi_iou_list.append(bg_iou3d)
+ roi_gt_list.append(gt_of_bg_rois)
+
+ rois = np.concatenate(roi_list, axis=0)
+ iou_of_rois = np.concatenate(roi_iou_list, axis=0)
+ gt_of_rois = np.concatenate(roi_gt_list, axis=0)
+
+ # collect extra features for point cloud pooling
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [rpn_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
+ else:
+ pts_extra_input_list = [seg_mask.reshape(-1, 1)]
+
+ if cfg.RCNN.USE_DEPTH:
+ pts_depth = (np.linalg.norm(rpn_xyz, ord=2, axis=1) / 70.0) - 0.5
+ pts_extra_input_list.append(pts_depth.reshape(-1, 1))
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
+
+ # pts, pts_feature, boxes3d, pool_extra_width, sampled_pt_num
+ pts_input, pts_features, pts_empty_flag = roipool3d_utils.roipool3d_cpu(
+ rpn_xyz, rpn_features, rois, pts_extra_input,
+ cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=cfg.RCNN.NUM_POINTS,
+ #canonical_transform=False
+ )
+
+ # data augmentation
+ if cfg.AUG_DATA and self.mode == 'TRAIN':
+ for k in range(rois.__len__()):
+ aug_pts = pts_input[k, :, 0:3].copy()
+ aug_gt_box3d = gt_of_rois[k].copy()
+ aug_roi_box3d = rois[k].copy()
+
+ # calculate alpha by ry
+ temp_boxes3d = np.concatenate([aug_roi_box3d.reshape(1, 7), aug_gt_box3d.reshape(1, 7)], axis=0)
+ temp_x, temp_z, temp_ry = temp_boxes3d[:, 0], temp_boxes3d[:, 2], temp_boxes3d[:, 6]
+ temp_beta = np.arctan2(temp_z, temp_x).astype(np.float64)
+ temp_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry
+
+ # data augmentation
+ aug_pts, aug_boxes3d, aug_method = self.data_augmentation(aug_pts, temp_boxes3d, temp_alpha,
+ mustaug=True, stage=2)
+
+ # assign to original data
+ pts_input[k, :, 0:3] = aug_pts
+ rois[k] = aug_boxes3d[0]
+ gt_of_rois[k] = aug_boxes3d[1]
+
+ valid_mask = (pts_empty_flag == 0).astype(np.int32)
+ # regression valid mask
+ reg_valid_mask = (iou_of_rois > cfg.RCNN.REG_FG_THRESH).astype(np.int32) & valid_mask
+
+ # classification label
+ cls_label = (iou_of_rois > cfg.RCNN.CLS_FG_THRESH).astype(np.int32)
+ invalid_mask = (iou_of_rois > cfg.RCNN.CLS_BG_THRESH) & (iou_of_rois < cfg.RCNN.CLS_FG_THRESH)
+ cls_label[invalid_mask] = -1
+ cls_label[valid_mask == 0] = -1
+
+ # canonical transform and sampling
+ pts_input_ct, gt_boxes3d_ct = self.canonical_transform_batch(pts_input, rois, gt_of_rois)
+
+ pts_input_ = np.concatenate((pts_input_ct, pts_features), axis=-1)
+ sample_info = OrderedDict()
+
+ sample_info['sample_id'] = sample_id
+ sample_info['pts_input'] = pts_input_
+ sample_info['pts_feature'] = pts_features
+ sample_info['roi_boxes3d'] = rois
+ sample_info['cls_label'] = cls_label
+ sample_info['reg_valid_mask'] = reg_valid_mask
+ sample_info['gt_boxes3d_ct'] = gt_boxes3d_ct
+ sample_info['gt_of_rois'] = gt_of_rois
+ return sample_info
+
+ @staticmethod
+ def random_aug_box3d(box3d):
+ """
+ :param box3d: (7) [x, y, z, h, w, l, ry]
+ random shift, scale, orientation
+ """
+ if cfg.RCNN.REG_AUG_METHOD == 'single':
+ pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
+ hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
+ angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
+
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale,
+ box3d[6:7] + angle_rot])
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
+ # pos_range, hwl_range, angle_range, mean_iou
+ range_config = [[0.2, 0.1, np.pi / 12, 0.7],
+ [0.3, 0.15, np.pi / 12, 0.6],
+ [0.5, 0.15, np.pi / 9, 0.5],
+ [0.8, 0.15, np.pi / 6, 0.3],
+ [1.0, 0.15, np.pi / 3, 0.2]]
+ idx = np.random.randint(len(range_config))
+
+ pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
+ hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
+ angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
+
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot])
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'normal':
+ x_shift = np.random.normal(loc=0, scale=0.3)
+ y_shift = np.random.normal(loc=0, scale=0.2)
+ z_shift = np.random.normal(loc=0, scale=0.3)
+ h_shift = np.random.normal(loc=0, scale=0.25)
+ w_shift = np.random.normal(loc=0, scale=0.15)
+ l_shift = np.random.normal(loc=0, scale=0.5)
+ ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
+
+ aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
+ box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift])
+ return aug_box3d
+ else:
+ raise NotImplementedError
+
+ def get_proposal_from_file(self, index):
+ sample_id = int(self.image_idx_list[index])
+ proposal_file = os.path.join(self.rcnn_eval_roi_dir, '%06d.txt' % sample_id)
+ roi_obj_list = kitti_utils.get_objects_from_label(proposal_file)
+
+ rpn_xyz, rpn_features, rpn_intensity, seg_mask = self.get_rpn_features(self.rcnn_eval_feature_dir, sample_id)
+ pts_rect, pts_rpn_features, pts_intensity = rpn_xyz, rpn_features, rpn_intensity
+
+ roi_box3d_list, roi_scores = [], []
+ for obj in roi_obj_list:
+ box3d = np.array([obj.pos[0], obj.pos[1], obj.pos[2], obj.h, obj.w, obj.l, obj.ry], dtype=np.float32)
+ roi_box3d_list.append(box3d.reshape(1, 7))
+ roi_scores.append(obj.score)
+
+ roi_boxes3d = np.concatenate(roi_box3d_list, axis=0) # (N, 7)
+ roi_scores = np.array(roi_scores, dtype=np.float32) # (N)
+
+ if cfg.RCNN.ROI_SAMPLE_JIT:
+ sample_dict = {'sample_id': sample_id,
+ 'rpn_xyz': rpn_xyz,
+ 'rpn_features': rpn_features,
+ 'seg_mask': seg_mask,
+ 'roi_boxes3d': roi_boxes3d,
+ 'roi_scores': roi_scores,
+ 'pts_depth': np.linalg.norm(rpn_xyz, ord=2, axis=1)}
+
+ if self.mode != 'TEST':
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = kitti_utils.objs_to_boxes3d(gt_obj_list)
+
+ roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
+ iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
+ if gt_boxes3d.shape[0] > 0:
+ gt_iou = iou3d.max(axis=1)
+ else:
+ gt_iou = np.zeros(roi_boxes3d.shape[0]).astype(np.float32)
+
+ sample_dict['gt_boxes3d'] = gt_boxes3d
+ sample_dict['gt_iou'] = gt_iou
+ return sample_dict
+
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [pts_intensity.reshape(-1, 1), seg_mask.reshape(-1, 1)]
+ else:
+ pts_extra_input_list = [seg_mask.reshape(-1, 1)]
+
+ if cfg.RCNN.USE_DEPTH:
+ cur_depth = np.linalg.norm(pts_rect, axis=1, ord=2)
+ cur_depth_norm = (cur_depth / 70.0) - 0.5
+ pts_extra_input_list.append(cur_depth_norm.reshape(-1, 1))
+
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=1)
+ pts_input, pts_features, _ = roipool3d_utils.roipool3d_cpu(
+ pts_rect, pts_rpn_features, roi_boxes3d, pts_extra_input,
+ cfg.RCNN.POOL_EXTRA_WIDTH, sampled_pt_num=cfg.RCNN.NUM_POINTS,
+ canonical_transform=True
+ )
+ pts_input = np.concatenate((pts_input, pts_features), axis=-1)
+
+ sample_dict = OrderedDict()
+ sample_dict['sample_id'] = sample_id
+ sample_dict['pts_input'] = pts_input
+ sample_dict['pts_feature'] = pts_features
+ sample_dict['roi_boxes3d'] = roi_boxes3d
+ sample_dict['roi_scores'] = roi_scores
+ #sample_dict['roi_size'] = roi_boxes3d[:, 3:6]
+
+ if self.mode == 'TEST':
+ return sample_dict
+
+ gt_obj_list = self.filtrate_objects(self.get_label(sample_id))
+ gt_boxes3d = np.zeros((gt_obj_list.__len__(), 7), dtype=np.float32)
+
+ for k, obj in enumerate(gt_obj_list):
+ gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ if gt_boxes3d.__len__() == 0:
+ gt_iou = np.zeros((roi_boxes3d.shape[0]), dtype=np.float32)
+ else:
+ roi_corners = kitti_utils.boxes3d_to_corners3d(roi_boxes3d,True)
+ gt_corners = kitti_utils.boxes3d_to_corners3d(gt_boxes3d,True)
+ iou3d = kitti_utils.get_iou3d(roi_corners, gt_corners)
+ gt_iou = iou3d.max(axis=1)
+
+ sample_dict['gt_iou'] = gt_iou
+ sample_dict['gt_boxes3d'] = gt_boxes3d
+
+ return sample_dict
+
+ def __len__(self):
+ if cfg.RPN.ENABLED:
+ return len(self.sample_id_list)
+ elif cfg.RCNN.ENABLED:
+ if self.mode == 'TRAIN':
+ return len(self.sample_id_list)
+ else:
+ return len(self.image_idx_list)
+ else:
+ raise NotImplementedError
+
+ def __getitem__(self, index):
+ if cfg.RPN.ENABLED:
+ return self.get_rpn_sample(index)
+ elif cfg.RCNN.ENABLED:
+ if self.mode == 'TRAIN':
+ if cfg.RCNN.ROI_SAMPLE_JIT:
+ return self.get_rcnn_sample_jit(index)
+ else:
+ return self.get_rcnn_training_sample_batch(index)
+ else:
+ return self.get_proposal_from_file(index)
+ else:
+ raise NotImplementedError
+
+ def padding_batch(self, batch_data, batch_size):
+ max_roi = 0
+ max_gt = 0
+
+ for k in range(batch_size):
+ # roi_boxes3d
+ max_roi = max(max_roi, batch_data[k][3].shape[0])
+ # gt_boxes3d
+ max_gt = max(max_gt, batch_data[k][-1].shape[0])
+ batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7))
+ batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
+
+ for i, data in enumerate(batch_data):
+ roi_num = data[3].shape[0]
+ gt_num = data[-1].shape[0]
+ batch_roi_boxes3d[i,:roi_num,:] = data[3]
+ batch_gt_boxes3d[i,:gt_num,:] = data[-1]
+
+ new_batch = []
+ for i, data in enumerate(batch_data):
+ new_batch.append(data[:3])
+ # roi_boxes3d
+ new_batch[i].append(batch_roi_boxes3d[i])
+ # ...
+ new_batch[i].extend(data[4:7])
+ # gt_boxes3d
+ new_batch[i].append(batch_gt_boxes3d[i])
+ return new_batch
+
+ def padding_batch_eval(self, batch_data, batch_size):
+ max_pts = 0
+ max_feats = 0
+ max_roi = 0
+ max_score = 0
+ max_iou = 0
+ max_gt = 0
+
+ for k in range(batch_size):
+ # pts_input
+ max_pts = max(max_pts, batch_data[k][1].shape[0])
+ # pts_feature
+ max_feats = max(max_feats, batch_data[k][2].shape[0])
+ # roi_boxes3d
+ max_roi = max(max_roi, batch_data[k][3].shape[0])
+ # gt_iou
+ max_iou = max(max_iou, batch_data[k][-2].shape[0])
+ # gt_boxes3d
+ max_gt = max(max_gt, batch_data[k][-1].shape[0])
+ batch_pts_input = np.zeros((batch_size, max_pts, 512, 133), dtype=np.float32)
+ batch_pts_feat = np.zeros((batch_size, max_feats, 512, 128), dtype=np.float32)
+ batch_roi_boxes3d = np.zeros((batch_size, max_roi, 7), dtype=np.float32)
+ batch_gt_iou = np.zeros((batch_size, max_iou), dtype=np.float32)
+ batch_gt_boxes3d = np.zeros((batch_size, max_gt, 7), dtype=np.float32)
+
+ for i, data in enumerate(batch_data):
+ # num
+ pts_num = data[1].shape[0]
+ pts_feat_num = data[2].shape[0]
+ roi_num = data[3].shape[0]
+ iou_num = data[-2].shape[0]
+ gt_num = data[-1].shape[0]
+ # data
+ batch_pts_input[i, :pts_num, :, :] = data[1]
+ batch_pts_feat[i, :pts_feat_num, :, :] = data[2]
+ batch_roi_boxes3d[i,:roi_num,:] = data[3]
+ batch_gt_iou[i,:iou_num] = data[-2]
+ batch_gt_boxes3d[i,:gt_num,:] = data[-1]
+
+ new_batch = []
+ for i, data in enumerate(batch_data):
+ new_batch.append(data[:1])
+ new_batch[i].append(batch_pts_input[i])
+ new_batch[i].append(batch_pts_feat[i])
+ new_batch[i].append(batch_roi_boxes3d[i])
+ new_batch[i].append(data[4])
+ new_batch[i].append(batch_gt_iou[i])
+ new_batch[i].append(batch_gt_boxes3d[i])
+ return new_batch
+
+ def get_reader(self, batch_size, fields, drop_last=False):
+ def reader():
+ batch_out = []
+ idxs = np.arange(self.__len__())
+ if self.mode == 'TRAIN':
+ np.random.shuffle(idxs)
+ for idx in idxs:
+ sample_all = self.__getitem__(idx)
+ sample = [sample_all[f] for f in fields]
+ if has_empty(sample):
+ logger.info("sample field: %d has empty field"%len(sample))
+ continue
+ batch_out.append(sample)
+ if len(batch_out) >= batch_size:
+ if cfg.RPN.ENABLED:
+ yield batch_out
+ else:
+ if self.mode == 'TRAIN':
+ yield self.padding_batch(batch_out, batch_size)
+ elif self.mode == 'EVAL':
+ # batch_size can should be 1 in rcnn_offline eval currently
+ # if batch_size > 1, batch should be padded as follow
+ # yield self.padding_batch_eval(batch_out, batch_size)
+ yield batch_out
+ else:
+ logger.error("not only support train/eval padding")
+ batch_out = []
+ if not drop_last:
+ if len(batch_out) > 0:
+ yield batch_out
+ return reader
+
+ def get_multiprocess_reader(self, batch_size, fields, proc_num=8, max_queue_len=128, drop_last=False):
+ def read_to_queue(idxs, queue):
+ for idx in idxs:
+ sample_all = self.__getitem__(idx)
+ sample = [sample_all[f] for f in fields]
+ queue.put(sample)
+ queue.put(None)
+
+ def reader():
+ sample_num = self.__len__()
+ idxs = np.arange(self.__len__())
+ if self.mode == 'TRAIN':
+ np.random.shuffle(idxs)
+
+ proc_idxs = []
+ proc_sample_num = int(sample_num / proc_num)
+ start_idx = 0
+ for i in range(proc_num - 1):
+ proc_idxs.append(idxs[start_idx:start_idx + proc_sample_num])
+ start_idx += proc_sample_num
+ proc_idxs.append(idxs[start_idx:])
+
+ queue = multiprocessing.Queue(max_queue_len)
+ p_list = []
+ for i in range(proc_num):
+ p_list.append(multiprocessing.Process(
+ target=read_to_queue, args=(proc_idxs[i], queue,)))
+ p_list[-1].start()
+
+ finish_num = 0
+ batch_out = []
+ while finish_num < len(p_list):
+ sample = queue.get()
+ if sample is None:
+ finish_num += 1
+ else:
+ batch_out.append(sample)
+ if len(batch_out) == batch_size:
+ yield batch_out
+ batch_out = []
+
+ # join process
+ for p in p_list:
+ if p.is_alive():
+ p.join()
+
+ return reader
+
+
+def _term_reader(signum, frame):
+ logger.info('pid {} terminated, terminate reader process '
+ 'group {}...'.format(os.getpid(), os.getpgrp()))
+ os.killpg(os.getpgid(os.getpid()), signal.SIGKILL)
+
+signal.signal(signal.SIGINT, _term_reader)
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/eval.py b/PaddleCV/Paddle3D/PointRCNN/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..7ee5d37f40bbee8a5486090b1ebda05f0d5928a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/eval.py
@@ -0,0 +1,343 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import logging
+import multiprocessing
+import numpy as np
+from collections import OrderedDict
+import paddle
+import paddle.fluid as fluid
+
+from models.point_rcnn import PointRCNN
+from data.kitti_rcnn_reader import KittiRCNNReader
+from utils.run_utils import *
+from utils.config import cfg, load_config, set_config_from_list
+from utils.metric_utils import calc_iou_recall, rpn_metric, rcnn_metric
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+np.random.seed(1024) # use same seed
+METRIC_PROC_NUM = 4
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ "PointRCNN semantic segmentation train script")
+ parser.add_argument(
+ '--cfg',
+ type=str,
+ default='cfgs/default.yml',
+ help='specify the config for training')
+ parser.add_argument(
+ '--eval_mode',
+ type=str,
+ default='rpn',
+ required=True,
+ help='specify the training mode')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=1,
+ help='evaluation batch size, default 1')
+ parser.add_argument(
+ '--ckpt_dir',
+ type=str,
+ default='checkpoints/199',
+ help='specify a ckpt directory to be evaluated if needed')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--output_dir',
+ type=str,
+ default='output',
+ help='output directory')
+ parser.add_argument(
+ '--save_rpn_feature',
+ action='store_true',
+ default=False,
+ help='save features for separately rcnn training and evaluation')
+ parser.add_argument(
+ '--save_result',
+ action='store_true',
+ default=False,
+ help='save roi and refine result of evaluation')
+ parser.add_argument(
+ '--rcnn_eval_roi_dir',
+ type=str,
+ default=None,
+ help='specify the saved rois for rcnn evaluation when using rcnn_offline mode')
+ parser.add_argument(
+ '--rcnn_eval_feature_dir',
+ type=str,
+ default=None,
+ help='specify the saved features for rcnn evaluation when using rcnn_offline mode')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval to log.')
+ parser.add_argument(
+ '--set',
+ dest='set_cfgs',
+ default=None,
+ nargs=argparse.REMAINDER,
+ help='set extra config keys if needed.')
+ args = parser.parse_args()
+ return args
+
+
+def eval():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ # PointRCNN model can only run on GPU
+ check_gpu(True)
+
+ load_config(args.cfg)
+ if args.set_cfgs is not None:
+ set_config_from_list(args.set_cfgs)
+
+ if not os.path.isdir(args.output_dir):
+ os.makedirs(args.output_dir)
+
+ if args.eval_mode == 'rpn':
+ cfg.RPN.ENABLED = True
+ cfg.RCNN.ENABLED = False
+ elif args.eval_mode == 'rcnn':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = cfg.RPN.FIXED = True
+ assert args.batch_size, "batch size must be 1 in rcnn evaluation"
+ elif args.eval_mode == 'rcnn_offline':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = False
+ assert args.batch_size, "batch size must be 1 in rcnn_offline evaluation"
+ else:
+ raise NotImplementedError("unkown eval mode: {}".format(args.eval_mode))
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+
+ # build model
+ startup = fluid.Program()
+ eval_prog = fluid.Program()
+ with fluid.program_guard(eval_prog, startup):
+ with fluid.unique_name.guard():
+ eval_model = PointRCNN(cfg, args.batch_size, True, 'TEST')
+ eval_model.build()
+ eval_pyreader = eval_model.get_pyreader()
+ eval_feeds = eval_model.get_feeds()
+ eval_outputs = eval_model.get_outputs()
+ eval_prog = eval_prog.clone(True)
+
+ extra_keys = []
+ if args.eval_mode == 'rpn':
+ extra_keys.extend(['sample_id', 'rpn_cls_label', 'gt_boxes3d'])
+ if args.save_rpn_feature:
+ extra_keys.extend(['pts_rect', 'pts_features', 'pts_input',])
+ eval_keys, eval_values = parse_outputs(
+ eval_outputs, prog=eval_prog, extra_keys=extra_keys)
+
+ eval_compile_prog = fluid.compiler.CompiledProgram(
+ eval_prog).with_data_parallel()
+
+ exe.run(startup)
+
+ # load checkpoint
+ assert os.path.isdir(
+ args.ckpt_dir), "ckpt_dir {} not a directory".format(args.ckpt_dir)
+
+ def if_exist(var):
+ return os.path.exists(os.path.join(args.ckpt_dir, var.name))
+ fluid.io.load_vars(exe, args.ckpt_dir, eval_prog, predicate=if_exist)
+
+ kitti_feature_dir = os.path.join(args.output_dir, 'features')
+ kitti_output_dir = os.path.join(args.output_dir, 'detections', 'data')
+ seg_output_dir = os.path.join(args.output_dir, 'seg_result')
+ if args.save_rpn_feature:
+ if os.path.exists(kitti_feature_dir):
+ shutil.rmtree(kitti_feature_dir)
+ os.makedirs(kitti_feature_dir)
+ if os.path.exists(kitti_output_dir):
+ shutil.rmtree(kitti_output_dir)
+ os.makedirs(kitti_output_dir)
+ if os.path.exists(seg_output_dir):
+ shutil.rmtree(seg_output_dir)
+ os.makedirs(seg_output_dir)
+
+ # must make sure these dirs existing
+ roi_output_dir = os.path.join('./result_dir', 'roi_result', 'data')
+ refine_output_dir = os.path.join('./result_dir', 'refine_result', 'data')
+ final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
+ if not os.path.exists(final_output_dir):
+ os.makedirs(final_output_dir)
+ if args.save_result:
+ if not os.path.exists(roi_output_dir):
+ os.makedirs(roi_output_dir)
+ if not os.path.exists(refine_output_dir):
+ os.makedirs(refine_output_dir)
+
+ # get reader
+ kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
+ npoints=cfg.RPN.NUM_POINTS,
+ split=cfg.TEST.SPLIT,
+ mode='EVAL',
+ classes=cfg.CLASSES,
+ rcnn_eval_roi_dir=args.rcnn_eval_roi_dir,
+ rcnn_eval_feature_dir=args.rcnn_eval_feature_dir)
+ eval_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size, eval_feeds)
+ eval_pyreader.decorate_sample_list_generator(eval_reader, place)
+
+ thresh_list = [0.1, 0.3, 0.5, 0.7, 0.9]
+ queue = multiprocessing.Queue(128)
+ mgr = multiprocessing.Manager()
+ lock = multiprocessing.Lock()
+ mdict = mgr.dict()
+ if cfg.RPN.ENABLED:
+ mdict['exit_proc'] = 0
+ mdict['total_gt_bbox'] = 0
+ mdict['total_cnt'] = 0
+ mdict['total_rpn_iou'] = 0
+ for i in range(len(thresh_list)):
+ mdict['total_recalled_bbox_list_{}'.format(i)] = 0
+
+ p_list = []
+ for i in range(METRIC_PROC_NUM):
+ p_list.append(multiprocessing.Process(
+ target=rpn_metric,
+ args=(queue, mdict, lock, thresh_list, args.save_rpn_feature, kitti_feature_dir,
+ seg_output_dir, kitti_output_dir, kitti_rcnn_reader, cfg.CLASSES)))
+ p_list[-1].start()
+
+ if cfg.RCNN.ENABLED:
+ for i in range(len(thresh_list)):
+ mdict['total_recalled_bbox_list_{}'.format(i)] = 0
+ mdict['total_roi_recalled_bbox_list_{}'.format(i)] = 0
+ mdict['exit_proc'] = 0
+ mdict['total_cls_acc'] = 0
+ mdict['total_cls_acc_refined'] = 0
+ mdict['total_det_num'] = 0
+ mdict['total_gt_bbox'] = 0
+ p_list = []
+ for i in range(METRIC_PROC_NUM):
+ p_list.append(multiprocessing.Process(
+ target=rcnn_metric,
+ args=(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
+ refine_output_dir, final_output_dir, args.save_result)
+ ))
+ p_list[-1].start()
+
+ try:
+ eval_pyreader.start()
+ eval_iter = 0
+ start_time = time.time()
+
+ cur_time = time.time()
+ while True:
+ eval_outs = exe.run(eval_compile_prog, fetch_list=eval_values, return_numpy=False)
+ rets_dict = {k: (np.array(v), v.recursive_sequence_lengths())
+ for k, v in zip(eval_keys, eval_outs)}
+ run_time = time.time() - cur_time
+ cur_time = time.time()
+ queue.put(rets_dict)
+ eval_iter += 1
+
+ logger.info("[EVAL] iter {}, time: {:.2f}".format(
+ eval_iter, run_time))
+
+ except fluid.core.EOFException:
+ # terminate metric process
+ for i in range(METRIC_PROC_NUM):
+ queue.put(None)
+ while mdict['exit_proc'] < METRIC_PROC_NUM:
+ time.sleep(1)
+ for p in p_list:
+ if p.is_alive():
+ p.join()
+
+ end_time = time.time()
+ logger.info("[EVAL] total {} iter finished, average time: {:.2f}".format(
+ eval_iter, (end_time - start_time) / float(eval_iter)))
+
+ if cfg.RPN.ENABLED:
+ avg_rpn_iou = mdict['total_rpn_iou'] / max(len(kitti_rcnn_reader), 1.)
+ logger.info("average rpn iou: {:.3f}".format(avg_rpn_iou))
+ total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
+ for idx, thresh in enumerate(thresh_list):
+ recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info("total bbox recall(thresh={:.3f}): {} / {} = {:.3f}".format(
+ thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], mdict['total_gt_bbox'], recall))
+
+ if cfg.RCNN.ENABLED:
+ cnt = float(max(eval_iter, 1.0))
+ avg_cls_acc = mdict['total_cls_acc'] / cnt
+ avg_cls_acc_refined = mdict['total_cls_acc_refined'] / cnt
+ avg_det_num = mdict['total_det_num'] / cnt
+
+ logger.info("avg_cls_acc: {}".format(avg_cls_acc))
+ logger.info("avg_cls_acc_refined: {}".format(avg_cls_acc_refined))
+ logger.info("avg_det_num: {}".format(avg_det_num))
+
+ total_gt_bbox = float(max(mdict['total_gt_bbox'], 1.0))
+ for idx, thresh in enumerate(thresh_list):
+ cur_roi_recall = mdict['total_roi_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info('total roi bbox recall(thresh=%.3f): %d / %d = %f' % (
+ thresh, mdict['total_roi_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_roi_recall))
+
+ for idx, thresh in enumerate(thresh_list):
+ cur_recall = mdict['total_recalled_bbox_list_{}'.format(idx)] / total_gt_bbox
+ logger.info('total bbox recall(thresh=%.2f) %d / %.2f = %.4f' % (
+ thresh, mdict['total_recalled_bbox_list_{}'.format(idx)], total_gt_bbox, cur_recall))
+
+ split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
+ image_idx_list = [x.strip() for x in open(split_file).readlines()]
+ for k in range(image_idx_list.__len__()):
+ cur_file = os.path.join(final_output_dir, '%s.txt' % image_idx_list[k])
+ if not os.path.exists(cur_file):
+ with open(cur_file, 'w') as temp_f:
+ pass
+
+ if float(sys.version[:3]) >= 3.6:
+ label_dir = os.path.join('./data/KITTI/object/training', 'label_2')
+ split_file = os.path.join('./data/KITTI', 'ImageSets', 'val.txt')
+ final_output_dir = os.path.join("./result_dir", 'final_result', 'data')
+ name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
+
+ from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
+ ap_result_str, ap_dict = kitti_evaluate(
+ label_dir, final_output_dir, label_split_file=split_file,
+ current_class=name_to_class["Car"])
+
+ logger.info("KITTI evaluate: {}, {}".format(ap_result_str, ap_dict))
+
+ else:
+ logger.info("KITTI mAP only support python version >= 3.6, users can "
+ "run 'python3 tools/kitti_eval.py' to evaluate KITTI mAP.")
+
+ finally:
+ eval_pyreader.reset()
+
+
+if __name__ == "__main__":
+ eval()
diff --git a/PaddleCV/Paddle3D/PointRCNN/ext_op b/PaddleCV/Paddle3D/PointRCNN/ext_op
new file mode 120000
index 0000000000000000000000000000000000000000..dca99c677c8fa26e7cbf3ce1d50a8e6af0621655
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/ext_op
@@ -0,0 +1 @@
+../PointNet++/ext_op
\ No newline at end of file
diff --git a/PaddleCV/Paddle3D/PointRCNN/images/teaser.png b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png
new file mode 100644
index 0000000000000000000000000000000000000000..21ae7e98165074ef93dc34fc643b3fddc5fe6c36
Binary files /dev/null and b/PaddleCV/Paddle3D/PointRCNN/images/teaser.png differ
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/__init__.py b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..46a4f6ee220f10f50a182f4a2ed510b0551f64a8
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/__init__.py
@@ -0,0 +1,13 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..d70d1bd69c63b4e31616c1325e69187710f23961
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/loss_utils.py
@@ -0,0 +1,202 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+__all__ = ["get_reg_loss"]
+
+
+def sigmoid_focal_loss(logits, labels, weights, gamma=2.0, alpha=0.25):
+ sce_loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, labels)
+ prob = fluid.layers.sigmoid(logits)
+ p_t = labels * prob + (1.0 - labels) * (1.0 - prob)
+ modulating_factor = fluid.layers.pow(1.0 - p_t, gamma)
+ alpha_weight_factor = labels * alpha + (1.0 - labels) * (1.0 - alpha)
+ return modulating_factor * alpha_weight_factor * sce_loss * weights
+
+
+def get_reg_loss(pred_reg, reg_label, fg_mask, point_num, loc_scope,
+ loc_bin_size, num_head_bin, anchor_size,
+ get_xz_fine=True, get_y_by_bin=False, loc_y_scope=0.5,
+ loc_y_bin_size=0.25, get_ry_fine=False):
+
+ """
+ Bin-based 3D bounding boxes regression loss. See https://arxiv.org/abs/1812.04244 for more details.
+
+ :param pred_reg: (N, C)
+ :param reg_label: (N, 7) [dx, dy, dz, h, w, l, ry]
+ :param loc_scope: constant
+ :param loc_bin_size: constant
+ :param num_head_bin: constant
+ :param anchor_size: (N, 3) or (3)
+ :param get_xz_fine:
+ :param get_y_by_bin:
+ :param loc_y_scope:
+ :param loc_y_bin_size:
+ :param get_ry_fine:
+ :return:
+ """
+ fg_num = fluid.layers.cast(fluid.layers.reduce_sum(fg_mask), dtype=pred_reg.dtype)
+ fg_num = fluid.layers.clip(fg_num, min=1.0, max=point_num)
+ fg_scale = float(point_num) / fg_num
+
+ per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ reg_loss_dict = {}
+
+ # xz localization loss
+ x_offset_label, y_offset_label, z_offset_label = reg_label[:, 0:1], reg_label[:, 1:2], reg_label[:, 2:3]
+ x_shift = fluid.layers.clip(x_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
+ z_shift = fluid.layers.clip(z_offset_label + loc_scope, 0., loc_scope * 2 - 1e-3)
+ x_bin_label = fluid.layers.cast(x_shift / loc_bin_size, dtype='int64')
+ z_bin_label = fluid.layers.cast(z_shift / loc_bin_size, dtype='int64')
+
+ x_bin_l, x_bin_r = 0, per_loc_bin_num
+ z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
+ start_offset = z_bin_r
+
+ loss_x_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, x_bin_l: x_bin_r], x_bin_label)
+ loss_x_bin = fluid.layers.reduce_mean(loss_x_bin * fg_mask) * fg_scale
+ loss_z_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, z_bin_l: z_bin_r], z_bin_label)
+ loss_z_bin = fluid.layers.reduce_mean(loss_z_bin * fg_mask) * fg_scale
+ reg_loss_dict['loss_x_bin'] = loss_x_bin
+ reg_loss_dict['loss_z_bin'] = loss_z_bin
+ loc_loss = loss_x_bin + loss_z_bin
+
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_label = x_shift - (fluid.layers.cast(x_bin_label, dtype=x_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
+ z_res_label = z_shift - (fluid.layers.cast(z_bin_label, dtype=z_shift.dtype) * loc_bin_size + loc_bin_size / 2.)
+ x_res_norm_label = x_res_label / loc_bin_size
+ z_res_norm_label = z_res_label / loc_bin_size
+
+ x_bin_onehot = fluid.layers.one_hot(x_bin_label, depth=per_loc_bin_num)
+ z_bin_onehot = fluid.layers.one_hot(z_bin_label, depth=per_loc_bin_num)
+
+ loss_x_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, x_res_l: x_res_r] * x_bin_onehot, dim=1, keep_dim=True), x_res_norm_label)
+ loss_x_res = fluid.layers.reduce_mean(loss_x_res * fg_mask) * fg_scale
+ loss_z_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, z_res_l: z_res_r] * z_bin_onehot, dim=1, keep_dim=True), z_res_norm_label)
+ loss_z_res = fluid.layers.reduce_mean(loss_z_res * fg_mask) * fg_scale
+ reg_loss_dict['loss_x_res'] = loss_x_res
+ reg_loss_dict['loss_z_res'] = loss_z_res
+ loc_loss += loss_x_res + loss_z_res
+
+ # y localization loss
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_shift = fluid.layers.clip(y_offset_label + loc_y_scope, 0., loc_y_scope * 2 - 1e-3)
+ y_bin_label = fluid.layers.cast(y_shift / loc_y_bin_size, dtype='int64')
+ y_res_label = y_shift - (fluid.layers.cast(y_bin_label, dtype=y_shift.dtype) * loc_y_bin_size + loc_y_bin_size / 2.)
+ y_res_norm_label = y_res_label / loc_y_bin_size
+
+ y_bin_onehot = fluid.layers.one_hot(y_bin_label, depth=per_loc_bin_num)
+
+ loss_y_bin = fluid.layers.cross_entropy(pred_reg[:, y_bin_l: y_bin_r], y_bin_label)
+ loss_y_bin = fluid.layers.reduce_mean(loss_y_bin * fg_mask) * fg_scale
+ loss_y_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_res_l: y_res_r] * y_bin_onehot, dim=1, keep_dim=True), y_res_norm_label)
+ loss_y_res = fluid.layers.reduce_mean(loss_y_res * fg_mask) * fg_scale
+
+ reg_loss_dict['loss_y_bin'] = loss_y_bin
+ reg_loss_dict['loss_y_res'] = loss_y_res
+
+ loc_loss += loss_y_bin + loss_y_res
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ loss_y_offset = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, y_offset_l: y_offset_r], dim=1, keep_dim=True), y_offset_label)
+ loss_y_offset = fluid.layers.reduce_mean(loss_y_offset * fg_mask) * fg_scale
+ reg_loss_dict['loss_y_offset'] = loss_y_offset
+ loc_loss += loss_y_offset
+
+ # angle loss
+ ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
+ ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
+
+ ry_label = reg_label[:, 6:7]
+
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+
+ ry_label = ry_label % (2 * np.pi) # 0 ~ 2pi
+ opposite_flag = fluid.layers.logical_and(ry_label > np.pi * 0.5, ry_label < np.pi * 1.5)
+ opposite_flag = fluid.layers.cast(opposite_flag, dtype=ry_label.dtype)
+ shift_angle = (ry_label + opposite_flag * np.pi + np.pi * 0.5) % (2 * np.pi) # (0 ~ pi)
+ shift_angle.stop_gradient = True
+
+ shift_angle = fluid.layers.clip(shift_angle - np.pi * 0.25, min=1e-3, max=np.pi * 0.5 - 1e-3) # (0, pi/2)
+
+ # bin center is (5, 10, 15, ..., 85)
+ ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
+ ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
+ ry_res_norm_label = ry_res_label / (angle_per_class / 2)
+
+ else:
+ # divide 2pi into several bins
+ angle_per_class = (2 * np.pi) / num_head_bin
+ heading_angle = ry_label % (2 * np.pi) # 0 ~ 2pi
+
+ shift_angle = (heading_angle + angle_per_class / 2) % (2 * np.pi)
+ shift_angle.stop_gradient = True
+ ry_bin_label = fluid.layers.cast(shift_angle / angle_per_class, dtype='int64')
+ ry_res_label = shift_angle - (fluid.layers.cast(ry_bin_label, dtype=shift_angle.dtype) * angle_per_class + angle_per_class / 2)
+ ry_res_norm_label = ry_res_label / (angle_per_class / 2)
+
+ ry_bin_onehot = fluid.layers.one_hot(ry_bin_label, depth=num_head_bin)
+ loss_ry_bin = fluid.layers.softmax_with_cross_entropy(pred_reg[:, ry_bin_l:ry_bin_r], ry_bin_label)
+ loss_ry_bin = fluid.layers.reduce_mean(loss_ry_bin * fg_mask) * fg_scale
+ loss_ry_res = fluid.layers.smooth_l1(fluid.layers.reduce_sum(pred_reg[:, ry_res_l: ry_res_r] * ry_bin_onehot, dim=1, keep_dim=True), ry_res_norm_label)
+ loss_ry_res = fluid.layers.reduce_mean(loss_ry_res * fg_mask) * fg_scale
+
+ reg_loss_dict['loss_ry_bin'] = loss_ry_bin
+ reg_loss_dict['loss_ry_res'] = loss_ry_res
+ angle_loss = loss_ry_bin + loss_ry_res
+
+ # size loss
+ size_res_l, size_res_r = ry_res_r, ry_res_r + 3
+ assert pred_reg.shape[1] == size_res_r, '%d vs %d' % (pred_reg.shape[1], size_res_r)
+
+ anchor_size_var = fluid.layers.zeros(shape=[3], dtype=reg_label.dtype)
+ fluid.layers.assign(np.array(anchor_size).astype('float32'), anchor_size_var)
+ size_res_norm_label = (reg_label[:, 3:6] - anchor_size_var) / anchor_size_var
+ size_res_norm_label = fluid.layers.reshape(size_res_norm_label, shape=[-1, 1], inplace=True)
+ size_res_norm = pred_reg[:, size_res_l:size_res_r]
+ size_res_norm = fluid.layers.reshape(size_res_norm, shape=[-1, 1], inplace=True)
+ size_loss = fluid.layers.smooth_l1(size_res_norm, size_res_norm_label)
+ size_loss = fluid.layers.reshape(size_loss, shape=[-1, 3])
+ size_loss = fluid.layers.reduce_mean(size_loss * fg_mask) * fg_scale
+
+ # Total regression loss
+ reg_loss_dict['loss_loc'] = loc_loss
+ reg_loss_dict['loss_angle'] = angle_loss
+ reg_loss_dict['loss_size'] = size_loss
+
+ return loc_loss, angle_loss, size_loss, reg_loss_dict
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py
new file mode 100644
index 0000000000000000000000000000000000000000..890ef897405722f9cc1ba1d129bea2c80fce17a1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/point_rcnn.py
@@ -0,0 +1,125 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+from collections import OrderedDict
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+from models.rpn import RPN
+from models.rcnn import RCNN
+
+
+__all__ = ["PointRCNN"]
+
+
+class PointRCNN(object):
+ def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
+ self.cfg = cfg
+ self.batch_size = batch_size
+ self.use_xyz = use_xyz
+ self.mode = mode
+ self.is_train = mode == 'TRAIN'
+ self.num_points = self.cfg.RPN.NUM_POINTS
+ self.prog = prog
+ self.inputs = None
+ self.pyreader = None
+
+ def build_inputs(self):
+ self.inputs = OrderedDict()
+
+ if self.cfg.RPN.ENABLED:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32')
+ self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[self.num_points, 3], dtype='float32')
+ self.inputs['pts_rect'] = fluid.layers.data(name='pts_rect', shape=[self.num_points, 3], dtype='float32')
+ self.inputs['pts_features'] = fluid.layers.data(name='pts_features', shape=[self.num_points, 1], dtype='float32')
+ self.inputs['rpn_cls_label'] = fluid.layers.data(name='rpn_cls_label', shape=[self.num_points], dtype='int32')
+ self.inputs['rpn_reg_label'] = fluid.layers.data(name='rpn_reg_label', shape=[self.num_points, 7], dtype='float32')
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[7], lod_level=1, dtype='float32')
+
+ if self.cfg.RCNN.ENABLED:
+ if self.cfg.RCNN.ROI_SAMPLE_JIT:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[1], dtype='int32', append_batch_size=False)
+ self.inputs['rpn_xyz'] = fluid.layers.data(name='rpn_xyz', shape=[self.num_points, 3], dtype='float32', append_batch_size=False)
+ self.inputs['rpn_features'] = fluid.layers.data(name='rpn_features', shape=[self.num_points,128], dtype='float32', append_batch_size=False)
+ self.inputs['rpn_intensity'] = fluid.layers.data(name='rpn_intensity', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['seg_mask'] = fluid.layers.data(name='seg_mask', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
+ self.inputs['pts_depth'] = fluid.layers.data(name='pts_depth', shape=[self.num_points], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1, -1, 7], dtype='float32', append_batch_size=False, lod_level=0)
+ else:
+ self.inputs['sample_id'] = fluid.layers.data(name='sample_id', shape=[-1], dtype='int32', append_batch_size=False)
+ self.inputs['pts_input'] = fluid.layers.data(name='pts_input', shape=[-1,512,133], dtype='float32', append_batch_size=False)
+ self.inputs['pts_feature'] = fluid.layers.data(name='pts_feature', shape=[-1,512,128], dtype='float32', append_batch_size=False)
+ self.inputs['roi_boxes3d'] = fluid.layers.data(name='roi_boxes3d', shape=[-1,7], dtype='float32', append_batch_size=False)
+ if self.is_train:
+ self.inputs['cls_label'] = fluid.layers.data(name='cls_label', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['reg_valid_mask'] = fluid.layers.data(name='reg_valid_mask', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d_ct'] = fluid.layers.data(name='gt_boxes3d_ct', shape=[-1,7], dtype='float32', append_batch_size=False)
+ self.inputs['gt_of_rois'] = fluid.layers.data(name='gt_of_rois', shape=[-1,7], dtype='float32', append_batch_size=False)
+ else:
+ self.inputs['roi_scores'] = fluid.layers.data(name='roi_scores', shape=[-1,], dtype='float32', append_batch_size=False)
+ self.inputs['gt_iou'] = fluid.layers.data(name='gt_iou', shape=[-1], dtype='float32', append_batch_size=False)
+ self.inputs['gt_boxes3d'] = fluid.layers.data(name='gt_boxes3d', shape=[-1,-1,7], dtype='float32', append_batch_size=False, lod_level=0)
+
+
+ self.pyreader = fluid.io.PyReader(
+ feed_list=list(self.inputs.values()),
+ capacity=64,
+ use_double_buffer=True,
+ iterable=False)
+
+ def build(self):
+ self.build_inputs()
+ if self.cfg.RPN.ENABLED:
+ self.rpn = RPN(self.cfg, self.batch_size, self.use_xyz,
+ self.mode, self.prog)
+ self.rpn.build(self.inputs)
+ self.rpn_outputs = self.rpn.get_outputs()
+ self.outputs = self.rpn_outputs
+
+ if self.cfg.RCNN.ENABLED:
+ self.rcnn = RCNN(self.cfg, 1, self.batch_size, self.mode)
+ self.rcnn.build_model(self.inputs)
+ self.outputs = self.rcnn.get_outputs()
+
+ if self.mode == 'TRAIN':
+ if self.cfg.RPN.ENABLED:
+ self.outputs['rpn_loss'], self.outputs['rpn_loss_cls'], \
+ self.outputs['rpn_loss_reg'] = self.rpn.get_loss()
+ if self.cfg.RCNN.ENABLED:
+ self.outputs['rcnn_loss'], self.outputs['rcnn_loss_cls'], \
+ self.outputs['rcnn_loss_reg'] = self.rcnn.get_loss()
+ self.outputs['loss'] = self.outputs.get('rpn_loss', 0.) \
+ + self.outputs.get('rcnn_loss', 0.)
+
+ def get_feeds(self):
+ return list(self.inputs.keys())
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ rpn_loss, _, _ = self.rpn.get_loss()
+ rcnn_loss, _, _ = self.rcnn.get_loss()
+ return rpn_loss + rcnn_loss
+
+ def get_pyreader(self):
+ return self.pyreader
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py
new file mode 100644
index 0000000000000000000000000000000000000000..6f92bb5f77afc50cdb1c92ab694b82c6ac64479f
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_modules.py
@@ -0,0 +1,203 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant, Normal
+from ext_op import *
+
+__all__ = ["conv_bn", "pointnet_sa_module", "pointnet_fp_module", "MLP"]
+
+
+def query_and_group(xyz, new_xyz, radius, nsample, features=None, use_xyz=True):
+ """
+ Perform query_ball and group_points
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ radius (float32): radius of ball
+ nsample (int32): maximum number of gather features
+ features (Variable): features with shape [B, N, C]
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ out (Variable): features with shape [B, npoint, nsample, C + 3]
+ """
+ idx = query_ball(xyz, new_xyz, radius, nsample)
+ idx.stop_gradient = True
+ xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
+ grouped_xyz = group_points(xyz, idx)
+ expand_new_xyz = fluid.layers.unsqueeze(fluid.layers.transpose(new_xyz, perm=[0, 2, 1]), axes=[-1])
+ expand_new_xyz = fluid.layers.expand(expand_new_xyz, [1, 1, 1, grouped_xyz.shape[3]])
+ grouped_xyz -= expand_new_xyz
+
+ if features is not None:
+ grouped_features = group_points(features, idx)
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) \
+ if use_xyz else grouped_features
+ else:
+ assert use_xyz, "use_xyz should be True when features is None"
+ return grouped_xyz
+
+
+def group_all(xyz, features=None, use_xyz=True):
+ """
+ Group all xyz and features when npoint is None
+ See query_and_group
+ """
+ xyz = fluid.layers.transpose(xyz,perm=[0, 2, 1])
+ grouped_xyz = fluid.layers.unsqueeze(xyz, axes=[2])
+ if features is not None:
+ grouped_features = fluid.layers.unsqueeze(features, axes=[2])
+ return fluid.layers.concat([grouped_xyz, grouped_features], axis=1) if use_xyz else grouped_features
+ else:
+ return grouped_xyz
+
+
+def conv_bn(input, out_channels, bn=True, bn_momentum=0.95, act='relu', name=None):
+ def _get_kaiming_init():
+ fan_in = input.shape[1]
+ std = (1.0 / fan_in / 3.0) ** 0.5
+ return Normal(0., std, 0.)
+
+ param_attr = ParamAttr(name='{}_conv_weight'.format(name),
+ initializer=_get_kaiming_init())
+ bias_attr = ParamAttr(name='{}_conv_bias'.format(name)) \
+ if not bn else False
+ out = fluid.layers.conv2d(input,
+ num_filters=out_channels,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=param_attr,
+ bias_attr=bias_attr,
+ act=act if not bn else None)
+ if bn:
+ bn_name = name + "_bn"
+ out = fluid.layers.batch_norm(out,
+ act=act,
+ momentum=bn_momentum,
+ param_attr=ParamAttr(name=bn_name + "_scale"),
+ bias_attr=ParamAttr(name=bn_name + "_offset"),
+ moving_mean_name=bn_name + '_mean',
+ moving_variance_name=bn_name + '_var')
+
+ return out
+
+
+def MLP(features, out_channels_list, bn=True, bn_momentum=0.95, act='relu', name=None):
+ out = features
+ for i, out_channels in enumerate(out_channels_list):
+ out = conv_bn(out, out_channels, bn=bn, act=act, bn_momentum=bn_momentum, name=name + "_{}".format(i))
+ return out
+
+
+def pointnet_sa_module(xyz,
+ npoint=None,
+ radiuss=[],
+ nsamples=[],
+ mlps=[],
+ feature=None,
+ bn=True,
+ bn_momentum=0.95,
+ use_xyz=True,
+ name=None):
+ """
+ PointNet MSG(Multi-Scale Group) Set Abstraction Module.
+ Call with radiuss, nsamples, mlps as single element list for
+ SSG(Single-Scale Group).
+
+ Args:
+ xyz (Variable): xyz coordiantes features with shape [B, N, 3]
+ radiuss ([float32]): list of radius of ball
+ nsamples ([int32]): list of maximum number of gather features
+ mlps ([[int32]]): list of out_channels_list
+ feature (Variable): features with shape [B, C, N]
+ bn (bool): whether perform batch norm after conv2d
+ bn_momentum (float): momentum of batch norm
+ use_xyz (bool): whether use xyz coordiantes features
+
+ Returns:
+ new_xyz (Variable): centriods features with shape [B, npoint, 3]
+ out (Variable): features with shape [B, npoint, \sum_i{mlps[i][-1]}]
+ """
+ assert len(radiuss) == len(nsamples) == len(mlps), \
+ "radiuss, nsamples, mlps length should be same"
+
+ farthest_idx = farthest_point_sampling(xyz, npoint)
+ farthest_idx.stop_gradient = True
+ new_xyz = gather_point(xyz, farthest_idx) if npoint is not None else None
+
+ outs = []
+ for i, (radius, nsample, mlp) in enumerate(zip(radiuss, nsamples, mlps)):
+ out = query_and_group(xyz, new_xyz, radius, nsample, feature, use_xyz) if npoint is not None else group_all(xyz, feature, use_xyz)
+ out = MLP(out, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp{}'.format(i))
+ out = fluid.layers.pool2d(out, pool_size=[1, out.shape[3]], pool_type='max')
+ out = fluid.layers.squeeze(out, axes=[-1])
+ outs.append(out)
+ out = fluid.layers.concat(outs, axis=1)
+
+ return (new_xyz, out)
+
+
+def pointnet_fp_module(unknown, known, unknown_feats, known_feats, mlp, bn=True, bn_momentum=0.95, name=None):
+ """
+ PointNet Feature Propagation Module
+
+ Args:
+ unknown (Variable): unknown xyz coordiantes features with shape [B, N, 3]
+ known (Variable): known xyz coordiantes features with shape [B, M, 3]
+ unknown_feats (Variable): unknown features with shape [B, N, C1] to be propagated to
+ known_feats (Variable): known features with shape [B, M, C2] to be propagated from
+ mlp ([int32]): out_channels_list
+ bn (bool): whether perform batch norm after conv2d
+
+ Returns:
+ new_features (Variable): new features with shape [B, N, mlp[-1]]
+ """
+ if known is None:
+ raise NotImplementedError("Not implement known as None currently.")
+ else:
+ dist, idx = three_nn(unknown, known, eps=0.)
+ dist.stop_gradient = True
+ idx.stop_gradient = True
+ dist = fluid.layers.sqrt(dist)
+ ones = fluid.layers.fill_constant_batch_size_like(dist, dist.shape, dist.dtype, 1)
+ dist_recip = ones / (dist + 1e-8); # 1.0 / dist
+ norm = fluid.layers.reduce_sum(dist_recip, dim=-1, keep_dim=True)
+ weight = dist_recip / norm
+ weight.stop_gradient = True
+ interp_feats = three_interp(known_feats, weight, idx)
+
+ new_features = interp_feats if unknown_feats is None else \
+ fluid.layers.concat([interp_feats, unknown_feats], axis=-1)
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+ new_features = fluid.layers.unsqueeze(new_features, axes=[-1])
+ new_features = MLP(new_features, mlp, bn=bn, bn_momentum=bn_momentum, name=name + '_mlp')
+ new_features = fluid.layers.squeeze(new_features, axes=[-1])
+ new_features = fluid.layers.transpose(new_features, perm=[0, 2, 1])
+
+ return new_features
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py
new file mode 100644
index 0000000000000000000000000000000000000000..b4d5f98c3b320663111cf9eceef4f2649f44007d
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/pointnet2_msg.py
@@ -0,0 +1,78 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains PointNet++ SSG/MSG semantic segmentation models
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+from models.pointnet2_modules import *
+
+__all__ = ["PointNet2MSG"]
+
+
+class PointNet2MSG(object):
+ def __init__(self, cfg, xyz, feature=None, use_xyz=True):
+ self.cfg = cfg
+ self.xyz = xyz
+ self.feature = feature
+ self.use_xyz = use_xyz
+ self.model_config()
+
+ def model_config(self):
+ self.SA_confs = []
+ for i in range(self.cfg.RPN.SA_CONFIG.NPOINTS.__len__()):
+ self.SA_confs.append({
+ "npoint": self.cfg.RPN.SA_CONFIG.NPOINTS[i],
+ "radiuss": self.cfg.RPN.SA_CONFIG.RADIUS[i],
+ "nsamples": self.cfg.RPN.SA_CONFIG.NSAMPLE[i],
+ "mlps": self.cfg.RPN.SA_CONFIG.MLPS[i],
+ })
+
+ self.FP_confs = []
+ for i in range(self.cfg.RPN.FP_MLPS.__len__()):
+ self.FP_confs.append({"mlp": self.cfg.RPN.FP_MLPS[i]})
+
+ def build(self, bn_momentum=0.95):
+ xyzs, features = [self.xyz], [self.feature]
+ xyzi, featurei = self.xyz, self.feature
+ for i, SA_conf in enumerate(self.SA_confs):
+ xyzi, featurei = pointnet_sa_module(
+ xyz=xyzi,
+ feature=featurei,
+ bn_momentum=bn_momentum,
+ use_xyz=self.use_xyz,
+ name="sa_{}".format(i),
+ **SA_conf)
+ xyzs.append(xyzi)
+ features.append(fluid.layers.transpose(featurei, perm=[0, 2, 1]))
+ for i in range(-1, -(len(self.FP_confs) + 1), -1):
+ features[i - 1] = pointnet_fp_module(
+ unknown=xyzs[i - 1],
+ known=xyzs[i],
+ unknown_feats=features[i - 1],
+ known_feats=features[i],
+ bn_momentum=bn_momentum,
+ name="fp_{}".format(i + len(self.FP_confs)),
+ **self.FP_confs[i])
+
+ return xyzs[0], features[0]
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py
new file mode 100644
index 0000000000000000000000000000000000000000..cb2f65332fbf14517abec0c257330fab1c834155
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/rcnn.py
@@ -0,0 +1,303 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import sys
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Constant
+
+from models.pointnet2_modules import MLP, pointnet_sa_module, conv_bn
+from models.loss_utils import sigmoid_focal_loss , get_reg_loss
+from utils.proposal_target import get_proposal_target_func
+from utils.cyops.kitti_utils import rotate_pc_along_y
+
+__all__ = ['RCNN']
+
+
+class RCNN(object):
+ def __init__(self, cfg, num_classes, batch_size, mode='TRAIN', use_xyz=True, input_channels=0):
+ self.cfg = cfg
+ self.use_xyz = use_xyz
+ self.num_classes = num_classes
+ self.input_channels = input_channels
+ self.inputs = None
+ self.training = mode == 'TRAIN'
+ self.batch_size = batch_size
+
+ def create_tmp_var(self, name, dtype, shape):
+ return fluid.default_main_program().current_block().create_var(
+ name=name, dtype=dtype, shape=shape
+ )
+
+ def build_model(self, inputs):
+ self.inputs = inputs
+ if self.cfg.RCNN.ROI_SAMPLE_JIT:
+ if self.training:
+ proposal_target = get_proposal_target_func(self.cfg)
+
+ tmp_list = [
+ self.inputs['seg_mask'],
+ self.inputs['rpn_features'],
+ self.inputs['gt_boxes3d'],
+ self.inputs['rpn_xyz'],
+ self.inputs['pts_depth'],
+ self.inputs['roi_boxes3d'],
+ self.inputs['rpn_intensity'],
+ ]
+ out_name = ['reg_valid_mask' ,'sampled_pts' ,'roi_boxes3d', 'gt_of_rois', 'pts_feature' ,'cls_label','gt_iou']
+ reg_valid_mask = self.create_tmp_var(name="reg_valid_mask",dtype='float32',shape=[-1,])
+ sampled_pts = self.create_tmp_var(name="sampled_pts",dtype='float32',shape=[-1, self.cfg.RCNN.NUM_POINTS, 3])
+ new_roi_boxes3d = self.create_tmp_var(name="new_roi_boxes3d",dtype='float32',shape=[-1, 7])
+ gt_of_rois = self.create_tmp_var(name="gt_of_rois", dtype='float32', shape=[-1,7])
+ pts_feature = self.create_tmp_var(name="pts_feature", dtype='float32',shape=[-1,512,130])
+ cls_label = self.create_tmp_var(name="cls_label",dtype='int64',shape=[-1])
+ gt_iou = self.create_tmp_var(name="gt_iou",dtype='float32',shape=[-1])
+
+ out_list = [reg_valid_mask, sampled_pts, new_roi_boxes3d, gt_of_rois, pts_feature, cls_label, gt_iou]
+ out = fluid.layers.py_func(func=proposal_target,x=tmp_list,out=out_list)
+
+ self.target_dict = {}
+ for i,item in enumerate(out):
+ self.target_dict[out_name[i]] = item
+
+ pts = fluid.layers.concat(input=[self.target_dict['sampled_pts'],self.target_dict['pts_feature']], axis=2)
+ self.debug = pts
+ self.target_dict['pts_input'] = pts
+ else:
+ rpn_xyz, rpn_features = inputs['rpn_xyz'], inputs['rpn_features']
+ batch_rois = inputs['roi_boxes3d']
+ rpn_intensity = inputs['rpn_intensity']
+ rpn_intensity = fluid.layers.unsqueeze(rpn_intensity,axes=[2])
+ seg_mask = fluid.layers.unsqueeze(inputs['seg_mask'],axes=[2])
+ if self.cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [rpn_intensity, seg_mask]
+ else:
+ pts_extra_input_list = [seg_mask]
+
+ if self.cfg.RCNN.USE_DEPTH:
+ pts_depth = inputs['pts_depth'] / 70.0 -0.5
+ pts_depth = fluid.layers.unsqueeze(pts_depth,axes=[2])
+ pts_extra_input_list.append(pts_depth)
+ pts_extra_input = fluid.layers.concat(pts_extra_input_list, axis=2)
+ pts_feature = fluid.layers.concat([pts_extra_input, rpn_features],axis=2)
+
+ pooled_features, pooled_empty_flag = fluid.layers.roi_pool_3d(rpn_xyz,pts_feature,batch_rois,
+ self.cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=self.cfg.RCNN.NUM_POINTS)
+ # canonical transformation
+ batch_size = batch_rois.shape[0]
+ roi_center = batch_rois[:, :, 0:3]
+ tmp = pooled_features[:, :, :, 0:3] - fluid.layers.unsqueeze(roi_center,axes=[2])
+ pooled_features = fluid.layers.concat(input=[tmp,pooled_features[:,:,:,3:]],axis=3)
+ concat_list = []
+ for i in range(batch_size):
+ tmp = rotate_pc_along_y(pooled_features[i, :, :, 0:3],
+ batch_rois[i, :, 6])
+ concat = fluid.layers.concat([tmp,pooled_features[i,:,:,3:]],axis=-1)
+ concat = fluid.layers.unsqueeze(concat,axes=[0])
+ concat_list.append(concat)
+ pooled_features = fluid.layers.concat(concat_list,axis=0)
+ pts = fluid.layers.reshape(pooled_features,shape=[-1,pooled_features.shape[2],pooled_features.shape[3]])
+
+ else:
+ pts = inputs['pts_input']
+ self.target_dict = {}
+ self.target_dict['pts_input'] = inputs['pts_input']
+ self.target_dict['roi_boxes3d'] = inputs['roi_boxes3d']
+
+ if self.training:
+ self.target_dict['cls_label'] = inputs['cls_label']
+ self.target_dict['reg_valid_mask'] = inputs['reg_valid_mask']
+ self.target_dict['gt_of_rois'] = inputs['gt_boxes3d_ct']
+
+ xyz = pts[:,:,0:3]
+ feature = fluid.layers.transpose(pts[:,:,3:], [0,2,1]) if pts.shape[-1]>3 else None
+ if self.cfg.RCNN.USE_RPN_FEATURES:
+ self.rcnn_input_channel = 3 + int(self.cfg.RCNN.USE_INTENSITY) + \
+ int(self.cfg.RCNN.USE_MASK) + int(self.cfg.RCNN.USE_DEPTH)
+ c_out = self.cfg.RCNN.XYZ_UP_LAYER[-1]
+
+ xyz_input = pts[:,:,:self.rcnn_input_channel]
+ xyz_input = fluid.layers.transpose(xyz_input, [0,2,1])
+ xyz_input = fluid.layers.unsqueeze(xyz_input, axes=[3])
+
+ rpn_feature = pts[:,:,self.rcnn_input_channel:]
+ rpn_feature = fluid.layers.transpose(rpn_feature, [0,2,1])
+ rpn_feature = fluid.layers.unsqueeze(rpn_feature,axes=[3])
+
+ xyz_feature = MLP(
+ xyz_input,
+ out_channels_list=self.cfg.RCNN.XYZ_UP_LAYER,
+ bn=self.cfg.RCNN.USE_BN,
+ name="xyz_up_layer")
+
+ merged_feature = fluid.layers.concat([xyz_feature, rpn_feature],axis=1)
+ merged_feature = MLP(
+ merged_feature,
+ out_channels_list=[c_out],
+ bn=self.cfg.RCNN.USE_BN,
+ name="xyz_down_layer")
+
+ xyzs = [xyz]
+ features = [fluid.layers.squeeze(merged_feature,axes=[3])]
+ else:
+ xyzs = [xyz]
+ features = [feature]
+
+ # forward
+ xyzi, featurei = xyzs[-1], features[-1]
+ for k in range(len(self.cfg.RCNN.SA_CONFIG.NPOINTS)):
+ mlps = self.cfg.RCNN.SA_CONFIG.MLPS[k]
+ npoint = self.cfg.RCNN.SA_CONFIG.NPOINTS[k] if self.cfg.RCNN.SA_CONFIG.NPOINTS[k] != -1 else None
+
+ xyzi, featurei = pointnet_sa_module(
+ xyz=xyzi,
+ feature = featurei,
+ bn = self.cfg.RCNN.USE_BN,
+ use_xyz = self.use_xyz,
+ name = "sa_{}".format(k),
+ npoint = npoint,
+ mlps = [mlps],
+ radiuss = [self.cfg.RCNN.SA_CONFIG.RADIUS[k]],
+ nsamples = [self.cfg.RCNN.SA_CONFIG.NSAMPLE[k]]
+ )
+ xyzs.append(xyzi)
+ features.append(featurei)
+
+ head_in = features[-1]
+ head_in = fluid.layers.unsqueeze(head_in, axes=[2])
+
+ cls_out = head_in
+ reg_out = cls_out
+
+ for i in range(0, self.cfg.RCNN.CLS_FC.__len__()):
+ cls_out = conv_bn(cls_out, self.cfg.RCNN.CLS_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_cls_{}'.format(i))
+ if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
+ cls_out = fluid.layers.dropout(cls_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
+ cls_channel = 1 if self.num_classes == 2 else self.num_classes
+ cls_out = conv_bn(cls_out, cls_channel, act=None, name="cls_out", bn=self.cfg.RCNN.USE_BN)
+ self.cls_out = fluid.layers.squeeze(cls_out,axes=[1,3])
+
+ per_loc_bin_num = int(self.cfg.RCNN.LOC_SCOPE / self.cfg.RCNN.LOC_BIN_SIZE) * 2
+ loc_y_bin_num = int(self.cfg.RCNN.LOC_Y_SCOPE / self.cfg.RCNN.LOC_Y_BIN_SIZE) * 2
+ reg_channel = per_loc_bin_num * 4 + self.cfg.RCNN.NUM_HEAD_BIN * 2 + 3
+ reg_channel += (1 if not self.cfg.RCNN.LOC_Y_BY_BIN else loc_y_bin_num * 2)
+ for i in range(0, self.cfg.RCNN.REG_FC.__len__()):
+ reg_out = conv_bn(reg_out, self.cfg.RCNN.REG_FC[i], bn=self.cfg.RCNN.USE_BN, name='rcnn_reg_{}'.format(i))
+ if i == 0 and self.cfg.RCNN.DP_RATIO >= 0:
+ reg_out = fluid.layers.dropout(reg_out, self.cfg.RCNN.DP_RATIO, dropout_implementation="upscale_in_train")
+
+ reg_out = conv_bn(reg_out, reg_channel, act=None, name="reg_out", bn=self.cfg.RCNN.USE_BN)
+ self.reg_out = fluid.layers.squeeze(reg_out, axes=[2,3])
+
+
+ self.outputs = {
+ 'rcnn_cls':self.cls_out,
+ 'rcnn_reg':self.reg_out,
+ }
+ if self.training:
+ self.outputs.update(self.target_dict)
+ elif not self.training:
+ self.outputs['sample_id'] = inputs['sample_id']
+ self.outputs['pts_input'] = inputs['pts_input']
+ self.outputs['roi_boxes3d'] = inputs['roi_boxes3d']
+ self.outputs['roi_scores'] = inputs['roi_scores']
+ self.outputs['gt_iou'] = inputs['gt_iou']
+ self.outputs['gt_boxes3d'] = inputs['gt_boxes3d']
+
+ if self.cls_out.shape[1] == 1:
+ raw_scores = fluid.layers.reshape(self.cls_out, shape=[-1])
+ norm_scores = fluid.layers.sigmoid(raw_scores)
+ else:
+ norm_scores = fluid.layers.softmax(self.cls_out, axis=1)
+ self.outputs['norm_scores'] = norm_scores
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ assert self.inputs is not None, \
+ "please call build() first"
+ rcnn_cls_label = self.outputs['cls_label']
+ reg_valid_mask = self.outputs['reg_valid_mask']
+ roi_boxes3d = self.outputs['roi_boxes3d']
+ roi_size = roi_boxes3d[:, 3:6]
+ gt_boxes3d_ct = self.outputs['gt_of_rois']
+ pts_input = self.outputs['pts_input']
+
+ rcnn_cls = self.cls_out
+ rcnn_reg = self.reg_out
+
+ # RCNN classification loss
+ assert self.cfg.RCNN.LOSS_CLS in ["SigmoidFocalLoss", "BinaryCrossEntropy"], \
+ "unsupported RCNN cls loss type {}".format(self.cfg.RCNN.LOSS_CLS)
+
+ if self.cfg.RCNN.LOSS_CLS == "SigmoidFocalLoss":
+ cls_flat = fluid.layers.reshape(self.cls_out, shape=[-1])
+ cls_label_flat = fluid.layers.reshape(rcnn_cls_label, shape=[-1])
+ cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
+ cls_target = fluid.layers.cast(cls_label_flat>0, dtype=cls_flat.dtype)
+ cls_label_flat.stop_gradient = True
+ pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
+ pos.stop_gradient = True
+ pos_normalizer = fluid.layers.reduce_sum(pos)
+ cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
+ cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
+ cls_weights.stop_gradient = True
+ rcnn_loss_cls = sigmoid_focal_loss(cls_flat, cls_target, cls_weights)
+ rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls)
+ else: # BinaryCrossEntropy
+ cls_label = fluid.layers.reshape(rcnn_cls_label, shape=self.cls_out.shape)
+ cls_valid_mask = fluid.layers.cast(cls_label >= 0, dtype=self.cls_out.dtype)
+ cls_label = fluid.layers.cast(cls_label, dtype=self.cls_out.dtype)
+ cls_label.stop_gradient = True
+ rcnn_loss_cls = fluid.layers.sigmoid_cross_entropy_with_logits(self.cls_out, cls_label)
+ cls_mask_normalzer = fluid.layers.reduce_sum(cls_valid_mask)
+ rcnn_loss_cls = fluid.layers.reduce_sum(rcnn_loss_cls * cls_valid_mask) \
+ / fluid.layers.clip(cls_mask_normalzer, min=1.0, max=1e10)
+
+ # RCNN regression loss
+ reg_out = self.reg_out
+ fg_mask = fluid.layers.cast(reg_valid_mask > 0, dtype=reg_out.dtype)
+ fg_mask = fluid.layers.unsqueeze(fg_mask, axes=[1])
+ fg_mask.stop_gradient = True
+ gt_boxes3d_ct = fluid.layers.reshape(gt_boxes3d_ct, [-1,7])
+ all_anchor_size = roi_size
+ anchor_size = all_anchor_size[fg_mask] if self.cfg.RCNN.SIZE_RES_ON_ROI else self.cfg.CLS_MEAN_SIZE[0]
+
+ loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
+ reg_out * fg_mask,
+ gt_boxes3d_ct,
+ fg_mask,
+ point_num=float(self.batch_size*64),
+ loc_scope=self.cfg.RCNN.LOC_SCOPE,
+ loc_bin_size=self.cfg.RCNN.LOC_BIN_SIZE,
+ num_head_bin=self.cfg.RCNN.NUM_HEAD_BIN,
+ anchor_size=anchor_size,
+ get_xz_fine=True,
+ get_y_by_bin=self.cfg.RCNN.LOC_Y_BY_BIN,
+ loc_y_scope=self.cfg.RCNN.LOC_Y_SCOPE,
+ loc_y_bin_size=self.cfg.RCNN.LOC_Y_BIN_SIZE,
+ get_ry_fine=True
+ )
+ rcnn_loss_reg = loc_loss + angle_loss + size_loss * 3
+ rcnn_loss = rcnn_loss_cls + rcnn_loss_reg
+ return rcnn_loss, rcnn_loss_cls, rcnn_loss_reg
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/models/rpn.py b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..5432e0c2b2b5d6f3e6e492bd34d9f2a8ab14c49f
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/models/rpn.py
@@ -0,0 +1,171 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid.param_attr import ParamAttr
+from paddle.fluid.initializer import Normal, Constant
+
+from utils.proposal_utils import get_proposal_func
+from models.pointnet2_msg import PointNet2MSG
+from models.pointnet2_modules import conv_bn
+from models.loss_utils import sigmoid_focal_loss, get_reg_loss
+
+__all__ = ["RPN"]
+
+
+class RPN(object):
+ def __init__(self, cfg, batch_size, use_xyz=True, mode='TRAIN', prog=None):
+ self.cfg = cfg
+ self.batch_size = batch_size
+ self.use_xyz = use_xyz
+ self.mode = mode
+ self.is_train = mode == 'TRAIN'
+ self.inputs = None
+ self.prog = fluid.default_main_program() if prog is None else prog
+
+ def build(self, inputs):
+ assert self.cfg.RPN.BACKBONE == 'pointnet2_msg', \
+ "RPN backbone only support pointnet2_msg"
+ self.inputs = inputs
+ self.outputs = {}
+
+ xyz = inputs["pts_input"]
+ assert not self.cfg.RPN.USE_INTENSITY, \
+ "RPN.USE_INTENSITY not support now"
+ feature = None
+ msg = PointNet2MSG(self.cfg, xyz, feature, self.use_xyz)
+ backbone_xyz, backbone_feature = msg.build()
+ self.outputs['backbone_xyz'] = backbone_xyz
+ self.outputs['backbone_feature'] = backbone_feature
+
+ backbone_feature = fluid.layers.transpose(backbone_feature, perm=[0, 2, 1])
+ cls_out = fluid.layers.unsqueeze(backbone_feature, axes=[-1])
+ reg_out = cls_out
+
+ # classification branch
+ for i in range(self.cfg.RPN.CLS_FC.__len__()):
+ cls_out = conv_bn(cls_out, self.cfg.RPN.CLS_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_cls_{}'.format(i))
+ if i == 0 and self.cfg.RPN.DP_RATIO > 0:
+ cls_out = fluid.layers.dropout(cls_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
+ cls_out = fluid.layers.conv2d(cls_out,
+ num_filters=1,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=ParamAttr(name='rpn_cls_out_conv_weight'),
+ bias_attr=ParamAttr(name='rpn_cls_out_conv_bias',
+ initializer=Constant(-np.log(99))))
+ cls_out = fluid.layers.squeeze(cls_out, axes=[1, 3])
+ self.outputs['rpn_cls'] = cls_out
+
+ # regression branch
+ per_loc_bin_num = int(self.cfg.RPN.LOC_SCOPE / self.cfg.RPN.LOC_BIN_SIZE) * 2
+ if self.cfg.RPN.LOC_XZ_FINE:
+ reg_channel = per_loc_bin_num * 4 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
+ else:
+ reg_channel = per_loc_bin_num * 2 + self.cfg.RPN.NUM_HEAD_BIN * 2 + 3
+ reg_channel += 1 # reg y
+
+ for i in range(self.cfg.RPN.REG_FC.__len__()):
+ reg_out = conv_bn(reg_out, self.cfg.RPN.REG_FC[i], bn=self.cfg.RPN.USE_BN, name='rpn_reg_{}'.format(i))
+ if i == 0 and self.cfg.RPN.DP_RATIO > 0:
+ reg_out = fluid.layers.dropout(reg_out, self.cfg.RPN.DP_RATIO, dropout_implementation="upscale_in_train")
+ reg_out = fluid.layers.conv2d(reg_out,
+ num_filters=reg_channel,
+ filter_size=1,
+ stride=1,
+ padding=0,
+ dilation=1,
+ param_attr=ParamAttr(name='rpn_reg_out_conv_weight',
+ initializer=Normal(0., 0.001),),
+ bias_attr=ParamAttr(name='rpn_reg_out_conv_bias'))
+ reg_out = fluid.layers.squeeze(reg_out, axes=[3])
+ reg_out = fluid.layers.transpose(reg_out, [0, 2, 1])
+ self.outputs['rpn_reg'] = reg_out
+
+ if self.mode != 'TRAIN' or self.cfg.RCNN.ENABLED:
+ rpn_scores_row = cls_out
+ rpn_scores_norm = fluid.layers.sigmoid(rpn_scores_row)
+ seg_mask = fluid.layers.cast(rpn_scores_norm > self.cfg.RPN.SCORE_THRESH, dtype='float32')
+ pts_depth = fluid.layers.sqrt(fluid.layers.reduce_sum(backbone_xyz * backbone_xyz, dim=2))
+ proposal_func = get_proposal_func(self.cfg, self.mode)
+ proposal_input = fluid.layers.concat([fluid.layers.unsqueeze(rpn_scores_row, axes=[-1]),
+ backbone_xyz, reg_out], axis=-1)
+ proposal = self.prog.current_block().create_var(name='proposal',
+ shape=[-1, proposal_input.shape[1], 8],
+ dtype='float32')
+ fluid.layers.py_func(proposal_func, proposal_input, proposal)
+ rois, roi_scores_row = proposal[:, :, :7], proposal[:, :, -1]
+ self.outputs['rois'] = rois
+ self.outputs['roi_scores_row'] = roi_scores_row
+ self.outputs['seg_mask'] = seg_mask
+ self.outputs['pts_depth'] = pts_depth
+
+ def get_outputs(self):
+ return self.outputs
+
+ def get_loss(self):
+ assert self.inputs is not None, \
+ "please call build() first"
+ rpn_cls_label = self.inputs['rpn_cls_label']
+ rpn_reg_label = self.inputs['rpn_reg_label']
+ rpn_cls = self.outputs['rpn_cls']
+ rpn_reg = self.outputs['rpn_reg']
+
+ # RPN classification loss
+ assert self.cfg.RPN.LOSS_CLS == "SigmoidFocalLoss", \
+ "unsupported RPN cls loss type {}".format(self.cfg.RPN.LOSS_CLS)
+ cls_flat = fluid.layers.reshape(rpn_cls, shape=[-1])
+ cls_label_flat = fluid.layers.reshape(rpn_cls_label, shape=[-1])
+ cls_label_pos = fluid.layers.cast(cls_label_flat > 0, dtype=cls_flat.dtype)
+ pos_normalizer = fluid.layers.reduce_sum(cls_label_pos)
+ cls_weights = fluid.layers.cast(cls_label_flat >= 0, dtype=cls_flat.dtype)
+ cls_weights = cls_weights / fluid.layers.clip(pos_normalizer, min=1.0, max=1e10)
+ cls_weights.stop_gradient = True
+ cls_label_flat = fluid.layers.cast(cls_label_flat, dtype=cls_flat.dtype)
+ cls_label_flat.stop_gradient = True
+ rpn_loss_cls = sigmoid_focal_loss(cls_flat, cls_label_pos, cls_weights)
+ rpn_loss_cls = fluid.layers.reduce_sum(rpn_loss_cls)
+
+ # RPN regression loss
+ rpn_reg = fluid.layers.reshape(rpn_reg, [-1, rpn_reg.shape[-1]])
+ reg_label = fluid.layers.reshape(rpn_reg_label, [-1, rpn_reg_label.shape[-1]])
+ fg_mask = fluid.layers.cast(cls_label_flat > 0, dtype=rpn_reg.dtype)
+ fg_mask = fluid.layers.unsqueeze(fg_mask, axes=[1])
+ fg_mask.stop_gradient = True
+ loc_loss, angle_loss, size_loss, loss_dict = get_reg_loss(
+ rpn_reg * fg_mask,
+ reg_label,
+ fg_mask,
+ float(self.batch_size * self.cfg.RPN.NUM_POINTS),
+ loc_scope=self.cfg.RPN.LOC_SCOPE,
+ loc_bin_size=self.cfg.RPN.LOC_BIN_SIZE,
+ num_head_bin=self.cfg.RPN.NUM_HEAD_BIN,
+ anchor_size=self.cfg.CLS_MEAN_SIZE[0],
+ get_xz_fine=self.cfg.RPN.LOC_XZ_FINE,
+ get_y_by_bin=False,
+ get_ry_fine=False)
+ rpn_loss_reg = loc_loss + angle_loss + size_loss * 3
+
+ self.rpn_loss = rpn_loss_cls * self.cfg.RPN.LOSS_WEIGHT[0] \
+ + rpn_loss_reg * self.cfg.RPN.LOSS_WEIGHT[1]
+ return self.rpn_loss, rpn_loss_cls, rpn_loss_reg
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/requirement.txt b/PaddleCV/Paddle3D/PointRCNN/requirement.txt
new file mode 100644
index 0000000000000000000000000000000000000000..6ff347ab06c588b507fd6b5f1442e2375afb032a
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/requirement.txt
@@ -0,0 +1,6 @@
+Cython
+opencv-python
+shapely
+scikit-image
+Numba
+fire
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py
new file mode 100644
index 0000000000000000000000000000000000000000..59cfa4abc0629c71d150f750e8f32400c6c361b9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_aug_scene.py
@@ -0,0 +1,330 @@
+"""
+Generate GT database
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_aug_scene.py
+"""
+
+import os
+import numpy as np
+import pickle
+
+import pts_utils
+import utils.cyops.kitti_utils as kitti_utils
+from utils.box_utils import boxes_iou3d
+from utils import calibration as calib
+from data.kitti_dataset import KittiDataset
+import argparse
+
+np.random.seed(1024)
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--mode', type=str, default='generator')
+parser.add_argument('--class_name', type=str, default='Car')
+parser.add_argument('--data_dir', type=str, default='./data')
+parser.add_argument('--save_dir', type=str, default='./data/KITTI/aug_scene/training')
+parser.add_argument('--split', type=str, default='train')
+parser.add_argument('--gt_database_dir', type=str, default='./data/gt_database/train_gt_database_3level_Car.pkl')
+parser.add_argument('--include_similar', action='store_true', default=False)
+parser.add_argument('--aug_times', type=int, default=4)
+args = parser.parse_args()
+
+PC_REDUCE_BY_RANGE = True
+if args.class_name == 'Car':
+ PC_AREA_SCOPE = np.array([[-40, 40], [-1, 3], [0, 70.4]]) # x, y, z scope in rect camera coords
+else:
+ PC_AREA_SCOPE = np.array([[-30, 30], [-1, 3], [0, 50]])
+
+
+def log_print(info, fp=None):
+ print(info)
+ if fp is not None:
+ # print(info, file=fp)
+ fp.write(info+"\n")
+
+
+def save_kitti_format(calib, bbox3d, obj_list, img_shape, save_fp):
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ # Discard boxes that are larger than 80% of the image width OR height
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ save_fp.write('%s %.2f %d %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
+ (args.class_name, obj_list[k].trucation, int(obj_list[k].occlusion), alpha, img_boxes[k, 0], img_boxes[k, 1],
+ img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6]))
+
+
+class AugSceneGenerator(KittiDataset):
+ def __init__(self, root_dir, gt_database=None, split='train', classes=args.class_name):
+ super(AugSceneGenerator, self).__init__(root_dir, split=split)
+ self.gt_database = None
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ self.gt_database = gt_database
+
+ def __len__(self):
+ raise NotImplementedError
+
+ def __getitem__(self, item):
+ raise NotImplementedError
+
+ def filtrate_dc_objects(self, obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type in ['DontCare']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ def filtrate_objects(self, obj_list):
+ valid_obj_list = []
+ type_whitelist = self.classes
+ if args.include_similar:
+ type_whitelist = list(self.classes)
+ if 'Car' in self.classes:
+ type_whitelist.append('Van')
+ if 'Pedestrian' in self.classes or 'Cyclist' in self.classes:
+ type_whitelist.append('Person_sitting')
+
+ for obj in obj_list:
+ if obj.cls_type in type_whitelist:
+ valid_obj_list.append(obj)
+ return valid_obj_list
+
+ @staticmethod
+ def get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape):
+ """
+ Valid point should be in the image (and in the PC_AREA_SCOPE)
+ :param pts_rect:
+ :param pts_img:
+ :param pts_rect_depth:
+ :param img_shape:
+ :return:
+ """
+ val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
+ val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
+ val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
+ pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
+
+ if PC_REDUCE_BY_RANGE:
+ x_range, y_range, z_range = PC_AREA_SCOPE
+ pts_x, pts_y, pts_z = pts_rect[:, 0], pts_rect[:, 1], pts_rect[:, 2]
+ range_flag = (pts_x >= x_range[0]) & (pts_x <= x_range[1]) \
+ & (pts_y >= y_range[0]) & (pts_y <= y_range[1]) \
+ & (pts_z >= z_range[0]) & (pts_z <= z_range[1])
+ pts_valid_flag = pts_valid_flag & range_flag
+ return pts_valid_flag
+
+ @staticmethod
+ def check_pc_range(xyz):
+ """
+ :param xyz: [x, y, z]
+ :return:
+ """
+ x_range, y_range, z_range = PC_AREA_SCOPE
+ if (x_range[0] <= xyz[0] <= x_range[1]) and (y_range[0] <= xyz[1] <= y_range[1]) and \
+ (z_range[0] <= xyz[2] <= z_range[1]):
+ return True
+ return False
+
+ def aug_one_scene(self, sample_id, pts_rect, pts_intensity, all_gt_boxes3d):
+ """
+ :param pts_rect: (N, 3)
+ :param gt_boxes3d: (M1, 7)
+ :param all_gt_boxex3d: (M2, 7)
+ :return:
+ """
+ assert self.gt_database is not None
+ extra_gt_num = np.random.randint(10, 15)
+ try_times = 50
+ cnt = 0
+ cur_gt_boxes3d = all_gt_boxes3d.copy()
+ cur_gt_boxes3d[:, 4] += 0.5
+ cur_gt_boxes3d[:, 5] += 0.5 # enlarge new added box to avoid too nearby boxes
+
+ extra_gt_obj_list = []
+ extra_gt_boxes3d_list = []
+ new_pts_list, new_pts_intensity_list = [], []
+ src_pts_flag = np.ones(pts_rect.shape[0], dtype=np.int32)
+
+ road_plane = self.get_road_plane(sample_id)
+ a, b, c, d = road_plane
+
+ while try_times > 0:
+ try_times -= 1
+
+ rand_idx = np.random.randint(0, self.gt_database.__len__() - 1)
+
+ new_gt_dict = self.gt_database[rand_idx]
+ new_gt_box3d = new_gt_dict['gt_box3d'].copy()
+ new_gt_points = new_gt_dict['points'].copy()
+ new_gt_intensity = new_gt_dict['intensity'].copy()
+ new_gt_obj = new_gt_dict['obj']
+ center = new_gt_box3d[0:3]
+ if PC_REDUCE_BY_RANGE and (self.check_pc_range(center) is False):
+ continue
+ if cnt > extra_gt_num:
+ break
+ if new_gt_points.__len__() < 5: # too few points
+ continue
+
+ # put it on the road plane
+ cur_height = (-d - a * center[0] - c * center[2]) / b
+ move_height = new_gt_box3d[1] - cur_height
+ new_gt_box3d[1] -= move_height
+ new_gt_points[:, 1] -= move_height
+
+ cnt += 1
+
+ iou3d = boxes_iou3d(new_gt_box3d.reshape(1, 7), cur_gt_boxes3d)
+
+ valid_flag = iou3d.max() < 1e-8
+ if not valid_flag:
+ continue
+
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[3] += 2 # remove the points above and below the object
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, enlarged_box3d.reshape(1, 7))
+ pt_mask_flag = (boxes_pts_mask_list[0] == 1)
+ src_pts_flag[pt_mask_flag] = 0 # remove the original points which are inside the new box
+
+ new_pts_list.append(new_gt_points)
+ new_pts_intensity_list.append(new_gt_intensity)
+ enlarged_box3d = new_gt_box3d.copy()
+ enlarged_box3d[4] += 0.5
+ enlarged_box3d[5] += 0.5 # enlarge new added box to avoid too nearby boxes
+ cur_gt_boxes3d = np.concatenate((cur_gt_boxes3d, enlarged_box3d.reshape(1, 7)), axis=0)
+ extra_gt_boxes3d_list.append(new_gt_box3d.reshape(1, 7))
+ extra_gt_obj_list.append(new_gt_obj)
+
+ if new_pts_list.__len__() == 0:
+ return False, pts_rect, pts_intensity, None, None
+
+ extra_gt_boxes3d = np.concatenate(extra_gt_boxes3d_list, axis=0)
+ # remove original points and add new points
+ pts_rect = pts_rect[src_pts_flag == 1]
+ pts_intensity = pts_intensity[src_pts_flag == 1]
+ new_pts_rect = np.concatenate(new_pts_list, axis=0)
+ new_pts_intensity = np.concatenate(new_pts_intensity_list, axis=0)
+ pts_rect = np.concatenate((pts_rect, new_pts_rect), axis=0)
+ pts_intensity = np.concatenate((pts_intensity, new_pts_intensity), axis=0)
+
+ return True, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list
+
+ def aug_one_epoch_scene(self, base_id, data_save_dir, label_save_dir, split_list, log_fp=None):
+ for idx, sample_id in enumerate(self.image_idx_list):
+ sample_id = int(sample_id)
+ print('process gt sample (%s, id=%06d)' % (args.split, sample_id))
+
+ pts_lidar = self.get_lidar(sample_id)
+ calib = self.get_calib(sample_id)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
+ img_shape = self.get_image_shape(sample_id)
+
+ pts_valid_flag = self.get_valid_flag(pts_rect, pts_img, pts_rect_depth, img_shape)
+ pts_rect = pts_rect[pts_valid_flag][:, 0:3]
+ pts_intensity = pts_lidar[pts_valid_flag][:, 3]
+
+ # all labels for checking overlapping
+ all_obj_list = self.filtrate_dc_objects(self.get_label(sample_id))
+ all_gt_boxes3d = np.zeros((all_obj_list.__len__(), 7), dtype=np.float32)
+ for k, obj in enumerate(all_obj_list):
+ all_gt_boxes3d[k, 0:3], all_gt_boxes3d[k, 3], all_gt_boxes3d[k, 4], all_gt_boxes3d[k, 5], \
+ all_gt_boxes3d[k, 6] = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ # gt_boxes3d of current label
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+ if args.class_name != 'Car' and obj_list.__len__() == 0:
+ continue
+
+ # augment one scene
+ aug_flag, pts_rect, pts_intensity, extra_gt_boxes3d, extra_gt_obj_list = \
+ self.aug_one_scene(sample_id, pts_rect, pts_intensity, all_gt_boxes3d)
+
+ # save augment result to file
+ pts_info = np.concatenate((pts_rect, pts_intensity.reshape(-1, 1)), axis=1)
+ bin_file = os.path.join(data_save_dir, '%06d.bin' % (base_id + sample_id))
+ pts_info.astype(np.float32).tofile(bin_file)
+
+ # save filtered original gt_boxes3d
+ label_save_file = os.path.join(label_save_dir, '%06d.txt' % (base_id + sample_id))
+ with open(label_save_file, 'w') as f:
+ for obj in obj_list:
+ f.write(obj.to_kitti_format() + '\n')
+
+ if aug_flag:
+ # augment successfully
+ save_kitti_format(calib, extra_gt_boxes3d, extra_gt_obj_list, img_shape=img_shape, save_fp=f)
+ else:
+ extra_gt_boxes3d = np.zeros((0, 7), dtype=np.float32)
+ log_print('Save to file (new_obj: %s): %s' % (extra_gt_boxes3d.__len__(), label_save_file), fp=log_fp)
+ split_list.append('%06d' % (base_id + sample_id))
+
+ def generate_aug_scene(self, aug_times, log_fp=None):
+ data_save_dir = os.path.join(args.save_dir, 'rectified_data')
+ label_save_dir = os.path.join(args.save_dir, 'aug_label')
+ if not os.path.isdir(data_save_dir):
+ os.makedirs(data_save_dir)
+ if not os.path.isdir(label_save_dir):
+ os.makedirs(label_save_dir)
+
+ split_file = os.path.join(args.save_dir, '%s_aug.txt' % args.split)
+ split_list = self.image_idx_list[:]
+ for epoch in range(aug_times):
+ base_id = (epoch + 1) * 10000
+ self.aug_one_epoch_scene(base_id, data_save_dir, label_save_dir, split_list, log_fp=log_fp)
+
+ with open(split_file, 'w') as f:
+ for idx, sample_id in enumerate(split_list):
+ f.write(str(sample_id) + '\n')
+ log_print('Save split file to %s' % split_file, fp=log_fp)
+ target_dir = os.path.join(args.data_dir, 'KITTI/ImageSets/')
+ os.system('cp %s %s' % (split_file, target_dir))
+ log_print('Copy split file from %s to %s' % (split_file, target_dir), fp=log_fp)
+
+
+if __name__ == '__main__':
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+ info_file = os.path.join(args.save_dir, 'log_info.txt')
+
+ if args.mode == 'generator':
+ log_fp = open(info_file, 'w')
+
+ gt_database = pickle.load(open(args.gt_database_dir, 'rb'))
+ log_print('Loading gt_database(%d) from %s' % (gt_database.__len__(), args.gt_database_dir), fp=log_fp)
+
+ dataset = AugSceneGenerator(root_dir=args.data_dir, gt_database=gt_database, split=args.split)
+ dataset.generate_aug_scene(aug_times=args.aug_times, log_fp=log_fp)
+
+ log_fp.close()
+
+ else:
+ pass
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py
new file mode 100644
index 0000000000000000000000000000000000000000..43290db734c9734fef8120031cab44a394f4323b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/generate_gt_database.py
@@ -0,0 +1,104 @@
+"""
+Generate GT database
+This code is based on https://github.com/sshaoshuai/PointRCNN/blob/master/tools/generate_gt_database.py
+"""
+
+import os
+import numpy as np
+import pickle
+
+from data.kitti_dataset import KittiDataset
+import pts_utils
+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--data_dir', type=str, default='./data')
+parser.add_argument('--save_dir', type=str, default='./data/gt_database')
+parser.add_argument('--class_name', type=str, default='Car')
+parser.add_argument('--split', type=str, default='train')
+args = parser.parse_args()
+
+
+class GTDatabaseGenerator(KittiDataset):
+ def __init__(self, root_dir, split='train', classes=args.class_name):
+ super(GTDatabaseGenerator, self).__init__(root_dir, split=split)
+ self.gt_database = None
+ if classes == 'Car':
+ self.classes = ('Background', 'Car')
+ elif classes == 'People':
+ self.classes = ('Background', 'Pedestrian', 'Cyclist')
+ elif classes == 'Pedestrian':
+ self.classes = ('Background', 'Pedestrian')
+ elif classes == 'Cyclist':
+ self.classes = ('Background', 'Cyclist')
+ else:
+ assert False, "Invalid classes: %s" % classes
+
+ def __len__(self):
+ raise NotImplementedError
+
+ def __getitem__(self, item):
+ raise NotImplementedError
+
+ def filtrate_objects(self, obj_list):
+ valid_obj_list = []
+ for obj in obj_list:
+ if obj.cls_type not in self.classes:
+ continue
+ if obj.level_str not in ['Easy', 'Moderate', 'Hard']:
+ continue
+ valid_obj_list.append(obj)
+
+ return valid_obj_list
+
+ def generate_gt_database(self):
+ gt_database = []
+ for idx, sample_id in enumerate(self.image_idx_list):
+ sample_id = int(sample_id)
+ print('process gt sample (id=%06d)' % sample_id)
+
+ pts_lidar = self.get_lidar(sample_id)
+ calib = self.get_calib(sample_id)
+ pts_rect = calib.lidar_to_rect(pts_lidar[:, 0:3])
+ pts_intensity = pts_lidar[:, 3]
+
+ obj_list = self.filtrate_objects(self.get_label(sample_id))
+
+ gt_boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
+ for k, obj in enumerate(obj_list):
+ gt_boxes3d[k, 0:3], gt_boxes3d[k, 3], gt_boxes3d[k, 4], gt_boxes3d[k, 5], gt_boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+
+ if gt_boxes3d.__len__() == 0:
+ print('No gt object')
+ continue
+
+ boxes_pts_mask_list = pts_utils.pts_in_boxes3d(pts_rect, gt_boxes3d)
+
+ for k in range(boxes_pts_mask_list.shape[0]):
+ pt_mask_flag = (boxes_pts_mask_list[k] == 1)
+ cur_pts = pts_rect[pt_mask_flag].astype(np.float32)
+ cur_pts_intensity = pts_intensity[pt_mask_flag].astype(np.float32)
+ sample_dict = {'sample_id': sample_id,
+ 'cls_type': obj_list[k].cls_type,
+ 'gt_box3d': gt_boxes3d[k],
+ 'points': cur_pts,
+ 'intensity': cur_pts_intensity,
+ 'obj': obj_list[k]}
+ gt_database.append(sample_dict)
+
+ save_file_name = os.path.join(args.save_dir, '%s_gt_database_3level_%s.pkl' % (args.split, self.classes[-1]))
+ with open(save_file_name, 'wb') as f:
+ pickle.dump(gt_database, f)
+
+ self.gt_database = gt_database
+ print('Save refine training sample info file to %s' % save_file_name)
+
+
+if __name__ == '__main__':
+ dataset = GTDatabaseGenerator(root_dir=args.data_dir, split=args.split)
+ if not os.path.isdir(args.save_dir):
+ os.makedirs(args.save_dir)
+
+ dataset.generate_gt_database()
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..6d16ef487301fb7ba45b71c64cd3af337cef13c5
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_eval.py
@@ -0,0 +1,71 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import argparse
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ "KITTI mAP evaluation script")
+ parser.add_argument(
+ '--result_dir',
+ type=str,
+ default='./result_dir',
+ help='detection result directory to evaluate')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--split',
+ type=str,
+ default='val',
+ help='evaluation split, default val')
+ parser.add_argument(
+ '--class_name',
+ type=str,
+ default='Car',
+ help='evaluation class name, default Car')
+ args = parser.parse_args()
+ return args
+
+
+def kitti_eval():
+ if float(sys.version[:3]) < 3.6:
+ print("KITTI mAP evaluation can only run with python3.6+")
+ sys.exit(1)
+
+ args = parse_args()
+
+ label_dir = os.path.join(args.data_dir, 'KITTI/object/training', 'label_2')
+ split_file = os.path.join(args.data_dir, 'KITTI/ImageSets',
+ '{}.txt'.format(args.split))
+ final_output_dir = os.path.join(args.result_dir, 'final_result', 'data')
+ name_to_class = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
+
+ from tools.kitti_object_eval_python.evaluate import evaluate as kitti_evaluate
+ ap_result_str, ap_dict = kitti_evaluate(
+ label_dir, final_output_dir, label_split_file=split_file,
+ current_class=name_to_class[args.class_name])
+
+ print("KITTI evaluate: ", ap_result_str, ap_dict)
+
+
+if __name__ == "__main__":
+ kitti_eval()
+
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE
new file mode 100644
index 0000000000000000000000000000000000000000..ab602974d200aa6849e6ad8220951ef9a78d9f08
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2018
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0e0e0c307c2db3f0486e594deae1c04ac49f55f3
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/README.md
@@ -0,0 +1,32 @@
+# kitti-object-eval-python
+**NOTE**: This is borrowed from [traveller59/kitti-object-eval-python](https://github.com/traveller59/kitti-object-eval-python)
+
+Fast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.
+## Dependencies
+Only support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.
+* Install by conda:
+```
+conda install -c numba cudatoolkit=x.x (8.0, 9.0, 9.1, depend on your environment)
+```
+## Usage
+* commandline interface:
+```
+python evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False
+```
+* python interface:
+```Python
+import kitti_common as kitti
+from eval import get_official_eval_result, get_coco_eval_result
+def _read_imageset_file(path):
+ with open(path, 'r') as f:
+ lines = f.readlines()
+ return [int(line) for line in lines]
+det_path = "/path/to/your_result_folder"
+dt_annos = kitti.get_label_annos(det_path)
+gt_path = "/path/to/your_gt_label_folder"
+gt_split_file = "/path/to/val.txt" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz
+val_image_ids = _read_imageset_file(gt_split_file)
+gt_annos = kitti.get_label_annos(gt_path, val_image_ids)
+print(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer
+print(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer
+```
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..38101ca69a59cdc0603ebc82cac0338432457550
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/eval.py
@@ -0,0 +1,740 @@
+import numpy as np
+import numba
+import io as sysio
+from tools.kitti_object_eval_python.rotate_iou import rotate_iou_gpu_eval
+
+
+@numba.jit
+def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):
+ scores.sort()
+ scores = scores[::-1]
+ current_recall = 0
+ thresholds = []
+ for i, score in enumerate(scores):
+ l_recall = (i + 1) / num_gt
+ if i < (len(scores) - 1):
+ r_recall = (i + 2) / num_gt
+ else:
+ r_recall = l_recall
+ if (((r_recall - current_recall) < (current_recall - l_recall))
+ and (i < (len(scores) - 1))):
+ continue
+ # recall = l_recall
+ thresholds.append(score)
+ current_recall += 1 / (num_sample_pts - 1.0)
+ return thresholds
+
+
+def clean_data(gt_anno, dt_anno, current_class, difficulty):
+ CLASS_NAMES = ['car', 'pedestrian', 'cyclist']
+ MIN_HEIGHT = [40, 25, 25]
+ MAX_OCCLUSION = [0, 1, 2]
+ MAX_TRUNCATION = [0.15, 0.3, 0.5]
+ dc_bboxes, ignored_gt, ignored_dt = [], [], []
+ current_cls_name = CLASS_NAMES[current_class].lower()
+ num_gt = len(gt_anno["name"])
+ num_dt = len(dt_anno["name"])
+ num_valid_gt = 0
+ for i in range(num_gt):
+ bbox = gt_anno["bbox"][i]
+ gt_name = gt_anno["name"][i].lower()
+ height = bbox[3] - bbox[1]
+ valid_class = -1
+ if (gt_name == current_cls_name):
+ valid_class = 1
+ elif (current_cls_name == "Pedestrian".lower()
+ and "Person_sitting".lower() == gt_name):
+ valid_class = 0
+ elif (current_cls_name == "Car".lower() and "Van".lower() == gt_name):
+ valid_class = 0
+ else:
+ valid_class = -1
+ ignore = False
+ if ((gt_anno["occluded"][i] > MAX_OCCLUSION[difficulty])
+ or (gt_anno["truncated"][i] > MAX_TRUNCATION[difficulty])
+ or (height <= MIN_HEIGHT[difficulty])):
+ # if gt_anno["difficulty"][i] > difficulty or gt_anno["difficulty"][i] == -1:
+ ignore = True
+ if valid_class == 1 and not ignore:
+ ignored_gt.append(0)
+ num_valid_gt += 1
+ elif (valid_class == 0 or (ignore and (valid_class == 1))):
+ ignored_gt.append(1)
+ else:
+ ignored_gt.append(-1)
+ # for i in range(num_gt):
+ if gt_anno["name"][i] == "DontCare":
+ dc_bboxes.append(gt_anno["bbox"][i])
+ for i in range(num_dt):
+ if (dt_anno["name"][i].lower() == current_cls_name):
+ valid_class = 1
+ else:
+ valid_class = -1
+ height = abs(dt_anno["bbox"][i, 3] - dt_anno["bbox"][i, 1])
+ if height < MIN_HEIGHT[difficulty]:
+ ignored_dt.append(1)
+ elif valid_class == 1:
+ ignored_dt.append(0)
+ else:
+ ignored_dt.append(-1)
+
+ return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes
+
+
+@numba.jit(nopython=True)
+def image_box_overlap(boxes, query_boxes, criterion=-1):
+ N = boxes.shape[0]
+ K = query_boxes.shape[0]
+ overlaps = np.zeros((N, K), dtype=boxes.dtype)
+ for k in range(K):
+ qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *
+ (query_boxes[k, 3] - query_boxes[k, 1]))
+ for n in range(N):
+ iw = (min(boxes[n, 2], query_boxes[k, 2]) -
+ max(boxes[n, 0], query_boxes[k, 0]))
+ if iw > 0:
+ ih = (min(boxes[n, 3], query_boxes[k, 3]) -
+ max(boxes[n, 1], query_boxes[k, 1]))
+ if ih > 0:
+ if criterion == -1:
+ ua = (
+ (boxes[n, 2] - boxes[n, 0]) *
+ (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)
+ elif criterion == 0:
+ ua = ((boxes[n, 2] - boxes[n, 0]) *
+ (boxes[n, 3] - boxes[n, 1]))
+ elif criterion == 1:
+ ua = qbox_area
+ else:
+ ua = 1.0
+ overlaps[n, k] = iw * ih / ua
+ return overlaps
+
+
+def bev_box_overlap(boxes, qboxes, criterion=-1):
+ riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)
+ return riou
+
+
+@numba.jit(nopython=True, parallel=True)
+def d3_box_overlap_kernel(boxes, qboxes, rinc, criterion=-1):
+ # ONLY support overlap in CAMERA, not lider.
+ N, K = boxes.shape[0], qboxes.shape[0]
+ for i in range(N):
+ for j in range(K):
+ if rinc[i, j] > 0:
+ # iw = (min(boxes[i, 1] + boxes[i, 4], qboxes[j, 1] +
+ # qboxes[j, 4]) - max(boxes[i, 1], qboxes[j, 1]))
+ iw = (min(boxes[i, 1], qboxes[j, 1]) - max(
+ boxes[i, 1] - boxes[i, 4], qboxes[j, 1] - qboxes[j, 4]))
+
+ if iw > 0:
+ area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]
+ area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]
+ inc = iw * rinc[i, j]
+ if criterion == -1:
+ ua = (area1 + area2 - inc)
+ elif criterion == 0:
+ ua = area1
+ elif criterion == 1:
+ ua = area2
+ else:
+ ua = inc
+ rinc[i, j] = inc / ua
+ else:
+ rinc[i, j] = 0.0
+
+
+def d3_box_overlap(boxes, qboxes, criterion=-1):
+ rinc = rotate_iou_gpu_eval(boxes[:, [0, 2, 3, 5, 6]],
+ qboxes[:, [0, 2, 3, 5, 6]], 2)
+ d3_box_overlap_kernel(boxes, qboxes, rinc, criterion)
+ return rinc
+
+
+@numba.jit(nopython=True)
+def compute_statistics_jit(overlaps,
+ gt_datas,
+ dt_datas,
+ ignored_gt,
+ ignored_det,
+ dc_bboxes,
+ metric,
+ min_overlap,
+ thresh=0,
+ compute_fp=False,
+ compute_aos=False):
+
+ det_size = dt_datas.shape[0]
+ gt_size = gt_datas.shape[0]
+ dt_scores = dt_datas[:, -1]
+ dt_alphas = dt_datas[:, 4]
+ gt_alphas = gt_datas[:, 4]
+ dt_bboxes = dt_datas[:, :4]
+ gt_bboxes = gt_datas[:, :4]
+
+ assigned_detection = [False] * det_size
+ ignored_threshold = [False] * det_size
+ if compute_fp:
+ for i in range(det_size):
+ if (dt_scores[i] < thresh):
+ ignored_threshold[i] = True
+ NO_DETECTION = -10000000
+ tp, fp, fn, similarity = 0, 0, 0, 0
+ # thresholds = [0.0]
+ # delta = [0.0]
+ thresholds = np.zeros((gt_size, ))
+ thresh_idx = 0
+ delta = np.zeros((gt_size, ))
+ delta_idx = 0
+ for i in range(gt_size):
+ if ignored_gt[i] == -1:
+ continue
+ det_idx = -1
+ valid_detection = NO_DETECTION
+ max_overlap = 0
+ assigned_ignored_det = False
+
+ for j in range(det_size):
+ if (ignored_det[j] == -1):
+ continue
+ if (assigned_detection[j]):
+ continue
+ if (ignored_threshold[j]):
+ continue
+ overlap = overlaps[j, i]
+ dt_score = dt_scores[j]
+ if (not compute_fp and (overlap > min_overlap)
+ and dt_score > valid_detection):
+ det_idx = j
+ valid_detection = dt_score
+ elif (compute_fp and (overlap > min_overlap)
+ and (overlap > max_overlap or assigned_ignored_det)
+ and ignored_det[j] == 0):
+ max_overlap = overlap
+ det_idx = j
+ valid_detection = 1
+ assigned_ignored_det = False
+ elif (compute_fp and (overlap > min_overlap)
+ and (valid_detection == NO_DETECTION)
+ and ignored_det[j] == 1):
+ det_idx = j
+ valid_detection = 1
+ assigned_ignored_det = True
+
+ if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:
+ fn += 1
+ elif ((valid_detection != NO_DETECTION)
+ and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):
+ assigned_detection[det_idx] = True
+ elif valid_detection != NO_DETECTION:
+ tp += 1
+ # thresholds.append(dt_scores[det_idx])
+ thresholds[thresh_idx] = dt_scores[det_idx]
+ thresh_idx += 1
+ if compute_aos:
+ # delta.append(gt_alphas[i] - dt_alphas[det_idx])
+ delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]
+ delta_idx += 1
+
+ assigned_detection[det_idx] = True
+ if compute_fp:
+ for i in range(det_size):
+ if (not (assigned_detection[i] or ignored_det[i] == -1
+ or ignored_det[i] == 1 or ignored_threshold[i])):
+ fp += 1
+ nstuff = 0
+ if metric == 0:
+ overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)
+ for i in range(dc_bboxes.shape[0]):
+ for j in range(det_size):
+ if (assigned_detection[j]):
+ continue
+ if (ignored_det[j] == -1 or ignored_det[j] == 1):
+ continue
+ if (ignored_threshold[j]):
+ continue
+ if overlaps_dt_dc[j, i] > min_overlap:
+ assigned_detection[j] = True
+ nstuff += 1
+ fp -= nstuff
+ if compute_aos:
+ tmp = np.zeros((fp + delta_idx, ))
+ # tmp = [0] * fp
+ for i in range(delta_idx):
+ tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0
+ # tmp.append((1.0 + np.cos(delta[i])) / 2.0)
+ # assert len(tmp) == fp + tp
+ # assert len(delta) == tp
+ if tp > 0 or fp > 0:
+ similarity = np.sum(tmp)
+ else:
+ similarity = -1
+ return tp, fp, fn, similarity, thresholds[:thresh_idx]
+
+
+def get_split_parts(num, num_part):
+ same_part = num // num_part
+ remain_num = num % num_part
+ if remain_num == 0:
+ return [same_part] * num_part
+ else:
+ return [same_part] * num_part + [remain_num]
+
+
+@numba.jit(nopython=True)
+def fused_compute_statistics(overlaps,
+ pr,
+ gt_nums,
+ dt_nums,
+ dc_nums,
+ gt_datas,
+ dt_datas,
+ dontcares,
+ ignored_gts,
+ ignored_dets,
+ metric,
+ min_overlap,
+ thresholds,
+ compute_aos=False):
+ gt_num = 0
+ dt_num = 0
+ dc_num = 0
+ for i in range(gt_nums.shape[0]):
+ for t, thresh in enumerate(thresholds):
+ overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:
+ gt_num + gt_nums[i]]
+
+ gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]
+ dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]
+ ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]
+ ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]
+ dontcare = dontcares[dc_num:dc_num + dc_nums[i]]
+ tp, fp, fn, similarity, _ = compute_statistics_jit(
+ overlap,
+ gt_data,
+ dt_data,
+ ignored_gt,
+ ignored_det,
+ dontcare,
+ metric,
+ min_overlap=min_overlap,
+ thresh=thresh,
+ compute_fp=True,
+ compute_aos=compute_aos)
+ pr[t, 0] += tp
+ pr[t, 1] += fp
+ pr[t, 2] += fn
+ if similarity != -1:
+ pr[t, 3] += similarity
+ gt_num += gt_nums[i]
+ dt_num += dt_nums[i]
+ dc_num += dc_nums[i]
+
+
+def calculate_iou_partly(gt_annos, dt_annos, metric, num_parts=50):
+ """fast iou algorithm. this function can be used independently to
+ do result analysis. Must be used in CAMERA coordinate system.
+ Args:
+ gt_annos: dict, must from get_label_annos() in kitti_common.py
+ dt_annos: dict, must from get_label_annos() in kitti_common.py
+ metric: eval type. 0: bbox, 1: bev, 2: 3d
+ num_parts: int. a parameter for fast calculate algorithm
+ """
+ assert len(gt_annos) == len(dt_annos)
+ total_dt_num = np.stack([len(a["name"]) for a in dt_annos], 0)
+ total_gt_num = np.stack([len(a["name"]) for a in gt_annos], 0)
+ num_examples = len(gt_annos)
+ split_parts = get_split_parts(num_examples, num_parts)
+ parted_overlaps = []
+ example_idx = 0
+
+ for num_part in split_parts:
+ gt_annos_part = gt_annos[example_idx:example_idx + num_part]
+ dt_annos_part = dt_annos[example_idx:example_idx + num_part]
+ if metric == 0:
+ gt_boxes = np.concatenate([a["bbox"] for a in gt_annos_part], 0)
+ dt_boxes = np.concatenate([a["bbox"] for a in dt_annos_part], 0)
+ overlap_part = image_box_overlap(gt_boxes, dt_boxes)
+ elif metric == 1:
+ loc = np.concatenate(
+ [a["location"][:, [0, 2]] for a in gt_annos_part], 0)
+ dims = np.concatenate(
+ [a["dimensions"][:, [0, 2]] for a in gt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
+ gt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ loc = np.concatenate(
+ [a["location"][:, [0, 2]] for a in dt_annos_part], 0)
+ dims = np.concatenate(
+ [a["dimensions"][:, [0, 2]] for a in dt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
+ dt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype(
+ np.float64)
+ elif metric == 2:
+ loc = np.concatenate([a["location"] for a in gt_annos_part], 0)
+ dims = np.concatenate([a["dimensions"] for a in gt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
+ gt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ loc = np.concatenate([a["location"] for a in dt_annos_part], 0)
+ dims = np.concatenate([a["dimensions"] for a in dt_annos_part], 0)
+ rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
+ dt_boxes = np.concatenate(
+ [loc, dims, rots[..., np.newaxis]], axis=1)
+ overlap_part = d3_box_overlap(gt_boxes, dt_boxes).astype(
+ np.float64)
+ else:
+ raise ValueError("unknown metric")
+ parted_overlaps.append(overlap_part)
+ example_idx += num_part
+ overlaps = []
+ example_idx = 0
+ for j, num_part in enumerate(split_parts):
+ gt_annos_part = gt_annos[example_idx:example_idx + num_part]
+ dt_annos_part = dt_annos[example_idx:example_idx + num_part]
+ gt_num_idx, dt_num_idx = 0, 0
+ for i in range(num_part):
+ gt_box_num = total_gt_num[example_idx + i]
+ dt_box_num = total_dt_num[example_idx + i]
+ overlaps.append(
+ parted_overlaps[j][gt_num_idx:gt_num_idx + gt_box_num,
+ dt_num_idx:dt_num_idx + dt_box_num])
+ gt_num_idx += gt_box_num
+ dt_num_idx += dt_box_num
+ example_idx += num_part
+
+ return overlaps, parted_overlaps, total_gt_num, total_dt_num
+
+
+def _prepare_data(gt_annos, dt_annos, current_class, difficulty):
+ gt_datas_list = []
+ dt_datas_list = []
+ total_dc_num = []
+ ignored_gts, ignored_dets, dontcares = [], [], []
+ total_num_valid_gt = 0
+ for i in range(len(gt_annos)):
+ rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)
+ num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets
+ ignored_gts.append(np.array(ignored_gt, dtype=np.int64))
+ ignored_dets.append(np.array(ignored_det, dtype=np.int64))
+ if len(dc_bboxes) == 0:
+ dc_bboxes = np.zeros((0, 4)).astype(np.float64)
+ else:
+ dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)
+ total_dc_num.append(dc_bboxes.shape[0])
+ dontcares.append(dc_bboxes)
+ total_num_valid_gt += num_valid_gt
+ gt_datas = np.concatenate(
+ [gt_annos[i]["bbox"], gt_annos[i]["alpha"][..., np.newaxis]], 1)
+ dt_datas = np.concatenate([
+ dt_annos[i]["bbox"], dt_annos[i]["alpha"][..., np.newaxis],
+ dt_annos[i]["score"][..., np.newaxis]
+ ], 1)
+ gt_datas_list.append(gt_datas)
+ dt_datas_list.append(dt_datas)
+ total_dc_num = np.stack(total_dc_num, axis=0)
+ return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,
+ total_dc_num, total_num_valid_gt)
+
+
+def eval_class(gt_annos,
+ dt_annos,
+ current_classes,
+ difficultys,
+ metric,
+ min_overlaps,
+ compute_aos=False,
+ num_parts=50):
+ """Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.
+ Args:
+ gt_annos: dict, must from get_label_annos() in kitti_common.py
+ dt_annos: dict, must from get_label_annos() in kitti_common.py
+ current_classes: list of int, 0: car, 1: pedestrian, 2: cyclist
+ difficultys: list of int. eval difficulty, 0: easy, 1: normal, 2: hard
+ metric: eval type. 0: bbox, 1: bev, 2: 3d
+ min_overlaps: float, min overlap. format: [num_overlap, metric, class].
+ num_parts: int. a parameter for fast calculate algorithm
+
+ Returns:
+ dict of recall, precision and aos
+ """
+ assert len(gt_annos) == len(dt_annos)
+ num_examples = len(gt_annos)
+ split_parts = get_split_parts(num_examples, num_parts)
+
+ rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)
+ overlaps, parted_overlaps, total_dt_num, total_gt_num = rets
+ N_SAMPLE_PTS = 41
+ num_minoverlap = len(min_overlaps)
+ num_class = len(current_classes)
+ num_difficulty = len(difficultys)
+ precision = np.zeros(
+ [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ recall = np.zeros(
+ [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
+ for m, current_class in enumerate(current_classes):
+ for l, difficulty in enumerate(difficultys):
+ rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)
+ (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,
+ dontcares, total_dc_num, total_num_valid_gt) = rets
+ for k, min_overlap in enumerate(min_overlaps[:, metric, m]):
+ thresholdss = []
+ for i in range(len(gt_annos)):
+ rets = compute_statistics_jit(
+ overlaps[i],
+ gt_datas_list[i],
+ dt_datas_list[i],
+ ignored_gts[i],
+ ignored_dets[i],
+ dontcares[i],
+ metric,
+ min_overlap=min_overlap,
+ thresh=0.0,
+ compute_fp=False)
+ tp, fp, fn, similarity, thresholds = rets
+ thresholdss += thresholds.tolist()
+ thresholdss = np.array(thresholdss)
+ thresholds = get_thresholds(thresholdss, total_num_valid_gt)
+ thresholds = np.array(thresholds)
+ pr = np.zeros([len(thresholds), 4])
+ idx = 0
+ for j, num_part in enumerate(split_parts):
+ gt_datas_part = np.concatenate(
+ gt_datas_list[idx:idx + num_part], 0)
+ dt_datas_part = np.concatenate(
+ dt_datas_list[idx:idx + num_part], 0)
+ dc_datas_part = np.concatenate(
+ dontcares[idx:idx + num_part], 0)
+ ignored_dets_part = np.concatenate(
+ ignored_dets[idx:idx + num_part], 0)
+ ignored_gts_part = np.concatenate(
+ ignored_gts[idx:idx + num_part], 0)
+ fused_compute_statistics(
+ parted_overlaps[j],
+ pr,
+ total_gt_num[idx:idx + num_part],
+ total_dt_num[idx:idx + num_part],
+ total_dc_num[idx:idx + num_part],
+ gt_datas_part,
+ dt_datas_part,
+ dc_datas_part,
+ ignored_gts_part,
+ ignored_dets_part,
+ metric,
+ min_overlap=min_overlap,
+ thresholds=thresholds,
+ compute_aos=compute_aos)
+ idx += num_part
+ for i in range(len(thresholds)):
+ recall[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 2])
+ precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])
+ if compute_aos:
+ aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])
+ for i in range(len(thresholds)):
+ precision[m, l, k, i] = np.max(
+ precision[m, l, k, i:], axis=-1)
+ recall[m, l, k, i] = np.max(recall[m, l, k, i:], axis=-1)
+ if compute_aos:
+ aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)
+ ret_dict = {
+ "recall": recall,
+ "precision": precision,
+ "orientation": aos,
+ }
+ return ret_dict
+
+
+def get_mAP(prec):
+ sums = 0
+ for i in range(0, prec.shape[-1], 4):
+ sums = sums + prec[..., i]
+ return sums / 11 * 100
+
+
+def print_str(value, *arg, sstream=None):
+ if sstream is None:
+ sstream = sysio.StringIO()
+ sstream.truncate(0)
+ sstream.seek(0)
+ print(value, *arg, file=sstream)
+ return sstream.getvalue()
+
+
+def do_eval(gt_annos,
+ dt_annos,
+ current_classes,
+ min_overlaps,
+ compute_aos=False):
+ # min_overlaps: [num_minoverlap, metric, num_class]
+ difficultys = [0, 1, 2]
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 0,
+ min_overlaps, compute_aos)
+ # ret: [num_class, num_diff, num_minoverlap, num_sample_points]
+ mAP_bbox = get_mAP(ret["precision"])
+ mAP_aos = None
+ if compute_aos:
+ mAP_aos = get_mAP(ret["orientation"])
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1,
+ min_overlaps)
+ mAP_bev = get_mAP(ret["precision"])
+ ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 2,
+ min_overlaps)
+ mAP_3d = get_mAP(ret["precision"])
+ return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
+
+
+def do_coco_style_eval(gt_annos, dt_annos, current_classes, overlap_ranges,
+ compute_aos):
+ # overlap_ranges: [range, metric, num_class]
+ min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])
+ for i in range(overlap_ranges.shape[1]):
+ for j in range(overlap_ranges.shape[2]):
+ min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])
+ mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval(
+ gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
+ # ret: [num_class, num_diff, num_minoverlap]
+ mAP_bbox = mAP_bbox.mean(-1)
+ mAP_bev = mAP_bev.mean(-1)
+ mAP_3d = mAP_3d.mean(-1)
+ if mAP_aos is not None:
+ mAP_aos = mAP_aos.mean(-1)
+ return mAP_bbox, mAP_bev, mAP_3d, mAP_aos
+
+
+def get_official_eval_result(gt_annos, dt_annos, current_classes):
+ overlap_0_7 = np.array([[0.7, 0.5, 0.5, 0.7,
+ 0.5], [0.7, 0.5, 0.5, 0.7, 0.5],
+ [0.7, 0.5, 0.5, 0.7, 0.5]])
+ overlap_0_5 = np.array([[0.7, 0.5, 0.5, 0.7,
+ 0.5], [0.5, 0.25, 0.25, 0.5, 0.25],
+ [0.5, 0.25, 0.25, 0.5, 0.25]])
+ min_overlaps = np.stack([overlap_0_7, overlap_0_5], axis=0) # [2, 3, 5]
+ class_to_name = {
+ 0: 'Car',
+ 1: 'Pedestrian',
+ 2: 'Cyclist',
+ 3: 'Van',
+ 4: 'Person_sitting',
+ }
+ name_to_class = {v: n for n, v in class_to_name.items()}
+ if not isinstance(current_classes, (list, tuple)):
+ current_classes = [current_classes]
+ current_classes_int = []
+ for curcls in current_classes:
+ if isinstance(curcls, str):
+ current_classes_int.append(name_to_class[curcls])
+ else:
+ current_classes_int.append(curcls)
+ current_classes = current_classes_int
+ min_overlaps = min_overlaps[:, :, current_classes]
+ result = ''
+ # check whether alpha is valid
+ compute_aos = False
+ for anno in dt_annos:
+ if anno['alpha'].shape[0] != 0:
+ if anno['alpha'][0] != -10:
+ compute_aos = True
+ break
+ mAPbbox, mAPbev, mAP3d, mAPaos = do_eval(
+ gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)
+
+ ret_dict = {}
+ for j, curcls in enumerate(current_classes):
+ # mAP threshold array: [num_minoverlap, metric, class]
+ # mAP result: [num_class, num_diff, num_minoverlap]
+ for i in range(min_overlaps.shape[0]):
+ result += print_str(
+ (f"{class_to_name[curcls]} "
+ "AP@{:.2f}, {:.2f}, {:.2f}:".format(*min_overlaps[i, :, j])))
+ result += print_str((f"bbox AP:{mAPbbox[j, 0, i]:.4f}, "
+ f"{mAPbbox[j, 1, i]:.4f}, "
+ f"{mAPbbox[j, 2, i]:.4f}"))
+ result += print_str((f"bev AP:{mAPbev[j, 0, i]:.4f}, "
+ f"{mAPbev[j, 1, i]:.4f}, "
+ f"{mAPbev[j, 2, i]:.4f}"))
+ result += print_str((f"3d AP:{mAP3d[j, 0, i]:.4f}, "
+ f"{mAP3d[j, 1, i]:.4f}, "
+ f"{mAP3d[j, 2, i]:.4f}"))
+
+
+ if compute_aos:
+ result += print_str((f"aos AP:{mAPaos[j, 0, i]:.2f}, "
+ f"{mAPaos[j, 1, i]:.2f}, "
+ f"{mAPaos[j, 2, i]:.2f}"))
+ ret_dict['Car_3d_easy'] = mAP3d[0, 0, 0]
+ ret_dict['Car_3d_moderate'] = mAP3d[0, 1, 0]
+ ret_dict['Car_3d_hard'] = mAP3d[0, 2, 0]
+ ret_dict['Car_bev_easy'] = mAPbev[0, 0, 0]
+ ret_dict['Car_bev_moderate'] = mAPbev[0, 1, 0]
+ ret_dict['Car_bev_hard'] = mAPbev[0, 2, 0]
+ ret_dict['Car_image_easy'] = mAPbbox[0, 0, 0]
+ ret_dict['Car_image_moderate'] = mAPbbox[0, 1, 0]
+ ret_dict['Car_image_hard'] = mAPbbox[0, 2, 0]
+
+ return result, ret_dict
+
+
+def get_coco_eval_result(gt_annos, dt_annos, current_classes):
+ class_to_name = {
+ 0: 'Car',
+ 1: 'Pedestrian',
+ 2: 'Cyclist',
+ 3: 'Van',
+ 4: 'Person_sitting',
+ }
+ class_to_range = {
+ 0: [0.5, 0.95, 10],
+ 1: [0.25, 0.7, 10],
+ 2: [0.25, 0.7, 10],
+ 3: [0.5, 0.95, 10],
+ 4: [0.25, 0.7, 10],
+ }
+ name_to_class = {v: n for n, v in class_to_name.items()}
+ if not isinstance(current_classes, (list, tuple)):
+ current_classes = [current_classes]
+ current_classes_int = []
+ for curcls in current_classes:
+ if isinstance(curcls, str):
+ current_classes_int.append(name_to_class[curcls])
+ else:
+ current_classes_int.append(curcls)
+ current_classes = current_classes_int
+ overlap_ranges = np.zeros([3, 3, len(current_classes)])
+ for i, curcls in enumerate(current_classes):
+ overlap_ranges[:, :, i] = np.array(
+ class_to_range[curcls])[:, np.newaxis]
+ result = ''
+ # check whether alpha is valid
+ compute_aos = False
+ for anno in dt_annos:
+ if anno['alpha'].shape[0] != 0:
+ if anno['alpha'][0] != -10:
+ compute_aos = True
+ break
+ mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(
+ gt_annos, dt_annos, current_classes, overlap_ranges, compute_aos)
+ for j, curcls in enumerate(current_classes):
+ # mAP threshold array: [num_minoverlap, metric, class]
+ # mAP result: [num_class, num_diff, num_minoverlap]
+ o_range = np.array(class_to_range[curcls])[[0, 2, 1]]
+ o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)
+ result += print_str((f"{class_to_name[curcls]} "
+ "coco AP@{:.2f}:{:.2f}:{:.2f}:".format(*o_range)))
+ result += print_str((f"bbox AP:{mAPbbox[j, 0]:.2f}, "
+ f"{mAPbbox[j, 1]:.2f}, "
+ f"{mAPbbox[j, 2]:.2f}"))
+ result += print_str((f"bev AP:{mAPbev[j, 0]:.2f}, "
+ f"{mAPbev[j, 1]:.2f}, "
+ f"{mAPbev[j, 2]:.2f}"))
+ result += print_str((f"3d AP:{mAP3d[j, 0]:.2f}, "
+ f"{mAP3d[j, 1]:.2f}, "
+ f"{mAP3d[j, 2]:.2f}"))
+ if compute_aos:
+ result += print_str((f"aos AP:{mAPaos[j, 0]:.2f}, "
+ f"{mAPaos[j, 1]:.2f}, "
+ f"{mAPaos[j, 2]:.2f}"))
+ return result
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py
new file mode 100644
index 0000000000000000000000000000000000000000..e822ae464618eb05c4123b7bd05cec875a567b70
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/evaluate.py
@@ -0,0 +1,32 @@
+import time
+import fire
+
+import tools.kitti_object_eval_python.kitti_common as kitti
+from tools.kitti_object_eval_python.eval import get_official_eval_result, get_coco_eval_result
+
+
+def _read_imageset_file(path):
+ with open(path, 'r') as f:
+ lines = f.readlines()
+ return [int(line) for line in lines]
+
+
+def evaluate(label_path,
+ result_path,
+ label_split_file,
+ current_class=0,
+ coco=False,
+ score_thresh=-1):
+ dt_annos = kitti.get_label_annos(result_path)
+ if score_thresh > 0:
+ dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)
+ val_image_ids = _read_imageset_file(label_split_file)
+ gt_annos = kitti.get_label_annos(label_path, val_image_ids)
+ if coco:
+ return get_coco_eval_result(gt_annos, dt_annos, current_class)
+ else:
+ return get_official_eval_result(gt_annos, dt_annos, current_class)
+
+
+if __name__ == '__main__':
+ fire.Fire()
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py
new file mode 100644
index 0000000000000000000000000000000000000000..e7e254ea4a27af9656757bbfb1f932c1348f59fe
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/kitti_common.py
@@ -0,0 +1,411 @@
+import concurrent.futures as futures
+import os
+import pathlib
+import re
+from collections import OrderedDict
+
+import numpy as np
+from skimage import io
+
+def get_image_index_str(img_idx):
+ return "{:06d}".format(img_idx)
+
+
+def get_kitti_info_path(idx,
+ prefix,
+ info_type='image_2',
+ file_tail='.png',
+ training=True,
+ relative_path=True):
+ img_idx_str = get_image_index_str(idx)
+ img_idx_str += file_tail
+ prefix = pathlib.Path(prefix)
+ if training:
+ file_path = pathlib.Path('training') / info_type / img_idx_str
+ else:
+ file_path = pathlib.Path('testing') / info_type / img_idx_str
+ if not (prefix / file_path).exists():
+ raise ValueError("file not exist: {}".format(file_path))
+ if relative_path:
+ return str(file_path)
+ else:
+ return str(prefix / file_path)
+
+
+def get_image_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,
+ relative_path)
+
+
+def get_label_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,
+ relative_path)
+
+
+def get_velodyne_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,
+ relative_path)
+
+
+def get_calib_path(idx, prefix, training=True, relative_path=True):
+ return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,
+ relative_path)
+
+
+def _extend_matrix(mat):
+ mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)
+ return mat
+
+
+def get_kitti_image_info(path,
+ training=True,
+ label_info=True,
+ velodyne=False,
+ calib=False,
+ image_ids=7481,
+ extend_matrix=True,
+ num_worker=8,
+ relative_path=True,
+ with_imageshape=True):
+ # image_infos = []
+ root_path = pathlib.Path(path)
+ if not isinstance(image_ids, list):
+ image_ids = list(range(image_ids))
+
+ def map_func(idx):
+ image_info = {'image_idx': idx}
+ annotations = None
+ if velodyne:
+ image_info['velodyne_path'] = get_velodyne_path(
+ idx, path, training, relative_path)
+ image_info['img_path'] = get_image_path(idx, path, training,
+ relative_path)
+ if with_imageshape:
+ img_path = image_info['img_path']
+ if relative_path:
+ img_path = str(root_path / img_path)
+ image_info['img_shape'] = np.array(
+ io.imread(img_path).shape[:2], dtype=np.int32)
+ if label_info:
+ label_path = get_label_path(idx, path, training, relative_path)
+ if relative_path:
+ label_path = str(root_path / label_path)
+ annotations = get_label_anno(label_path)
+ if calib:
+ calib_path = get_calib_path(
+ idx, path, training, relative_path=False)
+ with open(calib_path, 'r') as f:
+ lines = f.readlines()
+ P0 = np.array(
+ [float(info) for info in lines[0].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P1 = np.array(
+ [float(info) for info in lines[1].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P2 = np.array(
+ [float(info) for info in lines[2].split(' ')[1:13]]).reshape(
+ [3, 4])
+ P3 = np.array(
+ [float(info) for info in lines[3].split(' ')[1:13]]).reshape(
+ [3, 4])
+ if extend_matrix:
+ P0 = _extend_matrix(P0)
+ P1 = _extend_matrix(P1)
+ P2 = _extend_matrix(P2)
+ P3 = _extend_matrix(P3)
+ image_info['calib/P0'] = P0
+ image_info['calib/P1'] = P1
+ image_info['calib/P2'] = P2
+ image_info['calib/P3'] = P3
+ R0_rect = np.array([
+ float(info) for info in lines[4].split(' ')[1:10]
+ ]).reshape([3, 3])
+ if extend_matrix:
+ rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)
+ rect_4x4[3, 3] = 1.
+ rect_4x4[:3, :3] = R0_rect
+ else:
+ rect_4x4 = R0_rect
+ image_info['calib/R0_rect'] = rect_4x4
+ Tr_velo_to_cam = np.array([
+ float(info) for info in lines[5].split(' ')[1:13]
+ ]).reshape([3, 4])
+ Tr_imu_to_velo = np.array([
+ float(info) for info in lines[6].split(' ')[1:13]
+ ]).reshape([3, 4])
+ if extend_matrix:
+ Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)
+ Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)
+ image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam
+ image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo
+ if annotations is not None:
+ image_info['annos'] = annotations
+ add_difficulty_to_annos(image_info)
+ return image_info
+
+ with futures.ThreadPoolExecutor(num_worker) as executor:
+ image_infos = executor.map(map_func, image_ids)
+ return list(image_infos)
+
+
+def filter_kitti_anno(image_anno,
+ used_classes,
+ used_difficulty=None,
+ dontcare_iou=None):
+ if not isinstance(used_classes, (list, tuple)):
+ used_classes = [used_classes]
+ img_filtered_annotations = {}
+ relevant_annotation_indices = [
+ i for i, x in enumerate(image_anno['name']) if x in used_classes
+ ]
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (
+ image_anno[key][relevant_annotation_indices])
+ if used_difficulty is not None:
+ relevant_annotation_indices = [
+ i for i, x in enumerate(img_filtered_annotations['difficulty'])
+ if x in used_difficulty
+ ]
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (
+ img_filtered_annotations[key][relevant_annotation_indices])
+
+ if 'DontCare' in used_classes and dontcare_iou is not None:
+ dont_care_indices = [
+ i for i, x in enumerate(img_filtered_annotations['name'])
+ if x == 'DontCare'
+ ]
+ # bounding box format [y_min, x_min, y_max, x_max]
+ all_boxes = img_filtered_annotations['bbox']
+ ious = iou(all_boxes, all_boxes[dont_care_indices])
+
+ # Remove all bounding boxes that overlap with a dontcare region.
+ if ious.size > 0:
+ boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou
+ for key in image_anno.keys():
+ img_filtered_annotations[key] = (img_filtered_annotations[key][
+ np.logical_not(boxes_to_remove)])
+ return img_filtered_annotations
+
+def filter_annos_low_score(image_annos, thresh):
+ new_image_annos = []
+ for anno in image_annos:
+ img_filtered_annotations = {}
+ relevant_annotation_indices = [
+ i for i, s in enumerate(anno['score']) if s >= thresh
+ ]
+ for key in anno.keys():
+ img_filtered_annotations[key] = (
+ anno[key][relevant_annotation_indices])
+ new_image_annos.append(img_filtered_annotations)
+ return new_image_annos
+
+def kitti_result_line(result_dict, precision=4):
+ prec_float = "{" + ":.{}f".format(precision) + "}"
+ res_line = []
+ all_field_default = OrderedDict([
+ ('name', None),
+ ('truncated', -1),
+ ('occluded', -1),
+ ('alpha', -10),
+ ('bbox', None),
+ ('dimensions', [-1, -1, -1]),
+ ('location', [-1000, -1000, -1000]),
+ ('rotation_y', -10),
+ ('score', None),
+ ])
+ res_dict = [(key, None) for key, val in all_field_default.items()]
+ res_dict = OrderedDict(res_dict)
+ for key, val in result_dict.items():
+ if all_field_default[key] is None and val is None:
+ raise ValueError("you must specify a value for {}".format(key))
+ res_dict[key] = val
+
+ for key, val in res_dict.items():
+ if key == 'name':
+ res_line.append(val)
+ elif key in ['truncated', 'alpha', 'rotation_y', 'score']:
+ if val is None:
+ res_line.append(str(all_field_default[key]))
+ else:
+ res_line.append(prec_float.format(val))
+ elif key == 'occluded':
+ if val is None:
+ res_line.append(str(all_field_default[key]))
+ else:
+ res_line.append('{}'.format(val))
+ elif key in ['bbox', 'dimensions', 'location']:
+ if val is None:
+ res_line += [str(v) for v in all_field_default[key]]
+ else:
+ res_line += [prec_float.format(v) for v in val]
+ else:
+ raise ValueError("unknown key. supported key:{}".format(
+ res_dict.keys()))
+ return ' '.join(res_line)
+
+
+def add_difficulty_to_annos(info):
+ min_height = [40, 25,
+ 25] # minimum height for evaluated groundtruth/detections
+ max_occlusion = [
+ 0, 1, 2
+ ] # maximum occlusion level of the groundtruth used for evaluation
+ max_trunc = [
+ 0.15, 0.3, 0.5
+ ] # maximum truncation level of the groundtruth used for evaluation
+ annos = info['annos']
+ dims = annos['dimensions'] # lhw format
+ bbox = annos['bbox']
+ height = bbox[:, 3] - bbox[:, 1]
+ occlusion = annos['occluded']
+ truncation = annos['truncated']
+ diff = []
+ easy_mask = np.ones((len(dims), ), dtype=np.bool)
+ moderate_mask = np.ones((len(dims), ), dtype=np.bool)
+ hard_mask = np.ones((len(dims), ), dtype=np.bool)
+ i = 0
+ for h, o, t in zip(height, occlusion, truncation):
+ if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]:
+ easy_mask[i] = False
+ if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]:
+ moderate_mask[i] = False
+ if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]:
+ hard_mask[i] = False
+ i += 1
+ is_easy = easy_mask
+ is_moderate = np.logical_xor(easy_mask, moderate_mask)
+ is_hard = np.logical_xor(hard_mask, moderate_mask)
+
+ for i in range(len(dims)):
+ if is_easy[i]:
+ diff.append(0)
+ elif is_moderate[i]:
+ diff.append(1)
+ elif is_hard[i]:
+ diff.append(2)
+ else:
+ diff.append(-1)
+ annos["difficulty"] = np.array(diff, np.int32)
+ return diff
+
+
+def get_label_anno(label_path):
+ annotations = {}
+ annotations.update({
+ 'name': [],
+ 'truncated': [],
+ 'occluded': [],
+ 'alpha': [],
+ 'bbox': [],
+ 'dimensions': [],
+ 'location': [],
+ 'rotation_y': []
+ })
+ with open(label_path, 'r') as f:
+ lines = f.readlines()
+ # if len(lines) == 0 or len(lines[0]) < 15:
+ # content = []
+ # else:
+ content = [line.strip().split(' ') for line in lines]
+ annotations['name'] = np.array([x[0] for x in content])
+ annotations['truncated'] = np.array([float(x[1]) for x in content])
+ annotations['occluded'] = np.array([int(x[2]) for x in content])
+ annotations['alpha'] = np.array([float(x[3]) for x in content])
+ annotations['bbox'] = np.array(
+ [[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4)
+ # dimensions will convert hwl format to standard lhw(camera) format.
+ annotations['dimensions'] = np.array(
+ [[float(info) for info in x[8:11]] for x in content]).reshape(
+ -1, 3)[:, [2, 0, 1]]
+ annotations['location'] = np.array(
+ [[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3)
+ annotations['rotation_y'] = np.array(
+ [float(x[14]) for x in content]).reshape(-1)
+ if len(content) != 0 and len(content[0]) == 16: # have score
+ annotations['score'] = np.array([float(x[15]) for x in content])
+ else:
+ annotations['score'] = np.zeros([len(annotations['bbox'])])
+ return annotations
+
+def get_label_annos(label_folder, image_ids=None):
+ if image_ids is None:
+ filepaths = pathlib.Path(label_folder).glob('*.txt')
+ prog = re.compile(r'^\d{6}.txt$')
+ filepaths = filter(lambda f: prog.match(f.name), filepaths)
+ image_ids = [int(p.stem) for p in filepaths]
+ image_ids = sorted(image_ids)
+ if not isinstance(image_ids, list):
+ image_ids = list(range(image_ids))
+ annos = []
+ label_folder = pathlib.Path(label_folder)
+ for idx in image_ids:
+ image_idx = get_image_index_str(idx)
+ label_filename = label_folder / (image_idx + '.txt')
+ annos.append(get_label_anno(label_filename))
+ return annos
+
+def area(boxes, add1=False):
+ """Computes area of boxes.
+
+ Args:
+ boxes: Numpy array with shape [N, 4] holding N boxes
+
+ Returns:
+ a numpy array with shape [N*1] representing box areas
+ """
+ if add1:
+ return (boxes[:, 2] - boxes[:, 0] + 1.0) * (
+ boxes[:, 3] - boxes[:, 1] + 1.0)
+ else:
+ return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
+
+
+def intersection(boxes1, boxes2, add1=False):
+ """Compute pairwise intersection areas between boxes.
+
+ Args:
+ boxes1: a numpy array with shape [N, 4] holding N boxes
+ boxes2: a numpy array with shape [M, 4] holding M boxes
+
+ Returns:
+ a numpy array with shape [N*M] representing pairwise intersection area
+ """
+ [y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1)
+ [y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1)
+
+ all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2))
+ all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2))
+ if add1:
+ all_pairs_min_ymax += 1.0
+ intersect_heights = np.maximum(
+ np.zeros(all_pairs_max_ymin.shape),
+ all_pairs_min_ymax - all_pairs_max_ymin)
+
+ all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2))
+ all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2))
+ if add1:
+ all_pairs_min_xmax += 1.0
+ intersect_widths = np.maximum(
+ np.zeros(all_pairs_max_xmin.shape),
+ all_pairs_min_xmax - all_pairs_max_xmin)
+ return intersect_heights * intersect_widths
+
+
+def iou(boxes1, boxes2, add1=False):
+ """Computes pairwise intersection-over-union between box collections.
+
+ Args:
+ boxes1: a numpy array with shape [N, 4] holding N boxes.
+ boxes2: a numpy array with shape [M, 4] holding N boxes.
+
+ Returns:
+ a numpy array with shape [N, M] representing pairwise iou scores.
+ """
+ intersect = intersection(boxes1, boxes2, add1)
+ area1 = area(boxes1, add1)
+ area2 = area(boxes2, add1)
+ union = np.expand_dims(
+ area1, axis=1) + np.expand_dims(
+ area2, axis=0) - intersect
+ return intersect / union
diff --git a/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py
new file mode 100644
index 0000000000000000000000000000000000000000..cd694ef5c5a0c9fac9595a17743a35db37d48820
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/tools/kitti_object_eval_python/rotate_iou.py
@@ -0,0 +1,329 @@
+#####################
+# Based on https://github.com/hongzhenwang/RRPN-revise
+# Licensed under The MIT License
+# Author: yanyan, scrin@foxmail.com
+#####################
+import math
+
+import numba
+import numpy as np
+from numba import cuda
+
+@numba.jit(nopython=True)
+def div_up(m, n):
+ return m // n + (m % n > 0)
+
+@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
+def trangle_area(a, b, c):
+ return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) *
+ (b[0] - c[0])) / 2.0
+
+
+@cuda.jit('(float32[:], int32)', device=True, inline=True)
+def area(int_pts, num_of_inter):
+ area_val = 0.0
+ for i in range(num_of_inter - 2):
+ area_val += abs(
+ trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4],
+ int_pts[2 * i + 4:2 * i + 6]))
+ return area_val
+
+
+@cuda.jit('(float32[:], int32)', device=True, inline=True)
+def sort_vertex_in_convex_polygon(int_pts, num_of_inter):
+ if num_of_inter > 0:
+ center = cuda.local.array((2, ), dtype=numba.float32)
+ center[:] = 0.0
+ for i in range(num_of_inter):
+ center[0] += int_pts[2 * i]
+ center[1] += int_pts[2 * i + 1]
+ center[0] /= num_of_inter
+ center[1] /= num_of_inter
+ v = cuda.local.array((2, ), dtype=numba.float32)
+ vs = cuda.local.array((16, ), dtype=numba.float32)
+ for i in range(num_of_inter):
+ v[0] = int_pts[2 * i] - center[0]
+ v[1] = int_pts[2 * i + 1] - center[1]
+ d = math.sqrt(v[0] * v[0] + v[1] * v[1])
+ v[0] = v[0] / d
+ v[1] = v[1] / d
+ if v[1] < 0:
+ v[0] = -2 - v[0]
+ vs[i] = v[0]
+ j = 0
+ temp = 0
+ for i in range(1, num_of_inter):
+ if vs[i - 1] > vs[i]:
+ temp = vs[i]
+ tx = int_pts[2 * i]
+ ty = int_pts[2 * i + 1]
+ j = i
+ while j > 0 and vs[j - 1] > temp:
+ vs[j] = vs[j - 1]
+ int_pts[j * 2] = int_pts[j * 2 - 2]
+ int_pts[j * 2 + 1] = int_pts[j * 2 - 1]
+ j -= 1
+
+ vs[j] = temp
+ int_pts[j * 2] = tx
+ int_pts[j * 2 + 1] = ty
+
+
+@cuda.jit(
+ '(float32[:], float32[:], int32, int32, float32[:])',
+ device=True,
+ inline=True)
+def line_segment_intersection(pts1, pts2, i, j, temp_pts):
+ A = cuda.local.array((2, ), dtype=numba.float32)
+ B = cuda.local.array((2, ), dtype=numba.float32)
+ C = cuda.local.array((2, ), dtype=numba.float32)
+ D = cuda.local.array((2, ), dtype=numba.float32)
+
+ A[0] = pts1[2 * i]
+ A[1] = pts1[2 * i + 1]
+
+ B[0] = pts1[2 * ((i + 1) % 4)]
+ B[1] = pts1[2 * ((i + 1) % 4) + 1]
+
+ C[0] = pts2[2 * j]
+ C[1] = pts2[2 * j + 1]
+
+ D[0] = pts2[2 * ((j + 1) % 4)]
+ D[1] = pts2[2 * ((j + 1) % 4) + 1]
+ BA0 = B[0] - A[0]
+ BA1 = B[1] - A[1]
+ DA0 = D[0] - A[0]
+ CA0 = C[0] - A[0]
+ DA1 = D[1] - A[1]
+ CA1 = C[1] - A[1]
+ acd = DA1 * CA0 > CA1 * DA0
+ bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])
+ if acd != bcd:
+ abc = CA1 * BA0 > BA1 * CA0
+ abd = DA1 * BA0 > BA1 * DA0
+ if abc != abd:
+ DC0 = D[0] - C[0]
+ DC1 = D[1] - C[1]
+ ABBA = A[0] * B[1] - B[0] * A[1]
+ CDDC = C[0] * D[1] - D[0] * C[1]
+ DH = BA1 * DC0 - BA0 * DC1
+ Dx = ABBA * DC0 - BA0 * CDDC
+ Dy = ABBA * DC1 - BA1 * CDDC
+ temp_pts[0] = Dx / DH
+ temp_pts[1] = Dy / DH
+ return True
+ return False
+
+
+@cuda.jit(
+ '(float32[:], float32[:], int32, int32, float32[:])',
+ device=True,
+ inline=True)
+def line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):
+ a = cuda.local.array((2, ), dtype=numba.float32)
+ b = cuda.local.array((2, ), dtype=numba.float32)
+ c = cuda.local.array((2, ), dtype=numba.float32)
+ d = cuda.local.array((2, ), dtype=numba.float32)
+
+ a[0] = pts1[2 * i]
+ a[1] = pts1[2 * i + 1]
+
+ b[0] = pts1[2 * ((i + 1) % 4)]
+ b[1] = pts1[2 * ((i + 1) % 4) + 1]
+
+ c[0] = pts2[2 * j]
+ c[1] = pts2[2 * j + 1]
+
+ d[0] = pts2[2 * ((j + 1) % 4)]
+ d[1] = pts2[2 * ((j + 1) % 4) + 1]
+
+ area_abc = trangle_area(a, b, c)
+ area_abd = trangle_area(a, b, d)
+
+ if area_abc * area_abd >= 0:
+ return False
+
+ area_cda = trangle_area(c, d, a)
+ area_cdb = area_cda + area_abc - area_abd
+
+ if area_cda * area_cdb >= 0:
+ return False
+ t = area_cda / (area_abd - area_abc)
+
+ dx = t * (b[0] - a[0])
+ dy = t * (b[1] - a[1])
+ temp_pts[0] = a[0] + dx
+ temp_pts[1] = a[1] + dy
+ return True
+
+
+@cuda.jit('(float32, float32, float32[:])', device=True, inline=True)
+def point_in_quadrilateral(pt_x, pt_y, corners):
+ ab0 = corners[2] - corners[0]
+ ab1 = corners[3] - corners[1]
+
+ ad0 = corners[6] - corners[0]
+ ad1 = corners[7] - corners[1]
+
+ ap0 = pt_x - corners[0]
+ ap1 = pt_y - corners[1]
+
+ abab = ab0 * ab0 + ab1 * ab1
+ abap = ab0 * ap0 + ab1 * ap1
+ adad = ad0 * ad0 + ad1 * ad1
+ adap = ad0 * ap0 + ad1 * ap1
+
+ return abab >= abap and abap >= 0 and adad >= adap and adap >= 0
+
+
+@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)
+def quadrilateral_intersection(pts1, pts2, int_pts):
+ num_of_inter = 0
+ for i in range(4):
+ if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2):
+ int_pts[num_of_inter * 2] = pts1[2 * i]
+ int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1]
+ num_of_inter += 1
+ if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1):
+ int_pts[num_of_inter * 2] = pts2[2 * i]
+ int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1]
+ num_of_inter += 1
+ temp_pts = cuda.local.array((2, ), dtype=numba.float32)
+ for i in range(4):
+ for j in range(4):
+ has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts)
+ if has_pts:
+ int_pts[num_of_inter * 2] = temp_pts[0]
+ int_pts[num_of_inter * 2 + 1] = temp_pts[1]
+ num_of_inter += 1
+
+ return num_of_inter
+
+
+@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
+def rbbox_to_corners(corners, rbbox):
+ # generate clockwise corners and rotate it clockwise
+ angle = rbbox[4]
+ a_cos = math.cos(angle)
+ a_sin = math.sin(angle)
+ center_x = rbbox[0]
+ center_y = rbbox[1]
+ x_d = rbbox[2]
+ y_d = rbbox[3]
+ corners_x = cuda.local.array((4, ), dtype=numba.float32)
+ corners_y = cuda.local.array((4, ), dtype=numba.float32)
+ corners_x[0] = -x_d / 2
+ corners_x[1] = -x_d / 2
+ corners_x[2] = x_d / 2
+ corners_x[3] = x_d / 2
+ corners_y[0] = -y_d / 2
+ corners_y[1] = y_d / 2
+ corners_y[2] = y_d / 2
+ corners_y[3] = -y_d / 2
+ for i in range(4):
+ corners[2 *
+ i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x
+ corners[2 * i
+ + 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y
+
+
+@cuda.jit('(float32[:], float32[:])', device=True, inline=True)
+def inter(rbbox1, rbbox2):
+ corners1 = cuda.local.array((8, ), dtype=numba.float32)
+ corners2 = cuda.local.array((8, ), dtype=numba.float32)
+ intersection_corners = cuda.local.array((16, ), dtype=numba.float32)
+
+ rbbox_to_corners(corners1, rbbox1)
+ rbbox_to_corners(corners2, rbbox2)
+
+ num_intersection = quadrilateral_intersection(corners1, corners2,
+ intersection_corners)
+ sort_vertex_in_convex_polygon(intersection_corners, num_intersection)
+ # print(intersection_corners.reshape([-1, 2])[:num_intersection])
+
+ return area(intersection_corners, num_intersection)
+
+
+@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True)
+def devRotateIoUEval(rbox1, rbox2, criterion=-1):
+ area1 = rbox1[2] * rbox1[3]
+ area2 = rbox2[2] * rbox2[3]
+ area_inter = inter(rbox1, rbox2)
+ if criterion == -1:
+ return area_inter / (area1 + area2 - area_inter)
+ elif criterion == 0:
+ return area_inter / area1
+ elif criterion == 1:
+ return area_inter / area2
+ else:
+ return area_inter
+
+@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False)
+def rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1):
+ threadsPerBlock = 8 * 8
+ row_start = cuda.blockIdx.x
+ col_start = cuda.blockIdx.y
+ tx = cuda.threadIdx.x
+ row_size = min(N - row_start * threadsPerBlock, threadsPerBlock)
+ col_size = min(K - col_start * threadsPerBlock, threadsPerBlock)
+ block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
+ block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)
+
+ dev_query_box_idx = threadsPerBlock * col_start + tx
+ dev_box_idx = threadsPerBlock * row_start + tx
+ if (tx < col_size):
+ block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0]
+ block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1]
+ block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2]
+ block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3]
+ block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4]
+ if (tx < row_size):
+ block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0]
+ block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1]
+ block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2]
+ block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3]
+ block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4]
+ cuda.syncthreads()
+ if tx < row_size:
+ for i in range(col_size):
+ offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i
+ dev_iou[offset] = devRotateIoUEval(block_qboxes[i * 5:i * 5 + 5],
+ block_boxes[tx * 5:tx * 5 + 5], criterion)
+
+
+def rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):
+ """rotated box iou running in gpu. 500x faster than cpu version
+ (take 5ms in one example with numba.cuda code).
+ convert from [this project](
+ https://github.com/hongzhenwang/RRPN-revise/tree/master/lib/rotation).
+
+ Args:
+ boxes (float tensor: [N, 5]): rbboxes. format: centers, dims,
+ angles(clockwise when positive)
+ query_boxes (float tensor: [K, 5]): [description]
+ device_id (int, optional): Defaults to 0. [description]
+
+ Returns:
+ [type]: [description]
+ """
+ box_dtype = boxes.dtype
+ boxes = boxes.astype(np.float32)
+ query_boxes = query_boxes.astype(np.float32)
+ N = boxes.shape[0]
+ K = query_boxes.shape[0]
+ iou = np.zeros((N, K), dtype=np.float32)
+ if N == 0 or K == 0:
+ return iou
+ threadsPerBlock = 8 * 8
+ cuda.select_device(device_id)
+ blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock))
+
+ stream = cuda.stream()
+ with stream.auto_synchronize():
+ boxes_dev = cuda.to_device(boxes.reshape([-1]), stream)
+ query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream)
+ iou_dev = cuda.to_device(iou.reshape([-1]), stream)
+ rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream](
+ N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)
+ iou_dev.copy_to_host(iou.reshape([-1]), stream=stream)
+ return iou.astype(boxes.dtype)
diff --git a/PaddleCV/Paddle3D/PointRCNN/train.py b/PaddleCV/Paddle3D/PointRCNN/train.py
new file mode 100644
index 0000000000000000000000000000000000000000..41a6f0981b5222b940eb23aca548fbf0672723ba
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/train.py
@@ -0,0 +1,248 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import shutil
+import argparse
+import logging
+import numpy as np
+import paddle
+import paddle.fluid as fluid
+from paddle.fluid.layers import control_flow
+from paddle.fluid.contrib.extend_optimizer import extend_with_decoupled_weight_decay
+import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
+
+from models.point_rcnn import PointRCNN
+from data.kitti_rcnn_reader import KittiRCNNReader
+from utils.run_utils import *
+from utils.config import cfg, load_config, set_config_from_list
+from utils.optimizer import optimize
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser("PointRCNN semantic segmentation train script")
+ parser.add_argument(
+ '--cfg',
+ type=str,
+ default='cfgs/default.yml',
+ help='specify the config for training')
+ parser.add_argument(
+ '--train_mode',
+ type=str,
+ default='rpn',
+ required=True,
+ help='specify the training mode')
+ parser.add_argument(
+ '--batch_size',
+ type=int,
+ default=16,
+ required=True,
+ help='training batch size, default 16')
+ parser.add_argument(
+ '--epoch',
+ type=int,
+ default=200,
+ required=True,
+ help='epoch number. default 200.')
+ parser.add_argument(
+ '--save_dir',
+ type=str,
+ default='checkpoints',
+ help='directory name to save train snapshoot')
+ parser.add_argument(
+ '--resume',
+ type=str,
+ default=None,
+ help='path to resume training based on previous checkpoints. '
+ 'None for not resuming any checkpoints.')
+ parser.add_argument(
+ '--resume_epoch',
+ type=int,
+ default=0,
+ help='resume epoch id')
+ parser.add_argument(
+ '--data_dir',
+ type=str,
+ default='./data',
+ help='KITTI dataset root directory')
+ parser.add_argument(
+ '--gt_database',
+ type=str,
+ default='data/gt_database/train_gt_database_3level_Car.pkl',
+ help='generated gt database for augmentation')
+ parser.add_argument(
+ '--rcnn_training_roi_dir',
+ type=str,
+ default=None,
+ help='specify the saved rois for rcnn training when using rcnn_offline mode')
+ parser.add_argument(
+ '--rcnn_training_feature_dir',
+ type=str,
+ default=None,
+ help='specify the saved features for rcnn training when using rcnn_offline mode')
+ parser.add_argument(
+ '--worker_num',
+ type=int,
+ default=16,
+ help='multiprocess reader process num, default 16')
+ parser.add_argument(
+ '--log_interval',
+ type=int,
+ default=1,
+ help='mini-batch interval to log.')
+ parser.add_argument(
+ '--set',
+ dest='set_cfgs',
+ default=None,
+ nargs=argparse.REMAINDER,
+ help='set extra config keys if needed.')
+ args = parser.parse_args()
+ return args
+
+
+def train():
+ args = parse_args()
+ print_arguments(args)
+ # check whether the installed paddle is compiled with GPU
+ # PointRCNN model can only run on GPU
+ check_gpu(True)
+
+ load_config(args.cfg)
+ if args.set_cfgs is not None:
+ set_config_from_list(args.set_cfgs)
+
+ if args.train_mode == 'rpn':
+ cfg.RPN.ENABLED = True
+ cfg.RCNN.ENABLED = False
+ elif args.train_mode == 'rcnn':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = cfg.RPN.FIXED = True
+ elif args.train_mode == 'rcnn_offline':
+ cfg.RCNN.ENABLED = True
+ cfg.RPN.ENABLED = False
+ else:
+ raise NotImplementedError("unknown train mode: {}".format(args.train_mode))
+
+ checkpoints_dir = os.path.join(args.save_dir, args.train_mode)
+ if not os.path.isdir(checkpoints_dir):
+ os.makedirs(checkpoints_dir)
+
+ kitti_rcnn_reader = KittiRCNNReader(data_dir=args.data_dir,
+ npoints=cfg.RPN.NUM_POINTS,
+ split=cfg.TRAIN.SPLIT,
+ mode='TRAIN',
+ classes=cfg.CLASSES,
+ rcnn_training_roi_dir=args.rcnn_training_roi_dir,
+ rcnn_training_feature_dir=args.rcnn_training_feature_dir,
+ gt_database_dir=args.gt_database)
+ num_samples = len(kitti_rcnn_reader)
+ steps_per_epoch = int(num_samples / args.batch_size)
+ logger.info("Total {} samples, {} batch per epoch.".format(num_samples, steps_per_epoch))
+ boundaries = [i * steps_per_epoch for i in cfg.TRAIN.DECAY_STEP_LIST]
+ values = [cfg.TRAIN.LR * (cfg.TRAIN.LR_DECAY ** i) for i in range(len(boundaries) + 1)]
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+
+ # build model
+ startup = fluid.Program()
+ train_prog = fluid.Program()
+ with fluid.program_guard(train_prog, startup):
+ with fluid.unique_name.guard():
+ train_model = PointRCNN(cfg, args.batch_size, True, 'TRAIN')
+ train_model.build()
+ train_pyreader = train_model.get_pyreader()
+ train_feeds = train_model.get_feeds()
+ train_outputs = train_model.get_outputs()
+ train_loss = train_outputs['loss']
+ lr = optimize(train_loss,
+ learning_rate=cfg.TRAIN.LR,
+ warmup_factor=1. / cfg.TRAIN.DIV_FACTOR,
+ decay_factor=1e-5,
+ total_step=steps_per_epoch * args.epoch,
+ warmup_pct=cfg.TRAIN.PCT_START,
+ train_program=train_prog,
+ startup_prog=startup,
+ weight_decay=cfg.TRAIN.WEIGHT_DECAY,
+ clip_norm=cfg.TRAIN.GRAD_NORM_CLIP)
+ train_keys, train_values = parse_outputs(train_outputs, 'loss')
+
+ exe.run(startup)
+
+ if args.resume:
+ assert os.path.exists(args.resume), \
+ "Given resume weight dir {} not exist.".format(args.resume)
+ def if_exist(var):
+ logger.debug("{}: {}".format(var.name, os.path.exists(os.path.join(args.resume, var.name))))
+ return os.path.exists(os.path.join(args.resume, var.name))
+ fluid.io.load_vars(
+ exe, args.resume, predicate=if_exist, main_program=train_prog)
+
+ build_strategy = fluid.BuildStrategy()
+ build_strategy.memory_optimize = False
+ build_strategy.enable_inplace = False
+ build_strategy.fuse_all_optimizer_ops = False
+ train_compile_prog = fluid.compiler.CompiledProgram(
+ train_prog).with_data_parallel(loss_name=train_loss.name,
+ build_strategy=build_strategy)
+
+ def save_model(exe, prog, path):
+ if os.path.isdir(path):
+ shutil.rmtree(path)
+ logger.info("Save model to {}".format(path))
+ fluid.io.save_persistables(exe, path, prog)
+
+ # get reader
+ train_reader = kitti_rcnn_reader.get_multiprocess_reader(args.batch_size,
+ train_feeds,
+ proc_num=args.worker_num,
+ drop_last=True)
+ train_pyreader.decorate_sample_list_generator(train_reader, place)
+
+ train_stat = Stat()
+ for epoch_id in range(args.resume_epoch, args.epoch):
+ try:
+ train_pyreader.start()
+ train_iter = 0
+ train_periods = []
+ while True:
+ cur_time = time.time()
+ train_outs = exe.run(train_compile_prog, fetch_list=train_values + [lr.name])
+ period = time.time() - cur_time
+ train_periods.append(period)
+ train_stat.update(train_keys, train_outs[:-1])
+ if train_iter % args.log_interval == 0:
+ log_str = ""
+ for name, values in zip(train_keys + ['learning_rate'], train_outs):
+ log_str += "{}: {:.6f}, ".format(name, np.mean(values))
+ logger.info("[TRAIN] Epoch {}, batch {}: {}time: {:.2f}".format(epoch_id, train_iter, log_str, period))
+ train_iter += 1
+ except fluid.core.EOFException:
+ logger.info("[TRAIN] Epoch {} finished, {}average time: {:.2f}".format(epoch_id, train_stat.get_mean_log(), np.mean(train_periods[2:])))
+ save_model(exe, train_prog, os.path.join(checkpoints_dir, str(epoch_id)))
+ train_stat.reset()
+ train_periods = []
+ finally:
+ train_pyreader.reset()
+
+
+if __name__ == "__main__":
+ train()
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..cad1d5d9ab5b0e5ed0724ddfc65ef53d14044b76
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/__init__.py
@@ -0,0 +1,14 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..49c9ee74a64634e1836d081220996919ffae16a4
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/box_utils.py
@@ -0,0 +1,275 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains proposal functions
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+
+from utils.config import cfg
+
+__all__ = ["boxes3d_to_bev", "box_overlap_rotate", "boxes3d_to_bev", "box_iou", "box_nms"]
+
+
+def boxes3d_to_bev(boxes3d):
+ """
+ Args:
+ boxes3d: [N, 7], (x, y, z, h, w, l, ry)
+ Return:
+ boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
+ """
+ boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
+
+ cu, cv = boxes3d[:, 0], boxes3d[:, 2]
+ half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
+ boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
+ boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
+ boxes_bev[:, 4] = boxes3d[:, 6]
+ return boxes_bev
+
+
+def rotate_around_center(center, angle_cos, angle_sin, corners):
+ new_x = (corners[:, 0] - center[0]) * angle_cos + \
+ (corners[:, 1] - center[1]) * angle_sin + center[0]
+ new_y = -(corners[:, 0] - center[0]) * angle_sin + \
+ (corners[:, 1] - center[1]) * angle_cos + center[1]
+ return np.concatenate([new_x[:, np.newaxis], new_y[:, np.newaxis]], axis=-1)
+
+
+def check_rect_cross(p1, p2, q1, q2):
+ return min(p1[0], p2[0]) <= max(q1[0], q2[0]) and \
+ min(q1[0], q2[0]) <= max(p1[0], p2[0]) and \
+ min(p1[1], p2[1]) <= max(q1[1], q2[1]) and \
+ min(q1[1], q2[1]) <= max(p1[1], p2[1])
+
+
+def cross(p1, p2, p0):
+ return (p1[0] - p0[0]) * (p2[1] - p0[1]) - (p2[0] - p0[0]) * (p1[1] - p0[1]);
+
+
+def cross_area(a, b):
+ return a[0] * b[1] - a[1] * b[0]
+
+
+def intersection(p1, p0, q1, q0):
+ if not check_rect_cross(p1, p0, q1, q0):
+ return None
+
+ s1 = cross(q0, p1, p0)
+ s2 = cross(p1, q1, p0)
+ s3 = cross(p0, q1, q0)
+ s4 = cross(q1, p1, q0)
+ if not (s1 * s2 > 0 and s3 * s4 > 0):
+ return None
+
+ s5 = cross(q1, p1, p0)
+ if np.abs(s5 - s1) > 1e-8:
+ return np.array([(s5 * q0[0] - s1 * q1[0]) / (s5 - s1),
+ (s5 * q0[1] - s1 * q1[1]) / (s5 - s1)], dtype='float32')
+ else:
+ a0 = p0[1] - p1[1]
+ b0 = p1[0] - p0[0]
+ c0 = p0[0] * p1[1] - p1[0] * p0[1]
+ a0 = q0[1] - q1[1]
+ b0 = q1[0] - q0[0]
+ c0 = q0[0] * q1[1] - q1[0] * q0[1]
+ D = a0 * b1 - a1 * b0
+ return np.array([(b0 * c1 - b1 * c0) / D, (a1 * c0 - a0 * c1) / D], dtype='float32')
+
+
+def check_in_box2d(box, p):
+ center_x = (box[0] + box[2]) / 2.
+ center_y = (box[1] + box[3]) / 2.
+ angle_cos = np.cos(-box[4])
+ angle_sin = np.sin(-box[4])
+ rot_x = (p[0] - center_x) * angle_cos + (p[1] - center_y) * angle_sin + center_x
+ rot_y = -(p[0] - center_x) * angle_sin + (p[1] - center_y) * angle_cos + center_y
+ return rot_x > box[0] - 1e-5 and rot_x < box[2] + 1e-5 and \
+ rot_y > box[1] - 1e-5 and rot_y < box[3] + 1e-5
+
+
+def point_cmp(a, b, center):
+ return np.arctan2(a[1] - center[1], a[0] - center[0]) > \
+ np.arctan2(b[1] - center[1], b[0] - center[0])
+
+
+def box_overlap_rotate(cur_box, boxes):
+ """
+ Calculate box overlap with rotate, box: [x1, y1, x2, y2, angle]
+ """
+ areas = np.zeros((len(boxes), ), dtype='float32')
+ cur_center = [(cur_box[0] + cur_box[2]) / 2., (cur_box[1] + cur_box[3]) / 2.]
+ cur_corners = np.array([
+ [cur_box[0], cur_box[1]], # (x1, y1)
+ [cur_box[2], cur_box[1]], # (x2, y1)
+ [cur_box[2], cur_box[3]], # (x2, y2)
+ [cur_box[0], cur_box[3]], # (x1, y2)
+ [cur_box[0], cur_box[1]], # (x1, y1)
+ ], dtype='float32')
+ cur_angle_cos = np.cos(cur_box[4])
+ cur_angle_sin = np.sin(cur_box[4])
+ cur_corners = rotate_around_center(cur_center, cur_angle_cos, cur_angle_sin, cur_corners)
+
+ for i, box in enumerate(boxes):
+ box_center = [(box[0] + box[2]) / 2., (box[1] + box[3]) / 2.]
+ box_corners = np.array([
+ [box[0], box[1]],
+ [box[2], box[1]],
+ [box[2], box[3]],
+ [box[0], box[3]],
+ [box[0], box[1]],
+ ], dtype='float32')
+ box_angle_cos = np.cos(box[4])
+ box_angle_sin = np.sin(box[4])
+ box_corners = rotate_around_center(box_center, box_angle_cos, box_angle_sin, box_corners)
+
+ cross_points = np.zeros((16, 2), dtype='float32')
+ cnt = 0
+ # get intersection of lines
+ for j in range(4):
+ for k in range(4):
+ inters = intersection(cur_corners[j + 1], cur_corners[j],
+ box_corners[k + 1], box_corners[k])
+ if inters is not None:
+ cross_points[cnt, :] = inters
+ cnt += 1
+ # check corners
+ for l in range(4):
+ if check_in_box2d(cur_box, box_corners[l]):
+ cross_points[cnt, :] = box_corners[l]
+ cnt += 1
+ if check_in_box2d(box, cur_corners[l]):
+ cross_points[cnt, :] = cur_corners[l]
+ cnt += 1
+
+ if cnt > 0:
+ poly_center = np.sum(cross_points[:cnt, :], axis=0) / cnt
+ else:
+ poly_center = np.zeros((2,))
+
+ # sort the points of polygon
+ for j in range(cnt - 1):
+ for k in range(cnt - j - 1):
+ if point_cmp(cross_points[k], cross_points[k + 1], poly_center):
+ cross_points[k], cross_points[k + 1] = \
+ cross_points[k + 1].copy(), cross_points[k].copy()
+
+ # get the overlap areas
+ area = 0.
+ for j in range(cnt - 1):
+ area += cross_area(cross_points[j] - cross_points[0],
+ cross_points[j + 1] - cross_points[0])
+ areas[i] = np.abs(area) / 2.
+
+ return areas
+
+
+def box_iou(cur_box, boxes, box_type='normal'):
+ cur_S = (cur_box[2] - cur_box[0]) * (cur_box[3] - cur_box[1])
+ boxes_S = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
+
+ if box_type == 'normal':
+ inter_x1 = np.maximum(cur_box[0], boxes[:, 0])
+ inter_y1 = np.maximum(cur_box[1], boxes[:, 1])
+ inter_x2 = np.minimum(cur_box[2], boxes[:, 2])
+ inter_y2 = np.minimum(cur_box[3], boxes[:, 3])
+ inter_w = np.maximum(inter_x2 - inter_x1, 0.)
+ inter_h = np.maximum(inter_y2 - inter_y1, 0.)
+ inter_area = inter_w * inter_h
+ elif box_type == 'rotate':
+ inter_area = box_overlap_rotate(cur_box, boxes)
+ else:
+ raise NotImplementedError
+
+ return inter_area / np.maximum(cur_S + boxes_S - inter_area, 1e-8)
+
+
+def box_nms(boxes, scores, proposals, thresh, topk, nms_type='normal'):
+ assert nms_type in ['normal', 'rotate'], \
+ "unknown nms type {}".format(nms_type)
+ order = np.argsort(-scores)
+ boxes = boxes[order]
+ scores = scores[order]
+ proposals = proposals[order]
+
+ nmsed_scores = []
+ nmsed_proposals = []
+ cnt = 0
+ while boxes.shape[0]:
+ nmsed_scores.append(scores[0])
+ nmsed_proposals.append(proposals[0])
+ cnt +=1
+ if cnt >= topk or boxes.shape[0] == 1:
+ break
+ iou = box_iou(boxes[0], boxes[1:], nms_type)
+ boxes = boxes[1:][iou < thresh]
+ scores = scores[1:][iou < thresh]
+ proposals = proposals[1:][iou < thresh]
+ return nmsed_scores, nmsed_proposals
+
+
+def box_nms_eval(boxes, scores, proposals, thresh, nms_type='rotate'):
+ assert nms_type in ['normal', 'rotate'], \
+ "unknown nms type {}".format(nms_type)
+ order = np.argsort(-scores)
+ boxes = boxes[order]
+ scores = scores[order]
+ proposals = proposals[order]
+
+ nmsed_scores = []
+ nmsed_proposals = []
+ while boxes.shape[0]:
+ nmsed_scores.append(scores[0])
+ nmsed_proposals.append(proposals[0])
+ iou = box_iou(boxes[0], boxes[1:], nms_type)
+ inds = iou < thresh
+ boxes = boxes[1:][inds]
+ scores = scores[1:][inds]
+ proposals = proposals[1:][inds]
+ nmsed_scores = np.asarray(nmsed_scores)
+ nmsed_proposals = np.asarray(nmsed_proposals)
+ return nmsed_scores, nmsed_proposals
+
+def boxes_iou3d(boxes1, boxes2):
+ boxes1_bev = boxes3d_to_bev(boxes1)
+ boxes2_bev = boxes3d_to_bev(boxes2)
+
+ # bev overlap
+ overlaps_bev = np.zeros((boxes1_bev.shape[0], boxes2_bev.shape[0]))
+ for i in range(boxes1_bev.shape[0]):
+ overlaps_bev[i, :] = box_overlap_rotate(boxes1_bev[i], boxes2_bev)
+
+ # height overlap
+ boxes1_height_min = (boxes1[:, 1] - boxes1[:, 3]).reshape(-1, 1)
+ boxes1_height_max = boxes1[:, 1].reshape(-1, 1)
+ boxes2_height_min = (boxes2[:, 1] - boxes2[:, 3]).reshape(1, -1)
+ boxes2_height_max = boxes2[:, 1].reshape(1, -1)
+
+ max_of_min = np.maximum(boxes1_height_min, boxes2_height_min)
+ min_of_max = np.minimum(boxes1_height_max, boxes2_height_max)
+ overlaps_h = np.maximum(min_of_max - max_of_min, 0.)
+
+ # 3d iou
+ overlaps_3d = overlaps_bev * overlaps_h
+
+ vol_a = (boxes1[:, 3] * boxes1[:, 4] * boxes1[:, 5]).reshape(-1, 1)
+ vol_b = (boxes2[:, 3] * boxes2[:, 4] * boxes2[:, 5]).reshape(1, -1)
+ iou3d = overlaps_3d / np.maximum(vol_a + vol_b - overlaps_3d, 1e-7)
+
+ return iou3d
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py
new file mode 100644
index 0000000000000000000000000000000000000000..41fcf279db5a194c5dcc81ae8dafa48b088a42bc
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/calibration.py
@@ -0,0 +1,143 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/kitti_utils.py
+"""
+import numpy as np
+import os
+
+
+def get_calib_from_file(calib_file):
+ with open(calib_file) as f:
+ lines = f.readlines()
+
+ obj = lines[2].strip().split(' ')[1:]
+ P2 = np.array(obj, dtype=np.float32)
+ obj = lines[3].strip().split(' ')[1:]
+ P3 = np.array(obj, dtype=np.float32)
+ obj = lines[4].strip().split(' ')[1:]
+ R0 = np.array(obj, dtype=np.float32)
+ obj = lines[5].strip().split(' ')[1:]
+ Tr_velo_to_cam = np.array(obj, dtype=np.float32)
+
+ return {'P2': P2.reshape(3, 4),
+ 'P3': P3.reshape(3, 4),
+ 'R0': R0.reshape(3, 3),
+ 'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)}
+
+
+class Calibration(object):
+ def __init__(self, calib_file):
+ if isinstance(calib_file, str):
+ calib = get_calib_from_file(calib_file)
+ else:
+ calib = calib_file
+
+ self.P2 = calib['P2'] # 3 x 4
+ self.R0 = calib['R0'] # 3 x 3
+ self.V2C = calib['Tr_velo2cam'] # 3 x 4
+
+ # Camera intrinsics and extrinsics
+ self.cu = self.P2[0, 2]
+ self.cv = self.P2[1, 2]
+ self.fu = self.P2[0, 0]
+ self.fv = self.P2[1, 1]
+ self.tx = self.P2[0, 3] / (-self.fu)
+ self.ty = self.P2[1, 3] / (-self.fv)
+
+ def cart_to_hom(self, pts):
+ """
+ :param pts: (N, 3 or 2)
+ :return pts_hom: (N, 4 or 3)
+ """
+ pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32)))
+ return pts_hom
+
+ def lidar_to_rect(self, pts_lidar):
+ """
+ :param pts_lidar: (N, 3)
+ :return pts_rect: (N, 3)
+ """
+ pts_lidar_hom = self.cart_to_hom(pts_lidar)
+ pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T))
+ # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))
+ return pts_rect
+
+ def rect_to_img(self, pts_rect):
+ """
+ :param pts_rect: (N, 3)
+ :return pts_img: (N, 2)
+ """
+ pts_rect_hom = self.cart_to_hom(pts_rect)
+ pts_2d_hom = np.dot(pts_rect_hom, self.P2.T)
+ pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T # (N, 2)
+ pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2] # depth in rect camera coord
+ return pts_img, pts_rect_depth
+
+ def lidar_to_img(self, pts_lidar):
+ """
+ :param pts_lidar: (N, 3)
+ :return pts_img: (N, 2)
+ """
+ pts_rect = self.lidar_to_rect(pts_lidar)
+ pts_img, pts_depth = self.rect_to_img(pts_rect)
+ return pts_img, pts_depth
+
+ def img_to_rect(self, u, v, depth_rect):
+ """
+ :param u: (N)
+ :param v: (N)
+ :param depth_rect: (N)
+ :return:
+ """
+ x = ((u - self.cu) * depth_rect) / self.fu + self.tx
+ y = ((v - self.cv) * depth_rect) / self.fv + self.ty
+ pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1)
+ return pts_rect
+
+ def depthmap_to_rect(self, depth_map):
+ """
+ :param depth_map: (H, W), depth_map
+ :return:
+ """
+ x_range = np.arange(0, depth_map.shape[1])
+ y_range = np.arange(0, depth_map.shape[0])
+ x_idxs, y_idxs = np.meshgrid(x_range, y_range)
+ x_idxs, y_idxs = x_idxs.reshape(-1), y_idxs.reshape(-1)
+ depth = depth_map[y_idxs, x_idxs]
+ pts_rect = self.img_to_rect(x_idxs, y_idxs, depth)
+ return pts_rect, x_idxs, y_idxs
+
+ def corners3d_to_img_boxes(self, corners3d):
+ """
+ :param corners3d: (N, 8, 3) corners in rect coordinate
+ :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate
+ :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate
+ """
+ sample_num = corners3d.shape[0]
+ corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2) # (N, 8, 4)
+
+ img_pts = np.matmul(corners3d_hom, self.P2.T) # (N, 8, 3)
+
+ x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]
+ x1, y1 = np.min(x, axis=1), np.min(y, axis=1)
+ x2, y2 = np.max(x, axis=1), np.max(y, axis=1)
+
+ boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)
+ boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)
+
+ return boxes, boxes_corner
+
+ def camera_dis_to_rect(self, u, v, d):
+ """
+ Can only process valid u, v, d, which means u, v can not beyond the image shape, reprojection error 0.02
+ :param u: (N)
+ :param v: (N)
+ :param d: (N), the distance between camera and 3d points, d^2 = x^2 + y^2 + z^2
+ :return:
+ """
+ assert self.fu == self.fv, '%.8f != %.8f' % (self.fu, self.fv)
+ fd = np.sqrt((u - self.cu)**2 + (v - self.cv)**2 + self.fu**2)
+ x = ((u - self.cu) * d) / fd + self.tx
+ y = ((v - self.cv) * d) / fd + self.ty
+ z = np.sqrt(d**2 - x**2 - y**2)
+ pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), z.reshape(-1, 1)), axis=1)
+ return pts_rect
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/config.py b/PaddleCV/Paddle3D/PointRCNN/utils/config.py
new file mode 100644
index 0000000000000000000000000000000000000000..dc24aee5253576e3e5f78b8ed246af51c06279ba
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/config.py
@@ -0,0 +1,279 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+This code is bases on https://github.com/sshaoshuai/PointRCNN/blob/master/lib/config.py
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import yaml
+import numpy as np
+from ast import literal_eval
+
+__all__ = ["load_config", "cfg"]
+
+
+class AttrDict(dict):
+ def __init__(self, *args, **kwargs):
+ for arg in args:
+ for k, v in arg.items():
+ if isinstance(v, dict):
+ arg[k] = AttrDict(v)
+ else:
+ arg[k] = v
+ super(AttrDict, self).__init__(*args, **kwargs)
+
+ def __getattr__(self, name):
+ if name in self.__dict__:
+ return self.__dict__[name]
+ elif name in self:
+ return self[name]
+ else:
+ raise AttributeError(name)
+
+ def __setattr__(self, name, value):
+ if name in self.__dict__:
+ self.__dict__[name] = value
+ else:
+ self[name] = value
+
+
+__C = AttrDict()
+cfg = __C
+
+# 0. basic config
+__C.TAG = 'default'
+__C.CLASSES = 'Car'
+
+__C.INCLUDE_SIMILAR_TYPE = False
+
+# config of augmentation
+__C.AUG_DATA = True
+__C.AUG_METHOD_LIST = ['rotation', 'scaling', 'flip']
+__C.AUG_METHOD_PROB = [0.5, 0.5, 0.5]
+__C.AUG_ROT_RANGE = 18
+
+__C.GT_AUG_ENABLED = False
+__C.GT_EXTRA_NUM = 15
+__C.GT_AUG_RAND_NUM = False
+__C.GT_AUG_APPLY_PROB = 0.75
+__C.GT_AUG_HARD_RATIO = 0.6
+
+__C.PC_REDUCE_BY_RANGE = True
+__C.PC_AREA_SCOPE = np.array([[-40, 40],
+ [-1, 3],
+ [0, 70.4]]) # x, y, z scope in rect camera coords
+
+__C.CLS_MEAN_SIZE = np.array([[1.52, 1.63, 3.88]], dtype=np.float32)
+
+
+# 1. config of rpn network
+__C.RPN = AttrDict()
+__C.RPN.ENABLED = True
+__C.RPN.FIXED = False
+
+__C.RPN.USE_INTENSITY = True
+
+# config of bin-based loss
+__C.RPN.LOC_XZ_FINE = False
+__C.RPN.LOC_SCOPE = 3.0
+__C.RPN.LOC_BIN_SIZE = 0.5
+__C.RPN.NUM_HEAD_BIN = 12
+
+# config of network structure
+__C.RPN.BACKBONE = 'pointnet2_msg'
+
+__C.RPN.USE_BN = True
+__C.RPN.NUM_POINTS = 16384
+
+__C.RPN.SA_CONFIG = AttrDict()
+__C.RPN.SA_CONFIG.NPOINTS = [4096, 1024, 256, 64]
+__C.RPN.SA_CONFIG.RADIUS = [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
+__C.RPN.SA_CONFIG.NSAMPLE = [[16, 32], [16, 32], [16, 32], [16, 32]]
+__C.RPN.SA_CONFIG.MLPS = [[[16, 16, 32], [32, 32, 64]],
+ [[64, 64, 128], [64, 96, 128]],
+ [[128, 196, 256], [128, 196, 256]],
+ [[256, 256, 512], [256, 384, 512]]]
+__C.RPN.FP_MLPS = [[128, 128], [256, 256], [512, 512], [512, 512]]
+__C.RPN.CLS_FC = [128]
+__C.RPN.REG_FC = [128]
+__C.RPN.DP_RATIO = 0.5
+
+# config of training
+__C.RPN.LOSS_CLS = 'DiceLoss'
+__C.RPN.FG_WEIGHT = 15
+__C.RPN.FOCAL_ALPHA = [0.25, 0.75]
+__C.RPN.FOCAL_GAMMA = 2.0
+__C.RPN.REG_LOSS_WEIGHT = [1.0, 1.0, 1.0, 1.0]
+__C.RPN.LOSS_WEIGHT = [1.0, 1.0]
+__C.RPN.NMS_TYPE = 'normal' # normal, rotate
+
+# config of testing
+__C.RPN.SCORE_THRESH = 0.3
+
+
+# 2. config of rcnn network
+__C.RCNN = AttrDict()
+__C.RCNN.ENABLED = False
+
+# config of input
+__C.RCNN.USE_RPN_FEATURES = True
+__C.RCNN.USE_MASK = True
+__C.RCNN.MASK_TYPE = 'seg'
+__C.RCNN.USE_INTENSITY = False
+__C.RCNN.USE_DEPTH = True
+__C.RCNN.USE_SEG_SCORE = False
+__C.RCNN.ROI_SAMPLE_JIT = False
+__C.RCNN.ROI_FG_AUG_TIMES = 10
+
+__C.RCNN.REG_AUG_METHOD = 'multiple' # multiple, single, normal
+__C.RCNN.POOL_EXTRA_WIDTH = 1.0
+
+# config of bin-based loss
+__C.RCNN.LOC_SCOPE = 1.5
+__C.RCNN.LOC_BIN_SIZE = 0.5
+__C.RCNN.NUM_HEAD_BIN = 9
+__C.RCNN.LOC_Y_BY_BIN = False
+__C.RCNN.LOC_Y_SCOPE = 0.5
+__C.RCNN.LOC_Y_BIN_SIZE = 0.25
+__C.RCNN.SIZE_RES_ON_ROI = False
+
+# config of network structure
+__C.RCNN.USE_BN = False
+__C.RCNN.DP_RATIO = 0.0
+
+__C.RCNN.BACKBONE = 'pointnet' # pointnet, pointsift
+__C.RCNN.XYZ_UP_LAYER = [128, 128]
+
+__C.RCNN.NUM_POINTS = 512
+__C.RCNN.SA_CONFIG = AttrDict()
+__C.RCNN.SA_CONFIG.NPOINTS = [128, 32, -1]
+__C.RCNN.SA_CONFIG.RADIUS = [0.2, 0.4, 100]
+__C.RCNN.SA_CONFIG.NSAMPLE = [64, 64, 64]
+__C.RCNN.SA_CONFIG.MLPS = [[128, 128, 128],
+ [128, 128, 256],
+ [256, 256, 512]]
+__C.RCNN.CLS_FC = [256, 256]
+__C.RCNN.REG_FC = [256, 256]
+
+# config of training
+__C.RCNN.LOSS_CLS = 'BinaryCrossEntropy'
+__C.RCNN.FOCAL_ALPHA = [0.25, 0.75]
+__C.RCNN.FOCAL_GAMMA = 2.0
+__C.RCNN.CLS_WEIGHT = np.array([1.0, 1.0, 1.0], dtype=np.float32)
+__C.RCNN.CLS_FG_THRESH = 0.6
+__C.RCNN.CLS_BG_THRESH = 0.45
+__C.RCNN.CLS_BG_THRESH_LO = 0.05
+__C.RCNN.REG_FG_THRESH = 0.55
+__C.RCNN.FG_RATIO = 0.5
+__C.RCNN.ROI_PER_IMAGE = 64
+__C.RCNN.HARD_BG_RATIO = 0.6
+
+# config of testing
+__C.RCNN.SCORE_THRESH = 0.3
+__C.RCNN.NMS_THRESH = 0.1
+
+
+# general training config
+__C.TRAIN = AttrDict()
+__C.TRAIN.SPLIT = 'train'
+__C.TRAIN.VAL_SPLIT = 'smallval'
+
+__C.TRAIN.LR = 0.002
+__C.TRAIN.LR_CLIP = 0.00001
+__C.TRAIN.LR_DECAY = 0.5
+__C.TRAIN.DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
+__C.TRAIN.LR_WARMUP = False
+__C.TRAIN.WARMUP_MIN = 0.0002
+__C.TRAIN.WARMUP_EPOCH = 5
+
+__C.TRAIN.BN_MOMENTUM = 0.9
+__C.TRAIN.BN_DECAY = 0.5
+__C.TRAIN.BNM_CLIP = 0.01
+__C.TRAIN.BN_DECAY_STEP_LIST = [50, 100, 150, 200, 250, 300]
+
+__C.TRAIN.OPTIMIZER = 'adam'
+__C.TRAIN.WEIGHT_DECAY = 0.0 # "L2 regularization coeff [default: 0.0]"
+__C.TRAIN.MOMENTUM = 0.9
+
+__C.TRAIN.MOMS = [0.95, 0.85]
+__C.TRAIN.DIV_FACTOR = 10.0
+__C.TRAIN.PCT_START = 0.4
+
+__C.TRAIN.GRAD_NORM_CLIP = 1.0
+
+__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000
+__C.TRAIN.RPN_POST_NMS_TOP_N = 2048
+__C.TRAIN.RPN_NMS_THRESH = 0.85
+__C.TRAIN.RPN_DISTANCE_BASED_PROPOSE = True
+
+
+__C.TEST = AttrDict()
+__C.TEST.SPLIT = 'val'
+__C.TEST.RPN_PRE_NMS_TOP_N = 9000
+__C.TEST.RPN_POST_NMS_TOP_N = 300
+__C.TEST.RPN_NMS_THRESH = 0.7
+__C.TEST.RPN_DISTANCE_BASED_PROPOSE = True
+
+
+def load_config(fname):
+ """
+ Load config from yaml file and merge into global cfg
+ """
+ with open(fname) as f:
+ yml_cfg = AttrDict(yaml.load(f.read(), Loader=yaml.Loader))
+ _merge_cfg_a_to_b(yml_cfg, __C)
+
+
+def set_config_from_list(cfg_list):
+ assert len(cfg_list) % 2 == 0, "cfgs list length invalid"
+ for k, v in zip(cfg_list[0::2], cfg_list[1::2]):
+ key_list = k.split('.')
+ d = __C
+ for subkey in key_list[:-1]:
+ assert subkey in d
+ d = d[subkey]
+ subkey = key_list[-1]
+ assert subkey in d
+ try:
+ value = literal_eval(v)
+ except:
+ # handle the case when v is a string literal
+ value = v
+ assert type(value) == type(d[subkey]), \
+ 'type {} does not match original type {}'.format(type(value), type(d[subkey]))
+ d[subkey] = value
+
+
+def _merge_cfg_a_to_b(a, b):
+ assert isinstance(a, AttrDict), \
+ "unknown type {}".format(type(a))
+
+ for k, v in a.items():
+ assert k in b, "unknown key {}".format(k)
+ if type(v) is not type(b[k]):
+ if isinstance(b[k], np.ndarray):
+ b[k] = np.array(v, dtype=b[k].dtype)
+ else:
+ raise TypeError("Config type mismatch")
+ if isinstance(v, AttrDict):
+ _merge_cfg_a_to_b(v, b[k])
+ else:
+ b[k] = v
+
+
+if __name__ == "__main__":
+ load_config("./cfgs/default.yml")
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e02c54922625934fe1ab74a8c29e435f44f4d302
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/__init__.py
@@ -0,0 +1,15 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..b2c7f3c7169c0a0f5da1adeeb029eec423daf39e
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/iou3d_utils.pyx
@@ -0,0 +1,195 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import cython
+from math import pi, cos, sin
+import numpy as np
+cimport numpy as np
+
+
+cdef class Point:
+ cdef float x, y
+ def __cinit__(self, x, y):
+ self.x = x
+ self.y = y
+
+ def __add__(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return Point(self.x + v.x, self.y + v.y)
+
+ def __sub__(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return Point(self.x - v.x, self.y - v.y)
+
+ def cross(self, v):
+ if not isinstance(v, Point):
+ return NotImplemented
+ return self.x*v.y - self.y*v.x
+
+
+cdef class Line:
+ cdef float a, b, c
+ # ax + by + c = 0
+ def __cinit__(self, v1, v2):
+ self.a = v2.y - v1.y
+ self.b = v1.x - v2.x
+ self.c = v2.cross(v1)
+
+ def __call__(self, p):
+ return self.a*p.x + self.b*p.y + self.c
+
+ def intersection(self, other):
+ if not isinstance(other, Line):
+ return NotImplemented
+ w = self.a*other.b - self.b*other.a
+ return Point(
+ (self.b*other.c - self.c*other.b)/w,
+ (self.c*other.a - self.a*other.c)/w
+ )
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rectangle_vertices_(x1, y1, x2, y2, r):
+
+ cx = (x1 + x2) / 2
+ cy = (y1 + y2) / 2
+ angle = r
+ cr = cos(angle)
+ sr = sin(angle)
+ # rotate around center
+ return (
+ Point(
+ x=(x1-cx)*cr+(y1-cy)*sr+cx,
+ y=-(x1-cx)*sr+(y1-cy)*cr+cy
+ ),
+ Point(
+ x=(x2-cx)*cr+(y1-cy)*sr+cx,
+ y=-(x2-cx)*sr+(y1-cy)*cr+cy
+ ),
+ Point(
+ x=(x2-cx)*cr+(y2-cy)*sr+cx,
+ y=-(x2-cx)*sr+(y2-cy)*cr+cy
+ ),
+ Point(
+ x=(x1-cx)*cr+(y2-cy)*sr+cx,
+ y=-(x1-cx)*sr+(y2-cy)*cr+cy
+ )
+ )
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def intersection_area(r1, r2):
+ # r1 and r2 are in (center, width, height, rotation) representation
+ # First convert these into a sequence of vertices
+
+ rect1 = rectangle_vertices_(*r1)
+ rect2 = rectangle_vertices_(*r2)
+
+ # Use the vertices of the first rectangle as
+ # starting vertices of the intersection polygon.
+ intersection = rect1
+
+ # Loop over the edges of the second rectangle
+ for p, q in zip(rect2, rect2[1:] + rect2[:1]):
+ if len(intersection) <= 2:
+ break # No intersection
+
+ line = Line(p, q)
+
+ # Any point p with line(p) <= 0 is on the "inside" (or on the boundary),
+ # any point p with line(p) > 0 is on the "outside".
+
+ # Loop over the edges of the intersection polygon,
+ # and determine which part is inside and which is outside.
+ new_intersection = []
+ line_values = [line(t) for t in intersection]
+ for s, t, s_value, t_value in zip(
+ intersection, intersection[1:] + intersection[:1],
+ line_values, line_values[1:] + line_values[:1]):
+ if s_value <= 0:
+ new_intersection.append(s)
+ if s_value * t_value < 0:
+ # Points are on opposite sides.
+ # Add the intersection of the lines to new_intersection.
+ intersection_point = line.intersection(Line(s, t))
+ new_intersection.append(intersection_point)
+
+ intersection = new_intersection
+
+ # Calculate area
+ if len(intersection) <= 2:
+ return 0
+
+ return 0.5 * sum(p.x*q.y - p.y*q.x for p, q in zip(intersection, intersection[1:] + intersection[:1]))
+
+
+def boxes3d_to_bev_(boxes3d):
+ """
+ Args:
+ boxes3d: [N, 7], (x, y, z, h, w, l, ry)
+ Return:
+ boxes_bev: [N, 5], (x1, y1, x2, y2, ry)
+ """
+ boxes_bev = np.zeros((boxes3d.shape[0], 5), dtype='float32')
+ cu, cv = boxes3d[:, 0], boxes3d[:, 2]
+ half_l, half_w = boxes3d[:, 5] / 2, boxes3d[:, 4] / 2
+ boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_l, cv - half_w
+ boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_l, cv + half_w
+ boxes_bev[:, 4] = boxes3d[:, 6]
+ return boxes_bev
+
+
+def boxes_iou3d(boxes_a, boxes_b):
+ """
+ :param boxes_a: (N, 7) [x, y, z, h, w, l, ry]
+ :param boxes_b: (M, 7) [x, y, z, h, w, l, ry]
+ :return:
+ ans_iou: (M, N)
+ """
+ boxes_a_bev = boxes3d_to_bev_(boxes_a)
+ boxes_b_bev = boxes3d_to_bev_(boxes_b)
+ # bev overlap
+ num_a = boxes_a_bev.shape[0]
+ num_b = boxes_b_bev.shape[0]
+ overlaps_bev = np.zeros((num_a, num_b), dtype=np.float32)
+ for i in range(num_a):
+ for j in range(num_b):
+ overlaps_bev[i][j] = intersection_area(boxes_a_bev[i], boxes_b_bev[j])
+
+ # height overlap
+ boxes_a_height_min = (boxes_a[:, 1] - boxes_a[:, 3]).reshape(-1, 1)
+ boxes_a_height_max = boxes_a[:, 1].reshape(-1, 1)
+ boxes_b_height_min = (boxes_b[:, 1] - boxes_b[:, 3]).reshape(1, -1)
+ boxes_b_height_max = boxes_b[:, 1].reshape(1, -1)
+
+ max_of_min = np.maximum(boxes_a_height_min, boxes_b_height_min)
+ min_of_max = np.minimum(boxes_a_height_max, boxes_b_height_max)
+ overlaps_h = np.clip(min_of_max - max_of_min, a_min=0, a_max=np.inf)
+ # 3d iou
+ overlaps_3d = overlaps_bev * overlaps_h
+
+ vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).reshape(-1, 1)
+ vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).reshape(1, -1)
+
+ iou3d = overlaps_3d / np.clip(vol_a + vol_b - overlaps_3d, a_min=1e-7, a_max=np.inf)
+ return iou3d
+
+#if __name__ == '__main__':
+# # (center, width, height, rotation)
+# r1 = (10, 15, 15, 10, 30)
+# r2 = (15, 15, 20, 10, 0)
+# print(intersection_area(r1, r2))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..593dd0c9354516a2861701c5103f8e9b10ae46b1
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/kitti_utils.pyx
@@ -0,0 +1,346 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import cython
+import numpy as np
+cimport numpy as np
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def pts_in_boxes3d(np.ndarray pts_rect, np.ndarray boxes3d):
+ """
+ :param pts: (N, 3) in rect-camera coords
+ :param boxes3d: (M, 7)
+ :return: boxes_pts_mask_list: (M), list with [(N), (N), ..]
+ """
+ cdef float MAX_DIS = 10.0
+ cdef np.ndarray boxes_pts_mask_list = np.zeros((boxes3d.shape[0], pts_rect.shape[0]), dtype='int32')
+ cdef int boxes3d_num = boxes3d.shape[0]
+ cdef int pts_rect_num = pts_rect.shape[0]
+ cdef float cx, by, cz, h, w, l, angle, cy, cosa, sina, x_rot, z_rot
+ cdef int x, y, z
+
+ for i in range(boxes3d_num):
+ cx, by, cz, h, w, l, angle = boxes3d[i, :]
+ cy = by - h / 2.
+ cosa = np.cos(angle)
+ sina = np.sin(angle)
+ for j in range(pts_rect_num):
+ x, y, z = pts_rect[j, :]
+
+ if np.abs(x - cx) > MAX_DIS or np.abs(y - cy) > h / 2. or np.abs(z - cz) > MAX_DIS:
+ continue
+
+ x_rot = (x - cx) * cosa + (z - cz) * (-sina)
+ z_rot = (x - cx) * sina + (z - cz) * cosa
+ boxes_pts_mask_list[i, j] = int(x_rot >= -l / 2. and x_rot <= l / 2. and
+ z_rot >= -w / 2. and z_rot <= w / 2.)
+ return boxes_pts_mask_list
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rotate_pc_along_y(np.ndarray pc, float rot_angle):
+ """
+ params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
+ params rot_angle: rad scalar
+ Output pc: updated pc with XYZ rotated
+ """
+ cosval = np.cos(rot_angle)
+ sinval = np.sin(rot_angle)
+ rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
+ pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
+ return pc
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def rotate_pc_along_y_np(np.ndarray pc, np.ndarray rot_angle):
+ """
+ :param pc: (N, 512, 3 + C)
+ :param rot_angle: (N)
+ :return:
+ TODO: merge with rotate_pc_along_y_torch in bbox_transform.py
+ """
+ cdef np.ndarray cosa, sina, raw_1, raw_2, R, pc_temp
+ cosa = np.cos(rot_angle).reshape(-1, 1)
+ sina = np.sin(rot_angle).reshape(-1, 1)
+ raw_1 = np.concatenate([cosa, -sina], axis=1)
+ raw_2 = np.concatenate([sina, cosa], axis=1)
+ # # (N, 2, 2)
+ R = np.concatenate((np.expand_dims(raw_1, axis=1), np.expand_dims(raw_2, axis=1)), axis=1)
+ pc_temp = pc[:, :, [0, 2]]
+ pc[:, :, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1))
+
+ return pc
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def enlarge_box3d(np.ndarray boxes3d, float extra_width):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ """
+ cdef np.ndarray large_boxes3d
+ if isinstance(boxes3d, np.ndarray):
+ large_boxes3d = boxes3d.copy()
+ else:
+ large_boxes3d = boxes3d.clone()
+ large_boxes3d[:, 3:6] += extra_width * 2
+ large_boxes3d[:, 1] += extra_width
+
+ return large_boxes3d
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def boxes3d_to_corners3d(np.ndarray boxes3d, bint rotate=True):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ :param rotate:
+ :return: corners3d: (N, 8, 3)
+ """
+ cdef int boxes_num = boxes3d.shape[0]
+ cdef np.ndarray h, w, l
+ h, w, l = boxes3d[:, 3], boxes3d[:, 4], boxes3d[:, 5]
+ cdef np.ndarray x_corners, y_corners
+ x_corners = np.array([l / 2., l / 2., -l / 2., -l / 2., l / 2., l / 2., -l / 2., -l / 2.], dtype=np.float32).T # (N, 8)
+ z_corners = np.array([w / 2., -w / 2., -w / 2., w / 2., w / 2., -w / 2., -w / 2., w / 2.], dtype=np.float32).T # (N, 8)
+
+ y_corners = np.zeros((boxes_num, 8), dtype=np.float32)
+ y_corners[:, 4:8] = -h.reshape(boxes_num, 1).repeat(4, axis=1) # (N, 8)
+
+ cdef np.ndarray ry, zeros, ones, rot_list, R_list, temp_corners, rotated_corners
+ if rotate:
+ ry = boxes3d[:, 6]
+ zeros, ones = np.zeros(ry.size, dtype=np.float32), np.ones(ry.size, dtype=np.float32)
+ rot_list = np.array([[np.cos(ry), zeros, -np.sin(ry)],
+ [zeros, ones, zeros],
+ [np.sin(ry), zeros, np.cos(ry)]]) # (3, 3, N)
+ R_list = np.transpose(rot_list, (2, 0, 1)) # (N, 3, 3)
+
+ temp_corners = np.concatenate((x_corners.reshape(-1, 8, 1), y_corners.reshape(-1, 8, 1),
+ z_corners.reshape(-1, 8, 1)), axis=2) # (N, 8, 3)
+ rotated_corners = np.matmul(temp_corners, R_list) # (N, 8, 3)
+ x_corners, y_corners, z_corners = rotated_corners[:, :, 0], rotated_corners[:, :, 1], rotated_corners[:, :, 2]
+
+ cdef np.ndarray x_loc, y_loc, z_loc
+ x_loc, y_loc, z_loc = boxes3d[:, 0], boxes3d[:, 1], boxes3d[:, 2]
+
+ cdef np.ndarray x, y, z, corners
+ x = x_loc.reshape(-1, 1) + x_corners.reshape(-1, 8)
+ y = y_loc.reshape(-1, 1) + y_corners.reshape(-1, 8)
+ z = z_loc.reshape(-1, 1) + z_corners.reshape(-1, 8)
+
+ corners = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1), z.reshape(-1, 8, 1)), axis=2).astype(np.float32)
+
+ return corners
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def objs_to_boxes3d(obj_list):
+ cdef np.ndarray boxes3d = np.zeros((obj_list.__len__(), 7), dtype=np.float32)
+ cdef int k
+ for k, obj in enumerate(obj_list):
+ boxes3d[k, 0:3], boxes3d[k, 3], boxes3d[k, 4], boxes3d[k, 5], boxes3d[k, 6] \
+ = obj.pos, obj.h, obj.w, obj.l, obj.ry
+ return boxes3d
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def objs_to_scores(obj_list):
+ cdef np.ndarray scores = np.zeros((obj_list.__len__()), dtype=np.float32)
+ cdef int k
+ for k, obj in enumerate(obj_list):
+ scores[k] = obj.score
+ return scores
+
+
+def get_iou3d(np.ndarray corners3d, np.ndarray query_corners3d, bint need_bev=False):
+ """
+ :param corners3d: (N, 8, 3) in rect coords
+ :param query_corners3d: (M, 8, 3)
+ :return:
+ """
+ from shapely.geometry import Polygon
+ A, B = corners3d, query_corners3d
+ N, M = A.shape[0], B.shape[0]
+ iou3d = np.zeros((N, M), dtype=np.float32)
+ iou_bev = np.zeros((N, M), dtype=np.float32)
+
+ # for height overlap, since y face down, use the negative y
+ min_h_a = -A[:, 0:4, 1].sum(axis=1) / 4.0
+ max_h_a = -A[:, 4:8, 1].sum(axis=1) / 4.0
+ min_h_b = -B[:, 0:4, 1].sum(axis=1) / 4.0
+ max_h_b = -B[:, 4:8, 1].sum(axis=1) / 4.0
+
+ for i in range(N):
+ for j in range(M):
+ max_of_min = np.max([min_h_a[i], min_h_b[j]])
+ min_of_max = np.min([max_h_a[i], max_h_b[j]])
+ h_overlap = np.max([0, min_of_max - max_of_min])
+ if h_overlap == 0:
+ continue
+
+ bottom_a, bottom_b = Polygon(A[i, 0:4, [0, 2]].T), Polygon(B[j, 0:4, [0, 2]].T)
+ if bottom_a.is_valid and bottom_b.is_valid:
+ # check is valid, A valid Polygon may not possess any overlapping exterior or interior rings.
+ bottom_overlap = bottom_a.intersection(bottom_b).area
+ else:
+ bottom_overlap = 0.
+ overlap3d = bottom_overlap * h_overlap
+ union3d = bottom_a.area * (max_h_a[i] - min_h_a[i]) + bottom_b.area * (max_h_b[j] - min_h_b[j]) - overlap3d
+ iou3d[i][j] = overlap3d / union3d
+ iou_bev[i][j] = bottom_overlap / (bottom_a.area + bottom_b.area - bottom_overlap)
+
+ if need_bev:
+ return iou3d, iou_bev
+
+ return iou3d
+
+
+def get_objects_from_label(label_file):
+ import utils.object3d as object3d
+
+ with open(label_file, 'r') as f:
+ lines = f.readlines()
+ objects = [object3d.Object3d(line) for line in lines]
+ return objects
+
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def _rotate_pc_along_y(np.ndarray pc, np.ndarray angle):
+ cdef np.ndarray cosa = np.cos(angle)
+ cosa=cosa.reshape(-1, 1)
+ cdef np.ndarray sina = np.sin(angle)
+ sina = sina.reshape(-1, 1)
+
+ cdef np.ndarray R = np.concatenate([cosa, -sina, sina, cosa], axis=-1)
+ R = R.reshape(-1, 2, 2)
+ cdef np.ndarray pc_temp = pc[:, [0, 2]]
+ pc_temp = pc_temp.reshape(-1, 1, 2)
+ cdef np.ndarray pc_temp_1 = np.matmul(pc_temp, R.transpose(0, 2, 1))
+ pc_temp_1 = pc_temp_1.reshape(-1, 2)
+ pc[:,[0,2]] = pc_temp_1
+
+ return pc
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def decode_bbox_target(
+ np.ndarray roi_box3d,
+ np.ndarray pred_reg,
+ np.ndarray anchor_size,
+ float loc_scope,
+ float loc_bin_size,
+ int num_head_bin,
+ bint get_xz_fine=True,
+ float loc_y_scope=0.5,
+ float loc_y_bin_size=0.25,
+ bint get_y_by_bin=False,
+ bint get_ry_fine=False):
+
+ cdef int per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ cdef int loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ # recover xz localization
+ cdef int x_bin_l = 0
+ cdef int x_bin_r = per_loc_bin_num
+ cdef int z_bin_l = per_loc_bin_num,
+ cdef int z_bin_r = per_loc_bin_num * 2
+ cdef int start_offset = z_bin_r
+ cdef np.ndarray x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
+ cdef np.ndarray z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
+
+ cdef np.ndarray pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ cdef np.ndarray pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
+ z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
+
+ x_res = x_res_norm * loc_bin_size
+ z_res = z_res_norm * loc_bin_size
+ pos_x += x_res
+ pos_z += z_res
+
+ # recover y localization
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
+ y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
+ y_res = y_res_norm * loc_y_bin_size
+ pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
+ pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
+ pos_y = pos_y.reshape(-1)
+
+ # recover ry rotation
+ cdef int ry_bin_l = start_offset,
+ cdef int ry_bin_r = start_offset + num_head_bin
+ cdef int ry_res_l = ry_bin_r,
+ cdef int ry_res_r = ry_bin_r + num_head_bin
+
+ cdef np.ndarray ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
+ cdef np.ndarray ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+ ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
+ else:
+ angle_per_class = (2 * np.pi) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+
+ # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
+ ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
+ ry[ry > np.pi] -= 2 * np.pi
+
+ # recover size
+ cdef int size_res_l = ry_res_r
+ cdef int size_res_r = ry_res_r + 3
+ assert size_res_r == pred_reg.shape[1]
+
+ cdef np.ndarray size_res_norm = pred_reg[:, size_res_l: size_res_r]
+ cdef np.ndarray hwl = size_res_norm * anchor_size + anchor_size
+
+ # shift to original coords
+ cdef np.ndarray roi_center = np.array(roi_box3d[:, 0:3])
+ cdef np.ndarray shift_ret_box3d = np.concatenate((
+ pos_x.reshape(-1, 1),
+ pos_y.reshape(-1, 1),
+ pos_z.reshape(-1, 1),
+ hwl, ry.reshape(-1, 1)), axis=1)
+ ret_box3d = shift_ret_box3d
+ if roi_box3d.shape[1] == 7:
+ roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
+ ret_box3d = _rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
+ ret_box3d[:, 6] += roi_ry
+ ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
+
+ return ret_box3d
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py
new file mode 100644
index 0000000000000000000000000000000000000000..97d81421afa89a0e26daa4f956c4d835763cb966
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/object3d.py
@@ -0,0 +1,107 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
+"""
+import numpy as np
+
+
+def cls_type_to_id(cls_type):
+ type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
+ if cls_type not in type_to_id.keys():
+ return -1
+ return type_to_id[cls_type]
+
+
+class Object3d(object):
+
+ def __init__(self, line):
+ label = line.strip().split(' ')
+ self.src = line
+ self.cls_type = label[0]
+ self.cls_id = cls_type_to_id(self.cls_type)
+ self.trucation = float(label[1])
+ self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
+ self.alpha = float(label[3])
+ self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
+ self.h = float(label[8])
+ self.w = float(label[9])
+ self.l = float(label[10])
+ self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
+ self.dis_to_cam = np.linalg.norm(self.pos)
+ self.ry = float(label[14])
+ self.score = float(label[15]) if label.__len__() == 16 else -1.0
+ self.level_str = None
+ self.level = self.get_obj_level()
+
+ def get_obj_level(self):
+ height = float(self.box2d[3]) - float(self.box2d[1]) + 1
+
+ if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
+ self.level_str = 'Easy'
+ return 1 # Easy
+ elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
+ self.level_str = 'Moderate'
+ return 2 # Moderate
+ elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
+ self.level_str = 'Hard'
+ return 3 # Hard
+ else:
+ self.level_str = 'UnKnown'
+ return 4
+
+ def generate_corners3d(self):
+ """
+ generate corners3d representation for this object
+ :return corners_3d: (8, 3) corners of box3d in camera coord
+ """
+ l, h, w = self.l, self.h, self.w
+ x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
+ y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
+ z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
+
+ R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
+ [0, 1, 0],
+ [-np.sin(self.ry), 0, np.cos(self.ry)]])
+ corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
+ corners3d = np.dot(R, corners3d).T
+ corners3d = corners3d + self.pos
+ return corners3d
+
+ def to_bev_box2d(self, oblique=True, voxel_size=0.1):
+ """
+ :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
+ :param voxel_size: float, 0.1m
+ :param oblique:
+ :return: box2d (4, 2)/ (4) in image coordinate
+ """
+ if oblique:
+ corners3d = self.generate_corners3d()
+ xz_corners = corners3d[0:4, [0, 2]]
+ box2d = np.zeros((4, 2), dtype=np.int32)
+ box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
+ box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
+ else:
+ box2d = np.zeros(4, dtype=np.int32)
+ # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
+ cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
+ box2d[0], box2d[1] = cu - half_l, cv - half_w
+ box2d[2], box2d[3] = cu + half_l, cv + half_w
+
+ return box2d
+
+ def to_str(self):
+ print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
+ % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
+ self.pos, self.ry)
+ return print_str
+
+ def to_kitti_format(self):
+ kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
+ % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
+ self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
+ self.ry)
+ return kitti_str
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx
new file mode 100644
index 0000000000000000000000000000000000000000..3efa83135fed11d3e3a3daceb821c63424beb524
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/roipool3d_utils.pyx
@@ -0,0 +1,160 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import numpy as np
+cimport numpy as np
+cimport cython
+from libc.math cimport sin, cos
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef enlarge_box3d(np.ndarray boxes3d, int extra_width):
+ """
+ :param boxes3d: (N, 7) [x, y, z, h, w, l, ry]
+ """
+ if isinstance(boxes3d, np.ndarray):
+ large_boxes3d = boxes3d.copy()
+ else:
+ large_boxes3d = boxes3d.clone()
+ large_boxes3d[:, 3:6] += extra_width * 2
+ large_boxes3d[:, 1] += extra_width
+ return large_boxes3d
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef pt_in_box(float x, float y, float z, float cx, float bottom_y, float cz, float h, float w, float l, float angle):
+ cdef float max_ids = 10.0
+ cdef float cy = bottom_y - h / 2.0
+ if ((abs(x - cx) > max_ids) or (abs(y - cy) > h / 2.0) or (abs(z - cz) > max_ids)):
+ return 0
+ cdef float cosa = cos(angle)
+ cdef float sina = sin(angle)
+ cdef float x_rot = (x - cx) * cosa + (z - cz) * (-sina)
+
+ cdef float z_rot = (x - cx) * sina + (z - cz) * cosa
+
+ cdef float flag = (x_rot >= -l / 2.0) and (x_rot <= l / 2.0) and (z_rot >= -w / 2.0) and (z_rot <= w / 2.0)
+ return flag
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+cdef _rotate_pc_along_y(np.ndarray pc, float rot_angle):
+ """
+ params pc: (N, 3+C), (N, 3) is in the rectified camera coordinate
+ params rot_angle: rad scalar
+ Output pc: updated pc with XYZ rotated
+ """
+ cosval = np.cos(rot_angle)
+ sinval = np.sin(rot_angle)
+ rotmat = np.array([[cosval, -sinval], [sinval, cosval]])
+ pc[:, [0, 2]] = np.dot(pc[:, [0, 2]], np.transpose(rotmat))
+ return pc
+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def roipool3d_cpu(
+ np.ndarray[float, ndim=2] pts,
+ np.ndarray[float, ndim=2] pts_feature,
+ np.ndarray[float, ndim=2] boxes3d,
+ np.ndarray[float, ndim=2] pts_extra_input,
+ int pool_extra_width, int sampled_pt_num, int batch_size=1, bint canonical_transform=False):
+ cdef np.ndarray pts_feature_all = np.concatenate((pts_extra_input, pts_feature), axis=1)
+
+ cdef np.ndarray larged_boxes3d = enlarge_box3d(boxes3d.reshape(-1, 7), pool_extra_width).reshape(batch_size, -1, 7)
+
+ cdef int pts_num = pts.shape[0],
+ cdef int boxes_num = boxes3d.shape[0]
+ cdef int feature_len = pts_feature_all.shape[1]
+ cdef np.ndarray pts_data = np.zeros((batch_size, boxes_num, sampled_pt_num, 3))
+ cdef np.ndarray features_data = np.zeros((batch_size, boxes_num, sampled_pt_num, feature_len))
+ cdef np.ndarray empty_flag_data = np.zeros((batch_size, boxes_num))
+
+ cdef int cnt = 0
+ cdef float cx = 0.
+ cdef float bottom_y = 0.
+ cdef float cz = 0.
+ cdef float h = 0.
+ cdef float w = 0.
+ cdef float l = 0.
+ cdef float ry = 0.
+ cdef float x = 0.
+ cdef float y = 0.
+ cdef float z = 0.
+ cdef np.ndarray x_i
+ cdef np.ndarray feat_i
+ cdef int bs
+ cdef int i
+ cdef int j
+ for bs in range(batch_size):
+ # boxes: 64,7
+ for i in range(boxes_num):
+ cnt = 0
+ # box
+ box = larged_boxes3d[bs][i]
+ cx = box[0]
+ bottom_y = box[1]
+ cz = box[2]
+ h = box[3]
+ w = box[4]
+ l = box[5]
+ ry = box[6]
+ # points: 16384,3
+ x_i = pts
+ # features: 16384, 128
+ feat_i = pts_feature_all
+
+ for j in range(pts_num):
+ x = x_i[j][0]
+ y = x_i[j][1]
+ z = x_i[j][2]
+ cur_in_flag = pt_in_box(x,y,z,cx,bottom_y,cz,h,w,l,ry)
+ if cur_in_flag:
+ if cnt < sampled_pt_num:
+ pts_data[bs][i][cnt][:] = x_i[j]
+ features_data[bs][i][cnt][:] = feat_i[j]
+ cnt += 1
+ else:
+ break
+
+ if cnt == 0:
+ empty_flag_data[bs][i] = 1
+ elif (cnt < sampled_pt_num):
+ for k in range(cnt, sampled_pt_num):
+ pts_data[bs][i][k] = pts_data[bs][i][k % cnt]
+ features_data[bs][i][k] = features_data[bs][i][k % cnt]
+
+
+ pooled_pts = pts_data.astype("float32")[0]
+ pooled_features = features_data.astype('float32')[0]
+ pooled_empty_flag = empty_flag_data.astype('int64')[0]
+
+ cdef int extra_input_len = pts_extra_input.shape[1]
+ pooled_pts = np.concatenate((pooled_pts, pooled_features[:,:,0:extra_input_len]),axis=2)
+ pooled_features = pooled_features[:,:,extra_input_len:]
+
+ if canonical_transform:
+ # Translate to the roi coordinates
+ roi_ry = boxes3d[:, 6] % (2 * np.pi) # 0~2pi
+ roi_center = boxes3d[:, 0:3]
+ # shift to center
+ pooled_pts[:, :, 0:3] = pooled_pts[:, :, 0:3] - roi_center[:, np.newaxis, :]
+ for k in range(pooled_pts.shape[0]):
+ pooled_pts[k] = _rotate_pc_along_y(pooled_pts[k], roi_ry[k])
+ return pooled_pts, pooled_features, pooled_empty_flag
+
+ return pooled_pts, pooled_features, pooled_empty_flag
+
+
+#def roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num=512, batch_size=1):
+# return _roipool3d_cpu(pts, pts_feature, boxes3d, pts_extra_input, pool_extra_width, sampled_pt_num, batch_size)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d775017468bbb683d0ea0f0058062e5de12da73
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/cyops/setup.py
@@ -0,0 +1,74 @@
+# Copyright (c) 2017-present, Facebook, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+##############################################################################
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from Cython.Build import cythonize
+from setuptools import Extension
+from setuptools import setup
+
+import numpy as np
+
+_NP_INCLUDE_DIRS = np.get_include()
+
+
+# Extension modules
+ext_modules = [
+ Extension(
+ name='utils.cyops.roipool3d_utils',
+ sources=[
+ 'utils/cyops/roipool3d_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+
+ Extension(
+ name='utils.cyops.iou3d_utils',
+ sources=[
+ 'utils/cyops/iou3d_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+
+ Extension(
+ name='utils.cyops.kitti_utils',
+ sources=[
+ 'utils/cyops/kitti_utils.pyx'
+ ],
+ extra_compile_args=[
+ '-Wno-cpp'
+ ],
+ include_dirs=[
+ _NP_INCLUDE_DIRS
+ ]
+ ),
+]
+
+setup(
+ name='pp_pointrcnn',
+ ext_modules=cythonize(ext_modules)
+)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa7ee70652ac4e76aef9f4d755ec057ef2bc9123
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/metric_utils.py
@@ -0,0 +1,216 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import logging
+import numpy as np
+import utils.cyops.kitti_utils as kitti_utils
+from utils.config import cfg
+from utils.box_utils import boxes_iou3d, box_nms_eval, boxes3d_to_bev
+from utils.save_utils import save_rpn_feature, save_kitti_result, save_kitti_format
+
+__all__ = ['calc_iou_recall', 'rpn_metric', 'rcnn_metric']
+
+logging.root.handlers = []
+FORMAT = '%(asctime)s-%(levelname)s: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def calc_iou_recall(rets, thresh_list):
+ rpn_cls_label = rets['rpn_cls_label'][0]
+ boxes3d = rets['rois'][0]
+ seg_mask = rets['seg_mask'][0]
+ sample_id = rets['sample_id'][0]
+ gt_boxes3d = rets['gt_boxes3d'][0]
+ gt_boxes3d_num = rets['gt_boxes3d'][1]
+
+ gt_box_idx = 0
+ recalled_bbox_list = [0] * len(thresh_list)
+ gt_box_num = 0
+ rpn_iou_sum = 0.
+ for i in range(len(gt_boxes3d_num)):
+ cur_rpn_cls_label = rpn_cls_label[i]
+ cur_boxes3d = boxes3d[i]
+ cur_seg_mask = seg_mask[i]
+ cur_sample_id = sample_id[i]
+ cur_gt_boxes3d = gt_boxes3d[gt_box_idx: gt_box_idx +
+ gt_boxes3d_num[0][i]]
+ gt_box_idx += gt_boxes3d_num[0][i]
+
+ k = cur_gt_boxes3d.__len__() - 1
+ while k >= 0 and np.sum(cur_gt_boxes3d[k]) == 0:
+ k -= 1
+ cur_gt_boxes3d = cur_gt_boxes3d[:k + 1]
+
+ if cur_gt_boxes3d.shape[0] > 0:
+ iou3d = boxes_iou3d(cur_boxes3d, cur_gt_boxes3d[:, 0:7])
+ gt_max_iou = iou3d.max(axis=0)
+
+ for idx, thresh in enumerate(thresh_list):
+ recalled_bbox_list[idx] += np.sum(gt_max_iou > thresh)
+ gt_box_num += cur_gt_boxes3d.__len__()
+
+ fg_mask = cur_rpn_cls_label > 0
+ correct = np.sum(np.logical_and(
+ cur_seg_mask == cur_rpn_cls_label, fg_mask))
+ union = np.sum(fg_mask) + np.sum(cur_seg_mask > 0) - correct
+ rpn_iou = float(correct) / max(float(union), 1.0)
+ rpn_iou_sum += rpn_iou
+ logger.debug('sample_id:{}, rpn_iou:{}, gt_box_num:{}, recalled_bbox_list:{}'.format(
+ sample_id, rpn_iou, gt_box_num, str(recalled_bbox_list)))
+
+ return len(gt_boxes3d_num), gt_box_num, rpn_iou_sum, recalled_bbox_list
+
+
+def rpn_metric(queue, mdict, lock, thresh_list, is_save_rpn_feature, kitti_feature_dir,
+ seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes):
+ while True:
+ rets_dict = queue.get()
+ if rets_dict is None:
+ lock.acquire()
+ mdict['exit_proc'] += 1
+ lock.release()
+ return
+
+ cnt, gt_box_num, rpn_iou_sum, recalled_bbox_list = calc_iou_recall(
+ rets_dict, thresh_list)
+ lock.acquire()
+ mdict['total_cnt'] += cnt
+ mdict['total_gt_bbox'] += gt_box_num
+ mdict['total_rpn_iou'] += rpn_iou_sum
+ for i, bbox_num in enumerate(recalled_bbox_list):
+ mdict['total_recalled_bbox_list_{}'.format(i)] += bbox_num
+ logger.debug("rpn_metric: {}".format(str(mdict)))
+ lock.release()
+
+ if is_save_rpn_feature:
+ save_rpn_feature(rets_dict, kitti_feature_dir)
+ save_kitti_result(
+ rets_dict, seg_output_dir, kitti_output_dir, kitti_rcnn_reader, classes)
+
+
+def rcnn_metric(queue, mdict, lock, thresh_list, kitti_rcnn_reader, roi_output_dir,
+ refine_output_dir, final_output_dir, is_save_result=False):
+ while True:
+ rets_dict = queue.get()
+ if rets_dict is None:
+ lock.acquire()
+ mdict['exit_proc'] += 1
+ lock.release()
+ return
+
+ for k,v in rets_dict.items():
+ rets_dict[k] = v[0]
+
+ rcnn_cls = rets_dict['rcnn_cls']
+ rcnn_reg = rets_dict['rcnn_reg']
+ roi_boxes3d = rets_dict['roi_boxes3d']
+ roi_scores = rets_dict['roi_scores']
+
+ # bounding box regression
+ anchor_size = cfg.CLS_MEAN_SIZE[0]
+ pred_boxes3d = kitti_utils.decode_bbox_target(
+ roi_boxes3d,
+ rcnn_reg,
+ anchor_size=np.array(anchor_size),
+ loc_scope=cfg.RCNN.LOC_SCOPE,
+ loc_bin_size=cfg.RCNN.LOC_BIN_SIZE,
+ num_head_bin=cfg.RCNN.NUM_HEAD_BIN,
+ get_xz_fine=True,
+ get_y_by_bin=cfg.RCNN.LOC_Y_BY_BIN,
+ loc_y_scope=cfg.RCNN.LOC_Y_SCOPE,
+ loc_y_bin_size=cfg.RCNN.LOC_Y_BIN_SIZE,
+ get_ry_fine=True
+ )
+
+ # scoring
+ if rcnn_cls.shape[1] == 1:
+ raw_scores = rcnn_cls.reshape(-1)
+ norm_scores = rets_dict['norm_scores']
+ pred_classes = norm_scores > cfg.RCNN.SCORE_THRESH
+ pred_classes = pred_classes.astype(np.float32)
+ else:
+ pred_classes = np.argmax(rcnn_cls, axis=1).reshape(-1)
+ raw_scores = rcnn_cls[:, pred_classes]
+
+ # evaluation
+ gt_iou = rets_dict['gt_iou']
+ gt_boxes3d = rets_dict['gt_boxes3d']
+
+ # recall
+ if gt_boxes3d.size > 0:
+ gt_num = gt_boxes3d.shape[1]
+ gt_boxes3d = gt_boxes3d.reshape((-1,7))
+ iou3d = boxes_iou3d(pred_boxes3d, gt_boxes3d)
+ gt_max_iou = iou3d.max(axis=0)
+ refined_iou = iou3d.max(axis=1)
+
+ recalled_num = (gt_max_iou > 0.7).sum()
+ roi_boxes3d = roi_boxes3d.reshape((-1,7))
+ iou3d_in = boxes_iou3d(roi_boxes3d, gt_boxes3d)
+ gt_max_iou_in = iou3d_in.max(axis=0)
+
+ lock.acquire()
+ mdict['total_gt_bbox'] += gt_num
+ for idx, thresh in enumerate(thresh_list):
+ recalled_bbox_num = (gt_max_iou > thresh).sum()
+ mdict['total_recalled_bbox_list_{}'.format(idx)] += recalled_bbox_num
+ for idx, thresh in enumerate(thresh_list):
+ roi_recalled_bbox_num = (gt_max_iou_in > thresh).sum()
+ mdict['total_roi_recalled_bbox_list_{}'.format(idx)] += roi_recalled_bbox_num
+ lock.release()
+
+ # classification accuracy
+ cls_label = gt_iou > cfg.RCNN.CLS_FG_THRESH
+ cls_label = cls_label.astype(np.float32)
+ cls_valid_mask = (gt_iou >= cfg.RCNN.CLS_FG_THRESH) | (gt_iou <= cfg.RCNN.CLS_BG_THRESH)
+ cls_valid_mask = cls_valid_mask.astype(np.float32)
+ cls_acc = (pred_classes == cls_label).astype(np.float32)
+ cls_acc = (cls_acc * cls_valid_mask).sum() / max(cls_valid_mask.sum(), 1.0) * 1.0
+
+ iou_thresh = 0.7 if cfg.CLASSES == 'Car' else 0.5
+ cls_label_refined = (gt_iou >= iou_thresh)
+ cls_label_refined = cls_label_refined.astype(np.float32)
+ cls_acc_refined = (pred_classes == cls_label_refined).astype(np.float32).sum() / max(cls_label_refined.shape[0], 1.0)
+
+ sample_id = rets_dict['sample_id']
+ image_shape = kitti_rcnn_reader.get_image_shape(sample_id)
+
+ if is_save_result:
+ roi_boxes3d_np = roi_boxes3d
+ pred_boxes3d_np = pred_boxes3d
+ calib = kitti_rcnn_reader.get_calib(sample_id)
+ save_kitti_format(sample_id, calib, roi_boxes3d_np, roi_output_dir, roi_scores, image_shape)
+ save_kitti_format(sample_id, calib, pred_boxes3d_np, refine_output_dir, raw_scores, image_shape)
+
+ inds = norm_scores > cfg.RCNN.SCORE_THRESH
+ if inds.astype(np.float32).sum() == 0:
+ logger.debug("The num of 'norm_scores > thresh' of sample {} is 0".format(sample_id))
+ continue
+ pred_boxes3d_selected = pred_boxes3d[inds]
+ raw_scores_selected = raw_scores[inds]
+ # NMS thresh
+ boxes_bev_selected = boxes3d_to_bev(pred_boxes3d_selected)
+ scores_selected, pred_boxes3d_selected = box_nms_eval(boxes_bev_selected, raw_scores_selected, pred_boxes3d_selected, cfg.RCNN.NMS_THRESH)
+ calib = kitti_rcnn_reader.get_calib(sample_id)
+ save_kitti_format(sample_id, calib, pred_boxes3d_selected, final_output_dir, scores_selected, image_shape)
+ lock.acquire()
+ mdict['total_det_num'] += pred_boxes3d_selected.shape[0]
+ mdict['total_cls_acc'] += cls_acc
+ mdict['total_cls_acc_refined'] += cls_acc_refined
+ lock.release()
+ logger.debug("rcnn_metric: {}".format(str(mdict)))
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py
new file mode 100644
index 0000000000000000000000000000000000000000..7b5703bdbfba1c1bf239c2a2c9f2179ea908a7e5
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/object3d.py
@@ -0,0 +1,113 @@
+"""
+This code is borrow from https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/object3d.py
+"""
+import numpy as np
+
+
+def cls_type_to_id(cls_type):
+ type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}
+ if cls_type not in type_to_id.keys():
+ return -1
+ return type_to_id[cls_type]
+
+
+def get_objects_from_label(label_file):
+ with open(label_file, 'r') as f:
+ lines = f.readlines()
+ objects = [Object3d(line) for line in lines]
+ return objects
+
+
+class Object3d(object):
+ def __init__(self, line):
+ label = line.strip().split(' ')
+ self.src = line
+ self.cls_type = label[0]
+ self.cls_id = cls_type_to_id(self.cls_type)
+ self.trucation = float(label[1])
+ self.occlusion = float(label[2]) # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown
+ self.alpha = float(label[3])
+ self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)
+ self.h = float(label[8])
+ self.w = float(label[9])
+ self.l = float(label[10])
+ self.pos = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)
+ self.dis_to_cam = np.linalg.norm(self.pos)
+ self.ry = float(label[14])
+ self.score = float(label[15]) if label.__len__() == 16 else -1.0
+ self.level_str = None
+ self.level = self.get_obj_level()
+
+ def get_obj_level(self):
+ height = float(self.box2d[3]) - float(self.box2d[1]) + 1
+
+ if height >= 40 and self.trucation <= 0.15 and self.occlusion <= 0:
+ self.level_str = 'Easy'
+ return 1 # Easy
+ elif height >= 25 and self.trucation <= 0.3 and self.occlusion <= 1:
+ self.level_str = 'Moderate'
+ return 2 # Moderate
+ elif height >= 25 and self.trucation <= 0.5 and self.occlusion <= 2:
+ self.level_str = 'Hard'
+ return 3 # Hard
+ else:
+ self.level_str = 'UnKnown'
+ return 4
+
+ def generate_corners3d(self):
+ """
+ generate corners3d representation for this object
+ :return corners_3d: (8, 3) corners of box3d in camera coord
+ """
+ l, h, w = self.l, self.h, self.w
+ x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
+ y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
+ z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
+
+ R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],
+ [0, 1, 0],
+ [-np.sin(self.ry), 0, np.cos(self.ry)]])
+ corners3d = np.vstack([x_corners, y_corners, z_corners]) # (3, 8)
+ corners3d = np.dot(R, corners3d).T
+ corners3d = corners3d + self.pos
+ return corners3d
+
+ def to_bev_box2d(self, oblique=True, voxel_size=0.1):
+ """
+ :param bev_shape: (2) for bev shape (h, w), => (y_max, x_max) in image
+ :param voxel_size: float, 0.1m
+ :param oblique:
+ :return: box2d (4, 2)/ (4) in image coordinate
+ """
+ if oblique:
+ corners3d = self.generate_corners3d()
+ xz_corners = corners3d[0:4, [0, 2]]
+ box2d = np.zeros((4, 2), dtype=np.int32)
+ box2d[:, 0] = ((xz_corners[:, 0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ box2d[:, 1] = Object3d.BEV_SHAPE[0] - 1 - ((xz_corners[:, 1] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ box2d[:, 0] = np.clip(box2d[:, 0], 0, Object3d.BEV_SHAPE[1])
+ box2d[:, 1] = np.clip(box2d[:, 1], 0, Object3d.BEV_SHAPE[0])
+ else:
+ box2d = np.zeros(4, dtype=np.int32)
+ # discrete_center = np.floor((self.pos / voxel_size)).astype(np.int32)
+ cu = np.floor((self.pos[0] - Object3d.MIN_XZ[0]) / voxel_size).astype(np.int32)
+ cv = Object3d.BEV_SHAPE[0] - 1 - ((self.pos[2] - Object3d.MIN_XZ[1]) / voxel_size).astype(np.int32)
+ half_l, half_w = int(self.l / voxel_size / 2), int(self.w / voxel_size / 2)
+ box2d[0], box2d[1] = cu - half_l, cv - half_w
+ box2d[2], box2d[3] = cu + half_l, cv + half_w
+
+ return box2d
+
+ def to_str(self):
+ print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \
+ % (self.cls_type, self.trucation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,
+ self.pos, self.ry)
+ return print_str
+
+ def to_kitti_format(self):
+ kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \
+ % (self.cls_type, self.trucation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],
+ self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.pos[0], self.pos[1], self.pos[2],
+ self.ry)
+ return kitti_str
+
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py
new file mode 100644
index 0000000000000000000000000000000000000000..e32d1df862de7692e520168a2b35f482535f3ac6
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/optimizer.py
@@ -0,0 +1,122 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Optimization and learning rate scheduling."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+import paddle.fluid.layers.learning_rate_scheduler as lr_scheduler
+from paddle.fluid.layers import control_flow
+
+import logging
+logger = logging.getLogger(__name__)
+
+def cosine_warmup_decay(learning_rate, betas, warmup_factor, decay_factor,
+ total_step, warmup_pct):
+ def annealing_cos(start, end, pct):
+ "Cosine anneal from `start` to `end` as pct goes from 0.0 to 1.0."
+ cos_out = fluid.layers.cos(pct * np.pi) + 1.
+ return cos_out * (start - end) / 2. + end
+
+ warmup_start_lr = learning_rate * warmup_factor
+ decay_end_lr = learning_rate * decay_factor
+ warmup_step = total_step * warmup_pct
+
+ global_step = lr_scheduler._decay_step_counter()
+
+ lr = fluid.layers.create_global_var(
+ shape=[1],
+ value=float(learning_rate),
+ dtype='float32',
+ persistable=True,
+ name="learning_rate")
+ beta1 = fluid.layers.create_global_var(
+ shape=[1],
+ value=float(betas[0]),
+ dtype='float32',
+ persistable=True,
+ name="beta1")
+
+ warmup_step_var = fluid.layers.fill_constant(
+ shape=[1], dtype='float32', value=float(warmup_step), force_cpu=True)
+
+ with control_flow.Switch() as switch:
+ with switch.case(global_step < warmup_step_var):
+ cur_lr = annealing_cos(warmup_start_lr, learning_rate,
+ global_step / warmup_step_var)
+ fluid.layers.assign(cur_lr, lr)
+ cur_beta1 = annealing_cos(betas[0], betas[1],
+ global_step / warmup_step_var)
+ fluid.layers.assign(cur_beta1, beta1)
+ with switch.case(global_step >= warmup_step_var):
+ cur_lr = annealing_cos(learning_rate, decay_end_lr,
+ (global_step - warmup_step_var) / (total_step - warmup_step))
+ fluid.layers.assign(cur_lr, lr)
+ cur_beta1 = annealing_cos(betas[1], betas[0],
+ (global_step - warmup_step_var) / (total_step - warmup_step))
+ fluid.layers.assign(cur_beta1, beta1)
+
+ return lr, beta1
+
+
+def optimize(loss,
+ learning_rate,
+ warmup_factor,
+ decay_factor,
+ total_step,
+ warmup_pct,
+ train_program,
+ startup_prog,
+ weight_decay,
+ clip_norm,
+ beta1=[0.95, 0.85],
+ beta2=0.99,
+ scheduler='cosine_warmup_decay'):
+
+ scheduled_lr= None
+ if scheduler == 'cosine_warmup_decay':
+ scheduled_lr, scheduled_beta1 = cosine_warmup_decay(learning_rate, beta1, warmup_factor,
+ decay_factor, total_step,
+ warmup_pct)
+ else:
+ raise ValueError("Unkown learning rate scheduler, should be "
+ "'cosine_warmup_decay'")
+
+ optimizer = fluid.optimizer.Adam(learning_rate=scheduled_lr,
+ beta1=scheduled_beta1,
+ beta2=beta2)
+ fluid.clip.set_gradient_clip(
+ clip=fluid.clip.GradientClipByGlobalNorm(clip_norm=clip_norm))
+
+ param_list = dict()
+
+ if weight_decay > 0:
+ for param in train_program.global_block().all_parameters():
+ param_list[param.name] = param * 1.0
+ param_list[param.name].stop_gradient = True
+
+ _, param_grads = optimizer.minimize(loss)
+
+ if weight_decay > 0:
+ for param, grad in param_grads:
+ with param.block.program._optimized_guard(
+ [param, grad]), fluid.framework.name_scope("weight_decay"):
+ updated_param = param - param_list[
+ param.name] * weight_decay * scheduled_lr
+ fluid.layers.assign(output=param, input=updated_param)
+
+ return scheduled_lr
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py
new file mode 100644
index 0000000000000000000000000000000000000000..deda51180bfb9007f1dadd265c3f33f397b1cccf
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_target.py
@@ -0,0 +1,369 @@
+import numpy as np
+from utils.cyops import kitti_utils, roipool3d_utils, iou3d_utils
+
+CLOSE_RANDOM = False
+
+def get_proposal_target_func(cfg, mode='TRAIN'):
+
+ def sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d):
+ """
+ :param roi_boxes3d: (B, M, 7)
+ :param gt_boxes3d: (B, N, 8) [x, y, z, h, w, l, ry, cls]
+ :return
+ batch_rois: (B, N, 7)
+ batch_gt_of_rois: (B, N, 8)
+ batch_roi_iou: (B, N)
+ """
+
+ batch_size = roi_boxes3d.shape[0]
+
+ #batch_size = 1
+ fg_rois_per_image = int(np.round(cfg.RCNN.FG_RATIO * cfg.RCNN.ROI_PER_IMAGE))
+
+ batch_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
+ batch_gt_of_rois = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE, 7))
+ batch_roi_iou = np.zeros((batch_size, cfg.RCNN.ROI_PER_IMAGE))
+ for idx in range(batch_size):
+ cur_roi, cur_gt = roi_boxes3d[idx], gt_boxes3d[idx]
+ k = cur_gt.shape[0] - 1
+ while cur_gt[k].sum() == 0:
+ k -= 1
+ cur_gt = cur_gt[:k + 1]
+ # include gt boxes in the candidate rois
+ iou3d = iou3d_utils.boxes_iou3d(cur_roi, cur_gt[:, 0:7]) # (M, N)
+ max_overlaps = np.max(iou3d, axis=1)
+ gt_assignment = np.argmax(iou3d, axis=1)
+ # sample fg, easy_bg, hard_bg
+ fg_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+ fg_inds = np.where(max_overlaps >= fg_thresh)[0].reshape(-1)
+
+ # TODO: this will mix the fg and bg when CLS_BG_THRESH_LO < iou < CLS_BG_THRESH
+ # fg_inds = torch.cat((fg_inds, roi_assignment), dim=0) # consider the roi which has max_iou with gt as fg
+ easy_bg_inds = np.where(max_overlaps < cfg.RCNN.CLS_BG_THRESH_LO)[0].reshape(-1)
+ hard_bg_inds = np.where((max_overlaps < cfg.RCNN.CLS_BG_THRESH) & (max_overlaps >= cfg.RCNN.CLS_BG_THRESH_LO))[0].reshape(-1)
+
+ fg_num_rois = fg_inds.shape[0]
+ bg_num_rois = hard_bg_inds.shape[0] + easy_bg_inds.shape[0]
+
+ if fg_num_rois > 0 and bg_num_rois > 0:
+ # sampling fg
+ fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)
+ if CLOSE_RANDOM:
+ fg_inds = fg_inds[:fg_rois_per_this_image]
+ else:
+ rand_num = np.random.permutation(fg_num_rois)
+ fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]
+
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE - fg_rois_per_this_image
+ bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ elif fg_num_rois > 0 and bg_num_rois == 0:
+ # sampling fg
+ rand_num = np.floor(np.random.rand(cfg.RCNN.ROI_PER_IMAGE) * fg_num_rois)
+ # rand_num = torch.from_numpy(rand_num).type_as(gt_boxes3d).long()
+ fg_inds = fg_inds[rand_num]
+ fg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_rois_per_this_image = 0
+ elif bg_num_rois > 0 and fg_num_rois == 0:
+ # sampling bg
+ bg_rois_per_this_image = cfg.RCNN.ROI_PER_IMAGE
+ bg_inds = sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image)
+
+ fg_rois_per_this_image = 0
+ else:
+ import pdb
+ pdb.set_trace()
+ raise NotImplementedError
+ # augment the rois by noise
+ roi_list, roi_iou_list, roi_gt_list = [], [], []
+ if fg_rois_per_this_image > 0:
+ fg_rois_src = cur_roi[fg_inds]
+ gt_of_fg_rois = cur_gt[gt_assignment[fg_inds]]
+ iou3d_src = max_overlaps[fg_inds]
+ fg_rois, fg_iou3d = aug_roi_by_noise(
+ fg_rois_src, gt_of_fg_rois, iou3d_src, aug_times=cfg.RCNN.ROI_FG_AUG_TIMES)
+ roi_list.append(fg_rois)
+ roi_iou_list.append(fg_iou3d)
+ roi_gt_list.append(gt_of_fg_rois)
+
+ if bg_rois_per_this_image > 0:
+ bg_rois_src = cur_roi[bg_inds]
+ gt_of_bg_rois = cur_gt[gt_assignment[bg_inds]]
+ iou3d_src = max_overlaps[bg_inds]
+ aug_times = 1 if cfg.RCNN.ROI_FG_AUG_TIMES > 0 else 0
+ bg_rois, bg_iou3d = aug_roi_by_noise(
+ bg_rois_src, gt_of_bg_rois, iou3d_src, aug_times=aug_times)
+ roi_list.append(bg_rois)
+ roi_iou_list.append(bg_iou3d)
+ roi_gt_list.append(gt_of_bg_rois)
+
+
+ rois = np.concatenate(roi_list, axis=0)
+ iou_of_rois = np.concatenate(roi_iou_list, axis=0)
+ gt_of_rois = np.concatenate(roi_gt_list, axis=0)
+ batch_rois[idx] = rois
+ batch_gt_of_rois[idx] = gt_of_rois
+ batch_roi_iou[idx] = iou_of_rois
+
+ return batch_rois, batch_gt_of_rois, batch_roi_iou
+
+ def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image):
+
+ if hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] > 0:
+ hard_bg_rois_num = int(bg_rois_per_this_image * cfg.RCNN.HARD_BG_RATIO)
+ easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num
+ # sampling hard bg
+ if CLOSE_RANDOM:
+ rand_idx = list(np.arange(0,hard_bg_inds.shape[0]))*hard_bg_rois_num
+ rand_idx = rand_idx[:hard_bg_rois_num]
+ else:
+ rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
+ hard_bg_inds = hard_bg_inds[rand_idx]
+ # sampling easy bg
+ if CLOSE_RANDOM:
+ rand_idx = list(np.arange(0,easy_bg_inds.shape[0]))*easy_bg_rois_num
+ rand_idx = rand_idx[:easy_bg_rois_num]
+ else:
+ rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
+ easy_bg_inds = easy_bg_inds[rand_idx]
+ bg_inds = np.concatenate([hard_bg_inds, easy_bg_inds], axis=0)
+ elif hard_bg_inds.shape[0] > 0 and easy_bg_inds.shape[0] == 0:
+ hard_bg_rois_num = bg_rois_per_this_image
+ # sampling hard bg
+ rand_idx = np.random.randint(low=0, high=hard_bg_inds.shape[0], size=(hard_bg_rois_num,))
+ bg_inds = hard_bg_inds[rand_idx]
+ elif hard_bg_inds.shape[0] == 0 and easy_bg_inds.shape[0] > 0:
+ easy_bg_rois_num = bg_rois_per_this_image
+ # sampling easy bg
+ rand_idx = np.random.randint(low=0, high=easy_bg_inds.shape[0], size=(easy_bg_rois_num,))
+ bg_inds = easy_bg_inds[rand_idx]
+ else:
+ raise NotImplementedError
+
+ return bg_inds
+
+ def aug_roi_by_noise(roi_boxes3d, gt_boxes3d, iou3d_src, aug_times=10):
+ iou_of_rois = np.zeros(roi_boxes3d.shape[0]).astype(gt_boxes3d.dtype)
+ pos_thresh = min(cfg.RCNN.REG_FG_THRESH, cfg.RCNN.CLS_FG_THRESH)
+
+ for k in range(roi_boxes3d.shape[0]):
+ temp_iou = cnt = 0
+ roi_box3d = roi_boxes3d[k]
+
+ gt_box3d = gt_boxes3d[k].reshape(1, 7)
+ aug_box3d = roi_box3d
+ keep = True
+ while temp_iou < pos_thresh and cnt < aug_times:
+ if True: #np.random.rand() < 0.2:
+ aug_box3d = roi_box3d # p=0.2 to keep the original roi box
+ keep = True
+ else:
+ aug_box3d = random_aug_box3d(roi_box3d)
+ keep = False
+ aug_box3d = aug_box3d.reshape((1, 7))
+ iou3d = iou3d_utils.boxes_iou3d(aug_box3d, gt_box3d)
+ temp_iou = iou3d[0][0]
+ cnt += 1
+ roi_boxes3d[k] = aug_box3d.reshape(-1)
+ if cnt == 0 or keep:
+ iou_of_rois[k] = iou3d_src[k]
+ else:
+ iou_of_rois[k] = temp_iou
+ return roi_boxes3d, iou_of_rois
+
+ def random_aug_box3d(box3d):
+ """
+ :param box3d: (7) [x, y, z, h, w, l, ry]
+ random shift, scale, orientation
+ """
+ if cfg.RCNN.REG_AUG_METHOD == 'single':
+
+ pos_shift = (np.random.rand(3) - 0.5) # [-0.5 ~ 0.5]
+ hwl_scale = (np.random.rand(3) - 0.5) / (0.5 / 0.15) + 1.0 #
+ angle_rot = (np.random.rand(1) - 0.5) / (0.5 / (np.pi / 12)) # [-pi/12 ~ pi/12]
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'multiple':
+ # pos_range, hwl_range, angle_range, mean_iou
+ range_config = [[0.2, 0.1, np.pi / 12, 0.7],
+ [0.3, 0.15, np.pi / 12, 0.6],
+ [0.5, 0.15, np.pi / 9, 0.5],
+ [0.8, 0.15, np.pi / 6, 0.3],
+ [1.0, 0.15, np.pi / 3, 0.2]]
+ idx = np.random.randint(low=0, high=len(range_config), size=(1,))[0]
+ pos_shift = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][0]
+ hwl_scale = ((np.random.rand(3) - 0.5) / 0.5) * range_config[idx][1] + 1.0
+ angle_rot = ((np.random.rand(1) - 0.5) / 0.5) * range_config[idx][2]
+ aug_box3d = np.concatenate([box3d[0:3] + pos_shift, box3d[3:6] * hwl_scale, box3d[6:7] + angle_rot], axis=0)
+ return aug_box3d
+ elif cfg.RCNN.REG_AUG_METHOD == 'normal':
+ x_shift = np.random.normal(loc=0, scale=0.3)
+ y_shift = np.random.normal(loc=0, scale=0.2)
+ z_shift = np.random.normal(loc=0, scale=0.3)
+ h_shift = np.random.normal(loc=0, scale=0.25)
+ w_shift = np.random.normal(loc=0, scale=0.15)
+ l_shift = np.random.normal(loc=0, scale=0.5)
+ ry_shift = ((np.random.rand() - 0.5) / 0.5) * np.pi / 12
+ aug_box3d = np.array([box3d[0] + x_shift, box3d[1] + y_shift, box3d[2] + z_shift, box3d[3] + h_shift,
+ box3d[4] + w_shift, box3d[5] + l_shift, box3d[6] + ry_shift], dtype=np.float32)
+ aug_box3d = aug_box3d.astype(box3d.dtype)
+ return aug_box3d
+ else:
+ raise NotImplementedError
+
+ def data_augmentation(pts, rois, gt_of_rois):
+ """
+ :param pts: (B, M, 512, 3)
+ :param rois: (B, M. 7)
+ :param gt_of_rois: (B, M, 7)
+ :return:
+ """
+ batch_size, boxes_num = pts.shape[0], pts.shape[1]
+
+ # rotation augmentation
+ angles = (np.random.rand(batch_size, boxes_num) - 0.5 / 0.5) * (np.pi / cfg.AUG_ROT_RANGE)
+ # calculate gt alpha from gt_of_rois
+ temp_x, temp_z, temp_ry = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2], gt_of_rois[:, :, 6]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ gt_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
+
+ temp_x, temp_z, temp_ry = rois[:, :, 0], rois[:, :, 2], rois[:, :, 6]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ roi_alpha = -np.sign(temp_beta) * np.pi / 2 + temp_beta + temp_ry # (B, M)
+
+ for k in range(batch_size):
+ pts[k] = kitti_utils.rotate_pc_along_y_np(pts[k], angles[k])
+ gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(gt_of_rois[k], axis=1), angles[k]), axis=1)
+ rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(rois[k], axis=1), angles[k]),axis=1)
+
+ # calculate the ry after rotation
+ temp_x, temp_z = gt_of_rois[:, :, 0], gt_of_rois[:, :, 2]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ gt_of_rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + gt_alpha - temp_beta
+ temp_x, temp_z = rois[:, :, 0], rois[:, :, 2]
+ temp_beta = np.arctan2(temp_z, temp_x)
+ rois[:, :, 6] = np.sign(temp_beta) * np.pi / 2 + roi_alpha - temp_beta
+ # scaling augmentation
+ scales = 1 + ((np.random.rand(batch_size, boxes_num) - 0.5) / 0.5) * 0.05
+ pts = pts * np.expand_dims(np.expand_dims(scales, axis=2), axis=3)
+ gt_of_rois[:, :, 0:6] = gt_of_rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
+ rois[:, :, 0:6] = rois[:, :, 0:6] * np.expand_dims(scales, axis=2)
+
+ # flip augmentation
+ flip_flag = np.sign(np.random.rand(batch_size, boxes_num) - 0.5)
+ pts[:, :, :, 0] = pts[:, :, :, 0] * np.expand_dims(flip_flag, axis=2)
+ gt_of_rois[:, :, 0] = gt_of_rois[:, :, 0] * flip_flag
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ src_ry = gt_of_rois[:, :, 6]
+ ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
+ gt_of_rois[:, :, 6] = ry
+
+ rois[:, :, 0] = rois[:, :, 0] * flip_flag
+ # flip orientation: ry > 0: pi - ry, ry < 0: -pi - ry
+ src_ry = rois[:, :, 6]
+ ry = (flip_flag == 1).astype(np.float32) * src_ry + (flip_flag == -1).astype(np.float32) * (np.sign(src_ry) * np.pi - src_ry)
+ rois[:, :, 6] = ry
+
+ return pts, rois, gt_of_rois
+
+ def generate_proposal_target(seg_mask,rpn_features,gt_boxes3d,rpn_xyz,pts_depth,roi_boxes3d,rpn_intensity):
+ seg_mask = np.array(seg_mask)
+ features = np.array(rpn_features)
+ gt_boxes3d = np.array(gt_boxes3d)
+ rpn_xyz = np.array(rpn_xyz)
+ pts_depth = np.array(pts_depth)
+ roi_boxes3d = np.array(roi_boxes3d)
+ rpn_intensity = np.array(rpn_intensity)
+ batch_rois, batch_gt_of_rois, batch_roi_iou = sample_rois_for_rcnn(roi_boxes3d, gt_boxes3d)
+
+ if cfg.RCNN.USE_INTENSITY:
+ pts_extra_input_list = [np.expand_dims(rpn_intensity, axis=2),
+ np.expand_dims(seg_mask, axis=2)]
+ else:
+ pts_extra_input_list = [np.expand_dims(seg_mask, axis=2)]
+
+ if cfg.RCNN.USE_DEPTH:
+ pts_depth = pts_depth / 70.0 - 0.5
+ pts_extra_input_list.append(np.expand_dims(pts_depth, axis=2))
+ pts_extra_input = np.concatenate(pts_extra_input_list, axis=2)
+
+ # point cloud pooling
+ pts_feature = np.concatenate((pts_extra_input, rpn_features), axis=2)
+
+ batch_rois = batch_rois.astype(np.float32)
+
+ pooled_features, pooled_empty_flag = roipool3d_utils.roipool3d_gpu(
+ rpn_xyz, pts_feature, batch_rois, cfg.RCNN.POOL_EXTRA_WIDTH,
+ sampled_pt_num=cfg.RCNN.NUM_POINTS
+ )
+
+ sampled_pts, sampled_features = pooled_features[:, :, :, 0:3], pooled_features[:, :, :, 3:]
+ # data augmentation
+ if cfg.AUG_DATA:
+ # data augmentation
+ sampled_pts, batch_rois, batch_gt_of_rois = \
+ data_augmentation(sampled_pts, batch_rois, batch_gt_of_rois)
+
+ # canonical transformation
+ batch_size = batch_rois.shape[0]
+ roi_ry = batch_rois[:, :, 6] % (2 * np.pi)
+ roi_center = batch_rois[:, :, 0:3]
+ sampled_pts = sampled_pts - np.expand_dims(roi_center, axis=2) # (B, M, 512, 3)
+ batch_gt_of_rois[:, :, 0:3] = batch_gt_of_rois[:, :, 0:3] - roi_center
+ batch_gt_of_rois[:, :, 6] = batch_gt_of_rois[:, :, 6] - roi_ry
+
+ for k in range(batch_size):
+ sampled_pts[k] = kitti_utils.rotate_pc_along_y_np(sampled_pts[k], batch_rois[k, :, 6])
+ batch_gt_of_rois[k] = np.squeeze(kitti_utils.rotate_pc_along_y_np(
+ np.expand_dims(batch_gt_of_rois[k], axis=1), roi_ry[k]), axis=1)
+
+ # regression valid mask
+ valid_mask = (pooled_empty_flag == 0)
+ reg_valid_mask = ((batch_roi_iou > cfg.RCNN.REG_FG_THRESH) & valid_mask).astype(np.float32)
+
+ # classification label
+ batch_cls_label = (batch_roi_iou > cfg.RCNN.CLS_FG_THRESH).astype(np.int64)
+ invalid_mask = (batch_roi_iou > cfg.RCNN.CLS_BG_THRESH) & (batch_roi_iou < cfg.RCNN.CLS_FG_THRESH)
+ batch_cls_label[valid_mask == 0] = -1
+ batch_cls_label[invalid_mask > 0] = -1
+
+ output_dict = {'sampled_pts': sampled_pts.reshape(-1, cfg.RCNN.NUM_POINTS, 3).astype(np.float32),
+ 'pts_feature': sampled_features.reshape(-1, cfg.RCNN.NUM_POINTS, sampled_features.shape[3]).astype(np.float32),
+ 'cls_label': batch_cls_label.reshape(-1),
+ 'reg_valid_mask': reg_valid_mask.reshape(-1).astype(np.float32),
+ 'gt_of_rois': batch_gt_of_rois.reshape(-1, 7).astype(np.float32),
+ 'gt_iou': batch_roi_iou.reshape(-1).astype(np.float32),
+ 'roi_boxes3d': batch_rois.reshape(-1, 7).astype(np.float32)}
+
+ return output_dict.values()
+
+ return generate_proposal_target
+
+
+if __name__ == "__main__":
+
+ input_dict = {}
+ input_dict['roi_boxes3d'] = np.load("models/rpn_data/roi_boxes3d.npy")
+ input_dict['gt_boxes3d'] = np.load("models/rpn_data/gt_boxes3d.npy")
+ input_dict['rpn_xyz'] = np.load("models/rpn_data/rpn_xyz.npy")
+ input_dict['rpn_features'] = np.load("models/rpn_data/rpn_features.npy")
+ input_dict['rpn_intensity'] = np.load("models/rpn_data/rpn_intensity.npy")
+ input_dict['seg_mask'] = np.load("models/rpn_data/seg_mask.npy")
+ input_dict['pts_depth'] = np.load("models/rpn_data/pts_depth.npy")
+ for k, v in input_dict.items():
+ print(k, v.shape, np.sum(np.abs(v)))
+ input_dict[k] = np.expand_dims(v, axis=0)
+
+ from utils.config import cfg
+ cfg.RPN.LOC_XZ_FINE = True
+ cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
+ cfg.RPN.NMS_TYPE = 'rotate'
+
+ proposal_target_func = get_proposal_target_func(cfg)
+ out_dict = proposal_target_func(input_dict['seg_mask'],input_dict['rpn_features'],input_dict['gt_boxes3d'],
+ input_dict['rpn_xyz'],input_dict['pts_depth'],input_dict['roi_boxes3d'],input_dict['rpn_intensity'])
+ for key in out_dict.keys():
+ print("name:{}, shape{}".format(key,out_dict[key].shape))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..9160ffe8e4e4a1aff7f8e8984e5ddd3711d1ffb0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/proposal_utils.py
@@ -0,0 +1,270 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains proposal functions
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle.fluid as fluid
+
+import utils.box_utils as box_utils
+from utils.config import cfg
+
+__all__ = ["get_proposal_func"]
+
+
+def get_proposal_func(cfg, mode='TRAIN'):
+ def decode_bbox_target(roi_box3d, pred_reg, anchor_size, loc_scope,
+ loc_bin_size, num_head_bin, get_xz_fine=True,
+ loc_y_scope=0.5, loc_y_bin_size=0.25,
+ get_y_by_bin=False, get_ry_fine=False):
+ per_loc_bin_num = int(loc_scope / loc_bin_size) * 2
+ loc_y_bin_num = int(loc_y_scope / loc_y_bin_size) * 2
+
+ # recover xz localization
+ x_bin_l, x_bin_r = 0, per_loc_bin_num
+ z_bin_l, z_bin_r = per_loc_bin_num, per_loc_bin_num * 2
+ start_offset = z_bin_r
+
+ x_bin = np.argmax(pred_reg[:, x_bin_l: x_bin_r], axis=1)
+ z_bin = np.argmax(pred_reg[:, z_bin_l: z_bin_r], axis=1)
+
+ pos_x = x_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ pos_z = z_bin.astype('float32') * loc_bin_size + loc_bin_size / 2 - loc_scope
+ if get_xz_fine:
+ x_res_l, x_res_r = per_loc_bin_num * 2, per_loc_bin_num * 3
+ z_res_l, z_res_r = per_loc_bin_num * 3, per_loc_bin_num * 4
+ start_offset = z_res_r
+
+ x_res_norm = pred_reg[:, x_res_l:x_res_r][np.arange(len(x_bin)), x_bin]
+ z_res_norm = pred_reg[:, z_res_l:z_res_r][np.arange(len(z_bin)), z_bin]
+
+ x_res = x_res_norm * loc_bin_size
+ z_res = z_res_norm * loc_bin_size
+ pos_x += x_res
+ pos_z += z_res
+
+ # recover y localization
+ if get_y_by_bin:
+ y_bin_l, y_bin_r = start_offset, start_offset + loc_y_bin_num
+ y_res_l, y_res_r = y_bin_r, y_bin_r + loc_y_bin_num
+ start_offset = y_res_r
+
+ y_bin = np.argmax(pred_reg[:, y_bin_l: y_bin_r], axis=1)
+ y_res_norm = pred_reg[:, y_res_l:y_res_r][np.arange(len(y_bin)), y_bin]
+ y_res = y_res_norm * loc_y_bin_size
+ pos_y = y_bin.astype('float32') * loc_y_bin_size + loc_y_bin_size / 2 - loc_y_scope + y_res
+ pos_y = pos_y + np.array(roi_box3d[:, 1]).reshape(-1)
+ else:
+ y_offset_l, y_offset_r = start_offset, start_offset + 1
+ start_offset = y_offset_r
+
+ pos_y = np.array(roi_box3d[:, 1]) + np.array(pred_reg[:, y_offset_l])
+ pos_y = pos_y.reshape(-1)
+
+ # recover ry rotation
+ ry_bin_l, ry_bin_r = start_offset, start_offset + num_head_bin
+ ry_res_l, ry_res_r = ry_bin_r, ry_bin_r + num_head_bin
+
+ ry_bin = np.argmax(pred_reg[:, ry_bin_l: ry_bin_r], axis=1)
+ ry_res_norm = pred_reg[:, ry_res_l:ry_res_r][np.arange(len(ry_bin)), ry_bin]
+ if get_ry_fine:
+ # divide pi/2 into several bins
+ angle_per_class = (np.pi / 2) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+ ry = (ry_bin.astype('float32') * angle_per_class + angle_per_class / 2) + ry_res - np.pi / 4
+ else:
+ angle_per_class = (2 * np.pi) / num_head_bin
+ ry_res = ry_res_norm * (angle_per_class / 2)
+
+ # bin_center is (0, 30, 60, 90, 120, ..., 270, 300, 330)
+ ry = np.fmod(ry_bin.astype('float32') * angle_per_class + ry_res, 2 * np.pi)
+ ry[ry > np.pi] -= 2 * np.pi
+
+ # recover size
+ size_res_l, size_res_r = ry_res_r, ry_res_r + 3
+ assert size_res_r == pred_reg.shape[1]
+
+ size_res_norm = pred_reg[:, size_res_l: size_res_r]
+ hwl = size_res_norm * anchor_size + anchor_size
+
+ def rotate_pc_along_y(pc, angle):
+ cosa = np.cos(angle).reshape(-1, 1)
+ sina = np.sin(angle).reshape(-1, 1)
+
+ R = np.concatenate([cosa, -sina, sina, cosa], axis=-1).reshape(-1, 2, 2)
+ pc_temp = pc[:, [0, 2]].reshape(-1, 1, 2)
+ pc[:, [0, 2]] = np.matmul(pc_temp, R.transpose(0, 2, 1)).reshape(-1, 2)
+
+ return pc
+
+ # shift to original coords
+ roi_center = np.array(roi_box3d[:, 0:3])
+ shift_ret_box3d = np.concatenate((
+ pos_x.reshape(-1, 1),
+ pos_y.reshape(-1, 1),
+ pos_z.reshape(-1, 1),
+ hwl, ry.reshape(-1, 1)), axis=1)
+ ret_box3d = shift_ret_box3d
+ if roi_box3d.shape[1] == 7:
+ roi_ry = np.array(roi_box3d[:, 6]).reshape(-1)
+ ret_box3d = rotate_pc_along_y(np.array(shift_ret_box3d), -roi_ry)
+ ret_box3d[:, 6] += roi_ry
+ ret_box3d[:, [0, 2]] += roi_center[:, [0, 2]]
+ return ret_box3d
+
+ def distance_based_proposal(scores, proposals, sorted_idxs):
+ nms_range_list = [0, 40.0, 80.0]
+ pre_tot_top_n = cfg[mode].RPN_PRE_NMS_TOP_N
+ pre_top_n_list = [0, int(pre_tot_top_n * 0.7), pre_tot_top_n - int(pre_tot_top_n * 0.7)]
+ post_tot_top_n = cfg[mode].RPN_POST_NMS_TOP_N
+ post_top_n_list = [0, int(post_tot_top_n * 0.7), post_tot_top_n - int(post_tot_top_n * 0.7)]
+
+ batch_size = scores.shape[0]
+ ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
+ ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
+
+ for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
+ # sort by score
+ score_ord = score[sorted_idx]
+ proposal_ord = proposal[sorted_idx]
+
+ dist = proposal_ord[:, 2]
+ first_mask = (dist > nms_range_list[0]) & (dist <= nms_range_list[1])
+
+ scores_single_list, proposals_single_list = [], []
+ for i in range(1, len(nms_range_list)):
+ # get proposal distance mask
+ dist_mask = ((dist > nms_range_list[i - 1]) & (dist <= nms_range_list[i]))
+
+ if dist_mask.sum() != 0:
+ # this area has points, reduce by mask
+ cur_scores = score_ord[dist_mask]
+ cur_proposals = proposal_ord[dist_mask]
+
+ # fetch pre nms top K
+ cur_scores = cur_scores[:pre_top_n_list[i]]
+ cur_proposals = cur_proposals[:pre_top_n_list[i]]
+ else:
+ assert i == 2, '%d' % i
+ # this area doesn't have any points, so use rois of first area
+ cur_scores = score_ord[first_mask]
+ cur_proposals = proposal_ord[first_mask]
+
+ # fetch top K of first area
+ cur_scores = cur_scores[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
+ cur_proposals = cur_proposals[pre_top_n_list[i - 1]:][:pre_top_n_list[i]]
+
+ # oriented nms
+ boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
+ s_scores, s_proposals = box_utils.box_nms(
+ boxes_bev, cur_scores, cur_proposals,
+ cfg[mode].RPN_NMS_THRESH, post_top_n_list[i],
+ cfg.RPN.NMS_TYPE)
+ if len(s_scores) > 0:
+ scores_single_list.append(s_scores)
+ proposals_single_list.append(s_proposals)
+
+ scores_single = np.concatenate(scores_single_list, axis=0)
+ proposals_single = np.concatenate(proposals_single_list, axis=0)
+
+ prop_num = proposals_single.shape[0]
+ ret_scores[b, :prop_num, 0] = scores_single
+ ret_proposals[b, :prop_num] = proposals_single
+ # ret_proposals.tofile("proposal.data")
+ # ret_scores.tofile("score.data")
+ return np.concatenate([ret_proposals, ret_scores], axis=-1)
+
+ def score_based_proposal(scores, proposals, sorted_idxs):
+ batch_size = scores.shape[0]
+ ret_proposals = np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 7), dtype='float32')
+ ret_scores= np.zeros((batch_size, cfg[mode].RPN_POST_NMS_TOP_N, 1), dtype='float32')
+ for b, (score, proposal, sorted_idx) in enumerate(zip(scores, proposals, sorted_idxs)):
+ # sort by score
+ score_ord = score[sorted_idx]
+ proposal_ord = proposal[sorted_idx]
+
+ # pre nms top K
+ cur_scores = score_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
+ cur_proposals = proposal_ord[:cfg[mode].RPN_PRE_NMS_TOP_N]
+
+ boxes_bev = box_utils.boxes3d_to_bev(cur_proposals)
+ s_scores, s_proposals = box_utils.box_nms(
+ boxes_bev, cur_scores, cur_proposals,
+ cfg[mode].RPN_NMS_THRESH,
+ cfg[mode].RPN_POST_NMS_TOP_N,
+ 'rotate')
+ prop_num = len(s_proposals)
+ ret_scores[b, :prop_num, 0] = s_scores
+ ret_proposals[b, :prop_num] = s_proposals
+ # ret_proposals.tofile("proposal.data")
+ # ret_scores.tofile("score.data")
+ return np.concatenate([ret_proposals, ret_scores], axis=-1)
+
+ def generate_proposal(x):
+ rpn_scores = np.array(x[:, :, 0])[:, :, 0]
+ roi_box3d = x[:, :, 1:4]
+ pred_reg = x[:, :, 4:]
+
+ proposals = decode_bbox_target(
+ np.array(roi_box3d).reshape(-1, roi_box3d.shape()[-1]),
+ np.array(pred_reg).reshape(-1, pred_reg.shape()[-1]),
+ anchor_size=np.array(cfg.CLS_MEAN_SIZE[0], dtype='float32'),
+ loc_scope=cfg.RPN.LOC_SCOPE,
+ loc_bin_size=cfg.RPN.LOC_BIN_SIZE,
+ num_head_bin=cfg.RPN.NUM_HEAD_BIN,
+ get_xz_fine=cfg.RPN.LOC_XZ_FINE,
+ get_y_by_bin=False,
+ get_ry_fine=False)
+ proposals[:, 1] += proposals[:, 3] / 2
+ proposals = proposals.reshape(rpn_scores.shape[0], -1, proposals.shape[-1])
+
+ sorted_idxs = np.argsort(-rpn_scores, axis=-1)
+
+ if cfg.TEST.RPN_DISTANCE_BASED_PROPOSE:
+ ret = distance_based_proposal(rpn_scores, proposals, sorted_idxs)
+ else:
+ ret = score_based_proposal(rpn_scores, proposals, sorted_idxs)
+
+ return ret
+
+
+ return generate_proposal
+
+
+if __name__ == "__main__":
+ np.random.seed(3333)
+ x_np = np.random.random((4, 256, 84)).astype('float32')
+
+ from config import cfg
+ cfg.RPN.LOC_XZ_FINE = True
+ # cfg.TEST.RPN_DISTANCE_BASED_PROPOSE = False
+ # cfg.RPN.NMS_TYPE = 'rotate'
+ proposal_func = get_proposal_func(cfg)
+
+ x = fluid.layers.data(name="x", shape=[256, 84], dtype='float32')
+ proposal = fluid.default_main_program().current_block().create_var(
+ name="proposal", dtype='float32', shape=[256, 7])
+ fluid.layers.py_func(proposal_func, x, proposal)
+ loss = fluid.layers.reduce_mean(proposal)
+
+ place = fluid.CUDAPlace(0)
+ exe = fluid.Executor(place)
+ exe.run(fluid.default_startup_program())
+ ret = exe.run(fetch_list=[proposal.name, loss.name], feed={'x': x_np})
+ print(ret)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt
new file mode 100644
index 0000000000000000000000000000000000000000..044bbed5d020464250810601ec2dcdacdec0cd18
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/CMakeLists.txt
@@ -0,0 +1,6 @@
+
+cmake_minimum_required(VERSION 2.8.12)
+project(pts_utils)
+
+add_subdirectory(pybind11)
+pybind11_add_module(pts_utils pts_utils.cpp)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..356b02baa5288903e218c8fca1b17118ef8ea72b
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/pts_utils.cpp
@@ -0,0 +1,62 @@
+#include
+#include
+#include
+
+namespace py = pybind11;
+
+int pt_in_box3d(float x, float y, float z, float cx, float cy, float cz, float h, float w, float l, float cosa, float sina) {
+ if ((fabsf(x - cx) > 10.) || (fabsf(y - cy) > h / 2.0) || (fabsf(z - cz) > 10.)){
+ return 0;
+ }
+
+ float x_rot = (x - cx) * cosa + (z - cz) * (-sina);
+ float z_rot = (x - cx) * sina + (z - cz) * cosa;
+
+ int in_flag = static_cast((x_rot >= -l / 2.0) & (x_rot <= l / 2.0) & (z_rot >= -w / 2.0) & (z_rot <= w / 2.0));
+ return in_flag;
+}
+
+py::array_t pts_in_boxes3d(py::array_t pts, py::array_t boxes) {
+ py::buffer_info pts_buf= pts.request(), boxes_buf = boxes.request();
+
+ if (pts_buf.ndim != 2 || boxes_buf.ndim != 2) {
+ throw std::runtime_error("Number of dimensions must be 2");
+ }
+ if (pts_buf.shape[1] != 3) {
+ throw std::runtime_error("pts 2nd dimension must be 3");
+ }
+ if (boxes_buf.shape[1] != 7) {
+ throw std::runtime_error("boxes 2nd dimension must be 7");
+ }
+
+ auto pts_num = pts_buf.shape[0];
+ auto boxes_num = boxes_buf.shape[0];
+ auto mask = py::array_t(pts_num * boxes_num);
+ py::buffer_info mask_buf = mask.request();
+
+ float *pts_ptr = (float *) pts_buf.ptr,
+ *boxes_ptr = (float *) boxes_buf.ptr;
+ int *mask_ptr = (int *) mask_buf.ptr;
+
+ for (ssize_t i = 0; i < boxes_num; i++) {
+ float cx = boxes_ptr[i * 7];
+ float cy = boxes_ptr[i * 7 + 1] - boxes_ptr[i * 7 + 3] / 2.;
+ float cz = boxes_ptr[i * 7 + 2];
+ float h = boxes_ptr[i * 7 + 3];
+ float w = boxes_ptr[i * 7 + 4];
+ float l = boxes_ptr[i * 7 + 5];
+ float angle = boxes_ptr[i * 7 + 6];
+ float cosa = cosf(angle);
+ float sina = sinf(angle);
+ for (ssize_t j = 0; j < pts_num; j++) {
+ mask_ptr[i * pts_num + j] = pt_in_box3d(pts_ptr[j * 3], pts_ptr[j * 3 + 1], pts_ptr[j * 3 + 2], cx, cy, cz, h, w, l, cosa, sina);
+ }
+ }
+
+ mask.resize({boxes_num, pts_num});
+ return mask;
+}
+
+PYBIND11_MODULE(pts_utils, m) {
+ m.def("pts_in_boxes3d", &pts_in_boxes3d, "Calculate mask for whether points in boxes3d");
+}
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py
new file mode 100644
index 0000000000000000000000000000000000000000..e44e80ea703c0b2b3d1938fadc3c1befadb1dad0
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/setup.py
@@ -0,0 +1,12 @@
+from setuptools import setup
+from setuptools import Extension
+
+setup(
+ name='pts_utils',
+ ext_modules = [Extension(
+ name='pts_utils',
+ sources=['pts_utils.cpp'],
+ include_dirs=[r'../../pybind11/include'],
+ extra_compile_args=['-std=c++11']
+ )],
+)
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py
new file mode 100644
index 0000000000000000000000000000000000000000..e4e3be285e3363a2193102732f1c0d9894eb497d
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/pts_utils/test.py
@@ -0,0 +1,7 @@
+import numpy as np
+import pts_utils
+
+a = np.random.random((16384, 3)).astype('float32')
+b = np.random.random((64, 7)).astype('float32')
+c = pts_utils.pts_in_boxes3d(a, b)
+print(a, b, c, c.shape, np.sum(c))
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..0df37e5658f86c0cfc416e8a0185c5556bffe9f9
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/run_utils.py
@@ -0,0 +1,110 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+"""
+Contains common utility functions.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import sys
+import six
+import logging
+import numpy as np
+import paddle.fluid as fluid
+
+__all__ = ["check_gpu", "print_arguments", "parse_outputs", "Stat"]
+
+logger = logging.getLogger(__name__)
+
+
+def check_gpu(use_gpu):
+ """
+ Log error and exit when set use_gpu=True in paddlepaddle
+ cpu version.
+ """
+ err = "Config use_gpu cannot be set as True while you are " \
+ "using paddlepaddle cpu version ! \nPlease try: \n" \
+ "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
+ "\t2. Set --use_gpu=False to run model on CPU"
+
+ try:
+ if use_gpu and not fluid.is_compiled_with_cuda():
+ logger.error(err)
+ sys.exit(1)
+ except Exception as e:
+ pass
+
+
+def print_arguments(args):
+ """Print argparse's arguments.
+
+ Usage:
+
+ .. code-block:: python
+
+ parser = argparse.ArgumentParser()
+ parser.add_argument("name", default="Jonh", type=str, help="User name.")
+ args = parser.parse_args()
+ print_arguments(args)
+
+ :param args: Input argparse.Namespace for printing.
+ :type args: argparse.Namespace
+ """
+ logger.info("----------- Configuration Arguments -----------")
+ for arg, value in sorted(six.iteritems(vars(args))):
+ logger.info("%s: %s" % (arg, value))
+ logger.info("------------------------------------------------")
+
+
+def parse_outputs(outputs, filter_key=None, extra_keys=None, prog=None):
+ keys, values = [], []
+ for k, v in outputs.items():
+ if filter_key is not None and k.find(filter_key) < 0:
+ continue
+ keys.append(k)
+ v.persistable = True
+ values.append(v.name)
+
+ if prog is not None and extra_keys is not None:
+ for k in extra_keys:
+ try:
+ v = fluid.framework._get_var(k, prog)
+ keys.append(k)
+ v.persistable = True
+ values.append(v.name)
+ except:
+ pass
+ return keys, values
+
+
+class Stat(object):
+ def __init__(self):
+ self.stats = {}
+
+ def update(self, keys, values):
+ for k, v in zip(keys, values):
+ if k not in self.stats:
+ self.stats[k] = []
+ self.stats[k].append(v)
+
+ def reset(self):
+ self.stats = {}
+
+ def get_mean_log(self):
+ log = ""
+ for k, v in self.stats.items():
+ log += "avg_{}: {:.4f}, ".format(k, np.mean(v))
+ return log
diff --git a/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..c24a89a2429bd5f45386efa1176f8c8770500120
--- /dev/null
+++ b/PaddleCV/Paddle3D/PointRCNN/utils/save_utils.py
@@ -0,0 +1,132 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import numpy as np
+from utils.config import cfg
+from utils import calibration as calib
+import utils.cyops.kitti_utils as kitti_utils
+
+__all__ = ['save_rpn_feature', 'save_kitti_result', 'save_kitti_format']
+
+
+def save_rpn_feature(rets, kitti_features_dir):
+ """
+ save rpn features for RCNN offline training
+ """
+
+ sample_id = rets['sample_id'][0]
+ backbone_xyz = rets['backbone_xyz'][0]
+ backbone_feature = rets['backbone_feature'][0]
+ pts_features = rets['pts_features'][0]
+ seg_mask = rets['seg_mask'][0]
+ rpn_cls = rets['rpn_cls'][0]
+
+ for i in range(len(sample_id)):
+ pts_intensity = pts_features[i, :, 0]
+ s_id = sample_id[i, 0]
+
+ output_file = os.path.join(kitti_features_dir, '%06d.npy' % s_id)
+ xyz_file = os.path.join(kitti_features_dir, '%06d_xyz.npy' % s_id)
+ seg_file = os.path.join(kitti_features_dir, '%06d_seg.npy' % s_id)
+ intensity_file = os.path.join(
+ kitti_features_dir, '%06d_intensity.npy' % s_id)
+ np.save(output_file, backbone_feature[i])
+ np.save(xyz_file, backbone_xyz[i])
+ np.save(seg_file, seg_mask[i])
+ np.save(intensity_file, pts_intensity)
+ rpn_scores_raw_file = os.path.join(
+ kitti_features_dir, '%06d_rawscore.npy' % s_id)
+ np.save(rpn_scores_raw_file, rpn_cls[i])
+
+
+def save_kitti_result(rets, seg_output_dir, kitti_output_dir, reader, classes):
+ sample_id = rets['sample_id'][0]
+ roi_scores_row = rets['roi_scores_row'][0]
+ bboxes3d = rets['rois'][0]
+ pts_rect = rets['pts_rect'][0]
+ seg_mask = rets['seg_mask'][0]
+ rpn_cls_label = rets['rpn_cls_label'][0]
+ gt_boxes3d = rets['gt_boxes3d'][0]
+ gt_boxes3d_num = rets['gt_boxes3d'][1]
+
+ for i in range(len(sample_id)):
+ s_id = sample_id[i, 0]
+
+ seg_result_data = np.concatenate((pts_rect[i].reshape(-1, 3),
+ rpn_cls_label[i].reshape(-1, 1),
+ seg_mask[i].reshape(-1, 1)),
+ axis=1).astype('float16')
+ seg_output_file = os.path.join(seg_output_dir, '%06d.npy' % s_id)
+ np.save(seg_output_file, seg_result_data)
+
+ scores = roi_scores_row[i, :]
+ bbox3d = bboxes3d[i, :]
+ img_shape = reader.get_image_shape(s_id)
+ calib = reader.get_calib(s_id)
+
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(
+ img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % s_id)
+ with open(kitti_output_file, 'w') as f:
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ f.write('{} -1 -1 {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f} {:.4f}\n'.format(
+ classes, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6], scores[k]))
+
+
+def save_kitti_format(sample_id, calib, bbox3d, kitti_output_dir, scores, img_shape):
+ corners3d = kitti_utils.boxes3d_to_corners3d(bbox3d)
+ img_boxes, _ = calib.corners3d_to_img_boxes(corners3d)
+ img_boxes[:, 0] = np.clip(img_boxes[:, 0], 0, img_shape[1] - 1)
+ img_boxes[:, 1] = np.clip(img_boxes[:, 1], 0, img_shape[0] - 1)
+ img_boxes[:, 2] = np.clip(img_boxes[:, 2], 0, img_shape[1] - 1)
+ img_boxes[:, 3] = np.clip(img_boxes[:, 3], 0, img_shape[0] - 1)
+
+ img_boxes_w = img_boxes[:, 2] - img_boxes[:, 0]
+ img_boxes_h = img_boxes[:, 3] - img_boxes[:, 1]
+ box_valid_mask = np.logical_and(img_boxes_w < img_shape[1] * 0.8, img_boxes_h < img_shape[0] * 0.8)
+
+ kitti_output_file = os.path.join(kitti_output_dir, '%06d.txt' % sample_id)
+ with open(kitti_output_file, 'w') as f:
+ for k in range(bbox3d.shape[0]):
+ if box_valid_mask[k] == 0:
+ continue
+ x, z, ry = bbox3d[k, 0], bbox3d[k, 2], bbox3d[k, 6]
+ beta = np.arctan2(z, x)
+ alpha = -np.sign(beta) * np.pi / 2 + beta + ry
+
+ f.write('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f\n' %
+ (cfg.CLASSES, alpha, img_boxes[k, 0], img_boxes[k, 1], img_boxes[k, 2], img_boxes[k, 3],
+ bbox3d[k, 3], bbox3d[k, 4], bbox3d[k, 5], bbox3d[k, 0], bbox3d[k, 1], bbox3d[k, 2],
+ bbox3d[k, 6], scores[k]))
+
diff --git a/PaddleCV/PaddleDetection/docs/GETTING_STARTED_cn.md b/PaddleCV/PaddleDetection/docs/GETTING_STARTED_cn.md
index b5dd6041033e539e18d59dc8669a3658ba395da2..15cb8cdb239e72f204d598adbf78627000cb5bec 100644
--- a/PaddleCV/PaddleDetection/docs/GETTING_STARTED_cn.md
+++ b/PaddleCV/PaddleDetection/docs/GETTING_STARTED_cn.md
@@ -5,7 +5,7 @@
## 训练/评估/推断
-PaddleDetection提供了训练/训练/评估三个功能的使用脚本,支持通过不同可选参数实现特定功能
+PaddleDetection提供了训练/评估/推断三个功能的使用脚本,支持通过不同可选参数实现特定功能
```bash
# 设置PYTHONPATH路径
diff --git a/PaddleCV/PaddleDetection/ppdet/data/data_feed.py b/PaddleCV/PaddleDetection/ppdet/data/data_feed.py
index c384b2cb3241cc5c012cedc02dfc9cbeab524bf6..cbaebc2e4860e40481a8e1defdeea3edde22eb7e 100644
--- a/PaddleCV/PaddleDetection/ppdet/data/data_feed.py
+++ b/PaddleCV/PaddleDetection/ppdet/data/data_feed.py
@@ -452,7 +452,7 @@ class FasterRCNNTrainFeed(DataFeed):
'image', 'im_info', 'im_id', 'gt_box', 'gt_label',
'is_crowd'
],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
RandomFlipImage(prob=0.5),
@@ -504,7 +504,7 @@ class FasterRCNNEvalFeed(DataFeed):
COCO_VAL_IMAGE_DIR).__dict__,
fields=['image', 'im_info', 'im_id', 'im_shape', 'gt_box',
'gt_label', 'is_difficult'],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
NormalizeImage(mean=[0.485, 0.456, 0.406],
@@ -551,7 +551,7 @@ class FasterRCNNTestFeed(DataFeed):
dataset=SimpleDataSet(COCO_VAL_ANNOTATION,
COCO_VAL_IMAGE_DIR).__dict__,
fields=['image', 'im_info', 'im_id', 'im_shape'],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
NormalizeImage(mean=[0.485, 0.456, 0.406],
@@ -598,7 +598,7 @@ class MaskRCNNTrainFeed(DataFeed):
'image', 'im_info', 'im_id', 'gt_box', 'gt_label',
'is_crowd', 'gt_mask'
],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
RandomFlipImage(prob=0.5, is_mask_flip=True),
@@ -644,7 +644,7 @@ class MaskRCNNEvalFeed(DataFeed):
dataset=CocoDataSet(COCO_VAL_ANNOTATION,
COCO_VAL_IMAGE_DIR).__dict__,
fields=['image', 'im_info', 'im_id', 'im_shape'],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
NormalizeImage(mean=[0.485, 0.456, 0.406],
@@ -696,7 +696,7 @@ class MaskRCNNTestFeed(DataFeed):
dataset=SimpleDataSet(COCO_VAL_ANNOTATION,
COCO_VAL_IMAGE_DIR).__dict__,
fields=['image', 'im_info', 'im_id', 'im_shape'],
- image_shape=[3, 800, 1333],
+ image_shape=[None, 3, None, None],
sample_transforms=[
DecodeImage(to_rgb=True),
NormalizeImage(
diff --git a/PaddleCV/PaddleDetection/ppdet/data/tools/x2coco.py b/PaddleCV/PaddleDetection/ppdet/data/tools/x2coco.py
index da8e4aef4011ef1a23e7459bc473301e171b9fea..0379fab6335cb7886da8fe9f5170717a4453c6d6 100644
--- a/PaddleCV/PaddleDetection/ppdet/data/tools/x2coco.py
+++ b/PaddleCV/PaddleDetection/ppdet/data/tools/x2coco.py
@@ -277,13 +277,16 @@ def main():
indent=4,
cls=MyEncoder)
if args.val_proportion != 0:
- val_data_coco = deal_json(args.output_dir + '/val', args.json_input_dir)
+ val_data_coco = deal_json(args.dataset_type,
+ args.output_dir + '/val',
+ args.json_input_dir)
val_json_path = osp.join(args.output_dir + '/annotations',
'instance_val.json')
json.dump(
val_data_coco, open(val_json_path, 'w'), indent=4, cls=MyEncoder)
if args.test_proportion != 0:
- test_data_coco = deal_json(args.output_dir + '/test',
+ test_data_coco = deal_json(args.dataset_type,
+ args.output_dir + '/test',
args.json_input_dir)
test_json_path = osp.join(args.output_dir + '/annotations',
'instance_test.json')
diff --git a/PaddleCV/PaddleDetection/slim/distillation/README.md b/PaddleCV/PaddleDetection/slim/distillation/README.md
index e970cc42b54c17a6131c4873662fb2be46767b60..e46e6a2c92ac502f48d7d929a81b61228ed10d7a 100755
--- a/PaddleCV/PaddleDetection/slim/distillation/README.md
+++ b/PaddleCV/PaddleDetection/slim/distillation/README.md
@@ -135,7 +135,7 @@ python ../infer.py \
| FLOPS |Box AP|
|---|---|
|baseline|76.2 |
-|蒸馏后|- |
+|蒸馏后|76.27 |
## FAQ
diff --git a/PaddleCV/PaddleDetection/slim/quantization/README.md b/PaddleCV/PaddleDetection/slim/quantization/README.md
index acb4c9efcbd49bccc4682c7eb7af294885e5d42a..d451e959a8828c24fcafb9ac52b8c5a2a3ce8de5 100644
--- a/PaddleCV/PaddleDetection/slim/quantization/README.md
+++ b/PaddleCV/PaddleDetection/slim/quantization/README.md
@@ -4,7 +4,7 @@
## 概述
-该示例使用PaddleSlim提供的[量化压缩策略](https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#1-quantization-aware-training%E9%87%8F%E5%8C%96%E4%BB%8B%E7%BB%8D)对分类模型进行压缩。
+该示例使用PaddleSlim提供的[量化压缩策略](https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#1-quantization-aware-training%E9%87%8F%E5%8C%96%E4%BB%8B%E7%BB%8D)对检测模型进行压缩。
在阅读该示例前,建议您先了解以下内容:
- [检测模型的常规训练方法](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/PaddleDetection)
@@ -41,10 +41,11 @@
step1: 设置gpu卡
```
-export CUDA_VISIBLE_DEVICES=0
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
```
step2: 开始训练
-使用PaddleDetection提供的配置文件在用8卡进行训练:
+
+使用PaddleDetection提供的配置文件用8卡进行训练:
```
python compress.py \
@@ -234,8 +235,11 @@ FP32模型可使用PaddleLite进行加载预测,可参见教程[Paddle-Lite如
|---|---|---|---|---|
|baseline|- |76.2%|- |-|
|abs_max|abs_max|- |- |-|
-|abs_max|moving_average_abs_max|- |- |-|
+|abs_max|moving_average_abs_max|74.48%|10.99|3348.68|
|channel_wise_abs_max|abs_max|- |- |-|
+> 注: lite端运行手机信息:Android手机,
+型号:BKL-AL20,运行内存RAM:4GB 6GB,CPU核心数:八核 4*A73 2.36GHz+4*A53 1.8GHz,操作系统:EMUI 8.0,CPU品牌:麒麟970
+
## FAQ
diff --git a/PaddleCV/PaddleDetection/slim/quantization/compress.py b/PaddleCV/PaddleDetection/slim/quantization/compress.py
index b4a5553cf46eabcd25f7cc1ce6c50fccefd2e5df..0e145abcf70c54a3b7960243e20c0cb8cb6d39d9 100644
--- a/PaddleCV/PaddleDetection/slim/quantization/compress.py
+++ b/PaddleCV/PaddleDetection/slim/quantization/compress.py
@@ -49,7 +49,7 @@ from ppdet.data.data_feed import create_reader
from ppdet.utils.eval_utils import parse_fetches, eval_results
from ppdet.utils.stats import TrainingStats
from ppdet.utils.cli import ArgsParser, print_total_cfg
-from ppdet.utils.check import check_gpu, check_version
+from ppdet.utils.check import check_gpu
import ppdet.utils.checkpoint as checkpoint
from ppdet.modeling.model_input import create_feed
@@ -121,8 +121,7 @@ def main():
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu(cfg.use_gpu)
- # print_total_cfg(cfg)
- #check_version()
+
if cfg.use_gpu:
devices_num = fluid.core.get_cuda_device_count()
else:
diff --git a/PaddleCV/PaddleDetection/slim/quantization/freeze.py b/PaddleCV/PaddleDetection/slim/quantization/freeze.py
index 38c06578e3d22e1cc4f2bdcc933298553c1c1f37..42c7bc62fd771366430f3658d9446a0f12fe2125 100644
--- a/PaddleCV/PaddleDetection/slim/quantization/freeze.py
+++ b/PaddleCV/PaddleDetection/slim/quantization/freeze.py
@@ -195,19 +195,6 @@ def main():
model_filename='model',
params_filename='weights')
- logger.info("convert the freezed pass to paddle-lite execution")
- mobile_pass = TransformForMobilePass()
- mobile_pass.apply(test_graph)
- mobile_program = test_graph.to_program()
- fluid.io.save_inference_model(
- dirname=os.path.join(FLAGS.save_path, 'mobile'),
- feeded_var_names=feed_names,
- target_vars=fetch_targets,
- executor=exe,
- main_program=mobile_program,
- model_filename='model',
- params_filename='weights')
-
if __name__ == '__main__':
parser = ArgsParser()
diff --git a/PaddleCV/PaddleDetection/slim/quantization/yolov3_mobilenet_v1_slim.yaml b/PaddleCV/PaddleDetection/slim/quantization/yolov3_mobilenet_v1_slim.yaml
index 60a66f656f9e419cd862231654ab4eaca6057ea2..9d453450d91edf4d10c6aa5fd9fd29f21953e5d3 100644
--- a/PaddleCV/PaddleDetection/slim/quantization/yolov3_mobilenet_v1_slim.yaml
+++ b/PaddleCV/PaddleDetection/slim/quantization/yolov3_mobilenet_v1_slim.yaml
@@ -5,7 +5,6 @@ strategies:
start_epoch: 0
end_epoch: 4
float_model_save_path: './output/yolov3/float'
- mobile_model_save_path: './output/yolov3/mobile'
int8_model_save_path: './output/yolov3/int8'
weight_bits: 8
activation_bits: 8
diff --git a/PaddleCV/PaddleDetection/tools/train.py b/PaddleCV/PaddleDetection/tools/train.py
index 08e1fc63437c78722e11429d94468dcf2e5eee2c..6d04c665ecbae873a043624a80661c385194fe9e 100644
--- a/PaddleCV/PaddleDetection/tools/train.py
+++ b/PaddleCV/PaddleDetection/tools/train.py
@@ -19,6 +19,7 @@ from __future__ import print_function
import os
import time
import numpy as np
+import random
import datetime
from collections import deque
@@ -36,6 +37,7 @@ set_paddle_flags(
)
from paddle import fluid
+from paddle.fluid import profiler
from ppdet.experimental import mixed_precision_context
from ppdet.core.workspace import load_config, merge_config, create
@@ -61,10 +63,13 @@ def main():
FLAGS.dist = 'PADDLE_TRAINER_ID' in env and 'PADDLE_TRAINERS_NUM' in env
if FLAGS.dist:
trainer_id = int(env['PADDLE_TRAINER_ID'])
- import random
local_seed = (99 + trainer_id)
random.seed(local_seed)
np.random.seed(local_seed)
+
+ if FLAGS.enable_ce:
+ random.seed(0)
+ np.random.seed(0)
cfg = load_config(FLAGS.config)
if 'architecture' in cfg:
@@ -111,6 +116,9 @@ def main():
# build program
startup_prog = fluid.Program()
train_prog = fluid.Program()
+ if FLAGS.enable_ce:
+ startup_prog.random_seed = 1000
+ train_prog.random_seed = 1000
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
@@ -257,6 +265,18 @@ def main():
strs = 'iter: {}, lr: {:.6f}, {}, time: {:.3f}, eta: {}'.format(
it, np.mean(outs[-1]), logs, time_cost, eta)
logger.info(strs)
+
+ #only for continuous evaluation
+ if FLAGS.enable_ce and it == cfg.max_iters - 1:
+ print("kpis\t{}_train_loss\t{}".format(cfg.architecture, stats['loss']))
+ print("kpis\t{}_train_time\t{}".format(cfg.architecture, time_cost))
+
+ # profiler tools, used for benchmark
+ if FLAGS.is_profiler and it == 5:
+ profiler.start_profiler("All")
+ elif FLAGS.is_profiler and it == 10:
+ profiler.stop_profiler("total", FLAGS.profiler_path)
+ return
if (it > 0 and it % cfg.snapshot_iter == 0 or it == cfg.max_iters - 1) \
and (not FLAGS.dist or trainer_id == 0):
@@ -334,5 +354,23 @@ if __name__ == '__main__':
type=str,
default="tb_log_dir/scalar",
help='Tensorboard logging directory for scalar.')
+ parser.add_argument(
+ '--enable_ce',
+ type=bool,
+ default=False,
+ help="If set True, enable continuous evaluation job."
+ "This flag is only used for internal test.")
+
+ #NOTE:args for profiler tools, used for benchmark
+ parser.add_argument(
+ '--is_profiler',
+ type=int,
+ default=0,
+ help='The switch of profiler tools. (used for benchmark)')
+ parser.add_argument(
+ '--profiler_path',
+ type=str,
+ default="./",
+ help='The profiler output file path. (used for benchmark)')
FLAGS = parser.parse_args()
main()
diff --git a/PaddleCV/PaddleGAN/README.md b/PaddleCV/PaddleGAN/README.md
index 97b0f3985b149b30bab7d1123032f428dd8bc5a0..cb5453984bef0b7bb46b320ff24172413e52c124 100644
--- a/PaddleCV/PaddleGAN/README.md
+++ b/PaddleCV/PaddleGAN/README.md
@@ -12,7 +12,6 @@
- [FAQ](#faq)
- [参考论文](#参考论文)
- [版本更新](#版本更新)
-- [作者](#作者)
## 模型简介
@@ -312,9 +311,5 @@ SPADE整体的网络结构[10]
- 6/2019 新增CGAN, DCGAN, Pix2Pix, CycleGAN,StarGAN, AttGAN, STGAN
-## 作者
-- [ceci3](https://github.com/ceci3)
-- [zhumanyu](https://github.com/zhumanyu)
-
## 如何贡献代码
如果你可以修复某个issue或者增加一个新功能,欢迎给我们提交PR。如果对应的PR被接受了,我们将根据贡献的质量和难度进行打分(0-5分,越高越好)。如果你累计获得了10分,可以联系我们获得面试机会或者为你写推荐信。
diff --git a/PaddleCV/PaddleGAN/cycle_gan/train.py b/PaddleCV/PaddleGAN/cycle_gan/train.py
index a85da0ae2c97e95aa7d8de5a6ef5661988c84971..5fadd201ef250f31023b3858c1dffeb992f3b19a 100644
--- a/PaddleCV/PaddleGAN/cycle_gan/train.py
+++ b/PaddleCV/PaddleGAN/cycle_gan/train.py
@@ -69,6 +69,11 @@ add_arg('save_checkpoints', bool, True, "Whether to save checkpoints.")
add_arg('run_test', bool, True, "Whether to run test.")
add_arg('use_gpu', bool, True, "Whether to use GPU to train.")
add_arg('profile', bool, False, "Whether to profile.")
+
+# NOTE: args for profiler, used for benchmark
+add_arg('profiler_path', str, './profiler_cyclegan', "the path of profiler output files. used for benchmark")
+add_arg('max_iter', int, 0, "the max batch nums to train. used for benchmark")
+
add_arg('run_ce', bool, False, "Whether to run for model ce.")
# yapf: enable
@@ -214,9 +219,14 @@ def train(args):
loss_name=d_A_trainer.d_loss_A.name,
build_strategy=build_strategy,
exec_strategy=exec_strategy)
+
+ total_batch_num = 0 # this is for benchmark
+
for epoch in range(args.epoch):
batch_id = 0
for i in range(max_images_num):
+ if args.max_iter and total_batch_num == args.max_iter: # this for benchmark
+ return
data_A = next(A_reader)
data_B = next(B_reader)
tensor_A = fluid.LoDTensor()
@@ -265,6 +275,12 @@ def train(args):
losses[1].append(d_A_loss[0])
sys.stdout.flush()
batch_id += 1
+ total_batch_num = total_batch_num + 1 # this is for benchmark
+ # profiler tools for benchmark
+ if args.profile and epoch == 0 and batch_id == 10:
+ profiler.reset_profiler()
+ elif args.profile and epoch == 0 and batch_id == 15:
+ return
if args.run_test and not args.run_ce:
test(epoch)
@@ -281,7 +297,7 @@ if __name__ == "__main__":
print_arguments(args)
if args.profile:
if args.use_gpu:
- with profiler.cuda_profiler("cuda_profiler.txt", 'csv') as nvprof:
+ with profiler.profiler('All', 'total', args.profiler_path) as prof:
train(args)
else:
with profiler.profiler("CPU", sorted_key='total') as cpuprof:
diff --git a/PaddleCV/PaddleGAN/data_reader.py b/PaddleCV/PaddleGAN/data_reader.py
index ef18d7e05a70dfcef605597da55d80c1db794c94..407855abca1c3328e931841f404a3f80a9b6cc36 100644
--- a/PaddleCV/PaddleGAN/data_reader.py
+++ b/PaddleCV/PaddleGAN/data_reader.py
@@ -308,7 +308,7 @@ class triplex_reader_creator(reader_creator):
input_label = np.zeros(
(args.label_nc, index.shape[1], index.shape[2]))
np.put_along_axis(input_label, index, 1.0, 0)
- img1 = input_label
+ img1 = input_label.astype('float32')
img2 = (np.array(img2).astype('float32') / 255.0 - 0.5) / 0.5
img2 = img2.transpose([2, 0, 1])
if not args.no_instance:
@@ -630,6 +630,7 @@ class data_reader(object):
batch_size=self.cfg.batch_size,
mode="TRAIN")
reader_test = None
+ id2name = None
if self.cfg.run_test:
test_list = os.path.join(dataset_dir, "test.txt")
if self.cfg.test_list is not None:
diff --git a/PaddleCV/PaddleGAN/infer.py b/PaddleCV/PaddleGAN/infer.py
index 04c2fc9226ba27b59bbfc5da9cb625d29b08cdf9..4eed0a5deef604a68352985fe00fb0ba76b35c2f 100644
--- a/PaddleCV/PaddleGAN/infer.py
+++ b/PaddleCV/PaddleGAN/infer.py
@@ -305,16 +305,22 @@ def infer(args):
id2name = test_reader.id2name
for data in loader():
real_img, image_name = data[0]['input'], data[0]['image_name']
- image_name = id2name[np.array(image_name).astype('int32')[0]]
- print("read: ", image_name)
+ image_names = []
+ for name in image_name:
+ image_names.append(id2name[np.array(name).astype('int32')[0]])
+ print("read: ", image_names)
fake_temp = exe.run(fetch_list=[fake.name],
feed={"input": real_img})
- fake_temp = np.squeeze(fake_temp[0]).transpose([1, 2, 0])
- input_temp = np.squeeze(np.array(real_img)[0]).transpose([1, 2, 0])
+ fake_temp = save_batch_image(fake_temp[0])
+ input_temp = save_batch_image(np.array(real_img))
- imageio.imwrite(
- os.path.join(args.output, "fake_" + image_name), (
- (fake_temp + 1) * 127.5).astype(np.uint8))
+ for i, name in enumerate(image_names):
+ imageio.imwrite(
+ os.path.join(args.output, "fake_" + name), (
+ (fake_temp[i] + 1) * 127.5).astype(np.uint8))
+ imageio.imwrite(
+ os.path.join(args.output, "input_" + name), (
+ (input_temp[i] + 1) * 127.5).astype(np.uint8))
elif args.model_net == 'SPADE':
test_reader = triplex_reader_creator(
image_dir=args.dataset_dir,
diff --git a/PaddleCV/PaddleGAN/network/AttGAN_network.py b/PaddleCV/PaddleGAN/network/AttGAN_network.py
index 7d5640ec680730e612f5db39b076dcc0018b33f2..a447e66d16001c834f082e28fcee9da809ae6616 100755
--- a/PaddleCV/PaddleGAN/network/AttGAN_network.py
+++ b/PaddleCV/PaddleGAN/network/AttGAN_network.py
@@ -62,7 +62,7 @@ class AttGAN_model(object):
"""Concatenate attribute vector on feature map axis."""
ones = fluid.layers.fill_constant_batch_size_like(
z, [-1, a.shape[1], z.shape[2], z.shape[3]], "float32", 1.0)
- return fluid.layers.concat([z, ones * a], axis=1)
+ return fluid.layers.concat([z, fluid.layers.elementwise_mul(ones, a, axis=0)], axis=1)
def Genc(self, input, dim=64, n_layers=5, name='G_enc_', is_test=False):
z = input
diff --git a/PaddleCV/PaddleGAN/network/DCGAN_network.py b/PaddleCV/PaddleGAN/network/DCGAN_network.py
index c0e67bdc523fd2563aa1f9470db19ecae10607f5..13ba14d452f81ce5f931f1a343d5d60213e42eb6 100644
--- a/PaddleCV/PaddleGAN/network/DCGAN_network.py
+++ b/PaddleCV/PaddleGAN/network/DCGAN_network.py
@@ -89,5 +89,5 @@ class DCGAN_model(object):
norm=self.norm,
activation_fn='leaky_relu',
name=name + '_l1')
- out = linear(o_l1, 1, activation_fn='sigmoid', name=name + '_l2')
+ out = linear(o_l1, 1, activation_fn=None, name=name + '_l2')
return out
diff --git a/PaddleCV/PaddleGAN/network/STGAN_network.py b/PaddleCV/PaddleGAN/network/STGAN_network.py
index 6ea82687d1f0bd51b0188a62f78fcf390b45b0dd..75da511ec817375269002c42ee4788d57e6bffad 100755
--- a/PaddleCV/PaddleGAN/network/STGAN_network.py
+++ b/PaddleCV/PaddleGAN/network/STGAN_network.py
@@ -84,7 +84,7 @@ class STGAN_model(object):
"""Concatenate attribute vector on feature map axis."""
ones = fluid.layers.fill_constant_batch_size_like(
z, [-1, a.shape[1], z.shape[2], z.shape[3]], "float32", 1.0)
- return fluid.layers.concat([z, ones * a], axis=1)
+ return fluid.layers.concat([z, fluid.layers.elementwise_mul(ones, a, axis=0)], axis=1)
def Genc(self, input, dim=64, n_layers=5, name='G_enc_', is_test=False):
z = input
diff --git a/PaddleCV/PaddleGAN/network/base_network.py b/PaddleCV/PaddleGAN/network/base_network.py
index e3125a32b08be0d2441d95a6e02fd63374d8619a..50b2d86449f878e1f480fb66e63c1dbe9ad2836b 100644
--- a/PaddleCV/PaddleGAN/network/base_network.py
+++ b/PaddleCV/PaddleGAN/network/base_network.py
@@ -64,12 +64,6 @@ def norm_layer(input,
moving_variance_name=name + '_var')
elif norm_type == 'instance_norm':
- helper = fluid.layer_helper.LayerHelper("instance_norm", **locals())
- dtype = helper.input_dtype()
- epsilon = 1e-5
- mean = fluid.layers.reduce_mean(input, dim=[2, 3], keep_dim=True)
- var = fluid.layers.reduce_mean(
- fluid.layers.square(input - mean), dim=[2, 3], keep_dim=True)
if name is not None:
scale_name = name + "_scale"
offset_name = name + "_offset"
@@ -91,15 +85,8 @@ def norm_layer(input,
name=offset_name,
initializer=fluid.initializer.Constant(0.0),
trainable=False)
- scale = helper.create_parameter(
- attr=scale_param, shape=input.shape[1:2], dtype=dtype)
- offset = helper.create_parameter(
- attr=offset_param, shape=input.shape[1:2], dtype=dtype)
-
- tmp = fluid.layers.elementwise_mul(x=(input - mean), y=scale, axis=1)
- tmp = tmp / fluid.layers.sqrt(var + epsilon)
- tmp = fluid.layers.elementwise_add(tmp, offset, axis=1)
- return tmp
+ return fluid.layers.instance_norm(
+ input, param_attr=scale_param, bias_attr=offset_param)
else:
raise NotImplementedError("norm type: [%s] is not support" % norm_type)
diff --git a/PaddleCV/PaddleGAN/train.py b/PaddleCV/PaddleGAN/train.py
index 3008dd24873588e990d1a0e235719d7a13b988ca..a5339021baa039d2c22ea23f735e80b546848d2f 100644
--- a/PaddleCV/PaddleGAN/train.py
+++ b/PaddleCV/PaddleGAN/train.py
@@ -70,7 +70,7 @@ if __name__ == "__main__":
if cfg.profile:
if cfg.use_gpu:
with fluid.profiler.profiler('All', 'total',
- '/tmp/profile') as prof:
+ cfg.profiler_path) as prof:
train(cfg)
else:
with fluid.profiler.profiler("CPU", sorted_key='total') as cpuprof:
diff --git a/PaddleCV/PaddleGAN/trainer/AttGAN.py b/PaddleCV/PaddleGAN/trainer/AttGAN.py
index 81fe56977e6d4e64776188801b1cb4d717b9d7b6..02d840d6f163bfe0d62e9631274b20d10d7cf192 100644
--- a/PaddleCV/PaddleGAN/trainer/AttGAN.py
+++ b/PaddleCV/PaddleGAN/trainer/AttGAN.py
@@ -156,8 +156,13 @@ class DTrainer():
def gradient_penalty(self, f, real, fake=None, cfg=None, name=None):
def _interpolate(a, b=None):
if b is None:
- beta = fluid.layers.uniform_random_batch_size_like(
- input=a, shape=a.shape, min=0.0, max=1.0)
+ if cfg.enable_ce:
+ beta = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=a.shape, min=0.0, max=1.0, seed=1)
+ else:
+ beta = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=a.shape, min=0.0, max=1.0)
+
mean = fluid.layers.reduce_mean(
a, dim=list(range(len(a.shape))), keep_dim=True)
input_sub_mean = fluid.layers.elementwise_sub(a, mean, axis=0)
@@ -167,9 +172,14 @@ class DTrainer():
keep_dim=True)
b = beta * fluid.layers.sqrt(var) * 0.5 + a
shape = [a.shape[0]]
- alpha = fluid.layers.uniform_random_batch_size_like(
- input=a, shape=shape, min=0.0, max=1.0)
- inner = (b - a) * alpha + a
+ if cfg.enable_ce:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0, seed=1)
+ else:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0)
+
+ inner = fluid.layers.elementwise_mul((b-a), alpha, axis=0) + a
return inner
x = _interpolate(real, fake)
@@ -254,6 +264,10 @@ class AttGAN(object):
default=None,
help="the normalization in discriminator, choose in [None, instance_norm]"
)
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
@@ -282,6 +296,9 @@ class AttGAN(object):
name='label_org_', shape=[None, self.cfg.c_dim], dtype='float32')
label_trg_ = fluid.data(
name='label_trg_', shape=[None, self.cfg.c_dim], dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
py_reader = fluid.io.PyReader(
feed_list=[image_real, label_org, label_trg],
@@ -325,7 +342,11 @@ class AttGAN(object):
dis_trainer.program).with_data_parallel(
loss_name=dis_trainer.d_loss.name,
build_strategy=build_strategy)
-
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ gen_trainer_program.random_seed = 90
+ dis_trainer_program.random_seed = 90
+
t_time = 0
for epoch_id in range(self.cfg.epoch):
@@ -367,6 +388,8 @@ class AttGAN(object):
d_loss_gp[0], batch_time))
sys.stdout.flush()
batch_id += 1
+ if self.cfg.enable_ce and batch_id == 100:
+ break
if self.cfg.run_test:
image_name = fluid.data(
@@ -393,3 +416,13 @@ class AttGAN(object):
"net_G")
utility.checkpoints(epoch_id, self.cfg, exe, dis_trainer,
"net_D")
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count() if self.cfg.use_gpu else 1
+ print("kpis\tattgan_g_loss_fake_card{}\t{}".format(device_num, g_loss_fake[0]))
+ print("kpis\tattgan_g_loss_rec_card{}\t{}".format(device_num, g_loss_rec[0]))
+ print("kpis\tattgan_g_loss_cls_card{}\t{}".format(device_num, g_loss_cls[0]))
+ print("kpis\tattgan_d_loss_real_card{}\t{}".format(device_num, d_loss_real[0]))
+ print("kpis\tattgan_d_loss_fake_card{}\t{}".format(device_num,d_loss_fake[0]))
+ print("kpis\tattgan_d_loss_gp_card{}\t{}".format(device_num,d_loss_gp[0]))
+ print("kpis\tattgan_Batch_time_cost_card{}\t{}".format(device_num,batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/CycleGAN.py b/PaddleCV/PaddleGAN/trainer/CycleGAN.py
index 5c6d7909114270b10d017425fa814d11bc477418..62b118eba21c8c2c7d5d221c9effd0c28da4fa54 100644
--- a/PaddleCV/PaddleGAN/trainer/CycleGAN.py
+++ b/PaddleCV/PaddleGAN/trainer/CycleGAN.py
@@ -207,7 +207,10 @@ class CycleGAN(object):
type=int,
default=3,
help="only used when CycleGAN discriminator is nlayers")
-
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
def __init__(self,
@@ -237,6 +240,9 @@ class CycleGAN(object):
name='fake_pool_A', shape=data_shape, dtype='float32')
fake_pool_B = fluid.data(
name='fake_pool_B', shape=data_shape, dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
A_py_reader = fluid.io.PyReader(
feed_list=[input_A],
@@ -317,6 +323,10 @@ class CycleGAN(object):
fake_pool_B = B_pool.pool_image(fake_B_tmp)
fake_pool_A = A_pool.pool_image(fake_A_tmp)
+ if self.cfg.enable_ce:
+ fake_pool_B = fake_B_tmp
+ fake_pool_A = fake_A_tmp
+
# optimize the d_A network
d_A_loss = exe.run(
d_A_trainer_program,
@@ -344,6 +354,9 @@ class CycleGAN(object):
sys.stdout.flush()
batch_id += 1
+ # used for continuous evaluation
+ if self.cfg.enable_ce and batch_id == 10:
+ break
if self.cfg.run_test:
A_image_name = fluid.data(
@@ -390,3 +403,26 @@ class CycleGAN(object):
"net_DA")
utility.checkpoints(epoch_id, self.cfg, exe, d_B_trainer,
"net_DB")
+
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count(
+ ) if self.cfg.use_gpu else 1
+ print("kpis\tcyclegan_g_A_loss_card{}\t{}".format(device_num,
+ g_A_loss[0]))
+ print("kpis\tcyclegan_g_A_cyc_loss_card{}\t{}".format(
+ device_num, g_A_cyc_loss[0]))
+ print("kpis\tcyclegan_g_A_idt_loss_card{}\t{}".format(
+ device_num, g_A_idt_loss[0]))
+ print("kpis\tcyclegan_d_A_loss_card{}\t{}".format(device_num,
+ d_A_loss[0]))
+ print("kpis\tcyclegan_g_B_loss_card{}\t{}".format(device_num,
+ g_B_loss[0]))
+ print("kpis\tcyclegan_g_B_cyc_loss_card{}\t{}".format(
+ device_num, g_B_cyc_loss[0]))
+ print("kpis\tcyclegan_g_B_idt_loss_card{}\t{}".format(
+ device_num, g_B_idt_loss[0]))
+ print("kpis\tcyclegan_d_B_loss_card{}\t{}".format(device_num,
+ d_B_loss[0]))
+ print("kpis\tcyclegan_Batch_time_cost_card{}\t{}".format(
+ device_num, batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/DCGAN.py b/PaddleCV/PaddleGAN/trainer/DCGAN.py
index 4301f4d906ac46ce9f1540da11174321e2003258..a14ecb431fb7f912e27e0767b70d1c43b7780c79 100644
--- a/PaddleCV/PaddleGAN/trainer/DCGAN.py
+++ b/PaddleCV/PaddleGAN/trainer/DCGAN.py
@@ -27,6 +27,7 @@ import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import paddle.fluid as fluid
+import random
class GTrainer():
@@ -78,7 +79,10 @@ class DCGAN(object):
def add_special_args(self, parser):
parser.add_argument(
'--noise_size', type=int, default=100, help="the noise dimension")
-
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
def __init__(self, cfg=None, train_reader=None):
@@ -90,6 +94,11 @@ class DCGAN(object):
noise = fluid.data(
name='noise', shape=[None, self.cfg.noise_size], dtype='float32')
label = fluid.data(name='label', shape=[None, 1], dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
+ random.seed(0)
+ np.random.seed(0)
g_trainer = GTrainer(noise, label, self.cfg)
d_trainer = DTrainer(img, label, self.cfg)
@@ -200,3 +209,11 @@ class DCGAN(object):
if self.cfg.save_checkpoints:
utility.checkpoints(epoch_id, self.cfg, exe, g_trainer, "net_G")
utility.checkpoints(epoch_id, self.cfg, exe, d_trainer, "net_D")
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count(
+ ) if self.cfg.use_gpu else 1
+ print("kpis\tdcgan_d_loss_card{}\t{}".format(device_num, d_loss[0]))
+ print("kpis\tdcgan_g_loss_card{}\t{}".format(device_num, g_loss[0]))
+ print("kpis\tdcgan_Batch_time_cost_card{}\t{}".format(device_num,
+ batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/Pix2pix.py b/PaddleCV/PaddleGAN/trainer/Pix2pix.py
index 3595bea04fdab1f8cb76897151a57513c04ef1c5..1c340a57c3769de9448e4dae1c6352d7f424212f 100644
--- a/PaddleCV/PaddleGAN/trainer/Pix2pix.py
+++ b/PaddleCV/PaddleGAN/trainer/Pix2pix.py
@@ -18,8 +18,10 @@ from __future__ import print_function
from network.Pix2pix_network import Pix2pix_model
from util import utility
import paddle.fluid as fluid
+from paddle.fluid import profiler
import sys
import time
+import numpy as np
class GTrainer():
@@ -195,7 +197,10 @@ class Pix2pix(object):
type=int,
default=3,
help="only used when Pix2pix discriminator is nlayers")
-
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
def __init__(self,
@@ -217,6 +222,9 @@ class Pix2pix(object):
input_B = fluid.data(name='input_B', shape=data_shape, dtype='float32')
input_fake = fluid.data(
name='input_fake', shape=data_shape, dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
loader = fluid.io.DataLoader.from_generator(
feed_list=[input_A, input_B],
@@ -255,12 +263,15 @@ class Pix2pix(object):
t_time = 0
+ total_train_batch = 0 # used for benchmark
+
for epoch_id in range(self.cfg.epoch):
batch_id = 0
for tensor in loader():
+ if self.cfg.max_iter and total_train_batch == self.cfg.max_iter: # used for benchmark
+ return
s_time = time.time()
- tensor_A, tensor_B = tensor[0]['input_A'], tensor[0]['input_B']
# optimize the generator network
g_loss_gan, g_loss_l1, fake_B_tmp = exe.run(
gen_trainer_program,
@@ -270,17 +281,18 @@ class Pix2pix(object):
],
feed=tensor)
+ devices_num = utility.get_device_num(self.cfg)
+ fake_per_device = int(len(fake_B_tmp) / devices_num)
+ for dev in range(devices_num):
+ tensor[dev]['input_fake'] = fake_B_tmp[dev * fake_per_device : (dev+1) * fake_per_device]
+
# optimize the discriminator network
d_loss_real, d_loss_fake = exe.run(dis_trainer_program,
fetch_list=[
dis_trainer.d_loss_real,
dis_trainer.d_loss_fake
],
- feed={
- "input_A": tensor_A,
- "input_B": tensor_B,
- "input_fake": fake_B_tmp
- })
+ feed=tensor)
batch_time = time.time() - s_time
t_time += batch_time
@@ -294,6 +306,12 @@ class Pix2pix(object):
sys.stdout.flush()
batch_id += 1
+ total_train_batch += 1 # used for benchmark
+ # profiler tools
+ if self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq:
+ profiler.reset_profiler()
+ elif self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq + 5:
+ return
if self.cfg.run_test:
image_name = fluid.data(
@@ -325,3 +343,16 @@ class Pix2pix(object):
"net_G")
utility.checkpoints(epoch_id, self.cfg, exe, dis_trainer,
"net_D")
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count(
+ ) if self.cfg.use_gpu else 1
+ print("kpis\tpix2pix_g_loss_gan_card{}\t{}".format(device_num,
+ g_loss_gan[0]))
+ print("kpis\tpix2pix_g_loss_l1_card{}\t{}".format(device_num,
+ g_loss_l1[0]))
+ print("kpis\tpix2pix_d_loss_real_card{}\t{}".format(device_num,
+ d_loss_real[0]))
+ print("kpis\tpix2pix_d_loss_fake_card{}\t{}".format(device_num,
+ d_loss_fake[0]))
+ print("kpis\tpix2pix_Batch_time_cost_card{}\t{}".format(device_num,
+ batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/SPADE.py b/PaddleCV/PaddleGAN/trainer/SPADE.py
index b11c9b6c556e90ce3db50bd5de8e22c506ec6b1a..59d9df64334e6ec2f254dfdb018266047f4ccd1f 100644
--- a/PaddleCV/PaddleGAN/trainer/SPADE.py
+++ b/PaddleCV/PaddleGAN/trainer/SPADE.py
@@ -268,7 +268,11 @@ class SPADE(object):
type=bool,
default=False,
help="Whether to use instance label.")
-
+ parser.add_argument(
+ '--enable_ce',
+ type=bool,
+ default=False,
+ help="If set True, enable continuous evaluation job.")
return parser
def __init__(self,
@@ -298,6 +302,9 @@ class SPADE(object):
name='input_ins', shape=edge_shape, dtype='float32')
input_fake = fluid.data(
name='input_fake', shape=data_shape, dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
gen_trainer = GTrainer(input_A, input_B, input_C, self.cfg,
self.batch_num)
@@ -343,7 +350,11 @@ class SPADE(object):
dis_trainer.program).with_data_parallel(
loss_name=dis_trainer.d_loss.name,
build_strategy=build_strategy)
-
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ gen_trainer_program.random_seed = 90
+ dis_trainer_program.random_seed = 90
+
t_time = 0
for epoch_id in range(self.cfg.epoch):
@@ -391,7 +402,6 @@ class SPADE(object):
sys.stdout.flush()
batch_id += 1
-
if self.cfg.run_test:
test_program = gen_trainer.infer_program
image_name = fluid.data(
@@ -422,3 +432,12 @@ class SPADE(object):
"net_G")
utility.checkpoints(epoch_id, self.cfg, exe, dis_trainer,
"net_D")
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count() if self.cfg.use_gpu else 1
+ print("kpis\tspade_g_loss_gan_card{}\t{}".format(device_num, g_loss_gan[0]))
+ print("kpis\tspade_g_loss_vgg_card{}\t{}".format(device_num,g_loss_vgg[0]))
+ print("kpis\tspade_g_loss_feat_card{}\t{}".format(device_num,g_loss_feat[0]))
+ print("kpis\tspade_d_loss_real_card{}\t{}".format(device_num,d_loss_real[0]))
+ print("kpis\tspade_d_loss_fake_card{}\t{}".format(device_num,d_loss_fake[0]))
+ print("kpis\tspade_Batch_time_cost_card{}\t{}".format(device_num,batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/STGAN.py b/PaddleCV/PaddleGAN/trainer/STGAN.py
index 6e3c6156ae2f19d277aae0a89d68ed6db40e05da..7d4275c0dd3f64ac1924b64bcc15d677f4e8a1e3 100644
--- a/PaddleCV/PaddleGAN/trainer/STGAN.py
+++ b/PaddleCV/PaddleGAN/trainer/STGAN.py
@@ -17,10 +17,12 @@ from __future__ import print_function
from network.STGAN_network import STGAN_model
from util import utility
import paddle.fluid as fluid
+from paddle.fluid import profiler
import sys
import time
import copy
import numpy as np
+import ast
class GTrainer():
@@ -162,8 +164,13 @@ class DTrainer():
def gradient_penalty(self, f, real, fake=None, cfg=None, name=None):
def _interpolate(a, b=None):
if b is None:
- beta = fluid.layers.uniform_random_batch_size_like(
- input=a, shape=a.shape, min=0.0, max=1.0)
+ if cfg.enable_ce:
+ beta = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=a.shape, min=0.0, max=1.0, seed=1)
+ else:
+ beta = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=a.shape, min=0.0, max=1.0)
+
mean = fluid.layers.reduce_mean(
a, dim=list(range(len(a.shape))), keep_dim=True)
input_sub_mean = fluid.layers.elementwise_sub(a, mean, axis=0)
@@ -173,9 +180,14 @@ class DTrainer():
keep_dim=True)
b = beta * fluid.layers.sqrt(var) * 0.5 + a
shape = [a.shape[0]]
- alpha = fluid.layers.uniform_random_batch_size_like(
- input=a, shape=shape, min=0.0, max=1.0)
- inner = (b - a) * alpha + a
+ if cfg.enable_ce:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0, seed=1)
+ else:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0)
+
+ inner = fluid.layers.elementwise_mul((b-a), alpha, axis=0) + a
return inner
x = _interpolate(real, fake)
@@ -223,7 +235,7 @@ class STGAN(object):
default=1024,
help="the base fc dim in discriminator")
parser.add_argument(
- '--use_gru', type=bool, default=True, help="whether to use GRU")
+ '--use_gru', type=ast.literal_eval, default=True, help="whether to use GRU")
parser.add_argument(
'--lambda_cls',
type=float,
@@ -267,7 +279,10 @@ class STGAN(object):
default=None,
help="the normalization in discriminator, choose in [None, instance_norm]"
)
-
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
def __init__(self,
@@ -294,6 +309,9 @@ class STGAN(object):
name='label_org_', shape=[None, self.cfg.c_dim], dtype='float32')
label_trg_ = fluid.data(
name='label_trg_', shape=[None, self.cfg.c_dim], dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
test_gen_trainer = GTrainer(image_real, label_org, label_org_,
label_trg, label_trg_, self.cfg,
@@ -337,12 +355,20 @@ class STGAN(object):
dis_trainer.program).with_data_parallel(
loss_name=dis_trainer.d_loss.name,
build_strategy=build_strategy)
-
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ gen_trainer_program.random_seed = 90
+ dis_trainer_program.random_seed = 90
+
t_time = 0
+ total_train_batch = 0 # used for benchmark
+
for epoch_id in range(self.cfg.epoch):
batch_id = 0
for data in py_reader():
+ if self.cfg.max_iter and total_train_batch == self.cfg.max_iter: # used for benchmark
+ return
s_time = time.time()
# optimize the discriminator network
fetches = [
@@ -376,6 +402,15 @@ class STGAN(object):
d_loss_gp[0], batch_time))
sys.stdout.flush()
batch_id += 1
+ if self.cfg.enable_ce and batch_id == 100:
+ break
+
+ total_train_batch += 1 # used for benchmark
+ # profiler tools
+ if self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq:
+ profiler.reset_profiler()
+ elif self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq + 5:
+ return
if self.cfg.run_test:
image_name = fluid.data(
@@ -401,3 +436,15 @@ class STGAN(object):
"net_G")
utility.checkpoints(epoch_id, self.cfg, exe, dis_trainer,
"net_D")
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count() if self.cfg.use_gpu else 1
+ print("kpis\tstgan_g_loss_fake_card{}\t{}".format(device_num, g_loss_fake[0]))
+ print("kpis\tstgan_g_loss_rec_card{}\t{}".format(device_num, g_loss_rec[0]))
+ print("kpis\tstgan_g_loss_cls_card{}\t{}".format(device_num, g_loss_cls[0]))
+ print("kpis\tstgan_d_loss_card{}\t{}".format(device_num, d_loss[0]))
+ print("kpis\tstgan_d_loss_real_card{}\t{}".format(device_num, d_loss_real[0]))
+ print("kpis\tstgan_d_loss_fake_card{}\t{}".format(device_num,d_loss_fake[0]))
+ print("kpis\tstgan_d_loss_cls_card{}\t{}".format(device_num, d_loss_cls[0]))
+ print("kpis\tstgan_d_loss_gp_card{}\t{}".format(device_num,d_loss_gp[0]))
+ print("kpis\tstgan_Batch_time_cost_card{}\t{}".format(device_num,batch_time))
diff --git a/PaddleCV/PaddleGAN/trainer/StarGAN.py b/PaddleCV/PaddleGAN/trainer/StarGAN.py
index b4fce5952fa0d41300e474e0dd9919b097ba5fd2..6fa72be7578b082b84fb2f7486ae7991981e9545 100644
--- a/PaddleCV/PaddleGAN/trainer/StarGAN.py
+++ b/PaddleCV/PaddleGAN/trainer/StarGAN.py
@@ -17,6 +17,7 @@ from __future__ import print_function
from network.StarGAN_network import StarGAN_model
from util import utility
import paddle.fluid as fluid
+from paddle.fluid import profiler
import sys
import time
import copy
@@ -158,10 +159,14 @@ class DTrainer():
def gradient_penalty(self, f, real, fake, cfg=None, name=None):
def _interpolate(a, b):
shape = [a.shape[0]]
- alpha = fluid.layers.uniform_random_batch_size_like(
- input=a, shape=shape, min=0.0, max=1.0)
+ if cfg.enable_ce:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0, seed=1)
+ else:
+ alpha = fluid.layers.uniform_random_batch_size_like(
+ input=a, shape=shape, min=0.0, max=1.0)
- inner = b * (1.0 - alpha) + a * alpha
+ inner = fluid.layers.elementwise_mul(b, (1.0-alpha), axis=0) + fluid.layers.elementwise_mul(a, alpha, axis=0)
return inner
x = _interpolate(real, fake)
@@ -244,6 +249,10 @@ class StarGAN(object):
help="the attributes we selected to change")
parser.add_argument(
'--n_samples', type=int, default=1, help="batch size when testing")
+ parser.add_argument(
+ '--enable_ce',
+ action='store_true',
+ help="if set, run the tasks with continuous evaluation logs")
return parser
@@ -267,6 +276,9 @@ class StarGAN(object):
name='label_org', shape=[None, self.cfg.c_dim], dtype='float32')
label_trg = fluid.data(
name='label_trg', shape=[None, self.cfg.c_dim], dtype='float32')
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ fluid.default_startup_program().random_seed = 90
py_reader = fluid.io.PyReader(
feed_list=[image_real, label_org, label_trg],
@@ -303,12 +315,18 @@ class StarGAN(object):
dis_trainer.program).with_data_parallel(
loss_name=dis_trainer.d_loss.name,
build_strategy=build_strategy)
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ gen_trainer_program.random_seed = 90
+ dis_trainer_program.random_seed = 90
t_time = 0
-
+ total_train_batch = 0 # used for benchmark
for epoch_id in range(self.cfg.epoch):
batch_id = 0
for data in py_reader():
+ if self.cfg.max_iter and total_train_batch == self.cfg.max_iter: # used for benchmark
+ return
s_time = time.time()
d_loss_real, d_loss_fake, d_loss, d_loss_cls, d_loss_gp = exe.run(
dis_trainer_program,
@@ -344,6 +362,16 @@ class StarGAN(object):
sys.stdout.flush()
batch_id += 1
+ # used for ce
+ if self.cfg.enable_ce and batch_id == 100:
+ break
+
+ total_train_batch += 1 # used for benchmark
+ # profiler tools
+ if self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq:
+ profiler.reset_profiler()
+ elif self.cfg.profile and epoch_id == 0 and batch_id == self.cfg.print_freq + 5:
+ return
if self.cfg.run_test:
image_name = fluid.data(
@@ -369,3 +397,14 @@ class StarGAN(object):
"net_G")
utility.checkpoints(epoch_id, self.cfg, exe, dis_trainer,
"net_D")
+ # used for continuous evaluation
+ if self.cfg.enable_ce:
+ device_num = fluid.core.get_cuda_device_count() if self.cfg.use_gpu else 1
+ print("kpis\tstargan_g_loss_fake_card{}\t{}".format(device_num, g_loss_fake[0]))
+ print("kpis\tstargan_g_loss_rec_card{}\t{}".format(device_num, g_loss_rec[0]))
+ print("kpis\tstargan_g_loss_cls_card{}\t{}".format(device_num, g_loss_cls[0]))
+ print("kpis\tstargan_d_loss_real_card{}\t{}".format(device_num, d_loss_real[0]))
+ print("kpis\tstargan_d_loss_fake_card{}\t{}".format(device_num,d_loss_fake[0]))
+ print("kpis\tstargan_d_loss_cls_card{}\t{}".format(device_num, d_loss_cls[0]))
+ print("kpis\tstargan_d_loss_gp_card{}\t{}".format(device_num,d_loss_gp[0]))
+ print("kpis\tstargan_Batch_time_cost_card{}\t{}".format(device_num,batch_time))
diff --git a/PaddleCV/PaddleGAN/util/config.py b/PaddleCV/PaddleGAN/util/config.py
index 55666012515dfe2c631b1ac6ec4ed0909e12cc1d..3708cda9b8f7e64d26dcfb8d19abbbe9bbe0d701 100644
--- a/PaddleCV/PaddleGAN/util/config.py
+++ b/PaddleCV/PaddleGAN/util/config.py
@@ -85,6 +85,11 @@ def base_parse_args(parser):
add_arg('run_test', bool, True, "Whether to run test.")
add_arg('use_gpu', bool, True, "Whether to use GPU to train.")
add_arg('profile', bool, False, "Whether to profile.")
+
+ # NOTE: add args for profiler, used for benchmark
+ add_arg('profiler_path', str, '/tmp/profile', "the profiler output files. (used for benchmark)")
+ add_arg('max_iter', int, 0, "the max iter to train. (used for benchmark)")
+
add_arg('dropout', bool, False, "Whether to use drouput.")
add_arg('drop_last', bool, False,
"Whether to drop the last images that cannot form a batch")
diff --git a/PaddleCV/PaddleGAN/util/utility.py b/PaddleCV/PaddleGAN/util/utility.py
index d9465107f4451e1a2687910a8e646b5df32cf8d4..d28961a7c6c3026604fc893f3532be0a87a3308f 100644
--- a/PaddleCV/PaddleGAN/util/utility.py
+++ b/PaddleCV/PaddleGAN/util/utility.py
@@ -425,3 +425,12 @@ def check_version():
except Exception as e:
print(err)
sys.exit(1)
+
+def get_device_num(args):
+ if args.use_gpu:
+ gpus = os.environ.get("CUDA_VISIBLE_DEVICES", 1)
+ gpu_num = len(gpus.split(','))
+ return gpu_num
+ else:
+ cpu_num = os.environ.get("CPU_NUM", 1)
+ return int(cpu_num)
diff --git a/PaddleCV/PaddleVideo/models/bmn/bmn_utils.py b/PaddleCV/PaddleVideo/models/bmn/bmn_utils.py
index da2ceb20e428c940a83270521852cd846eadf07f..d35dd43ba6b46ad07bcc58e4169244ddfd9d488c 100644
--- a/PaddleCV/PaddleVideo/models/bmn/bmn_utils.py
+++ b/PaddleCV/PaddleVideo/models/bmn/bmn_utils.py
@@ -100,6 +100,7 @@ def soft_nms(df, alpha, t1, t2):
def video_process(video_list,
video_dict,
output_path,
+ result_dict,
snms_alpha=0.4,
snms_t1=0.55,
snms_t2=0.9):
@@ -134,15 +135,13 @@ def bmn_post_processing(video_dict, subset, output_path, result_path):
num_videos_per_thread]
p = mp.Process(
target=video_process,
- args=(
- tmp_video_list,
- video_dict,
- output_path, ))
+ args=(tmp_video_list, video_dict, output_path, result_dict))
p.start()
processes.append(p)
tmp_video_list = video_list[(pp_num - 1) * num_videos_per_thread:]
p = mp.Process(
- target=video_process, args=(tmp_video_list, video_dict, output_path))
+ target=video_process,
+ args=(tmp_video_list, video_dict, output_path, result_dict))
p.start()
processes.append(p)
for p in processes:
diff --git a/PaddleCV/PaddleVideo/models/bsn/bsn_utils.py b/PaddleCV/PaddleVideo/models/bsn/bsn_utils.py
index cee44ebfa20c3290921dc615ebef36c0ab353d0f..d8dc46af2ddea9d3305e04204bf27731452e34d2 100644
--- a/PaddleCV/PaddleVideo/models/bsn/bsn_utils.py
+++ b/PaddleCV/PaddleVideo/models/bsn/bsn_utils.py
@@ -104,6 +104,7 @@ def soft_nms(df, alpha, t1, t2):
def video_process(video_list,
video_dict,
output_path_pem,
+ result_dict,
snms_alpha=0.75,
snms_t1=0.65,
snms_t2=0.9):
@@ -139,19 +140,13 @@ def bsn_post_processing(video_dict, subset, output_path_pem, result_path_pem):
num_videos_per_thread]
p = mp.Process(
target=video_process,
- args=(
- tmp_video_list,
- video_dict,
- output_path_pem, ))
+ args=(tmp_video_list, video_dict, output_path_pem, result_dict))
p.start()
processes.append(p)
tmp_video_list = video_list[(pp_num - 1) * num_videos_per_thread:]
p = mp.Process(
target=video_process,
- args=(
- tmp_video_list,
- video_dict,
- output_path_pem, ))
+ args=(tmp_video_list, video_dict, output_path_pem, result_dict))
p.start()
processes.append(p)
for p in processes:
diff --git a/PaddleCV/PaddleVideo/train.py b/PaddleCV/PaddleVideo/train.py
index 467523d88d8878684ff217f741a0a85778d1327d..4adac34374bd8025f988569899e4cb45ed769ed1 100644
--- a/PaddleCV/PaddleVideo/train.py
+++ b/PaddleCV/PaddleVideo/train.py
@@ -104,6 +104,17 @@ def parse_args():
type=ast.literal_eval,
default=False,
help='If set True, enable continuous evaluation job.')
+ # NOTE: args for profiler, used for benchmark
+ parser.add_argument(
+ '--profiler_path',
+ type=str,
+ default='./',
+ help='the path to store profiler output file. used for benchmark.')
+ parser.add_argument(
+ '--is_profiler',
+ type=int,
+ default=0,
+ help='the switch profiler. used for benchmark.')
args = parser.parse_args()
return args
@@ -236,7 +247,9 @@ def train(args):
compiled_test_prog=compiled_valid_prog, #test_exe=valid_exe,
test_dataloader=valid_dataloader,
test_fetch_list=valid_fetch_list,
- test_metrics=valid_metrics)
+ test_metrics=valid_metrics,
+ is_profiler=args.is_profiler,
+ profiler_path=args.profiler_path)
if __name__ == "__main__":
diff --git a/PaddleCV/PaddleVideo/utils/train_utils.py b/PaddleCV/PaddleVideo/utils/train_utils.py
index 4168abbb86eb0675779d570457708b72b41089a7..f7e489183ef91fed9898fdd456932f8cc8264967 100644
--- a/PaddleCV/PaddleVideo/utils/train_utils.py
+++ b/PaddleCV/PaddleVideo/utils/train_utils.py
@@ -18,6 +18,7 @@ import time
import numpy as np
import paddle
import paddle.fluid as fluid
+from paddle.fluid import profiler
import logging
import shutil
@@ -76,7 +77,8 @@ def train_with_dataloader(exe, train_prog, compiled_train_prog, train_dataloader
log_interval = 0, valid_interval = 0, save_dir = './', \
save_model_name = 'model', fix_random_seed = False, \
compiled_test_prog = None, test_dataloader = None, \
- test_fetch_list = None, test_metrics = None):
+ test_fetch_list = None, test_metrics = None, \
+ is_profiler = None, profiler_path = None):
if not train_dataloader:
logger.error("[TRAIN] get dataloader failed.")
epoch_periods = []
@@ -98,6 +100,13 @@ def train_with_dataloader(exe, train_prog, compiled_train_prog, train_dataloader
train_metrics.calculate_and_log_out(train_outs, \
info = '[TRAIN] Epoch {}, iter {} '.format(epoch, train_iter))
train_iter += 1
+
+ # NOTE: profiler tools, used for benchmark
+ if is_profiler and epoch == 0 and train_iter == log_interval:
+ profiler.start_profiler("All")
+ elif is_profiler and epoch == 0 and train_iter == log_interval + 5:
+ profiler.stop_profiler("total", profiler_path)
+ return
if len(epoch_periods) < 1:
logger.info(
diff --git a/PaddleCV/Research/PWCNet/AverageMeter.py b/PaddleCV/Research/PWCNet/AverageMeter.py
new file mode 100644
index 0000000000000000000000000000000000000000..633e6c067d465559d2da61913342da2e521ac731
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/AverageMeter.py
@@ -0,0 +1,18 @@
+
+
+class AverageMeter(object):
+ """Computes and stores the average and current value"""
+ def __init__(self):
+ self.reset()
+
+ def reset(self):
+ self.val = 0
+ self.avg = 0
+ self.sum = 0
+ self.count = 0
+
+ def update(self, val, n=1):
+ self.val = val
+ self.sum += val * n
+ self.count += n
+ self.avg = self.sum / self.count
diff --git a/PaddleCV/Research/PWCNet/README.md b/PaddleCV/Research/PWCNet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b3335013b641836c47b61dd31f8a6f5459188254
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/README.md
@@ -0,0 +1,86 @@
+# PWCNet reimplement using paddlepaddle DyGraph
+PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume.
+# Environment
+```
+cenntos7
+paddle develop version (after 20191201) install from source
+python3.7
+SciPy 1.1.0
+```
+code will update for paddle v1.7 later.
+# Compile correlation op
+```
+cd correlation_op
+sh make.sh
+```
+# Datasets
+1.Please download the `FlyingChairs dataset` and `FlyingChairs_train_val.txt` from https://lmb.informatik.uni-freiburg.de/resources/datasets
+
+Or you can use `./data/download.sh` to download datasets.
+
+We split the data to train and val by using `FlyingChairs_train_val.txt` with `1 for train and 2 for val`.
+# Inference
+Note that the paddle models `pwc_net_paddle.pdparams` and `pwc_net_chairs_paddle.pdparams` are transferred from the pytorch pth files `pwc_net.pth.tar` and `pwc_net_chairs.pth.tar`.
+
+Run
+```
+python infer.py
+```
+
+| Input img1 | Input img2 |
+|-------|------------|
+| | |
+
+|prediction with pwc_net_paddle.pdparams| prediction with pwc_net_chairs_paddle.pdparams|
+|-------------|-------------|
+| | |
+
+# First Train with L2 loss
+A single gpu is supported. Multi gpus will be supported later.
+
+You should check parameters in `my_args.py` as you like.
+
+And change them in `train.sh`.
+```
+--data_root
+--train_val_txt
+--batch_size
+```
+Then run
+```
+./train.sh
+```
+Some results during training can be seen
+```
+./img1.png
+./img2.png
+./hsv_pd.png # ground truth
+./hsv_predict.png # output of model
+```
+
+# Finetune with L1 loss
+finetune from your best pretrain model by adding --pretrained your_best_model_name eg. `--pretrained epoch_7_pwc_net_paddle`
+
+Run
+```
+./finetune.sh
+```
+# Note
+This code reimplement PWCNet like the code of `https://github.com/NVlabs/PWC-Net`
+If you want to want to train like the paper
+```
+@InProceedings{Sun2018PWC-Net,
+ author = {Deqing Sun and Xiaodong Yang and Ming-Yu Liu and Jan Kautz},
+ title = {{PWC-Net}: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume},
+ booktitle = CVPR,
+ year = {2018},
+}
+```
+Please use all the datasets in `./data/download.sh` if you like. And use the code in `./data/datasets.py`.
+
+Reference works
+```
+https://github.com/NVlabs/PWC-Net
+https://github.com/ClementPinard/FlowNetPytorch
+https://github.com/NVIDIA/flownet2-pytorch/blob/master/datasets.py
+```
\ No newline at end of file
diff --git a/PaddleCV/Research/PWCNet/__init__.py b/PaddleCV/Research/PWCNet/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/PaddleCV/Research/PWCNet/correlation_op/README.md b/PaddleCV/Research/PWCNet/correlation_op/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d83c6fe61d6fef1d01139289b69605628e689d72
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/correlation_op/README.md
@@ -0,0 +1,14 @@
+自定义OP编译:
+1. 使用paddle develop 12月1日之后的版本
+2. sh make.sh编译成correlation_lib.so动态库
+3. 添加动态库路径到LD_LIBRARY_PATH:
+```
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python3.7 -c 'import paddle; print(paddle.sysconfig.get_lib())'`
+```
+4. 添加correlation op的python路径:
+```
+export PYTHONPATH=$PYTHONPATH:`pwd`
+```
+5. python test_correlation.py运行单测,验证是否加载成功。
+
+PS: 如果paddle whl包是从官网上下载的,需要使用gcc 4.8,即把make.sh中的g++ 改为 g++-4.8
diff --git a/PaddleCV/Research/PWCNet/correlation_op/correlation.py b/PaddleCV/Research/PWCNet/correlation_op/correlation.py
new file mode 100644
index 0000000000000000000000000000000000000000..05e9267d1fcb51344e096592ad86d22223b99f75
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/correlation_op/correlation.py
@@ -0,0 +1,25 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import paddle.fluid as fluid
+import os
+file_dir = os.path.dirname(os.path.abspath(__file__))
+fluid.load_op_library(os.path.join(file_dir, 'correlation_lib.so'))
+
+from paddle.fluid.layer_helper import LayerHelper
+
+def correlation(input1, input2, pad_size, kernel_size, max_displacement, stride1, stride2, corr_type_multiply=1):
+ helper = LayerHelper("correlation", **locals())
+ output = helper.create_variable_for_type_inference(dtype=input1.dtype)
+ helper.append_op(type="correlation", inputs={"Input1": input1, "Input2": input2}, attrs={"pad_size": pad_size, "kernel_size": kernel_size, "max_displacement": max_displacement, "stride1": stride1, "stride2": stride2, "corr_type_multiply": corr_type_multiply}, outputs = {"Output": output})
+ return output
diff --git a/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cc b/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..4902db3ed7115d0d315ae2f2cbab5ea1a5ee6528
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cc
@@ -0,0 +1,140 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include
+#include
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+inline std::vector CorrelationOutputSize(int batch, int input_height, int input_width, int stride1, int stride2, int kernel_size, int pad_size, int max_displacement) {
+
+ std::vector output_shape({batch});
+ int kernel_radius = (kernel_size - 1) / 2;
+ int border_radius = kernel_radius + max_displacement;
+ int padded_input_height = input_height + 2 * pad_size;
+ int padded_input_width = input_width + 2 * pad_size;
+ int output_channel = ((max_displacement/stride2) * 2 + 1) * ((max_displacement/stride2) * 2 + 1);
+ output_shape.push_back(output_channel);
+ int output_height = std::ceil(static_cast(padded_input_height - 2 * border_radius) / static_cast(stride1));
+ int output_width = std::ceil(static_cast(padded_input_width - 2 * border_radius) / static_cast(stride1));
+ output_shape.push_back(output_height);
+ output_shape.push_back(output_width);
+ return output_shape;
+}
+
+class CorrelationOpMaker : public framework::OpProtoAndCheckerMaker {
+ public:
+ void Make() override{
+ AddInput("Input1", "input1");
+ AddInput("Input2", "input2");
+ AddOutput("Output", "output");
+ AddAttr("pad_size", "pad size for input1 and input2");
+ AddAttr("kernel_size", "kernel size of input1 and input2");
+ AddAttr("max_displacement", "max displacement of input1 and input2");
+ AddAttr("stride1", "Input1 stride");
+ AddAttr("stride2", "Input2 stride");
+ AddAttr("corr_type_multiply", "correlation coefficient").SetDefault(1);
+ AddComment(R"DOC(Correlation of two feature map. Only support NCHW data format.)DOC");
+ }
+};
+
+class CorrelationOp : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ void InferShape(framework::InferShapeContext* ctx) const override{
+ PADDLE_ENFORCE_EQ(ctx->HasInput("Input1"), true, "Input(input1) cannot be null");
+ PADDLE_ENFORCE_EQ(ctx->HasInput("Input2"), true, "Input(input2) cannot be null");
+ int stride1 = ctx->Attrs().Get("stride1");
+ int stride2 = ctx->Attrs().Get("stride2");
+ int max_displacement = ctx->Attrs().Get("max_displacement");
+ int pad_size = ctx->Attrs().Get("pad_size");
+ int kernel_size = ctx->Attrs().Get("kernel_size");
+
+ auto in_dims = ctx->GetInputDim("Input1");
+ auto in2_dims = ctx->GetInputDim("Input2");
+ PADDLE_ENFORCE_EQ(in_dims.size() == 4, true, "input1 must be 4-dims");
+ PADDLE_ENFORCE_EQ(in2_dims.size() == 4, true, "input2 must be 4-dims");
+ std::vector output_shape = CorrelationOutputSize(in_dims[0], in_dims[2], in_dims[3], stride1, stride2, kernel_size, pad_size, max_displacement);
+ ctx->SetOutputDim("Output", framework::make_ddim(output_shape));
+ }
+
+ protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override{
+ auto input_data_type = OperatorWithKernel::IndicateVarDataType(ctx, "Input1");
+ PADDLE_ENFORCE_EQ(input_data_type, ctx.Input("Input2")->type(), "Input1 and Input2 shoule have same type");
+ return framework::OpKernelType(input_data_type, ctx.GetPlace());
+ }
+};
+
+template
+class CorrelationOpGradMaker : public framework::SingleGradOpMaker {
+ public:
+ using framework::SingleGradOpMaker::SingleGradOpMaker;
+
+ protected:
+ std::unique_ptr Apply() const override {
+ auto* op = new T();
+ op->SetType("correlation_grad");
+ op->SetInput("Input1", this->Input("Input1"));
+ op->SetInput("Input2", this->Input("Input2"));
+ op->SetInput(framework::GradVarName("Output"), this->OutputGrad("Output"));
+ op->SetOutput(framework::GradVarName("Input1"), this->InputGrad("Input1"));
+ op->SetOutput(framework::GradVarName("Input2"), this->InputGrad("Input2"));
+ op->SetAttrMap(this->Attrs());
+
+ return std::unique_ptr(op);
+ }
+};
+
+class CorrelationOpGrad : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ void InferShape(framework::InferShapeContext* ctx) const override{
+ PADDLE_ENFORCE_EQ(ctx->HasInput("Input1"), true, "Input(Input1) should not be null");
+ PADDLE_ENFORCE_EQ(ctx->HasInput("Input2"), true, "Input(Input2) should not be null");
+ PADDLE_ENFORCE_EQ(ctx->HasInput(framework::GradVarName("Output")), true, "Input(Output@GRAD) should not be null");
+
+ auto in1_dims = ctx->GetInputDim("Input1");
+ auto in2_dims = ctx->GetInputDim("Input2");
+ ctx->SetOutputDim(framework::GradVarName("Input1"), in1_dims);
+ ctx->SetOutputDim(framework::GradVarName("Input2"), in1_dims);
+ }
+
+ protected:
+ framework::OpKernelType GetExpectedKernelType(
+ const framework::ExecutionContext& ctx) const override{
+ const auto* var = ctx.InputVar(framework::GradVarName("Output"));
+ if (var == nullptr) {
+ PADDLE_THROW("cannot find Output@GRAD");
+ }
+ return framework::OpKernelType(OperatorWithKernel::IndicateVarDataType(ctx, "Input1"), ctx.GetPlace());
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OPERATOR(correlation, ops::CorrelationOp, ops::CorrelationOpMaker,
+ ops::CorrelationOpGradMaker,
+ ops::CorrelationOpGradMaker);
+REGISTER_OPERATOR(correlation_grad, ops::CorrelationOpGrad);
diff --git a/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cu b/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..161844430fe4b9dfeaf80dbe127d802d67a6de76
--- /dev/null
+++ b/PaddleCV/Research/PWCNet/correlation_op/correlation_op.cu
@@ -0,0 +1,434 @@
+/* Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#pragma once
+#include
+#include "paddle/fluid/framework/op_registry.h"
+
+#define THREADS_PER_BLOCK 32
+#define FULL_MASK 0xffffffff
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+
+template
+__forceinline__ __device__ T warpReduceSum(T val) {
+ for (int offset = 16; offset > 0; offset /= 2) {
+ val += __shfl_down_sync(FULL_MASK, val, offset);
+ }
+ return val;
+}
+
+template
+__forceinline__ __device__ T blockReduceSum(T val) {
+ static __shared__ T shared[32];
+ int lane = threadIdx.x % warpSize;
+ int wid = threadIdx.x / warpSize;
+
+ val = warpReduceSum(val);
+ if (lane == 0)
+ shared[wid] = val;
+
+ __syncthreads();
+ val = (threadIdx.x < blockDim.x / warpSize) ? shared[lane] : 0;
+
+ if (wid == 0)
+ val = warpReduceSum(val);
+
+ return val;
+}
+
+template
+__global__ void set_zero(T *x, int num) {
+ for(int i = blockIdx.x * blockDim.x + threadIdx.x; i < num; i += blockDim.x * gridDim.x)
+ x[i] = static_cast(0);
+}
+
+template
+__global__ void channel_first(const T *input, T *rinput, const int channel, const int height, const int width, const int pad_size) {
+ int n = blockIdx.x;
+ int h = blockIdx.y;
+ int w = blockIdx.z;
+
+ int ch_off = threadIdx.x;
+ T value;
+ int dimchw = channel * height * width;
+ int dimhw = height * width;
+
+ int p_dimw = (width + 2 * pad_size);
+ int p_dimh = (height + 2 * pad_size);
+ int p_dimchw = channel * p_dimw * p_dimh;
+ int p_dimcw = channel * p_dimw;
+
+ for (int c = ch_off; c < channel; c += THREADS_PER_BLOCK) {
+ value = input[n * dimchw + c * dimhw + h * width + w];
+ rinput[n * p_dimchw + (h + pad_size) * p_dimcw + (w + pad_size) * channel + c] = value;
+ }
+}
+
+template
+__global__ void correlation_forward(T *output, const int output_channel, const int output_height, const int output_width, const T *rinput1, const int input_channel, const int input_height, const int input_width, const T *rinput2, const int pad_size, const int kernel_size, const int max_displacement, const int stride1, const int stride2) {
+
+ int p_input_width = input_width + 2 * pad_size;
+ int p_input_height = input_height + 2 * pad_size;
+
+ int kernel_rad = (kernel_size - 1) / 2;
+ int displacement_rad = max_displacement / stride2;
+
+ int displacement_size = 2 * displacement_rad + 1;
+
+ int n = blockIdx.x;
+ int h1 = blockIdx.y * stride1 + max_displacement;
+ int w1 = blockIdx.z * stride1 + max_displacement;
+ int c = threadIdx.x;
+
+ int p_dimchw = p_input_height * p_input_width * input_channel;
+ int p_dimcw = p_input_width * input_channel;
+ int p_dimc = input_channel;
+
+ int t_dimchw = output_channel * output_height * output_width;
+ int t_dimhw = output_height * output_width;
+ int t_dimw = output_width;
+
+ int nelems = kernel_size * kernel_size * p_dimc;
+
+ for (int tj = -displacement_rad; tj <= displacement_rad; ++tj) {
+ for(int ti = -displacement_rad; ti <= displacement_rad; ++ti) {
+ int w2 = w1 + ti * stride2;
+ int h2 = h1 + tj * stride2;
+
+ T acc0 = 0;
+ for(int j = -kernel_rad; j <= kernel_rad; ++j) {
+ for(int i = -kernel_rad; i <= kernel_rad; ++i) {
+ for(int ch = c; ch < p_dimc; ch += blockDim.x) {
+ int index1 = n * p_dimchw + (h1 + j) * p_dimcw + (w1 + i) * p_dimc + ch;
+ int index2 = n * p_dimchw + (h2 + j) * p_dimcw + (w2 + i) * p_dimc + ch;
+ acc0 += static_cast(rinput1[index1] * rinput2[index2]);
+ }
+ }
+ }
+ if (blockDim.x == warpSize) {
+ __syncwarp();
+ acc0 = warpReduceSum(acc0);
+ } else {
+ __syncthreads();
+ acc0 = blockReduceSum(acc0);
+ }
+
+ if (threadIdx.x == 0) {
+ int tc = (tj + displacement_rad) * displacement_size + (ti + displacement_rad);
+ const int t_index = n * t_dimchw + tc * t_dimhw + blockIdx.y * t_dimw + blockIdx.z;
+ output[t_index] = static_cast(acc0 / nelems);
+ }
+ }
+ }
+
+}
+
+//class CorrelationKernel
+template
+class CorrelationKernel : public framework::OpKernel {
+ public:
+ void Compute(const framework::ExecutionContext &ctx) const override {
+ PADDLE_ENFORCE_EQ(platform::is_gpu_place(ctx.GetPlace()), true, "It must be CUDAPlace");
+
+ auto *input1 = ctx.Input("Input1");
+ auto *input2 = ctx.Input("Input2");
+ int pad_size = ctx.Attr("pad_size");
+ int kernel_size = ctx.Attr("kernel_size");
+ int stride1 = ctx.Attr("stride1");
+ int stride2 = ctx.Attr("stride2");
+ int max_displacement = ctx.Attr("max_displacement");
+ int corr_type_multiply = ctx.Attr("corr_type_multiply");
+
+ auto *output = ctx.Output("Output");
+ output->mutable_data(ctx.GetPlace());
+ auto &dev_ctx = ctx.template device_context();
+
+ // base on input1, NCHW
+ auto in_dims = input1->dims();
+ int N = in_dims[0];
+ int C = in_dims[1];
+ int H = in_dims[2];
+ int W = in_dims[3];
+
+ int padded_input_height = H + 2 * pad_size;
+ int padded_input_width = W + 2 * pad_size;
+
+ Tensor rinput1 = ctx.AllocateTmpTensor({N, padded_input_height, padded_input_width, C}, dev_ctx);
+ rinput1.mutable_data(ctx.GetPlace());
+
+ Tensor rinput2 = ctx.AllocateTmpTensor({N, padded_input_height, padded_input_width, C}, dev_ctx);
+ rinput2.mutable_data(ctx.GetPlace());
+
+ set_zero<<<(rinput1.numel() + 512 - 1)/512, 512, 0, dev_ctx.stream()>>>(rinput1.data(), rinput1.numel());
+ set_zero<<<(rinput2.numel() + 512 - 1)/512, 512, 0, dev_ctx.stream()>>>(rinput2.data(), rinput2.numel());
+ set_zero<<<(output->numel() + 512 - 1)/512, 512, 0, dev_ctx.stream()>>>(output->data(), output->numel());
+
+ auto out_dims = output->dims();
+ int OC = out_dims[1];
+ int OH = out_dims[2];
+ int OW = out_dims[3];
+
+ dim3 blocks_grid(N, H, W);
+ dim3 threads_block(THREADS_PER_BLOCK);
+
+ channel_first<<>>(input1->data(), rinput1.data(), C, H, W, pad_size);
+ channel_first