提交 3c9441ae 编写于 作者: G Gines Hidalgo

Slightly improved mAP and reduced false positives, but reduced mAR

上级 80fc1144
...@@ -17,6 +17,7 @@ OpenPose - Frequently Asked Question (FAQ) ...@@ -17,6 +17,7 @@ OpenPose - Frequently Asked Question (FAQ)
11. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library-not-found) 11. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library-not-found)
12. [CMake-GUI Error While Getting Default Caffe](#cmake-gui-error-while-getting-default-caffe) 12. [CMake-GUI Error While Getting Default Caffe](#cmake-gui-error-while-getting-default-caffe)
13. [Libgomp Out of Memory Error](#libgomp-out-of-memory-error) 13. [Libgomp Out of Memory Error](#libgomp-out-of-memory-error)
14. [Runtime Error with Turing GPU (Tesla T4) or Volta GPU][#runtime-error-with-turing-gpu-teslat4-or-volta-gpu)
2. [Speed Performance Issues](#speed-performance-issues) 2. [Speed Performance Issues](#speed-performance-issues)
1. [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark) 1. [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark)
2. [How to Measure the Latency Time?](#how-to-measure-the-latency-time) 2. [How to Measure the Latency Time?](#how-to-measure-the-latency-time)
...@@ -39,7 +40,7 @@ OpenPose - Frequently Asked Question (FAQ) ...@@ -39,7 +40,7 @@ OpenPose - Frequently Asked Question (FAQ)
#### Out of Memory Error #### Out of Memory Error
**Q: Out of memory error** - I get an error similar to: `Check failed: error == cudaSuccess (2 vs. 0) out of memory`. **Q: Out of memory error** - I get an error similar to: `Check failed: error == cudaSuccess (2 vs. 0) out of memory`.
**A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2 GB for BODY_25 and ~1.5 GB for COCO. **A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2.2 GB for BODY_25 (default) and ~1.5 GB for COCO (`--model_pose COCO`). Note that you still need at least about 2.2 GB free for the default OpenPose to run. I.e., GPUs with only 2 GB will not fit the default OpenPose, and you will have to either switch to the `COCO` model (slower and less accurate), or reduce the `--net_resolution` (faster speed but also lower accuracy).
...@@ -162,6 +163,14 @@ git submodle update ...@@ -162,6 +163,14 @@ git submodle update
#### Runtime Error with Turing GPU (Tesla T4) or Volta GPU
**Q**: When I start OpenPose, I receive a runtime error for new GPU architectures.
**A**: To solve this problem, 1) make sure you are using CUDA 10 or higher, and 2) change line 7 in `{OPENPOSE_PATH}/3rdparty/caffe/cmake/Cuda.cmake`, from `set(Caffe_known_gpu_archs "30 35 50 52 60 61")` to `set(Caffe_known_gpu_archs "30 35 50 52 60 61 75")`.
### Speed Performance Issues ### Speed Performance Issues
#### Speed Up, Memory Reduction, and Benchmark #### Speed Up, Memory Reduction, and Benchmark
......
...@@ -54,7 +54,12 @@ We add links to some community-based work based on OpenPose. Note: We do not sup ...@@ -54,7 +54,12 @@ We add links to some community-based work based on OpenPose. Note: We do not sup
- [ROS example](https://github.com/firephinx/openpose_ros) (based on a very old OpenPose version). For questions and more details, read and post ONLY on [issue thread #51](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/51). - [ROS example](https://github.com/firephinx/openpose_ros) (based on a very old OpenPose version). For questions and more details, read and post ONLY on [issue thread #51](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/51).
- Docker Images. For questions and more details, read and post ONLY on [issue thread #347](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/347). - Docker Images. For questions and more details, read and post ONLY on [issue thread #347](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/347).
- Dockerfile working with CUDA 10: [link 1](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile) and [link 2](https://cloud.docker.com/repository/docker/exsidius/openpose/general). - Dockerfile working also with CUDA 10:
- [Link 1](https://github.com/esemeniuc/openpose-docker), it claims to also include Python support. Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102).
- [Link 2](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile).
- [Link 3](https://cloud.docker.com/repository/docker/exsidius/openpose/general).
- Dockerfile working only with CUDA 8:
- [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 5, Python2.7](https://github.com/tlkh/openpose). Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102).
- [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6). - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6).
- [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33). - [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33).
...@@ -168,7 +173,25 @@ make -j`nproc` ...@@ -168,7 +173,25 @@ make -j`nproc`
``` ```
#### Windows #### Windows
In order to build the project, open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press <kbd>F5</kbd>). In order to build the project, select and run only one of the 2 following alternatives.
1. **CMake-GUI alternative (recommended)**: Open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press <kbd>F5</kbd>).
2. Command-line build alternative (not recommended). NOTE: The command line alternative is not officially supported, but it was added in [GitHub issue #1198](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1198). For any questions or bug report about this command-line version, comment in that GitHub issue.
1. Run "MSVS 2017 Developer Command Console"
```
openpose\mkdir build
cd build
cmake .. -G "Visual Studio 15 2017 Win64" -T v140
cmake --build . --config Release
copy x64\Release\* bin\
```
2. If you want to clean build
```
cmake --clean-first .
cmake --build . --config Release
copy x64\Release\* bin\
```
**VERY IMPORTANT NOTE**: In order to use OpenPose outside Visual Studio, and assuming you have not unchecked the `BUILD_BIN_FOLDER` flag in CMake, copy all DLLs from `{build_directory}/bin` into the folder where the generated `openpose.dll` and `*.exe` demos are, e.g., `{build_directory}x64/Release` for the 64-bit release version. **VERY IMPORTANT NOTE**: In order to use OpenPose outside Visual Studio, and assuming you have not unchecked the `BUILD_BIN_FOLDER` flag in CMake, copy all DLLs from `{build_directory}/bin` into the folder where the generated `openpose.dll` and `*.exe` demos are, e.g., `{build_directory}x64/Release` for the 64-bit release version.
......
...@@ -23,13 +23,13 @@ This module performs 3-D keypoint (body, face, and hand) reconstruction and rend ...@@ -23,13 +23,13 @@ This module performs 3-D keypoint (body, face, and hand) reconstruction and rend
## Installation ## Installation
Check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for installation steps. Check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for installation steps.
## Non Linear Optimization ## Non Linear Optimization
In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for more details. In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for more details.
......
...@@ -369,10 +369,14 @@ OpenPose Library - Release Notes ...@@ -369,10 +369,14 @@ OpenPose Library - Release Notes
1. Main improvements: 1. Main improvements:
1. Highly improved 3D triangulation for >3 cameras by fixing some small bugs. 1. Highly improved 3D triangulation for >3 cameras by fixing some small bugs.
2. Added community-based support for Nvidia NVCaffe. 2. Added community-based support for Nvidia NVCaffe.
3. Increased accuracy very lightly for CUDA version (about 0.01%) by adapting the threshold in `process()` in `bodyPartConnectorBase.cu` to `defaultNmsThreshold`. This also removes any posibility of future bugs in that function for using a default NMS threshold higher than 0.15 (which was the hard-coded value used previously).
4. Increased mAP but reduced mAR (both about 0.01%) as well as reduction of false positives. Step 1: removed legs where only knee/ankle/feet are found. Step 2: If no people is found in an image, `removePeopleBelowThresholds` is re-run with `maximizePositives = true`.
5. Number of maximum people is not limited by the maximum number of max peaks anymore. However, the number of body part candidates for a specific keypoint (e.g., nose) is still limited to the number of max peaks.
2. Functions or parameters renamed: 2. Functions or parameters renamed:
1. `--3d_min_views` default value (-1) no longer means that all camera views are required. Instead, it will be equal to max(2, min(4, #cameras-1)). This should provide a good trade-off between recall and precission. 1. `--3d_min_views` default value (-1) no longer means that all camera views are required. Instead, it will be equal to max(2, min(4, #cameras-1)). This should provide a good trade-off between recall and precission.
3. Main bugs fixed: 3. Main bugs fixed:
1. Windows: Added back support for OpenGL and Spinnaker, as well as DLLs for debug compilation. 1. Windows: Added back support for OpenGL and Spinnaker, as well as DLLs for debug compilation.
2. `06_face_from_image.cpp` and `07_hand_from_image.cpp` working again, they stopped working in version 1.5.0 with the GPU image resize for the GUI.
4. Changes/additions that affect the compatibility with the OpenPose Unity Plugin: 4. Changes/additions that affect the compatibility with the OpenPose Unity Plugin:
......
...@@ -18,10 +18,10 @@ namespace op ...@@ -18,10 +18,10 @@ namespace op
void connectBodyPartsGpu( void connectBodyPartsGpu(
Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr, Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr,
const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold, const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold,
const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor = 1.f, const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
const bool maximizePositives = false, Array<T> pairScoresCpu = Array<T>{}, T* pairScoresGpuPtr = nullptr, const bool maximizePositives, Array<T> pairScoresCpu, T* pairScoresGpuPtr,
const unsigned int* const bodyPartPairsGpuPtr = nullptr, const unsigned int* const mapIdxGpuPtr = nullptr, const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr,
const T* const peaksGpuPtr = nullptr); const T* const peaksGpuPtr, const T defaultNmsThreshold);
template <typename T> template <typename T>
void connectBodyPartsOcl( void connectBodyPartsOcl(
...@@ -41,16 +41,16 @@ namespace op ...@@ -41,16 +41,16 @@ namespace op
const unsigned int numberBodyPartPairs, const Array<T>& precomputedPAFs = Array<T>()); const unsigned int numberBodyPartPairs, const Array<T>& precomputedPAFs = Array<T>());
template <typename T> template <typename T>
void removePeopleBelowThresholds(std::vector<int>& validSubsetIndexes, int& numberPeople, void removePeopleBelowThresholdsAndFillFaces(
const std::vector<std::pair<std::vector<int>, T>>& subsets, std::vector<int>& validSubsetIndexes, int& numberPeople,
const unsigned int numberBodyParts, const int minSubsetCnt, std::vector<std::pair<std::vector<int>, T>>& subsets, const unsigned int numberBodyParts,
const T minSubsetScore, const int maxPeaks, const bool maximizePositives); const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr);
template <typename T> template <typename T>
void peopleVectorToPeopleArray(Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor, void peopleVectorToPeopleArray(
const std::vector<std::pair<std::vector<int>, T>>& subsets, Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
const std::vector<int>& validSubsetIndexes, const T* const peaksPtr, const std::vector<std::pair<std::vector<int>, T>>& subsets, const std::vector<int>& validSubsetIndexes,
const int numberPeople, const unsigned int numberBodyParts, const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts,
const unsigned int numberBodyPartPairs); const unsigned int numberBodyPartPairs);
template <typename T> template <typename T>
......
...@@ -25,6 +25,8 @@ namespace op ...@@ -25,6 +25,8 @@ namespace op
void setMaximizePositives(const bool maximizePositives); void setMaximizePositives(const bool maximizePositives);
void setDefaultNmsThreshold(const T defaultNmsThreshold);
void setInterMinAboveThreshold(const T interMinAboveThreshold); void setInterMinAboveThreshold(const T interMinAboveThreshold);
void setInterThreshold(const T interThreshold); void setInterThreshold(const T interThreshold);
...@@ -56,6 +58,7 @@ namespace op ...@@ -56,6 +58,7 @@ namespace op
private: private:
PoseModel mPoseModel; PoseModel mPoseModel;
bool mMaximizePositives; bool mMaximizePositives;
T mDefaultNmsThreshold;
T mInterMinAboveThreshold; T mInterMinAboveThreshold;
T mInterThreshold; T mInterThreshold;
int mMinSubsetCnt; int mMinSubsetCnt;
......
...@@ -210,10 +210,12 @@ namespace op ...@@ -210,10 +210,12 @@ namespace op
1.f,1.f,1.f,1.f,1.f,1.f, \ 1.f,1.f,1.f,1.f,1.f,1.f, \
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \ 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \ 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f
// First 0.45f row:
// 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f,
#define POSE_BODY_135_COLORS_RENDER_GPU \ #define POSE_BODY_135_COLORS_RENDER_GPU \
255.f, 0.f, 85.f, \ 255.f, 0.f, 85.f, \
170.f, 0.f, 255.f, \ 170.f, 0.f, 255.f, \
......
...@@ -21,12 +21,15 @@ namespace op ...@@ -21,12 +21,15 @@ namespace op
void scaleKeypoints2d(Array<T>& keypoints, const T scaleX, const T scaleY, const T offsetX, const T offsetY); void scaleKeypoints2d(Array<T>& keypoints, const T scaleX, const T scaleY, const T offsetX, const T offsetY);
template <typename T> template <typename T>
void renderKeypointsCpu(Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs, void renderKeypointsCpu(
const std::vector<T> colors, const T thicknessCircleRatio, Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
const T thicknessLineRatioWRTCircle, const std::vector<T>& poseScales, const T threshold); const std::vector<T> colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle,
const std::vector<T>& poseScales, const T threshold);
template <typename T> template <typename T>
Rectangle<T> getKeypointsRectangle(const Array<T>& keypoints, const int person, const T threshold); Rectangle<T> getKeypointsRectangle(
const Array<T>& keypoints, const int person, const T threshold, const int firstIndex = 0,
const int lastIndex = -1);
template <typename T> template <typename T>
T getAverageScore(const Array<T>& keypoints, const int person); T getAverageScore(const Array<T>& keypoints, const int person);
...@@ -44,7 +47,8 @@ namespace op ...@@ -44,7 +47,8 @@ namespace op
T getDistanceAverage(const Array<T>& keypoints, const int personA, const int personB, const T threshold); T getDistanceAverage(const Array<T>& keypoints, const int personA, const int personB, const T threshold);
template <typename T> template <typename T>
T getDistanceAverage(const Array<T>& keypointsA, const int personA, const Array<T>& keypointsB, const int personB, T getDistanceAverage(
const Array<T>& keypointsA, const int personA, const Array<T>& keypointsB, const int personB,
const T threshold); const T threshold);
/** /**
......
...@@ -267,7 +267,7 @@ namespace op ...@@ -267,7 +267,7 @@ namespace op
// Input cvMat to OpenPose input & output format // Input cvMat to OpenPose input & output format
// Note: resize on GPU reduces accuracy about 0.1% // Note: resize on GPU reduces accuracy about 0.1%
bool resizeOnCpu = true; bool resizeOnCpu = true;
// const auto resizeOnCpu = (numberGpuThreads < 3); // const auto resizeOnCpu = (wrapperStructPose.poseMode != PoseMode::Enabled);
if (resizeOnCpu) if (resizeOnCpu)
{ {
const auto gpuResize = false; const auto gpuResize = false;
...@@ -277,7 +277,8 @@ namespace op ...@@ -277,7 +277,8 @@ namespace op
} }
// Note: We realized that somehow doing it on GPU for any number of GPUs does speedup the whole OP // Note: We realized that somehow doing it on GPU for any number of GPUs does speedup the whole OP
resizeOnCpu = false; resizeOnCpu = false;
addCvMatToOpOutputInCpu = addCvMatToOpOutput && (resizeOnCpu || !renderOutputGpu); addCvMatToOpOutputInCpu = addCvMatToOpOutput
&& (resizeOnCpu || !renderOutputGpu || wrapperStructPose.poseMode != PoseMode::Enabled);
if (addCvMatToOpOutputInCpu) if (addCvMatToOpOutputInCpu)
{ {
const auto gpuResize = false; const auto gpuResize = false;
...@@ -618,7 +619,7 @@ namespace op ...@@ -618,7 +619,7 @@ namespace op
{ {
const auto gpuResize = true; const auto gpuResize = true;
opOutputToCvMats.emplace_back(std::make_shared<OpOutputToCvMat>(gpuResize)); opOutputToCvMats.emplace_back(std::make_shared<OpOutputToCvMat>(gpuResize));
poseExtractorsWs[i].emplace_back( poseExtractorsWs.at(i).emplace_back(
std::make_shared<WOpOutputToCvMat<TDatumsSP>>(opOutputToCvMats.back())); std::make_shared<WOpOutputToCvMat<TDatumsSP>>(opOutputToCvMats.back()));
// Assign shared parameters // Assign shared parameters
opOutputToCvMats.back()->setSharedParameters( opOutputToCvMats.back()->setSharedParameters(
......
...@@ -33,7 +33,7 @@ LIBRARY_NAME := $(PROJECT) ...@@ -33,7 +33,7 @@ LIBRARY_NAME := $(PROJECT)
LIB_BUILD_DIR := $(BUILD_DIR)/lib LIB_BUILD_DIR := $(BUILD_DIR)/lib
STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a
DYNAMIC_VERSION_MAJOR := 1 DYNAMIC_VERSION_MAJOR := 1
DYNAMIC_VERSION_MINOR := 4 DYNAMIC_VERSION_MINOR := 5
DYNAMIC_VERSION_REVISION := 0 DYNAMIC_VERSION_REVISION := 0
DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so
#DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR) #DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR)
......
#include <set> #include <set>
#include <openpose/utilities/check.hpp> #include <openpose/utilities/check.hpp>
#include <openpose/utilities/fastMath.hpp> #include <openpose/utilities/fastMath.hpp>
#include <openpose/utilities/keypoint.hpp>
#include <openpose/pose/poseParameters.hpp> #include <openpose/pose/poseParameters.hpp>
#include <openpose/net/bodyPartConnectorBase.hpp> #include <openpose/net/bodyPartConnectorBase.hpp>
namespace op namespace op
{ {
template <typename T> template <typename T>
inline T getScoreAB(const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr, inline T getScoreAB(
const T* const mapX, const T* const mapY, const Point<int>& heatMapSize, const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr, const T* const mapX,
const T interThreshold, const T interMinAboveThreshold) const T* const mapY, const Point<int>& heatMapSize, const T interThreshold, const T interMinAboveThreshold)
{ {
try try
{ {
...@@ -57,6 +58,27 @@ namespace op ...@@ -57,6 +58,27 @@ namespace op
} }
} }
template <typename T>
void getKeypointCounter(
int& personCounter, const std::vector<std::pair<std::vector<int>, T>>& peopleVector,
const unsigned int index, const int indexFirst, const int indexLast, const int minimum)
{
try
{
// Count keypoints
auto keypointCounter = 0;
for (auto i = indexFirst ; i < indexLast ; i++)
keypointCounter += (peopleVector[index].first.at(i) > 0);
// If enough keypoints --> subtract them and keep them at least as big as minimum
if (keypointCounter > minimum)
personCounter += minimum-keypointCounter; // personCounter = non-considered keypoints + minimum
}
catch (const std::exception& e)
{
error(e.what(), __LINE__, __FUNCTION__, __FILE__);
}
}
template <typename T> template <typename T>
std::vector<std::pair<std::vector<int>, T>> createPeopleVector( std::vector<std::pair<std::vector<int>, T>> createPeopleVector(
const T* const heatMapPtr, const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize, const T* const heatMapPtr, const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize,
...@@ -211,8 +233,9 @@ namespace op ...@@ -211,8 +233,9 @@ namespace op
for (auto j = 1; j <= numberPeaksB; j++) for (auto j = 1; j <= numberPeaksB; j++)
{ {
// Initial PAF // Initial PAF
auto scoreAB = getScoreAB(i, j, candidateAPtr, candidateBPtr, mapX, mapY, auto scoreAB = getScoreAB(
heatMapSize, interThreshold, interMinAboveThreshold); i, j, candidateAPtr, candidateBPtr, mapX, mapY, heatMapSize, interThreshold,
interMinAboveThreshold);
// E.g., neck-nose connection. If possible PAF between neck i, nose j --> add // E.g., neck-nose connection. If possible PAF between neck i, nose j --> add
// parts score + connection score // parts score + connection score
...@@ -263,9 +286,8 @@ namespace op ...@@ -263,9 +286,8 @@ namespace op
const auto indexB = std::get<2>(aBConnection); const auto indexB = std::get<2>(aBConnection);
if (!occurA[indexA-1] && !occurB[indexB-1]) if (!occurA[indexA-1] && !occurB[indexB-1])
{ {
abConnections.emplace_back(std::make_tuple(bodyPartA*peaksOffset + indexA*3 + 2, abConnections.emplace_back(std::make_tuple(
bodyPartB*peaksOffset + indexB*3 + 2, bodyPartA*peaksOffset+indexA*3+2, bodyPartB*peaksOffset+indexB*3+2, score));
score));
counter++; counter++;
if (counter==minAB) if (counter==minAB)
break; break;
...@@ -298,8 +320,8 @@ namespace op ...@@ -298,8 +320,8 @@ namespace op
// Add ears connections (in case person is looking to opposite direction to camera) // Add ears connections (in case person is looking to opposite direction to camera)
// Note: This has some issues: // Note: This has some issues:
// - It does not prevent repeating the same keypoint in different people // - It does not prevent repeating the same keypoint in different people
// - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it will not // - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it
// merge them both // will not merge them both
else if ( else if (
(numberBodyParts == 18 && (pairIndex==17 || pairIndex==18)) (numberBodyParts == 18 && (pairIndex==17 || pairIndex==18))
|| ((numberBodyParts == 19 || (numberBodyParts == 25) || ((numberBodyParts == 19 || (numberBodyParts == 25)
...@@ -622,49 +644,139 @@ namespace op ...@@ -622,49 +644,139 @@ namespace op
} }
template <typename T> template <typename T>
void removePeopleBelowThresholds( void getRoiDiameterAndBounds(
Rectangle<int>& roi, int& diameter, int& indexFirstNon0, int& indexLastNon0,
const std::vector<int>& personVector, const T* const peaksPtr,
const int indexInit, const int indexEnd)
{
try
{
roi = Rectangle<int>{0,0,0,0};
for (auto index = 0u ; index < personVector.size()-1 ; index++)
{
const auto x = peaksPtr[personVector[index]-2];
const auto y = peaksPtr[personVector[index]-1];
const auto score = peaksPtr[personVector[index]];
if (roi.x > x)
roi.x = x;
if (roi.y > y)
roi.y = y;
}
}
catch (const std::exception& e)
{
error(e.what(), __LINE__, __FUNCTION__, __FILE__);
}
}
template <typename T>
void removePeopleBelowThresholdsAndFillFaces(
std::vector<int>& validSubsetIndexes, int& numberPeople, std::vector<int>& validSubsetIndexes, int& numberPeople,
const std::vector<std::pair<std::vector<int>, T>>& peopleVector, const unsigned int numberBodyParts, std::vector<std::pair<std::vector<int>, T>>& peopleVector, const unsigned int numberBodyParts,
const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives) const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr)
// const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives)
{ {
try try
{ {
// Delete people below the following thresholds: // Delete people below the following thresholds:
// a) minSubsetCnt: removed if less than minSubsetCnt body parts // a) minSubsetCnt: removed if less than minSubsetCnt body parts
// b) minSubsetScore: removed if global score smaller than this // b) minSubsetScore: removed if global score smaller than this
// c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds -> Not required
numberPeople = 0; numberPeople = 0;
validSubsetIndexes.clear(); validSubsetIndexes.clear();
validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // maxPeaks is not required
validSubsetIndexes.reserve(peopleVector.size());
// Face valid sets
std::vector<int> faceValidSubsetIndexes;
faceValidSubsetIndexes.reserve(peopleVector.size());
// Face invalid sets
std::vector<int> faceInvalidSubsetIndexes;
faceInvalidSubsetIndexes.reserve(peopleVector.size());
// For each person candidate
for (auto index = 0u ; index < peopleVector.size() ; index++) for (auto index = 0u ; index < peopleVector.size() ; index++)
{ {
auto personCounter = peopleVector[index].first.back(); auto personCounter = peopleVector[index].first.back();
// Analog for hand/face keypoints
if (numberBodyParts >= 135)
{
// No consider face keypoints for personCounter
const auto currentCounter = personCounter;
getKeypointCounter(personCounter, peopleVector, index, 65, 135, 1);
const auto newCounter = personCounter;
if (personCounter == 0)
{
faceInvalidSubsetIndexes.emplace_back(index);
continue;
}
// If body is still valid and facial points were removed, then add to valid faces
else if (currentCounter != newCounter)
faceValidSubsetIndexes.emplace_back(index);
// No consider right hand keypoints for personCounter
getKeypointCounter(personCounter, peopleVector, index, 45, 65, 1);
// No consider left hand keypoints for personCounter
getKeypointCounter(personCounter, peopleVector, index, 25, 45, 1);
}
// Foot keypoints do not affect personCounter (too many false positives, // Foot keypoints do not affect personCounter (too many false positives,
// same foot usually appears as both left and right keypoints) // same foot usually appears as both left and right keypoints)
// Pros: Removed tons of false positives // Pros: Removed tons of false positives
// Cons: Standalone leg will never be recorded // Cons: Standalone leg will never be recorded
// Solution: No consider foot keypoints for that
if (!maximizePositives && (numberBodyParts == 25 || numberBodyParts > 70)) if (!maximizePositives && (numberBodyParts == 25 || numberBodyParts > 70))
{ {
// No consider foot keypoints for that const auto currentCounter = personCounter;
for (auto i = 19 ; i < 25 ; i++) getKeypointCounter(personCounter, peopleVector, index, 19, 25, 0);
personCounter -= (peopleVector[index].first.at(i) > 0); const auto newCounter = personCounter;
// No consider hand keypoints for that // Problem: Same leg/foot keypoints are considered for both left and right keypoints.
if (numberBodyParts > 70) // Solution: Remove legs that are duplicated and that do not have upper torso
for (auto i = 25 ; i < 65 ; i++) // Result: Slight increase in COCO mAP and decrease in mAR + reducing a lot false positives!
personCounter -= (peopleVector[index].first.at(i) > 0); if (newCounter != currentCounter && newCounter <= 4)
continue;
} }
// Add only valid people
const auto personScore = peopleVector[index].second; const auto personScore = peopleVector[index].second;
if (personCounter >= minSubsetCnt && (personScore/personCounter) >= minSubsetScore) if (personCounter >= minSubsetCnt && (personScore/personCounter) >= minSubsetScore)
{ {
numberPeople++; numberPeople++;
validSubsetIndexes.emplace_back(index); validSubsetIndexes.emplace_back(index);
if (numberPeople == maxPeaks) // // This is not required, it is OK if there are more people. No more GPU memory used.
break; // if (numberPeople == maxPeaks)
// break;
} }
// Sanity check
else if ((personCounter < 1 && numberBodyParts != 25 && numberBodyParts < 70) || personCounter < 0) else if ((personCounter < 1 && numberBodyParts != 25 && numberBodyParts < 70) || personCounter < 0)
error("Bad personCounter (" + std::to_string(personCounter) + "). Bug in this" error("Bad personCounter (" + std::to_string(personCounter) + "). Bug in this"
" function if this happens.", __LINE__, __FUNCTION__, __FILE__); " function if this happens.", __LINE__, __FUNCTION__, __FILE__);
} }
// // Random standalone facial keypoints --> Merge into a more complete face
// if (numberPeople > 0 && faceInvalidSubsetIndexes.size() > 0)
// {
// for (auto faceId = 0u ; faceId < faceInvalidSubsetIndexes.size() ; faceId++)
// {
// // Get ROI
// Rectangle<int> roi;
// int diameter;
// int indexFirstNon0;
// int indexLastNon0;
// const auto index = faceValidSubsetIndexes[faceId];
// getRoiDiameterAndBounds(
// roi, diameter, indexFirstNon0, indexLastNon0, peopleVector[index].first, peaksPtr, 65, 135);
// // const auto personCounter = peopleVector[index].first.back();
// // const auto x = peaksPtr[peopleVector[index].first[part]-2];
// // const auto y = peaksPtr[peopleVector[index].first[part]-1];
// // const auto score = peaksPtr[peopleVector[index].first[part]];
// }
// }
// If no people found --> Repeat with maximizePositives = true
// Result: Increased COCO mAP because we catch more foot-only images
if (numberPeople == 0 && !maximizePositives)
{
removePeopleBelowThresholdsAndFillFaces(
validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
true, peaksPtr);
// // Debugging
// if (numberPeople > 0)
// log("Found " + std::to_string(numberPeople) + " people in second iteration");
}
} }
catch (const std::exception& e) catch (const std::exception& e)
{ {
...@@ -673,30 +785,35 @@ namespace op ...@@ -673,30 +785,35 @@ namespace op
} }
template <typename T> template <typename T>
void peopleVectorToPeopleArray(Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor, void peopleVectorToPeopleArray(
const std::vector<std::pair<std::vector<int>, T>>& peopleVector, Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
const std::vector<int>& validSubsetIndexes, const T* const peaksPtr, const std::vector<std::pair<std::vector<int>, T>>& peopleVector, const std::vector<int>& validSubsetIndexes,
const int numberPeople, const unsigned int numberBodyParts, const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts,
const unsigned int numberBodyPartPairs) const unsigned int numberBodyPartPairs)
{ {
try try
{ {
// Allocate memory (initialized to 0)
if (numberPeople > 0) if (numberPeople > 0)
{ {
// Initialized to 0 for non-found keypoints in people // Initialized to 0 for non-found keypoints in people
poseKeypoints.reset({numberPeople, (int)numberBodyParts, 3}, 0.f); poseKeypoints.reset({numberPeople, (int)numberBodyParts, 3}, 0.f);
poseScores.reset(numberPeople); poseScores.reset(numberPeople);
} }
// No people --> Empty Arrays
else else
{ {
poseKeypoints.reset(); poseKeypoints.reset();
poseScores.reset(); poseScores.reset();
} }
// Fill people keypoints
const auto oneOverNumberBodyPartsAndPAFs = 1/T(numberBodyParts + numberBodyPartPairs); const auto oneOverNumberBodyPartsAndPAFs = 1/T(numberBodyParts + numberBodyPartPairs);
// For each person
for (auto person = 0u ; person < validSubsetIndexes.size() ; person++) for (auto person = 0u ; person < validSubsetIndexes.size() ; person++)
{ {
const auto& personPair = peopleVector[validSubsetIndexes[person]]; const auto& personPair = peopleVector[validSubsetIndexes[person]];
const auto& personVector = personPair.first; const auto& personVector = personPair.first;
// For each body part
for (auto bodyPart = 0u; bodyPart < numberBodyParts; bodyPart++) for (auto bodyPart = 0u; bodyPart < numberBodyParts; bodyPart++)
{ {
const auto baseOffset = (person*numberBodyParts + bodyPart) * 3; const auto baseOffset = (person*numberBodyParts + bodyPart) * 3;
...@@ -1109,10 +1226,10 @@ namespace op ...@@ -1109,10 +1226,10 @@ namespace op
// } // }
template <typename T> template <typename T>
void connectBodyPartsCpu(Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapPtr, void connectBodyPartsCpu(
const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize, Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapPtr, const T* const peaksPtr,
const int maxPeaks, const T interMinAboveThreshold, const T interThreshold, const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold,
const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
const bool maximizePositives) const bool maximizePositives)
{ {
try try
...@@ -1124,29 +1241,27 @@ namespace op ...@@ -1124,29 +1241,27 @@ namespace op
if (numberBodyParts == 0) if (numberBodyParts == 0)
error("Invalid value of numberBodyParts, it must be positive, not " + std::to_string(numberBodyParts), error("Invalid value of numberBodyParts, it must be positive, not " + std::to_string(numberBodyParts),
__LINE__, __FUNCTION__, __FILE__); __LINE__, __FUNCTION__, __FILE__);
// std::vector<std::pair<std::vector<int>, double>> refers to: // std::vector<std::pair<std::vector<int>, double>> refers to:
// - std::vector<int>: [body parts locations, #body parts found] // - std::vector<int>: [body parts locations, #body parts found]
// - double: person subset score // - double: person subset score
const auto peopleVector = createPeopleVector( auto peopleVector = createPeopleVector(
heatMapPtr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold, heatMapPtr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold,
bodyPartPairs, numberBodyParts, numberBodyPartPairs); bodyPartPairs, numberBodyParts, numberBodyPartPairs);
// Delete people below the following thresholds: // Delete people below the following thresholds:
// a) minSubsetCnt: removed if less than minSubsetCnt body parts // a) minSubsetCnt: removed if less than minSubsetCnt body parts
// b) minSubsetScore: removed if global score smaller than this // b) minSubsetScore: removed if global score smaller than this
// c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds
int numberPeople; int numberPeople;
std::vector<int> validSubsetIndexes; std::vector<int> validSubsetIndexes;
validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
removePeopleBelowThresholds( validSubsetIndexes.reserve(peopleVector.size());
removePeopleBelowThresholdsAndFillFaces(
validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore, validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
maxPeaks, maximizePositives); maximizePositives, peaksPtr);
// Fill and return poseKeypoints // Fill and return poseKeypoints
peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peopleVectorToPeopleArray(
peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople,
numberBodyParts, numberBodyPartPairs);
// Experimental code // Experimental code
if (poseModel == PoseModel::BODY_25D) if (poseModel == PoseModel::BODY_25D)
error("BODY_25D is an experimental branch which is not usable.", __LINE__, __FUNCTION__, __FILE__); error("BODY_25D is an experimental branch which is not usable.", __LINE__, __FUNCTION__, __FILE__);
...@@ -1185,16 +1300,16 @@ namespace op ...@@ -1185,16 +1300,16 @@ namespace op
const unsigned int numberBodyParts, const unsigned int numberBodyPartPairs, const unsigned int numberBodyParts, const unsigned int numberBodyPartPairs,
const Array<double>& precomputedPAFs); const Array<double>& precomputedPAFs);
template OP_API void removePeopleBelowThresholds( template OP_API void removePeopleBelowThresholdsAndFillFaces(
std::vector<int>& validSubsetIndexes, int& numberPeople, std::vector<int>& validSubsetIndexes, int& numberPeople,
const std::vector<std::pair<std::vector<int>, float>>& peopleVector, std::vector<std::pair<std::vector<int>, float>>& peopleVector,
const unsigned int numberBodyParts, const unsigned int numberBodyParts, const int minSubsetCnt, const float minSubsetScore,
const int minSubsetCnt, const float minSubsetScore, const int maxPeaks, const bool maximizePositives); const bool maximizePositives, const float* const peaksPtr);
template OP_API void removePeopleBelowThresholds( template OP_API void removePeopleBelowThresholdsAndFillFaces(
std::vector<int>& validSubsetIndexes, int& numberPeople, std::vector<int>& validSubsetIndexes, int& numberPeople,
const std::vector<std::pair<std::vector<int>, double>>& peopleVector, std::vector<std::pair<std::vector<int>, double>>& peopleVector,
const unsigned int numberBodyParts, const unsigned int numberBodyParts, const int minSubsetCnt, const double minSubsetScore,
const int minSubsetCnt, const double minSubsetScore, const int maxPeaks, const bool maximizePositives); const bool maximizePositives, const double* const peaksPtr);
template OP_API void peopleVectorToPeopleArray( template OP_API void peopleVectorToPeopleArray(
Array<float>& poseKeypoints, Array<float>& poseScores, const float scaleFactor, Array<float>& poseKeypoints, Array<float>& poseScores, const float scaleFactor,
......
...@@ -14,7 +14,7 @@ namespace op ...@@ -14,7 +14,7 @@ namespace op
template <typename T> template <typename T>
inline __device__ T process( inline __device__ T process(
const T* bodyPartA, const T* bodyPartB, const T* mapX, const T* mapY, const int heatmapWidth, const T* bodyPartA, const T* bodyPartB, const T* mapX, const T* mapY, const int heatmapWidth,
const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold, const T defaultNmsThreshold)
{ {
const auto vectorAToBX = bodyPartB[0] - bodyPartA[0]; const auto vectorAToBX = bodyPartB[0] - bodyPartA[0];
const auto vectorAToBY = bodyPartB[1] - bodyPartA[1]; const auto vectorAToBY = bodyPartB[1] - bodyPartA[1];
...@@ -59,7 +59,7 @@ namespace op ...@@ -59,7 +59,7 @@ namespace op
const auto l2Dist = sqrtf(vectorAToBX*vectorAToBX + vectorAToBY*vectorAToBY); const auto l2Dist = sqrtf(vectorAToBX*vectorAToBX + vectorAToBY*vectorAToBY);
const auto threshold = sqrtf(heatmapWidth*heatmapHeight)/150; // 3.3 for 368x656, 6.6 for 2x resolution const auto threshold = sqrtf(heatmapWidth*heatmapHeight)/150; // 3.3 for 368x656, 6.6 for 2x resolution
if (l2Dist < threshold) if (l2Dist < threshold)
return T(0.15); return T(defaultNmsThreshold+1e-6); // Without 1e-6 will not work because I use strict greater
} }
} }
return -1; return -1;
...@@ -69,7 +69,8 @@ namespace op ...@@ -69,7 +69,8 @@ namespace op
// __global__ void pafScoreKernelOld( // __global__ void pafScoreKernelOld(
// T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr, // T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr,
// const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs, // const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs,
// const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) // const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold,
// const T defaultNmsThreshold)
// { // {
// const auto pairIndex = (blockIdx.x * blockDim.x) + threadIdx.x; // const auto pairIndex = (blockIdx.x * blockDim.x) + threadIdx.x;
// const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y; // const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y;
...@@ -96,7 +97,7 @@ namespace op ...@@ -96,7 +97,7 @@ namespace op
// const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight; // const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight;
// pairScoresPtr[outputIndex] = process( // pairScoresPtr[outputIndex] = process(
// bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold, // bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold,
// interMinAboveThreshold); // interMinAboveThreshold, defaultNmsThreshold);
// } // }
// else // else
// pairScoresPtr[outputIndex] = -1; // pairScoresPtr[outputIndex] = -1;
...@@ -107,7 +108,8 @@ namespace op ...@@ -107,7 +108,8 @@ namespace op
__global__ void pafScoreKernel( __global__ void pafScoreKernel(
T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr, T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr,
const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs, const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs,
const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold,
const T defaultNmsThreshold)
{ {
const auto peakB = (blockIdx.x * blockDim.x) + threadIdx.x; const auto peakB = (blockIdx.x * blockDim.x) + threadIdx.x;
const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y; const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y;
...@@ -135,191 +137,21 @@ namespace op ...@@ -135,191 +137,21 @@ namespace op
const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight; const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight;
pairScoresPtr[outputIndex] = process( pairScoresPtr[outputIndex] = process(
bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold, bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold,
interMinAboveThreshold); interMinAboveThreshold, defaultNmsThreshold);
} }
else else
pairScoresPtr[outputIndex] = -1; pairScoresPtr[outputIndex] = -1;
} }
} }
// template <typename T>
// std::vector<std::pair<std::vector<int>, T>> pafVectorIntoPeopleVectorOld(
// const std::vector<std::tuple<T, T, int, int, int>>& pairConnections, const T* const peaksPtr,
// const int maxPeaks, const std::vector<unsigned int>& bodyPartPairs, const unsigned int numberBodyParts)
// {
// try
// {
// // std::vector<std::pair<std::vector<int>, double>> refers to:
// // - std::vector<int>: [body parts locations, #body parts found]
// // - double: person subset score
// std::vector<std::pair<std::vector<int>, T>> peopleVector;
// const auto vectorSize = numberBodyParts+1;
// const auto peaksOffset = (maxPeaks+1);
// // Save which body parts have been already assigned
// std::vector<int> personAssigned(numberBodyParts*maxPeaks, -1);
// // Iterate over each PAF pair connection detected
// // E.g., neck1-nose2, neck5-Lshoulder0, etc.
// for (const auto& pairConnection : pairConnections)
// {
// // Read pairConnection
// // // Total score - only required for previous sort
// // const auto totalScore = std::get<0>(pairConnection);
// const auto pafScore = std::get<1>(pairConnection);
// const auto pairIndex = std::get<2>(pairConnection);
// const auto indexA = std::get<3>(pairConnection);
// const auto indexB = std::get<4>(pairConnection);
// // Derived data
// const auto bodyPartA = bodyPartPairs[2*pairIndex];
// const auto bodyPartB = bodyPartPairs[2*pairIndex+1];
// const auto indexScoreA = (bodyPartA*peaksOffset + indexA)*3 + 2;
// const auto indexScoreB = (bodyPartB*peaksOffset + indexB)*3 + 2;
// // -1 because indexA and indexB are 1-based
// auto& aAssigned = personAssigned[bodyPartA*maxPeaks+indexA-1];
// auto& bAssigned = personAssigned[bodyPartB*maxPeaks+indexB-1];
// // Debugging
// #ifdef DEBUG
// if (indexA-1 > peaksOffset || indexA <= 0)
// error("Something is wrong: " + std::to_string(indexA)
// + " vs. " + std::to_string(peaksOffset) + ". Contact us.",
// __LINE__, __FUNCTION__, __FILE__);
// if (indexB-1 > peaksOffset || indexB <= 0)
// error("Something is wrong: " + std::to_string(indexB)
// + " vs. " + std::to_string(peaksOffset) + ". Contact us.",
// __LINE__, __FUNCTION__, __FILE__);
// #endif
// // Different cases:
// // 1. A & B not assigned yet: Create new person
// // 2. A assigned but not B: Add B to person with A (if no another B there)
// // 3. B assigned but not A: Add A to person with B (if no another A there)
// // 4. A & B already assigned to same person (circular/redundant PAF): Update person score
// // 5. A & B already assigned to different people: Merge people if keypoint intersection is null
// // 1. A & B not assigned yet: Create new person
// if (aAssigned < 0 && bAssigned < 0)
// {
// // Keypoint indexes
// std::vector<int> rowVector(vectorSize, 0);
// rowVector[bodyPartA] = indexScoreA;
// rowVector[bodyPartB] = indexScoreB;
// // Number keypoints
// rowVector.back() = 2;
// // Score
// const auto personScore = peaksPtr[indexScoreA] + peaksPtr[indexScoreB] + pafScore;
// // Set associated personAssigned as assigned
// aAssigned = (int)peopleVector.size();
// bAssigned = aAssigned;
// // Create new personVector
// peopleVector.emplace_back(std::make_pair(rowVector, personScore));
// }
// // 2. A assigned but not B: Add B to person with A (if no another B there)
// // or
// // 3. B assigned but not A: Add A to person with B (if no another A there)
// else if ((aAssigned >= 0 && bAssigned < 0)
// || (aAssigned < 0 && bAssigned >= 0))
// {
// // Assign person1 to one where xAssigned >= 0
// const auto assigned1 = (aAssigned >= 0 ? aAssigned : bAssigned);
// auto& assigned2 = (aAssigned >= 0 ? bAssigned : aAssigned);
// const auto bodyPart2 = (aAssigned >= 0 ? bodyPartB : bodyPartA);
// const auto indexScore2 = (aAssigned >= 0 ? indexScoreB : indexScoreA);
// // Person index
// auto& personVector = peopleVector[assigned1];
// // Debugging
// #ifdef DEBUG
// const auto bodyPart1 = (aAssigned >= 0 ? bodyPartA : bodyPartB);
// const auto indexScore1 = (aAssigned >= 0 ? indexScoreA : indexScoreB);
// const auto index1 = (aAssigned >= 0 ? indexA : indexB);
// if ((unsigned int)personVector.first.at(bodyPart1) != indexScore1)
// error("Something is wrong: "
// + std::to_string((personVector.first[bodyPart1]-2)/3-bodyPart1*peaksOffset)
// + " vs. " + std::to_string((indexScore1-2)/3-bodyPart1*peaksOffset) + " vs. "
// + std::to_string(index1) + ". Contact us.",
// __LINE__, __FUNCTION__, __FILE__);
// #endif
// // If person with 1 does not have a 2 yet
// if (personVector.first[bodyPart2] == 0)
// {
// // Update keypoint indexes
// personVector.first[bodyPart2] = indexScore2;
// // Update number keypoints
// personVector.first.back()++;
// // Update score
// personVector.second += peaksPtr[indexScore2] + pafScore;
// // Set associated personAssigned as assigned
// assigned2 = assigned1;
// }
// // Otherwise, ignore this B because the previous one came from a higher PAF-confident score
// }
// // 4. A & B already assigned to same person (circular/redundant PAF): Update person score
// else if (aAssigned >=0 && bAssigned >=0 && aAssigned == bAssigned)
// peopleVector[aAssigned].second += pafScore;
// // 5. A & B already assigned to different people: Merge people if keypoint intersection is null
// // I.e., that the keypoints in person A and B do not overlap
// else if (aAssigned >=0 && bAssigned >=0 && aAssigned != bAssigned)
// {
// // Assign person1 to the one with lowest index for 2 reasons:
// // 1. Speed up: Removing an element from std::vector is cheaper for latest elements
// // 2. Avoid harder index update: Updated elements in person1ssigned would depend on
// // whether person1 > person2 or not: element = aAssigned - (person2 > person1 ? 1 : 0)
// const auto assigned1 = (aAssigned < bAssigned ? aAssigned : bAssigned);
// const auto assigned2 = (aAssigned < bAssigned ? bAssigned : aAssigned);
// auto& person1 = peopleVector[assigned1].first;
// const auto& person2 = peopleVector[assigned2].first;
// // Check if complementary
// // Defining found keypoint indexes in personA as kA, and analogously kB
// // Complementary if and only if kA intersection kB = empty. I.e., no common keypoints
// bool complementary = true;
// for (auto part = 0u ; part < numberBodyParts ; part++)
// {
// if (person1[part] > 0 && person2[part] > 0)
// {
// complementary = false;
// break;
// }
// }
// // If complementary, merge both people into 1
// if (complementary)
// {
// // Update keypoint indexes
// for (auto part = 0u ; part < numberBodyParts ; part++)
// if (person1[part] == 0)
// person1[part] = person2[part];
// // Update number keypoints
// person1.back() += person2.back();
// // Update score
// peopleVector[assigned1].second += peopleVector[assigned2].second + pafScore;
// // Erase the non-merged person
// peopleVector.erase(peopleVector.begin()+assigned2);
// // Update associated personAssigned (person indexes have changed)
// for (auto& element : personAssigned)
// {
// if (element == assigned2)
// element = assigned1;
// else if (element > assigned2)
// element--;
// }
// }
// }
// }
// // Return result
// return peopleVector;
// }
// catch (const std::exception& e)
// {
// error(e.what(), __LINE__, __FUNCTION__, __FILE__);
// return {};
// }
// }
template <typename T> template <typename T>
void connectBodyPartsGpu(Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapGpuPtr, void connectBodyPartsGpu(
const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize, Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr,
const int maxPeaks, const T interMinAboveThreshold, const T interThreshold, const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold,
const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
const bool maximizePositives, Array<T> pairScoresCpu, T* pairScoresGpuPtr, const bool maximizePositives, Array<T> pairScoresCpu, T* pairScoresGpuPtr,
const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr, const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr,
const T* const peaksGpuPtr) const T* const peaksGpuPtr, const T defaultNmsThreshold)
{ {
try try
{ {
...@@ -352,27 +184,10 @@ namespace op ...@@ -352,27 +184,10 @@ namespace op
// pafScoreKernelOld<<<numBlocks, THREADS_PER_BLOCK>>>( // pafScoreKernelOld<<<numBlocks, THREADS_PER_BLOCK>>>(
// pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr, // pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr,
// maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, // maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold,
// interMinAboveThreshold); // interMinAboveThreshold, defaultNmsThreshold);
// // pairScoresCpu <-- pairScoresGpu // // pairScoresCpu <-- pairScoresGpu
// cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T), // cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T),
// cudaMemcpyDeviceToHost); // cudaMemcpyDeviceToHost);
// // Get pair connections and their scores
// const auto pairConnections = pafPtrIntoVector(
// pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs);
// const auto peopleVector = pafVectorIntoPeopleVectorOld(
// pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts);
// // Delete people below the following thresholds:
// // a) minSubsetCnt: removed if less than minSubsetCnt body parts
// // b) minSubsetScore: removed if global score smaller than this
// // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds
// int numberPeople;
// std::vector<int> validSubsetIndexes;
// validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
// removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt,
// minSubsetScore, maxPeaks, maximizePositives);
// // Fill and return poseKeypoints
// peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes,
// peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs);
// OP_PROFILE_END(timeNormalize1, 1e3, REPS); // OP_PROFILE_END(timeNormalize1, 1e3, REPS);
// Efficient code // Efficient code
...@@ -386,14 +201,16 @@ namespace op ...@@ -386,14 +201,16 @@ namespace op
pafScoreKernel<<<numBlocks, THREADS_PER_BLOCK>>>( pafScoreKernel<<<numBlocks, THREADS_PER_BLOCK>>>(
pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr, pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr,
maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold,
interMinAboveThreshold); interMinAboveThreshold, defaultNmsThreshold);
// pairScoresCpu <-- pairScoresGpu // pairScoresCpu <-- pairScoresGpu
cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T), cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T),
cudaMemcpyDeviceToHost); cudaMemcpyDeviceToHost);
// OP_PROFILE_END(timeNormalize2, 1e3, REPS);
// Get pair connections and their scores // Get pair connections and their scores
const auto pairConnections = pafPtrIntoVector( const auto pairConnections = pafPtrIntoVector(
pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs); pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs);
const auto peopleVector = pafVectorIntoPeopleVector( auto peopleVector = pafVectorIntoPeopleVector(
pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts); pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts);
// // Old code: Get pair connections and their scores // // Old code: Get pair connections and their scores
// // std::vector<std::pair<std::vector<int>, double>> refers to: // // std::vector<std::pair<std::vector<int>, double>> refers to:
...@@ -409,13 +226,15 @@ namespace op ...@@ -409,13 +226,15 @@ namespace op
// c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds
int numberPeople; int numberPeople;
std::vector<int> validSubsetIndexes; std::vector<int> validSubsetIndexes;
validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, validSubsetIndexes.reserve(peopleVector.size());
minSubsetScore, maxPeaks, maximizePositives); removePeopleBelowThresholdsAndFillFaces(
validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
maximizePositives, peaksPtr);
// Fill and return poseKeypoints // Fill and return poseKeypoints
peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peopleVectorToPeopleArray(
peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople,
// OP_PROFILE_END(timeNormalize2, 1e3, REPS); numberBodyParts, numberBodyPartPairs);
// // Profiling verbose // // Profiling verbose
// log(" BPC(ori)=" + std::to_string(timeNormalize1) + "ms"); // log(" BPC(ori)=" + std::to_string(timeNormalize1) + "ms");
...@@ -436,12 +255,12 @@ namespace op ...@@ -436,12 +255,12 @@ namespace op
const float interMinAboveThreshold, const float interThreshold, const int minSubsetCnt, const float interMinAboveThreshold, const float interThreshold, const int minSubsetCnt,
const float minSubsetScore, const float scaleFactor, const bool maximizePositives, const float minSubsetScore, const float scaleFactor, const bool maximizePositives,
Array<float> pairScoresCpu, float* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr, Array<float> pairScoresCpu, float* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr,
const unsigned int* const mapIdxGpuPtr, const float* const peaksGpuPtr); const unsigned int* const mapIdxGpuPtr, const float* const peaksGpuPtr, const float defaultNmsThreshold);
template void connectBodyPartsGpu( template void connectBodyPartsGpu(
Array<double>& poseKeypoints, Array<double>& poseScores, const double* const heatMapGpuPtr, Array<double>& poseKeypoints, Array<double>& poseScores, const double* const heatMapGpuPtr,
const double* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const double* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks,
const double interMinAboveThreshold, const double interThreshold, const int minSubsetCnt, const double interMinAboveThreshold, const double interThreshold, const int minSubsetCnt,
const double minSubsetScore, const double scaleFactor, const bool maximizePositives, const double minSubsetScore, const double scaleFactor, const bool maximizePositives,
Array<double> pairScoresCpu, double* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr, Array<double> pairScoresCpu, double* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr,
const unsigned int* const mapIdxGpuPtr, const double* const peaksGpuPtr); const unsigned int* const mapIdxGpuPtr, const double* const peaksGpuPtr, const double defaultNmsThreshold);
} }
...@@ -156,16 +156,15 @@ namespace op ...@@ -156,16 +156,15 @@ namespace op
pairScoresGpuPtrBuffer, heatMapGpuPtrBuffer, peaksGpuPtrBuffer, bodyPartPairsGpuPtrBuffer, mapIdxGpuPtrBuffer, pairScoresGpuPtrBuffer, heatMapGpuPtrBuffer, peaksGpuPtrBuffer, bodyPartPairsGpuPtrBuffer, mapIdxGpuPtrBuffer,
maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold,
interMinAboveThreshold); interMinAboveThreshold);
OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer(pairScoresGpuPtrBuffer, CL_TRUE, 0, OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer(
totalComputations * sizeof(T), pairScoresCpu.getPtr()); pairScoresGpuPtrBuffer, CL_TRUE, 0, totalComputations * sizeof(T), pairScoresCpu.getPtr());
// New code // New code
// Get pair connections and their scores // Get pair connections and their scores
const auto pairConnections = pafPtrIntoVector( const auto pairConnections = pafPtrIntoVector(
pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs); pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs);
const auto peopleVector = pafVectorIntoPeopleVector( auto peopleVector = pafVectorIntoPeopleVector(
pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts); pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts);
// // Old code // // Old code
// // Get pair connections and their scores // // Get pair connections and their scores
// // std::vector<std::pair<std::vector<int>, double>> refers to: // // std::vector<std::pair<std::vector<int>, double>> refers to:
...@@ -175,7 +174,6 @@ namespace op ...@@ -175,7 +174,6 @@ namespace op
// const auto peopleVector = createPeopleVector( // const auto peopleVector = createPeopleVector(
// tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold, // tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold,
// bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu); // bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu);
// Delete people below the following thresholds: // Delete people below the following thresholds:
// a) minSubsetCnt: removed if less than minSubsetCnt body parts // a) minSubsetCnt: removed if less than minSubsetCnt body parts
// b) minSubsetScore: removed if global score smaller than this // b) minSubsetScore: removed if global score smaller than this
...@@ -183,15 +181,13 @@ namespace op ...@@ -183,15 +181,13 @@ namespace op
int numberPeople; int numberPeople;
std::vector<int> validSubsetIndexes; std::vector<int> validSubsetIndexes;
validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, removePeopleBelowThresholdsAndFillFaces(
minSubsetScore, maxPeaks, maximizePositives); validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
maximizePositives, peaksPtr);
// Fill and return poseKeypoints // Fill and return poseKeypoints
peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peopleVectorToPeopleArray(
peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople,
numberBodyParts, numberBodyPartPairs);
// // Sanity check
// cudaCheck(__LINE__, __FUNCTION__, __FILE__);
#else #else
UNUSED(poseKeypoints); UNUSED(poseKeypoints);
UNUSED(poseScores); UNUSED(poseScores);
......
...@@ -108,6 +108,19 @@ namespace op ...@@ -108,6 +108,19 @@ namespace op
} }
} }
template <typename T>
void BodyPartConnectorCaffe<T>::setDefaultNmsThreshold(const T defaultNmsThreshold)
{
try
{
mDefaultNmsThreshold = {defaultNmsThreshold};
}
catch (const std::exception& e)
{
error(e.what(), __LINE__, __FUNCTION__, __FILE__);
}
}
template <typename T> template <typename T>
void BodyPartConnectorCaffe<T>::setInterMinAboveThreshold(const T interMinAboveThreshold) void BodyPartConnectorCaffe<T>::setInterMinAboveThreshold(const T interMinAboveThreshold)
{ {
...@@ -300,8 +313,8 @@ namespace op ...@@ -300,8 +313,8 @@ namespace op
} }
template <typename T> template <typename T>
void BodyPartConnectorCaffe<T>::Forward_gpu(const std::vector<ArrayCpuGpu<T>*>& bottom, Array<T>& poseKeypoints, void BodyPartConnectorCaffe<T>::Forward_gpu(
Array<T>& poseScores) const std::vector<ArrayCpuGpu<T>*>& bottom, Array<T>& poseKeypoints, Array<T>& poseScores)
{ {
try try
{ {
...@@ -354,12 +367,12 @@ namespace op ...@@ -354,12 +367,12 @@ namespace op
} }
// Run body part connector // Run body part connector
connectBodyPartsGpu(poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel, connectBodyPartsGpu(
Point<int>{heatMapsBlob->shape(3), heatMapsBlob->shape(2)}, poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel,
maxPeaks, mInterMinAboveThreshold, mInterThreshold, Point<int>{heatMapsBlob->shape(3), heatMapsBlob->shape(2)}, maxPeaks, mInterMinAboveThreshold,
mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives, mInterThreshold, mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives,
mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr, mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr, peaksGpuPtr,
peaksGpuPtr); mDefaultNmsThreshold);
#else #else
UNUSED(bottom); UNUSED(bottom);
UNUSED(poseKeypoints); UNUSED(poseKeypoints);
......
...@@ -317,6 +317,7 @@ namespace op ...@@ -317,6 +317,7 @@ namespace op
// OP_CUDA_PROFILE_END(timeNormalize3, 1e3, REPS); // OP_CUDA_PROFILE_END(timeNormalize3, 1e3, REPS);
// OP_CUDA_PROFILE_INIT(REPS); // OP_CUDA_PROFILE_INIT(REPS);
spBodyPartConnectorCaffe->setScaleNetToOutput(mScaleNetToOutput); spBodyPartConnectorCaffe->setScaleNetToOutput(mScaleNetToOutput);
spBodyPartConnectorCaffe->setDefaultNmsThreshold((float)get(PoseProperty::NMSThreshold));
spBodyPartConnectorCaffe->setInterMinAboveThreshold( spBodyPartConnectorCaffe->setInterMinAboveThreshold(
(float)get(PoseProperty::ConnectInterMinAboveThreshold)); (float)get(PoseProperty::ConnectInterMinAboveThreshold));
spBodyPartConnectorCaffe->setInterThreshold((float)get(PoseProperty::ConnectInterThreshold)); spBodyPartConnectorCaffe->setInterThreshold((float)get(PoseProperty::ConnectInterThreshold));
......
...@@ -174,9 +174,10 @@ namespace op ...@@ -174,9 +174,10 @@ namespace op
const double offsetY); const double offsetY);
template <typename T> template <typename T>
void renderKeypointsCpu(Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs, void renderKeypointsCpu(
const std::vector<T> colors, const T thicknessCircleRatio, Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
const T thicknessLineRatioWRTCircle, const std::vector<T>& poseScales, const T threshold) const std::vector<T> colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle,
const std::vector<T>& poseScales, const T threshold)
{ {
try try
{ {
...@@ -209,8 +210,9 @@ namespace op ...@@ -209,8 +210,9 @@ namespace op
const auto personRectangle = getKeypointsRectangle(keypoints, person, thresholdRectangle); const auto personRectangle = getKeypointsRectangle(keypoints, person, thresholdRectangle);
if (personRectangle.area() > 0) if (personRectangle.area() > 0)
{ {
const auto ratioAreas = fastMin(T(1), fastMax(personRectangle.width/(T)width, const auto ratioAreas = fastMin(
personRectangle.height/(T)height)); T(1), fastMax(
personRectangle.width/(T)width, personRectangle.height/(T)height));
// Size-dependent variables // Size-dependent variables
const auto thicknessRatio = fastMax( const auto thicknessRatio = fastMax(
positiveIntRound(std::sqrt(area)* thicknessCircleRatio * ratioAreas), 2); positiveIntRound(std::sqrt(area)* thicknessCircleRatio * ratioAreas), 2);
...@@ -283,21 +285,32 @@ namespace op ...@@ -283,21 +285,32 @@ namespace op
const std::vector<double>& poseScales, const double threshold); const std::vector<double>& poseScales, const double threshold);
template <typename T> template <typename T>
Rectangle<T> getKeypointsRectangle(const Array<T>& keypoints, const int person, const T threshold) Rectangle<T> getKeypointsRectangle(
const Array<T>& keypoints, const int person, const T threshold, const int firstIndex, const int lastIndex)
{ {
try try
{ {
// Params
const auto numberKeypoints = keypoints.getSize(1); const auto numberKeypoints = keypoints.getSize(1);
// Sanity check const auto lastIndexClean = (lastIndex < 0 ? numberKeypoints : lastIndex);
// Sanity checks
if (numberKeypoints < 1) if (numberKeypoints < 1)
error("Number body parts must be > 0.", __LINE__, __FUNCTION__, __FILE__); error("Number body parts must be > 0.", __LINE__, __FUNCTION__, __FILE__);
if (lastIndexClean > numberKeypoints)
error("The value of `lastIndex` must be less or equal than `numberKeypoints`. Currently: "
+ std::to_string(lastIndexClean) + " vs. " + std::to_string(numberKeypoints),
__LINE__, __FUNCTION__, __FILE__);
if (firstIndex > lastIndexClean)
error("The value of `firstIndex` must be less or equal than `lastIndex`. Currently: "
+ std::to_string(firstIndex) + " vs. " + std::to_string(lastIndex),
__LINE__, __FUNCTION__, __FILE__);
// Define keypointPtr // Define keypointPtr
const auto keypointPtr = keypoints.getConstPtr() + person * keypoints.getSize(1) * keypoints.getSize(2); const auto keypointPtr = keypoints.getConstPtr() + person * keypoints.getSize(1) * keypoints.getSize(2);
T minX = std::numeric_limits<T>::max(); T minX = std::numeric_limits<T>::max();
T maxX = std::numeric_limits<T>::lowest(); T maxX = std::numeric_limits<T>::lowest();
T minY = minX; T minY = minX;
T maxY = maxX; T maxY = maxX;
for (auto part = 0 ; part < numberKeypoints ; part++) for (auto part = firstIndex ; part < lastIndexClean ; part++)
{ {
const auto score = keypointPtr[3*part + 2]; const auto score = keypointPtr[3*part + 2];
if (score > threshold) if (score > threshold)
...@@ -328,9 +341,11 @@ namespace op ...@@ -328,9 +341,11 @@ namespace op
} }
} }
template OP_API Rectangle<float> getKeypointsRectangle( template OP_API Rectangle<float> getKeypointsRectangle(
const Array<float>& keypoints, const int person, const float threshold); const Array<float>& keypoints, const int person, const float threshold, const int firstIndex,
const int lastIndex);
template OP_API Rectangle<double> getKeypointsRectangle( template OP_API Rectangle<double> getKeypointsRectangle(
const Array<double>& keypoints, const int person, const double threshold); const Array<double>& keypoints, const int person, const double threshold, const int firstIndex,
const int lastIndex);
template <typename T> template <typename T>
T getAverageScore(const Array<T>& keypoints, const int person) T getAverageScore(const Array<T>& keypoints, const int person)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册