未验证 提交 03932a84 编写于 作者: L lidanqing 提交者: GitHub

Update mkldnn_quant readme (#657)

* Update mkldnn_quant readme
Co-authored-by: Ncc <52520497+juncaipeng@users.noreply.github.com>
上级 22c4608d
...@@ -9,6 +9,7 @@ message("flags" ${CMAKE_CXX_FLAGS}) ...@@ -9,6 +9,7 @@ message("flags" ${CMAKE_CXX_FLAGS})
if(NOT DEFINED PADDLE_LIB) if(NOT DEFINED PADDLE_LIB)
message(FATAL_ERROR "please set PADDLE_LIB with -DPADDLE_LIB=/path/paddle/lib") message(FATAL_ERROR "please set PADDLE_LIB with -DPADDLE_LIB=/path/paddle/lib")
endif() endif()
set(DEMO_NAME sample_tester)
if(NOT DEFINED DEMO_NAME) if(NOT DEFINED DEMO_NAME)
message(FATAL_ERROR "please set DEMO_NAME with -DDEMO_NAME=demo_name") message(FATAL_ERROR "please set DEMO_NAME with -DDEMO_NAME=demo_name")
endif() endif()
......
...@@ -118,20 +118,17 @@ val/ILSVRC2012_val_00000002.jpg 0 ...@@ -118,20 +118,17 @@ val/ILSVRC2012_val_00000002.jpg 0
- 用户也可以从Paddle官网下载发布的[预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。请选择`ubuntu14.04_cpu_avx_mkl` 最新发布版或者develop版。 - 用户也可以从Paddle官网下载发布的[预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。请选择`ubuntu14.04_cpu_avx_mkl` 最新发布版或者develop版。
你可以将准备好的预测库解压并重命名为fluid_inference,放在当前目录下(`/PATH_TO_PaddleSlim/demo/mkldnn_quant/`)。或者在cmake时通过设置PADDLE_ROOT来指定Paddle预测库的位置。
#### 编译应用 #### 编译应用
样例所在目录为PaddleSlim下`demo/mkldnn_quant/`,样例`sample_tester.cc`和编译所需`cmake`文件夹都在这个目录下。 样例`sample_tester.cc`所在目录为PaddleSlim下`demo/mkldnn_quant/`。 编译时,设置`PADDLE_LIB`为Paddle源码编译生成的预测库或者直接下载的预测库
``` ```
cd /PATH/TO/PaddleSlim cd /PATH/TO/PaddleSlim
cd demo/mkldnn_quant/ cd demo/mkldnn_quant/
mkdir build mkdir build && cd build
cd build cmake -DPADDLE_LIB=path/to/paddle_inference_install_dir ..
cmake -DPADDLE_ROOT=$PADDLE_ROOT ..
make -j make -j
``` ```
如果你从官网下载解压了[预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)到当前目录下,这里`-DPADDLE_ROOT`可以不设置,因为`-DPADDLE_ROOT`默认位置`demo/mkldnn_quant/fluid_inference`
#### 运行测试 #### 运行测试
``` ```
...@@ -140,26 +137,24 @@ export KMP_AFFINITY=granularity=fine,compact,1,0 ...@@ -140,26 +137,24 @@ export KMP_AFFINITY=granularity=fine,compact,1,0
export KMP_BLOCKTIME=1 export KMP_BLOCKTIME=1
# Turbo Boost could be set to OFF using the command # Turbo Boost could be set to OFF using the command
echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
# In the file run.sh, set `MODEL_DIR` to `/PATH/TO/FLOAT32/MODEL`或者`/PATH/TO/SAVE/INT8/MODEL` # For 1 thread performance, by default the bash use 1 threads
# In the file run.sh, set `DATA_FILE` to `/PATH/TO/SAVE/BINARY/FILE` # Set `MODEL_DIR` to `/PATH/TO/FLOAT32/MODEL` or `/PATH/TO/SAVE/INT8/MODEL`
# For 1 thread performance: # Set `DATA_FILE` to `/PATH/TO/SAVE/BINARY/FILE`
./run.sh ./run.sh path/to/MODEL_DIR path/to/DATA_FILE
# For 20 thread performance: # For 20 thread performance, set third parameter 20
./run.sh -1 20 ./run.sh path/to/MODEL_DIR path/to/DATA_FILE 20
``` ```
运行时需要配置以下参数 `run.sh`中所有可选配置参数注释
- **infer_model:** 模型所在目录,注意模型参数当前必须是分开保存成多个文件的。可以设置为`PATH/TO/SAVE/INT8/MODEL`, `PATH/TO/SAVE/FLOAT32/MODEL`。无默认值。 - **infer_model:** 模型所在目录,注意模型参数当前必须是分开保存成多个文件的。可以设置为`PATH/TO/SAVE/INT8/MODEL`, `PATH/TO/SAVE/FLOAT32/MODEL`。无默认值。
- **infer_data:** 测试数据文件所在路径。注意需要是经`full_ILSVRC2012_val_preprocess`转化后的binary文件。 - **infer_data:** 测试数据文件所在路径。注意需要是经`full_ILSVRC2012_val_preprocess`转化后的binary文件。
- **batch_size:** 预测batch size大小。默认值为50 - **batch_size:** 预测batch size大小。默认值为1
- **iterations:** batches迭代数。默认为0,0表示预测infer_data中所有batches (image numbers/batch_size) - **iterations:** batches迭代数。默认为0,0表示预测infer_data中所有batches (image numbers/batch_size)
- **num_threads:** 预测使用CPU 线程数,默认为单核一个线程。 - **num_threads:** 预测使用CPU 线程数,默认为单核一个线程。
- **with_accuracy_layer:** 模型为包含精度计算层的测试模型还是不包含精度计算层的预测模型,默认为true。 - **with_accuracy_layer:** 模型为包含精度计算层的测试模型还是不包含精度计算层的预测模型,默认为true。
- **use_analysis** 是否使用`paddle::AnalysisConfig`对模型优化、融合(fuse),加速。默认为false - **use_analysis** 是否使用`paddle::AnalysisConfig`对模型优化、融合(fuse),加速。默认为false
你可以直接修改`/PATH_TO_PaddleSlim/demo/mkldnn_quant/`目录下的`run.sh`中的MODEL_DIR和DATA_DIR,即可执行`./run.sh`进行CPU预测。 ### 4.3 用户编写自己的测试
### 4.3 用户编写自己的测试:
如果用户编写自己的测试: 如果用户编写自己的测试:
1. 测试INT8模型 1. 测试INT8模型
...@@ -180,7 +175,7 @@ static void SetConfig(paddle::AnalysisConfig *cfg) { ...@@ -180,7 +175,7 @@ static void SetConfig(paddle::AnalysisConfig *cfg) {
- 如果infer_model传入PaddleSlim产出的quant模型,`use_analysis`即使设置为true不起作用,因为quant模型包含fake_quantize/fake_dequantize ops,无法fuse,无法优化。 - 如果infer_model传入PaddleSlim产出的quant模型,`use_analysis`即使设置为true不起作用,因为quant模型包含fake_quantize/fake_dequantize ops,无法fuse,无法优化。
## 5. 精度和性能数据 ## 5. 精度和性能数据
INT8模型精度和性能结果参考[CPU部署预测INT8模型的精度和性能](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/docs/zh_cn/tutorials/image_classification_mkldnn_quant_tutorial.md) INT8模型精度和性能结果参考[CPU部署预测INT8模型的精度和性能](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/tutorials/image_classification_mkldnn_quant_tutorial.md)
## FAQ ## FAQ
......
...@@ -114,20 +114,18 @@ Users can compile the Paddle inference library from the source code or download ...@@ -114,20 +114,18 @@ Users can compile the Paddle inference library from the source code or download
- For instructions on how to compile the Paddle inference library from source, see [Compile from Source](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html#id12), checkout release/2.0 or develop branch and compile it. - For instructions on how to compile the Paddle inference library from source, see [Compile from Source](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html#id12), checkout release/2.0 or develop branch and compile it.
- Users can also download the published [inference Library](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html). Please select `ubuntu14.04_cpu_avx_mkl` latest release or develop version. The downloaded library has to be decompressed and renamed into `fluid_inference` directory and placed in current directory (`/PATH_TO_PaddleSlim/demo/mkldnn_quant/`) for the library to be available. Another option is to set the `PADDLE_ROOT` cmake variable to the `fluid_inference` directory location to link the tests with the Paddle inference library properly. - Users can also download the published [inference Library](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html). Please select `ubuntu14.04_cpu_avx_mkl` latest release or develop version.
#### Compile the application #### Compile the application
The source code file of the sample test (`sample_tester.cc`) and the `cmake` files are all located in `demo/mkldnn_quant/`directory. The source code file of the sample test (`sample_tester.cc`) is located in `demo/mkldnn_quant/`directory.
``` ```
cd /PATH/TO/PaddleSlim cd /PATH/TO/PaddleSlim
cd demo/mkldnn_quant/ cd demo/mkldnn_quant/
mkdir build mkdir build && cd build
cd build cmake -DPADDLE_LIB=path/to/paddle_inference_install_dir ..
cmake -DPADDLE_ROOT=$PADDLE_ROOT ..
make -j make -j
``` ```
- `-DPADDLE_ROOT` default value is `demo/mkldnn_quant/fluid_inference`. If users download and unzip the library [Inference library from the official website](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html) in current directory `demo/mkldnn_quant/`, users could skip this option.
#### Run the test #### Run the test
``` ```
...@@ -136,25 +134,23 @@ export KMP_AFFINITY=granularity=fine,compact,1,0 ...@@ -136,25 +134,23 @@ export KMP_AFFINITY=granularity=fine,compact,1,0
export KMP_BLOCKTIME=1 export KMP_BLOCKTIME=1
# Turbo Boost could be set to OFF using the command # Turbo Boost could be set to OFF using the command
echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
# In the file run.sh, set `MODEL_DIR` to `/PATH/TO/FLOAT32/MODEL` or `/PATH/TO/SAVE/INT8/MODEL` # For 1 thread performance, by default the bash use 1 threads
# In the file run.sh, set `DATA_FILE` to `/PATH/TO/SAVE/BINARY/FILE` # Set `MODEL_DIR` to `/PATH/TO/FLOAT32/MODEL` or `/PATH/TO/SAVE/INT8/MODEL`
# For 1 thread performance: # Set `DATA_FILE` to `/PATH/TO/SAVE/BINARY/FILE`
./run.sh ./run.sh path/to/MODEL_DIR path/to/DATA_FILE
# For 20 thread performance: # For 20 thread performance, set third parameter 20
./run.sh -1 20 ./run.sh path/to/MODEL_DIR path/to/DATA_FILE 20
``` ```
**Available options in the above command and their descriptions are as follows:** **Available options in `run.sh` and their descriptions are as follows:**
- **infer_model:** Required. Tested model path. Note that the model parameters files need be saved into multiple files. - **infer_model:** Required. Tested model path. Note that the model parameters files need be saved into multiple files.
- **infer_data:** Required. The path of the tested data file. Note that it needs to be a binary file converted by `full_ILSVRC2012_val_preprocess`. - **infer_data:** Required. The path of the tested data file. Note that it needs to be a binary file converted by `full_ILSVRC2012_val_preprocess`.
- **batch_size:** Batch size. The default value is 50. - **batch_size:** Batch size. The default value is 1.
- **iterations:** Batch iterations. The default is 0, which means predict all batches (image numbers/batch size) in infer_data - **iterations:** Batch iterations. The default is 0, which means predict all batches (image numbers/batch size) in infer_data
- **num_threads:** Number of CPU threads used. The default value is 1. - **num_threads:** Number of CPU threads used. The default value is 1.
- **with_accuracy_layer:** The model is with accuracy layer or not. Default value false. - **with_accuracy_layer:** The model is with accuracy layer or not. Default value false.
- **use_analysis** Whether to use paddle::AnalysisConfig to optimize the model. Default value is false. - **use_analysis** Whether to use paddle::AnalysisConfig to optimize the model. Default value is false.
One can directly modify MODEL_DIR and DATA_DIR in `run.sh` under `/PATH_TO_PaddleSlim/demo/mkldnn_quant/` directory, then execute `./run.sh` for CPU inference.
### 4.3 Writing your own tests: ### 4.3 Writing your own tests:
When writing their own test, users can: When writing their own test, users can:
1. Test the resulting INT8 model - then paddle::NativeConfig should be used (without applying additional optimizations) and the option `use_analysis` should be set to `false` in the demo. 1. Test the resulting INT8 model - then paddle::NativeConfig should be used (without applying additional optimizations) and the option `use_analysis` should be set to `false` in the demo.
...@@ -175,7 +171,7 @@ static void SetConfig(paddle::AnalysisConfig *cfg) { ...@@ -175,7 +171,7 @@ static void SetConfig(paddle::AnalysisConfig *cfg) {
- If `infer_model` is a path to a fakely quantized model generated by PaddleSlim, `use_analysis` will not work even if it is set to true, because the fake quantized model contains fake quantize/dequantize ops, which cannot be fused or optimized. - If `infer_model` is a path to a fakely quantized model generated by PaddleSlim, `use_analysis` will not work even if it is set to true, because the fake quantized model contains fake quantize/dequantize ops, which cannot be fused or optimized.
## 5. Accuracy and performance benchmark ## 5. Accuracy and performance benchmark
For INT8 models accuracy and performance results see [CPU deployment predicts the accuracy and performance of INT8 model](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/docs/zh_cn/tutorials/image_classification_mkldnn_quant_tutorial.md) For INT8 models accuracy and performance results see [CPU deployment predicts the accuracy and performance of INT8 model](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/tutorials/image_classification_mkldnn_quant_tutorial.md)
## FAQ ## FAQ
......
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
set(PADDLE_FOUND OFF)
if(NOT PADDLE_ROOT)
set(PADDLE_ROOT $ENV{PADDLE_ROOT} CACHE PATH "Paddle Path")
endif()
if(NOT PADDLE_ROOT)
message(FATAL_ERROR "Set PADDLE_ROOT as your root directory installed PaddlePaddle")
endif()
set(THIRD_PARTY_ROOT ${PADDLE_ROOT}/third_party)
if(USE_GPU)
set(CUDA_ROOT $ENV{CUDA_ROOT} CACHE PATH "CUDA root Path")
set(CUDNN_ROOT $ENV{CUDNN_ROOT} CACHE PATH "CUDNN root Path")
endif()
# Support directory orgnizations
find_path(PADDLE_INC_DIR NAMES paddle_inference_api.h PATHS ${PADDLE_ROOT}/paddle/include)
if(PADDLE_INC_DIR)
set(LIB_PATH "paddle/lib")
else()
find_path(PADDLE_INC_DIR NAMES paddle/fluid/inference/paddle_inference_api.h PATHS ${PADDLE_ROOT})
if(PADDLE_INC_DIR)
include_directories(${PADDLE_ROOT}/paddle/fluid/inference)
endif()
set(LIB_PATH "paddle/fluid/inference")
endif()
include_directories(${PADDLE_INC_DIR})
find_library(PADDLE_FLUID_SHARED_LIB NAMES "libpaddle_fluid.so" PATHS
${PADDLE_ROOT}/${LIB_PATH})
find_library(PADDLE_FLUID_STATIC_LIB NAMES "libpaddle_fluid.a" PATHS
${PADDLE_ROOT}/${LIB_PATH})
if(USE_SHARED AND PADDLE_INC_DIR AND PADDLE_FLUID_SHARED_LIB)
set(PADDLE_FOUND ON)
add_library(paddle_fluid_shared SHARED IMPORTED)
set_target_properties(paddle_fluid_shared PROPERTIES IMPORTED_LOCATION
${PADDLE_FLUID_SHARED_LIB})
set(PADDLE_LIBRARIES paddle_fluid_shared)
message(STATUS "Found PaddlePaddle Fluid (include: ${PADDLE_INC_DIR}; "
"library: ${PADDLE_FLUID_SHARED_LIB}")
elseif(PADDLE_INC_DIR AND PADDLE_FLUID_STATIC_LIB)
set(PADDLE_FOUND ON)
add_library(paddle_fluid_static STATIC IMPORTED)
set_target_properties(paddle_fluid_static PROPERTIES IMPORTED_LOCATION
${PADDLE_FLUID_STATIC_LIB})
set(PADDLE_LIBRARIES paddle_fluid_static)
message(STATUS "Found PaddlePaddle Fluid (include: ${PADDLE_INC_DIR}; "
"library: ${PADDLE_FLUID_STATIC_LIB}")
else()
set(PADDLE_FOUND OFF)
message(WARNING "Cannot find PaddlePaddle Fluid under ${PADDLE_ROOT}")
return()
endif()
# including directory of third_party libraries
set(PADDLE_THIRD_PARTY_INC_DIRS)
function(third_party_include TARGET_NAME HEADER_NAME TARGET_DIRNAME)
find_path(PADDLE_${TARGET_NAME}_INC_DIR NAMES ${HEADER_NAME} PATHS
${TARGET_DIRNAME}
NO_DEFAULT_PATH)
if(PADDLE_${TARGET_NAME}_INC_DIR)
message(STATUS "Found PaddlePaddle third_party including directory: " ${PADDLE_${TARGET_NAME}_INC_DIR})
set(PADDLE_THIRD_PARTY_INC_DIRS ${PADDLE_THIRD_PARTY_INC_DIRS} ${PADDLE_${TARGET_NAME}_INC_DIR} PARENT_SCOPE)
endif()
endfunction()
third_party_include(glog glog/logging.h ${THIRD_PARTY_ROOT}/install/glog/include)
third_party_include(protobuf google/protobuf/message.h ${THIRD_PARTY_ROOT}/install/protobuf/include)
third_party_include(gflags gflags/gflags.h ${THIRD_PARTY_ROOT}/install/gflags/include)
third_party_include(eigen unsupported/Eigen/CXX11/Tensor ${THIRD_PARTY_ROOT}/eigen3)
third_party_include(boost boost/config.hpp ${THIRD_PARTY_ROOT}/boost)
if(USE_GPU)
third_party_include(cuda cuda.h ${CUDA_ROOT}/include)
third_party_include(cudnn cudnn.h ${CUDNN_ROOT}/include)
endif()
message(STATUS "PaddlePaddle need to include these third party directories: ${PADDLE_THIRD_PARTY_INC_DIRS}")
include_directories(${PADDLE_THIRD_PARTY_INC_DIRS})
set(PADDLE_THIRD_PARTY_LIBRARIES)
function(third_party_library TARGET_NAME TARGET_DIRNAME)
set(library_names ${ARGN})
set(local_third_party_libraries)
foreach(lib ${library_names})
string(REGEX REPLACE "^lib" "" lib_noprefix ${lib})
if(${lib} MATCHES "${CMAKE_STATIC_LIBRARY_SUFFIX}$")
set(libtype STATIC)
string(REGEX REPLACE "${CMAKE_STATIC_LIBRARY_SUFFIX}$" "" libname ${lib_noprefix})
elseif(${lib} MATCHES "${CMAKE_SHARED_LIBRARY_SUFFIX}(\\.[0-9]+)?$")
set(libtype SHARED)
string(REGEX REPLACE "${CMAKE_SHARED_LIBRARY_SUFFIX}(\\.[0-9]+)?$" "" libname ${lib_noprefix})
else()
message(FATAL_ERROR "Unknown library type: ${lib}")
endif()
#message(STATUS "libname: ${libname}")
find_library(${libname}_LIBRARY NAMES "${lib}" PATHS
${TARGET_DIRNAME}
NO_DEFAULT_PATH)
if(${libname}_LIBRARY)
set(${TARGET_NAME}_FOUND ON PARENT_SCOPE)
add_library(${libname} ${libtype} IMPORTED)
set_target_properties(${libname} PROPERTIES IMPORTED_LOCATION ${${libname}_LIBRARY})
set(local_third_party_libraries ${local_third_party_libraries} ${libname})
message(STATUS "Found PaddlePaddle third_party library: " ${${libname}_LIBRARY})
else()
set(${TARGET_NAME}_FOUND OFF PARENT_SCOPE)
message(WARNING "Cannot find ${lib} under ${THIRD_PARTY_ROOT}")
endif()
endforeach()
set(PADDLE_THIRD_PARTY_LIBRARIES ${PADDLE_THIRD_PARTY_LIBRARIES} ${local_third_party_libraries} PARENT_SCOPE)
endfunction()
third_party_library(mklml ${THIRD_PARTY_ROOT}/install/mklml/lib libiomp5.so libmklml_intel.so)
third_party_library(mkldnn ${THIRD_PARTY_ROOT}/install/mkldnn/lib libmkldnn.so)
if(NOT mkldnn_FOUND)
third_party_library(mkldnn ${THIRD_PARTY_ROOT}/install/mkldnn/lib libmkldnn.so.0)
endif()
if(NOT USE_SHARED)
third_party_library(glog ${THIRD_PARTY_ROOT}/install/glog/lib libglog.a)
third_party_library(protobuf ${THIRD_PARTY_ROOT}/install/protobuf/lib libprotobuf.a)
third_party_library(gflags ${THIRD_PARTY_ROOT}/install/gflags/lib libgflags.a)
if(NOT mklml_FOUND)
third_party_library(openblas ${THIRD_PARTY_ROOT}/install/openblas/lib libopenblas.a)
endif()
third_party_library(zlib ${THIRD_PARTY_ROOT}/install/zlib/lib libz.a)
third_party_library(snappystream ${THIRD_PARTY_ROOT}/install/snappystream/lib libsnappystream.a)
third_party_library(snappy ${THIRD_PARTY_ROOT}/install/snappy/lib libsnappy.a)
third_party_library(xxhash ${THIRD_PARTY_ROOT}/install/xxhash/lib libxxhash.a)
if(USE_GPU)
third_party_library(cudart ${CUDA_ROOT}/lib64 libcudart.so)
endif()
endif()
\ No newline at end of file
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Tries to find Gperftools.
#
# Usage of this module as follows:
#
# find_package(Gperftools)
#
# Variables used by this module, they can change the default behaviour and need
# to be set before calling find_package:
#
# Gperftools_ROOT_DIR Set this variable to the root installation of
# Gperftools if the module has problems finding
# the proper installation path.
#
# Variables defined by this module:
#
# GPERFTOOLS_FOUND System has Gperftools libs/headers
# GPERFTOOLS_LIBRARIES The Gperftools libraries (tcmalloc & profiler)
# GPERFTOOLS_INCLUDE_DIR The location of Gperftools headers
find_library(GPERFTOOLS_TCMALLOC
NAMES tcmalloc
HINTS ${Gperftools_ROOT_DIR}/lib)
find_library(GPERFTOOLS_PROFILER
NAMES profiler
HINTS ${Gperftools_ROOT_DIR}/lib)
find_library(GPERFTOOLS_TCMALLOC_AND_PROFILER
NAMES tcmalloc_and_profiler
HINTS ${Gperftools_ROOT_DIR}/lib)
find_path(GPERFTOOLS_INCLUDE_DIR
NAMES gperftools/heap-profiler.h
HINTS ${Gperftools_ROOT_DIR}/include)
set(GPERFTOOLS_LIBRARIES ${GPERFTOOLS_TCMALLOC_AND_PROFILER})
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
Gperftools
DEFAULT_MSG
GPERFTOOLS_LIBRARIES
GPERFTOOLS_INCLUDE_DIR)
mark_as_advanced(
Gperftools_ROOT_DIR
GPERFTOOLS_TCMALLOC
GPERFTOOLS_PROFILER
GPERFTOOLS_TCMALLOC_AND_PROFILER
GPERFTOOLS_LIBRARIES
GPERFTOOLS_INCLUDE_DIR)
# create IMPORTED targets
if (Gperftools_FOUND AND NOT TARGET gperftools::tcmalloc)
add_library(gperftools::tcmalloc UNKNOWN IMPORTED)
set_target_properties(gperftools::tcmalloc PROPERTIES
IMPORTED_LOCATION ${GPERFTOOLS_TCMALLOC}
INTERFACE_INCLUDE_DIRECTORIES "${GPERFTOOLS_INCLUDE_DIR}")
add_library(gperftools::profiler UNKNOWN IMPORTED)
set_target_properties(gperftools::profiler PROPERTIES
IMPORTED_LOCATION ${GPERFTOOLS_PROFILER}
INTERFACE_INCLUDE_DIRECTORIES "${GPERFTOOLS_INCLUDE_DIR}")
endif()
#!/bin/bash #!/bin/bash
MODEL_DIR=/home/li/models/ResNet50_4th_qat_int8 MODEL_DIR=$1
DATA_FILE=/mnt/disk500/data/int8_full_val.bin DATA_FILE=$2
num_threads=1 default_num_threads=1
with_accuracy_layer=false default_with_accuracy=false
use_profile=true num_threads=${3:-$default_num_threads}
with_accuracy_layer=${4:-$default_with_accuracy}
ITERATIONS=0 ITERATIONS=0
GLOG_logtostderr=1 ./build/sample_tester \ GLOG_logtostderr=1 ./build/sample_tester \
...@@ -13,5 +14,4 @@ GLOG_logtostderr=1 ./build/sample_tester \ ...@@ -13,5 +14,4 @@ GLOG_logtostderr=1 ./build/sample_tester \
--num_threads=${num_threads} \ --num_threads=${num_threads} \
--iterations=${ITERATIONS} \ --iterations=${ITERATIONS} \
--with_accuracy_layer=${with_accuracy_layer} \ --with_accuracy_layer=${with_accuracy_layer} \
--use_profile=${use_profile} \
--use_analysis=false --use_analysis=false
...@@ -36,9 +36,6 @@ DEFINE_bool(with_accuracy_layer, ...@@ -36,9 +36,6 @@ DEFINE_bool(with_accuracy_layer,
true, true,
"Set with_accuracy_layer to true if provided model has accuracy " "Set with_accuracy_layer to true if provided model has accuracy "
"layer and requires label input"); "layer and requires label input");
DEFINE_bool(use_profile,
false,
"Set use_profile to true to get profile information");
DEFINE_bool(use_analysis, DEFINE_bool(use_analysis,
false, false,
"If use_analysis is set to true, the model will be optimized"); "If use_analysis is set to true, the model will be optimized");
...@@ -151,7 +148,7 @@ void SetInput(std::vector<std::vector<paddle::PaddleTensor>> *inputs, ...@@ -151,7 +148,7 @@ void SetInput(std::vector<std::vector<paddle::PaddleTensor>> *inputs,
} }
inputs->push_back(std::move(tmp_vec)); inputs->push_back(std::move(tmp_vec));
if (i > 0 && i % 100==0) { if (i > 0 && i % 100==0) {
LOG(INFO) << "Read " << i * 100 * FLAGS_batch_size << " samples"; LOG(INFO) << "Read " << i * FLAGS_batch_size << " samples";
} }
} }
} }
...@@ -183,10 +180,6 @@ void PredictionRun(paddle::PaddlePredictor *predictor, ...@@ -183,10 +180,6 @@ void PredictionRun(paddle::PaddlePredictor *predictor,
outputs->resize(iterations); outputs->resize(iterations);
Timer run_timer; Timer run_timer;
double elapsed_time = 0; double elapsed_time = 0;
#ifdef WITH_GPERFTOOLS
ResetProfiler();
ProfilerStart("paddle_inference.prof");
#endif
int predicted_num = 0; int predicted_num = 0;
for (int i = 0; i < iterations; i++) { for (int i = 0; i < iterations; i++) {
...@@ -200,10 +193,6 @@ void PredictionRun(paddle::PaddlePredictor *predictor, ...@@ -200,10 +193,6 @@ void PredictionRun(paddle::PaddlePredictor *predictor,
} }
} }
#ifdef WITH_GPERFTOOLS
ProfilerStop();
#endif
auto batch_latency = elapsed_time / iterations; auto batch_latency = elapsed_time / iterations;
PrintTime(FLAGS_batch_size, num_threads, batch_latency, iterations); PrintTime(FLAGS_batch_size, num_threads, batch_latency, iterations);
...@@ -277,9 +266,6 @@ static void SetIrOptimConfig(paddle::AnalysisConfig *cfg) { ...@@ -277,9 +266,6 @@ static void SetIrOptimConfig(paddle::AnalysisConfig *cfg) {
cfg->DisableGpu(); cfg->DisableGpu();
cfg->SwitchIrOptim(); cfg->SwitchIrOptim();
cfg->EnableMKLDNN(); cfg->EnableMKLDNN();
if (FLAGS_use_profile) {
cfg->EnableProfile();
}
} }
std::unique_ptr<paddle::PaddlePredictor> CreatePredictor( std::unique_ptr<paddle::PaddlePredictor> CreatePredictor(
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册