Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleDetection
提交
e5bf8616
P
PaddleDetection
项目概览
PaddlePaddle
/
PaddleDetection
接近 2 年 前同步成功
通知
707
Star
11112
Fork
2696
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
184
列表
看板
标记
里程碑
合并请求
40
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleDetection
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
184
Issue
184
列表
看板
标记
里程碑
合并请求
40
合并请求
40
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
e5bf8616
编写于
11月 13, 2018
作者:
N
nhzlx
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'develop' of
https://github.com/paddlepaddle/paddle
into add_trt_plugin
test=develop
上级
d38fd6a0
38f499df
变更
73
隐藏空白更改
内联
并排
Showing
73 changed file
with
2361 addition
and
223 deletion
+2361
-223
CMakeLists.txt
CMakeLists.txt
+4
-0
cmake/external/mkldnn.cmake
cmake/external/mkldnn.cmake
+0
-1
cmake/external/ngraph.cmake
cmake/external/ngraph.cmake
+92
-0
cmake/external/protobuf.cmake
cmake/external/protobuf.cmake
+49
-53
paddle/fluid/API.spec
paddle/fluid/API.spec
+5
-1
paddle/fluid/framework/ir/graph_pattern_detector.cc
paddle/fluid/framework/ir/graph_pattern_detector.cc
+20
-16
paddle/fluid/framework/operator.cc
paddle/fluid/framework/operator.cc
+2
-0
paddle/fluid/framework/var_type_inference.h
paddle/fluid/framework/var_type_inference.h
+25
-0
paddle/fluid/inference/analysis/analyzer.cc
paddle/fluid/inference/analysis/analyzer.cc
+3
-1
paddle/fluid/inference/analysis/data_flow_graph.cc
paddle/fluid/inference/analysis/data_flow_graph.cc
+2
-2
paddle/fluid/inference/tensorrt/convert/activation_op.cc
paddle/fluid/inference/tensorrt/convert/activation_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/batch_norm_op.cc
paddle/fluid/inference/tensorrt/convert/batch_norm_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/concat_op.cc
paddle/fluid/inference/tensorrt/convert/concat_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/conv2d_op.cc
paddle/fluid/inference/tensorrt/convert/conv2d_op.cc
+1
-2
paddle/fluid/inference/tensorrt/convert/dropout_op.cc
paddle/fluid/inference/tensorrt/convert/dropout_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/elementwise_op.cc
paddle/fluid/inference/tensorrt/convert/elementwise_op.cc
+2
-2
paddle/fluid/inference/tensorrt/convert/fc_op.cc
paddle/fluid/inference/tensorrt/convert/fc_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/mul_op.cc
paddle/fluid/inference/tensorrt/convert/mul_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/pad_op.cc
paddle/fluid/inference/tensorrt/convert/pad_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/pool2d_op.cc
paddle/fluid/inference/tensorrt/convert/pool2d_op.cc
+1
-1
paddle/fluid/inference/tensorrt/convert/softmax_op.cc
paddle/fluid/inference/tensorrt/convert/softmax_op.cc
+1
-1
paddle/fluid/inference/tensorrt/helper.h
paddle/fluid/inference/tensorrt/helper.h
+1
-1
paddle/fluid/inference/tests/api/CMakeLists.txt
paddle/fluid/inference/tests/api/CMakeLists.txt
+1
-1
paddle/fluid/operators/CMakeLists.txt
paddle/fluid/operators/CMakeLists.txt
+22
-0
paddle/fluid/operators/activation_op.cc
paddle/fluid/operators/activation_op.cc
+6
-10
paddle/fluid/operators/batch_norm_op.cc
paddle/fluid/operators/batch_norm_op.cc
+10
-1
paddle/fluid/operators/conv_op.cc
paddle/fluid/operators/conv_op.cc
+12
-0
paddle/fluid/operators/cross_entropy_op.cc
paddle/fluid/operators/cross_entropy_op.cc
+11
-0
paddle/fluid/operators/detection/CMakeLists.txt
paddle/fluid/operators/detection/CMakeLists.txt
+1
-0
paddle/fluid/operators/detection/density_prior_box_op.cc
paddle/fluid/operators/detection/density_prior_box_op.cc
+175
-0
paddle/fluid/operators/detection/density_prior_box_op.h
paddle/fluid/operators/detection/density_prior_box_op.h
+146
-0
paddle/fluid/operators/elementwise_op.h
paddle/fluid/operators/elementwise_op.h
+6
-10
paddle/fluid/operators/gather.cu.h
paddle/fluid/operators/gather.cu.h
+3
-1
paddle/fluid/operators/gather.h
paddle/fluid/operators/gather.h
+2
-1
paddle/fluid/operators/gather_op.cc
paddle/fluid/operators/gather_op.cc
+4
-2
paddle/fluid/operators/mean_op.cc
paddle/fluid/operators/mean_op.cc
+19
-2
paddle/fluid/operators/mul_op.cc
paddle/fluid/operators/mul_op.cc
+10
-1
paddle/fluid/operators/pool_op.cc
paddle/fluid/operators/pool_op.cc
+14
-4
paddle/fluid/operators/reduce_max_op.cu
paddle/fluid/operators/reduce_max_op.cu
+0
-9
paddle/fluid/operators/reduce_max_op.part.cu
paddle/fluid/operators/reduce_max_op.part.cu
+25
-0
paddle/fluid/operators/reduce_mean_op.cu
paddle/fluid/operators/reduce_mean_op.cu
+0
-10
paddle/fluid/operators/reduce_mean_op.part.cu
paddle/fluid/operators/reduce_mean_op.part.cu
+26
-0
paddle/fluid/operators/reduce_min_op.cu
paddle/fluid/operators/reduce_min_op.cu
+0
-9
paddle/fluid/operators/reduce_min_op.part.cu
paddle/fluid/operators/reduce_min_op.part.cu
+25
-0
paddle/fluid/operators/reduce_prod_op.cu
paddle/fluid/operators/reduce_prod_op.cu
+0
-9
paddle/fluid/operators/reduce_prod_op.part.cu
paddle/fluid/operators/reduce_prod_op.part.cu
+25
-0
paddle/fluid/operators/reduce_sum_op.cu
paddle/fluid/operators/reduce_sum_op.cu
+0
-10
paddle/fluid/operators/reduce_sum_op.part.cu
paddle/fluid/operators/reduce_sum_op.part.cu
+26
-0
paddle/fluid/operators/scatter.cu.h
paddle/fluid/operators/scatter.cu.h
+2
-1
paddle/fluid/operators/scatter.h
paddle/fluid/operators/scatter.h
+2
-1
paddle/fluid/operators/similarity_focus_op.cc
paddle/fluid/operators/similarity_focus_op.cc
+87
-0
paddle/fluid/operators/similarity_focus_op.h
paddle/fluid/operators/similarity_focus_op.h
+168
-0
paddle/fluid/operators/softmax_op.cc
paddle/fluid/operators/softmax_op.cc
+9
-1
paddle/fluid/operators/stack_op.cc
paddle/fluid/operators/stack_op.cc
+6
-2
paddle/fluid/operators/stack_op.cu
paddle/fluid/operators/stack_op.cu
+6
-2
paddle/fluid/operators/tensor_array_to_tensor_op.cc
paddle/fluid/operators/tensor_array_to_tensor_op.cc
+246
-0
paddle/scripts/paddle_build.sh
paddle/scripts/paddle_build.sh
+2
-0
python/paddle/fluid/__init__.py
python/paddle/fluid/__init__.py
+1
-0
python/paddle/fluid/distribute_lookup_table.py
python/paddle/fluid/distribute_lookup_table.py
+39
-0
python/paddle/fluid/layers/detection.py
python/paddle/fluid/layers/detection.py
+130
-0
python/paddle/fluid/layers/nn.py
python/paddle/fluid/layers/nn.py
+198
-4
python/paddle/fluid/layers/tensor.py
python/paddle/fluid/layers/tensor.py
+58
-4
python/paddle/fluid/optimizer.py
python/paddle/fluid/optimizer.py
+59
-7
python/paddle/fluid/tests/book/test_label_semantic_roles.py
python/paddle/fluid/tests/book/test_label_semantic_roles.py
+1
-1
python/paddle/fluid/tests/test_detection.py
python/paddle/fluid/tests/test_detection.py
+18
-0
python/paddle/fluid/tests/unittests/test_density_prior_box_op.py
...paddle/fluid/tests/unittests/test_density_prior_box_op.py
+142
-0
python/paddle/fluid/tests/unittests/test_dist_transpiler.py
python/paddle/fluid/tests/unittests/test_dist_transpiler.py
+11
-2
python/paddle/fluid/tests/unittests/test_layers.py
python/paddle/fluid/tests/unittests/test_layers.py
+14
-0
python/paddle/fluid/tests/unittests/test_parallel_executor_dry_run.py
...e/fluid/tests/unittests/test_parallel_executor_dry_run.py
+3
-3
python/paddle/fluid/tests/unittests/test_similarity_focus_op.py
.../paddle/fluid/tests/unittests/test_similarity_focus_op.py
+217
-0
python/paddle/fluid/tests/unittests/test_tensor_array_to_tensor.py
...ddle/fluid/tests/unittests/test_tensor_array_to_tensor.py
+142
-0
python/paddle/fluid/transpiler/distribute_transpiler.py
python/paddle/fluid/transpiler/distribute_transpiler.py
+4
-27
python/setup.py.in
python/setup.py.in
+12
-0
未找到文件。
CMakeLists.txt
浏览文件 @
e5bf8616
...
@@ -41,6 +41,7 @@ option(WITH_GPU "Compile PaddlePaddle with NVIDIA GPU" ${CUDA_F
...
@@ -41,6 +41,7 @@ option(WITH_GPU "Compile PaddlePaddle with NVIDIA GPU" ${CUDA_F
option
(
WITH_AMD_GPU
"Compile PaddlePaddle with AMD GPU"
OFF
)
option
(
WITH_AMD_GPU
"Compile PaddlePaddle with AMD GPU"
OFF
)
option
(
WITH_AVX
"Compile PaddlePaddle with AVX intrinsics"
${
AVX_FOUND
}
)
option
(
WITH_AVX
"Compile PaddlePaddle with AVX intrinsics"
${
AVX_FOUND
}
)
option
(
WITH_MKL
"Compile PaddlePaddle with MKL support."
${
AVX_FOUND
}
)
option
(
WITH_MKL
"Compile PaddlePaddle with MKL support."
${
AVX_FOUND
}
)
option
(
WITH_NGRAPH
"Compile PaddlePaddle with nGraph support."
OFF
)
option
(
WITH_DSO
"Compile PaddlePaddle with dynamic linked CUDA"
ON
)
option
(
WITH_DSO
"Compile PaddlePaddle with dynamic linked CUDA"
ON
)
option
(
WITH_TESTING
"Compile PaddlePaddle with unit testing"
OFF
)
option
(
WITH_TESTING
"Compile PaddlePaddle with unit testing"
OFF
)
option
(
WITH_SWIG_PY
"Compile PaddlePaddle with inference api"
ON
)
option
(
WITH_SWIG_PY
"Compile PaddlePaddle with inference api"
ON
)
...
@@ -103,6 +104,8 @@ if(ANDROID OR IOS)
...
@@ -103,6 +104,8 @@ if(ANDROID OR IOS)
"Disable RDMA when cross-compiling for Android and iOS"
FORCE
)
"Disable RDMA when cross-compiling for Android and iOS"
FORCE
)
set
(
WITH_MKL OFF CACHE STRING
set
(
WITH_MKL OFF CACHE STRING
"Disable MKL when cross-compiling for Android and iOS"
FORCE
)
"Disable MKL when cross-compiling for Android and iOS"
FORCE
)
set
(
WITH_NGRAPH OFF CACHE STRING
"Disable nGraph when cross-compiling for Android and iOS"
FORCE
)
set
(
WITH_GOLANG OFF CACHE STRING
set
(
WITH_GOLANG OFF CACHE STRING
"Disable golang when cross-compiling for Android and iOS"
FORCE
)
"Disable golang when cross-compiling for Android and iOS"
FORCE
)
...
@@ -171,6 +174,7 @@ include(external/protobuf) # download, build, install protobuf
...
@@ -171,6 +174,7 @@ include(external/protobuf) # download, build, install protobuf
include
(
external/python
)
# download, build, install python
include
(
external/python
)
# download, build, install python
include
(
external/openblas
)
# download, build, install openblas
include
(
external/openblas
)
# download, build, install openblas
include
(
external/mkldnn
)
# download, build, install mkldnn
include
(
external/mkldnn
)
# download, build, install mkldnn
include
(
external/ngraph
)
# download, build, install nGraph
include
(
external/swig
)
# download, build, install swig
include
(
external/swig
)
# download, build, install swig
include
(
external/boost
)
# download boost
include
(
external/boost
)
# download boost
include
(
external/any
)
# download libn::any
include
(
external/any
)
# download libn::any
...
...
cmake/external/mkldnn.cmake
浏览文件 @
e5bf8616
...
@@ -37,7 +37,6 @@ SET(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
...
@@ -37,7 +37,6 @@ SET(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
SET
(
CMAKE_INSTALL_RPATH
"
${
CMAKE_INSTALL_RPATH
}
"
"
${
MKLDNN_INSTALL_DIR
}
/lib"
)
SET
(
CMAKE_INSTALL_RPATH
"
${
CMAKE_INSTALL_RPATH
}
"
"
${
MKLDNN_INSTALL_DIR
}
/lib"
)
INCLUDE_DIRECTORIES
(
${
MKLDNN_INC_DIR
}
)
# For MKLDNN code to include internal headers.
INCLUDE_DIRECTORIES
(
${
MKLDNN_INC_DIR
}
)
# For MKLDNN code to include internal headers.
INCLUDE_DIRECTORIES
(
${
THIRD_PARTY_PATH
}
/install
)
# For Paddle code to include mkldnn.h
IF
(
${
CBLAS_PROVIDER
}
STREQUAL
"MKLML"
)
IF
(
${
CBLAS_PROVIDER
}
STREQUAL
"MKLML"
)
SET
(
MKLDNN_DEPENDS
${
MKLML_PROJECT
}
)
SET
(
MKLDNN_DEPENDS
${
MKLML_PROJECT
}
)
...
...
cmake/external/ngraph.cmake
0 → 100644
浏览文件 @
e5bf8616
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
add_library
(
ngraph INTERFACE
)
IF
(
WIN32 OR APPLE
)
MESSAGE
(
WARNING
"Windows or Mac is not supported with nGraph in Paddle yet."
"Force WITH_NGRAPH=OFF"
)
SET
(
WITH_NGRAPH OFF CACHE STRING
"Disable nGraph in Windows and MacOS"
FORCE
)
ENDIF
()
IF
(
${
WITH_NGRAPH
}
AND NOT
${
WITH_MKLDNN
}
)
MESSAGE
(
WARNING
"nGraph needs mkl-dnn to be enabled."
"Force WITH_NGRAPH=OFF"
)
SET
(
WITH_NGRAPH OFF CACHE STRING
"Disable nGraph if mkl-dnn is disabled"
FORCE
)
ENDIF
()
IF
(
NOT
${
WITH_NGRAPH
}
)
return
()
ENDIF
()
INCLUDE
(
ExternalProject
)
SET
(
NGRAPH_PROJECT
"extern_ngraph"
)
SET
(
NGRAPH_VERSION
"0.9"
)
SET
(
NGRAPH_GIT_TAG
"f9fd9d4cc318dc59dd4b68448e7fbb5f67a28bd0"
)
SET
(
NGRAPH_SOURCES_DIR
${
THIRD_PARTY_PATH
}
/ngraph
)
SET
(
NGRAPH_INSTALL_DIR
${
THIRD_PARTY_PATH
}
/install/ngraph
)
SET
(
NGRAPH_INC_DIR
${
NGRAPH_INSTALL_DIR
}
/include
)
SET
(
NGRAPH_SHARED_LIB_NAME libngraph.so.
${
NGRAPH_VERSION
}
)
SET
(
NGRAPH_CPU_LIB_NAME libcpu_backend.so
)
SET
(
NGRAPH_TBB_LIB_NAME libtbb.so.2
)
SET
(
NGRAPH_GIT_REPO
"https://github.com/NervanaSystems/ngraph.git"
)
ExternalProject_Add
(
${
NGRAPH_PROJECT
}
${
EXTERNAL_PROJECT_LOG_ARGS
}
DEPENDS
${
MKLDNN_PROJECT
}
${
MKLML_PROJECT
}
GIT_REPOSITORY
${
NGRAPH_GIT_REPO
}
GIT_TAG
${
NGRAPH_GIT_TAG
}
PREFIX
${
NGRAPH_SOURCES_DIR
}
UPDATE_COMMAND
""
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=
${
NGRAPH_INSTALL_DIR
}
CMAKE_ARGS -DNGRAPH_UNIT_TEST_ENABLE=FALSE
CMAKE_ARGS -DNGRAPH_TOOLS_ENABLE=FALSE
CMAKE_ARGS -DNGRAPH_INTERPRETER_ENABLE=FALSE
CMAKE_ARGS -DNGRAPH_DEX_ONLY=TRUE
CMAKE_ARGS -DCMAKE_BUILD_TYPE=
${
CMAKE_BUILD_TYPE
}
CMAKE_ARGS -DMKLDNN_INCLUDE_DIR=
${
MKLDNN_INC_DIR
}
CMAKE_ARGS -DMKLDNN_LIB_DIR=
${
MKLDNN_INSTALL_DIR
}
/lib
)
if
(
UNIX AND NOT APPLE
)
include
(
GNUInstallDirs
)
SET
(
NGRAPH_LIB_DIR
${
NGRAPH_INSTALL_DIR
}
/
${
CMAKE_INSTALL_LIBDIR
}
)
else
()
SET
(
NGRAPH_LIB_DIR
${
NGRAPH_INSTALL_DIR
}
/lib
)
endif
()
MESSAGE
(
STATUS
"nGraph lib will be installed at:
${
NGRAPH_LIB_DIR
}
"
)
SET
(
NGRAPH_SHARED_LIB
${
NGRAPH_LIB_DIR
}
/
${
NGRAPH_SHARED_LIB_NAME
}
)
SET
(
NGRAPH_CPU_LIB
${
NGRAPH_LIB_DIR
}
/
${
NGRAPH_CPU_LIB_NAME
}
)
SET
(
NGRAPH_TBB_LIB
${
NGRAPH_LIB_DIR
}
/
${
NGRAPH_TBB_LIB_NAME
}
)
# Workaround for nGraph expecting mklml to be in mkldnn install directory.
ExternalProject_Add_Step
(
${
NGRAPH_PROJECT
}
PrepareMKL
COMMAND
${
CMAKE_COMMAND
}
-E create_symlink
${
MKLML_LIB
}
${
MKLDNN_INSTALL_DIR
}
/lib/libmklml_intel.so
COMMAND
${
CMAKE_COMMAND
}
-E create_symlink
${
MKLML_IOMP_LIB
}
${
MKLDNN_INSTALL_DIR
}
/lib/libiomp5.so
DEPENDEES download
DEPENDERS configure
)
add_dependencies
(
ngraph
${
NGRAPH_PROJECT
}
)
target_compile_definitions
(
ngraph INTERFACE -DPADDLE_WITH_NGRAPH
)
target_include_directories
(
ngraph INTERFACE
${
NGRAPH_INC_DIR
}
)
target_link_libraries
(
ngraph INTERFACE
${
NGRAPH_SHARED_LIB
}
)
LIST
(
APPEND external_project_dependencies ngraph
)
cmake/external/protobuf.cmake
浏览文件 @
e5bf8616
...
@@ -30,66 +30,61 @@ UNSET_VAR(PROTOBUF_LITE_LIBRARY)
...
@@ -30,66 +30,61 @@ UNSET_VAR(PROTOBUF_LITE_LIBRARY)
UNSET_VAR
(
PROTOBUF_LIBRARY
)
UNSET_VAR
(
PROTOBUF_LIBRARY
)
UNSET_VAR
(
PROTOBUF_INCLUDE_DIR
)
UNSET_VAR
(
PROTOBUF_INCLUDE_DIR
)
UNSET_VAR
(
Protobuf_PROTOC_EXECUTABLE
)
UNSET_VAR
(
Protobuf_PROTOC_EXECUTABLE
)
function
(
protobuf_generate_python SRCS
)
# shameless copy from https://github.com/Kitware/CMake/blob/master/Modules/FindProtobuf.cmake
if
(
NOT ARGN
)
message
(
SEND_ERROR
"Error: PROTOBUF_GENERATE_PYTHON() called without any proto files"
)
return
()
endif
()
if
(
NOT COMMAND protobuf_generate_python
)
# before cmake 3.4, protobuf_genrerate_python is not defined.
if
(
PROTOBUF_GENERATE_CPP_APPEND_PATH
)
function
(
protobuf_generate_python SRCS
)
# Create an include path for each file specified
# shameless copy from https://github.com/Kitware/CMake/blob/master/Modules/FindProtobuf.cmake
if
(
NOT ARGN
)
message
(
SEND_ERROR
"Error: PROTOBUF_GENERATE_PYTHON() called without any proto files"
)
return
()
endif
()
if
(
PROTOBUF_GENERATE_CPP_APPEND_PATH
)
# Create an include path for each file specified
foreach
(
FIL
${
ARGN
}
)
get_filename_component
(
ABS_FIL
${
FIL
}
ABSOLUTE
)
get_filename_component
(
ABS_PATH
${
ABS_FIL
}
PATH
)
list
(
FIND _protobuf_include_path
${
ABS_PATH
}
_contains_already
)
if
(
${
_contains_already
}
EQUAL -1
)
list
(
APPEND _protobuf_include_path -I
${
ABS_PATH
}
)
endif
()
endforeach
()
else
()
set
(
_protobuf_include_path -I
${
CMAKE_CURRENT_SOURCE_DIR
}
)
endif
()
if
(
DEFINED PROTOBUF_IMPORT_DIRS AND NOT DEFINED Protobuf_IMPORT_DIRS
)
set
(
Protobuf_IMPORT_DIRS
"
${
PROTOBUF_IMPORT_DIRS
}
"
)
endif
()
if
(
DEFINED Protobuf_IMPORT_DIRS
)
foreach
(
DIR
${
Protobuf_IMPORT_DIRS
}
)
get_filename_component
(
ABS_PATH
${
DIR
}
ABSOLUTE
)
list
(
FIND _protobuf_include_path
${
ABS_PATH
}
_contains_already
)
if
(
${
_contains_already
}
EQUAL -1
)
list
(
APPEND _protobuf_include_path -I
${
ABS_PATH
}
)
endif
()
endforeach
()
endif
()
set
(
${
SRCS
}
)
foreach
(
FIL
${
ARGN
}
)
foreach
(
FIL
${
ARGN
}
)
get_filename_component
(
ABS_FIL
${
FIL
}
ABSOLUTE
)
get_filename_component
(
ABS_FIL
${
FIL
}
ABSOLUTE
)
get_filename_component
(
FIL_WE
${
FIL
}
NAME_WE
)
get_filename_component
(
ABS_PATH
${
ABS_FIL
}
PATH
)
if
(
NOT PROTOBUF_GENERATE_CPP_APPEND_PATH
)
list
(
FIND _protobuf_include_path
${
ABS_PATH
}
_contains_already
)
get_filename_component
(
FIL_DIR
${
FIL
}
DIRECTORY
)
if
(
${
_contains_already
}
EQUAL -1
)
if
(
FIL_DIR
)
list
(
APPEND _protobuf_include_path -I
${
ABS_PATH
}
)
set
(
FIL_WE
"
${
FIL_DIR
}
/
${
FIL_WE
}
"
)
endif
()
endif
()
endif
()
endforeach
()
else
()
set
(
_protobuf_include_path -I
${
CMAKE_CURRENT_SOURCE_DIR
}
)
endif
()
if
(
DEFINED PROTOBUF_IMPORT_DIRS AND NOT DEFINED Protobuf_IMPORT_DIRS
)
set
(
Protobuf_IMPORT_DIRS
"
${
PROTOBUF_IMPORT_DIRS
}
"
)
endif
()
list
(
APPEND
${
SRCS
}
"
${
CMAKE_CURRENT_BINARY_DIR
}
/
${
FIL_WE
}
_pb2.py"
)
if
(
DEFINED Protobuf_IMPORT_DIRS
)
add_custom_command
(
foreach
(
DIR
${
Protobuf_IMPORT_DIRS
}
)
OUTPUT
"
${
CMAKE_CURRENT_BINARY_DIR
}
/
${
FIL_WE
}
_pb2.py"
get_filename_component
(
ABS_PATH
${
DIR
}
ABSOLUTE
)
COMMAND
${
Protobuf_PROTOC_EXECUTABLE
}
--python_out
${
CMAKE_CURRENT_BINARY_DIR
}
${
_protobuf_include_path
}
${
ABS_FIL
}
list
(
FIND _protobuf_include_path
${
ABS_PATH
}
_contains_already
)
DEPENDS
${
ABS_FIL
}
${
Protobuf_PROTOC_EXECUTABLE
}
if
(
${
_contains_already
}
EQUAL -1
)
COMMENT
"Running Python protocol buffer compiler on
${
FIL
}
"
list
(
APPEND _protobuf_include_path -I
${
ABS_PATH
}
)
VERBATIM
)
endif
(
)
endforeach
()
endforeach
()
endif
()
set
(
${
SRCS
}
${${
SRCS
}}
PARENT_SCOPE
)
set
(
${
SRCS
}
)
endfunction
()
foreach
(
FIL
${
ARGN
}
)
endif
()
get_filename_component
(
ABS_FIL
${
FIL
}
ABSOLUTE
)
get_filename_component
(
FIL_WE
${
FIL
}
NAME_WE
)
if
(
NOT PROTOBUF_GENERATE_CPP_APPEND_PATH
)
get_filename_component
(
FIL_DIR
${
FIL
}
DIRECTORY
)
if
(
FIL_DIR
)
set
(
FIL_WE
"
${
FIL_DIR
}
/
${
FIL_WE
}
"
)
endif
()
endif
()
list
(
APPEND
${
SRCS
}
"
${
CMAKE_CURRENT_BINARY_DIR
}
/
${
FIL_WE
}
_pb2.py"
)
add_custom_command
(
OUTPUT
"
${
CMAKE_CURRENT_BINARY_DIR
}
/
${
FIL_WE
}
_pb2.py"
COMMAND
${
PROTOBUF_PROTOC_EXECUTABLE
}
--python_out
${
CMAKE_CURRENT_BINARY_DIR
}
${
_protobuf_include_path
}
${
ABS_FIL
}
DEPENDS
${
ABS_FIL
}
${
PROTOBUF_PROTOC_EXECUTABLE
}
COMMENT
"Running Python protocol buffer compiler on
${
FIL
}
"
VERBATIM
)
endforeach
()
set
(
${
SRCS
}
${${
SRCS
}}
PARENT_SCOPE
)
endfunction
()
# Print and set the protobuf library information,
# Print and set the protobuf library information,
# finish this cmake process and exit from this file.
# finish this cmake process and exit from this file.
...
@@ -126,6 +121,7 @@ macro(PROMPT_PROTOBUF_LIB)
...
@@ -126,6 +121,7 @@ macro(PROMPT_PROTOBUF_LIB)
# FIND_Protobuf.cmake uses `Protobuf_PROTOC_EXECUTABLE`.
# FIND_Protobuf.cmake uses `Protobuf_PROTOC_EXECUTABLE`.
# make `protobuf_generate_cpp` happy.
# make `protobuf_generate_cpp` happy.
SET
(
Protobuf_PROTOC_EXECUTABLE
${
PROTOBUF_PROTOC_EXECUTABLE
}
)
SET
(
Protobuf_PROTOC_EXECUTABLE
${
PROTOBUF_PROTOC_EXECUTABLE
}
)
FOREACH
(
dep
${
protobuf_DEPS
}
)
FOREACH
(
dep
${
protobuf_DEPS
}
)
ADD_DEPENDENCIES
(
protobuf
${
dep
}
)
ADD_DEPENDENCIES
(
protobuf
${
dep
}
)
ADD_DEPENDENCIES
(
protobuf_lite
${
dep
}
)
ADD_DEPENDENCIES
(
protobuf_lite
${
dep
}
)
...
...
paddle/fluid/API.spec
浏览文件 @
e5bf8616
...
@@ -103,7 +103,7 @@ paddle.fluid.layers.beam_search ArgSpec(args=['pre_ids', 'pre_scores', 'ids', 's
...
@@ -103,7 +103,7 @@ paddle.fluid.layers.beam_search ArgSpec(args=['pre_ids', 'pre_scores', 'ids', 's
paddle.fluid.layers.row_conv ArgSpec(args=['input', 'future_context_size', 'param_attr', 'act'], varargs=None, keywords=None, defaults=(None, None))
paddle.fluid.layers.row_conv ArgSpec(args=['input', 'future_context_size', 'param_attr', 'act'], varargs=None, keywords=None, defaults=(None, None))
paddle.fluid.layers.multiplex ArgSpec(args=['inputs', 'index'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.multiplex ArgSpec(args=['inputs', 'index'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.layer_norm ArgSpec(args=['input', 'scale', 'shift', 'begin_norm_axis', 'epsilon', 'param_attr', 'bias_attr', 'act', 'name'], varargs=None, keywords=None, defaults=(True, True, 1, 1e-05, None, None, None, None))
paddle.fluid.layers.layer_norm ArgSpec(args=['input', 'scale', 'shift', 'begin_norm_axis', 'epsilon', 'param_attr', 'bias_attr', 'act', 'name'], varargs=None, keywords=None, defaults=(True, True, 1, 1e-05, None, None, None, None))
paddle.fluid.layers.softmax_with_cross_entropy ArgSpec(args=['logits', 'label', 'soft_label', 'ignore_index', 'numeric_stable_mode'
], varargs=None, keywords=None, defaults=(False, -100
, False))
paddle.fluid.layers.softmax_with_cross_entropy ArgSpec(args=['logits', 'label', 'soft_label', 'ignore_index', 'numeric_stable_mode'
, 'return_softmax'], varargs=None, keywords=None, defaults=(False, -100, False
, False))
paddle.fluid.layers.smooth_l1 ArgSpec(args=['x', 'y', 'inside_weight', 'outside_weight', 'sigma'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.smooth_l1 ArgSpec(args=['x', 'y', 'inside_weight', 'outside_weight', 'sigma'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.one_hot ArgSpec(args=['input', 'depth'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.one_hot ArgSpec(args=['input', 'depth'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.autoincreased_step_counter ArgSpec(args=['counter_name', 'begin', 'step'], varargs=None, keywords=None, defaults=(None, 1, 1))
paddle.fluid.layers.autoincreased_step_counter ArgSpec(args=['counter_name', 'begin', 'step'], varargs=None, keywords=None, defaults=(None, 1, 1))
...
@@ -179,10 +179,12 @@ paddle.fluid.layers.space_to_depth ArgSpec(args=['x', 'blocksize', 'name'], vara
...
@@ -179,10 +179,12 @@ paddle.fluid.layers.space_to_depth ArgSpec(args=['x', 'blocksize', 'name'], vara
paddle.fluid.layers.affine_grid ArgSpec(args=['theta', 'out_shape', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.affine_grid ArgSpec(args=['theta', 'out_shape', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.sequence_reverse ArgSpec(args=['x', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.sequence_reverse ArgSpec(args=['x', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.affine_channel ArgSpec(args=['x', 'scale', 'bias', 'data_layout', 'name'], varargs=None, keywords=None, defaults=(None, None, 'NCHW', None))
paddle.fluid.layers.affine_channel ArgSpec(args=['x', 'scale', 'bias', 'data_layout', 'name'], varargs=None, keywords=None, defaults=(None, None, 'NCHW', None))
paddle.fluid.layers.similarity_focus ArgSpec(args=['input', 'axis', 'indexes', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.hash ArgSpec(args=['input', 'hash_size', 'num_hash', 'name'], varargs=None, keywords=None, defaults=(1, None))
paddle.fluid.layers.hash ArgSpec(args=['input', 'hash_size', 'num_hash', 'name'], varargs=None, keywords=None, defaults=(1, None))
paddle.fluid.layers.grid_sampler ArgSpec(args=['x', 'grid', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.grid_sampler ArgSpec(args=['x', 'grid', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.log_loss ArgSpec(args=['input', 'label', 'epsilon', 'name'], varargs=None, keywords=None, defaults=(0.0001, None))
paddle.fluid.layers.log_loss ArgSpec(args=['input', 'label', 'epsilon', 'name'], varargs=None, keywords=None, defaults=(0.0001, None))
paddle.fluid.layers.add_position_encoding ArgSpec(args=['input', 'alpha', 'beta', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.add_position_encoding ArgSpec(args=['input', 'alpha', 'beta', 'name'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.bilinear_tensor_product ArgSpec(args=['x', 'y', 'size', 'act', 'name', 'param_attr', 'bias_attr'], varargs=None, keywords=None, defaults=(None, None, None, None))
paddle.fluid.layers.data ArgSpec(args=['name', 'shape', 'append_batch_size', 'dtype', 'lod_level', 'type', 'stop_gradient'], varargs=None, keywords=None, defaults=(True, 'float32', 0, VarType.LOD_TENSOR, True))
paddle.fluid.layers.data ArgSpec(args=['name', 'shape', 'append_batch_size', 'dtype', 'lod_level', 'type', 'stop_gradient'], varargs=None, keywords=None, defaults=(True, 'float32', 0, VarType.LOD_TENSOR, True))
paddle.fluid.layers.open_files ArgSpec(args=['filenames', 'shapes', 'lod_levels', 'dtypes', 'thread_num', 'buffer_size', 'pass_num', 'is_test'], varargs=None, keywords=None, defaults=(None, None, 1, None))
paddle.fluid.layers.open_files ArgSpec(args=['filenames', 'shapes', 'lod_levels', 'dtypes', 'thread_num', 'buffer_size', 'pass_num', 'is_test'], varargs=None, keywords=None, defaults=(None, None, 1, None))
paddle.fluid.layers.read_file ArgSpec(args=['reader'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.read_file ArgSpec(args=['reader'], varargs=None, keywords=None, defaults=None)
...
@@ -201,6 +203,7 @@ paddle.fluid.layers.create_tensor ArgSpec(args=['dtype', 'name', 'persistable'],
...
@@ -201,6 +203,7 @@ paddle.fluid.layers.create_tensor ArgSpec(args=['dtype', 'name', 'persistable'],
paddle.fluid.layers.create_parameter ArgSpec(args=['shape', 'dtype', 'name', 'attr', 'is_bias', 'default_initializer'], varargs=None, keywords=None, defaults=(None, None, False, None))
paddle.fluid.layers.create_parameter ArgSpec(args=['shape', 'dtype', 'name', 'attr', 'is_bias', 'default_initializer'], varargs=None, keywords=None, defaults=(None, None, False, None))
paddle.fluid.layers.create_global_var ArgSpec(args=['shape', 'value', 'dtype', 'persistable', 'force_cpu', 'name'], varargs=None, keywords=None, defaults=(False, False, None))
paddle.fluid.layers.create_global_var ArgSpec(args=['shape', 'value', 'dtype', 'persistable', 'force_cpu', 'name'], varargs=None, keywords=None, defaults=(False, False, None))
paddle.fluid.layers.cast ArgSpec(args=['x', 'dtype'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.cast ArgSpec(args=['x', 'dtype'], varargs=None, keywords=None, defaults=None)
paddle.fluid.layers.tensor_array_to_tensor ArgSpec(args=['input', 'axis', 'name'], varargs=None, keywords=None, defaults=(1, None))
paddle.fluid.layers.concat ArgSpec(args=['input', 'axis', 'name'], varargs=None, keywords=None, defaults=(0, None))
paddle.fluid.layers.concat ArgSpec(args=['input', 'axis', 'name'], varargs=None, keywords=None, defaults=(0, None))
paddle.fluid.layers.sums ArgSpec(args=['input', 'out'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.sums ArgSpec(args=['input', 'out'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.assign ArgSpec(args=['input', 'output'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.assign ArgSpec(args=['input', 'output'], varargs=None, keywords=None, defaults=(None,))
...
@@ -271,6 +274,7 @@ paddle.fluid.layers.hard_shrink ArgSpec(args=['x', 'threshold'], varargs=None, k
...
@@ -271,6 +274,7 @@ paddle.fluid.layers.hard_shrink ArgSpec(args=['x', 'threshold'], varargs=None, k
paddle.fluid.layers.cumsum ArgSpec(args=['x', 'axis', 'exclusive', 'reverse'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.cumsum ArgSpec(args=['x', 'axis', 'exclusive', 'reverse'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.thresholded_relu ArgSpec(args=['x', 'threshold'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.thresholded_relu ArgSpec(args=['x', 'threshold'], varargs=None, keywords=None, defaults=(None,))
paddle.fluid.layers.prior_box ArgSpec(args=['input', 'image', 'min_sizes', 'max_sizes', 'aspect_ratios', 'variance', 'flip', 'clip', 'steps', 'offset', 'name', 'min_max_aspect_ratios_order'], varargs=None, keywords=None, defaults=(None, [1.0], [0.1, 0.1, 0.2, 0.2], False, False, [0.0, 0.0], 0.5, None, False))
paddle.fluid.layers.prior_box ArgSpec(args=['input', 'image', 'min_sizes', 'max_sizes', 'aspect_ratios', 'variance', 'flip', 'clip', 'steps', 'offset', 'name', 'min_max_aspect_ratios_order'], varargs=None, keywords=None, defaults=(None, [1.0], [0.1, 0.1, 0.2, 0.2], False, False, [0.0, 0.0], 0.5, None, False))
paddle.fluid.layers.density_prior_box ArgSpec(args=['input', 'image', 'densities', 'fixed_sizes', 'fixed_ratios', 'variance', 'clip', 'steps', 'offset', 'name'], varargs=None, keywords=None, defaults=(None, None, None, [0.1, 0.1, 0.2, 0.2], False, [0.0, 0.0], 0.5, None))
paddle.fluid.layers.multi_box_head ArgSpec(args=['inputs', 'image', 'base_size', 'num_classes', 'aspect_ratios', 'min_ratio', 'max_ratio', 'min_sizes', 'max_sizes', 'steps', 'step_w', 'step_h', 'offset', 'variance', 'flip', 'clip', 'kernel_size', 'pad', 'stride', 'name', 'min_max_aspect_ratios_order'], varargs=None, keywords=None, defaults=(None, None, None, None, None, None, None, 0.5, [0.1, 0.1, 0.2, 0.2], True, False, 1, 0, 1, None, False))
paddle.fluid.layers.multi_box_head ArgSpec(args=['inputs', 'image', 'base_size', 'num_classes', 'aspect_ratios', 'min_ratio', 'max_ratio', 'min_sizes', 'max_sizes', 'steps', 'step_w', 'step_h', 'offset', 'variance', 'flip', 'clip', 'kernel_size', 'pad', 'stride', 'name', 'min_max_aspect_ratios_order'], varargs=None, keywords=None, defaults=(None, None, None, None, None, None, None, 0.5, [0.1, 0.1, 0.2, 0.2], True, False, 1, 0, 1, None, False))
paddle.fluid.layers.bipartite_match ArgSpec(args=['dist_matrix', 'match_type', 'dist_threshold', 'name'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.bipartite_match ArgSpec(args=['dist_matrix', 'match_type', 'dist_threshold', 'name'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.target_assign ArgSpec(args=['input', 'matched_indices', 'negative_indices', 'mismatch_value', 'name'], varargs=None, keywords=None, defaults=(None, None, None))
paddle.fluid.layers.target_assign ArgSpec(args=['input', 'matched_indices', 'negative_indices', 'mismatch_value', 'name'], varargs=None, keywords=None, defaults=(None, None, None))
...
...
paddle/fluid/framework/ir/graph_pattern_detector.cc
浏览文件 @
e5bf8616
...
@@ -167,10 +167,12 @@ struct HitGroup {
...
@@ -167,10 +167,12 @@ struct HitGroup {
bool
Match
(
Node
*
node
,
PDNode
*
pat
)
{
bool
Match
(
Node
*
node
,
PDNode
*
pat
)
{
if
(
nodes_
.
count
(
node
))
{
if
(
nodes_
.
count
(
node
))
{
if
(
!
roles
.
count
(
pat
))
return
false
;
if
(
roles
.
count
(
pat
)
&&
roles
[
pat
]
==
node
)
return
true
;
return
roles
[
pat
]
==
node
;
return
false
;
}
else
{
if
(
roles
.
count
(
pat
)
&&
roles
[
pat
]
!=
node
)
return
false
;
return
true
;
}
}
return
!
roles
.
count
(
pat
)
||
roles
.
at
(
pat
)
==
node
;
}
}
void
Register
(
Node
*
node
,
PDNode
*
pat
)
{
void
Register
(
Node
*
node
,
PDNode
*
pat
)
{
...
@@ -198,7 +200,6 @@ GraphPatternDetector::DetectPatterns() {
...
@@ -198,7 +200,6 @@ GraphPatternDetector::DetectPatterns() {
std
::
vector
<
GraphPatternDetector
::
subgraph_t
>
result
;
std
::
vector
<
GraphPatternDetector
::
subgraph_t
>
result
;
std
::
vector
<
HitGroup
>
init_groups
;
std
::
vector
<
HitGroup
>
init_groups
;
std
::
array
<
std
::
vector
<
HitGroup
>
,
2
>
bi_records
;
std
::
array
<
std
::
vector
<
HitGroup
>
,
2
>
bi_records
;
// PADDLE_ENFORCE(!pattern_.edges().empty(), "At least one edge is needed");
auto
*
first_pnode
=
pattern_
.
edges
().
empty
()
?
pattern
().
nodes
().
front
().
get
()
auto
*
first_pnode
=
pattern_
.
edges
().
empty
()
?
pattern
().
nodes
().
front
().
get
()
:
pattern_
.
edges
().
front
().
first
;
:
pattern_
.
edges
().
front
().
first
;
if
(
!
pdnodes2nodes_
.
count
(
first_pnode
))
return
result
;
if
(
!
pdnodes2nodes_
.
count
(
first_pnode
))
return
result
;
...
@@ -228,11 +229,12 @@ GraphPatternDetector::DetectPatterns() {
...
@@ -228,11 +229,12 @@ GraphPatternDetector::DetectPatterns() {
VLOG
(
80
)
<<
"check "
<<
source
->
id
()
<<
" -- "
<<
target
->
id
();
VLOG
(
80
)
<<
"check "
<<
source
->
id
()
<<
" -- "
<<
target
->
id
();
// TODO(Superjomn) add some prune strategies.
// TODO(Superjomn) add some prune strategies.
for
(
const
auto
&
group
:
pre_groups
)
{
for
(
const
auto
&
group
:
pre_groups
)
{
HitGroup
new_group
=
group
;
if
(
IsNodesLink
(
source
,
target
))
{
if
(
IsNodesLink
(
source
,
target
)
&&
HitGroup
new_group
=
group
;
new_group
.
Match
(
source
,
edge
.
first
))
{
bool
flag
=
new_group
.
Match
(
source
,
edge
.
first
)
&&
new_group
.
Register
(
source
,
edge
.
first
);
new_group
.
Match
(
target
,
edge
.
second
);
if
(
new_group
.
Match
(
target
,
edge
.
second
))
{
if
(
flag
)
{
new_group
.
Register
(
source
,
edge
.
first
);
new_group
.
Register
(
target
,
edge
.
second
);
new_group
.
Register
(
target
,
edge
.
second
);
cur_groups
.
push_back
(
new_group
);
cur_groups
.
push_back
(
new_group
);
// TODO(Superjomn) need to unique
// TODO(Superjomn) need to unique
...
@@ -261,14 +263,16 @@ GraphPatternDetector::DetectPatterns() {
...
@@ -261,14 +263,16 @@ GraphPatternDetector::DetectPatterns() {
return
result
;
return
result
;
}
}
bool
GraphItemCMP
(
const
std
::
pair
<
PDNode
*
,
Node
*>
&
a
,
struct
GraphItemLessThan
{
bool
operator
()(
const
std
::
pair
<
PDNode
*
,
Node
*>
&
a
,
const
std
::
pair
<
PDNode
*
,
Node
*>
&
b
)
{
const
std
::
pair
<
PDNode
*
,
Node
*>
&
b
)
{
if
(
a
.
first
!=
b
.
first
)
{
if
(
a
.
first
!=
b
.
first
)
{
return
a
.
first
<
b
.
first
;
return
a
.
first
<
b
.
first
;
}
else
{
}
else
{
return
a
.
second
<
b
.
second
;
return
a
.
second
<
b
.
second
;
}
}
}
}
}
;
// TODO(Superjomn) enhance the function as it marks unique unique as duplicates
// TODO(Superjomn) enhance the function as it marks unique unique as duplicates
// see https://github.com/PaddlePaddle/Paddle/issues/13550
// see https://github.com/PaddlePaddle/Paddle/issues/13550
...
@@ -282,7 +286,7 @@ void GraphPatternDetector::UniquePatterns(
...
@@ -282,7 +286,7 @@ void GraphPatternDetector::UniquePatterns(
for
(
auto
&
g
:
*
subgraphs
)
{
for
(
auto
&
g
:
*
subgraphs
)
{
// Sort the items in the sub-graph, and transform to a string key.
// Sort the items in the sub-graph, and transform to a string key.
std
::
vector
<
std
::
pair
<
PDNode
*
,
Node
*>>
sorted_keys
(
g
.
begin
(),
g
.
end
());
std
::
vector
<
std
::
pair
<
PDNode
*
,
Node
*>>
sorted_keys
(
g
.
begin
(),
g
.
end
());
std
::
sort
(
sorted_keys
.
begin
(),
sorted_keys
.
end
(),
GraphItem
CMP
);
std
::
sort
(
sorted_keys
.
begin
(),
sorted_keys
.
end
(),
GraphItem
LessThan
()
);
std
::
stringstream
ss
;
std
::
stringstream
ss
;
for
(
auto
&
item
:
sorted_keys
)
{
for
(
auto
&
item
:
sorted_keys
)
{
ss
<<
item
.
first
<<
":"
<<
item
.
second
;
ss
<<
item
.
first
<<
":"
<<
item
.
second
;
...
...
paddle/fluid/framework/operator.cc
浏览文件 @
e5bf8616
...
@@ -259,6 +259,8 @@ std::string OperatorBase::DebugStringEx(const Scope* scope) const {
...
@@ -259,6 +259,8 @@ std::string OperatorBase::DebugStringEx(const Scope* scope) const {
if
(
row_size
>=
0
)
{
if
(
row_size
>=
0
)
{
ss
<<
"[row_size="
<<
row_size
<<
"]"
;
ss
<<
"[row_size="
<<
row_size
<<
"]"
;
}
}
std
::
string
dtype
=
GetDtype
(
*
scope
,
output
.
second
[
i
]);
ss
<<
":"
<<
dtype
;
ss
<<
"["
<<
GetDims
(
*
scope
,
var_name
,
true
)
<<
"]"
;
ss
<<
"["
<<
GetDims
(
*
scope
,
var_name
,
true
)
<<
"]"
;
ss
<<
"("
<<
GetLoD
(
*
scope
,
var_name
)
<<
")"
;
ss
<<
"("
<<
GetLoD
(
*
scope
,
var_name
)
<<
")"
;
}
}
...
...
paddle/fluid/framework/var_type_inference.h
浏览文件 @
e5bf8616
...
@@ -13,6 +13,9 @@ See the License for the specific language governing permissions and
...
@@ -13,6 +13,9 @@ See the License for the specific language governing permissions and
limitations under the License. */
limitations under the License. */
#pragma once
#pragma once
#include <string>
#include "paddle/fluid/framework/block_desc.h"
#include "paddle/fluid/framework/op_desc.h"
#include "paddle/fluid/framework/type_defs.h"
#include "paddle/fluid/framework/type_defs.h"
namespace
paddle
{
namespace
paddle
{
...
@@ -24,5 +27,27 @@ class VarTypeInference {
...
@@ -24,5 +27,27 @@ class VarTypeInference {
virtual
void
operator
()(
const
OpDesc
&
op_desc
,
BlockDesc
*
block
)
const
=
0
;
virtual
void
operator
()(
const
OpDesc
&
op_desc
,
BlockDesc
*
block
)
const
=
0
;
};
};
class
PassInDtypeAndVarTypeToOutput
:
public
framework
::
VarTypeInference
{
public:
void
operator
()(
const
framework
::
OpDesc
&
op_desc
,
framework
::
BlockDesc
*
block
)
const
final
{
auto
in_out_var_names
=
this
->
GetInputOutputWithSameType
();
for
(
auto
&
i_o_n
:
in_out_var_names
)
{
auto
&
x_name
=
op_desc
.
Input
(
i_o_n
.
first
).
at
(
0
);
auto
&
out_name
=
op_desc
.
Output
(
i_o_n
.
second
).
at
(
0
);
auto
&
x
=
block
->
FindRecursiveOrCreateVar
(
x_name
);
auto
&
out
=
block
->
FindRecursiveOrCreateVar
(
out_name
);
out
.
SetType
(
x
.
GetType
());
out
.
SetDataType
(
x
.
GetDataType
());
}
}
protected:
virtual
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
=
0
;
};
}
// namespace framework
}
// namespace framework
}
// namespace paddle
}
// namespace paddle
paddle/fluid/inference/analysis/analyzer.cc
浏览文件 @
e5bf8616
...
@@ -113,7 +113,9 @@ void Analyzer::Run(Argument* argument) {
...
@@ -113,7 +113,9 @@ void Analyzer::Run(Argument* argument) {
passes
.
push_back
(
"infer_clean_graph_pass"
);
passes
.
push_back
(
"infer_clean_graph_pass"
);
passes
.
push_back
(
"graph_viz_pass"
);
// add graphviz for debug.
passes
.
push_back
(
"graph_viz_pass"
);
// add graphviz for debug.
for
(
auto
&
pass
:
ir_passes_
)
{
for
(
auto
&
pass
:
ir_passes_
)
{
if
(
!
disabled_ir_passes_
.
count
(
pass
))
{
// skip mkldnn pass when use_mkldnn_ = false;
bool
skip_pass
=
(
!
use_mkldnn_
)
&&
pass
.
find
(
"mkldnn"
)
!=
std
::
string
::
npos
;
if
(
!
disabled_ir_passes_
.
count
(
pass
)
&&
!
skip_pass
)
{
passes
.
push_back
(
pass
);
passes
.
push_back
(
pass
);
passes
.
push_back
(
"graph_viz_pass"
);
// add graphviz for debug.
passes
.
push_back
(
"graph_viz_pass"
);
// add graphviz for debug.
}
}
...
...
paddle/fluid/inference/analysis/data_flow_graph.cc
浏览文件 @
e5bf8616
...
@@ -112,8 +112,8 @@ void DataFlowGraph::Build(const framework::proto::ProgramDesc &prog) {
...
@@ -112,8 +112,8 @@ void DataFlowGraph::Build(const framework::proto::ProgramDesc &prog) {
out_alias
->
SetPbMsg
(
out
->
pb_msg
());
out_alias
->
SetPbMsg
(
out
->
pb_msg
());
var2id
[
out_alias
->
name
()]
=
var2id
[
out_alias
->
name
()]
=
out_alias
->
id
();
// update variable's alias Node
out_alias
->
id
();
// update variable's alias Node
LOG
(
INFO
)
<<
"loop found in graph, create SSA alias node ["
VLOG
(
40
)
<<
"loop found in graph, create SSA alias node ["
<<
out_alias
->
repr
()
<<
"] for ["
<<
out
->
repr
()
<<
"]"
;
<<
out_alias
->
repr
()
<<
"] for ["
<<
out
->
repr
()
<<
"]"
;
out
=
out_alias
;
out
=
out_alias
;
}
}
out
->
inlinks
.
push_back
(
o
);
out
->
inlinks
.
push_back
(
o
);
...
...
paddle/fluid/inference/tensorrt/convert/activation_op.cc
浏览文件 @
e5bf8616
...
@@ -27,7 +27,7 @@ class ActivationOpConverter : public OpConverter {
...
@@ -27,7 +27,7 @@ class ActivationOpConverter : public OpConverter {
// Here the two nullptr looks strange, that's because the
// Here the two nullptr looks strange, that's because the
// framework::OpDesc's constructor is strange.
// framework::OpDesc's constructor is strange.
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
LOG
(
INFO
)
VLOG
(
3
)
<<
"convert a fluid Activation op to tensorrt activation layer whose "
<<
"convert a fluid Activation op to tensorrt activation layer whose "
"type is "
"type is "
<<
op_type_
;
<<
op_type_
;
...
...
paddle/fluid/inference/tensorrt/convert/batch_norm_op.cc
浏览文件 @
e5bf8616
...
@@ -23,7 +23,7 @@ class BatchNormOpConverter : public OpConverter {
...
@@ -23,7 +23,7 @@ class BatchNormOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
LOG
(
INFO
)
<<
"convert a fluid batch norm op to tensorrt batch_norm"
;
VLOG
(
3
)
<<
"convert a fluid batch norm op to tensorrt batch_norm"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
...
...
paddle/fluid/inference/tensorrt/convert/concat_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class ConcatOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class ConcatOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
<<
"convert a fluid mul op to tensorrt mul layer without bias"
;
VLOG
(
3
)
<<
"convert a fluid mul op to tensorrt mul layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
...
...
paddle/fluid/inference/tensorrt/convert/conv2d_op.cc
浏览文件 @
e5bf8616
...
@@ -37,8 +37,7 @@ class Conv2dOpConverter : public OpConverter {
...
@@ -37,8 +37,7 @@ class Conv2dOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
LOG
(
INFO
)
VLOG
(
3
)
<<
"convert a fluid conv2d op to tensorrt conv layer without bias"
;
<<
"convert a fluid conv2d op to tensorrt conv layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Input"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Input"
).
size
(),
1
);
...
...
paddle/fluid/inference/tensorrt/convert/dropout_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class DropoutOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class DropoutOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
<<
"convert a fluid dropout op to tensorrt dropout layer"
;
VLOG
(
3
)
<<
"convert a fluid dropout op to tensorrt dropout layer"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
auto
*
input1
=
engine_
->
GetITensor
(
op_desc
.
Input
(
"X"
)[
0
]);
auto
*
input1
=
engine_
->
GetITensor
(
op_desc
.
Input
(
"X"
)[
0
]);
...
...
paddle/fluid/inference/tensorrt/convert/elementwise_op.cc
浏览文件 @
e5bf8616
...
@@ -26,7 +26,7 @@ class ElementwiseWeightOpConverter : public OpConverter {
...
@@ -26,7 +26,7 @@ class ElementwiseWeightOpConverter : public OpConverter {
// Here the two nullptr looks strange, that's because the
// Here the two nullptr looks strange, that's because the
// framework::OpDesc's constructor is strange.
// framework::OpDesc's constructor is strange.
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
LOG
(
INFO
)
<<
"convert a fluid elementwise op to tensorrt IScaleLayer"
;
VLOG
(
3
)
<<
"convert a fluid elementwise op to tensorrt IScaleLayer"
;
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Y"
).
size
(),
1
);
// Y is a weight
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Y"
).
size
(),
1
);
// Y is a weight
...
@@ -108,7 +108,7 @@ class ElementwiseTensorOpConverter : public OpConverter {
...
@@ -108,7 +108,7 @@ class ElementwiseTensorOpConverter : public OpConverter {
// Here the two nullptr looks strange, that's because the
// Here the two nullptr looks strange, that's because the
// framework::OpDesc's constructor is strange.
// framework::OpDesc's constructor is strange.
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
LOG
(
INFO
)
<<
"convert a fluid elementwise op to tensorrt IScaleLayer"
;
VLOG
(
3
)
<<
"convert a fluid elementwise op to tensorrt IScaleLayer"
;
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Y"
).
size
(),
1
);
// Y is a weight
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"Y"
).
size
(),
1
);
// Y is a weight
...
...
paddle/fluid/inference/tensorrt/convert/fc_op.cc
浏览文件 @
e5bf8616
...
@@ -52,7 +52,7 @@ class FcOpConverter : public OpConverter {
...
@@ -52,7 +52,7 @@ class FcOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
<<
"convert a fluid fc op to tensorrt fc layer without bias"
;
VLOG
(
3
)
<<
"convert a fluid fc op to tensorrt fc layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
PADDLE_ENFORCE_EQ
(
op_desc
.
Input
(
"X"
).
size
(),
1
);
...
...
paddle/fluid/inference/tensorrt/convert/mul_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class MulOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class MulOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
<<
"convert a fluid mul op to tensorrt mul layer without bias"
;
VLOG
(
3
)
<<
"convert a fluid mul op to tensorrt mul layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
...
...
paddle/fluid/inference/tensorrt/convert/pad_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class PadOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class PadOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
<<
"convert a fluid transpose op to tensorrt tranpose layer"
;
VLOG
(
3
)
<<
"convert a fluid transpose op to tensorrt tranpose layer"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
...
...
paddle/fluid/inference/tensorrt/convert/pool2d_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class Pool2dOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class Pool2dOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
VLOG
(
3
)
<<
"convert a fluid pool2d op to tensorrt pool2d layer without bias"
;
<<
"convert a fluid pool2d op to tensorrt pool2d layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
...
...
paddle/fluid/inference/tensorrt/convert/softmax_op.cc
浏览文件 @
e5bf8616
...
@@ -25,7 +25,7 @@ class SoftMaxOpConverter : public OpConverter {
...
@@ -25,7 +25,7 @@ class SoftMaxOpConverter : public OpConverter {
public:
public:
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
void
operator
()(
const
framework
::
proto
::
OpDesc
&
op
,
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
const
framework
::
Scope
&
scope
,
bool
test_mode
)
override
{
VLOG
(
40
)
VLOG
(
3
)
<<
"convert a fluid softmax op to tensorrt softmax layer without bias"
;
<<
"convert a fluid softmax op to tensorrt softmax layer without bias"
;
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
framework
::
OpDesc
op_desc
(
op
,
nullptr
);
// Declare inputs
// Declare inputs
...
...
paddle/fluid/inference/tensorrt/helper.h
浏览文件 @
e5bf8616
...
@@ -52,7 +52,7 @@ class NaiveLogger : public nvinfer1::ILogger {
...
@@ -52,7 +52,7 @@ class NaiveLogger : public nvinfer1::ILogger {
void
log
(
nvinfer1
::
ILogger
::
Severity
severity
,
const
char
*
msg
)
override
{
void
log
(
nvinfer1
::
ILogger
::
Severity
severity
,
const
char
*
msg
)
override
{
switch
(
severity
)
{
switch
(
severity
)
{
case
Severity
::
kINFO
:
case
Severity
::
kINFO
:
LOG
(
INFO
)
<<
msg
;
VLOG
(
3
)
<<
msg
;
break
;
break
;
case
Severity
::
kWARNING
:
case
Severity
::
kWARNING
:
LOG
(
WARNING
)
<<
msg
;
LOG
(
WARNING
)
<<
msg
;
...
...
paddle/fluid/inference/tests/api/CMakeLists.txt
浏览文件 @
e5bf8616
...
@@ -110,5 +110,5 @@ if(WITH_GPU AND TENSORRT_FOUND)
...
@@ -110,5 +110,5 @@ if(WITH_GPU AND TENSORRT_FOUND)
endif
()
endif
()
cc_test
(
test_trt_models SRCS trt_models_tester.cc
cc_test
(
test_trt_models SRCS trt_models_tester.cc
ARGS --dirname=
${
TRT_MODEL_INSTALL_DIR
}
/trt_test_models
ARGS --dirname=
${
TRT_MODEL_INSTALL_DIR
}
/trt_test_models
DEPS paddle_inference_tensorrt_subgraph_engine
)
DEPS paddle_inference_tensorrt_subgraph_engine
SERIAL
)
endif
()
endif
()
paddle/fluid/operators/CMakeLists.txt
浏览文件 @
e5bf8616
...
@@ -5,6 +5,8 @@ list(REMOVE_DUPLICATES GENERAL_OPS)
...
@@ -5,6 +5,8 @@ list(REMOVE_DUPLICATES GENERAL_OPS)
set
(
DEPS_OPS
""
)
set
(
DEPS_OPS
""
)
set
(
pybind_file
${
PADDLE_BINARY_DIR
}
/paddle/fluid/pybind/pybind.h
)
set
(
pybind_file
${
PADDLE_BINARY_DIR
}
/paddle/fluid/pybind/pybind.h
)
file
(
WRITE
${
pybind_file
}
"// Generated by the paddle/fluid/operator/CMakeLists.txt. DO NOT EDIT!
\n\n
"
)
file
(
WRITE
${
pybind_file
}
"// Generated by the paddle/fluid/operator/CMakeLists.txt. DO NOT EDIT!
\n\n
"
)
set
(
PART_CUDA_KERNEL_FILES
)
function
(
op_library TARGET
)
function
(
op_library TARGET
)
# op_library is a function to create op library. The interface is same as
# op_library is a function to create op library. The interface is same as
# cc_library. But it handle split GPU/CPU code and link some common library
# cc_library. But it handle split GPU/CPU code and link some common library
...
@@ -37,6 +39,12 @@ function(op_library TARGET)
...
@@ -37,6 +39,12 @@ function(op_library TARGET)
if
(
EXISTS
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.cu
)
if
(
EXISTS
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.cu
)
list
(
APPEND cu_srcs
${
TARGET
}
.cu
)
list
(
APPEND cu_srcs
${
TARGET
}
.cu
)
endif
()
endif
()
if
(
EXISTS
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.part.cu
)
set
(
PART_CUDA_KERNEL_FILES
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.part.cu
${
PART_CUDA_KERNEL_FILES
}
PARENT_SCOPE
)
list
(
APPEND cu_srcs
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.part.cu
)
endif
()
if
(
EXISTS
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.hip.cu
)
if
(
EXISTS
${
CMAKE_CURRENT_SOURCE_DIR
}
/
${
TARGET
}
.hip.cu
)
list
(
APPEND hip_cu_srcs
${
TARGET
}
.hip.cu
)
list
(
APPEND hip_cu_srcs
${
TARGET
}
.hip.cu
)
endif
()
endif
()
...
@@ -317,6 +325,7 @@ op_library(save_op DEPS lod_tensor)
...
@@ -317,6 +325,7 @@ op_library(save_op DEPS lod_tensor)
op_library
(
load_op DEPS lod_tensor
)
op_library
(
load_op DEPS lod_tensor
)
op_library
(
save_combine_op DEPS lod_tensor
)
op_library
(
save_combine_op DEPS lod_tensor
)
op_library
(
load_combine_op DEPS lod_tensor
)
op_library
(
load_combine_op DEPS lod_tensor
)
op_library
(
tensor_array_to_tensor_op DEPS concat_op
)
op_library
(
concat_op DEPS concat_and_split
)
op_library
(
concat_op DEPS concat_and_split
)
list
(
REMOVE_ITEM GENERAL_OPS
${
DEPS_OPS
}
)
list
(
REMOVE_ITEM GENERAL_OPS
${
DEPS_OPS
}
)
...
@@ -326,6 +335,8 @@ foreach(src ${GENERAL_OPS})
...
@@ -326,6 +335,8 @@ foreach(src ${GENERAL_OPS})
endforeach
()
endforeach
()
file
(
APPEND
${
pybind_file
}
"USE_OP(less_than);
\n
USE_OP(logical_and);
\n
USE_NO_KERNEL_OP(read_from_array);
\n
"
)
file
(
APPEND
${
pybind_file
}
"USE_OP(less_than);
\n
USE_OP(logical_and);
\n
USE_NO_KERNEL_OP(read_from_array);
\n
"
)
if
(
NOT WIN32
)
if
(
NOT WIN32
)
add_subdirectory
(
reader
)
add_subdirectory
(
reader
)
endif
(
NOT WIN32
)
endif
(
NOT WIN32
)
...
@@ -352,3 +363,14 @@ if(NOT WIN32)
...
@@ -352,3 +363,14 @@ if(NOT WIN32)
nv_test
(
nccl_op_test SRCS nccl_op_test.cu.cc DEPS nccl_op gpu_info device_context
)
nv_test
(
nccl_op_test SRCS nccl_op_test.cu.cc DEPS nccl_op gpu_info device_context
)
endif
()
endif
()
nv_test
(
dropout_op_test SRCS dropout_op_test.cc DEPS dropout_op tensor
)
nv_test
(
dropout_op_test SRCS dropout_op_test.cc DEPS dropout_op tensor
)
if
(
WITH_GPU
)
foreach
(
CUDA_KERNEL_FILE
${
PART_CUDA_KERNEL_FILES
}
)
file
(
READ
${
CUDA_KERNEL_FILE
}
TARGET_CONTENT
)
string
(
REGEX MATCH
"REGISTER_OP_CUDA_KERNEL
\\
(
\\
n?([^,]+),.*"
MATCHED
${
TARGET_CONTENT
}
)
if
(
MATCHED
)
string
(
STRIP
${
CMAKE_MATCH_1
}
MATCHED
)
file
(
APPEND
${
pybind_file
}
"USE_OP_DEVICE_KERNEL(
${
MATCHED
}
, CUDA);
\n
"
)
endif
()
endforeach
()
endif
()
paddle/fluid/operators/activation_op.cc
浏览文件 @
e5bf8616
...
@@ -91,16 +91,12 @@ class ActivationOp : public framework::OperatorWithKernel {
...
@@ -91,16 +91,12 @@ class ActivationOp : public framework::OperatorWithKernel {
}
}
};
};
class
ActivationOpInferVarType
:
public
framework
::
VarTypeInference
{
class
ActivationOpInferVarType
public:
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
void
operator
()(
const
framework
::
OpDesc
&
op_desc
,
protected:
framework
::
BlockDesc
*
block
)
const
override
{
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
auto
x_name
=
op_desc
.
Input
(
"X"
)[
0
];
const
override
{
auto
out_name
=
op_desc
.
Output
(
"Out"
)[
0
];
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
auto
&
x
=
block
->
FindRecursiveOrCreateVar
(
x_name
);
auto
&
out
=
block
->
FindRecursiveOrCreateVar
(
out_name
);
out
.
SetType
(
x
.
GetType
());
out
.
SetDataType
(
x
.
GetDataType
());
}
}
};
};
...
...
paddle/fluid/operators/batch_norm_op.cc
浏览文件 @
e5bf8616
...
@@ -170,6 +170,15 @@ The required data format for this layer is one of the following:
...
@@ -170,6 +170,15 @@ The required data format for this layer is one of the following:
}
}
};
};
class
BatchNormOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Y"
}};
}
};
template
<
typename
T
>
template
<
typename
T
>
class
BatchNormKernel
<
platform
::
CPUDeviceContext
,
T
>
class
BatchNormKernel
<
platform
::
CPUDeviceContext
,
T
>
:
public
framework
::
OpKernel
<
T
>
{
:
public
framework
::
OpKernel
<
T
>
{
...
@@ -525,7 +534,7 @@ class BatchNormGradMaker : public framework::SingleGradOpDescMaker {
...
@@ -525,7 +534,7 @@ class BatchNormGradMaker : public framework::SingleGradOpDescMaker {
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
batch_norm
,
ops
::
BatchNormOp
,
ops
::
BatchNormOpMaker
,
REGISTER_OPERATOR
(
batch_norm
,
ops
::
BatchNormOp
,
ops
::
BatchNormOpMaker
,
ops
::
BatchNormGradMaker
);
ops
::
BatchNorm
OpInferVarType
,
ops
::
BatchNorm
GradMaker
);
REGISTER_OPERATOR
(
batch_norm_grad
,
ops
::
BatchNormGradOp
);
REGISTER_OPERATOR
(
batch_norm_grad
,
ops
::
BatchNormGradOp
);
REGISTER_OP_CPU_KERNEL
(
REGISTER_OP_CPU_KERNEL
(
...
...
paddle/fluid/operators/conv_op.cc
浏览文件 @
e5bf8616
...
@@ -224,6 +224,15 @@ $$
...
@@ -224,6 +224,15 @@ $$
)DOC"
);
)DOC"
);
}
}
class
ConvOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{
{
"Input"
,
/*->*/
"Output"
}};
}
};
void
Conv3DOpMaker
::
Make
()
{
void
Conv3DOpMaker
::
Make
()
{
AddInput
(
AddInput
(
"Input"
,
"Input"
,
...
@@ -365,6 +374,7 @@ framework::OpKernelType ConvOpGrad::GetExpectedKernelType(
...
@@ -365,6 +374,7 @@ framework::OpKernelType ConvOpGrad::GetExpectedKernelType(
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
conv2d
,
ops
::
ConvOp
,
ops
::
Conv2DOpMaker
,
REGISTER_OPERATOR
(
conv2d
,
ops
::
ConvOp
,
ops
::
Conv2DOpMaker
,
ops
::
ConvOpInferVarType
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
conv2d_grad
,
ops
::
ConvOpGrad
);
REGISTER_OPERATOR
(
conv2d_grad
,
ops
::
ConvOpGrad
);
...
@@ -372,7 +382,9 @@ REGISTER_OPERATOR(conv2d_grad, ops::ConvOpGrad);
...
@@ -372,7 +382,9 @@ REGISTER_OPERATOR(conv2d_grad, ops::ConvOpGrad);
REGISTER_OPERATOR
(
depthwise_conv2d
,
ops
::
ConvOp
,
ops
::
Conv2DOpMaker
,
REGISTER_OPERATOR
(
depthwise_conv2d
,
ops
::
ConvOp
,
ops
::
Conv2DOpMaker
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
depthwise_conv2d_grad
,
ops
::
ConvOpGrad
);
REGISTER_OPERATOR
(
depthwise_conv2d_grad
,
ops
::
ConvOpGrad
);
REGISTER_OPERATOR
(
conv3d
,
ops
::
ConvOp
,
ops
::
Conv3DOpMaker
,
REGISTER_OPERATOR
(
conv3d
,
ops
::
ConvOp
,
ops
::
Conv3DOpMaker
,
ops
::
ConvOpInferVarType
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
conv3d_grad
,
ops
::
ConvOpGrad
);
REGISTER_OPERATOR
(
conv3d_grad
,
ops
::
ConvOpGrad
);
...
...
paddle/fluid/operators/cross_entropy_op.cc
浏览文件 @
e5bf8616
...
@@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
...
@@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License. */
limitations under the License. */
#include "paddle/fluid/operators/cross_entropy_op.h"
#include "paddle/fluid/operators/cross_entropy_op.h"
#include <string>
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -179,6 +180,15 @@ or not. But the output only shares the LoD information with input X.
...
@@ -179,6 +180,15 @@ or not. But the output only shares the LoD information with input X.
)DOC"
);
)DOC"
);
}
}
};
};
class
CrossEntropyOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Y"
}};
}
};
}
// namespace operators
}
// namespace operators
}
// namespace paddle
}
// namespace paddle
...
@@ -186,6 +196,7 @@ namespace ops = paddle::operators;
...
@@ -186,6 +196,7 @@ namespace ops = paddle::operators;
using
CPUCtx
=
paddle
::
platform
::
CPUDeviceContext
;
using
CPUCtx
=
paddle
::
platform
::
CPUDeviceContext
;
REGISTER_OPERATOR
(
cross_entropy
,
ops
::
CrossEntropyOp
,
ops
::
CrossEntropyOpMaker
,
REGISTER_OPERATOR
(
cross_entropy
,
ops
::
CrossEntropyOp
,
ops
::
CrossEntropyOpMaker
,
ops
::
CrossEntropyOpInferVarType
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
cross_entropy_grad
,
ops
::
CrossEntropyGradientOp
);
REGISTER_OPERATOR
(
cross_entropy_grad
,
ops
::
CrossEntropyGradientOp
);
REGISTER_OP_CPU_KERNEL
(
cross_entropy
,
ops
::
CrossEntropyOpKernel
<
CPUCtx
,
float
>
,
REGISTER_OP_CPU_KERNEL
(
cross_entropy
,
ops
::
CrossEntropyOpKernel
<
CPUCtx
,
float
>
,
...
...
paddle/fluid/operators/detection/CMakeLists.txt
浏览文件 @
e5bf8616
...
@@ -22,6 +22,7 @@ iou_similarity_op.cu)
...
@@ -22,6 +22,7 @@ iou_similarity_op.cu)
detection_library
(
mine_hard_examples_op SRCS mine_hard_examples_op.cc
)
detection_library
(
mine_hard_examples_op SRCS mine_hard_examples_op.cc
)
detection_library
(
multiclass_nms_op SRCS multiclass_nms_op.cc poly_util.cc gpc.cc
)
detection_library
(
multiclass_nms_op SRCS multiclass_nms_op.cc poly_util.cc gpc.cc
)
detection_library
(
prior_box_op SRCS prior_box_op.cc prior_box_op.cu
)
detection_library
(
prior_box_op SRCS prior_box_op.cc prior_box_op.cu
)
detection_library
(
density_prior_box_op SRCS density_prior_box_op.cc
)
detection_library
(
anchor_generator_op SRCS anchor_generator_op.cc
detection_library
(
anchor_generator_op SRCS anchor_generator_op.cc
anchor_generator_op.cu
)
anchor_generator_op.cu
)
detection_library
(
target_assign_op SRCS target_assign_op.cc
detection_library
(
target_assign_op SRCS target_assign_op.cc
...
...
paddle/fluid/operators/detection/density_prior_box_op.cc
0 → 100644
浏览文件 @
e5bf8616
/*Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/fluid/operators/detection/density_prior_box_op.h"
namespace
paddle
{
namespace
operators
{
class
DensityPriorBoxOp
:
public
framework
::
OperatorWithKernel
{
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"Input"
),
"Input(Input) of DensityPriorBoxOp should not be null."
);
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"Image"
),
"Input(Image) of DensityPriorBoxOp should not be null."
);
auto
image_dims
=
ctx
->
GetInputDim
(
"Image"
);
auto
input_dims
=
ctx
->
GetInputDim
(
"Input"
);
PADDLE_ENFORCE
(
image_dims
.
size
()
==
4
,
"The layout of image is NCHW."
);
PADDLE_ENFORCE
(
input_dims
.
size
()
==
4
,
"The layout of input is NCHW."
);
PADDLE_ENFORCE_LT
(
input_dims
[
2
],
image_dims
[
2
],
"The height of input must smaller than image."
);
PADDLE_ENFORCE_LT
(
input_dims
[
3
],
image_dims
[
3
],
"The width of input must smaller than image."
);
auto
variances
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
float
>>
(
"variances"
);
auto
fixed_sizes
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
float
>>
(
"fixed_sizes"
);
auto
fixed_ratios
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
float
>>
(
"fixed_ratios"
);
auto
densities
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"densities"
);
PADDLE_ENFORCE_EQ
(
fixed_sizes
.
size
(),
densities
.
size
(),
"The number of fixed_sizes and densities must be equal."
);
size_t
num_priors
=
0
;
if
((
fixed_sizes
.
size
()
>
0
)
&&
(
densities
.
size
()
>
0
))
{
for
(
size_t
i
=
0
;
i
<
densities
.
size
();
++
i
)
{
if
(
fixed_ratios
.
size
()
>
0
)
{
num_priors
+=
(
fixed_ratios
.
size
())
*
(
pow
(
densities
[
i
],
2
));
}
}
}
std
::
vector
<
int64_t
>
dim_vec
(
4
);
dim_vec
[
0
]
=
input_dims
[
2
];
dim_vec
[
1
]
=
input_dims
[
3
];
dim_vec
[
2
]
=
num_priors
;
dim_vec
[
3
]
=
4
;
ctx
->
SetOutputDim
(
"Boxes"
,
framework
::
make_ddim
(
dim_vec
));
ctx
->
SetOutputDim
(
"Variances"
,
framework
::
make_ddim
(
dim_vec
));
}
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
return
framework
::
OpKernelType
(
framework
::
ToDataType
(
ctx
.
Input
<
framework
::
Tensor
>
(
"Input"
)
->
type
()),
platform
::
CPUPlace
());
}
};
class
DensityPriorBoxOpMaker
:
public
framework
::
OpProtoAndCheckerMaker
{
public:
void
Make
()
override
{
AddInput
(
"Input"
,
"(Tensor, default Tensor<float>), "
"the input feature data of DensityPriorBoxOp, the layout is NCHW."
);
AddInput
(
"Image"
,
"(Tensor, default Tensor<float>), "
"the input image data of DensityPriorBoxOp, the layout is NCHW."
);
AddOutput
(
"Boxes"
,
"(Tensor, default Tensor<float>), the output prior boxes of "
"DensityPriorBoxOp. The layout is [H, W, num_priors, 4]. "
"H is the height of input, W is the width of input, num_priors "
"is the box count of each position."
);
AddOutput
(
"Variances"
,
"(Tensor, default Tensor<float>), the expanded variances of "
"DensityPriorBoxOp. The layout is [H, W, num_priors, 4]. "
"H is the height of input, W is the width of input, num_priors "
"is the box count of each position."
);
AddAttr
<
std
::
vector
<
float
>>
(
"variances"
,
"(vector<float>) List of variances to be "
"encoded in density prior boxes."
)
.
AddCustomChecker
([](
const
std
::
vector
<
float
>&
variances
)
{
PADDLE_ENFORCE_EQ
(
variances
.
size
(),
4
,
"Must and only provide 4 variance."
);
for
(
size_t
i
=
0
;
i
<
variances
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
variances
[
i
],
0.0
,
"variance[%d] must be greater than 0."
,
i
);
}
});
AddAttr
<
bool
>
(
"clip"
,
"(bool) Whether to clip out-of-boundary boxes."
)
.
SetDefault
(
true
);
AddAttr
<
float
>
(
"step_w"
,
"Density prior boxes step across width, 0.0 for auto calculation."
)
.
SetDefault
(
0.0
)
.
AddCustomChecker
([](
const
float
&
step_w
)
{
PADDLE_ENFORCE_GE
(
step_w
,
0.0
,
"step_w should be larger than 0."
);
});
AddAttr
<
float
>
(
"step_h"
,
"Density prior boxes step across height, 0.0 for auto calculation."
)
.
SetDefault
(
0.0
)
.
AddCustomChecker
([](
const
float
&
step_h
)
{
PADDLE_ENFORCE_GE
(
step_h
,
0.0
,
"step_h should be larger than 0."
);
});
AddAttr
<
float
>
(
"offset"
,
"(float) "
"Density prior boxes center offset."
)
.
SetDefault
(
0.5
);
AddAttr
<
std
::
vector
<
float
>>
(
"fixed_sizes"
,
"(vector<float>) List of fixed sizes "
"of generated density prior boxes."
)
.
SetDefault
(
std
::
vector
<
float
>
{})
.
AddCustomChecker
([](
const
std
::
vector
<
float
>&
fixed_sizes
)
{
for
(
size_t
i
=
0
;
i
<
fixed_sizes
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
fixed_sizes
[
i
],
0.0
,
"fixed_sizes[%d] should be larger than 0."
,
i
);
}
});
AddAttr
<
std
::
vector
<
float
>>
(
"fixed_ratios"
,
"(vector<float>) List of fixed ratios "
"of generated density prior boxes."
)
.
SetDefault
(
std
::
vector
<
float
>
{})
.
AddCustomChecker
([](
const
std
::
vector
<
float
>&
fixed_ratios
)
{
for
(
size_t
i
=
0
;
i
<
fixed_ratios
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
fixed_ratios
[
i
],
0.0
,
"fixed_ratios[%d] should be larger than 0."
,
i
);
}
});
AddAttr
<
std
::
vector
<
int
>>
(
"densities"
,
"(vector<float>) List of densities "
"of generated density prior boxes."
)
.
SetDefault
(
std
::
vector
<
int
>
{})
.
AddCustomChecker
([](
const
std
::
vector
<
int
>&
densities
)
{
for
(
size_t
i
=
0
;
i
<
densities
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
densities
[
i
],
0
,
"densities[%d] should be larger than 0."
,
i
);
}
});
AddComment
(
R"DOC(
Density Prior box operator
Each position of the input produce N density prior boxes, N is determined by
the count of fixed_ratios, densities, the calculation of N is as follows:
for density in densities:
N += size(fixed_ratios)*density^2
)DOC"
);
}
};
}
// namespace operators
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
density_prior_box
,
ops
::
DensityPriorBoxOp
,
ops
::
DensityPriorBoxOpMaker
,
paddle
::
framework
::
EmptyGradOpMaker
);
REGISTER_OP_CPU_KERNEL
(
density_prior_box
,
ops
::
DensityPriorBoxOpKernel
<
float
>
,
ops
::
DensityPriorBoxOpKernel
<
double
>
);
paddle/fluid/operators/detection/density_prior_box_op.h
0 → 100644
浏览文件 @
e5bf8616
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include <algorithm>
#include <vector>
#include "paddle/fluid/operators/detection/prior_box_op.h"
namespace
paddle
{
namespace
operators
{
template
<
typename
T
>
class
DensityPriorBoxOpKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
auto
*
input
=
ctx
.
Input
<
paddle
::
framework
::
Tensor
>
(
"Input"
);
auto
*
image
=
ctx
.
Input
<
paddle
::
framework
::
Tensor
>
(
"Image"
);
auto
*
boxes
=
ctx
.
Output
<
paddle
::
framework
::
Tensor
>
(
"Boxes"
);
auto
*
vars
=
ctx
.
Output
<
paddle
::
framework
::
Tensor
>
(
"Variances"
);
auto
variances
=
ctx
.
Attr
<
std
::
vector
<
float
>>
(
"variances"
);
auto
clip
=
ctx
.
Attr
<
bool
>
(
"clip"
);
auto
fixed_sizes
=
ctx
.
Attr
<
std
::
vector
<
float
>>
(
"fixed_sizes"
);
auto
fixed_ratios
=
ctx
.
Attr
<
std
::
vector
<
float
>>
(
"fixed_ratios"
);
auto
densities
=
ctx
.
Attr
<
std
::
vector
<
int
>>
(
"densities"
);
T
step_w
=
static_cast
<
T
>
(
ctx
.
Attr
<
float
>
(
"step_w"
));
T
step_h
=
static_cast
<
T
>
(
ctx
.
Attr
<
float
>
(
"step_h"
));
T
offset
=
static_cast
<
T
>
(
ctx
.
Attr
<
float
>
(
"offset"
));
auto
img_width
=
image
->
dims
()[
3
];
auto
img_height
=
image
->
dims
()[
2
];
auto
feature_width
=
input
->
dims
()[
3
];
auto
feature_height
=
input
->
dims
()[
2
];
T
step_width
,
step_height
;
if
(
step_w
==
0
||
step_h
==
0
)
{
step_width
=
static_cast
<
T
>
(
img_width
)
/
feature_width
;
step_height
=
static_cast
<
T
>
(
img_height
)
/
feature_height
;
}
else
{
step_width
=
step_w
;
step_height
=
step_h
;
}
int
num_priors
=
0
;
if
(
fixed_sizes
.
size
()
>
0
&&
densities
.
size
()
>
0
)
{
for
(
size_t
i
=
0
;
i
<
densities
.
size
();
++
i
)
{
if
(
fixed_ratios
.
size
()
>
0
)
{
num_priors
+=
(
fixed_ratios
.
size
())
*
(
pow
(
densities
[
i
],
2
));
}
}
}
boxes
->
mutable_data
<
T
>
(
ctx
.
GetPlace
());
vars
->
mutable_data
<
T
>
(
ctx
.
GetPlace
());
auto
e_boxes
=
framework
::
EigenTensor
<
T
,
4
>::
From
(
*
boxes
).
setConstant
(
0.0
);
int
step_average
=
static_cast
<
int
>
((
step_width
+
step_height
)
*
0.5
);
for
(
int
h
=
0
;
h
<
feature_height
;
++
h
)
{
for
(
int
w
=
0
;
w
<
feature_width
;
++
w
)
{
T
center_x
=
(
w
+
offset
)
*
step_width
;
T
center_y
=
(
h
+
offset
)
*
step_height
;
int
idx
=
0
;
// Generate density prior boxes with fixed sizes.
for
(
size_t
s
=
0
;
s
<
fixed_sizes
.
size
();
++
s
)
{
auto
fixed_size
=
fixed_sizes
[
s
];
int
density
=
densities
[
s
];
// Generate density prior boxes with fixed ratios.
if
(
fixed_ratios
.
size
()
>
0
)
{
for
(
size_t
r
=
0
;
r
<
fixed_ratios
.
size
();
++
r
)
{
float
ar
=
fixed_ratios
[
r
];
int
shift
=
step_average
/
density
;
float
box_width_ratio
=
fixed_size
*
sqrt
(
ar
);
float
box_height_ratio
=
fixed_size
/
sqrt
(
ar
);
for
(
int
di
=
0
;
di
<
density
;
++
di
)
{
for
(
int
dj
=
0
;
dj
<
density
;
++
dj
)
{
float
center_x_temp
=
center_x
-
step_average
/
2.
+
shift
/
2.
+
dj
*
shift
;
float
center_y_temp
=
center_y
-
step_average
/
2.
+
shift
/
2.
+
di
*
shift
;
e_boxes
(
h
,
w
,
idx
,
0
)
=
(
center_x_temp
-
box_width_ratio
/
2.
)
/
img_width
>=
0
?
(
center_x_temp
-
box_width_ratio
/
2.
)
/
img_width
:
0
;
e_boxes
(
h
,
w
,
idx
,
1
)
=
(
center_y_temp
-
box_height_ratio
/
2.
)
/
img_height
>=
0
?
(
center_y_temp
-
box_height_ratio
/
2.
)
/
img_height
:
0
;
e_boxes
(
h
,
w
,
idx
,
2
)
=
(
center_x_temp
+
box_width_ratio
/
2.
)
/
img_width
<=
1
?
(
center_x_temp
+
box_width_ratio
/
2.
)
/
img_width
:
1
;
e_boxes
(
h
,
w
,
idx
,
3
)
=
(
center_y_temp
+
box_height_ratio
/
2.
)
/
img_height
<=
1
?
(
center_y_temp
+
box_height_ratio
/
2.
)
/
img_height
:
1
;
idx
++
;
}
}
}
}
}
}
}
if
(
clip
)
{
platform
::
Transform
<
platform
::
CPUDeviceContext
>
trans
;
ClipFunctor
<
T
>
clip_func
;
trans
(
ctx
.
template
device_context
<
platform
::
CPUDeviceContext
>(),
boxes
->
data
<
T
>
(),
boxes
->
data
<
T
>
()
+
boxes
->
numel
(),
boxes
->
data
<
T
>
(),
clip_func
);
}
framework
::
Tensor
var_t
;
var_t
.
mutable_data
<
T
>
(
framework
::
make_ddim
({
1
,
static_cast
<
int
>
(
variances
.
size
())}),
ctx
.
GetPlace
());
auto
var_et
=
framework
::
EigenTensor
<
T
,
2
>::
From
(
var_t
);
for
(
size_t
i
=
0
;
i
<
variances
.
size
();
++
i
)
{
var_et
(
0
,
i
)
=
variances
[
i
];
}
int
box_num
=
feature_height
*
feature_width
*
num_priors
;
auto
var_dim
=
vars
->
dims
();
vars
->
Resize
({
box_num
,
static_cast
<
int
>
(
variances
.
size
())});
auto
e_vars
=
framework
::
EigenMatrix
<
T
,
Eigen
::
RowMajor
>::
From
(
*
vars
);
e_vars
=
var_et
.
broadcast
(
Eigen
::
DSizes
<
int
,
2
>
(
box_num
,
1
));
vars
->
Resize
(
var_dim
);
}
};
// namespace operators
}
// namespace operators
}
// namespace paddle
paddle/fluid/operators/elementwise_op.h
浏览文件 @
e5bf8616
...
@@ -75,16 +75,12 @@ class ElementwiseOp : public framework::OperatorWithKernel {
...
@@ -75,16 +75,12 @@ class ElementwiseOp : public framework::OperatorWithKernel {
}
}
};
};
class
ElementwiseOpInferVarType
:
public
framework
::
VarTypeInference
{
class
ElementwiseOpInferVarType
public:
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
void
operator
()(
const
framework
::
OpDesc
&
op_desc
,
protected:
framework
::
BlockDesc
*
block
)
const
override
{
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
auto
x_name
=
op_desc
.
Input
(
"X"
)[
0
];
const
override
{
auto
out_name
=
op_desc
.
Output
(
"Out"
)[
0
];
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
auto
&
x
=
block
->
FindRecursiveOrCreateVar
(
x_name
);
auto
&
out
=
block
->
FindRecursiveOrCreateVar
(
out_name
);
out
.
SetType
(
x
.
GetType
());
out
.
SetDataType
(
x
.
GetDataType
());
}
}
};
};
...
...
paddle/fluid/operators/gather.cu.h
浏览文件 @
e5bf8616
...
@@ -50,7 +50,9 @@ void GPUGather(const platform::DeviceContext& ctx, const Tensor& src,
...
@@ -50,7 +50,9 @@ void GPUGather(const platform::DeviceContext& ctx, const Tensor& src,
const
Tensor
&
index
,
Tensor
*
output
)
{
const
Tensor
&
index
,
Tensor
*
output
)
{
// PADDLE_ENFORCE(platform::is_gpu_place(place));
// PADDLE_ENFORCE(platform::is_gpu_place(place));
// check index of shape 1-D
// check index of shape 1-D
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
);
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
||
(
index
.
dims
().
size
()
==
2
&&
index
.
dims
()[
1
]
==
1
));
int
index_size
=
index
.
dims
()[
0
];
int
index_size
=
index
.
dims
()[
0
];
auto
src_dims
=
src
.
dims
();
auto
src_dims
=
src
.
dims
();
...
...
paddle/fluid/operators/gather.h
浏览文件 @
e5bf8616
...
@@ -38,7 +38,8 @@ void CPUGather(const platform::DeviceContext& ctx, const Tensor& src,
...
@@ -38,7 +38,8 @@ void CPUGather(const platform::DeviceContext& ctx, const Tensor& src,
const
Tensor
&
index
,
Tensor
*
output
)
{
const
Tensor
&
index
,
Tensor
*
output
)
{
PADDLE_ENFORCE
(
platform
::
is_cpu_place
(
ctx
.
GetPlace
()));
PADDLE_ENFORCE
(
platform
::
is_cpu_place
(
ctx
.
GetPlace
()));
// check index of shape 1-D
// check index of shape 1-D
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
);
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
||
(
index
.
dims
().
size
()
==
2
&&
index
.
dims
()[
1
]
==
1
));
int64_t
index_size
=
index
.
dims
()[
0
];
int64_t
index_size
=
index
.
dims
()[
0
];
auto
src_dims
=
src
.
dims
();
auto
src_dims
=
src
.
dims
();
...
...
paddle/fluid/operators/gather_op.cc
浏览文件 @
e5bf8616
...
@@ -31,7 +31,8 @@ class GatherOp : public framework::OperatorWithKernel {
...
@@ -31,7 +31,8 @@ class GatherOp : public framework::OperatorWithKernel {
"Output(Out) of GatherOp should not be null."
);
"Output(Out) of GatherOp should not be null."
);
auto
index_dims
=
ctx
->
GetInputDim
(
"Index"
);
auto
index_dims
=
ctx
->
GetInputDim
(
"Index"
);
PADDLE_ENFORCE
(
index_dims
.
size
()
==
1
);
PADDLE_ENFORCE
(
index_dims
.
size
()
==
1
||
(
index_dims
.
size
()
==
2
&&
index_dims
[
1
]
==
1
));
int
batch_size
=
ctx
->
GetInputDim
(
"Index"
)[
0
];
int
batch_size
=
ctx
->
GetInputDim
(
"Index"
)[
0
];
framework
::
DDim
output_dims
(
ctx
->
GetInputDim
(
"X"
));
framework
::
DDim
output_dims
(
ctx
->
GetInputDim
(
"X"
));
output_dims
[
0
]
=
batch_size
;
output_dims
[
0
]
=
batch_size
;
...
@@ -53,6 +54,7 @@ class GatherGradOp : public framework::OperatorWithKernel {
...
@@ -53,6 +54,7 @@ class GatherGradOp : public framework::OperatorWithKernel {
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
ctx
->
SetOutputDim
(
framework
::
GradVarName
(
"X"
),
ctx
->
GetInputDim
(
"X"
));
ctx
->
SetOutputDim
(
framework
::
GradVarName
(
"X"
),
ctx
->
GetInputDim
(
"X"
));
ctx
->
ShareLoD
(
"X"
,
/*-->*/
framework
::
GradVarName
(
"X"
));
}
}
protected:
protected:
...
@@ -75,7 +77,7 @@ Gather Operator.
...
@@ -75,7 +77,7 @@ Gather Operator.
$Out = X[Index]$
$Out = X[Index]$
Out is obtained by gathering entries of the outer-most dimension
Out is obtained by gathering entries of the outer-most dimension
of X indexed by Index and concatenate them together.
of X indexed by Index and concatenate them together.
Example:
Example:
...
...
paddle/fluid/operators/mean_op.cc
浏览文件 @
e5bf8616
...
@@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
...
@@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License. */
limitations under the License. */
#include "paddle/fluid/operators/mean_op.h"
#include "paddle/fluid/operators/mean_op.h"
#include <string>
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -42,6 +42,14 @@ Mean Operator calculates the mean of all elements in X.
...
@@ -42,6 +42,14 @@ Mean Operator calculates the mean of all elements in X.
}
}
};
};
class
MeanOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
}
};
class
MeanGradOp
:
public
framework
::
OperatorWithKernel
{
class
MeanGradOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
...
@@ -50,6 +58,14 @@ class MeanGradOp : public framework::OperatorWithKernel {
...
@@ -50,6 +58,14 @@ class MeanGradOp : public framework::OperatorWithKernel {
ctx
->
SetOutputDim
(
framework
::
GradVarName
(
"X"
),
ctx
->
GetInputDim
(
"X"
));
ctx
->
SetOutputDim
(
framework
::
GradVarName
(
"X"
),
ctx
->
GetInputDim
(
"X"
));
ctx
->
ShareLoD
(
"X"
,
framework
::
GradVarName
(
"X"
));
ctx
->
ShareLoD
(
"X"
,
framework
::
GradVarName
(
"X"
));
}
}
framework
::
OpKernelType
GetExpectedKernelType
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
auto
input_data_type
=
framework
::
ToDataType
(
ctx
.
Input
<
Tensor
>
(
"X"
)
->
type
());
return
framework
::
OpKernelType
(
input_data_type
,
ctx
.
GetPlace
());
}
};
};
class
MeanGradMaker
:
public
framework
::
SingleGradOpDescMaker
{
class
MeanGradMaker
:
public
framework
::
SingleGradOpDescMaker
{
...
@@ -71,7 +87,8 @@ class MeanGradMaker : public framework::SingleGradOpDescMaker {
...
@@ -71,7 +87,8 @@ class MeanGradMaker : public framework::SingleGradOpDescMaker {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
mean
,
ops
::
MeanOp
,
ops
::
MeanOpMaker
,
ops
::
MeanGradMaker
);
REGISTER_OPERATOR
(
mean
,
ops
::
MeanOp
,
ops
::
MeanOpMaker
,
ops
::
MeanOpInferVarType
,
ops
::
MeanGradMaker
);
REGISTER_OPERATOR
(
mean_grad
,
ops
::
MeanGradOp
);
REGISTER_OPERATOR
(
mean_grad
,
ops
::
MeanGradOp
);
REGISTER_OP_CPU_KERNEL
(
REGISTER_OP_CPU_KERNEL
(
mean
,
ops
::
MeanKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
mean
,
ops
::
MeanKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
...
...
paddle/fluid/operators/mul_op.cc
浏览文件 @
e5bf8616
...
@@ -126,6 +126,14 @@ or not. But the output only shares the LoD information with input $X$.
...
@@ -126,6 +126,14 @@ or not. But the output only shares the LoD information with input $X$.
}
}
};
};
class
MulOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
}
};
class
MulGradOp
:
public
framework
::
OperatorWithKernel
{
class
MulGradOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
...
@@ -178,7 +186,8 @@ class MulOpGradMaker : public framework::SingleGradOpDescMaker {
...
@@ -178,7 +186,8 @@ class MulOpGradMaker : public framework::SingleGradOpDescMaker {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
mul
,
ops
::
MulOp
,
ops
::
MulOpMaker
,
ops
::
MulOpGradMaker
);
REGISTER_OPERATOR
(
mul
,
ops
::
MulOp
,
ops
::
MulOpMaker
,
ops
::
MulOpInferVarType
,
ops
::
MulOpGradMaker
);
REGISTER_OPERATOR
(
mul_grad
,
ops
::
MulGradOp
);
REGISTER_OPERATOR
(
mul_grad
,
ops
::
MulGradOp
);
REGISTER_OP_CPU_KERNEL
(
REGISTER_OP_CPU_KERNEL
(
mul
,
ops
::
MulKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
mul
,
ops
::
MulKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
...
...
paddle/fluid/operators/pool_op.cc
浏览文件 @
e5bf8616
...
@@ -40,7 +40,7 @@ int PoolOutputSize(int input_size, int filter_size, int padding, int stride,
...
@@ -40,7 +40,7 @@ int PoolOutputSize(int input_size, int filter_size, int padding, int stride,
return
output_size
;
return
output_size
;
}
}
void
PoolOp
::
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
{
void
PoolOp
::
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
{
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"X"
),
"X(Input) of Pooling should not be null."
);
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"X"
),
"X(Input) of Pooling should not be null."
);
PADDLE_ENFORCE
(
ctx
->
HasOutput
(
"Out"
),
PADDLE_ENFORCE
(
ctx
->
HasOutput
(
"Out"
),
"Out(Output) of Pooling should not be null."
);
"Out(Output) of Pooling should not be null."
);
...
@@ -81,7 +81,7 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const {
...
@@ -81,7 +81,7 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const {
}
}
framework
::
OpKernelType
PoolOp
::
GetExpectedKernelType
(
framework
::
OpKernelType
PoolOp
::
GetExpectedKernelType
(
const
framework
::
ExecutionContext
&
ctx
)
const
{
const
framework
::
ExecutionContext
&
ctx
)
const
{
framework
::
LibraryType
library_
{
framework
::
LibraryType
::
kPlain
};
framework
::
LibraryType
library_
{
framework
::
LibraryType
::
kPlain
};
std
::
string
data_format
=
ctx
.
Attr
<
std
::
string
>
(
"data_format"
);
std
::
string
data_format
=
ctx
.
Attr
<
std
::
string
>
(
"data_format"
);
framework
::
DataLayout
layout_
=
framework
::
StringToDataLayout
(
data_format
);
framework
::
DataLayout
layout_
=
framework
::
StringToDataLayout
(
data_format
);
...
@@ -104,7 +104,7 @@ framework::OpKernelType PoolOp::GetExpectedKernelType(
...
@@ -104,7 +104,7 @@ framework::OpKernelType PoolOp::GetExpectedKernelType(
layout_
,
library_
);
layout_
,
library_
);
}
}
void
PoolOpGrad
::
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
{
void
PoolOpGrad
::
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
{
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"X"
),
"Input(X) must not be null."
);
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"X"
),
"Input(X) must not be null."
);
PADDLE_ENFORCE
(
ctx
->
HasOutput
(
framework
::
GradVarName
(
"X"
)),
PADDLE_ENFORCE
(
ctx
->
HasOutput
(
framework
::
GradVarName
(
"X"
)),
"Input(X@GRAD) should not be null."
);
"Input(X@GRAD) should not be null."
);
...
@@ -112,7 +112,7 @@ void PoolOpGrad::InferShape(framework::InferShapeContext *ctx) const {
...
@@ -112,7 +112,7 @@ void PoolOpGrad::InferShape(framework::InferShapeContext *ctx) const {
}
}
framework
::
OpKernelType
PoolOpGrad
::
GetExpectedKernelType
(
framework
::
OpKernelType
PoolOpGrad
::
GetExpectedKernelType
(
const
framework
::
ExecutionContext
&
ctx
)
const
{
const
framework
::
ExecutionContext
&
ctx
)
const
{
framework
::
LibraryType
library_
{
framework
::
LibraryType
::
kPlain
};
framework
::
LibraryType
library_
{
framework
::
LibraryType
::
kPlain
};
std
::
string
data_format
=
ctx
.
Attr
<
std
::
string
>
(
"data_format"
);
std
::
string
data_format
=
ctx
.
Attr
<
std
::
string
>
(
"data_format"
);
framework
::
DataLayout
layout_
=
framework
::
StringToDataLayout
(
data_format
);
framework
::
DataLayout
layout_
=
framework
::
StringToDataLayout
(
data_format
);
...
@@ -262,6 +262,14 @@ Example:
...
@@ -262,6 +262,14 @@ Example:
)DOC"
);
)DOC"
);
}
}
class
PoolOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
}
};
void
Pool3dOpMaker
::
Make
()
{
void
Pool3dOpMaker
::
Make
()
{
AddInput
(
"X"
,
AddInput
(
"X"
,
"(Tensor) The input tensor of pooling operator. "
"(Tensor) The input tensor of pooling operator. "
...
@@ -372,6 +380,7 @@ Example:
...
@@ -372,6 +380,7 @@ Example:
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
pool2d
,
ops
::
PoolOp
,
ops
::
Pool2dOpMaker
,
REGISTER_OPERATOR
(
pool2d
,
ops
::
PoolOp
,
ops
::
Pool2dOpMaker
,
ops
::
PoolOpInferVarType
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
pool2d_grad
,
ops
::
PoolOpGrad
);
REGISTER_OPERATOR
(
pool2d_grad
,
ops
::
PoolOpGrad
);
...
@@ -383,6 +392,7 @@ REGISTER_OP_CPU_KERNEL(
...
@@ -383,6 +392,7 @@ REGISTER_OP_CPU_KERNEL(
ops
::
PoolGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
);
ops
::
PoolGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
);
REGISTER_OPERATOR
(
pool3d
,
ops
::
PoolOp
,
ops
::
Pool3dOpMaker
,
REGISTER_OPERATOR
(
pool3d
,
ops
::
PoolOp
,
ops
::
Pool3dOpMaker
,
ops
::
PoolOpInferVarType
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
pool3d_grad
,
ops
::
PoolOpGrad
);
REGISTER_OPERATOR
(
pool3d_grad
,
ops
::
PoolOpGrad
);
...
...
paddle/fluid/operators/reduce_max_op.cu
浏览文件 @
e5bf8616
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_max,
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_max,
int
,
ops
::
MaxFunctor
>
,
int
,
ops
::
MaxFunctor
>
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MaxFunctor
>
);
int64_t
,
ops
::
MaxFunctor
>
);
REGISTER_OP_CUDA_KERNEL
(
reduce_max_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MaxOrMinGradFunctor
>
);
paddle/fluid/operators/reduce_max_op.part.cu
0 → 100644
浏览文件 @
e5bf8616
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/operators/reduce_min_max_op.h"
REGISTER_OP_CUDA_KERNEL
(
reduce_max_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MaxOrMinGradFunctor
>
);
paddle/fluid/operators/reduce_mean_op.cu
浏览文件 @
e5bf8616
...
@@ -69,13 +69,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_mean, ops::ReduceMeanKernel<float>,
...
@@ -69,13 +69,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_mean, ops::ReduceMeanKernel<float>,
ops
::
ReduceMeanKernel
<
double
>
,
ops
::
ReduceMeanKernel
<
double
>
,
ops
::
ReduceMeanKernel
<
int
>
,
ops
::
ReduceMeanKernel
<
int
>
,
ops
::
ReduceMeanKernel
<
int64_t
>
);
ops
::
ReduceMeanKernel
<
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
reduce_mean_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MeanGradFunctor
>
);
paddle/fluid/operators/reduce_mean_op.part.cu
0 → 100644
浏览文件 @
e5bf8616
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// .part used to speed up nvcc compile
#include "paddle/fluid/operators/reduce_mean_op.h"
REGISTER_OP_CUDA_KERNEL
(
reduce_mean_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MeanGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MeanGradFunctor
>
);
paddle/fluid/operators/reduce_min_op.cu
浏览文件 @
e5bf8616
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_min,
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_min,
int
,
ops
::
MinFunctor
>
,
int
,
ops
::
MinFunctor
>
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MinFunctor
>
);
int64_t
,
ops
::
MinFunctor
>
);
REGISTER_OP_CUDA_KERNEL
(
reduce_min_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MaxOrMinGradFunctor
>
);
paddle/fluid/operators/reduce_min_op.part.cu
0 → 100644
浏览文件 @
e5bf8616
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/operators/reduce_min_max_op.h"
REGISTER_OP_CUDA_KERNEL
(
reduce_min_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
MaxOrMinGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
MaxOrMinGradFunctor
>
);
paddle/fluid/operators/reduce_prod_op.cu
浏览文件 @
e5bf8616
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_prod,
...
@@ -23,12 +23,3 @@ REGISTER_OP_CUDA_KERNEL(reduce_prod,
int
,
ops
::
ProdFunctor
>
,
int
,
ops
::
ProdFunctor
>
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
ops
::
ReduceKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
ProdFunctor
>
);
int64_t
,
ops
::
ProdFunctor
>
);
REGISTER_OP_CUDA_KERNEL
(
reduce_prod_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
ProdGradFunctor
>
);
paddle/fluid/operators/reduce_prod_op.part.cu
0 → 100644
浏览文件 @
e5bf8616
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/operators/reduce_prod_op.h"
REGISTER_OP_CUDA_KERNEL
(
reduce_prod_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
ProdGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
ProdGradFunctor
>
);
paddle/fluid/operators/reduce_sum_op.cu
浏览文件 @
e5bf8616
...
@@ -64,13 +64,3 @@ class ReduceSumKernel : public framework::OpKernel<T> {
...
@@ -64,13 +64,3 @@ class ReduceSumKernel : public framework::OpKernel<T> {
REGISTER_OP_CUDA_KERNEL
(
reduce_sum
,
ops
::
ReduceSumKernel
<
float
>
,
REGISTER_OP_CUDA_KERNEL
(
reduce_sum
,
ops
::
ReduceSumKernel
<
float
>
,
ops
::
ReduceSumKernel
<
double
>
,
ops
::
ReduceSumKernel
<
int
>
,
ops
::
ReduceSumKernel
<
double
>
,
ops
::
ReduceSumKernel
<
int
>
,
ops
::
ReduceSumKernel
<
int64_t
>
);
ops
::
ReduceSumKernel
<
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
reduce_sum_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
SumGradFunctor
>
);
paddle/fluid/operators/reduce_sum_op.part.cu
0 → 100644
浏览文件 @
e5bf8616
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/operators/cub_reduce.h"
#include "paddle/fluid/operators/reduce_sum_op.h"
REGISTER_OP_CUDA_KERNEL
(
reduce_sum_grad
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
,
ops
::
SumGradFunctor
>
,
ops
::
ReduceGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
,
ops
::
SumGradFunctor
>
);
paddle/fluid/operators/scatter.cu.h
浏览文件 @
e5bf8616
...
@@ -51,7 +51,8 @@ void GPUScatterAssign(const platform::DeviceContext& ctx, const Tensor& src,
...
@@ -51,7 +51,8 @@ void GPUScatterAssign(const platform::DeviceContext& ctx, const Tensor& src,
const
Tensor
&
index
,
Tensor
*
output
)
{
const
Tensor
&
index
,
Tensor
*
output
)
{
// PADDLE_ENFORCE(platform::is_gpu_place(place));
// PADDLE_ENFORCE(platform::is_gpu_place(place));
// check index of shape 1-D
// check index of shape 1-D
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
);
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
||
(
index
.
dims
().
size
()
==
2
&&
index
.
dims
()[
1
]
==
1
));
int
index_size
=
index
.
dims
()[
0
];
int
index_size
=
index
.
dims
()[
0
];
auto
src_dims
=
src
.
dims
();
auto
src_dims
=
src
.
dims
();
...
...
paddle/fluid/operators/scatter.h
浏览文件 @
e5bf8616
...
@@ -37,7 +37,8 @@ void ScatterAssign(const platform::DeviceContext& ctx, const Tensor& src,
...
@@ -37,7 +37,8 @@ void ScatterAssign(const platform::DeviceContext& ctx, const Tensor& src,
const
Tensor
&
index
,
Tensor
*
output
)
{
const
Tensor
&
index
,
Tensor
*
output
)
{
PADDLE_ENFORCE
(
platform
::
is_cpu_place
(
ctx
.
GetPlace
()));
PADDLE_ENFORCE
(
platform
::
is_cpu_place
(
ctx
.
GetPlace
()));
// check index of shape 1-D
// check index of shape 1-D
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
);
PADDLE_ENFORCE
(
index
.
dims
().
size
()
==
1
||
(
index
.
dims
().
size
()
==
2
&&
index
.
dims
()[
1
]
==
1
));
int
index_size
=
index
.
dims
()[
0
];
int
index_size
=
index
.
dims
()[
0
];
auto
src_dims
=
src
.
dims
();
auto
src_dims
=
src
.
dims
();
...
...
paddle/fluid/operators/similarity_focus_op.cc
0 → 100644
浏览文件 @
e5bf8616
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/fluid/operators/similarity_focus_op.h"
namespace
paddle
{
namespace
operators
{
class
SimilarityFocusOpMaker
:
public
framework
::
OpProtoAndCheckerMaker
{
public:
void
Make
()
override
{
AddInput
(
"X"
,
"(Tensor, default Tensor<float>), a 4-D tensor with shape,"
" [BatchSize, X, Y, Z]"
);
AddOutput
(
"Out"
,
"(Tensor, default Tensor<float>), the similarity focus mask"
" with the same shape of input X."
);
AddAttr
<
int
>
(
"axis"
,
"(int32), indicating the dimension to be select. It can"
" only be 1, 2, or 3."
);
AddAttr
<
std
::
vector
<
int
>>
(
"indexes"
,
"(std::vector<int32>), indicating the indexes"
" of the selected dimension."
);
AddComment
(
R"DOC(
SimilarityFocus Operator.
Generate a similarity focus mask with the same shape of input using the following method:
1. Extract the 3-D tensor(here the first dimension is BatchSize) corresponding
to the axis according to the indexes. For example, if axis=1 and indexes=[a],
it will get the matrix T=X[:, a, :, :]. In this case, if the shape of input X
is (BatchSize, A, B, C), the shape of tensor T is (BatchSize, B, C).
2. For each index, find the largest numbers in the tensor T, so that the same
row and same column has at most one number(what it means is that if the
largest number has been found in the i-th row and the j-th column, then
the numbers in the i-th row or j-th column will be skipped. And then the
next largest number will be selected from the remaining numbers. Obviously
there will be min(B, C) numbers), and mark the corresponding position of the
3-D similarity focus mask as 1, otherwise as 0. Do elementwise-or for
each index.
3. Broadcast the 3-D similarity focus mask to the same shape of input X.
Refer to `Similarity Focus Layer <http://www.aclweb.org/anthology/N16-1108>`_
)DOC"
);
}
};
class
SimilarityFocusOp
:
public
framework
::
OperatorWithKernel
{
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
PADDLE_ENFORCE
(
ctx
->
HasInput
(
"X"
),
"Input(X) should be not null."
);
PADDLE_ENFORCE
(
ctx
->
HasOutput
(
"Out"
),
"Output(Out) should be not null."
);
auto
x_dims
=
ctx
->
GetInputDim
(
"X"
);
PADDLE_ENFORCE_EQ
(
x_dims
.
size
(),
4
,
"Input(X)'s rank should be 4."
);
ctx
->
SetOutputDim
(
"Out"
,
x_dims
);
ctx
->
ShareLoD
(
"X"
,
/*->*/
"Out"
);
}
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
return
framework
::
OpKernelType
(
framework
::
ToDataType
(
ctx
.
Input
<
Tensor
>
(
"X"
)
->
type
()),
platform
::
CPUPlace
());
}
};
}
// namespace operators
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
similarity_focus
,
ops
::
SimilarityFocusOp
,
ops
::
SimilarityFocusOpMaker
,
paddle
::
framework
::
EmptyGradOpMaker
);
REGISTER_OP_CPU_KERNEL
(
similarity_focus
,
ops
::
SimilarityFocusKernel
<
float
>
,
ops
::
SimilarityFocusKernel
<
double
>
);
paddle/fluid/operators/similarity_focus_op.h
0 → 100644
浏览文件 @
e5bf8616
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include <algorithm>
#include <cstring>
#include <utility>
#include <vector>
#include "paddle/fluid/framework/eigen.h"
#include "paddle/fluid/framework/op_registry.h"
namespace
paddle
{
namespace
operators
{
using
Tensor
=
framework
::
Tensor
;
template
<
typename
T
>
class
SimilarityFocusKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
context
)
const
override
{
Tensor
*
out
=
context
.
Output
<
Tensor
>
(
"Out"
);
const
Tensor
*
x
=
context
.
Input
<
Tensor
>
(
"X"
);
T
*
out_data
=
out
->
mutable_data
<
T
>
(
context
.
GetPlace
());
const
T
*
x_data
=
x
->
data
<
T
>
();
int
axis
=
context
.
Attr
<
int
>
(
"axis"
);
std
::
vector
<
int
>
indexes
=
context
.
Attr
<
std
::
vector
<
int
>>
(
"indexes"
);
int64_t
batch_size
=
x
->
dims
()[
0
];
int64_t
dim
[
4
];
for
(
int
i
=
1
;
i
<=
3
;
++
i
)
{
dim
[
i
]
=
x
->
dims
()[
i
];
}
if
(
indexes
.
size
()
<
1
)
{
PADDLE_THROW
(
"Indexes' size can not be 0."
);
}
for
(
auto
index
:
indexes
)
{
if
(
dim
[
axis
]
<
index
)
{
PADDLE_THROW
(
"Index exceeds tensor shape limit."
);
}
}
int64_t
array_size
=
1
;
for
(
int
i
=
1
;
i
<=
3
;
++
i
)
{
if
(
i
!=
axis
)
{
array_size
*=
dim
[
i
];
}
}
std
::
vector
<
std
::
pair
<
T
,
int64_t
>>
array
(
array_size
);
bool
(
*
cmp
)(
std
::
pair
<
T
,
int64_t
>
,
std
::
pair
<
T
,
int64_t
>
)
=
[](
std
::
pair
<
T
,
int64_t
>
x
,
std
::
pair
<
T
,
int64_t
>
y
)
{
return
x
.
first
>
y
.
first
;
};
int64_t
(
*
compute_index
)(
int64_t
*
,
int
,
int
,
int
,
int
)
=
[](
int64_t
*
dim
,
int
d1
,
int
d2
,
int
d3
,
int
d4
)
{
return
d1
*
dim
[
1
]
*
dim
[
2
]
*
dim
[
3
]
+
d2
*
dim
[
2
]
*
dim
[
3
]
+
d3
*
dim
[
3
]
+
d4
;
};
memset
(
out_data
,
0
,
sizeof
(
T
)
*
batch_size
*
dim
[
1
]
*
dim
[
2
]
*
dim
[
3
]);
for
(
int
i
=
0
;
i
<
batch_size
;
++
i
)
{
for
(
auto
index
:
indexes
)
{
if
(
axis
==
1
)
{
for
(
int
j
=
0
;
j
<
dim
[
2
];
++
j
)
{
for
(
int
k
=
0
;
k
<
dim
[
3
];
++
k
)
{
array
[
j
*
dim
[
3
]
+
k
]
=
std
::
make_pair
(
x_data
[
compute_index
(
dim
,
i
,
index
,
j
,
k
)],
j
*
dim
[
3
]
+
k
);
}
}
std
::
sort
(
array
.
begin
(),
array
.
end
(),
cmp
);
int
tag_num
=
0
;
std
::
vector
<
bool
>
tag2
(
dim
[
2
]),
tag3
(
dim
[
3
]);
for
(
auto
x
:
array
)
{
int
idx2
=
x
.
second
/
dim
[
3
];
int
idx3
=
x
.
second
%
dim
[
3
];
if
(
tag2
[
idx2
]
||
tag3
[
idx3
])
{
continue
;
}
tag_num
++
;
tag2
[
idx2
]
=
true
;
tag3
[
idx3
]
=
true
;
for
(
int
j
=
0
;
j
<
dim
[
1
];
++
j
)
{
out_data
[
compute_index
(
dim
,
i
,
j
,
idx2
,
idx3
)]
=
1
;
}
if
(
tag_num
==
std
::
min
(
dim
[
2
],
dim
[
3
]))
{
break
;
}
}
}
else
if
(
axis
==
2
)
{
for
(
int
j
=
0
;
j
<
dim
[
1
];
++
j
)
{
for
(
int
k
=
0
;
k
<
dim
[
3
];
++
k
)
{
array
[
j
*
dim
[
3
]
+
k
]
=
std
::
make_pair
(
x_data
[
compute_index
(
dim
,
i
,
j
,
index
,
k
)],
j
*
dim
[
3
]
+
k
);
}
}
std
::
sort
(
array
.
begin
(),
array
.
end
(),
cmp
);
int
tag_num
=
0
;
std
::
vector
<
bool
>
tag1
(
dim
[
1
]),
tag3
(
dim
[
3
]);
for
(
auto
x
:
array
)
{
int
idx1
=
x
.
second
/
dim
[
3
];
int
idx3
=
x
.
second
%
dim
[
3
];
if
(
tag1
[
idx1
]
||
tag3
[
idx3
])
{
continue
;
}
tag_num
++
;
tag1
[
idx1
]
=
true
;
tag3
[
idx3
]
=
true
;
for
(
int
j
=
0
;
j
<
dim
[
2
];
++
j
)
{
out_data
[
compute_index
(
dim
,
i
,
idx1
,
j
,
idx3
)]
=
1
;
}
if
(
tag_num
==
std
::
min
(
dim
[
1
],
dim
[
3
]))
{
break
;
}
}
}
else
if
(
axis
==
3
)
{
for
(
int
j
=
0
;
j
<
dim
[
1
];
++
j
)
{
for
(
int
k
=
0
;
k
<
dim
[
2
];
++
k
)
{
array
[
j
*
dim
[
2
]
+
k
]
=
std
::
make_pair
(
x_data
[
compute_index
(
dim
,
i
,
j
,
k
,
index
)],
j
*
dim
[
2
]
+
k
);
}
}
std
::
sort
(
array
.
begin
(),
array
.
end
(),
cmp
);
int
tag_num
=
0
;
std
::
vector
<
bool
>
tag1
(
dim
[
1
]),
tag2
(
dim
[
2
]);
for
(
auto
x
:
array
)
{
int
idx1
=
x
.
second
/
dim
[
2
];
int
idx2
=
x
.
second
%
dim
[
2
];
if
(
tag1
[
idx1
]
||
tag2
[
idx2
])
{
continue
;
}
tag_num
++
;
tag1
[
idx1
]
=
true
;
tag2
[
idx2
]
=
true
;
for
(
int
j
=
0
;
j
<
dim
[
3
];
++
j
)
{
out_data
[
compute_index
(
dim
,
i
,
idx1
,
idx2
,
j
)]
=
1
;
}
if
(
tag_num
==
std
::
min
(
dim
[
1
],
dim
[
2
]))
{
break
;
}
}
}
else
{
PADDLE_THROW
(
"Axis must be 1 or 2 or 3"
);
}
}
}
}
};
}
// namespace operators
}
// namespace paddle
paddle/fluid/operators/softmax_op.cc
浏览文件 @
e5bf8616
...
@@ -124,6 +124,14 @@ For each row $i$ and each column $j$ in the matrix, we have:
...
@@ -124,6 +124,14 @@ For each row $i$ and each column $j$ in the matrix, we have:
}
}
};
};
class
SoftmaxOpInferVarType
:
public
framework
::
PassInDtypeAndVarTypeToOutput
{
protected:
std
::
unordered_map
<
std
::
string
,
std
::
string
>
GetInputOutputWithSameType
()
const
override
{
return
std
::
unordered_map
<
std
::
string
,
std
::
string
>
{{
"X"
,
/*->*/
"Out"
}};
}
};
class
SoftmaxOpGrad
:
public
framework
::
OperatorWithKernel
{
class
SoftmaxOpGrad
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
...
@@ -196,7 +204,7 @@ class SoftmaxOpGradMaker : public framework::SingleGradOpDescMaker {
...
@@ -196,7 +204,7 @@ class SoftmaxOpGradMaker : public framework::SingleGradOpDescMaker {
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
softmax
,
ops
::
SoftmaxOp
,
ops
::
SoftmaxOpMaker
,
REGISTER_OPERATOR
(
softmax
,
ops
::
SoftmaxOp
,
ops
::
SoftmaxOpMaker
,
ops
::
SoftmaxOpGradMaker
);
ops
::
SoftmaxOp
InferVarType
,
ops
::
SoftmaxOp
GradMaker
);
REGISTER_OPERATOR
(
softmax_grad
,
ops
::
SoftmaxOpGrad
);
REGISTER_OPERATOR
(
softmax_grad
,
ops
::
SoftmaxOpGrad
);
REGISTER_OP_CPU_KERNEL
(
REGISTER_OP_CPU_KERNEL
(
softmax
,
ops
::
SoftmaxKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
softmax
,
ops
::
SoftmaxKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
...
...
paddle/fluid/operators/stack_op.cc
浏览文件 @
e5bf8616
...
@@ -21,8 +21,12 @@ REGISTER_OPERATOR(stack, ops::StackOp, ops::StackOpMaker,
...
@@ -21,8 +21,12 @@ REGISTER_OPERATOR(stack, ops::StackOp, ops::StackOpMaker,
REGISTER_OPERATOR
(
stack_grad
,
ops
::
StackOpGrad
);
REGISTER_OPERATOR
(
stack_grad
,
ops
::
StackOpGrad
);
REGISTER_OP_CPU_KERNEL
(
stack
,
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
float
>
,
REGISTER_OP_CPU_KERNEL
(
stack
,
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
float
>
,
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
double
>
);
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
double
>
,
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
int
>
,
ops
::
StackKernel
<
plat
::
CPUDeviceContext
,
int64_t
>
);
REGISTER_OP_CPU_KERNEL
(
stack_grad
,
REGISTER_OP_CPU_KERNEL
(
stack_grad
,
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
float
>
,
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
float
>
,
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
double
>
);
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
double
>
,
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
int
>
,
ops
::
StackGradKernel
<
plat
::
CPUDeviceContext
,
int64_t
>
);
paddle/fluid/operators/stack_op.cu
浏览文件 @
e5bf8616
...
@@ -18,8 +18,12 @@ namespace plat = paddle::platform;
...
@@ -18,8 +18,12 @@ namespace plat = paddle::platform;
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
REGISTER_OP_CUDA_KERNEL
(
stack
,
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
float
>
,
REGISTER_OP_CUDA_KERNEL
(
stack
,
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
float
>
,
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
double
>
);
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
double
>
,
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
int
>
,
ops
::
StackKernel
<
plat
::
CUDADeviceContext
,
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
stack_grad
,
REGISTER_OP_CUDA_KERNEL
(
stack_grad
,
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
float
>
,
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
float
>
,
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
double
>
);
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
double
>
,
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
int
>
,
ops
::
StackGradKernel
<
plat
::
CUDADeviceContext
,
int64_t
>
);
paddle/fluid/operators/tensor_array_to_tensor_op.cc
0 → 100644
浏览文件 @
e5bf8616
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <string>
#include <vector>
#include "paddle/fluid/framework/lod_tensor_array.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/framework/variable.h"
namespace
paddle
{
namespace
operators
{
using
framework
::
Tensor
;
void
LodTensorArray2LodTensorVector
(
const
framework
::
Scope
&
scope
,
const
std
::
string
&
base_name
,
const
std
::
string
&
lod_tensor_array_name
,
std
::
vector
<
std
::
string
>
*
res_names
)
{
auto
&
inx
=
scope
.
FindVar
(
lod_tensor_array_name
)
->
Get
<
framework
::
LoDTensorArray
>
();
for
(
size_t
i
=
0
;
i
<
inx
.
size
();
i
++
)
{
std
::
string
var_name
=
base_name
+
std
::
to_string
(
i
);
framework
::
Variable
*
g_feed_value
=
const_cast
<
framework
::
Scope
&>
(
scope
).
Var
(
var_name
);
auto
&
feed_input
=
*
(
g_feed_value
->
GetMutable
<
paddle
::
framework
::
LoDTensor
>
());
feed_input
.
ShareDataWith
(
inx
[
i
]);
res_names
->
push_back
(
var_name
);
}
}
void
LodTensorVectorResizeFromLodTensorArray
(
const
framework
::
Scope
&
scope
,
const
std
::
string
&
base_name
,
const
std
::
string
&
lod_tensor_array_name
,
std
::
vector
<
std
::
string
>
*
res_names
)
{
auto
&
inx
=
scope
.
FindVar
(
lod_tensor_array_name
)
->
Get
<
framework
::
LoDTensorArray
>
();
for
(
size_t
i
=
0
;
i
<
inx
.
size
();
i
++
)
{
std
::
string
var_name
=
base_name
+
std
::
to_string
(
i
);
framework
::
Variable
*
g_feed_value
=
const_cast
<
framework
::
Scope
&>
(
scope
).
Var
(
var_name
);
auto
&
feed_input
=
*
(
g_feed_value
->
GetMutable
<
paddle
::
framework
::
LoDTensor
>
());
auto
dims
=
inx
[
i
].
dims
();
feed_input
.
Resize
(
dims
);
res_names
->
push_back
(
var_name
);
}
}
void
LodTensorArrayCreateFromLodTensorArray
(
const
framework
::
Scope
&
scope
,
const
std
::
string
&
input_lod_tensor_array_name
,
const
std
::
string
&
output_lod_tensor_array_name
)
{
auto
&
inx
=
scope
.
FindVar
(
input_lod_tensor_array_name
)
->
Get
<
framework
::
LoDTensorArray
>
();
auto
&
grad_inx
=
*
scope
.
FindVar
(
output_lod_tensor_array_name
)
->
GetMutable
<
framework
::
LoDTensorArray
>
();
for
(
size_t
i
=
0
;
i
<
inx
.
size
();
i
++
)
{
std
::
string
var_name
=
output_lod_tensor_array_name
+
std
::
to_string
(
i
);
framework
::
Variable
*
g_feed_value
=
const_cast
<
framework
::
Scope
&>
(
scope
).
Var
(
var_name
);
auto
&
feed_input
=
*
(
g_feed_value
->
GetMutable
<
paddle
::
framework
::
LoDTensor
>
());
grad_inx
.
push_back
(
feed_input
);
}
}
class
LoDTensorArray2TensorOp
:
public
framework
::
OperatorBase
{
public:
using
OperatorBase
::
OperatorBase
;
private:
void
RunImpl
(
const
framework
::
Scope
&
scope
,
const
platform
::
Place
&
place
)
const
override
{
auto
axis
=
Attr
<
int
>
(
"axis"
);
framework
::
AttributeMap
attrs
;
attrs
[
"axis"
]
=
axis
;
auto
&
inx
=
scope
.
FindVar
(
Input
(
"X"
))
->
Get
<
framework
::
LoDTensorArray
>
();
auto
&
out
=
*
scope
.
FindVar
(
Output
(
"Out"
))
->
GetMutable
<
framework
::
LoDTensor
>
();
auto
&
out_inx
=
*
scope
.
FindVar
(
Output
(
"OutIndex"
))
->
GetMutable
<
framework
::
LoDTensor
>
();
const
size_t
n
=
inx
.
size
();
PADDLE_ENFORCE_GT
(
n
,
0
,
"Input tensorarray size should > 0."
);
std
::
string
base_name
=
Inputs
(
"X"
)[
0
];
std
::
vector
<
std
::
string
>
names
;
// get the input tensorarray items' dim in out_inx
auto
out_inx_dim
=
out_inx
.
dims
();
out_inx_dim
[
0
]
=
inx
.
size
();
out_inx
.
Resize
(
out_inx_dim
);
std
::
string
var_name
=
"out_index"
;
framework
::
Variable
*
tmp_index_var
=
const_cast
<
framework
::
Scope
&>
(
scope
).
Var
(
var_name
);
auto
&
tmp_index_tensor
=
*
(
tmp_index_var
->
GetMutable
<
paddle
::
framework
::
LoDTensor
>
());
tmp_index_tensor
.
Resize
(
out_inx_dim
);
int
*
tmp_index_data
=
tmp_index_tensor
.
mutable_data
<
int
>
(
platform
::
CPUPlace
());
auto
out_dims
=
inx
[
0
].
dims
();
size_t
out_dim_sum
=
0
;
for
(
size_t
index
=
0
;
index
<
inx
.
size
();
index
++
)
{
auto
inx_dims
=
inx
[
index
].
dims
();
out_dim_sum
+=
inx_dims
[
axis
];
tmp_index_data
[
index
]
=
inx_dims
[
axis
];
}
out_inx
.
ShareDataWith
(
tmp_index_tensor
);
// get input array items' dims
out_dims
[
axis
]
=
out_dim_sum
;
out
.
Resize
(
out_dims
);
LodTensorArray2LodTensorVector
(
scope
,
base_name
,
Input
(
"X"
),
&
names
);
// Invoke Reshape Op
auto
concat_op
=
framework
::
OpRegistry
::
CreateOp
(
"concat"
,
{{
"X"
,
names
}},
{{
"Out"
,
{
Output
(
"Out"
)}}},
attrs
);
concat_op
->
Run
(
scope
,
place
);
}
};
class
LoDTensorArray2TensorOpMaker
:
public
framework
::
OpProtoAndCheckerMaker
{
public:
void
Make
()
override
{
AddInput
(
"X"
,
"Input LoDTensorArray of tensor_array_to_tensor operator."
);
AddOutput
(
"Out"
,
"Output tensor of tensor_array_to_tensor operator."
);
AddOutput
(
"OutIndex"
,
"Output input LoDTensorArray items' dims of "
"tensor_array_to_tensor operator."
);
AddAttr
<
int
>
(
"axis"
,
"The axis along which the input tensors will be concatenated."
)
.
SetDefault
(
0
);
AddComment
(
R"DOC(
tensor_array_to_tensor Operator.
Concatenate the input LoDTensorArray along dimension axis to the output Tensor.
Examples:
Input = {[1,2], [3,4], [5,6]}
axis = 0
Output = [[1,2],
[3,4],
[5,6]]
OutputIndex = [1,1,1]
)DOC"
);
}
};
class
LoDTensorArray2TensorOpInferShape
:
public
framework
::
InferShapeBase
{
public:
void
operator
()(
framework
::
InferShapeContext
*
ctx
)
const
override
{}
};
class
LoDTensorArray2TensorGradInferShape
:
public
framework
::
InferShapeBase
{
public:
void
operator
()(
framework
::
InferShapeContext
*
context
)
const
override
{}
};
class
LoDTensorArray2TensorGradInferVarType
:
public
framework
::
VarTypeInference
{
public:
void
operator
()(
const
framework
::
OpDesc
&
op_desc
,
framework
::
BlockDesc
*
block
)
const
override
{
for
(
auto
&
out_var
:
op_desc
.
Output
(
framework
::
GradVarName
(
"X"
)))
{
block
->
Var
(
out_var
)
->
SetType
(
framework
::
proto
::
VarType
::
LOD_TENSOR_ARRAY
);
}
}
};
class
LoDTensorArray2TensorGradOp
:
public
framework
::
OperatorBase
{
public:
using
OperatorBase
::
OperatorBase
;
private:
void
RunImpl
(
const
framework
::
Scope
&
scope
,
const
platform
::
Place
&
place
)
const
override
{
auto
axis
=
Attr
<
int
>
(
"axis"
);
framework
::
AttributeMap
attrs
;
attrs
[
"axis"
]
=
axis
;
auto
&
inx
=
scope
.
FindVar
(
Input
(
"X"
))
->
Get
<
framework
::
LoDTensorArray
>
();
const
size_t
n
=
inx
.
size
();
PADDLE_ENFORCE_GT
(
n
,
0
,
"Input tensorarray size should > 0."
);
std
::
string
base_name
=
Inputs
(
"X"
)[
0
];
std
::
vector
<
std
::
string
>
names
;
LodTensorArray2LodTensorVector
(
scope
,
base_name
,
Input
(
"X"
),
&
names
);
// grad
auto
dx_name
=
Output
(
framework
::
GradVarName
(
"X"
));
auto
dout_name
=
Input
(
framework
::
GradVarName
(
"Out"
));
std
::
vector
<
std
::
string
>
grad_names
;
LodTensorVectorResizeFromLodTensorArray
(
scope
,
"grad_name"
,
Input
(
"X"
),
&
grad_names
);
auto
concat_grad_op
=
framework
::
OpRegistry
::
CreateOp
(
"concat_grad"
,
{{
"X"
,
names
},
{
"Out@GRAD"
,
{
dout_name
}}},
{{
"X@GRAD"
,
grad_names
}},
attrs
);
concat_grad_op
->
Run
(
scope
,
place
);
LodTensorArrayCreateFromLodTensorArray
(
scope
,
Input
(
"X"
),
dx_name
);
auto
&
grad_inx
=
*
scope
.
FindVar
(
dx_name
)
->
GetMutable
<
framework
::
LoDTensorArray
>
();
for
(
size_t
i
=
0
;
i
<
grad_names
.
size
();
i
++
)
{
std
::
string
var_name
=
grad_names
[
i
];
auto
&
feed_input
=
scope
.
FindVar
(
var_name
)
->
Get
<
framework
::
LoDTensor
>
();
grad_inx
[
i
].
ShareDataWith
(
feed_input
);
}
}
};
}
// namespace operators
}
// namespace paddle
USE_OP
(
concat
);
namespace
ops
=
paddle
::
operators
;
REGISTER_OPERATOR
(
tensor_array_to_tensor
,
ops
::
LoDTensorArray2TensorOp
,
ops
::
LoDTensorArray2TensorOpMaker
,
ops
::
LoDTensorArray2TensorOpInferShape
,
paddle
::
framework
::
DefaultGradOpDescMaker
<
true
>
);
REGISTER_OPERATOR
(
tensor_array_to_tensor_grad
,
ops
::
LoDTensorArray2TensorGradOp
,
ops
::
LoDTensorArray2TensorGradInferShape
,
ops
::
LoDTensorArray2TensorGradInferVarType
);
paddle/scripts/paddle_build.sh
浏览文件 @
e5bf8616
...
@@ -139,6 +139,7 @@ function cmake_gen() {
...
@@ -139,6 +139,7 @@ function cmake_gen() {
-DWITH_AMD_GPU=
${
WITH_AMD_GPU
:-
OFF
}
-DWITH_AMD_GPU=
${
WITH_AMD_GPU
:-
OFF
}
-DWITH_DISTRIBUTE=
${
WITH_DISTRIBUTE
:-
OFF
}
-DWITH_DISTRIBUTE=
${
WITH_DISTRIBUTE
:-
OFF
}
-DWITH_MKL=
${
WITH_MKL
:-
ON
}
-DWITH_MKL=
${
WITH_MKL
:-
ON
}
-DWITH_NGRAPH=
${
WITH_NGRAPH
:-
OFF
}
-DWITH_AVX=
${
WITH_AVX
:-
OFF
}
-DWITH_AVX=
${
WITH_AVX
:-
OFF
}
-DWITH_GOLANG=
${
WITH_GOLANG
:-
OFF
}
-DWITH_GOLANG=
${
WITH_GOLANG
:-
OFF
}
-DCUDA_ARCH_NAME=
${
CUDA_ARCH_NAME
:-
All
}
-DCUDA_ARCH_NAME=
${
CUDA_ARCH_NAME
:-
All
}
...
@@ -171,6 +172,7 @@ EOF
...
@@ -171,6 +172,7 @@ EOF
-DWITH_AMD_GPU
=
${
WITH_AMD_GPU
:-
OFF
}
\
-DWITH_AMD_GPU
=
${
WITH_AMD_GPU
:-
OFF
}
\
-DWITH_DISTRIBUTE
=
${
WITH_DISTRIBUTE
:-
OFF
}
\
-DWITH_DISTRIBUTE
=
${
WITH_DISTRIBUTE
:-
OFF
}
\
-DWITH_MKL
=
${
WITH_MKL
:-
ON
}
\
-DWITH_MKL
=
${
WITH_MKL
:-
ON
}
\
-DWITH_NGRAPH
=
${
WITH_NGRAPH
:-
OFF
}
\
-DWITH_AVX
=
${
WITH_AVX
:-
OFF
}
\
-DWITH_AVX
=
${
WITH_AVX
:-
OFF
}
\
-DWITH_GOLANG
=
${
WITH_GOLANG
:-
OFF
}
\
-DWITH_GOLANG
=
${
WITH_GOLANG
:-
OFF
}
\
-DCUDA_ARCH_NAME
=
${
CUDA_ARCH_NAME
:-
All
}
\
-DCUDA_ARCH_NAME
=
${
CUDA_ARCH_NAME
:-
All
}
\
...
...
python/paddle/fluid/__init__.py
浏览文件 @
e5bf8616
...
@@ -34,6 +34,7 @@ from . import regularizer
...
@@ -34,6 +34,7 @@ from . import regularizer
from
.
import
average
from
.
import
average
from
.
import
metrics
from
.
import
metrics
from
.
import
transpiler
from
.
import
transpiler
from
.
import
distribute_lookup_table
from
.param_attr
import
ParamAttr
,
WeightNormParamAttr
from
.param_attr
import
ParamAttr
,
WeightNormParamAttr
from
.data_feeder
import
DataFeeder
from
.data_feeder
import
DataFeeder
from
.core
import
LoDTensor
,
LoDTensorArray
,
CPUPlace
,
CUDAPlace
,
CUDAPinnedPlace
,
Scope
from
.core
import
LoDTensor
,
LoDTensorArray
,
CPUPlace
,
CUDAPlace
,
CUDAPinnedPlace
,
Scope
...
...
python/paddle/fluid/distribute_lookup_table.py
0 → 100644
浏览文件 @
e5bf8616
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
LOOKUP_TABLE_TYPE
=
"lookup_table"
def
find_distributed_lookup_table
(
program
):
"""
Find distribute lookup table in program.
We only support one distribute table now.
:param program:
:return: table_name or None
"""
table_name
=
None
for
op
in
program
.
global_block
().
ops
:
if
op
.
type
==
LOOKUP_TABLE_TYPE
:
if
op
.
attr
(
'is_distributed'
)
is
True
:
if
table_name
is
None
:
table_name
=
op
.
input
(
"W"
)[
0
]
if
table_name
!=
op
.
input
(
"W"
)[
0
]:
raise
RuntimeError
(
"all distributed lookup_table_ops"
" should have only one table"
)
else
:
if
table_name
is
not
None
:
assert
op
.
input
(
"W"
)[
0
]
!=
table_name
return
table_name
python/paddle/fluid/layers/detection.py
浏览文件 @
e5bf8616
...
@@ -31,6 +31,7 @@ from functools import reduce
...
@@ -31,6 +31,7 @@ from functools import reduce
__all__
=
[
__all__
=
[
'prior_box'
,
'prior_box'
,
'density_prior_box'
,
'multi_box_head'
,
'multi_box_head'
,
'bipartite_match'
,
'bipartite_match'
,
'target_assign'
,
'target_assign'
,
...
@@ -1023,6 +1024,135 @@ def prior_box(input,
...
@@ -1023,6 +1024,135 @@ def prior_box(input,
return
box
,
var
return
box
,
var
def
density_prior_box
(
input
,
image
,
densities
=
None
,
fixed_sizes
=
None
,
fixed_ratios
=
None
,
variance
=
[
0.1
,
0.1
,
0.2
,
0.2
],
clip
=
False
,
steps
=
[
0.0
,
0.0
],
offset
=
0.5
,
name
=
None
):
"""
**Density Prior Box Operator**
Generate density prior boxes for SSD(Single Shot MultiBox Detector)
algorithm. Each position of the input produce N prior boxes, N is
determined by the count of densities, fixed_sizes and fixed_ratios.
Boxes center at grid points around each input position is generated by
this operator, and the grid points is determined by densities and
the count of density prior box is determined by fixed_sizes and fixed_ratios.
Obviously, the number of fixed_sizes is equal to the number of densities.
For densities_i in densities:
N_density_prior_box =sum(N_fixed_ratios * densities_i^2),
Args:
input(Variable): The Input Variables, the format is NCHW.
image(Variable): The input image data of PriorBoxOp,
the layout is NCHW.
densities(list|tuple|None): the densities of generated density prior
boxes, this attribute should be a list or tuple of integers.
Default: None.
fixed_sizes(list|tuple|None): the fixed sizes of generated density
prior boxes, this attribute should a list or tuple of same
length with :attr:`densities`. Default: None.
fixed_ratios(list|tuple|None): the fixed ratios of generated density
prior boxes, if this attribute is not set and :attr:`densities`
and :attr:`fix_sizes` is set, :attr:`aspect_ratios` will be used
to generate density prior boxes.
variance(list|tuple): the variances to be encoded in density prior boxes.
Default:[0.1, 0.1, 0.2, 0.2].
clip(bool): Whether to clip out-of-boundary boxes. Default: False.
step(list|turple): Prior boxes step across width and height, If
step[0] == 0.0/step[1] == 0.0, the density prior boxes step across
height/weight of the input will be automatically calculated.
Default: [0., 0.]
offset(float): Prior boxes center offset. Default: 0.5
name(str): Name of the density prior box op. Default: None.
Returns:
tuple: A tuple with two Variable (boxes, variances)
boxes: the output density prior boxes of PriorBox.
The layout is [H, W, num_priors, 4].
H is the height of input, W is the width of input,
num_priors is the total
box count of each position of input.
variances: the expanded variances of PriorBox.
The layout is [H, W, num_priors, 4].
H is the height of input, W is the width of input
num_priors is the total
box count of each position of input
Examples:
.. code-block:: python
box, var = fluid.layers.density_prior_box(
input=conv1,
image=images,
min_sizes=[100.],
max_sizes=[200.],
aspect_ratios=[1.0, 1.0 / 2.0, 2.0],
densities=[3, 4],
fixed_sizes=[50., 60.],
fixed_ratios=[1.0, 3.0, 1.0 / 3.0],
flip=True,
clip=True)
"""
helper
=
LayerHelper
(
"density_prior_box"
,
**
locals
())
dtype
=
helper
.
input_dtype
()
def
_is_list_or_tuple_
(
data
):
return
(
isinstance
(
data
,
list
)
or
isinstance
(
data
,
tuple
))
if
not
_is_list_or_tuple_
(
densities
):
raise
TypeError
(
'densities should be a list or a tuple or None.'
)
if
not
_is_list_or_tuple_
(
fixed_sizes
):
raise
TypeError
(
'fixed_sizes should be a list or a tuple or None.'
)
if
not
_is_list_or_tuple_
(
fixed_ratios
):
raise
TypeError
(
'fixed_ratios should be a list or a tuple or None.'
)
if
len
(
densities
)
!=
len
(
fixed_sizes
):
raise
ValueError
(
'densities and fixed_sizes length should be euqal.'
)
if
not
(
_is_list_or_tuple_
(
steps
)
and
len
(
steps
)
==
2
):
raise
ValueError
(
'steps should be a list or tuple '
,
'with length 2, (step_width, step_height).'
)
densities
=
list
(
map
(
int
,
densities
))
fixed_sizes
=
list
(
map
(
float
,
fixed_sizes
))
fixed_ratios
=
list
(
map
(
float
,
fixed_ratios
))
steps
=
list
(
map
(
float
,
steps
))
attrs
=
{
'variances'
:
variance
,
'clip'
:
clip
,
'step_w'
:
steps
[
0
],
'step_h'
:
steps
[
1
],
'offset'
:
offset
,
}
if
densities
is
not
None
and
len
(
densities
)
>
0
:
attrs
[
'densities'
]
=
densities
if
fixed_sizes
is
not
None
and
len
(
fixed_sizes
)
>
0
:
attrs
[
'fixed_sizes'
]
=
fixed_sizes
if
fixed_ratios
is
not
None
and
len
(
fixed_ratios
)
>
0
:
attrs
[
'fixed_ratios'
]
=
fixed_ratios
box
=
helper
.
create_variable_for_type_inference
(
dtype
)
var
=
helper
.
create_variable_for_type_inference
(
dtype
)
helper
.
append_op
(
type
=
"density_prior_box"
,
inputs
=
{
"Input"
:
input
,
"Image"
:
image
},
outputs
=
{
"Boxes"
:
box
,
"Variances"
:
var
},
attrs
=
attrs
,
)
box
.
stop_gradient
=
True
var
.
stop_gradient
=
True
return
box
,
var
def
multi_box_head
(
inputs
,
def
multi_box_head
(
inputs
,
image
,
image
,
base_size
,
base_size
,
...
...
python/paddle/fluid/layers/nn.py
浏览文件 @
e5bf8616
...
@@ -160,10 +160,12 @@ __all__ = [
...
@@ -160,10 +160,12 @@ __all__ = [
'affine_grid'
,
'affine_grid'
,
'sequence_reverse'
,
'sequence_reverse'
,
'affine_channel'
,
'affine_channel'
,
'similarity_focus'
,
'hash'
,
'hash'
,
'grid_sampler'
,
'grid_sampler'
,
'log_loss'
,
'log_loss'
,
'add_position_encoding'
,
'add_position_encoding'
,
'bilinear_tensor_product'
,
]
]
...
@@ -4065,8 +4067,8 @@ def edit_distance(input, label, normalized=True, ignored_tokens=None):
...
@@ -4065,8 +4067,8 @@ def edit_distance(input, label, normalized=True, ignored_tokens=None):
Examples:
Examples:
.. code-block:: python
.. code-block:: python
x = fluid.layers.data(name='x', shape=[
8
], dtype='float32')
x = fluid.layers.data(name='x', shape=[
1
], dtype='float32')
y = fluid.layers.data(name='y', shape=[
7
], dtype='float32')
y = fluid.layers.data(name='y', shape=[
1
], dtype='float32')
cost = fluid.layers.edit_distance(input=x,label=y)
cost = fluid.layers.edit_distance(input=x,label=y)
"""
"""
helper
=
LayerHelper
(
"edit_distance"
,
**
locals
())
helper
=
LayerHelper
(
"edit_distance"
,
**
locals
())
...
@@ -4740,7 +4742,8 @@ def softmax_with_cross_entropy(logits,
...
@@ -4740,7 +4742,8 @@ def softmax_with_cross_entropy(logits,
label
,
label
,
soft_label
=
False
,
soft_label
=
False
,
ignore_index
=-
100
,
ignore_index
=-
100
,
numeric_stable_mode
=
False
):
numeric_stable_mode
=
False
,
return_softmax
=
False
):
"""
"""
**Softmax With Cross Entropy Operator.**
**Softmax With Cross Entropy Operator.**
...
@@ -4804,9 +4807,15 @@ def softmax_with_cross_entropy(logits,
...
@@ -4804,9 +4807,15 @@ def softmax_with_cross_entropy(logits,
the algorithm is always numerically stable.
the algorithm is always numerically stable.
Note that the speed may be slower when use
Note that the speed may be slower when use
stable algorithm. Default: False
stable algorithm. Default: False
return_softmax (bool): A flag indicating whether to return the softmax
along with the cross entropy loss. Default: False
Returns:
Returns:
Variable: The cross entropy loss is a 2-D tensor with shape [N x 1].
Variable or Tuple of two Variables: Return the cross entropy loss if
`return_softmax` is False, otherwise the tuple
(loss, softmax), where the cross entropy loss is
a 2-D tensor with shape [N x 1], and softmax is a
2-D tensor with shape [N x K].
Examples:
Examples:
.. code-block:: python
.. code-block:: python
...
@@ -4831,6 +4840,10 @@ def softmax_with_cross_entropy(logits,
...
@@ -4831,6 +4840,10 @@ def softmax_with_cross_entropy(logits,
'ignore_index'
:
ignore_index
,
'ignore_index'
:
ignore_index
,
'numeric_stable_mode'
:
numeric_stable_mode
'numeric_stable_mode'
:
numeric_stable_mode
})
})
if
return_softmax
:
return
loss
,
softmax
return
loss
return
loss
...
@@ -7933,6 +7946,118 @@ def affine_channel(x, scale=None, bias=None, data_layout='NCHW', name=None):
...
@@ -7933,6 +7946,118 @@ def affine_channel(x, scale=None, bias=None, data_layout='NCHW', name=None):
return
out
return
out
def
similarity_focus
(
input
,
axis
,
indexes
,
name
=
None
):
"""
SimilarityFocus Operator
Generate a similarity focus mask with the same shape of input using the following method:
1. Extract the 3-D tensor(here the first dimension is BatchSize) corresponding
to the axis according to the indexes. For example, if axis=1 and indexes=[a],
it will get the matrix T=X[:, a, :, :]. In this case, if the shape of input X
is (BatchSize, A, B, C), the shape of tensor T is (BatchSize, B, C).
2. For each index, find the largest numbers in the tensor T, so that the same
row and same column has at most one number(what it means is that if the
largest number has been found in the i-th row and the j-th column, then
the numbers in the i-th row or j-th column will be skipped. And then the
next largest number will be selected from the remaining numbers. Obviously
there will be min(B, C) numbers), and mark the corresponding position of the
3-D similarity focus mask as 1, otherwise as 0. Do elementwise-or for
each index.
3. Broadcast the 3-D similarity focus mask to the same shape of input X.
Refer to `Similarity Focus Layer <http://www.aclweb.org/anthology/N16-1108>`_
.. code-block:: text
* Example :
Given a 4-D tensor x with the shape (BatchSize, C, A, B), where C is
the number of channels and the shape of feature map is (A, B):
x.shape = (2, 3, 2, 2)
x.data = [[[[0.8, 0.1],
[0.4, 0.5]],
[[0.9, 0.7],
[0.9, 0.9]],
[[0.8, 0.9],
[0.1, 0.2]]],
[[[0.2, 0.5],
[0.3, 0.4]],
[[0.9, 0.7],
[0.8, 0.4]],
[[0.0, 0.2],
[0.4, 0.7]]]]
Given axis: 1 (the axis of the channel)
Given indexes: [0]
then we get a 4-D tensor out with the same shape of input x:
out.shape = (2, 3, 2, 2)
out.data = [[[[1.0, 0.0],
[0.0, 1.0]],
[[1.0, 0.0],
[0.0, 1.0]],
[[1.0, 0.0],
[0.0, 1.0]]],
[[[0.0, 1.0],
[1.0, 0.0]],
[[0.0, 1.0],
[1.0, 0.0]],
[[0.0, 1.0],
[1.0, 0.0]]]]
Args:
input(Variable): The input tensor variable(default float). It should
be a 4-D tensor with shape [BatchSize, A, B, C].
axis(int): Indicating the dimension to be selected. It can only be
1, 2 or 3.
indexes(list): Indicating the indexes of the selected dimension.
Returns:
Variable: A tensor variable with the same shape and same type
as the input.
Examples:
.. code-block:: python
data = fluid.layers.data(
name='data', shape=[2, 3, 2, 2], dtype='float32')
x = fluid.layers.layer_norm(input=data, axis=1, indexes=[0])
"""
helper
=
LayerHelper
(
'similarity_focus'
,
**
locals
())
# check attrs
if
isinstance
(
axis
,
int
)
is
False
:
raise
TypeError
(
"axis must be int type."
)
if
isinstance
(
indexes
,
list
)
is
False
:
raise
TypeError
(
"indexes must be list type."
)
if
axis
!=
1
and
axis
!=
2
and
axis
!=
3
:
raise
ValueError
(
"axis must be 1, 2 or 3."
)
if
len
(
indexes
)
==
0
:
raise
ValueError
(
"indexes can not be empty."
)
if
name
is
None
:
out
=
helper
.
create_variable_for_type_inference
(
dtype
=
input
.
dtype
)
else
:
out
=
helper
.
create_variable
(
name
=
name
,
dtype
=
input
.
dtype
,
persistable
=
False
)
helper
.
append_op
(
type
=
'similarity_focus'
,
inputs
=
{
'X'
:
input
},
outputs
=
{
'Out'
:
out
},
attrs
=
{
"axis"
:
axis
,
"indexes"
:
indexes
})
return
out
def
hash
(
input
,
hash_size
,
num_hash
=
1
,
name
=
None
):
def
hash
(
input
,
hash_size
,
num_hash
=
1
,
name
=
None
):
"""
"""
Hash the input to an integer whose value is less than the given hash size.
Hash the input to an integer whose value is less than the given hash size.
...
@@ -8176,3 +8301,72 @@ def add_position_encoding(input, alpha, beta, name=None):
...
@@ -8176,3 +8301,72 @@ def add_position_encoding(input, alpha, beta, name=None):
attrs
=
{
"alpha"
:
alpha
,
attrs
=
{
"alpha"
:
alpha
,
"beta"
:
beta
})
"beta"
:
beta
})
return
out
return
out
def
bilinear_tensor_product
(
x
,
y
,
size
,
act
=
None
,
name
=
None
,
param_attr
=
None
,
bias_attr
=
None
):
"""
**Add Bilinear Tensor Product Layer**
This layer performs bilinear tensor product on two inputs.
For example:
.. math::
out{i} = x * W_{i} * {y^\mathrm{T}}, i=0,1,...,size-1
In this formula:
- :math:`x`: the first input contains M elements, shape is [batch_size, M].
- :math:`y`: the second input contains N elements, shape is [batch_size, N].
- :math:`W_{i}`: the i-th learned weight, shape is [M, N]
- :math:`out{i}`: the i-th element of out, shape is [batch_size, size].
- :math:`y^\mathrm{T}`: the transpose of :math:`y_{2}`.
Args:
x (Variable): 2-D input tensor with shape [batch_size, M]
y (Variable): 2-D input tensor with shape [batch_size, N]
size (int): The dimension of this layer.
act (str, default None): Activation to be applied to the output of this layer.
name (str, default None): The name of this layer.
param_attr (ParamAttr, default None): The parameter attribute for the learnable w.
parameters/weights of this layer.
bias_attr (ParamAttr, default None): The parameter attribute for the bias
of this layer. If it is set to False, no bias will be added to the output units.
If it is set to None, the bias is initialized zero. Default: None.
Returns:
Variable: A 2-D Tensor of shape [batch_size, size].
Examples:
.. code-block:: python
tensor = bilinear_tensor_product(x=layer1, y=layer2, size=1000)
"""
helper
=
LayerHelper
(
'bilinear_tensor_product'
,
**
locals
())
dtype
=
helper
.
input_dtype
(
'x'
)
param_shape
=
[
size
,
x
.
shape
[
1
],
y
.
shape
[
1
]]
w
=
helper
.
create_parameter
(
attr
=
helper
.
param_attr
,
shape
=
param_shape
,
dtype
=
dtype
,
is_bias
=
False
)
if
name
is
None
:
out
=
helper
.
create_variable_for_type_inference
(
dtype
=
dtype
)
else
:
out
=
helper
.
create_variable
(
name
=
name
,
dtype
=
dtype
,
persistable
=
False
)
inputs
=
{
"X"
:
x
,
"Y"
:
y
,
"Weight"
:
w
}
if
helper
.
bias_attr
:
bias_size
=
[
1
,
size
]
bias
=
helper
.
create_parameter
(
attr
=
helper
.
bias_attr
,
shape
=
bias_size
,
dtype
=
dtype
,
is_bias
=
True
)
inputs
[
"Bias"
]
=
bias
helper
.
append_op
(
type
=
"bilinear_tensor_product"
,
inputs
=
inputs
,
outputs
=
{
"Out"
:
out
})
# add activation
return
helper
.
append_activation
(
out
)
python/paddle/fluid/layers/tensor.py
浏览文件 @
e5bf8616
...
@@ -24,10 +24,10 @@ from .layer_function_generator import templatedoc
...
@@ -24,10 +24,10 @@ from .layer_function_generator import templatedoc
import
numpy
import
numpy
__all__
=
[
__all__
=
[
'create_tensor'
,
'create_parameter'
,
'create_global_var'
,
'cast'
,
'concat'
,
'create_tensor'
,
'create_parameter'
,
'create_global_var'
,
'cast'
,
'
sums'
,
'assign'
,
'fill_constant_batch_size_like'
,
'fill_constant
'
,
'
tensor_array_to_tensor'
,
'concat'
,
'sums'
,
'assign
'
,
'
argmin'
,
'argmax'
,
'argsort'
,
'ones'
,
'zeros'
,
'reverse'
,
'has_inf
'
,
'
fill_constant_batch_size_like'
,
'fill_constant'
,
'argmin'
,
'argmax
'
,
'has_nan'
,
'isfinite'
'
argsort'
,
'ones'
,
'zeros'
,
'reverse'
,
'has_inf'
,
'
has_nan'
,
'isfinite'
]
]
...
@@ -193,6 +193,60 @@ def concat(input, axis=0, name=None):
...
@@ -193,6 +193,60 @@ def concat(input, axis=0, name=None):
return
out
return
out
def
tensor_array_to_tensor
(
input
,
axis
=
1
,
name
=
None
):
"""
This function concatenates the input LodTensorArray along the axis mentioned
and returns that as the output.
A simple example as below:
.. code-block:: text
Given:
input.data = {[[0.6, 0.1, 0.3],
[0.5, 0.3, 0.2]],
[[1.3],
[1.8]],
[[2.3, 2.1],
[2.5, 2.4]]}
axis = 1
Then:
output.data = [[0.6, 0.1, 0.3, 1.3, 2.3, 2.1],
[0.5, 0.3, 0.2, 1.8, 2.5, 2.4]]
output_index.data = [3, 1, 2]
Args:
input(list): Input LodTensorArray
axis(int): Integer axis along which the tensors will be concatenated
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns:
Variable: Output variable of the concatenation
Variable: The input LodTensorArray items' dims along the axis
Examples:
.. code-block:: python
output, output_index = fluid.layers.tensor_array_to_tensor(input=tensor_array)
"""
helper
=
LayerHelper
(
'tensor_array_to_tensor'
,
**
locals
())
out
=
helper
.
create_variable_for_type_inference
(
dtype
=
helper
.
input_dtype
())
out_index
=
helper
.
create_variable_for_type_inference
(
dtype
=
"int32"
)
helper
.
append_op
(
type
=
'tensor_array_to_tensor'
,
inputs
=
{
'X'
:
input
},
outputs
=
{
'Out'
:
[
out
],
'OutIndex'
:
[
out_index
]},
attrs
=
{
'axis'
:
axis
})
return
out
,
out_index
def
sums
(
input
,
out
=
None
):
def
sums
(
input
,
out
=
None
):
"""
"""
This function performs the sum operation on the input and returns the
This function performs the sum operation on the input and returns the
...
...
python/paddle/fluid/optimizer.py
浏览文件 @
e5bf8616
...
@@ -13,21 +13,23 @@
...
@@ -13,21 +13,23 @@
# limitations under the License.
# limitations under the License.
from
__future__
import
print_function
from
__future__
import
print_function
import
re
import
sys
from
collections
import
defaultdict
from
collections
import
defaultdict
from
contextlib
import
contextmanager
from
paddle.fluid.framework
import
Program
,
Variable
,
name_scope
,
default_main_program
from
paddle.fluid.framework
import
Program
,
Variable
,
name_scope
,
default_main_program
from
paddle.fluid.distribute_lookup_table
import
find_distributed_lookup_table
from
.
import
framework
from
.
import
framework
from
.
import
layers
from
.
import
layers
from
.
import
unique_name
from
.backward
import
append_backward
from
.backward
import
append_backward
from
.clip
import
append_gradient_clip_ops
,
error_clip_callback
from
.framework
import
program_guard
from
.framework
import
program_guard
from
.
import
unique_name
from
.initializer
import
Constant
from
.initializer
import
Constant
from
.layer_helper
import
LayerHelper
from
.layer_helper
import
LayerHelper
from
.regularizer
import
append_regularization_ops
from
.clip
import
append_gradient_clip_ops
,
error_clip_callback
from
contextlib
import
contextmanager
from
.layers
import
ops
from
.layers
import
ops
from
.regularizer
import
append_regularization_ops
__all__
=
[
__all__
=
[
'SGD'
,
'Momentum'
,
'Adagrad'
,
'Adam'
,
'Adamax'
,
'DecayedAdagrad'
,
'Ftrl'
,
'SGD'
,
'Momentum'
,
'Adagrad'
,
'Adam'
,
'Adamax'
,
'DecayedAdagrad'
,
'Ftrl'
,
...
@@ -85,7 +87,7 @@ class Optimizer(object):
...
@@ -85,7 +87,7 @@ class Optimizer(object):
name
=
unique_name
.
generate
(
"learning_rate"
),
name
=
unique_name
.
generate
(
"learning_rate"
),
shape
=
[
1
],
shape
=
[
1
],
value
=
float
(
self
.
_learning_rate
),
value
=
float
(
self
.
_learning_rate
),
dtype
=
'float32'
if
self
.
_dtype
==
None
else
self
.
_dtype
,
dtype
=
'float32'
if
self
.
_dtype
is
None
else
self
.
_dtype
,
persistable
=
True
)
persistable
=
True
)
def
_global_learning_rate
(
self
,
program
=
None
):
def
_global_learning_rate
(
self
,
program
=
None
):
...
@@ -245,6 +247,50 @@ class Optimizer(object):
...
@@ -245,6 +247,50 @@ class Optimizer(object):
end
=
len
(
global_block
.
ops
)
end
=
len
(
global_block
.
ops
)
return
global_block
.
_slice_ops
(
start
,
end
)
return
global_block
.
_slice_ops
(
start
,
end
)
def
_process_distribute_lookuptable
(
self
,
param_grads
,
loss
,
startup_program
):
"""
Because distribute lookup table only support SGD optimizer for now, not support
other optimizer and regularization, so we should find the table parameter out,
and avoid to add regularization and other op for it, and add sgd optimize op
for it independently.
:param param_grads(list((Var, Var))): list of (param, grad) pair.
:param loss: the loss variable.
:param startup_program: the startup program
"""
program
=
loss
.
block
.
program
table_name
=
find_distributed_lookup_table
(
program
)
table_param
=
None
table_grad
=
None
new_param_grads
=
[]
for
p
,
g
in
param_grads
:
if
p
.
name
==
table_name
:
if
table_param
is
not
None
:
raise
RuntimeError
(
"multi dist table var found, only support one now!"
)
table_param
=
p
table_grad
=
g
else
:
new_param_grads
.
append
((
p
,
g
))
sgd_op
=
None
if
table_param
is
not
None
:
with
program_guard
(
program
,
startup_program
):
param_and_grad
=
[
table_param
,
table_grad
]
with
table_param
.
block
.
program
.
_optimized_guard
(
param_and_grad
),
\
framework
.
name_scope
(
"optimizer"
):
self
.
_create_global_learning_rate
()
# create the optimize op
sgd_op
=
loss
.
block
.
append_op
(
type
=
'sgd'
,
inputs
=
{
"Param"
:
table_param
,
"Grad"
:
table_grad
,
"LearningRate"
:
self
.
_create_param_lr
(
param_and_grad
)
},
outputs
=
{
"ParamOut"
:
param_and_grad
[
0
]})
return
new_param_grads
,
(
table_param
,
table_grad
),
sgd_op
def
minimize
(
self
,
def
minimize
(
self
,
loss
,
loss
,
startup_program
=
None
,
startup_program
=
None
,
...
@@ -260,6 +306,9 @@ class Optimizer(object):
...
@@ -260,6 +306,9 @@ class Optimizer(object):
params_grads
=
sorted
(
params_grads
,
key
=
lambda
x
:
x
[
0
].
name
)
params_grads
=
sorted
(
params_grads
,
key
=
lambda
x
:
x
[
0
].
name
)
params_grads
,
table_param_and_grad
,
table_optimize_op
=
\
self
.
_process_distribute_lookuptable
(
params_grads
,
loss
,
startup_program
)
params_grads
=
append_gradient_clip_ops
(
params_grads
)
params_grads
=
append_gradient_clip_ops
(
params_grads
)
# Add regularization if any
# Add regularization if any
...
@@ -268,6 +317,9 @@ class Optimizer(object):
...
@@ -268,6 +317,9 @@ class Optimizer(object):
optimize_ops
=
self
.
_create_optimization_pass
(
params_grads
,
loss
,
optimize_ops
=
self
.
_create_optimization_pass
(
params_grads
,
loss
,
startup_program
)
startup_program
)
if
table_optimize_op
is
not
None
:
optimize_ops
.
append
(
table_optimize_op
)
params_grads
.
append
(
table_param_and_grad
)
return
optimize_ops
,
params_grads
return
optimize_ops
,
params_grads
...
...
python/paddle/fluid/tests/book/test_label_semantic_roles.py
浏览文件 @
e5bf8616
...
@@ -38,7 +38,7 @@ depth = 8
...
@@ -38,7 +38,7 @@ depth = 8
mix_hidden_lr
=
1e-3
mix_hidden_lr
=
1e-3
IS_SPARSE
=
True
IS_SPARSE
=
True
PASS_NUM
=
1
0
PASS_NUM
=
1
BATCH_SIZE
=
10
BATCH_SIZE
=
10
embedding_name
=
'emb'
embedding_name
=
'emb'
...
...
python/paddle/fluid/tests/test_detection.py
浏览文件 @
e5bf8616
...
@@ -128,6 +128,24 @@ class TestPriorBox(unittest.TestCase):
...
@@ -128,6 +128,24 @@ class TestPriorBox(unittest.TestCase):
assert
box
.
shape
[
3
]
==
4
assert
box
.
shape
[
3
]
==
4
class
TestDensityPriorBox
(
unittest
.
TestCase
):
def
test_density_prior_box
(
self
):
data_shape
=
[
3
,
224
,
224
]
images
=
fluid
.
layers
.
data
(
name
=
'pixel'
,
shape
=
data_shape
,
dtype
=
'float32'
)
conv1
=
fluid
.
layers
.
conv2d
(
images
,
3
,
3
,
2
)
box
,
var
=
layers
.
density_prior_box
(
input
=
conv1
,
image
=
images
,
densities
=
[
3
,
4
],
fixed_sizes
=
[
50.
,
60.
],
fixed_ratios
=
[
1.0
],
clip
=
True
)
assert
len
(
box
.
shape
)
==
4
assert
box
.
shape
==
var
.
shape
assert
box
.
shape
[
3
]
==
4
class
TestAnchorGenerator
(
unittest
.
TestCase
):
class
TestAnchorGenerator
(
unittest
.
TestCase
):
def
test_anchor_generator
(
self
):
def
test_anchor_generator
(
self
):
data_shape
=
[
3
,
224
,
224
]
data_shape
=
[
3
,
224
,
224
]
...
...
python/paddle/fluid/tests/unittests/test_density_prior_box_op.py
0 → 100644
浏览文件 @
e5bf8616
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
print_function
import
unittest
import
numpy
as
np
import
sys
import
math
from
op_test
import
OpTest
class
TestDensityPriorBoxOp
(
OpTest
):
def
set_data
(
self
):
self
.
init_test_params
()
self
.
init_test_input
()
self
.
init_test_output
()
self
.
inputs
=
{
'Input'
:
self
.
input
,
'Image'
:
self
.
image
}
self
.
attrs
=
{
'variances'
:
self
.
variances
,
'clip'
:
self
.
clip
,
'step_w'
:
self
.
step_w
,
'step_h'
:
self
.
step_h
,
'offset'
:
self
.
offset
,
'densities'
:
self
.
densities
,
'fixed_sizes'
:
self
.
fixed_sizes
,
'fixed_ratios'
:
self
.
fixed_ratios
}
self
.
outputs
=
{
'Boxes'
:
self
.
out_boxes
,
'Variances'
:
self
.
out_var
}
def
test_check_output
(
self
):
self
.
check_output
()
def
setUp
(
self
):
self
.
op_type
=
"density_prior_box"
self
.
set_data
()
def
set_density
(
self
):
self
.
densities
=
[]
self
.
fixed_sizes
=
[]
self
.
fixed_ratios
=
[]
def
init_test_params
(
self
):
self
.
layer_w
=
32
self
.
layer_h
=
32
self
.
image_w
=
40
self
.
image_h
=
40
self
.
step_w
=
float
(
self
.
image_w
)
/
float
(
self
.
layer_w
)
self
.
step_h
=
float
(
self
.
image_h
)
/
float
(
self
.
layer_h
)
self
.
input_channels
=
2
self
.
image_channels
=
3
self
.
batch_size
=
10
self
.
variances
=
[
0.1
,
0.1
,
0.2
,
0.2
]
self
.
variances
=
np
.
array
(
self
.
variances
,
dtype
=
np
.
float
).
flatten
()
self
.
set_density
()
self
.
clip
=
True
self
.
num_priors
=
0
if
len
(
self
.
fixed_sizes
)
>
0
and
len
(
self
.
densities
)
>
0
:
for
density
in
self
.
densities
:
if
len
(
self
.
fixed_ratios
)
>
0
:
self
.
num_priors
+=
len
(
self
.
fixed_ratios
)
*
(
pow
(
density
,
2
))
self
.
offset
=
0.5
def
init_test_input
(
self
):
self
.
image
=
np
.
random
.
random
(
(
self
.
batch_size
,
self
.
image_channels
,
self
.
image_w
,
self
.
image_h
)).
astype
(
'float32'
)
self
.
input
=
np
.
random
.
random
(
(
self
.
batch_size
,
self
.
input_channels
,
self
.
layer_w
,
self
.
layer_h
)).
astype
(
'float32'
)
def
init_test_output
(
self
):
out_dim
=
(
self
.
layer_h
,
self
.
layer_w
,
self
.
num_priors
,
4
)
out_boxes
=
np
.
zeros
(
out_dim
).
astype
(
'float32'
)
out_var
=
np
.
zeros
(
out_dim
).
astype
(
'float32'
)
step_average
=
int
((
self
.
step_w
+
self
.
step_h
)
*
0.5
)
for
h
in
range
(
self
.
layer_h
):
for
w
in
range
(
self
.
layer_w
):
idx
=
0
c_x
=
(
w
+
self
.
offset
)
*
self
.
step_w
c_y
=
(
h
+
self
.
offset
)
*
self
.
step_h
# Generate density prior boxes with fixed size
for
density
,
fixed_size
in
zip
(
self
.
densities
,
self
.
fixed_sizes
):
if
(
len
(
self
.
fixed_ratios
)
>
0
):
for
ar
in
self
.
fixed_ratios
:
shift
=
int
(
step_average
/
density
)
box_width_ratio
=
fixed_size
*
math
.
sqrt
(
ar
)
box_height_ratio
=
fixed_size
/
math
.
sqrt
(
ar
)
for
di
in
range
(
density
):
for
dj
in
range
(
density
):
c_x_temp
=
c_x
-
step_average
/
2.0
+
shift
/
2.0
+
dj
*
shift
c_y_temp
=
c_y
-
step_average
/
2.0
+
shift
/
2.0
+
di
*
shift
out_boxes
[
h
,
w
,
idx
,
:]
=
[
max
((
c_x_temp
-
box_width_ratio
/
2.0
)
/
self
.
image_w
,
0
),
max
((
c_y_temp
-
box_height_ratio
/
2.0
)
/
self
.
image_h
,
0
),
min
((
c_x_temp
+
box_width_ratio
/
2.0
)
/
self
.
image_w
,
1
),
min
((
c_y_temp
+
box_height_ratio
/
2.0
)
/
self
.
image_h
,
1
)
]
idx
+=
1
if
self
.
clip
:
out_boxes
=
np
.
clip
(
out_boxes
,
0.0
,
1.0
)
out_var
=
np
.
tile
(
self
.
variances
,
(
self
.
layer_h
,
self
.
layer_w
,
self
.
num_priors
,
1
))
self
.
out_boxes
=
out_boxes
.
astype
(
'float32'
)
self
.
out_var
=
out_var
.
astype
(
'float32'
)
class
TestDensityPriorBox
(
TestDensityPriorBoxOp
):
def
set_density
(
self
):
self
.
densities
=
[
3
,
4
]
self
.
fixed_sizes
=
[
1.0
,
2.0
]
self
.
fixed_ratios
=
[
1.0
]
if
__name__
==
'__main__'
:
unittest
.
main
()
python/paddle/fluid/tests/unittests/test_dist_transpiler.py
浏览文件 @
e5bf8616
...
@@ -567,7 +567,6 @@ class TestDistLookupTable(TestDistLookupTableBase):
...
@@ -567,7 +567,6 @@ class TestDistLookupTable(TestDistLookupTableBase):
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'uniform_random'
,
'fill_constant'
,
'fill_constant'
,
'uniform_random'
,
'uniform_random'
,
'recv'
,
'recv'
,
'recv'
,
'fetch_barrier'
,
'concat'
,
'uniform_random'
,
'recv'
,
'recv'
,
'recv'
,
'fetch_barrier'
,
'concat'
,
'fake_init'
'fake_init'
...
@@ -639,7 +638,7 @@ class TestAsyncDistLookupTable(TestDistLookupTableBase):
...
@@ -639,7 +638,7 @@ class TestAsyncDistLookupTable(TestDistLookupTableBase):
# 5 save table
# 5 save table
self
.
assertEqual
([
op
.
type
for
op
in
pserver1
.
blocks
[
5
].
ops
],
[
"save"
])
self
.
assertEqual
([
op
.
type
for
op
in
pserver1
.
blocks
[
5
].
ops
],
[
"save"
])
trainer
,
_
=
self
.
get_trainer
(
config
)
trainer
,
trainer_startup
=
self
.
get_trainer
(
config
)
self
.
assertEqual
(
len
(
trainer
.
blocks
),
1
)
self
.
assertEqual
(
len
(
trainer
.
blocks
),
1
)
ops
=
[
ops
=
[
'split_ids'
,
'prefetch'
,
'merge_ids'
,
'sequence_pool'
,
'split_ids'
,
'prefetch'
,
'merge_ids'
,
'sequence_pool'
,
...
@@ -653,6 +652,16 @@ class TestAsyncDistLookupTable(TestDistLookupTableBase):
...
@@ -653,6 +652,16 @@ class TestAsyncDistLookupTable(TestDistLookupTableBase):
'recv'
,
'concat'
'recv'
,
'concat'
]
]
self
.
assertEqual
([
op
.
type
for
op
in
trainer
.
blocks
[
0
].
ops
],
ops
)
self
.
assertEqual
([
op
.
type
for
op
in
trainer
.
blocks
[
0
].
ops
],
ops
)
startup_ops
=
[
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'fill_constant'
,
'uniform_random'
,
'uniform_random'
,
'recv'
,
'recv'
,
'recv'
,
'fetch_barrier'
,
'concat'
,
'fake_init'
]
self
.
assertEqual
([
op
.
type
for
op
in
trainer_startup
.
blocks
[
0
].
ops
],
startup_ops
)
class
TestDistLookupTableSliceSize
(
TestDistLookupTableBase
):
class
TestDistLookupTableSliceSize
(
TestDistLookupTableBase
):
...
...
python/paddle/fluid/tests/unittests/test_layers.py
浏览文件 @
e5bf8616
...
@@ -369,6 +369,10 @@ class TestBook(unittest.TestCase):
...
@@ -369,6 +369,10 @@ class TestBook(unittest.TestCase):
with
program_guard
(
program
):
with
program_guard
(
program
):
x
=
layers
.
data
(
name
=
'x'
,
shape
=
[
16
],
dtype
=
'float32'
)
x
=
layers
.
data
(
name
=
'x'
,
shape
=
[
16
],
dtype
=
'float32'
)
y
=
layers
.
data
(
name
=
'label'
,
shape
=
[
1
],
dtype
=
'int64'
)
y
=
layers
.
data
(
name
=
'label'
,
shape
=
[
1
],
dtype
=
'int64'
)
loss
,
softmax
=
layers
.
softmax_with_cross_entropy
(
x
,
y
,
return_softmax
=
True
)
self
.
assertIsNotNone
(
loss
)
self
.
assertIsNotNone
(
softmax
)
loss
=
layers
.
softmax_with_cross_entropy
(
x
,
y
)
loss
=
layers
.
softmax_with_cross_entropy
(
x
,
y
)
self
.
assertIsNotNone
(
loss
)
self
.
assertIsNotNone
(
loss
)
print
(
str
(
program
))
print
(
str
(
program
))
...
@@ -911,6 +915,16 @@ class TestBook(unittest.TestCase):
...
@@ -911,6 +915,16 @@ class TestBook(unittest.TestCase):
self
.
assertIsNotNone
(
data_1
)
self
.
assertIsNotNone
(
data_1
)
print
(
str
(
program
))
print
(
str
(
program
))
def
test_bilinear_tensor_product_layer
(
self
):
program
=
Program
()
with
program_guard
(
program
):
data
=
layers
.
data
(
name
=
'data'
,
shape
=
[
4
],
dtype
=
"float32"
)
theta
=
layers
.
data
(
name
=
"theta"
,
shape
=
[
5
],
dtype
=
"float32"
)
out
=
layers
.
bilinear_tensor_product
(
data
,
theta
,
6
)
print
(
str
(
program
))
if
__name__
==
'__main__'
:
if
__name__
==
'__main__'
:
unittest
.
main
()
unittest
.
main
()
python/paddle/fluid/tests/unittests/test_parallel_executor_dry_run.py
浏览文件 @
e5bf8616
...
@@ -21,8 +21,8 @@ import six
...
@@ -21,8 +21,8 @@ import six
class
TestBase
(
unittest
.
TestCase
):
class
TestBase
(
unittest
.
TestCase
):
def
main
(
self
,
def
main
(
self
,
network_func
,
network_func
,
iter
=
10
0
,
iter
=
10
,
iter_per_pe
=
10
0
,
iter_per_pe
=
10
,
use_gpu
=
True
,
use_gpu
=
True
,
use_experimental_executor
=
False
):
use_experimental_executor
=
False
):
if
use_gpu
and
not
fluid
.
core
.
is_compiled_with_cuda
():
if
use_gpu
and
not
fluid
.
core
.
is_compiled_with_cuda
():
...
@@ -45,7 +45,7 @@ class TestBase(unittest.TestCase):
...
@@ -45,7 +45,7 @@ class TestBase(unittest.TestCase):
exe_strategy
.
_dry_run
=
True
exe_strategy
.
_dry_run
=
True
exe_strategy
.
use_experimental_executor
=
use_experimental_executor
exe_strategy
.
use_experimental_executor
=
use_experimental_executor
pe
=
fluid
.
ParallelExecutor
(
pe
=
fluid
.
ParallelExecutor
(
use_cuda
=
True
,
use_cuda
=
use_gpu
,
loss_name
=
loss
.
name
,
loss_name
=
loss
.
name
,
main_program
=
main_prog
,
main_program
=
main_prog
,
exec_strategy
=
exe_strategy
)
exec_strategy
=
exe_strategy
)
...
...
python/paddle/fluid/tests/unittests/test_similarity_focus_op.py
0 → 100755
浏览文件 @
e5bf8616
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
print_function
import
unittest
import
numpy
as
np
import
paddle.fluid.core
as
core
from
op_test
import
OpTest
class
TestSimilarityFocusOp
(
OpTest
):
def
setUp
(
self
):
self
.
op_type
=
"similarity_focus"
batch_size
=
2
x_dim
,
y_dim
,
z_dim
=
3
,
2
,
2
self
.
inputs
=
{
'X'
:
np
.
array
([[[[
0.8
,
0.1
],
[
0.4
,
0.5
]],
[[
0.9
,
0.7
],
[
0.9
,
0.9
]],
[[
0.8
,
0.9
],
[
0.1
,
0.2
]]],
[[[
0.2
,
0.5
],
[
0.3
,
0.4
]],
[[
0.9
,
0.7
],
[
0.8
,
0.4
]],
[[
0.0
,
0.2
],
[
0.4
,
0.7
]]]]),
}
self
.
attrs
=
{
'axis'
:
1
,
'indexes'
:
[
0
],
}
output
=
None
for
batch
in
range
(
batch_size
):
res
=
np
.
zeros
((
1
,
y_dim
,
z_dim
)).
astype
(
"float32"
).
reshape
(
-
1
)
for
index
in
self
.
attrs
[
'indexes'
]:
channel
=
self
.
inputs
[
'X'
][
batch
,
index
,
:,
:].
reshape
(
-
1
).
copy
(
)
tag1
=
[
0
for
i
in
range
(
y_dim
)]
tag2
=
[
0
for
i
in
range
(
z_dim
)]
cnt
=
0
for
i
in
range
(
channel
.
size
):
index
=
channel
.
argmax
()
idx1
=
index
//
z_dim
idx2
=
index
%
z_dim
if
tag1
[
idx1
]
+
tag2
[
idx2
]
==
0
:
tag1
[
idx1
]
=
1
tag2
[
idx2
]
=
1
res
[
index
]
=
1
cnt
+=
1
if
cnt
==
min
(
y_dim
,
z_dim
):
break
channel
[
index
]
=
-
1
res
=
res
.
reshape
(
1
,
y_dim
,
z_dim
).
repeat
([
x_dim
],
axis
=
0
)
res
=
res
.
reshape
(
1
,
x_dim
,
y_dim
,
z_dim
)
if
output
is
not
None
:
output
=
np
.
concatenate
((
output
,
res
),
axis
=
0
)
else
:
output
=
res
self
.
outputs
=
{
'Out'
:
output
}
def
test_check_output
(
self
):
self
.
check_output
()
class
TestSimilarityFocusOp_axis1
(
OpTest
):
def
setUp
(
self
):
self
.
op_type
=
"similarity_focus"
batch_size
=
3
x_dim
,
y_dim
,
z_dim
=
4
,
5
,
6
self
.
inputs
=
{
'X'
:
np
.
random
.
random
(
(
batch_size
,
x_dim
,
y_dim
,
z_dim
)).
astype
(
"float32"
),
}
self
.
attrs
=
{
'axis'
:
1
,
'indexes'
:
[
0
,
3
],
}
output
=
None
for
batch
in
range
(
batch_size
):
res
=
np
.
zeros
((
1
,
y_dim
,
z_dim
)).
astype
(
"float32"
).
reshape
(
-
1
)
for
index
in
self
.
attrs
[
'indexes'
]:
channel
=
self
.
inputs
[
'X'
][
batch
,
index
,
:,
:].
reshape
(
-
1
).
copy
(
)
tag1
=
[
0
for
i
in
range
(
y_dim
)]
tag2
=
[
0
for
i
in
range
(
z_dim
)]
cnt
=
0
for
i
in
range
(
channel
.
size
):
index
=
channel
.
argmax
()
idx1
=
index
//
z_dim
idx2
=
index
%
z_dim
if
tag1
[
idx1
]
+
tag2
[
idx2
]
==
0
:
tag1
[
idx1
]
=
1
tag2
[
idx2
]
=
1
res
[
index
]
=
1
cnt
+=
1
if
cnt
==
min
(
y_dim
,
z_dim
):
break
channel
[
index
]
=
-
1
res
=
res
.
reshape
(
1
,
y_dim
,
z_dim
)
res
=
res
.
repeat
([
x_dim
],
axis
=
0
)
res
=
res
.
reshape
(
1
,
x_dim
,
y_dim
,
z_dim
)
if
output
is
not
None
:
output
=
np
.
concatenate
((
output
,
res
),
axis
=
0
)
else
:
output
=
res
self
.
outputs
=
{
'Out'
:
output
}
def
test_check_output
(
self
):
self
.
check_output
()
class
TestSimilarityFocusOp_axis2
(
OpTest
):
def
setUp
(
self
):
self
.
op_type
=
"similarity_focus"
batch_size
=
6
x_dim
,
y_dim
,
z_dim
=
7
,
8
,
9
self
.
inputs
=
{
'X'
:
np
.
random
.
random
(
(
batch_size
,
x_dim
,
y_dim
,
z_dim
)).
astype
(
"float32"
),
}
self
.
attrs
=
{
'axis'
:
2
,
'indexes'
:
[
0
,
3
,
5
],
}
output
=
None
for
batch
in
range
(
batch_size
):
res
=
np
.
zeros
((
x_dim
,
1
,
z_dim
)).
astype
(
"float32"
).
reshape
(
-
1
)
for
index
in
self
.
attrs
[
'indexes'
]:
channel
=
self
.
inputs
[
'X'
][
batch
,
:,
index
,
:].
reshape
(
-
1
).
copy
(
)
tag1
=
[
0
for
i
in
range
(
x_dim
)]
tag2
=
[
0
for
i
in
range
(
z_dim
)]
cnt
=
0
for
i
in
range
(
channel
.
size
):
index
=
channel
.
argmax
()
idx1
=
index
//
z_dim
idx2
=
index
%
z_dim
if
tag1
[
idx1
]
+
tag2
[
idx2
]
==
0
:
tag1
[
idx1
]
=
1
tag2
[
idx2
]
=
1
res
[
index
]
=
1
cnt
+=
1
if
cnt
==
min
(
x_dim
,
z_dim
):
break
channel
[
index
]
=
-
1
res
=
res
.
reshape
(
x_dim
,
1
,
z_dim
)
res
=
res
.
repeat
([
y_dim
],
axis
=
1
)
res
=
res
.
reshape
(
1
,
x_dim
,
y_dim
,
z_dim
)
if
output
is
not
None
:
output
=
np
.
concatenate
((
output
,
res
),
axis
=
0
)
else
:
output
=
res
self
.
outputs
=
{
'Out'
:
output
}
def
test_check_output
(
self
):
self
.
check_output
()
class
TestSimilarityFocusOp_axis3
(
OpTest
):
def
setUp
(
self
):
self
.
op_type
=
"similarity_focus"
batch_size
=
64
x_dim
,
y_dim
,
z_dim
=
48
,
48
,
13
self
.
inputs
=
{
'X'
:
np
.
random
.
random
(
(
batch_size
,
x_dim
,
y_dim
,
z_dim
)).
astype
(
"float32"
),
}
self
.
attrs
=
{
'axis'
:
3
,
'indexes'
:
[
0
,
2
,
7
,
9
],
}
output
=
None
for
batch
in
range
(
batch_size
):
res
=
np
.
zeros
((
x_dim
,
y_dim
,
1
)).
astype
(
"float32"
).
reshape
(
-
1
)
for
index
in
self
.
attrs
[
'indexes'
]:
channel
=
self
.
inputs
[
'X'
][
batch
,
:,
:,
index
].
reshape
(
-
1
).
copy
(
)
tag1
=
[
0
for
i
in
range
(
x_dim
)]
tag2
=
[
0
for
i
in
range
(
y_dim
)]
cnt
=
0
for
i
in
range
(
channel
.
size
):
index
=
channel
.
argmax
()
idx1
=
index
//
y_dim
idx2
=
index
%
y_dim
if
tag1
[
idx1
]
+
tag2
[
idx2
]
==
0
:
tag1
[
idx1
]
=
1
tag2
[
idx2
]
=
1
res
[
index
]
=
1
cnt
+=
1
if
cnt
==
min
(
x_dim
,
y_dim
):
break
channel
[
index
]
=
-
1
res
=
res
.
reshape
(
x_dim
,
y_dim
,
1
)
res
=
res
.
repeat
([
z_dim
],
axis
=
2
)
res
=
res
.
reshape
(
1
,
x_dim
,
y_dim
,
z_dim
)
if
output
is
not
None
:
output
=
np
.
concatenate
((
output
,
res
),
axis
=
0
)
else
:
output
=
res
self
.
outputs
=
{
'Out'
:
output
}
def
test_check_output
(
self
):
self
.
check_output
()
if
__name__
==
"__main__"
:
unittest
.
main
()
python/paddle/fluid/tests/unittests/test_tensor_array_to_tensor.py
0 → 100644
浏览文件 @
e5bf8616
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
print_function
import
unittest
import
numpy
import
paddle.fluid
as
fluid
import
paddle.fluid.core
as
core
from
paddle.fluid.op
import
Operator
from
paddle.fluid.executor
import
Executor
class
TestLoDTensorArrayConcat
(
unittest
.
TestCase
):
def
setUp
(
self
):
self
.
op_type
=
"tensor_array_to_tensor"
self
.
attrs
=
{
"axis"
:
0
}
self
.
outputs
=
[
"Out"
]
def
test_get_set
(
self
):
scope
=
core
.
Scope
()
program
=
fluid
.
Program
()
block
=
program
.
global_block
()
input_arr
=
block
.
create_var
(
name
=
"tmp_lod_tensor_array"
,
type
=
core
.
VarDesc
.
VarType
.
LOD_TENSOR_ARRAY
)
input_arr
.
persistable
=
True
input_arr_var
=
scope
.
var
(
'tmp_lod_tensor_array'
)
input_tensor_array
=
input_arr_var
.
get_lod_tensor_array
()
self
.
assertEqual
(
0
,
len
(
input_tensor_array
))
cpu
=
core
.
CPUPlace
()
for
i
in
range
(
10
):
t
=
core
.
LoDTensor
()
if
i
==
0
:
t
.
set
(
numpy
.
array
([[
i
],
[
i
]],
dtype
=
'float32'
),
cpu
)
else
:
t
.
set
(
numpy
.
array
([[
i
]],
dtype
=
'float32'
),
cpu
)
input_tensor_array
.
append
(
t
)
self
.
assertEqual
(
10
,
len
(
input_tensor_array
))
random_grad
=
numpy
.
random
.
random_sample
([
11
]).
astype
(
numpy
.
float32
)
y_out
=
block
.
create_var
(
name
=
"Out"
)
y_out
.
persistable
=
True
y_out_index
=
block
.
create_var
(
name
=
"OutIndex"
)
y_out_index
.
persistable
=
True
y_grad_arr
=
block
.
create_var
(
name
=
'Out@GRAD'
,
dtype
=
'float32'
,
shape
=
[
11
])
y_grad_arr
.
persistable
=
True
y_grad
=
scope
.
var
(
'Out@GRAD'
)
y_grad_tensor
=
y_grad
.
get_tensor
()
y_grad_tensor
.
set
(
random_grad
,
cpu
)
op
=
block
.
append_op
(
type
=
self
.
op_type
,
inputs
=
{
"X"
:
input_arr
},
outputs
=
{
"Out"
:
y_out
,
"OutIndex"
:
y_out_index
},
attrs
=
self
.
attrs
)
out_grad
=
block
.
create_var
(
name
=
"tmp_lod_tensor_array@GRAD"
,
type
=
core
.
VarDesc
.
VarType
.
LOD_TENSOR_ARRAY
)
out_grad
.
persistable
=
True
grad_op_desc_list
,
op_grad_to_var
=
core
.
get_grad_op_desc
(
op
.
desc
,
set
(),
[])
grad_op_desc
=
grad_op_desc_list
[
0
]
new_op_desc
=
block
.
desc
.
append_op
()
new_op_desc
.
copy_from
(
grad_op_desc
)
for
var_name
in
grad_op_desc
.
output_arg_names
():
block
.
desc
.
var
(
var_name
.
encode
(
"ascii"
))
grad_op_desc
.
infer_var_type
(
block
.
desc
)
grad_op_desc
.
infer_shape
(
block
.
desc
)
for
arg
in
grad_op_desc
.
output_arg_names
():
grad_var
=
block
.
desc
.
find_var
(
arg
.
encode
(
"ascii"
))
grad_var
.
set_dtype
(
core
.
VarDesc
.
VarType
.
FP32
)
fetch_list
=
[]
fetch_list
.
append
(
block
.
var
(
'Out'
))
fetch_list
.
append
(
block
.
var
(
'OutIndex'
))
exe
=
fluid
.
Executor
(
fluid
.
CPUPlace
())
out
=
exe
.
run
(
program
,
fetch_list
=
fetch_list
,
scope
=
scope
)
#print ("index: ", numpy.array(out[1]))
# test forward
tensor_res
=
numpy
.
array
(
out
[
0
])
tensor_res_out_idx
=
numpy
.
array
(
out
[
1
])
tensor_gt
=
numpy
.
array
(
[
0
]
+
[
0
,
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
9
],
dtype
=
'float32'
)
self
.
assertEqual
(
len
(
tensor_res
),
len
(
tensor_gt
))
self
.
assertEqual
(
len
(
tensor_res_out_idx
),
10
)
for
i
in
range
(
len
(
tensor_res
)):
self
.
assertEqual
(
tensor_res
[
i
],
tensor_gt
[
i
])
for
i
in
range
(
len
(
tensor_res_out_idx
)):
if
i
==
0
:
self
.
assertEqual
(
tensor_res_out_idx
[
i
],
2
)
else
:
self
.
assertEqual
(
tensor_res_out_idx
[
i
],
1
)
# test backward
grad_tensor
=
scope
.
var
(
'tmp_lod_tensor_array@GRAD'
)
grad_tensor_array
=
grad_tensor
.
get_lod_tensor_array
()
self
.
assertEqual
(
10
,
len
(
grad_tensor_array
))
for
i
in
range
(
len
(
grad_tensor_array
)):
if
i
==
0
:
self
.
assertEqual
(
numpy
.
array
(
grad_tensor_array
[
i
])[
0
],
numpy
.
array
(
random_grad
[
i
]))
self
.
assertEqual
(
numpy
.
array
(
grad_tensor_array
[
i
])[
1
],
numpy
.
array
(
random_grad
[
i
+
1
]))
if
i
==
1
:
self
.
assertEqual
(
numpy
.
array
(
grad_tensor_array
[
i
]),
numpy
.
array
(
random_grad
[
i
+
1
]))
if
__name__
==
'__main__'
:
unittest
.
main
()
python/paddle/fluid/transpiler/distribute_transpiler.py
浏览文件 @
e5bf8616
...
@@ -31,18 +31,17 @@ Steps to transpile pserver:
...
@@ -31,18 +31,17 @@ Steps to transpile pserver:
"""
"""
import
math
import
math
import
sys
import
numpy
as
np
import
numpy
as
np
import
collections
import
collections
import
six
import
logging
import
logging
from
.ps_dispatcher
import
RoundRobin
,
HashName
,
PSDispatcher
from
.ps_dispatcher
import
RoundRobin
,
PSDispatcher
from
..
import
core
,
framework
,
unique_name
from
..
import
core
,
framework
,
unique_name
from
..framework
import
Program
,
default_main_program
,
\
from
..framework
import
Program
,
default_main_program
,
\
default_startup_program
,
Block
,
\
default_startup_program
,
Block
,
\
Parameter
,
grad_var_name
Parameter
,
grad_var_name
from
.details
import
*
from
.details
import
*
from
..distribute_lookup_table
import
find_distributed_lookup_table
from
functools
import
reduce
from
functools
import
reduce
LOOKUP_TABLE_TYPE
=
"lookup_table"
LOOKUP_TABLE_TYPE
=
"lookup_table"
...
@@ -292,7 +291,8 @@ class DistributeTranspiler(object):
...
@@ -292,7 +291,8 @@ class DistributeTranspiler(object):
self
.
optimize_ops
,
self
.
params_grads
=
self
.
_get_optimize_pass
()
self
.
optimize_ops
,
self
.
params_grads
=
self
.
_get_optimize_pass
()
ps_dispatcher
=
self
.
config
.
split_method
(
self
.
pserver_endpoints
)
ps_dispatcher
=
self
.
config
.
split_method
(
self
.
pserver_endpoints
)
self
.
has_distributed_lookup_table
=
self
.
_has_distributed_lookup_table
()
self
.
table_name
=
find_distributed_lookup_table
(
self
.
origin_program
)
self
.
has_distributed_lookup_table
=
self
.
table_name
!=
None
self
.
param_name_to_grad_name
=
dict
()
self
.
param_name_to_grad_name
=
dict
()
self
.
grad_name_to_param_name
=
dict
()
self
.
grad_name_to_param_name
=
dict
()
for
param_var
,
grad_var
in
self
.
params_grads
:
for
param_var
,
grad_var
in
self
.
params_grads
:
...
@@ -966,28 +966,6 @@ to transpile() call.")
...
@@ -966,28 +966,6 @@ to transpile() call.")
# ====================== private transpiler functions =====================
# ====================== private transpiler functions =====================
def
_has_distributed_lookup_table
(
self
):
# process lookup_table_op
# 1. check all lookup_table_op is distributed
# 2. check all lookup_table_op share the same table.
distributed_lookup_table_ops
=
[]
# support only one distributed_lookup_table now
self
.
table_name
=
None
for
op
in
self
.
origin_program
.
global_block
().
ops
:
if
op
.
type
==
LOOKUP_TABLE_TYPE
:
if
op
.
attr
(
'is_distributed'
)
is
True
:
if
self
.
table_name
is
None
:
self
.
table_name
=
op
.
input
(
"W"
)[
0
]
if
self
.
table_name
!=
op
.
input
(
"W"
)[
0
]:
raise
RuntimeError
(
"all distributed lookup_table_ops"
" should have only one table"
)
distributed_lookup_table_ops
.
append
(
op
)
else
:
if
self
.
table_name
is
not
None
:
assert
op
.
input
(
"W"
)[
0
]
!=
self
.
table_name
return
len
(
distributed_lookup_table_ops
)
>
0
def
_update_dist_lookup_table_vars
(
self
,
param_list
,
grad_list
,
def
_update_dist_lookup_table_vars
(
self
,
param_list
,
grad_list
,
params_grads
):
params_grads
):
# TODO(wuyi): put find a way to put dist lookup table stuff all together.
# TODO(wuyi): put find a way to put dist lookup table stuff all together.
...
@@ -1341,7 +1319,6 @@ to transpile() call.")
...
@@ -1341,7 +1319,6 @@ to transpile() call.")
"""
"""
create a new block to handle save checkpoint.
create a new block to handle save checkpoint.
"""
"""
import
os
pserver_program
.
global_block
().
create_var
(
pserver_program
.
global_block
().
create_var
(
name
=
"kLookupTablePath"
,
name
=
"kLookupTablePath"
,
...
...
python/setup.py.in
浏览文件 @
e5bf8616
...
@@ -174,6 +174,18 @@ if '${CMAKE_BUILD_TYPE}' == 'Release':
...
@@ -174,6 +174,18 @@ if '${CMAKE_BUILD_TYPE}' == 'Release':
raise Exception("patch libmkldnn.so failed, command: %s" % command)
raise Exception("patch libmkldnn.so failed, command: %s" % command)
package_data['paddle.libs']+=['libmkldnn.so.0']
package_data['paddle.libs']+=['libmkldnn.so.0']
shutil.copy('${MKLDNN_SHARED_LIB}', libs_path)
shutil.copy('${MKLDNN_SHARED_LIB}', libs_path)
if '${WITH_NGRAPH}' == 'ON':
if '${CMAKE_BUILD_TYPE}' == 'Release':
# only change rpath in Release mode.
command = "patchelf --set-rpath '$ORIGIN/' ${NGRAPH_SHARED_LIB}"
if os.system(command) != 0:
raise Exception("patch ${NGRAPH_SHARED_LIB_NAME} failed, command: %s" % command)
shutil.copy('${NGRAPH_SHARED_LIB}', libs_path)
shutil.copy('${NGRAPH_CPU_LIB}', libs_path)
shutil.copy('${NGRAPH_TBB_LIB}', libs_path)
package_data['paddle.libs']+=['${NGRAPH_SHARED_LIB_NAME}',
'${NGRAPH_CPU_LIB_NAME}',
'${NGRAPH_TBB_LIB_NAME}']
# remove unused paddle/libs/__init__.py
# remove unused paddle/libs/__init__.py
os.remove(libs_path+'/__init__.py')
os.remove(libs_path+'/__init__.py')
package_dir['paddle.libs']=libs_path
package_dir['paddle.libs']=libs_path
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录