Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Crayon鑫
Paddle
提交
31363c3f
P
Paddle
项目概览
Crayon鑫
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
31363c3f
编写于
3月 23, 2022
作者:
P
phlrain
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'develop' of
https://github.com/PaddlePaddle/Paddle
into add_some_yaml_config
上级
da478d1e
3980e222
变更
50
展开全部
隐藏空白更改
内联
并排
Showing
50 changed file
with
3144 addition
and
3606 deletion
+3144
-3606
paddle/fluid/framework/infershape_utils.cc
paddle/fluid/framework/infershape_utils.cc
+158
-197
paddle/fluid/framework/infershape_utils.h
paddle/fluid/framework/infershape_utils.h
+59
-1
paddle/fluid/framework/new_executor/workqueue/workqueue.h
paddle/fluid/framework/new_executor/workqueue/workqueue.h
+58
-0
paddle/fluid/framework/new_executor/workqueue/workqueue_test.cc
.../fluid/framework/new_executor/workqueue/workqueue_test.cc
+6
-0
paddle/fluid/operators/deformable_conv_func.h
paddle/fluid/operators/deformable_conv_func.h
+0
-149
paddle/fluid/operators/deformable_conv_op.cc
paddle/fluid/operators/deformable_conv_op.cc
+9
-159
paddle/fluid/operators/deformable_conv_op.cu
paddle/fluid/operators/deformable_conv_op.cu
+0
-643
paddle/fluid/operators/deformable_conv_op.h
paddle/fluid/operators/deformable_conv_op.h
+0
-509
paddle/fluid/operators/deformable_conv_v1_op.cc
paddle/fluid/operators/deformable_conv_v1_op.cc
+9
-132
paddle/fluid/operators/deformable_conv_v1_op.cu
paddle/fluid/operators/deformable_conv_v1_op.cu
+0
-604
paddle/fluid/operators/deformable_conv_v1_op.h
paddle/fluid/operators/deformable_conv_v1_op.h
+0
-556
paddle/fluid/operators/flatten_op.cc
paddle/fluid/operators/flatten_op.cc
+13
-83
paddle/fluid/operators/flatten_op.cu.cc
paddle/fluid/operators/flatten_op.cu.cc
+0
-31
paddle/fluid/operators/flatten_op.h
paddle/fluid/operators/flatten_op.h
+0
-41
paddle/fluid/operators/flatten_op_xpu.cc
paddle/fluid/operators/flatten_op_xpu.cc
+0
-23
paddle/fluid/pybind/eager_method.cc
paddle/fluid/pybind/eager_method.cc
+56
-9
paddle/fluid/pybind/imperative.cc
paddle/fluid/pybind/imperative.cc
+1
-0
paddle/fluid/pybind/pybind.cc
paddle/fluid/pybind/pybind.cc
+2
-0
paddle/phi/api/include/tensor.h
paddle/phi/api/include/tensor.h
+1
-3
paddle/phi/api/lib/tensor_method.cc
paddle/phi/api/lib/tensor_method.cc
+6
-6
paddle/phi/infermeta/multiary.cc
paddle/phi/infermeta/multiary.cc
+209
-0
paddle/phi/infermeta/multiary.h
paddle/phi/infermeta/multiary.h
+13
-0
paddle/phi/infermeta/unary.cc
paddle/phi/infermeta/unary.cc
+16
-0
paddle/phi/infermeta/unary.h
paddle/phi/infermeta/unary.h
+6
-0
paddle/phi/kernels/CMakeLists.txt
paddle/phi/kernels/CMakeLists.txt
+3
-1
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
+333
-0
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
+0
-120
paddle/phi/kernels/deformable_conv_grad_kernel.h
paddle/phi/kernels/deformable_conv_grad_kernel.h
+39
-0
paddle/phi/kernels/deformable_conv_kernel.h
paddle/phi/kernels/deformable_conv_kernel.h
+2
-1
paddle/phi/kernels/flatten_grad_kernel.cc
paddle/phi/kernels/flatten_grad_kernel.cc
+1
-0
paddle/phi/kernels/flatten_kernel.cc
paddle/phi/kernels/flatten_kernel.cc
+1
-1
paddle/phi/kernels/funcs/CMakeLists.txt
paddle/phi/kernels/funcs/CMakeLists.txt
+1
-0
paddle/phi/kernels/funcs/deformable_conv_functor.cc
paddle/phi/kernels/funcs/deformable_conv_functor.cc
+172
-0
paddle/phi/kernels/funcs/deformable_conv_functor.cu
paddle/phi/kernels/funcs/deformable_conv_functor.cu
+185
-0
paddle/phi/kernels/funcs/deformable_conv_functor.h
paddle/phi/kernels/funcs/deformable_conv_functor.h
+74
-0
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
+366
-0
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
+0
-134
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
+364
-0
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
+22
-68
paddle/phi/ops/compat/deformable_conv_sig.cc
paddle/phi/ops/compat/deformable_conv_sig.cc
+28
-0
python/paddle/distributed/auto_parallel/dist_loader.py
python/paddle/distributed/auto_parallel/dist_loader.py
+26
-6
python/paddle/distributed/auto_parallel/dist_saver.py
python/paddle/distributed/auto_parallel/dist_saver.py
+241
-0
python/paddle/distributed/auto_parallel/engine.py
python/paddle/distributed/auto_parallel/engine.py
+236
-117
python/paddle/distributed/auto_parallel/utils.py
python/paddle/distributed/auto_parallel/utils.py
+8
-0
python/paddle/fluid/dygraph/varbase_patch_methods.py
python/paddle/fluid/dygraph/varbase_patch_methods.py
+32
-0
python/paddle/fluid/tests/unittests/auto_parallel/engine_api.py
.../paddle/fluid/tests/unittests/auto_parallel/engine_api.py
+10
-5
python/paddle/fluid/tests/unittests/auto_parallel/engine_predict_api.py
...fluid/tests/unittests/auto_parallel/engine_predict_api.py
+122
-0
python/paddle/fluid/tests/unittests/auto_parallel/test_engine_api.py
...le/fluid/tests/unittests/auto_parallel/test_engine_api.py
+28
-0
python/paddle/fluid/tests/unittests/test_egr_python_api.py
python/paddle/fluid/tests/unittests/test_egr_python_api.py
+54
-7
python/paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
.../paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
+174
-0
未找到文件。
paddle/fluid/framework/infershape_utils.cc
浏览文件 @
31363c3f
...
@@ -27,7 +27,6 @@ limitations under the License. */
...
@@ -27,7 +27,6 @@ limitations under the License. */
#include "paddle/phi/core/compat/op_utils.h"
#include "paddle/phi/core/compat/op_utils.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/core/meta_tensor.h"
#include "paddle/phi/core/tensor_utils.h"
#include "paddle/phi/core/tensor_utils.h"
namespace
paddle
{
namespace
paddle
{
...
@@ -101,235 +100,197 @@ class InferShapeArgumentMappingContext : public phi::ArgumentMappingContext {
...
@@ -101,235 +100,197 @@ class InferShapeArgumentMappingContext : public phi::ArgumentMappingContext {
const
InferShapeContext
&
ctx_
;
const
InferShapeContext
&
ctx_
;
};
};
// TODO(chenweihang): Support TensorArray later
int64_t
CompatMetaTensor
::
numel
()
const
{
class
CompatMetaTensor
:
public
phi
::
MetaTensor
{
if
(
is_runtime_
)
{
public:
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
CompatMetaTensor
(
InferShapeVarPtr
var
,
bool
is_runtime
)
return
var
->
Get
<
Tensor
>
().
numel
();
:
var_
(
std
::
move
(
var
)),
is_runtime_
(
is_runtime
)
{}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
CompatMetaTensor
()
=
default
;
return
var
->
ElementSize
();
CompatMetaTensor
(
const
CompatMetaTensor
&
)
=
default
;
CompatMetaTensor
(
CompatMetaTensor
&&
)
=
default
;
CompatMetaTensor
&
operator
=
(
const
CompatMetaTensor
&
)
=
delete
;
CompatMetaTensor
&
operator
=
(
CompatMetaTensor
&&
)
=
delete
;
int64_t
numel
()
const
override
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
Tensor
>
().
numel
();
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
ElementSize
();
}
}
}
}
DDim
dims
()
const
override
{
DDim
CompatMetaTensor
::
dims
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
dims
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
dims
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
dims
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
dims
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// use tensor array size as dims
// use tensor array size as dims
auto
&
tensor_array
=
var
->
Get
<
framework
::
LoDTensorArray
>
();
auto
&
tensor_array
=
var
->
Get
<
framework
::
LoDTensorArray
>
();
return
phi
::
make_ddim
({
static_cast
<
int64_t
>
(
tensor_array
.
size
())});
return
phi
::
make_ddim
({
static_cast
<
int64_t
>
(
tensor_array
.
size
())});
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dims from DenseTensor or SelectedRows or "
"DenseTensorArray."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dims from DenseTensor or SelectedRows or "
return
var
->
GetShape
().
empty
()
?
phi
::
make_ddim
({
0UL
})
"DenseTensorArray."
));
:
phi
::
make_ddim
(
var
->
GetShape
());
}
}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetShape
().
empty
()
?
phi
::
make_ddim
({
0UL
})
:
phi
::
make_ddim
(
var
->
GetShape
());
}
}
}
phi
::
DataType
dtype
()
const
override
{
phi
::
DataType
CompatMetaTensor
::
dtype
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
dtype
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
dtype
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
dtype
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
dtype
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported get dtype from LoDTensorArray now
// Unsupported get dtype from LoDTensorArray now
return
phi
::
DataType
::
UNDEFINED
;
return
phi
::
DataType
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dtype from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
return
paddle
::
framework
::
TransToPhiDataType
(
var
->
GetDataType
(
));
"Currently, only can get dtype from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
paddle
::
framework
::
TransToPhiDataType
(
var
->
GetDataType
());
}
}
}
DataLayout
layout
()
const
override
{
DataLayout
CompatMetaTensor
::
layout
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
layout
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
layout
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
layout
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
layout
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// Unsupported get layout from LoDTensorArray now
return
phi
::
DataLayout
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get layout from DenseTensor or "
"SelectedRows."
));
}
}
else
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported get layout for VarDesc now
// Unsupported get layout from LoDTensorArray now
return
DataLayout
::
UNDEFINED
;
return
phi
::
DataLayout
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get layout from DenseTensor or "
"SelectedRows."
));
}
}
}
else
{
// NOTE(chenweihang): do nothing
// Unsupported get layout for VarDesc now
return
DataLayout
::
UNDEFINED
;
}
}
}
void
set_dims
(
const
DDim
&
dims
)
override
{
void
CompatMetaTensor
::
set_dims
(
const
DDim
&
dims
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
auto
*
tensor_array
=
var
->
GetMutable
<
framework
::
LoDTensorArray
>
();
auto
*
tensor_array
=
var
->
GetMutable
<
framework
::
LoDTensorArray
>
();
// Note: Here I want enforce `tensor_array->size() == 0UL`, because
// Note: Here I want enforce `tensor_array->size() == 0UL`, because
// inplace using on LoDTensorArray is dangerous, but the unittest
// inplace using on LoDTensorArray is dangerous, but the unittest
// `test_list` contains this behavior
// `test_list` contains this behavior
PADDLE_ENFORCE_EQ
(
dims
.
size
(),
1UL
,
PADDLE_ENFORCE_EQ
(
dims
.
size
(),
1UL
,
platform
::
errors
::
InvalidArgument
(
platform
::
errors
::
InvalidArgument
(
"LoDTensorArray can only have one dimension."
));
"LoDTensorArray can only have one dimension."
));
// only set the array size for LoDTensorArray input
// only set the array size for LoDTensorArray input
tensor_array
->
resize
(
dims
[
0
]);
tensor_array
->
resize
(
dims
[
0
]);
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set dims from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
var
->
SetShape
(
vectorize
(
dims
));
"Currently, only can set dims from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetShape
(
vectorize
(
dims
));
}
}
}
void
set_dtype
(
phi
::
DataType
dtype
)
override
{
void
CompatMetaTensor
::
set_dtype
(
phi
::
DataType
dtype
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported set dtype for LoDTensorArray now
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set dtype from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
var
->
SetDataType
(
paddle
::
framework
::
TransToProtoVarType
(
dtype
));
"Currently, only can set dtype from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetDataType
(
paddle
::
framework
::
TransToProtoVarType
(
dtype
));
}
}
}
void
set_layout
(
DataLayout
layout
)
override
{
void
CompatMetaTensor
::
set_layout
(
DataLayout
layout
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set layout from DenseTensor or "
"SelectedRows."
));
}
}
else
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported set layout for VarDesc now
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set layout from DenseTensor or "
"SelectedRows."
));
}
}
}
else
{
// NOTE(chenweihang): do nothing
// Unsupported set layout for VarDesc now
}
}
}
void
share_lod
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_lod
(
const
MetaTensor
&
meta_tensor
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
lod
=
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
lod
=
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetRuntimeLoD
();
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetRuntimeLoD
();
}
else
{
// NOTE(chenweihang): do nothing
// only LoDTensor need to share lod
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
// NOTE(chenweihang): do nothing
var
->
SetLoDLevel
(
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
)
// only LoDTensor need to share lod
.
GetCompileTimeLoD
());
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetLoDLevel
(
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetCompileTimeLoD
());
}
}
}
void
share_dims
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_dims
(
const
MetaTensor
&
meta_tensor
)
{
set_dims
(
meta_tensor
.
dims
());
set_dims
(
meta_tensor
.
dims
());
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
selected_rows
=
var
->
GetMutable
<
phi
::
SelectedRows
>
();
auto
*
selected_rows
=
var
->
GetMutable
<
phi
::
SelectedRows
>
();
auto
&
input_selected_rows
=
auto
&
input_selected_rows
=
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetSelectedRows
();
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetSelectedRows
();
selected_rows
->
set_rows
(
input_selected_rows
.
rows
());
selected_rows
->
set_rows
(
input_selected_rows
.
rows
());
selected_rows
->
set_height
(
input_selected_rows
.
height
());
selected_rows
->
set_height
(
input_selected_rows
.
height
());
}
}
}
}
}
}
void
share_meta
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_meta
(
const
MetaTensor
&
meta_tensor
)
{
share_dims
(
meta_tensor
);
share_dims
(
meta_tensor
);
set_dtype
(
meta_tensor
.
dtype
());
set_dtype
(
meta_tensor
.
dtype
());
set_layout
(
meta_tensor
.
layout
());
set_layout
(
meta_tensor
.
layout
());
// special case: share lod of LoDTensor
// special case: share lod of LoDTensor
share_lod
(
meta_tensor
);
share_lod
(
meta_tensor
);
}
}
private:
const
LoD
&
GetRuntimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
LoDTensor
>
().
lod
();
}
int32_t
GetCompileTimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetLoDLevel
();
}
const
phi
::
SelectedRows
&
GetSelectedRows
()
const
{
PADDLE_ENFORCE_EQ
(
is_runtime_
,
true
,
platform
::
errors
::
Unavailable
(
"Only can get Tensor from MetaTensor in rumtime."
));
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
PADDLE_ENFORCE_EQ
(
var
->
IsType
<
phi
::
SelectedRows
>
(),
true
,
platform
::
errors
::
Unavailable
(
"The Tensor in MetaTensor is not SelectedRows."
));
return
var
->
Get
<
phi
::
SelectedRows
>
();
}
InferShapeVarPtr
var_
;
bool
is_runtime_
;
};
phi
::
InferMetaContext
BuildInferMetaContext
(
InferShapeContext
*
ctx
,
phi
::
InferMetaContext
BuildInferMetaContext
(
InferShapeContext
*
ctx
,
const
std
::
string
&
op_type
)
{
const
std
::
string
&
op_type
)
{
...
...
paddle/fluid/framework/infershape_utils.h
浏览文件 @
31363c3f
...
@@ -18,7 +18,7 @@ limitations under the License. */
...
@@ -18,7 +18,7 @@ limitations under the License. */
#include "paddle/fluid/framework/op_info.h"
#include "paddle/fluid/framework/op_info.h"
#include "paddle/fluid/framework/shape_inference.h"
#include "paddle/fluid/framework/shape_inference.h"
#include "paddle/phi/core/meta_tensor.h"
namespace
phi
{
namespace
phi
{
class
InferMetaContext
;
class
InferMetaContext
;
}
// namespace phi
}
// namespace phi
...
@@ -39,5 +39,63 @@ phi::InferMetaContext BuildInferMetaContext(InferShapeContext* ctx,
...
@@ -39,5 +39,63 @@ phi::InferMetaContext BuildInferMetaContext(InferShapeContext* ctx,
} \
} \
}
}
// TODO(chenweihang): Support TensorArray later
class
CompatMetaTensor
:
public
phi
::
MetaTensor
{
public:
CompatMetaTensor
(
InferShapeVarPtr
var
,
bool
is_runtime
)
:
var_
(
std
::
move
(
var
)),
is_runtime_
(
is_runtime
)
{}
CompatMetaTensor
()
=
default
;
CompatMetaTensor
(
const
CompatMetaTensor
&
)
=
default
;
CompatMetaTensor
(
CompatMetaTensor
&&
)
=
default
;
CompatMetaTensor
&
operator
=
(
const
CompatMetaTensor
&
)
=
delete
;
CompatMetaTensor
&
operator
=
(
CompatMetaTensor
&&
)
=
delete
;
int64_t
numel
()
const
override
;
DDim
dims
()
const
override
;
phi
::
DataType
dtype
()
const
override
;
DataLayout
layout
()
const
override
;
void
set_dims
(
const
DDim
&
dims
)
override
;
void
set_dtype
(
phi
::
DataType
dtype
)
override
;
void
set_layout
(
DataLayout
layout
)
override
;
void
share_lod
(
const
MetaTensor
&
meta_tensor
)
override
;
void
share_dims
(
const
MetaTensor
&
meta_tensor
)
override
;
void
share_meta
(
const
MetaTensor
&
meta_tensor
)
override
;
private:
const
LoD
&
GetRuntimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
LoDTensor
>
().
lod
();
}
int32_t
GetCompileTimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetLoDLevel
();
}
const
phi
::
SelectedRows
&
GetSelectedRows
()
const
{
PADDLE_ENFORCE_EQ
(
is_runtime_
,
true
,
platform
::
errors
::
Unavailable
(
"Only can get Tensor from MetaTensor in rumtime."
));
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
PADDLE_ENFORCE_EQ
(
var
->
IsType
<
phi
::
SelectedRows
>
(),
true
,
platform
::
errors
::
Unavailable
(
"The Tensor in MetaTensor is not SelectedRows."
));
return
var
->
Get
<
phi
::
SelectedRows
>
();
}
InferShapeVarPtr
var_
;
bool
is_runtime_
;
};
}
// namespace framework
}
// namespace framework
}
// namespace paddle
}
// namespace paddle
paddle/fluid/framework/new_executor/workqueue/workqueue.h
浏览文件 @
31363c3f
...
@@ -15,9 +15,12 @@
...
@@ -15,9 +15,12 @@
#pragma once
#pragma once
#include <functional>
#include <functional>
#include <future>
#include <memory>
#include <memory>
#include <string>
#include <string>
#include <type_traits>
#include <vector>
#include <vector>
#include "paddle/fluid/platform/enforce.h"
namespace
paddle
{
namespace
paddle
{
namespace
framework
{
namespace
framework
{
...
@@ -25,6 +28,29 @@ namespace framework {
...
@@ -25,6 +28,29 @@ namespace framework {
constexpr
const
char
*
kQueueEmptyEvent
=
"QueueEmpty"
;
constexpr
const
char
*
kQueueEmptyEvent
=
"QueueEmpty"
;
constexpr
const
char
*
kQueueDestructEvent
=
"QueueDestruct"
;
constexpr
const
char
*
kQueueDestructEvent
=
"QueueDestruct"
;
// For std::function
// https://stackoverflow.com/questions/25421346/how-to-create-an-stdfunction-from-a-move-capturing-lambda-expression
template
<
typename
OnlyMovable
>
class
FakeCopyable
{
public:
explicit
FakeCopyable
(
OnlyMovable
&&
obj
)
:
obj_
(
std
::
move
(
obj
))
{
static_assert
(
std
::
is_copy_constructible
<
OnlyMovable
>::
value
==
false
,
"Need not to use FakeCopyable"
);
}
FakeCopyable
(
FakeCopyable
&&
other
)
:
obj_
(
std
::
move
(
other
.
obj_
))
{}
FakeCopyable
(
const
FakeCopyable
&
other
)
{
PADDLE_THROW
(
platform
::
errors
::
Unavailable
(
"Never use the copy constructor of FakeCopyable."
));
}
OnlyMovable
&
Get
()
{
return
obj_
;
}
private:
OnlyMovable
obj_
;
};
class
EventsWaiter
;
class
EventsWaiter
;
struct
WorkQueueOptions
{
struct
WorkQueueOptions
{
...
@@ -78,6 +104,22 @@ class WorkQueue {
...
@@ -78,6 +104,22 @@ class WorkQueue {
virtual
void
AddTask
(
std
::
function
<
void
()
>
fn
)
=
0
;
virtual
void
AddTask
(
std
::
function
<
void
()
>
fn
)
=
0
;
// Higher cost than AddTask
template
<
typename
F
,
typename
...
Args
>
std
::
future
<
typename
std
::
result_of
<
F
(
Args
...)
>::
type
>
AddAwaitableTask
(
F
&&
f
,
Args
&&
...
args
)
{
using
ReturnType
=
typename
std
::
result_of
<
F
(
Args
...)
>::
type
;
std
::
function
<
ReturnType
()
>
task
=
std
::
bind
(
std
::
forward
<
F
>
(
f
),
std
::
forward
<
Args
>
(
args
)...);
std
::
promise
<
ReturnType
>
prom
;
std
::
future
<
ReturnType
>
res
=
prom
.
get_future
();
AddTask
([
t
=
std
::
move
(
task
),
p
=
FakeCopyable
<
std
::
promise
<
ReturnType
>>
(
std
::
move
(
prom
))
]()
mutable
{
p
.
Get
().
set_value
(
t
());
});
return
res
;
}
// See WorkQueueOptions.track_task for details
// See WorkQueueOptions.track_task for details
// virtual void WaitQueueEmpty() = 0;
// virtual void WaitQueueEmpty() = 0;
...
@@ -102,6 +144,22 @@ class WorkQueueGroup {
...
@@ -102,6 +144,22 @@ class WorkQueueGroup {
virtual
void
AddTask
(
size_t
queue_idx
,
std
::
function
<
void
()
>
fn
)
=
0
;
virtual
void
AddTask
(
size_t
queue_idx
,
std
::
function
<
void
()
>
fn
)
=
0
;
// Higher cost than AddTask
template
<
typename
F
,
typename
...
Args
>
std
::
future
<
typename
std
::
result_of
<
F
(
Args
...)
>::
type
>
AddAwaitableTask
(
size_t
queue_idx
,
F
&&
f
,
Args
&&
...
args
)
{
using
ReturnType
=
typename
std
::
result_of
<
F
(
Args
...)
>::
type
;
std
::
function
<
ReturnType
()
>
task
=
std
::
bind
(
std
::
forward
<
F
>
(
f
),
std
::
forward
<
Args
>
(
args
)...);
std
::
promise
<
ReturnType
>
prom
;
std
::
future
<
ReturnType
>
res
=
prom
.
get_future
();
AddTask
(
queue_idx
,
[
t
=
std
::
move
(
task
),
p
=
FakeCopyable
<
std
::
promise
<
ReturnType
>>
(
std
::
move
(
prom
))
]()
mutable
{
p
.
Get
().
set_value
(
t
());
});
return
res
;
}
// See WorkQueueOptions.track_task for details
// See WorkQueueOptions.track_task for details
// virtual void WaitQueueGroupEmpty() = 0;
// virtual void WaitQueueGroupEmpty() = 0;
...
...
paddle/fluid/framework/new_executor/workqueue/workqueue_test.cc
浏览文件 @
31363c3f
...
@@ -60,11 +60,13 @@ TEST(WorkQueue, TestSingleThreadedWorkQueue) {
...
@@ -60,11 +60,13 @@ TEST(WorkQueue, TestSingleThreadedWorkQueue) {
}
}
finished
=
true
;
finished
=
true
;
});
});
auto
handle
=
work_queue
->
AddAwaitableTask
([]()
{
return
1234
;
});
// WaitQueueEmpty
// WaitQueueEmpty
EXPECT_EQ
(
finished
.
load
(),
false
);
EXPECT_EQ
(
finished
.
load
(),
false
);
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
EXPECT_EQ
(
finished
.
load
(),
true
);
EXPECT_EQ
(
finished
.
load
(),
true
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
);
EXPECT_EQ
(
handle
.
get
(),
1234
);
}
}
TEST
(
WorkQueue
,
TestMultiThreadedWorkQueue
)
{
TEST
(
WorkQueue
,
TestMultiThreadedWorkQueue
)
{
...
@@ -146,6 +148,9 @@ TEST(WorkQueue, TestWorkQueueGroup) {
...
@@ -146,6 +148,9 @@ TEST(WorkQueue, TestWorkQueueGroup) {
++
counter
;
++
counter
;
}
}
});
});
int
random_num
=
123456
;
auto
handle
=
queue_group
->
AddAwaitableTask
(
1
,
[
random_num
]()
{
return
random_num
;
});
// WaitQueueGroupEmpty
// WaitQueueGroupEmpty
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
*
kExternalLoopNum
+
kLoopNum
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
*
kExternalLoopNum
+
kLoopNum
);
...
@@ -154,4 +159,5 @@ TEST(WorkQueue, TestWorkQueueGroup) {
...
@@ -154,4 +159,5 @@ TEST(WorkQueue, TestWorkQueueGroup) {
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
queue_group
.
reset
();
queue_group
.
reset
();
EXPECT_EQ
(
events_waiter
.
WaitEvent
(),
paddle
::
framework
::
kQueueDestructEvent
);
EXPECT_EQ
(
events_waiter
.
WaitEvent
(),
paddle
::
framework
::
kQueueDestructEvent
);
EXPECT_EQ
(
handle
.
get
(),
random_num
);
}
}
paddle/fluid/operators/deformable_conv_func.h
已删除
100644 → 0
浏览文件 @
da478d1e
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Part of the following code in this file refs to
// https://github.com/msracver/Deformable-ConvNets/blob/master/faster_rcnn/operator_cxx/deformable_convolution.cu
//
// Copyright (c) 2017 Microsoft
// Licensed under The Apache-2.0 License [see LICENSE for details]
// \file deformable_psroi_pooling.cu
// \brief
// \author Yi Li, Guodong Zhang, Jifeng Dai
#pragma once
#include "paddle/phi/core/hostdevice.h"
#include "paddle/phi/kernels/funcs/blas/blas.h"
#include "paddle/phi/kernels/funcs/math_function.h"
template
<
typename
T
>
HOSTDEVICE
T
DmcnGetGradientWeight
(
T
argmax_h
,
T
argmax_w
,
const
int
h
,
const
int
w
,
const
int
height
,
const
int
width
)
{
if
(
argmax_h
<=
-
1
||
argmax_h
>=
height
||
argmax_w
<=
-
1
||
argmax_w
>=
width
)
{
return
0
;
}
int
argmax_h_low
=
floor
(
argmax_h
);
int
argmax_w_low
=
floor
(
argmax_w
);
int
argmax_h_high
=
argmax_h_low
+
1
;
int
argmax_w_high
=
argmax_w_low
+
1
;
T
weight
=
0
;
weight
=
(
h
==
argmax_h_low
&&
w
==
argmax_w_low
)
?
(
h
+
1
-
argmax_h
)
*
(
w
+
1
-
argmax_w
)
:
weight
;
weight
=
(
h
==
argmax_h_low
&&
w
==
argmax_w_high
)
?
(
h
+
1
-
argmax_h
)
*
(
argmax_w
+
1
-
w
)
:
weight
;
weight
=
(
h
==
argmax_h_high
&&
w
==
argmax_w_low
)
?
(
argmax_h
+
1
-
h
)
*
(
w
+
1
-
argmax_w
)
:
weight
;
weight
=
(
h
==
argmax_h_high
&&
w
==
argmax_w_high
)
?
(
argmax_h
+
1
-
h
)
*
(
argmax_w
+
1
-
w
)
:
weight
;
return
weight
;
}
template
<
typename
T
>
HOSTDEVICE
T
DmcnGetCoordinateWeight
(
T
argmax_h
,
T
argmax_w
,
const
int
height
,
const
int
width
,
const
T
*
im_data
,
const
int
data_width
,
const
int
bp_dir
)
{
if
(
argmax_h
<=
-
1
||
argmax_h
>=
height
||
argmax_w
<=
-
1
||
argmax_w
>=
width
)
{
return
0
;
}
int
argmax_h_low
=
floor
(
argmax_h
);
int
argmax_w_low
=
floor
(
argmax_w
);
int
argmax_h_high
=
argmax_h_low
+
1
;
int
argmax_w_high
=
argmax_w_low
+
1
;
T
weight
=
0
;
if
(
bp_dir
==
0
)
{
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_w_low
+
1
-
argmax_w
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_high
<=
width
-
1
)
?
-
1
*
(
argmax_w
-
argmax_w_low
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_high
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_low
>=
0
)
?
(
argmax_w_low
+
1
-
argmax_w
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_w
-
argmax_w_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_high
]
:
0
;
}
else
if
(
bp_dir
==
1
)
{
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_h_low
+
1
-
argmax_h
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_h_low
+
1
-
argmax_h
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_high
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_h
-
argmax_h_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_h
-
argmax_h_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_high
]
:
0
;
}
return
weight
;
}
template
<
typename
T
>
HOSTDEVICE
T
DmcnIm2colBilinear
(
const
T
*
bottom_data
,
const
int
data_width
,
const
int
height
,
const
int
width
,
T
h
,
T
w
)
{
int
h_low
=
floor
(
h
);
int
w_low
=
floor
(
w
);
int
h_high
=
h_low
+
1
;
int
w_high
=
w_low
+
1
;
T
lh
=
h
-
h_low
;
T
lw
=
w
-
w_low
;
T
hh
=
1
-
lh
;
T
hw
=
1
-
lw
;
T
v1
=
(
h_low
>=
0
&&
w_low
>=
0
)
?
bottom_data
[
h_low
*
data_width
+
w_low
]
:
0
;
T
v2
=
(
h_low
>=
0
&&
w_high
<=
width
-
1
)
?
bottom_data
[
h_low
*
data_width
+
w_high
]
:
0
;
T
v3
=
(
h_high
<=
height
-
1
&&
w_low
>=
0
)
?
bottom_data
[
h_high
*
data_width
+
w_low
]
:
0
;
T
v4
=
(
h_high
<=
height
-
1
&&
w_high
<=
width
-
1
)
?
bottom_data
[
h_high
*
data_width
+
w_high
]
:
0
;
T
w1
=
hh
*
hw
;
T
w2
=
hh
*
lw
;
T
w3
=
lh
*
hw
;
T
w4
=
lh
*
lw
;
return
w1
*
v1
+
w2
*
v2
+
w3
*
v3
+
w4
*
v4
;
}
paddle/fluid/operators/deformable_conv_op.cc
浏览文件 @
31363c3f
...
@@ -12,9 +12,11 @@
...
@@ -12,9 +12,11 @@
// See the License for the specific language governing permissions and
// See the License for the specific language governing permissions and
// limitations under the License.
// limitations under the License.
#include "paddle/fluid/operators/deformable_conv_op.h"
#include <memory>
#include <memory>
#include "paddle/fluid/operators/conv_op.h"
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/multiary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -108,158 +110,6 @@ $$
...
@@ -108,158 +110,6 @@ $$
class
DeformableConvOp
:
public
framework
::
OperatorWithKernel
{
class
DeformableConvOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Input"
),
"Input"
,
"Input"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Offset"
),
"Input"
,
"Offset"
,
"deformable_conv)"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Mask"
),
"Input"
,
"Mask"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Filter"
),
"Input"
,
"Filter"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Output"
),
"Output"
,
"Output"
,
"deformable_conv"
);
auto
in_dims
=
ctx
->
GetInputDim
(
"Input"
);
auto
filter_dims
=
ctx
->
GetInputDim
(
"Filter"
);
auto
offset_dims
=
ctx
->
GetInputDim
(
"Offset"
);
auto
mask_dims
=
ctx
->
GetInputDim
(
"Mask"
);
std
::
vector
<
int
>
strides
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"strides"
);
std
::
vector
<
int
>
paddings
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"paddings"
);
std
::
vector
<
int
>
dilations
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"dilations"
);
int
groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"groups"
);
int
deformable_groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"deformable_groups"
);
int
im2col_step
=
ctx
->
Attrs
().
Get
<
int
>
(
"im2col_step"
);
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
platform
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. The difference is [%d]: [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent. But received input "
"dimension:[%d], strides dimension:[%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d]: [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
platform
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But "
"received output channels:[%d], groups:[%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. The difference is [%d]: [%d]"
,
filter_dims
[
0
]
%
groups
,
0
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
platform
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide im2col_step. But "
"received Input batchsize:[%d], im2col_step:[%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
((
!
ctx
->
IsRuntime
())
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
platform
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group size. But received "
"output num_filter:[%d], deformable group size:[%d]"
,
output_shape
[
1
],
deformable_groups
));
if
(
ctx
->
IsRuntime
())
{
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
mask_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to mask map height. The "
"difference is [%d] vs [%d]"
,
output_shape
[
2
],
mask_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
mask_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to mask map width. The "
"difference is [%d] vs [%d]"
,
output_shape
[
3
],
mask_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
ctx
->
SetOutputDim
(
"Output"
,
phi
::
make_ddim
(
output_shape
));
}
protected:
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
framework
::
OpKernelType
GetExpectedKernelType
(
...
@@ -331,13 +181,13 @@ class DeformableConvGradOp : public framework::OperatorWithKernel {
...
@@ -331,13 +181,13 @@ class DeformableConvGradOp : public framework::OperatorWithKernel {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
DECLARE_INFER_SHAPE_FUNCTOR
(
deformable_conv
,
DeformableConvInferShapeFunctor
,
PD_INFER_META
(
phi
::
DeformableConvInferMeta
));
REGISTER_OPERATOR
(
deformable_conv
,
ops
::
DeformableConvOp
,
REGISTER_OPERATOR
(
deformable_conv
,
ops
::
DeformableConvOp
,
ops
::
DeformableConvOpMaker
,
ops
::
DeformableConvOpMaker
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
imperative
::
OpBase
>
);
ops
::
DeformableConvGradOpMaker
<
paddle
::
imperative
::
OpBase
>
,
DeformableConvInferShapeFunctor
);
REGISTER_OPERATOR
(
deformable_conv_grad
,
ops
::
DeformableConvGradOp
);
REGISTER_OPERATOR
(
deformable_conv_grad
,
ops
::
DeformableConvGradOp
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_grad
,
ops
::
DeformableConvGradCPUKernel
<
float
>
,
ops
::
DeformableConvGradCPUKernel
<
double
>
);
paddle/fluid/operators/deformable_conv_op.cu
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_op.h
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_v1_op.cc
浏览文件 @
31363c3f
...
@@ -12,9 +12,11 @@
...
@@ -12,9 +12,11 @@
// See the License for the specific language governing permissions and
// See the License for the specific language governing permissions and
// limitations under the License.
// limitations under the License.
#include "paddle/fluid/operators/deformable_conv_v1_op.h"
#include <memory>
#include <memory>
#include "paddle/fluid/operators/conv_op.h"
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/multiary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -113,128 +115,6 @@ $$
...
@@ -113,128 +115,6 @@ $$
class
DeformableConvV1Op
:
public
framework
::
OperatorWithKernel
{
class
DeformableConvV1Op
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Input"
),
"Input"
,
"Input"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Offset"
),
"Input"
,
"Offset"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Filter"
),
"Input"
,
"Filter"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Output"
),
"Output"
,
"Output"
,
"deformable_conv_v1"
);
auto
in_dims
=
ctx
->
GetInputDim
(
"Input"
);
auto
filter_dims
=
ctx
->
GetInputDim
(
"Filter"
);
auto
offset_dims
=
ctx
->
GetInputDim
(
"Offset"
);
std
::
vector
<
int
>
strides
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"strides"
);
std
::
vector
<
int
>
paddings
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"paddings"
);
std
::
vector
<
int
>
dilations
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"dilations"
);
int
groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"groups"
);
int
deformable_groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"deformable_groups"
);
int
im2col_step
=
ctx
->
Attrs
().
Get
<
int
>
(
"im2col_step"
);
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
platform
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. the difference is [%d] vs [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent., But received [%d]: [%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d] vs [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
platform
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But"
"received output channels: [%d], groups: [%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. But received [%d]: [%d]"
,
filter_dims
[
0
],
deformable_groups
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
platform
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide "
"im2col_step, But received [%d]: [%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
((
!
ctx
->
IsRuntime
())
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
if
(
ctx
->
IsRuntime
())
{
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
platform
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group "
"size. But received [%d]: [%d]"
,
output_shape
[
1
],
deformable_groups
));
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
ctx
->
SetOutputDim
(
"Output"
,
phi
::
make_ddim
(
output_shape
));
}
protected:
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
framework
::
OpKernelType
GetExpectedKernelType
(
...
@@ -300,15 +180,12 @@ class DeformableConvV1GradOp : public framework::OperatorWithKernel {
...
@@ -300,15 +180,12 @@ class DeformableConvV1GradOp : public framework::OperatorWithKernel {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
DECLARE_INFER_SHAPE_FUNCTOR
(
deformable_conv
,
DeformableConvV1InferShapeFunctor
,
PD_INFER_META
(
phi
::
DeformableConvInferMeta
));
REGISTER_OPERATOR
(
deformable_conv_v1
,
ops
::
DeformableConvV1Op
,
REGISTER_OPERATOR
(
deformable_conv_v1
,
ops
::
DeformableConvV1Op
,
ops
::
DeformableConvV1OpMaker
,
ops
::
DeformableConvV1OpMaker
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
imperative
::
OpBase
>
);
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
imperative
::
OpBase
>
,
DeformableConvV1InferShapeFunctor
);
REGISTER_OPERATOR
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradOp
);
REGISTER_OPERATOR
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradOp
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_v1
,
ops
::
DeformableConvV1CPUKernel
<
float
>
,
ops
::
DeformableConvV1CPUKernel
<
double
>
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradCPUKernel
<
float
>
,
ops
::
DeformableConvV1GradCPUKernel
<
double
>
);
paddle/fluid/operators/deformable_conv_v1_op.cu
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_v1_op.h
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/flatten_op.cc
浏览文件 @
31363c3f
...
@@ -17,7 +17,10 @@ limitations under the License. */
...
@@ -17,7 +17,10 @@ limitations under the License. */
#include <string>
#include <string>
#include <unordered_map>
#include <unordered_map>
#include <vector>
#include <vector>
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/unary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -270,70 +273,24 @@ class Flatten2GradOp : public framework::OperatorWithKernel {
...
@@ -270,70 +273,24 @@ class Flatten2GradOp : public framework::OperatorWithKernel {
class
FlattenContiguousRangeOp
:
public
framework
::
OperatorWithKernel
{
class
FlattenContiguousRangeOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"X"
),
"Input"
,
"X"
,
"FlattenContiguousRange"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"X"
),
"Input"
,
"X"
,
"FlattenContiguousRange"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Out"
),
"Output"
,
"Out"
,
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Out"
),
"Output"
,
"Out"
,
"FlattenContiguousRange"
);
"FlattenContiguousRange"
);
const
auto
&
start_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"start_axis"
);
const
auto
&
start_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"start_axis"
);
const
auto
&
stop_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"stop_axis"
);
const
auto
&
stop_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"stop_axis"
);
const
auto
&
in_dims
=
ctx
->
GetInputDim
(
"X"
);
int
in_dims_size
=
in_dims
.
size
();
int
real_start_axis
=
start_axis
,
real_stop_axis
=
stop_axis
;
if
(
start_axis
<
0
)
{
real_start_axis
=
start_axis
+
in_dims_size
;
}
if
(
stop_axis
<
0
)
{
real_stop_axis
=
stop_axis
+
in_dims_size
;
}
PADDLE_ENFORCE_GE
(
real_stop_axis
,
real_start_axis
,
platform
::
errors
::
InvalidArgument
(
"The stop_axis should be greater"
"than or equal to start_axis."
));
const
auto
&
out_dims
=
// Construct MetaTensor for InferMeta Func
GetOutputShape
(
real_start_axis
,
real_stop_axis
,
in_dims
);
using
CompatMetaTensor
=
framework
::
CompatMetaTensor
;
ctx
->
SetOutputDim
(
"Out"
,
phi
::
make_ddim
(
out_dims
));
CompatMetaTensor
x
(
ctx
->
GetInputVarPtrs
(
"X"
)[
0
],
ctx
->
IsRuntime
());
if
(
in_dims
[
0
]
==
out_dims
[
0
])
{
CompatMetaTensor
out
(
ctx
->
GetOutputVarPtrs
(
"Out"
)[
0
],
ctx
->
IsRuntime
());
// Only pass LoD when the first dimension of output and Input(X)
std
::
unique_ptr
<
CompatMetaTensor
>
xshape
(
nullptr
);
// are the same.
if
(
ctx
->
HasOutput
(
"XShape"
))
{
ctx
->
ShareLoD
(
"X"
,
"Out"
);
xshape
=
std
::
move
(
std
::
unique_ptr
<
CompatMetaTensor
>
(
new
CompatMetaTensor
(
}
ctx
->
GetOutputVarPtrs
(
"XShape"
)[
0
],
ctx
->
IsRuntime
())));
if
(
!
ctx
->
HasOutput
(
"XShape"
))
return
;
// OP_INOUT_CHECK(ctx->HasOutput("XShape"), "Output", "XShape", "Flatten2");
std
::
vector
<
int64_t
>
xshape_dims
(
in_dims
.
size
()
+
1
);
xshape_dims
[
0
]
=
0
;
for
(
int
i
=
0
;
i
<
in_dims
.
size
();
++
i
)
{
xshape_dims
[
i
+
1
]
=
in_dims
[
i
];
}
}
ctx
->
SetOutputDim
(
"XShape"
,
phi
::
make_ddim
(
xshape_dims
));
phi
::
FlattenWithXShapeInferMeta
(
x
,
start_axis
,
stop_axis
,
&
out
,
ctx
->
ShareLoD
(
"X"
,
"XShape"
);
xshape
.
get
());
}
static
std
::
vector
<
int32_t
>
GetOutputShape
(
const
int
start_axis
,
const
int
stop_axis
,
const
framework
::
DDim
&
in_dims
)
{
int64_t
outer
=
1
;
std
::
vector
<
int32_t
>
out_shape
;
int
in_dims_size
=
in_dims
.
size
();
out_shape
.
reserve
(
in_dims_size
-
stop_axis
+
start_axis
);
for
(
int
i
=
0
;
i
<
start_axis
;
++
i
)
{
out_shape
.
push_back
(
in_dims
[
i
]);
}
for
(
int
i
=
start_axis
;
i
<=
stop_axis
;
i
++
)
{
if
(
in_dims
[
i
]
==
-
1
||
outer
==
-
1
)
{
outer
=
-
1
;
}
else
{
outer
*=
in_dims
[
i
];
}
}
out_shape
.
push_back
(
outer
);
for
(
int
i
=
stop_axis
+
1
;
i
<
in_dims_size
;
i
++
)
{
out_shape
.
push_back
(
in_dims
[
i
]);
}
return
out_shape
;
}
}
};
};
...
@@ -487,30 +444,3 @@ REGISTER_OP_CPU_KERNEL(
...
@@ -487,30 +444,3 @@ REGISTER_OP_CPU_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
REGISTER_OP_CPU_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
REGISTER_OP_CPU_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
paddle/fluid/operators/flatten_op.cu.cc
浏览文件 @
31363c3f
...
@@ -47,34 +47,3 @@ REGISTER_OP_CUDA_KERNEL(
...
@@ -47,34 +47,3 @@ REGISTER_OP_CUDA_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
paddle/fluid/operators/flatten_op.h
浏览文件 @
31363c3f
...
@@ -119,46 +119,5 @@ class Flatten2GradKernel : public framework::OpKernel<T> {
...
@@ -119,46 +119,5 @@ class Flatten2GradKernel : public framework::OpKernel<T> {
}
}
};
};
template
<
typename
DeviceContext
,
typename
T
>
class
FlattenContiguousRangeKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
context
)
const
override
{
auto
*
in
=
context
.
Input
<
framework
::
LoDTensor
>
(
"X"
);
auto
*
out
=
context
.
Output
<
framework
::
LoDTensor
>
(
"Out"
);
out
->
mutable_data
(
context
.
GetPlace
(),
in
->
type
());
auto
&
start_axis
=
context
.
Attr
<
int
>
(
"start_axis"
);
auto
&
stop_axis
=
context
.
Attr
<
int
>
(
"stop_axis"
);
auto
&
dev_ctx
=
context
.
device_context
<
DeviceContext
>
();
// call new kernel
phi
::
FlattenKernel
<
T
,
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
>
(
static_cast
<
const
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
&>
(
dev_ctx
),
*
in
,
start_axis
,
stop_axis
,
out
);
}
};
template
<
typename
DeviceContext
,
typename
T
>
class
FlattenContiguousRangeGradKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
auto
*
d_x
=
ctx
.
Output
<
framework
::
LoDTensor
>
(
framework
::
GradVarName
(
"X"
));
auto
*
d_out
=
ctx
.
Input
<
framework
::
LoDTensor
>
(
framework
::
GradVarName
(
"Out"
));
auto
*
xshape
=
ctx
.
Input
<
framework
::
LoDTensor
>
(
"XShape"
);
d_x
->
mutable_data
(
ctx
.
GetPlace
(),
d_out
->
type
());
auto
&
dev_ctx
=
ctx
.
device_context
<
DeviceContext
>
();
// call new kernel
phi
::
FlattenGradKernel
<
T
,
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
>
(
static_cast
<
const
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
&>
(
dev_ctx
),
*
d_out
,
*
xshape
,
d_x
);
}
};
}
// namespace operators
}
// namespace operators
}
// namespace paddle
}
// namespace paddle
paddle/fluid/operators/flatten_op_xpu.cc
浏览文件 @
31363c3f
...
@@ -41,27 +41,4 @@ REGISTER_OP_XPU_KERNEL(
...
@@ -41,27 +41,4 @@ REGISTER_OP_XPU_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
REGISTER_OP_XPU_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
REGISTER_OP_XPU_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
#endif
#endif
paddle/fluid/pybind/eager_method.cc
浏览文件 @
31363c3f
...
@@ -868,16 +868,22 @@ static PyObject* tensor_register_grad_hook(TensorObject* self, PyObject* args,
...
@@ -868,16 +868,22 @@ static PyObject* tensor_register_grad_hook(TensorObject* self, PyObject* args,
int64_t
hook_id
;
int64_t
hook_id
;
if
(
egr
::
egr_utils_api
::
IsLeafTensor
(
self
->
tensor
))
{
if
(
egr
::
egr_utils_api
::
IsLeafTensor
(
self
->
tensor
))
{
VLOG
(
6
)
<<
"Register hook for leaf tensor: "
<<
self
->
tensor
.
name
();
VLOG
(
6
)
<<
"Register hook for leaf tensor: "
<<
self
->
tensor
.
name
();
auto
autograd_meta
=
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
);
if
(
autograd_meta
&&
!
autograd_meta
->
StopGradient
())
{
if
(
!
autograd_meta
->
GetMutableGradNode
())
{
VLOG
(
6
)
<<
"Detected NULL grad_node, Leaf tensor should have had "
"grad_node with type: GradNodeAccumulation."
;
autograd_meta
->
SetGradNode
(
std
::
make_shared
<
egr
::
GradNodeAccumulation
>
(
autograd_meta
));
}
}
std
::
shared_ptr
<
egr
::
GradNodeBase
>
grad_node
=
std
::
shared_ptr
<
egr
::
GradNodeBase
>
grad_node
=
egr
::
EagerUtils
::
grad_node
(
self
->
tensor
);
egr
::
EagerUtils
::
grad_node
(
self
->
tensor
);
PADDLE_ENFORCE
(
grad_node
.
get
()
!=
nullptr
,
paddle
::
platform
::
errors
::
Fatal
(
"Detected NULL grad_node,"
"Leaf tensor should have had grad_node "
"with type: GradNodeAccumulation."
));
auto
rank_info
=
auto
rank_info
=
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
)
->
OutRankInfo
();
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
)
->
OutRankInfo
();
PyObject
*
hook_func
=
PyTuple_GET_ITEM
(
args
,
0
);
PyObject
*
hook_func
=
PyTuple_GET_ITEM
(
args
,
0
);
auto
accumulation_grad_node
=
auto
accumulation_grad_node
=
...
@@ -948,8 +954,8 @@ static PyObject* tensor_register_reduce_hook(TensorObject* self, PyObject* args,
...
@@ -948,8 +954,8 @@ static PyObject* tensor_register_reduce_hook(TensorObject* self, PyObject* args,
EAGER_CATCH_AND_THROW_RETURN_NULL
EAGER_CATCH_AND_THROW_RETURN_NULL
}
}
static
PyObject
*
set_grad_type
(
TensorObject
*
self
,
PyObject
*
args
,
static
PyObject
*
tensor__
set_grad_type
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
PyObject
*
kwargs
)
{
EAGER_TRY
EAGER_TRY
auto
var_type
=
pybind
::
CastPyArg2ProtoType
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
auto
var_type
=
pybind
::
CastPyArg2ProtoType
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
auto
grad_tensor
=
auto
grad_tensor
=
...
@@ -963,6 +969,42 @@ static PyObject* set_grad_type(TensorObject* self, PyObject* args,
...
@@ -963,6 +969,42 @@ static PyObject* set_grad_type(TensorObject* self, PyObject* args,
EAGER_CATCH_AND_THROW_RETURN_NULL
EAGER_CATCH_AND_THROW_RETURN_NULL
}
}
static
PyObject
*
tensor__clear
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
EAGER_TRY
self
->
tensor
.
reset
();
return
Py_None
;
EAGER_CATCH_AND_THROW_RETURN_NULL
}
static
PyObject
*
tensor__copy_gradient_from
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
EAGER_TRY
auto
src
=
CastPyArg2Tensor
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
if
(
self
->
tensor
.
is_initialized
())
{
PADDLE_ENFORCE_EQ
(
self
->
tensor
.
dtype
(),
src
.
dtype
(),
platform
::
errors
::
PreconditionNotMet
(
"Tensor %s has different data type with Tensor %s"
,
self
->
tensor
.
name
(),
src
.
name
()));
PADDLE_ENFORCE_EQ
(
self
->
tensor
.
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
platform
::
errors
::
PreconditionNotMet
(
"Tensor %s has different type with Tensor %s, Tensor "
"ShareGradientDataWith cannot be performed!"
,
self
->
tensor
.
name
(),
src
.
name
()));
}
VLOG
(
6
)
<<
"Tensor copy gradient from: "
<<
src
.
name
();
auto
*
p_grad
=
egr
::
EagerUtils
::
mutable_grad
(
self
->
tensor
);
if
(
p_grad
)
{
PADDLE_ENFORCE_EQ
(
src
.
initialized
(),
true
,
platform
::
errors
::
InvalidArgument
(
"Tensor %s has not been initialized"
,
src
.
name
()));
p_grad
->
set_impl
(
src
.
impl
());
}
Py_INCREF
(
Py_None
);
return
Py_None
;
EAGER_CATCH_AND_THROW_RETURN_NULL
}
static
PyObject
*
tensor_method_get_non_zero_indices
(
TensorObject
*
self
,
static
PyObject
*
tensor_method_get_non_zero_indices
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
PyObject
*
kwargs
)
{
...
@@ -1117,7 +1159,12 @@ PyMethodDef variable_methods[] = {
...
@@ -1117,7 +1159,12 @@ PyMethodDef variable_methods[] = {
{
"_register_backward_hook"
,
{
"_register_backward_hook"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor_register_reduce_hook
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor_register_reduce_hook
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_set_grad_type"
,
(
PyCFunction
)(
void
(
*
)(
void
))
set_grad_type
,
{
"_set_grad_type"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__set_grad_type
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_clear"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__clear
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_copy_gradient_from"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__copy_gradient_from
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
/***the method of sparse tensor****/
/***the method of sparse tensor****/
{
"non_zero_indices"
,
{
"non_zero_indices"
,
...
...
paddle/fluid/pybind/imperative.cc
浏览文件 @
31363c3f
...
@@ -655,6 +655,7 @@ void BindImperative(py::module *m_ptr) {
...
@@ -655,6 +655,7 @@ void BindImperative(py::module *m_ptr) {
}
else
{
}
else
{
act_name
=
name
.
cast
<
std
::
string
>
();
act_name
=
name
.
cast
<
std
::
string
>
();
}
}
VLOG
(
4
)
<<
"Init VarBase :"
<<
act_name
;
new
(
&
self
)
imperative
::
VarBase
(
act_name
);
new
(
&
self
)
imperative
::
VarBase
(
act_name
);
self
.
SetPersistable
(
persistable
);
self
.
SetPersistable
(
persistable
);
self
.
SetType
(
type
);
self
.
SetType
(
type
);
...
...
paddle/fluid/pybind/pybind.cc
浏览文件 @
31363c3f
...
@@ -829,6 +829,8 @@ PYBIND11_MODULE(core_noavx, m) {
...
@@ -829,6 +829,8 @@ PYBIND11_MODULE(core_noavx, m) {
[](
const
framework
::
Tensor
&
self
)
{
[](
const
framework
::
Tensor
&
self
)
{
return
reinterpret_cast
<
uintptr_t
>
(
self
.
data
());
return
reinterpret_cast
<
uintptr_t
>
(
self
.
data
());
})
})
.
def
(
"_slice"
,
&
framework
::
Tensor
::
Slice
)
.
def
(
"_numel"
,
&
framework
::
Tensor
::
numel
)
.
def
(
"_is_initialized"
,
.
def
(
"_is_initialized"
,
[](
const
framework
::
Tensor
&
self
)
{
return
self
.
IsInitialized
();
})
[](
const
framework
::
Tensor
&
self
)
{
return
self
.
IsInitialized
();
})
.
def
(
"_get_dims"
,
.
def
(
"_get_dims"
,
...
...
paddle/phi/api/include/tensor.h
浏览文件 @
31363c3f
...
@@ -427,9 +427,7 @@ class PADDLE_API Tensor final {
...
@@ -427,9 +427,7 @@ class PADDLE_API Tensor final {
* @param blocking, Should we copy this in sync way.
* @param blocking, Should we copy this in sync way.
* @return void
* @return void
*/
*/
void
copy_
(
const
Tensor
&
src
,
void
copy_
(
const
Tensor
&
src
,
const
phi
::
Place
&
target_place
,
bool
blocking
);
const
phi
::
Place
&
target_place
,
const
bool
blocking
);
/**
/**
* @brief Cast datatype from one to another
* @brief Cast datatype from one to another
*
*
...
...
paddle/phi/api/lib/tensor_method.cc
浏览文件 @
31363c3f
...
@@ -84,26 +84,26 @@ void Tensor::copy_(const Tensor &src,
...
@@ -84,26 +84,26 @@ void Tensor::copy_(const Tensor &src,
if
(
is_initialized
())
{
if
(
is_initialized
())
{
PADDLE_ENFORCE_EQ
(
dtype
(),
PADDLE_ENFORCE_EQ
(
dtype
(),
src
.
dtype
(),
src
.
dtype
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Tensor %s has different data type with Tensor %s, "
"Tensor %s has different data type with Tensor %s, "
"Tensor Copy cannot be performed!"
,
"Tensor Copy cannot be performed!"
,
name
(),
name
(),
src
.
name
()));
src
.
name
()));
PADDLE_ENFORCE_EQ
(
impl
()
->
type_info
().
id
(),
PADDLE_ENFORCE_EQ
(
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Tensor %s has different type with Tensor %s, Tensor "
"Tensor %s has different type with Tensor %s, Tensor "
"Copy cannot be performed!"
,
"Copy cannot be performed!"
,
name
(),
name
(),
src
.
name
()));
src
.
name
()));
PADDLE_ENFORCE_EQ
(
target_place
,
PADDLE_ENFORCE_EQ
(
target_place
,
inner_place
(),
inner_place
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Place is different of dst tensor and args %s, which "
"Place is different of dst tensor and args %s, which "
"current tensor holds %s "
"current tensor holds %s "
"Copy cannot be performed!"
,
"Copy cannot be performed!"
,
target_place
.
DebugString
()
,
target_place
,
inner_place
()
.
DebugString
()
));
inner_place
()));
kernel_key_set
.
backend_set
=
kernel_key_set
.
backend_set
=
kernel_key_set
.
backend_set
|
kernel_key_set
.
backend_set
|
BackendSet
(
phi
::
TransToPhiBackend
(
inner_place
()));
BackendSet
(
phi
::
TransToPhiBackend
(
inner_place
()));
...
@@ -177,7 +177,7 @@ void Tensor::copy_(const Tensor &src,
...
@@ -177,7 +177,7 @@ void Tensor::copy_(const Tensor &src,
blocking
,
blocking
,
static_cast
<
phi
::
SelectedRows
*>
(
impl_
.
get
()));
static_cast
<
phi
::
SelectedRows
*>
(
impl_
.
get
()));
}
else
{
}
else
{
PADDLE_THROW
(
p
addle
::
platform
::
errors
::
InvalidArgument
(
PADDLE_THROW
(
p
hi
::
errors
::
InvalidArgument
(
"We currently only support dense tensor copy for now and if u need to "
"We currently only support dense tensor copy for now and if u need to "
"copy selected rows please raise a issue."
));
"copy selected rows please raise a issue."
));
}
}
...
...
paddle/phi/infermeta/multiary.cc
浏览文件 @
31363c3f
...
@@ -516,6 +516,215 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
...
@@ -516,6 +516,215 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
out
->
share_lod
(
*
x
.
at
(
0
));
out
->
share_lod
(
*
x
.
at
(
0
));
}
}
inline
int
ConvOutputSize
(
int
input_size
,
int
filter_size
,
int
dilation
,
int
padding
,
int
stride
)
{
const
int
dkernel
=
dilation
*
(
filter_size
-
1
)
+
1
;
int
output_size
=
(
input_size
+
2
*
padding
-
dkernel
)
/
stride
+
1
;
PADDLE_ENFORCE_GT
(
output_size
,
0
,
phi
::
errors
::
InvalidArgument
(
"The output's size is expected to be greater than 0. "
"But recieved: output's size is %d. The output's size is computed by "
"((input_size + 2 * padding - (dilation * (filter_size - 1) + 1)) / "
"stride + 1), where input_size is %d, padding is %d, "
"filter_size is %d, dilation is %d, stride is %d."
,
output_size
,
input_size
,
padding
,
filter_size
,
dilation
,
stride
));
return
output_size
;
}
void
DeformableConvInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
offset
,
const
MetaTensor
&
filter
,
paddle
::
optional
<
const
MetaTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
MetaTensor
*
out
,
MetaConfig
config
)
{
auto
in_dims
=
x
.
dims
();
auto
offset_dims
=
offset
.
dims
();
auto
filter_dims
=
filter
.
dims
();
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
phi
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
phi
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. The difference is [%d]: [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
phi
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent. But received input "
"dimension:[%d], strides dimension:[%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
phi
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d]: [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
phi
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
phi
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But "
"received output channels:[%d], groups:[%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
phi
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. The difference is [%d]: [%d]"
,
filter_dims
[
0
]
%
groups
,
0
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
phi
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide im2col_step. But "
"received Input batchsize:[%d], im2col_step:[%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
phi
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
phi
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
(
!
config
.
is_runtime
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
phi
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group size. But received "
"output num_filter:[%d], deformable group size:[%d]"
,
output_shape
[
1
],
deformable_groups
));
if
(
config
.
is_runtime
)
{
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
phi
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
phi
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
phi
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
phi
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
if
(
mask
)
{
auto
mask_dims
=
mask
->
dims
();
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
mask_dims
[
2
],
phi
::
errors
::
InvalidArgument
(
"output height must equal to mask map height. The "
"difference is [%d] vs [%d]"
,
output_shape
[
2
],
mask_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
mask_dims
[
3
],
phi
::
errors
::
InvalidArgument
(
"output width must equal to mask map width. The "
"difference is [%d] vs [%d]"
,
output_shape
[
3
],
mask_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
phi
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
phi
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
}
out
->
set_dims
(
phi
::
make_ddim
(
output_shape
));
out
->
set_dtype
(
x
.
dtype
());
}
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
w
,
const
MetaTensor
&
w
,
const
MetaTensor
&
label
,
const
MetaTensor
&
label
,
...
...
paddle/phi/infermeta/multiary.h
浏览文件 @
31363c3f
...
@@ -120,6 +120,19 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
...
@@ -120,6 +120,19 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
MetaTensor
*
out
,
MetaTensor
*
out
,
MetaConfig
config
=
MetaConfig
());
MetaConfig
config
=
MetaConfig
());
void
DeformableConvInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
offset
,
const
MetaTensor
&
filter
,
paddle
::
optional
<
const
MetaTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
MetaTensor
*
out
,
MetaConfig
config
=
MetaConfig
());
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
w
,
const
MetaTensor
&
w
,
const
MetaTensor
&
label
,
const
MetaTensor
&
label
,
...
...
paddle/phi/infermeta/unary.cc
浏览文件 @
31363c3f
...
@@ -352,6 +352,14 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -352,6 +352,14 @@ void FlattenInferMeta(const MetaTensor& x,
int
start_axis
,
int
start_axis
,
int
stop_axis
,
int
stop_axis
,
MetaTensor
*
out
)
{
MetaTensor
*
out
)
{
FlattenWithXShapeInferMeta
(
x
,
start_axis
,
stop_axis
,
out
,
nullptr
);
}
void
FlattenWithXShapeInferMeta
(
const
MetaTensor
&
x
,
int
start_axis
,
int
stop_axis
,
MetaTensor
*
out
,
MetaTensor
*
xshape
)
{
auto
x_dims
=
x
.
dims
();
auto
x_dims
=
x
.
dims
();
int
in_dims_size
=
x_dims
.
size
();
int
in_dims_size
=
x_dims
.
size
();
if
(
start_axis
<
0
)
{
if
(
start_axis
<
0
)
{
...
@@ -394,6 +402,14 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -394,6 +402,14 @@ void FlattenInferMeta(const MetaTensor& x,
// are the same.
// are the same.
out
->
share_lod
(
x
);
out
->
share_lod
(
x
);
}
}
if
(
xshape
==
nullptr
)
return
;
std
::
vector
<
int64_t
>
xshape_dims
(
x_dims
.
size
()
+
1
);
xshape_dims
[
0
]
=
0
;
for
(
int
i
=
0
;
i
<
x_dims
.
size
();
++
i
)
{
xshape_dims
[
i
+
1
]
=
x_dims
[
i
];
}
xshape
->
set_dims
(
phi
::
make_ddim
(
xshape_dims
));
xshape
->
share_lod
(
x
);
}
}
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
...
...
paddle/phi/infermeta/unary.h
浏览文件 @
31363c3f
...
@@ -86,6 +86,12 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -86,6 +86,12 @@ void FlattenInferMeta(const MetaTensor& x,
int
stop_axis
,
int
stop_axis
,
MetaTensor
*
out
);
MetaTensor
*
out
);
void
FlattenWithXShapeInferMeta
(
const
MetaTensor
&
x
,
int
start_axis
,
int
stop_axis
,
MetaTensor
*
out
,
MetaTensor
*
xshape
);
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
float
temperature
,
float
temperature
,
bool
hard
,
bool
hard
,
...
...
paddle/phi/kernels/CMakeLists.txt
浏览文件 @
31363c3f
...
@@ -27,12 +27,14 @@ kernel_library(full_kernel DEPS ${COMMON_KERNEL_DEPS} empty_kernel)
...
@@ -27,12 +27,14 @@ kernel_library(full_kernel DEPS ${COMMON_KERNEL_DEPS} empty_kernel)
# Some kernels depend on some targets that are not commonly used.
# Some kernels depend on some targets that are not commonly used.
# These targets are not suitable for common dependencies.
# These targets are not suitable for common dependencies.
# In this case, you need to manually generate them here.
# In this case, you need to manually generate them here.
set
(
MANUAL_BUILD_KERNELS eigh_kernel gumbel_softmax_kernel gumbel_softmax_grad_kernel
set
(
MANUAL_BUILD_KERNELS
deformable_conv_kernel deformable_conv_grad_kernel
eigh_kernel gumbel_softmax_kernel gumbel_softmax_grad_kernel
hierarchical_sigmoid_kernel hierarchical_sigmoid_grad_kernel
hierarchical_sigmoid_kernel hierarchical_sigmoid_grad_kernel
matrix_power_kernel matrix_power_grad_kernel maxout_kernel maxout_grad_kernel pool_kernel
matrix_power_kernel matrix_power_grad_kernel maxout_kernel maxout_grad_kernel pool_kernel
put_along_axis_kernel put_along_axis_grad_kernel segment_pool_kernel segment_pool_grad_kernel
put_along_axis_kernel put_along_axis_grad_kernel segment_pool_kernel segment_pool_grad_kernel
softmax_kernel softmax_grad_kernel take_along_axis_kernel take_along_axis_grad_kernel
softmax_kernel softmax_grad_kernel take_along_axis_kernel take_along_axis_grad_kernel
triangular_solve_grad_kernel determinant_grad_kernel reduce_kernel
)
triangular_solve_grad_kernel determinant_grad_kernel reduce_kernel
)
kernel_library
(
deformable_conv_kernel DEPS
${
COMMON_KERNEL_DEPS
}
deformable_conv_functor
)
kernel_library
(
deformable_conv_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
deformable_conv_functor
)
kernel_library
(
eigh_kernel DEPS
${
COMMON_KERNEL_DEPS
}
lapack_function
)
kernel_library
(
eigh_kernel DEPS
${
COMMON_KERNEL_DEPS
}
lapack_function
)
kernel_library
(
hierarchical_sigmoid_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
...
...
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/deformable_conv_grad_kernel.h"
#include "paddle/phi/backends/cpu/cpu_context.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h"
namespace
phi
{
template
<
typename
T
>
inline
void
ModulatedDeformableCol2imCPUKernel
(
const
int
num_kernels
,
const
T
*
data_col
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
grad_im
)
{
for
(
int
thread
=
0
;
thread
<
num_kernels
;
thread
++
)
{
const
int
j
=
(
thread
/
width_col
/
height_col
/
batch_size
)
%
kernel_w
;
const
int
i
=
(
thread
/
width_col
/
height_col
/
batch_size
/
kernel_w
)
%
kernel_h
;
const
int
c
=
thread
/
width_col
/
height_col
/
batch_size
/
kernel_w
/
kernel_h
;
const
int
deformable_group_index
=
c
/
channel_per_deformable_group
;
int
w_out
=
thread
%
width_col
;
int
h_out
=
(
thread
/
width_col
)
%
height_col
;
int
b
=
(
thread
/
width_col
/
height_col
)
%
batch_size
;
int
w_in
=
w_out
*
stride_w
-
pad_w
;
int
h_in
=
h_out
*
stride_h
-
pad_h
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
const
T
cur_inv_h_data
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
cur_inv_w_data
=
w_in
+
j
*
dilation_w
+
offset_w
;
T
cur_top_grad
=
data_col
[
thread
];
if
(
data_mask
)
{
const
T
*
data_mask_ptr
=
data_mask
+
(
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
cur_top_grad
*=
mask
;
}
const
int
cur_h
=
static_cast
<
int
>
(
cur_inv_h_data
);
const
int
cur_w
=
static_cast
<
int
>
(
cur_inv_w_data
);
for
(
int
dy
=
-
2
;
dy
<=
2
;
dy
++
)
{
for
(
int
dx
=
-
2
;
dx
<=
2
;
dx
++
)
{
if
(
cur_h
+
dy
>=
0
&&
cur_h
+
dy
<
height
&&
cur_w
+
dx
>=
0
&&
cur_w
+
dx
<
width
&&
abs
(
cur_inv_h_data
-
(
cur_h
+
dy
))
<
1
&&
abs
(
cur_inv_w_data
-
(
cur_w
+
dx
))
<
1
)
{
int
cur_bottom_grad_pos
=
((
b
*
channels
+
c
)
*
height
+
cur_h
+
dy
)
*
width
+
cur_w
+
dx
;
T
weight
=
DmcnGetGradientWeight
(
cur_inv_h_data
,
cur_inv_w_data
,
cur_h
+
dy
,
cur_w
+
dx
,
height
,
width
);
*
(
grad_im
+
cur_bottom_grad_pos
)
=
*
(
grad_im
+
cur_bottom_grad_pos
)
+
weight
*
cur_top_grad
;
}
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableCol2im
(
const
Context
&
dev_ctx
,
const
T
*
data_col
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
kernel_shape
,
const
std
::
vector
<
int
>&
pad
,
const
std
::
vector
<
int
>&
stride
,
const
std
::
vector
<
int
>&
dilation
,
const
int
deformable_group
,
T
*
grad_im
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_group
;
int
num_kernels
=
col_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
ModulatedDeformableCol2imCPUKernel
(
num_kernels
,
data_col
,
data_offset
,
data_mask
,
im_shape
[
0
],
im_shape
[
1
],
im_shape
[
2
],
kernel_shape
[
2
],
kernel_shape
[
3
],
pad
[
0
],
pad
[
1
],
stride
[
0
],
stride
[
1
],
dilation
[
0
],
dilation
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
deformable_group
,
col_shape
[
2
],
col_shape
[
3
],
grad_im
);
}
template
<
typename
T
>
void
ModulatedDeformableCol2imCoordCPUKernel
(
const
int
num_kernels
,
const
T
*
data_col
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
offset_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
grad_offset
,
T
*
grad_mask
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
T
val
=
0
,
mval
=
0
;
const
int
w
=
i
%
width_col
;
const
int
h
=
(
i
/
width_col
)
%
height_col
;
const
int
c
=
(
i
/
width_col
/
height_col
)
%
offset_channels
;
const
int
b
=
(
i
/
width_col
/
height_col
)
/
offset_channels
;
const
int
deformable_group_index
=
c
/
(
2
*
kernel_h
*
kernel_w
);
const
int
col_step
=
kernel_h
*
kernel_w
;
int
cnt
=
0
;
const
T
*
data_col_ptr
=
data_col
+
deformable_group_index
*
channel_per_deformable_group
*
batch_size
*
width_col
*
height_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b
*
deformable_group
+
deformable_group_index
)
*
channel_per_deformable_group
/
kernel_h
/
kernel_w
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
const
int
offset_c
=
c
-
deformable_group_index
*
2
*
kernel_h
*
kernel_w
;
for
(
int
col_c
=
offset_c
/
2
;
col_c
<
channel_per_deformable_group
;
col_c
+=
col_step
)
{
const
int
col_pos
=
(((
col_c
*
batch_size
+
b
)
*
height_col
)
+
h
)
*
width_col
+
w
;
const
int
bp_dir
=
offset_c
%
2
;
int
j
=
(
col_pos
/
width_col
/
height_col
/
batch_size
)
%
kernel_w
;
int
i
=
(
col_pos
/
width_col
/
height_col
/
batch_size
/
kernel_w
)
%
kernel_h
;
int
w_out
=
col_pos
%
width_col
;
int
h_out
=
(
col_pos
/
width_col
)
%
height_col
;
int
w_in
=
w_out
*
stride_w
-
pad_w
;
int
h_in
=
h_out
*
stride_h
-
pad_h
;
const
int
data_offset_h_ptr
=
(((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
int
data_offset_w_ptr
=
(((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
inv_h
=
h_in
+
i
*
dilation_h
+
offset_h
;
T
inv_w
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
inv_h
<=
-
1
||
inv_w
<=
-
1
||
inv_h
>=
height
||
inv_w
>=
width
)
{
inv_h
=
inv_w
=
-
2
;
}
else
{
mval
+=
data_col_ptr
[
col_pos
]
*
funcs
::
DmcnIm2colBilinear
(
data_im_ptr
+
cnt
*
height
*
width
,
width
,
height
,
width
,
inv_h
,
inv_w
);
}
const
T
weight
=
DmcnGetCoordinateWeight
(
inv_h
,
inv_w
,
height
,
width
,
data_im_ptr
+
cnt
*
height
*
width
,
width
,
bp_dir
);
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
(((
i
*
kernel_w
+
j
)
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
val
+=
weight
*
data_col_ptr
[
col_pos
]
*
mask
;
}
else
{
val
+=
weight
*
data_col_ptr
[
col_pos
];
}
cnt
+=
1
;
}
grad_offset
[
i
]
=
val
;
if
(
grad_mask
&&
offset_c
%
2
==
0
)
grad_mask
[(((
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
+
offset_c
/
2
)
*
height_col
+
h
)
*
width_col
+
w
]
=
mval
;
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableCol2imCoord
(
const
Context
&
dev_ctx
,
const
T
*
data_col
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
kernel_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
grad_offset
,
T
*
grad_mask
)
{
int
num_kernels
=
2
*
kernel_shape
[
2
]
*
kernel_shape
[
3
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
]
*
deformable_groups
;
int
channel_per_deformable_group
=
col_shape
[
0
]
/
deformable_groups
;
ModulatedDeformableCol2imCoordCPUKernel
(
num_kernels
,
data_col
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
0
],
im_shape
[
1
],
im_shape
[
2
],
kernel_shape
[
2
],
kernel_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
2
*
kernel_shape
[
2
]
*
kernel_shape
[
3
]
*
deformable_groups
,
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
grad_offset
,
grad_mask
);
}
template
<
typename
T
,
typename
Context
>
void
FilterGradAddup
(
const
Context
&
dev_ctx
,
const
int
nthreads
,
const
int
n
,
const
int
height
,
const
int
width
,
const
T
*
dweight_3d
,
T
*
filter_grad
)
{
for
(
int
i
=
0
;
i
<
nthreads
;
i
++
)
{
filter_grad
[
i
]
=
filter_grad
[
i
]
+
dweight_3d
[
i
];
}
}
}
// namespace phi
PD_REGISTER_KERNEL
(
deformable_conv_grad
,
CPU
,
ALL_LAYOUT
,
phi
::
DeformableConvGradKernel
,
float
,
double
)
{}
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
浏览文件 @
31363c3f
...
@@ -18,126 +18,6 @@
...
@@ -18,126 +18,6 @@
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/kernels/impl/deformable_conv_kernel_impl.h"
#include "paddle/phi/kernels/impl/deformable_conv_kernel_impl.h"
namespace
phi
{
template
<
typename
T
>
inline
void
ModulatedDeformableIm2colCPUKernel
(
const
int
num_kernels
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
*
mask
;
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
// get outputs of im2col with offset by bilinear interpolation
ModulatedDeformableIm2colCPUKernel
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
}
// namespace phi
PD_REGISTER_KERNEL
(
deformable_conv
,
PD_REGISTER_KERNEL
(
deformable_conv
,
CPU
,
CPU
,
ALL_LAYOUT
,
ALL_LAYOUT
,
...
...
paddle/phi/kernels/deformable_conv_grad_kernel.h
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "paddle/phi/core/dense_tensor.h"
namespace
phi
{
template
<
typename
T
,
typename
Context
>
void
DeformableConvGradKernel
(
const
Context
&
dev_ctx
,
const
DenseTensor
&
x
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
filter
,
paddle
::
optional
<
const
DenseTensor
&>
mask
,
const
DenseTensor
&
out_grad
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
DenseTensor
*
dx
,
DenseTensor
*
offset_grad
,
DenseTensor
*
filter_grad
,
DenseTensor
*
mask_grad
);
}
// namespace phi
paddle/phi/kernels/deformable_conv_kernel.h
浏览文件 @
31363c3f
...
@@ -15,6 +15,7 @@
...
@@ -15,6 +15,7 @@
#pragma once
#pragma once
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/utils/optional.h"
namespace
phi
{
namespace
phi
{
...
@@ -23,7 +24,7 @@ void DeformableConvKernel(const Context& dev_ctx,
...
@@ -23,7 +24,7 @@ void DeformableConvKernel(const Context& dev_ctx,
const
DenseTensor
&
x
,
const
DenseTensor
&
x
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
filter
,
const
DenseTensor
&
filter
,
const
DenseTensor
&
mask
,
paddle
::
optional
<
const
DenseTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
const
std
::
vector
<
int
>&
dilations
,
...
...
paddle/phi/kernels/flatten_grad_kernel.cc
浏览文件 @
31363c3f
...
@@ -25,6 +25,7 @@ void FlattenGradKernel(const Context& dev_ctx,
...
@@ -25,6 +25,7 @@ void FlattenGradKernel(const Context& dev_ctx,
const
DenseTensor
&
xshape
,
const
DenseTensor
&
xshape
,
DenseTensor
*
x_grad
)
{
DenseTensor
*
x_grad
)
{
auto
xshape_dims
=
xshape
.
dims
();
auto
xshape_dims
=
xshape
.
dims
();
dev_ctx
.
Alloc
(
x_grad
,
out_grad
.
dtype
());
auto
x_dims
=
phi
::
slice_ddim
(
xshape_dims
,
1
,
xshape_dims
.
size
());
auto
x_dims
=
phi
::
slice_ddim
(
xshape_dims
,
1
,
xshape_dims
.
size
());
phi
::
Copy
(
dev_ctx
,
out_grad
,
dev_ctx
.
GetPlace
(),
false
,
x_grad
);
phi
::
Copy
(
dev_ctx
,
out_grad
,
dev_ctx
.
GetPlace
(),
false
,
x_grad
);
x_grad
->
Resize
(
x_dims
);
x_grad
->
Resize
(
x_dims
);
...
...
paddle/phi/kernels/flatten_kernel.cc
浏览文件 @
31363c3f
...
@@ -27,6 +27,7 @@ void FlattenKernel(const Context& dev_ctx,
...
@@ -27,6 +27,7 @@ void FlattenKernel(const Context& dev_ctx,
int
start_axis
,
int
start_axis
,
int
stop_axis
,
int
stop_axis
,
DenseTensor
*
out
)
{
DenseTensor
*
out
)
{
dev_ctx
.
Alloc
(
out
,
x
.
dtype
());
auto
out_dims
=
out
->
dims
();
auto
out_dims
=
out
->
dims
();
phi
::
Copy
(
dev_ctx
,
x
,
dev_ctx
.
GetPlace
(),
false
,
out
);
phi
::
Copy
(
dev_ctx
,
x
,
dev_ctx
.
GetPlace
(),
false
,
out
);
out
->
Resize
(
out_dims
);
out
->
Resize
(
out_dims
);
...
@@ -43,7 +44,6 @@ void FlattenWithXShape(const Context& dev_ctx,
...
@@ -43,7 +44,6 @@ void FlattenWithXShape(const Context& dev_ctx,
DenseTensor
*
out
,
DenseTensor
*
out
,
DenseTensor
*
xshape
)
{
DenseTensor
*
xshape
)
{
FlattenKernel
<
T
,
Context
>
(
dev_ctx
,
x
,
start_axis
,
stop_axis
,
out
);
FlattenKernel
<
T
,
Context
>
(
dev_ctx
,
x
,
start_axis
,
stop_axis
,
out
);
funcs
::
SetXShape
(
x
,
xshape
);
}
}
}
// namespace phi
}
// namespace phi
...
...
paddle/phi/kernels/funcs/CMakeLists.txt
浏览文件 @
31363c3f
...
@@ -3,6 +3,7 @@ add_subdirectory(blas)
...
@@ -3,6 +3,7 @@ add_subdirectory(blas)
add_subdirectory
(
lapack
)
add_subdirectory
(
lapack
)
add_subdirectory
(
detail
)
add_subdirectory
(
detail
)
math_library
(
deformable_conv_functor DEPS dense_tensor
)
math_library
(
concat_and_split_functor DEPS dense_tensor
)
math_library
(
concat_and_split_functor DEPS dense_tensor
)
math_library
(
gru_compute DEPS activation_functions math_function
)
math_library
(
gru_compute DEPS activation_functions math_function
)
math_library
(
lstm_compute DEPS activation_functions
)
math_library
(
lstm_compute DEPS activation_functions
)
...
...
paddle/phi/kernels/funcs/deformable_conv_functor.cc
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/funcs/deformable_conv_functor.h"
#include "paddle/phi/backends/cpu/cpu_context.h"
namespace
phi
{
namespace
funcs
{
template
<
typename
T
>
inline
void
ModulatedDeformableIm2colCPUKernel
(
const
int
num_kernels
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
;
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
*
data_col_ptr
*=
mask
;
}
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
// get outputs of im2col with offset by bilinear interpolation
ModulatedDeformableIm2colCPUKernel
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
template
void
ModulatedDeformableIm2col
(
const
phi
::
CPUContext
&
dev_ctx
,
const
float
*
data_im
,
const
float
*
data_offset
,
const
float
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
float
*
data_col
);
template
void
ModulatedDeformableIm2col
(
const
phi
::
CPUContext
&
dev_ctx
,
const
double
*
data_im
,
const
double
*
data_offset
,
const
double
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
double
*
data_col
);
}
// namespace funcs
}
// namespace phi
paddle/phi/kernels/funcs/deformable_conv_functor.cu
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/funcs/deformable_conv_functor.h"
#include "paddle/phi/backends/gpu/gpu_context.h"
namespace
phi
{
namespace
funcs
{
static
constexpr
int
kNumCUDAThreads
=
512
;
static
constexpr
int
kNumMaximumNumBlocks
=
4096
;
static
inline
int
NumBlocks
(
const
int
N
)
{
return
std
::
min
((
N
+
kNumCUDAThreads
-
1
)
/
kNumCUDAThreads
,
kNumMaximumNumBlocks
);
}
template
<
typename
T
>
__global__
void
ModulatedDeformableIm2colGpuKernel
(
const
int
nthreads
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
int
index
=
blockIdx
.
x
*
blockDim
.
x
+
threadIdx
.
x
;
int
offset
=
blockDim
.
x
*
gridDim
.
x
;
for
(
size_t
i
=
index
;
i
<
nthreads
;
i
+=
offset
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
;
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
*
data_col_ptr
*=
mask
;
}
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
int
blocks
=
NumBlocks
(
num_kernels
);
int
threads
=
kNumCUDAThreads
;
ModulatedDeformableIm2colGpuKernel
<
T
><<<
blocks
,
threads
,
0
,
dev_ctx
.
stream
()
>>>
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
template
void
ModulatedDeformableIm2col
(
const
phi
::
GPUContext
&
dev_ctx
,
const
float
*
data_im
,
const
float
*
data_offset
,
const
float
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
float
*
data_col
);
template
void
ModulatedDeformableIm2col
(
const
phi
::
GPUContext
&
dev_ctx
,
const
double
*
data_im
,
const
double
*
data_offset
,
const
double
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
double
*
data_col
);
}
// namespace funcs
}
// namespace phi
paddle/phi/kernels/funcs/deformable_conv_functor.h
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/ops/compat/deformable_conv_sig.cc
浏览文件 @
31363c3f
...
@@ -29,6 +29,34 @@ KernelSignature DeformableConvOpArgumentMapping(
...
@@ -29,6 +29,34 @@ KernelSignature DeformableConvOpArgumentMapping(
{
"Output"
});
{
"Output"
});
}
}
KernelSignature
DeformableConvGradOpArgumentMapping
(
const
ArgumentMappingContext
&
ctx
)
{
return
KernelSignature
(
"deformable_conv_grad"
,
{
"Input"
,
"Offset"
,
"Filter"
,
"Mask"
,
GradVarName
(
"Output"
)},
{
"strides"
,
"paddings"
,
"dilations"
,
"deformable_groups"
,
"groups"
,
"im2col_step"
},
{
GradVarName
(
"Input"
),
GradVarName
(
"Offset"
),
GradVarName
(
"Filter"
),
GradVarName
(
"Mask"
)});
}
}
// namespace phi
}
// namespace phi
PD_REGISTER_BASE_KERNEL_NAME
(
deformable_conv_v1
,
deformable_conv
);
PD_REGISTER_BASE_KERNEL_NAME
(
deformable_conv_v1_grad
,
deformable_conv_grad
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv
,
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv
,
phi
::
DeformableConvOpArgumentMapping
);
phi
::
DeformableConvOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_grad
,
phi
::
DeformableConvGradOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_v1
,
phi
::
DeformableConvOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_v1_grad
,
phi
::
DeformableConvGradOpArgumentMapping
);
python/paddle/distributed/auto_parallel/dist_loader.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/dist_saver.py
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/engine.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/utils.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/dygraph/varbase_patch_methods.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/engine_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/engine_predict_api.py
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/test_engine_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/test_egr_python_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录