Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Paddle
提交
31363c3f
P
Paddle
项目概览
PaddlePaddle
/
Paddle
大约 2 年 前同步成功
通知
2325
Star
20933
Fork
5424
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1423
列表
看板
标记
里程碑
合并请求
543
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1,423
Issue
1,423
列表
看板
标记
里程碑
合并请求
543
合并请求
543
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
31363c3f
编写于
3月 23, 2022
作者:
P
phlrain
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'develop' of
https://github.com/PaddlePaddle/Paddle
into add_some_yaml_config
上级
da478d1e
3980e222
变更
50
展开全部
隐藏空白更改
内联
并排
Showing
50 changed file
with
3144 addition
and
3606 deletion
+3144
-3606
paddle/fluid/framework/infershape_utils.cc
paddle/fluid/framework/infershape_utils.cc
+158
-197
paddle/fluid/framework/infershape_utils.h
paddle/fluid/framework/infershape_utils.h
+59
-1
paddle/fluid/framework/new_executor/workqueue/workqueue.h
paddle/fluid/framework/new_executor/workqueue/workqueue.h
+58
-0
paddle/fluid/framework/new_executor/workqueue/workqueue_test.cc
.../fluid/framework/new_executor/workqueue/workqueue_test.cc
+6
-0
paddle/fluid/operators/deformable_conv_func.h
paddle/fluid/operators/deformable_conv_func.h
+0
-149
paddle/fluid/operators/deformable_conv_op.cc
paddle/fluid/operators/deformable_conv_op.cc
+9
-159
paddle/fluid/operators/deformable_conv_op.cu
paddle/fluid/operators/deformable_conv_op.cu
+0
-643
paddle/fluid/operators/deformable_conv_op.h
paddle/fluid/operators/deformable_conv_op.h
+0
-509
paddle/fluid/operators/deformable_conv_v1_op.cc
paddle/fluid/operators/deformable_conv_v1_op.cc
+9
-132
paddle/fluid/operators/deformable_conv_v1_op.cu
paddle/fluid/operators/deformable_conv_v1_op.cu
+0
-604
paddle/fluid/operators/deformable_conv_v1_op.h
paddle/fluid/operators/deformable_conv_v1_op.h
+0
-556
paddle/fluid/operators/flatten_op.cc
paddle/fluid/operators/flatten_op.cc
+13
-83
paddle/fluid/operators/flatten_op.cu.cc
paddle/fluid/operators/flatten_op.cu.cc
+0
-31
paddle/fluid/operators/flatten_op.h
paddle/fluid/operators/flatten_op.h
+0
-41
paddle/fluid/operators/flatten_op_xpu.cc
paddle/fluid/operators/flatten_op_xpu.cc
+0
-23
paddle/fluid/pybind/eager_method.cc
paddle/fluid/pybind/eager_method.cc
+56
-9
paddle/fluid/pybind/imperative.cc
paddle/fluid/pybind/imperative.cc
+1
-0
paddle/fluid/pybind/pybind.cc
paddle/fluid/pybind/pybind.cc
+2
-0
paddle/phi/api/include/tensor.h
paddle/phi/api/include/tensor.h
+1
-3
paddle/phi/api/lib/tensor_method.cc
paddle/phi/api/lib/tensor_method.cc
+6
-6
paddle/phi/infermeta/multiary.cc
paddle/phi/infermeta/multiary.cc
+209
-0
paddle/phi/infermeta/multiary.h
paddle/phi/infermeta/multiary.h
+13
-0
paddle/phi/infermeta/unary.cc
paddle/phi/infermeta/unary.cc
+16
-0
paddle/phi/infermeta/unary.h
paddle/phi/infermeta/unary.h
+6
-0
paddle/phi/kernels/CMakeLists.txt
paddle/phi/kernels/CMakeLists.txt
+3
-1
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
+333
-0
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
+0
-120
paddle/phi/kernels/deformable_conv_grad_kernel.h
paddle/phi/kernels/deformable_conv_grad_kernel.h
+39
-0
paddle/phi/kernels/deformable_conv_kernel.h
paddle/phi/kernels/deformable_conv_kernel.h
+2
-1
paddle/phi/kernels/flatten_grad_kernel.cc
paddle/phi/kernels/flatten_grad_kernel.cc
+1
-0
paddle/phi/kernels/flatten_kernel.cc
paddle/phi/kernels/flatten_kernel.cc
+1
-1
paddle/phi/kernels/funcs/CMakeLists.txt
paddle/phi/kernels/funcs/CMakeLists.txt
+1
-0
paddle/phi/kernels/funcs/deformable_conv_functor.cc
paddle/phi/kernels/funcs/deformable_conv_functor.cc
+172
-0
paddle/phi/kernels/funcs/deformable_conv_functor.cu
paddle/phi/kernels/funcs/deformable_conv_functor.cu
+185
-0
paddle/phi/kernels/funcs/deformable_conv_functor.h
paddle/phi/kernels/funcs/deformable_conv_functor.h
+74
-0
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
+366
-0
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
+0
-134
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
+364
-0
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
+22
-68
paddle/phi/ops/compat/deformable_conv_sig.cc
paddle/phi/ops/compat/deformable_conv_sig.cc
+28
-0
python/paddle/distributed/auto_parallel/dist_loader.py
python/paddle/distributed/auto_parallel/dist_loader.py
+26
-6
python/paddle/distributed/auto_parallel/dist_saver.py
python/paddle/distributed/auto_parallel/dist_saver.py
+241
-0
python/paddle/distributed/auto_parallel/engine.py
python/paddle/distributed/auto_parallel/engine.py
+236
-117
python/paddle/distributed/auto_parallel/utils.py
python/paddle/distributed/auto_parallel/utils.py
+8
-0
python/paddle/fluid/dygraph/varbase_patch_methods.py
python/paddle/fluid/dygraph/varbase_patch_methods.py
+32
-0
python/paddle/fluid/tests/unittests/auto_parallel/engine_api.py
.../paddle/fluid/tests/unittests/auto_parallel/engine_api.py
+10
-5
python/paddle/fluid/tests/unittests/auto_parallel/engine_predict_api.py
...fluid/tests/unittests/auto_parallel/engine_predict_api.py
+122
-0
python/paddle/fluid/tests/unittests/auto_parallel/test_engine_api.py
...le/fluid/tests/unittests/auto_parallel/test_engine_api.py
+28
-0
python/paddle/fluid/tests/unittests/test_egr_python_api.py
python/paddle/fluid/tests/unittests/test_egr_python_api.py
+54
-7
python/paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
.../paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
+174
-0
未找到文件。
paddle/fluid/framework/infershape_utils.cc
浏览文件 @
31363c3f
...
@@ -27,7 +27,6 @@ limitations under the License. */
...
@@ -27,7 +27,6 @@ limitations under the License. */
#include "paddle/phi/core/compat/op_utils.h"
#include "paddle/phi/core/compat/op_utils.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/core/meta_tensor.h"
#include "paddle/phi/core/tensor_utils.h"
#include "paddle/phi/core/tensor_utils.h"
namespace
paddle
{
namespace
paddle
{
...
@@ -101,235 +100,197 @@ class InferShapeArgumentMappingContext : public phi::ArgumentMappingContext {
...
@@ -101,235 +100,197 @@ class InferShapeArgumentMappingContext : public phi::ArgumentMappingContext {
const
InferShapeContext
&
ctx_
;
const
InferShapeContext
&
ctx_
;
};
};
// TODO(chenweihang): Support TensorArray later
int64_t
CompatMetaTensor
::
numel
()
const
{
class
CompatMetaTensor
:
public
phi
::
MetaTensor
{
if
(
is_runtime_
)
{
public:
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
CompatMetaTensor
(
InferShapeVarPtr
var
,
bool
is_runtime
)
return
var
->
Get
<
Tensor
>
().
numel
();
:
var_
(
std
::
move
(
var
)),
is_runtime_
(
is_runtime
)
{}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
CompatMetaTensor
()
=
default
;
return
var
->
ElementSize
();
CompatMetaTensor
(
const
CompatMetaTensor
&
)
=
default
;
CompatMetaTensor
(
CompatMetaTensor
&&
)
=
default
;
CompatMetaTensor
&
operator
=
(
const
CompatMetaTensor
&
)
=
delete
;
CompatMetaTensor
&
operator
=
(
CompatMetaTensor
&&
)
=
delete
;
int64_t
numel
()
const
override
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
Tensor
>
().
numel
();
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
ElementSize
();
}
}
}
}
DDim
dims
()
const
override
{
DDim
CompatMetaTensor
::
dims
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
dims
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
dims
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
dims
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
dims
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// use tensor array size as dims
// use tensor array size as dims
auto
&
tensor_array
=
var
->
Get
<
framework
::
LoDTensorArray
>
();
auto
&
tensor_array
=
var
->
Get
<
framework
::
LoDTensorArray
>
();
return
phi
::
make_ddim
({
static_cast
<
int64_t
>
(
tensor_array
.
size
())});
return
phi
::
make_ddim
({
static_cast
<
int64_t
>
(
tensor_array
.
size
())});
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dims from DenseTensor or SelectedRows or "
"DenseTensorArray."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dims from DenseTensor or SelectedRows or "
return
var
->
GetShape
().
empty
()
?
phi
::
make_ddim
({
0UL
})
"DenseTensorArray."
));
:
phi
::
make_ddim
(
var
->
GetShape
());
}
}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetShape
().
empty
()
?
phi
::
make_ddim
({
0UL
})
:
phi
::
make_ddim
(
var
->
GetShape
());
}
}
}
phi
::
DataType
dtype
()
const
override
{
phi
::
DataType
CompatMetaTensor
::
dtype
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
dtype
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
dtype
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
dtype
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
dtype
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported get dtype from LoDTensorArray now
// Unsupported get dtype from LoDTensorArray now
return
phi
::
DataType
::
UNDEFINED
;
return
phi
::
DataType
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get dtype from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
return
paddle
::
framework
::
TransToPhiDataType
(
var
->
GetDataType
(
));
"Currently, only can get dtype from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
paddle
::
framework
::
TransToPhiDataType
(
var
->
GetDataType
());
}
}
}
DataLayout
layout
()
const
override
{
DataLayout
CompatMetaTensor
::
layout
()
const
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
return
var
->
Get
<
phi
::
DenseTensor
>
().
layout
();
return
var
->
Get
<
phi
::
DenseTensor
>
().
layout
();
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
return
var
->
Get
<
phi
::
SelectedRows
>
().
layout
();
return
var
->
Get
<
phi
::
SelectedRows
>
().
layout
();
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// Unsupported get layout from LoDTensorArray now
return
phi
::
DataLayout
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get layout from DenseTensor or "
"SelectedRows."
));
}
}
else
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported get layout for VarDesc now
// Unsupported get layout from LoDTensorArray now
return
DataLayout
::
UNDEFINED
;
return
phi
::
DataLayout
::
UNDEFINED
;
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can get layout from DenseTensor or "
"SelectedRows."
));
}
}
}
else
{
// NOTE(chenweihang): do nothing
// Unsupported get layout for VarDesc now
return
DataLayout
::
UNDEFINED
;
}
}
}
void
set_dims
(
const
DDim
&
dims
)
override
{
void
CompatMetaTensor
::
set_dims
(
const
DDim
&
dims
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dims
=
dims
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
auto
*
tensor_array
=
var
->
GetMutable
<
framework
::
LoDTensorArray
>
();
auto
*
tensor_array
=
var
->
GetMutable
<
framework
::
LoDTensorArray
>
();
// Note: Here I want enforce `tensor_array->size() == 0UL`, because
// Note: Here I want enforce `tensor_array->size() == 0UL`, because
// inplace using on LoDTensorArray is dangerous, but the unittest
// inplace using on LoDTensorArray is dangerous, but the unittest
// `test_list` contains this behavior
// `test_list` contains this behavior
PADDLE_ENFORCE_EQ
(
dims
.
size
(),
1UL
,
PADDLE_ENFORCE_EQ
(
dims
.
size
(),
1UL
,
platform
::
errors
::
InvalidArgument
(
platform
::
errors
::
InvalidArgument
(
"LoDTensorArray can only have one dimension."
));
"LoDTensorArray can only have one dimension."
));
// only set the array size for LoDTensorArray input
// only set the array size for LoDTensorArray input
tensor_array
->
resize
(
dims
[
0
]);
tensor_array
->
resize
(
dims
[
0
]);
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set dims from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
var
->
SetShape
(
vectorize
(
dims
));
"Currently, only can set dims from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetShape
(
vectorize
(
dims
));
}
}
}
void
set_dtype
(
phi
::
DataType
dtype
)
override
{
void
CompatMetaTensor
::
set_dtype
(
phi
::
DataType
dtype
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
dtype
=
dtype
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported set dtype for LoDTensorArray now
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set dtype from DenseTensor or SelectedRows."
));
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
var
->
SetDataType
(
paddle
::
framework
::
TransToProtoVarType
(
dtype
));
"Currently, only can set dtype from DenseTensor or SelectedRows."
));
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetDataType
(
paddle
::
framework
::
TransToProtoVarType
(
dtype
));
}
}
}
void
set_layout
(
DataLayout
layout
)
override
{
void
CompatMetaTensor
::
set_layout
(
DataLayout
layout
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
}
else
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
SelectedRows
>
()
->
mutable_value
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
layout
=
layout
;
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
}
else
if
(
var
->
IsType
<
framework
::
LoDTensorArray
>
())
{
// NOTE(chenweihang): do nothing
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set layout from DenseTensor or "
"SelectedRows."
));
}
}
else
{
// NOTE(chenweihang): do nothing
// NOTE(chenweihang): do nothing
// Unsupported set layout for VarDesc now
// Unsupported set dtype for LoDTensorArray now
}
else
{
PADDLE_THROW
(
platform
::
errors
::
Unimplemented
(
"Currently, only can set layout from DenseTensor or "
"SelectedRows."
));
}
}
}
else
{
// NOTE(chenweihang): do nothing
// Unsupported set layout for VarDesc now
}
}
}
void
share_lod
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_lod
(
const
MetaTensor
&
meta_tensor
)
{
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
if
(
var
->
IsType
<
phi
::
DenseTensor
>
())
{
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
auto
*
tensor
=
var
->
GetMutable
<
phi
::
DenseTensor
>
();
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
lod
=
phi
::
DenseTensorUtils
::
GetMutableMeta
(
tensor
)
->
lod
=
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetRuntimeLoD
();
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetRuntimeLoD
();
}
else
{
// NOTE(chenweihang): do nothing
// only LoDTensor need to share lod
}
}
else
{
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
// NOTE(chenweihang): do nothing
var
->
SetLoDLevel
(
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
)
// only LoDTensor need to share lod
.
GetCompileTimeLoD
());
}
}
}
else
{
auto
*
var
=
BOOST_GET
(
VarDesc
*
,
var_
);
var
->
SetLoDLevel
(
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetCompileTimeLoD
());
}
}
}
void
share_dims
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_dims
(
const
MetaTensor
&
meta_tensor
)
{
set_dims
(
meta_tensor
.
dims
());
set_dims
(
meta_tensor
.
dims
());
if
(
is_runtime_
)
{
if
(
is_runtime_
)
{
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
auto
*
var
=
BOOST_GET
(
Variable
*
,
var_
);
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
if
(
var
->
IsType
<
phi
::
SelectedRows
>
())
{
auto
*
selected_rows
=
var
->
GetMutable
<
phi
::
SelectedRows
>
();
auto
*
selected_rows
=
var
->
GetMutable
<
phi
::
SelectedRows
>
();
auto
&
input_selected_rows
=
auto
&
input_selected_rows
=
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetSelectedRows
();
static_cast
<
const
CompatMetaTensor
&>
(
meta_tensor
).
GetSelectedRows
();
selected_rows
->
set_rows
(
input_selected_rows
.
rows
());
selected_rows
->
set_rows
(
input_selected_rows
.
rows
());
selected_rows
->
set_height
(
input_selected_rows
.
height
());
selected_rows
->
set_height
(
input_selected_rows
.
height
());
}
}
}
}
}
}
void
share_meta
(
const
MetaTensor
&
meta_tensor
)
override
{
void
CompatMetaTensor
::
share_meta
(
const
MetaTensor
&
meta_tensor
)
{
share_dims
(
meta_tensor
);
share_dims
(
meta_tensor
);
set_dtype
(
meta_tensor
.
dtype
());
set_dtype
(
meta_tensor
.
dtype
());
set_layout
(
meta_tensor
.
layout
());
set_layout
(
meta_tensor
.
layout
());
// special case: share lod of LoDTensor
// special case: share lod of LoDTensor
share_lod
(
meta_tensor
);
share_lod
(
meta_tensor
);
}
}
private:
const
LoD
&
GetRuntimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
LoDTensor
>
().
lod
();
}
int32_t
GetCompileTimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetLoDLevel
();
}
const
phi
::
SelectedRows
&
GetSelectedRows
()
const
{
PADDLE_ENFORCE_EQ
(
is_runtime_
,
true
,
platform
::
errors
::
Unavailable
(
"Only can get Tensor from MetaTensor in rumtime."
));
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
PADDLE_ENFORCE_EQ
(
var
->
IsType
<
phi
::
SelectedRows
>
(),
true
,
platform
::
errors
::
Unavailable
(
"The Tensor in MetaTensor is not SelectedRows."
));
return
var
->
Get
<
phi
::
SelectedRows
>
();
}
InferShapeVarPtr
var_
;
bool
is_runtime_
;
};
phi
::
InferMetaContext
BuildInferMetaContext
(
InferShapeContext
*
ctx
,
phi
::
InferMetaContext
BuildInferMetaContext
(
InferShapeContext
*
ctx
,
const
std
::
string
&
op_type
)
{
const
std
::
string
&
op_type
)
{
...
...
paddle/fluid/framework/infershape_utils.h
浏览文件 @
31363c3f
...
@@ -18,7 +18,7 @@ limitations under the License. */
...
@@ -18,7 +18,7 @@ limitations under the License. */
#include "paddle/fluid/framework/op_info.h"
#include "paddle/fluid/framework/op_info.h"
#include "paddle/fluid/framework/shape_inference.h"
#include "paddle/fluid/framework/shape_inference.h"
#include "paddle/phi/core/meta_tensor.h"
namespace
phi
{
namespace
phi
{
class
InferMetaContext
;
class
InferMetaContext
;
}
// namespace phi
}
// namespace phi
...
@@ -39,5 +39,63 @@ phi::InferMetaContext BuildInferMetaContext(InferShapeContext* ctx,
...
@@ -39,5 +39,63 @@ phi::InferMetaContext BuildInferMetaContext(InferShapeContext* ctx,
} \
} \
}
}
// TODO(chenweihang): Support TensorArray later
class
CompatMetaTensor
:
public
phi
::
MetaTensor
{
public:
CompatMetaTensor
(
InferShapeVarPtr
var
,
bool
is_runtime
)
:
var_
(
std
::
move
(
var
)),
is_runtime_
(
is_runtime
)
{}
CompatMetaTensor
()
=
default
;
CompatMetaTensor
(
const
CompatMetaTensor
&
)
=
default
;
CompatMetaTensor
(
CompatMetaTensor
&&
)
=
default
;
CompatMetaTensor
&
operator
=
(
const
CompatMetaTensor
&
)
=
delete
;
CompatMetaTensor
&
operator
=
(
CompatMetaTensor
&&
)
=
delete
;
int64_t
numel
()
const
override
;
DDim
dims
()
const
override
;
phi
::
DataType
dtype
()
const
override
;
DataLayout
layout
()
const
override
;
void
set_dims
(
const
DDim
&
dims
)
override
;
void
set_dtype
(
phi
::
DataType
dtype
)
override
;
void
set_layout
(
DataLayout
layout
)
override
;
void
share_lod
(
const
MetaTensor
&
meta_tensor
)
override
;
void
share_dims
(
const
MetaTensor
&
meta_tensor
)
override
;
void
share_meta
(
const
MetaTensor
&
meta_tensor
)
override
;
private:
const
LoD
&
GetRuntimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
return
var
->
Get
<
LoDTensor
>
().
lod
();
}
int32_t
GetCompileTimeLoD
()
const
{
auto
*
var
=
BOOST_GET_CONST
(
VarDesc
*
,
var_
);
return
var
->
GetLoDLevel
();
}
const
phi
::
SelectedRows
&
GetSelectedRows
()
const
{
PADDLE_ENFORCE_EQ
(
is_runtime_
,
true
,
platform
::
errors
::
Unavailable
(
"Only can get Tensor from MetaTensor in rumtime."
));
auto
*
var
=
BOOST_GET_CONST
(
Variable
*
,
var_
);
PADDLE_ENFORCE_EQ
(
var
->
IsType
<
phi
::
SelectedRows
>
(),
true
,
platform
::
errors
::
Unavailable
(
"The Tensor in MetaTensor is not SelectedRows."
));
return
var
->
Get
<
phi
::
SelectedRows
>
();
}
InferShapeVarPtr
var_
;
bool
is_runtime_
;
};
}
// namespace framework
}
// namespace framework
}
// namespace paddle
}
// namespace paddle
paddle/fluid/framework/new_executor/workqueue/workqueue.h
浏览文件 @
31363c3f
...
@@ -15,9 +15,12 @@
...
@@ -15,9 +15,12 @@
#pragma once
#pragma once
#include <functional>
#include <functional>
#include <future>
#include <memory>
#include <memory>
#include <string>
#include <string>
#include <type_traits>
#include <vector>
#include <vector>
#include "paddle/fluid/platform/enforce.h"
namespace
paddle
{
namespace
paddle
{
namespace
framework
{
namespace
framework
{
...
@@ -25,6 +28,29 @@ namespace framework {
...
@@ -25,6 +28,29 @@ namespace framework {
constexpr
const
char
*
kQueueEmptyEvent
=
"QueueEmpty"
;
constexpr
const
char
*
kQueueEmptyEvent
=
"QueueEmpty"
;
constexpr
const
char
*
kQueueDestructEvent
=
"QueueDestruct"
;
constexpr
const
char
*
kQueueDestructEvent
=
"QueueDestruct"
;
// For std::function
// https://stackoverflow.com/questions/25421346/how-to-create-an-stdfunction-from-a-move-capturing-lambda-expression
template
<
typename
OnlyMovable
>
class
FakeCopyable
{
public:
explicit
FakeCopyable
(
OnlyMovable
&&
obj
)
:
obj_
(
std
::
move
(
obj
))
{
static_assert
(
std
::
is_copy_constructible
<
OnlyMovable
>::
value
==
false
,
"Need not to use FakeCopyable"
);
}
FakeCopyable
(
FakeCopyable
&&
other
)
:
obj_
(
std
::
move
(
other
.
obj_
))
{}
FakeCopyable
(
const
FakeCopyable
&
other
)
{
PADDLE_THROW
(
platform
::
errors
::
Unavailable
(
"Never use the copy constructor of FakeCopyable."
));
}
OnlyMovable
&
Get
()
{
return
obj_
;
}
private:
OnlyMovable
obj_
;
};
class
EventsWaiter
;
class
EventsWaiter
;
struct
WorkQueueOptions
{
struct
WorkQueueOptions
{
...
@@ -78,6 +104,22 @@ class WorkQueue {
...
@@ -78,6 +104,22 @@ class WorkQueue {
virtual
void
AddTask
(
std
::
function
<
void
()
>
fn
)
=
0
;
virtual
void
AddTask
(
std
::
function
<
void
()
>
fn
)
=
0
;
// Higher cost than AddTask
template
<
typename
F
,
typename
...
Args
>
std
::
future
<
typename
std
::
result_of
<
F
(
Args
...)
>::
type
>
AddAwaitableTask
(
F
&&
f
,
Args
&&
...
args
)
{
using
ReturnType
=
typename
std
::
result_of
<
F
(
Args
...)
>::
type
;
std
::
function
<
ReturnType
()
>
task
=
std
::
bind
(
std
::
forward
<
F
>
(
f
),
std
::
forward
<
Args
>
(
args
)...);
std
::
promise
<
ReturnType
>
prom
;
std
::
future
<
ReturnType
>
res
=
prom
.
get_future
();
AddTask
([
t
=
std
::
move
(
task
),
p
=
FakeCopyable
<
std
::
promise
<
ReturnType
>>
(
std
::
move
(
prom
))
]()
mutable
{
p
.
Get
().
set_value
(
t
());
});
return
res
;
}
// See WorkQueueOptions.track_task for details
// See WorkQueueOptions.track_task for details
// virtual void WaitQueueEmpty() = 0;
// virtual void WaitQueueEmpty() = 0;
...
@@ -102,6 +144,22 @@ class WorkQueueGroup {
...
@@ -102,6 +144,22 @@ class WorkQueueGroup {
virtual
void
AddTask
(
size_t
queue_idx
,
std
::
function
<
void
()
>
fn
)
=
0
;
virtual
void
AddTask
(
size_t
queue_idx
,
std
::
function
<
void
()
>
fn
)
=
0
;
// Higher cost than AddTask
template
<
typename
F
,
typename
...
Args
>
std
::
future
<
typename
std
::
result_of
<
F
(
Args
...)
>::
type
>
AddAwaitableTask
(
size_t
queue_idx
,
F
&&
f
,
Args
&&
...
args
)
{
using
ReturnType
=
typename
std
::
result_of
<
F
(
Args
...)
>::
type
;
std
::
function
<
ReturnType
()
>
task
=
std
::
bind
(
std
::
forward
<
F
>
(
f
),
std
::
forward
<
Args
>
(
args
)...);
std
::
promise
<
ReturnType
>
prom
;
std
::
future
<
ReturnType
>
res
=
prom
.
get_future
();
AddTask
(
queue_idx
,
[
t
=
std
::
move
(
task
),
p
=
FakeCopyable
<
std
::
promise
<
ReturnType
>>
(
std
::
move
(
prom
))
]()
mutable
{
p
.
Get
().
set_value
(
t
());
});
return
res
;
}
// See WorkQueueOptions.track_task for details
// See WorkQueueOptions.track_task for details
// virtual void WaitQueueGroupEmpty() = 0;
// virtual void WaitQueueGroupEmpty() = 0;
...
...
paddle/fluid/framework/new_executor/workqueue/workqueue_test.cc
浏览文件 @
31363c3f
...
@@ -60,11 +60,13 @@ TEST(WorkQueue, TestSingleThreadedWorkQueue) {
...
@@ -60,11 +60,13 @@ TEST(WorkQueue, TestSingleThreadedWorkQueue) {
}
}
finished
=
true
;
finished
=
true
;
});
});
auto
handle
=
work_queue
->
AddAwaitableTask
([]()
{
return
1234
;
});
// WaitQueueEmpty
// WaitQueueEmpty
EXPECT_EQ
(
finished
.
load
(),
false
);
EXPECT_EQ
(
finished
.
load
(),
false
);
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
EXPECT_EQ
(
finished
.
load
(),
true
);
EXPECT_EQ
(
finished
.
load
(),
true
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
);
EXPECT_EQ
(
handle
.
get
(),
1234
);
}
}
TEST
(
WorkQueue
,
TestMultiThreadedWorkQueue
)
{
TEST
(
WorkQueue
,
TestMultiThreadedWorkQueue
)
{
...
@@ -146,6 +148,9 @@ TEST(WorkQueue, TestWorkQueueGroup) {
...
@@ -146,6 +148,9 @@ TEST(WorkQueue, TestWorkQueueGroup) {
++
counter
;
++
counter
;
}
}
});
});
int
random_num
=
123456
;
auto
handle
=
queue_group
->
AddAwaitableTask
(
1
,
[
random_num
]()
{
return
random_num
;
});
// WaitQueueGroupEmpty
// WaitQueueGroupEmpty
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
*
kExternalLoopNum
+
kLoopNum
);
EXPECT_EQ
(
counter
.
load
(),
kLoopNum
*
kExternalLoopNum
+
kLoopNum
);
...
@@ -154,4 +159,5 @@ TEST(WorkQueue, TestWorkQueueGroup) {
...
@@ -154,4 +159,5 @@ TEST(WorkQueue, TestWorkQueueGroup) {
events_waiter
.
WaitEvent
();
events_waiter
.
WaitEvent
();
queue_group
.
reset
();
queue_group
.
reset
();
EXPECT_EQ
(
events_waiter
.
WaitEvent
(),
paddle
::
framework
::
kQueueDestructEvent
);
EXPECT_EQ
(
events_waiter
.
WaitEvent
(),
paddle
::
framework
::
kQueueDestructEvent
);
EXPECT_EQ
(
handle
.
get
(),
random_num
);
}
}
paddle/fluid/operators/deformable_conv_func.h
已删除
100644 → 0
浏览文件 @
da478d1e
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Part of the following code in this file refs to
// https://github.com/msracver/Deformable-ConvNets/blob/master/faster_rcnn/operator_cxx/deformable_convolution.cu
//
// Copyright (c) 2017 Microsoft
// Licensed under The Apache-2.0 License [see LICENSE for details]
// \file deformable_psroi_pooling.cu
// \brief
// \author Yi Li, Guodong Zhang, Jifeng Dai
#pragma once
#include "paddle/phi/core/hostdevice.h"
#include "paddle/phi/kernels/funcs/blas/blas.h"
#include "paddle/phi/kernels/funcs/math_function.h"
template
<
typename
T
>
HOSTDEVICE
T
DmcnGetGradientWeight
(
T
argmax_h
,
T
argmax_w
,
const
int
h
,
const
int
w
,
const
int
height
,
const
int
width
)
{
if
(
argmax_h
<=
-
1
||
argmax_h
>=
height
||
argmax_w
<=
-
1
||
argmax_w
>=
width
)
{
return
0
;
}
int
argmax_h_low
=
floor
(
argmax_h
);
int
argmax_w_low
=
floor
(
argmax_w
);
int
argmax_h_high
=
argmax_h_low
+
1
;
int
argmax_w_high
=
argmax_w_low
+
1
;
T
weight
=
0
;
weight
=
(
h
==
argmax_h_low
&&
w
==
argmax_w_low
)
?
(
h
+
1
-
argmax_h
)
*
(
w
+
1
-
argmax_w
)
:
weight
;
weight
=
(
h
==
argmax_h_low
&&
w
==
argmax_w_high
)
?
(
h
+
1
-
argmax_h
)
*
(
argmax_w
+
1
-
w
)
:
weight
;
weight
=
(
h
==
argmax_h_high
&&
w
==
argmax_w_low
)
?
(
argmax_h
+
1
-
h
)
*
(
w
+
1
-
argmax_w
)
:
weight
;
weight
=
(
h
==
argmax_h_high
&&
w
==
argmax_w_high
)
?
(
argmax_h
+
1
-
h
)
*
(
argmax_w
+
1
-
w
)
:
weight
;
return
weight
;
}
template
<
typename
T
>
HOSTDEVICE
T
DmcnGetCoordinateWeight
(
T
argmax_h
,
T
argmax_w
,
const
int
height
,
const
int
width
,
const
T
*
im_data
,
const
int
data_width
,
const
int
bp_dir
)
{
if
(
argmax_h
<=
-
1
||
argmax_h
>=
height
||
argmax_w
<=
-
1
||
argmax_w
>=
width
)
{
return
0
;
}
int
argmax_h_low
=
floor
(
argmax_h
);
int
argmax_w_low
=
floor
(
argmax_w
);
int
argmax_h_high
=
argmax_h_low
+
1
;
int
argmax_w_high
=
argmax_w_low
+
1
;
T
weight
=
0
;
if
(
bp_dir
==
0
)
{
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_w_low
+
1
-
argmax_w
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_high
<=
width
-
1
)
?
-
1
*
(
argmax_w
-
argmax_w_low
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_high
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_low
>=
0
)
?
(
argmax_w_low
+
1
-
argmax_w
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_w
-
argmax_w_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_high
]
:
0
;
}
else
if
(
bp_dir
==
1
)
{
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_h_low
+
1
-
argmax_h
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_low
>=
0
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_h_low
+
1
-
argmax_h
)
*
im_data
[
argmax_h_low
*
data_width
+
argmax_w_high
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_low
>=
0
)
?
-
1
*
(
argmax_h
-
argmax_h_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_low
]
:
0
;
weight
+=
(
argmax_h_high
<=
height
-
1
&&
argmax_w_high
<=
width
-
1
)
?
(
argmax_h
-
argmax_h_low
)
*
im_data
[
argmax_h_high
*
data_width
+
argmax_w_high
]
:
0
;
}
return
weight
;
}
template
<
typename
T
>
HOSTDEVICE
T
DmcnIm2colBilinear
(
const
T
*
bottom_data
,
const
int
data_width
,
const
int
height
,
const
int
width
,
T
h
,
T
w
)
{
int
h_low
=
floor
(
h
);
int
w_low
=
floor
(
w
);
int
h_high
=
h_low
+
1
;
int
w_high
=
w_low
+
1
;
T
lh
=
h
-
h_low
;
T
lw
=
w
-
w_low
;
T
hh
=
1
-
lh
;
T
hw
=
1
-
lw
;
T
v1
=
(
h_low
>=
0
&&
w_low
>=
0
)
?
bottom_data
[
h_low
*
data_width
+
w_low
]
:
0
;
T
v2
=
(
h_low
>=
0
&&
w_high
<=
width
-
1
)
?
bottom_data
[
h_low
*
data_width
+
w_high
]
:
0
;
T
v3
=
(
h_high
<=
height
-
1
&&
w_low
>=
0
)
?
bottom_data
[
h_high
*
data_width
+
w_low
]
:
0
;
T
v4
=
(
h_high
<=
height
-
1
&&
w_high
<=
width
-
1
)
?
bottom_data
[
h_high
*
data_width
+
w_high
]
:
0
;
T
w1
=
hh
*
hw
;
T
w2
=
hh
*
lw
;
T
w3
=
lh
*
hw
;
T
w4
=
lh
*
lw
;
return
w1
*
v1
+
w2
*
v2
+
w3
*
v3
+
w4
*
v4
;
}
paddle/fluid/operators/deformable_conv_op.cc
浏览文件 @
31363c3f
...
@@ -12,9 +12,11 @@
...
@@ -12,9 +12,11 @@
// See the License for the specific language governing permissions and
// See the License for the specific language governing permissions and
// limitations under the License.
// limitations under the License.
#include "paddle/fluid/operators/deformable_conv_op.h"
#include <memory>
#include <memory>
#include "paddle/fluid/operators/conv_op.h"
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/multiary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -108,158 +110,6 @@ $$
...
@@ -108,158 +110,6 @@ $$
class
DeformableConvOp
:
public
framework
::
OperatorWithKernel
{
class
DeformableConvOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Input"
),
"Input"
,
"Input"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Offset"
),
"Input"
,
"Offset"
,
"deformable_conv)"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Mask"
),
"Input"
,
"Mask"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Filter"
),
"Input"
,
"Filter"
,
"deformable_conv"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Output"
),
"Output"
,
"Output"
,
"deformable_conv"
);
auto
in_dims
=
ctx
->
GetInputDim
(
"Input"
);
auto
filter_dims
=
ctx
->
GetInputDim
(
"Filter"
);
auto
offset_dims
=
ctx
->
GetInputDim
(
"Offset"
);
auto
mask_dims
=
ctx
->
GetInputDim
(
"Mask"
);
std
::
vector
<
int
>
strides
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"strides"
);
std
::
vector
<
int
>
paddings
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"paddings"
);
std
::
vector
<
int
>
dilations
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"dilations"
);
int
groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"groups"
);
int
deformable_groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"deformable_groups"
);
int
im2col_step
=
ctx
->
Attrs
().
Get
<
int
>
(
"im2col_step"
);
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
platform
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. The difference is [%d]: [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent. But received input "
"dimension:[%d], strides dimension:[%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d]: [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
platform
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But "
"received output channels:[%d], groups:[%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. The difference is [%d]: [%d]"
,
filter_dims
[
0
]
%
groups
,
0
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
platform
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide im2col_step. But "
"received Input batchsize:[%d], im2col_step:[%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
((
!
ctx
->
IsRuntime
())
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
platform
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group size. But received "
"output num_filter:[%d], deformable group size:[%d]"
,
output_shape
[
1
],
deformable_groups
));
if
(
ctx
->
IsRuntime
())
{
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
mask_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to mask map height. The "
"difference is [%d] vs [%d]"
,
output_shape
[
2
],
mask_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
mask_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to mask map width. The "
"difference is [%d] vs [%d]"
,
output_shape
[
3
],
mask_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
ctx
->
SetOutputDim
(
"Output"
,
phi
::
make_ddim
(
output_shape
));
}
protected:
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
framework
::
OpKernelType
GetExpectedKernelType
(
...
@@ -331,13 +181,13 @@ class DeformableConvGradOp : public framework::OperatorWithKernel {
...
@@ -331,13 +181,13 @@ class DeformableConvGradOp : public framework::OperatorWithKernel {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
DECLARE_INFER_SHAPE_FUNCTOR
(
deformable_conv
,
DeformableConvInferShapeFunctor
,
PD_INFER_META
(
phi
::
DeformableConvInferMeta
));
REGISTER_OPERATOR
(
deformable_conv
,
ops
::
DeformableConvOp
,
REGISTER_OPERATOR
(
deformable_conv
,
ops
::
DeformableConvOp
,
ops
::
DeformableConvOpMaker
,
ops
::
DeformableConvOpMaker
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvGradOpMaker
<
paddle
::
imperative
::
OpBase
>
);
ops
::
DeformableConvGradOpMaker
<
paddle
::
imperative
::
OpBase
>
,
DeformableConvInferShapeFunctor
);
REGISTER_OPERATOR
(
deformable_conv_grad
,
ops
::
DeformableConvGradOp
);
REGISTER_OPERATOR
(
deformable_conv_grad
,
ops
::
DeformableConvGradOp
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_grad
,
ops
::
DeformableConvGradCPUKernel
<
float
>
,
ops
::
DeformableConvGradCPUKernel
<
double
>
);
paddle/fluid/operators/deformable_conv_op.cu
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_op.h
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_v1_op.cc
浏览文件 @
31363c3f
...
@@ -12,9 +12,11 @@
...
@@ -12,9 +12,11 @@
// See the License for the specific language governing permissions and
// See the License for the specific language governing permissions and
// limitations under the License.
// limitations under the License.
#include "paddle/fluid/operators/deformable_conv_v1_op.h"
#include <memory>
#include <memory>
#include "paddle/fluid/operators/conv_op.h"
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/multiary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -113,128 +115,6 @@ $$
...
@@ -113,128 +115,6 @@ $$
class
DeformableConvV1Op
:
public
framework
::
OperatorWithKernel
{
class
DeformableConvV1Op
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Input"
),
"Input"
,
"Input"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Offset"
),
"Input"
,
"Offset"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"Filter"
),
"Input"
,
"Filter"
,
"deformable_conv_v1"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Output"
),
"Output"
,
"Output"
,
"deformable_conv_v1"
);
auto
in_dims
=
ctx
->
GetInputDim
(
"Input"
);
auto
filter_dims
=
ctx
->
GetInputDim
(
"Filter"
);
auto
offset_dims
=
ctx
->
GetInputDim
(
"Offset"
);
std
::
vector
<
int
>
strides
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"strides"
);
std
::
vector
<
int
>
paddings
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"paddings"
);
std
::
vector
<
int
>
dilations
=
ctx
->
Attrs
().
Get
<
std
::
vector
<
int
>>
(
"dilations"
);
int
groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"groups"
);
int
deformable_groups
=
ctx
->
Attrs
().
Get
<
int
>
(
"deformable_groups"
);
int
im2col_step
=
ctx
->
Attrs
().
Get
<
int
>
(
"im2col_step"
);
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
platform
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. the difference is [%d] vs [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
platform
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent., But received [%d]: [%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
platform
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d] vs [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
platform
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But"
"received output channels: [%d], groups: [%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
platform
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. But received [%d]: [%d]"
,
filter_dims
[
0
],
deformable_groups
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
platform
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide "
"im2col_step, But received [%d]: [%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
platform
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
((
!
ctx
->
IsRuntime
())
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
if
(
ctx
->
IsRuntime
())
{
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
platform
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group "
"size. But received [%d]: [%d]"
,
output_shape
[
1
],
deformable_groups
));
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
platform
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
platform
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
platform
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
ctx
->
SetOutputDim
(
"Output"
,
phi
::
make_ddim
(
output_shape
));
}
protected:
protected:
framework
::
OpKernelType
GetExpectedKernelType
(
framework
::
OpKernelType
GetExpectedKernelType
(
...
@@ -300,15 +180,12 @@ class DeformableConvV1GradOp : public framework::OperatorWithKernel {
...
@@ -300,15 +180,12 @@ class DeformableConvV1GradOp : public framework::OperatorWithKernel {
}
// namespace paddle
}
// namespace paddle
namespace
ops
=
paddle
::
operators
;
namespace
ops
=
paddle
::
operators
;
DECLARE_INFER_SHAPE_FUNCTOR
(
deformable_conv
,
DeformableConvV1InferShapeFunctor
,
PD_INFER_META
(
phi
::
DeformableConvInferMeta
));
REGISTER_OPERATOR
(
deformable_conv_v1
,
ops
::
DeformableConvV1Op
,
REGISTER_OPERATOR
(
deformable_conv_v1
,
ops
::
DeformableConvV1Op
,
ops
::
DeformableConvV1OpMaker
,
ops
::
DeformableConvV1OpMaker
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
framework
::
OpDesc
>
,
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
imperative
::
OpBase
>
);
ops
::
DeformableConvV1GradOpMaker
<
paddle
::
imperative
::
OpBase
>
,
DeformableConvV1InferShapeFunctor
);
REGISTER_OPERATOR
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradOp
);
REGISTER_OPERATOR
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradOp
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_v1
,
ops
::
DeformableConvV1CPUKernel
<
float
>
,
ops
::
DeformableConvV1CPUKernel
<
double
>
);
REGISTER_OP_CPU_KERNEL
(
deformable_conv_v1_grad
,
ops
::
DeformableConvV1GradCPUKernel
<
float
>
,
ops
::
DeformableConvV1GradCPUKernel
<
double
>
);
paddle/fluid/operators/deformable_conv_v1_op.cu
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/deformable_conv_v1_op.h
已删除
100644 → 0
浏览文件 @
da478d1e
此差异已折叠。
点击以展开。
paddle/fluid/operators/flatten_op.cc
浏览文件 @
31363c3f
...
@@ -17,7 +17,10 @@ limitations under the License. */
...
@@ -17,7 +17,10 @@ limitations under the License. */
#include <string>
#include <string>
#include <unordered_map>
#include <unordered_map>
#include <vector>
#include <vector>
#include "paddle/fluid/framework/infershape_utils.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/phi/core/infermeta_utils.h"
#include "paddle/phi/infermeta/unary.h"
namespace
paddle
{
namespace
paddle
{
namespace
operators
{
namespace
operators
{
...
@@ -270,70 +273,24 @@ class Flatten2GradOp : public framework::OperatorWithKernel {
...
@@ -270,70 +273,24 @@ class Flatten2GradOp : public framework::OperatorWithKernel {
class
FlattenContiguousRangeOp
:
public
framework
::
OperatorWithKernel
{
class
FlattenContiguousRangeOp
:
public
framework
::
OperatorWithKernel
{
public:
public:
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
using
framework
::
OperatorWithKernel
::
OperatorWithKernel
;
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
void
InferShape
(
framework
::
InferShapeContext
*
ctx
)
const
override
{
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"X"
),
"Input"
,
"X"
,
"FlattenContiguousRange"
);
OP_INOUT_CHECK
(
ctx
->
HasInput
(
"X"
),
"Input"
,
"X"
,
"FlattenContiguousRange"
);
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Out"
),
"Output"
,
"Out"
,
OP_INOUT_CHECK
(
ctx
->
HasOutput
(
"Out"
),
"Output"
,
"Out"
,
"FlattenContiguousRange"
);
"FlattenContiguousRange"
);
const
auto
&
start_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"start_axis"
);
const
auto
&
start_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"start_axis"
);
const
auto
&
stop_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"stop_axis"
);
const
auto
&
stop_axis
=
ctx
->
Attrs
().
Get
<
int
>
(
"stop_axis"
);
const
auto
&
in_dims
=
ctx
->
GetInputDim
(
"X"
);
int
in_dims_size
=
in_dims
.
size
();
int
real_start_axis
=
start_axis
,
real_stop_axis
=
stop_axis
;
if
(
start_axis
<
0
)
{
real_start_axis
=
start_axis
+
in_dims_size
;
}
if
(
stop_axis
<
0
)
{
real_stop_axis
=
stop_axis
+
in_dims_size
;
}
PADDLE_ENFORCE_GE
(
real_stop_axis
,
real_start_axis
,
platform
::
errors
::
InvalidArgument
(
"The stop_axis should be greater"
"than or equal to start_axis."
));
const
auto
&
out_dims
=
// Construct MetaTensor for InferMeta Func
GetOutputShape
(
real_start_axis
,
real_stop_axis
,
in_dims
);
using
CompatMetaTensor
=
framework
::
CompatMetaTensor
;
ctx
->
SetOutputDim
(
"Out"
,
phi
::
make_ddim
(
out_dims
));
CompatMetaTensor
x
(
ctx
->
GetInputVarPtrs
(
"X"
)[
0
],
ctx
->
IsRuntime
());
if
(
in_dims
[
0
]
==
out_dims
[
0
])
{
CompatMetaTensor
out
(
ctx
->
GetOutputVarPtrs
(
"Out"
)[
0
],
ctx
->
IsRuntime
());
// Only pass LoD when the first dimension of output and Input(X)
std
::
unique_ptr
<
CompatMetaTensor
>
xshape
(
nullptr
);
// are the same.
if
(
ctx
->
HasOutput
(
"XShape"
))
{
ctx
->
ShareLoD
(
"X"
,
"Out"
);
xshape
=
std
::
move
(
std
::
unique_ptr
<
CompatMetaTensor
>
(
new
CompatMetaTensor
(
}
ctx
->
GetOutputVarPtrs
(
"XShape"
)[
0
],
ctx
->
IsRuntime
())));
if
(
!
ctx
->
HasOutput
(
"XShape"
))
return
;
// OP_INOUT_CHECK(ctx->HasOutput("XShape"), "Output", "XShape", "Flatten2");
std
::
vector
<
int64_t
>
xshape_dims
(
in_dims
.
size
()
+
1
);
xshape_dims
[
0
]
=
0
;
for
(
int
i
=
0
;
i
<
in_dims
.
size
();
++
i
)
{
xshape_dims
[
i
+
1
]
=
in_dims
[
i
];
}
}
ctx
->
SetOutputDim
(
"XShape"
,
phi
::
make_ddim
(
xshape_dims
));
phi
::
FlattenWithXShapeInferMeta
(
x
,
start_axis
,
stop_axis
,
&
out
,
ctx
->
ShareLoD
(
"X"
,
"XShape"
);
xshape
.
get
());
}
static
std
::
vector
<
int32_t
>
GetOutputShape
(
const
int
start_axis
,
const
int
stop_axis
,
const
framework
::
DDim
&
in_dims
)
{
int64_t
outer
=
1
;
std
::
vector
<
int32_t
>
out_shape
;
int
in_dims_size
=
in_dims
.
size
();
out_shape
.
reserve
(
in_dims_size
-
stop_axis
+
start_axis
);
for
(
int
i
=
0
;
i
<
start_axis
;
++
i
)
{
out_shape
.
push_back
(
in_dims
[
i
]);
}
for
(
int
i
=
start_axis
;
i
<=
stop_axis
;
i
++
)
{
if
(
in_dims
[
i
]
==
-
1
||
outer
==
-
1
)
{
outer
=
-
1
;
}
else
{
outer
*=
in_dims
[
i
];
}
}
out_shape
.
push_back
(
outer
);
for
(
int
i
=
stop_axis
+
1
;
i
<
in_dims_size
;
i
++
)
{
out_shape
.
push_back
(
in_dims
[
i
]);
}
return
out_shape
;
}
}
};
};
...
@@ -487,30 +444,3 @@ REGISTER_OP_CPU_KERNEL(
...
@@ -487,30 +444,3 @@ REGISTER_OP_CPU_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
REGISTER_OP_CPU_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
REGISTER_OP_CPU_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CPUDeviceContext
,
int64_t
>
);
paddle/fluid/operators/flatten_op.cu.cc
浏览文件 @
31363c3f
...
@@ -47,34 +47,3 @@ REGISTER_OP_CUDA_KERNEL(
...
@@ -47,34 +47,3 @@ REGISTER_OP_CUDA_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
REGISTER_OP_CUDA_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
double
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
uint8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
CUDADeviceContext
,
int64_t
>
);
paddle/fluid/operators/flatten_op.h
浏览文件 @
31363c3f
...
@@ -119,46 +119,5 @@ class Flatten2GradKernel : public framework::OpKernel<T> {
...
@@ -119,46 +119,5 @@ class Flatten2GradKernel : public framework::OpKernel<T> {
}
}
};
};
template
<
typename
DeviceContext
,
typename
T
>
class
FlattenContiguousRangeKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
context
)
const
override
{
auto
*
in
=
context
.
Input
<
framework
::
LoDTensor
>
(
"X"
);
auto
*
out
=
context
.
Output
<
framework
::
LoDTensor
>
(
"Out"
);
out
->
mutable_data
(
context
.
GetPlace
(),
in
->
type
());
auto
&
start_axis
=
context
.
Attr
<
int
>
(
"start_axis"
);
auto
&
stop_axis
=
context
.
Attr
<
int
>
(
"stop_axis"
);
auto
&
dev_ctx
=
context
.
device_context
<
DeviceContext
>
();
// call new kernel
phi
::
FlattenKernel
<
T
,
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
>
(
static_cast
<
const
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
&>
(
dev_ctx
),
*
in
,
start_axis
,
stop_axis
,
out
);
}
};
template
<
typename
DeviceContext
,
typename
T
>
class
FlattenContiguousRangeGradKernel
:
public
framework
::
OpKernel
<
T
>
{
public:
void
Compute
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
auto
*
d_x
=
ctx
.
Output
<
framework
::
LoDTensor
>
(
framework
::
GradVarName
(
"X"
));
auto
*
d_out
=
ctx
.
Input
<
framework
::
LoDTensor
>
(
framework
::
GradVarName
(
"Out"
));
auto
*
xshape
=
ctx
.
Input
<
framework
::
LoDTensor
>
(
"XShape"
);
d_x
->
mutable_data
(
ctx
.
GetPlace
(),
d_out
->
type
());
auto
&
dev_ctx
=
ctx
.
device_context
<
DeviceContext
>
();
// call new kernel
phi
::
FlattenGradKernel
<
T
,
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
>
(
static_cast
<
const
typename
paddle
::
framework
::
ConvertToPhiContext
<
DeviceContext
>::
TYPE
&>
(
dev_ctx
),
*
d_out
,
*
xshape
,
d_x
);
}
};
}
// namespace operators
}
// namespace operators
}
// namespace paddle
}
// namespace paddle
paddle/fluid/operators/flatten_op_xpu.cc
浏览文件 @
31363c3f
...
@@ -41,27 +41,4 @@ REGISTER_OP_XPU_KERNEL(
...
@@ -41,27 +41,4 @@ REGISTER_OP_XPU_KERNEL(
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
ops
::
Flatten2GradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
REGISTER_OP_XPU_KERNEL
(
flatten_contiguous_range
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
REGISTER_OP_XPU_KERNEL
(
flatten_contiguous_range_grad
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
float
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
plat
::
float16
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int8_t
>
,
ops
::
FlattenContiguousRangeGradKernel
<
paddle
::
platform
::
XPUDeviceContext
,
int64_t
>
);
#endif
#endif
paddle/fluid/pybind/eager_method.cc
浏览文件 @
31363c3f
...
@@ -868,16 +868,22 @@ static PyObject* tensor_register_grad_hook(TensorObject* self, PyObject* args,
...
@@ -868,16 +868,22 @@ static PyObject* tensor_register_grad_hook(TensorObject* self, PyObject* args,
int64_t
hook_id
;
int64_t
hook_id
;
if
(
egr
::
egr_utils_api
::
IsLeafTensor
(
self
->
tensor
))
{
if
(
egr
::
egr_utils_api
::
IsLeafTensor
(
self
->
tensor
))
{
VLOG
(
6
)
<<
"Register hook for leaf tensor: "
<<
self
->
tensor
.
name
();
VLOG
(
6
)
<<
"Register hook for leaf tensor: "
<<
self
->
tensor
.
name
();
auto
autograd_meta
=
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
);
if
(
autograd_meta
&&
!
autograd_meta
->
StopGradient
())
{
if
(
!
autograd_meta
->
GetMutableGradNode
())
{
VLOG
(
6
)
<<
"Detected NULL grad_node, Leaf tensor should have had "
"grad_node with type: GradNodeAccumulation."
;
autograd_meta
->
SetGradNode
(
std
::
make_shared
<
egr
::
GradNodeAccumulation
>
(
autograd_meta
));
}
}
std
::
shared_ptr
<
egr
::
GradNodeBase
>
grad_node
=
std
::
shared_ptr
<
egr
::
GradNodeBase
>
grad_node
=
egr
::
EagerUtils
::
grad_node
(
self
->
tensor
);
egr
::
EagerUtils
::
grad_node
(
self
->
tensor
);
PADDLE_ENFORCE
(
grad_node
.
get
()
!=
nullptr
,
paddle
::
platform
::
errors
::
Fatal
(
"Detected NULL grad_node,"
"Leaf tensor should have had grad_node "
"with type: GradNodeAccumulation."
));
auto
rank_info
=
auto
rank_info
=
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
)
->
OutRankInfo
();
egr
::
EagerUtils
::
unsafe_autograd_meta
(
self
->
tensor
)
->
OutRankInfo
();
PyObject
*
hook_func
=
PyTuple_GET_ITEM
(
args
,
0
);
PyObject
*
hook_func
=
PyTuple_GET_ITEM
(
args
,
0
);
auto
accumulation_grad_node
=
auto
accumulation_grad_node
=
...
@@ -948,8 +954,8 @@ static PyObject* tensor_register_reduce_hook(TensorObject* self, PyObject* args,
...
@@ -948,8 +954,8 @@ static PyObject* tensor_register_reduce_hook(TensorObject* self, PyObject* args,
EAGER_CATCH_AND_THROW_RETURN_NULL
EAGER_CATCH_AND_THROW_RETURN_NULL
}
}
static
PyObject
*
set_grad_type
(
TensorObject
*
self
,
PyObject
*
args
,
static
PyObject
*
tensor__
set_grad_type
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
PyObject
*
kwargs
)
{
EAGER_TRY
EAGER_TRY
auto
var_type
=
pybind
::
CastPyArg2ProtoType
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
auto
var_type
=
pybind
::
CastPyArg2ProtoType
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
auto
grad_tensor
=
auto
grad_tensor
=
...
@@ -963,6 +969,42 @@ static PyObject* set_grad_type(TensorObject* self, PyObject* args,
...
@@ -963,6 +969,42 @@ static PyObject* set_grad_type(TensorObject* self, PyObject* args,
EAGER_CATCH_AND_THROW_RETURN_NULL
EAGER_CATCH_AND_THROW_RETURN_NULL
}
}
static
PyObject
*
tensor__clear
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
EAGER_TRY
self
->
tensor
.
reset
();
return
Py_None
;
EAGER_CATCH_AND_THROW_RETURN_NULL
}
static
PyObject
*
tensor__copy_gradient_from
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
EAGER_TRY
auto
src
=
CastPyArg2Tensor
(
PyTuple_GET_ITEM
(
args
,
0
),
0
);
if
(
self
->
tensor
.
is_initialized
())
{
PADDLE_ENFORCE_EQ
(
self
->
tensor
.
dtype
(),
src
.
dtype
(),
platform
::
errors
::
PreconditionNotMet
(
"Tensor %s has different data type with Tensor %s"
,
self
->
tensor
.
name
(),
src
.
name
()));
PADDLE_ENFORCE_EQ
(
self
->
tensor
.
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
platform
::
errors
::
PreconditionNotMet
(
"Tensor %s has different type with Tensor %s, Tensor "
"ShareGradientDataWith cannot be performed!"
,
self
->
tensor
.
name
(),
src
.
name
()));
}
VLOG
(
6
)
<<
"Tensor copy gradient from: "
<<
src
.
name
();
auto
*
p_grad
=
egr
::
EagerUtils
::
mutable_grad
(
self
->
tensor
);
if
(
p_grad
)
{
PADDLE_ENFORCE_EQ
(
src
.
initialized
(),
true
,
platform
::
errors
::
InvalidArgument
(
"Tensor %s has not been initialized"
,
src
.
name
()));
p_grad
->
set_impl
(
src
.
impl
());
}
Py_INCREF
(
Py_None
);
return
Py_None
;
EAGER_CATCH_AND_THROW_RETURN_NULL
}
static
PyObject
*
tensor_method_get_non_zero_indices
(
TensorObject
*
self
,
static
PyObject
*
tensor_method_get_non_zero_indices
(
TensorObject
*
self
,
PyObject
*
args
,
PyObject
*
args
,
PyObject
*
kwargs
)
{
PyObject
*
kwargs
)
{
...
@@ -1117,7 +1159,12 @@ PyMethodDef variable_methods[] = {
...
@@ -1117,7 +1159,12 @@ PyMethodDef variable_methods[] = {
{
"_register_backward_hook"
,
{
"_register_backward_hook"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor_register_reduce_hook
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor_register_reduce_hook
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_set_grad_type"
,
(
PyCFunction
)(
void
(
*
)(
void
))
set_grad_type
,
{
"_set_grad_type"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__set_grad_type
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_clear"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__clear
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
{
"_copy_gradient_from"
,
(
PyCFunction
)(
void
(
*
)(
void
))
tensor__copy_gradient_from
,
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
METH_VARARGS
|
METH_KEYWORDS
,
NULL
},
/***the method of sparse tensor****/
/***the method of sparse tensor****/
{
"non_zero_indices"
,
{
"non_zero_indices"
,
...
...
paddle/fluid/pybind/imperative.cc
浏览文件 @
31363c3f
...
@@ -655,6 +655,7 @@ void BindImperative(py::module *m_ptr) {
...
@@ -655,6 +655,7 @@ void BindImperative(py::module *m_ptr) {
}
else
{
}
else
{
act_name
=
name
.
cast
<
std
::
string
>
();
act_name
=
name
.
cast
<
std
::
string
>
();
}
}
VLOG
(
4
)
<<
"Init VarBase :"
<<
act_name
;
new
(
&
self
)
imperative
::
VarBase
(
act_name
);
new
(
&
self
)
imperative
::
VarBase
(
act_name
);
self
.
SetPersistable
(
persistable
);
self
.
SetPersistable
(
persistable
);
self
.
SetType
(
type
);
self
.
SetType
(
type
);
...
...
paddle/fluid/pybind/pybind.cc
浏览文件 @
31363c3f
...
@@ -829,6 +829,8 @@ PYBIND11_MODULE(core_noavx, m) {
...
@@ -829,6 +829,8 @@ PYBIND11_MODULE(core_noavx, m) {
[](
const
framework
::
Tensor
&
self
)
{
[](
const
framework
::
Tensor
&
self
)
{
return
reinterpret_cast
<
uintptr_t
>
(
self
.
data
());
return
reinterpret_cast
<
uintptr_t
>
(
self
.
data
());
})
})
.
def
(
"_slice"
,
&
framework
::
Tensor
::
Slice
)
.
def
(
"_numel"
,
&
framework
::
Tensor
::
numel
)
.
def
(
"_is_initialized"
,
.
def
(
"_is_initialized"
,
[](
const
framework
::
Tensor
&
self
)
{
return
self
.
IsInitialized
();
})
[](
const
framework
::
Tensor
&
self
)
{
return
self
.
IsInitialized
();
})
.
def
(
"_get_dims"
,
.
def
(
"_get_dims"
,
...
...
paddle/phi/api/include/tensor.h
浏览文件 @
31363c3f
...
@@ -427,9 +427,7 @@ class PADDLE_API Tensor final {
...
@@ -427,9 +427,7 @@ class PADDLE_API Tensor final {
* @param blocking, Should we copy this in sync way.
* @param blocking, Should we copy this in sync way.
* @return void
* @return void
*/
*/
void
copy_
(
const
Tensor
&
src
,
void
copy_
(
const
Tensor
&
src
,
const
phi
::
Place
&
target_place
,
bool
blocking
);
const
phi
::
Place
&
target_place
,
const
bool
blocking
);
/**
/**
* @brief Cast datatype from one to another
* @brief Cast datatype from one to another
*
*
...
...
paddle/phi/api/lib/tensor_method.cc
浏览文件 @
31363c3f
...
@@ -84,26 +84,26 @@ void Tensor::copy_(const Tensor &src,
...
@@ -84,26 +84,26 @@ void Tensor::copy_(const Tensor &src,
if
(
is_initialized
())
{
if
(
is_initialized
())
{
PADDLE_ENFORCE_EQ
(
dtype
(),
PADDLE_ENFORCE_EQ
(
dtype
(),
src
.
dtype
(),
src
.
dtype
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Tensor %s has different data type with Tensor %s, "
"Tensor %s has different data type with Tensor %s, "
"Tensor Copy cannot be performed!"
,
"Tensor Copy cannot be performed!"
,
name
(),
name
(),
src
.
name
()));
src
.
name
()));
PADDLE_ENFORCE_EQ
(
impl
()
->
type_info
().
id
(),
PADDLE_ENFORCE_EQ
(
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
src
.
impl
()
->
type_info
().
id
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Tensor %s has different type with Tensor %s, Tensor "
"Tensor %s has different type with Tensor %s, Tensor "
"Copy cannot be performed!"
,
"Copy cannot be performed!"
,
name
(),
name
(),
src
.
name
()));
src
.
name
()));
PADDLE_ENFORCE_EQ
(
target_place
,
PADDLE_ENFORCE_EQ
(
target_place
,
inner_place
(),
inner_place
(),
p
latform
::
errors
::
PreconditionNotMet
(
p
hi
::
errors
::
PreconditionNotMet
(
"Place is different of dst tensor and args %s, which "
"Place is different of dst tensor and args %s, which "
"current tensor holds %s "
"current tensor holds %s "
"Copy cannot be performed!"
,
"Copy cannot be performed!"
,
target_place
.
DebugString
()
,
target_place
,
inner_place
()
.
DebugString
()
));
inner_place
()));
kernel_key_set
.
backend_set
=
kernel_key_set
.
backend_set
=
kernel_key_set
.
backend_set
|
kernel_key_set
.
backend_set
|
BackendSet
(
phi
::
TransToPhiBackend
(
inner_place
()));
BackendSet
(
phi
::
TransToPhiBackend
(
inner_place
()));
...
@@ -177,7 +177,7 @@ void Tensor::copy_(const Tensor &src,
...
@@ -177,7 +177,7 @@ void Tensor::copy_(const Tensor &src,
blocking
,
blocking
,
static_cast
<
phi
::
SelectedRows
*>
(
impl_
.
get
()));
static_cast
<
phi
::
SelectedRows
*>
(
impl_
.
get
()));
}
else
{
}
else
{
PADDLE_THROW
(
p
addle
::
platform
::
errors
::
InvalidArgument
(
PADDLE_THROW
(
p
hi
::
errors
::
InvalidArgument
(
"We currently only support dense tensor copy for now and if u need to "
"We currently only support dense tensor copy for now and if u need to "
"copy selected rows please raise a issue."
));
"copy selected rows please raise a issue."
));
}
}
...
...
paddle/phi/infermeta/multiary.cc
浏览文件 @
31363c3f
...
@@ -516,6 +516,215 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
...
@@ -516,6 +516,215 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
out
->
share_lod
(
*
x
.
at
(
0
));
out
->
share_lod
(
*
x
.
at
(
0
));
}
}
inline
int
ConvOutputSize
(
int
input_size
,
int
filter_size
,
int
dilation
,
int
padding
,
int
stride
)
{
const
int
dkernel
=
dilation
*
(
filter_size
-
1
)
+
1
;
int
output_size
=
(
input_size
+
2
*
padding
-
dkernel
)
/
stride
+
1
;
PADDLE_ENFORCE_GT
(
output_size
,
0
,
phi
::
errors
::
InvalidArgument
(
"The output's size is expected to be greater than 0. "
"But recieved: output's size is %d. The output's size is computed by "
"((input_size + 2 * padding - (dilation * (filter_size - 1) + 1)) / "
"stride + 1), where input_size is %d, padding is %d, "
"filter_size is %d, dilation is %d, stride is %d."
,
output_size
,
input_size
,
padding
,
filter_size
,
dilation
,
stride
));
return
output_size
;
}
void
DeformableConvInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
offset
,
const
MetaTensor
&
filter
,
paddle
::
optional
<
const
MetaTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
MetaTensor
*
out
,
MetaConfig
config
)
{
auto
in_dims
=
x
.
dims
();
auto
offset_dims
=
offset
.
dims
();
auto
filter_dims
=
filter
.
dims
();
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
4
,
phi
::
errors
::
InvalidArgument
(
"Conv input should be 4-D tensor, get %u"
,
in_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
(),
filter_dims
.
size
(),
phi
::
errors
::
InvalidArgument
(
"Conv input dimension and filter dimension should be "
"the same. The difference is [%d]: [%d]"
,
in_dims
.
size
(),
filter_dims
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
.
size
()
-
strides
.
size
(),
2U
,
phi
::
errors
::
InvalidArgument
(
"Conv input dimension and strides "
"dimension should be consistent. But received input "
"dimension:[%d], strides dimension:[%d]"
,
in_dims
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
paddings
.
size
(),
strides
.
size
(),
phi
::
errors
::
InvalidArgument
(
"Conv paddings dimension and Conv strides dimension "
"should be the same. The difference is [%d]: [%d]"
,
paddings
.
size
(),
strides
.
size
()));
PADDLE_ENFORCE_EQ
(
in_dims
[
1
],
filter_dims
[
1
]
*
groups
,
phi
::
errors
::
InvalidArgument
(
"The number of input channels should be equal to filter "
"channels * groups. The difference is [%d]: [%d]"
,
in_dims
[
1
],
filter_dims
[
1
]
*
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
groups
,
0
,
phi
::
errors
::
InvalidArgument
(
"The number of output channels should be divided by groups. But "
"received output channels:[%d], groups:[%d]"
,
filter_dims
[
0
],
groups
));
PADDLE_ENFORCE_EQ
(
filter_dims
[
0
]
%
deformable_groups
,
0
,
phi
::
errors
::
InvalidArgument
(
"The number of output channels should be "
"divided by deformable groups. The difference is [%d]: [%d]"
,
filter_dims
[
0
]
%
groups
,
0
));
if
(
in_dims
[
0
]
>
im2col_step
)
{
PADDLE_ENFORCE_EQ
(
in_dims
[
0
]
%
im2col_step
,
0U
,
phi
::
errors
::
InvalidArgument
(
"Input batchsize must be smaller than or divide im2col_step. But "
"received Input batchsize:[%d], im2col_step:[%d]"
,
in_dims
[
0
],
im2col_step
));
}
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
strides
[
i
],
0U
,
phi
::
errors
::
InvalidArgument
(
"stride %d size incorrect"
,
i
));
}
for
(
size_t
i
=
0
;
i
<
dilations
.
size
();
++
i
)
{
PADDLE_ENFORCE_GT
(
dilations
[
i
],
0U
,
phi
::
errors
::
InvalidArgument
(
"dilation %d size incorrect"
,
i
));
}
std
::
vector
<
int64_t
>
output_shape
({
in_dims
[
0
],
filter_dims
[
0
]});
for
(
size_t
i
=
0
;
i
<
strides
.
size
();
++
i
)
{
if
(
!
config
.
is_runtime
&&
(
in_dims
[
i
+
2
]
<=
0
||
filter_dims
[
i
+
2
]
<=
0
))
{
output_shape
.
push_back
(
-
1
);
}
else
{
output_shape
.
push_back
(
ConvOutputSize
(
in_dims
[
i
+
2
],
filter_dims
[
i
+
2
],
dilations
[
i
],
paddings
[
i
],
strides
[
i
]));
}
}
PADDLE_ENFORCE_EQ
(
output_shape
[
1
]
%
deformable_groups
,
0U
,
phi
::
errors
::
InvalidArgument
(
"output num_filter must divide deformable group size. But received "
"output num_filter:[%d], deformable group size:[%d]"
,
output_shape
[
1
],
deformable_groups
));
if
(
config
.
is_runtime
)
{
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
offset_dims
[
2
],
phi
::
errors
::
InvalidArgument
(
"output height must equal to offset map height. "
"The difference is [%d]: [%d]"
,
output_shape
[
2
],
offset_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
offset_dims
[
3
],
phi
::
errors
::
InvalidArgument
(
"output width must equal to offset map width. The "
"difference is [%d]: [%d]"
,
output_shape
[
3
],
offset_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
phi
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. "
"But received [%d]: [%d]"
,
offset_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
phi
::
errors
::
InvalidArgument
(
"offset filter must divide deformable group size. But received "
"[%d]: [%d]"
,
offset_dims
[
1
]
/
(
2
*
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
if
(
mask
)
{
auto
mask_dims
=
mask
->
dims
();
PADDLE_ENFORCE_EQ
(
output_shape
[
2
],
mask_dims
[
2
],
phi
::
errors
::
InvalidArgument
(
"output height must equal to mask map height. The "
"difference is [%d] vs [%d]"
,
output_shape
[
2
],
mask_dims
[
2
]));
PADDLE_ENFORCE_EQ
(
output_shape
[
3
],
mask_dims
[
3
],
phi
::
errors
::
InvalidArgument
(
"output width must equal to mask map width. The "
"difference is [%d] vs [%d]"
,
output_shape
[
3
],
mask_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
%
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
0U
,
phi
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
],
filter_dims
[
2
]
*
filter_dims
[
3
]));
PADDLE_ENFORCE_EQ
(
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
,
phi
::
errors
::
InvalidArgument
(
"mask filter must divide deformable group size. "
"But received [%d]: [%d]"
,
mask_dims
[
1
]
/
(
filter_dims
[
2
]
*
filter_dims
[
3
]),
deformable_groups
));
}
}
out
->
set_dims
(
phi
::
make_ddim
(
output_shape
));
out
->
set_dtype
(
x
.
dtype
());
}
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
w
,
const
MetaTensor
&
w
,
const
MetaTensor
&
label
,
const
MetaTensor
&
label
,
...
...
paddle/phi/infermeta/multiary.h
浏览文件 @
31363c3f
...
@@ -120,6 +120,19 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
...
@@ -120,6 +120,19 @@ void ConcatInferMeta(const std::vector<MetaTensor*>& x,
MetaTensor
*
out
,
MetaTensor
*
out
,
MetaConfig
config
=
MetaConfig
());
MetaConfig
config
=
MetaConfig
());
void
DeformableConvInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
offset
,
const
MetaTensor
&
filter
,
paddle
::
optional
<
const
MetaTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
MetaTensor
*
out
,
MetaConfig
config
=
MetaConfig
());
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
void
HierarchicalSigmoidInferMeta
(
const
MetaTensor
&
x
,
const
MetaTensor
&
w
,
const
MetaTensor
&
w
,
const
MetaTensor
&
label
,
const
MetaTensor
&
label
,
...
...
paddle/phi/infermeta/unary.cc
浏览文件 @
31363c3f
...
@@ -352,6 +352,14 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -352,6 +352,14 @@ void FlattenInferMeta(const MetaTensor& x,
int
start_axis
,
int
start_axis
,
int
stop_axis
,
int
stop_axis
,
MetaTensor
*
out
)
{
MetaTensor
*
out
)
{
FlattenWithXShapeInferMeta
(
x
,
start_axis
,
stop_axis
,
out
,
nullptr
);
}
void
FlattenWithXShapeInferMeta
(
const
MetaTensor
&
x
,
int
start_axis
,
int
stop_axis
,
MetaTensor
*
out
,
MetaTensor
*
xshape
)
{
auto
x_dims
=
x
.
dims
();
auto
x_dims
=
x
.
dims
();
int
in_dims_size
=
x_dims
.
size
();
int
in_dims_size
=
x_dims
.
size
();
if
(
start_axis
<
0
)
{
if
(
start_axis
<
0
)
{
...
@@ -394,6 +402,14 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -394,6 +402,14 @@ void FlattenInferMeta(const MetaTensor& x,
// are the same.
// are the same.
out
->
share_lod
(
x
);
out
->
share_lod
(
x
);
}
}
if
(
xshape
==
nullptr
)
return
;
std
::
vector
<
int64_t
>
xshape_dims
(
x_dims
.
size
()
+
1
);
xshape_dims
[
0
]
=
0
;
for
(
int
i
=
0
;
i
<
x_dims
.
size
();
++
i
)
{
xshape_dims
[
i
+
1
]
=
x_dims
[
i
];
}
xshape
->
set_dims
(
phi
::
make_ddim
(
xshape_dims
));
xshape
->
share_lod
(
x
);
}
}
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
...
...
paddle/phi/infermeta/unary.h
浏览文件 @
31363c3f
...
@@ -86,6 +86,12 @@ void FlattenInferMeta(const MetaTensor& x,
...
@@ -86,6 +86,12 @@ void FlattenInferMeta(const MetaTensor& x,
int
stop_axis
,
int
stop_axis
,
MetaTensor
*
out
);
MetaTensor
*
out
);
void
FlattenWithXShapeInferMeta
(
const
MetaTensor
&
x
,
int
start_axis
,
int
stop_axis
,
MetaTensor
*
out
,
MetaTensor
*
xshape
);
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
void
GumbelSoftmaxInferMeta
(
const
MetaTensor
&
x
,
float
temperature
,
float
temperature
,
bool
hard
,
bool
hard
,
...
...
paddle/phi/kernels/CMakeLists.txt
浏览文件 @
31363c3f
...
@@ -27,12 +27,14 @@ kernel_library(full_kernel DEPS ${COMMON_KERNEL_DEPS} empty_kernel)
...
@@ -27,12 +27,14 @@ kernel_library(full_kernel DEPS ${COMMON_KERNEL_DEPS} empty_kernel)
# Some kernels depend on some targets that are not commonly used.
# Some kernels depend on some targets that are not commonly used.
# These targets are not suitable for common dependencies.
# These targets are not suitable for common dependencies.
# In this case, you need to manually generate them here.
# In this case, you need to manually generate them here.
set
(
MANUAL_BUILD_KERNELS eigh_kernel gumbel_softmax_kernel gumbel_softmax_grad_kernel
set
(
MANUAL_BUILD_KERNELS
deformable_conv_kernel deformable_conv_grad_kernel
eigh_kernel gumbel_softmax_kernel gumbel_softmax_grad_kernel
hierarchical_sigmoid_kernel hierarchical_sigmoid_grad_kernel
hierarchical_sigmoid_kernel hierarchical_sigmoid_grad_kernel
matrix_power_kernel matrix_power_grad_kernel maxout_kernel maxout_grad_kernel pool_kernel
matrix_power_kernel matrix_power_grad_kernel maxout_kernel maxout_grad_kernel pool_kernel
put_along_axis_kernel put_along_axis_grad_kernel segment_pool_kernel segment_pool_grad_kernel
put_along_axis_kernel put_along_axis_grad_kernel segment_pool_kernel segment_pool_grad_kernel
softmax_kernel softmax_grad_kernel take_along_axis_kernel take_along_axis_grad_kernel
softmax_kernel softmax_grad_kernel take_along_axis_kernel take_along_axis_grad_kernel
triangular_solve_grad_kernel determinant_grad_kernel reduce_kernel
)
triangular_solve_grad_kernel determinant_grad_kernel reduce_kernel
)
kernel_library
(
deformable_conv_kernel DEPS
${
COMMON_KERNEL_DEPS
}
deformable_conv_functor
)
kernel_library
(
deformable_conv_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
deformable_conv_functor
)
kernel_library
(
eigh_kernel DEPS
${
COMMON_KERNEL_DEPS
}
lapack_function
)
kernel_library
(
eigh_kernel DEPS
${
COMMON_KERNEL_DEPS
}
lapack_function
)
kernel_library
(
hierarchical_sigmoid_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
kernel_library
(
hierarchical_sigmoid_grad_kernel DEPS
${
COMMON_KERNEL_DEPS
}
matrix_bit_code
)
...
...
paddle/phi/kernels/cpu/deformable_conv_grad_kernel.cc
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/deformable_conv_grad_kernel.h"
#include "paddle/phi/backends/cpu/cpu_context.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h"
namespace
phi
{
template
<
typename
T
>
inline
void
ModulatedDeformableCol2imCPUKernel
(
const
int
num_kernels
,
const
T
*
data_col
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
grad_im
)
{
for
(
int
thread
=
0
;
thread
<
num_kernels
;
thread
++
)
{
const
int
j
=
(
thread
/
width_col
/
height_col
/
batch_size
)
%
kernel_w
;
const
int
i
=
(
thread
/
width_col
/
height_col
/
batch_size
/
kernel_w
)
%
kernel_h
;
const
int
c
=
thread
/
width_col
/
height_col
/
batch_size
/
kernel_w
/
kernel_h
;
const
int
deformable_group_index
=
c
/
channel_per_deformable_group
;
int
w_out
=
thread
%
width_col
;
int
h_out
=
(
thread
/
width_col
)
%
height_col
;
int
b
=
(
thread
/
width_col
/
height_col
)
%
batch_size
;
int
w_in
=
w_out
*
stride_w
-
pad_w
;
int
h_in
=
h_out
*
stride_h
-
pad_h
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_out
)
*
width_col
+
w_out
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
const
T
cur_inv_h_data
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
cur_inv_w_data
=
w_in
+
j
*
dilation_w
+
offset_w
;
T
cur_top_grad
=
data_col
[
thread
];
if
(
data_mask
)
{
const
T
*
data_mask_ptr
=
data_mask
+
(
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
cur_top_grad
*=
mask
;
}
const
int
cur_h
=
static_cast
<
int
>
(
cur_inv_h_data
);
const
int
cur_w
=
static_cast
<
int
>
(
cur_inv_w_data
);
for
(
int
dy
=
-
2
;
dy
<=
2
;
dy
++
)
{
for
(
int
dx
=
-
2
;
dx
<=
2
;
dx
++
)
{
if
(
cur_h
+
dy
>=
0
&&
cur_h
+
dy
<
height
&&
cur_w
+
dx
>=
0
&&
cur_w
+
dx
<
width
&&
abs
(
cur_inv_h_data
-
(
cur_h
+
dy
))
<
1
&&
abs
(
cur_inv_w_data
-
(
cur_w
+
dx
))
<
1
)
{
int
cur_bottom_grad_pos
=
((
b
*
channels
+
c
)
*
height
+
cur_h
+
dy
)
*
width
+
cur_w
+
dx
;
T
weight
=
DmcnGetGradientWeight
(
cur_inv_h_data
,
cur_inv_w_data
,
cur_h
+
dy
,
cur_w
+
dx
,
height
,
width
);
*
(
grad_im
+
cur_bottom_grad_pos
)
=
*
(
grad_im
+
cur_bottom_grad_pos
)
+
weight
*
cur_top_grad
;
}
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableCol2im
(
const
Context
&
dev_ctx
,
const
T
*
data_col
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
kernel_shape
,
const
std
::
vector
<
int
>&
pad
,
const
std
::
vector
<
int
>&
stride
,
const
std
::
vector
<
int
>&
dilation
,
const
int
deformable_group
,
T
*
grad_im
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_group
;
int
num_kernels
=
col_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
ModulatedDeformableCol2imCPUKernel
(
num_kernels
,
data_col
,
data_offset
,
data_mask
,
im_shape
[
0
],
im_shape
[
1
],
im_shape
[
2
],
kernel_shape
[
2
],
kernel_shape
[
3
],
pad
[
0
],
pad
[
1
],
stride
[
0
],
stride
[
1
],
dilation
[
0
],
dilation
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
deformable_group
,
col_shape
[
2
],
col_shape
[
3
],
grad_im
);
}
template
<
typename
T
>
void
ModulatedDeformableCol2imCoordCPUKernel
(
const
int
num_kernels
,
const
T
*
data_col
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
offset_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
grad_offset
,
T
*
grad_mask
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
T
val
=
0
,
mval
=
0
;
const
int
w
=
i
%
width_col
;
const
int
h
=
(
i
/
width_col
)
%
height_col
;
const
int
c
=
(
i
/
width_col
/
height_col
)
%
offset_channels
;
const
int
b
=
(
i
/
width_col
/
height_col
)
/
offset_channels
;
const
int
deformable_group_index
=
c
/
(
2
*
kernel_h
*
kernel_w
);
const
int
col_step
=
kernel_h
*
kernel_w
;
int
cnt
=
0
;
const
T
*
data_col_ptr
=
data_col
+
deformable_group_index
*
channel_per_deformable_group
*
batch_size
*
width_col
*
height_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b
*
deformable_group
+
deformable_group_index
)
*
channel_per_deformable_group
/
kernel_h
/
kernel_w
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
const
int
offset_c
=
c
-
deformable_group_index
*
2
*
kernel_h
*
kernel_w
;
for
(
int
col_c
=
offset_c
/
2
;
col_c
<
channel_per_deformable_group
;
col_c
+=
col_step
)
{
const
int
col_pos
=
(((
col_c
*
batch_size
+
b
)
*
height_col
)
+
h
)
*
width_col
+
w
;
const
int
bp_dir
=
offset_c
%
2
;
int
j
=
(
col_pos
/
width_col
/
height_col
/
batch_size
)
%
kernel_w
;
int
i
=
(
col_pos
/
width_col
/
height_col
/
batch_size
/
kernel_w
)
%
kernel_h
;
int
w_out
=
col_pos
%
width_col
;
int
h_out
=
(
col_pos
/
width_col
)
%
height_col
;
int
w_in
=
w_out
*
stride_w
-
pad_w
;
int
h_in
=
h_out
*
stride_h
-
pad_h
;
const
int
data_offset_h_ptr
=
(((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
int
data_offset_w_ptr
=
(((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
inv_h
=
h_in
+
i
*
dilation_h
+
offset_h
;
T
inv_w
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
inv_h
<=
-
1
||
inv_w
<=
-
1
||
inv_h
>=
height
||
inv_w
>=
width
)
{
inv_h
=
inv_w
=
-
2
;
}
else
{
mval
+=
data_col_ptr
[
col_pos
]
*
funcs
::
DmcnIm2colBilinear
(
data_im_ptr
+
cnt
*
height
*
width
,
width
,
height
,
width
,
inv_h
,
inv_w
);
}
const
T
weight
=
DmcnGetCoordinateWeight
(
inv_h
,
inv_w
,
height
,
width
,
data_im_ptr
+
cnt
*
height
*
width
,
width
,
bp_dir
);
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
(((
i
*
kernel_w
+
j
)
*
height_col
+
h_out
)
*
width_col
+
w_out
);
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
val
+=
weight
*
data_col_ptr
[
col_pos
]
*
mask
;
}
else
{
val
+=
weight
*
data_col_ptr
[
col_pos
];
}
cnt
+=
1
;
}
grad_offset
[
i
]
=
val
;
if
(
grad_mask
&&
offset_c
%
2
==
0
)
grad_mask
[(((
b
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
+
offset_c
/
2
)
*
height_col
+
h
)
*
width_col
+
w
]
=
mval
;
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableCol2imCoord
(
const
Context
&
dev_ctx
,
const
T
*
data_col
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
kernel_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
grad_offset
,
T
*
grad_mask
)
{
int
num_kernels
=
2
*
kernel_shape
[
2
]
*
kernel_shape
[
3
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
]
*
deformable_groups
;
int
channel_per_deformable_group
=
col_shape
[
0
]
/
deformable_groups
;
ModulatedDeformableCol2imCoordCPUKernel
(
num_kernels
,
data_col
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
0
],
im_shape
[
1
],
im_shape
[
2
],
kernel_shape
[
2
],
kernel_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
2
*
kernel_shape
[
2
]
*
kernel_shape
[
3
]
*
deformable_groups
,
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
grad_offset
,
grad_mask
);
}
template
<
typename
T
,
typename
Context
>
void
FilterGradAddup
(
const
Context
&
dev_ctx
,
const
int
nthreads
,
const
int
n
,
const
int
height
,
const
int
width
,
const
T
*
dweight_3d
,
T
*
filter_grad
)
{
for
(
int
i
=
0
;
i
<
nthreads
;
i
++
)
{
filter_grad
[
i
]
=
filter_grad
[
i
]
+
dweight_3d
[
i
];
}
}
}
// namespace phi
PD_REGISTER_KERNEL
(
deformable_conv_grad
,
CPU
,
ALL_LAYOUT
,
phi
::
DeformableConvGradKernel
,
float
,
double
)
{}
paddle/phi/kernels/cpu/deformable_conv_kernel.cc
浏览文件 @
31363c3f
...
@@ -18,126 +18,6 @@
...
@@ -18,126 +18,6 @@
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/kernels/impl/deformable_conv_kernel_impl.h"
#include "paddle/phi/kernels/impl/deformable_conv_kernel_impl.h"
namespace
phi
{
template
<
typename
T
>
inline
void
ModulatedDeformableIm2colCPUKernel
(
const
int
num_kernels
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
*
mask
;
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
// get outputs of im2col with offset by bilinear interpolation
ModulatedDeformableIm2colCPUKernel
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
}
// namespace phi
PD_REGISTER_KERNEL
(
deformable_conv
,
PD_REGISTER_KERNEL
(
deformable_conv
,
CPU
,
CPU
,
ALL_LAYOUT
,
ALL_LAYOUT
,
...
...
paddle/phi/kernels/deformable_conv_grad_kernel.h
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "paddle/phi/core/dense_tensor.h"
namespace
phi
{
template
<
typename
T
,
typename
Context
>
void
DeformableConvGradKernel
(
const
Context
&
dev_ctx
,
const
DenseTensor
&
x
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
filter
,
paddle
::
optional
<
const
DenseTensor
&>
mask
,
const
DenseTensor
&
out_grad
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
int
deformable_groups
,
int
groups
,
int
im2col_step
,
DenseTensor
*
dx
,
DenseTensor
*
offset_grad
,
DenseTensor
*
filter_grad
,
DenseTensor
*
mask_grad
);
}
// namespace phi
paddle/phi/kernels/deformable_conv_kernel.h
浏览文件 @
31363c3f
...
@@ -15,6 +15,7 @@
...
@@ -15,6 +15,7 @@
#pragma once
#pragma once
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/dense_tensor.h"
#include "paddle/utils/optional.h"
namespace
phi
{
namespace
phi
{
...
@@ -23,7 +24,7 @@ void DeformableConvKernel(const Context& dev_ctx,
...
@@ -23,7 +24,7 @@ void DeformableConvKernel(const Context& dev_ctx,
const
DenseTensor
&
x
,
const
DenseTensor
&
x
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
offset
,
const
DenseTensor
&
filter
,
const
DenseTensor
&
filter
,
const
DenseTensor
&
mask
,
paddle
::
optional
<
const
DenseTensor
&>
mask
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
dilations
,
const
std
::
vector
<
int
>&
dilations
,
...
...
paddle/phi/kernels/flatten_grad_kernel.cc
浏览文件 @
31363c3f
...
@@ -25,6 +25,7 @@ void FlattenGradKernel(const Context& dev_ctx,
...
@@ -25,6 +25,7 @@ void FlattenGradKernel(const Context& dev_ctx,
const
DenseTensor
&
xshape
,
const
DenseTensor
&
xshape
,
DenseTensor
*
x_grad
)
{
DenseTensor
*
x_grad
)
{
auto
xshape_dims
=
xshape
.
dims
();
auto
xshape_dims
=
xshape
.
dims
();
dev_ctx
.
Alloc
(
x_grad
,
out_grad
.
dtype
());
auto
x_dims
=
phi
::
slice_ddim
(
xshape_dims
,
1
,
xshape_dims
.
size
());
auto
x_dims
=
phi
::
slice_ddim
(
xshape_dims
,
1
,
xshape_dims
.
size
());
phi
::
Copy
(
dev_ctx
,
out_grad
,
dev_ctx
.
GetPlace
(),
false
,
x_grad
);
phi
::
Copy
(
dev_ctx
,
out_grad
,
dev_ctx
.
GetPlace
(),
false
,
x_grad
);
x_grad
->
Resize
(
x_dims
);
x_grad
->
Resize
(
x_dims
);
...
...
paddle/phi/kernels/flatten_kernel.cc
浏览文件 @
31363c3f
...
@@ -27,6 +27,7 @@ void FlattenKernel(const Context& dev_ctx,
...
@@ -27,6 +27,7 @@ void FlattenKernel(const Context& dev_ctx,
int
start_axis
,
int
start_axis
,
int
stop_axis
,
int
stop_axis
,
DenseTensor
*
out
)
{
DenseTensor
*
out
)
{
dev_ctx
.
Alloc
(
out
,
x
.
dtype
());
auto
out_dims
=
out
->
dims
();
auto
out_dims
=
out
->
dims
();
phi
::
Copy
(
dev_ctx
,
x
,
dev_ctx
.
GetPlace
(),
false
,
out
);
phi
::
Copy
(
dev_ctx
,
x
,
dev_ctx
.
GetPlace
(),
false
,
out
);
out
->
Resize
(
out_dims
);
out
->
Resize
(
out_dims
);
...
@@ -43,7 +44,6 @@ void FlattenWithXShape(const Context& dev_ctx,
...
@@ -43,7 +44,6 @@ void FlattenWithXShape(const Context& dev_ctx,
DenseTensor
*
out
,
DenseTensor
*
out
,
DenseTensor
*
xshape
)
{
DenseTensor
*
xshape
)
{
FlattenKernel
<
T
,
Context
>
(
dev_ctx
,
x
,
start_axis
,
stop_axis
,
out
);
FlattenKernel
<
T
,
Context
>
(
dev_ctx
,
x
,
start_axis
,
stop_axis
,
out
);
funcs
::
SetXShape
(
x
,
xshape
);
}
}
}
// namespace phi
}
// namespace phi
...
...
paddle/phi/kernels/funcs/CMakeLists.txt
浏览文件 @
31363c3f
...
@@ -3,6 +3,7 @@ add_subdirectory(blas)
...
@@ -3,6 +3,7 @@ add_subdirectory(blas)
add_subdirectory
(
lapack
)
add_subdirectory
(
lapack
)
add_subdirectory
(
detail
)
add_subdirectory
(
detail
)
math_library
(
deformable_conv_functor DEPS dense_tensor
)
math_library
(
concat_and_split_functor DEPS dense_tensor
)
math_library
(
concat_and_split_functor DEPS dense_tensor
)
math_library
(
gru_compute DEPS activation_functions math_function
)
math_library
(
gru_compute DEPS activation_functions math_function
)
math_library
(
lstm_compute DEPS activation_functions
)
math_library
(
lstm_compute DEPS activation_functions
)
...
...
paddle/phi/kernels/funcs/deformable_conv_functor.cc
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/funcs/deformable_conv_functor.h"
#include "paddle/phi/backends/cpu/cpu_context.h"
namespace
phi
{
namespace
funcs
{
template
<
typename
T
>
inline
void
ModulatedDeformableIm2colCPUKernel
(
const
int
num_kernels
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
for
(
int
i
=
0
;
i
<
num_kernels
;
i
++
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
;
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
*
data_col_ptr
*=
mask
;
}
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
// get outputs of im2col with offset by bilinear interpolation
ModulatedDeformableIm2colCPUKernel
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
template
void
ModulatedDeformableIm2col
(
const
phi
::
CPUContext
&
dev_ctx
,
const
float
*
data_im
,
const
float
*
data_offset
,
const
float
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
float
*
data_col
);
template
void
ModulatedDeformableIm2col
(
const
phi
::
CPUContext
&
dev_ctx
,
const
double
*
data_im
,
const
double
*
data_offset
,
const
double
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
double
*
data_col
);
}
// namespace funcs
}
// namespace phi
paddle/phi/kernels/funcs/deformable_conv_functor.cu
0 → 100644
浏览文件 @
31363c3f
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/phi/kernels/funcs/deformable_conv_functor.h"
#include "paddle/phi/backends/gpu/gpu_context.h"
namespace
phi
{
namespace
funcs
{
static
constexpr
int
kNumCUDAThreads
=
512
;
static
constexpr
int
kNumMaximumNumBlocks
=
4096
;
static
inline
int
NumBlocks
(
const
int
N
)
{
return
std
::
min
((
N
+
kNumCUDAThreads
-
1
)
/
kNumCUDAThreads
,
kNumMaximumNumBlocks
);
}
template
<
typename
T
>
__global__
void
ModulatedDeformableIm2colGpuKernel
(
const
int
nthreads
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
int
height
,
const
int
width
,
const
int
kernel_h
,
const
int
kernel_w
,
const
int
pad_h
,
const
int
pad_w
,
const
int
stride_h
,
const
int
stride_w
,
const
int
dilation_h
,
const
int
dilation_w
,
const
int
channel_per_deformable_group
,
const
int
batch_size
,
const
int
num_channels
,
const
int
deformable_group
,
const
int
height_col
,
const
int
width_col
,
T
*
data_col
)
{
int
index
=
blockIdx
.
x
*
blockDim
.
x
+
threadIdx
.
x
;
int
offset
=
blockDim
.
x
*
gridDim
.
x
;
for
(
size_t
i
=
index
;
i
<
nthreads
;
i
+=
offset
)
{
const
int
w_col
=
i
%
width_col
;
const
int
h_col
=
(
i
/
width_col
)
%
height_col
;
const
int
b_col
=
(
i
/
width_col
)
/
height_col
%
batch_size
;
const
int
c_im
=
(
i
/
width_col
/
height_col
)
/
batch_size
;
const
int
c_col
=
c_im
*
kernel_h
*
kernel_w
;
const
int
deformable_group_index
=
c_im
/
channel_per_deformable_group
;
const
int
h_in
=
h_col
*
stride_h
-
pad_h
;
const
int
w_in
=
w_col
*
stride_w
-
pad_w
;
T
*
data_col_ptr
=
data_col
+
((
c_col
*
batch_size
+
b_col
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
*
data_im_ptr
=
data_im
+
(
b_col
*
num_channels
+
c_im
)
*
height
*
width
;
const
T
*
data_offset_ptr
=
data_offset
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
2
*
kernel_h
*
kernel_w
*
height_col
*
width_col
;
const
T
*
data_mask_ptr
=
data_mask
?
data_mask
+
(
b_col
*
deformable_group
+
deformable_group_index
)
*
kernel_h
*
kernel_w
*
height_col
*
width_col
:
nullptr
;
for
(
int
i
=
0
;
i
<
kernel_h
;
++
i
)
{
for
(
int
j
=
0
;
j
<
kernel_w
;
++
j
)
{
const
int
data_offset_h_ptr
=
((
2
*
(
i
*
kernel_w
+
j
))
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
int
data_offset_w_ptr
=
((
2
*
(
i
*
kernel_w
+
j
)
+
1
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
offset_h
=
data_offset_ptr
[
data_offset_h_ptr
];
const
T
offset_w
=
data_offset_ptr
[
data_offset_w_ptr
];
T
val
=
static_cast
<
T
>
(
0
);
const
T
h_im
=
h_in
+
i
*
dilation_h
+
offset_h
;
const
T
w_im
=
w_in
+
j
*
dilation_w
+
offset_w
;
if
(
h_im
>
-
1
&&
w_im
>
-
1
&&
h_im
<
height
&&
w_im
<
width
)
{
val
=
DmcnIm2colBilinear
(
data_im_ptr
,
width
,
height
,
width
,
h_im
,
w_im
);
}
*
data_col_ptr
=
val
;
if
(
data_mask_ptr
)
{
const
int
data_mask_hw_ptr
=
((
i
*
kernel_w
+
j
)
*
height_col
+
h_col
)
*
width_col
+
w_col
;
const
T
mask
=
data_mask_ptr
[
data_mask_hw_ptr
];
*
data_col_ptr
*=
mask
;
}
data_col_ptr
+=
batch_size
*
height_col
*
width_col
;
}
}
}
}
template
<
typename
T
,
typename
Context
>
void
ModulatedDeformableIm2col
(
const
Context
&
dev_ctx
,
const
T
*
data_im
,
const
T
*
data_offset
,
const
T
*
data_mask
,
const
std
::
vector
<
int64_t
>&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
T
*
data_col
)
{
int
channel_per_deformable_group
=
im_shape
[
0
]
/
deformable_groups
;
int
num_kernels
=
im_shape
[
0
]
*
col_shape
[
1
]
*
col_shape
[
2
]
*
col_shape
[
3
];
int
blocks
=
NumBlocks
(
num_kernels
);
int
threads
=
kNumCUDAThreads
;
ModulatedDeformableIm2colGpuKernel
<
T
><<<
blocks
,
threads
,
0
,
dev_ctx
.
stream
()
>>>
(
num_kernels
,
data_im
,
data_offset
,
data_mask
,
im_shape
[
1
],
im_shape
[
2
],
filter_shape
[
2
],
filter_shape
[
3
],
paddings
[
0
],
paddings
[
1
],
strides
[
0
],
strides
[
1
],
dilations
[
0
],
dilations
[
1
],
channel_per_deformable_group
,
col_shape
[
1
],
im_shape
[
0
],
deformable_groups
,
col_shape
[
2
],
col_shape
[
3
],
data_col
);
}
template
void
ModulatedDeformableIm2col
(
const
phi
::
GPUContext
&
dev_ctx
,
const
float
*
data_im
,
const
float
*
data_offset
,
const
float
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
float
*
data_col
);
template
void
ModulatedDeformableIm2col
(
const
phi
::
GPUContext
&
dev_ctx
,
const
double
*
data_im
,
const
double
*
data_offset
,
const
double
*
data_mask
,
const
std
::
vector
<
int64_t
>
&
im_shape
,
const
std
::
vector
<
int64_t
>&
col_shape
,
const
std
::
vector
<
int64_t
>&
filter_shape
,
const
std
::
vector
<
int
>&
paddings
,
const
std
::
vector
<
int
>&
strides
,
const
std
::
vector
<
int
>&
dilations
,
const
int
deformable_groups
,
double
*
data_col
);
}
// namespace funcs
}
// namespace phi
paddle/phi/kernels/funcs/deformable_conv_functor.h
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/gpu/deformable_conv_grad_kernel.cu
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/gpu/deformable_conv_kernel.cu
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/kernels/impl/deformable_conv_kernel_impl.h
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
paddle/phi/ops/compat/deformable_conv_sig.cc
浏览文件 @
31363c3f
...
@@ -29,6 +29,34 @@ KernelSignature DeformableConvOpArgumentMapping(
...
@@ -29,6 +29,34 @@ KernelSignature DeformableConvOpArgumentMapping(
{
"Output"
});
{
"Output"
});
}
}
KernelSignature
DeformableConvGradOpArgumentMapping
(
const
ArgumentMappingContext
&
ctx
)
{
return
KernelSignature
(
"deformable_conv_grad"
,
{
"Input"
,
"Offset"
,
"Filter"
,
"Mask"
,
GradVarName
(
"Output"
)},
{
"strides"
,
"paddings"
,
"dilations"
,
"deformable_groups"
,
"groups"
,
"im2col_step"
},
{
GradVarName
(
"Input"
),
GradVarName
(
"Offset"
),
GradVarName
(
"Filter"
),
GradVarName
(
"Mask"
)});
}
}
// namespace phi
}
// namespace phi
PD_REGISTER_BASE_KERNEL_NAME
(
deformable_conv_v1
,
deformable_conv
);
PD_REGISTER_BASE_KERNEL_NAME
(
deformable_conv_v1_grad
,
deformable_conv_grad
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv
,
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv
,
phi
::
DeformableConvOpArgumentMapping
);
phi
::
DeformableConvOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_grad
,
phi
::
DeformableConvGradOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_v1
,
phi
::
DeformableConvOpArgumentMapping
);
PD_REGISTER_ARG_MAPPING_FN
(
deformable_conv_v1_grad
,
phi
::
DeformableConvGradOpArgumentMapping
);
python/paddle/distributed/auto_parallel/dist_loader.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/dist_saver.py
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/engine.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/distributed/auto_parallel/utils.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/dygraph/varbase_patch_methods.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/engine_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/engine_predict_api.py
0 → 100644
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/auto_parallel/test_engine_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/test_egr_python_api.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
python/paddle/fluid/tests/unittests/test_inplace_eager_fluid.py
浏览文件 @
31363c3f
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录