提交 · 9a53477c56db87b509f8786f67b84130d0ea9e94 · Crayon鑫 / Paddle

05 11月, 2021 3 次提交
- C
  [PTen] Organize pten unitests and directory (#36948) · 9a53477c
  由 Chen Weihang 提交于 11月 05, 2021
```
* organize pten unitests

* fix detail errors
```
  9a53477c
- X
  Refactor apply transformer (#36899) · 9d2dd727
  由 xiongkun 提交于 11月 05, 2021
```
* bugfix: ps mode can't set backend automatically

* refactor

* fix

* refact

* refine code

* refine

* push
```
  9d2dd727
- Z
  Update the `VizGraph` method of CinnCompiler and add more debug info. (#36975) · d572fa27
  由 Zhen Wang 提交于 11月 05, 2021
```
* Use a more appropriate `Compile` method in cinn_launch_op.

* Update the VizGraph method of CinnCompiler.

* Add resnet50 model training with CINN.
```
  d572fa27
03 11月, 2021 5 次提交

Z
Fix PTen thread safety error (#36960) · 9c81a9bb
由 Zeng Jinle 提交于 11月 03, 2021
```
* fix pten thread safety error

* improve coverage
```
9c81a9bb

Add FLAGS_allow_cinn_ops & FLAGS_deny_cinn_ops for controlling op types used... · 2479664a

由 Zhen Wang 提交于 11月 03, 2021

Add FLAGS_allow_cinn_ops & FLAGS_deny_cinn_ops for controlling op types used in training with CINN. (#36842)

* Update UT test_parallel_executor_run_cinn.py.

* Add FLAGS_allow_cinn_ops & FLAGS_deny_cinn_ops & FLAGS_cinn_ops_delim.

* Use the custom StringSplit function and remove the FLAGS_cinn_ops_delim flag.

* Add FlagController test.

* Apply lock to the cache_ only in CinnCompiler.

* Add VizGraph & ReadableKey method for CinnCompiler.

* Update the dot style of VizGraph in CinnCompiler.

2479664a

C

change api->include and hapi->api (#36938) · 3121f889
由 Chen Weihang 提交于 11月 03, 2021

3121f889
L

executor framework (#36892) · 10b039b7
由 LiYuRio 提交于 11月 03, 2021

10b039b7
A
Fix persistable var is GC as unexpected (#36932) · 2b2e8b85
由 Aurelius84 提交于 11月 03, 2021
```
* Fix persistable var is GC as unexpected

* polish code and rebase develop
```
2b2e8b85

02 11月, 2021 6 次提交

C
[PTen] Add lock to kernel signature map init (#36923) · 71d375bb
由 Chen Weihang 提交于 11月 02, 2021
```
* add lock to kernel sig map

* add lock for map emplace
```
71d375bb
J
[Need review] Added conv + hard_sigmoid oneDNN fuse pass (#36869) · 53690719
由 jakpiase 提交于 11月 02, 2021
```
* added conv + hard_sigmoid fuse pass

* Removed IsOptional() statements

* Reverted removing optional
```
53690719
Z

memory sparse table (#36909) · 703487c6
由 zhaocaibei123 提交于 11月 02, 2021

703487c6

Add Intermediate Kernel API for refactor Tensor Lib (#36914) · 4a7f1a0d

由 YuanRisheng 提交于 11月 02, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* Add Intermediate API layer

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* intermediate api adapt to new dense tensor

* add some TODO and delete include header
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

4a7f1a0d

W

fix some bug, test=develop (#36888) · b0941102
由 wanghuancoder 提交于 11月 02, 2021

b0941102

Add matmul_v2 kernel in pten (#36844) · e11ecfce

由 zyfncg 提交于 11月 02, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* add matmul kernel in pten

* add unittest for new matmul_v2 kernel

* fix bug of CI compile

* fix bug of CI compile

* merge conflict

* remove useless file
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

e11ecfce

01 11月, 2021 5 次提交

L
[new-exec] refine vlog of interpretercore (#36865) · 4c93c4c3
由 Leo Chen 提交于 11月 01, 2021
```
* refine vlog of interpretercore

* fix ut
```
4c93c4c3

Paddle Tensor Operation Library initial implementation (#34425) · b9fdd3bc

由 Chen Weihang 提交于 11月 01, 2021

* initial tensor design & sign kernel demo

* add move constructor for meta & add lodtensor

* add dirs & sign xpu kernel

* add mean cpu&cuda kernel impl

* move sign & mean xpu & npu kernel

* add selected_rows basic impl

* refactor design, BaseTensor to DenseTensor, etc.

* add scale mkldnn kernel

* polish xpu & npu impl details

* fix mkldnn reuse compile failed

* change tensor operation lib name

* rename util filename

* add more comments

* change TensorImplInterface to TensorInterface

* add kernel key and factory

* remove MKLDNNTensorMeta, add MKLDNNDenseTensor

* change XXDeviceContext to XXContext

* add base kernel registrar utils & test on sign

* replace boost::any by paddle::any

* fix several ci failed

* fix npu compile error

* add ordered map util

* fix multiple ordered_map compile errors

* move dev into include dir

* support sign op in static op run

* fix static op run error

* fix new executor compile failed

* add dygraph branch & remove sign_op.h

* fix test_infer_no_need_buffer_slots

* fix rocm compile link error

* fix unitybuild error & clear glog

* fix npu compile failed

* skip quant trans test

* fix part windows compile problem

* fix xpu enforce error

* fix inference test failed

* remove ordered_map to solve quant failed

* fix part of rcom compile faild

* add more register kernels

* revert scale kernel temporarily

* fix code format error

* add new kernel registrar marco

* rename top to tcmpt

* revert xpu, npu, mkldnn impl & remove op def

* add kernel args parse functor to auto parse args

* revert some change & add scale kernels

* add op proto in dygraph kernelcontext building

* polish kernel dispatch logic & nameing rule

* fix scale kernel match error

* fix scale test failed

* add mean API and unittest

* test mean api success

* add branch to solve compiled error

* skip clang format error

* add mean skip rule in op_library

* add dot kernel, api and unittest (#6)

* remove old kernel and add symbol link

* fix dot compiled failed

* add merco for module declare

* fix npu and xpu compile error

* revert sign, mean, scale, dot kernel removing

* add comment for keeping old kernel impl

* fix mutable_data error

* fix bfloat16 conflit

* fix inference undef error

* adapt to msvc compile rules

* polish comment for template inst

* add cmake template instantiation for win

* fix backend to place device id bug

* fix ifdef error

* Op2functor (#7)

* add kernel args maker class

* make args maker non-const

* remove debug log

* modify codes by review options

* split constructPrKernelContext function

* fix output name bug

* fix test_mean_op test_sign_op failed

* fill_any_like kernel refactor (#10)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* skip dtype for fill_any_like

* add attrs for kernel key constrcut

* add use_pt_kernel Flags to control whether to use pt kernel (#13)

* add use_pt_kernel Flags to control whether to use pt kernel

* change the default value to true for cheking pt kernels

* fix mutable_data cuda place error

* move high level apis into hapi

* remove selectedrows adapting temporarily

* Support Scalar in Tensor Compute Library (#14)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* remove mkldnn tensor & polish details

* use flat_hash_map and small_vector in kernel factory

* Refactor flatten kernel (#12)

* refactor flatten kernel

* update infershape function

* fix compile bugs

* fix bugs when merge

* fix compiler bugs

* fix bugs when run test_flatten_api

* fix bugs when run test

* Revert "use flat_hash_map and small_vector in kernel factory"

This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b.

* Move cpu, cuda and other device code into kernels (#15)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Perfect unitests (#16)

* perfect unittest

* update license

* replace with flat_hash_map, small_vector (#19)

* fix small_vector build error on windows platform

* replace with flat_hash_map, small_vector

* remove todo

* Perfect unitests (#20)

* perfect unittest

* update license

* fix bug when run tcmpt_utils_test

* refactor execution adapting impl

* fix insert conflit

* Fix CI bug of test_yolov3 (#21)

* fill_any_like kernel refactor

* remove useless code of full_like c++ api

* Support Scalar in Tensor Compute Library

* add scalar in dygraph and static graph mode

* keep the basic type for attr, instead of using scalar for all

* merge the code

* start refactor matmul

* move cpu, cuda and other device modules into kernels

* merge code

* polish code in operator.cc

* Fix CI bug of test_yolov3

* add the tensor base class, test=develop (#17)

* update the tensor base class, test=develop

* remove two funcs, test=develop

* update the error msg, test=develop
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* [no-verify] commit backend and tensor signature changes

* Rename tcmpt to pten (#23)

* rename tcmpt to pten

* update omitted files for rename to pten

* update omitted file for rename to pten

* remove k of all enum var

* remove kernel_instantiate (#26)

* remove symbols and spatial_tensor

* change common to functions

* readd share tensor impl methods

* add a candidate dense tensor class, test=develop (#28)

* change all Pt to Pten

* resolve conflit with xiaowei

* Op2functor opt1 (#27)

* replace to small vector and change to const &

* add std::move
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

* polish kernel factory and kernel registry

* fix operator test error msg mismatch

* remove tensor signature and backend set member

* move scalar and polish enforce

* revert dtype layout change to fix error

* fix enum operator override error

* add several base unittests

* add pten utils tests

* polish some details

* Dev/op2func refactor 3 (#30)

* add a candidate dense tensor class, test=develop

* remove TensorBase::backend(), test=develop

* remove some ops, test=develop

* cherry-pick the pr of tensor meta, test=develop

* moves the dense tensor and some ops, test=develop

* update the linalg operator, test=develop

* update other operators, test=develop

* fix errors, test=develop

* fix bugs, test=develop

* try to resolve the problem of windows ci, test=develop

* updates codes, test=develop

* fix the tensor_utils.cc, test=develop

* modify the dense tensor, test=develop

* fix the data type, test=develop
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details

* polish kernel signature details

* fix a bug about offsets of the tensor, test=develop (#31)
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

* polish some details
Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com>
Co-authored-by: Nzyfncg <1370305206@qq.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

b9fdd3bc

J

add debug infomation for build_cinn_pass and graph symbolization (#36867) · 813e7526
由 jiangcheng 提交于 11月 01, 2021

813e7526
Z

memory sparse table & brpc communication upgrade dependency (#36734) · 29c6bcbf
由 zhaocaibei123 提交于 11月 01, 2021

29c6bcbf

add cinn_launch_op for using CINN to optimize graph (#36600) · 0a963ee9

由 CtfGo 提交于 11月 01, 2021

增加CinnLaunchOp，负责执行Cinn子图编译的结果，要点如下：
1. 在子图划分的BuildCinnPass中，每个子图在原图中会被替换为该CinnLaunchOp，由它来调用Cinn进行子图编译、执行的功能。
2. CinnLaunchOp的输入/输出即为子图的输入和输出，另外增加`compilation_key`属性，它可由该属性key从全局Cache中获取子图对象、编译结果，该属性由BuildCinnPass在创建Op时进行设置
3. CinnLaunchOp功能实现的流程为：
        - 从全局Cache中获取子图对象
        - 从全局Cache中获取子图编译结果，未命中cache时进行即时编译
        - 根据编译结果的变量信息(数据类型、shape）初始化运行时数据，分配内存/显存
        - 将运行时数据打包为参数，调用cinn的可执行对象runtime program进行计算
        - 子图运行结果通过参数指针同步到paddle侧的tensor

0a963ee9

29 10月, 2021 2 次提交

W
fix some bug in new executor (#36822) · b5af9575
由 wanghuancoder 提交于 10月 29, 2021
```
* fix some bug in new executor, test=develop

* fix error message, test=develop
```
b5af9575

[new-exec] enable check_nan_inf (#36802) · be55bac3

由 Leo Chen 提交于 10月 29, 2021

* enable check_nan_inf and fix variable scope

* add ut

* fix bug

* update ut

* revert doc change

* fix npu compile

be55bac3

28 10月, 2021 4 次提交

Fix several bugs for enabling Paddle to train with CINN. (#36739) · c93331c5

由 Zhen Wang 提交于 10月 28, 2021

* Update the content of `test_parallel_executor_run_cinn.py`.

* Fix some bugs in the topological sort and `CreateNewSubGraph`.

* Update the CINN commit id used by Paddle.

* Update the unit test to `add+relu`.

* Update according to reviewers' suggestion.

c93331c5

X
support inference for quantized matmul_v2 (#36594) · b151a451
由 XGZhang 提交于 10月 28, 2021
```
* support inference for quantized matmul_v2

* undate code style

* code style
```
b151a451

Fix cancel (#36740) · 704e454f

由 liutiexing 提交于 10月 28, 2021

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* update

* update

* update Error MSG

* update EventsWaiter

* Add Cancel For ThreadPool

* Add UT for Cancel

* fix Cancel

704e454f

A
Modify Struct into Class to improve encapsulation and Polish code exception (#36797) · 9516108a
由 Aurelius84 提交于 10月 28, 2021
```
* Refactor InterpreterCore code

* make tuple
```
9516108a

27 10月, 2021 3 次提交
- Q
  [ROCM] add custom op support, test=develop (#36771) · dd1d3789
  由 Qi Li 提交于 10月 27, 2021
```
* [ROCM] add custom op support, test=develop

* remove debug codes, test=develop
```
  dd1d3789
- W
  GeneratePass support attr condition and mapping (#36747) · 5c569aef
  由 wuhuanzhou 提交于 10月 27, 2021
```
* GeneratePass support attr condition and mapping, test=develop

* fix coverage, test=develop
```
  5c569aef
- W
  
  enable trt test check and fix trt ut error（3/3） (#36581) · 8c1c72af
  由 Wilber 提交于 10月 27, 2021
  
  8c1c72af
26 10月, 2021 5 次提交

L
[new-exec] cache exception in child thread (#36692) · 87fbbd36
由 Leo Chen 提交于 10月 26, 2021
```
* cache exception in child thread

* add ut

* fix ut
```
87fbbd36

[new-exec] Add cancel for thread pool (#36688) · fe6dbdd3

由 liutiexing 提交于 10月 26, 2021

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* update

* update

* update Error MSG

* update EventsWaiter

* Add Cancel For ThreadPool

* Add UT for Cancel

fe6dbdd3

Z
Fix the null ptr bug in build_cinn_pass. (#36698) · 28bab073
由 Zhen Wang 提交于 10月 26, 2021
```
* Fix the null ptr bug in build_cinn_pass.

* Add test for empty&ctrl var.
```
28bab073

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul,... · 93c591e2

由 Wangzheee 提交于 10月 26, 2021

[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, matmul, mul) convert pass, fix (matmul, mul) op_teller (#36652)

* new_Matmul2ToMatmulToMul

* new_Matmul2ToMatmulToMul

* fix paddle_pass_builder

* fix paddle_pass_builder

* fix paddle_pass_builder

* tem

* tem

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* Add MatmulV2ToMatmul convert Pass; MatmulV2ToMul convert Pass

* add matmul_broadcast_unitest

* fix op_teller

93c591e2

Support various length support for SelectedRows in GLOO::AllGather (#36637) · eca78a9f

由 xiongkun 提交于 10月 26, 2021

* In cpu parallel using gloo, add various length support for SelectedRows

* fix bug

* fix bugs

* fix by code review

* remove timeout

eca78a9f

25 10月, 2021 2 次提交

Create CinnCompiler class for compiling subgraphs found by build_cinn_pass. (#36562) · 4c460378

由 Zhen Wang 提交于 10月 25, 2021

* Init the functions of CinnCompiler.

* Add the unit test for CinnCompiler.

* Fix some compilation errors.

* Update the UT of cinn_compiler.

* Use Decomposer&OpFusion passes in CinnCompiler::CompileGraph.

* Update some comments.

* Uncomment some includes in build_cinn_pass.cc.

* Use refs instead of ptrs as returned types of FindGraph & Compile in
CinnCompiler.

* Use the merged CinnGraphSymbolization functions in CinnCompiler.

4c460378

[new-exec] Add events waiter (#36480) · cdb9bfa3

由 liutiexing 提交于 10月 25, 2021

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* update

* update

* update Error MSG

* update EventsWaiter

cdb9bfa3

24 10月, 2021 1 次提交
- Z
  
  Add the macro `-DPADDLE_WITH_CINN`. (#36660) · e2173b68
  由 Zhen Wang 提交于 10月 24, 2021
  
  e2173b68
23 10月, 2021 3 次提交

add cinn graph symbolization (#36417) · bbd4bd73

由 jiangcheng 提交于 10月 23, 2021

* add cinn graph symbolization

* fix some bug

* add paddle scope to cinn scope

* add paddle scope to CINN scope in Symbolization, and add feed op when build cinn pass

* fix some bug

* fix some bug by review advices

* optimize code problem

* revert build_cinn_pass and move the change to https://github.com/PaddlePaddle/Paddle/pull/36503

* fix some bug after co-compilation

* perfect single test script

* remove scope and rename feed_target to input_tensor

* using std::unordered_map instead of absl::flat_hash_map

* fix single test bug

* revert to preverion for WITH_CINN has add in later PR

* full error information for CI

* full enfore information for CI pass

bbd4bd73

Add transformer of paddle desc and cinn desc (#36100) · 3cb6f65e

由 jiangcheng 提交于 10月 23, 2021

* add transformer of paddle desc and cinn desc

* change LOG(FATAL) to PADDLE_THROW for ci

* full error imformation for ci

* fix some problem as review advice

* fix some bug

* move vat type utils to tansform_desc header file

* add if NOT WITH_CINN control whether compile

* build_strategy check whether open WITH_CINN

* add control WITH_CINN in cmake

3cb6f65e

New Paddle-CINN Compile PR (#36584) · ab732884

由 Huihuang Zheng 提交于 10月 23, 2021

This PR added some changes to match the CINN change for compilation. It also tried to fix JiangCheng's Problem in PR: https://github.com/PaddlePaddle/Paddle/pull/36100

These changes include:
1. Set `CINN_GIT_TAG` to a newer tag
2. CINN now just `make cinnapi -j`
3. We have to add `-DPY_VERSION=${PY_VERSION} -DWITH_TESTING=ON` to CINN cmake args
4. For CINN's third party dependencies, we could just include headers without target_link_libraries
5. Moved `cinn.cmake` from `paddle/cmake` to `paddle/cmake/external` to match old style. External folder contains `lite`, which is the same level of `cinn`
6. CINN added `-DNAMESPACE=cinn_gflags` in `gflags.cmake` to have different gflag namespaces between CINN and Paddle. It solved re-define problem.
7. Change namespace of `::google::` in gflags to `::GFLAGS_NAMESPACE`

ab732884

22 10月, 2021 1 次提交

[hapi] support dygraph amp O2 (#36441) · 08248db0

由 Leo Chen 提交于 10月 22, 2021

* [hapi] support dygrapg amp O2

* fix problem of static pure fp16 in hapi

* fix bug

* fix format

* fix ut

* follow comments

* update ut

* update amp save/load

* fix ut

* refine code format

08248db0

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致