提交 · 5653c3a4880a100fda7e1aab2b02f237645bff20 · BaiXuePrincess / Paddle

18 2月, 2021 1 次提交

[CustomOp] Check Compiler ABI compatibility (#30869) · 5653c3a4

由 Aurelius84 提交于 2月 18, 2021

* support setup.py to compile custom op

* move file into paddle.utils.cpp_extension

* support python setup.py install

* refine code style

* Enrich code and add unittest

5653c3a4

10 2月, 2021 2 次提交

W

delay timeout of unnittest 'test_static_save_load'. (#30975) · 8ab29f4b
由 WeiXin 提交于 2月 10, 2021

8ab29f4b

New custom operator extension mechanism (#30690) · f649442d

由 Chen Weihang 提交于 2月 09, 2021

* initial commit: simple demo

* polish copyright format

* add grap op simple demo

* adapt uncertain number of argument

* change trait marco name

* add place & dtype support for add kernel

* add dispath and infershape func

* poish code & add notes

* add dynamic_loader dep for paddle_framework

* add new custom op test dir

* polish impl details

* add unittest for new custom op

* fix failed unittest

* Costum op (#1)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* refactor register design & add test

* change op_funtion to op_meta_info

* split op meta info into .h and .cc

* move get methods into friend class

* move OpMetaInfoHelper into framework space

* move CustomTensorUtils into framework space

* change pybind api name

* move PD C API into op meta info

* add register custom op api

* remove inference cmake change

* refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* polish detail & error message

* polish test details

* Add cast api && Change copy related api to copy_to && add more test (#4)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add type cast

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* add more error log

* add more error log

* polish code

* used for test

* remove test comment

* remove test comment

* fix uint8 type error

* fix lost uint8 type error

* add test for coverage

* polish details by reviewer comments

* add prefix for DISABLE_COPY_AND_ASSIGN
Co-authored-by: NJiabin Yang <360788950@qq.com>

f649442d

09 2月, 2021 1 次提交
- C
  
  try to fix reader and signal test failed (#30960) · 010f2caa
  由 Chen Weihang 提交于 2月 08, 2021
  
  010f2caa
08 2月, 2021 2 次提交
- L
  
  [Static setitem] Support index is ellipsis for setitem in static mode (#30836) · 12c15beb
  由 liym27 提交于 2月 08, 2021
  
  12c15beb
- L
  
  [kunlun]fix sync in multi kunlun xpu dygraph training. (#30943) · 87197f8c
  由 liuyuhui 提交于 2月 08, 2021
  
  87197f8c
07 2月, 2021 1 次提交
- W
  fix a bug of Sequential::__getitem__ (#30899) · 823f499a
  由 wanghuancoder 提交于 2月 07, 2021
```
* fix a bug of Sequential::__getitem__, test=develop

* add testcase, test=develop
```
  823f499a
06 2月, 2021 1 次提交
- J
  
  [oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925) · 9e527d99
  由 Jacek Czaja 提交于 2月 06, 2021
  
  9e527d99
05 2月, 2021 1 次提交
- L
  
  [Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858) · 4a8b8b45
  由 liuyuhui 提交于 2月 05, 2021
  
  4a8b8b45
04 2月, 2021 1 次提交
- J
  
  [oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757) · abfa8226
  由 Jacek Czaja 提交于 2月 04, 2021
  
  abfa8226
03 2月, 2021 9 次提交
- C
  
  add clip_by_norm on kunlun, *test=kunlun (#30862) · ac2e2e6b
  由 cucuzg 提交于 2月 03, 2021
  
  ac2e2e6b
- W
  fix the broadcast for the large second input (#30818) · b7560a59
  由 wawltor 提交于 2月 03, 2021
```
fix the broadcast for the large second input 
```
  b7560a59
- J
  
  Implement cuda kernel for index_sample. (#30380) · 6e1e036a
  由 JamesLim 提交于 2月 03, 2021
  
  6e1e036a
- A
  
  Call new cudnn batch norm API regardless of data type and data layout (#30157) · 666efc23
  由 AshburnLee 提交于 2月 03, 2021
  
  666efc23
- 石
  support xpu with analysis predictor, test=develop (#30832) · 2ac4143b
  由石晓伟提交于 2月 03, 2021
```
* support xpu inference with analysis predictor, test=develop

* merge the cmake of the xpu toolchain, test=develop

* add c-apis, test=develop

* fix a bug in extern_xpu, test=develop
```
  2ac4143b
- J
  Update paddle.static.Print with paddle2.0 api (#30846) · 05d2b7a3
  由 joejiong 提交于 2月 03, 2021
```
As the title
```
  05d2b7a3
- A
  [CustomOp] Support install as Package and Add load interface (#30798) · e49d0746
  由 Aurelius84 提交于 2月 03, 2021
```
* support setup.py to compile custom op

* move file into paddle.utils.cpp_extension

* support python setup.py install

* refine code style

* Enrich code and add unittest

* Polish code and api doc

* fix cpp_extension not include in package

* fix relative import

* fix os.makedirs exist_ok param compatibility PY2

* add compile flags in test_jit_load
```
  e49d0746
- A
  
  Layer normalization fuse pass. (#30721) · 4f066e31
  由 Adam Osewski 提交于 2月 03, 2021
  
  4f066e31
- W
  
  【kunlun】dygraph supports multi xpu card training (#30671) · b1026f64
  由 WangXi 提交于 2月 03, 2021
  
  b1026f64
02 2月, 2021 1 次提交
- S
  fix trt plugin clone and initialize bugs in TRT7.1+ (#30709) · b9094509
  由 Shang Zhizhou 提交于 2月 02, 2021
```
* fix trt plugin clone and initialize bugs

* fix unit test error

* enable trt in ci py3

* update unittest timeout
```
  b9094509
01 2月, 2021 3 次提交
- S
  
  fix unittest random error (#30808) · 200ee33d
  由 Shang Zhizhou 提交于 2月 01, 2021
  
  200ee33d
- X
  Optimize the encoder of Transformer. (#30439) · db870872
  由 xiemoyuan 提交于 2月 01, 2021
```
* Add cache for Transformer encoder.

* Bug fixed.

* add unittests for transformer encoder.
```
  db870872
- W
  
  Fleet distributed strategy support pure fp16 (#30754) · 31ed9c9e
  由 WangXi 提交于 2月 01, 2021
  
  31ed9c9e
29 1月, 2021 1 次提交
- A
  
  【CustomOp】support setup.py to compile custom op (#30753) · 2c974cc3
  由 Aurelius84 提交于 1月 29, 2021
  
  2c974cc3
28 1月, 2021 2 次提交
- W
  
  A fix for oneDNN matmul kernel. Fixes issue #30309 (#30723) · fc002405
  由 Wojciech Uss 提交于 1月 28, 2021
  
  fc002405
- W
  
  Split unittest. (#30727) · 3491acfb
  由 WeiXin 提交于 1月 28, 2021
  
  3491acfb
27 1月, 2021 3 次提交

L
upgrade gather_tree to core.ops (#30697) · fef3654b
由 liu zhengxi 提交于 1月 27, 2021
```
* upgrade gather_tree to core.ops

* update gather_tree unittests
```
fef3654b

REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719) · f8da5536

由 jakpiase 提交于 1月 27, 2021

* added external reorder to profiler

* resolved conflict

* added enable_static

* initial version of lstm, not working yet

* added lstm to operators.cmake

* added vanilla lstm mkldnn op

* added peephole weights integration

* minor changes

* added formatting

* added fusion_lstm_mkldnn to static_whitelist

* added formatting

* removed comment

* moved use_peepholes attribute inside is_cached block

* reverted wrong changes

* minor formatting change

* minor changes

* changed stream handling

* minor change

* added datatype to GetExpectedKernelType()

* added reading stream from TLS

f8da5536

L

[Dy2Stat] Fix error message when the message has more than one lines. (#30714) · 13ef444f
由 liym27 提交于 1月 27, 2021

13ef444f

26 1月, 2021 3 次提交

T
Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)" (#30708) · 824a79d3
由 Tao Luo 提交于 1月 26, 2021
```
This reverts commit d834f4e6.
```
824a79d3

Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661) · d834f4e6

由 jakpiase 提交于 1月 26, 2021

* added external reorder to profiler

* resolved conflict

* added enable_static

* initial version of lstm, not working yet

* added lstm to operators.cmake

* added vanilla lstm mkldnn op

* added peephole weights integration

* minor changes

* added formatting

* added fusion_lstm_mkldnn to static_whitelist

* added formatting

* removed comment

* moved use_peepholes attribute inside is_cached block

* reverted wrong changes

* minor formatting change

* minor changes

d834f4e6

L
polish printing dtype (#30682) · 1a13626f
由 Leo Chen 提交于 1月 26, 2021
```
* polish printing dtype

* fix special case
```
1a13626f

25 1月, 2021 2 次提交
- W
  
  fix test_gen_nccl_id_op failed (#30686) · a28a2026
  由 WangXi 提交于 1月 25, 2021
  
  a28a2026
- C
  fix abs bug and add abs test case (#30637) · fb7fbc7a
  由 chentianyu03 提交于 1月 25, 2021
```
* add abs test case

* use std::abs to fix abs bug

* fix the abs bug

* fix abs bug
```
  fb7fbc7a
22 1月, 2021 1 次提交
- S
  
  Fix scatter grad bug (#30604) · 9514b4aa
  由 ShenLiang 提交于 1月 22, 2021
  
  9514b4aa
21 1月, 2021 2 次提交
- Q
  
  [ROCM] update cmake and dockerfile, test=develop (#30598) · 1f5841c2
  由 Qi Li 提交于 1月 21, 2021
  
  1f5841c2
- Z
  Fix the bug in fleet amp_init. (#30606) · 4a9de931
  由 Zhen Wang 提交于 1月 21, 2021
```
* Fix the bug in fleet amp_init.

* Fix the amp_init unit test.
```
  4a9de931
20 1月, 2021 3 次提交

support reduce_max op on kunlun (#30581) · 10271ddf

由 TTerror 提交于 1月 20, 2021

* support reduce_max op on kunlun

* support reduce_max op on kunlun

* support reduce_max op on kunlun

* support reduce_max op on kunlun

10271ddf

W
延长单测'test_static_save_load'超时 (#30599) · ca338214
由 WeiXin 提交于 1月 20, 2021
```
* delay the 'timeout' of 'test_static_save_load'.

* delay the 'timeout' of 'test_static_save_load'.
```
ca338214

make abs op support complex types (#30375) · 358106fc

由 chentianyu03 提交于 1月 20, 2021

* rewrite abs op

* rewrite abs op and remove abs in activation

* remove abs register in old codes

* fix abs_grad type error

* fix abs double_grad output name error

* modify abs_grad, abs_grad_grad functor for windows building

* format code style

* fix the bug of result is nan when the divisor is zero

* add missing abs attr and add abs for float16

358106fc

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致