提交 · b95eb38b8a797f3995162e7558e0bd2f0b22efd9 · PaddlePaddle / Paddle

22 2月, 2021 1 次提交
- J
  
  fix the bug in backward OP of index_sample. (#31026) · b95eb38b
  由 JamesLim 提交于 2月 22, 2021
  
  b95eb38b
20 2月, 2021 8 次提交

C
Remove PE special profiler (#30886) · 6b3371e0
由 Chengmo 提交于 2月 20, 2021
```
* remove pe special profiler

* add profiler info
```
6b3371e0

[CustomOp] Add more dispatch marco for users (#31058) · 6beeafe7

由 Chen Weihang 提交于 2月 20, 2021

* add more dispatch marco

* add more dispatch marco

* add more tests

* revert unneeded change

* add timeout for test dispatch

* add float and complex test

* remove and marco

6beeafe7

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel... · d5323dab

由 TTerror 提交于 2月 20, 2021

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)

* add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun

* update squeeze/unsqueeze op

d5323dab

1
test=develop, save/load, shrink (#30625) · 16b4260b
由 123malin 提交于 2月 20, 2021
```
* test=develop, save/load, shrink
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
```
16b4260b
J

hide useless headers and add complex support (#31074) · 628451af
由 Jiabin Yang 提交于 2月 20, 2021

628451af
W
update paddle_fluid.so to paddle_inference.so (#30850) · 463eae03
由 Wilber 提交于 2月 20, 2021
```
* update paddle_fluid.so to paddle_inference.so
```
463eae03

[static setitem] Support the index is Tensor; step>1; step<0 .(#30949) · 5b367dab

由 liym27 提交于 2月 20, 2021

* [static setitem] support the index step > 1. tensor_a[::3] = value

* [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value

* [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value

* Add op version.

5b367dab

Q

[ROCM] update fluid inference for rocm (part1), test=develop (#31018) · eb3050fa
由 Qi Li 提交于 2月 20, 2021

eb3050fa

19 2月, 2021 9 次提交
- J
  Added reshape grad bf16 (#31035) · f7465641
  由 Jacek Czaja 提交于 2月 19, 2021
```
* - added Reshape grad bf16

* - Added reshape grad bf16

* - cosmetics in py
```
  f7465641
- W
  Modify relu native implementation 2 (#30996) · 615d8a22
  由 Wojciech Uss 提交于 2月 18, 2021
```
* Modify relu native implementation

* fix GPU performance
```
  615d8a22
- S
  
  Remove scale loss before reduce in dygraph (#30807) · 9401173e
  由 ShenLiang 提交于 2月 19, 2021
  
  9401173e
- W
  
  fix python pass builder error. (#30946) · 0020d915
  由 Wilber 提交于 2月 18, 2021
  
  0020d915
- W
  
  fix jetson problem (#30939) · 39aeaa16
  由 Wilber 提交于 2月 18, 2021
  
  39aeaa16
- W
  
  update trt error message when input height or width is -1 (#31019) · 01ccfbcd
  由 Wilber 提交于 2月 18, 2021
  
  01ccfbcd
- W
  
  resolve memory leak in cudnn8.0 (#31029) · cf8b8f9c
  由 Wilber 提交于 2月 18, 2021
  
  cf8b8f9c
- G
  add offset parameter in roi_align,generate_proposals.etc ops (#30864) · 5b267474
  由 Guanghua Yu 提交于 2月 19, 2021
```
* add  parameter in roi_align op
```
  5b267474
- C
  
  fix regex error & simplify marco name (#31031) · 75f81233
  由 Chen Weihang 提交于 2月 18, 2021
  
  75f81233
18 2月, 2021 3 次提交

Z
enable exhaustive_search for forward and backward algos when dtype is float16 (#30959) · f0ee1592
由 Zhang Ting 提交于 2月 18, 2021
```
* enable exhaustive_search for input_grad when dtype is float16

* enable exhaustive_search for forward algos
```
f0ee1592
P

add trt transpose and flatten converter (#31022) · 9b54fe41
由 Pei Yang 提交于 2月 18, 2021

9b54fe41

Add Conv Transpose BF16 (#30877) · caf9d398

由 joanna.wozna.intel 提交于 2月 18, 2021

* Add conv transpose BF16

* Share function GetWeightsTz

* Adjust to review and fix op compatibility

* Add bias to unique handler name

* Remove errors related to paddle enforce

* Add conv2d_transpose to bf16 list and kernel refator

caf9d398

10 2月, 2021 1 次提交

New custom operator extension mechanism (#30690) · f649442d

由 Chen Weihang 提交于 2月 09, 2021

* initial commit: simple demo

* polish copyright format

* add grap op simple demo

* adapt uncertain number of argument

* change trait marco name

* add place & dtype support for add kernel

* add dispath and infershape func

* poish code & add notes

* add dynamic_loader dep for paddle_framework

* add new custom op test dir

* polish impl details

* add unittest for new custom op

* fix failed unittest

* Costum op (#1)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* refactor register design & add test

* change op_funtion to op_meta_info

* split op meta info into .h and .cc

* move get methods into friend class

* move OpMetaInfoHelper into framework space

* move CustomTensorUtils into framework space

* change pybind api name

* move PD C API into op meta info

* add register custom op api

* remove inference cmake change

* refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* polish detail & error message

* polish test details

* Add cast api && Change copy related api to copy_to && add more test (#4)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add type cast

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* add more error log

* add more error log

* polish code

* used for test

* remove test comment

* remove test comment

* fix uint8 type error

* fix lost uint8 type error

* add test for coverage

* polish details by reviewer comments

* add prefix for DISABLE_COPY_AND_ASSIGN
Co-authored-by: NJiabin Yang <360788950@qq.com>

f649442d

09 2月, 2021 6 次提交
- Z
  
  fix bug of Linux UT parallel level (#30971) · 5c033271
  由 Zhou Wei 提交于 2月 09, 2021
  
  5c033271
- W
  update eigen version on Windows (#30573) · 9b3c80c8
  由 wuhuanzhou 提交于 2月 09, 2021
```
* update eigen version on Windows, test=develop

* add /bigobj for cl, test=develop
```
  9b3c80c8
- S
  
  Solve inconsistent order in each card in dynamic graph (#30931) · dae3e1f3
  由 ShenLiang 提交于 2月 09, 2021
  
  dae3e1f3
- W
  
  Fix the problem that the number of ops executed by xpu is wrong (#30961) · 14d039e4
  由 WangXi 提交于 2月 09, 2021
  
  14d039e4
- C
  
  try to fix reader and signal test failed (#30960) · 010f2caa
  由 Chen Weihang 提交于 2月 08, 2021
  
  010f2caa
- A
  
  Fix LayerNorm tester for gcc4.8 (#30962) · 3ba69809
  由 Adam Osewski 提交于 2月 09, 2021
  
  3ba69809
08 2月, 2021 4 次提交
- Q
  
  [ROCM] update fluid platform for rocm39 (part3), test=develop (#30913) · 93c1d9e7
  由 Qi Li 提交于 2月 08, 2021
  
  93c1d9e7
- Q
  
  fix depends of kunlun bkcl (#30945) · 15297a06
  由 QingshuChen 提交于 2月 08, 2021
  
  15297a06
- L
  
  Add error message for slice op(#30851) · 97f7a70c
  由 liym27 提交于 2月 08, 2021
  
  97f7a70c
- L
  
  [kunlun]fix sync in multi kunlun xpu dygraph training. (#30943) · 87197f8c
  由 liuyuhui 提交于 2月 08, 2021
  
  87197f8c
07 2月, 2021 3 次提交
- 石
  bug fix of xpu lite engine, test=develop (#30918) · 99bd16eb
  由石晓伟提交于 2月 07, 2021
```
* bug fix of xpu lite engine, test=develop

* xpu zero copy tensor, test=develop

* revert paddle/fluid/inference/tests/api/CMakeLists.txt
```
  99bd16eb
- T
  
  Add WITH_XPU_BKCL in Kunlun-CI (#30919) · 2e932338
  由 tianshuo78520a 提交于 2月 07, 2021
  
  2e932338
- Q
  
  [ROCM] update fluid platform for rocm39 (part2), test=develop (#30774) · 34f1628c
  由 Qi Li 提交于 2月 07, 2021
  
  34f1628c
06 2月, 2021 1 次提交
- J
  
  [oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925) · 9e527d99
  由 Jacek Czaja 提交于 2月 06, 2021
  
  9e527d99
05 2月, 2021 4 次提交
- C
  add truncated gaussian random (#30922) · c98f144f
  由 Chengmo 提交于 2月 05, 2021
```
add truncated gaussian random
```
  c98f144f
- L
  
  [Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858) · 4a8b8b45
  由 liuyuhui 提交于 2月 05, 2021
  
  4a8b8b45
- L
  Performance optimization for dynamic setitem: Call op set_value to speed up... · 39f41cb4
  由 liym27 提交于 2月 05, 2021
```
Performance optimization for dynamic setitem: Call op set_value to speed up because the original call to TensorToPyArray will introduce unnecessary data copy. (#30817)
```
  39f41cb4
- L
  
  [Kunlun]fix include files of gen_comm_id_helper.cc (#30917) · bef46ccf
  由 liuyuhui 提交于 2月 05, 2021
  
  bef46ccf

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功