提交 · f2dc29a9fabcfd0d9d5f277019e5290483a8c650 · 机器未来 / Paddle

19 2月, 2021 10 次提交
- A
  
  [CustomOp] Support output dtypes in generated Python API (#31045) · f2dc29a9
  由 Aurelius84 提交于 2月 19, 2021
  
  f2dc29a9
- W
  Modify relu native implementation 2 (#30996) · 615d8a22
  由 Wojciech Uss 提交于 2月 18, 2021
```
* Modify relu native implementation

* fix GPU performance
```
  615d8a22
- S
  
  Remove scale loss before reduce in dygraph (#30807) · 9401173e
  由 ShenLiang 提交于 2月 19, 2021
  
  9401173e
- W
  
  fix python pass builder error. (#30946) · 0020d915
  由 Wilber 提交于 2月 18, 2021
  
  0020d915
- W
  
  fix jetson problem (#30939) · 39aeaa16
  由 Wilber 提交于 2月 18, 2021
  
  39aeaa16
- W
  
  update trt error message when input height or width is -1 (#31019) · 01ccfbcd
  由 Wilber 提交于 2月 18, 2021
  
  01ccfbcd
- W
  
  resolve memory leak in cudnn8.0 (#31029) · cf8b8f9c
  由 Wilber 提交于 2月 18, 2021
  
  cf8b8f9c
- K
  fix dataloader collate return list mix tensor and numpy array (#30904) · c4ddc3ab
  由 Kaipeng Deng 提交于 2月 19, 2021
```
* fix dataloader collate return list mix tensor and numpy array. test=develop
```
  c4ddc3ab
- G
  add offset parameter in roi_align,generate_proposals.etc ops (#30864) · 5b267474
  由 Guanghua Yu 提交于 2月 19, 2021
```
* add  parameter in roi_align op
```
  5b267474
- C
  
  fix regex error & simplify marco name (#31031) · 75f81233
  由 Chen Weihang 提交于 2月 18, 2021
  
  75f81233
18 2月, 2021 8 次提交

Z
enable exhaustive_search for forward and backward algos when dtype is float16 (#30959) · f0ee1592
由 Zhang Ting 提交于 2月 18, 2021
```
* enable exhaustive_search for input_grad when dtype is float16

* enable exhaustive_search for forward algos
```
f0ee1592
P

add trt transpose and flatten converter (#31022) · 9b54fe41
由 Pei Yang 提交于 2月 18, 2021

9b54fe41

[CustomOp] Support Compile multi ops at same time (#30920) · 4c9f96c9

由 Aurelius84 提交于 2月 18, 2021


* add more unitest for ABI compatibility

* add more unittest

* refine warning style

* support compile multi custom ops in same time

* fix not import paddle in unittest

* fix typo

* add more unittest

* add comment for details

4c9f96c9

Add Conv Transpose BF16 (#30877) · caf9d398

由 joanna.wozna.intel 提交于 2月 18, 2021

* Add conv transpose BF16

* Share function GetWeightsTz

* Adjust to review and fix op compatibility

* Add bias to unique handler name

* Remove errors related to paddle enforce

* Add conv2d_transpose to bf16 list and kernel refator

caf9d398

H
Refine fake_interface Error Message (#30981) · cbbe1274
由 Huihuang Zheng 提交于 2月 18, 2021
```
Refine fake_interface Error Message
```
cbbe1274

Add Support for Tuple in for Loop (#30998) · c1375783

由 Huihuang Zheng 提交于 2月 18, 2021

Dy2stat didn't support tuple as iteration variable in the past. This PR added there main cases:

1). Non-enumerate case: for var1, var2 in var|var.numpy() will be re-written as:
for FOR_ITER_TUPLE_PREFIX_x in var | var.numpy():
var1 = FOR_ITER_TUPLE_PREFIX_x[0]
var2 = FOR_ITER_TUPLE_PREFIX_x[1]
2). Enumerate out tuple case: for t in enumerate(var|var.numpy) will be rewritten as:
for FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x in enumerate(var|var.numpy):
t = (FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x)
3). Enumerate inner tuple case: for i, (var1, (var2, va3)) in enumerate(var|var.numpy()) will
be re-written as:
for i, FOR_ITER_TUPLE_PREFIX_x in var | var.numpy():
var1 = FOR_ITER_TUPLE_PREFIX_x[0]
var2 = FOR_ITER_TUPLE_PREFIX_x[1][0]
var3 = FOR_ITER_TUPLE_PREFIX_x[1][1]

c1375783

W

Handle missing symlink method on Windows (#31006) · 2497f439
由 Wojciech Uss 提交于 2月 17, 2021

2497f439

[CustomOp] Check Compiler ABI compatibility (#30869) · 5653c3a4

由 Aurelius84 提交于 2月 18, 2021

* support setup.py to compile custom op

* move file into paddle.utils.cpp_extension

* support python setup.py install

* refine code style

* Enrich code and add unittest

5653c3a4

11 2月, 2021 1 次提交
- H
  
  fix lrn bug in reshape size, test=develop (#30968) · 20e300e2
  由 huangjun12 提交于 2月 11, 2021
  
  20e300e2
10 2月, 2021 2 次提交

W

delay timeout of unnittest 'test_static_save_load'. (#30975) · 8ab29f4b
由 WeiXin 提交于 2月 10, 2021

8ab29f4b

New custom operator extension mechanism (#30690) · f649442d

由 Chen Weihang 提交于 2月 09, 2021

* initial commit: simple demo

* polish copyright format

* add grap op simple demo

* adapt uncertain number of argument

* change trait marco name

* add place & dtype support for add kernel

* add dispath and infershape func

* poish code & add notes

* add dynamic_loader dep for paddle_framework

* add new custom op test dir

* polish impl details

* add unittest for new custom op

* fix failed unittest

* Costum op (#1)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* refactor register design & add test

* change op_funtion to op_meta_info

* split op meta info into .h and .cc

* move get methods into friend class

* move OpMetaInfoHelper into framework space

* move CustomTensorUtils into framework space

* change pybind api name

* move PD C API into op meta info

* add register custom op api

* remove inference cmake change

* refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* polish detail & error message

* polish test details

* Add cast api && Change copy related api to copy_to && add more test (#4)

* fix compile error

* wrap framework tensor with LoDTensor

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* add CustomTensor default constructor

* add size() for CustomTensor

* make size const for CustomTensor

* refactor place related api to circle the concept

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* fix compile error

* make place const

* make Tensor copy

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* debug CustomTensor core

* remove additional head of framework

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* use back to shared ptr for custom tensor

* add gpu test

* merge latest cwh code in

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* adjust ut code of custom op

* hid share data from and to

* rename CustomTensor to Tensor

* support multi dtype

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* remove lod, make reshape lowercase, add copy test and refactor copy api

* fix copy to error

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add more test

* add type cast

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* add cast and make copy to api

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* merge cwh code

* add more error log

* add more error log

* polish code

* used for test

* remove test comment

* remove test comment

* fix uint8 type error

* fix lost uint8 type error

* add test for coverage

* polish details by reviewer comments

* add prefix for DISABLE_COPY_AND_ASSIGN
Co-authored-by: NJiabin Yang <360788950@qq.com>

f649442d

09 2月, 2021 9 次提交
- Z
  
  fix bug of Linux UT parallel level (#30971) · 5c033271
  由 Zhou Wei 提交于 2月 09, 2021
  
  5c033271
- C
  support label with float input of cross_entropy, test=develop (#30929) · f5ca2db2
  由 chajchaj 提交于 2月 09, 2021
```
* support label with float input of cross_entropy, test=develop

* fix code style in nn/functional/loss.py, test=develop
```
  f5ca2db2
- P
  modify dockerfile: support cuda11 and delete gcc8.2 in cpu version (#30746) · 52edaecc
  由 pangyoki 提交于 2月 09, 2021
```
* support cuda11 and delete gcc8.2 in cpu version

* change method

* fix pip

* change 11 to 11.0
```
  52edaecc
- W
  update eigen version on Windows (#30573) · 9b3c80c8
  由 wuhuanzhou 提交于 2月 09, 2021
```
* update eigen version on Windows, test=develop

* add /bigobj for cl, test=develop
```
  9b3c80c8
- S
  
  Solve inconsistent order in each card in dynamic graph (#30931) · dae3e1f3
  由 ShenLiang 提交于 2月 09, 2021
  
  dae3e1f3
- W
  
  Fix the problem that the number of ops executed by xpu is wrong (#30961) · 14d039e4
  由 WangXi 提交于 2月 09, 2021
  
  14d039e4
- H
  Update gast requirement, test=develop (#30932) · 8e72e031
  由 Huihuang Zheng 提交于 2月 09, 2021
```
gast version can be conflict with the other software users installed. We set the version to be higher than 0.3.3
```
  8e72e031
- C
  
  try to fix reader and signal test failed (#30960) · 010f2caa
  由 Chen Weihang 提交于 2月 08, 2021
  
  010f2caa
- A
  
  Fix LayerNorm tester for gcc4.8 (#30962) · 3ba69809
  由 Adam Osewski 提交于 2月 09, 2021
  
  3ba69809
08 2月, 2021 5 次提交
- Q
  
  [ROCM] update fluid platform for rocm39 (part3), test=develop (#30913) · 93c1d9e7
  由 Qi Li 提交于 2月 08, 2021
  
  93c1d9e7
- Q
  
  fix depends of kunlun bkcl (#30945) · 15297a06
  由 QingshuChen 提交于 2月 08, 2021
  
  15297a06
- L
  
  [Static setitem] Support index is ellipsis for setitem in static mode (#30836) · 12c15beb
  由 liym27 提交于 2月 08, 2021
  
  12c15beb
- L
  
  Add error message for slice op(#30851) · 97f7a70c
  由 liym27 提交于 2月 08, 2021
  
  97f7a70c
- L
  
  [kunlun]fix sync in multi kunlun xpu dygraph training. (#30943) · 87197f8c
  由 liuyuhui 提交于 2月 08, 2021
  
  87197f8c
07 2月, 2021 5 次提交
- W
  op benchmark ci retry with specfied id (#30743) · 99bf6228
  由 wuhuanzhou 提交于 2月 07, 2021
```
* op benchmark ci retry with specfied id, notest, test=op_benchmark

* fix parse case name with case id, notest, test=op_benchmark

* remove test code, test=develop
```
  99bf6228
- 石
  bug fix of xpu lite engine, test=develop (#30918) · 99bd16eb
  由石晓伟提交于 2月 07, 2021
```
* bug fix of xpu lite engine, test=develop

* xpu zero copy tensor, test=develop

* revert paddle/fluid/inference/tests/api/CMakeLists.txt
```
  99bd16eb
- T
  
  Add WITH_XPU_BKCL in Kunlun-CI (#30919) · 2e932338
  由 tianshuo78520a 提交于 2月 07, 2021
  
  2e932338
- W
  fix a bug of Sequential::__getitem__ (#30899) · 823f499a
  由 wanghuancoder 提交于 2月 07, 2021
```
* fix a bug of Sequential::__getitem__, test=develop

* add testcase, test=develop
```
  823f499a
- Q
  
  [ROCM] update fluid platform for rocm39 (part2), test=develop (#30774) · 34f1628c
  由 Qi Li 提交于 2月 07, 2021
  
  34f1628c

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致