提交 · 7ba85acad7c927f0ba65b2e44f2cc5402275fcf9 · 机器未来 / Paddle

14 4月, 2021 8 次提交
- C
  Add inner register backward hook method for Tensor (#32171) · 7ba85aca
  由 Chen Weihang 提交于 4月 14, 2021
```
* add register backward hook method

* add leaf grad accumullated test
```
  7ba85aca
- Q
  Fix rocm cmake (#32230) · f3e49c40
  由 Qi Li 提交于 4月 14, 2021
```
* [ROCM] fix some typo in cmake, test=develop

* [ROCM] fix rccl in paddle build script, test=develop
```
  f3e49c40
- T
  Delete grpc.cmake/distribeted/distributed_ops (#32166) · 22ea4c30
  由 tianshuo78520a 提交于 4月 14, 2021
```
* Delete grpc.cmake/distribeted/distributed_ops

* reset operators/CMakeLists.txt

* rm test_transpiler_ops.py

* del test_transpiler_ops.py
```
  22ea4c30
- Z
  fix matrix_inverse_op with rocm (#32128) · 995b5f2c
  由 zhulei 提交于 4月 14, 2021
```
* fix matrix_inverse_op with rocm

* fix matrix_inverse_op with rocm

* fix matrix_inverse_op with rocm

* fix matrix_inverse_op with rocm
```
  995b5f2c
- X
  
  Add model benchmark ci (#32247) · 279b653c
  由 xiegegege 提交于 4月 14, 2021
  
  279b653c
- F
  add common dtypes as paddle's dtypes (#32012) · 95939b52
  由 Feiyu Chan 提交于 4月 14, 2021
```
* add common dtypes as paddle's dtypes

* import paddle.fluid.core_avx.VarDesc.VarType as paddle.dtype
```
  95939b52
- T
  
  fix expand op lack of float16 (#32238) · f4b2ce44
  由 Thomas Young 提交于 4月 14, 2021
  
  f4b2ce44
- X
  
  add new post-quant methods (#32208) · 4281eb49
  由 XGZhang 提交于 4月 14, 2021
  
  4281eb49
13 4月, 2021 8 次提交

extend multiclass_nms unittest timeout threshold (#32214) · cb81826a

由 Pei Yang 提交于 4月 13, 2021

* extend multiclass_nms unittest timeout threshold

* adjust timeout to 200s

* temporarily disable multiclass_nms trt op teller

cb81826a

L

upgrade to oneDNN2.2.1 (fix when prim descriptor or attr contain NaN) (#32227) · b9e543f8
由 lidanqing 提交于 4月 13, 2021

b9e543f8
Z

add statistics_UT_resource.sh for imporving UT parallel level (#32220) · 1d5d3e47
由 Zhou Wei 提交于 4月 13, 2021

1d5d3e47
Y
Fix prec on windows for long args (#32218) · 7ab47e8d
由 YUNSHEN XIE 提交于 4月 13, 2021
```
* fix error for long args

* remove unneccessary code
```
7ab47e8d

add layer.to api (#32040) · 6e946e9d

由 chentianyu03 提交于 4月 13, 2021

* add layer.to api

* add layer.to api

* add layer.to api

* add the doc for Layer.to

* add input type checking

* modify assert and import bug

* format code style

* format code style

* make place support str type

* add SetGradVarBase method to set the gradient after conversion

* modify argument palce to device

* modify argument palce to device

* modify doc of layers.to API

* add xpuplace to device argument

6e946e9d

Q

[ROCM] fix depth conv2d in rocm, test=develop (#32170) · 693c7629
由 Qi Li 提交于 4月 13, 2021

693c7629
J

optimize check_finite_and_unscale_op by fused kernel, test=develop (#31954) · fdf63b4e
由 jiangcheng 提交于 4月 13, 2021

fdf63b4e

run the sample codes added by `add_sample_code` in ops.py (#31863) · 4a09c1a1

由 Ren Wei (任卫) 提交于 4月 13, 2021

* skip paddle.Tensor.<lambda>

* some file may not exists. such as version.py, it's generated by setup.py

* debug mode

* add unittests for sampcd_processor.py

* add test cases for sampcd_processor

* add test cases for sampcd_processor

* add testcases

* add test cases

* add testcases

* add testcases

* refactor, add testcases

* add import

* all files map to pool. dont split manually

* __all__ += another list

* add testcases

* add testcases

* handle个锤子啊

* this line should not removed

https://github.com/wadefelix/Paddle/commit/882e7f7c3be6c2415f58550f82be338b84f0c0ef#diff-cb0679475bf60202fd803ae05b9146989437c3f787d1502616be6c71c69d0fb1

* print -> logger

* regulate the logging infomation

* regulate the logging infomation

* logger to file

* logger

* threads or subprocesses number config

* follow the good code style

don't touch wlist.json

* run test_sampcd_processor.py, it's a unittest for sampcd_processor.py

* update unittest for sampcd_processor.py

test=document_fix

4a09c1a1

12 4月, 2021 9 次提交

C

polish custom api content for performence (#32209) · 0624ea56
由 Chen Weihang 提交于 4月 12, 2021

0624ea56

[Rocm] fix python test of multinomial (#32158) · 4b5cb22f

由 zhulei 提交于 4月 12, 2021

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

4b5cb22f

Optimize the process of obtaining prec_list on windows (#32123) · 8dacfb5e

由 YUNSHEN XIE 提交于 4月 12, 2021

* test,test,notest,test=windows_ci

* test,notest,test=windows_ci

* test,notest,test=windows_ci

* test,notest,test=windows_ci

* remove test code

* delete some unnecessary logs

* fix format error

* turn on added ut check on windows

8dacfb5e

A

[CustomOp]Fix description of supporting MacOS (#32192) · bb3b7906
由 Aurelius84 提交于 4月 12, 2021

bb3b7906

[ROCM] fix some unittests (#32129) · bd2a4e23

由 ronnywang 提交于 4月 12, 2021

* [ROCM] fix test_gru_rnn_op

* [ROCM] fix test_expand_op

* [ROCM] fix test_cross_entropy_loss

* [ROCM] fix test_conv_nn_grad

* [ROCM] fix test_bilinear_tensor_product_op

* [ROCM] fix elementwise_op_function

* [ROCM] fix test_lstm_cudnn_op

* [ROCM] fix test_gpu_package_without_gpu_device

* [ROCM] fix test_gru_unit_op

* [ROCM] fix test_imperative_optimizer

* [ROCM] fix rnn

* [ROCM] fix group_norm_op

* [ROCM] fix test_pool3d_api

* [ROCM] fix test_pool3d_op

bd2a4e23

L

Optimization of bilinear backward OP CUDA kernel. (#30950) · d8afe407
由 limingshu 提交于 4月 12, 2021

d8afe407
L

follow comments to refine PR 32144 (#32174) · af374ae6
由 Leo Chen 提交于 4月 12, 2021

af374ae6
W

remove PYTHON_ABI, test=document_fix (#32190) · 80698cad
由 wuhuanzhou 提交于 4月 12, 2021

80698cad
T
fix concat_grad on kunlun (#32151) · a2387ef2
由 TTerror 提交于 4月 12, 2021
```
* fix concat_grad on kunlun

* fix concat_grad on kunlun
```
a2387ef2

10 4月, 2021 2 次提交
- A
  
  Optimize the performance of the forward of log_softmax when axis is -1 and dim <= 1024 (#31630) · f8bab5b0
  由 AshburnLee 提交于 4月 10, 2021
  
  f8bab5b0
- T
  
  Ci py3 gcc5.4 (#32045) · afa3720c
  由 tianshuo78520a 提交于 4月 10, 2021
  
  afa3720c
09 4月, 2021 9 次提交

N
make high precision for avg_pool and adaptive_avg_pool when data_type is float16 (#31887) · ec2ffb68
由 niuliling123 提交于 4月 09, 2021
```
* make high precision for avg_pool
```
ec2ffb68

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

S

fix unittest timeour (#32161) · a73cb679
由 Shang Zhizhou 提交于 4月 09, 2021

a73cb679
A
[Dy2Stat] Fix undefined var used in For (#32153) · 4636d136
由 Aurelius84 提交于 4月 09, 2021
```
* fix undefind var in For

* fix code style
```
4636d136
Y

Advoid CPU -> CPU memory copy when start, end, step is already on CPU. (#29088) · 95122ebe
由 Yiqun Liu 提交于 4月 09, 2021

95122ebe
A
[CustomOp]Support MacOS platform and Remove libpaddle_custom_op.so dependency (#31976) · d815fbf9
由 Aurelius84 提交于 4月 09, 2021
```
* Remove old custom OP to reduce whl package volume

* [Custom OP]Remove old custom OP to reduce whl package volume

* support macos
```
d815fbf9
A
[Dy2Stat] Support DictCmp and zip grammer (#32159) · 55730d95
由 Aurelius84 提交于 4月 09, 2021
```
* support DictCmp and zip grammar

* fix code style
```
55730d95
J

Candidate fix to #31992 (#32136) · dabaca00
由 Jacek Czaja 提交于 4月 09, 2021

dabaca00
L

[ROCM] update rocm skip ut list, test=develop (#32149) · 3822247f
由 Lei.C 提交于 4月 09, 2021

3822247f

08 4月, 2021 4 次提交
- C
  Support converting the model from fp32 to fp16 (#32112) · 1bae1e74
  由 cc 提交于 4月 08, 2021
```
* Support converting the model from fp32 to fp16
```
  1bae1e74
- C
  Add LayerDict class (#31951) · e45c3fa5
  由 chentianyu03 提交于 4月 08, 2021
```
* add layerdict class

* add docs and test cases for LayerDict class

* remove the arguments type in function define

* add update inputs type check
```
  e45c3fa5
- J
  
  4D Hybrid Parallelism (#32134) · 54344964
  由 JZ-LIANG 提交于 4月 08, 2021
  
  54344964
- Z
  The unsupported_fp16_list using in AMP will be created automatically during the runtime. (#32102) · 6e65fe02
  由 Zhen Wang 提交于 4月 08, 2021
```
* Use the runtime to create the unsupported_fp16_list using in AMP.

* Add more infos about supported ops.

* Add some comments for the function of OpSupportedInfos.

* Fix the unit test of test_multi_precision_fp16_train.
```
  6e65fe02

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致