提交 · 29f65225a212c843ffa81dcf87f417120f8c7ee4 · Crayon鑫 / Paddle

15 4月, 2021 1 次提交

Customizable Python Layer in Dygraph (#32130) · 29f65225

由 WeiXin 提交于 4月 14, 2021

* custom python backward

* polish up the code

* polish up the code

* polish up the code.

* Fix code format and comments.

* Delete redundant files.

* add unnittest.

* edit unnittest.

* edit unnittest.

* Remove redundant header files.

* Improve coverage and remove redundant code.

* support saving for backward.

* polish code according to comments.

* Add support type for PyLayer.

* Modify the DOC.

* polish Doc.

* polish Doc.

* polish Doc.

* polish Doc.

* polish Doc.

* polish Doc.

* polish code and make the code robust.

* Modify the code format.

29f65225

14 4月, 2021 1 次提交
- C
  Add inner register backward hook method for Tensor (#32171) · 7ba85aca
  由 Chen Weihang 提交于 4月 14, 2021
```
* add register backward hook method

* add leaf grad accumullated test
```
  7ba85aca
13 4月, 2021 1 次提交

add layer.to api (#32040) · 6e946e9d

由 chentianyu03 提交于 4月 13, 2021

* add layer.to api

* add layer.to api

* add layer.to api

* add the doc for Layer.to

* add input type checking

* modify assert and import bug

* format code style

* format code style

* make place support str type

* add SetGradVarBase method to set the gradient after conversion

* modify argument palce to device

* modify argument palce to device

* modify doc of layers.to API

* add xpuplace to device argument

6e946e9d

09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

01 4月, 2021 4 次提交

S
Support control flow in DataParallel (#31625) · 8460698b
由 ShenLiang 提交于 4月 01, 2021
```
* support control flow

* supoort sync_parameters_buffers

* fix the bug of sparse embedding
```
8460698b

add custom init grad for backward function (#31540) · 83b953f5

由 chentianyu03 提交于 4月 01, 2021

* add custom init grad for backward function

* add custom init grad for backward function

* handle when the grad_tensor is none

* handle when the grad_tensor is none

* fix the args type error on windows platform

* modify the args order and doc

* format code

* add grad_tensor to xpu

* modify the grad_tensor type check

* add paddle.backward api to support multi tensors gradient compute

* add paddle.backward api to support multi tensors gradient compute

* add paddle.atuograd module and backward api

* change tensor.backward func args

* modify tensor backward api

* remove create_graph intputs args

* add doc and examplex code for backward api

* when have the same tensor, throw error

* modify test Init func args

* modify the execute.Init func args in test files

* add paddle.autograd package in setup.py.in

* modify error msg, remove _run_backward method in class Tensor

* add test cases for backward api

83b953f5

K
new group (#31682) · 07741593
由 kuizhiqing 提交于 4月 01, 2021
```
* new group

* ci compatible fix

* assert nccl
```
07741593

Refactor and simplify hook design & add Tensor.register_hook API (#31775) · dbeb3ea4

由 Chen Weihang 提交于 3月 31, 2021

* refactor and simplify hook design

* fix reducer add hook error

* add Tensor.register_hook basic impl

* refine prepare data impl

* revert prepare data change

* support register_hook for Tensor

* add hook test in model

* polish tests and doc example

* fix double grad test failed

* remove reduce hook func

* fix set empty error

* polish code by comments

* change reduce_hook to mutable_hook

* remove useless tmp_ins

* fix shape code format error

* fix shape code format error

dbeb3ea4

26 3月, 2021 1 次提交
- T
  delete include framework.pb.h (#31859) · e804f085
  由 tianshuo78520a 提交于 3月 26, 2021
```
* delete include framework.pb.h

* fix error
```
  e804f085
15 3月, 2021 1 次提交
- K
  DataLoader supprot dict str (#31481) · a32e8bf1
  由 Kaipeng Deng 提交于 3月 15, 2021
```
* add dict/str/list supprot for DataLoader. test=develop
```
  a32e8bf1
12 3月, 2021 1 次提交
- W
  
  Make CreateProgramDesc more robust (#31543) · da9dda5c
  由 whs 提交于 3月 12, 2021
  
  da9dda5c
09 3月, 2021 1 次提交
- Q
  
  [ROCM] fix reduce op, test=develop (#31478) · b85c8e03
  由 Qi Li 提交于 3月 09, 2021
  
  b85c8e03
05 3月, 2021 1 次提交
- L
  [Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn... · 9ebf05b0
  由 liuyuhui 提交于 3月 05, 2021
```
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes (#31130)
```
  9ebf05b0
26 2月, 2021 1 次提交
- J
  
  [Custom OP] Support stream set on Custom Op (#31257) · 038ce70d
  由 Jiabin Yang 提交于 2月 26, 2021
  
  038ce70d
25 2月, 2021 1 次提交

add cache for VariableWrapper (#30880) · ca3b6bcf

由 chentianyu03 提交于 2月 25, 2021

* add cache for VariableWrapper

* modify args names and vlog level

* format code style

* add log when set cache to variable_wrapper

* add log when set cache to variable_wrapper

* add comment to variableWrapper cache

* format code style

ca3b6bcf

24 2月, 2021 1 次提交
- L
  fix the modification of set_expected_place (#31177) · 0f1fde51
  由 Leo Chen 提交于 2月 24, 2021
```
* revert the modification of set_expected_place

* set device before op run

* add ut
```
  0f1fde51
22 2月, 2021 1 次提交

[ROCM] update fluid imperative for rocm (part1), test=develop (#31017) · 1d996637

由 Qi Li 提交于 2月 22, 2021

* [ROCM] update fluid imperative for rocm (part1), test=develop

* [ROCM] update reducer.cc after merge, test=develop

* update reducer cmake after merge, test=develop

1d996637

19 2月, 2021 1 次提交
- S
  
  Remove scale loss before reduce in dygraph (#30807) · 9401173e
  由 ShenLiang 提交于 2月 19, 2021
  
  9401173e
09 2月, 2021 1 次提交
- S
  
  Solve inconsistent order in each card in dynamic graph (#30931) · dae3e1f3
  由 ShenLiang 提交于 2月 09, 2021
  
  dae3e1f3
08 2月, 2021 1 次提交
- L
  
  [kunlun]fix sync in multi kunlun xpu dygraph training. (#30943) · 87197f8c
  由 liuyuhui 提交于 2月 08, 2021
  
  87197f8c
04 2月, 2021 2 次提交
- W
  
  fix xpu dygraph place (#30868) · 6e3856d3
  由 WangXi 提交于 2月 04, 2021
  
  6e3856d3
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
03 2月, 2021 1 次提交
- W
  
  【kunlun】dygraph supports multi xpu card training (#30671) · b1026f64
  由 WangXi 提交于 2月 03, 2021
  
  b1026f64
29 1月, 2021 1 次提交
- S
  
  rm Singleton of reducer (#30775) · 3858f458
  由 ShenLiang 提交于 1月 29, 2021
  
  3858f458
20 1月, 2021 1 次提交

add some RecordEvent, for dygraph timeline (#30299) · d1b25ed9

由 wanghuancoder 提交于 1月 20, 2021

* add some RecordEvent, for dygraph timeline, test=develop

* change GpuMemcpySync to memory::Copy, test=develop

* fix compile problem, test=develop

* fix compile problem, test=develop

* fix, test=develop

* fix, test=develop

d1b25ed9

19 1月, 2021 4 次提交
- W
  
  [Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455) · 572c466d
  由 WangXi 提交于 1月 19, 2021
  
  572c466d
- Z
  
  fix bug of multicard grad ncclAllReduce (#30553) · fb20ec9a
  由 Zhou Wei 提交于 1月 19, 2021
  
  fb20ec9a
- P
  
  fix error message of Inplace strategy (#30520) · 00554b3f
  由 pangyoki 提交于 1月 19, 2021
  
  00554b3f
- L
  support layer_norm fp16 in dygraph amp (#30430) · 7043b8cf
  由 Leo Chen 提交于 1月 19, 2021
```
* support layer_norm fp16 in dygraph amp

* add ut

* refine code
```
  7043b8cf
15 1月, 2021 1 次提交

Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103) · 13d75736

由 pangyoki 提交于 1月 15, 2021

* add view strategy on squeeze,unsqueeze,reshape,flatten

* add squeeze unittest

* add unittests

* use View strategy as name rather than Reuse Allacation

* fix view api doc

* fix format

* use core.ops when input of reshape2 is Tensor

* fix test_cross_entropy_loss error because of reshape2

* fix test_cross_entropy_loss error because of reshape2

* add inplace strategy

* add elementwise_add sub

* let backward op not use inplace

* grad op do not use inplace

* fix memory increase error and add leaf error message

* delete selected_rows

* change op_function

* little change

* solve HandleViewBetweenInputAndOutput

* add unittest and leaf error message

* merge view error

* optimize op_function_generator format and support sum inplace op

* fix format of basic_engine

* fix format for framework

* little change of variable wrapper

* add reshape, squeeze, unsqueeze, scatter api

* add relu elu tanh softmax inplace api

* fix test_squeeze_op unittest

* fix test_relu_op unittest

* fix comment problems

* delete sample code of inplace api

* add reference of grad_pending_nodes in basic_engine

* fix unittest name

* add inplace apis into wlist

* fix error message

* add PADDLE_ENFORCE for set grad op twice

* fix head file error

13d75736

13 1月, 2021 1 次提交
- S
  
  Support unused parameters in dynamic graph distributed (#30224) · a60f17b8
  由 ShenLiang 提交于 1月 13, 2021
  
  a60f17b8
11 1月, 2021 1 次提交
- 石
  
  fix header file paths of gflags, commit 1, test=develop (#30271) · 8ce2482b
  由石晓伟提交于 1月 11, 2021
  
  8ce2482b
08 1月, 2021 2 次提交

Fix dtype of ungenerated grad var (#28511) · 8696335f

由 Leo Chen 提交于 1月 08, 2021

* fix dtype of ungenerated grad var

* update ut

* refine code

* set default dtype

* fix could_use_cudnn bug

* remove debug code

* re-implement

* fix bug

8696335f

Add callback after TensorCopy (#30123) · 1f97d61c

由 Leo Chen 提交于 1月 08, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

1f97d61c

07 1月, 2021 1 次提交

[Complex] Simplify prepared op impl to improve performance (#30153) · d0fb06b2

由 Chen Weihang 提交于 1月 07, 2021

* simplify prepared op impl to improve performance

* fix kunlun compile error

* continue fix kunlun compile error

* only transform diff place when dtype diff

* fix failed unittests

* remove useless file

* polish impl by review comment

d0fb06b2

05 1月, 2021 1 次提交

support dygraph in xpu place (#30051) · 297fff1a

由 hong 提交于 1月 05, 2021

* support dygraph in xpu place; test=develop

* fix cpu/gpu compile error; test=develop

* fix compile error; test=develop

* fix xpu compile error; testd=develop

297fff1a

29 12月, 2020 1 次提交
- C
  
  support grad accumulated across batch (#29942) · a1d9a14e
  由 Chen Weihang 提交于 12月 28, 2020
  
  a1d9a14e
25 12月, 2020 2 次提交

[Complex] Handle complex to real after type promotion (#29855) · a6072055

由 Chen Weihang 提交于 12月 25, 2020

* try to add fwd op input dtypes

* refactor base impl

* return tmp_ins after dygraph prepare data

* fix typo found in debug

* polish comment & add complex net test

* revert detail change

* fix unittest failed

* add complex kernel condition control

* fix xpu test failed & polish comment

* polish details by review comments

a6072055

[Complex] Add support for complex grad accumulated (#29889) · 1a304e6c

由 Chen Weihang 提交于 12月 25, 2020

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

1a304e6c

22 12月, 2020 1 次提交
- S
  
  opt sparse allreduce using ncclgather (#29819) · f65f1caa
  由 ShenLiang 提交于 12月 22, 2020
  
  f65f1caa

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致