提交 · 06de4891de30806c049c5e315dab3c74241ae7bb · Crayon鑫 / Paddle

10 6月, 2022 1 次提交
- W
  Add option for executor profiler (#43355) · 6aee6410
  由 Weilong Wu 提交于 6月 10, 2022
```
* Add option for test executor profiler

* Change option for test executor_profiler
```
  6aee6410
07 6月, 2022 1 次提交
- H
  [Dygraph] Fix bugs of EagerReducer for complex control flows (#43252) · 2922985a
  由 Haohongxiang 提交于 6月 07, 2022
```
* fix bugs of reducer

* update

* update
```
  2922985a
04 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：cmake-format (#43057) · 92568edb
  由 Sing_chan 提交于 6月 04, 2022
  
  92568edb
30 5月, 2022 3 次提交
- L
  Add fused_bias_dropout_residual_ln op and layer. (#43062) · dceccd9d
  由 Li Min 提交于 5月 30, 2022
```
* add fused_bias_dropout_residual_ln op and layer.
```
  dceccd9d
- C
  
  Implement fused_gate_attention operator for AlphaFold. (#42018) · fdcdbec5
  由 crystal 提交于 5月 30, 2022
  
  fdcdbec5
- Z
  
  rm serial mode in exclusive case (#43073) · a1d87776
  由 zhangchunle 提交于 5月 30, 2022
  
  a1d87776
28 5月, 2022 1 次提交
- S
  [Bug Fix]Fix global_scatter/global_gather in ProcessGroup (#43027) · 8cc2e28c
  由 ShenLiang 提交于 5月 28, 2022
```
* fix alltoall

* rename utest
```
  8cc2e28c
27 5月, 2022 1 次提交
- B
  
  fix_sharding_timeout (#43002) · 905d857c
  由 Baibaifan 提交于 5月 27, 2022
  
  905d857c
23 5月, 2022 1 次提交
- R
  Reduce test case for test_tensordot (#42885) · 9aed8327
  由 Ruibiao Chen 提交于 5月 23, 2022
```
* Reduce test case for test_tensordot

* Fix CI errors
```
  9aed8327
16 5月, 2022 1 次提交
- S
  
  make some test run with old executor in specified windows server (#42777) · 4b355ff9
  由 Sing_chan 提交于 5月 16, 2022
  
  4b355ff9
13 5月, 2022 1 次提交
- R
  Refactor test_tensordot (#42650) · 757b5d31
  由 Ruibiao Chen 提交于 5月 13, 2022
```
* Refactor test_tensordot

* Add test_static

* Fix CI errors
```
  757b5d31
10 5月, 2022 1 次提交

[Eager] Refactor several sharding test (#42608) · 668a0a41

由 Weilong Wu 提交于 5月 10, 2022

* [Eager] fix sharding under eager mode

* [Eager] fix several sharding test under eager mode

* Recover using _test_eager_guard

* Ensured fleet.init under legacy

* Ensured fleet.init under legacy

* Fix CI issue, re-definition strategy and call fleet.init() in stage2_offload

* Modified dygraph_group_sharded_api.py, move fleet.init to a better line

668a0a41

05 5月, 2022 1 次提交
- R
  
  Disable standalone executor for test_tensordot (#42476) · e51fad5f
  由 Ruibiao Chen 提交于 5月 05, 2022
  
  e51fad5f
28 4月, 2022 1 次提交

Add gradient merge for DistributedFusedLamb optimizer (#40177) · 108aeb28

由 sneaxiy 提交于 4月 28, 2022

* add gradient merge for DistributedFusedLamb

* use master acc gradient

* fix CI ut

* polish

* remove math_function_impl.h change

* fix test_update_loss_scaling_op.py

* try to fix XPU/NPU CI

* add gm ut

108aeb28

26 4月, 2022 2 次提交
- W
  
  Add fused_multi_transformer op to optimize transformer generation performance (#41814) · 9dadf7df
  由 WangXi 提交于 4月 26, 2022
  
  9dadf7df
- X
  Add C++ EinsumOp which support 2 operands einsum. (#42105) · c7302f96
  由 xiongkun 提交于 4月 26, 2022
```
* full api fix

* when out is None, go old dygraph mode

* by static check

* first version: support 2-inputs forwards. TODO: 1. backward  2. BroadCast  3. MultiVariable

* time out -> 120
```
  c7302f96
22 4月, 2022 1 次提交

[WIP] Algorithm Cache of cuBlasLt Epilogue (#41010) · 19650d72

由 Ming-Xu Huang 提交于 4月 22, 2022

* Fix leading dimension setting error in fused_gemm_epilogue_grad_op.

* Add dyload to cuBlasLt functions.

* Added cublasLtMatmulAlgoGetHeuristic to improve performance.

* Added FLAGS_cublaslt_exhaustive_search_times to cublasLt epilogue

* Added UTs to FLAGS_cublaslt_exhaustive_search_times

* Added warmup runs in algo searching of Gemm epilogue.

* Update copyright and documents.

* Fixed error handling.

19650d72

19 4月, 2022 1 次提交
- S
  Fix pipeline in new dygraph (#41937) · 6b690d89
  由 ShenLiang 提交于 4月 19, 2022
```
* fix utest

* fix time
```
  6b690d89
13 4月, 2022 2 次提交
- R
  Add yaml for deformable_conv and deformable_conv_v1 OPs (#41644) · b8968390
  由 Ruibiao Chen 提交于 4月 13, 2022
```
* Add yaml for deformable_conv and deformable_conv_v1 OPs

* Add UT

* Add to skipped_phi_api list for infrt
```
  b8968390
- B
  
  sharding_for_eager_tensor (#41415) · f4cc5def
  由 Baibaifan 提交于 4月 13, 2022
  
  f4cc5def
12 4月, 2022 1 次提交
- L
  
  use standalone executor for test_nn_grad/test_norm_nn_grad (#41574) · bb427a3d
  由 Leo Chen 提交于 4月 12, 2022
  
  bb427a3d
08 4月, 2022 1 次提交
- L
  
  update (#41309) · ab137a84
  由 lilong12 提交于 4月 08, 2022
  
  ab137a84
07 4月, 2022 2 次提交

Switch some dy2st UT to eager mode (#41382) · edbb3986

由 0x45f 提交于 4月 07, 2022

* Sitch some dy2st UT to eager mode

* Fix test_lstm and remove test_transformer

* Run test_resnet_v2 in old dy mode

edbb3986

Q
ignore some failed test for KL2 (#41342) · 81389c51
由 QingshuChen 提交于 4月 07, 2022
```
* ignore some failed test for KL2
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun
```
81389c51

06 4月, 2022 1 次提交

[Eager] Support test_layers's test cases switch to eager mode (#41216) · 5ae8babb

由 Weilong Wu 提交于 4月 06, 2022

* [Eager] Support test_layers's test cases switch to eager mode

* Update batch_norm _C_ops action to fix CI

* Use None instead of new EmptyTensor

* Updated var name

* Make sure to switch eager mode, Fix Coverage_CI

* Remove _non_static_mode statement

* Remove batch_norm dispensable input statement

* Polish batch_norm code

* Fix CI issue

5ae8babb

05 4月, 2022 4 次提交

H
[Dygraph] Support process group in dp with fleet api (#41119) · 1f829f6e
由 Haohongxiang 提交于 4月 05, 2022
```
* support process group in dp with fleet api

* update

* fix uts

* update
```
1f829f6e
R
Add nms op and batched_nms api (#40962) · 7554f428
由 RichardWooSJTU 提交于 4月 05, 2022
```
* add nms op and batched_nms api
```
7554f428

[new-exec] enable the new standalone executor by default (#41179) · 93ea1297

由 Leo Chen 提交于 4月 05, 2022

* enable new executor by default

* enable stream safe allocator

* test=document_fix;test=coverage

* do not use scope in op kernel

* fit empty program for new executor

* fix communication depend

* fix test_sync_batch_norm

* skip unsupported place

* refine datatransfer

* fit for dirtributed program

* fix dependencpy

* fix some ut

93ea1297

C

add test time, test=document_fix (#41405) · feaa9798
由 Chen Weihang 提交于 4月 05, 2022

feaa9798

04 4月, 2022 3 次提交

H
[Dygraph] Support sparse tensor in refactored reducer (#40836) · 1b031987
由 Haohongxiang 提交于 4月 04, 2022
```
* [Dygraph] Support sparse tensor in refactored reducer

* add uts

* refactor

* update

* fix bugs
```
1b031987

[Phi] Add softmax with cross entropy infershape & yaml (#41351) · a6b6bcbf

由 Chen Weihang 提交于 4月 04, 2022

* add infershape and forward yaml

* add final_state call

* add base unittests

* add backward yaml and test

* fix without softmax test error

* add cross_entropy test

a6b6bcbf

Add yaml for reduce_sum OP (#41295) · 5936fa6e

由 From00 提交于 4月 04, 2022

* Add yaml for reduce_sum OP

* Fix CI errors

* Fix CI errors

* Fix CI errors

* Fix CI errors

5936fa6e

02 4月, 2022 1 次提交
- L
  
  wrapper the usage of distributed functions (#39720) · 8df46229
  由 lilong12 提交于 4月 02, 2022
  
  8df46229
30 3月, 2022 2 次提交

Add new APIs for GPU memory monitoring (max_memory_allocated,... · afe02e9d

由 From00 提交于 3月 30, 2022

Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657)

* Add new API memory_reserved

* Add memory_allocated, max_memory_reserved and max_memory_allocater

* Fix CI error

* Fix CI error

* Enhance UT

* Add FLAGS_memory_stats_opt

* Add STATS macro functions

* Add StatAllocator

* Fix CI errors

* Add UT

* Fix CI errors

afe02e9d

suppor inplace in tensor_method_setitem (#40915) · 7170c687

由 pangyoki 提交于 3月 30, 2022

* suppor inplace in tensor_method_setitem

* delete bump_inplace_version

* optimize inplace unittest

* fix

* fix setitem bug

* update eager_generator

* optimize inplace unittest

* little change

7170c687

28 3月, 2022 1 次提交

[Dygraph] Add unittests for DataParallel in eager mode (#40709) · 62af5903

由 Haohongxiang 提交于 3月 28, 2022

* add uts for EagerReducer

* add more uts

* fix bugs

* fix bugs

* modify

* modify uts

* fix bugs

* update

* update

* update

* solve conflicts and merge

* add some other uts

* modify time of uts

* update

* update

* update

* remove uts of resnet

62af5903

25 3月, 2022 1 次提交

Refactor Dygraph Flags (#40786) · 3085d5e4

由 Jiabin Yang 提交于 3月 25, 2022

* refactor eager flags

* fix flags error when we switch from eager to dygraph

* fix ci problem

* fix ci

* fix ci

* merge develop and fix code style

* merge develop and fix code style

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* merge develop

3085d5e4

24 3月, 2022 1 次提交
- L
  
  Wrap dist api for dygraph mode (#40408) · 9d8cfc1b
  由 lilong12 提交于 3月 24, 2022
  
  9d8cfc1b
21 3月, 2022 1 次提交
- K
  
  fleetrun launch in legacy mode (#40568) · c54c60de
  由 kuizhiqing 提交于 3月 21, 2022
  
  c54c60de
19 3月, 2022 1 次提交

support inplace in dygraph eager_fluid state (#40400) · 8e612903

由 pangyoki 提交于 3月 19, 2022

* [Eager] Support eager grad interface, draft version

* Support eager grad interface with allow_unused and multi startup_op

* Fix code format

* Fix allow_unused case, return PyNone if tensor not initialize

* Support output's stop_gradient related to create_graph

* Support grad exception case in eager mode, fix coverage CI

* Update ToPyObject, return PyNone if not initialize

* AccumulationNode add FLAGS_retain_grad_for_all_tensor

* Fix ci issue

* Fix CI issue

* fix, use core.eager.Tensor

* Add func SetBufferSlotRankZeros for GradTensorHolder

* Support retain_graph by using ClearTensorWrappers

* Support retain_graph by using ClearTensorWrappers

* Update retain_graph and no_grad_vars related test case

* Update code gen logic for ClearTensorWrappers

* Fix by override statement

* fix override func args

* Support retain_graph, update unit tests

* Updated ClearTensorWrappers logic

* fix grad python interface

* Use deep copy and update unit tests

* Polish code

* Polish code

* Fix CI issue, Deep copy only use when user set grad_tensors

* Fix CI, use Backward instead RunBackward

* Fix CI, Declare kernel explicitly in test file

* Polish, remove vector of TensorWrapper

* Refactor the logic of grad/backward, polish codes

* Update code after merge upstream develop

* Polish after merge upstream develop

* Update to adapt new GradNodeBase superclass

* Fix error introduced during conflict resolution

* support inplace strategy in eager_fluid state

* solve conflict

* nothing

* Update purify potential_startup_nodes logic

* Fix errors

* Polish code

* Remove useless args for ToPyObject

* Remove useless TensorWrappersSet

* fix record conflict

* Fix code-format, re-install pre-commit

* fix tensor_wrapper bug

* Fix pre-process logic for potential_startup_ops

* Update unit tests, use eager mode

* Fix conflicts

* fix unittest timeout

* little change
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

8e612903

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致