提交 · 12339fa0b9914da7abbd83a3f68a5792b9af4792 · Crayon鑫 / Paddle

15 11月, 2021 5 次提交
- Z
  Add distributed pass framework: including PassBase/PassTest/PassUtils (#36643) · 12339fa0
  由 Zeng Jinle 提交于 11月 15, 2021
```
* add split_program

* make ut faster

* increase ut timeout

* make result deterministic

* add fuse_all_reduce pass

* add ut framework, update

* fix ut framework

* remove useless code

* add coverage support

* update

* fix CI

* fix some bugs and fix ci coverage

* fix conflict
```
  12339fa0
- Z
  
  fix bug of indexing with ellipsis (#37182) · f2a56c6a
  由 zyfncg 提交于 11月 15, 2021
  
  f2a56c6a
- J
  
  add fetch op for cinn graph output node of build_cinn_pass (#37172) · 10cc040d
  由 jiangcheng 提交于 11月 15, 2021
  
  10cc040d
- L
  modify sparse_attention docs, test=document_fix (#36554) · 6b0cc2b1
  由 Liu-xiandong 提交于 11月 15, 2021
```
* modify sparse_attention docs, test=develop

* add warning

* add warning ,test=document_fix
```
  6b0cc2b1
- Z
  [heterps]bug fix for local training with --heter_worker_num (#37166) · 31cd9145
  由 zmx 提交于 11月 15, 2021
```
* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop
```
  31cd9145
12 11月, 2021 7 次提交
- Z
  [fix]fix the bug of fused_attention and fused_feedforward (#36972) · 6486e242
  由 zhangkaihuo 提交于 11月 12, 2021
```
* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter
```
  6486e242
- C
  
  fix test_scale_op skipped test (#37153) · ca7f1cd2
  由 Chen Weihang 提交于 11月 12, 2021
  
  ca7f1cd2
- Y
  
  [fleet_executor] handle empty addr for single card train (#37150) · 2c7870e0
  由 Yuang Liu 提交于 11月 12, 2021
  
  2c7870e0
- L
  Refine new executor (#37074) · 1fe4513c
  由 Leo Chen 提交于 11月 12, 2021
```
* split declaration and implementation

* remove initdevices

* refine VariableMetaInfo

* add ut

* fix compile
```
  1fe4513c
- F
  [CPU-PSLIB] Fix bug for consistency insepection of op's embedding name and... · 9574bcd7
  由 Fan Zhang 提交于 11月 12, 2021
```
[CPU-PSLIB] Fix bug for consistency insepection of op's embedding name and sparse table name in config_fleet.py (#36753)

* [CPU-PSLIB] Fix bug for consistency insepection of op's embedding name and sparse table name in config_fleet.py

* [CPU-PSLIB] Fix bug for consistency insepection of op's embedding name and sparse table name in config_fleet.py
```
  9574bcd7
- A
  
  [NPU] fix fill_constant and test_memcpy_op_npu (#37144) · 9396f286
  由 Aganlengzi 提交于 11月 12, 2021
  
  9396f286
- Z
  [AutoParallel] Add AutoConvert (#36958) · 1773afd7
  由 zhaoyingli 提交于 11月 12, 2021
```
* add AutoConvert

* add unitest

* amend merge&slice

* amend default dist_attr

* update doc&improve coverage

* add interface dist_context

* tiny modify
```
  1773afd7
11 11月, 2021 11 次提交

remove repeated linalg in __all__ (#37117) · 357425d8
由 zhouweiwei2014 提交于 11月 11, 2021

357425d8

[Bug fixes] Add default arg to enhance varbase ClearGradient func (#36837) · 63f5c2d4

由 Weilong Wu 提交于 11月 11, 2021

* Add default arg to enhance varbase ClearGradient func

* Removed default arg, use a Flag to enhance varbase ClearGradient func

* Renamed Flags to FLAGS_real_release

* Use default arg to enhance varbase ClearGradient func and expose two func to set/get gradient isEmpty

* Removed DECLARE_bool statement

* Polished Code

63f5c2d4

T
add where/where_index/masked_select for kunlun (#37053) · f5e7b02a
由 TTerror 提交于 11月 11, 2021
```
* add where/where_index/masked_select for kunlun

* fix where/where_index

* update where/masked_select
```
f5e7b02a

Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc

由 jakpiase 提交于 11月 11, 2021

* added softplus + activation fuse plass

* minor change

* implemented reviewer suggestion

* minor fix

* minor fix

* added scale_out parameter

* minor fix

* fix for iScan CI

* conditionally disabled logs

* refactored pass builder

a346c4dc

fleet support elastic scale up/down (#36684) · 6af531b7

由 xiayanming 提交于 11月 11, 2021

* fleet support elastic train

* fleet support elastic train

* support elastic

* add unittest

* fix unitest bug

* fix unittest bug

* fix unittest bug

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix elastic bug

* fix ci fail

* fix ci fail

* fix elastic bug

* fix elastic bug

* fix joint debugging bug

* fix joint debugging bug

* fix windows ci failed

* fix windows ci failed

6af531b7

[Heterps]Refactor Heter Pipeline Parameter Server (#36845) · a2da1efa

由 zmx 提交于 11月 11, 2021

* change username

* fix

* fix

* fix

* fix

* fix

* update

* update

* update unittests

* fix

* update

* fix

* update

* fix

* fix

* fix

* update

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update send_and_recv op. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix ut. test=develop

* fix unit. notest,test=coverage

* fix ut. notest, test=coverage

* update. notest,test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix. notest, test=coverage

* fix. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* add func. notest, test=coverage

* fix ut. notest, test=coverage

* fix. test=develop

* fix. test=develop

a2da1efa

[New features] Support VarBase to expose func (#36965) · 52645667

由 Weilong Wu 提交于 11月 11, 2021

* Expose func for varbase

* Expose func for varbase and enhance varbase init func

* Change func name and add test case for _CopyGradientWith

* Rename func

* Add test cases to increase coverage

* Refine the logic of _to func

* Replace numel() with _numel(), Add test code

52645667

L

Get global cluster information (#37084) · 31673a92
由 LiYuRio 提交于 11月 11, 2021

31673a92
W

update ut (#37089) · 6c183a8e
由 Wilber 提交于 11月 11, 2021

6c183a8e
W
fix 2 bug: 1.skip lodtensorarray; 2.delete feed op (#37090) · d5df6bdf
由 wanghuancoder 提交于 11月 11, 2021
```
* fix 2 bug: 1.skip lodtensorarray; 2.delete feed op, test=develop

* program clone, test=develop
```
d5df6bdf
N
[PaddlePaddle Hackathon] add WideResNet (#36952) · 8395f573
由 Nyakku Shigure 提交于 11月 11, 2021
```
* add wide resnet
* update pretrained weights link
```
8395f573

10 11月, 2021 5 次提交
- J
  Added stack FP32 FWD oneDNN kernel (#37002) · 99f9224c
  由 jakpiase 提交于 11月 10, 2021
```
* added stack oneDNN FP32 op

* minor change

* CI fix

* added skipping for gpus

* fix for stack op

* CI fix

* CI fix

* Added comment

* CI fix
```
  99f9224c
- A
  
  Fix inner_program in Executor (#37083) · 8a2ce0f2
  由 Aurelius84 提交于 11月 10, 2021
  
  8a2ce0f2
- L
  Fix fused_attention_op scope. (#37065) · ad44a40c
  由 Li Min 提交于 11月 10, 2021
```
att, bug fix
```
  ad44a40c
- B
  
  fix multihead_matmul ut for tensorrt6 (#37073) · 48d53cfc
  由 baoachun 提交于 11月 10, 2021
  
  48d53cfc
- J
  Fix rnn grad bug in cpu when dropout is zero (#37080) · 211940eb
  由 Jack Zhou 提交于 11月 10, 2021
```
* fix rnn grad bug when num_layers is set 2 and dropout_prob is set 0

* add more test for rnn
```
  211940eb
09 11月, 2021 5 次提交
- Z
  Refine param conversion logic in layer.to (#36862) · 993ec76a
  由 zhangbo9674 提交于 11月 09, 2021
```
* refine layer to

* delete comment

* refine logic

* refine code

* refine pure_fp16_init

* refine comment
```
  993ec76a
- A
  
  fix CompileProgram in Executor (#37036) · 77a8c94b
  由 Aurelius84 提交于 11月 09, 2021
  
  77a8c94b
- W
  delete profiler.cuda_profiler (#36524) · d817388e
  由 wanghuancoder 提交于 11月 09, 2021
```
* delete profiler.cuda_profiler, test=develop

* delete nvprof, test=develop

* add required: gpu, test=develop

* remove cuda_profiler, test=develop
```
  d817388e
- Z
  Try to fix CUDA Graph H2D copy bug (#36987) · 2a143f84
  由 Zeng Jinle 提交于 11月 09, 2021
```
* try to fix CUDA Graph H2D copy bug

* remove useless code

* fix ci

* fix ROCM CI

* fix CUDA_VERSION

* improve CI coverage
```
  2a143f84
- T
  
  add gather_nd/tile op for kunlun (#37029) · 819b9589
  由 TTerror 提交于 11月 09, 2021
  
  819b9589
08 11月, 2021 7 次提交

Use cuda virtual memory management and merge blocks (#36189) · a1ec1d5a

由 wanghuancoder 提交于 11月 08, 2021

* Use cuda virtual memory management and merge blocks, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* window dll, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* use autogrowthv2 for system allocator, test=develop

* remove ~CUDAVirtualMemAllocator(), test=develop

* refine, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* fix cuda error of CUDA_ERROR_NOT_INITIALIZED, test=develop

* fix bug, test=develop

* revert system allocator, test =develop

* revert multiprocessing, test=develop

* fix AutoGrowthBestFitAllocatorV2 mutxt, test=develop

* catch cudaErrorInitializationError when create allocator, test=develop

* fix cuMemSetAccess use, test=develop

* refine cuda api use, test=develop

* refine, test=develop

* for test, test=develop

* for test, test=develop

* switch to v2, test=develop

* refine virtual allocator, test=develop

* Record cuMemCreate and cuMemRelease, test=develop

* refine, test=develop

* avoid out of bounds, test=develop

* rename allocator, test=develop

* refine, test=develop

* use PADDLE_ENFORCE_CUDA_SUCCESS, test=develop

* for test,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

a1ec1d5a

L
【fix-bug】Support attn_mask=None input cases for fused_attention_op. (#36951) · 472dcca4
由 Li Min 提交于 11月 08, 2021
```
目前的fused_attention_op不支持attn_mask=None的输入，本PR对此进行了补充，并补充了相应的单测逻辑。
```
472dcca4
W

add pass and mkldnn base ut. (#36967) · b7e88308
由 Wilber 提交于 11月 08, 2021

b7e88308
K

avoid setting logging.basicConfig (#37031) · 1305b4f5
由 kuizhiqing 提交于 11月 08, 2021

1305b4f5
0

set net.forward to original forward function in flops (#36852) · 94bcc2ab
由 0x45f 提交于 11月 08, 2021

94bcc2ab
Z

setitem support passing stop_gradient from value to tensor (#37023) · aef8bf2a
由 zyfncg 提交于 11月 08, 2021

aef8bf2a

Add Support for OperatorBase in new executor (#36945) · 251f68e7

由 xiongkun 提交于 11月 08, 2021

* add scope as membership

* functions complete

* fix bugs: garbage collectior

* deal unknow variable holder

* add

* 1. add unittest for operator_base

* code format

251f68e7

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致