提交 · 172f27191002b21a31c1cbb2df092e4446b67606 · PaddlePaddle / Paddle

16 6月, 2021 1 次提交
- L
  
  bug fix, test=develop (#33595) · 172f2719
  由 lilong12 提交于 6月 16, 2021
  
  172f2719
15 6月, 2021 1 次提交

[cherry-pick] fix gather bug && fix hang of new_group (#33553) · a4e841e0

由 ShenLiang 提交于 6月 15, 2021

* Fix gather infer shape using axis (#33413)

* fix gather shape bug

* fix None

* fix topo

* Fix hang of hybrid parallel in new_group  (#33141)

* fix hang of hybrid parallel

* fix new_group for hang problem

* fix hang

a4e841e0

11 6月, 2021 1 次提交
- Z
  update 2.0 public api in all left files (#33314) · e48f7a5b
  由 zhiboniu 提交于 6月 11, 2021
```
* update 2.0 public api in all left files

* reverse device.py all list;
fix some flake8 errors
```
  e48f7a5b
10 6月, 2021 1 次提交
- L
  fix the bug in the creation of pp groups to avoid hang (#32890) (#33473) · fe841790
  由 lilong12 提交于 6月 10, 2021
```
* update, test=develop
```
  fe841790
02 6月, 2021 1 次提交
- C
  [Cherry-pick] Fix spawn default nprocs get error (#33215) (#33249) · 5d8e4395
  由 Chen Weihang 提交于 6月 02, 2021
```
* fix spawn default nprocs get error

* polish error message
```
  5d8e4395
26 5月, 2021 1 次提交

[Cherry-Pick][HybridParallel]Fix pipeline in dygraph (#33097) · d7d3090f

由 ShenLiang 提交于 5月 26, 2021

* [HybridParallel]Fix pipeline in dygraph (#33007)

* fix pipeline

* fix mp pp dp

* fix utest of hybrid parallel

* add utest for tuple

* fix utest (#33108)

d7d3090f

25 5月, 2021 1 次提交
- S
  [HybridParallel]Fix precision problem of model parallel (#32897) (#33087) · 4026e227
  由 ShenLiang 提交于 5月 25, 2021
```
* fix precision of mp

* fix bug of seed

* fix dp

* print group
```
  4026e227
24 5月, 2021 1 次提交
- Z
  
  update 2.0 public api in distributed (#32990) · 7c0b96e6
  由 zhiboniu 提交于 5月 24, 2021
  
  7c0b96e6
19 5月, 2021 1 次提交

【cherrypick】support cuda11 for heterps; add profiler in oneps (#32957) · ab1a4df9

由 danleifeng 提交于 5月 19, 2021

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

* cherrypick for #32640 :add profile and fix dataset hang in heterps;test=develop

ab1a4df9

11 5月, 2021 1 次提交
- S
  fix find_unused_parameters default value (#32829) · 02513207
  由 ShenLiang 提交于 5月 11, 2021
```
fix error log for reducer

fix doc

fix bug of utest

fix spawn

fix converage
```
  02513207
07 5月, 2021 1 次提交
- L
  
  bug fix, test=develop (#32753) · 5fdd85ba
  由 lilong12 提交于 5月 07, 2021
  
  5fdd85ba
06 5月, 2021 2 次提交
- F
  avoid polluting logging's root logger (#32673) (#32706) · 0bb079cd
  由 Feiyu Chan 提交于 5月 06, 2021
```
avoid polluting logging's root logger
```
  0bb079cd
- L
  
  update, test=develop (#32731) · df00636b
  由 lilong12 提交于 5月 06, 2021
  
  df00636b
05 5月, 2021 1 次提交
- L
  Fix the bug in pipeline for dygraph mode (#32716) (#32728) · 6b86e966
  由 lilong12 提交于 5月 05, 2021
```
* update, test=develop
```
  6b86e966
01 5月, 2021 1 次提交
- B
  
  slove develop bugs (#32560) (#32684) · 6a1957e7
  由 Baibaifan 提交于 5月 01, 2021
  
  6a1957e7
27 4月, 2021 4 次提交
- X
  [Docs] Modified the docs of some api for supporting list/tuple args. (#32360) · 15158927
  由 xiemoyuan 提交于 4月 27, 2021
```
* fixed docs.

* Fixed docs. test=document_fix

code bak.

fixed docs. test=document_fix

* Revert to previous version of python/paddle/fluid/backward.py

* fixed bugs.

* test=document_fix. Fixed examples.
```
  15158927
- X
  Support list and tuple for args. (#32344) · a08a118d
  由 xiemoyuan 提交于 4月 27, 2021
```
* Support list and tuple for parameters of layer_norm, multiprocess_reader, DatasetFolder and ImageFolder.

* add unittest for layer_norm.

* add require gpu for example.
```
  a08a118d
- T
  Revert "[PsCore] optimize performance of large kv (#32535)" (#32599) · 809ac036
  由 tianshuo78520a 提交于 4月 27, 2021
```
This reverts commit 4b7242b0.
```
  809ac036
- S
  [HybridParallel] Fix amp bug in ModelParallel (#32579) · c1db7e32
  由 ShenLiang 提交于 4月 27, 2021
```
* fix amp bug

* fix name of wordsize
```
  c1db7e32
26 4月, 2021 4 次提交
- L
  add send/recv api (#32504) · c47bafc6
  由 lilong12 提交于 4月 26, 2021
```
* add sendrecv, test=develop
```
  c47bafc6
- S
  
  add barrier for new group (#32572) · 4ba49af5
  由 ShenLiang 提交于 4月 26, 2021
  
  4ba49af5
- T
  [PsCore] optimize performance of large kv (#32535) · 4b7242b0
  由 Thunderbrook 提交于 4月 26, 2021
```
* optimize pull sparse

* optimize pull sparse

* change macro

* format
```
  4b7242b0
- S
  [HybridParallel]Fix model parallel bug by using C++ op (#32536) · ea465fa5
  由 ShenLiang 提交于 4月 26, 2021
```
* fix model parallel

* rm parallel_help.py

* add embedding
```
  ea465fa5
25 4月, 2021 4 次提交
- L
  add pipeline for dynamic graph (#32511) · 561dc719
  由 lilong12 提交于 4月 25, 2021
```
* add pp dygraph, test=develop
```
  561dc719
- J
  Dygraph Recompute (#32516) · 583ebab7
  由 JZ-LIANG 提交于 4月 25, 2021
```
* Dygraph reocmpute

* unitest for Dygraph reocmpute

* dy recompute remove unitest for win and mac
```
  583ebab7
- S
  [HybridParallel] Add pipeline layer in dygraph (#32449) · 7ef1de67
  由 ShenLiang 提交于 4月 25, 2021
```
* add pipeline layer
```
  7ef1de67
- L
  Fix the bug in mp (#31996) · 976fe6f9
  由 lilong12 提交于 4月 25, 2021
```
* update
```
  976fe6f9
23 4月, 2021 1 次提交
- B
  solve hccl communicate conflict (#32447) · 0e74eea2
  由 Baibaifan 提交于 4月 23, 2021
```
solve hccl communicate conflict (#32447)
```
  0e74eea2
22 4月, 2021 2 次提交
- Y
  
  Add fleet get_loss_scaling doc and update alert message (#32419) · d03b0b16
  由 Yuang Liu 提交于 4月 22, 2021
  
  d03b0b16
- S
  [HybridParallel] Add ClipGradByGlobalNorm & check_finite_and_unscale in Dygraph (#32354) · 7ea999fd
  由 ShenLiang 提交于 4月 22, 2021
```
* add clip/check

* add amp & clip grad in dygraph

* add logging
```
  7ea999fd
21 4月, 2021 3 次提交

【NPU】Merge NPU ccl code (#32381) · c3158527

由 zhang wenhui 提交于 4月 21, 2021

* add allreduce and broadcast without test (#31024)

add allreduce and broadcast without test

* Refactor HCCLCommContext to be compatible with Paddle (#31359)

Refactor HCCLCommContext to be compatible with Paddle (#31359)

* [NPU] add npu kernel for communication op (#31437)

* add allreduce and broadcast without test

* add c_broadcast_test case

* build c_comm_init and c_create_group operators

* make the whole thing compile

* add broadcast and init op test case but run failed

* make unit test compile

* fix broadcast test bug and change into hcom for ccl

* change c_comm_init and c_create_group ops accordingly

* make tests compile

* transfer code to 27

* compiled successfully in 28, but run failed

* test broadcast in 28, but failed

* make hcom primitives work

* change hccl data type for base.h

* fix broadcast bug

* make attributes work

* fix group name bug

* add allreduce but test failed

* allreduce bug for qiuliang

* allreduce finished

* add allgather and reducescatter

* merge all op code

* add allgather test

* finish run all ccl op test exclude send/recv

* all all op and test exclude send/recv

* send_v2_npu.cc recv_v2_npiu.cc compiled

* fix ccl core dump bug and test allgather, reducescatter, broadcast op

* fix allreduce bug just for test

* hcom send&recv test pass, without hcom_destroy

* for qiuliang test

* Ascend Send&Recv Test Pass

* all op (ex send/recv) ok

* fix bug

* merge all ccl op

* style merge to PaddlePaddle

* merge style

* new merge style

* merge style 2

* insert an empty at the end

* disable ctest for hcom to pass ci
Co-authored-by: Nvoid-main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>

* Add auto-increasing tag id for Hcom OPs (#31702)

* add c_reduce_sum op (#31793)

add c_reduce_sum op

* update Ascendrc hccl to 20.3 (#32126)

update Ascendrc hccl to 20.3 (#32126)

* fix merge code

* change cmake.txt1

* [NPU] Support npu kernel for c sync stream op (#31386)

* sync stream npu op

* add with_ascend_acl

* update c++ unittest

* compile all failed

* try to pre commit

* after pre commit

* merge&compile&test hccl successfully!

* fix code style

* fix code style

* fix bugs about hccl

* fix some bugs

* fix code style

* fix style

* fix style

* fix

* fixed

* merge develop
Co-authored-by: Nlw921014 <liuwei921014@yeah.net>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: Nf2hkop <f2huestc@outlook.com>
Co-authored-by: Nxiayanming <41795079@qq.com>

c3158527

Y

add get_loss_scaling to fleet (#32401) · 37bb3342
由 Yuang Liu 提交于 4月 21, 2021

37bb3342
G

add test=develop (#32380) · 4898c38d
由 gongweibao 提交于 4月 21, 2021

4898c38d

20 4月, 2021 1 次提交
- J
  [Sharding]: update config DOC (#32299) · e3489013
  由 JZ-LIANG 提交于 4月 20, 2021
```
* sharding: update config DOC

* update pipeline config

* sharding update doc
```
  e3489013
19 4月, 2021 1 次提交
- S
  [Hybrid Parallel] Support dp & mp in dygraph (#32323) · ffd40860
  由 ShenLiang 提交于 4月 19, 2021
```
* support dp & mp
```
  ffd40860
17 4月, 2021 1 次提交
- S
  [Hybrid Parallel] Add model parallel support in dygraph (#32248) · 66d46221
  由 ShenLiang 提交于 4月 17, 2021
```
* add model parallel support in dygraph
```
  66d46221
15 4月, 2021 3 次提交

1
tree-based-model (#31696) · a8c3a902
由 123malin 提交于 4月 15, 2021
```
* add index_dataset and index_sampler for tree-based model
```
a8c3a902

heterps support pscore (#32093) · 9f8c8f96

由 Thunderbrook 提交于 4月 15, 2021

* pscore support heterps

* fleet cmake

* fleet wrapper

* macro

* solve conflict

* solve conflict

* add unitest

* paddle enforce

* unitest

* unitest

* unitest

9f8c8f96

【NPU】Cherry-pick ascendrc ops code by 0325 to develop (#32197) · e6bc358d

由 zhang wenhui 提交于 4月 15, 2021

* merge 31065

* Fix typo of selected_npus (#31230)

* merge 31249

* [NPU] Support npu op pow and pow grad (#31247)

* [NPU] Support npu op: (1) pow (2) pow_grad

* Support fp16

* Fix pow npu fp16 test (#31256)

* support list of list attribute for NPU (#31299)

* support list of list attribute for NPU

* fix compile problem

* fix reference

* [NPU] Support npu op: (1) slice (2) slice_grad (#31275)

* fix reading flags from env (#31329)

* merge 31347

* [NPU] Support npu op layer_norm and layer_norm_grad (#31310)

* init commit, add layer_norm npu kernel

* fix typo

* add unittest

* add unittest

* fix bug

* fix bug

* refine ut

* [NPU] add npu kernel for equal op (#31393)

* add npu kernel for equal op

* refine code

* add more ut

* update year

* [NPU] Support npu kernel for shape op  (#31427)

* add shape npu

* fix

* fix

* fix endif (#31431)

* Fix pow, use fillD instead of broadcast (#31433)

* Fix pow, refine code (#31440)

* fix cmake of cryptopp to avoid downloading every time (#31451)

* [NPU] squeeze and unsqueeze op for ascend (#31452)
Co-authored-by: Nroot <xiayanming@baidu.com>

* Support npu kernel for gather op (#31458)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* 【NPU】add scale op for npu (#31499)

* add scale npu

* fix

* fix

* Support TensorFormVector, TensorToVector of bool type (#31518)

* support TensorFormVector, TensorToVector of bool type

* add ut

* fix compile problem

* 【NPU】support npu kernel for fill_constant op (#31521)

* add fill_constant npu

* add fill_constant npu

* fix

* cherry-pick 31422, solve conflict

* 【NPU】Support npu kernel for matmul op (#31544)

* add matmulv2_npu

* add matmul

* add matmul

* [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)

* [NPU] Support npu op elementwise_max (#31574)

* 【NPU】add relu op for  npu (#31515)

* add relu npu

* fixed

* fix

* 【NPU】Suppert npu kernel for reshape2 op (#31524)

* add reshape2 npu

* add reshpe2

* [NPU] Support npu kernel for gather op fix bug (#31541)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* update gather_grad

* fix bug

* fix bug

* [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457)

* Support npu kernel for amp_check_finite_and_unscale_npu op

* support EnforceNotMet exception

* fix exception bug

* modify python unittest

* precommit

* update c++ unittest

* fix review

* fix review

* [NPU] accuracy op (#31492)

* accuracy op

* fix license

* fix

* add test and fix bug

* [NPU] add Assign OP (#31561)

* add assign op

* add test assign npu test

* dele if def
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] fix npu op elementwise_mul_grad (#31592)

* 【NPU】Support npu op gelu and gelu_grad (#31530)

* Support npu op gelu and gelu_grad

* Support npu op gelu and gelu_grad

* [NPU] fix assgin cmake (#31595)

* fix gather_grad bug (#31607)

* [NPU] add range op (#31560)

* add range op

* fix codestyle; call GetSize directly
Co-authored-by: Noyjxer <1728722986@qq.com>

* 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573)

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600)

* [NPU] Support npu op logicalnot_op (#31534)

* [NPU] Support npu op elementwise_min (#31575)

* [NPU] Support npu op elementwise_pow (#31576)

* [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399)

* [npu] support npu kernel `table_lookup_v2`

* clean up

* +python test

* +cmake

* clean up

* remove int8 kernel
+ python unitest for fp16

* clean up

* [NPU] support npu kernel for `less_than` (#31327)

* [npu] support npu kernel for `less than`

* remove int* kernel

* cleanup

* [NPU] Support npu kernel scatter op (#31624)

* Support npu kernel scatter op

* Add more test

* [NPU] fix allocator min chunk size (#31632)

* [NPU] Support NPU kernel cast op (#31635)
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for sgd (#31639)

* 【NPU】Support NPU kernel for reduce_sum op v2 (#31620)

* add reduce_sum

* fix broadcastd

* fix test

* fix

* add unsqueeze in reduce_sum

* add template

* add unittest for keep_dim

* test reduce_all
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for adam (#31644)

* add npu kernel for adam

* refine code

* disable test

* modify atol

* 【NPU】Support npu kernel for mul op (#31584)

* add mul

* add test mul

* [NPU] add npu kernel for softmax_with_cross_entropy (#31656)

* init

* fix bugs

* [NPU] add npu kernel for mean Op (#31562)

* update mean op

* update mean op

* give a better test activation
Co-authored-by: Noyjxer <1728722986@qq.com>

* Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665)

This reverts commit 468ac699.

* 【NPU】Add TensorCopy to NPU kernel for reduce_sum op  (#31667)

* update unittest

* add TensorCopy in npu grad kernel

* [NPU] Support npu op `expand` (#31405)

* [npu] support npu kernel  for `expand`

* [NPU] fix shape of dx in mul_grad (#31675)

* fix shape of dx

* refine code

* [NPU] add Increment op (#31563)

* add increment

* fix

* update test increment op inplace

* update increment op

* increment b = 2
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] add NPU add topk  (#31596)

* add topk op

* add cmake

* update topk npu op

* refactor func

* fix test not go npu TopKD bug

* NPUPlace(4) to NPUPlace(0)

* update comment
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] Support NPU kernel sum op (#31671)

* [NPU] npu support `transpose` (#31486)

* cherry-pick 31564, solve conflict

* [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699)

* [NPU] Support testing grad of NPU ops in OpTest (#31697)

* [NPU] Support NPU kernel of stack op (#31711)

* [NPU] Remove redundant ctest of top_k_op_npu_test (#31718)

* [NPU] fix reshape npu op kernel (#31726)

* rename npu op file

* fix reshape

* [NPU] change transpose to transpose2 (#31734)

* change transpose to transpose2

* fix bug

* [NPU] Support  mean npu kernel (#31729)

* [NPU] fix some bugs of npu op (#31739)

* fix softmax

* fix mean

* fix lookup_table_v2

* 【NPU】Fix npu kernel elementwise_div_grad  (#31753)

* [NPU] fix the grad kernel diff bug of gather op (#31757)

* fix gather grad kernel diff

* fix gather grad kernel diff

* fix gather review bug

* 【NPU】Fix reshape test & add grad test (#31776)

* fix

* fix

* [NPU] support fp16 for npu accuracy op (#31797)

* [NPU] support list of tensor input (#31801)

* support list of tensor as npu input

* add comment

* fix typo

* fix typo

* [NPU] add npu kernel for concat op (#31695)

* add npu kernel for concat op

* add npu kernel for concat op

* refine code

* update

* refine concat_grad

* [NPU] Support npu kernel for op elementwise_floordiv (#31822)

* [NPU] fix bug of lookup_table_v2_grad (#31834)

* [NPU] support default stream (#31510)

* [NPU] support mixed precision input for npu layer norm (#31847)

* support mixed precision input for npu layer norm

* fix layer_norm npu kernel
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* 【NPU】Support npu kernel for update_loss_scaling op (#31830)

* add update_loss_scaling_npu NPU kernel

* change TensorFromVec to Memset

* fix compile problem (#31850)

* [NPU] support npu for conditional_block op (#31854)

* 【NPU】Add int dtype kernel for reshape2 op (#31864)

* fix

* fix

* [NPU] fix some op bugs (#31855)

* fix some op bugs

* fix some bugs

* follow comments

* fix log level

* add ut

* [NPU] support fp16 of input for api pow (#31871)

* [NPU] add npu kernel for truncated_gaussian_random op (#31654)

* init

* add todo

* add npu kernel for truncated_gaussian_random

* add sync

* fix concat_grad

* fix typo

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix code style

* fix code style

* fix code

* Fix op test (#32231)

* fix conditional block (#32243)

* fix style code
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: NReventon_L <luyuxiang1994@qq.com>
Co-authored-by: Nroot <xiayanming@baidu.com>
Co-authored-by: Noyjxer <1728722986@qq.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: NOleNet <olenet@126.com>
Co-authored-by: NMeiyim <chen_xuyi@outlook.com>
Co-authored-by: Noyxuan-11 <963650125@qq.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

e6bc358d

09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功