提交 · 65478332e1e165b0a367afaa9b2cbfd5762e9b16 · PaddlePaddle / Paddle

25 3月, 2022 8 次提交
- 王
  
  [infrt] add phi_dt.create_inited_dense_tensor.cpu.f32 kernel. (#40902) · 65478332
  由王明冬提交于 3月 25, 2022
  
  65478332
- F
  
  move elementwise_max/min/mod into phi (#40590) · cfadf61b
  由 FlyingQianMM 提交于 3月 25, 2022
  
  cfadf61b
- 0
  Fix loop index for FillZeroForEmptyGradInputs (#40909) · 3228fc34
  由 0x45f 提交于 3月 25, 2022
```
* Fix loop index for FillZeroForEmptyGradInputs

* Call fill zero in run_program_grad
```
  3228fc34
- S
  
  fix dependency (#40901) · c7b69fd2
  由 seemingwang 提交于 3月 25, 2022
  
  c7b69fd2
- A
  [NPU] add merged_momentum (#40875) · 2b74b739
  由 Aganlengzi 提交于 3月 25, 2022
```
* [NPU] add merged_momentum

* fix

* fix device
```
  2b74b739
- Z
  
  modify unit test in bn, stack and split. *test=kunlun (#40880) · 139a30ec
  由 Zhangjingyu06 提交于 3月 25, 2022
  
  139a30ec
- Z
  Scalar support marking data_type in yaml (#40867) · 04087012
  由 zyfncg 提交于 3月 25, 2022
```
* Scalar support marking data_type in yaml

* fix code-gene bug
```
  04087012
- F
  support get_item where the index is a bool scalar tensor (#40829) · 0f5e90a2
  由 FlyingQianMM 提交于 3月 25, 2022
```
* support get_item where the index is a bool scalar tensor

* add unittests for supporting get_item where the index is a bool scalar tensor
```
  0f5e90a2
24 3月, 2022 32 次提交
- C
  [Phi] Move mean op kernel into phi (#40872) · 8df91763
  由 Chen Weihang 提交于 3月 24, 2022
```
* add mean phi kernel

* remove original mean kernel

* add alias name
```
  8df91763
- C
  [Phi] Move batch size like infershape into phi (#40847) · 6d3db9c7
  由 Chen Weihang 提交于 3月 24, 2022
```
* move batch size like infershape

* revert other op change

* call infermeta in infershape

* adjust batchsize like pos
```
  6d3db9c7
- Z
  
  p_norm transfer to phi kernels (#40819) · 92afe146
  由 zhiboniu 提交于 3月 24, 2022
  
  92afe146
- L
  
  [new-exec] enable standalone_executor_test in coverage (#40846) · 22a5035e
  由 Leo Chen 提交于 3月 24, 2022
  
  22a5035e
- J
  fix build_cinn_pass internal var may be control var problem (#40812) · 310b7dba
  由 jiangcheng 提交于 3月 24, 2022
```
* fix build_cinn_pass internal var may be control var problem

* add annotation and vlog by review advice
```
  310b7dba
- Z
  Support intermediate for Sparse API (#40840) · 98244a9a
  由 zyfncg 提交于 3月 24, 2022
```
* support intermediate for saprse api

* close intermediate in yaml

* fix dygraph_api dep for eager
```
  98244a9a
- Z
  [AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48
  由 zhangbo9674 提交于 3月 24, 2022
```
* approve amp for intermediate_dygraph

* add amp_utils for intermediate_dygraph

* add amp needcast check for mlu & npu

* test unittest

* add SetGradNode for set_stop_gradient && add checktensor for GradientHooks

* refine code

* refien unittest of imperative_amp for new dygraph

* inplace api skip amp

* add test_imperative_qat_amp for intermediate amp

* refine code

* refine test_amp ci strategy

* refine unittest code

* refine amp_utils code

* refine amp getpromotetype for some special op

* refine unittest code
```
  c12f7d48
- A
  
  [phi] Remove usless cmake message (#40884) · 38d1fe34
  由 Aurelius84 提交于 3月 24, 2022
  
  38d1fe34
- J
  Correct MultipleQuantizeSquash (#40717) · 753964a2
  由 joanna.wozna.intel 提交于 3月 24, 2022
```
* Correct MultipleQuantizeSquash

* Correct logging
```
  753964a2
- R
  
  the `defaults` in FullArgSpec may be `None` (#40882) · 99541895
  由 Ren Wei (任卫) 提交于 3月 24, 2022
  
  99541895
- R
  [MoE]Assign pos op (#40580) · 305f32d1
  由 Roc 提交于 3月 24, 2022
```
* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count
Co-authored-by: Nhlygit66666 <2570058140@qq.com>
```
  305f32d1
- L
  
  Wrap dist api for dygraph mode (#40408) · 9d8cfc1b
  由 lilong12 提交于 3月 24, 2022
  
  9d8cfc1b
- G
  
  support dp for class_center_sample and margin_cross_entropy (#39852) · bff9e28e
  由 Guoxia Wang 提交于 3月 24, 2022
  
  bff9e28e
- S
  make vcvars64 and cuda_version can be set in xly pipe (#40870) · a9164245
  由 Sing_chan 提交于 3月 24, 2022
```
* make vcvars64 and cuda_version can be set in xly pipe

* make third_party_path reused by ci and build pipe;test=windows_ci_inference;test=windows_op;test=windows_ci
```
  a9164245
- T
  
  Clean api workspace (#40885) · 83906bcf
  由 tianshuo78520a 提交于 3月 24, 2022
  
  83906bcf
- L
  Refine events waiter (#40876) · 36ee6dd3
  由 liutiexing 提交于 3月 24, 2022
```
* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Add EventsWaiter

* update

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* update

* update Error MSG

* update EventsWaiter

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>
```
  36ee6dd3
- K
  test=document_fix , fix launch doc (#40848) · 2e8f9882
  由 kuizhiqing 提交于 3月 24, 2022
```
* test=document_fix , fix launch doc

* test=document_fix , fix typo
```
  2e8f9882
- J
  Fix rnn, wmt16 docs;test=document_fix (#40783) · cc8e98c7
  由 Jack Zhou 提交于 3月 24, 2022
```
* Fix rnn, wmt16 docs;test=document_fix

* Fix wmt14 docs;test=document_fix

* Add more description;test=document_fix
```
  cc8e98c7
- X
  [Auto Parallel] Gradient merge pass support dist attribute (#40737) · 0443c6f4
  由 xiayanming 提交于 3月 24, 2022
```
* [Auto Parallel] gradient merge pass support dist attribute
```
  0443c6f4
- Z
  
  Add sparse convertion api and sparse creation api (#40780) · a8f86600
  由 zhangkaihuo 提交于 3月 24, 2022
  
  a8f86600
- Z
  
  modify communicator api (#40881) · f95f3a65
  由 zhaocaibei123 提交于 3月 24, 2022
  
  f95f3a65
- C
  [Phi] Migrate InferShape of multiplex, qr, tril_triu (#40102) · 2e736531
  由 caozhou 提交于 3月 24, 2022
```
* migrate infershape

* fix tril_triu infershape error

* fix qr_op infershape

* add parse qr mode func

* move order
```
  2e736531
- H
  
  [Infrt] add method for automatically scanning pass and kernel info (#40822) · f51a5791
  由 huzhiqiang 提交于 3月 24, 2022
  
  f51a5791
- Z
  [Refactor] refactored eager_gen.py PR #1 (#40815) · 68c9e3e4
  由 Zhanlue Yang 提交于 3月 24, 2022
```
* [Refactor] refactored eager_gen.py PR #1

* [Refactor] refactored eager_gen.py PR #1

* Refactored version 2

* Added automatic code generation utils

* Fixed merge issues
```
  68c9e3e4
- H
  
  [Infrt] upgrade kernel launcher fun generator (#40826) · 7fa3a724
  由 huzhiqiang 提交于 3月 24, 2022
  
  7fa3a724
- S
  
  smaller the retry_times since random failure in windows is rare (#40857) · 0bcb4f85
  由 Sing_chan 提交于 3月 24, 2022
  
  0bcb4f85
- R
  
  [custom runtime] clear headers (#40845) · d3a43477
  由 ronnywang 提交于 3月 24, 2022
  
  d3a43477
- J
  
  test=document_fix (#40861) · 01339433
  由 Jiabin Yang 提交于 3月 24, 2022
  
  01339433
- K
  
  fix device id env (#40844) · 8562668e
  由 kuizhiqing 提交于 3月 24, 2022
  
  8562668e
- Z
  
  Modified paddle build script for 2.3 release (#40863) · 1d60e819
  由 Zhanlue Yang 提交于 3月 24, 2022
  
  1d60e819
- 王
  
  [infrt] fix bug in emit si32 attribute. (#40860) · d5bebf0b
  由王明冬提交于 3月 24, 2022
  
  d5bebf0b
- S
  test gpu graph engine's performance (#40775) · 83ae1619
  由 seemingwang 提交于 3月 24, 2022
```
* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake
```
  83ae1619

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功