提交 · 7003dcaa2f5814da8584d5cf3b9b1a97cffdc8f2 · PaddlePaddle / Paddle

21 4月, 2022 11 次提交
- S
  Support FP16 argmax/argmin kernel (#42038) · 7003dcaa
  由 sneaxiy 提交于 4月 21, 2022
```
* support int16 argmax kernel

* add fp16 test
```
  7003dcaa
- Z
  
  modify batch_norm and batch_norm_grad. *test=kunlun (#41976) · 9774f965
  由 Zhangjingyu06 提交于 4月 21, 2022
  
  9774f965
- J
  update ampere sm (#42023) · c3b0b680
  由 JingZhuangzhuang 提交于 4月 21, 2022
```
* update ampere sm

* update ampere sm

* update ampere sm
```
  c3b0b680
- S
  
  block kernel_signature in windows (#42033) · 2f283997
  由 Sing_chan 提交于 4月 21, 2022
  
  2f283997
- Z
  Move pass optimizations into CINN. (#42047) · 83d6e315
  由 Zhen Wang 提交于 4月 20, 2022
```
* Move pass optimizations into CINN.

* Update the commit id of used cinn codes.
```
  83d6e315
- W
  infer add io stream. (#42031) · 0d28ee29
  由 Wilber 提交于 4月 21, 2022
```
* infer add io stream.

* add macro
```
  0d28ee29
- R
  Support cinn_launch op in standalone executor (#42046) · f2f1de7b
  由 Ruibiao Chen 提交于 4月 21, 2022
```
* Support cinn_launch OP in standalone executor

* Remove some redundant code
```
  f2f1de7b
- W
  
  [Eager] Support numpy.narray as input for eager expand (#42043) · 3da8066a
  由 Weilong Wu 提交于 4月 21, 2022
  
  3da8066a
- P
  add _grad_name and _grad_value for eager tensor (#41990) · 1bf2eeab
  由 pangyoki 提交于 4月 21, 2022
```
* add _grad_name and _grad_value for eager tensor

* fix paddle_enforce

* fix paddle_enforce 2

* fix grad_name

* _grad_value return lodtensor rather than tensor

* fix
```
  1bf2eeab
- D
  
  fix api math equation dispaly issue; test=document_fix (#42058) · f5ac9961
  由 David Nicolas 提交于 4月 21, 2022
  
  f5ac9961
- A
  
  [Eager]Fix SetDeviceId in eager_final_state_api from python_c_gen.py (#42025) · 94ffda57
  由 Aurelius84 提交于 4月 21, 2022
  
  94ffda57
20 4月, 2022 12 次提交
- J
  
  fix adaptive pool pass (#42019) · 747ba3f8
  由 JingZhuangzhuang 提交于 4月 20, 2022
  
  747ba3f8
- W
  
  [Eager] remove useless logic (#42020) · d67abac6
  由 Weilong Wu 提交于 4月 20, 2022
  
  d67abac6
- L
  
  be compatible with the old version of alltoall (#42007) · c6a084ef
  由 lilong12 提交于 4月 20, 2022
  
  c6a084ef
- B
  【PaddlePaddle Hackathon 2】9、为 Paddle 新增 logspace API (#41261) · a3c50c42
  由 BrilliantYuKaimin 提交于 4月 20, 2022
```
* 增加logspace的算子描述

* 增加logspace的形状推断

* 增加logspace核函数实现

* 在python中增加logspace接口

* 增加logspace单测

* 增加logspace

* Update logspace_kernel.cu

* Update logspace_op.cc

* 调整代码格式

* Update doc of logspace

* Update tensor.py

* Update logspace_op.cc

* Update logspace_kernel.cc

* Update logspace_kernel.cu

* Update test_logspace.py

* 调整 logspace 的位置

* 调整代码格式
```
  a3c50c42
- Y
  Fix paddle.t doc en and the annotation display on 4 en docs (#41699) · 885171e3
  由 Yilingyelu 提交于 4月 20, 2022
```
* gradients; test=document_fix

* fix VarType; test=document_fix

* fix vartype; test=document_fix

* cumsum; test=document_fix

* t; test=document_fix
```
  885171e3
- B
  
  update demo_ci ut threshold (#41981) · 65a5492a
  由 baoachun 提交于 4月 20, 2022
  
  65a5492a
- H
  
  windows compile add onnxruntime switch (#41988) · 0f72c72c
  由 heliqi 提交于 4月 20, 2022
  
  0f72c72c
- L
  [new-exec] clear the scope listener after run (#41947) · d4cf5666
  由 Leo Chen 提交于 4月 20, 2022
```
* clear the listener after run

* only sync variables in program

* refine code

* fit for lod_tensor_blocking_queue
```
  d4cf5666
- F
  
  [MLU] add gather mlu kernel (#41969) · 23ad2166
  由 fwenguang 提交于 4月 20, 2022
  
  23ad2166
- C
  
  fix inference custom op (#41999) · 30d8d114
  由 Chen Weihang 提交于 4月 20, 2022
  
  30d8d114
- C
  [CustomOp] Fix custom op pinned input error (#41972) · f1711f24
  由 Chen Weihang 提交于 4月 20, 2022
```
* fix custom op pinned input error

* fix compile error
```
  f1711f24
- T
  enable auto-tune when using cinn (#41795) · d70104e5
  由 TeFeng Chen 提交于 4月 20, 2022
```
* optimize preparation overhead before executing cinn compiled program

* update code notes

* fix flag annotation

* enable auto-tune when using CINN

* update cinn commit tag

* skip test

* fix lacking header file
```
  d70104e5
19 4月, 2022 17 次提交
- C
  
  polish tensor api details (#41971) · e5c61b15
  由 Chen Weihang 提交于 4月 19, 2022
  
  e5c61b15
- W
  double accessor and show_scale (#41943) · 8113c913
  由 wangguanqun 提交于 4月 19, 2022
```
* double accessor and show_scale

* double accessor and show_scale

* rename

* fix bug in pslib config

* add unittest
```
  8113c913
- C
  reduce performance influence by RecordEvent in Python (#41822) · d3f95e5a
  由 chenjian 提交于 4月 19, 2022
```
* reduce performance influence

* add unit test

* fix
```
  d3f95e5a
- J
  OneDNN md-in-tensor refactoring part 1: Added main changes for md-in-tensor (#41303) · c9f4fcf3
  由 jakpiase 提交于 4月 19, 2022
```
* changes for md in tensor

* ci fix

* Temporarily limited dims for test

* ci fix

* removed unnecessary includes

* added reviewers suggestions

* checkouted two files to avoid changing more than 19 in single PR

* minor fix

* reverted one file to reduce files changed to 19
```
  c9f4fcf3
- B
  
  fix_nccl_barrier (#41970) · 3e8b6bbc
  由 Baibaifan 提交于 4月 19, 2022
  
  3e8b6bbc
- C
  Rebase for profiler statistic ratio (#41939) · 9b54bf93
  由 chenjian 提交于 4月 19, 2022
```
* fix according to suggestion

* add kernel summary

* improve coverage
```
  9b54bf93
- X
  
  fix StickBreakingTransform forward error when input rank is over 2 (#41940) · 2b55290e
  由 Xiaoxu Chen 提交于 4月 19, 2022
  
  2b55290e
- K
  
  rm distri env (#41961) · 469e3198
  由 kuizhiqing 提交于 4月 19, 2022
  
  469e3198
- J
  [Eager] make fast through to linear (#41945) · 8631d73a
  由 Jiabin Yang 提交于 4月 19, 2022
```
* make fast through to linear

* make fast through to linear

* add to do for later upgrades

* support build once for now
```
  8631d73a
- Z
  
  Implement Amp Layout AutoTune (#41884) · c2bcb141
  由 Zhang Ting 提交于 4月 19, 2022
  
  c2bcb141
- support bmm&bmm_grad for KL2, *test=kunlun (#41935) · 60bec700
  由 z8hanghuan 提交于 4月 19, 2022
  
  60bec700
- S
  Cpu gpu graph engine (#41942) · 4f461ab9
  由 seemingwang 提交于 4月 19, 2022
```
* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* recover test

* recover test

* fix spelling

* recover

* fix

* fix linking problem

* remove comment
```
  4f461ab9
- G
  fix bug for MultiplicativeDecay (#41850) · 14573629
  由 guguguzi 提交于 4月 19, 2022
```
* fix bug for MultiplicativeDecay

* remove changes to test_lr_scheduler.py
```
  14573629
- H
  
  [infrt] support resnet50 on gpu backend (#41473) · aa67c292
  由 huzhiqiang 提交于 4月 19, 2022
  
  aa67c292
- A
  [Eager]Fix full_like/clip with np.generic type as attribute (#41808) · 9ac6b7ed
  由 Aurelius84 提交于 4月 19, 2022
```
* [Eager]Fix full_like/clip with np.generic type as attribute

* support numpy genertic

* remove usless code
```
  9ac6b7ed
- B
  update gpu fp16 op blacklist (#41703) · 55096a1c
  由 baoachun 提交于 4月 19, 2022
```
* update gpu fp16 op blacklist

* update blacklist
```
  55096a1c
- Q
  
  [MLU]add op: cumsum, fill_any_like, unsqueeze (#41791) · 6da637e8
  由 qipengh 提交于 4月 19, 2022
  
  6da637e8

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功