提交 · b4a938840967f8e0d59a588a50a77f54f63961c8 · Crayon鑫 / Paddle

10 6月, 2022 14 次提交
- L
  
  optimize bwd layer_norm kernel with fast method (#42491) · b4a93884
  由 limingshu 提交于 6月 10, 2022
  
  b4a93884
- [MLU] add mlu kernel for clip (#43229) · 798e2e7e
  由光明和真理提交于 6月 10, 2022
  
  798e2e7e
- F
  
  [MLU]add mlu kernel for scatter op (#43292) · 9ad05afd
  由 fuyou765 提交于 6月 10, 2022
  
  9ad05afd
- W
  
  revert PR43039 (#43384) · ac75617a
  由 Wilber 提交于 6月 10, 2022
  
  ac75617a
- Y
  [BugFix]Fix dims mismatch when run rec_svtrnet model in eager mode (#43373) · cdeb3167
  由 YuanRisheng 提交于 6月 10, 2022
```
* change tensor name

* fix unittest bugs
```
  cdeb3167
- S
  
  fix nullptr (#43370) · acfd7129
  由 sneaxiy 提交于 6月 10, 2022
  
  acfd7129
- L
  make all phi kernels to 2(host/device) static libraries directly (#43247) · 5781999d
  由 Leo Chen 提交于 6月 10, 2022
```
* make all phi kernels to 2(host/device) static libraries directly

* fix calling kernel_declare

* fix compile

* fix cpu compile

* fix rocm compile

* fix xpu compile

* fix xpu kp compile

* fix inference compile
```
  5781999d
- T
  
  [Hackathon No.28] implement logcumsumexp (#42267) · 19a7524f
  由 tiancaishaonvjituizi 提交于 6月 10, 2022
  
  19a7524f
- C
  add new field for event node (#43223) · 06de4891
  由 chenjian 提交于 6月 10, 2022
```
* add new field for event node

* fix

* fix bug

* fix bug

* fix clang

* fix clang format

* fix code format
```
  06de4891
- C
  
  [MLU]add mlu kernel for sqrt op (#43326) · 6d3a68cb
  由 cambriconhsq 提交于 6月 10, 2022
  
  6d3a68cb
- A
  add unary ops (#773) (#43363) · 8045fcfd
  由 Allen Guo 提交于 6月 10, 2022
```
* add unary ops

* move to activation_ops
```
  8045fcfd
- R
  Refactor DeviceContextPool (#42901) · 114723c9
  由 Ruibiao Chen 提交于 6月 10, 2022
```
* Refactor DeviceContextPool

* Adjust header file order
```
  114723c9
- E
  Re-implemented check_finite_and_unscale_op with newly added xdnn api (#42960) · 6197fbf6
  由 enzodechine 提交于 6月 10, 2022
```
* Re-implemented check_finite_and_unscale_op  with newly added xdnn api
*test=kunlun

* Re-implemented check_finite_and_unscale_op  with newly added xdnn api

*test=kunlun
```
  6197fbf6
- F
  
  [MLU] add randperm kernel and reduce_prod kernel (#43357) · b07f469b
  由 fwenguang 提交于 6月 10, 2022
  
  b07f469b
09 6月, 2022 13 次提交
- F
  
  [MLU] add mlu meshgrid kernel (#43271) · 0d719718
  由 fwenguang 提交于 6月 09, 2022
  
  0d719718
- M
  
  [sparse inference] Supporting 2:4 sparse inference (#43179) · 20b38cfa
  由 minghaoBD 提交于 6月 09, 2022
  
  20b38cfa
- F
  
  [MLU]add mlu kernel for range op (#43296) · 1a80b484
  由 fuyou765 提交于 6月 09, 2022
  
  1a80b484
- C
  
  add mlu gather_nd kernel (#43344) · 0454b777
  由 cifar10 提交于 6月 09, 2022
  
  0454b777
- L
  [Bug fix] Do not quantize weights Y when matmul X and Y both other ops outputs (#43297) · 06d999f6
  由 lidanqing 提交于 6月 09, 2022
```
* fix some matmul that X and Y both other ops outputs, do not dequantize the Y.

* fix CI format

* fix according to review
```
  06d999f6
- S
  Add nproc_per_node for DistributedFusedLamb (#43295) · 6678def9
  由 sneaxiy 提交于 6月 09, 2022
```
* add nproc_per_node for DistributedFusedLamb

* fix nproc_per_node communicator bug

* fix ring_id = 1 init bug

* fix ci

* fix test_parallel_executor_mnist.py
```
  6678def9
- R
  
  Fix shrink downstream op bugs for standalone executor (#43330) · 04294f80
  由 Ruibiao Chen 提交于 6月 09, 2022
  
  04294f80
- C
  
  [MLU]add mlu kernel for conv2dtransposed op (#43233) · c96f7a29
  由 cambriconhsq 提交于 6月 09, 2022
  
  c96f7a29
- Z
  [part1] fix sign-compare warning (#43276) · c49f35cf
  由 zhangchunle 提交于 6月 09, 2022
```
* fix sign-compare warning

* fix sign-compare 2
```
  c49f35cf
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
- Z
  
  fix scale pass when "conditional_block" or "while" is before "scale" (#43323) · 8585279f
  由 zhupengyang 提交于 6月 09, 2022
  
  8585279f
- W
  
  Export symbols of phi operator library (#43336) · c43a1ff6
  由 weishengying 提交于 6月 09, 2022
  
  c43a1ff6
- W
  [Eager] fix pylayer forward output code (#43331) · e81f28f0
  由 wanghuancoder 提交于 6月 09, 2022
```
* fix pylayer forward output code

* refine
```
  e81f28f0
08 6月, 2022 13 次提交
- W
  
  thread_local method to support predictor stream. (#42785) · cab0f2f5
  由 Wilber 提交于 6月 08, 2022
  
  cab0f2f5
- Z
  
  disable lite gpu (#43177) · 811d57d8
  由 zhupengyang 提交于 6月 08, 2022
  
  811d57d8
- A
  
  [NPU] fix reduce_max (#43230) · 07ede118
  由 Aganlengzi 提交于 6月 08, 2022
  
  07ede118
- Z
  
  test=document_fix (#43322) · 3adeea60
  由 zhangchunle 提交于 6月 08, 2022
  
  3adeea60
- X
  
  call_once (#43206) · cad139a7
  由 xiaoxiaohehe001 提交于 6月 08, 2022
  
  cad139a7
- W
  
  fix_ernie_unitest (#43283) · 0ffaf049
  由 Wangzheee 提交于 6月 08, 2022
  
  0ffaf049
- Z
  
  fix tensor copy bug (#43299) · 88216f63
  由 zyfncg 提交于 6月 08, 2022
  
  88216f63
- Y
  [Phi]Move group op kernel into PHI and add yaml / unittest (#43104) · 99c6497b
  由 YuanRisheng 提交于 6月 08, 2022
```
* move_group_norm

* move group norm backward

* fix code format

* modify code according comment
```
  99c6497b
- F
  
  [MLU] add logical ops (#43286) · 8bd3514c
  由 fwenguang 提交于 6月 08, 2022
  
  8bd3514c
- W
  [Paddle-Inference]support matmulv2 in multihead (#43269) · 1fbd4440
  由 Wangzheee 提交于 6月 08, 2022
```
* support matmulv2 in multihead
```
  1fbd4440
- Y
  
  fix bugs (#43294) · e1a34bc4
  由 YuanRisheng 提交于 6月 08, 2022
  
  e1a34bc4
- Y
  Fix wrong reduce_dims in fused_gate_attention and optimize the memory usage. (#43216) · 10f8637c
  由 Yiqun Liu 提交于 6月 08, 2022
```
* Polish codes and memory usage for fused_gate_attention.

* Fix wrong reduce_dims in fused_gate_attention when computing gradient of nonbatched_bias.
```
  10f8637c
- Z
  [GPUPS]Optimize dump_pool_to_cpu for dymf (#43219) · 9c17688a
  由 zmxdream 提交于 6月 08, 2022
```
* optimize dump_to_cpu for dymf

* code clean. test=develop

* fix func. test=develop

* fix code style. test=develop

* fix. test=develop
```
  9c17688a

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致