提交 · 9070d5c5d85e15a04324b6a5f2f1e2c9a7ecc1b6 · PaddlePaddle / Paddle

02 3月, 2022 13 次提交
- Z
  
  test=document_fix;record py3 case time (#40018) · 9070d5c5
  由 zhangchunle 提交于 3月 02, 2022
  
  9070d5c5
- 王
  
  [infrt] speed up the infrt ci. test=devvelop (#40032) · 36660d4c
  由王明冬提交于 3月 02, 2022
  
  36660d4c
- F
  
  [MLU] add transpose2 mlu kernel (#39994) · 4cab812e
  由 fwenguang 提交于 3月 02, 2022
  
  4cab812e
- B
  
  add_new_comm_primitive (#40040) · 4e00d2bb
  由 Baibaifan 提交于 3月 02, 2022
  
  4e00d2bb
- L
  
  fix unittests for eignvalsh (#39841) · aa47297a
  由 lkylkylky 提交于 3月 02, 2022
  
  aa47297a
- optimize CUDA implementaion of randint OP (#39952) · fb635089
  由 zhouweiwei2014 提交于 3月 02, 2022
```
* change CUDA implementaion of randint OP,move distribution common func to phi

* fix CI

* fix CI
```
  fb635089
- W
  [Eager] open eager when WITH_PYTHON (#39979) · 9af72957
  由 wanghuancoder 提交于 3月 02, 2022
```
* open eager when WITH_PYTHON, test=develop

* refine, test=develop

* refine, test=develop

* add DWITH_PYTHON for gen_fluid_lib, test=develop
```
  9af72957
- W
  
  ernie: revert skip_layernorm_fp16 (#39991) · 26e2b918
  由 Wangzheee 提交于 3月 02, 2022
  
  26e2b918
- J
  
  add share external data interface (#39809) · 1ff1c1e0
  由 JingZhuangzhuang 提交于 3月 02, 2022
  
  1ff1c1e0
- F
  [Pten] Gru lstm migration (#39729) · e4dba69a
  由 Feiyu Chan 提交于 3月 02, 2022
```
* move sequence2batch

* move lstm and gru

* Add phi/kernels directory into exclusion to stop using hipcc to compile non .cu files in it.
```
  e4dba69a
- W
  
  [Eager] Support gnn ptb_rnn in eager mode (#39993) · dbcf8797
  由 Weilong Wu 提交于 3月 02, 2022
  
  dbcf8797
- F
  
  Fix bug for prepare phi OP (#40033) · fb0cadfd
  由 From00 提交于 3月 02, 2022
  
  fb0cadfd
- S
  update pd_2_trt lower pass (#40019) · acdf0663
  由 Shang Zhizhou 提交于 3月 02, 2022
```
* update pd_2_trt lower pass

* update pd_2_trt lower pass

* update style

* udpate

* change trt.graph to trt.create_engine

* update comments

* update comments

* add test
```
  acdf0663
01 3月, 2022 27 次提交
- Z
  
  Added attr & tensor type mapping for final state codegen (#39997) · 852a872f
  由 Zhanlue Yang 提交于 3月 01, 2022
  
  852a872f
- Q
  
  [ROCM] fix to get rocm number in script, test=develop (#39938) · 72e462cd
  由 Qi Li 提交于 3月 01, 2022
  
  72e462cd
- fix bug of paddle.to_tensor and paddle.moveaxis (#39662) · 4617c1b2
  由 zhouweiwei2014 提交于 3月 01, 2022
```
* fix bug of paddle.to_tensor and paddle.moveaxis

* fix CI
```
  4617c1b2
- A
  
  fix compiling and running with ipu (#39920) · 69ab2700
  由 Allen Guo 提交于 3月 01, 2022
  
  69ab2700
- C
  [Phi]rm reduce infershape (#39820) · 09039636
  由 chentianyu03 提交于 3月 01, 2022
```
* modify infershape utils and rm reduce infershape

* merge develop

* fix infermete bug

* add IsForInferShape func in ArgumentMappingContext

* add reduce_mean infermeta

* modify annotation

* add default dims
```
  09039636
- X
  [phi] tranfer the selu_op and pass the CI (#39819) · 197da15a
  由 xiongkun 提交于 3月 01, 2022
```
* tranfer the selu_op and pass the CI

* add sig files

* fix code

* fix by code review

* remove TOOD

* change the include position

* change the head position
```
  197da15a
- N
  Add function description for Kernel Primitive API (#39884) · 255bf609
  由 niuliling123 提交于 3月 01, 2022
```
* Add function description for Kernel Primitive API
1. Set cumsum and sort share memory size = 1024
2.sort and cumsum api limitation : blockDim.x must be less than 512 (blockDim.x <= 512)
```
  255bf609
- Z
  Fixed auto codegen for intermediate tensors (#39797) · 2592805b
  由 Zhanlue Yang 提交于 3月 01, 2022
```
* Refactored GradNodeAccumulation data structure and behaviour

* Fixed CI issues

* Fix compilation issues

* Fixed minor issues

* Reverted changes for intermediate and OverwriteOutput

* fixed minor issue

* Fixed auto codegen for intermediate tensors

* Removed restriction on AccumulationNode modification

* Fixed CI Coverage issues

* Adjusted Log contents

* Fixed CI issues
```
  2592805b
- J
  Add mobilenetv3_large performance test for bf16 and int8 (#39738) · eb7c211a
  由 joanna.wozna.intel 提交于 3月 01, 2022
```
* Add mobilenetv3_large performance test

* Disable the BF16 test if the device does not support BF16 computations

* Change test timeout
```
  eb7c211a
- Z
  [bf16] add bf16 kernel: layer_norm p_norm reduce_sum (#39843) · ce8ed978
  由 zhangbo9674 提交于 3月 01, 2022
```
* add layer norm

* add p norm

* add reduce sum

* refine layer norm register bf16 for cudnn811

* add bf16 cast for hip

* add unittest

* refine rocm

* refine layer_norm unittest

* refine reduce op

* refine unittest

* enhance atol for reduce unittest
```
  ce8ed978
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
- Z
  
  add test_warpctc_op in mac (#39983) · 25650774
  由 zhangchunle 提交于 3月 01, 2022
  
  25650774
- Z
  [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
  由 zhangbo9674 提交于 3月 01, 2022
```
* add scale gather sum

* refine CUDA_ATOMIC_WRAPPER ADD for bf16

* add gather unittest

* solve conflict

* add scale uinttest

* add sum unittest

* solve conflict

* refine gather unittest

* refine unittest
```
  6d26b332
- G
  
  add MasterParam and MasterParamOut for sparse_momentum op (#39969) · 9de79892
  由 Guoxia Wang 提交于 3月 01, 2022
  
  9de79892
- P
  
  change tests_v2 to dynamic_tests_v2 in CI op benchmark (#39995) · 4204b97a
  由 pangyoki 提交于 3月 01, 2022
  
  4204b97a
- H
  
  update error_string when target is out of bound (#40001) · a7acfc5b
  由 HydrogenSulfate 提交于 3月 01, 2022
  
  a7acfc5b
- R
  
  [phi] migrate where kernel into phi (#39811) · 468a2a17
  由 ronnywang 提交于 3月 01, 2022
  
  468a2a17
- Z
  [PHI] Remove reseting dtype, layout and allocation by arg_def for outputs in executor (#39781) · 4fbcf6f4
  由 zyfncg 提交于 3月 01, 2022
```
* remove SetAllocationForOutputTenosr

* add place param for copy kernel

* recover SetAllocationForOutputTenosr

* polish code

* fix empty_dev api bug

* remove reseting dtype and layout for output in executor

* fix merge bug

* [Phi] Add ClearHolder when re-alloc on new place in DeviceContext

* fix hostAlloc

* remove setting output allocation

* remove full_kernel_impl.h

* fix bug of xpu full_like
Co-authored-by: NAurelius84 <zhangliujie@baidu.com>
```
  4fbcf6f4
- L
  [phi] move uniform_random to phi (#39937) · b3466387
  由 Leo Chen 提交于 3月 01, 2022
```
* move uniform_random to phi

* fit selected_rows

* replace mutable_data
```
  b3466387
- C
  [Phi] Support kps backend and kernel registry (#39941) · 08b43cce
  由 Chen Weihang 提交于 3月 01, 2022
```
* support kps backend and compile

* resolve conflict

* fix kps backend trans

* test in xpu2 device

* remove dummy kernel
```
  08b43cce
- optimize mergeadd for sparse_adam,*test=kunlun (#39966) · d4911594
  由 z8hanghuan 提交于 3月 01, 2022
```
* optimize mergeadd for sparse_adam,*test=kunlun

* optimize mergeadd for sparse_adam,*test=kunlun

* optimize mergeadd for sparse_adam, *test=kunlun
```
  d4911594
- Z
  [PHI] Support Multi Input and Output for InferShape (#39870) · e8d45583
  由 zyfncg 提交于 3月 01, 2022
```
* add multi input for infer_shape

* support multi output for infershape

* fix split bug

* fix bug of concat

* support vector<MetaTensor*> in infrt

* fix bug
```
  e8d45583
- A
  [Phi] Migrate logical_and/or/not/xor into Phi (#39942) · 8c237973
  由 Aurelius84 提交于 3月 01, 2022
```
* [Phi] Migrate logical_and/or/not/xor into Phi

* fix unittest

* fix function name
```
  8c237973
- S
  [DP] Construct reducer group (#39987) · 4da841e0
  由 ShenLiang 提交于 3月 01, 2022
```
* add reducer
```
  4da841e0
- C
  Optimize group_norm op forward (#39596) · 657dd5a9
  由 crystal 提交于 3月 01, 2022
```
* optimize group norm forward

* use vectorized optimization

* add scalar calculation code

* optimize code
```
  657dd5a9
- C
  
  remove dot infershape (#39945) · 75280d36
  由 chentianyu03 提交于 3月 01, 2022
  
  75280d36
- 王
  
  add type constrait for DenseTensor (#39967) · 4149cabe
  由王明冬提交于 3月 01, 2022
  
  4149cabe

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功