提交 · 5a2e5179148119c745065bfeec6cce9b0ce1700c · BaiXuePrincess / Paddle

20 10月, 2022 10 次提交
- K
  Add FusedMultiTransformer fuse pass for GPT3 (#45907) · 5a2e5179
  由 Kaipeng Deng 提交于 10月 20, 2022
```
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
```
  5a2e5179
- H
  [MKLDNN] Delete mkldnn hard code of fc (#47138) · 4dc4d5fc
  由 HongyuJia 提交于 10月 20, 2022
```
* remove fc mkldnn hardcode

* remove useless enum of kFCMKLDNN

* fix macro error

* update operators.cmake
```
  4dc4d5fc
- S
  
  log only if > 0 (#47181) · d6208aad
  由 Sylwester Fraczek 提交于 10月 20, 2022
  
  d6208aad
- X
  
  support compiling: with_distribute=on and with_pscore=off (#47192) · ad7aeb9e
  由 Xinger 提交于 10月 20, 2022
  
  ad7aeb9e
- J
  Add infer prune function (#47046) · af9486fc
  由 JingZhuangzhuang 提交于 10月 20, 2022
```
* Add infer prune function

* Update phi.cmake

* Update operators.cmake

* add fusion op
```
  af9486fc
- J
  Add _get_phi_kernel_name interface (#47032) · 0508b94b
  由 JingZhuangzhuang 提交于 10月 20, 2022
```
* add _get_phi_kernel_name interface

* remove inference interface

* Revert "remove inference interface"

This reverts commit 784a8a6c51fa2dc49a01c8699525298ac21b178f.
```
  0508b94b
- C
  
  fix gcc54 compile failed (#47172) · 68e27f35
  由 Chen Weihang 提交于 10月 19, 2022
  
  68e27f35
- T
  [CodeStyle][W605] Add escape symbols to some strings (#46752) · e1c0461d
  由 Tony Cao 提交于 10月 20, 2022
```
* Fix W605 in tools folder by adding escape symbols

* Fix W605 in incubate and some other folders

* Fix W605 in /fluid/test folders

* Update tools/analysisPyXml.py
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>

* Add some changes to manual and auto escape symbols

* revert changes in transformer.py

* Fix new code with W605 error: add escape symbols

* revert changes in transformer.py

* revert changes in transformer.py
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
```
  e1c0461d
- W
  
  [Eager, Performance optimization] support not equal to sink to cpp layer (#47174) · f0408778
  由 Weilong Wu 提交于 10月 20, 2022
  
  f0408778
- L
  [Wsign-compare] Close Wno-error=sign_compare (#47163) · 2246e884
  由 Li-fAngyU 提交于 10月 20, 2022
```
* close Wno-error=sign_compare

* close Wno-error=sign_compare

* Update CMakeLists.txt
```
  2246e884
19 10月, 2022 17 次提交
- W
  
  [Eager] polish general_grad (#47151) · b9c8c1b1
  由 Weilong Wu 提交于 10月 19, 2022
  
  b9c8c1b1
- W
  
  fix clang warning of [-Wformat] (#47137) · 7a2489e6
  由 Wang Xin 提交于 10月 19, 2022
  
  7a2489e6
- R
  
  modify timeout limitation, test=infer-coverage (#46831) · 89d481db
  由 RichardWooSJTU 提交于 10月 19, 2022
  
  89d481db
- X
  
  fix rpc compile bug (#47026) · e89d7297
  由 Xinger 提交于 10月 19, 2022
  
  e89d7297
- N
  
  [CodeStyle][py2] remove `six` package (part 1) (#46965) · e6fb551c
  由 Nyakku Shigure 提交于 10月 19, 2022
  
  e6fb551c
- Y
  
  add nvtxRangePush/Pop for naive_executor and refine some code (#47139) · de6e7431
  由 Yuanle Liu 提交于 10月 19, 2022
  
  de6e7431
- Z
  
  move the logic of mkldnn layout in GetKernelTypeForVar from ActivationOp to base class (#47104) · 95ca886c
  由 zyfncg 提交于 10月 19, 2022
  
  95ca886c
- Z
  Rename name of op and op_args in yaml to align python api (#46343) · 85489d39
  由 zyfncg 提交于 10月 19, 2022
```
* rename op in yaml

* fix test_layout_autotune

* fix layout autotune of transpose
```
  85489d39
- R
  Support stream overlap for c_allreduce_sum (#47030) · d00b7d83
  由 Ruibiao Chen 提交于 10月 19, 2022
```
* Support stream overlap for c_allreduce_sum

* Test CI

* Add notes

* Add SingleStreamGuard for BuildOpFuncList
```
  d00b7d83
- Y
  Enable to record whether the conv algo is got by exhaustive search to fix... · 3bc4b850
  由 Yiqun Liu 提交于 10月 19, 2022
```
Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
```
  3bc4b850
- C
  Support uniform api and sigmoid api in new AD (#46960) · af4bdede
  由 Charles-hit 提交于 10月 19, 2022
```
* support uniform api in new ad

* add unit test for uniform_random_p

* resolve conflict

* fix uniform_random orig2prim

* fix primrules

* remove ShapeTensor and ShapeTensorList input in uniform_random_p op and add sigmoid orig2prim rules
```
  af4bdede
- W
  [Dy2St]Fix recurrent op eager deletion pass error in dy2st (#47105) · 94132190
  由 WangZhen 提交于 10月 19, 2022
```
* Fix recurrent op eager deletion pass error in dy2st

* Polish code

* Refine error message
```
  94132190
- W
  
  slice op supports uint8_t (#47067) · 1e1c7275
  由 will-jl944 提交于 10月 19, 2022
  
  1e1c7275
- H
  Construct exec and ctx only once in cond op to speed up (#47092) · 2814d7f6
  由 Hui Zhang 提交于 10月 19, 2022
```
* cond infer apply exec seprate

* fix bugs

* fix as comment
```
  2814d7f6
- L
  clean unused code: piece.cc/h (#47103) · e435d695
  由 Leo Chen 提交于 10月 19, 2022
```
* clean unused code: piece.cc/h

* clean usage
```
  e435d695
- W
  
  fix old dygraph a vlog bug (#47115) · 3c39475d
  由 wanghuancoder 提交于 10月 19, 2022
  
  3c39475d
- L
  
  fix build warning: [Wsign-compare] on linux (#46644) · be273ea9
  由 Li-fAngyU 提交于 10月 19, 2022
  
  be273ea9
18 10月, 2022 10 次提交
- W
  
  Fix bugs in the General Plugin Mechanism (#47072) · 75b16781
  由 weishengying 提交于 10月 18, 2022
  
  75b16781
- Z
  [Paddle-TRT]Rewrite strided_slice converter using shape tensor (#46819) · 5c0bfc18
  由 zhoutianzi666 提交于 10月 18, 2022
```
* Rewrite strided_slice converter  using  shape tensor 
* clean code
```
  5c0bfc18
- W
  Merge layernorm trt fuse (#46320) · 5e9f491e
  由 Wang Bojun 提交于 10月 18, 2022
```
* first version, accuracy corrected

* disable debug print

* use blockReduceSum in phi

* add UT

* add opCompat

* code style

* code refine

* bug fix

* code refine

* test fix

* bugfix

* codesytle fix

* code style

* code-style

* code-style

* code-style
```
  5e9f491e
- S
  FC + activation fuse passes (#45183) · b7a23adb
  由 Sławomir Siwek 提交于 10月 18, 2022
```
* git

* style

* leave default relu in kernel

* style

* cleanup FCMKLDNN pattern

* merge conflicts

* update develop

* update develop

* add const

* rename to oneDNN and adjust attributes

* whitespace
```
  b7a23adb
- H
  Construct exec and ctx only once in cond op to speed up (#47009) · 42e312a1
  由 Hui Zhang 提交于 10月 18, 2022
```
* cond infer apply exec seprate

* fix bugs
```
  42e312a1
- W
  
  reconstruct code for convert_fp16 (#46428) · 1cc482b0
  由 Wilber 提交于 10月 18, 2022
  
  1cc482b0
- X
  
  [Paddle Inference] Add_expand_v2_trt_layer (#47002) · a21a2b5b
  由 xiaoxiaohehe001 提交于 10月 18, 2022
  
  a21a2b5b
- W
  
  [Eager, Performance optimization] support pow( ** operator) to sink to Cpp layer (#47077) · 62c0abac
  由 Weilong Wu 提交于 10月 18, 2022
  
  62c0abac
- Z
  [code-gen] Support code-gen for opmaker of sparse op (#46993) · bdd3dde3
  由 zyfncg 提交于 10月 18, 2022
```
* support generating code of opmaker for backward op invoke forward op

* gsupport code-gen of opmaker for sparse op

* refind logic of choose phi kernrel

* fix complie budg

* fix code_gen bug

* fix bug

* fix kernel signature code-gen

* fix complie bug of VarType

* fix complie bug of VarType

* fix test_sparse_conv_op

* fix test_sparse_norm_op
```
  bdd3dde3
- H
  
  delete GetExpectedKernelType mkldnn of conv_op (#47044) · a9c20660
  由 HongyuJia 提交于 10月 18, 2022
  
  a9c20660
17 10月, 2022 3 次提交

Add enable_partial_send_recv switch in pipeline_configs (#46992) · b9a2f29c

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Support allow_partial switch, which can be configure in
pipeline_configs. If sent tensor are not the same from
different hosts, they shouldn't been sent partially and
then concated as a whole tensor.

* Change name allow_partial to enable_partial_send_recv.

* Add global variable _enable_partial_send_recv

b9a2f29c

Support BF16 training for sharding (#46846) · 0b39b244

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

0b39b244

H
Revert "add common subexpression elimination (#44386)" (#47062) · 7c6835ca
由 hong 提交于 10月 17, 2022
```
This reverts commit 166ff39a.
```
7c6835ca

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致