提交 · eb0e4d4bea2af2bae34511474c382a0daced0ad3 · PaddlePaddle / Paddle

22 8月, 2023 6 次提交
- J
  
  [XPU] modify add_layernorm_xpu kernel (#56429) · eb0e4d4b
  由 jiangfan06 提交于 8月 22, 2023
  
  eb0e4d4b
- R
  
  [Fluid] NO.4 Migrate c_split to PHI (#56327) · 5dc7ff04
  由 Ruibin Cheung 提交于 8月 22, 2023
  
  5dc7ff04
- L
  [XPU][PHI Kernels] add index_put kernel for xpu (#56169) · 332a73b1
  由 lijin23 提交于 8月 22, 2023
```
* add inverse kernel for xpu

* add more kernels

* add index_put kernel for xpu

* add index_put kernel for xpu

* remove unused headers

* refine test

* wait to avoid memory bugs for xpu

* refine inverse
```
  332a73b1
- Z
  
  fix delete_repeated_ops_pass, fix multiclass_nms3 (#56434) · 3e55f255
  由 zhupengyang 提交于 8月 22, 2023
  
  3e55f255
- C
  [AutoParallel] Polish dist tensor design (#56368) · 8495377a
  由 Chen Weihang 提交于 8月 22, 2023
```
* polish dist teensor design

* adjust constructor

* polish details

* polish details design

* fix compile error

* refactor init tensor impl

* fix reshard test

* polish details

* add unittest for coverage
```
  8495377a
- [Paddle Inference] refactor linear_compress (#55490) · ffff3da0
  由 FormlessUnit 提交于 8月 22, 2023
```
* Modify kernels to support quantized_matmul

---------
Co-authored-by: Nsuperxf <1208713646@qq.com>
```
  ffff3da0
21 8月, 2023 9 次提交
- I
  
  bug fix of operator "interp_linear" · 752f29a1
  由 idontkonwher 提交于 8月 21, 2023
  
  752f29a1
- J
  
  bugfix, read and write race at fast_ln_fwd (#56435) · 1f987a75
  由 Jeng Bai-Cheng 提交于 8月 21, 2023
  
  1f987a75
- L
  
  [GLCC]Part-1: Add pylayer op to Support @to_static (#56108) · 7577a67a
  由 Lu Qi 提交于 8月 21, 2023
  
  7577a67a
- J
  
  [XPU] Add xpu plugin for reduce ops (#56389) · c6757bd3
  由 jiangfan06 提交于 8月 21, 2023
  
  c6757bd3
- R
  【Complex op】add complex support for numel (#56412) · f8cba26d
  由 Ryan 提交于 8月 21, 2023
```
* add complex numel

* change test && add doc
```
  f8cba26d
- G
  Add c_embedding forward compat op. (#56377) · 0668650f
  由 Ghost Screaming 提交于 8月 21, 2023
```
* Add c_embedding forward compat op.

* Fix some bugs.

* Polish code style.
```
  0668650f
- L
  
  remove namespace for dist attr and process mesh (#56449) · 1f94081d
  由 LiYuRio 提交于 8月 21, 2023
  
  1f94081d
- R
  
  fix dynamic to static when export LLM inference model (#56390) · 95c4bb41
  由 RichardWooSJTU 提交于 8月 21, 2023
  
  95c4bb41
- W
  fix strided slice compute bug (#56428) · 14abff89
  由 wanghuancoder 提交于 8月 21, 2023
```
* fix strided slice compute bug
```
  14abff89
18 8月, 2023 8 次提交
- W
  
  fix stride with ir bug (#56420) · 9a7dc249
  由 wanghuancoder 提交于 8月 18, 2023
  
  9a7dc249
- W
  
  fix stride legacy inplace bug (#56418) · ed9ec699
  由 wanghuancoder 提交于 8月 18, 2023
  
  ed9ec699
- Z
  
  fix gpt bug (#56419) · e5b71671
  由 zhangbo9674 提交于 8月 18, 2023
  
  e5b71671
- [Docs] add some Tensor API en doc (#56402) · 1322cd92
  由 zhouweiwei2014 提交于 8月 18, 2023
  
  1322cd92
- H
  
  move dgc_momentum InferShape to phi (#56358) · a533dae3
  由 huangjiyi 提交于 8月 18, 2023
  
  a533dae3
- L
  [Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c
  由 lzy 提交于 8月 18, 2023
```
[Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)

* 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope

* fix FLAGS_cache_inference_while_scope

* add unitest

* add unitest

* skip unitest when cudnn_version < 8100

* skip test share_external_data_bf16 when CUDA_ARCH < 80
```
  c65ef07c
- H
  [NewIR] new ir support assert op (#56353) · 7f5c14bc
  由 hong 提交于 8月 18, 2023
```
* fix op translator reshape type

* update

* new ir support vector type place transfer

* add test case

* update

* revert code

* add test assert new ir test

* update

* update
```
  7f5c14bc
- Y
  
  fix fft bug in DCU (#56340) · d084a236
  由 yuguo 提交于 8月 18, 2023
  
  d084a236
17 8月, 2023 3 次提交
- T
  Revert "add some Tensor API en doc (#55958)" (#56375) · d1ea359b
  由 tianshuo78520a 提交于 8月 17, 2023
```
This reverts commit fd765f61.
```
  d1ea359b
- add some Tensor API en doc (#55958) · fd765f61
  由 zhouweiwei2014 提交于 8月 17, 2023
  
  fd765f61
- Z
  add lu_unpack data check (#56311) · 6fdb316c
  由 zhiboniu 提交于 8月 17, 2023
```
* add lu_unpack data check

* add error input api test

* add error type info
```
  6fdb316c
16 8月, 2023 12 次提交

L
[NewIR] support c_broadcast (#56284) · a8981be0
由 Leo Chen 提交于 8月 16, 2023
```
* [NewIR] support c_broadcast

* add legacyOpList
```
a8981be0
Refine FusedNorm comment (#56305) · 12547fb4
由 MarDino 提交于 8月 16, 2023
```
* refine static op return val
```
12547fb4
Z

fix bug (#56324) · 1af1178a
由 zhangbo9674 提交于 8月 16, 2023

1af1178a
H
move dgc_momentum kernel to phi (#56158) · baa4fb42
由 huangjiyi 提交于 8月 16, 2023
```
* update

* update
```
baa4fb42

[ROCM]:Delete the special target and fix compiler options (#55507) · 4d501872

由 onepick 提交于 8月 16, 2023

runtime compiler api will only build special target if it is bind.

'--include-path' is not supported by hipcc and "-I/include/folder"
is better choice

fix ut:
        * device_code_test
        * test_code_generator
        * test_fusion_group_pass
        * test_fusion_group_op
Signed-off-by: Njiajuku <jiajuku12@163.com>

4d501872

S

[Fluid] move assign_pos to phi (#55794) · 9d899273
由 Sonder 提交于 8月 16, 2023

9d899273

[Fluid] NO.1 Migrate c_embedding to PHI (#56129) · 7c9abfb2

由 Ruibin Cheung 提交于 8月 16, 2023

* [Fluid] Migrate c_embedding to PHI

* fix

* add python_api

* fix ut

* migrate xpu kernel

* fix windows compile error

7c9abfb2

J

[XPU] Add fast_layernorm_xpu_fuse_pass and fast_layernorm_xpu plugin (#56269) · f16e1869
由 jiangfan06 提交于 8月 16, 2023

f16e1869
X

fix bmm op bugs in static mode with dynamic shape (#56135) · be22021c
由 xiongkun 提交于 8月 16, 2023

be22021c
W

fix yaml error name (#56319) · edba06e1
由 wanghuancoder 提交于 8月 16, 2023

edba06e1

[AutoParallel] Dygraph basic impl for semi auto parallel (#55698) · 7039bef3

由 Chen Weihang 提交于 8月 16, 2023

* add phi forward api gen impl

* add phi backward gen code

* polish api code gen impl

* polish code gen impl

* remove auto_paralel namespace

* add dygraph forward impl

* add for_auto_parallel cond

* fix code gen errors

* add dygraph backward impl

* resolve conflict with develop

* refactor dist api gen impl

* revert origin api gen impl

* replace template for override func

* fix dnnl marco error

* revert third_party change

* add with distributed marco

* Update grad_tensor_holder.cc details

* merge dist tensor constructor

* change test tensor to replicate

* fx typo

* resolve conflict with develop

* fix out dim error

7039bef3

[NewIR]New ir support c concat (#56243) · fcde3991

由 hong 提交于 8月 16, 2023

* support new ir load combine

* update

* polish code

* remove print

* support c concat

* update

* polish code

* fix bug

* polish code

* fix compile bug

* polish code

* remove useless code

fcde3991

15 8月, 2023 2 次提交

Y
Add flash attention backward grad check (#56249) · 1509a036
由 yinwei 提交于 8月 15, 2023
```
---------
Co-authored-by: Ntianhaodongbd <tianhaodong@baidu.com>
```
1509a036

[Paddle Inference] Add masked multihead attention kernel and export API. (#55344) · 989c5e87

由 xiaoxiaohehe001 提交于 8月 15, 2023

* support_mmha
* add_python_api
* add_api_doc
* fix_doc_error
* fix_infermeta
* add_infermeta
* add_bf16_cuda_check
* add_bf16_check
* fix_ci_windows
* fix_ci_windows_kernel_register
* fix_test_mmha
* add_cumoffsets
* remove_bias
* delete_mmha_reshape_input_output
* rename_delete_hfile
* remove_fluid

---------
Co-authored-by: Nyangjianfengo1 <yangjianfeng01@baidu.com>

989c5e87

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功