提交 · d03ef0541959391e7414d6d8780f6248383fef18 · PaddlePaddle / Paddle

22 8月, 2022 2 次提交
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
19 8月, 2022 1 次提交
- R
  Fix random op dependency and lr_shedule bugs for standalone executor (#45265) · 6d4ae007
  由 Ruibiao Chen 提交于 8月 19, 2022
```
* Fix random op depenency and lr_shedule bugs for standalone executor

* Fix CI errors

* Fix CI errors

* Fix CI errors
```
  6d4ae007
18 8月, 2022 3 次提交

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in... · d8d124b6

由 pangyoki 提交于 8月 18, 2022

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085)

* apply inplace addto in python apply_pass

* fix

* apply inplace pass for program

* skip feed and fetch var

* fix block_desc.move_from

* fix block desc

* alltoall remove inplace

* fix

d8d124b6

change to async mode for xpu multi-card training in static graph mode, test=kunlun (#45024) · 41bdf41d

由 zhangxiaoci 提交于 8月 18, 2022

* change to async mode for xpu multi-card training in static graph mode

* minor bugfix

* irrelevant. move to another pr

* move change to other pr

* fix stream issue

* fix 'stream not meet with current context' error

* fix branch diverge, test=kunlun

41bdf41d

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

17 8月, 2022 2 次提交
- A
  [OpAttr]Add SupportTensor for OpMaker with whitelist mechanism (#45084) · 2594935a
  由 Aurelius84 提交于 8月 17, 2022
```
* [OpAttr]Add SupportTensor for OpMaker

* fix typo

* fix code style

* add SupportTensor for concat op

* add unittest for register Tensor

* add shape checker and split attribute
```
  2594935a
- F
  
  fix:op version (#45192) · d0cd0a11
  由 feng_shuai 提交于 8月 17, 2022
  
  d0cd0a11
16 8月, 2022 4 次提交

[Phi] Move amp ops into phi (#45079) · b4f67757

由 Chen Weihang 提交于 8月 16, 2022

* move check finite and unscale kernel into phi

* move infershape into phi

* move update_loss_scaling kernel into phi

* remove original kernels

* move update loss scaling infershape into phi

* add header for xpu and npu

* solve coverage failed

* fix npu test failed

* remove mutable data in cu file

* fix new executor failed

* add valid check for meta tensor output

b4f67757

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44
F

add strongly typed functions to set attributes to avoid unexpected type conversions. (#45107) · 307801d5
由 Feiyu Chan 提交于 8月 16, 2022

307801d5

15 8月, 2022 2 次提交

Y

fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
由 Yuanle Liu 提交于 8月 15, 2022

ac0553a0

[Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe

由 Yulong Ao 提交于 8月 15, 2022

* [Auto Parallel] Move the distributed info from python to c++

* [Auto Parallel] Add dist_attrs for VarDesc and OpDesc

* [Auto Parallel] Add the lost file

* [Auto Parallel] Make the dist attr be unique_ptr

* [Auto Parallel] Add the proto conversion

* [Auto Parallel] Improve the proto support

* [Auto Parallel] Fix the bugs for adding a device or a link

* [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper

* [Auto Parallel] Improve the impl of these dist attrs

* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh

* [Auto Parallel] Fix the unittest problem

* [Auto Parallel] Explicitly add the src file for auto_parallel target

* [Auto Parallel] Add the proto depedency explicitly

* [Auto Parallel] Fix the cmake bug on windows and mac

* [Auto Parallel] Remove the pybind11 header file in process_mesh.h

* [Auto Parallel] Remove unused codes

* [Auto Parallel] Check whether the dist attr is null

* [Auto Parallel] Implement the assign operator for OpDesc explicitly

a52357fe

14 8月, 2022 1 次提交
- X
  Revert "[Paddle Inference] Support cuda_graph. (#44878)" (#45115) · b0e7681f
  由 xiaoxiaohehe001 提交于 8月 14, 2022
```
This reverts commit 84bf5c31.
```
  b0e7681f
13 8月, 2022 2 次提交

Refine program cache (#45005) · e96dae8b

由 Leo Chen 提交于 8月 13, 2022

* add cached_serialize_str_

* support program hash

* add sha

* add ut

* use hash_str only for new_exe

* fix attr order

e96dae8b

fl-ps: support split sparse params in local & remote (#44864) · 3f5c405f

由 ziyoujiyi 提交于 8月 13, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

* fix logging risk

* fix logging possible risk

* write trainer_desc file

* support split sparse params in local & remote

* fix import paddle.fluid.core.PSGPU

* fix import paddle.fluid.core.PSGPU

* add remote_sparse & local_sparse config

* fix unittest

* fix test_dist_fleet_geo table error

* fix PADDLE_ENFORCE error

* fix other's pr conflict

3f5c405f

12 8月, 2022 2 次提交

Offload calculations from matmul op to fuse pass (#44941) · acb78ea2

由 Sławomir Siwek 提交于 8月 12, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* Add int8 support for matmulV2

* restore ut

* adjust old ut

* restore parallel UT ruels

* remove mkldnn code from base ops

* move enforces to pass

* remove duplicated functions

* delete duplicated enforces

* feedback from review

* add comments to variables

* enable eltwise support

* dynamic attribute

* remove fusepass tests from op test

* remove fuse pass cases from op test

* revert introduction of dynamic attributes

* style
Co-authored-by: Nwozna <joanna.wozna@intel.com>

acb78ea2

transfer memcpy_h2d from fluid to phi (#44932) · 7bc57d35

由 kangguangli 提交于 8月 12, 2022

* transfer memcpy_h2d from fluid to phi

* use UnchangedInferMeta instead

* restore test_standalone_executor

* add newline to fix codestyle check

* rename pt -> phi

* simplify logic and add check

* make the comment more clear

* remove useless comment

* refine code

7bc57d35

11 8月, 2022 1 次提交
- W
  
  Change bias to persistable in preln_residual_bias_fuse_pass (#45037) · 26c573de
  由 whs 提交于 8月 11, 2022
  
  26c573de
10 8月, 2022 4 次提交
- X
  [Paddle Inference] Support cuda_graph. (#44878) · 84bf5c31
  由 xiaoxiaohehe001 提交于 8月 10, 2022
```
* cuda_graph

* cuda_graph_

* cuda_graph_

* cuda_graph_
```
  84bf5c31
- L
  [new-exec] set cuda device before run (#44985) · 68b06ba6
  由 Leo Chen 提交于 8月 10, 2022
```
* set cuda device before run

* add header file

* fix compile
```
  68b06ba6
- L
  fix proto consistency bug (#45017) · 9c98ee3e
  由 Leo Chen 提交于 8月 10, 2022
```
* fix proto bug

* add ut

* reset need_update for var_desc

* refine code

* fix var desc order issue
```
  9c98ee3e
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 2 次提交
- Y
  
  fix mkldnn conv add pass when the dims of res and out are not equel (#45018) · 42c694df
  由 yeliang2258 提交于 8月 09, 2022
  
  42c694df
- Y
  Fix a bug in transpose2 when run native cpu (#44659) · 8185cecd
  由 yeliang2258 提交于 8月 09, 2022
```
* fix a bug in transpose2 about mkldnn

* fix bug
```
  8185cecd
08 8月, 2022 1 次提交
- L
  clean includes of tensor.h (#44928) · ee9ea48d
  由 Leo Chen 提交于 8月 08, 2022
```
* clean tensor.h

* fix gather_nd
```
  ee9ea48d
05 8月, 2022 4 次提交

fix 5 operator makers with typos which pass string literal to argument... · ce9d2a9e

由 Feiyu Chan 提交于 8月 05, 2022

fix 5 operator makers with typos which pass string literal to argument 'generated', remove generated as parameter of AddAttr (#44935)

ce9d2a9e

[MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2

由 YuanRisheng 提交于 8月 05, 2022

* move mkldnn activation kernel

* fix compile bugs

* fix compile bugs

* deal with conflict

* fix compile bugs

* fix windows compile bugs

* mkldnn unittest fix

* change mutable to alloc

* fix unittest bugs

* modify code according comment

2dfa88d2

Z

Add feed&fetch as default deny ops. (#44708) · d4ca7ffb
由 Zhen Wang 提交于 8月 05, 2022

d4ca7ffb

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

04 8月, 2022 2 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

王

add xpu garbage collector for standalone executor. (#44572) · 0e26361c
由王明冬提交于 8月 04, 2022

0e26361c

03 8月, 2022 2 次提交
- H
  [jit] c++ property deserialization & Variable support vector of int, float (#44727) · 9735d1b8
  由 Hui Zhang 提交于 8月 03, 2022
```
* c++ property deserialization

* fix for comment

* more error info

* fix exception info

* fix ci

* fix compile

* fix layer test ci
```
  9735d1b8
- W
  
  fix trt and gpu pass: emb_elt_layn (#44842) · 2ea1c134
  由 Wangzheee 提交于 8月 03, 2022
  
  2ea1c134
02 8月, 2022 5 次提交
- L
  
  fix namespace of GPUContext (#44822) · 65f38869
  由 Leo Chen 提交于 8月 02, 2022
  
  65f38869
- W
  Multihead matmul fp16 (#44792) · 0fd8ee63
  由 Wilber 提交于 8月 02, 2022
```
* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error
```
  0fd8ee63
- D
  
  fix gpups CUDADeviceContext to phi-GPUContext;test=develop (#44804) · 3491d183
  由 danleifeng 提交于 8月 02, 2022
  
  3491d183
- W
  [Phi] polish and rename, pt* -> phi* (#44697) · 942ff89f
  由 Weilong Wu 提交于 8月 02, 2022
```
* polish and rename, pt* -> phi*

* fix code format
```
  942ff89f
- R
  Skip inplace for coalesce_tensor_op outputs (#44795) · bb22e59c
  由 Ruibiao Chen 提交于 8月 02, 2022
```
* Skip inplace for coalesce_tensor_op outputs

* Fix typos

* Add UTs

* Fix typos
```
  bb22e59c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功