提交 · 9c98ee3efd0bfd051385ecabe0b2e3c982d84540 · BaiXuePrincess / Paddle

10 8月, 2022 2 次提交
- L
  fix proto consistency bug (#45017) · 9c98ee3e
  由 Leo Chen 提交于 8月 10, 2022
```
* fix proto bug

* add ut

* reset need_update for var_desc

* refine code

* fix var desc order issue
```
  9c98ee3e
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 2 次提交
- Y
  
  fix mkldnn conv add pass when the dims of res and out are not equel (#45018) · 42c694df
  由 yeliang2258 提交于 8月 09, 2022
  
  42c694df
- Y
  Fix a bug in transpose2 when run native cpu (#44659) · 8185cecd
  由 yeliang2258 提交于 8月 09, 2022
```
* fix a bug in transpose2 about mkldnn

* fix bug
```
  8185cecd
08 8月, 2022 1 次提交
- L
  clean includes of tensor.h (#44928) · ee9ea48d
  由 Leo Chen 提交于 8月 08, 2022
```
* clean tensor.h

* fix gather_nd
```
  ee9ea48d
05 8月, 2022 4 次提交

fix 5 operator makers with typos which pass string literal to argument... · ce9d2a9e

由 Feiyu Chan 提交于 8月 05, 2022

fix 5 operator makers with typos which pass string literal to argument 'generated', remove generated as parameter of AddAttr (#44935)

ce9d2a9e

[MKLDNN]Move mkldnn activation kernel to phi (#44365) · 2dfa88d2

由 YuanRisheng 提交于 8月 05, 2022

* move mkldnn activation kernel

* fix compile bugs

* fix compile bugs

* deal with conflict

* fix compile bugs

* fix windows compile bugs

* mkldnn unittest fix

* change mutable to alloc

* fix unittest bugs

* modify code according comment

2dfa88d2

Z

Add feed&fetch as default deny ops. (#44708) · d4ca7ffb
由 Zhen Wang 提交于 8月 05, 2022

d4ca7ffb

Merge matmul_v1 and matmul_v2 fuse passes (#44870) · d0cf9d9d

由 Sławomir Siwek 提交于 8月 05, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* restore ut

* adjust old ut

* restore parallel UT ruels

* feedback from review

d0cf9d9d

04 8月, 2022 2 次提交

Matmuls with activation and elementwise_add fuses (#44655) · 0420d514

由 Sławomir Siwek 提交于 8月 04, 2022

* Add unit tests

* matmul_v2 + activation

* matmuls + elementwise_add

* matmul_v2 postops

* transform matmul to v2

* opcompat

* fix fusing matmul with multipe outs

* add shape constraints

* remove unused vars

* change pass order

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

* add alpha constraint

* merge matmul refactor

* trigger CI

* - fix

* - another fix

* code style

* add support for matmul+elementwise_add+activation

* code style

* fix bfloat16 bugs

* change append_binary to append_sum
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

0420d514

王

add xpu garbage collector for standalone executor. (#44572) · 0e26361c
由王明冬提交于 8月 04, 2022

0e26361c

03 8月, 2022 2 次提交
- H
  [jit] c++ property deserialization & Variable support vector of int, float (#44727) · 9735d1b8
  由 Hui Zhang 提交于 8月 03, 2022
```
* c++ property deserialization

* fix for comment

* more error info

* fix exception info

* fix ci

* fix compile

* fix layer test ci
```
  9735d1b8
- W
  
  fix trt and gpu pass: emb_elt_layn (#44842) · 2ea1c134
  由 Wangzheee 提交于 8月 03, 2022
  
  2ea1c134
02 8月, 2022 6 次提交
- L
  
  fix namespace of GPUContext (#44822) · 65f38869
  由 Leo Chen 提交于 8月 02, 2022
  
  65f38869
- W
  Multihead matmul fp16 (#44792) · 0fd8ee63
  由 Wilber 提交于 8月 02, 2022
```
* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error
```
  0fd8ee63
- D
  
  fix gpups CUDADeviceContext to phi-GPUContext;test=develop (#44804) · 3491d183
  由 danleifeng 提交于 8月 02, 2022
  
  3491d183
- W
  [Phi] polish and rename, pt* -> phi* (#44697) · 942ff89f
  由 Weilong Wu 提交于 8月 02, 2022
```
* polish and rename, pt* -> phi*

* fix code format
```
  942ff89f
- R
  Skip inplace for coalesce_tensor_op outputs (#44795) · bb22e59c
  由 Ruibiao Chen 提交于 8月 02, 2022
```
* Skip inplace for coalesce_tensor_op outputs

* Fix typos

* Add UTs

* Fix typos
```
  bb22e59c
- R
  Refactor build_op_downstream_map for standalone executor (#44729) · 9b97ac70
  由 Ruibiao Chen 提交于 8月 02, 2022
```
* Refactor build_op_downstream_map for standalone executor

* Add some comments
```
  9b97ac70
01 8月, 2022 3 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

GPUGraph merge to develop (#44594) · 798670bb

由 danleifeng 提交于 8月 01, 2022

798670bb

W
[Paddle Inference] add varlen_token_prune plugin, pass, convert (#44733) · 24187fcb
由 Wangzheee 提交于 8月 01, 2022
```
* add varlen_token_prune plugin, pass, convert
```
24187fcb

29 7月, 2022 4 次提交

L
unify fluid::CUDADeviceContext and phi::GpuContext (#44723) · 88490567
由 Leo Chen 提交于 7月 29, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile
```
88490567

[Auto parallel] Optimization Tuning (#43782) · 72f2ed43

由 JZ-LIANG 提交于 7月 29, 2022

* fixed bug for pass & engine

* fixed bug for benchmark GPT-3

* add tuner & profiler

* add algorithms & config

72f2ed43

move CUDAStream to phi (#44529) · da3743fd

由 Leo Chen 提交于 7月 29, 2022

* init

* move CUDAStream to phi

* fix compilation

* merge develop

* add stream_owned_ member

* split cuda_stream.h

* fix cpu compile

* fix constructor

* fix bug

* fix windows compile

* fix inference test_levit

* fix windows tests

da3743fd

H

[XPU] add sampling_id op, add top_k op, update xdnn api. test=kunlun (#44704) · e61f48c1
由 houj04 提交于 7月 29, 2022

e61f48c1

27 7月, 2022 1 次提交
- P
  fix RemoveIntermediateOut in fuse_elewise_add_act_pass while converting graph to program (#44593) · be132719
  由 pangyoki 提交于 7月 27, 2022
```
* fix RemoveNode in fuse_elewise_add_act_pass

* fix

* change pointer to share_ptr

* fix

* fix

* fix format

* fix

* fix graph_safe_remove_nodes
```
  be132719
26 7月, 2022 5 次提交

Add a feed op before each input parameter var. (#44499) · 9b662bef

由 Zhen Wang 提交于 7月 26, 2022

* Add a feed op before each input parameter var.

* Fix some issues about the unit test build_cinn_pass_test.

9b662bef

R

Merge kProgramDescs in GraphToProgram (#44526) · b6e84806
由 Ruibiao Chen 提交于 7月 26, 2022

b6e84806

add horizontal federation learning ps feature (#44327) · 4bc22b69

由 ziyoujiyi 提交于 7月 26, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

4bc22b69

R
Set more attrs in ReplaceScaleLossGradOp (#44576) · ab198b45
由 Ruibiao Chen 提交于 7月 26, 2022
```
* Set more attrs in ReplaceScaleLossGradOp

* Fix typos

* Fix CI errors

* Add UT
```
ab198b45
R

Remove ControlDepVar in GraphToBlock (#44591) · 8d3672f0
由 Ruibiao Chen 提交于 7月 26, 2022

8d3672f0

25 7月, 2022 1 次提交
- L
  
  [Phi] Migrate squared_l2_norm_op to phi (#44492) · 3e170163
  由 lyq 提交于 7月 25, 2022
  
  3e170163
21 7月, 2022 2 次提交
- Z
  add slot attr for push sparse op (#44422) · 85c6937b
  由 zhaocaibei123 提交于 7月 21, 2022
```
* add slot attr for push sparse op

* add pybind

* remove fleet

* add unittest

* fix
```
  85c6937b
- X
  [Paddle inference] Add conv_fusion_fp16 (#44435) · 37455714
  由 xiaoxiaohehe001 提交于 7月 21, 2022
```
* convfusionfp16

* convfusionfp16

* convfusionfp16
```
  37455714
20 7月, 2022 5 次提交
- Z
  [GPUPS]Fix psgpuwrapper initialization (#44468) · 99bf7007
  由 zmxdream 提交于 7月 20, 2022
```
* Update ps_gpu_wrapper.h

* Update ps_gpu_wrapper.h

* Update ps_gpu_wrapper.cc
```
  99bf7007
- D
  【GPUPS】Adam accessor (#43919) · b8d106e1
  由 danleifeng 提交于 7月 20, 2022
```
* add adam/sharedadam optimzier for gpups;edit optimizer struct;test=develop
```
  b8d106e1
- P
  transfer block_id to CreateVarNode in multi_devices_graph_pass (#44366) · 1882ffd5
  由 pangyoki 提交于 7月 20, 2022
```
* fix CreateVarNode in multi_devices_graph_pass

* Revert "Fix var duplication bug for graph_to_program_pass (#44278)"

This reverts commit a2c4c86b.
```
  1882ffd5
- H
  [XPU][NPU] (1) add device_guard. (2) add support for LoDTensorArray of sum op. (#44367) · 8753a2bf
  由 houj04 提交于 7月 20, 2022
```
* device_guard support xpu. test=kunlun

* sum op of xpu support LoDTensorArray. add test for while op of xpu. test=kunlun.
```
  8753a2bf
- Z
  [GPUPS]FleetWrapper initialize (#44441) · 28cb0067
  由 zmxdream 提交于 7月 20, 2022
```
* fix FleetWrapper initialize
```
  28cb0067

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致