提交 · 72f2ed43756218cb125d9cb3ba3b949c94e636a0 · BaiXuePrincess / Paddle

29 7月, 2022 3 次提交

[Auto parallel] Optimization Tuning (#43782) · 72f2ed43

由 JZ-LIANG 提交于 7月 29, 2022

* fixed bug for pass & engine

* fixed bug for benchmark GPT-3

* add tuner & profiler

* add algorithms & config

72f2ed43

move CUDAStream to phi (#44529) · da3743fd

由 Leo Chen 提交于 7月 29, 2022

* init

* move CUDAStream to phi

* fix compilation

* merge develop

* add stream_owned_ member

* split cuda_stream.h

* fix cpu compile

* fix constructor

* fix bug

* fix windows compile

* fix inference test_levit

* fix windows tests

da3743fd

H

[XPU] add sampling_id op, add top_k op, update xdnn api. test=kunlun (#44704) · e61f48c1
由 houj04 提交于 7月 29, 2022

e61f48c1

27 7月, 2022 1 次提交
- P
  fix RemoveIntermediateOut in fuse_elewise_add_act_pass while converting graph to program (#44593) · be132719
  由 pangyoki 提交于 7月 27, 2022
```
* fix RemoveNode in fuse_elewise_add_act_pass

* fix

* change pointer to share_ptr

* fix

* fix

* fix format

* fix

* fix graph_safe_remove_nodes
```
  be132719
26 7月, 2022 5 次提交

Add a feed op before each input parameter var. (#44499) · 9b662bef

由 Zhen Wang 提交于 7月 26, 2022

* Add a feed op before each input parameter var.

* Fix some issues about the unit test build_cinn_pass_test.

9b662bef

R

Merge kProgramDescs in GraphToProgram (#44526) · b6e84806
由 Ruibiao Chen 提交于 7月 26, 2022

b6e84806

add horizontal federation learning ps feature (#44327) · 4bc22b69

由 ziyoujiyi 提交于 7月 26, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

4bc22b69

R
Set more attrs in ReplaceScaleLossGradOp (#44576) · ab198b45
由 Ruibiao Chen 提交于 7月 26, 2022
```
* Set more attrs in ReplaceScaleLossGradOp

* Fix typos

* Fix CI errors

* Add UT
```
ab198b45
R

Remove ControlDepVar in GraphToBlock (#44591) · 8d3672f0
由 Ruibiao Chen 提交于 7月 26, 2022

8d3672f0

25 7月, 2022 1 次提交
- L
  
  [Phi] Migrate squared_l2_norm_op to phi (#44492) · 3e170163
  由 lyq 提交于 7月 25, 2022
  
  3e170163
21 7月, 2022 2 次提交
- Z
  add slot attr for push sparse op (#44422) · 85c6937b
  由 zhaocaibei123 提交于 7月 21, 2022
```
* add slot attr for push sparse op

* add pybind

* remove fleet

* add unittest

* fix
```
  85c6937b
- X
  [Paddle inference] Add conv_fusion_fp16 (#44435) · 37455714
  由 xiaoxiaohehe001 提交于 7月 21, 2022
```
* convfusionfp16

* convfusionfp16

* convfusionfp16
```
  37455714
20 7月, 2022 7 次提交
- Z
  [GPUPS]Fix psgpuwrapper initialization (#44468) · 99bf7007
  由 zmxdream 提交于 7月 20, 2022
```
* Update ps_gpu_wrapper.h

* Update ps_gpu_wrapper.h

* Update ps_gpu_wrapper.cc
```
  99bf7007
- D
  【GPUPS】Adam accessor (#43919) · b8d106e1
  由 danleifeng 提交于 7月 20, 2022
```
* add adam/sharedadam optimzier for gpups;edit optimizer struct;test=develop
```
  b8d106e1
- P
  transfer block_id to CreateVarNode in multi_devices_graph_pass (#44366) · 1882ffd5
  由 pangyoki 提交于 7月 20, 2022
```
* fix CreateVarNode in multi_devices_graph_pass

* Revert "Fix var duplication bug for graph_to_program_pass (#44278)"

This reverts commit a2c4c86b.
```
  1882ffd5
- H
  [XPU][NPU] (1) add device_guard. (2) add support for LoDTensorArray of sum op. (#44367) · 8753a2bf
  由 houj04 提交于 7月 20, 2022
```
* device_guard support xpu. test=kunlun

* sum op of xpu support LoDTensorArray. add test for while op of xpu. test=kunlun.
```
  8753a2bf
- Z
  [GPUPS]FleetWrapper initialize (#44441) · 28cb0067
  由 zmxdream 提交于 7月 20, 2022
```
* fix FleetWrapper initialize
```
  28cb0067
- R
  Add dependency for read op in standalone executor (#44362) · 2ee32028
  由 Ruibiao Chen 提交于 7月 20, 2022
```
* Add dependency for read op in standalone executor

* Fix CI errors

* Add UT

* add_dependency -> dependency_utils

* Fix CI errors
```
  2ee32028
- T
  
  Clean CI_SKIP_CPP_TEST (#44412) · 3ed53280
  由 tianshuo78520a 提交于 7月 20, 2022
  
  3ed53280
19 7月, 2022 2 次提交
- H
  
  Accelerate inference period in op Cache method (#43857) · a8680f54
  由 huzhiqiang 提交于 7月 19, 2022
  
  a8680f54
- R
  Rename BOOST_GET macros (#44368) · 4b085c57
  由 Ruibiao Chen 提交于 7月 19, 2022
```
* Rename BOOST_GET macros

* Fix conflicts
```
  4b085c57
18 7月, 2022 1 次提交
- 王
  
  add ipu support for standalone executor. (#44342) · fbedf77e
  由王明冬提交于 7月 18, 2022
  
  fbedf77e
16 7月, 2022 1 次提交
- L
  
  Not rename pb file to avoid re-compile (#44370) · 6f7550e4
  由 Leo Chen 提交于 7月 15, 2022
  
  6f7550e4
15 7月, 2022 1 次提交
- R
  
  Remove boost library (#44092) · d2e59e15
  由 Ruibiao Chen 提交于 7月 15, 2022
  
  d2e59e15
14 7月, 2022 4 次提交
- W
  Compilation optimization (#44242) · 4baf0dbe
  由 wanghuancoder 提交于 7月 14, 2022
```
* Compilation optimization
```
  4baf0dbe
- Y
  [Phi]Improve the mechanism for mkldnn kernel in PHI (#43941) · e9b4d0be
  由 YuanRisheng 提交于 7月 14, 2022
```
* adapt mkldnn kernel in PHI

* fix ci compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix compile bugs

* delete comment

* fix compile bugs in windows-inference

* delete code for converage

* modify code by review

* modify code by review

* add todo

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix unittest bugsx
```
  e9b4d0be
- R
  
  Fix var duplication bug for graph_to_program_pass (#44278) · a2c4c86b
  由 Ruibiao Chen 提交于 7月 14, 2022
  
  a2c4c86b
- W
  
  fixed glog (#44316) · cb44b694
  由 WJJ1995 提交于 7月 14, 2022
  
  cb44b694
13 7月, 2022 3 次提交
- P
  add shape attribute in fill_constant op converted from scale_loss_grad after... · 7cf72a38
  由 pangyoki 提交于 7月 13, 2022
```
add shape attribute in fill_constant op converted from scale_loss_grad after convert graph to program (#43898)

* fix grad loss shape

* little change

* delete for_test

* add unittest for FLAGS_CONVERT_GRAPH_TO_PROGRAM

* avoid conflict
```
  7cf72a38
- Z
  
  fix device optimizer config (#44282) · bcf57274
  由 zmxdream 提交于 7月 13, 2022
  
  bcf57274
- Z
  
  fix bug of data transform on xpu (#44262) · 469d5ab4
  由 zyfncg 提交于 7月 13, 2022
  
  469d5ab4
12 7月, 2022 4 次提交

J

Add pool avg to quantization and concat scales correction (#44186) · c797e64d
由 joanna.wozna.intel 提交于 7月 12, 2022

c797e64d
王

add xpu_kp support for standalone executor. test=develop (#44231) · 015532b4
由王明冬提交于 7月 12, 2022

015532b4

matmul+activation fuse pass (#43519) · 3333a439

由 Sławomir Siwek 提交于 7月 12, 2022

* add method for post ops

* format code

* gpd

* format style

* add matmul+act test

* implement matmul+activation

* whitespaces

* code style

* python code format

* Increase UT timeout

* code format

* update style

* generalize activation fuse passes

* change order

* Unify activation GPD

* Revert changes with op_act

* remove softmax mkldnn attrs

* set common name for act attributes

* whitespace

* append postops by helper function

* ut style

* revert changes related to quantization

* Reduce redundancy

* reduce number of parameters

* trigger CI

* validate attribute

* trim unit test

3333a439

X

fix_convfusion (#44226) · 636c6347
由 xiaoxiaohehe001 提交于 7月 12, 2022

636c6347

11 7月, 2022 4 次提交

王

[NPU] add npu support for new executor. test=develop (#43403) · 5988553f
由王明冬提交于 7月 11, 2022

5988553f

[IPU] support more ops 0/N (#44204) · 0a04b8a9

由 Allen Guo 提交于 7月 11, 2022

* add authors
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

* squash cpp changes 1/N

* clean code
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

0a04b8a9

Z
Quantize shape operator (#44124) · d4372a1e
由 Zuza Gawrysiak 提交于 7月 11, 2022
```
* Quantize shape operator

* Add shape op to propagate scales pass
```
d4372a1e
S
Unify and generalize activation fuse passes (#44185) · 826e2781
由 Sławomir Siwek 提交于 7月 11, 2022
```
* reduce redundancy

* python code style

* fix int8 ut
```
826e2781

07 7月, 2022 1 次提交

[Windows CI] copy onnxruntime.dll to c++ test folder in windows (#44121) · 05b7ef8d

由 Sing_chan 提交于 7月 07, 2022

* copy onnxruntime.dll to c++ test folder in windows

* remove ut that failed due to onnxrumtime.dll

* test_api_impl failed of diff

* use TARGET to make sure if the test exist; use POST_BUILD to add copy command

05b7ef8d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致