提交 · b2390438b2c70fa13897e0edb263512e89bd3ccf · PaddlePaddle / Paddle

13 4月, 2022 3 次提交
- Z
  Fix problem of infermeta with vector output (#41646) · b2390438
  由 zyfncg 提交于 4月 13, 2022
```
* remove stack_grad infershape

* fix bug of output with null

* fix bug
```
  b2390438
- T
  optimize hbm (#41623) · d95280c7
  由 Thunderbrook 提交于 4月 13, 2022
```
* optimize hbm

* format

* format
```
  d95280c7
- C
  [Phi&CustomOp] Remove deprecated enum PlaceType for custom op & add warning (#41647) · 78ef1071
  由 Chen Weihang 提交于 4月 13, 2022
```
* remove old custom op placetype

* replace dist  placetype using

* add with gpu macro

* fix mutable_data error

* fix set value error

* add comment
```
  78ef1071
12 4月, 2022 3 次提交
- D
  【heterps】datafeed puttofeedvec performance (#40168) · c202a613
  由 danleifeng 提交于 4月 12, 2022
```
* perform SlotRecordInMemoryDataFeed feedvec;test=develop
```
  c202a613
- L
  
  add dependency for send/recv to support pp parallel (#41652) · a058b474
  由 Leo Chen 提交于 4月 12, 2022
  
  a058b474
- C
  [CustomOp]Add new method for custom double grad (#41538) · 362c7c80
  由 Chen Weihang 提交于 4月 12, 2022
```
* add new method for custom double grad

* add tanh double grad unittest

* change year

* revert tensor init method
```
  362c7c80
10 4月, 2022 3 次提交
- L
  [KP]fix bug when TruncatedNormal cannot fall back in cpu (#41565) · c1394c6a
  由 Liu-xiandong 提交于 4月 10, 2022
```
* [KP]fix bug when TruncatedNormal cannot fall back in cpu

* delete useless comment

* delete useless comment
```
  c1394c6a
- B
  
  add mkldnn compute_propagate_scales int8 pass (#41592) · c00d869b
  由 baoachun 提交于 4月 10, 2022
  
  c00d869b
- B
  add mkldnn int8 pass [step1] (#41579) · e68da187
  由 baoachun 提交于 4月 10, 2022
```
* add mkldnn int8 pass

* add mkldnn int8 pass

* update pass
```
  e68da187
09 4月, 2022 3 次提交

由 zhaocaibei123 提交于 4月 09, 2022

* update name

* update name

* fix test

* fix fleet bind

* update name

* update name

* fix test

* fix gpups wrapper

* remove Push/Pull/Load/Save with context in client and wrapper base class

* fix

* fix

* remove some interface

* fix

* remove

* code style

* recover

* fix

* remove code unused

* remove some unused table & accessor & CommonDenseTable => MemoryDenseTable

* fix

* fix

* fix

* recover

* remove unused code

* recover unittest

* fix

* remove

* fix

* remove code unuseful

* remove

* fix

* recover

* remove
Co-authored-by: Nesythan <esythan@126.com>

7a07c4a5

Autotune the workspace_size_limit in conv. (#40338) · b937cdc5

由 limingshu 提交于 4月 09, 2022

* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

b937cdc5

L
[new-exec] fix bug that no thread is waked up when adding task to threadpool (#41567) · f581f5bf
由 Leo Chen 提交于 4月 09, 2022
```
* fix bug that no thread is waked up when adding task to threadpool

* fix typo
```
f581f5bf

07 4月, 2022 3 次提交
- T
  [GPUPS] bind afs wrpper (#41227) · b3bcebbe
  由 Thunderbrook 提交于 4月 07, 2022
```
* afs wrapper

* format

* format

* macro
```
  b3bcebbe
- L
  Profile Executors (#41100) · dfb47986
  由 liutiexing 提交于 4月 07, 2022
```
* Profile Executors

* update

* fix ut

* fix names

* update

* update
```
  dfb47986
- H
  momentum support l2decay for xpu. test=kunlun (#41325) · 533c649f
  由 houj04 提交于 4月 07, 2022
```
* momentum support l2decay for xpu. test=kunlun

* fix include file. test=kunlun

* fix cmake for device_worker. test=kunlun
```
  533c649f
06 4月, 2022 1 次提交
- A
  [IPU] remove paddle_ipu shared library (#41307) · 229e91bf
  由 Allen Guo 提交于 4月 06, 2022
```
* remove paddle_ipu shared library

* fix unique_name
```
  229e91bf
05 4月, 2022 2 次提交

Z
Fix bug of data transform in inference executor (#41349) · 91212104
由 zyfncg 提交于 4月 05, 2022
```
* fix bug of data transform in inference executor

* fix bug
```
91212104

[new-exec] enable the new standalone executor by default (#41179) · 93ea1297

由 Leo Chen 提交于 4月 05, 2022

* enable new executor by default

* enable stream safe allocator

* test=document_fix;test=coverage

* do not use scope in op kernel

* fit empty program for new executor

* fix communication depend

* fix test_sync_batch_norm

* skip unsupported place

* refine datatransfer

* fit for dirtributed program

* fix dependencpy

* fix some ut

93ea1297

04 4月, 2022 2 次提交

S
conv + elementwise_add refactor (#41286) · e5e0b726
由 Sławomir Siwek 提交于 4月 04, 2022
```
* DRY

* change nodes names

* add const prefix

* change asX to as_x in all files
```
e5e0b726

Add dropout yaml (#41355) · 1c7001e7

由 hong 提交于 4月 04, 2022

* add dropout slice yaml

* remove useless code

* fix infer shape error

* skip infrt compile for dropout

1c7001e7

03 4月, 2022 1 次提交

[Phi]Concat grad (#41112) · 3f57ef7a

由 chentianyu03 提交于 4月 03, 2022

* add concat_grad kernel

* fix error

* remove comment code

* fix outs nullptr error

* change to phi header

* add concat_grad declare for standalone_executor_test

3f57ef7a

02 4月, 2022 5 次提交
- L
  
  [new-exec] fit empty program for new executor (#41328) · e0ccaeaf
  由 Leo Chen 提交于 4月 02, 2022
  
  e0ccaeaf
- W
  [Paddle inference] support new quant_model (#41049) · 1b58ce14
  由 Wangzheee 提交于 4月 02, 2022
```
* paddle inference support new quant_model
```
  1b58ce14
- L
  [KP] fix bug in phi static graph mode (#41269) · d0f46aac
  由 Liu-xiandong 提交于 4月 02, 2022
```
* [KP] fix bug in phi static graph mode

* modify the useless code
```
  d0f46aac
- Z
  统一ps refine (#41234) · b3270adf
  由 zhaocaibei123 提交于 4月 02, 2022
```
* update name

* update name

* fix test

* fix fleet bind

* update name

* update name

* fix test

* fix gpups wrapper

* remove Push/Pull/Load/Save with context in client and wrapper base class

* fix

* fix
Co-authored-by: Nesythan <esythan@126.com>
```
  b3270adf
- L
  
  [new-exec] support to enable mkldnn by flags (#41274) · cb124156
  由 Leo Chen 提交于 4月 02, 2022
  
  cb124156
01 4月, 2022 6 次提交

L
fix mac c++ version (#41172) · a2c01db1
由 liutiexing 提交于 4月 01, 2022
```
* fix mac c++ version

* update

* fix apple systems
```
a2c01db1

[Phi] Move softmax with cross entropy kernel into phi (#40832) · e6ec98fe

由 Chen Weihang 提交于 4月 01, 2022

* add cross_entropy_with_softmax phi kernel

* remove softmax_with_cross_entropy kernel

* add softmax_with_cross_entropy grad kernel

* remove original op kernel

* refine cross entropy impl

* fix pointer error

* revert kernel cu change

* fix xpu failed

* fix cinn failed

* fix npu failed

* add forward sig

* add check_nan_inf for pt kernel

* remove repeat cmake item

* fix unittest error

e6ec98fe

[Phi]Interploatd kernels into phi (#40855) · d65a7a46

由 chentianyu03 提交于 4月 01, 2022

* add interploate cpu kernel

* fix nullptr bug

* add interpolate gpu kernel

* fix unit test error

* remove raw kernels

* add cuda kernel impl

* add infermeta

* recover accidentally deleted kernels in interpolate op

* fix grad x_grad name error

* remove interpolate_v2_op.h

* rm unused codes

* fix xpu build error

* fix build error

* fix namespace error

* add register header for nup

* fix infermeta error

* modify by review

* add the missing args in test_trt_convert_nearest_interp_v2

d65a7a46

[GPUPS]fix CMakeLists with pslib (#41225) · 4da4265a

由 zmxdream 提交于 4月 01, 2022

* fix cmake. test=develop

* fix. test=develop

* fix dep for graphs_ps_gpu. test=develop

* update. test=develop

* update. test=develop

4da4265a

A

[custom kernel] support fallback (#41212) · 9c2a9afd
由 Aganlengzi 提交于 4月 01, 2022

9c2a9afd
L
[new-exec] move WaitEvent/RecordEvent into try-catch (#41222) · 5dae6da0
由 Leo Chen 提交于 4月 01, 2022
```
* move WaitEvent/RecordEvent into try-catch

* refine supportNpu
```
5dae6da0

31 3月, 2022 8 次提交

heter & multi-cloud brpc communication (#40965) · 2f41f389

由 ziyoujiyi 提交于 3月 31, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

2f41f389

Z
[Phi] Rename ScalarArray to IntArray (#40975) · e559fe41
由 zyfncg 提交于 3月 31, 2022
```
* rename scalar_array to int_array

* update cmake

* fix conflict

* remove useless log
```
e559fe41

[new-exec] fit mkldnn op (#41058) · 02cf6764

由 Leo Chen 提交于 3月 31, 2022

* fix bug that some op has no op_role attr

* add mkldnn support for new executor

* fit for mkldnn data_transfer

* fit for mkldnn data_transfer

02cf6764

Maintain old profiler (#41132) · a6bf2218

由 chenjian 提交于 3月 31, 2022

* no

* maintain old profiler

* exclude new python record events for old profiler

* maintain old profiler

* maintain

* maintain old profiler

* maintain

* fix cmakes

a6bf2218

add flatten2,reshape2,squueze2_trt_fuse_pass test cast (#41031) · 7ef69202

由 heliqi 提交于 3月 31, 2022

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

* add flatten2,reshape2,squueze2_trt_fuse_pass  test cast

7ef69202

W
[phi] move yolov3_loss to phi (#40944) · fb93bd5c
由 wuyefeilin 提交于 3月 31, 2022
```
* mv yolov3_loss op to phi

* fix as review

* update operator.h
```
fb93bd5c

fix load bug and add distributed strategy from pslib (#40883) · 47383dca

由 wangguanqun 提交于 3月 31, 2022

* fix load bug and add distributed strategy from pslib

* add unittest

* use cvm config

* trainer and worker config

* add unittest

* add unittest

* add test

* code style

47383dca

L
add depend when doing fuse_all_optimizer on program (#41178) · 3b00dc92
由 Leo Chen 提交于 3月 31, 2022
```
* fix dependency of fused optimizer

* add ut
```
3b00dc92

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功