提交 · 801159ceeb20e05ca71582f640cba05ee2651ac5 · PaddlePaddle / Paddle

26 1月, 2022 25 次提交

Add FuseBatchNormAddActPass and unittest. (#39178) · 801159ce

由 hlygit66666 提交于 1月 26, 2022

* add fuse_relu_depthwise_conv_pass unittest

* fix atol and rtol

* fix according to review

* add FuseBatchNormAddActPass and unittest

* Update test_dist_fuse_bn_add_act_pass.py

* solve conflict

801159ce

[pten] Cast xpu kernel (#39179) · 93d2f0a6

由 chentianyu03 提交于 1月 26, 2022

* cast xpu kernel init

* cast xpu kernel

* replace with raw cast xpu kernel

* fix cast kernel bug

* add the missing break

* modify namespace and header file

93d2f0a6

X

add dependences of enforce (#39237) · 2c0160e5
由 xiongkun 提交于 1月 26, 2022

2c0160e5

[Eager] Support imperative selected_rows_to_lod_tensor and the opposite case (#39223) · 787980b1

由 Weilong Wu 提交于 1月 26, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

* Support imperative selected_rows_to_lod_tensor

* Polish code

787980b1

Q
[MLU]Add conv2d op (#39110) · 71634a61
由 qipengh 提交于 1月 26, 2022
```
* [MLU]Add conv2d op

* [MLU]fix comment

* [MLU]adapt NCHW of conv2d op
```
71634a61

[IPU] sync misc changes 01 (#38876) · 4efbebea

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

4efbebea

Y

update uts p2 (#39232) · 83d0d853
由 yaozhixin 提交于 1月 26, 2022

83d0d853

[Move selected_rows PR ] VisitDataType use Pten::DataType (#39236) · 42a0947e

由 Weilong Wu 提交于 1月 26, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

* Use pten::DataType instead of using proto_type

* Move part of data_type to pten

* Polish Code

42a0947e

Y
[Pten]Move kernel_primitives lib to Pten directory (#39169) · 452bcbe2
由 YuanRisheng 提交于 1月 26, 2022
```
* move kernel_primitives

* use pten's errors
```
452bcbe2
W
[PTEN] cpu_context add eigen deps (#39234) · bd5c962d
由 Wilber 提交于 1月 26, 2022
```
* add eigen deps

* update
```
bd5c962d

[IPU] sync misc changes 02 (#39189) · 5df78366

由 Allen Guo 提交于 1月 26, 2022

* sync misc changes

* apply comments 01

* fix compile error

* remove is_ipu_place check

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* sync changes

* restore cmake

* update ir cmake and setup.py

* update inference_lib cmake

* restore for split PR
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

5df78366

L

[AMP] support setting amp_level in multi-thread (#39198) · 04285ab4
由 Leo Chen 提交于 1月 26, 2022

04285ab4

[Move selected_rows PR #4] SelectedRows inherits from TensorBase. (#39162) · 3e80253a

由 Weilong Wu 提交于 1月 26, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

3e80253a

P
add profile record (infer_shape, compute) for dygraph (#39023) · d9acc87e
由 pangyoki 提交于 1月 26, 2022
```
* add profile record for dygraph

* add op type in record

* fix little bug

* solve conflict
```
d9acc87e
Y

update uts p3 (#39214) · eb45bb4e
由 yaozhixin 提交于 1月 26, 2022

eb45bb4e
Z

change output of backward_api (#39229) · 33b3e28a
由 zyfncg 提交于 1月 26, 2022

33b3e28a
石

[Refactoring Tensor PR #7] differentiate deprecated interfaces (#39228) · 30470853
由石晓伟提交于 1月 26, 2022

30470853
L
Optimize layer norm forward when cols is 1024. (#39167) · 01d04be6
由 Li Min 提交于 1月 26, 2022
```
* Optimize layer_norm fwd when cols is 1024.
```
01d04be6
Y

update uts p1 (#39210) · 6efb9f59
由 yaozhixin 提交于 1月 26, 2022

6efb9f59

add sigmoid cross entropy with logits to kl2 (#38915) · fd44de58

由 houj04 提交于 1月 26, 2022

* add sigmoid cross entropy with logits to kl2. test=kunlun

* add sigmoid cross entropy with logits to kl2. test=kunlun

* follow comments. test=kunlun

fd44de58

B
support npu weight unified H2D copy before inference (#39160) · 106b5514
由 baoachun 提交于 1月 26, 2022
```
* support npu weight unified H2D copy

* remove redundant variable
```
106b5514

fix gradient accumulator bug. test=kunlun (#39127) · b1a458ac

由 houj04 提交于 1月 26, 2022

* fix gradient accumulator bug. test=kunlun

* fix typo. test=kunlun

* fix typo. test=kunlun

* fix unit tests. test=kunlun

* using TensorCopySync. test=kunlun

* only fix for xpu place. test=kunlun

b1a458ac

Y

[fleet_executor] Dist model bug fixer (#39207) · 02d3f232
由 Yuang Liu 提交于 1月 26, 2022

02d3f232
J

sum op (#39165) · 55d6b87c
由 joeqiao12 提交于 1月 26, 2022

55d6b87c

[PTen] Unify InferMeta(Shape) Function in pten and fluid op (#38976) · b75507d3

由 Chen Weihang 提交于 1月 26, 2022

* infermeta context init design

* support infermeta called in fluid op

* add hasattr and attr methods

* add dygraah GetVarPtrs support

* rename arg_map_context to arg_map_utils

* add registry for arg map func

* resolve conflit

* refactor op utils design

* polish meta config

* fix details

* remove hasattr method

* resolve conflit

* revert cmake order change

* revert some change

* change init pos

* fix compile faileed

* fix typo

* fix inference failed

* fix windows ccompile failed

* polish format
Co-authored-by: NWang Huan <wanghuan29@baidu.com>

b75507d3

25 1月, 2022 15 次提交
- Y
  
  reconstruct directory of ps (#39191) · 2bf9b844
  由 yaoxuefeng 提交于 1月 25, 2022
  
  2bf9b844
- Z
  
  fix compile problem cause by api code_gen (#39199) · 39238275
  由 zyfncg 提交于 1月 25, 2022
  
  39238275
- Y
  
  change infermeta and remove makePtenTenosr in reshape (#39186) · 7613129e
  由 YuanRisheng 提交于 1月 25, 2022
  
  7613129e
- H
  Add FuseBatchNormActPass and unittest. (#39176) · 09104d02
  由 hlygit66666 提交于 1月 25, 2022
```
* add fuse_relu_depthwise_conv_pass unittest

* fix atol and rtol

* fix according to review

* Add fuse_bn_act_pass unittest

* rm others

* add fuse_bn_act_pass
```
  09104d02
- L
  GetWorkspaceSize trigger modfication in heuristic cudnn conv (#39184) · 4c61e141
  由 limingshu 提交于 1月 25, 2022
```
* first commit

* add more changes
```
  4c61e141
- C
  add trace event data structure definition (#39109) · 57b2033b
  由 chenjian 提交于 1月 25, 2022
```
* add trace event data structure definition

* convert enum item to string for cupti enum explaination

* modify paddle_enforce_eq description
```
  57b2033b
- Z
  [inference] update trt convert reduce op&ut,test=develop (#39088) · 80753755
  由 Zhang Jun 提交于 1月 25, 2022
```
* [inference] update convert reduce op&ut,test=develop

* update

* update

* update

* add int32 support

* add int32 support

* add comments

* trt < 7.0 do not support int32

* test=develop

* update

* test=develop
```
  80753755
- J
  [MLU]add mlu kernel for fill_constant op (#39069) · 6e871dbc
  由 joeqiao12 提交于 1月 25, 2022
```
* [MLU]add mlu kernel for fill_constant op

* delete device_context DEPS
```
  6e871dbc
- N
  Revert "Replace EigenBroadcast with ElementwiseBroadcast in ReduceGrad (#38959)" (#39205) · 978558be
  由 niuliling123 提交于 1月 25, 2022
```
This reverts commit 9059ef69.
```
  978558be
- 石
  
  fix custom ops, test=develop (#39153) · 712ccfbf
  由石晓伟提交于 1月 25, 2022
  
  712ccfbf
- F
  
  fix:the axis must be 1(channel), when the dims of bias is 1 (#39052) · f07b8cbe
  由 feng_shuai 提交于 1月 25, 2022
  
  f07b8cbe
- S
  [Custom Ops]Assert _compile_dir/includes.txt existence (#39183) · 1e515aa8
  由 sneaxiy 提交于 1月 25, 2022
```
* assert _compile_dir include file existence

* polish
```
  1e515aa8
- F
  
  [MLU]add mlu batch_norm kernel pytest (#39071) · 55164761
  由 fwenguang 提交于 1月 25, 2022
  
  55164761
- J
  [MLU]add mlu kernel for split and concat (#39020) · ac3dc0bb
  由 joeqiao12 提交于 1月 25, 2022
```
* [MLU]add mlu kernel for concat and split op

* delete device_context DEPS
```
  ac3dc0bb
- Y
  
  [fleet_executor] Dist model run method Implementation (#39194) · 20e23e1b
  由 Yuang Liu 提交于 1月 25, 2022
  
  20e23e1b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功