提交 · ef5d216e2ad190ae6dc83f9e4293bfa36b6e0e8c · Crayon鑫 / Paddle

16 2月, 2022 4 次提交
- Z
  
  change the format of api yaml (#39532) · ef5d216e
  由 zyfncg 提交于 2月 16, 2022
  
  ef5d216e
- W
  
  Support nce in eager mode (#39589) · 672def6c
  由 Weilong Wu 提交于 2月 16, 2022
  
  672def6c
- C
  [PTen] Rename general grad infermeta func (#39578) · 12ca438e
  由 Chen Weihang 提交于 2月 16, 2022
```
* rename general grad infermeta func

* remove useless code
```
  12ca438e
- A
  [Dy2Stat]Refine ProgramCache.last and Return recent one (#39541) · 4157579e
  由 Aurelius84 提交于 2月 16, 2022
```
* Refine ProgramCache.last and Return recent one

* add comment

* fix unittest
```
  4157579e
15 2月, 2022 9 次提交

[PluggableDevice] Add custom runtime support (#38740) · 3e7825f3

由 ronnywang 提交于 2月 15, 2022

* [CustomRuntime] Add DeviceManager

* [CustomRuntime] Add DeviceInterface

* [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager

* [CustomRuntime] Add plug-in device

* [CustomRuntime] Memory module support PluggableDevice

* [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option

* update

* [API] update API doc based on comments, test=develop
Co-authored-by: Nqili93 <qili93@qq.com>

3e7825f3

A
Added hapi BF16 lenet script (#39298) · 70714d1b
由 arlesniak 提交于 2月 15, 2022
```
* hapi lenet BF16

* ops list updated

* year typo fix

* tests updated fo CI
```
70714d1b
F

pool2d_coonvert_ut (#39545) · cf8a5573
由 feng_shuai 提交于 2月 15, 2022

cf8a5573
L
[Paddle-TRT] Replace GeLU plugin with TensorRT built-in layer for TensorRT 7.0. (#38399) · a3689d8c
由 Leo Chen 提交于 2月 15, 2022
```
* Replace GeLU plugin with TRT built-in layers for approximate GeLU

* Add TensorRT built-in layer for nonapproximate GeLU
```
a3689d8c
W

Support test_layers 51/55 tests with _test_eager_guard() (#39515) · 536a55fa
由 Weilong Wu 提交于 2月 15, 2022

536a55fa

[Pten] Support SelectedRows in C++ API (#39497) · 5bb3b668

由 zyfncg 提交于 2月 15, 2022

* add data_transform in pten api

* support GetKernelTypeForVar

* fix complie problem of bfloat16

* add scale_sr in api

* suppport select_row in C++ api

* merge code

5bb3b668

F

delete mish_convert_ut skip (#39432) · 8cedcd3e
由 feng_shuai 提交于 2月 15, 2022

8cedcd3e

new way of test case, 2nd, *test=kunlun (#39478) · 4745234f

由 z8hanghuan 提交于 2月 15, 2022

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

4745234f

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

14 2月, 2022 8 次提交

Add Inplace addto pass and unittest. (#39433) · 52af0a60

由 hlygit66666 提交于 2月 14, 2022

* add fuse_relu_depthwise_conv_pass unittest

* fix atol and rtol

* fix according to review

* Update test_dist_fuse_relu_depthwise_conv_pass.py

* add inplace_addto pass and unittest

52af0a60

[UT] mish op, conv+mish, fc+mish fuse passes (#39340) · 02938b3d

由 Sławomir Siwek 提交于 2月 14, 2022

* mish unit tests

* code format

* remove unused imports

* code format

* remove hard-coded shape values

* remove timeouts

* remove timeouts v2

* restore timeouts

02938b3d

统一ps：heter ps 二阶段单测通过 (#39468) · 765a2ada

由 ziyoujiyi 提交于 2月 14, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* refactor ps optimize

* refactor ps optimize

* refactor ps optimize

* .

* .

* .

* .

* .

* .

* refactor theoneps

* the_one_ps

* add ps pass unittest

* add ps pass unittest

* ps unitest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* ps unittest ready

* ps unittest ready

* solve dist_pass init conflict

* solve import CommContext error

* unittest ok

* implement AllocateFrom

* solve setup.py.in conflict

* solve conflict

* solve conflict

* solve conflict

* .

* .

* cpu-async-ps minimize test ok & gpu minimize test ok

* add heter 2stage unittest

* add heter 2stage unittest

* add heter 2stage unittest
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

765a2ada

new may of test cases, *test=kunlun (#39444) · e07420b9

由 z8hanghuan 提交于 2月 14, 2022

* new may of test cases, *test=kunlun

* new may of test cases, *test=kunlun

* new may of test cases, *test=kunlun

e07420b9

T

fix gather_nd, *test=kunlun (#39283) · d12c3636
由 TTerror 提交于 2月 14, 2022

d12c3636
T

update xpu test build script and fix get_test_cover_info, *test=kunlun (#39235) · 9ba3f429
由 TTerror 提交于 2月 14, 2022

9ba3f429
Z
Fixed get_tensor method for EagerTensor (#39414) · 97229944
由 Zhanlue Yang 提交于 2月 14, 2022
```
* Enabled Eager OpTest #1

* Enabled Eager OpTest #1

* Fixed get_tensor method for EagerTensor
```
97229944

Adjusted python-level trace_op to accomodate final state Eager Dygraph (#39319) · ec8a0c1d

由 Zhanlue Yang 提交于 2月 14, 2022

* Removed debug info

* Added automatic code generation for final state Eager Dygraph

* Modified backward yaml

* Added EagerUtils helper functions for final state CodeGen

* Adjusted CMakeFiles to support compilation for final state auto generated codes

* Added python-c code generation for final state Eager Dygraph

* Fixed minor issue

* Fixed yaml.load() method failure

* Fixed minor issues

* Refactored Python-C Attributes Parsing Functions

* Fixed minor issue with Python-C AddFunctions

* Adjusted python-level trace_op to accomodate final state Eager Dygraph

* Added Logs for final state Eager Dygraph

* Fixed merge issues

* Fixed minor issue

ec8a0c1d

13 2月, 2022 1 次提交

[Pten] Generate Wrapped InferMeta by Yaml (#39482) · 74a150fe

由 zyfncg 提交于 2月 13, 2022

* generate wrapped_infer_meta

* add test for wrapped_infer_meta

* Update test_meta_fn_utils.cc

* change the dir of generated file
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NChen Weihang <chenwhpro@163.com>

74a150fe

11 2月, 2022 9 次提交

L

Add TensorRT inspector into Paddle-TRT (#38362) · 69793a27
由 Leo Chen 提交于 2月 11, 2022

69793a27

Added shape (U)INT8/BF16/FP32 oneDNN kernel (#36033) · 52bbaae9

由 jakpiase 提交于 2月 11, 2022

* added shape oneDNN kernel

* removed unnecessary import from test

* added skipping tests for GPU

* refactoring

* refactored shape kernel

* added tests in new framework

* removed one line

* minor change

* added newline at EOF

* added formatting

* added attributes as extra

52bbaae9

F

[MLU] add pool2d pytest (#39454) · 2db25f0d
由 fwenguang 提交于 2月 11, 2022

2db25f0d
J

uniform_random op for mlu (#39450) · 02f06708
由 joeqiao12 提交于 2月 11, 2022

02f06708
Z
[bf16] add bf16 kernel: transpose & unbind (#39457) · 1e6047f1
由 zhangbo9674 提交于 2月 11, 2022
```
* add transpose unbind

* add unittest

* refine transpose unittest
```
1e6047f1
Z
[MLU]support c_gen_cncl_id_op run on MLU device (#39336) · 89aa8b1a
由 zn 提交于 2月 11, 2022
```
Co-authored-by: Nzhangna <zhangna@cambricon.com>
```
89aa8b1a
J

fix prelu trt convert (#39389) · c86765ed
由 JingZhuangzhuang 提交于 2月 11, 2022

c86765ed

统一 ps 开发 - python (#39431) · 22c67d14

由 ziyoujiyi 提交于 2月 11, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* refactor ps optimize

* refactor ps optimize

* refactor ps optimize

* .

* .

* .

* .

* .

* .

* refactor theoneps

* the_one_ps

* add ps pass unittest

* add ps pass unittest

* ps unitest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* ps unittest ready

* ps unittest ready

* solve dist_pass init conflict

* solve import CommContext error

* unittest ok

* implement AllocateFrom

* solve setup.py.in conflict

* solve conflict

* solve conflict

* solve conflict

* .

* .

* cpu-async-ps minimize test ok & gpu minimize test ok
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

22c67d14

【Pten】Auto-Generate InterMeta register (#39436) · 7d6096ff

由 zyfncg 提交于 2月 11, 2022

* fix code conflict

* generate inter_meta register

* clear cache

* just try

* add sign c++ api

* polish some code

7d6096ff

10 2月, 2022 9 次提交
- 0
  [Dy2St]Handle `a, b = paddle.shape(x)` in Static Analysis (#39245) · 1252f4bb
  由 0x45f 提交于 2月 10, 2022
```
* refine Assign

* add UT
```
  1252f4bb
- F
  [MLU] add mlu kernel for accuracy op (#39337) · 383de295
  由 fwenguang 提交于 2月 10, 2022
```
* [MLU] add mlu kernel for accuracy op

* fix license format

* fix error message
```
  383de295
- F
  [NPU] add reduce_min (#39019) · 2b8b16d7
  由 furnace 提交于 2月 10, 2022
```
[NPU] add reduce_min
```
  2b8b16d7
- W
  change dtype of pooling mask to 'int32' for Paddle2ONNX (#39314) · 29d31606
  由 Wei Shengyu 提交于 2月 10, 2022
```
* change dtype of pooling mask to 'int32' for Paddle2ONNX

* empty commit to rerun ci

* fix format
```
  29d31606
- W
  mkldnn layout issue fix (#39422) · 52d6b306
  由 wenbin 提交于 2月 10, 2022
```
* mkldnn conv fix

* definetion
```
  52d6b306
- S
  Add _get_parameter method to Lamb optimizer (#39416) · c47d6729
  由 sneaxiy 提交于 2月 10, 2022
```
* add _get_parameter func to lamb

* remove duplicate code
```
  c47d6729
- Z
  【Pten】Refactor C++ API code-gen (#39408) · 7b70b792
  由 zyfncg 提交于 2月 10, 2022
```
* refactor C++ API code-gen

* fix windows problem of C++ API
```
  7b70b792
- C
  Modify the unsqueeze dimension of input data in conv1d NCL And NLC format (#38425) · 224bc511
  由 crystal 提交于 2月 10, 2022
```
* optimize conv1d forward

* add conv opt

* Optimize memory copy

* delete share data with

* set num_filters=512

* add nlc optimize

* Optimize num_filter=512 data on A100 and V100

* Fix the workspace_size size setting of filter
```
  224bc511
- Z
  [bf16] add bf16 kernel: squeeze & unsqueeze & stack (#39402) · 59c7aea5
  由 zhangbo9674 提交于 2月 10, 2022
```
* add squeeze unsqueeze stack

* add unittest

* add cpu kernel
```
  59c7aea5

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致