提交 · 5bb3b66834a1038e6f10a92ccf228f2d2a3b922a · Crayon鑫 / Paddle

15 2月, 2022 4 次提交

[Pten] Support SelectedRows in C++ API (#39497) · 5bb3b668

由 zyfncg 提交于 2月 15, 2022

* add data_transform in pten api

* support GetKernelTypeForVar

* fix complie problem of bfloat16

* add scale_sr in api

* suppport select_row in C++ api

* merge code

5bb3b668

F

delete mish_convert_ut skip (#39432) · 8cedcd3e
由 feng_shuai 提交于 2月 15, 2022

8cedcd3e

new way of test case, 2nd, *test=kunlun (#39478) · 4745234f

由 z8hanghuan 提交于 2月 15, 2022

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

* new way of test case, 2nd, *test=kunlun

4745234f

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

14 2月, 2022 8 次提交

Add Inplace addto pass and unittest. (#39433) · 52af0a60

由 hlygit66666 提交于 2月 14, 2022

* add fuse_relu_depthwise_conv_pass unittest

* fix atol and rtol

* fix according to review

* Update test_dist_fuse_relu_depthwise_conv_pass.py

* add inplace_addto pass and unittest

52af0a60

[UT] mish op, conv+mish, fc+mish fuse passes (#39340) · 02938b3d

由 Sławomir Siwek 提交于 2月 14, 2022

* mish unit tests

* code format

* remove unused imports

* code format

* remove hard-coded shape values

* remove timeouts

* remove timeouts v2

* restore timeouts

02938b3d

统一ps：heter ps 二阶段单测通过 (#39468) · 765a2ada

由 ziyoujiyi 提交于 2月 14, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* refactor ps optimize

* refactor ps optimize

* refactor ps optimize

* .

* .

* .

* .

* .

* .

* refactor theoneps

* the_one_ps

* add ps pass unittest

* add ps pass unittest

* ps unitest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* ps unittest ready

* ps unittest ready

* solve dist_pass init conflict

* solve import CommContext error

* unittest ok

* implement AllocateFrom

* solve setup.py.in conflict

* solve conflict

* solve conflict

* solve conflict

* .

* .

* cpu-async-ps minimize test ok & gpu minimize test ok

* add heter 2stage unittest

* add heter 2stage unittest

* add heter 2stage unittest
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

765a2ada

new may of test cases, *test=kunlun (#39444) · e07420b9

由 z8hanghuan 提交于 2月 14, 2022

* new may of test cases, *test=kunlun

* new may of test cases, *test=kunlun

* new may of test cases, *test=kunlun

e07420b9

T

fix gather_nd, *test=kunlun (#39283) · d12c3636
由 TTerror 提交于 2月 14, 2022

d12c3636
T

update xpu test build script and fix get_test_cover_info, *test=kunlun (#39235) · 9ba3f429
由 TTerror 提交于 2月 14, 2022

9ba3f429
Z
Fixed get_tensor method for EagerTensor (#39414) · 97229944
由 Zhanlue Yang 提交于 2月 14, 2022
```
* Enabled Eager OpTest #1

* Enabled Eager OpTest #1

* Fixed get_tensor method for EagerTensor
```
97229944

Adjusted python-level trace_op to accomodate final state Eager Dygraph (#39319) · ec8a0c1d

由 Zhanlue Yang 提交于 2月 14, 2022

* Removed debug info

* Added automatic code generation for final state Eager Dygraph

* Modified backward yaml

* Added EagerUtils helper functions for final state CodeGen

* Adjusted CMakeFiles to support compilation for final state auto generated codes

* Added python-c code generation for final state Eager Dygraph

* Fixed minor issue

* Fixed yaml.load() method failure

* Fixed minor issues

* Refactored Python-C Attributes Parsing Functions

* Fixed minor issue with Python-C AddFunctions

* Adjusted python-level trace_op to accomodate final state Eager Dygraph

* Added Logs for final state Eager Dygraph

* Fixed merge issues

* Fixed minor issue

ec8a0c1d

13 2月, 2022 1 次提交

[Pten] Generate Wrapped InferMeta by Yaml (#39482) · 74a150fe

由 zyfncg 提交于 2月 13, 2022

* generate wrapped_infer_meta

* add test for wrapped_infer_meta

* Update test_meta_fn_utils.cc

* change the dir of generated file
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NChen Weihang <chenwhpro@163.com>

74a150fe

11 2月, 2022 9 次提交

L

Add TensorRT inspector into Paddle-TRT (#38362) · 69793a27
由 Leo Chen 提交于 2月 11, 2022

69793a27

Added shape (U)INT8/BF16/FP32 oneDNN kernel (#36033) · 52bbaae9

由 jakpiase 提交于 2月 11, 2022

* added shape oneDNN kernel

* removed unnecessary import from test

* added skipping tests for GPU

* refactoring

* refactored shape kernel

* added tests in new framework

* removed one line

* minor change

* added newline at EOF

* added formatting

* added attributes as extra

52bbaae9

F

[MLU] add pool2d pytest (#39454) · 2db25f0d
由 fwenguang 提交于 2月 11, 2022

2db25f0d
J

uniform_random op for mlu (#39450) · 02f06708
由 joeqiao12 提交于 2月 11, 2022

02f06708
Z
[bf16] add bf16 kernel: transpose & unbind (#39457) · 1e6047f1
由 zhangbo9674 提交于 2月 11, 2022
```
* add transpose unbind

* add unittest

* refine transpose unittest
```
1e6047f1
Z
[MLU]support c_gen_cncl_id_op run on MLU device (#39336) · 89aa8b1a
由 zn 提交于 2月 11, 2022
```
Co-authored-by: Nzhangna <zhangna@cambricon.com>
```
89aa8b1a
J

fix prelu trt convert (#39389) · c86765ed
由 JingZhuangzhuang 提交于 2月 11, 2022

c86765ed

统一 ps 开发 - python (#39431) · 22c67d14

由 ziyoujiyi 提交于 2月 11, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* refactor ps optimize

* refactor ps optimize

* refactor ps optimize

* .

* .

* .

* .

* .

* .

* refactor theoneps

* the_one_ps

* add ps pass unittest

* add ps pass unittest

* ps unitest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* ps unittest frame

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* add cpu_async_ps_mode test

* ps unittest ready

* ps unittest ready

* solve dist_pass init conflict

* solve import CommContext error

* unittest ok

* implement AllocateFrom

* solve setup.py.in conflict

* solve conflict

* solve conflict

* solve conflict

* .

* .

* cpu-async-ps minimize test ok & gpu minimize test ok
Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>

22c67d14

【Pten】Auto-Generate InterMeta register (#39436) · 7d6096ff

由 zyfncg 提交于 2月 11, 2022

* fix code conflict

* generate inter_meta register

* clear cache

* just try

* add sign c++ api

* polish some code

7d6096ff

10 2月, 2022 11 次提交
- 0
  [Dy2St]Handle `a, b = paddle.shape(x)` in Static Analysis (#39245) · 1252f4bb
  由 0x45f 提交于 2月 10, 2022
```
* refine Assign

* add UT
```
  1252f4bb
- F
  [MLU] add mlu kernel for accuracy op (#39337) · 383de295
  由 fwenguang 提交于 2月 10, 2022
```
* [MLU] add mlu kernel for accuracy op

* fix license format

* fix error message
```
  383de295
- F
  [NPU] add reduce_min (#39019) · 2b8b16d7
  由 furnace 提交于 2月 10, 2022
```
[NPU] add reduce_min
```
  2b8b16d7
- W
  change dtype of pooling mask to 'int32' for Paddle2ONNX (#39314) · 29d31606
  由 Wei Shengyu 提交于 2月 10, 2022
```
* change dtype of pooling mask to 'int32' for Paddle2ONNX

* empty commit to rerun ci

* fix format
```
  29d31606
- W
  mkldnn layout issue fix (#39422) · 52d6b306
  由 wenbin 提交于 2月 10, 2022
```
* mkldnn conv fix

* definetion
```
  52d6b306
- S
  Add _get_parameter method to Lamb optimizer (#39416) · c47d6729
  由 sneaxiy 提交于 2月 10, 2022
```
* add _get_parameter func to lamb

* remove duplicate code
```
  c47d6729
- Z
  【Pten】Refactor C++ API code-gen (#39408) · 7b70b792
  由 zyfncg 提交于 2月 10, 2022
```
* refactor C++ API code-gen

* fix windows problem of C++ API
```
  7b70b792
- C
  Modify the unsqueeze dimension of input data in conv1d NCL And NLC format (#38425) · 224bc511
  由 crystal 提交于 2月 10, 2022
```
* optimize conv1d forward

* add conv opt

* Optimize memory copy

* delete share data with

* set num_filters=512

* add nlc optimize

* Optimize num_filter=512 data on A100 and V100

* Fix the workspace_size size setting of filter
```
  224bc511
- Z
  [bf16] add bf16 kernel: squeeze & unsqueeze & stack (#39402) · 59c7aea5
  由 zhangbo9674 提交于 2月 10, 2022
```
* add squeeze unsqueeze stack

* add unittest

* add cpu kernel
```
  59c7aea5
- Z
  [bf16] add bf16 kernel: dropout & reshape & slice (#39395) · e8ac7fc3
  由 zhangbo9674 提交于 2月 10, 2022
```
* add dropout

* add reshape

* add slice

* refien slice unittest

* refine slice unittest

* add cpu bf16 kernel
```
  e8ac7fc3
- A
  
  [PluggableDevice] custom kernel supports multi cpp_dtype registering (#39385) · 63d2333e
  由 Aganlengzi 提交于 2月 10, 2022
  
  63d2333e
09 2月, 2022 7 次提交
- B
  
  optimize sharding stage3 offload (#39397) · b292dfb8
  由 Baibaifan 提交于 2月 09, 2022
  
  b292dfb8
- W
  [Paddle-Inference] rebuild matmul pass: trt and gpu_cpu (#39369) · db7d129e
  由 Wangzheee 提交于 2月 09, 2022
```
* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu
```
  db7d129e
- F
  
  [MLU] add gaussian_random mlu kernel (#39338) · c35b4b8e
  由 fwenguang 提交于 2月 09, 2022
  
  c35b4b8e
- F
  
  [mlu] add mlu kernel for momentum op (#39331) · f8ba12e5
  由 fwenguang 提交于 2月 09, 2022
  
  f8ba12e5
- F
  
  [mlu] add mlu kernel for elementwise_add (#39313) · d47a511a
  由 fwenguang 提交于 2月 09, 2022
  
  d47a511a
- J
  Replace EagerTensor with Tensor (#39376) · 945a3ce9
  由 Jiabin Yang 提交于 2月 09, 2022
```
* merge legacy to fluid

* Remove legacy code

* Remove legacy code

* Remove DataType test

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer
```
  945a3ce9
- H
  Move trace op to pten (#39227) · d7dddf94
  由 hong 提交于 2月 09, 2022
```
* add trace op

* bug fix

* bug fix; test=develop

* thrust bug fix; test=develop

* remove useless register; test=develop

* fix bug; test=develop

* update trace kernel; test=develop

* move kernel args to trace_sig; test=develop
```
  d7dddf94

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致