提交 · 5d5d84503f1c2241255601a188e78d33f092d8e2 · PaddlePaddle / Paddle

20 1月, 2022 10 次提交

Y
fix mac ci bug (#38964) · 5d5d8450
由 YUNSHEN XIE 提交于 1月 20, 2022
```
* test=allcases;notest,test=mac_py3

* fix bug in mac ci

* fix format issue
```
5d5d8450

[Auto Parallel] Improve the dist op interface and the compatible computation (#39014) · 9acc26ca

由 Yulong Ao 提交于 1月 20, 2022

* Add the backward support for QR

* Remove unnecessary comments

* [Auto Parallel] Improve the dist op interface and compatible computation

* Remove unnecessary modification

* Recover some modifications

* Add lost files

* Fix a minor bug

* Fix the bug of the planner

* Fix the format problem

9acc26ca

Y

mod communicator (#39064) · 2a9c993e
由 yaoxuefeng 提交于 1月 20, 2022

2a9c993e
Z
Fix master weight bug for multi_tensor optimizer(momentum, adam) (#38991) · 6b0c57cf
由 zhangbo9674 提交于 1月 20, 2022
```
* fix mp

* support merged_momentum for mp
```
6b0c57cf
M
[Paddle-ASP]Make test_asp_sharding running on non-mac platform (#39034) · c0f27282
由 minghaoBD 提交于 1月 20, 2022
```
* [Paddle-ASP]Make test_asp_sharding running on non-mac platform

* syntax check

* syntax check
```
c0f27282

【PTen】Remove code of converting Tensor to DensoeTensor (#38926) · 8784ec65

由 zyfncg 提交于 1月 20, 2022

* remove MakePtenTensor in BuildKernelContext

* fix a bug caused by storage

* remove WriteBackOutput in dynamic and static mode

* fix complie error of std::max

* fix complie error of std::max

* fix date_type bug

* fix memory alloc bug

* add some debug info

* fix compile problem

* fix problem of data_type check

* comment out some unreached code

8784ec65

S

remove if !defined(WIN32) (#39058) · 90e9233a
由 sneaxiy 提交于 1月 20, 2022

90e9233a

[Eager] Support Eager mode for some testcase (#38783) · d21074cd

由 wanghuancoder 提交于 1月 20, 2022

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* eager test case

* support inference test

* refine test and fix initializer failed

* modify eagertensor patch method

* add eagertensor.clear_grandint, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* support create varbase and fix retain grad error

* call monkey_patch_varbase in _test_eager_guard, test=develop

* fix windows error

* split clear_gradient to clear_gradient and zero_grads, test=develop

* refine, test=develop

* refine, test=develop

* support test_imperative_basic test in eager mode

* remove additional log in variable.h

* remove additional log in variable.h

* remove additional code create in merge

* eager

* fix some eager logic, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* patch_tensor_method_func, test=develop

* refine, test=develop

* eager test case, test=develop

* refine, test=develop

* eager, test=develop

* eager, test=develop

* eager optimizer, test=develop

* eager optimizer, test=develop

* eager test_imperative_optimizer_v2, test=develop

* eager, test=develop

* refine, test=develop

* refine, test=develop

* eager, test=develop

* add resize in share buffer to, test=develop

* eager, test=develop

* fix _share_buffer_to, test=develop

* refine, test=develop

* refine, test=develop

* support eager for dataloader,test=develop
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NJiabinYang <360788950@qq.com>

d21074cd

C

revert cached kernel context removing (#39055) · 4d413d02
由 Chen Weihang 提交于 1月 20, 2022

4d413d02
S

fix gelu compile on CUDA 10 (#39045) · 0617a3ed
由 sneaxiy 提交于 1月 20, 2022

0617a3ed

19 1月, 2022 6 次提交

ipu python interface p1 (#38096) · 0837a2cc

由 jianghaicheng 提交于 1月 19, 2022

* ipu_commit_tests p1

* resolve comments

* resolve comments

* resolve comments

* resolve comments

* resolve comments

* resolve comments

* resolve comments

* update lint and ipustrategy introduction

* update ipu_config

* update __init__ of static

* update doc

* update doc 2

* update doc 3

* update doc 4

* update doc 5

* update doc 5

* update doc 6

* update lint

* update lint 2

* update ipustrategy

* add IpuStrategy to all

* update ipustrategy

* update ipu_shard_guard

* update ipu_shard_guard 2
Co-authored-by: Nyaozhixin <522190855@qq.com>

0837a2cc

Fix paddle.flops AttributeError (#38850) · ae1e71b3

由 yingyibiao 提交于 1月 19, 2022

* Fix AttributeError when output y is a tuple which has no attribute 'shape'

* Add unit test for dynamic_flops with multiple outputs

* Add unit test for dynamic_flops with multiple outputs

ae1e71b3

W

[hybrid] Fix out of memory bug (#39009) · 01222f52
由 wuhuachaocoding 提交于 1月 19, 2022

01222f52
Y

[fleet executor] Init fleet exe and prepare feed&fetch (#39032) · e43b6f65
由 Yuang Liu 提交于 1月 19, 2022

e43b6f65
H

convert paddle op definations into pd dialect in infrt (#38708) · f37a23a7
由 huzhiqiang 提交于 1月 19, 2022

f37a23a7
Z

Add conv2d_transpose and conv2d_transpose_grad for XPU,test=kunlun (#38956) · c7de7440
由 zhangyikun02 提交于 1月 19, 2022

c7de7440

18 1月, 2022 18 次提交
- S
  Mish FP32/BF16 kernel, conv and fc fuse passes (#38623) · 1d18bc2c
  由 Sławomir Siwek 提交于 1月 18, 2022
```
* Mish

* Change exp() library

* mish fuse pass

* mish attrs

* fixes

* mishop maker

* remove attrs

* mish kernal for bf16

* fc+mish fuse

* fix code format error

* Resolve merge conflicts

* Update mish operator version

* update mish variable to new naming convention
```
  1d18bc2c
- Y
  
  [fleet executor] add comm init for dist model inf (#39012) · 4c46eed0
  由 Yuang Liu 提交于 1月 18, 2022
  
  4c46eed0
- C
  
  add CI check from pten including fluid files (#39013) · 8b77f870
  由 chentianyu03 提交于 1月 18, 2022
  
  8b77f870
- change CUDA implementaion of uniform/gaussian OP (#38611) · bbbd75e4
  由 zhouweiwei2014 提交于 1月 18, 2022
```
* change CUDA implementaion of uniform/gaussian OP

* fix unittest
```
  bbbd75e4
- K
  
  fix http gloo bug (#39017) · a998c077
  由 kuizhiqing 提交于 1月 18, 2022
  
  a998c077
- Z
  [Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3
  由 Zhanlue Yang 提交于 1月 18, 2022
```
* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues
```
  2052f1e3
- W
  add the uva function for the Tensor (#38950) · bfacd706
  由 wawltor 提交于 1月 18, 2022
```
* add the uva api for the tensor

* fix the compiler problem for the uva

* fix the example for the _uva

* fix the compile problem in the pten library

* update the enviroment support for the uva

* use the make_shared replace the shared_ptr
```
  bfacd706
- T
  
  fix lookup_table_v2 error in kunlun2 (#38855) · df898f8b
  由 taixiurong 提交于 1月 18, 2022
  
  df898f8b
- Y
  
  Unify the functor of elementwise and logical ops. (#35767) · b1365d25
  由 Yiqun Liu 提交于 1月 18, 2022
  
  b1365d25
- J
  fix trt convert conv2d skip (#38999) · dfa242e4
  由 JingZhuangzhuang 提交于 1月 18, 2022
```
* fix trt convert conv2d skip

* fix trt convert conv2d skip
```
  dfa242e4
- W
  modify transpose params check (#39006) · 27f8460a
  由 wenbin 提交于 1月 18, 2022
```
* modify params check

* correct compile
```
  27f8460a
- Z
  
  Fixed python-level LoDTensor patch (#38996) · a17e51dd
  由 Zhanlue Yang 提交于 1月 18, 2022
  
  a17e51dd
- D
  
  Fix pad api docs (#38988) · 5406e6f8
  由 duanboqiang 提交于 1月 18, 2022
  
  5406e6f8
- S
  fix build bug (#38997) · 5eab7dab
  由 Shang Zhizhou 提交于 1月 18, 2022
```
* fix build bug

* fix code style
```
  5eab7dab
- Y
  
  break the circular dependency between reduce and elementwise (#38951) · a1980d9c
  由 YuanRisheng 提交于 1月 18, 2022
  
  a1980d9c
- Z
  [AutoParallel] Recompute Pass (#38920) · 30845734
  由 zhaoyingli 提交于 1月 18, 2022
```
* [AutoParallel] Recompute Pass

* update unittest

* reshard for amp

* add comment
```
  30845734
- Z
  [GPUPS]Fix ps_gpu_wrapper (#38993) · 4aa91fd6
  由 zmxdream 提交于 1月 18, 2022
```
* update

* fix ps_gpu_wrapper. test=develop

* fix ps_gpu_wrapper. test=develop
```
  4aa91fd6
- S
  Speedup FP16 Gelu op using fast math and vectorized 8 kernel (#38980) · 8c20d668
  由 sneaxiy 提交于 1月 18, 2022
```
* speedup gelu using fast math

* add bwd part
```
  8c20d668
17 1月, 2022 6 次提交

W
disable unsupported trt dimension (#38962) · 55e9087f
由 wenbin 提交于 1月 17, 2022
```
* develop test

* throw

* ne

* wrong cnt
```
55e9087f
J

fix for conv2D training error (#38938) · 944ea436
由 jakpiase 提交于 1月 17, 2022

944ea436

update ipu_executor, remove ipu_optimizer (#38986) · 05c98ec7

由 Allen Guo 提交于 1月 17, 2022

Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

05c98ec7

[IPU] update ipu_backend p0 (#38854) · b2aee3e3

由 Allen Guo 提交于 1月 17, 2022

* update ipu_backend

* sync with paddle internal
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* apply comments 01

* update error messag

* restore ipu_executor and ipu_optimizer

* add clang-format on
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

b2aee3e3

expose input variables that only shape needed in each subgraph that compiled by CINN (#38367) · b4cb3589

由 CtfGo 提交于 1月 17, 2022

collecting input variables that only shape needed of each subgraph that compiled by CINN in build_cinn_pass, and expose them to memory optimization of framework passes by declaringDECLARE_INPLACE_OP_INFERER in cinn_launch op.

b4cb3589

Z

remove MakePtenDenseTensor in op compute (#38910) · 04f042a5
由 zyfncg 提交于 1月 17, 2022

04f042a5

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功