提交 · 92125870edb510182d24a5b1af3ed4bad2489cec · PaddlePaddle / Paddle

18 8月, 2022 5 次提交

O

fix typo of pybind.cc (#45239) · 41294cb5
由 OccupyMars2025 提交于 8月 18, 2022

41294cb5

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in... · d8d124b6

由 pangyoki 提交于 8月 18, 2022

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085)

* apply inplace addto in python apply_pass

* fix

* apply inplace pass for program

* skip feed and fetch var

* fix block_desc.move_from

* fix block desc

* alltoall remove inplace

* fix

d8d124b6

A
[OpAttr]Squeeze axes support Tensor (#45189) · c93451f4
由 Aurelius84 提交于 8月 18, 2022
```
* [OpAttr]Squeeze axes support Tensor

* add support_tensor

* fix unittest

* fix coverage
```
c93451f4

change to async mode for xpu multi-card training in static graph mode, test=kunlun (#45024) · 41bdf41d

由 zhangxiaoci 提交于 8月 18, 2022

* change to async mode for xpu multi-card training in static graph mode

* minor bugfix

* irrelevant. move to another pr

* move change to other pr

* fix stream issue

* fix 'stream not meet with current context' error

* fix branch diverge, test=kunlun

41bdf41d

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

17 8月, 2022 10 次提交
- Z
  
  refine eager_gen for amp (#45211) · e31a0a50
  由 zyfncg 提交于 8月 17, 2022
  
  e31a0a50
- S
  
  add dependency of phi_enforce (#45191) · aa96f70e
  由 Sing_chan 提交于 8月 17, 2022
  
  aa96f70e
- A
  [OpAttr]Add SupportTensor for OpMaker with whitelist mechanism (#45084) · 2594935a
  由 Aurelius84 提交于 8月 17, 2022
```
* [OpAttr]Add SupportTensor for OpMaker

* fix typo

* fix code style

* add SupportTensor for concat op

* add unittest for register Tensor

* add shape checker and split attribute
```
  2594935a
- W
  fix multi stream error. (#45196) · a79d4a75
  由 Wilber 提交于 8月 17, 2022
```
* fix multi stream error.
```
  a79d4a75
- L
  Reuse addKernel to replace TensorAdd (#45161) · 0e3b49d4
  由 Leo Chen 提交于 8月 17, 2022
```
* use addKernel

* fix compile

* remove elementwiseAddto

* add return

* fix custom place
```
  0e3b49d4
- F
  
  fix:op version (#45192) · d0cd0a11
  由 feng_shuai 提交于 8月 17, 2022
  
  d0cd0a11
- W
  [Eager]fix_stop_gradient (#45154) · cccba68c
  由 wanghuancoder 提交于 8月 17, 2022
```
* fix_stop_gradient
```
  cccba68c
- F
  
  [MLU] fix copy error (#45194) · 75690584
  由 fwenguang 提交于 8月 17, 2022
  
  75690584
- Y
  add instance norm op for xpu (#45097) · 216d25ac
  由 ykkk2333 提交于 8月 17, 2022
```
* xpu unittest grad compute supports more types, *test=kunlun

* add instance norm xpu, *test=kunlun
```
  216d25ac
- S
  Fix squared_l2_norm wrong stream bug (#45174) · 951010a2
  由 sneaxiy 提交于 8月 17, 2022
```
* fix squared_l2_norm bug

* update buffer.h
```
  951010a2
16 8月, 2022 11 次提交

[Phi] Move amp ops into phi (#45079) · b4f67757

由 Chen Weihang 提交于 8月 16, 2022

* move check finite and unscale kernel into phi

* move infershape into phi

* move update_loss_scaling kernel into phi

* remove original kernels

* move update loss scaling infershape into phi

* add header for xpu and npu

* solve coverage failed

* fix npu test failed

* remove mutable data in cu file

* fix new executor failed

* add valid check for meta tensor output

b4f67757

[Eager] Forword only add dygraph func (#45153) · 933db9d4

由 Weilong Wu 提交于 8月 16, 2022

* [Eager draft] forward_only interface migrate to autograd_api

* strings api add dygraph forward function

* rm useless comments

* draft version for check CI

* fix ci

* forward-only no need compute_require_grad and pass stop_gradient, rm useless comments

* polish yaml and using CPUPlace = phi::CPUPlace

* rm useless comments

* polish yaml and update some test case

* rm useless funcs

* polish eager_gen code

* polish code

933db9d4

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

A

support fp16 softmax on custom place (#45177) · a0bbfbd4
由 Aganlengzi 提交于 8月 16, 2022

a0bbfbd4
F
Fix problem that the shape of tensor is not inited correctly when backward in static graph (#45030) · e26f80ad
由 feifei-111 提交于 8月 16, 2022
```
* fix_shape

* code style

* fix assert

* fix to_tensor badreturn
```
e26f80ad
W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44
H

[XPU] add truncated_gaussian_random op. (#45152) · 5bcabf78
由 houj04 提交于 8月 16, 2022

5bcabf78
W

memoptim and fp16 mixed precision (#45132) · fa890092
由 Wilber 提交于 8月 16, 2022

fa890092

【autograd】add select_p、eq_p、pow_p primitive operator for new autograd (#44813) · b681c88c

由 Sing_chan 提交于 8月 16, 2022

* add select_p

* fix bugs

* add custom test for select_p; modify select_p primrules

* modify according to xiaoxu's comment

* add eq_p, select_p, pow_p, use autograd to test grad of high order

* add requirement of autograd, modify expected type of eq

* modify according to xiaoxu's comment

* import primops to use primops.pow

b681c88c

F

add strongly typed functions to set attributes to avoid unexpected type conversions. (#45107) · 307801d5
由 Feiyu Chan 提交于 8月 16, 2022

307801d5
Y

[Auto Parallel] Move the tests to a standalone folder (#45136) · 59241336
由 Yulong Ao 提交于 8月 16, 2022

59241336

15 8月, 2022 7 次提交

Y

fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
由 Yuanle Liu 提交于 8月 15, 2022

ac0553a0

Refine TRT unit test (#45102) · 3512bf11

由 zlsh80826 提交于 8月 15, 2022

* Reduce pool2d test configuration

* Reduce depthwise_conv2d test configuration

* Reduce trt_convert_conv2d_fusion test configuration

* Reduce trt_convert_conv2d test configuration

* Reduce trt_convert_conv2d_transpose test configuration

* Reduce trt_convert_hard_swish test configuration

* Enhance trt auto scan test error message and mechanism

* Increase FP16 trt ut tolerance

3512bf11

Z

add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
由 zhangyikun02 提交于 8月 15, 2022

6815c8ab
H
[jit] rm useless property pybind (#44962) · 8788513b
由 Hui Zhang 提交于 8月 15, 2022
```
* rm useless pybind

* rm useless ut
```
8788513b

[Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe

由 Yulong Ao 提交于 8月 15, 2022

* [Auto Parallel] Move the distributed info from python to c++

* [Auto Parallel] Add dist_attrs for VarDesc and OpDesc

* [Auto Parallel] Add the lost file

* [Auto Parallel] Make the dist attr be unique_ptr

* [Auto Parallel] Add the proto conversion

* [Auto Parallel] Improve the proto support

* [Auto Parallel] Fix the bugs for adding a device or a link

* [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper

* [Auto Parallel] Improve the impl of these dist attrs

* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh

* [Auto Parallel] Fix the unittest problem

* [Auto Parallel] Explicitly add the src file for auto_parallel target

* [Auto Parallel] Add the proto depedency explicitly

* [Auto Parallel] Fix the cmake bug on windows and mac

* [Auto Parallel] Remove the pybind11 header file in process_mesh.h

* [Auto Parallel] Remove unused codes

* [Auto Parallel] Check whether the dist attr is null

* [Auto Parallel] Implement the assign operator for OpDesc explicitly

a52357fe

[XPU] add some collective ops. (#45049) · 7e2a20d5

由 houj04 提交于 8月 15, 2022

* [XPU] add some collective ops. test=kunlun

* use XPUOpTestWrapper. test=kunlun

* skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun

7e2a20d5

W
convert_fp16 support multi block (#45050) · 9aecf286
由 Wilber 提交于 8月 15, 2022
```
* convert_fp16 support multi block

* update

* update
```
9aecf286

14 8月, 2022 1 次提交
- X
  Revert "[Paddle Inference] Support cuda_graph. (#44878)" (#45115) · b0e7681f
  由 xiaoxiaohehe001 提交于 8月 14, 2022
```
This reverts commit 84bf5c31.
```
  b0e7681f
13 8月, 2022 2 次提交

Refine program cache (#45005) · e96dae8b

由 Leo Chen 提交于 8月 13, 2022

* add cached_serialize_str_

* support program hash

* add sha

* add ut

* use hash_str only for new_exe

* fix attr order

e96dae8b

fl-ps: support split sparse params in local & remote (#44864) · 3f5c405f

由 ziyoujiyi 提交于 8月 13, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

* fix logging risk

* fix logging possible risk

* write trainer_desc file

* support split sparse params in local & remote

* fix import paddle.fluid.core.PSGPU

* fix import paddle.fluid.core.PSGPU

* add remote_sparse & local_sparse config

* fix unittest

* fix test_dist_fleet_geo table error

* fix PADDLE_ENFORCE error

* fix other's pr conflict

3f5c405f

12 8月, 2022 4 次提交

L

fix nccl comm in sync_bn (#45100) · 1e965756
由 LiYuRio 提交于 8月 12, 2022

1e965756

Offload calculations from matmul op to fuse pass (#44941) · acb78ea2

由 Sławomir Siwek 提交于 8月 12, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* Add int8 support for matmulV2

* restore ut

* adjust old ut

* restore parallel UT ruels

* remove mkldnn code from base ops

* move enforces to pass

* remove duplicated functions

* delete duplicated enforces

* feedback from review

* add comments to variables

* enable eltwise support

* dynamic attribute

* remove fusepass tests from op test

* remove fuse pass cases from op test

* revert introduction of dynamic attributes

* style
Co-authored-by: Nwozna <joanna.wozna@intel.com>

acb78ea2

[phi] Transfer linear_interp_v2 yaml to phi (#45072) · c737232f

由 HongyuJia 提交于 8月 12, 2022

* support optional<vector<Tensor>> in yaml and eager

* delete useless comments in eager_gen.py

* fix api_base.py support optional<vector<TTensor>>

* python_c_gen.py support optional<vector<tensor>>

* transfer linear_interp_v2 yaml from fluid to phi

* fix op_test typo error

* change linear_interp_v2 testcase

* fix args in final_state_linear_interp_v2

* fix zeropad2d typo. test=document_fix

c737232f

A

fix compilation (#45087) · 4eec94dd
由 Allen Guo 提交于 8月 12, 2022

4eec94dd

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功