提交 · f4da2d4dced0cce8eaa8e173a281463ee8faa1c3 · PaddlePaddle / Paddle

17 8月, 2022 3 次提交
- H
  [phi] Transfer fluid bicubic_interp_v2 to phi bicubic_interp (add yaml) (#45151) · f4da2d4d
  由 HongyuJia 提交于 8月 17, 2022
```
* transfer bicubic_interp op to phi, change name from bicubic_interp_v2 to bicubic_interp

* test final_state_bicubic_interp api

* testcase match imperative case
```
  f4da2d4d
- S
  Fix squared_l2_norm wrong stream bug (#45174) · 951010a2
  由 sneaxiy 提交于 8月 17, 2022
```
* fix squared_l2_norm bug

* update buffer.h
```
  951010a2
- Z
  
  Optimize performance of amp (#45188) · 5e1a20bf
  由 Zhang Zheng 提交于 8月 17, 2022
  
  5e1a20bf
16 8月, 2022 22 次提交

[Phi] Move amp ops into phi (#45079) · b4f67757

由 Chen Weihang 提交于 8月 16, 2022

* move check finite and unscale kernel into phi

* move infershape into phi

* move update_loss_scaling kernel into phi

* remove original kernels

* move update loss scaling infershape into phi

* add header for xpu and npu

* solve coverage failed

* fix npu test failed

* remove mutable data in cu file

* fix new executor failed

* add valid check for meta tensor output

b4f67757

[geometric]Add paddle.geometric.send_uv API (#44848) · 88724a53

由 Siming Dai 提交于 8月 16, 2022

* initial commit

* fix op maker bug

* fix mul grad bug

* add unittest

* fix add grad bug, add cpu kernel

* add paddle.geometric.message_passing

* add paddle.geometric.send_uv api, add unittest

* add fp16 judgement

* fix file typo, move compute_type to message_op

* add impl file

* fix unittest timeout time

* add review revise

88724a53

[Auto Paralle]Add reshard cost and update estimator (#45118) · 6a15d407

由 caozhou 提交于 8月 16, 2022

* update reshard cost and cost estimator

* add unittest

* add dropout cost

* fix import error

* fix reshard code style error

* improve unittest coverage

6a15d407

[Eager] Forword only add dygraph func (#45153) · 933db9d4

由 Weilong Wu 提交于 8月 16, 2022

* [Eager draft] forward_only interface migrate to autograd_api

* strings api add dygraph forward function

* rm useless comments

* draft version for check CI

* fix ci

* forward-only no need compute_require_grad and pass stop_gradient, rm useless comments

* polish yaml and using CPUPlace = phi::CPUPlace

* rm useless comments

* polish yaml and update some test case

* rm useless funcs

* polish eager_gen code

* polish code

933db9d4

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

A

support fp16 softmax on custom place (#45177) · a0bbfbd4
由 Aganlengzi 提交于 8月 16, 2022

a0bbfbd4
F
Fix problem that the shape of tensor is not inited correctly when backward in static graph (#45030) · e26f80ad
由 feifei-111 提交于 8月 16, 2022
```
* fix_shape

* code style

* fix assert

* fix to_tensor badreturn
```
e26f80ad
Y

Enable Profiler to add nvtx tag. (#45162) · ff2f1373
由 Yiqun Liu 提交于 8月 16, 2022

ff2f1373

fix the document for the add api (#45101) · d85cf4d4

由 wawltor 提交于 8月 16, 2022

* fix the api for the add

* update the document for the api add

* update add docs; test=document_fix
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>

d85cf4d4

[CodeStyle] Add CI for self.assertTrue(np.allclose(...)) (#45126) · 87ff40b7

由 Yulv-git 提交于 8月 16, 2022

* Add CI for assert-allclose.

* Update CI script.

* Update check_approval.

* Specify the destination path for the git diff.

* Add test samples.

* Add CI for assert-allclose with \n.

* Update test samples.

* Update ALL_ADDED_LINES_IN_TARGET_PATH.

* update GitHub username to userid, test=document_fix

* add rfc as a specification, test=document_fix

* try to integrate single and multiple rows together, test=document_fix

* remove duplicate dirs, test=document_fix

* add anchor `#background`, test=document_fix

* remove original scripts, test=document_fix

* remove test files, test=document_fix
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

87ff40b7

H
[Fleet] Reconstruct of Fleet API in Dygraph Mode (#44922) · c17e6af8
由 Haohongxiang 提交于 8月 16, 2022
```
* reconstruct_of_fleet_api

* update
```
c17e6af8
H

transfer nearest_interp op to phi, change name from nearest_interp_v2 to nearest_interp (#45148) · 6452ab3b
由 HongyuJia 提交于 8月 16, 2022

6452ab3b
W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44
Z

Use base visit in cpu kernel (#45062) · ab583173
由 zhangkaihuo 提交于 8月 16, 2022

ab583173
S

[Ops] Support more dtype for gather kernel (#45142) · 0b4268a6
由 Siming Dai 提交于 8月 16, 2022

0b4268a6
H

[XPU] add truncated_gaussian_random op. (#45152) · 5bcabf78
由 houj04 提交于 8月 16, 2022

5bcabf78
J
[AutoParallel] Prune D2H memcpy for fp16 pass (#45159) · e2b924bf
由 JZ-LIANG 提交于 8月 16, 2022
```
* prune d2h memcpy for fp16 pass
```
e2b924bf
W

memoptim and fp16 mixed precision (#45132) · fa890092
由 Wilber 提交于 8月 16, 2022

fa890092

【autograd】add select_p、eq_p、pow_p primitive operator for new autograd (#44813) · b681c88c

由 Sing_chan 提交于 8月 16, 2022

* add select_p

* fix bugs

* add custom test for select_p; modify select_p primrules

* modify according to xiaoxu's comment

* add eq_p, select_p, pow_p, use autograd to test grad of high order

* add requirement of autograd, modify expected type of eq

* modify according to xiaoxu's comment

* import primops to use primops.pow

b681c88c

F

add strongly typed functions to set attributes to avoid unexpected type conversions. (#45107) · 307801d5
由 Feiyu Chan 提交于 8月 16, 2022

307801d5
C

support momentum op auto generation (#45163) · 642f6df9
由 Charles-hit 提交于 8月 16, 2022

642f6df9
Y

[Auto Parallel] Move the tests to a standalone folder (#45136) · 59241336
由 Yulong Ao 提交于 8月 16, 2022

59241336

15 8月, 2022 15 次提交

C

support adamw generation (#45149) · 1353761a
由 Charles-hit 提交于 8月 15, 2022

1353761a

refactor fleet. (#44833) · 8636d2a2

由 wuhuachaocoding 提交于 8月 15, 2022

* refactor fleet.

* refact fleet.py.

* update fleet/__init__.py.

* update fleet.py

* update code style.

* update fleet

* update fleet

* update fleet

* update fleet

* update model.py

* update fleet.

* update __init__.py

* update fleet.

* update fleet.

* update fleet

* update fleet

* update fleet

* update fleet.

* update optimizer.py

* update optimizer

* update fleet.py

* update scaler.py

* update setup.py.in

8636d2a2

R
modify atol and rtol to solve unnittest failure (#45139) · f30c7bd6
由 RichardWooSJTU 提交于 8月 15, 2022
```
Co-authored-by: NminghaoBD <liminghao03@baidu.com>
```
f30c7bd6

[phi] change op name linear_interp_v2 to linear_interp (#45128) · 6de3bdb3

由 HongyuJia 提交于 8月 15, 2022

* change name linear_interp_v2 to linear_interp

* fix deprecated_op_names

* deprecated_op_names add linear_interp_grad

6de3bdb3

Y

fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
由 Yuanle Liu 提交于 8月 15, 2022

ac0553a0

Refine TRT unit test (#45102) · 3512bf11

由 zlsh80826 提交于 8月 15, 2022

* Reduce pool2d test configuration

* Reduce depthwise_conv2d test configuration

* Reduce trt_convert_conv2d_fusion test configuration

* Reduce trt_convert_conv2d test configuration

* Reduce trt_convert_conv2d_transpose test configuration

* Reduce trt_convert_hard_swish test configuration

* Enhance trt auto scan test error message and mechanism

* Increase FP16 trt ut tolerance

3512bf11

W
[Eager] fix sync batch norm to inplace (#45028) · c75b091b
由 wanghuancoder 提交于 8月 15, 2022
```
* fix sync batch norm to inplace
```
c75b091b
D
Fix compile error of windows platform(atomicAdd in grid_sample_grad_kernel) (#45131) · 05f7d0c5
由 duanyanhui 提交于 8月 15, 2022
```
* fix compile error
```
05f7d0c5
Z

add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
由 zhangyikun02 提交于 8月 15, 2022

6815c8ab
Z
[AutoParallel] add collate_fn for dist_loader (#45053) · 3649099f
由 zhaoyingli 提交于 8月 15, 2022
```
* add collate_fn

* fix number of inputs
```
3649099f
H
[jit] rm useless property pybind (#44962) · 8788513b
由 Hui Zhang 提交于 8月 15, 2022
```
* rm useless pybind

* rm useless ut
```
8788513b

[Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe

由 Yulong Ao 提交于 8月 15, 2022

* [Auto Parallel] Move the distributed info from python to c++

* [Auto Parallel] Add dist_attrs for VarDesc and OpDesc

* [Auto Parallel] Add the lost file

* [Auto Parallel] Make the dist attr be unique_ptr

* [Auto Parallel] Add the proto conversion

* [Auto Parallel] Improve the proto support

* [Auto Parallel] Fix the bugs for adding a device or a link

* [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper

* [Auto Parallel] Improve the impl of these dist attrs

* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh

* [Auto Parallel] Fix the unittest problem

* [Auto Parallel] Explicitly add the src file for auto_parallel target

* [Auto Parallel] Add the proto depedency explicitly

* [Auto Parallel] Fix the cmake bug on windows and mac

* [Auto Parallel] Remove the pybind11 header file in process_mesh.h

* [Auto Parallel] Remove unused codes

* [Auto Parallel] Check whether the dist attr is null

* [Auto Parallel] Implement the assign operator for OpDesc explicitly

a52357fe

[XPU] add some collective ops. (#45049) · 7e2a20d5

由 houj04 提交于 8月 15, 2022

* [XPU] add some collective ops. test=kunlun

* use XPUOpTestWrapper. test=kunlun

* skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun

7e2a20d5

R
Update FLAGS for standalone executor (#45127) · 566bbf0c
由 Ruibiao Chen 提交于 8月 15, 2022
```
* Update FLAGS for standalone executor

* Update FLAGS_FORCE_USE_PROGRAM_CACHE
```
566bbf0c
W
convert_fp16 support multi block (#45050) · 9aecf286
由 Wilber 提交于 8月 15, 2022
```
* convert_fp16 support multi block

* update

* update
```
9aecf286

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功