提交 · 92faeedfa54ef9e795b10245df202cb64824b8ee · Crayon鑫 / Paddle

31 3月, 2022 1 次提交
- L
  
  Pg heter cloud (#40911) · 92faeedf
  由 lilong12 提交于 3月 31, 2022
  
  92faeedf
30 3月, 2022 25 次提交

0

Fix test_jit_save_load (#41114) · 4b61918d
由 0x45f 提交于 3月 30, 2022

4b61918d

[Phi] Move Rnn Op from fluid to phi (#41007) · 66cf8b08

由 zyfncg 提交于 3月 30, 2022

* move rnn kernel to phi

* move infershape of rnn to phi

* fix HIP bug

* rename function

* fix HIP bug

* fix hip bug

66cf8b08

[MoE] Moe apis (#41092) · aac7879a

由 Roc 提交于 3月 30, 2022

* add random routing op

add _random_routing api in utils

add random routing ut

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* add op about moe gate

update utils

add limit by capacity op

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

* fix for win

* fix bugs in test_limit_by_capacity_op

* update ut

* update for test (timeout)

* fix ut

* update

* update(fix) ut for win

* moe apis in incubate

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count

* add apis and utils

* add gate apis

* add moe and grad clip apis

* update moe apis

* add ops for moe gate

* fix

* update for base moe layer api

* add random routing op

add _random_routing api in utils

add random routing ut

* fix for dygraph

* update with ranodm routing

* update

* fix ut for limit by capacity

* update

* update limit by capacity for easily to switch to single thread mode

* update api docs
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

aac7879a

Revert "Revert "[Phi] Move elementwise_floordiv and elementwise_pow to phi... · eef46770

由 Chen Weihang 提交于 3月 30, 2022

Revert "Revert "[Phi] Move elementwise_floordiv and elementwise_pow to phi (#40993)" (#41065)" (#41110)

This reverts commit 3a6f1135.

eef46770

Add new APIs for GPU memory monitoring (max_memory_allocated,... · afe02e9d

由 From00 提交于 3月 30, 2022

Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657)

* Add new API memory_reserved

* Add memory_allocated, max_memory_reserved and max_memory_allocater

* Fix CI error

* Fix CI error

* Enhance UT

* Add FLAGS_memory_stats_opt

* Add STATS macro functions

* Add StatAllocator

* Fix CI errors

* Add UT

* Fix CI errors

afe02e9d

C
Revert "Revert "[Phi] trans logsumexp op (#40790)" (#41068)" (#41109) · ee8eeb45
由 Chen Weihang 提交于 3月 30, 2022
```
This reverts commit 054fc997.
```
ee8eeb45
H
Revert "Revert "Move some activation to phi (#40727)" (#41056)" (#41095) · 91bb52cd
由 hong 提交于 3月 30, 2022
```
This reverts commit 05f3d48e.
```
91bb52cd

[DoubleGrad PR #3] Supported higher-order GradNode generation (#41051) · abd2df4c

由 Zhanlue Yang 提交于 3月 30, 2022

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* Fixed yaml typo

abd2df4c

P

add _reset_grad_inplace_version (#41101) · cb8afc24
由 pangyoki 提交于 3月 30, 2022

cb8afc24
A
[Yaml] Fix topk yaml compilation problem on Windows (#41082) · 95265d5c
由 Aurelius84 提交于 3月 30, 2022
```
* [Yaml] Fix topk yaml compilation on Windows

* fix make_shared

* fix conflict
```
95265d5c

add bilinear interpolate v2 to xpu list and unitteset, *test=kunlun (#41037) · 4e86dff2

由 ykkk2333 提交于 3月 30, 2022

* add bilinear interpolate v2 to xpu list and unitteset, *test=kunlun

* Delete ps_usr_print_log

* Delete ps_usr_print_log

* Delete xpu_op_test

4e86dff2

Z

Apply TransposeFolding & GemmRewriter passes. (#41084) · c761b48b
由 Zhen Wang 提交于 3月 29, 2022

c761b48b

[Eager] dlpack (#40811) · 4d300224

由 wanghuancoder 提交于 3月 30, 2022

* dlpack eager, test=develop

* eager test_base_layer, test=develop

* fix error report, test=develop

* eager _getitem_from_offset, test=develop

* refine, test=develop

* refine offset, test=develop

* add test_inner test_outer, test=develop

* refine, test=develop

* refine, test=develop

4d300224

Y

move elementwise_mul selected rows input (#41042) · 13f1641d
由 YuanRisheng 提交于 3月 30, 2022

13f1641d

Optimize the perf of top_k when k is too large (#40941) · 45078d9f

由 Zhang Zheng 提交于 3月 30, 2022

* Optimize the perf of top_k when k is too large

* fix rcom compile

* fix

* only compile in cuda

* fix log info

45078d9f

swish and pow op for xpu test=kunlun (#40654) · d951f3af

由 houj04 提交于 3月 30, 2022

* swish and pow op for xpu. test=kunlun

* fix code style. test=kunlun.

* use pow_grad xdnn api. test=kunlun.

d951f3af

H

Optimize the onnxruntime code (#41044) · f12b5260
由 heliqi 提交于 3月 30, 2022

f12b5260

suppor inplace in tensor_method_setitem (#40915) · 7170c687

由 pangyoki 提交于 3月 30, 2022

* suppor inplace in tensor_method_setitem

* delete bump_inplace_version

* optimize inplace unittest

* fix

* fix setitem bug

* update eager_generator

* optimize inplace unittest

* little change

7170c687

Z
Refactor code auto-gene for no_need_buffer (#41025) · 97cd0f51
由 zyfncg 提交于 3月 30, 2022
```
* refactor code auto-gene for no_need_buffer

* fix some bug

* delete test code
```
97cd0f51
C
[Phi]fix pad3d infermeta bug (#41020) · 9219495c
由 chentianyu03 提交于 3月 30, 2022
```
* fix pad3d infermeta bug

* add check for construct ScalarArray
```
9219495c
Y
change to new api in ssync mode (#41022) · 2089b485
由 yaoxuefeng 提交于 3月 30, 2022
```
* change to new api in ssync mode

* fix

* fix

* fix

* fix
```
2089b485

support view strategy in dygraph eager_final state (#40891) · 495ca4aa

由 pangyoki 提交于 3月 30, 2022

* support view strategy in eager_final state

* perfect reshape kernel

* fix bugs of sig

* add unittest for reshape_sig

* fix bugs when run converage

* fix inplace bug in final_state eager_gen

* fix python_c_gen

* support view strategy for final state

* fix order of out and xshape in reshape

* fix Coverage_CI unittest timeout error

* support reshape view

* fix reshape_sig

* fix yml and api_base
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

495ca4aa

C

fix double grad var judging (#41072) · 775ddb5a
由 Chen Weihang 提交于 3月 30, 2022

775ddb5a
L

fix bug that some op has no op_role attr (#41040) · 040d3386
由 Leo Chen 提交于 3月 30, 2022

040d3386

[Eager] Pylayer (#39989) · 157c1a28

由 wanghuancoder 提交于 3月 30, 2022

* Supported Complex2Real Conversion for Eager Dygraph

* Supported Complex2Real Conversion for Eager Dygraph

* Enabled complex type promotion test for matmul_v2

* pylayer, test=develop

* Fix CI issues

* Support initializing specific grad tensors to zero for selected operators

* finish forward, test=develop

* create grad node finish, test=develop

* Merged adj_edges_ with GradSlotMeta

* Fixed monir issue

* backward finish, start dbg, test=develop

* Adjusted num runs

* Recovered Eager performance tests configurations

* Recovered Eager performance tests configurations

* finish, test=develop

* polish, test=develop

* polish, test=develop

* refine, test=develop

* eager, test=develop

* Adjusted performance tests configurations

* Fixed Minor Issues with performance tests

* [Phi] Fix macro name typo

* support set_materialize_grads, test=develop

* suppotr mark_non_differentiable, test=develop

* support once_differentiable, test=develop

* refine, test=develop

* refine, test=develop

* Moved out Edge from GradSlotMeta

* Fixed issues from merge

* Fixed typo

* Addressed review comments

* Fixed merge issues

* Fixed minor issues

* Fixed minor issue

* refine, test=develop

* refine, test=develop

* refine, test=develop

* Fixed major issues and enabled auto_prune test cases

* Fixed issues from merge

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NAurelius84 <zhangliujie@baidu.com>

157c1a28

29 3月, 2022 14 次提交

[MoE] Moe apis (#40895) · aeade538

由 Roc 提交于 3月 29, 2022

* add random routing op

add _random_routing api in utils

add random routing ut

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* add op about moe gate

update utils

add limit by capacity op

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

* fix for win

* fix bugs in test_limit_by_capacity_op

* update ut

* update for test (timeout)

* fix ut

* update

* update(fix) ut for win

* moe apis in incubate

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count

* add apis and utils

* add gate apis

* add moe and grad clip apis

* update moe apis

* add ops for moe gate

* fix

* update for base moe layer api

* add random routing op

add _random_routing api in utils

add random routing ut

* fix for dygraph

* update with ranodm routing

* update

* fix ut for limit by capacity

* update
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

aeade538

W
add elementwise sub and elementwise div in tensorrt op teller (#40806) · f3022dfa
由 wangxinxin08 提交于 3月 29, 2022
```
* add elementwise sub and elementwise div in tensorrt op teller

* add unittest of elementwise mul, sub and div
```
f3022dfa
T
Revert "[Phi] trans logsumexp op (#40790)" (#41068) · 054fc997
由 tianshuo78520a 提交于 3月 29, 2022
```
This reverts commit 9c0eaada.
```
054fc997
T
Revert "[Phi] Move elementwise_floordiv and elementwise_pow to phi (#40993)" (#41065) · 3a6f1135
由 tianshuo78520a 提交于 3月 29, 2022
```
This reverts commit b532315d.
```
3a6f1135
L

refine AsyncWorkQueue (#40977) · 63471c83
由 liutiexing 提交于 3月 29, 2022

63471c83
0
Fix test_reinforcement_learning.py for eager run_program OP (#41018) · 733d8168
由 0x45f 提交于 3月 29, 2022
```
* Fix test_reinforcement_learning.py for eager run_program OP

* Add comment
```
733d8168
T
Revert "Move some activation to phi (#40727)" (#41056) · 05f3d48e
由 tianshuo78520a 提交于 3月 29, 2022
```
This reverts commit e77a947e.
```
05f3d48e

津

[Phi] trans logsumexp op (#40790) · 9c0eaada

由津提交于 3月 29, 2022

* [Phi] trans logsumexp op

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* add sig

* fix sig bugs

* fix sig bugs

* fix xpu bugs

* fix review bugs

* test=develop

9c0eaada

W
[Phi] Move elementwise_floordiv and elementwise_pow to phi (#40993) · b532315d
由 wuyefeilin 提交于 3月 29, 2022
```
* mv floordiv to phi

* mv elementwise_pow to phi

* fix as review
```
b532315d
Z

pool2d support fp16 on xpu and update pool2d unittest, test=kunlun (#40841) · 4d198acb
由 zhangyikun02 提交于 3月 29, 2022

4d198acb
Z

[MLU]add reduce op mlu kernel (#41028) · d1c1d731
由 zn 提交于 3月 29, 2022

d1c1d731
Z

softmax_with_cross_entropy support fp16 on xpu, test=kunlun (#40869) · 649948a6
由 zhangyikun02 提交于 3月 29, 2022

649948a6

Use _C_ops.yolov3_loss in eager mode for test_yolov3.py (#40831) · 3b381aac

由 0x45f 提交于 3月 29, 2022

* Use _C_ops.yolov3_loss in eager mode for test_yolov3.py

* fix code for test_yolov3_loss_op

* remove useless import

* Fix dygraph_mode flag

3b381aac

F

Determine execution sequence of random OPs in new executor (#41012) · fe8acb67
由 From00 提交于 3月 29, 2022

fe8acb67

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致