提交 · 9714878cc76b6db1e1fdec2a81dabc4874f25ea6 · PaddlePaddle / Paddle

07 4月, 2022 7 次提交
- remove FLAGS_use_curand and change all random op CUDA implementation (#41308) · 9714878c
  由 zhouweiwei2014 提交于 4月 07, 2022
  
  9714878c
- Y
  [Phi]Add hard_swish/kron/linspace/logit yaml file (#41298) · 90cb337e
  由 YuanRisheng 提交于 4月 07, 2022
```
* add yaml

* perfect converage
```
  90cb337e
- L
  
  add send/recv to/from switch module for PrcoessGroupHeter (#41285) · 633ac4e6
  由 lilong12 提交于 4月 07, 2022
  
  633ac4e6
- S
  Add Output(Step) to DistributedFusedLamb optimizer (#41249) · e4459a40
  由 sneaxiy 提交于 4月 07, 2022
```
* add Output(Step) to distributed fused lamb op

* add _set_step
```
  e4459a40
- Q
  ignore some failed test for KL2 (#41342) · 81389c51
  由 QingshuChen 提交于 4月 07, 2022
```
* ignore some failed test for KL2
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun
```
  81389c51
- H
  momentum support l2decay for xpu. test=kunlun (#41325) · 533c649f
  由 houj04 提交于 4月 07, 2022
```
* momentum support l2decay for xpu. test=kunlun

* fix include file. test=kunlun

* fix cmake for device_worker. test=kunlun
```
  533c649f
- Y
  
  fix bugs of reshape double grad infermeta (#41459) · 53409bcd
  由 YuanRisheng 提交于 4月 07, 2022
  
  53409bcd
06 4月, 2022 2 次提交

[Eager] Support test_layers's test cases switch to eager mode (#41216) · 5ae8babb

由 Weilong Wu 提交于 4月 06, 2022

* [Eager] Support test_layers's test cases switch to eager mode

* Update batch_norm _C_ops action to fix CI

* Use None instead of new EmptyTensor

* Updated var name

* Make sure to switch eager mode, Fix Coverage_CI

* Remove _non_static_mode statement

* Remove batch_norm dispensable input statement

* Polish batch_norm code

* Fix CI issue

5ae8babb

Add conv yaml (#41354) · 7ed7c6c7

由 hong 提交于 4月 06, 2022

* update

* add conv yaml

* add backward

* remove useless code

* fix bug

* fix bug

* revert fluid dygraph conv2d

* remove useless infermeta function

* fix meta fn deluplicat error

* conv using custom impl

* remove amp include

* fix bug

* use cudnn = true

* fix test mkldnn caching bug

7ed7c6c7

05 4月, 2022 4 次提交

Table refine: remove table/accessor unuseful (#41400) · a288fcab

由 zhaocaibei123 提交于 4月 05, 2022

* update name

* update name

* fix test

* fix fleet bind

* update name

* update name

* fix test

* fix gpups wrapper

* remove Push/Pull/Load/Save with context in client and wrapper base class

* fix

* fix

* remove some interface

* fix

* remove

* code style

* recover

* fix

* remove code unused

* remove some unused table & accessor & CommonDenseTable => MemoryDenseTable

* fix

* fix

* fix

* recover

* remove unused code
Co-authored-by: Nesythan <esythan@126.com>

a288fcab

L

fix linspace (#41404) · 84e8ae77
由 Leo Chen 提交于 4月 05, 2022

84e8ae77
G

add new format of quantization (#41041) · b72a7ebb
由 Guanghua Yu 提交于 4月 05, 2022

b72a7ebb
R
Add nms op and batched_nms api (#40962) · 7554f428
由 RichardWooSJTU 提交于 4月 05, 2022
```
* add nms op and batched_nms api
```
7554f428

04 4月, 2022 2 次提交
- H
  Add batch norm yaml (#41386) · 77cf305f
  由 hong 提交于 4月 04, 2022
```
* update

* fix bug
```
  77cf305f
- H
  Add dropout yaml (#41355) · 1c7001e7
  由 hong 提交于 4月 04, 2022
```
* add dropout slice yaml

* remove useless code

* fix infer shape error

* skip infrt compile for dropout
```
  1c7001e7
03 4月, 2022 3 次提交

[Phi]Concat grad (#41112) · 3f57ef7a

由 chentianyu03 提交于 4月 03, 2022

* add concat_grad kernel

* fix error

* remove comment code

* fix outs nullptr error

* change to phi header

* add concat_grad declare for standalone_executor_test

3f57ef7a

Add infer meta (#41054) · 868a3203

由 hong 提交于 4月 03, 2022

* add some infer meta

* fix bug

* fix bugs;

* fix bug and add set data type

* revert infer shape of lookup table

* recover test

868a3203

Z
Add randperm and range yaml (#41265) · fd1ecfc5
由 zyfncg 提交于 4月 03, 2022
```
* add randperm and range yaml

* add eager test for randperm
```
fd1ecfc5

02 4月, 2022 5 次提交

Add graph apis (#40809) · b0398c8e

由 Siming Dai 提交于 4月 02, 2022

* Add graph_reindex API

* add graph_sample_neighbors api

* Add buffer

* delete VLOG

* delete thrust::copy for output

* add ShareDataWith

* delete graph_reindex hashtable output

* add graph_reindex dispensable

* add reindex unittest, move memset to cuda kernel, change api

* fix conflict

* add reindex buffer for gpu version note

* fix conflicts for op_func_generator

* Add fisher_yates sampling, add dispensable, change infermeta

* add dtype for edge_id

* fix rocm ci and static check ci

* add unittest

* fix unittest

* fix unittest

* fix bug

b0398c8e

L

do not use scope in op kernel (#41316) · 0f6412c0
由 Leo Chen 提交于 4月 02, 2022

0f6412c0
W
[Paddle inference] support new quant_model (#41049) · 1b58ce14
由 Wangzheee 提交于 4月 02, 2022
```
* paddle inference support new quant_model
```
1b58ce14

[phi] Move clip op to phi (#40602) · c0658045

由 wuyefeilin 提交于 4月 02, 2022

* move clip op to phi

* fix as review

* update hierarchical_sigmoid_kernel.cc

* update selected_rows

* update clip_kernel.cu

* fix as review

c0658045

T

xpu add dropout&cast unitest (#41120) · acec26a1
由 taixiurong 提交于 4月 02, 2022

acec26a1

01 4月, 2022 6 次提交

Add nll_loss yaml (#41126) · 8e032db8

由 zyfncg 提交于 4月 01, 2022

* add nll_loss yaml

* fix nll loss

* fix nll loss bug

* fix bug

* fix bug

* fix infrt problem
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

8e032db8

[Eager] Support pinned (#41035) · f3270fc8

由 wanghuancoder 提交于 4月 01, 2022

* support pinned, test=develop

* support async_write, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

f3270fc8

[Phi] Move softmax with cross entropy kernel into phi (#40832) · e6ec98fe

由 Chen Weihang 提交于 4月 01, 2022

* add cross_entropy_with_softmax phi kernel

* remove softmax_with_cross_entropy kernel

* add softmax_with_cross_entropy grad kernel

* remove original op kernel

* refine cross entropy impl

* fix pointer error

* revert kernel cu change

* fix xpu failed

* fix cinn failed

* fix npu failed

* add forward sig

* add check_nan_inf for pt kernel

* remove repeat cmake item

* fix unittest error

e6ec98fe

[Phi]Interploatd kernels into phi (#40855) · d65a7a46

由 chentianyu03 提交于 4月 01, 2022

* add interploate cpu kernel

* fix nullptr bug

* add interpolate gpu kernel

* fix unit test error

* remove raw kernels

* add cuda kernel impl

* add infermeta

* recover accidentally deleted kernels in interpolate op

* fix grad x_grad name error

* remove interpolate_v2_op.h

* rm unused codes

* fix xpu build error

* fix build error

* fix namespace error

* add register header for nup

* fix infermeta error

* modify by review

* add the missing args in test_trt_convert_nearest_interp_v2

d65a7a46

support multi_layer of bilstm,*test=kunlun (#41151) · 00d23897

由 z8hanghuan 提交于 4月 01, 2022

* support multi_layer of bilstm,*test=kunlun

* support multi_layer of bilstm, *test=kunlun

* support multi_layer of bilstm, *test=kunlun

* support multi_layer of bilstm, *test=kunlun

00d23897

[Phi] Add shape and strided_slice yaml & Adapt eager mode (#41131) · 9b6a02d4

由 Chen Weihang 提交于 4月 01, 2022

* add several yaml

* polish strided slice kernel & add yaml

* reorder yaml

* add several yaml

* revert yaml config change

* resolve conflict

* Update test_strided_slice_op.py

9b6a02d4

31 3月, 2022 9 次提交
- Z
  heter & multi-cloud brpc communication (#40965) · 2f41f389
  由 ziyoujiyi 提交于 3月 31, 2022
```
* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .
```
  2f41f389
- C
  
  fix conflict (#40851) · 74894cd7
  由 csy0225 提交于 3月 31, 2022
  
  74894cd7
- Z
  [Phi] Rename ScalarArray to IntArray (#40975) · e559fe41
  由 zyfncg 提交于 3月 31, 2022
```
* rename scalar_array to int_array

* update cmake

* fix conflict

* remove useless log
```
  e559fe41
- A
  [Yaml] Migrate sqrt/square/reciprocal yaml (#41164) · 2d69abd2
  由 Aurelius84 提交于 3月 31, 2022
```
* [Yaml] Migrate sqrt/square/reciprocal yaml

* clean file

* fix unittest error
```
  2d69abd2
- L
  [new-exec] fit mkldnn op (#41058) · 02cf6764
  由 Leo Chen 提交于 3月 31, 2022
```
* fix bug that some op has no op_role attr

* add mkldnn support for new executor

* fit for mkldnn data_transfer

* fit for mkldnn data_transfer
```
  02cf6764
- W
  [phi] move yolov3_loss to phi (#40944) · fb93bd5c
  由 wuyefeilin 提交于 3月 31, 2022
```
* mv yolov3_loss op to phi

* fix as review

* update operator.h
```
  fb93bd5c
- L
  add depend when doing fuse_all_optimizer on program (#41178) · 3b00dc92
  由 Leo Chen 提交于 3月 31, 2022
```
* fix dependency of fused optimizer

* add ut
```
  3b00dc92
- Z
  Restrict compilation conditions of optimized topk kernel (#41153) · dea24544
  由 Zhang Zheng 提交于 3月 31, 2022
```
* Restrict compilation conditions of optimized topk kernel

* fix
```
  dea24544
- L
  
  Pg heter cloud (#40911) · 92faeedf
  由 lilong12 提交于 3月 31, 2022
  
  92faeedf
30 3月, 2022 2 次提交

[Phi] Move Rnn Op from fluid to phi (#41007) · 66cf8b08

由 zyfncg 提交于 3月 30, 2022

* move rnn kernel to phi

* move infershape of rnn to phi

* fix HIP bug

* rename function

* fix HIP bug

* fix hip bug

66cf8b08

[MoE] Moe apis (#41092) · aac7879a

由 Roc 提交于 3月 30, 2022

* add random routing op

add _random_routing api in utils

add random routing ut

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* add op about moe gate

update utils

add limit by capacity op

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

add ut for limit_by_capacity

add ut for prune_gate_by_capacity

* fix for win

* fix bugs in test_limit_by_capacity_op

* update ut

* update for test (timeout)

* fix ut

* update

* update(fix) ut for win

* moe apis in incubate

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count

* add apis and utils

* add gate apis

* add moe and grad clip apis

* update moe apis

* add ops for moe gate

* fix

* update for base moe layer api

* add random routing op

add _random_routing api in utils

add random routing ut

* fix for dygraph

* update with ranodm routing

* update

* fix ut for limit by capacity

* update

* update limit by capacity for easily to switch to single thread mode

* update api docs
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

aac7879a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功