提交 · 1f1a7835b9cc5499d19e860a0aba250f5e3cd2c2 · 机器未来 / Paddle

26 8月, 2022 18 次提交

R

Move conv2d_transpose XPU kernel to PHI, test=kunlun (#45419) · 1f1a7835
由 Ruibiao Chen 提交于 8月 26, 2022

1f1a7835
Z
[Phi] Delete xpu kernel of fill_any_like and fill_constant in fluid (#45420) · 6ab80b64
由 zyfncg 提交于 8月 26, 2022
```
* delete fill xpu op in fluid

* delete fill_constant header, test=kunlun

* fix npu header, test=kunlun
```
6ab80b64
T

test=document_fix (#45454) · a6c4d976
由 tianshuo78520a 提交于 8月 26, 2022

a6c4d976

Layernorm shape bugfix (#45431) · 3ca8cf44

由 Wang Bojun 提交于 8月 26, 2022

* fix bug fix

* add shape size check

* polish code

* multi -1 shape fix

* code style improve

* bug fix

* code style fix

3ca8cf44

G

fix ptq unittest (#45447) · 14f6c74b
由 Guanghua Yu 提交于 8月 26, 2022

14f6c74b
W

[Eager] delete final state pre-name (#45306) · 126940b3
由 wanghuancoder 提交于 8月 26, 2022

126940b3

fix en docs in fft and io.dataset (#44948) · 2dca718a

由 Liyulingyue 提交于 8月 26, 2022

* irfftn; test = docutment_fix

* fft; test=document_fix

* fft; test=document_fix

* fft; test=document_fix

* subdata; test=document_fix

* adaptive_avg_pool2d; test=document_fix

* adaptive_avg_pool3d; test = document_fix

* ftt; test=document_fix

* ftt; test=document_fix

* AvgPool1D; test=document_fix

* avg_pool1d; test=document_fix

* test=document_fix

* test=document_fix

* test=document_fix

* test=document_fix

* fft; test=document_fix

* emb; test=document_fix

* emb; test=document_fix

* emb;test=document_fix

* fold; test=document_fix

* fold; test=document_fix

* fold; test=document_fix

* fold;test=document_fix

* GELU;test=document_fix

* update irfftn docs;test=document_fix

* Update fft.py

* Update fft.py

* Update common.py

* Update common.py

* Update fft.py

* Update input.py

* Update pooling.py

* dropout2d; test=document_fix

* Fold; test=document_fix

* update fold math;test=document_fix

* Update common.py
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>

2dca718a

W

fix_multihead (#45429) · fa06d9c3
由 Wangzheee 提交于 8月 26, 2022

fa06d9c3
D

fix brpc update compile error; test=develop (#45438) · a5e9ccda
由 danleifeng 提交于 8月 26, 2022

a5e9ccda
H

[XPU] add load_combine_op_xpu. test=kunlun (#45436) · 3055d71a
由 houj04 提交于 8月 26, 2022

3055d71a
W

From fluid package import APIs of quantization to paddle package (#45403) · 79f7f16d
由 whs 提交于 8月 26, 2022

79f7f16d
C

fix temporary tensor erase in new exe (#45404) · fda183d3
由 ceci3 提交于 8月 26, 2022

fda183d3

Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a

由 kangguangli 提交于 8月 26, 2022

* remove fluid kernel and activate phi kernel

* fix parameter error

* transfer mkldnn part

* modify header file path

* fix compile error

* transfer special case

* fix lod setting and special case for layout setting

* add testcase and refine code

985f2a4a

H

Modify PE Engine thread from 2 into 1 in JitLayer (#45356) · 9382159d
由 Hui Zhang 提交于 8月 26, 2022

9382159d
Y

[dygraph hybrid pp for interleave] Virtual pp stage layer split (#45402) · 04c15e79
由 Yuang Liu 提交于 8月 26, 2022

04c15e79
H
fix reduce mean grad bug *test=kunlun (#45401) · 2a992178
由 haosicheng 提交于 8月 26, 2022
```
* add temporal shift and grad *test=kunlun

* fix reduce mean grad bug *test=kunlun
```
2a992178

[ Dy2static ] select input fix and while_op memory bug fixed. (#45380) · 91298884

由 xiongkun 提交于 8月 26, 2022

* while support for python container.
It is convenient to convert more dynamic graph codes into static graphs.

* cond support python container

* 1. make select_input output shape = input[1]
2. add warning in while_loop risky assign

* fix 2 problem in GPT export:
1. a bug in while_op no_need_copy_var, which causes gpu memory leakage
2. a bug in undefined_var where the stop_gradient should be False.

* change name by code review

* format

91298884

王

[NPU] fix CI error in new executor. (#45432) · f4193eac
由王明冬提交于 8月 26, 2022

f4193eac

25 8月, 2022 22 次提交

F

add support for double attributes (#45390) · efab2eb4
由 Feiyu Chan 提交于 8月 25, 2022

efab2eb4

Enable OMP multithreading in lookup_table_v2 (#45249) · 0c363de8

由 piotrekobi 提交于 8月 25, 2022

* Add omp parallel for directives

* Revert "Add omp parallel for directives"

This reverts commit f4e4f8ddb12454018d9c1e49c074af2543659de6.

* Add #pragma omp parallel for to correct file

* Add check for _OPENMP definition

* Disable omp on gpu

* Trigger CI

* Readd check for _OPENMP definition

* Change macro disabling changes on GPU

* Improve macro readability

0c363de8

A
[OpAttr]axis of Reverse Support Tensor type (#45391) · 91110661
由 Aurelius84 提交于 8月 25, 2022
```
* [OpAttr]axis of Reverse Support Tensor type

* fix coverage

* fix unittest
```
91110661
D
update brpc version to 1.2.0 (#45351) · 9b5b005e
由 danleifeng 提交于 8月 25, 2022
```
* update brpc version;test=develop
```
9b5b005e
H

fix auto tune unitest assert (#45421) · cb0b53cb
由 hong 提交于 8月 25, 2022

cb0b53cb
A
[OpAttr]min/max of uniform_random support Tensor type (#45417) · c8955d0d
由 Aurelius84 提交于 8月 25, 2022
```
* [OpAttr]min/max of Uniform_rand support Tensor type

* fix typo
```
c8955d0d
C
Fix record operator input shapes segment fault in new dygraph (#45360) · 4d78390e
由 chenjian 提交于 8月 25, 2022
```
* fix segment fault

* fix
```
4d78390e

Transfer memcpy d2h from fluid to phi (#45150) · 0d14e74a

由 kangguangli 提交于 8月 25, 2022

* transfer memcpy_d2h from fluid to phi

* refine arg check and add comment

* fix cannot fallback to phi kernel

* fix gpu_context host alloc when tensor size = 0

* add kernel for std::vector<DenseTensor> args

* fix bugs in MemcpyD2HMultiIOKernel

* remove useless header file

* polish format

* fix typo

* add testcase for cudapinned place

* refine check condition in test

* polish error message

* polish error message

* remove header in fluid  directory

* merge memcpy_h2d and memcpy_d2h into one file, change register method to simplify implementation

* fix code style check

0d14e74a

R
[NPU] add run_program_op_npu (#45349) · 64afa638
由 ronnywang 提交于 8月 25, 2022
```
* [NPU] add run_program_op_npu

* add run_program_op_npu ut
```
64afa638
S
make full_like support double_max in dygraph (#45385) · edd66f2e
由 Sing_chan 提交于 8月 25, 2022
```
* make full_like support double_max in dygraph

* fix bug
```
edd66f2e
W
[Eager] sync_batch_norm_grad delete mean and variance (#45411) · 5df464fe
由 wanghuancoder 提交于 8月 25, 2022
```
* sync_batch_norm_grad delete mean and variance
```
5df464fe

optimize conv algo cache (#41891) · 1cd7e68b

由 hong 提交于 8月 25, 2022

* optimizer conv alog speed

* code polish

* remove useless code

* fix compile error

* fix cpu compile error

* not use cudnn alog t

* add search cache max number

* polish code

* fix cache test bug

* add groups data format to conv args

* fix cache test bug

* fix cudnn_deterministic bug

* fix test switch auto tune bug

* fix test swith autotune bug;

* fix conv cache bug

* fix cache test error

* fix cache test bug

* fix windows mac compile error

* fix workspace search error

* update cudnn cache

* fix cache test bug; test=develop

* fix autotune swith test error

* polish code

* oplish code

1cd7e68b

Fl-PS bug fix (#45413) · f2f3f6e7

由 ziyoujiyi 提交于 8月 25, 2022

* back fl

* delete ssl cert

* .

* make warning

* .

* unittest paral degree

* solve unittest

* heter & multi cloud commm ready

* .

* .

* fl-ps v1.0

* .

* support N + N mode

* .

* .

* .

* .

* delete print

* .

* .

* .

* .

* fix bug

* .

* .

* fl-ps with coordinator ready

* merge dev

* update message parse only

* update fl client scheduler

* fix bug

* update multithreads sync

* fix ci errors

* update role_maker.py

* update role_maker.py

* fix ci error: windows py import error

* fix ci error: windows py import error

* fix windows ci pylib import error

* add dump fields & params

* try to fix windows import fleet error

* fix ps FLAGS error

* fix logging risk

* fix logging possible risk

* write trainer_desc file

* support split sparse params in local & remote

* fix import paddle.fluid.core.PSGPU

* fix import paddle.fluid.core.PSGPU

* add remote_sparse & local_sparse config

* fix unittest

* fix test_dist_fleet_geo table error

* fix PADDLE_ENFORCE error

* fix other's pr conflict

* forbidden ssd table

* .

* recover ssd table code

* recover file mode

f2f3f6e7

R

[triu_indices] add triu_indices_op (#45168) · a410c397
由 Rayman 提交于 8月 25, 2022

a410c397
W

fix params sync multi times problem (#45406) · 20d38664
由 Wilber 提交于 8月 25, 2022

20d38664
H

add new function ptq first then initialize qat scale with ptq scale (#44747) · 9ac27ac3
由 handiz 提交于 8月 25, 2022

9ac27ac3
J
[Auto Parallel] Support High Order Differential with Data Parallel Calc-Comm Overlaping (#45388) · bdd0b0f1
由 JZ-LIANG 提交于 8月 25, 2022
```
* support high order differential with data parallel overlap

* update unitest
```
bdd0b0f1
U

fix roi_align_op_npu to pass the unittest (#45310) · 256bf6ff
由 USTCKAY 提交于 8月 25, 2022

256bf6ff
S
Fix unique_kernel bugs (#45032) · ea1f4702
由 sprouteer 提交于 8月 25, 2022
```
* fix unique_kernel bugs

* fix unique kernel cu bugs
```
ea1f4702

Fix relu python call (#45082) · 839fac65

由 hong 提交于 8月 25, 2022

* add python final state

* fix bug

* fix bugs

* fix bug

* fix bug

* revert impl, final state mul not support selected rows

* fix softmax use cudnn error

* add softlable false unitest

* revert loss.py

839fac65

H

add temporal shift and grad *test=kunlun (#45300) · 63d9a175
由 haosicheng 提交于 8月 25, 2022

63d9a175
Z

enforce_reshape (#45386) · 0bf40070
由 zhoutianzi666 提交于 8月 25, 2022

0bf40070

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致