提交 · cd2a4cdf4762751431f3469c72922c1b6ff326c8 · 机器未来 / Paddle

11 4月, 2022 1 次提交
- S
  
  fix some ops (#41577) · 795d7121
  由 sneaxiy 提交于 4月 11, 2022
  
  795d7121
10 4月, 2022 1 次提交
- C
  
  fix warpctc grad kernel dep eror (#41598) · 91d6f47a
  由 Chen Weihang 提交于 4月 10, 2022
  
  91d6f47a
09 4月, 2022 2 次提交

H

add depthwise conv hip support (#41537) · b3b8d345
由 hong 提交于 4月 09, 2022

b3b8d345

Autotune the workspace_size_limit in conv. (#40338) · b937cdc5

由 limingshu 提交于 4月 09, 2022

* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

b937cdc5

08 4月, 2022 1 次提交
- J
  
  Fix RNN OP multi-threads predict bug (#41529) · 09203e46
  由 Jack Zhou 提交于 4月 08, 2022
  
  09203e46
07 4月, 2022 8 次提交
- remove FLAGS_use_curand and change all random op CUDA implementation (#41308) · 9714878c
  由 zhouweiwei2014 提交于 4月 07, 2022
  
  9714878c
- Y
  [Phi]Add hard_swish/kron/linspace/logit yaml file (#41298) · 90cb337e
  由 YuanRisheng 提交于 4月 07, 2022
```
* add yaml

* perfect converage
```
  90cb337e
- fix compile bug of windows cuda11.5 (#41433) · eea85814
  由 zhouweiwei2014 提交于 4月 07, 2022
  
  eea85814
- Z
  
  Add Sparse API to_dense, to_sparse_coo and values (#41394) · f78cc3da
  由 zhangkaihuo 提交于 4月 07, 2022
  
  f78cc3da
- S
  [BugFix] Add error hint for one_hot gpu version (#41335) · 91266b96
  由 Siming Dai 提交于 4月 07, 2022
```
* add one_hot gpu hint

* move allow_out_of_range judgement

* delete useless unittest
```
  91266b96
- Z
  
  fix p_norm gpu nan bug while divide zero (#41359) · dfa63126
  由 zhiboniu 提交于 4月 07, 2022
  
  dfa63126
- C
  [Phi] Polish truncated normal kernel and add yaml (#41280) · d39e7896
  由 Chen Weihang 提交于 4月 07, 2022
```
* polish truncated normal kernel

* add yaml

* add truncated normal kernel and add yaml

* polish unittests and yaml

* import dygraph mehtod
```
  d39e7896
- Y
  
  fix bugs of reshape double grad infermeta (#41459) · 53409bcd
  由 YuanRisheng 提交于 4月 07, 2022
  
  53409bcd
06 4月, 2022 4 次提交

Y
[Phi]Add graph_send_recv yaml file (#41206) · 6f4bd0ea
由 YuanRisheng 提交于 4月 06, 2022
```
* add graph_send_recv yaml

* deal with confict

* fix compile bugs
```
6f4bd0ea
S

fix bug of missing boost when compile cache.cc (#41430) · 5c6e4bff
由 Sing_chan 提交于 4月 06, 2022

5c6e4bff

Add conv yaml (#41354) · 7ed7c6c7

由 hong 提交于 4月 06, 2022

* update

* add conv yaml

* add backward

* remove useless code

* fix bug

* fix bug

* revert fluid dygraph conv2d

* remove useless infermeta function

* fix meta fn deluplicat error

* conv using custom impl

* remove amp include

* fix bug

* use cudnn = true

* fix test mkldnn caching bug

7ed7c6c7

X
[Dygraph TestsFix] Test some tests in new dygraph final_state mode. (#41363) · 0b96793e
由 xiongkun 提交于 4月 06, 2022
```
* fix less than

* fix some tests

* fix additional 3 unittest case
```
0b96793e

05 4月, 2022 4 次提交

Z
Fix bug of data transform in inference executor (#41349) · 91212104
由 zyfncg 提交于 4月 05, 2022
```
* fix bug of data transform in inference executor

* fix bug
```
91212104

[DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul (#41387) · d8a10977

由 Zhanlue Yang 提交于 4月 05, 2022

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue

d8a10977

G

add new format of quantization (#41041) · b72a7ebb
由 Guanghua Yu 提交于 4月 05, 2022

b72a7ebb

Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e

由 Zhang Ting 提交于 4月 05, 2022

* switch autotune

* implement AutoTuneCache

* implement AutoTuneCache class

* add pybind api

* add dygraph test

* support static mode and eager mode and improve unittests

* rename the SwitchAutoTune Class and improve tests

* improve AutoTuneStatus and reduce the cost of tests

b0f8000e

04 4月, 2022 4 次提交
- 0
  
  Fix Warpctc error when using muti-gpu (#41389) · f8b3e576
  由 0x45f 提交于 4月 04, 2022
  
  f8b3e576
- F
  
  fix index_select kernel configuration error where input numel is 0 (#41383) · 3e9ad093
  由 FlyingQianMM 提交于 4月 04, 2022
  
  3e9ad093
- H
  Add batch norm yaml (#41386) · 77cf305f
  由 hong 提交于 4月 04, 2022
```
* update

* fix bug
```
  77cf305f
- F
  Add yaml for flatten_contiguous_range OP (#41345) · c5285cc5
  由 From00 提交于 4月 04, 2022
```
* Add yaml for flatten_contiguous_range OP

* update

* Fix typos
Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
```
  c5285cc5
03 4月, 2022 4 次提交

[Phi]Concat grad (#41112) · 3f57ef7a

由 chentianyu03 提交于 4月 03, 2022

* add concat_grad kernel

* fix error

* remove comment code

* fix outs nullptr error

* change to phi header

* add concat_grad declare for standalone_executor_test

3f57ef7a

add maximum limit for grid of index_select (#41127) · af8d2482

由 FlyingQianMM 提交于 4月 03, 2022

* limit grid dim for index select

* mv LimitGridDim into gpu_launch_config.h

* fix conflicts

* fix conflicts

* fix code style

* set block to 256

* fix grid setting

* set dtype of block_dim to unsigned int

af8d2482

Z
Add randperm and range yaml (#41265) · fd1ecfc5
由 zyfncg 提交于 4月 03, 2022
```
* add randperm and range yaml

* add eager test for randperm
```
fd1ecfc5

Add some yaml config (#41053) · e4914734

由 From00 提交于 4月 03, 2022

* Add yaml config

* Add yaml for flatten_contiguous_range_op

* Remove h_sigmoid yaml

* Fix CI errors

* Fix code format

* Fix flatten OP errors

* Fix conflicts

* Fix CI errors

* Remove flatten_contiguous_range OP

* Remove redundant code

* Fix typos

e4914734

02 4月, 2022 9 次提交

Add graph apis (#40809) · b0398c8e

由 Siming Dai 提交于 4月 02, 2022

* Add graph_reindex API

* add graph_sample_neighbors api

* Add buffer

* delete VLOG

* delete thrust::copy for output

* add ShareDataWith

* delete graph_reindex hashtable output

* add graph_reindex dispensable

* add reindex unittest, move memset to cuda kernel, change api

* fix conflict

* add reindex buffer for gpu version note

* fix conflicts for op_func_generator

* Add fisher_yates sampling, add dispensable, change infermeta

* add dtype for edge_id

* fix rocm ci and static check ci

* add unittest

* fix unittest

* fix unittest

* fix bug

b0398c8e

X
[Yaml] add yaml for 5 ops [ elementwise_pow, expm1, floor_divide, logsumexp, mish ] (#41288) · 36f97cdc
由 xiongkun 提交于 4月 02, 2022
```
* add yaml for ele_max ele_min

* add yaml for: mish / logexpsum / expm1 / elemenwise_pow / elementwise_floordiv
```
36f97cdc

[phi] Move clip op to phi (#40602) · c0658045

由 wuyefeilin 提交于 4月 02, 2022

* move clip op to phi

* fix as review

* update hierarchical_sigmoid_kernel.cc

* update selected_rows

* update clip_kernel.cu

* fix as review

c0658045

L
enable new-executor on windows to test it (#41301) · e59a693e
由 Leo Chen 提交于 4月 02, 2022
```
* enable new-executor on windows to test it

* add message

* fix ut
```
e59a693e
Z

Sparse conv and pool support indices as template (#41137) · 5d3fd4fe
由 zhangkaihuo 提交于 4月 02, 2022

5d3fd4fe
N

Fix a bug when reduceHigherDim in HIP (#41273) · 7dd4a9fe
由 niuliling123 提交于 4月 02, 2022

7dd4a9fe
Z
Limit the condition of entering optimized kernel (#41296) · 3b686b18
由 Zhang Zheng 提交于 4月 02, 2022
```
Co-authored-by: Nroot <root@yq01-sys-hic-k8s-v100-box-a225-0186.yq01.baidu.com>
```
3b686b18

[Yaml] transfer around 22 ops yaml file and pass the final state OpTest. (#41024) · 16bfcd18

由 xiongkun 提交于 4月 02, 2022

* 1. add the python api grad 2. add final and intermediate state vlog 3. change the python_api error logic

* add python api or close the check_eager=True

* fix the compatibility

16bfcd18

Z

Fix sparse conv and verify sparse conv backward (#40961) · ad0c106c
由 zhangkaihuo 提交于 4月 02, 2022

ad0c106c

01 4月, 2022 2 次提交

H

update (#41245) · 99029dc9
由 hong 提交于 4月 01, 2022

99029dc9

[Eager] Support pinned (#41035) · f3270fc8

由 wanghuancoder 提交于 4月 01, 2022

* support pinned, test=develop

* support async_write, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine,test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

f3270fc8

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致