提交 · 79f8eeca3b8b527700efd49096f4c367f4dbf333 · Crayon鑫 / Paddle

19 2月, 2022 8 次提交

[Pten] Add selected_rows kernel for Full (#39465) · 79f8eeca

由 zyfncg 提交于 2月 19, 2022

* Add selected_rows kernel for full

* remove fill_constant register in fluid

* fix bug without GPU

* add jit_kernel_helper dependency for fc

* do some refactor

* add unittest for ops signatures

* add coverage unittest

* fix merge conflict

* fix full selectew_rows bug

79f8eeca

Update record interface using part1 (#39693) · eec6ef81

由 chenjian 提交于 2月 19, 2022

* fix RecordEvent interface

* modify default level to 4

* update interface use

* add const default trace level

* update record event interface using

* update operator.cc

* update part1

* fix include profiler.h header in ps server

* fix include profiler.h header in ps server

eec6ef81

Z
Enabled test_matmul_v2_op for final state Eager Dygraph (#39504) · 77625d7d
由 Zhanlue Yang 提交于 2月 19, 2022
```
* Enabled test_matmul_v2_op for final state Eager Dygraph

* Fixed minor issue

* Fixed format issue
```
77625d7d
C
[PTen] Support parse cc file in gpu (#39691) · b29c05c7
由 Chen Weihang 提交于 2月 19, 2022
```
* support parse cc in gpu

* change file name
```
b29c05c7

fix RecordEvent interface (#39675) · 019a552b

由 chenjian 提交于 2月 19, 2022

* fix RecordEvent interface

* modify default level to 4

* update interface use

* add const default trace level

* update operator.cc

019a552b

[Pten] Adjust the params of creation kernel for inference (#39573) · 4e5d6743

由 zyfncg 提交于 2月 19, 2022

* remove manual_api

* change sig map of full and empty

* fix fill_any_like_xpu_op

* fix fill_any_like_xpu_op

* fix problem of fill_any_like_xpu_op

* fix conflict

* polish code

4e5d6743

[Eager Hook] Support ReduceHook in GradNodeAccumulation (#39674) · 06b177c0

由 Weilong Wu 提交于 2月 19, 2022

* [Eager] Support GradientHook before running seperate GradNode

* Fix CI issue

* Support eager ReduceHook in accumulation_node

* Fix CI  issue

* Add some tests to fix coverage CI issue

06b177c0

Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61

由 sneaxiy 提交于 2月 19, 2022

* add DistributedFusedLamb op

* polish code

* fix compile error

* compatible with pten changement

* fix rocm compile error

* improve converage

* update upstream/develop

* fix cast_with_ptr.h

* add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1

* fix clip before allreduce

* add use_master_param_norm

* code polish

* fix bug

* fix ROCM ci

5df3cd61

18 2月, 2022 21 次提交

Shared selected rows (#39608) · 7fc04070

由 Jiabin Yang 提交于 2月 18, 2022

* merge legacy to fluid

* Remove legacy code

* Remove legacy code

* Remove DataType test

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* add more test

* Support copiable selected rows and merge develop

7fc04070

F
[Pten] blas and lapck migration (#39587) · 8c7ee8c2
由 Feiyu Chan 提交于 2月 18, 2022
```
* move blas related files
* move lapack related files
```
8c7ee8c2
Z

Fix wrong inputs (#39700) · 1d6fd81d
由 zlsh80826 提交于 2月 18, 2022

1d6fd81d

cinn_instruction_run_op test (#39576) · fdc4fe3b

由 TeFeng Chen 提交于 2月 18, 2022

* add cinn_instruction_run_op test code

* update several interfaces of CinnLaunchContext

* update several interfaces and add detail comments in CinnLaunchContext class

* to skip the bug of error message check

* fix ut test failed due to reliant interface updated

fdc4fe3b

X
[pten] trans diagonal kernel into pten (#39575) · 5c66338f
由 xiongkun 提交于 2月 18, 2022
```
* trans diagonal kernel into pten

* fix by code review
```
5c66338f

[AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848

由 zhangbo9674 提交于 2月 18, 2022

* support dtype param for auto_cast

* add amp_dtype for tracer

* add unsupported bf16 list

* support bf16 amp for O2

* refine python interface for bfloat16

* refine code

* refine code

* refine unittest

* refine code

* refine code

* add bf16 o1

* refine code by comment

* add gradient accumulator

* add recompute

7d6d3848

R

[CustomDevice]Improved custom device initialization (#39634) · 7e4ed848
由 ronnywang 提交于 2月 18, 2022

7e4ed848
R

[CustomRuntime] add pten::Backend support (#39606) · d6d0820e
由 ronnywang 提交于 2月 18, 2022

d6d0820e
A
[IPU] Update IpuStrategy (#39644) · 46161679
由 Allen Guo 提交于 2月 18, 2022
```
* Update IpuStrategy

* fix ci

* rerun ci
```
46161679
B

refactor the forward implementation of shape npu op (#39613) · e674af23
由 baoachun 提交于 2月 18, 2022

e674af23

Infrt registers pten kernels (#39588) · dc39eb18

由 Wilber 提交于 2月 18, 2022

* the mlir representation of pten, test=develop

* fixes an error, test=develop

* infrt registers pten kernels
Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

dc39eb18

Z
[Pten] Support inplace and intermediate in C++ API (#39651) · 638aab6e
由 zyfncg 提交于 2月 18, 2022
```
* support inplace and intermediate in yaml

* add cmake for dygraph_api
```
638aab6e
T

dropout support Seed, fix elementwise_add_grad bug, test=kunlun (#39656) · 70b9f2ac
由 taixiurong 提交于 2月 18, 2022

70b9f2ac
C
[pten]add T, remove default value of DataType in DeviceContext::Alloc (#39620) · 8363406a
由 chentianyu03 提交于 2月 18, 2022
```
* add T to Alloc and remove default value of DataType in DeviceContext::Alloc

* add dtype
```
8363406a
S
add tool: print kernel signaturs (#39670) · 03b875a8
由 Shang Zhizhou 提交于 2月 18, 2022
```
* add tool: print kernel signaturs

* fix windows compile
```
03b875a8
W

fix compile error in jetson (#39669) · c8c24460
由 Wilber 提交于 2月 18, 2022

c8c24460

[MLU]add matmul and matmul_v2 op (#39539) · 229ec32a

由 qipengh 提交于 2月 18, 2022

* [MLU]add matmul and matmul_v2 op

* [MLU] fix data_type and del matmul

* [MLU] fix compile error

* [MLU] fix ci_check error

229ec32a

J

add flatten op for mlu (#39530) · 4c5cec5c
由 joeqiao12 提交于 2月 18, 2022

4c5cec5c
Z
[MLU]add sync stream ops and broadcast pytest (#39518) · d2bd05b9
由 zn 提交于 2月 18, 2022
```
* [MLU]add sync stream ops and broadcast pytest

* [MLU]fix broadcast pytest to add data type
```
d2bd05b9

[Bug Fix]Fix gradient accumulator (#39577) · a7cbd3ef

由 Jiabin Yang 提交于 2月 18, 2022

* merge legacy to fluid

* Remove legacy code

* Remove legacy code

* Remove DataType test

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* add more test

* fix different device gradient_accmulator bug

* merge develop

* remove useless tests

a7cbd3ef

W
[Eager] Support GradientHook before running separate GradNode (#39638) · adf4b98f
由 Weilong Wu 提交于 2月 18, 2022
```
* [Eager] Support GradientHook before running seperate GradNode

* Fix CI issue

* Fix CI issue
```
adf4b98f

17 2月, 2022 11 次提交

avoid custom kernel deps on pten_function_api (#39661) · cbce0e60

由 Leo Chen 提交于 2月 17, 2022

* pten matmul cuda kernel support bf16

* avoid custom kernel deps on pten_function_api

* Revert "pten matmul cuda kernel support bf16"

This reverts commit 5d520845b9a189375677276efb673235ed8e5ee0.

* refine code

* fix compile

* fix test_split_api

cbce0e60

L
[pten] move bernoulli kernel to pten (#39590) · f86073c4
由 Leo Chen 提交于 2月 17, 2022
```
* move bernoulli kernel to pten

* follow comments
```
f86073c4
L
[new-exec] refactor code of interpretercore gc (#39617) · c3135426
由 Leo Chen 提交于 2月 17, 2022
```
* relocate code of interpretercore gc
```
c3135426

[bugfix] to concat input squash (#39593) · f29da150

由 Sylwester Fraczek 提交于 2月 17, 2022

* fix and add more tests

* remove unwanted changes

* check only concat and elementwise

* move check to a function

* add todo comment

* Revert "fix ptq fc attr name fuse_activation->activation_type"

This reverts commit ffd023353a5e9b0fd15e81b9e9f9fe1794035017.

f29da150

J

add reshape2 op for mlu (#39562) · 2d2f11d1
由 joeqiao12 提交于 2月 17, 2022

2d2f11d1
Z

fix selected_rows bug in C++ API (#39658) · b72d4cb4
由 zyfncg 提交于 2月 17, 2022

b72d4cb4
T
save the name lists of variables of a cinn subgraph as its attributes (#39622) · a1ad003c
由 TeFeng Chen 提交于 2月 17, 2022
```
* save the name lists of the input,internal and output variables of a subgraph as its attribute

* fix compile error
```
a1ad003c
S
move trunc to pten (#39543) · 4501abd6
由 Sing_chan 提交于 2月 17, 2022
```
* move trunc to pten

* modify according to YuanRisheng's comment
```
4501abd6

[PTen] Clean useless header in pten core (#39560) · c05cd7ed

由 Chen Weihang 提交于 2月 17, 2022

* clean useless header in pten core

* fix compiled failed

* fix cmake target

* fix typo

* resolve conflict

c05cd7ed

add softplus op for kunlun2. test=kunlun (#39555) · 9f99b591

由 houj04 提交于 2月 17, 2022

* add softplus op for kunlun2. test=kunlun

* add softplus op for kunlun2. test=kunlun

* fix code style. test=kunlun

* fix code style. test=kunlun

* add more test cases. test=kunlun

9f99b591

adaptive pool2d pass fix (#39600) · c1c5c1fc

由 wenbin 提交于 2月 17, 2022

* first commit

* teller fix

* bug fix

* enable for pool2d only

* fix global_pooling issue

* pooling_type

* fix test

c1c5c1fc

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致