提交 · 715fd051d03935afcbcaaa8ddfe9daca0d2fa5cf · PaddlePaddle / Paddle

19 11月, 2021 1 次提交
- L
  
  fix cmake dependence error (#37304) · 6653ac5e
  由 LiYuRio 提交于 11月 19, 2021
  
  6653ac5e
18 11月, 2021 7 次提交

J
Fix for wrong results in segmentation models (#37310) · c1802f91
由 jakpiase 提交于 11月 18, 2021
```
* fix

* ci rerun

* ci rerun

* ci Rerun
```
c1802f91
optimize the data structure to speed up sampling in graph engine. (#37315) · 521a274e
由 Webbley 提交于 11月 18, 2021
```
* optimize the data structure from c++ to python to speed up sampling in graph engine

* update test
```
521a274e
L
fix bug to support dropout eval grad computing. (#37305) · c3d3001f
由 Li Min 提交于 11月 18, 2021
```
* fix bug to support dropout eval grad computing.

* Remove useless code.
```
c3d3001f

[PTen]elementwise_sub kernel refactor (#37260) · 36a95654

由 YuanRisheng 提交于 11月 18, 2021

* elementwise_add kernel refactor

* fix compile bugs in elementwise_add refactor

* fix compile bugs when run in npu/xpu

* fix bugs when run unit test

* fix bugs when run ci-windows

* modify code as recommended

* code format adjust

* fix bugs when run ci

* fix compile bug when run in ci-windwos

* elementwise_sub refactor

* add PD_DLL_DECL for elementwise_sub

* fix bugs when compilei

36a95654

Y

[fleet_executor] Parse runtime graph to start carrier (#37282) · f85bd5c9
由 Yuang Liu 提交于 11月 18, 2021

f85bd5c9

Add the `GetFetchNames` method in CinnGraphSymbolization. (#37218) · 3ad495e8

由 Zhen Wang 提交于 11月 18, 2021

* Add the `GetFetchNames` method in CinnGraphSymbolization.

* Use unordered_set instead vector as the type of fetch_var_names.

* Reuse the definition of kCompilationKey.

* Use CompileOptions to set fetch_var_ids.

* Update the argument passing of GraphCompiler.Build.

* Fix some bugs in CinnGraphSymbolization::GetFetchIds.

3ad495e8

Opt topk (#37256) · c4862d99

由 zhangkaihuo 提交于 11月 18, 2021

topk中有cub和手写kernel两种实现，而cub是通过排序来获取topk，通过多组数据发现只有当input_width>=128且k超过input_width 75%的时候性能会比手写的更好。

c4862d99

17 11月, 2021 12 次提交

Replace custom IOHW -> OIHW reorder with build-in oneDNN reorder (#37175) · 162ac048

由 Sławomir Siwek 提交于 11月 17, 2021

* Use oneDNN reorder instead of custom one

* Fix whitespace typo

* Fix Code format error

* Incorporating feedback

* Remove unncessary reorder

* Support GIOHW format

* Fix code format error

162ac048

L
[new-exec] Refine standalone executor (#37278) · 6d6642c8
由 Leo Chen 提交于 11月 17, 2021
```
* init

* add feed ops in python side

* import LRScheduler

* update_feed

* refine code format
```
6d6642c8

Changed first batch of deprecated mkldnn headers and function names to new oneDNN names (#37040) · ce3ee9bb

由 piotrekobiIntel 提交于 11月 17, 2021

* Change first batch of mkldnn headers and namespace names to dnnl

* Revert changes to tensor.h, which require approval

* Format changes with pre-commit

* Add int32 tests

* Fix int32 tests and call GetDataFromTensor for int32

* Fix test

ce3ee9bb

N
Modify reduce_op.op.h for xpu2 with kernel primitive api (#36904) · 9c5d5665
由 niuliling123 提交于 11月 17, 2021
```
* Modify reduce_op.op.h for xpu2 with kernel primitive api
```
9c5d5665
A

Fix data transform bug in new executor (#37280) · 1460b761
由 Aurelius84 提交于 11月 17, 2021

1460b761
Z

update dataset (#37194) · ca8c4f3e
由 zhaocaibei123 提交于 11月 17, 2021

ca8c4f3e

[heterps]Refactor heterogenous worker (#37244) · 54d2626a

由 zmx 提交于 11月 17, 2021

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* refactor heter trainer. test=develop

* fix. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix ut. test=develop

* fix ut. test=develop

* fix ut. test=develop

54d2626a

D

fix compile error when pslib use cpu branch;test=develop (#37248) · 0057c12d
由 danleifeng 提交于 11月 17, 2021

0057c12d
L
copy beta pow to same place when skip_update=1 (#37245) · 5e4b419b
由 Leo Chen 提交于 11月 17, 2021
```
* copy beta pow to same place when skip_update=1

* fix xpu
```
5e4b419b
L

[Fleet Executor] Construct runtime graph (#37158) · 0daa69d4
由 LiYuRio 提交于 11月 17, 2021

0daa69d4
W

[npu][hybrid] support offload (#37224) · 762819a8
由 WangXi 提交于 11月 17, 2021

762819a8

Dependence analysis (#37231) · d943459b

由 xiongkun 提交于 11月 17, 2021

* add

* add BuildOperatorDependences

* fix bug

* add unittest for write after write

* fix merge bug

* fix

d943459b

16 11月, 2021 10 次提交

C

decrease pten log level (#37239) · d8982c52
由 Chen Weihang 提交于 11月 16, 2021

d8982c52
A
Added BF16 Pool2d grad (#37081) · f95d44a2
由 arlesniak 提交于 11月 16, 2021
```
* Added BF16 Pool2d grad

* upstream pulled

* fix for CI

* fixes after review
```
f95d44a2
D

[psgpu]fix pipe bug:save and pull overlap; test=develop (#37233) · 62ec644f
由 danleifeng 提交于 11月 16, 2021

62ec644f
W

Removed unnecessary ENFORCE statement (#37219) · 70b7c7ed
由 Weilong Wu 提交于 11月 16, 2021

70b7c7ed

Add API and unit test for reshape (#37232) · 79b49c20

由 YuanRisheng 提交于 11月 16, 2021

* reshape kernel refactor

* fix compile bugs when run ci

* support xpu for reshape

* fix bugs when run unittest in kunlun ci

* fix compile bugs when run kunlun

* perfect code according to suggestion

* add api and unit test for reshape

79b49c20

Z
for pure fp16 (#37230) · 6ebc318e
由 zhangkaihuo 提交于 11月 16, 2021
```
Add pure fp16 support for fused transformer.
```
6ebc318e
Y
Make FLAGS_determinstic effective in conv2d forward. (#37173) · ea47d211
由 Yiqun Liu 提交于 11月 16, 2021
```
* Make FLAGS_determinstic effective in conv2d forward.

* Add call of SetCinnCudnnDeterministic in cinn_launch op.
```
ea47d211
J

added onednn elu kernel (#37149) · ae40ee32
由 jakpiase 提交于 11月 16, 2021

ae40ee32

Fix attn_bias_add bug. (#37147) · a9e7a854

由 Li Min 提交于 11月 16, 2021

fused_attention_op的实现中，使用了bias_add，且其实现是通过使用kernel primitive来实现的，之后kernel primitive的WriteData api接口及函数内部实现发生了更改，将判断越界的逻辑移到了template的参数中，使得调用的分支有错误，产生了越界赋值操作，污染了别的显存空间的内容。具体表现为：test_fused_attention_op_api.py 单次执行基本上不会报错，多次循环执行不同shape的输入，结果计算不对，具有偶发性，bug不易察觉。

a9e7a854

Y

[fleet_executor] Add sync method (#37167) · f49c2c23
由 Yuang Liu 提交于 11月 16, 2021

f49c2c23

15 11月, 2021 10 次提交

[Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a

由 Chen Weihang 提交于 11月 15, 2021

* move extension into pten [no-verify]

* append tensor methods by ext_tensor [no-verify]

* append other tensor methods [no-verify]

* ext related files tidy [no-verify]

* include relation tidy [no-verify]

* add pten tensor test [no-verify]

* replace tensor in custom op & compile success

* refine tensor constructor for unittest

* custom relu jit run success

* fix all custom op unittests

* add inference cmake adapt [no-verify]

* fix failed unittests

* fix windows failed unittests

* try to fix kunlun and inference failed

* fix test_elementwise_api error

* try to fix win compile failed

* fix kunlun fp16 type error

* remove useless haddle error macro

* add custom linear op test

* fix compile failed & add win symbols

* fix non pten kernel cast failed

* add dll decl for api

* polish several deetails

* polish details by review comment

* add dll_decl for register

1e598f1a

[new-exec] fix stream analysis (#37161) · 584b4b24

由 Leo Chen 提交于 11月 15, 2021

* fix revord_event

* refine class Instruction

* refine Instruction and InterpreterCore

* make instruction and operator_base consistent

* support NoNeedBufferVar in stream_analyzer

* fix place of event

* add vlog before continue

584b4b24

remove input dim check in op_teller and update ut (#37097) · 6b21bb0b

由 baoachun 提交于 11月 15, 2021

* remove input dim check of activation in op_teller

* remove input dim check of concat in op_teller

* remove input dim check of clip in op_teller

* remove input dim check of scale in op_teller

* remove input dim check in op_teller

* update attr check of slice in op_teller

6b21bb0b

Y

fix ctest depent probs (#37203) · cf958f2f
由 Yuang Liu 提交于 11月 15, 2021

cf958f2f
W
fix 3 bug of new_executor (#37142) · 8358d614
由 wanghuancoder 提交于 11月 15, 2021
```
* fix 3 bug, test=develop

* refine, test=develop
```
8358d614
F

fix:delete macro INFERENCE (#37130) · b628c316
由 feng_shuai 提交于 11月 15, 2021

b628c316
A
Added BF16 to mean op (#37104) · df7cc457
由 arlesniak 提交于 11月 15, 2021
```
* Added BF16 to mean op

* fix for CI

* fix for CI

* fix for CI
```
df7cc457
J

fix cinn_compile_test not pass problem (#37190) · 83eef6d2
由 jiangcheng 提交于 11月 15, 2021

83eef6d2
W
[New features] Add elementwise_mul triple grad kernel (#37152) · 59fdf4da
由 Weilong Wu 提交于 11月 15, 2021
```
* Add elementwise_mul triple grad kernel

* Removed InplaceInferer and polished code
```
59fdf4da
Z

Accessor 20211112 2 (#37181) · 84b0ec97
由 zhaocaibei123 提交于 11月 15, 2021

84b0ec97

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功