提交 · e683e537aadb1e0db8e92e03b17d0d40159bd97e · Crayon鑫 / Paddle

01 9月, 2022 4 次提交

[phi] Migrate uniform_random XPU kernel to PHI (#45583) · ded33b58

由 HongyuJia 提交于 9月 01, 2022

* copy kernel file to phi

* delete some code

* migrate uniform_random, test=kunlun

* fix input error, test=kunlun

* fix gpu register error, test=kunlun

* add include file, test=kunlun

* try fix error from CI, test=kunlun

* polish other PR

* fix CI-coverage error, test=kunlun

ded33b58

L

add deps on mkldnn for var_type_traits (#45629) · 13d62e12
由 Leo Chen 提交于 9月 01, 2022

13d62e12

ps optimizer default config (#45563) · ae217373

由 wangguanqun 提交于 9月 01, 2022

* config

* fix unittest

* zero init & cache & patch config

* add barrier to save and load

* add unittest

ae217373

L
remove circular dependency of device_context and allocator (#45455) · 934171ae
由 Leo Chen 提交于 9月 01, 2022
```
* refine cmake of framework

* add deps for dense tensor

* fix deps

* remove alloc(ctx)

* add depends on mkldnn
```
934171ae

31 8月, 2022 3 次提交
- L
  
  Fix UT failures (#45099) · 1ac8ca4d
  由 Leo Chen 提交于 8月 31, 2022
  
  1ac8ca4d
- H
  add del dropout op pass to jit pe enigne (#45439) · 46bc06b5
  由 Hui Zhang 提交于 8月 31, 2022
```
* add del dropout op pass to jit pe enigne

* add delete dropout test
```
  46bc06b5
- L
  
  fix bug that Op with id 0 can not be lauched (#45577) · 9c1aa6c7
  由 Leo Chen 提交于 8月 31, 2022
  
  9c1aa6c7
30 8月, 2022 2 次提交

Remove extra attribute in OpMaker (#44310) · fe321f9a

由 zyfncg 提交于 8月 30, 2022

* add runtime config in phi

* add runtime attr for op desc and op

* fix no proto error

* adjust opdesc set_attr impl

* try to remove conv_op extra attrs

* add init runtime attr map

* change extra header path

* fix runtime_attr

* fix trace_op

* fix bug of pass

* fix merge conflict

* fix dygraph attrs

* fix bug of pass

* fix dygraph bug

* fix unittest module

* delete extra attr default

* fix dropout kernel

* polish code

* fix extra output of instance_norm

* fix merge confilct

* fix op_desc bug

* add extra attr in yaml for conv3d_transpose

* don't remove extra input and output

* fix save_inference_model

* fix bug of batch_norm

* revert some change

* polish log

* polish code

* add code comment
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

fe321f9a

Z
[Paddle-TRT] constant-folding (#45494) · 97f43a8e
由 zhoutianzi666 提交于 8月 30, 2022
```
add constant folding pass， for some model，it will get less latency；
```
97f43a8e

29 8月, 2022 2 次提交

[new_exe] Dy2Static support new_executor (#44450) · aba1295b

由 zhangbo9674 提交于 8月 29, 2022

* add interpretercore

* refine backward program id

* add code

* refine program

* refine code

* create forward/backward_program by prog2graph2prog method

* test, do not care

* refine code

* refine code

* refine code

* test, do not care

* add interpretorcore

* add scope

* refine scope create method

* add jit for new_exe

* solve conflict

* delete unused code

* polish code

* polish code

* refine scope in inplace

* refine for datatransfer

* refine _rebuild_from_desc

* refine control eager deletion attr

* refine used_for_jit

* refine jit for infer

* op size0 use ori program

* polish code

* refine jit

* refine run_program_op ut

* refine inplace

* refine control

* refine graph helper

* refine control

* refine inplace

* refine buffer_share_inplace_pass

* polish code

* polish code

* refine usage for compilerProgram

* refine control

* test

* test core cache

* refine code

* refine io.py

* increase test_seq2seq timeout

* refine convert program

* refine interpretercore_cache release

* delete buildinplace

* refine partial_program && io

* refine code for io

* test

* test

* test

aba1295b

A
[OpAttr]num_rows/num_colums of eye support Tensor type (#45427) · b93b710a
由 Aurelius84 提交于 8月 29, 2022
```
* [OpAttr]num_rows/num_colums of eye support Tensor type

* fix attr cast with long type
```
b93b710a

26 8月, 2022 2 次提交

Transfer transfer_layout from fluid to phi (#45261) · 985f2a4a

由 kangguangli 提交于 8月 26, 2022

* remove fluid kernel and activate phi kernel

* fix parameter error

* transfer mkldnn part

* modify header file path

* fix compile error

* transfer special case

* fix lod setting and special case for layout setting

* add testcase and refine code

985f2a4a

王

[NPU] fix CI error in new executor. (#45432) · f4193eac
由王明冬提交于 8月 26, 2022

f4193eac

25 8月, 2022 3 次提交
- F
  
  add support for double attributes (#45390) · efab2eb4
  由 Feiyu Chan 提交于 8月 25, 2022
  
  efab2eb4
- D
  update brpc version to 1.2.0 (#45351) · 9b5b005e
  由 danleifeng 提交于 8月 25, 2022
```
* update brpc version;test=develop
```
  9b5b005e
- R
  [NPU] add run_program_op_npu (#45349) · 64afa638
  由 ronnywang 提交于 8月 25, 2022
```
* [NPU] add run_program_op_npu

* add run_program_op_npu ut
```
  64afa638
24 8月, 2022 3 次提交

S
Solve the random state serialization (#45327) · 73e41c89
由 ShenLiang 提交于 8月 24, 2022
```
* fix utest

* fix utest

* fix utest

* fix log

* fix random utest
```
73e41c89

make tensor_util contains no cuda code (#45256) · 78916a7a

由 Leo Chen 提交于 8月 24, 2022

* make tensor_util contains no cuda code

* refine isfinite

* revert ut

* move isfinite function to its op

* fix test

* fix compile

* std::isnan is not defined for int type on windows

* fix windows compile

* fix fp16

* fix rocm compile

* revert gradient node

78916a7a

W

conv_eltwiseadd_bn_fuse support fp16 (#45379) · 62b5452d
由 Wilber 提交于 8月 24, 2022

62b5452d

23 8月, 2022 4 次提交
- P
  
  print log while use new exe (#45335) · fac8a260
  由 pangyoki 提交于 8月 23, 2022
  
  fac8a260
- Z
  [AutoParallel] Add Quant Pass (#44877) · 61bc016c
  由 zhaoyingli 提交于 8月 23, 2022
```
* add quant pass
```
  61bc016c
- O
  
  Update scope.h (#45270) · 60e072d3
  由 OccupyMars2025 提交于 8月 23, 2022
  
  60e072d3
- O
  modify something unimportant when I read source code (#45273) · 5edc96e6
  由 OccupyMars2025 提交于 8月 23, 2022
```
* Update scope.h

* typo

* Update dense_tensor.inl
```
  5edc96e6
22 8月, 2022 3 次提交
- J
  Add int8 support for matmul+elementwise_add fuse pass (#45077) · 9e5f3a38
  由 joanna.wozna.intel 提交于 8月 22, 2022
```
* Add int8 support for matmul+elementwiae_add fuse

* Corrections after review and ernie test fix
```
  9e5f3a38
- S
  Extend conv_concat_relu to support all activations (#45089) · d03ef054
  由 Sławomir Siwek 提交于 8月 22, 2022
```
* merge conv_concat_relu to conv_act

* fix typo

* extend unit test

* reuse existing gpd

* codestyle

* enforce mkldnn conv
```
  d03ef054
- Y
  
  remove trt_skip_layernorm_fuse_pass from gpu passes (#45293) · 25d58db6
  由 Yuanle Liu 提交于 8月 22, 2022
  
  25d58db6
19 8月, 2022 1 次提交
- R
  Fix random op dependency and lr_shedule bugs for standalone executor (#45265) · 6d4ae007
  由 Ruibiao Chen 提交于 8月 19, 2022
```
* Fix random op depenency and lr_shedule bugs for standalone executor

* Fix CI errors

* Fix CI errors

* Fix CI errors
```
  6d4ae007
18 8月, 2022 3 次提交

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in... · d8d124b6

由 pangyoki 提交于 8月 18, 2022

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085)

* apply inplace addto in python apply_pass

* fix

* apply inplace pass for program

* skip feed and fetch var

* fix block_desc.move_from

* fix block desc

* alltoall remove inplace

* fix

d8d124b6

change to async mode for xpu multi-card training in static graph mode, test=kunlun (#45024) · 41bdf41d

由 zhangxiaoci 提交于 8月 18, 2022

* change to async mode for xpu multi-card training in static graph mode

* minor bugfix

* irrelevant. move to another pr

* move change to other pr

* fix stream issue

* fix 'stream not meet with current context' error

* fix branch diverge, test=kunlun

41bdf41d

fix infer tans scope (#45203) · 2d0bb2c3

由 JingZhuangzhuang 提交于 8月 18, 2022

* fix infer tans scop

* fix infer trans scope

* fic infer trans scope

* fic infer trans scope
Co-authored-by: Ndingjiawei <327396238@qq.com>

2d0bb2c3

17 8月, 2022 2 次提交
- A
  [OpAttr]Add SupportTensor for OpMaker with whitelist mechanism (#45084) · 2594935a
  由 Aurelius84 提交于 8月 17, 2022
```
* [OpAttr]Add SupportTensor for OpMaker

* fix typo

* fix code style

* add SupportTensor for concat op

* add unittest for register Tensor

* add shape checker and split attribute
```
  2594935a
- F
  
  fix:op version (#45192) · d0cd0a11
  由 feng_shuai 提交于 8月 17, 2022
  
  d0cd0a11
16 8月, 2022 4 次提交

[Phi] Move amp ops into phi (#45079) · b4f67757

由 Chen Weihang 提交于 8月 16, 2022

* move check finite and unscale kernel into phi

* move infershape into phi

* move update_loss_scaling kernel into phi

* remove original kernels

* move update loss scaling infershape into phi

* add header for xpu and npu

* solve coverage failed

* fix npu test failed

* remove mutable data in cu file

* fix new executor failed

* add valid check for meta tensor output

b4f67757

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

W

fix new quant (#45155) · 2fb65e44
由 Wangzheee 提交于 8月 16, 2022

2fb65e44
F

add strongly typed functions to set attributes to avoid unexpected type conversions. (#45107) · 307801d5
由 Feiyu Chan 提交于 8月 16, 2022

307801d5

15 8月, 2022 2 次提交

Y

fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
由 Yuanle Liu 提交于 8月 15, 2022

ac0553a0

[Auto Parallel] Move the distributed info from python to c++ (#44510) · a52357fe

由 Yulong Ao 提交于 8月 15, 2022

* [Auto Parallel] Move the distributed info from python to c++

* [Auto Parallel] Add dist_attrs for VarDesc and OpDesc

* [Auto Parallel] Add the lost file

* [Auto Parallel] Make the dist attr be unique_ptr

* [Auto Parallel] Add the proto conversion

* [Auto Parallel] Improve the proto support

* [Auto Parallel] Fix the bugs for adding a device or a link

* [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper

* [Auto Parallel] Improve the impl of these dist attrs

* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh

* [Auto Parallel] Fix the unittest problem

* [Auto Parallel] Explicitly add the src file for auto_parallel target

* [Auto Parallel] Add the proto depedency explicitly

* [Auto Parallel] Fix the cmake bug on windows and mac

* [Auto Parallel] Remove the pybind11 header file in process_mesh.h

* [Auto Parallel] Remove unused codes

* [Auto Parallel] Check whether the dist attr is null

* [Auto Parallel] Implement the assign operator for OpDesc explicitly

a52357fe

14 8月, 2022 1 次提交
- X
  Revert "[Paddle Inference] Support cuda_graph. (#44878)" (#45115) · b0e7681f
  由 xiaoxiaohehe001 提交于 8月 14, 2022
```
This reverts commit 84bf5c31.
```
  b0e7681f
13 8月, 2022 1 次提交

Refine program cache (#45005) · e96dae8b

由 Leo Chen 提交于 8月 13, 2022

* add cached_serialize_str_

* support program hash

* add sha

* add ut

* use hash_str only for new_exe

* fix attr order

e96dae8b

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致