提交 · 092839d64a2302093dc831177eab7d99cb9be81c · 机器未来 / Paddle

16 12月, 2021 9 次提交

D
[psgpu]add checknan print and fix trainer device (#38131) · 092839d6
由 danleifeng 提交于 12月 16, 2021
```
* trainer_device fix and checknan tool for psgpu;test=develop

* disable show_one_table;test=develop
```
092839d6

Adapt host event recorder to profiler (#37766) · 5b6be4d7

由 liutiexing 提交于 12月 16, 2021

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* add os_info

* update

* update

* update

* update

* update

* update for bugfix

* update

* update

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

5b6be4d7

L
Add fmax and fmin operators (#37826) · dd3afc9d
由 LJQ❤️ 提交于 12月 16, 2021
```
Add elementwise_fmax and elementwise_fmin operators
```
dd3afc9d

Add sparse_attention mask ,test=develop (#37973) · fa463b90

由 Liu-xiandong 提交于 12月 16, 2021

Add key_padding_mask and attn_mask in sparse_attention Api

1.Key padding mask is a tensor with dimensions [batch_size, seq_len], and attention mask is a tensor with dimensions [seq_len, seq_len]. The data types of the two masks are consistent with Q, K, and V, which are float32 or float64. If the value in Mask is 0, it means that the position needs to be masked.

2.The changed files are mainly paddle/fluid/operators/sparse_attention_op.cu and python/paddle/fluid/tests/unittests/test_sparse_attention_op.py. sparse_attention has three parts: sddmm, softmax, and dsd. Adding the mask operation only needs to modify the softmax. It has no effect on the other two parts. In addition, in order to test the mask function, related tests has been added.

fa463b90

N
Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
由 niuliling123 提交于 12月 16, 2021
```
* Add the transformop parameter in TensorReduceFunctorImpl
```
524389ee

[Pten]Modify registered kernel name (#38109) · be874c08

由 YuanRisheng 提交于 12月 16, 2021

* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile

* modify register name

* fix compile bugs

be874c08

pylayer support tuple/list type args and fix check args bug (#38146) · 861053eb

由 chentianyu03 提交于 12月 16, 2021

* Revert "Revert "pylayer support tuple/list type args (#37727)" (#37956)"

This reverts commit d848ff04.

* move check args,kwargs before forward execute

861053eb

Add float16 type for scatter op. (#38136) · 9bac4a76

由 Li Min 提交于 12月 16, 2021

* Add float16 type for scatter op.

* Add fp16 test for scatter op.

* Add int and int64 support for scatter_grad on gpu.

* Add int and int64 for check_variable_and_dtype routine.

* Minors.

* Code format.

9bac4a76

Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators (#37969) · 08482a86

由 Zhanlue Yang 提交于 12月 16, 2021

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators

* Fixed CI issues

08482a86

15 12月, 2021 11 次提交

add mkldnn conv3d_bias_mkldnn_fuse_pass ut (#37700) · 0456e003

由 baoachun 提交于 12月 15, 2021

* add mkldnn conv3d_bias_mkldnn_fuse_pass ut

* update conv3d_bias_mkldnn_fuse_pass ut

* disable conv3d_bias_mkldnn_fuse_pass

0456e003

ipu_inference (#37102) · 141b2854

由 jianghaicheng 提交于 12月 15, 2021

* add ipu_inference

* resovle commments

* resolve comments

* add EnableIpu introduction

* rm line

* restore npu update

* add ernie and resnet50 test

* fix copyright time
Co-authored-by: Nyaozhixin <522190855@qq.com>

141b2854

[new-exec] add standalone executor test (#38101) · ab6daf84

由 Leo Chen 提交于 12月 15, 2021

* refine test

* add download_program target

* update ut code

* refine code

* disable profiler

* add comments

* refine cmake

* skip coverage ci

ab6daf84

update mkldnn conv_concat_relu_mkldnn_fuse_pass ut (#37606) · bb51d6dc

由 baoachun 提交于 12月 15, 2021

* update mkldnn conv_concat_relu_mkldnn_fuse_pass ut

* update conv_concat_relu_mkldnn_fuse_pass ut

* restrict conv2d data_format in conv_concat_relu_mkldnn_fuse_pass

bb51d6dc

C

replace moves_storage and alloc_construct (#38134) · e78eb3f4
由 Chen Weihang 提交于 12月 14, 2021

e78eb3f4
W
remove bf16 (#38133) · 49108efa
由 wenbin 提交于 12月 15, 2021
```
* remove bf16

* remove comments

* remove wrong return

* fix UT
```
49108efa
Y
Change a comment to avoid the disturb to op benchmark ci. (#38148) · 4d8242df
由 Yiqun Liu 提交于 12月 15, 2021
```
test=document_fix
```
4d8242df
H
Add cinn_launch_op_test into Paddle-CINN ci (#38076) · e5a838f8
由 Huihuang Zheng 提交于 12月 15, 2021
```
As the title.
```
e5a838f8
C

move tensor using to single header (#38142) · c23afce1
由 Chen Weihang 提交于 12月 14, 2021

c23afce1

replace with pten kernel in cast cuda compute and remove unused codes (#38074) · 75332401

由 chentianyu03 提交于 12月 15, 2021

* replace with pten kernel in cast cuda compute and remove unused codes

* rm unused header file

* replace CastCUDAOpKernel with CastOpKernel

75332401

Synchronized auto-generated Python-C API with Dygraph Forward Functions (#38017) · 77dfb2e8

由 Zhanlue Yang 提交于 12月 15, 2021

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

77dfb2e8

14 12月, 2021 13 次提交
- S
  add map_matmul and fc_act_fuse passes to quant2_int8_mkldnn_pass (#38023) · 8f800dc0
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* add map_matmul passes to quant2_int8_mkldnn_pass

* fix fc+act fuse (activation scale)

* ci fix, c++17 structured bindings not available

* fix ci static check
```
  8f800dc0
- Z
  fix memory leak problen of set_value op (#38098) · f8202941
  由 zyfncg 提交于 12月 14, 2021
```
* fix bug of set_value op

* fix BumpInplaceVersion

* polish some comments

* revert change of full_like
```
  f8202941
- B
  add conv_gelu_mkldnn_fuse_pass (#38107) · 206a33b3
  由 baoachun 提交于 12月 14, 2021
```
* add conv_gelu_mkldnn_fuse_pass

* add post ops
```
  206a33b3
- A
  
  Add const in GetInput/OutputVarPtrs in InferShapeContext (#38066) · 22f14e74
  由 Aurelius84 提交于 12月 14, 2021
  
  22f14e74
- W
  
  modify the fix_seed attribute in dropout op is a def attribute.test=develop (#38100) · f44add7b
  由 weishengying 提交于 12月 14, 2021
  
  f44add7b
- Y
  
  remove KernelName (#38082) · 8198cad7
  由 YuanRisheng 提交于 12月 14, 2021
  
  8198cad7
- Y
  
  [fleet_executor] Take task node from python side (#38083) · 7eb121df
  由 Yuang Liu 提交于 12月 14, 2021
  
  7eb121df
- Y
  [PTen] Reduce reshape kernel functions in pten (#38055) · a3c8abc7
  由 YuanRisheng 提交于 12月 14, 2021
```
* Reduce reshape kernel functions in pten

* delete notes

* fix bugs when compile
```
  a3c8abc7
- F
  Mkldnn depthwise conv pass (#37798) · 19a833c8
  由 feng_shuai 提交于 12月 14, 2021
```
* test_mkldnn_depthwise_conv_pass

* test: add TimeOut

* sset TIMEOUT

* fix:add random num for dilation and group
```
  19a833c8
- Z
  Handled Dispensable Inputs/Outputs in Eager AutoCodeGen (#37959) · f2043bd1
  由 Zhanlue Yang 提交于 12月 14, 2021
```
* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
```
  f2043bd1
- H
  add layer_norm_fuse_pass test case (#37830) · b95c9cf2
  由 heliqi 提交于 12月 14, 2021
```
* add layer_norm_fuse_pass test case

* restore cmakelist code

* Merge branch 'develop' into layer_norm_fuse_pass

* Merge branch 'develop' into layer_norm_fuse_pass

* add bad case test
```
  b95c9cf2
- W
  
  fix generate_proposals op doc (#38048) · c117dfba
  由 wangguanzhong 提交于 12月 14, 2021
  
  c117dfba
- S
  add reshape+transpose+matmul_v2 only (#37847) · a922168a
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* reshape+transpose+matmul_v2

* in_name->input_name

* fix pr-ci-static-check
```
  a922168a
13 12月, 2021 7 次提交
- Z
  update 3 tests (#37922) · 33fbb66e
  由 zhenlin 提交于 12月 13, 2021
```
* update 3 tests

* fix typo error
```
  33fbb66e
- W
  disable bad case for shuffle pass (#38072) · e7f5d325
  由 wenbin 提交于 12月 13, 2021
```
* disabled bad case

* int to size_t
```
  e7f5d325
- J
  
  add popart_canonicalization p4 (#37967) · 69252fd8
  由 jianghaicheng 提交于 12月 13, 2021
  
  69252fd8
- T
  
  update xpu_memcpy (#38049) · bdf5834e
  由 taixiurong 提交于 12月 13, 2021
  
  bdf5834e
- X
  fix single card 8 unittests in new executor (#37957) · 9a4eec98
  由 xiongkun 提交于 12月 13, 2021
```
* fix single card 8 unittests in new executor

* fix

* fix
```
  9a4eec98
- N
  
  [pnorm] Optimize p_norm op for special cases (#37685) · 10d9ab4b
  由 Noel 提交于 12月 13, 2021
  
  10d9ab4b
- W
  
  fix mac import hang, test=develop (#38051) · d3569c7e
  由 wanghuancoder 提交于 12月 13, 2021
  
  d3569c7e

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致