提交 · 44b45b9f07ba4dcbc97abd0c45feebb694bbe93e · PaddlePaddle / Paddle

06 2月, 2020 1 次提交

Correct the use of DeviceContext in unittest sequence_pooling_test and... · 44b45b9f

由 Yiqun Liu 提交于 2月 06, 2020

Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456)

* Add log in memory::Copy for debug purpose.

* Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one.

* Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one.
test=develop

* Change the type of second_dim from size_t to int64_t.
test=develop

44b45b9f

19 1月, 2020 1 次提交
- W
  
  Optimize the depthwise op test=develop (#22265) · 0d8b222b
  由 wangchaochaohu 提交于 1月 19, 2020
  
  0d8b222b
07 1月, 2020 1 次提交
- C
  
  replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109) · ba8414d3
  由 Chen Weihang 提交于 1月 07, 2020
  
  ba8414d3
04 1月, 2020 1 次提交
- K
  
  polish cross_entropy ENFORCE (#22056) · 34c57120
  由 Kaipeng Deng 提交于 1月 04, 2020
  
  34c57120
23 12月, 2019 1 次提交
- G
  optimize fc jit (#21878) · d4dda862
  由 GaoWei8 提交于 12月 23, 2019
```
test=develop
```
  d4dda862
11 12月, 2019 1 次提交
- G
  Modify padding strategy: remove weight copy in fc padding (#21650) · 5af0c7ba
  由 GaoWei8 提交于 12月 11, 2019
```
test=develop
```
  5af0c7ba
02 12月, 2019 1 次提交

fix -Wno-error=sign-compare warning in gcc8 (#21434) · 01fa4ead

由 Tao Luo 提交于 12月 02, 2019

* fix -Wno-error=sign-compare warning in gcc8

test=develop

* fix warning in distributed codes

test=develop

01fa4ead

28 11月, 2019 1 次提交

remove -Wno-error=sign-compare, make warning as error (#21358) · c0656dcb

由 Tao Luo 提交于 11月 28, 2019

* remove -Wno-error=sign-compare, make warning as error

test=develop test=document_fix

* fix exist compile warning

test=develop

c0656dcb

27 11月, 2019 1 次提交
- G
  Polish the codes of fc when needs padding (#21378) · 8493f20e
  由 GaoWei8 提交于 11月 27, 2019
```
test=develop
```
  8493f20e
26 11月, 2019 1 次提交

Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) · 234060f8

由 GaoWei8 提交于 11月 26, 2019

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

234060f8

22 11月, 2019 1 次提交

add dequantize_abs_max op and modify lookup_table op (#20899) · f0b15184

由 Liufang Sang 提交于 11月 22, 2019

* add int8 kernel to lookup_table op and add dequantize op test=develop

* change paddle_enforce to paddle_enforce_eq test=develop

* change copyright and change some not suitable code test=develop

* remove debug log test=develop

* replace GetInputType with IndicateVarDataType test=develop

* fix EmptyGradMaker test=develop

* fix diff between cpu and gpu test=develop

* use memcopy when int8_t test=develop

f0b15184

14 11月, 2019 1 次提交
- W
  
  Fix warpctc in padding mode. (#21033) · cfdd1fc2
  由 whs 提交于 11月 14, 2019
  
  cfdd1fc2
12 11月, 2019 1 次提交

fix the computation for dx (grad for x) for prelu operation. (#20949) · e249d9a3

由 lilong12 提交于 11月 12, 2019

* set the default value of alpha for prelu to 0.25, test=develop

* add the call to __syncthreads(), test=develop

* fix the implementation of cpu prelu, test=develop

* repair the implementation of element mode prelu, test=develop

* modify test_prelu_op.py, test=develop

e249d9a3

08 11月, 2019 1 次提交

Add dependency for error_codes.proto (#21084) · 2f27b103

由 Chen Weihang 提交于 11月 08, 2019

* fix activation_functions deps, test=develop, test=document_fix

* add error_codes_proto deps, test=develop, test=document_fix

* try delete enforce.h, test=develop, test=document_fix

2f27b103

05 11月, 2019 2 次提交
- Z
  Fix ce ocr_recognition test fails (#20987) · 0059404e
  由 zhaoyuchen2018 提交于 11月 05, 2019
```
ocr_recognition fails, so add a path to handle small frame_size.

test=develop
```
  0059404e
- T
  refine murmurhash3_x64_128 for bloom_filter (#20996) · 25ffa844
  由 Tao Luo 提交于 11月 05, 2019
```
test=develop
```
  25ffa844
01 11月, 2019 1 次提交

Fix gru as small frame_size has error. (#20922) · 7f3a445e

由 zhaoyuchen2018 提交于 10月 31, 2019

seems shuffle_sync cannot handle small size

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

7f3a445e

31 10月, 2019 2 次提交
- Z
  maxout supports channel_last input (#20846) · 8d1e9f0f
  由 Zhang Ting 提交于 10月 31, 2019
```
* maxout support channel_last input, test=develop

* modified details of Input(X) and Attr(groups, axis) in doc, test=develop
```
  8d1e9f0f
- Z
  
  fix the bug of conv_transpose:compatible with Anylayout setting, test=develop (#20897) · c18f1bd7
  由 Zhang Ting 提交于 10月 31, 2019
  
  c18f1bd7
30 10月, 2019 1 次提交
- Z
  
  fix select_rows mergeadd bug, test=develop (#20876) · d4289125
  由 zhang wenhui 提交于 10月 30, 2019
  
  d4289125
28 10月, 2019 1 次提交
- A
  
  add pyramid_hash_op (#20698) · aacd16db
  由 Aurelius84 提交于 10月 28, 2019
  
  aacd16db
23 10月, 2019 1 次提交

Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and... · e89c16b9

由 Pei Yang 提交于 10月 23, 2019

Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and "num" attribute in split op converter (#20733)

* fix pool2d trt converter, test=develop

* add fix for split op converter, test=develop

e89c16b9

16 10月, 2019 1 次提交
- Q
  Support fp16 in GPU impl of fused_elemwise_activation_op. (#20636) · 01eddc1a
  由 qingqing01 提交于 10月 16, 2019
```
* Support fp16 in fused_elemwise_activation_op.
* Fix unit testing in ONLY-CPU mode.
```
  01eddc1a
13 10月, 2019 1 次提交
- Z
  
  fix conv_transpose's bug: compatible with Anylayout setting, test=develop (#20589) · 78910480
  由 Zhang Ting 提交于 10月 13, 2019
  
  78910480
09 10月, 2019 1 次提交

mv two function in conv op for good code style (#20116) · ad60b3b8

由 liym27 提交于 10月 09, 2019

* Delete PadFuntion, include padding.h instead. test=develop

* move function(IsSymmetricPadding) from conv_cudnn_op.cu/conv_transpose_cudnn_op.cu to padding.h, test=develop

ad60b3b8

07 10月, 2019 1 次提交
- Z
  
  conv_transpose supports channel_last input, test=develop, test=document_preview (#20072) · cf6919bf
  由 Zhang Ting 提交于 10月 07, 2019
  
  cf6919bf
30 9月, 2019 1 次提交
- D
  Improve elementwise operators performance in same dimensions. (#19763) · 425279a5
  由 danleifeng 提交于 9月 30, 2019
```
Improve elementwise operators performance in same dimensions
```
  425279a5
29 9月, 2019 1 次提交

fix conv2d and conv3d: (#20042) · 3aa331d9

由 liym27 提交于 9月 29, 2019

1.support asymmetric padding;
    2.support padding algorithm:"SAME" and "VALID";
    3.support channel_last: data_format NHWC and NDHWC;
    4.change doc of python API and c++;

    test=develop, test=document_preview

3aa331d9

28 9月, 2019 1 次提交

fix pool2d pool3d,support asymmetric padding and channel_last (#19739) · 24010472

由 liym27 提交于 9月 28, 2019

* fix pool2d pool3d:
1. support asymmetric padding;
2. support padding algorithm:"SAME" and "VALID";
3. support channel_last: data_format NHWC and NDHWC;
4. support inferring shape when input with negative dims in compile time;
5. change doc of python API and c++;
6. fix bug in cuda kernel when Attr(adaptive) is true.

test=develop,test=document_preview

* fix 'tensors' to 'Tensors'. test=develop,test=document_preview

* add test for converage ValueError.test=develop,test=document_preview

* resolve conflict in test_pool2d. test=develop

24010472

27 9月, 2019 1 次提交
- C
  Add fp16 support for pad and split (#19881) · fb2a9cdf
  由 chengduo 提交于 9月 27, 2019
```
* make pad and split support fp16
test=develop
```
  fb2a9cdf
25 9月, 2019 1 次提交

add support of matmul with multiple head even different width and height (#19708) · c670058a

由 Bob Zhu 提交于 9月 25, 2019

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* refactor the code of matmul with multiple head even different width and height

test=develop

c670058a

23 9月, 2019 1 次提交
- K
  fix softmax CE time limit check failed (#19846) · 3f021781
  由 Kaipeng Deng 提交于 9月 23, 2019
```
* fix softmax ce time limit check failed. test=develop

* refine softmax calc. test=develop
```
  3f021781
20 9月, 2019 1 次提交
- A
  support 2-level lod of input in sequence_pool (#19839) · fcf53e55
  由 Aurelius84 提交于 9月 20, 2019
```
* support 2-level lod of input in sequence_pool test=develop

* fix lod level bug in .cu test=develop
```
  fcf53e55
16 9月, 2019 1 次提交
- K
  
  fix softmax axis!=-1. test=develop (#19800) · 99c78b77
  由 Kaipeng Deng 提交于 9月 16, 2019
  
  99c78b77
11 9月, 2019 2 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

05 9月, 2019 3 次提交
- 1
  fix the diff between async mode and async_half mode (#19535) · 2f037c31
  由 123malin 提交于 9月 05, 2019
```
* test=develop,  communicator merge add => merge average
```
  2f037c31
- T
  unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) · 3ae939e4
  由 Tao Luo 提交于 9月 05, 2019
```
* remove assert.h

* change PADDLE_ASSERT_MSG to PADDLE_ENFORCE

test=develop

* fix tensorrt paddle_enforce

test=develop
```
  3ae939e4
- T
  paddle::framework::vectorize() templatization (#19627) · d6c85c96
  由 Tao Luo 提交于 9月 05, 2019
```
test=develop
```
  d6c85c96
04 9月, 2019 1 次提交
- T
  refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607) · 0a46d345
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  0a46d345

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功