提交 · 1840c1652c1a67043884e3cdd0b5c39a05c9474d · 机器未来 / Paddle

28 11月, 2019 2 次提交
- Z
  
  fix lod_reset bug, test=develop (#21392) · b97fc16d
  由 Zeng Jinle 提交于 11月 28, 2019
  
  b97fc16d
- Z
  Polish reference count pass (#21324) · 89966525
  由 Zeng Jinle 提交于 11月 28, 2019
```
* fix ref_cnt pass, test=develop

* add cpp unittests to reference_count_pass, test=develop

* follow comments, test=develop
```
  89966525
27 11月, 2019 4 次提交

Support data_norm gpu kernel (#21325) · 47a82e38

由 hutuxian 提交于 11月 27, 2019

* support data_norm_op run in CUDA
* add two parameters sync_stats & summary_decay_rate
* add UT

47a82e38

G
Polish the codes of fc when needs padding (#21378) · 8493f20e
由 GaoWei8 提交于 11月 27, 2019
```
test=develop
```
8493f20e

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

Z

fix syn bn grad maker, test=develop, test=document_fix (#21317) · b639a882
由 Zeng Jinle 提交于 11月 27, 2019

b639a882

26 11月, 2019 7 次提交

add axis check for concat op (#21288) · 4d0f5ab1

由 Youwei Song 提交于 11月 26, 2019

* add axis check for concat op
test=develop

* fix PADDLE_ENFORCE format
test=develop

* move to ComputeAxis for InferShape check
test=develop

4d0f5ab1

Z
Fix ernie python infer diff (#21311) · afb13484
由 zhaoyuchen2018 提交于 11月 26, 2019
```
* Fix ernie pythoin infer diff
* Refine mask

test=develop
```
afb13484
L
Fix mistake of batch norm op (#21237) · b6ce4f8b
由 Lv Mengsi 提交于 11月 26, 2019
```
* fix_bn

* revert unittest,test=develop
```
b6ce4f8b

add the framework support for distfc (#21197) · 41d13209

由 lilong12 提交于 11月 26, 2019

* add the framework support for distfc and ut, test=develop
* fix the implementation of shard_index_op, test=develop

41d13209

Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) · 234060f8

由 GaoWei8 提交于 11月 26, 2019

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

234060f8

J

[MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207) · f4cf028a
由 Jacek Czaja 提交于 11月 26, 2019

f4cf028a

Refactor MKL-DNN ElementwiseMul (#21061) · ed9ceb9f

由 Michał Gallus 提交于 11月 26, 2019

* Refactor MKL-DNN ElementwiseMul

remove manual fallback, remove format attrs
test=develop

* Refine PADDLE_ENFORCEs in eltwise_mul_op.h

test=develop

* Make ElementwiseMulOp inherit from ElementwiseOp

* Change type of simd_width to int

test=develop

* Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp

test=develop

* Restore attributes

test=develop

* Fix test coverage for mkldnn eltwise mul

test=develop

* Conform to new is_run_common_broadcast API

test=develop

* Add UT for AreDimsAndFormatCorrect

test=develop

ed9ceb9f

25 11月, 2019 4 次提交
- Z
  
  remove warning LNK4006 and warning LNK4221 (#21226) · 345b67b5
  由 zhouwei25 提交于 11月 25, 2019
  
  345b67b5
- W
  fix the fill_constant op precious problem (#21322) · 6514f52e
  由 wangchaochaohu 提交于 11月 25, 2019
```
* fix the fill_constant op precious problem test=develop
```
  6514f52e
- Z
  Improve argsort performance. (#21267) · 08c19c58
  由 zhaoyuchen2018 提交于 11月 25, 2019
```
* Improve argsort performance.

- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s

- Add fp16 support

* Refine error message
* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  08c19c58
- W
  
  Fix dgc accuracy by mv regularization to local (#21278) · 8ac7687e
  由 WangXi 提交于 11月 25, 2019
  
  8ac7687e
24 11月, 2019 2 次提交
- L
  use prefetch to load next mem into cache (#21206) · b19e1a1b
  由 Leo Zhao 提交于 11月 24, 2019
```
* use prefetch to load next mem into cache

test=develop

* remove hard code memcpy om pyramid_hash_ff

test=develop
```
  b19e1a1b
- G
  
  optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597) · ed2a1852
  由 gongweibao 提交于 11月 24, 2019
  
  ed2a1852
22 11月, 2019 5 次提交

Fix the crash issue when scale or bias was null-pointer. (#21284) · 69dd5152

由 Yihua Xu 提交于 11月 22, 2019

* Fix the crash issue when scale or bias was null-pointer.

test=develop

* Add the error message for passing CI.

test=develop

69dd5152

Z

optimize lod_reset op to avoid data transform · 698b8b73
由 Zhang Ting 提交于 11月 22, 2019

698b8b73

add dequantize_abs_max op and modify lookup_table op (#20899) · f0b15184

由 Liufang Sang 提交于 11月 22, 2019

* add int8 kernel to lookup_table op and add dequantize op test=develop

* change paddle_enforce to paddle_enforce_eq test=develop

* change copyright and change some not suitable code test=develop

* remove debug log test=develop

* replace GetInputType with IndicateVarDataType test=develop

* fix EmptyGradMaker test=develop

* fix diff between cpu and gpu test=develop

* use memcopy when int8_t test=develop

f0b15184

support cvm_op run in gpu (#21300) · a6ce2306

由 hutuxian 提交于 11月 22, 2019

Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel.
What's more, we improve the UTs about CVM OP.

a6ce2306

Avoid the string as the key of map to improve the jit performance (#21292) · b085ecc2

由 Yihua Xu 提交于 11月 22, 2019

* Avoid the string as the key of map to improve the jit performance.

test=develop

* Use map to replace unordered_map.

test=develop

b085ecc2

21 11月, 2019 1 次提交

open dygraph op test, test=develop (#19787) · c4ede95c

由 zhongpu 提交于 11月 21, 2019

* open dygraph op test, test=develop

* modify to_variable, test=develop

* modify input and output for dygraph, test=develop

* modify input and output for dygraph(fix bug), test=develop

* fix input processing of dygraph op test, test=develop

* fix bug, test=develop

* fix op test, test=develop

* fix forward bug for dygraph, test=develop

* fix mkldnn op test for forward, test=develop

* update nn.py for dygraph, test=develop

* fix crop_tensor_op, test=develop

* fix elementwise_mul_op, test=develop

* fix fill_op, test=develop

* fix some mkldnn op, test=develop

* open backward op test for dygraph, test=develop

* delete log, test=develop

* close backward op test for dygraph, test=develop

* fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop

* fix optest backward bug for dygraph, test=develop

* fix optest backward bug for dygraph, test=develop

* close backward op test for dygraph, test=develop

* close backward op test for dygraph, test=develop

* open dygraph op test, test=develop

* fix op test for dygraph, fix GradOpDescMaker, test=develop

* fix bug for linear_chain_crf_op.h, test=develop

* remove log, test=develop

* remove log, test=develop

* remove log for op_test.py, test=develop

* remove log for op_test.py, test=develop

* fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop

* fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop

* fix bug for test_increment_ngraph_op.py, test=develop

* fix lod for op test in dygraph, test=develop

* refactor op_test.py to reduce redundant code, test=develop

* fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop

* remove debug log, test=develop

* remove redundant code in base.py, test=develop

* fix some error in optest, test=develop

* fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop

* refactor op_test.py, test=develop

* remove redundant writing, test=develop

* fix error(get tensor of the grad variable), test=develop

* fix test_concat_mkldnn test_conv2d_mkldnn, test=develop

* fix optest.py for get tensor of LoDTensor, test=develop

* fix optest.py for get tensor of LoDTensor, test=develop

* fix optest.py for get tensor of LoDTensor, test=develop

* fix some redundant code, test=develop

* reslove conflict and rewrite paddle error message, test=develop

c4ede95c

20 11月, 2019 3 次提交
- D
  
  edit elementwise_mul doublegrad inplace (#21245) · 6fc3e8ec
  由 danleifeng 提交于 11月 20, 2019
  
  6fc3e8ec
- Z
  Fix topk compile failed on windows (#21243) · 3ff5cc2d
  由 zhaoyuchen2018 提交于 11月 20, 2019
```
* Fix topk compile failed on windows
* Use explicit cast for assign data
```
  3ff5cc2d
- Z
  optimize assign op to avoid copy data from GPU to GPU (#21181) · 01a96463
  由 Zhang Ting 提交于 11月 20, 2019
```
* optimize assign op to avoid copy data from GPU to GPU, test=develop

* modified GetkernelTypeForVar and just avoid device transform, test=develop
```
  01a96463
19 11月, 2019 3 次提交

D

extend elementwise broadcast function (#20957) · 0e7baabe
由 danleifeng 提交于 11月 19, 2019

0e7baabe
A
Fix GELU grad error (#21204) · d623e863
由 Adam 提交于 11月 19, 2019
```
test=develop
```
d623e863

fix data_norm op to avoid impractical normalization result test=develop (#21152) · b5d8ba83

由 yaoxuefeng 提交于 11月 19, 2019

* fix auc drop first commit test=develop

* update datanorm op

* update datanorm with enforce test=develop

* update test=develop

* update format test=develop

* update format

* update format test=develop

* add unit test test=develop

* update unit test test=develop

* update format test=develop

* update format test=develop

* update API description test=develop

* update API description test=develop

* update format test=develop

* fix codes as comments test=develop

* fix description as comments test=develop

* fix description as comments test=develop

* update codes.. test=develop

b5d8ba83

18 11月, 2019 3 次提交

modified error message and API doc for channel_last supported Op (#21002) · 9cbe7bcc

由 Zhang Ting 提交于 11月 18, 2019

* modified error message for conv and conv_transpose, test=develop

* modified doc of conv and conv_transpose op, test=develop

* modified the expression for error message, test=develop

* modified error message for group_norm op, test=develop

* modified detail of Attr(data_format) or Attr(data_layout)

* add ValueError in API doc for maxout op, test=develop

9cbe7bcc

G

Fix the error of init variable in StaticRNN when stop_gradient=ON (#21118) · 56b5d147
由 guofei 提交于 11月 18, 2019

56b5d147
W

Fix INF bug of softmax_cross_entropy_op (#21165) · 3c98ec90
由 WangXi 提交于 11月 18, 2019

3c98ec90

15 11月, 2019 2 次提交
- Y
  
  Fix jit tls issue (#21151) · eec9c9cb
  由 Yihua Xu 提交于 11月 15, 2019
  
  eec9c9cb
- R
  
  Refine edit distance cn (#21121) · aeb88791
  由 ruri 提交于 11月 15, 2019
  
  aeb88791
14 11月, 2019 4 次提交

K

fix elementwise_mod float point kernel. test=develop (#21183) · 98b59cb8
由 Kaipeng Deng 提交于 11月 14, 2019

98b59cb8
W

Fix warpctc in padding mode. (#21033) · cfdd1fc2
由 whs 提交于 11月 14, 2019

cfdd1fc2

Add examples for error message writing specification - NotFound, OutOfRange,... · 8da0cd53

由 Chen Weihang 提交于 11月 14, 2019

Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134)

* add examples for error msg spec, test=develop

* change ENFORCE to ENFORCE_**, test=develop

* add more already exists examples, test=develop

8da0cd53

Improve topk performance. (#21087) · b93870e6

由 zhaoyuchen2018 提交于 11月 13, 2019

* Improve topk performance.

give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.

* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

b93870e6

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致