提交 · b39f947698c90219d3321d4a3ddf114c0927a632 · PaddlePaddle / Paddle

28 11月, 2019 2 次提交
- Z
  
  Eliminate the impact on incremental compilation (#21410) · b39f9476
  由 zhouwei25 提交于 11月 28, 2019
  
  b39f9476
- T
  optimization check_api_approvals (#21371) · e0da2bcd
  由 tianshuo78520a 提交于 11月 28, 2019
```
* optimization check_api_approvals

* change echo line

* echo_line

* update

* test=develop;test=document_fix
```
  e0da2bcd
27 11月, 2019 6 次提交

Z
fix C++ multicard inference bug. (#20955) · d1a6e112
由 Zhaolong Xing 提交于 11月 27, 2019
```
test=develop
```
d1a6e112

Support data_norm gpu kernel (#21325) · 47a82e38

由 hutuxian 提交于 11月 27, 2019

* support data_norm_op run in CUDA
* add two parameters sync_stats & summary_decay_rate
* add UT

47a82e38

Support numpy bridge (enabled by default in dygraph mode) (#20983) · d5ff79e5

由 Youwei Song 提交于 11月 27, 2019

* add numpy bridge

* fix template compile

* add unittest, add default
test=develop

* fix unittest
test=develop

* fix unittest
test=develop

* zero_copy=True for to_variable,
test=develop

* bug fix
test=develop

* disable deprecated NumPy API
test=develop

* use better design of NumpyAllocator
test=develop

* fix Py_None check
test=develop

* reset c++ tracer when jump out dygraph guard
test=develop

* refine PADDLE_ENFORCE_xx format
test=develop

* bug fix of tracer switch
test=develop

* update decref
test=develop

d5ff79e5

G
Polish the codes of fc when needs padding (#21378) · 8493f20e
由 GaoWei8 提交于 11月 27, 2019
```
test=develop
```
8493f20e

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

Z

fix syn bn grad maker, test=develop, test=document_fix (#21317) · b639a882
由 Zeng Jinle 提交于 11月 27, 2019

b639a882

26 11月, 2019 16 次提交
- Y
  add axis check for concat op (#21288) · 4d0f5ab1
  由 Youwei Song 提交于 11月 26, 2019
```
* add axis check for concat op
test=develop

* fix PADDLE_ENFORCE format
test=develop

* move to ComputeAxis for InferShape check
test=develop
```
  4d0f5ab1
- I
  
  paddleslim quantization skip pattern support list of string (#21141) · 07e6a942
  由 itminner 提交于 11月 26, 2019
  
  07e6a942
- T
  make CUDA_ARCH_NAME default Auto (#21352) · d8e7d252
  由 Tao Luo 提交于 11月 26, 2019
```
* make CUDA_ARCH_NAME default Auto

test=develop

* refine warning

test=develop
```
  d8e7d252
- Z
  Fix some typos in AMP. (#21354) · be2e3e67
  由 Zhen Wang 提交于 11月 26, 2019
```
* fix some typos in AMP. test=develop

* delete useless codes. test=develop
```
  be2e3e67
- Z
  Fix ernie python infer diff (#21311) · afb13484
  由 zhaoyuchen2018 提交于 11月 26, 2019
```
* Fix ernie pythoin infer diff
* Refine mask

test=develop
```
  afb13484
- L
  Fix mistake of batch norm op (#21237) · b6ce4f8b
  由 Lv Mengsi 提交于 11月 26, 2019
```
* fix_bn

* revert unittest,test=develop
```
  b6ce4f8b
- L
  add the framework support for distfc (#21197) · 41d13209
  由 lilong12 提交于 11月 26, 2019
```
* add the framework support for distfc and ut, test=develop
* fix the implementation of shard_index_op, test=develop
```
  41d13209
- Z
  
  polish global_value_getter_setter, test=develop (#21332) · dbba9c7e
  由 Zeng Jinle 提交于 11月 26, 2019
  
  dbba9c7e
- H
  change download log format (#21290) · a214a308
  由 hong 提交于 11月 26, 2019
```
* change download log formate; test=develop

* add unittest for data download; test=develop

* remove cache before download; test=develop
```
  a214a308
- G
  Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) · 234060f8
  由 GaoWei8 提交于 11月 26, 2019
```
* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop
```
  234060f8
- R
  
  reduce interp op input size to pass CI, test=develop (#21341) · 6cfcbe05
  由 ruri 提交于 11月 26, 2019
  
  6cfcbe05
- S
  
  add prediction demo and script on windows (#21248) · 45c1e7bb
  由 silingtong123 提交于 11月 26, 2019
  
  45c1e7bb
- S
  
  package the CAPI inference library and third_party (#21299) · 4b429c19
  由 silingtong123 提交于 11月 26, 2019
  
  4b429c19
- J
  
  [MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207) · f4cf028a
  由 Jacek Czaja 提交于 11月 26, 2019
  
  f4cf028a
- M
  Refactor MKL-DNN ElementwiseMul (#21061) · ed9ceb9f
  由 Michał Gallus 提交于 11月 26, 2019
```
* Refactor MKL-DNN ElementwiseMul

remove manual fallback, remove format attrs
test=develop

* Refine PADDLE_ENFORCEs in eltwise_mul_op.h

test=develop

* Make ElementwiseMulOp inherit from ElementwiseOp

* Change type of simd_width to int

test=develop

* Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp

test=develop

* Restore attributes

test=develop

* Fix test coverage for mkldnn eltwise mul

test=develop

* Conform to new is_run_common_broadcast API

test=develop

* Add UT for AreDimsAndFormatCorrect

test=develop
```
  ed9ceb9f
- D
  fix logger problem (#21342) · 0a93635b
  由 Dong Daxiang 提交于 11月 26, 2019
```
* fix logger problem
test=develop

* refine logger
test=develop
```
  0a93635b
25 11月, 2019 9 次提交
- Z
  
  remove warning LNK4006 and warning LNK4221 (#21226) · 345b67b5
  由 zhouwei25 提交于 11月 25, 2019
  
  345b67b5
- W
  fix the fill_constant op precious problem (#21322) · 6514f52e
  由 wangchaochaohu 提交于 11月 25, 2019
```
* fix the fill_constant op precious problem test=develop
```
  6514f52e
- Z
  Improve argsort performance. (#21267) · 08c19c58
  由 zhaoyuchen2018 提交于 11月 25, 2019
```
* Improve argsort performance.

- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s

- Add fp16 support

* Refine error message
* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  08c19c58
- L
  
  fix Print_op input dtype list error test=develop (#21326) · 7fcaa39b
  由 lijianshe02 提交于 11月 25, 2019
  
  7fcaa39b
- J
  
  add resnet50 test for post trainint quantization, test=develop (#21272) · 84865b80
  由 juncaipeng 提交于 11月 25, 2019
  
  84865b80
- T
  print table stat info for pslib (#21296) · 9a7832f8
  由 Thunderbrook 提交于 11月 25, 2019
```
* print table stat
test=develop

* notes
test=develop

* notes
test=develop
```
  9a7832f8
- Z
  
  Cache 3rd source code, improve stability, reduce the compilation time (#21190) · 341dee06
  由 zhouwei25 提交于 11月 25, 2019
  
  341dee06
- W
  
  Fix dgc accuracy by mv regularization to local (#21278) · 8ac7687e
  由 WangXi 提交于 11月 25, 2019
  
  8ac7687e
- Z
  Add global value getter setter (#21285) · b9f8ae84
  由 Zeng Jinle 提交于 11月 25, 2019
```
* add global value getter setter, test=develop

* fix error messages, test=develop
```
  b9f8ae84
24 11月, 2019 5 次提交

use prefetch to load next mem into cache (#21206) · b19e1a1b

由 Leo Zhao 提交于 11月 24, 2019

* use prefetch to load next mem into cache

test=develop

* remove hard code memcpy om pyramid_hash_ff

test=develop

b19e1a1b

Refactor fetch handler (#21264) · 691ced87

由 Dong Daxiang 提交于 11月 24, 2019

* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.

For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.

691ced87

Y
adapt test_collective_base.py for only two GPU cards available. (#21307) · f1b09ba3
由 Yi Liu 提交于 11月 24, 2019
```
* adapt test_collective_base.py for only two GPU cards available.
test=develop

* fix bug of issue #21259
test=develop
```
f1b09ba3
G

optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597) · ed2a1852
由 gongweibao 提交于 11月 24, 2019

ed2a1852

Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310) · c918788b

由 Yiqun Liu 提交于 11月 24, 2019

* Disable fusion_group pass for windows and mac. We will do some experiments on Linux first.
test=develop

* Print the subgraph when check failed.
test=develop

c918788b

22 11月, 2019 2 次提交
- Y
  Fix the crash issue when scale or bias was null-pointer. (#21284) · 69dd5152
  由 Yihua Xu 提交于 11月 22, 2019
```
* Fix the crash issue when scale or bias was null-pointer.

test=develop

* Add the error message for passing CI.

test=develop
```
  69dd5152
- Z
  
  optimize lod_reset op to avoid data transform · 698b8b73
  由 Zhang Ting 提交于 11月 22, 2019
  
  698b8b73

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功