提交 · b16274556acb4411681a85f45aa56b950151b5e0 · Crayon鑫 / Paddle

29 11月, 2019 3 次提交

Add dscending for argsort (#21400) · b1627455

由 zhaoyuchen2018 提交于 11月 29, 2019

* Add ascending for argsort

* Refine api doc description.

* Refine descending description

* Add int32 logic to speedup when data is small size.

* Remove int32 opt as not support in python

b1627455

Add dygraph execution context (#20157) · ac854670

由 hong 提交于 11月 29, 2019

* add_dygraph_execution_context

* add dygraph infershape context and execution context; test=develop

* fix imperative bug; test=develop

* remove inputs outputs interface from execution context,
because it have same function with inputNames;
test=develop

* remove tracer_test ctest; test=develop

* fix split op bug; test=develop

* fix unitests bug; test=develop

* fix distribute test bug; test=develop

* fix ngraph compile bug; test=develop

* fix grad maker bug; test=develop

* fix load op bugs; test=develop

* fix operator.cc construct bug; test=develop

* remove useless name find in operator; test=develop

* add tracer_test; test=develop

* fix concat, split bug; test=develop

* remove tracer_test unitest; test=develop

* fix attribute check bug; test=develop

* add test code to fix converage; test=develop

* remove useless code, change check backward input in engin; test=develop

* unlock var type infer shape;test=develop

* add ShareAllLoD api; test=develop

* add dygraph infershape context unitest; test=develop

* remove increase and decrease lod in dygraph; test=develop

* addd override; test=develop

* fix increase descrease lod; test=develop

* fix paddle_enforce; test=develop

* disable lod op dygraph check; test=develop

* fix paddle enforce error; test=develop

* add comment for op_registry and OperatorBase; test=develop

* optimize the comment of op_registry; test=develop

* fix format of comment; test=develop

* fix format of comment; test=develop

* optimize the format of comment; test=develop

* optimize the format of the comment; test=develop

* optimize comment of op_registry; test=develop

ac854670

H
add macro to ban windows (#21422) · a6b089c6
由 hutuxian 提交于 11月 29, 2019
```
remove nccl related code in windows
```
a6b089c6

28 11月, 2019 8 次提交

K
add Adam beta1/beta2 support Variable (#21234) · ebfb720a
由 Kaipeng Deng 提交于 11月 28, 2019
```
* add Adam beta1/beta2 support Variable. test=develop
```
ebfb720a

Use system allocator in OpTest (#21335) · 09696d5d

由 Zeng Jinle 提交于 11月 28, 2019

* use system allocator in unittests, test=develop

* fix op bugs, test=develop

* fix tensor copy bug when src and dst are the same, test=develop

09696d5d

R

Add masked select api (#21172) · 007c9975
由 ruri 提交于 11月 28, 2019

007c9975

batch_norm momentum support variable (#21246) · 67c836fb

由 Kaipeng Deng 提交于 11月 28, 2019

* batch_norm momentum support variable. test=develop

* fix format. test=develop

* add batch_norm momentum variable example. test=develop

* move MomentumTensor to training branch. test=develop

* split example. test=develop

* fix doc. test=develop

* fix PADDLE_ENFORCE ci. test=develop

* fix format. test=develop

67c836fb

Fp32 vs int8 qat C++ performance (#21244) · c0aa1367

由 lidanqing 提交于 11月 28, 2019

* add ut for comparing FP32 and QAT INT8

* add save qat transformed model python script
test=develop

* updated

* added missing file

* add "with_label"
test=develop

* performance benchmark as unit test
test=develop

* change names of unnecessary thing

* Change CMakeList.txt for model downloading and UT
test=develop

* change names of functions and params for more readable code
test=develop

* Change PADDLE_ENFORCE messages
test=develop

* fix indent problems
test=develop

* indent problems
test=develop

c0aa1367

X
fix fleet save bug (#21362) · f1178e9d
由 xujiaqi01 提交于 11月 28, 2019
```
* fix fleet save bug of save_infernece_model
* test=develop
```
f1178e9d
L

add config file to avoid load checkpoint test=develop (#21373) · 1840c165
由 Liufang Sang 提交于 11月 28, 2019

1840c165
Z

fix lod_reset bug, test=develop (#21392) · b97fc16d
由 Zeng Jinle 提交于 11月 28, 2019

b97fc16d

27 11月, 2019 3 次提交

Support data_norm gpu kernel (#21325) · 47a82e38

由 hutuxian 提交于 11月 27, 2019

* support data_norm_op run in CUDA
* add two parameters sync_stats & summary_decay_rate
* add UT

47a82e38

Support numpy bridge (enabled by default in dygraph mode) (#20983) · d5ff79e5

由 Youwei Song 提交于 11月 27, 2019

* add numpy bridge

* fix template compile

* add unittest, add default
test=develop

* fix unittest
test=develop

* fix unittest
test=develop

* zero_copy=True for to_variable,
test=develop

* bug fix
test=develop

* disable deprecated NumPy API
test=develop

* use better design of NumpyAllocator
test=develop

* fix Py_None check
test=develop

* reset c++ tracer when jump out dygraph guard
test=develop

* refine PADDLE_ENFORCE_xx format
test=develop

* bug fix of tracer switch
test=develop

* update decref
test=develop

d5ff79e5

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

26 11月, 2019 9 次提交

I

paddleslim quantization skip pattern support list of string (#21141) · 07e6a942
由 itminner 提交于 11月 26, 2019

07e6a942
Z
Fix some typos in AMP. (#21354) · be2e3e67
由 Zhen Wang 提交于 11月 26, 2019
```
* fix some typos in AMP. test=develop

* delete useless codes. test=develop
```
be2e3e67

add the framework support for distfc (#21197) · 41d13209

由 lilong12 提交于 11月 26, 2019

* add the framework support for distfc and ut, test=develop
* fix the implementation of shard_index_op, test=develop

41d13209

change download log format (#21290) · a214a308

由 hong 提交于 11月 26, 2019

* change download log formate; test=develop

* add unittest for data download; test=develop

* remove cache before download; test=develop

a214a308

Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) · 234060f8

由 GaoWei8 提交于 11月 26, 2019

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

234060f8

R

reduce interp op input size to pass CI, test=develop (#21341) · 6cfcbe05
由 ruri 提交于 11月 26, 2019

6cfcbe05
J

[MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207) · f4cf028a
由 Jacek Czaja 提交于 11月 26, 2019

f4cf028a

Refactor MKL-DNN ElementwiseMul (#21061) · ed9ceb9f

由 Michał Gallus 提交于 11月 26, 2019

* Refactor MKL-DNN ElementwiseMul

remove manual fallback, remove format attrs
test=develop

* Refine PADDLE_ENFORCEs in eltwise_mul_op.h

test=develop

* Make ElementwiseMulOp inherit from ElementwiseOp

* Change type of simd_width to int

test=develop

* Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp

test=develop

* Restore attributes

test=develop

* Fix test coverage for mkldnn eltwise mul

test=develop

* Conform to new is_run_common_broadcast API

test=develop

* Add UT for AreDimsAndFormatCorrect

test=develop

ed9ceb9f

D
fix logger problem (#21342) · 0a93635b
由 Dong Daxiang 提交于 11月 26, 2019
```
* fix logger problem
test=develop

* refine logger
test=develop
```
0a93635b

25 11月, 2019 7 次提交
- W
  fix the fill_constant op precious problem (#21322) · 6514f52e
  由 wangchaochaohu 提交于 11月 25, 2019
```
* fix the fill_constant op precious problem test=develop
```
  6514f52e
- Z
  Improve argsort performance. (#21267) · 08c19c58
  由 zhaoyuchen2018 提交于 11月 25, 2019
```
* Improve argsort performance.

- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s

- Add fp16 support

* Refine error message
* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  08c19c58
- L
  
  fix Print_op input dtype list error test=develop (#21326) · 7fcaa39b
  由 lijianshe02 提交于 11月 25, 2019
  
  7fcaa39b
- J
  
  add resnet50 test for post trainint quantization, test=develop (#21272) · 84865b80
  由 juncaipeng 提交于 11月 25, 2019
  
  84865b80
- T
  print table stat info for pslib (#21296) · 9a7832f8
  由 Thunderbrook 提交于 11月 25, 2019
```
* print table stat
test=develop

* notes
test=develop

* notes
test=develop
```
  9a7832f8
- W
  
  Fix dgc accuracy by mv regularization to local (#21278) · 8ac7687e
  由 WangXi 提交于 11月 25, 2019
  
  8ac7687e
- Z
  Add global value getter setter (#21285) · b9f8ae84
  由 Zeng Jinle 提交于 11月 25, 2019
```
* add global value getter setter, test=develop

* fix error messages, test=develop
```
  b9f8ae84
24 11月, 2019 3 次提交

Refactor fetch handler (#21264) · 691ced87

由 Dong Daxiang 提交于 11月 24, 2019

* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.

For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.

691ced87

Y
adapt test_collective_base.py for only two GPU cards available. (#21307) · f1b09ba3
由 Yi Liu 提交于 11月 24, 2019
```
* adapt test_collective_base.py for only two GPU cards available.
test=develop

* fix bug of issue #21259
test=develop
```
f1b09ba3
G

optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597) · ed2a1852
由 gongweibao 提交于 11月 24, 2019

ed2a1852

22 11月, 2019 4 次提交

add dequantize_abs_max op and modify lookup_table op (#20899) · f0b15184

由 Liufang Sang 提交于 11月 22, 2019

* add int8 kernel to lookup_table op and add dequantize op test=develop

* change paddle_enforce to paddle_enforce_eq test=develop

* change copyright and change some not suitable code test=develop

* remove debug log test=develop

* replace GetInputType with IndicateVarDataType test=develop

* fix EmptyGradMaker test=develop

* fix diff between cpu and gpu test=develop

* use memcopy when int8_t test=develop

f0b15184

support cvm_op run in gpu (#21300) · a6ce2306

由 hutuxian 提交于 11月 22, 2019

Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel.
What's more, we improve the UTs about CVM OP.

a6ce2306

C
Polish some PE code details (#21274) · 95250852
由 Chen Weihang 提交于 11月 22, 2019
```
* polish code details, test=develop

* futher polish hint msg, test=develop
```
95250852
Y
fix bug of issue #21259 (#21287) · 0fd1281e
由 Yi Liu 提交于 11月 22, 2019
```
pass the argument `allow_out_of_range` of one_hot op to c++ back end.
```
0fd1281e

21 11月, 2019 3 次提交

fix fs_client_param bug (#21212) · 319d2ba9

由 xujiaqi01 提交于 11月 21, 2019

* fix fs_client_param bug， user can set this config through fleet_desc_file or fleet config
* test=develop

319d2ba9

solve pslib core in stop worker (#21263) · 0d17c1b8

由 Thunderbrook 提交于 11月 21, 2019

* general table

* add sparse table
test=develop

* no cvm
test=develop

* add no_cvm
test=develop

* add note
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* add key of optimizer
test=develop

* solve pslib stop core
test=develop

* barrier
test=develop

* add notes
test=develop

0d17c1b8

Z

fix bug for python/paddle/fluid/tests/unittests/test_elementwise_mul_op.py, test=develop (#21289) · fa4d0550
由 zhongpu 提交于 11月 21, 2019

fa4d0550

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致