提交 · 78716128ae543140b797cc24822026ab2881a0d6 · 机器未来 / Paddle

24 2月, 2020 1 次提交
- T
  SYNC with communicaotor (#22344) (#22725) · 78716128
  由 tangwei12 提交于 2月 24, 2020
```
* add sync communicator and implement
```
  78716128
18 2月, 2020 2 次提交

Add TopK Op Grad CPU&GPU Kernel test=develop (#22628) (#22656) · 471971a3

由 Jiawei Wang 提交于 2月 18, 2020

* Add TopK Op Grad CPU&GPU Kernel test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify PADDLE_ENFORCE test=develop

* Add TopK Op Grad, modify PADDLE_THROW test=develop

* Add TopK Op Grad, modify unittest test=develop

* fix ngraph top k op unittest test=develop

471971a3

Y
register fp16 kernel for some ops (#22650) · 8d8527fb
由 Yibing Liu 提交于 2月 18, 2020
```
test=release/1.7
```
8d8527fb

17 2月, 2020 1 次提交
- Z
  [cherry-pick] :Refine the error log about runtime batch and max_batch… (#22537) · c35413bf
  由 Zhaolong Xing 提交于 2月 17, 2020
```
* [cherry-pick] :Refine the error log about runtime batch and max_batch_size. #22535
test=release/1.7

* fix comments
test=release/1.7
```
  c35413bf
13 2月, 2020 3 次提交

石

load inference model from memory buffer, test=release/1.7 (#22562) · baec7a35

由石晓伟提交于 2月 13, 2020

* 1. load model from memory

2. scale is no longer added when saving inference model

test=develop

* raise ci coverage, test=develop

* supports saving weights to memory. test=develop

* raise ci coverage, test=develop

* fix PADDLE_ENFORCE messages, test=develop

baec7a35

Add support for dynamic_decode(while) training. (#22231) (#22574) · 4e3c535a

由 Guo Sheng 提交于 2月 13, 2020

* Add support for dynamic_decode(while) training. test=develop

* Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop

* Fix test_rnn_decode_api.py. test=develop

* Refine docs for apis in rnn.py. test=develop

* Adjust outputs of dynamic_decode. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop

* Rename _create_array_outof_while as _create_array_out_of_while in rnn.py.

test=release/1.7

4e3c535a

[Cherry-Pick][1.7] Enable MKL-DNN Quantization of Ernie in Slim (#22575) · e78858f1

由 Michał Gallus 提交于 2月 13, 2020

* Introduce Ernie NLP

* Fix error regarding incorrect attr type

test=release/1.7
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>

e78858f1

11 2月, 2020 3 次提交

[cherry-pick] Add weight quantization in post_training_quanzitaion (#22445) (#22493) · ad2813b1

由 cc 提交于 2月 11, 2020

* Add weight quantization in post_training_quanzitaion (#22445)

* [cherry-pick]Support int16 for Tensor (#22423)

* add int16 support, test=develop, test=release/1.7
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>

ad2813b1

cherry-pick 22509. test=develop test=release/1.7 (#22527) · 49a80b45

由 Wilber 提交于 2月 11, 2020

[cherry-pick] #22509

支持不依赖nccl进行编译。

多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用

49a80b45

update. test=develop test=release/1.7 (#22518) · 59bb29db

由 Wilber 提交于 2月 11, 2020

[cherry-pick] #22484

支持不依赖nccl进行编译。

多卡下，如果没有打开WITH_NCCL开关编译，则只能使用单卡

59bb29db

07 2月, 2020 1 次提交
- L
  optimize performance of interpolate op (#22436) (#22489) · 0d0ea9b7
  由 LielinJiang 提交于 2月 07, 2020
```
* optimize interpolate op.
test=develop
```
  0d0ea9b7
05 2月, 2020 3 次提交
- T
  fix sigmoid cudnn bug (#22439) (#22449) · 0be820cb
  由 Tao Luo 提交于 2月 05, 2020
```
Co-authored-by: NManjunath Bhat <manjunathbhat9920@gmail.com>
```
  0be820cb
- W
  cherry-pick 22384 and 22371. test=develop test=release/1.7 (#22453) · fb98116c
  由 Wilber 提交于 2月 05, 2020
```
[cherry-pick] #22384 and #22371

22384增加了WITH_NCCL开关

22371修改了fluid依赖lite的commit id
```
  fb98116c
- C
  [Cherry-pick]Fix geo init & send (#22413) · 13dca2c9
  由 Chengmo 提交于 2月 05, 2020
```
* Fix GEO-SGD init & send Bug (#22375)

* test=develop, fix geo Send & Init

* test=release/1.7,test=develop, cherry-pick 8f36c395
```
  13dca2c9
04 2月, 2020 2 次提交

石

remove anakin from code, test=release/1.7 (#22421) · 399bda2b
由石晓伟提交于 2月 04, 2020

399bda2b

[DNNL] Fix accuracy in INT8 FC (#22404) (#22410) · a13490a0

由 Michał Gallus 提交于 2月 04, 2020

test=release/1.7

* Enable quantize to reorder to nchw as well

* Correct FC MKL-DNN input dim requirements to accept 3D

* Improve DNNL FC format, error and 3D input handling

* Improve error checking in FC

* Improve PADDLE_ENFORCE messages in fc-related files

* Remove data layout attribute from obligatory pass args

* Fix message in fc_mkldnn_pass to be logically correct

a13490a0

21 1月, 2020 1 次提交
- L
  
  change std::cout to log(INFO), vlog (#22316) (#22337) · 90ce4aea
  由 lidanqing 提交于 1月 21, 2020
  
  90ce4aea
20 1月, 2020 1 次提交
- T
  integrated HALF_ASYNC to communicator (#21869) (#22343) · fa4e0e82
  由 tangwei12 提交于 1月 20, 2020
```
* add half_async in the communicator
* fix DistributedStrategy
```
  fa4e0e82
17 1月, 2020 1 次提交
- Q
  
  Fix infer_shape in compling for elementwise_op (#22291) (#22353) · 410a5356
  由 qingqing01 提交于 1月 17, 2020
  
  410a5356
16 1月, 2020 1 次提交
- A
  
  Add caching mechanizm to requantize_mkldnn_op (#22267) · 35bab4f2
  由 Adam 提交于 1月 16, 2020
  
  35bab4f2
14 1月, 2020 3 次提交
- 1
  Bug fix for sparse recorder (#21969) (#22245) · 2e834eab
  由 123malin 提交于 1月 14, 2020
```
* test=develop, bug fix for sparse recorder
```
  2e834eab
- F
  
  add backward gradient computation for op argsort. cherry-pick #22203. test=release/1.7 (#22233) · 681d908e
  由 FlyingQianMM 提交于 1月 14, 2020
  
  681d908e
- Z
  
  support the fusion of batch_norm and relu for AMP. test=release/1.7 (#22210) · c63a63d5
  由 Zhen Wang 提交于 1月 14, 2020
  
  c63a63d5
10 1月, 2020 1 次提交
- B
  
  Improve ngraph file line coverage (#22155) · 298ee7d2
  由 baojun 提交于 1月 09, 2020
  
  298ee7d2
09 1月, 2020 2 次提交

test Optimizer in dygraph (#21949) · d0f0a252

由 zhongpu 提交于 1月 09, 2020

* test Optimizer in dygraph, test=develop

* add optest for Optimizer in dygraph, test=develop

* fix adagrad optimizer, test=develop

* fix dpsgd optimizer, test=develop

* fix test_optimizer.py, test=develop

* fix dpsgd optimizer, this op only support cpu, test=develop

* add optest for optimizer, test=develop

* add description for dpsgd, test=develop

* add rmsprop to white_list in unused_var_check.cc, test=develop

* polish code style, test=develop

* polish code style, test=develop

* delete seed attribute for DpsgdOptimizer, test=develop

* change testing to debugging, test=develop

d0f0a252

石

[Feature] Lite subgraph (#22114) · ad0dfb17
由石晓伟提交于 1月 09, 2020

ad0dfb17

08 1月, 2020 3 次提交

Refine stack op to improve xlnet performance, test=develop (#22142) · 3d4f2aa6

由 zhaoyuchen2018 提交于 1月 08, 2020

stack's wait cost a lot of cpu time, use cuda kernel to do memory copy
will reduce cpu time.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

3d4f2aa6

L

add double register op_data_type of pad2d and fix compile error, test=develop (#22075) · 64a40442
由 liu zhengxi 提交于 1月 08, 2020

64a40442

Support prroi_pool_op with Tensor and LoDTensor rois (#20649) · 6ea38091

由 Double_V 提交于 1月 08, 2020

1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor.
2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.

6ea38091

07 1月, 2020 4 次提交
- Z
  Fix windows build not kernel issue, test=develop (#22105) · 3dbd4087
  由 zhaoyuchen2018 提交于 1月 07, 2020
```
windows conv_fusion failed as no kernel， explicit declare lambda
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  3dbd4087
- C
  Update pyramid related OP (#21372) · 418abc92
  由 Chengmo 提交于 1月 07, 2020
```
* add special way to add distribute vars， Update Pyramid hash op
```
  418abc92
- F
  add erf op (#21785) · 14aebc7a
  由 Feiyu Chan 提交于 1月 07, 2020
```
* add erf op and python interface.

* add fp16 support for erf op.

* add unitests for erf op and its python interface.
```
  14aebc7a
- C
  
  replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109) · ba8414d3
  由 Chen Weihang 提交于 1月 07, 2020
  
  ba8414d3
06 1月, 2020 4 次提交

support elu_op double grad (#21822) · fab4b076

由 Double_V 提交于 1月 06, 2020

* support elu activation double grad,test=develop

* delete the code commit in .cc,test=develop

* fix relu test unpass, test=develop

* add elu double grad kernel and unit test

* add caculate dX in elu double grad functor, test=develop

* update the commit code,test=develop

fab4b076

Add TRT support for BERT (#21135) · 0a51098a

由 Pei Yang 提交于 1月 06, 2020

* add gelu plugin

* align trt bert with gpu

* add support for fused fc with relu,

* add unittest for bert trt

0a51098a

J

[MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088) · b0b27ff6
由 Jacek Czaja 提交于 1月 06, 2020

b0b27ff6
1
add distributed_strategy (#21710) · 7fb817d4
由 123malin 提交于 1月 06, 2020
```
* add distributed_strategy
```
7fb817d4

05 1月, 2020 1 次提交
- J
  
  [MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747) · ad8a9cb8
  由 Jacek Czaja 提交于 1月 05, 2020
  
  ad8a9cb8
04 1月, 2020 1 次提交
- K
  
  polish cross_entropy ENFORCE (#22056) · 34c57120
  由 Kaipeng Deng 提交于 1月 04, 2020
  
  34c57120
03 1月, 2020 1 次提交
- S
  register int/int64_t/float16 in pow/square kernel,test=develop (#22023) · 7f4abaf2
  由 SunAhong1993 提交于 1月 03, 2020
```
* register int/int64_t/float16 in  pow/square kernel,test=develop

* add abs/square/exp type,test=develop
```
  7f4abaf2

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致