提交 · 433cef03e5b7aa8ccf65531ae77ed11d51460563 · Crayon鑫 / Paddle

28 2月, 2020 2 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
- K
  
  fix detection_map. test=develop (#22705) · ebc7ffc3
  由 Kaipeng Deng 提交于 2月 28, 2020
  
  ebc7ffc3
27 2月, 2020 3 次提交

Refine adam op to improve performance, test=develop (#22346) · 72dde4ab

由 zhaoyuchen2018 提交于 2月 27, 2020

* Refine adam op, test=develop

* Fuse kernels together to reduce cpu time.

* Refine paddle enforce, test=develop

* Remove some comments, test=develop

* Refine code,test=develop

* Refine cuda kernel, test=develop

* Refine code according to comments, test=develop

72dde4ab

W

fix lod level, test=develop (#22755) · f2d1cd11
由 wangguanzhong 提交于 2月 27, 2020

f2d1cd11

Correct CPU gradients of the argsort op (#22739) · 79d71234

由 FlyingQianMM 提交于 2月 27, 2020

* Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop

* fix dynamic threshold error in test_argsort_op, test=develop

79d71234

26 2月, 2020 1 次提交
- G
  Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717) · ae8b5f11
  由 guofei 提交于 2月 26, 2020
```
As the title
```
  ae8b5f11
25 2月, 2020 3 次提交
- C
  register fp16 for assign op (#22744) · 15c26671
  由 chengjuntao 提交于 2月 25, 2020
```
* register fp16 for assign op, test=develop

* add op test for fp16, test=develop
```
  15c26671
- D
  
  fix generate_mask_labels lod level (#22743) · 1c065346
  由 dyning 提交于 2月 25, 2020
  
  1c065346
- G
  
  fix compile&runtime lod_equality of lod_reset (#22737) · ba140222
  由 GaoWei8 提交于 2月 25, 2020
  
  ba140222
24 2月, 2020 2 次提交

add partial_sum op in contrib (#22292) · 3132681e

由 ShenLiang 提交于 2月 24, 2020

* add partial_sum_op, test=develop

* modify the Paddle Error Message, test=develop

* modify the Paddle Error Message, test=develop

* modify the bug for python3, test=develop

* modify the ut for ci, test=develop

* mv to contrib, test=develop

* use check_variable_and_dtype, test=develop

* fix ci, test=develop

* fix conflict, test=dvelop

* add partial concat, test=develop

* fix the conflict, test=develop

* fix the error, test=develop

* rm SSE4, test=develop

3132681e

add partial_concat op in contrib (#22528) · e1366613

由 ShenLiang 提交于 2月 24, 2020

* add partial_concat, test=develop

* fix the grids and blocks, test=develop

* fix the Paddle_Enforce, test=develop

* fix the doc of op, test=develop

* fix the doc, test=develop

* fix the doc of the op, test=develop

* replace -1 with None, test=develop

e1366613

23 2月, 2020 1 次提交
- T
  
  fix typo words (#22653) · d2ba91aa
  由 tianshuo78520a 提交于 2月 23, 2020
  
  d2ba91aa
22 2月, 2020 2 次提交
- Y
  register fp16 kernel for some ops (#22650) (#22696) · 6e7bfe30
  由 Yibing Liu 提交于 2月 22, 2020
```
test=develop
```
  6e7bfe30
- T
  SYNC with communicaotor (#22344) · 66a31501
  由 tangwei12 提交于 2月 22, 2020
```
* add sync communicator and implement
```
  66a31501
21 2月, 2020 2 次提交
- Y
  
  Add the support of fp16 in fusion_group (#22239) · 22bbd547
  由 Yiqun Liu 提交于 2月 21, 2020
  
  22bbd547
- H
  Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673) · adfa5b83
  由 Huihuang Zheng 提交于 2月 21, 2020
```
1. Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp.
2. Also enrich PADDLE_ENFORCE error messages.
```
  adfa5b83
18 2月, 2020 1 次提交

[UT coverage] improve the mul_mkldnn_op line coverage (#22408) · d9262145

由 lidanqing 提交于 2月 18, 2020

* improve the mul_mkldnn_op line coverage
test=develop

* remove fp32 mul mkldnn kernel
test=develop

* locally refactoring
test=develop

* change according to reviews
test=develop

d9262145

17 2月, 2020 4 次提交
- Z
  [Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535) · a06d75a2
  由 Zhaolong Xing 提交于 2月 17, 2020
```
* fix trt log
test=develop

* fix comments
test=develop
```
  a06d75a2
- A
  
  Update MKLDNN to v1.2 (#22521) · 608447bf
  由 Adam 提交于 2月 17, 2020
  
  608447bf
- A
  
  transpose_mkldnn code change to meet Paddle standards (#22591) · ab610a34
  由 Adam 提交于 2月 17, 2020
  
  ab610a34
- J
  Add TopK Op Grad CPU&GPU Kernel test=develop (#22628) · 8f035fb6
  由 Jiawei Wang 提交于 2月 17, 2020
```
* Add TopK Op Grad CPU&GPU Kernel test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify PADDLE_ENFORCE test=develop

* Add TopK Op Grad, modify PADDLE_THROW test=develop

* Add TopK Op Grad, modify unittest test=develop

* fix ngraph top k op unittest test=develop
```
  8f035fb6
15 2月, 2020 1 次提交

update ops's unittest data type from float32 to float64 and shape over 100 (#22544) · 90ee3666

由 Steffy-zxf 提交于 2月 15, 2020

* update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt
1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64)
2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data
3. remove sqrt from op_accuracy_white_list.py
4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100
5. test=develop

* modify the writing style according suggestions
test=develop

90ee3666

13 2月, 2020 1 次提交

[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486) · 8acd745c

由 Zhaolong Xing 提交于 2月 13, 2020

* 1. optim multihead matmul: fuse three fc to multihtead matmul

test=develop

* fix conflict
test=develop

* fix comments
test=develop

8acd745c

12 2月, 2020 3 次提交

Add support for dynamic_decode(while) training. (#22231) · 31b54646

由 Guo Sheng 提交于 2月 12, 2020

* Add support for dynamic_decode(while) training. test=develop

* Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop

* Fix test_rnn_decode_api.py. test=develop

* Refine docs for apis in rnn.py. test=develop

* Adjust outputs of dynamic_decode. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop

* Rename _create_array_outof_while as _create_array_out_of_while in rnn.py.
test=develop

31b54646

Add support for Ernie NLP model to the Slim QAT (#22506) · 4cddb43c

由 Wojciech Uss 提交于 2月 12, 2020

* a test for Ernie QAT INT8 accuracy check

test=develop

* Remove NLP comparison test to split PRs

test=develop

* Fix typo and tabs, delete commented lines

test=develop

* re-combine the 2 PRs, test=develop
Co-authored-by: NMichał Gallus <sand3r@interia.eu>
Co-authored-by: Nbingyanghuang <33643817+bingyanghuang@users.noreply.github.com>

4cddb43c

support slice double grad, test=develop (#22166) · 58d99247

由 Double_V 提交于 2月 12, 2020

* support slice double grad, test=develop
* merge two doublegradopmaker to one doublegradopmaker,test=develop
* change the shape of slice_OP's unittest, test=develop

58d99247

11 2月, 2020 4 次提交

Paddlebox about box_wrapper (#22497) · 1a7962be

由 hutuxian 提交于 2月 11, 2020

Refine PaddleBox Framework, Main functions: 
* Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC.
* Replace FeedPass with new interface: BeginFeedPass & EndFeedPass
* Refactor Pull/Push Sparse Function in box_wrapper.
* Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct.
* Cache copied keys in pull sparse in order to reuse it in push period.

1a7962be

H

【OpPorting Example】DEMO OF FIX COMPILE&RUNTIME LOD_EQUALITY (#22460) · 9e29d3eb
由 huzhiqiang 提交于 2月 11, 2020

9e29d3eb

Improve transpose performance with tile sm copy, test=develop (#22311) · 54970444

由 zhaoyuchen2018 提交于 2月 11, 2020


* Refine code, fix select tile error,test=develop

* Refine element type and some comments, test=develop

* Refine comments and gpu utils, test=develop

* Remove some useless condition

* Refine floor and ceil, test=develop

* refine for loop. test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

54970444

Compile without nccl deps. [1/2] (#22509) · a90fa540

由 Wilber 提交于 2月 11, 2020

支持不依赖nccl进行编译。[1/2]

多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

a90fa540

10 2月, 2020 3 次提交
- W
  Compile without nccl deps. [2/2] (#22484) · de009152
  由 Wilber 提交于 2月 10, 2020
```
Compile without nccl deps. [1/2]
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
```
  de009152
- Y
  Fix dismatch of std::max's arguments type on windows. (#22507) · 4b2227e9
  由 Yiqun Liu 提交于 2月 10, 2020
```
test=develop
```
  4b2227e9
- W
  
  fix test_fusion_seqpool_concat lod level between compile and runtime (#22488) · 870f4658
  由 Wilber 提交于 2月 10, 2020
  
  870f4658
07 2月, 2020 4 次提交
- Z
  Fix the integer overflow problem of sequence2batch (#22479) · a61d0952
  由 Zhong Hui 提交于 2月 07, 2020
```
Fix the  integer overflow problem in the op of sequence2batch, change the int32_t to size_t，
In the /paddle/fluid/operators/math/sequence2batch.h#L122.
```
  a61d0952
- C
  Add weight quantization in post_training_quanzitaion (#22445) · 197913eb
  由 cc 提交于 2月 07, 2020
```
* support weight quantization in post_training_quanzitaion, test=develop
* add test for weight quantization, test=develop
```
  197913eb
- T
  refine reshape_op shape error message (#22480) · 7c9ce097
  由 Tao Luo 提交于 2月 07, 2020
```
test=develop
```
  7c9ce097
- L
  optimize performance of interpolate op (#22436) · 2b1386b2
  由 LielinJiang 提交于 2月 07, 2020
```
* optimize interpolate op, test=develop
```
  2b1386b2
06 2月, 2020 1 次提交

Correct the use of DeviceContext in unittest sequence_pooling_test and... · 44b45b9f

由 Yiqun Liu 提交于 2月 06, 2020

Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456)

* Add log in memory::Copy for debug purpose.

* Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one.

* Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one.
test=develop

* Change the type of second_dim from size_t to int64_t.
test=develop

44b45b9f

05 2月, 2020 2 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

fix sigmoid cudnn bug (#22439) · 943cb8c6

由 Tao Luo 提交于 2月 05, 2020

* Sigmoid bug fix, test=develop

* fix code format

test=develop
Co-authored-by: NManjunath Bhat <manjunathbhat9920@gmail.com>

943cb8c6

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致