提交 · fb7f85291ba79d4e89d28f32aae42ee700e425e1 · BaiXuePrincess / Paddle

26 10月, 2020 2 次提交
- Z
  
  fix print tensor place,add cpu/cuda/pin_memory API for Tensor (#28200) · fb7f8529
  由 Zhou Wei 提交于 10月 26, 2020
  
  fb7f8529
- M
  add sharding strategy in fleet(#27900) · 81244fbf
  由 mapingshuo 提交于 10月 26, 2020
```
* add sharding
```
  81244fbf
23 10月, 2020 2 次提交
- C
  Add compile limit for PADDLE_ENFORCE without error message (#28221) · 2babd6ff
  由 Chen Weihang 提交于 10月 23, 2020
```
* add compile limit for paddle enforce

* polish elementwise_op_function.cu.h

* fix failed unittest

* fix windows compile failed

* detail polish

* revert no type constructor
```
  2babd6ff
- L
  
  use FLAGS_use_mkldnn to prevent unnecessary attrs copy (#28146) · 4ea23307
  由 lidanqing 提交于 10月 23, 2020
  
  4ea23307
22 10月, 2020 4 次提交
- D
  
  fix wrong data type, test=develop (#28203) · 2db77be4
  由 Double_V 提交于 10月 22, 2020
  
  2db77be4
- F
  fix strided_slice_op's GetExpectedKernelType (#28192) · efe6e284
  由 Feiyu Chan 提交于 10月 22, 2020
```
* fix strided_slice_op's GetExpectedKernelType when input tensor is at CUDAPinnedPlace

* add unittest for tensors in cuda pinned place

* skip test for cuda pinned place on cpu machines
```
  efe6e284
- L
  Fix bug of fetch_async_op_handle when fetching the feed variable (#28194) · 1f3be859
  由 Leo Chen 提交于 10月 22, 2020
```
* fix bug of fetch_async_op_handle

* revert some changes of test_buffer_shared_memory_reuse_pass

* revert some changes of test_buffer_shared_memory_reuse_pass
```
  1f3be859
- W
  
  Fix nccl op test failed, test=develop (#28172) · e450823b
  由 WangXi 提交于 10月 22, 2020
  
  e450823b
21 10月, 2020 6 次提交
- W
  
  [lite-xpu-subgraph] Fix xpu compile and test xpu ci. (#27932) · f935ca8a
  由 Wilber 提交于 10月 21, 2020
  
  f935ca8a
- D
  dygraph nccl init support host domain name (#28107) · f29fb396
  由 danleifeng 提交于 10月 21, 2020
```
* nccl init support hostname and ip; test=develop
```
  f29fb396
- W
  
  support multiclass nms for multi-batch, test=develop (#28154) · 5cd97a1c
  由 wangguanzhong 提交于 10月 21, 2020
  
  5cd97a1c
- P
  
  change avg pooling from trt plugin to trt layer (#28032) · 602d2ce5
  由 Pei Yang 提交于 10月 21, 2020
  
  602d2ce5
- D
  
  fix Wmaybe-uninitialized warning in pooling.cc, test=develop (#28126) · 5289b72a
  由 Double_V 提交于 10月 21, 2020
  
  5289b72a
- Z
  
  fix dynamic_loader more safe and error message on windows (#28117) · 5d700021
  由 Zhou Wei 提交于 10月 21, 2020
  
  5d700021
20 10月, 2020 8 次提交
- W
  fix generate_proposal_labels in cascade-rcnn series model, test=develop (#27892) · d1e1f174
  由 wangguanzhong 提交于 10月 20, 2020
```
* fix generate_proposal_labels in cascade-rcnn series model, test=develop

* fix example code & unittest, test=develop

* update code from review comments, test=develop
```
  d1e1f174
- L
  fill_constant op supports NaN and Inf (#28109) · a911c19e
  由 Leo Chen 提交于 10月 20, 2020
```
* fill_constant supports nan and inf

* add ut
```
  a911c19e
- Z
  
  randperm run error in multi-gpus (#27942) · 6dd64b0a
  由 zhupengyang 提交于 10月 20, 2020
  
  6dd64b0a
- D
  add rois_num for roi_align xpu OP (#28077) · d43f75e4
  由 Double_V 提交于 10月 20, 2020
```
* add stack pool2d roi_align xpu op,test=kunlun

* error message opt, test=kunlun

* add xpu unittest,test=kunlun

* skip check grad,test=kunlun

* fix boostget , test=kunlun

* error message opt for XPU, test=kunlun

* add rois_num for roi_align xpu OP, test=develop
```
  d43f75e4
- X
  
  rm max_input in conv2d for kunlun, test=kunlun (#28062) · e3d02c95
  由 xiaoting 提交于 10月 20, 2020
  
  e3d02c95
- J
  Add AVX512 instruction check for C-API (#28087) · a21b5710
  由 joanna.wozna.intel 提交于 10月 20, 2020
```
* Add AVX512 instruction check for C-API

* Fix formatting
```
  a21b5710
- W
  
  refine gpu kernel config for Paddle (#28085) · 463c72c2
  由 wangchaochaohu 提交于 10月 20, 2020
  
  463c72c2
- Y
  lookup_table_v2_op_xpu report errors;test=kunlun (#28064) · 2cb1ecb9
  由 yinhaofeng 提交于 10月 20, 2020
```
* lookup_table_v2_op_xpu report errors;test=kunlun

* lookup_table_v2_op_xpu report errors;test=kunlun
```
  2cb1ecb9
19 10月, 2020 13 次提交

由 yinhaofeng 提交于 10月 19, 2020

* lookup_table_xpu op report errors;test=kunlun

* add adam xpu op;test=kunlun

* reset lookup

* change adam wrong;test=kunlun

6f0c3d1f

T

Add xpu transpose2 op.test=kunlun (#28086) · a5c95cd5
由 TeslaZhao 提交于 10月 19, 2020

a5c95cd5
C
Fix xpu error message (#28061) · 5f04875c
由 Chengmo 提交于 10月 19, 2020
```
* fix error message,test=kunlun

* fix, test=kunlun
```
5f04875c
L
Fix diag OP bug on Windows Python3.8 · c8d32c8c
由 LutaoChu 提交于 10月 19, 2020
```
Fix diag OP bug on Windows Python3.8 ，remove the std::min
```
c8d32c8c
P

reduce trt warning message (#28011) · a0b2f936
由 Pei Yang 提交于 10月 19, 2020

a0b2f936

Allclose op (#27891) · d4668938

由 huangxu96 提交于 10月 19, 2020

* Still has bugs.

* Fixed allclose_op bug, which cannot deal with some cases of fp64 inputs.

* improved CUDA kernel performance.

* Changed CUDA code.

* Fixed a bug in cuda kernel which cannot deal with large dimension input, and added an unittest for it.

* Add a test case for float32 input.

d4668938

Fix error message of multinomial op (#27946) · 975bd887

由 pangyoki 提交于 10月 19, 2020

* fix multinomial doc

* fix multinomial error message

* little doc change

* fix Categorical class doc

* optimize format of error message

* fix CPU Kernel error message format

* fix isinf and isnan error in WindowsOPENBLAS CI

* delete inf and nan

* add manual_seed in sample code

* little error message change

* change error message to InvalidArgument

* add full point for error message and add manual_seed in CPU environment

975bd887

K

update yolo_box support h != w. test=develop (#27327) · b6eff442
由 Kaipeng Deng 提交于 10月 19, 2020

b6eff442

error message opt for XPU, test=kunlun (#27972) · c1eed1fa

由 Double_V 提交于 10月 19, 2020

* add stack pool2d roi_align xpu op,test=kunlun

* error message opt, test=kunlun

* add xpu unittest,test=kunlun

* skip check grad,test=kunlun

* fix boostget , test=kunlun

* error message opt for XPU, test=kunlun

c1eed1fa

Add truncated_gaussian_random XPU kernel (#27861) · 4c5b779a

由 pangyoki 提交于 10月 19, 2020

* Add truncated_gaussian_random_op XPU kernel

* Add truncated_gaussian_random_op XPU kernel, test=kunlun

* little change, test=kunlun

* change boost_get to BOOST_GET_CONST

* change boost_get to BOOST_GET_CONST, test=kunlun

* little change, test=kunlun

* use Generator to generate random number and optimize format, test=kunlun

* little change, test=kunlun

* add TODO, test=kunlun

4c5b779a

Add gaussian_random XPU kernels (#27853) · 5b8e5001

由 pangyoki 提交于 10月 19, 2020

* Add gaussian_random XPU kernels

* commit kunlun, test=kunlun

* new version, test=kunlun

* change boost_get to BOOST_GET_CONST, test=kunlun

* use Generator to generate random number and optimize format, test=kunlun

* add TODO, test=kunlun

5b8e5001

Add uniform_random XPU kernel (#27846) · 74ce0397

由 pangyoki 提交于 10月 19, 2020

* support uniform_random op on Baidu Kunlun

* change dtype of attr shape from int to int64_t

* kunlun ci, test=kunlun

* new version, test=kunlun

* change boost_get to BOOST_GET_CONST

* change boost_get to BOOST_GET_CONST, test=kunlun

* use Generator to generate random number and optimize format

* run Kunlun CI, test=kunlun

* add TODO, test=kunlun

74ce0397

Polish kunlun error (#27974) · abf4d52a

由 xiaoting 提交于 10月 19, 2020

* polish error message,test=kunlun

* polish error,test=kunlun

* polish error,test=kunlun

* polish error,test=kunlun

abf4d52a

18 10月, 2020 1 次提交

add cast/concat/assign xpu op (#27911) · 3e956865

由 liuyuhui 提交于 10月 18, 2020

* addd

* add cast_op_xpu, test=kunlun

* fix bug for cast_op_xpu,test=kunlun

* add concat_op_xpu, test=kunlun

* slove conflicts, test=kunlun

* fix bug,test=kunlun

* add assign_op_xpu, test=kunlun

* fix bug,test=kunlun

* test=kunlun;test=develop

* fix concat bug,test=kunlun

* fix check_dygraph set in test_concat_op_xpu.py,test=kunlun

* fix error message,test=kunlun
Co-authored-by: Nmapingshuo <mps2012@yeah.net>

3e956865

16 10月, 2020 4 次提交

Incorporate cudnn_lstm into LSTM api (#27217) · fa9d3fa5

由 Guo Sheng 提交于 10月 16, 2020

* Incorporate cudnn_lstm into LSTM api.
test=develop

* Make coalesce_tensor support alignment optionally.
test=develop

* Reorganize RNN apis. test=develop

* Fix cudnn rnn layout conversion.
test=develop

* Add sequence_length support for RNN cudnn implement.
Add optional init_h and init_c gradient for cudnn_lstm_op.
test=develop

* Use create_parameter for rnn cudnn impl.
test=develop

* Move `self._flat_weight = self.create_parameter()` in RNNBase to main_program.
test=develop

* Update RNN api unittest to use set_device.
test=develop

* Fix set_place for unit tests of RNN apis.
test=develop

* Fix use_align in coalesce_tensor_op.
test=develop

* Adjust RNN apis arguments according to comments.
test=develop

* Polish documents for SimpleRNN apis.
test=develop

* Refine random seed in cudnn_lstm_op.
Expose rnn params from sublayers to RNN.
test=develop

* Fix RNN saving for jit.save.
Refine cudnn_lstm dropout behavior.
test=develop

* Fix doc of GRU. test=develop

* Use ShareDataWith to avoid copying for cudnn_lstm_op test.
test=develop

* Remove updates on cudnn_lstm temporarily.
test=develop

* Use ShareDataWith to avoid copying for cudnn_lstm_op test.
test=develop

* Refine random seed in cudnn_lstm_op.
test=develop

* Fix test_lstm by adjust ConcreteProgram buffer getter.
test=develop

* Use create_parameter instead of create_var for rnn._flat_weight for static graph usage.
test=develop

* Remove W input for cudnn_lstm to pass unused_var_check.
test=develop

* Add test_predict for RNN unit tests coverage.
test=develop

* Fix code style of rnn.
test=develop

* Fix F.rnn usage in rnn.py.
test=develop

fa9d3fa5

C
change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes (#27998) · 05fd49e9
由 chentianyu03 提交于 10月 16, 2020
```
* change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes

* format codes
```
05fd49e9
G

error message optimization in mean_xpu,softmax_with_cross_entropy_op_xpu,test=kunlun (#27967) · f94d0537
由 Guanghua Yu 提交于 10月 16, 2020

f94d0537

Fix xpu enforce (#27978) · d330cf66

由 Jack Zhou 提交于 10月 16, 2020

* test=kunlun;

Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast):

    * elementwise_div op
    * elementwise_max op
    * elementwise_mul op (with grad op)
    * elementwise_sub op (with grad op)

* 0.05->0.01

* add xpu error message description;test=kunlun

d330cf66

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致