提交 · 1bd9cfef4e27baa84fd40ed1e65e80017d0cf232 · PaddlePaddle / Paddle

08 10月, 2021 2 次提交
- A
  Added oneDNN BF16 relu (#36265) · 1bd9cfef
  由 arlesniak 提交于 10月 08, 2021
```
* Added oneDNN BF16 relu

* fixed typo

* refactored test, review fixes
```
  1bd9cfef
- Z
  
  fix cast cuda implementation (#36266) · 9814f895
  由 Zeng Jinle 提交于 10月 08, 2021
  
  9814f895
07 10月, 2021 2 次提交
- H
  fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer (#36237) · 730dcaf4
  由 Haohongxiang 提交于 10月 07, 2021
```
* fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer

* update

* update
```
  730dcaf4
- A
  [OneDNN] Conv op refactor. (#36252) · e9288340
  由 Adam Osewski 提交于 10月 07, 2021
```
* Remove unused header.

* Use ConvMKLDNNHandlerT for conv2d INT8.

* Use absolute module path to import.
```
  e9288340
05 10月, 2021 1 次提交

Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719

由 jakpiase 提交于 10月 05, 2021

* tmp

* added concat BF16/FP32 BWD oneDNN kernel

* minor change

* minor change

* fix for CI

* added formatting

* Reverted deleting static keyword

* added reviewers suggestions

* reverted deleting concat bf16 test file

* fixed concat tests

dc4d5719

04 10月, 2021 1 次提交
- J
  
  added Piotr to authors.md and updated Intel-related paddle authors image (#36254) · 2cee0ea7
  由 jakpiase 提交于 10月 04, 2021
  
  2cee0ea7
30 9月, 2021 6 次提交
- Y
  
  add slotrecord datafeed (#36099) · 0a3dbe8a
  由 yaoxuefeng 提交于 9月 30, 2021
  
  0a3dbe8a
- W
  
  fix yolo (#36240) · c12176e8
  由 wenbin 提交于 9月 30, 2021
  
  c12176e8
- L
  
  add test_hessian time out (#36234) · 56b04bc1
  由 levi131 提交于 9月 30, 2021
  
  56b04bc1
- A
  [NPU] modify transpose2 and index_select_grad kernels for model xlnet (#36214) · a66b9fba
  由 Aganlengzi 提交于 9月 30, 2021
```
* [NPU] modify transpose2 and index_select_grad kernels for model xlnet

* add transpose2 int64_t unit test

* add more transpose2 unit tests

* update test_transpose_op_npu.py
```
  a66b9fba
- 李
  Fix raw optim (#36176) · 5e0f199a
  由李季提交于 9月 30, 2021
```
* fix raw optim

* pre-commit test file
Co-authored-by: Nsneaxiy <sneaxiy@126.com>
```
  5e0f199a
- 李
  
  fix the undefined variable bug in dist_transformer file (#36211) · 8af939f1
  由李季提交于 9月 30, 2021
  
  8af939f1
29 9月, 2021 20 次提交

Add basic support for CUDA Graph (#36190) · 21b93c3d

由 Zeng Jinle 提交于 9月 29, 2021

* add basic support for CUDA Graph

* fix ci compile error

* fix LOG print, fix windows CI

* follow comments and update

* small fix for default ctor

* fix rocm compile error

* fix CPU compile error

21b93c3d

Z
add optest for adamw (#36148) · 69eed34d
由 zhaoyingli 提交于 9月 29, 2021
```
* update func name

* skip cpu

* update unittest

* update unittest
```
69eed34d
L
fix cusparse compile problem, test=develop (#36199) · 3eb50715
由 Liu-xiandong 提交于 9月 29, 2021
```
* fix cusparse compile problem, test=develop

* Modify file permissions
```
3eb50715

Add functional autograd API:hessian (#36108) · 1f93582c

由 levi131 提交于 9月 29, 2021

* init functional jacobian api

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* init hessian API

* save status

* polish API docstring

* modify docstring

* add utils.py

* save status

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* test_hessian.py is ok

* polish hessian API

* init vhp

* Revert "init vhp"

This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.

* add test for partial_engine.cc

* modify numerical_delta with dtype float32

* merge fix for dtype float64

* spell fix

* polish code

* rm _stop_gradient_pre_process
Co-authored-by: NJiabinYang <360788950@qq.com>

1f93582c

L
Spinlock (#36030) · a9ea41c5
由 liutiexing 提交于 9月 29, 2021
```
* add align for WorkQueue

* add spinlock

* merge spinlock
```
a9ea41c5
Y

add slot record dataset (#36200) · 79bd5f90
由 yaoxuefeng 提交于 9月 29, 2021

79bd5f90
Z
[npu] add box coder (#36171) · 83578cfa
由 zhulei 提交于 9月 29, 2021
```
* [npu] add box coder

* [npu] add box coder
```
83578cfa
P

fix bug of top_k npu op (#36175) · 2b8fd704
由 pangyoki 提交于 9月 29, 2021

2b8fd704

[NPU] Add group norm (#35937) · c79de728

由 zhulei 提交于 9月 29, 2021

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group_norm op

c79de728

[NPU] mod for model bert (#36165) · 7bddf2e8

由 Aganlengzi 提交于 9月 29, 2021

* merge conflict of paddle_gtest_main.cc

* modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt

7bddf2e8

W

[hybrid] Fix model parallel non-distributed param broadcast (#36186) · bec9fc9a
由 WangXi 提交于 9月 29, 2021

bec9fc9a

Add op paddle.device.cuda.get_device_name and paddle.device.cuda.get_device_capability. (#35672) · f703558d

由 hlygit66666 提交于 9月 29, 2021

* add op paddle.device.cuda.get_device_name

* fix some bugs

* fix some bugs

* fix error message bugs

* fix en docs

* fix bugs

* fix bugs

* fix bugs

* add error message test case

* add get_device_name and get_device_capability

* fix review

* fix docs bug

* fix docs

* fix docs

f703558d

fix paddle.device.cuda.get_device_properties doc (#36178) · 6d4435ac

由 Yanxing Shi 提交于 9月 29, 2021

* Initial Commit

* add unittest and add error information

* modify doc

* fix some error

* fix some word

* fix bug cudaDeviceProp* and modify error explanation

* fix cudaDeviceProp* error and unnitest samples

* fix hip error and PADDLE_WITH_HIP

* update style

* fix error is_compiled_with_cuda

* fix paddle.device.cuda.get_device_properties

* fix error for multi thread safe

* update style

* merge conflict

* modify after mentor review

* update style

* delete word

* fix unittest error for windows

* support string input and modify some code

* modify doc to support string input

* fix error for express information

* fix error for express information

* fix unnitest for windows

* fix device.startswith('gpu:')

* format error and doc

* fix after review

* format code

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix py2 error

* fix wrong words and doc

* fix _gpuDeviceProperties

* test=document_fix

6d4435ac

Y

Implement the grad and enhance the cache of norm_convolution fusion ops. (#36168) · 767050d9
由 Yiqun Liu 提交于 9月 29, 2021

767050d9
Z

remove wait if no fetch (#36150) · b3d2dc7b
由 Zeng Jinle 提交于 9月 29, 2021

b3d2dc7b
B

fix nullptr block in op_teller (#36197) · 667bf188
由 baoachun 提交于 9月 29, 2021

667bf188
Z

refine case when thread_num = 1 (#36201) · 7e60cc63
由 Zeng Jinle 提交于 9月 29, 2021

7e60cc63
L

Add fused_dropout wrapper to ease use. (#36185) · 092d45c3
由 Li Min 提交于 9月 29, 2021

092d45c3
R

[ROCM] bugfix for bilinear_interp_v2_grad (#36160) · 5e1d0b5c
由 ronnywang 提交于 9月 29, 2021

5e1d0b5c
Z

fix flags approval (#36192) · 1b1210ea
由 Zeng Jinle 提交于 9月 29, 2021

1b1210ea

28 9月, 2021 8 次提交

F
add roi_align (#35102) · f068e08d
由 Feng Ni 提交于 9月 28, 2021
```
* add roi_align in vision/ops.py
```
f068e08d
L
Add sparse_attention api, test=develop (#35676) · 6b587e93
由 Liu-xiandong 提交于 9月 28, 2021
```
Add sparse_attention OPs, python api will be added in next pr
```
6b587e93

add API paddle.linalg.eig (#35674) · bc7e2b92

由 Lijunhui 提交于 9月 28, 2021

* Add paddle.linalg.eig op

* remove comments

* remove comments

* extend batch_size to the origin

* add real times complex functor & destroy the backward complex output bug

* terminate output diff when input real tensors

* correct tiny doc errors

* move functions from eig_helper to svd_helper and remove eig_helper

* remove tensor.Resize

* remove no longer used code

* use existing lapack functions

* reply review comments 21/27

* remove .cu as this op is only executed on CPU

* remove const_cast & add const in argument list for read-only references

* fix sample code error in CI

* remove template typename Tbase and more

* remove eig exposure in paddle.*

* add 'name=None' in eig python implementation

* handle the unittest

* try to solve the unittest

* solve CI coverage

* remove no longer used code

* polish API doc and more

* reply review comments

* polish unittest, commit plan B

* polish unittest

bc7e2b92

R

[ROCM] bugfix for arg_min_max (#36098) · 36791fdd
由 ronnywang 提交于 9月 28, 2021

36791fdd
T
[HeterPs]ps gpu dump (#36157) · 97d30602
由 Thunderbrook 提交于 9月 28, 2021
```
* ps gpu dump

* remove log
```
97d30602

[hybrid] seed and dropout op support force-cpu (#35820) · 58c8f6b3

由 xiayanming 提交于 9月 28, 2021

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] fix seed ci failed issue

* add AsExtra for force_cpu of seed op

58c8f6b3

remove new linalg api in paddle.__init__ (#36151) · 3bb4715e

由 zhiboniu 提交于 9月 28, 2021

remove recent linalg api in paddle.init;
add args 'name' in some new linalg api interface
same change in develop branch to #36112

3bb4715e

【Bug fix】Fix dygraph double grad dtype error (#36125) · af4f018a

由 Jiabin Yang 提交于 9月 28, 2021

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* add test for partial_engine.cc

af4f018a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功