提交 · 83932715d17a9b8b4ea8362273d6872c30935ebe · BaiXuePrincess / Paddle

14 9月, 2021 3 次提交

Add api paddle.device.cuda.empty_cache to release idle gpu memory hold by allocator。 (#35427) · 83932715

由 chenenquan 提交于 9月 14, 2021

* Add empty_cache api to release idle gpu memory hold by allocator,test=develop

* Add empty_cache api to release idle gpu memory hold by allocator,test=develop

* Add empty_cache api to release idle gpu memory hold by allocator,test=develop

* Fix test coverage problem for empty_cache

* delete redundant check for empty_cache

* fix the problem of empty_cache's doc

* delete the nvidia-smi comment in doc of empty_cache, test=document_fix

83932715

W

[Inference] Add tuned trt_dynamic_shape mode. (#34806) · 7c96efed
由 Wilber 提交于 9月 14, 2021

7c96efed
W

slice_op support bool tensor. (#35586) · f5e430c5
由 WeiXin 提交于 9月 14, 2021

f5e430c5

13 9月, 2021 20 次提交
- Y
  
  fix bug, test=document_fix (#35697) · 97a73e1d
  由 YUNSHEN XIE 提交于 9月 13, 2021
  
  97a73e1d
- Y
  Change uts to nightly mode (#35541) · 2b0f9b51
  由 YUNSHEN XIE 提交于 9月 13, 2021
```
* Change uts to nightly mode

* remove test_trt_pool_op from parallel_UT_rule.py,test=document_fix
```
  2b0f9b51
- C
  fix instance norm index error (#35341) · e641c638
  由 ceci3 提交于 9月 13, 2021
```
* fix instance norm index error

* add unittest

* update

* fix
```
  e641c638
- X
  
  refine svd; unexpose tensor.svd; fix english document; set timeout=40 (#35635) · f521a30d
  由 xiongkun 提交于 9月 13, 2021
  
  f521a30d
- Z
  [RC22] Fix linear with matmul_op replace (#35445) · 53e294ca
  由 zhulei 提交于 9月 13, 2021
```
* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace
```
  53e294ca
- B
  add flatten/flatten2 converter test cases (#35462) · fb65268c
  由 baoachun 提交于 9月 13, 2021
```
* add flatten/flatten2 converter test cases

* add fatten/flatten2 trt converter test cases
```
  fb65268c
- J
  [Bugfix] reshape with zero input tensor (#35642) · cabc5f36
  由 JZ-LIANG 提交于 9月 13, 2021
```
* reshape support zero-input

* add unitest

* revise error message
```
  cabc5f36
- 李
  upload global scatter and global gather operators related files (#35546) · ecfe8375
  由李季提交于 9月 13, 2021
```
* upload global scatter and global gather operators related files
```
  ecfe8375
- Z
  
  Support int16_t in fill_constant_op (#35619) · 4b6f8099
  由 Zhang Zheng 提交于 9月 13, 2021
  
  4b6f8099
- B
  
  add gather trt converter test case (#35523) · 75d5e3bf
  由 baoachun 提交于 9月 13, 2021
  
  75d5e3bf
- B
  
  add gather_nd trt converter test cases (#35464) · 42559f72
  由 baoachun 提交于 9月 13, 2021
  
  42559f72
- Q
  
  [NPU] add npu unit test if title has NPU key word, test=develop (#35566) · 666da145
  由 Qi Li 提交于 9月 13, 2021
  
  666da145
- Y
  Add searchsorted op (#35159) · 66223048
  由 Yanxing Shi 提交于 9月 13, 2021
```
* fix github name

* fix CI error

* fix review and CI error

* fix inf,nan error and modify unittest samples

* add unittest samples

* add unittest samples

* fix unittest error

* test=document_fix

* test=document_fix

* modify doc and add unittest samples

* fix error newline in constant

* modify doc after mentor review

* modify __all__ and doc

* modify doc
```
  66223048
- S
  [HybridParallel]Fix scaler bug in pipeline_parallel/model_parallel (#35556) · 2bb44317
  由 ShenLiang 提交于 9月 13, 2021
```
* support grad group

* fix single card condition
```
  2bb44317
- B
  add group_norm trt converter test case (#35524) · 787209f7
  由 baoachun 提交于 9月 13, 2021
```
* add group_norm trt converter test case

* update group_norm trt converter test case
```
  787209f7
- C
  Revert "change '/' method from scale Op to elementwise_div Op (#33279)" (#35650) · 03026cea
  由 chentianyu03 提交于 9月 13, 2021
```
This reverts commit ae93d9c2.
```
  03026cea
- J
  
  catch dimentions error when input is empty in static.nn.group_norm (#35613) · 7b743ba2
  由 JYChen 提交于 9月 13, 2021
  
  7b743ba2
- G
  support hybrid parallel inference helper class (#35576) · dc3c845a
  由 Guoxia Wang 提交于 9月 13, 2021
```
* support hybrid parallel inference helper class
```
  dc3c845a
- Z
  [ROCM] fix top_k_v2 with large shape (#33783) · b8c6e180
  由 zhulei 提交于 9月 13, 2021
```
* [ROCM] fix top_k_v2 with large shape

* [ROCM] fix top_k_v2 with large shape
```
  b8c6e180
- J
  Added clip BF16/FP32 FWD/BWD kernels (#35601) · 4e233712
  由 jakpiase 提交于 9月 12, 2021
```
* implemented clip op bf16/fp32

* added skipping if not cpu or bf16

* CI rerun after bf16 package change

* added parentheses to ensure formatting
```
  4e233712
11 9月, 2021 3 次提交
- 王
  
  register the with_quant_attr attribute for all operattor. test=develop (#35591) · 8412d6c0
  由王明冬提交于 9月 11, 2021
  
  8412d6c0
- B
  
  Add cpu npu cembedding (#35467) · ec252914
  由 Baibaifan 提交于 9月 11, 2021
  
  ec252914
- F
  
  re-submit softmax_with_cross_entropy hard label (#35283) (#35660) · 4f4962cb
  由 Feng Xing 提交于 9月 11, 2021
  
  4f4962cb
10 9月, 2021 12 次提交
- L
  change metaclass of Layer from pybind11_builtins.pybind11_type to type (#35538) · 523f46fe
  由 Leo Chen 提交于 9月 10, 2021
```
* change metaclass of Layer from pybind11_builtins.pybind11_type to type

* fix cast

* add ut
```
  523f46fe
- F
  
  test=document_fix (#35655) · 49e243c9
  由 Feng Xing 提交于 9月 10, 2021
  
  49e243c9
- F
  
  re-submit softmax_with_cross_entropy hard label (#35283) · a4b67f78
  由 Feng Xing 提交于 9月 10, 2021
  
  a4b67f78
- Z
  
  add api_op fill_diagonal_tensor (#34515) · 98d047d7
  由 zhiboniu 提交于 9月 10, 2021
  
  98d047d7
- H
  add cumprod op (#35185) · 4e509f46
  由 hlygit66666 提交于 9月 10, 2021
```
* add test_cumprod_op

* Revert "add test_cumprod_op"

This reverts commit c96cf6dff5d09ae7d8cc72c1e8ae4369a153aa19.

* recommit

* add error message

* test input(x) initialize

* test use cpu

* update test code

* add test type

* add test case

* solve ci problem

* add complex case test

* add complex case test

* fix review problem

* fix conflict

* fix some docs

* change test case

* change test case

* fix review problems again

* fix docs

* fix inclusivescan bug
```
  4e509f46
- H
  Support float16 when using ClipGradByGlobalNorm. (#33565) · 5bdca05b
  由 huangxu96 提交于 9月 10, 2021
```
This PR supports gradient clip (ClipGradByGlobalNorm) when training with AMP(auto mixed precision).
```
  5bdca05b
- B
  
  add prelu trt converter test case (#35512) · 749945b3
  由 baoachun 提交于 9月 10, 2021
  
  749945b3
- F
  
  change trt_tile_op half diff and add some func for CE (#35597) · 922e23bf
  由 feng_shuai 提交于 9月 10, 2021
  
  922e23bf
- B
  
  add elementwise trt converter test cases (#35552) · 29cacee4
  由 baoachun 提交于 9月 10, 2021
  
  29cacee4
- Z
  Fix scatter and gather bug (#35595) · 6f7aca9e
  由 Zeng Jinle 提交于 9月 10, 2021
```
* fix scatter gather bug:

* fix windows ci
```
  6f7aca9e
- W
  conv3d (#35507) · 42847d2e
  由 wenbin 提交于 9月 10, 2021
```
* conv3d

* remove const_cast

* modify ut

* disable dynamic shape for trt6.0

* remove trt5
```
  42847d2e
- P
  add asExtra for nce op (#35474) · 512329b0
  由 pangyoki 提交于 9月 10, 2021
```
* add asExtra for nce op

* fix unittest error in macos

* remove asExtra for is_test
```
  512329b0
09 9月, 2021 1 次提交

Add matrix_rank Op and it's GPU and CPU kernel (#34823) · eb1fbf12

由 0x45f 提交于 9月 09, 2021

* init matrix_rank op, add matrix_rank CPU code and test

* add GPU kernel, remove svd_eigen.h

* add CPU kernel when tol is tensor

* add cpu and gpu code when tol is tensor

* fix CI-ROCM error

* add matrix_rank API describe, fix PR-CI-Py3 error

* fix PR-CI-Windows error, add matrix_rank API test

* delete useless comments

* fix review

* add my code in svd_helper.h

* update doc commets

* remove spaces

eb1fbf12

08 9月, 2021 1 次提交
- add API Tensor.T for reverse dim of Tensor (#35379) · 2133f3dd
  由 zhouweiwei2014 提交于 9月 08, 2021
  
  2133f3dd

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致