提交 · 3a81805bbce4ea4a01a81d06c626c20fa9cfeed9 · PaddlePaddle / Paddle

03 3月, 2021 1 次提交
- Q
  
  [ROCM] update fluid operators for rocm (part6), test=develop (#31301) · 946dbdae
  由 Qi Li 提交于 3月 03, 2021
  
  946dbdae
21 9月, 2020 1 次提交

Optimize argsort Op performance on GPU · f11a53ee

由 LutaoChu 提交于 9月 21, 2020

* argsort op acceleration on GPU when the input size is equal to the length of the ‘axis’ dimension

f11a53ee

20 4月, 2020 1 次提交

Optimize the error messages of paddle CUDA API (#23816) · 78170037

由 Zhou Wei 提交于 4月 20, 2020

* Optimize the error messages of paddle CUDA API, test=develop

* fix the error messages of paddle CUDA API, test=develop

* Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop

* remove build_ex_string,test=develop

* merge conflict,test=develop

78170037

10 1月, 2020 1 次提交
- F
  add backward gradient computation for op argsort (#22203) · 443a713c
  由 FlyingQianMM 提交于 1月 10, 2020
```
* add backward gradient computation for op argsort test=developo

* use pre-commit test=develop
```
  443a713c
25 12月, 2019 1 次提交

add register op_data_type of pad/expand_as et.al (#21718) · 5cb2c741

由 Aurelius84 提交于 12月 25, 2019

* add register op_data_type test=develop

* fix register bug in isfinite op test=develop

* rm int int64_t in pad2d gradKernel  test=develop

5cb2c741

29 11月, 2019 1 次提交

Add dscending for argsort (#21400) · b1627455

由 zhaoyuchen2018 提交于 11月 29, 2019

* Add ascending for argsort

* Refine api doc description.

* Refine descending description

* Add int32 logic to speedup when data is small size.

* Remove int32 opt as not support in python

b1627455

25 11月, 2019 1 次提交

Improve argsort performance. (#21267) · 08c19c58

由 zhaoyuchen2018 提交于 11月 25, 2019

* Improve argsort performance.

- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s

- Add fp16 support

* Refine error message
* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

08c19c58

30 8月, 2019 1 次提交
- T
  remove unused assert.h (#19529) · 02270b3e
  由 Tao Luo 提交于 8月 30, 2019
```
test=develop
```
  02270b3e
17 6月, 2018 1 次提交
- Y
  
  Avoid using dynamic array in cuda kernel · 92cfa2be
  由 Yibing Liu 提交于 6月 17, 2018
  
  92cfa2be
12 6月, 2018 3 次提交
- Y
  
  Support more negative axes in argsort_op · 94e72ea6
  由 Yibing Liu 提交于 6月 12, 2018
  
  94e72ea6
- Y
  
  Compute target index on gpu · 42645ff7
  由 Yibing Liu 提交于 6月 12, 2018
  
  42645ff7
- Y
  
  Add gpu kernel for argsort op · 6ee22c4f
  由 Yibing Liu 提交于 6月 12, 2018
  
  6ee22c4f

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功