提交 · 93d017873f53eacaa88f2d91f2769cf269aec1b0 · PaddlePaddle / Paddle

30 3月, 2023 1 次提交

由 huangjiyi 提交于 3月 30, 2023

* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>

93d01787

19 8月, 2022 1 次提交

Support beam search decode op in XPU environment (#44917) · adaffb7b

由 mengqingchun02 提交于 8月 19, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

adaffb7b

02 8月, 2022 1 次提交

support beam_search operator on xpu. test=kunlun (#44720) · 9bf80772

由 mengqingchun02 提交于 8月 02, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

9bf80772

05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
15 9月, 2021 1 次提交

[NPU] add beam_search npu op (#34860) · 3760be06

由 pangyoki 提交于 9月 15, 2021

* add beam_search npu op

* fix CMakeList and add unittest

* fix bug of beam search npu op

* fix unittest

* let input ids become int64

* set output ids to int64_t

* delete check_dygraph

* fix beam_width=1

3760be06

24 1月, 2019 1 次提交

Add the CUDA kernel for beam_search op (#15020) · 3008fa12

由 Yiqun Liu 提交于 1月 24, 2019

* Refine the beam_search op and test.

* A basic CUDA implementation of beam_search for small batch_size.

* Implement CUDA kernel for beam_search_op.

* Use multiple CUDA threads in the same block to select the top beam.

* Update the python api of beam_search op.

* Enable extend function in CPU kernel of beam_search op.

* Unify the CUDA codes.
test=develop

* Unify the CPU kernel of beam_search op.

* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.

* Update the description of beam_search in API.spec.

* Enable the use of CUDA kernel in beam_search op.

* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop

* Follow comments.
test=develop

* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop

* Remove the except of is_empty op in PrepareData.
test=develop

3008fa12

26 12月, 2018 1 次提交

Fp16 training (#14992) · 856f0da0

由 Wu Yi 提交于 12月 26, 2018

* wip

* wip

* wip

* wip for test

* add fp16 tests test=develop

* fix cpu build test=develop

* fix test=develop

* fix py3 tests test=develop

* fix lr_scheduler dtype test=develop

* fix test=dvelop

* test fix ci compile test=develop

* fix build and merge test=develop

* fallback momentumop change to general test=develop

* make fp16 lr schedule simple test=develop

* fix ut test=develop

* fix tests test=develop

* remove fp16 learning rate cast test=develop

856f0da0

20 12月, 2018 2 次提交

T
Revert "[Feature] Fp16 training for resnet50 (#14850)" · da87f7a6
由 typhoonzero 提交于 12月 20, 2018
```
This reverts commit 3d750f9c.
```
da87f7a6

[Feature] Fp16 training for resnet50 (#14850) · 3d750f9c

由 Wu Yi 提交于 12月 20, 2018

* wip

* wip

* wip

* wip for test

* add fp16 tests test=develop

* fix cpu build test=develop

* fix test=develop

* fix py3 tests test=develop

* fix lr_scheduler dtype test=develop

* fix test=dvelop

* test fix ci compile test=develop

* fix build and merge test=develop

* fallback momentumop change to general test=develop

3d750f9c

16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

17 10月, 2018 1 次提交
- D
  
  use binary search. test=develop · a9f5f822
  由 dzhwinter 提交于 10月 17, 2018
  
  a9f5f822
15 10月, 2018 1 次提交

Add check for opt op (#13840) · 8e2fdc54

由 chengduo 提交于 10月 15, 2018

* add check for opt op

* fix opt op
test=develop

* fix test fail
test=develop

* fix optimization doc
test=develop

* test=develop

8e2fdc54

14 10月, 2018 1 次提交
- D
  
  add sparse update momentum. test=develop · 8329a1f1
  由 dzhwinter 提交于 10月 14, 2018
  
  8329a1f1
20 7月, 2018 1 次提交
- Q
  Fix serious bug in nesterov momentum optimizer. (#12231) · 873a50ce
  由 qingqing01 提交于 7月 20, 2018
```
* Fix serious bug in nesterov momentum optimizer.
```
  873a50ce
12 4月, 2018 1 次提交
- S
  Fix cpplint errors for a set of operators (#9837) · 8d3ce01f
  由 Siddharth Goyal 提交于 4月 11, 2018
```
* Fix cpplint errors, round2

* Fix pointer issue
```
  8d3ce01f
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
26 12月, 2017 1 次提交
- L
  
  unify the indentation of license · 761b3297
  由 Luo Tao 提交于 12月 26, 2017
  
  761b3297
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 12月, 2017 2 次提交
- D
  
  Update cuda kernel and doc. · d432b10d
  由 dangqingqing 提交于 12月 05, 2017
  
  d432b10d
- D
  
  Refine and speedup momentum operator. · 5bd1e73f
  由 dangqingqing 提交于 12月 05, 2017
  
  5bd1e73f
03 10月, 2017 1 次提交
- S
  
  Add momentum operator · d28b3094
  由 sidgoyal78 提交于 10月 02, 2017
  
  d28b3094
07 8月, 2017 1 次提交
- D
  
  "remove alias to more operators" · 6b23b91c
  由 dongzhihong 提交于 8月 07, 2017
  
  6b23b91c
04 8月, 2017 1 次提交
- L
  
  Add cpplint for *.h and cuda *.cu · b58725bd
  由 liaogang 提交于 8月 04, 2017
  
  b58725bd
31 7月, 2017 1 次提交
- Q
  
  add EIGEN_USE_GPU macro to op.cu file · 61f94f00
  由 qijun 提交于 7月 31, 2017
  
  61f94f00
25 7月, 2017 1 次提交
- Y
  Add type_alias to import framework into ops · efc119b4
  由 Yu Yang 提交于 7月 25, 2017
```
Make implement an operator less noisy.
```
  efc119b4
19 7月, 2017 1 次提交
- Q
  Add sgd op (#2950) · e3b27d19
  由 Qiao Longfei 提交于 7月 19, 2017
```
* a simplest SGD op
```
  e3b27d19

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功