提交 · dba694f471713137f6d9d5338bf230983b16bac2 · PaddlePaddle / Paddle

23 2月, 2022 1 次提交

[phi] move unbind to phi (#39789) · dba694f4

由 Leo Chen 提交于 2月 23, 2022

* move unbind to phi

* revert infer shape

* add header file

* move concat_and_split to phi

dba694f4

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

21 1月, 2022 1 次提交
- C
  
  [pten] add concat pten kernel (#38955) · 06803c29
  由 chentianyu03 提交于 1月 21, 2022
  
  06803c29
17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

13 1月, 2022 1 次提交
- 石
  
  splits allocation for pten, test=develop (#38853) · 277cf900
  由石晓伟提交于 1月 13, 2022
  
  277cf900
08 12月, 2021 1 次提交
- S
  Fix CUDA Graph H2D bug by restore host memory (#37774) · a1ad3a63
  由 sneaxiy 提交于 12月 08, 2021
```
* fix CUDA Graph H2D bug again

* fix no return bug
```
  a1ad3a63
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
09 11月, 2021 1 次提交

Try to fix CUDA Graph H2D copy bug (#36987) · 2a143f84

由 Zeng Jinle 提交于 11月 09, 2021

* try to fix CUDA Graph H2D copy bug

* remove useless code

* fix ci

* fix ROCM CI

* fix CUDA_VERSION

* improve CI coverage

2a143f84

09 8月, 2021 1 次提交
- L
  
  fix split on empty tensor (#34356) · 898acb1a
  由 Leo Chen 提交于 8月 09, 2021
  
  898acb1a
22 7月, 2021 1 次提交
- W
  
  fix concat bug (#34319) · c342651e
  由 wuhuachaocoding 提交于 7月 22, 2021
  
  c342651e
07 7月, 2021 1 次提交
- X
  
  [HIP] 解决hipMemcpy无法overlap的问题，修改后AMD GPU性能提升大于10% (#33982) · 20da7703
  由 xiayanming 提交于 7月 07, 2021
  
  20da7703
31 3月, 2021 1 次提交
- T
  fix split core (#31892) · 393b3bd6
  由 Thunderbrook 提交于 3月 31, 2021
```
* fix split core

* format
```
  393b3bd6
11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

11 9月, 2019 1 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

12 6月, 2019 1 次提交
- Y
  Optimize the concat and split cuda implementation for cases when the number of... · 7e463c84
  由 Yiqun Liu 提交于 6月 12, 2019
```
Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979)

test=develop
```
  7e463c84
29 5月, 2019 1 次提交

Optimize the concat and split kernel for specical cases when the number of... · 5782ddda

由 Yiqun Liu 提交于 5月 29, 2019

Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)

* Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
test=develop

* Refine codes.
test=develop

* Correct the condition.
test=develop

* Move the define of tmp_data outside the if statement.

* Print the cudnn minor version.
test=develop

* Fix the case when in_num/o_num is 1 in concat/split op.
test=develop

* Remove const_cast.
test=develop

5782ddda

25 12月, 2018 1 次提交

Move GetTensor to tensor_util (#15011) · b9fb03cf

由 chengduo 提交于 12月 25, 2018

* refine tensor
test=develop

* refine tensor
test=develop

* fix device_context log
test=develop

b9fb03cf

21 12月, 2018 2 次提交

[Feature] Add Temporary Allocator (#14875) · 79bd6dfa

由 chengduo 提交于 12月 21, 2018

* Add Temporal Allocator

* add Temporay Allocator to DeviceContext
test=develop

* code refine
test=develop

* fix mean_iou
test=develop

* Add DeviceTemporaryAllocator
test=develop

* fix conv_op bug
test=develop

* small fix
test=develop

* code refine
test=develop

* log refine
test=develop

* fix unit test
test=develop

* move double check

* refine concat_and_split
test=develop

* add limit_of_temporary_allocation
test=develop

* fix name
test=develop

79bd6dfa

M
Remove unnessesary code · 0a4b6fc0
由 minqiyang 提交于 12月 21, 2018
```
test=develop
```
0a4b6fc0

20 12月, 2018 1 次提交
- M
  
  Accelerate lstm · 454db666
  由 minqiyang 提交于 12月 20, 2018
  
  454db666
23 10月, 2018 1 次提交

Refine Split op (#13967) · a7497653

由 chengduo 提交于 10月 23, 2018

* speedup split_op
test=develop

* speedup split_op
test=develop

* rename ConcatGrad to Split

* refine concat and split
test=develop

* fix compile error

a7497653

18 9月, 2018 1 次提交
- C
  
  speed up lod_tensor to array and array to lod_tensor · 0d751917
  由 chengduoZH 提交于 9月 18, 2018
  
  0d751917
17 9月, 2018 1 次提交
- C
  
  refine seq_concat · e7940141
  由 chengduoZH 提交于 9月 17, 2018
  
  e7940141
27 8月, 2018 1 次提交
- D
  
  fix concat synchronization bug · 6cc78705
  由 dzhwinter 提交于 8月 27, 2018
  
  6cc78705
22 8月, 2018 1 次提交
- Y
  
  Handle LoD for concat & seq_softmax ops · 2a36ad1a
  由 Yu Yang 提交于 8月 22, 2018
  
  2a36ad1a
19 6月, 2018 2 次提交
- Q
  
  fix concat grad kernel · 762160bd
  由 qiaolongfei 提交于 6月 19, 2018
  
  762160bd
- Q
  Make the CUDA kernel of concat correct and fix unit tests. (#11541) · 9c90dc97
  由 qingqing01 提交于 6月 19, 2018
```
* Make the CUDA kernel of concat correct and fix unit tests.
```
  9c90dc97
17 6月, 2018 1 次提交
- Q
  
  add gpu support for concat · ad1ad738
  由 qiaolongfei 提交于 6月 17, 2018
  
  ad1ad738
02 5月, 2018 1 次提交

Fix more CPPLint issues in fluid/operators/math (#10276) · 73858547

由 Abhinav Arora 提交于 5月 01, 2018

* Fix CPPLint issues in lstm_cpu_kernel.h

* Fix CPPLint issues in math/math_function_test

* Fix CPPLint issues in math/math_function_test

* Fix CPPLint issues in math/concat.cc

* Fix CPPLint issues in math/concat.cc

* Fix CPPLint issues in math/concat.cc

* Fix CPPLint issues in math/gru_cpu_kernel

* Fix CPPLint issues in math/selected_rows_functor_test.cu

* Fix compile error

* Fix compile error

73858547

30 4月, 2018 1 次提交
- D
  Feature/cuda9 cudnn7 (#10140) · eb6f9dd5
  由 dzhwinter 提交于 4月 30, 2018
```
* "re-commit "

* "picked up"

* "fix ci"

* "fix pdb hang up issue in cuda 9"
```
  eb6f9dd5
23 3月, 2018 2 次提交
- C
  
  code refine · 750aff10
  由 chengduoZH 提交于 3月 23, 2018
  
  750aff10
- C
  
  fix concat op · 043f47b2
  由 chengduoZH 提交于 3月 23, 2018
  
  043f47b2
05 3月, 2018 2 次提交
- C
  
  fix bug for big number; float->double and code refine · 131ec276
  由 chengduoZH 提交于 3月 05, 2018
  
  131ec276
- C
  
  follow comments and refine code · 82bd82c1
  由 chengduoZH 提交于 3月 05, 2018
  
  82bd82c1
03 3月, 2018 1 次提交
- C
  
  get max threads of GPU · 00e596ed
  由 chengduoZH 提交于 3月 02, 2018
  
  00e596ed
02 3月, 2018 1 次提交
- C
  
  refine concat_op · 60e7ee06
  由 chengduoZH 提交于 2月 28, 2018
  
  60e7ee06

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功