提交 · c48a9ad56e69a5d27d1b36df8c731c9c32f84d78 · Crayon鑫 / Paddle

17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

10 1月, 2022 1 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

10 6月, 2021 1 次提交
- C
  Support diff dataset tensor place in single process dataloader (#33470) · dec63f1a
  由 Chen Weihang 提交于 6月 10, 2021
```
* support diff dataset tensor place in single process dataloader

* fix unittest failed
```
  dec63f1a
19 4月, 2021 1 次提交

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop (#32294) · cbe5c9f8

由 Leo Chen 提交于 4月 19, 2021

* [NPU] support GarbageCollector for npu (#31874)

* support GarbageCollector for npu

* fix typo

* fix gather_grad

* disable NPUDefaultStreamGarbageCollector on NPU

* [NPU] support npu for memcpy op (#31808)

* support npu for memcpy op

* add ut

* fix ut

* fix typo

* 【NPU】fix bug of using temp vector (#31963)

* fix bug when beta1_pow on cpu (#31995)

* [NPU] support npu profiler (#31684)

* support npu profiler

* add python api

* fix bugs

* add wrapper for incomplete type

* update profile proto

* record npu wait

* add xpu placeholder

* fix adam (#32016)

* [NPU] enable async copy and  add wait before sync operation (#31956)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* make TensorFromVector/TensorToVector sync

* [NPU] Support dataloader on npu place. (#31867)

* [NPU] Wait on NPUPlace (#32086)

* [NPU] fix cast op (#32121)

* fix npu kernel of cast op to handle casting to same dtype

* add comments

* [NPU] support cann 20.3 (#32044)

* fix compile problem on cann 20.3

* fix ut

* fix test_mul

* fix check_finite_and_scale

* fix lookup_table_v2_grad

* fix cmake

* support print op

* [NPU] Support npu save load (#31893)

* support save load for NPU

* add save load npu unittest

* support np.array transform in NPU

* fix errors

* delete dygraph in unittest

* add Wait

* fix unittest

* fix review comment

* fix unittest problem

* fix little problem

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196)

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace

* refine code

* fix NPUDeviceContext in all c++ unittest (#32198)

* fix NPUDeviceContext in all c++ unittest

* refine log
Co-authored-by: Npangyoki <pangyoki@126.com>

* [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* change TensorFromVector to FillNpuTensorWithConstant

* fix ignored api

* delete extra unittest

* fix little error

* fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu

* change TensorCopySync to TensorCopy

* delete useless Wait and add StreamWait

* fix npu_stream error

* fix check_finite_and_unscale_op_npu TensorCopy

* only save stream wait

* fix NPUDeviceContext in all c++ unittest

* delete wait
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* delete useless unittest file (#32206)

* Fix op test (#32231)

* fix conditional block (#32243)

* fix adam bug again (#32246)

* fix compile

* fix ut

* fix ut
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

cbe5c9f8

03 3月, 2021 1 次提交
- Q
  [ROCM] update fluid operators for rocm (part3), test=develop (#31213) · 84639b61
  由 Qi Li 提交于 3月 03, 2021
```
* [ROCM] update fluid operators for rocm (part3), test=develop

* fix clang format error, test=develop
```
  84639b61
04 2月, 2021 1 次提交
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
20 11月, 2020 1 次提交
- C
  
  fix occupied 0 device memory bug (#28771) · b969c32a
  由 Chen Weihang 提交于 11月 20, 2020
  
  b969c32a
10 8月, 2020 1 次提交

Add pin memory control for BufferedReader (#26026) · 3c8daa9b

由 Chen Weihang 提交于 8月 10, 2020

* add pin memory control

* fix buffered reader init problem

* fix unittest error

* add unittest for coverage

3c8daa9b

29 7月, 2020 1 次提交

Simplify BufferedReader to improve DataLoader performance (#25648) · 1b3081b1

由 Chen Weihang 提交于 7月 29, 2020

* simplify buffered reader to improve DataLoader performance

* fix 22 failed unittests

* fix cuda pinned context condition

* fix test_reader_reset failed

* fix two failed unittests

* change unittest place

* polish error messaage

* polish cast op GetExpecctedKernelType

* remove debug info in unittest

1b3081b1

25 5月, 2020 1 次提交

Polish reader folder error message (#24698) · 7fa9f16c

由 Chen Weihang 提交于 5月 25, 2020

* polish reader error message, test=develop

* fix detail error, test=develop

* reset activation dcudnn change, test=develop

7fa9f16c

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

20 4月, 2020 1 次提交

Optimize the error messages of paddle CUDA API (#23816) · 78170037

由 Zhou Wei 提交于 4月 20, 2020

* Optimize the error messages of paddle CUDA API, test=develop

* fix the error messages of paddle CUDA API, test=develop

* Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop

* remove build_ex_string,test=develop

* merge conflict,test=develop

78170037

25 3月, 2020 1 次提交
- Z
  
  add cuda resource pool for BufferedReader, test=develop (#23152) · bba74071
  由 Zeng Jinle 提交于 3月 25, 2020
  
  bba74071
14 10月, 2019 1 次提交

Refine py_reader exit (#20331) · 40effc61

由 Zeng Jinle 提交于 10月 14, 2019

* refine py_reader exit, test=develop

* fix multiprocess_reader exception unittest, test=develop

* increase code coverage for legacy fluid.layers.py_reader, test=develop

40effc61

14 8月, 2019 1 次提交
- C
  Use CUDAPinnedPlace in buffered_reader (#19112) · c70a97f4
  由 chengduo 提交于 8月 14, 2019
```
Use CUDAPinnedPlace in buffered_reader
```
  c70a97f4
17 6月, 2019 1 次提交
- Z
  Fix py_reader iterable bug (#18108) · 6eec66a1
  由 Zeng Jinle 提交于 6月 17, 2019
```
* fix py_reader iterable bug, test=develop

* move data from buffered_reader,test=develop
```
  6eec66a1
30 4月, 2019 1 次提交
- Z
  
  fix reader default stream,test=develop (#17106) · 08773b60
  由 Zeng Jinle 提交于 4月 29, 2019
  
  08773b60
11 3月, 2019 1 次提交

Revert "Revert "Add Event for TensorCopy"" (#16035) · ad80bde8

由 chengduo 提交于 3月 11, 2019

* Revert "Revert "Add Event for TensorCopy" (#16022)"

This reverts commit e2da3a5b.

* use default stream
test=develop

ad80bde8

04 3月, 2019 3 次提交
- C
  Revert "Add Event for TensorCopy" (#16022) · 92438f61
  由 chengduo 提交于 3月 03, 2019
```
* Revert "Add Event for TensorCopy (#15953)"

This reverts commit 7235fd66.
test=develop

* fix CI
test=develop
```
  92438f61
- C
  Add Event for TensorCopy (#15953) · 06f3c857
  由 chengduo 提交于 3月 01, 2019
```
Add Event for TensorCopy 
```
  06f3c857
- C
  Revert "Add Event for TensorCopy" (#16022) · e2da3a5b
  由 chengduo 提交于 3月 03, 2019
```
* Revert "Add Event for TensorCopy (#15953)"

This reverts commit 7235fd66.
test=develop

* fix CI
test=develop
```
  e2da3a5b
01 3月, 2019 3 次提交
- C
  Add Event for TensorCopy (#15953) · 7235fd66
  由 chengduo 提交于 3月 01, 2019
```
Add Event for TensorCopy 
```
  7235fd66
- Q
  
  pure async mode train · 847e4f4e
  由 Qiao Longfei 提交于 3月 01, 2019
  
  847e4f4e
- S
  add sample_generator · 3334c279
  由 sneaxiy 提交于 2月 27, 2019
```
test=develop
```
  3334c279
20 2月, 2019 1 次提交
- S
  decoupled reader · 7160cb0f
  由 sneaxiy 提交于 2月 19, 2019
```
test=develop
```
  7160cb0f
08 2月, 2019 1 次提交
- D
  Fix Pr #15296 · bc921927
  由 Dun Liang 提交于 2月 08, 2019
```
test=develop
```
  bc921927
01 2月, 2019 1 次提交
- K
  
  Revert "Async double buffered py reader" · 6f0f8045
  由 kolinwei 提交于 2月 01, 2019
  
  6f0f8045
20 1月, 2019 1 次提交
- D
  
  fix ci && test=develop · e5004f3c
  由 Dun Liang 提交于 1月 20, 2019
  
  e5004f3c
12 1月, 2019 1 次提交
- D
  
  add async copy and pinned place · a900015c
  由 Dun Liang 提交于 1月 12, 2019
  
  a900015c
10 12月, 2018 1 次提交
- Y
  
  fix pyreader failed · 79082c94
  由 Yancey1989 提交于 12月 10, 2018
  
  79082c94
07 12月, 2018 1 次提交
- Y
  
  clean code · 220db4f3
  由 Yancey1989 提交于 12月 07, 2018
  
  220db4f3
06 12月, 2018 1 次提交
- Y
  
  init parallel graph mode · c9de6f1b
  由 Yancey1989 提交于 12月 06, 2018
  
  c9de6f1b
20 7月, 2018 1 次提交

Some enhancement on readers · 060f4217

由 fengjiayi 提交于 7月 20, 2018

1. Make the feeding thread of py_reader a daemon thread.
2. Update buffer_reader's destructor, fixing a bug.
3. Make pyreader demo script supporting CPU environment.

060f4217

18 7月, 2018 1 次提交
- Y
  
  Change code · b789a3a4
  由 yuyang18 提交于 7月 18, 2018
  
  b789a3a4
16 7月, 2018 1 次提交
- Y
  
  Try to speed up buffered reader · e576345f
  由 yuyang18 提交于 7月 16, 2018
  
  e576345f
14 7月, 2018 1 次提交
- Y
  
  Extract buffered reader · dc34effd
  由 yuyang18 提交于 7月 14, 2018
  
  dc34effd

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致