提交 · 68f99b787d9a7a07e04415c7e79a8797159fc7a7 · Crayon鑫 / Paddle

06 9月, 2022 1 次提交

[PHI]Add TensorArray for PHI (#45479) · 68f99b78

由 YuanRisheng 提交于 9月 06, 2022

* add tensor array

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix ci bugs

* update by comment

* update code

68f99b78

01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

05 7月, 2022 1 次提交
- R
  Dataloader add custom device support (#44013) · a0dc361c
  由 ronnywang 提交于 7月 05, 2022
```
* Dataloader add custom device support

* update test=document_fix
```
  a0dc361c
26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
12 5月, 2022 1 次提交

add xpu buffer_reader, *test=kunlun (#42578) · cc343a41

由 z8hanghuan 提交于 5月 12, 2022

* add xpu buffer_reader, *test=kunlun

* xpu buffer_reader, use XPUDeviceGuard, *test=kunlun

* modify xpu.cmake, *test=kunlun

* modify xpu.cmake, *test=kunlun

* modify xpu.cmake, *test=kunlun

* add xpu buffer_reader, *test=kunlun

* add xpu buffer reader, *test=kunlun

* add xpu buffer reader, *test=kunlun

cc343a41

09 3月, 2022 1 次提交
- F
  
  [MLU] add mlu buffer reader (#40131) · b5a8a0d9
  由 fwenguang 提交于 3月 09, 2022
  
  b5a8a0d9
23 2月, 2022 1 次提交

Update record interface using part3 (#39695) · 1fcaab45

由 chenjian 提交于 2月 23, 2022

* fix RecordEvent interface

* modify default level to 4

* update interface use

* add const default trace level

* update record event interface using

* update record event interface using

* update record event interface using

* update operator.cc

* update part2

* update part1

* update part3

* fix include profiler.h header in ps server

* fix include profiler.h header in ps server

* fix profiler.h header

* fix profiler.h header

* fix merge buf

* update

* fix bug

* fix bug

1fcaab45

15 2月, 2022 1 次提交

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

10 1月, 2022 1 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

10 6月, 2021 1 次提交
- C
  Support diff dataset tensor place in single process dataloader (#33470) · dec63f1a
  由 Chen Weihang 提交于 6月 10, 2021
```
* support diff dataset tensor place in single process dataloader

* fix unittest failed
```
  dec63f1a
19 4月, 2021 1 次提交

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop (#32294) · cbe5c9f8

由 Leo Chen 提交于 4月 19, 2021

* [NPU] support GarbageCollector for npu (#31874)

* support GarbageCollector for npu

* fix typo

* fix gather_grad

* disable NPUDefaultStreamGarbageCollector on NPU

* [NPU] support npu for memcpy op (#31808)

* support npu for memcpy op

* add ut

* fix ut

* fix typo

* 【NPU】fix bug of using temp vector (#31963)

* fix bug when beta1_pow on cpu (#31995)

* [NPU] support npu profiler (#31684)

* support npu profiler

* add python api

* fix bugs

* add wrapper for incomplete type

* update profile proto

* record npu wait

* add xpu placeholder

* fix adam (#32016)

* [NPU] enable async copy and  add wait before sync operation (#31956)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* make TensorFromVector/TensorToVector sync

* [NPU] Support dataloader on npu place. (#31867)

* [NPU] Wait on NPUPlace (#32086)

* [NPU] fix cast op (#32121)

* fix npu kernel of cast op to handle casting to same dtype

* add comments

* [NPU] support cann 20.3 (#32044)

* fix compile problem on cann 20.3

* fix ut

* fix test_mul

* fix check_finite_and_scale

* fix lookup_table_v2_grad

* fix cmake

* support print op

* [NPU] Support npu save load (#31893)

* support save load for NPU

* add save load npu unittest

* support np.array transform in NPU

* fix errors

* delete dygraph in unittest

* add Wait

* fix unittest

* fix review comment

* fix unittest problem

* fix little problem

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196)

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace

* refine code

* fix NPUDeviceContext in all c++ unittest (#32198)

* fix NPUDeviceContext in all c++ unittest

* refine log
Co-authored-by: Npangyoki <pangyoki@126.com>

* [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* change TensorFromVector to FillNpuTensorWithConstant

* fix ignored api

* delete extra unittest

* fix little error

* fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu

* change TensorCopySync to TensorCopy

* delete useless Wait and add StreamWait

* fix npu_stream error

* fix check_finite_and_unscale_op_npu TensorCopy

* only save stream wait

* fix NPUDeviceContext in all c++ unittest

* delete wait
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* delete useless unittest file (#32206)

* Fix op test (#32231)

* fix conditional block (#32243)

* fix adam bug again (#32246)

* fix compile

* fix ut

* fix ut
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

cbe5c9f8

03 3月, 2021 1 次提交
- Q
  [ROCM] update fluid operators for rocm (part3), test=develop (#31213) · 84639b61
  由 Qi Li 提交于 3月 03, 2021
```
* [ROCM] update fluid operators for rocm (part3), test=develop

* fix clang format error, test=develop
```
  84639b61
04 2月, 2021 1 次提交
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
20 11月, 2020 1 次提交
- C
  
  fix occupied 0 device memory bug (#28771) · b969c32a
  由 Chen Weihang 提交于 11月 20, 2020
  
  b969c32a
10 8月, 2020 1 次提交

Add pin memory control for BufferedReader (#26026) · 3c8daa9b

由 Chen Weihang 提交于 8月 10, 2020

* add pin memory control

* fix buffered reader init problem

* fix unittest error

* add unittest for coverage

3c8daa9b

29 7月, 2020 1 次提交

Simplify BufferedReader to improve DataLoader performance (#25648) · 1b3081b1

由 Chen Weihang 提交于 7月 29, 2020

* simplify buffered reader to improve DataLoader performance

* fix 22 failed unittests

* fix cuda pinned context condition

* fix test_reader_reset failed

* fix two failed unittests

* change unittest place

* polish error messaage

* polish cast op GetExpecctedKernelType

* remove debug info in unittest

1b3081b1

25 5月, 2020 1 次提交

Polish reader folder error message (#24698) · 7fa9f16c

由 Chen Weihang 提交于 5月 25, 2020

* polish reader error message, test=develop

* fix detail error, test=develop

* reset activation dcudnn change, test=develop

7fa9f16c

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

20 4月, 2020 1 次提交

Optimize the error messages of paddle CUDA API (#23816) · 78170037

由 Zhou Wei 提交于 4月 20, 2020

* Optimize the error messages of paddle CUDA API, test=develop

* fix the error messages of paddle CUDA API, test=develop

* Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop

* remove build_ex_string,test=develop

* merge conflict,test=develop

78170037

25 3月, 2020 1 次提交
- Z
  
  add cuda resource pool for BufferedReader, test=develop (#23152) · bba74071
  由 Zeng Jinle 提交于 3月 25, 2020
  
  bba74071
14 10月, 2019 1 次提交

Refine py_reader exit (#20331) · 40effc61

由 Zeng Jinle 提交于 10月 14, 2019

* refine py_reader exit, test=develop

* fix multiprocess_reader exception unittest, test=develop

* increase code coverage for legacy fluid.layers.py_reader, test=develop

40effc61

14 8月, 2019 1 次提交
- C
  Use CUDAPinnedPlace in buffered_reader (#19112) · c70a97f4
  由 chengduo 提交于 8月 14, 2019
```
Use CUDAPinnedPlace in buffered_reader
```
  c70a97f4
17 6月, 2019 1 次提交
- Z
  Fix py_reader iterable bug (#18108) · 6eec66a1
  由 Zeng Jinle 提交于 6月 17, 2019
```
* fix py_reader iterable bug, test=develop

* move data from buffered_reader,test=develop
```
  6eec66a1
30 4月, 2019 1 次提交
- Z
  
  fix reader default stream,test=develop (#17106) · 08773b60
  由 Zeng Jinle 提交于 4月 29, 2019
  
  08773b60
11 3月, 2019 1 次提交

Revert "Revert "Add Event for TensorCopy"" (#16035) · ad80bde8

由 chengduo 提交于 3月 11, 2019

* Revert "Revert "Add Event for TensorCopy" (#16022)"

This reverts commit e2da3a5b.

* use default stream
test=develop

ad80bde8

04 3月, 2019 3 次提交
- C
  Revert "Add Event for TensorCopy" (#16022) · 92438f61
  由 chengduo 提交于 3月 03, 2019
```
* Revert "Add Event for TensorCopy (#15953)"

This reverts commit 7235fd66.
test=develop

* fix CI
test=develop
```
  92438f61
- C
  Add Event for TensorCopy (#15953) · 06f3c857
  由 chengduo 提交于 3月 01, 2019
```
Add Event for TensorCopy 
```
  06f3c857
- C
  Revert "Add Event for TensorCopy" (#16022) · e2da3a5b
  由 chengduo 提交于 3月 03, 2019
```
* Revert "Add Event for TensorCopy (#15953)"

This reverts commit 7235fd66.
test=develop

* fix CI
test=develop
```
  e2da3a5b
01 3月, 2019 3 次提交
- C
  Add Event for TensorCopy (#15953) · 7235fd66
  由 chengduo 提交于 3月 01, 2019
```
Add Event for TensorCopy 
```
  7235fd66
- Q
  
  pure async mode train · 847e4f4e
  由 Qiao Longfei 提交于 3月 01, 2019
  
  847e4f4e
- S
  add sample_generator · 3334c279
  由 sneaxiy 提交于 2月 27, 2019
```
test=develop
```
  3334c279
20 2月, 2019 1 次提交
- S
  decoupled reader · 7160cb0f
  由 sneaxiy 提交于 2月 19, 2019
```
test=develop
```
  7160cb0f
08 2月, 2019 1 次提交
- D
  Fix Pr #15296 · bc921927
  由 Dun Liang 提交于 2月 08, 2019
```
test=develop
```
  bc921927
01 2月, 2019 1 次提交
- K
  
  Revert "Async double buffered py reader" · 6f0f8045
  由 kolinwei 提交于 2月 01, 2019
  
  6f0f8045
20 1月, 2019 1 次提交
- D
  
  fix ci && test=develop · e5004f3c
  由 Dun Liang 提交于 1月 20, 2019
  
  e5004f3c
12 1月, 2019 1 次提交
- D
  
  add async copy and pinned place · a900015c
  由 Dun Liang 提交于 1月 12, 2019
  
  a900015c

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致