提交 · 1dbc863202f1229ac9f630586a2ccf785c74aee4 · Crayon鑫 / Paddle

10 1月, 2022 1 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
25 2月, 2021 1 次提交
- Q
  
  [ROCM] update fluid framework for rocm (part4), test=develop (#31013) · 580447d0
  由 Qi Li 提交于 2月 25, 2021
  
  580447d0
26 12月, 2020 1 次提交
- L
  
  [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574) · 4427df37
  由 liuyuhui 提交于 12月 26, 2020
  
  4427df37
27 9月, 2020 1 次提交

Refine error msg in paddle/fluid/framework/details [part 2] (#27429) · 35074963

由 Leo Chen 提交于 9月 27, 2020

* refine broadcast_op_handle

* refine some error messages

* refine some files

* fix bug

* fix bug

* fix bug

* follow comments

* follow comments

35074963

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

20 9月, 2019 1 次提交
- Z
  
  fix reduce and broadcast to avoid multi-stream, test=develop (#19889) · b754700f
  由 Zeng Jinle 提交于 9月 20, 2019
  
  b754700f
11 9月, 2019 1 次提交
- C
  Open fuse broadcast option (#18833) · e506c99c
  由 chengduo 提交于 9月 11, 2019
```
* fix vlog level and fuse option type
test=develop
```
  e506c99c
11 7月, 2019 1 次提交

Feature/buffer_shared_inplace (#17911) · d3003a16

由 Zeng Jinle 提交于 7月 11, 2019

* feature/buffer_shared_inplace, test=develop

* refine code, test=develop

* fix elementwise_add op cpu inplace and sum inplace bug, test=develop

* add unittest and debug log, test=develop

* fix parallel_executor scope bug, polish code, test=develop

* fix sum op, activation op, single_in_place_inference bug, test=develop

* remove kLocalExecScopeName, test=develop

* fix unittest,test=develop

* fix out_var first version bug, test=develop

* follow comments,test=develop

d3003a16

28 3月, 2019 1 次提交
- C
  Fuse Adam And SGD ops (#15933) · 1096746c
  由 chengduo 提交于 3月 28, 2019
```
* fuse optimizer
```
  1096746c
21 2月, 2019 1 次提交

Profiler refine and add CUDA runtime api tracer (#15301) · a83e4704

由 Dun 提交于 2月 21, 2019

* refine profiler && add runtime tracer

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* fix bug && test=develop

* add thread id map && test=develop

* test=develop

* testing

* bug fix

* remove cuda event && refine code && test=develop

* test=develop

* test=develop

* test=develop

* fix windows temp file && test=develop

* test=develop

* fix windows bug && test=develop

* fix start up issue && test=develop

* code polish &&  test=develop

* remove unused code && test=develop

* add some cupti cbid && test=develop

* add FLAGS_multiple_of_cupti_buffer_size && test=develop

* fix compile error && test=develop

* add keyword && test=develop

* fix && test=develop

* code polish && test=develop

a83e4704

19 2月, 2019 1 次提交
- T
  fix warnings (#15790) · e1c707fe
  由 tensor-tang 提交于 2月 19, 2019
```
* fix warnings

test=develop

* fix enforce test

test=develop
```
  e1c707fe
17 1月, 2019 1 次提交
- G
  
  Hide varhandle members. (#15382) · 7cd4dd7c
  由 gongweibao 提交于 1月 17, 2019
  
  7cd4dd7c
26 11月, 2018 1 次提交
- M
  Revert the changes of VLOG · 53433d7f
  由 minqiyang 提交于 11月 26, 2018
```
test=develop
```
  53433d7f
22 11月, 2018 1 次提交
- P
  
  fix unit test cases · 7c8c9dc9
  由 peizhilin 提交于 11月 22, 2018
  
  7c8c9dc9
08 11月, 2018 1 次提交
- M
  Change the origin VLOG level to 10 times · 0c3227a5
  由 minqiyang 提交于 11月 08, 2018
```
Fix code to support cpplint syntax check

test=develop
```
  0c3227a5
29 10月, 2018 2 次提交

Q

fix compile, optimize code test=develop · 3d4e0508
由 Qiao Longfei 提交于 10月 29, 2018

3d4e0508

[1.1] [project] train imagenet using large batch size (#13766) · 26200f2e

由 Wu Yi 提交于 10月 29, 2018

* fix nccl2 lars dist support

* put lars in momentum op

* add tests lars

* fix ci

* fix cpu kernel

* soft warning

* remove lars in test_recognize_digits.py

* move to another op

* add file

* update api.spec test=develop

* update test=develop

* fix api.spec test=develop

* wip

* wip, finish grad merge ops

* wip, finish graph build

* wip test running

* work on 1 gpu

* workable version

* update

* fix tests

* fuse broadcast op

* fix compile failed

* refine

* add batch merge test mnist

* fix CI test=develop

* fix build

* use independent bn params for batch merge test=develop

* update api.spec

* follow comments and for test

* wip

* refine tests test=develop

* follow comments test=develop

* remove startup bn modify test=develop

* follow comments test=develop

* fix merge test=develop

26200f2e

27 10月, 2018 1 次提交
- Q
  
  broadcast handle not inited parameter · fad42fe7
  由 Qiao Longfei 提交于 10月 27, 2018
  
  fad42fe7
13 9月, 2018 1 次提交
- Y
  
  update by comment · 1e1b6622
  由 Yancey1989 提交于 9月 13, 2018
  
  1e1b6622
12 9月, 2018 1 次提交
- Y
  
  move bcast op into pass · 5ce1a960
  由 Yancey1989 提交于 9月 12, 2018
  
  5ce1a960
22 6月, 2018 1 次提交
- C
  
  enhance ParallelExecutor stable (#11637) · da556ed6
  由 chengduo 提交于 6月 22, 2018
  
  da556ed6
21 6月, 2018 3 次提交
- C
  Enhance Parallel Executor stable (#11634) · 7d26dd81
  由 chengduo 提交于 6月 21, 2018
```
* Fix Parallel Exe(VarHandel's version)

* Fix broadcast

* enhance ParallelExecutor stable
```
  7d26dd81
- C
  
  Add No Mutex · c99fca5f
  由 chengduoZH 提交于 6月 21, 2018
  
  c99fca5f
- C
  
  Fix broadcast · 13de7238
  由 chengduoZH 提交于 6月 21, 2018
  
  13de7238
09 5月, 2018 2 次提交
- C
  
  extract method from broadcast::RunImpl · 83053221
  由 chengduoZH 提交于 5月 09, 2018
  
  83053221
- C
  
  refine pe · 9eec2c75
  由 chengduoZH 提交于 5月 09, 2018
  
  9eec2c75
05 5月, 2018 1 次提交
- C
  
  follow comments · 881e063e
  由 chengduoZH 提交于 5月 05, 2018
  
  881e063e
04 5月, 2018 1 次提交
- C
  
  follow comments and clean code · 7722baa8
  由 chengduoZH 提交于 5月 04, 2018
  
  7722baa8
02 5月, 2018 1 次提交
- C
  
  update sparse parameter · 5ff1ef36
  由 chengduoZH 提交于 5月 02, 2018
  
  5ff1ef36
20 4月, 2018 1 次提交
- C
  
  fix scope of gather broadcast · 9a4ae4df
  由 chengduoZH 提交于 4月 20, 2018
  
  9a4ae4df
18 4月, 2018 2 次提交
- C
  
  check the generate_op is null or not and add DEPS of broadcast_op_handle and gather_op_handle · 4760ac44
  由 chengduoZH 提交于 4月 18, 2018
  
  4760ac44
- Y
  
  Clean Code · d24ef931
  由 Yu Yang 提交于 4月 18, 2018
  
  d24ef931
17 4月, 2018 1 次提交
- C
  
  code refine · 4abef501
  由 chengduoZH 提交于 4月 16, 2018
  
  4abef501
16 4月, 2018 1 次提交
- C
  
  refine gather and broadcast · 690cd1f7
  由 chengduoZH 提交于 4月 14, 2018
  
  690cd1f7
13 4月, 2018 3 次提交
- C
  
  follow comments · 384d6ee8
  由 chengduoZH 提交于 4月 13, 2018
  
  384d6ee8
- C
  
  enhance broadcast_op_handle and gather_op_handle · 02842cfc
  由 chengduoZH 提交于 4月 13, 2018
  
  02842cfc
- C
  
  refine broadcast op · b0267ac9
  由 chengduoZH 提交于 4月 13, 2018
  
  b0267ac9
12 4月, 2018 1 次提交
- C
  
  code refine · e26c6d78
  由 chengduoZH 提交于 4月 11, 2018
  
  e26c6d78

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致