提交 · ccf5709d3f9958dccedf1b79e3c834fc2398b9c2 · Crayon鑫 / Paddle

09 4月, 2021 1 次提交

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

26 2月, 2021 1 次提交
- Q
  
  [ROCM] update fluid framework for rocm (part6), test=develop (#31015) · 28b356b9
  由 Qi Li 提交于 2月 26, 2021
  
  28b356b9
04 2月, 2021 1 次提交
- W
  
  fix xpu dygraph place (#30868) · 6e3856d3
  由 WangXi 提交于 2月 04, 2021
  
  6e3856d3
03 2月, 2021 1 次提交
- W
  
  【kunlun】dygraph supports multi xpu card training (#30671) · b1026f64
  由 WangXi 提交于 2月 03, 2021
  
  b1026f64
20 1月, 2021 1 次提交

add some RecordEvent, for dygraph timeline (#30299) · d1b25ed9

由 wanghuancoder 提交于 1月 20, 2021

* add some RecordEvent, for dygraph timeline, test=develop

* change GpuMemcpySync to memory::Copy, test=develop

* fix compile problem, test=develop

* fix compile problem, test=develop

* fix, test=develop

* fix, test=develop

d1b25ed9

13 1月, 2021 1 次提交

Set expected place in child thread for dataloader to avoid costing cuda memory... · 3d015f1c

由 Leo Chen 提交于 1月 13, 2021

Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)

* set expected place in child thread for dataloader

* set device id when set tensor from numpy

* revert tensor_py change

* add compile guard

* fix ci

* fix bug

3d015f1c

01 12月, 2020 1 次提交

add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199) · 8f45d142

由 chentianyu03 提交于 12月 01, 2020

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

8f45d142

30 10月, 2020 1 次提交
- 石
  update the version of pybind, test=develop (#28284) · d9b5f126
  由石晓伟提交于 10月 30, 2020
```
* update version pybind to v2.4.3, test=develop

* update unittests, test=develop
```
  d9b5f126
26 9月, 2020 1 次提交
- J
  
  Add conv2d bfloat16 support (#27325) · b0ee1405
  由 joanna.wozna.intel 提交于 9月 26, 2020
  
  b0ee1405
03 9月, 2020 1 次提交
- J
  
  Add bfloat16 data type (#25402) · 95e1434b
  由 joanna.wozna.intel 提交于 9月 03, 2020
  
  95e1434b
25 8月, 2020 1 次提交

optimized transformation form tensor to numpy (#26447) · c1f5df52

由 wanghuancoder 提交于 8月 25, 2020

* optimized transformation form tensor to numpy, test=develop

* optimized transformation form tensor to numpy, pass pre-commit, test=develop

* modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop

* modify py:array construct, test=develop

* fix _fetch_var to use deep copy, test=develop

c1f5df52

21 8月, 2020 1 次提交

support Baidu Kunlun AI Accelerator (#25959) · 138ecf24

由 QingshuChen 提交于 8月 21, 2020

* support Baidu AI Accelerator
  * test=kunlun

* minor
 * test=kunlun

* support xpu op in separate file
 * test=kunlun

* update XPU error message and remove duplicated code

 * test=kunlun

* minor
 * test=kunlun

* minor
 * test=kunlun

138ecf24

15 8月, 2020 1 次提交

expose and unify the Tensor concepts to the user (#25978) · 6de463d3

由 Zhou Wei 提交于 8月 15, 2020

* expose and unify the Tensor concepts to the user

* expose tensor to user

* add copy place for Tensor

* add copy place for Tensor

* add note

* add macro PADDLE_WITH_CUDA

* remove RUN_TYPE=DIST

* fix some error

6de463d3

19 6月, 2020 1 次提交
- C
  
  polish tensor set error messag, test=develop (#25113) · b23801a2
  由 Chen Weihang 提交于 6月 19, 2020
  
  b23801a2
08 6月, 2020 1 次提交

Refine error message in pybind folder (#24886) · 6190023a

由 Leo Chen 提交于 6月 08, 2020

* refine err_msg of pybind.cc, test=develop

* refine err_msg in tensor_py.h, test=develop

* refine error msg, test=develop

* fix test_exception, test=develop

* follow comments, test=develop

6190023a

11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

09 3月, 2020 1 次提交
- Z
  Fix model int8 quant fail, test=develop (#22891) · a020a257
  由 zhaoyuchen2018 提交于 3月 09, 2020
```
As model fails when enable int8 quant, so disable allocate memory in cpu
for small variable.
```
  a020a257
27 2月, 2020 1 次提交

Refine adam op to improve performance, test=develop (#22346) · 72dde4ab

由 zhaoyuchen2018 提交于 2月 27, 2020

* Refine adam op, test=develop

* Fuse kernels together to reduce cpu time.

* Refine paddle enforce, test=develop

* Remove some comments, test=develop

* Refine code,test=develop

* Refine cuda kernel, test=develop

* Refine code according to comments, test=develop

72dde4ab

04 2月, 2020 1 次提交

Support int16 for Tensor (#22423) · 822e5b36

由 Leo Chen 提交于 2月 04, 2020

* add int16 support, test=develop

* add test, test=develop

* fix typo, test=develop

* fix dtype error in slice, test=develop

822e5b36

09 12月, 2019 1 次提交

Refine VarBase init function (#21587) · 4f81d1bd

由 Leo Chen 提交于 12月 09, 2019

* refine init function, test=develop

* add tests, test=develop

* remove extern, which may cause symbol error in gcc-4.8, test=develop

4f81d1bd

05 12月, 2019 1 次提交

Split VarBase from Python Variable for Dygraph (#21359) · cdd46d7e

由 Leo Chen 提交于 12月 05, 2019

* test=develop, fix docker with paddle nccl problem

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

* add variable method to varbase and refactor to_variable to support return varbase

* support kwargs in varbase constructor

* add VarBase constructor to support default python args

* refine varbase initial method

* reset branch

* fix ut for change VarBase error info to PaddleEnforce

* cherry is parameter change before

* overload isinstance to replace too many change of is_variable

* rm useless files

* rm useless code merged by git

* test=develop, fix some ut failed error

* test=develop, fix test_graph_wrapper

* add some tests, test=develop

* refine __getitem__, test=develop

* add tests, test=develop

* fix err_msg, test=develop

cdd46d7e

27 11月, 2019 1 次提交

Support numpy bridge (enabled by default in dygraph mode) (#20983) · d5ff79e5

由 Youwei Song 提交于 11月 27, 2019

* add numpy bridge

* fix template compile

* add unittest, add default
test=develop

* fix unittest
test=develop

* fix unittest
test=develop

* zero_copy=True for to_variable,
test=develop

* bug fix
test=develop

* disable deprecated NumPy API
test=develop

* use better design of NumpyAllocator
test=develop

* fix Py_None check
test=develop

* reset c++ tracer when jump out dygraph guard
test=develop

* refine PADDLE_ENFORCE_xx format
test=develop

* bug fix of tracer switch
test=develop

* update decref
test=develop

d5ff79e5

01 11月, 2019 2 次提交

L

tensor.set() supports array list and remove unused code, test=develop (#20959) · 2c3c579b
由 Leo Chen 提交于 11月 01, 2019

2c3c579b

Update Tensor.set() to support float16 (#19964) · 9974e407

由 Leo Chen 提交于 11月 01, 2019

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

9974e407

10 5月, 2019 1 次提交

Double backward of conv2d. (#17211) · e32c9888

由 qingqing01 提交于 5月 10, 2019

* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
    - Now use it in conv2d_grad_grad.
    - Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables，return None in Python.

e32c9888

06 5月, 2019 1 次提交
- Z
  Fix tensor_py.h (#17195) · c5eeecca
  由 Zeng Jinle 提交于 5月 06, 2019
```
* fix tensor_py,test=develop

* change class name,test=develop
```
  c5eeecca
30 4月, 2019 1 次提交

Fix mem leak when converting Tensor to numpy array (#17182) · 5dfe2ab9

由 Zeng Jinle 提交于 4月 30, 2019

* fix mem leak when converting Tensor to numpy array
test=develop

* remove unused unittest,test=develop

* follow comments, test=develop

* fix dygraph bug,test=develop

5dfe2ab9

22 4月, 2019 1 次提交

Speed unit testing. (#16978) · ea42e431

由 qingqing01 提交于 4月 22, 2019

* Speed affine_channel_op unit testing
* Add check in tensor_py
* Fix ONLY_CPU Compiling

ea42e431

27 3月, 2019 1 次提交
- W
  Tensor index (#16223) · c300b1ba
  由 wopeizl 提交于 3月 27, 2019
```
* extend the slice function for python
test=develop
```
  c300b1ba
12 12月, 2018 1 次提交
- Y
  Change tensor uses proto::VarType::type · 9bd70a1e
  由 Yu Yang 提交于 12月 11, 2018
```
test=develop
```
  9bd70a1e
10 12月, 2018 1 次提交
- T
  add HasProtoAttr function in op_desc.h, clean node.h · 067ed70f
  由 Tao Luo 提交于 12月 10, 2018
```
test=develop
```
  067ed70f
03 12月, 2018 1 次提交
- S
  
  fix bug · c47c451a
  由 sneaxiy 提交于 12月 03, 2018
  
  c47c451a
24 11月, 2018 1 次提交
- M
  Change the include files because the version changes of pybind11 · 81994e84
  由 minqiyang 提交于 11月 24, 2018
```
test=develop
```
  81994e84
19 10月, 2018 1 次提交
- S
  
  fix pinned allocator · 2002e71d
  由 sneaxiy 提交于 10月 19, 2018
  
  2002e71d
02 10月, 2018 1 次提交
- Y
  
  Add comments and polish code style · 15076c32
  由 Yu Yang 提交于 10月 02, 2018
  
  15076c32
30 9月, 2018 2 次提交
- Y
  
  Polish code · 29f66c24
  由 Yu Yang 提交于 9月 30, 2018
  
  29f66c24
- Y
  
  Refine prelu_op · 6ca37448
  由 Yu Yang 提交于 9月 30, 2018
  
  6ca37448
29 9月, 2018 3 次提交
- Y
  
  Refine PyBind · ae9378f6
  由 Yu Yang 提交于 9月 29, 2018
  
  ae9378f6
- Y
  
  Refine · a1a01899
  由 Yu Yang 提交于 9月 29, 2018
  
  a1a01899
- Y
  
  Add communication attr · 31270e58
  由 Yu Yang 提交于 9月 29, 2018
  
  31270e58

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致