提交 · c552d1acc4a1bb289555cd70d925ae20d10151a2 · PaddlePaddle / Paddle

16 3月, 2022 1 次提交
- P
  
  add forward case · c552d1ac
  由 phlrain 提交于 3月 16, 2022
  
  c552d1ac
01 3月, 2022 1 次提交

optimize mergeadd for sparse_adam,*test=kunlun (#39966) · d4911594

由 z8hanghuan 提交于 3月 01, 2022

* optimize mergeadd for sparse_adam,*test=kunlun

* optimize mergeadd for sparse_adam,*test=kunlun

* optimize mergeadd for sparse_adam, *test=kunlun

d4911594

22 2月, 2022 1 次提交

change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624

由 xiongkun 提交于 2月 22, 2022

* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI

728c0624

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

19 2月, 2022 1 次提交

[Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264

由 Aurelius84 提交于 2月 19, 2022

* Unify paddle/pten::framework::ddim into pten::ddim

* fix paddle namespace

* compile sucessfully

* fix npu src file

* fix conflict

* fix conflict

* fix tensorrt compiler error

* fix conflict

* fix conflict

* fix tesst file conflict

* fix conflict

* fix mlu file conflict

* fix mlu file conflict

* fix cinn header file conflict

* fix conflict

* fix conflict

* fix conflict

* fix conflict

2fe04264

18 2月, 2022 1 次提交
- F
  [Pten] blas and lapck migration (#39587) · 8c7ee8c2
  由 Feiyu Chan 提交于 2月 18, 2022
```
* move blas related files
* move lapack related files
```
  8c7ee8c2
11 2月, 2022 1 次提交
- F
  [Pten] move operators/math/math_function_* to pten/kernels/func (#39300) · d25a7f9e
  由 Feiyu Chan 提交于 2月 11, 2022
```
* move operators/math/math_function_* to pten/kernels/func
* namespace from `paddle::operators::math` to `pten::funcs`
```
  d25a7f9e
25 1月, 2022 1 次提交

[Move selected_rows PR ] Change the relationship of [include/Cmake]. (#39128) · 2bafd338

由 Weilong Wu 提交于 1月 25, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

2bafd338

24 1月, 2022 1 次提交

support sparse of adam, *test=kunlun (#38483) · e106901e

由 z8hanghuan 提交于 1月 24, 2022

* support sparse of adam, *test=kunlun

* add pre-commit-config.yaml

* support sparse of adam in KL2,*test=kunlun

* support sparse of adam in KL2, *test=kunlun

* modify xpu.cmake, *test=kunlun

* support sparse of adam, rm some wait, *test=kunlun

* support sparse of adam, rm some wait, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

* support sparse of adam, *test=kunlun

e106901e

17 1月, 2022 1 次提交

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

21 9月, 2021 1 次提交

Reuse OneDNN handler for SGD and SUM for SelectedRows input tensors. (#35510) · 799f3861

由 Adam Osewski 提交于 9月 20, 2021

* Create stateful OneDNNAXPYHandler object.

This makes it possible to call it multiple times without recreating the
oneDNN primitives every time.

* Prepare SGDOpKernel to reuse its implementation from OneDNN kernel.

* OneDNN SGD kernel.

* Update call to use new OneDNNAXPYHandler object api.

* Setup seed in proper place.

* Enable OneDNN kernel only for single case.

* For dense param and sparse grad.

* Small refactor.

* Enable oneDNN by op attr or by cmd line flag.

* Use int64_t type for number of elements.

* Support dense param and grad from OneDNN kernel.

* Enable SGD OneDNN kernel when use MP BF16 optimizer.

* Force non-copyable/movable OneDNNAXPYHandler.

* Reuse OneDNNAXPYHandler for spare tensors in SUM op.

* Fix SFINAE rules.

* Remove recording event inside AXPY.

* Get rid of internal primitive caching.

* Stop use PP cache mechanims to store mem and primitive obj.
* Handler obj store and reuse needed desc & prim

* Do not derive from MKLDNNHandlerT

799f3861

09 7月, 2021 1 次提交

Use CBLAS for SelectedRows elementwise add operation. (#34008) · 1412d3bc

由 arlesniak 提交于 7月 09, 2021

* Use CBLAS for SelectedRows elementwise add operation. It's faster.

* template compilation fix

* reverted template compilation fix

* slimmed template compilation fix
Co-authored-by: NAdam Osewski <adam.osewski@intel.com>

1412d3bc

21 6月, 2021 1 次提交

Add AXPY oneDNN handler (#33632) · 773aabc7

由 lidanqing 提交于 6月 21, 2021

* Add oneDNN AXPY handler.

* Add fallback for small tensors.

* Fix ifdefs

* Remove unnecessary namespace prefixes and add missing headers.

* Guard handler_axpy with proper ifdefs.

* Compilation of this function is possible only when Paddle is not build
with CUDA nor HIP.

* Move AXPY handler code to separate files.

* Use oneDNN AXPY handler in SGD op.

* Use axpy handler only when Paddle is built with oneDNN.

* Add test for SUM BF16 with big rows.

* Fix SFINAE rules for elementwise_add_to.

* Add test case for SGD with big rows.

* update

* update
Co-authored-by: NAdam Osewski <adam.osewski@intel.com>

773aabc7

26 5月, 2021 1 次提交
- C
  modify matmul Op to complex template types (#33130) · 6c07cd7e
  由 chentianyu03 提交于 5月 26, 2021
```
* modify matmul Op to complex template types

* remove complex64/128 head file
```
  6c07cd7e
06 5月, 2021 1 次提交
- A
  
  Sum kernel for CPU supporting BF16 and SelectedRows (#32631) · 9599c3b3
  由 Adam Osewski 提交于 5月 06, 2021
  
  9599c3b3
04 2月, 2021 1 次提交
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
25 12月, 2020 1 次提交

[Complex] Add support for complex grad accumulated (#29889) · 1a304e6c

由 Chen Weihang 提交于 12月 25, 2020

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

1a304e6c

10 9月, 2020 1 次提交
- S
  update error info for selected_rows_functor · 50e60e87
  由 Steffy-zxf 提交于 9月 10, 2020
```
update error info for selected_rows_functor
```
  50e60e87
11 5月, 2020 1 次提交

Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f

由 Chen Weihang 提交于 5月 11, 2020

* add new macro BOOST_GET_SAFELY & unittests, test=develop

* add different macro type, test=develop

* fix get macro type in executor, test=develop

* four macro part change backup

* using one macro for all case, test=develop

* revert attribute change, test=develop

* change to three func to solve gcc4.8 bug, test=develop

* polish some details, test=develop

aa0f254f

30 10月, 2019 1 次提交
- Z
  
  fix select_rows mergeadd bug, test=develop (#20876) · d4289125
  由 zhang wenhui 提交于 10月 30, 2019
  
  d4289125
05 9月, 2019 1 次提交
- 1
  fix the diff between async mode and async_half mode (#19535) · 2f037c31
  由 123malin 提交于 9月 05, 2019
```
* test=develop,  communicator merge add => merge average
```
  2f037c31
12 4月, 2019 1 次提交
- Q
  
  optimize merge add if input rows of all selected rows is not duplicated · 920a9609
  由 Qiao Longfei 提交于 4月 12, 2019
  
  920a9609
09 1月, 2019 1 次提交
- Q
  
  follow comment test=develop · c3b9edf9
  由 Qiao Longfei 提交于 1月 09, 2019
  
  c3b9edf9
28 12月, 2018 1 次提交
- Q
  
  sum op support empty selected rows as input · 25d44d40
  由 Qiao Longfei 提交于 12月 28, 2018
  
  25d44d40
14 12月, 2018 2 次提交
- M
  Add sorted_result parameter to SelectedRows Functor · 5fea8cd4
  由 minqiyang 提交于 12月 14, 2018
```
test=develop
```
  5fea8cd4
- M
  Remove BinarySearch from Adam Op · da796dfe
  由 minqiyang 提交于 12月 14, 2018
```
test=develop
```
  da796dfe
26 11月, 2018 1 次提交
- M
  Revert the changes of VLOG · 53433d7f
  由 minqiyang 提交于 11月 26, 2018
```
test=develop
```
  53433d7f
14 11月, 2018 1 次提交
- T
  fix some compiler warning · e0d4e04b
  由 Tao Luo 提交于 11月 14, 2018
```
test=develop
```
  e0d4e04b
08 11月, 2018 1 次提交
- M
  Change the origin VLOG level to 10 times · 0c3227a5
  由 minqiyang 提交于 11月 08, 2018
```
Fix code to support cpplint syntax check

test=develop
```
  0c3227a5
27 10月, 2018 2 次提交
- Q
  
  optimize code · 96d55009
  由 Qiao Longfei 提交于 10月 27, 2018
  
  96d55009
- Q
  
  sum op handle empty input · dd78b5df
  由 Qiao Longfei 提交于 10月 27, 2018
  
  dd78b5df
17 10月, 2018 1 次提交
- Q
  
  change elementwise_add to elementwise_add_to test=develop · 02259575
  由 Qiao Longfei 提交于 10月 17, 2018
  
  02259575
15 10月, 2018 5 次提交
- Q
  code optimize · 936926aa
  由 Qiao Longfei 提交于 10月 15, 2018
```
test=develop
```
  936926aa
- Q
  
  clean code · c52ccbc1
  由 Qiao Longfei 提交于 10月 15, 2018
  
  c52ccbc1
- Q
  
  optimize blas call · 6056d043
  由 Qiao Longfei 提交于 10月 15, 2018
  
  6056d043
- Q
  
  optimize code · 5db75513
  由 Qiao Longfei 提交于 10月 15, 2018
  
  5db75513
- Q
  
  change map to unordered_map · d5c64af2
  由 Qiao Longfei 提交于 10月 15, 2018
  
  d5c64af2
12 10月, 2018 1 次提交
- M
  Polish code · 3f6ec900
  由 minqiyang 提交于 10月 12, 2018
```
test=develop
```
  3f6ec900
11 10月, 2018 2 次提交
- M
  Accelerate SelectedRows Functors: · 8ec748cf
  由 minqiyang 提交于 10月 11, 2018
```
  1. Accelerate SelectedRows MergeAdd functor

  2. Add SelectedRowsSumTo functor to support MergeAdd multiple SelectedRows into one

test=develop
```
  8ec748cf
- Q
  
  optimize code · 38568519
  由 Qiao Longfei 提交于 10月 11, 2018
  
  38568519

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功