- 16 3月, 2022 1 次提交
-
-
由 phlrain 提交于
-
- 01 3月, 2022 1 次提交
-
-
由 z8hanghuan 提交于
* optimize mergeadd for sparse_adam,*test=kunlun * optimize mergeadd for sparse_adam,*test=kunlun * optimize mergeadd for sparse_adam, *test=kunlun
-
- 22 2月, 2022 1 次提交
-
-
由 xiongkun 提交于
* change Vector to std::vector and provide MixVector class as a helper wrapper class * solve the multi-gpu hang problem * remove the duplicate template instantialize * Copy vector to cpu * add CopyToCPU * xxx * final version: fix the problem of all reduce * remove mixvector dependence * fix * merge * fix code * fix by CI
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 19 2月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Unify paddle/pten::framework::ddim into pten::ddim * fix paddle namespace * compile sucessfully * fix npu src file * fix conflict * fix conflict * fix tensorrt compiler error * fix conflict * fix conflict * fix tesst file conflict * fix conflict * fix mlu file conflict * fix mlu file conflict * fix cinn header file conflict * fix conflict * fix conflict * fix conflict * fix conflict
-
- 18 2月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move blas related files * move lapack related files
-
- 11 2月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
- 25 1月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again
-
- 24 1月, 2022 1 次提交
-
-
由 z8hanghuan 提交于
* support sparse of adam, *test=kunlun * add pre-commit-config.yaml * support sparse of adam in KL2,*test=kunlun * support sparse of adam in KL2, *test=kunlun * modify xpu.cmake, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun
-
- 17 1月, 2022 1 次提交
-
-
由 Wilber 提交于
* add pten::Place data structure. * update ci problem * fix ci problem * update * using platform::Place=pten::Place * remove BOOST_GET_CONST for CPUPlace and GPUPlace * compile pass 25%. * compile pass 45% * compile pass 60% * remove boost_get for xpu npu mlu and ipu * compile pass on cpu and gpu. * fix compile problem * fix compile error. * update * fix ci problem * update * ci approve * fix ci problem * fix ci eager test problem * remove BOOST_GET_CONST * fix npu compile
-
- 21 9月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Create stateful OneDNNAXPYHandler object. This makes it possible to call it multiple times without recreating the oneDNN primitives every time. * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel. * OneDNN SGD kernel. * Update call to use new OneDNNAXPYHandler object api. * Setup seed in proper place. * Enable OneDNN kernel only for single case. * For dense param and sparse grad. * Small refactor. * Enable oneDNN by op attr or by cmd line flag. * Use int64_t type for number of elements. * Support dense param and grad from OneDNN kernel. * Enable SGD OneDNN kernel when use MP BF16 optimizer. * Force non-copyable/movable OneDNNAXPYHandler. * Reuse OneDNNAXPYHandler for spare tensors in SUM op. * Fix SFINAE rules. * Remove recording event inside AXPY. * Get rid of internal primitive caching. * Stop use PP cache mechanims to store mem and primitive obj. * Handler obj store and reuse needed desc & prim * Do not derive from MKLDNNHandlerT
-
- 09 7月, 2021 1 次提交
-
-
由 arlesniak 提交于
* Use CBLAS for SelectedRows elementwise add operation. It's faster. * template compilation fix * reverted template compilation fix * slimmed template compilation fix Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
-
- 21 6月, 2021 1 次提交
-
-
由 lidanqing 提交于
* Add oneDNN AXPY handler. * Add fallback for small tensors. * Fix ifdefs * Remove unnecessary namespace prefixes and add missing headers. * Guard handler_axpy with proper ifdefs. * Compilation of this function is possible only when Paddle is not build with CUDA nor HIP. * Move AXPY handler code to separate files. * Use oneDNN AXPY handler in SGD op. * Use axpy handler only when Paddle is built with oneDNN. * Add test for SUM BF16 with big rows. * Fix SFINAE rules for elementwise_add_to. * Add test case for SGD with big rows. * update * update Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
-
- 26 5月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* modify matmul Op to complex template types * remove complex64/128 head file
-
- 06 5月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
-
- 04 2月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include second time, test=develop
-
- 25 12月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add support for complex grad accumulated * add unittest for coverage * update test dtype * remove useless blank line
-
- 10 9月, 2020 1 次提交
-
-
由 Steffy-zxf 提交于
update error info for selected_rows_functor
-
- 11 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 30 10月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
-
- 05 9月, 2019 1 次提交
-
-
由 123malin 提交于
* test=develop, communicator merge add => merge average
-
- 12 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 09 1月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 28 12月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 14 12月, 2018 2 次提交
- 26 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-
- 14 11月, 2018 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 08 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
Fix code to support cpplint syntax check test=develop
-
- 27 10月, 2018 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 17 10月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 15 10月, 2018 5 次提交
-
-
由 Qiao Longfei 提交于
test=develop
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 12 10月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-
- 11 10月, 2018 2 次提交
-
-
由 minqiyang 提交于
1. Accelerate SelectedRows MergeAdd functor 2. Add SelectedRowsSumTo functor to support MergeAdd multiple SelectedRows into one test=develop
-
由 Qiao Longfei 提交于
-