- 21 9月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Create stateful OneDNNAXPYHandler object. This makes it possible to call it multiple times without recreating the oneDNN primitives every time. * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel. * OneDNN SGD kernel. * Update call to use new OneDNNAXPYHandler object api. * Setup seed in proper place. * Enable OneDNN kernel only for single case. * For dense param and sparse grad. * Small refactor. * Enable oneDNN by op attr or by cmd line flag. * Use int64_t type for number of elements. * Support dense param and grad from OneDNN kernel. * Enable SGD OneDNN kernel when use MP BF16 optimizer. * Force non-copyable/movable OneDNNAXPYHandler. * Reuse OneDNNAXPYHandler for spare tensors in SUM op. * Fix SFINAE rules. * Remove recording event inside AXPY. * Get rid of internal primitive caching. * Stop use PP cache mechanims to store mem and primitive obj. * Handler obj store and reuse needed desc & prim * Do not derive from MKLDNNHandlerT
-
- 09 7月, 2021 1 次提交
-
-
由 arlesniak 提交于
* Use CBLAS for SelectedRows elementwise add operation. It's faster. * template compilation fix * reverted template compilation fix * slimmed template compilation fix Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
-
- 21 6月, 2021 1 次提交
-
-
由 lidanqing 提交于
* Add oneDNN AXPY handler. * Add fallback for small tensors. * Fix ifdefs * Remove unnecessary namespace prefixes and add missing headers. * Guard handler_axpy with proper ifdefs. * Compilation of this function is possible only when Paddle is not build with CUDA nor HIP. * Move AXPY handler code to separate files. * Use oneDNN AXPY handler in SGD op. * Use axpy handler only when Paddle is built with oneDNN. * Add test for SUM BF16 with big rows. * Fix SFINAE rules for elementwise_add_to. * Add test case for SGD with big rows. * update * update Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
-
- 26 5月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* modify matmul Op to complex template types * remove complex64/128 head file
-
- 06 5月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
-
- 04 2月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include second time, test=develop
-
- 25 12月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add support for complex grad accumulated * add unittest for coverage * update test dtype * remove useless blank line
-
- 10 9月, 2020 1 次提交
-
-
由 Steffy-zxf 提交于
update error info for selected_rows_functor
-
- 11 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 30 10月, 2019 1 次提交
-
-
由 zhang wenhui 提交于
-
- 05 9月, 2019 1 次提交
-
-
由 123malin 提交于
* test=develop, communicator merge add => merge average
-
- 12 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 09 1月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 28 12月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 14 12月, 2018 2 次提交
- 26 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-
- 14 11月, 2018 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 08 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
Fix code to support cpplint syntax check test=develop
-
- 27 10月, 2018 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 17 10月, 2018 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 15 10月, 2018 5 次提交
-
-
由 Qiao Longfei 提交于
test=develop
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 12 10月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-
- 11 10月, 2018 2 次提交
-
-
由 minqiyang 提交于
1. Accelerate SelectedRows MergeAdd functor 2. Add SelectedRowsSumTo functor to support MergeAdd multiple SelectedRows into one test=develop
-
由 Qiao Longfei 提交于
-
- 08 10月, 2018 1 次提交
-
-
由 qiaolongfei 提交于
-
- 18 9月, 2018 1 次提交
-
-
由 sneaxiy 提交于
-
- 28 4月, 2018 1 次提交
-
-
由 Abhinav Arora 提交于
* Fix CPPLint errors * Fix CPPLint errors in sequence2batch * Fix compilation * Fix LSTM op and GRU op * Fix LSTMP op * Fix more cpplint errors in operators/math * Address Code review feedback
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 08 2月, 2018 1 次提交
-
-
由 Yu Yang 提交于
-
- 29 12月, 2017 2 次提交
-
-
由 typhoonzero 提交于
-
由 typhoonzero 提交于
-
- 27 12月, 2017 1 次提交
-
-
由 typhoonzero 提交于
-