- 25 3月, 2022 2 次提交
-
-
由 duanboqiang 提交于
* fix lars optitmizer bug * Update optimizer.py
-
由 Jiabin Yang 提交于
* refactor eager flags * fix flags error when we switch from eager to dygraph * fix ci problem * fix ci * fix ci * merge develop and fix code style * merge develop and fix code style * fix op test error * fix op test error * fix op test error * fix op test error * fix op test error * merge develop
-
- 20 1月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * eager test case * support inference test * refine test and fix initializer failed * modify eagertensor patch method * add eagertensor.clear_grandint, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * support create varbase and fix retain grad error * call monkey_patch_varbase in _test_eager_guard, test=develop * fix windows error * split clear_gradient to clear_gradient and zero_grads, test=develop * refine, test=develop * refine, test=develop * support test_imperative_basic test in eager mode * remove additional log in variable.h * remove additional log in variable.h * remove additional code create in merge * eager * fix some eager logic, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * patch_tensor_method_func, test=develop * refine, test=develop * eager test case, test=develop * refine, test=develop * eager, test=develop * eager, test=develop * eager optimizer, test=develop * eager optimizer, test=develop * eager test_imperative_optimizer_v2, test=develop * eager, test=develop * refine, test=develop * refine, test=develop * eager, test=develop * add resize in share buffer to, test=develop * eager, test=develop * fix _share_buffer_to, test=develop * refine, test=develop * refine, test=develop * support eager for dataloader,test=develop Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 24 12月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
-
- 17 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
* support multi precision update for LAMB * hide some api * fix ci uts * fix lamb output of dygraph * remove some changes to some PR * try to fix Py3 CI compile error * fix test_imperative_optimizer, add lars ut, add layer_norm ut * fix ut, fix format * fix ut * fix windows ci
-
- 08 12月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 28 10月, 2021 1 次提交
-
-
由 limingshu 提交于
-
- 14 10月, 2021 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 Yuang Liu 提交于
-
- 13 10月, 2021 1 次提交
-
-
由 limingshu 提交于
* A leap of try for cudaLaunchCooperativeKernel * fix bugs * Totally replace the lar cuda kernel * Fix bugs * a test for lars merge * Adding las_op_momentum infer_shape * Fix codes * use avg_numel instead of max_numel to acquire grid num * modify unittest files about lars op * Finally converge when merged-lars works * fix ctest files * add merged_operation kernel when cuda version is older than 11 * Fix code style * fix ctest failure * fix error * fix all ctest error and change lars compute code of cpu * fix bugs on v100. * revert python modififation about lars * revert python modification codes
-
- 12 10月, 2021 2 次提交
-
-
由 Zeng Jinle 提交于
This reverts commit b3f6eedb.
-
由 Zeng Jinle 提交于
-
- 21 9月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Create stateful OneDNNAXPYHandler object. This makes it possible to call it multiple times without recreating the oneDNN primitives every time. * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel. * OneDNN SGD kernel. * Update call to use new OneDNNAXPYHandler object api. * Setup seed in proper place. * Enable OneDNN kernel only for single case. * For dense param and sparse grad. * Small refactor. * Enable oneDNN by op attr or by cmd line flag. * Use int64_t type for number of elements. * Support dense param and grad from OneDNN kernel. * Enable SGD OneDNN kernel when use MP BF16 optimizer. * Force non-copyable/movable OneDNNAXPYHandler. * Reuse OneDNNAXPYHandler for spare tensors in SUM op. * Fix SFINAE rules. * Remove recording event inside AXPY. * Get rid of internal primitive caching. * Stop use PP cache mechanims to store mem and primitive obj. * Handler obj store and reuse needed desc & prim * Do not derive from MKLDNNHandlerT
-
- 18 9月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 17 9月, 2021 2 次提交
-
-
由 zhangbo9674 提交于
* add pure fp16 major function in auto_cast & tracer * support master weight in dygraph for pure fp16 * check mix dtype of fp16&fp32 for check_finite_and_unscale op * change pure fp16 funtion name * refine some bug in auto_cast * refine auto_cast interface logic * add param _casted_by_pure_fp16 for class Layer * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator * refine pure_fp16_decorator as decorator * add unittest * add comment * add comment * support recompute * add comment for auto_cast and decorator * support to_static_state_dict for paddle.jit.save * unlimite models num and optimizers num * add lookup_table in black_list * fix momentum and layer state_dict * fix bug in layer state_dict * fix bug in layer state_dict_helper * refine unittest * refine test_momentun_op * refine interface and some code * refine amp_decorator interface * refine pure fp16 interface * refine master weight interface
-
由 Haohongxiang 提交于
* Support EMA in Paddle2.x and Fleet * update * update * update * modify ut of ema * modify docs * modify bugs * update * update * update * modify ut
-
- 16 9月, 2021 2 次提交
- 15 9月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 10 9月, 2021 1 次提交
-
-
由 Zhong Hui 提交于
-
- 08 9月, 2021 3 次提交
-
-
由 WangXi 提交于
-
由 Zeng Jinle 提交于
* add fleet api for program pass * turn on apply pass for CI test * fix disable fuse_all_optimizer bug * try to test ci * fix CI * fill unspecified op role * fix fuse_allreduce * add ut to improve coverage * remove useless change * improve c++ coverage * follow some comments * test ir pass pipeline * update doc * reduce ut time again
-
由 lilong12 提交于
* support weight sharing
-
- 02 9月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 27 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 23 8月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 20 8月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 17 8月, 2021 1 次提交
-
-
由 Roc 提交于
-
- 14 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 11 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 09 8月, 2021 1 次提交
-
-
由 JZ-LIANG 提交于
-
- 28 7月, 2021 2 次提交
- 27 7月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 20 7月, 2021 2 次提交
- 19 7月, 2021 3 次提交
- 16 7月, 2021 1 次提交
-
-
由 WangXi 提交于
-