- 01 3月, 2023 1 次提交
-
-
由 niuliling123 提交于
-
- 17 2月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* rename multi_tensor_adam to fused_adam * fix some bugs * fix CI coverage * rename test_fused_adam.py * fix some bug * add test_fused_adam_op.py * fix some bugs * fix fused_adam_op.cc * fix CI bugs * fix CI bug * fix CI bug
-
- 15 2月, 2023 1 次提交
-
-
由 lzy 提交于
* make FusedMultiTransformer supports variable-lengths. * modify ffn2 when cuda_version >= 11.6 because of #49392. * code style * delete remove_padding
-
- 09 2月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* add multi_tenosr_adam * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py * fix adam.py optimizer.py * fix adamw.py * fix test_multi_tensor_adam.py * fix CI bug * fix CI coverage * fix ci bug * fix betapow * fix some bugs * fix test_adamw_op.py * fix CI coverage * fix multi_tensor_adam_kernel.cc * fix CI bug * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py * fix code style * update C++ parts * remove python parts modification temporarily * add C++ ut * update betapow copy code logic * fix ci ut * fix windows ci * fix coverage ci * improve coverage rate --------- Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
- 05 1月, 2023 1 次提交
-
-
由 姜永久 提交于
* rm op_function_generator * rm op_func_generator.h * rm op_function * modify cmake * rm op_function.h * rm check for op_function_generator.cc * reset imperative * rm python part * fix imperative * lint * lint * modify legacy_c * review * modify * modify legacy * rm gen op_functions code * reset framework * rm core.ops for test * core.ops->core.eager.ops.legacy * not raiseError for xpu
-
- 23 12月, 2022 1 次提交
-
-
由 lzy 提交于
-
- 10 10月, 2022 1 次提交
-
-
由 carryyu 提交于
make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches. (#46777) * make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches.
-
- 18 9月, 2022 1 次提交
-
-
由 RichardWooSJTU 提交于
-
- 12 8月, 2022 1 次提交
-
-
由 Siming Dai 提交于
* add init file * add op definition and infermeta * add kernel definition funcs * add broadcast infer shape * add gpu forward kernel * delete SUB and DIV * add x_grad * add template * add e_grad for min and max * fix small bug * temp commit * temp commit * add e_grad for sum and mean * fix some compile bug * fix compile bugs * fix compile problem * add sum forward unittest * fix broadcast error, add kernel sig, register e_grad, change unit test * fix grad * add temp grad fix * temp commit * add min max unittest * add max, min unittest, fix mul bug * add cpu forward sum and mean * add forward min max, fix mean unittest * add cpu backward min max * fix code-style * add backward sum mean * fix rocm ci * set uniitest timeout * fix bug of x broadcast to e, gpu grad * fix bug of x broadcast to e, cpu grad * rename BOOST_GET_CONST macro * fix rocm ci * mv graph_send_e_recv to graph_send_ue_recv * move out_size to IntArray * add eager op test * fix max pool type bug, add unittest for api * revise api doc * add fp16 for atomic min and max, add unittest * add unittest * add fp16 support for graph_send_recv * fix unittest fp16 bug * change OutSizeTensor to Out_size * move E to Y * add copyright, fix comment * review code * fix thread block size * fix thread block size * change api attribute name: pool_type to reduce_op, compute_type to message_op * change api attribute name, move pool_type to reduce_op, move compute_type to message_op
-
- 09 8月, 2022 1 次提交
-
-
由 Siming Dai 提交于
* change out_size to INTArray * fix out_size eager bug * add unittest for out_size tensor * add deprecated for paddle.incubate.graph_send_recv, add paddle.geometric.send_u_recv and unittests * fix lowest bug * fix according review comment * add default value in yaml * change api file name * change name
-
- 29 7月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* migrate lstsq op * update * fix bugs for CIs * update * fix bugs * add uts * update * update * update * fix bugs of jip * fix bugs of hip * update * update according to review * update * update * update * update
-
- 07 7月, 2022 1 次提交
-
-
由 zhangyikun02 提交于
-
- 26 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 02 6月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* support CUDAGraph for partial graph * add ut * fix ci * fix ut again because of eager mode * fix kunlun ci * fix win ci
-
- 31 5月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* polish append op using * fix var error * fix group norm impl
-
- 30 5月, 2022 2 次提交
- 26 4月, 2022 1 次提交
-
-
由 WangXi 提交于
-
- 15 4月, 2022 1 次提交
-
-
由 pangyoki 提交于
* support no_need_buffer in eager_fluid state * change no_need_buffer info from fwd_info to bwd_info * fix CI fail, gru_unit donnot use no_need_buffer * fix conflict between no_need_buffer and dispensable * use tensor.define in dispensable * solve conflict * solve conflict
-
- 06 4月, 2022 2 次提交
-
-
由 Weilong Wu 提交于
* [Eager] Support test_layers's test cases switch to eager mode * Update batch_norm _C_ops action to fix CI * Use None instead of new EmptyTensor * Updated var name * Make sure to switch eager mode, Fix Coverage_CI * Remove _non_static_mode statement * Remove batch_norm dispensable input statement * Polish batch_norm code * Fix CI issue
-
由 wanghuancoder 提交于
-
- 05 4月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* eager math op, test=develop * eager support lookahead, test=develop * refine,test=develop * refine doc, test=develop * refine,test =develop * refie, test=develop * refie, test=develop * refie, test=develop * test_paddle_multiprocessing * refine, test=develop * refine, test=develop * fix bug, test=develop * refine, test=develop * dataloader, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * test_datasets timeout, test=develop * refine, test=develop
-
- 04 4月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
-
- 03 4月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [Eager]Enhance eager_trace_op logic to support Optimizer Op * fix AsDispensable * [Eager]Fix 17 unittest and open check_eager=True * remove print * fix unittests * fix op_testa * fix coverage CI failed * fix ci
-
- 02 4月, 2022 1 次提交
-
-
由 Siming Dai 提交于
* Add graph_reindex API * add graph_sample_neighbors api * Add buffer * delete VLOG * delete thrust::copy for output * add ShareDataWith * delete graph_reindex hashtable output * add graph_reindex dispensable * add reindex unittest, move memset to cuda kernel, change api * fix conflict * add reindex buffer for gpu version note * fix conflicts for op_func_generator * Add fisher_yates sampling, add dispensable, change infermeta * add dtype for edge_id * fix rocm ci and static check ci * add unittest * fix unittest * fix unittest * fix bug
-
- 01 4月, 2022 2 次提交
-
-
由 pangyoki 提交于
* support C_ops assign * open unittest * fix clone
-
由 Aurelius84 提交于
* [Eager]Enhance eager_trace_op logic to support Optimizer Op * fix AsDispensable
-
- 29 3月, 2022 1 次提交
-
-
由 0x45f 提交于
* Use _C_ops.yolov3_loss in eager mode for test_yolov3.py * fix code for test_yolov3_loss_op * remove useless import * Fix dygraph_mode flag
-
- 28 3月, 2022 1 次提交
-
-
由 0x45f 提交于
* Refine test_lac.py for eager mode * refine code * Fix test_program_translator for eager
-
- 23 3月, 2022 2 次提交
-
-
由 wanghuancoder 提交于
* fix some slice bug, test=develop * eager slice, test=develop * eager slice, test=develop * refine, test=develop * refine, test=develop * fix bug, test=develop * refine, test=develop * rename function name, test=develop
-
由 Weilong Wu 提交于
-
- 15 3月, 2022 1 次提交
-
-
由 furnace 提交于
* [NPU] add AMP O1 support * [NPU] fix NOTE and warnings
-
- 11 3月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
-
- 01 3月, 2022 1 次提交
-
-
由 Guoxia Wang 提交于
-
- 16 2月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
-
- 27 1月, 2022 1 次提交
-
-
由 Siming Dai 提交于
* add the test case for the UVA * add the context load for the uva * Add graph_sample kernel * Add graph_sample commit * add new commit for graph_sample * add unsigned long long int * delete some remarks * add cpu version * add cuda eids * add cpu eids * delete _uva * optimize speed: emplace_back, last_layer * add to_uva_tensor * add cpu return_eids choice * add gpu return_eids choice * add cpu reindex_nodes * add gpu reindex_nodes * rename op and add OMP for cpu * add incubate api * fix the compile problem for the PADDLE_ENFORE and different device * fix the rcom and windows compile problem * add unittest for graph_sample_neighbors * fix cpu unittest and unique problem * fix uva unittest, fix cuda unique problem * fix the windows compile problem * fix the windows rand_r compile problem * add correct unittest, add src_eids dispensable * delete black * combine uva unittest * mv Sample_index to Sample_Index; check input shape; fix random sample func * delete memset & cudaMemset * fix according to PR comments * fix rocm ci * modify function names according to the specification * fix windows_openblas ci * refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors * fix rocm ci * rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc * add data type * fix conflict Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
-
- 20 1月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * eager test case * support inference test * refine test and fix initializer failed * modify eagertensor patch method * add eagertensor.clear_grandint, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * support create varbase and fix retain grad error * call monkey_patch_varbase in _test_eager_guard, test=develop * fix windows error * split clear_gradient to clear_gradient and zero_grads, test=develop * refine, test=develop * refine, test=develop * support test_imperative_basic test in eager mode * remove additional log in variable.h * remove additional log in variable.h * remove additional code create in merge * eager * fix some eager logic, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * patch_tensor_method_func, test=develop * refine, test=develop * eager test case, test=develop * refine, test=develop * eager, test=develop * eager, test=develop * eager optimizer, test=develop * eager optimizer, test=develop * eager test_imperative_optimizer_v2, test=develop * eager, test=develop * refine, test=develop * refine, test=develop * eager, test=develop * add resize in share buffer to, test=develop * eager, test=develop * fix _share_buffer_to, test=develop * refine, test=develop * refine, test=develop * support eager for dataloader,test=develop Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 07 1月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* add multi tensor for adam * add merged_adam op * refine code * refine adam compute logic
-
- 24 12月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
-
- 20 12月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
* add multi_tensor for momentum and clear_grads for optimizer * fix bug for dygraph * add unittest * refine comment * add param_group * refine regularizaiton logic * del clear_grads * add clear_grads * add dispensable check of None * refine clear_grad * fix build bug * refine code by comment * refine code * add multi tensor check * refine param_group update * add multi tensor for static mode * refine comments * delete useless comma for momentum * refine comment for momentum * refine code by commment
-