- 18 4月, 2022 11 次提交
-
-
由 Yuang Liu 提交于
-
由 z8hanghuan 提交于
-
由 JZ-LIANG 提交于
* adapot dist op * [Auto Parallel] Support the auto completion of while_op * add dist_fill_constant_batch_size_like * align infer accuracy
-
由 furnace 提交于
[NPU] fix conv2d and top_k_v2 fp16
-
由 JingZhuangzhuang 提交于
-
由 Wilber 提交于
-
由 Aurelius84 提交于
* [Eager] add _fallback_legacy_dygraph for npu/xpu/rocm * fix import
-
由 TeFeng Chen 提交于
cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) * optimize preparation overhead before executing cinn compiled program * update code notes * fix flag annotation * add a flag of auto-tune feature beforehand
-
由 zhangkaihuo 提交于
-
由 Siming Dai 提交于
* add eids result for graph_sample_neighbors * fix bug * move fisher_yates sample to warp * add cpu eid output * delete comment * delete comment * change nullptr placeholder * optimize sample kernel * fix mutable_data
-
由 qipengh 提交于
* [MLU]add op: reduce_sum, elementwise_sub * [MLU]del unrelated code
-
- 17 4月, 2022 3 次提交
-
-
由 Fan Zhang 提交于
* Adapt XPUPS - 1st version - 3.24 * Adapt XPUPS - update XPU PushSparse - 2nd version - 3.24 * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25 * refactor heter comm kernel * update. test=develop * Adapt XPUPS - modify by compilation - 4th version - 3.27 * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * heter_comm update * heter_comm update * update calc_shard_offset. test=develop * heter_comm update * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30 * update. test=develop * update pslib.cmake * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * Adapt XPUPS - modify by kp compilation - 6th version - 3.30 * update. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * used by minxu * update heter_comm_inl * fix. test=develop * Adapt XPUPS - modify by kp compilation - 7th version - 3.30 * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 3.31 update * Adapt XPUPS - update kp compilation path - 8th version - 3.31 * add optimizer kernel. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm.h 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * Adapt XPUPS - update by kp compilation - 9th version - 4.1 * update hashtable. test=develop * fix. test=develop * update hashtable 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 10th version - 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * modify by compilation 4.1 * update. test=develop * update. test=develop * fix. test=develop * modify by compilation 4.1 * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 19:30 * fix. test=develop * update ps_gpu_wrapper.kps 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 11th version - 4.1 * fix. test=develop * Adapt XPUPS - update by kp compilation - 12nd version - 4.2 * fix. test=develop * fix. test=develop * modify by compilation 4.2 * 4.2 update * fix. test=develop * template init. test=develop * update 4.6 * fix. test=develop * template init. test=develop * 4.6 modify by compilation * hashtable template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 13nd version - 4.7 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.11 update * fix. test=develop * fix. test=develop * 4.11 update * update by pre-commit * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.12 update * fix. test=develop * Adapt XPUPS - update by kp compilation - 14th version - 4.13 * 4.13 update * 4.14 update * 4.14 update * 4.14 update * 4.14 modify by merged latest compilation * retry CI 4.14 * 4.15 pass static check * 4.15 modify by gpups CI * 3.16 update by gpups CI - modify ps_gpu_wrapper.h * 4.16 update * 4.16 pass xpu compile * 4.16 retry CI * 4.16 update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
-
由 Chen Weihang 提交于
* split phi and fluid infermeta context * resolve conflict * fix type error * optimize scheduling perf * spec small vector size * replace all grad var name * fix test failed * move init defalut signature * polish details * polish details * fix no init bug * init sig for tests * add init sig for infer * fix infrt error * fix infrt failed * fix kunlun error * fix infrt failed
-
由 Chen Weihang 提交于
* fix place type related compat error * fix test failed * remove dll decl * revert place type change * add dll decl
-
- 16 4月, 2022 5 次提交
-
-
由 王明冬 提交于
-
由 z8hanghuan 提交于
* modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun
-
由 Baibaifan 提交于
-
由 Roc 提交于
* moe ref * ref commit; test=document_fix * update; test=document_fix * update test=document_fix * update; test=document_fix
-
由 levi131 提交于
* native commit for triple grad of sigmod * Updated unittests files * init functional jacobian api * Updated trible_test func * Updated gradient_checker & test_script * finish test with dtype float32 * add float64 test case * polish code * use atol=1e-5 with dtype float64 * fix for ci * set timeout for test_jacobian * fix dygraph grad to support high differential * polish API docstring * Updated gradient checker and some related files * fix double grad strip error for high differential * fix double grad strip error for high differential * Add Sigmoid triple grad tests * fix dygraph double grad dtype error when calling for high differential senario * Updated triple grad teses func * Use np.random to initialize ddx * Updated triple_grad_check func * add todo for gradient checker and refine some comments * remove additional code * add test for warnging in backward.py * format python code * support multi input in triple gradient checker * Add matmul triple grad kernel * Updated comments of TODO * Supported some special tests * Change code-format to follow CI std * Updated gradient_checker.py * Fix conflicts * Removed unnecessary printing log * Change code style to follow CI std * merge upstream * add priops.py * add_p * rm useless files * add sub_p mul_p div_p * add sqrt_p and tanh_p * add reshape_p * add broadcast_p * Add python primitive wrappers. * Jvp rules updated. * JVP rules done for all the 17 primops. * quick check and fixes. * add jvp(op, *args) * add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p * add split_p and concat_p * add gather_p and scatter_add_p * add slice_select_p and slice_assign_p * Add transpose rules. * add multi input check for add_p, sub_p, mul_p, div_p * update concat_p * Linearize and transpose in progress.. * refine gather_p and scatter_add_p * updated. * update transpose. * refine slice_assign_p and slice_select_p * init commit for lower * Merged with primitive ops. * small update * add rules for orig2prim and prim2orig * add 9 test for prim ops * add more test and fix some bug * add more test * register proto * Adding primops test. * add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto * support multi input and multi output for split_p and concat_p * Test updated. * update * fix slice bug for slice_select_p and slice_assign_p * updated. * Ops updated. * Refactor and bug fixes. * updated. * finish orig2prim and prim2orig rules * dtype for axis attr should be long int * update dtype for axis attr int64_t * update for iscan CI * Update primx. * Refactor vars in primx. * update for lower transform * update primx.py * update * Fix linearize and transpose. * Update is_dot * Update is_dot * Update is_dot * add gradient aggregation, fix add_transpose. * pass first linearize+transpose test. * update test * add_prim_op_pywrapper * Add primops UT * Fix set_value and update * Fix code format and PR-CI-Coverage Co-authored-by: Nveyron95 <veyron_wu@163.com> Co-authored-by: NJiabin Yang <360788950@qq.com> Co-authored-by: NTongxin Bai <waffle.bai@gmail.com> Co-authored-by: N0x45f <wangzhen45@baidu.com>
-
- 15 4月, 2022 21 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * arm_brpc compile * . * . * . * . * . * . * . * . * . * . * . * . * . * . * only output is ok * base is ok * . * . * . * . * . * . * . * . * add switch server bin * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * adapt brpc ssl * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
-
由 seemingwang 提交于
* extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * recover test * recover test * fix spelling * recover * fix
-
由 Roc 提交于
* moe ref * ref commit; test=document_fix * update; test=document_fix * update test=document_fix
-
由 huangxu96 提交于
As the title
-
由 chentianyu03 提交于
* add adamw yaml * fix test case error * make the name of weight and bias in linear1 and linear2 to be constant
-
由 chentianyu03 提交于
* split reduce_kernel * rm reduce_kernel in cmake * split reduce_grad kernels * fix cmake build error * format code * fix standalone_executor_test error
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode * Enabled more test cases * [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode * Adjusted test_imperative_star_gan_with_gradient_penalty.py
-
由 Haohongxiang 提交于
* refactor mp in eager mode * update * update * add uts
-
由 TTerror 提交于
-
由 lilong12 提交于
-
由 danleifeng 提交于
* add gpupsutil and afsclient; test=develop
-
由 fwenguang 提交于
-
由 Jack Zhou 提交于
* Add core.eager.StringTensor __init__ which pyarray args can be passed * Add the numpy method of core.eager.StringTensor * revert tensor.to_string modification * Add ToPyObject for core.eager.StringTensor * Add debug string for core.eager.StringTensor * Remove place args of core.eager.StringTensor temporarily * Fix check string_tensor error * remove dtype of core.eager.StringTensor * add core.eager.StringTensor unittest * remove pstring from VarDesc * Add InitStringTensorWithStringTensor * Remove to_string modification * Remove zero_copy arg from StringTensor creator
-
由 zmxdream 提交于
* refactor heter comm kernel * update. test=develop * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix hashtable_kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
-
由 Allen Guo 提交于
* add mixed-precission support for ipu * restore cast_model_to_fp16 api * update UTs
-
由 Chen Weihang 提交于
-
由 zhangkaihuo 提交于
-
由 pangyoki 提交于
* support no_need_buffer in eager_fluid state * change no_need_buffer info from fwd_info to bwd_info * fix CI fail, gru_unit donnot use no_need_buffer * fix conflict between no_need_buffer and dispensable * use tensor.define in dispensable * solve conflict * solve conflict
-
由 Asthestarsfalll 提交于
-
由 zhangxiaoci 提交于
-
由 limingshu 提交于
* change cudnn helper for auto-tune * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm. * Fix the bug in calculating and printing current step cache hit rate. * Improve the autotune cache and fix unittest. * Change the key from AlgorithmType to int64_t. * Fix unittest for cpu-only env. * change ChooseAlgoByWorkspace for heuristic mode Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-