- 15 4月, 2022 24 次提交
-
-
由 seemingwang 提交于
* extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * recover test * recover test * fix spelling * recover * fix
-
由 Roc 提交于
* moe ref * ref commit; test=document_fix * update; test=document_fix * update test=document_fix
-
由 huangxu96 提交于
As the title
-
由 chentianyu03 提交于
* add adamw yaml * fix test case error * make the name of weight and bias in linear1 and linear2 to be constant
-
由 chentianyu03 提交于
* split reduce_kernel * rm reduce_kernel in cmake * split reduce_grad kernels * fix cmake build error * format code * fix standalone_executor_test error
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode * Enabled more test cases * [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode * Adjusted test_imperative_star_gan_with_gradient_penalty.py
-
由 Haohongxiang 提交于
* refactor mp in eager mode * update * update * add uts
-
由 TTerror 提交于
-
由 lilong12 提交于
-
由 danleifeng 提交于
* add gpupsutil and afsclient; test=develop
-
由 fwenguang 提交于
-
由 Jack Zhou 提交于
* Add core.eager.StringTensor __init__ which pyarray args can be passed * Add the numpy method of core.eager.StringTensor * revert tensor.to_string modification * Add ToPyObject for core.eager.StringTensor * Add debug string for core.eager.StringTensor * Remove place args of core.eager.StringTensor temporarily * Fix check string_tensor error * remove dtype of core.eager.StringTensor * add core.eager.StringTensor unittest * remove pstring from VarDesc * Add InitStringTensorWithStringTensor * Remove to_string modification * Remove zero_copy arg from StringTensor creator
-
由 zmxdream 提交于
* refactor heter comm kernel * update. test=develop * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix hashtable_kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
-
由 Allen Guo 提交于
* add mixed-precission support for ipu * restore cast_model_to_fp16 api * update UTs
-
由 Chen Weihang 提交于
-
由 zhangkaihuo 提交于
-
由 pangyoki 提交于
* support no_need_buffer in eager_fluid state * change no_need_buffer info from fwd_info to bwd_info * fix CI fail, gru_unit donnot use no_need_buffer * fix conflict between no_need_buffer and dispensable * use tensor.define in dispensable * solve conflict * solve conflict
-
由 Asthestarsfalll 提交于
-
由 zhangxiaoci 提交于
-
由 limingshu 提交于
* change cudnn helper for auto-tune * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm. * Fix the bug in calculating and printing current step cache hit rate. * Improve the autotune cache and fix unittest. * Change the key from AlgorithmType to int64_t. * Fix unittest for cpu-only env. * change ChooseAlgoByWorkspace for heuristic mode Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 fwenguang 提交于
-
由 fwenguang 提交于
* [MLU] add mlu new profiler * fix format
-
由 caozhou 提交于
* update cluster
-
由 hong 提交于
* try to fix batch norm memory issue * fix batch norm memroy alloc bug * polish some code
-
- 14 4月, 2022 16 次提交
-
-
由 caozhou 提交于
-
由 chenjian 提交于
-
由 houj04 提交于
-
由 Lijunhui 提交于
* regist elementwise_xxx
-
由 Chen Weihang 提交于
-
由 YuanRisheng 提交于
* support construct scalar using non-cpu tensor * fix bugs when run unittest * fix compile bugs * fix bugs when run ci * fix compile bugs * fix bugs when move copy * perfect unit test * perfect unittest * update according to comment * add target dependency * deal with conflict * fix bugs when run unit test * fix unit test bugs
-
由 Yiqun Liu 提交于
-
由 zhangkaihuo 提交于
-
由 liutiexing 提交于
* executor perf statistics * fix ut * fix ut * fix ut * add ut * add ut
-
由 Jacek Czaja 提交于
* Add UT - Added missed data_layout - Added missing conversions - NDHWC added - NDHWC support in data_transform - another fix - condddate change - fix u- fix - fix - fix - fix - fix - fix to hack - compilation fix - fix to automatic merge * - reduced UT * - fix * - lint * - fix to lint
-
由 Sławomir Siwek 提交于
* Change tensor name to match activation * declare fc_eltwise_add pass * merge conv_eltwise refactor PR * first compilable draft * unittest feedback tools * Fuse pass tester * Move IsReachable() to shared file * 100% coverage of fuse_pass_tester.cc * register pass * Add bias node * Improve unit tests / remove bias node from pattern * improve fc_eltwiseadd_unittest * cancel eltwise_add fuse if act is already fused * Add elementwise_input scale * Residual MVP * Add new FC attrs * Add more test cases * Add missing op attrs * Adapt code to new Elementwise pattern * reuse existing fcpattern * improve code style * remove unused arguments * fix typo * remove whitespace * remove int8 related code * Remove attributes from base ops * style * style check * Remove input from base op * Set attribute during fuse * ut timeout * download and test model * DRY * apply feedback from review * Style check * fix typo * cosmetic changes * explicitly set residual as output * VIT-OCR accuracy check * trigger CI * remove whitespaces * fix missing data file
-
由 Sing_chan 提交于
-
由 zmxdream 提交于
* modify xpu_kp.cmake with HETERPS&PSLIB * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop
-
由 Vigi Zhang 提交于
-
由 z8hanghuan 提交于
* support multi layer and bidirection of lstm_grad, *test=kunlun * support multi layer and bidirection of lstm_grad, *test=kunlun
-
由 Sing_chan 提交于
-