- 14 12月, 2020 5 次提交
-
-
由 arlesniak 提交于
-
由 WangXi 提交于
-
由 Leo Chen 提交于
* fix compile problem when cuda_arch < 6000 * refine code * refine code
-
由 QingshuChen 提交于
* support roi_align & affine_channel for kunlun * minor
-
由 Jacek Czaja 提交于
-
- 11 12月, 2020 5 次提交
-
-
由 Leo Chen 提交于
-
由 Zhang Ting 提交于
* improve drop out * add VectorizedRandomGeneratorWithGenerator * fix bug * modify according to comments
-
由 Zhang Ting 提交于
-
由 LoveAn 提交于
* Add the strategy of skipping cc/cu test compilation and execution in CI, test=develop * fix if error with CI_SKIP_TEST, test=develop * fix add properties to test error on Linux/MAC, test=develop * fix set test properties of test_code_generator error, test=develop * remove test codes and advance judgment of file modification on Linux, test=develop * rename CI_SKIP_TEST to CI_SKIP_CPP_TEST, test=document_fix * Add branch judgement on Linux, test=develop
-
由 taixiurong 提交于
* 1.fix matmul bug 2. add one hot * add xpu error msg
-
- 10 12月, 2020 4 次提交
-
-
由 Zhong Hui 提交于
fix p_norm with empty shape (#29500)
-
由 Leo Chen 提交于
* layernorm fw opt * layernorm bw opt * fix typo, test=develop * remove const dim3 for windows CI compatibility * merge develop Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
-
由 ShenLiang 提交于
-
由 Zhen Wang 提交于
* remove tensor copy in the update_loss_scaling op * not use thrust. * fix some cuda memory access error.
-
- 09 12月, 2020 4 次提交
- 08 12月, 2020 5 次提交
-
-
由 Zhang Ting 提交于
This reverts commit befd6d53.
-
由 jakpiase 提交于
* added external reorder to profiler * added external and internal reorders to profiler * added internal and external reorder to profiler * added formatting to int/ext reorder commit * removed unnecessary comment
-
由 taixiurong 提交于
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>
-
由 TTerror 提交于
* update reduce_sum op on xpu * update reduce_sum op on xpu * support running on xpu
-
由 Jack Zhou 提交于
-
- 07 12月, 2020 4 次提交
-
-
由 Zhang Ting 提交于
-
由 Leo Chen 提交于
-
由 Leo Chen 提交于
-
由 LoveAn 提交于
* Compiling operator libraries with Unity Build on Windows CPU. * Compiling operator libraries with Unity Build on Windows GPU, no_test, test=windows_ci * Add option in windows ci script, no_test, test=windows_ci * Optimize parallel compiling, test=develop * remove limit of parallel compile and skip some ops in UB, test=develop * remove changes of header file, test=develop * remove changes of header file, test=develop * fix test_eye_op unittest failed, test=develop * Compiling operator libraries with Unity Build on Linux, test=develop * set default WITH_UNITY_BUILD=OFF, test=develop * Move unity build rules into a single file and add comment, test=develop * optimize parallel compilation, test=develop * fix undefined reference error on coverage ci, test=develop
-
- 04 12月, 2020 4 次提交
-
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest * kron, reshape, transpose support complex types * sum and trace op support complex types * add test case of sum and trace op * fix the bug of imag part of complex not initialized * format file * format code style * kron support type promotion; modify test cases
-
由 卖鱼的哲学 提交于
* fix expand && concat/transpose to new api * update uniform_random_op * update xpu_header
-
由 QingshuChen 提交于
* test=kunlun
-
由 Chen Weihang 提交于
* basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments
-
- 03 12月, 2020 5 次提交
-
-
由 tangwei12 提交于
* fix gpu emb out of range Change-Id: I5794ac73bd634d5ea069a6fbbd914274b6d6b7bf * fix doc Change-Id: I5a3350b2930a9ab2f52116c192b087307faf8fdf
-
由 Zhang Ting 提交于
* improve performance of elementwise_sum_grad
-
由 Shang Zhizhou 提交于
* fix tensorrt output shape error * fix unittest tensorrt_engine_op_test * fix code style for unitest
-
由 Aurelius84 提交于
-
由 wangchaochaohu 提交于
-
- 02 12月, 2020 4 次提交
-
-
由 ShenLiang 提交于
-
由 Zhen Wang 提交于
-
由 Leo Chen 提交于
-
由 Zhen Wang 提交于
* add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT. * For CI Coverage Checking.
-