- 08 12月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 07 12月, 2020 2 次提交
-
-
由 wangchaochaohu 提交于
-
由 tangwei12 提交于
* fix gpu emb out of range Change-Id: I5794ac73bd634d5ea069a6fbbd914274b6d6b7bf * fix doc Change-Id: I5a3350b2930a9ab2f52116c192b087307faf8fdf
-
- 05 12月, 2020 1 次提交
-
-
由 chentianyu03 提交于
* fix random failed of complex matmul * Make transpose, trace, kron, reshape, sum op support complex type (#29321) * add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest * kron, reshape, transpose support complex types * sum and trace op support complex types * add test case of sum and trace op * fix the bug of imag part of complex not initialized * format file * format code style * kron support type promotion; modify test cases
-
- 04 12月, 2020 2 次提交
-
-
由 Shang Zhizhou 提交于
* fix tensorrt output shape error * fix unittest tensorrt_engine_op_test * fix code style for unitest
-
由 Chen Weihang 提交于
* basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments
-
- 03 12月, 2020 2 次提交
-
-
由 Leo Chen 提交于
-
由 Zhen Wang 提交于
* Add pure fp16 training with master weights. (#27712) * add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.
-
- 01 12月, 2020 2 次提交
-
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest
-
由 Wilber 提交于
-
- 30 11月, 2020 5 次提交
-
-
由 Adam Osewski 提交于
- Make sure that oneDNN memory descriptors are created only once at first iteration.
-
由 123malin 提交于
* fix paramete prefetch & device guard Co-authored-by: NMrChengmo <cmchengmo@163.com> Co-authored-by: Nchengmo <chengmo@baidu.com>
-
由 123malin 提交于
* test=develop, optimize async prefetch
-
由 WangXi 提交于
-
由 Jack Zhou 提交于
fix gru gcc7.4 bug for the gru compile
-
- 28 11月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 27 11月, 2020 4 次提交
-
-
由 lilong12 提交于
update expand as op to use the shape of the target tensor instead of the target tensor itself. (#29020) * update, test=develop
-
由 Jack Zhou 提交于
Add eigen gru and fix the dropout bug in the rnn
-
由 arlesniak 提交于
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
- 26 11月, 2020 2 次提交
-
-
由 Noel 提交于
Fix ops doc for some ops
-
由 joanna.wozna.intel 提交于
* Add bf16 pool2d and unify bf16 unit tests * Add change default ops test
-
- 25 11月, 2020 4 次提交
-
-
由 joejiong 提交于
add uint8 for reshape operator
-
由 taixiurong 提交于
-
由 joejiong 提交于
Simple code clean up
-
由 wawltor 提交于
remove eigen threadpool for the speed up
-
- 24 11月, 2020 2 次提交
- 23 11月, 2020 3 次提交
-
-
由 furnace 提交于
* refactor momentum op to combine weight_decay (scale op and sum op)
-
由 Jacek Czaja 提交于
-
由 yaoxuefeng 提交于
-
- 20 11月, 2020 9 次提交
-
-
由 gongweibao 提交于
-
由 Chen Weihang 提交于
-
由 joejiong 提交于
Adding uint8 support for squeeze operator.
-
由 wangchaochaohu 提交于
-
由 joanna.wozna.intel 提交于
* Add bf16 matmul, fc, elementwise add and mul * Correct unit test
-
由 yaoxuefeng 提交于
-
由 taixiurong 提交于
* 1.add xpu slice op 2. add xpu top_k op 3.modify xpu cast to new api * 1.add xpu slice op 2. add xpu top_k op 3.modify xpu cast to new api
-
由 Jack Zhou 提交于
* add lstm, simple rnn op kernel * fix the test_lstm for the rnn op * change func name * fix forward postprocess bug * add gru forward, backward code * remove unittest.skipIf; use a big rnn op instead of combination op * fix input doesn't have gradient bug * add eigen lstm forward, backward Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
-
由 QingshuChen 提交于
* adjust kunlun header file *test=kunlun * update kunlun unittest *test=kunlun * update xpu unitest * test = kunlun * update xpu unittest * test=kunlun * update xpu unitest * test=kunlun
-