- 08 1月, 2020 3 次提交
-
-
由 zhupengyang 提交于
* [NPU] enhance unittest for shuffle_channel, unsqueeze, pool test=develop
-
由 huzhiqiang 提交于
* fix the issue that: loading model consumes too much time test=decelop
-
由 xiaogang 提交于
-
- 07 1月, 2020 3 次提交
-
-
由 yiicy 提交于
-
由 zhupengyang 提交于
test=develop
-
由 huzhiqiang 提交于
-
- 06 1月, 2020 2 次提交
-
-
由 石晓伟 提交于
-
由 liu zhengxi 提交于
* alter the api name from cpu to x86, test=develop * correct the step_rnn model test, test=develop
-
- 03 1月, 2020 3 次提交
-
-
由 zhupengyang 提交于
test=develop
-
由 Wilber 提交于
temporarily remove cuda fc fuse because we don't support cuda fc now
-
由 hong19860320 提交于
-
- 02 1月, 2020 3 次提交
-
-
由 石晓伟 提交于
-
由 GaoWei8 提交于
* Enhance fc_fuse_pass to enable fusing relu to fc_op test=develop * restrict fusing relu in x86 test=develop
-
由 hong19860320 提交于
-
- 31 12月, 2019 3 次提交
-
-
由 Wilber 提交于
X86 and cuda compile simutaneously cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_MKL=ON -DLITE_WITH_CUDA=ON -DWITH_MKLDNN=OFF -DLITE_WITH_X86=ON -DLITE_WITH_PROFILE=OFF -DWITH_LITE=OFF -DLITE_WITH_LIGHT_WEIGHT_FRAMEWORK=OFF -DWITH_PYTHON=OFF -DWITH_TESTING=ON -DLITE_WITH_ARM=OFF -DLITE_ON_TINY_PUBLISH=OFF -DCUDNN_ROOT=/usr/local/cudnn/ -DLITE_BUILD_EXTRA=ON (#2708) x86 and cuda compile simutaneously
-
由 zhupengyang 提交于
test=develop
-
由 hong19860320 提交于
* Fix the compiling error which occurs when specify the ddk_root path and build for huawei NPU. * Refine the registration of op bridges and make it similar to the registration of op and kernel. * Refine the interfaces of the graph and node for op bridges, and support creating constant and data node automatically according to the attribute 'persistable' of the target tensor. * Add the unit test of the scale and softmax op bridge for NPU.
-
- 30 12月, 2019 2 次提交
-
-
由 juncaipeng 提交于
* fix yolov3 bug when run several times, test=develop
-
由 Yiqun Liu 提交于
Optimize the execution of RuntimeProgram by saving the bool whether the op is feed/fetch op. (#2703) test=develop
-
- 28 12月, 2019 1 次提交
-
-
由 huzhiqiang 提交于
-
- 27 12月, 2019 4 次提交
-
-
由 石晓伟 提交于
-
由 yiicy 提交于
-
由 hong19860320 提交于
-
由 huzhiqiang 提交于
remove test_models ci projects, because these project hass been removed in ci test test=develop (#2669)
-
- 26 12月, 2019 3 次提交
-
-
由 Wilber 提交于
-fix fluid-lite-subgraph x86 compile error - Replace FLAGS with environment variables
-
由 xiaogang 提交于
* feat: add multi_thread ut
-
由 zhupengyang 提交于
test=develop
-
- 25 12月, 2019 6 次提交
-
-
由 juncaipeng 提交于
add clear for tensor
-
由 juncaipeng 提交于
* fix op inputs and outputs type, test=develop
-
由 Wilber 提交于
optimize softmax cuda kernel
-
由 juncaipeng 提交于
-
由 hong19860320 提交于
-
由 Yiqun Liu 提交于
* Remove GEMM padding in fc_compute. test=develop * Write a common ParallelFor function to run the for loop in parallel. * Add the codes of padding GEMM back in fc. * Refine the code of fc when padding_weight is false to avoid the definition of temporary Tensor. * Refine the unit test of fc and add testing case of padding and parallel. test=develop * Enable more test cases in common fc unittest, including padding and parallel for x86 target. * Remove the fc test under kernels/x86. test=develop * Disable relu in test of fc for non-x86 target. test=develop * Change the eps of arm. test=develop
-
- 24 12月, 2019 7 次提交
-
-
由 zhupengyang 提交于
-
由 hong19860320 提交于
-
由 huzhiqiang 提交于
-
由 hong19860320 提交于
* Support multiple types for XPU and NPU op bridges * Add lookup_table, gather, slice, stack and scale op bridges for supporting BERT * Fix the definition of lookup_table kernel for X86
-
由 yiicy 提交于
-
由 zhupengyang 提交于
test=develop
-
由 zhupengyang 提交于
test=develop
-