- 07 5月, 2021 1 次提交
-
-
由 Jiawei Wang 提交于
-
- 20 10月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 11 5月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 08 1月, 2020 1 次提交
-
-
由 zhaoyuchen2018 提交于
stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 30 1月, 2019 1 次提交
-
-
由 Yibing Liu 提交于
* Some improvements to support bert mixed precision training test=develop * Revert the cast in layer_norm test=develop
-
- 12 11月, 2018 1 次提交
-
-
由 Yibing Liu 提交于
* Add int type support for stack_op * Improve gather op to support index with shape N x 1 test=develop * Fix stack_op kernel's registry test=develop
-