- 09 4月, 2022 5 次提交
-
-
由 limingshu 提交于
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode. * Use the system cudaMalloc and cudaFree to allocate workspace during searching. * Enable switch of two kind of workspace setting methods. Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 Jiabin Yang 提交于
* fix_ci_problem3 * support windows no default error
-
由 wanghuancoder 提交于
-
由 Leo Chen 提交于
* fix bug that no thread is waked up when adding task to threadpool * fix typo
-
由 LiYuRio 提交于
-
- 08 4月, 2022 10 次提交
-
-
由 whs 提交于
-
由 crystal 提交于
fix group_norm vectorized address misalignment
-
由 Allen Guo 提交于
-
由 Jack Zhou 提交于
-
由 z8hanghuan 提交于
* modify unittest of lstm forward, *test=kunlun * modify unittest of lstm forward, *test=kunlun
-
由 Aurelius84 提交于
* [Eager]Fix segment_pool/allclose/isclose/scale API bug * fix kernel register problem
-
由 Qi Li 提交于
* [ROCm] fix dcu error in device event base, test=develop * fix, test=develop
-
由 taixiurong 提交于
-
由 ronnywang 提交于
-
由 hong 提交于
* ad conj flip yaml * add flip conj pixel shuffle
-
- 07 4月, 2022 25 次提交
-
-
由 Thunderbrook 提交于
* afs wrapper * format * format * macro
-
由 zhouweiwei2014 提交于
-
由 YuanRisheng 提交于
* add yaml * perfect converage
-
由 Ruibiao Chen 提交于
* modify matrix_rank * add matrix_rank shape * add matrix_rank shape * Add yaml for matrix_rank OP * Add UT Co-authored-by: Nzhoujianqian <15205085056@163.com>
-
由 Chen Weihang 提交于
* add unbind yaml * fix unittest
-
由 lilong12 提交于
-
由 liutiexing 提交于
* Profile Executors * update * fix ut * fix names * update * update
-
由 zhouweiwei2014 提交于
-
由 lilong12 提交于
-
由 Chen Weihang 提交于
-
由 huzhiqiang 提交于
-
由 sneaxiy 提交于
* add Output(Step) to distributed fused lamb op * add _set_step
-
由 zhangkaihuo 提交于
-
由 Siming Dai 提交于
* add one_hot gpu hint * move allow_out_of_range judgement * delete useless unittest
-
由 chenjian 提交于
* no * maintain old profiler * fix old dygraph record event
-
由 zhiboniu 提交于
-
由 QingshuChen 提交于
* ignore some failed test for KL2 *test=kunlun * minor *test=kunlun * minor *test=kunlun
-
由 Wilber 提交于
* add rewrite pattern form paddle op tp trt op * infrt-trt run resnet50. Co-authored-by: weishengying <1343838695@qq.com>
-
由 Sing_chan 提交于
* change inference demo_test build method to ninja to choose visual studio version automaticly * notest;test=windows_ci_inference * set cuda of demo_ci by arg,fix bug of ninja compile,test=document_fix;test=windows_ci;test=windows_ci_inference * fix bug;test=document_fix;test=windows_ci;test=windows_ci_inference * fix bug;test=document_fix;test=windows_ci_inference" * set lib_path according to generator
-
由 Zhang Jun 提交于
-
由 Chen Weihang 提交于
* polish truncated normal kernel * add yaml * add truncated normal kernel and add yaml * polish unittests and yaml * import dygraph mehtod
-
由 houj04 提交于
* momentum support l2decay for xpu. test=kunlun * fix include file. test=kunlun * fix cmake for device_worker. test=kunlun
-
由 JingZhuangzhuang 提交于
* modify infer gpu memory strategy * modify infer gpu memory strategy
-
由 YuanRisheng 提交于
-
由 Yiqun Liu 提交于
* Add GPU memory usage information in the print of profiler. * Add ifdef.
-