- 09 4月, 2022 5 次提交
-
-
由 limingshu 提交于
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode. * Use the system cudaMalloc and cudaFree to allocate workspace during searching. * Enable switch of two kind of workspace setting methods. Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 Jiabin Yang 提交于
* fix_ci_problem3 * support windows no default error
-
由 wanghuancoder 提交于
-
由 Leo Chen 提交于
* fix bug that no thread is waked up when adding task to threadpool * fix typo
-
由 LiYuRio 提交于
-
- 08 4月, 2022 8 次提交
-
-
由 whs 提交于
-
由 crystal 提交于
fix group_norm vectorized address misalignment
-
由 z8hanghuan 提交于
* modify unittest of lstm forward, *test=kunlun * modify unittest of lstm forward, *test=kunlun
-
由 Aurelius84 提交于
* [Eager]Fix segment_pool/allclose/isclose/scale API bug * fix kernel register problem
-
由 Qi Li 提交于
* [ROCm] fix dcu error in device event base, test=develop * fix, test=develop
-
由 taixiurong 提交于
-
由 ronnywang 提交于
-
由 hong 提交于
* ad conj flip yaml * add flip conj pixel shuffle
-
- 07 4月, 2022 16 次提交
-
-
由 Thunderbrook 提交于
* afs wrapper * format * format * macro
-
由 zhouweiwei2014 提交于
-
由 YuanRisheng 提交于
* add yaml * perfect converage
-
由 lilong12 提交于
-
由 liutiexing 提交于
* Profile Executors * update * fix ut * fix names * update * update
-
由 lilong12 提交于
-
由 sneaxiy 提交于
* add Output(Step) to distributed fused lamb op * add _set_step
-
由 zhangkaihuo 提交于
-
由 chenjian 提交于
* no * maintain old profiler * fix old dygraph record event
-
由 QingshuChen 提交于
* ignore some failed test for KL2 *test=kunlun * minor *test=kunlun * minor *test=kunlun
-
由 Sing_chan 提交于
* change inference demo_test build method to ninja to choose visual studio version automaticly * notest;test=windows_ci_inference * set cuda of demo_ci by arg,fix bug of ninja compile,test=document_fix;test=windows_ci;test=windows_ci_inference * fix bug;test=document_fix;test=windows_ci;test=windows_ci_inference * fix bug;test=document_fix;test=windows_ci_inference" * set lib_path according to generator
-
由 Zhang Jun 提交于
-
由 houj04 提交于
* momentum support l2decay for xpu. test=kunlun * fix include file. test=kunlun * fix cmake for device_worker. test=kunlun
-
由 JingZhuangzhuang 提交于
* modify infer gpu memory strategy * modify infer gpu memory strategy
-
由 YuanRisheng 提交于
-
由 Yiqun Liu 提交于
* Add GPU memory usage information in the print of profiler. * Add ifdef.
-
- 06 4月, 2022 7 次提交
-
-
由 0x45f 提交于
-
由 pangyoki 提交于
* support final_state in multiprocess * fix no place.device * set device_id in eager_gen
-
由 feng_shuai 提交于
-
由 Allen Guo 提交于
* remove paddle_ipu shared library * fix unique_name
-
由 Weilong Wu 提交于
* [Eager] Support test_layers's test cases switch to eager mode * Update batch_norm _C_ops action to fix CI * Use None instead of new EmptyTensor * Updated var name * Make sure to switch eager mode, Fix Coverage_CI * Remove _non_static_mode statement * Remove batch_norm dispensable input statement * Polish batch_norm code * Fix CI issue
-
由 hong 提交于
* update * add conv yaml * add backward * remove useless code * fix bug * fix bug * revert fluid dygraph conv2d * remove useless infermeta function * fix meta fn deluplicat error * conv using custom impl * remove amp include * fix bug * use cudnn = true * fix test mkldnn caching bug
-
由 wanghuancoder 提交于
-
- 05 4月, 2022 4 次提交
-
-
由 zyfncg 提交于
* fix bug of data transform in inference executor * fix bug
-
由 zhaocaibei123 提交于
* update name * update name * fix test * fix fleet bind * update name * update name * fix test * fix gpups wrapper * remove Push/Pull/Load/Save with context in client and wrapper base class * fix * fix * remove some interface * fix * remove * code style * recover * fix * remove code unused * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable * fix * fix * fix * recover * remove unused code Co-authored-by: Nesythan <esythan@126.com>
-
由 wangxinxin08 提交于
* add fake index and unittest for multiclass_nms3 trt * modify unittest
-
由 Zhanlue Yang 提交于
* [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues * [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul * Fixed issues with phi kernel * Added triple grad test case * Fixed minor issue
-