- 28 4月, 2021 10 次提交
-
-
由 arlesniak 提交于
-
由 Chen Weihang 提交于
* add fake interface for hook in static mode * add unittests * fix failed unittests
-
由 denglin-github 提交于
* Add dlnne engine runtime * Fix log * Remove <const_cast> and remove unrelated modify with dlnne, +clang-format * Fix CMakeList format error * Add copyright message * Fix dlnne CMakeList.txt * Add some paddlepaddle_pass to support more networks * Fix some format bug * Add delete dropout_op pass * Fix some format bug * Fix format bug
-
由 wangna11BD 提交于
-
由 Thunderbrook 提交于
* Revert "Revert "[PsCore] optimize performance of large kv (#32535)" (#32599)" This reverts commit 809ac036. * brpc dep
-
由 Kqnonrime 提交于
* fix two error message * fix two error message * fix error * fix error * fix error * fix error * fix some error message * fix some error * fix error * fix some error * fix some error * fix some error * fix one error * fix some error * fix seven error message * fix error * fix error * fix error * fix error * fix some error message * fix error * fix some error * fix some error
-
由 zhulei 提交于
-
由 wawltor 提交于
Reduce the time cost for the elementwise_add test case (#32628)
-
由 Jacek Czaja 提交于
* - Added clearing oneDNN per executor * - Executor is nt always having FLAGS_use_mkldnn set to true
-
由 jiangcheng 提交于
* optimize update_loss_scaling_op by fused for loop to one kernel, test=develop * remove useless while loop and optimize variable name, test=develop * optimize variable name from out_addrs_tensor to out_addrs_mem, test=develop * optimize variable name for readable by change prefix identifier from t_ to local_
-
- 27 4月, 2021 23 次提交
-
-
由 lilong12 提交于
* add alltoall api, test=develop
-
由 pangyoki 提交于
* support cuda11.2 and using gcc5.4 in cuda10.1 * fix manylinux py36 bug * support cuda11.2 * fix python36 pip version problem in ubuntu * save cuda11.0
-
由 zhiboniu 提交于
* update 2.0 public api in nn * replace Chinese character cause error in ci;synchronization with pr:#32588 to avoid 'ascii' codec in python2 * numbers used in paddle.nn.functional.norm but not imported
-
由 zhiboniu 提交于
Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
-
由 WeiXin 提交于
* edit paddle.save/load API * Update io.py edit doc * delete cpython-37.pyc * Update io.py edit doc * Update io.py recommit * Update io.py recommit * Update io.py recommit * Update io.py recommit
-
由 WeiXin 提交于
* clear 'BasicEngine' when an exception occurs in the backward. * deal with conflict. * deal with conflict.
-
由 wenbin 提交于
-
由 Zhong Hui 提交于
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
-
由 Zhang Zheng 提交于
-
由 Baibaifan 提交于
-
由 WeiXin 提交于
* jit.save/load support function. * delete unnittest test_jit_load_model_incomplete. * edit code according to CI * Modify the documentation. * add note to doc.
-
由 xiemoyuan 提交于
* fixed docs. * Fixed docs. test=document_fix code bak. fixed docs. test=document_fix * Revert to previous version of python/paddle/fluid/backward.py * fixed bugs. * test=document_fix. Fixed examples.
-
由 Guanghua Yu 提交于
* fix cross_entropy calculation error * add unittest and fix static
-
由 Ren Wei (任卫) 提交于
* UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1788: ordinal not in range(128) test=document_fix str(doc) in python2 test=document_fix * update md5 function in count_api_without_core_ops.py str in py2 is different. test=document_fix
-
由 xiemoyuan 提交于
* Support list and tuple for parameters of layer_norm, multiprocess_reader, DatasetFolder and ImageFolder. * add unittest for layer_norm. * add require gpu for example.
-
由 Pei Yang 提交于
-
由 tianshuo78520a 提交于
This reverts commit 4b7242b0.
-
由 Aurelius84 提交于
-
由 XiangGao 提交于
Co-authored-by: NYang Zhang <yangzhang@live.com>
-
由 ShenLiang 提交于
* fix amp bug * fix name of wordsize
-
由 zhiboniu 提交于
-
由 zhiboniu 提交于
-
由 zhiboniu 提交于
-
- 26 4月, 2021 7 次提交
-
-
由 lilong12 提交于
* add sendrecv, test=develop
-
由 WeiXin 提交于
-
由 ShenLiang 提交于
-
由 zhangchunle 提交于
-
由 Zhou Wei 提交于
* clear CUDA compile environment on windows * fix Windows CI * fix Windows CI * fix Windows CI
-
由 jiangcheng 提交于
* new optimize for where_index_op with prefix sum version. * write a scan prefix sum kernel with stream for where index op. * optimize where_index by using cub::DeviceScan::InclusiveSum instead of imperfect self-kernel. * remove CheckTrue struct and rename stide_array for readable. * optimize variable name for readable. * optimize function name and annotation.
-
由 Thunderbrook 提交于
* optimize pull sparse * optimize pull sparse * change macro * format
-