- 01 3月, 2021 2 次提交
- 26 2月, 2021 3 次提交
-
-
由 liym27 提交于
-
由 liym27 提交于
* [NPU] Support npu op: (1) pow (2) pow_grad * Support fp16
-
由 xiayanming 提交于
add ascend unittest
-
- 25 2月, 2021 2 次提交
-
-
由 xiayanming 提交于
Ascendrc
-
由 Leo Chen 提交于
refactor npu device manager (#31154)
-
- 23 2月, 2021 1 次提交
-
-
由 liym27 提交于
* [NPU] Support executor with NPU * Fix code according to reviews * Fix code * Add unittest for sub op npu
-
- 18 2月, 2021 1 次提交
-
-
由 xiayanming 提交于
support parsing ascend rank table file
-
- 25 1月, 2021 1 次提交
-
-
由 Void Main 提交于
[Feature] Build parser to support distributed training
-
- 22 1月, 2021 2 次提交
-
-
由 gongweibao 提交于
cleanup test_ascend_group.py
-
由 gongweibao 提交于
Add startup bash files of test_ascend_group
-
- 21 1月, 2021 4 次提交
-
-
由 gongweibao 提交于
Add Hccl program group
-
由 gongweibao 提交于
Pass device_ids info from launch to trainer
-
由 Void Main 提交于
Build praser for Hcom* operators
-
由 gongweibao 提交于
Add distribution supported
-
- 15 1月, 2021 3 次提交
- 14 1月, 2021 5 次提交
-
-
由 taixiurong 提交于
-
由 Zhou Wei 提交于
-
由 Jiaqi Liu 提交于
* add auc into 'all' list * alias acc, expose to users * update sample code
-
由 123malin 提交于
* test=develop, add distributed_infer
-
由 Chen Weihang 提交于
-
- 13 1月, 2021 9 次提交
-
-
由 Huihuang Zheng 提交于
As the title
-
由 cc 提交于
* skip quantizing ops in cpu inference, test=develop
-
由 Bai Yifan 提交于
-
由 huangxu96 提交于
-
由 Huihuang Zheng 提交于
As the title
-
由 Leo Chen 提交于
Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338) * set expected place in child thread for dataloader * set device id when set tensor from numpy * revert tensor_py change * add compile guard * fix ci * fix bug
-
由 QingshuChen 提交于
* optimize memcpy perf for kunlun * remove useless unitest for kunlun mean * minor
-
由 huangxu96 提交于
* Implemented AddQuantDequantPass in imperative quantization. * Supported LeakyReLU Quantization * For meeting coverage rate. * Changed the file name of test of AddQuantDequant * Implemented more Quantized NoWeightLayers. * Fix the loss cannot align problem between static and dynamic model quantization, add swish as supported quantized layer in imperative quantization. * remove noweight_list * support 2.0 API such as Pool2D and ReLu
-
由 ShenLiang 提交于
-
- 12 1月, 2021 7 次提交
-
-
由 JZ-LIANG 提交于
-
由 lidanqing 提交于
-
由 Wojciech Uss 提交于
* upgrade oneDNN version to 2.0 master branch * - Added workarounds for new lib onednn change * fix regex Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
由 tangwei12 提交于
* add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0
-
由 YUNSHEN XIE 提交于
* disable test_pipeline * fix error
-
由 chajchaj 提交于
* fix bug of using ignore_index and reduction,test=develop * fix bug of celoss when using ignore_index and reduction, test=develop * improve performance when ignore_index=-100, test=develop * add test in test_cross_entropy_loss.py for coverage rate, test=develop * rm comment in test_cross_entropy_loss.py, test=develop * del hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop * change mask to a more simplified implementation, test=develop * del comment in python/paddle/nn/functional/loss.py, test=develop * del hard code and change mask to a more simplified implementation, test=develop * change mask to a more simplified implementation, test=develop * change mask to a more simplified implementation, test=develop
-
由 Double_V 提交于
* fix elugradgrad test fail and error message opt * fix unitest,test=develop * Update prroi_pool_op.h fix error message * opt message,test=develop * fix ci fail,test=develop
-