- 22 6月, 2022 10 次提交
-
-
由 Jackwaterveg 提交于
* fix conflict * improve the doc
-
由 Yiqun Liu 提交于
cherry-pick #42750。 QA反馈,#42750 优化后,solov2模型性能可提升6%,故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下,该pr不在release/2.3分支中,故将#42750 中python修改同步到fluid.layers.tensor.linspace中。
-
由 shiyutang 提交于
* merge_release_and_dev * merge_release_dev * update * Use tempfile to place the temporary files (#43237) * tempfile_fix * update * fix_CI * update_word2vec.inference.model * remove_change_in_word2vec_book * fix_word2vec_book * rm_affine * update
-
由 Zhang Ting 提交于
fix the bug that _DataLoaderIterMultiProcess use time to generate the seed cherry-pick #43318
-
由 Zhang Ting 提交于
[cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719) [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax cherry-pick #43635 #43681 #43474
-
由 Sing_chan 提交于
Only cherry pick format tool(clang-format, yapf, cmake-format) upgrade to release/2.3, lint tool such as cpplint will not move, because we are not going to fix cpplint error in release/2.3 pre_commit.sh also is moved to release/2.3 so that both PR-CI-pre-commit and PR-CI-pre-commit-23 can works. pre install clang-format to avoid repeat installation due to pre-commit's multi-thread running.
-
由 zyfncg 提交于
-
由 LielinJiang 提交于
* fix decode_jpeg example code * fix decode_jpeg example code
-
由 zhangbo9674 提交于
在 amp-o2功能开发过程中,为了支持指定网络存储数据类型的功能,添加state_dict hook功能,但是在Layer的set_state_dict是通过state_dict获取网络参数并加载的,hook接口的存在导致 set_state_dict无法加载到原本网络参数。 本pr通过增加hook控制开关,在set_state_dict中禁用hook解决该问题。 详见pr43407
-
由 zhangbo9674 提交于
bug: 当class Layer的_buffers中有参数为None的时候,调用to()方法将会报layer to 'NoneType' object has no attribute 'place'的错误。 修复方法: to()方法增加对_buffers中None类型参数的判断,如果为None,跳过该参数的处理。
-
- 21 6月, 2022 5 次提交
-
-
由 Jackwaterveg 提交于
* fix usage of prefetch_factor * add assert * add docstring and change prefetch_factor when num_workers=0 * fix doc
-
由 Guanghua Yu 提交于
* cherry pick #43088 #40664 * fix clang format
-
由 chalsliu 提交于
* Update CUDA and TensorRT version for CI * disable ut * Update TensorRT for CUDA 10.2
-
由 niuliling123 提交于
删除 layout autotune 中的多余打印 背景 :layout autotune log会导致模型打印信息增多
-
由 zhoutianzi666 提交于
-
- 20 6月, 2022 5 次提交
-
-
由 z8hanghuan 提交于
* modify xpu.cmake,*test=kunlun (#41832) * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * support bilstm,*test=kunlun * [cherry-pick]support multi_layer of bilstm,*test=kunlun * [cherry-pick]refactor sum unit test,*test=kunlun (#43561)
-
由 xiongkun 提交于
* cherry pick from #43397 * fix code
-
由 Shang Zhizhou 提交于
-
由 zhaoyingli 提交于
-
由 zhaoyingli 提交于
* place all save/load path into temporary directory * rm no need unittest
-
- 18 6月, 2022 1 次提交
-
-
由 gongweibao 提交于
* fix test * fix test.
-
- 17 6月, 2022 4 次提交
-
-
由 weishengying 提交于
-
由 YuanRisheng 提交于
-
由 Haohongxiang 提交于
* fix pg bugs * update
-
由 WangXi 提交于
* Rename dropout is test (#43098) * replace dropout_is_test with is_test. * improve atol on a100. * fused_attention fused_feedforward api support Model Tensor Parallel (#42985) * fix is_test bug in fused_feedforward. (#43508) Co-authored-by: NLi Min <11663212+limin2021@users.noreply.github.com>
-
- 16 6月, 2022 5 次提交
-
-
由 zhangbopd 提交于
Use tempfile for unit test & custom op test to replace temporary files to ensure that all temporary files will be deleted normally after a single measurement, avoiding the usage of disk files. The PR only involves single-test and op test modifications and does not affect existing functionality. Release/2.3 branch modified in PR43521;
-
由 Qi Li 提交于
* fix unit test temp file, test=develop (#43155) * add cleanup code, test=develop (#43305)
-
由 Qi Li 提交于
* Fix numpy 1.20+ deprecation warnings (#42929) * Replace np.bool/np.bool8 with np.bool_ * Replace np.object with np.object_ * Replace np.complex with np.complex128 * Replace np.float with np.float64 * Replace np.int with np.int_ * Rerun pre-commit for newer pre-commit configuration * Use builtin bool instead of np.bool_ based on the context * fix mode dtype Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
-
由 zhaoyingli 提交于
-
由 Guanghua Yu 提交于
* Add progress bar and speed up Quantization Pass * fix typo
-
- 15 6月, 2022 1 次提交
-
-
由 zyfncg 提交于
* fix bug of strided_slice (#43388) * fix stride_slice bug * fix bug * fix bug of infer shape for slice (#43443)
-
- 14 6月, 2022 3 次提交
-
-
由 Shang Zhizhou 提交于
-
由 xiongkun 提交于
* [EinsumOp] Polish forward logic and backward logic for optimize (#42603) * change logic for optimize * modifty * merge * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010) * [EinsumOp] Make EinsumOp support bfloat16. (#43085) * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 * make EInsumOP support bf16 * add unittest for BF16 * add condition for test_BF16 * fix bugs * fix * change the backward api to fit einsum op
-
由 freeliuzc 提交于
使用 tempfile 替换临时文件,保证在单测结束后,所有临时文件都会被正常的删除,避免占用磁盘文件。 此 PR 仅涉及单测修改,不影响现有功能。 develop 分支修改在 PR 43376
-
- 13 6月, 2022 1 次提交
-
-
由 tianshuo78520a 提交于
删除无用信息
-
- 09 6月, 2022 3 次提交
-
-
由 Guanghua Yu 提交于
* support fuse conv and bn in QAT (#42255) * support skip_op_list in PostTrainingQuantization (#42378) * fix unittest
-
由 Guanghua Yu 提交于
-
由 zhupengyang 提交于
-
- 08 6月, 2022 2 次提交
-
-
由 niuliling123 提交于
Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现,文件编译时间较长,因此本PR将其替换为KP实现 删除DefaultElementwiseOperator中重复功能支持,减少elementwise_double_grad OP编译时间
-
由 tianshuo78520a 提交于
删除在2.3 对比whl包大小。
-