- 28 6月, 2022 6 次提交
-
-
由 Ming-Xu Huang 提交于
1. test_parallel_executor_seresnext_base_gpu failed on 2 P100 GPUs with `470.82` driver. ``` ====================================================================== FAIL: test_seresnext_with_learning_rate_decay (test_parallel_executor_seresnext_base_gpu.TestResnetGPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/test_parallel_executor_seresnext_base_gpu.py", line 32, in test_seresnext_with_learning_rate_decay self._compare_result_with_origin_model( File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/seresnext_test_base.py", line 56, in _compare_result_with_origin_model self.assertAlmostEquals( AssertionError: 6.8825445 != 6.882531 within 1e-05 delta (1.335144e-05 difference) ---------------------------------------------------------------------- ``` 2. To be more accuracte on evaluating loss convergence, we proposed to apply IOU as metric, instead of comparing first and last loss values. 3. As offline discussion, we also evaluated convergence on P100 and A100 in 1000 interations to make sure this UT have the same convergence property on both devices. The curves are showed below. ![A100-Single, P100-Single and Diff (1)](https://user-images.githubusercontent.com/13541238/175461920-25df6101-6dd8-4387-862c-d1c8e9299c57.png)
-
由 fuyou765 提交于
-
由 zhouweiwei2014 提交于
* [Sparse]add SparseTensor mv kernel(csr*dense_vec->dence_vec, coo*dense_vec->dense_vec) * fix CI
-
由 minghaoBD 提交于
-
由 zhangxiaoci 提交于
-
由 Xiaoxu Chen 提交于
* enable Jacobian,Hessian supporting new autograd * fix prim mode failed in PR-CI-Windows * add forward_gradients api * add forward_gradients api * skip test_autograd_functional_prim in windows ci * fix test_autograd_funciton_prim timeouot * remove the block parameter in prim2orig method * remove duplicate to_tensors code snippet # test=allcases
-
- 27 6月, 2022 4 次提交
-
-
由 Aurelius84 提交于
* [Dy2Stat]Refactor convert_shape transformer logic * clean usless unittest
-
由 wanghuancoder 提交于
* rename eagerpylayer
-
由 Aganlengzi 提交于
* [CustomDevice]add custom place supports * sync format
-
由 Aurelius84 提交于
-
- 24 6月, 2022 12 次提交
-
-
由 gongweibao 提交于
* tmp fix * init * compile ok * compile ok * add vlogs * add test * fix termination error * add testfile * add * fix window compile * fix window compile * fix windows compile * fix windows compile * fix windows compile * fix windows compile * fix windows compile * fix windows compile * fix kunlun compile * fix compilation * fix compilation * fix compilation * tmp fix * add windows * add windows * add more logs * change timeout to protected * SB * add * add * fix timeout * add * fix test * fix test * fix test * fix ut * fix ut * fix ut
-
由 Guanghua Yu 提交于
-
由 xiongkun 提交于
* add closure analysis for control flow and add some unittest * finetune the design of FunctionScopeVisitor * fix * fix python check * fix code by code review
-
由 ccrrong 提交于
* add slice plugin int32 support
-
由 zhouweiwei2014 提交于
-
由 fuyou765 提交于
-
由 z8hanghuan 提交于
* modify xpu unittest to support fp64, *test=kunlun * modify xpu unittest to support fp64 for KL2, *test=kunlun * modify xpu unittest to support fp64, *test=kunlun * modify xpu unittest to support fp64, *test=kunlun
-
由 cifar10 提交于
-
由 Chen Weihang 提交于
* fix incompatible error * rmeove default constructor * add macro * fix cpu make error * add DefaultGPUPlace api
-
由 光明和真理 提交于
-
由 Chenxiao Niu 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Use a fast completion for data parallelism * remove unuse cuSparse function * [Auto Parallel] Fix some bugs of the fast dp completion * [Auto Parallel] Add the cmake statements * [Auto Parallel] Make the unittest adapt to the new interface * [Auto Parallel] Modify the timeout of the unittest * [Auto Parallel] Remove unnecessary comments Co-authored-by: Nzhouwei25 <zhouwei25@baidu.com>
-
- 23 6月, 2022 12 次提交
-
-
由 niuliling123 提交于
-
由 Matsumoto Ruko 提交于
-
由 taixiurong 提交于
-
由 Leo Chen 提交于
-
由 zyfncg 提交于
* move trace into api.yaml * add trace unittest * fix trace test * fix generate op
-
由 zhangbo9674 提交于
-
由 Aurelius84 提交于
* [Dy2Stat]Support nonlocal mechanism in IF ast transformer * support prune return vars in cond * fix unittest * fix unittest * fix static check
-
由 ccrrong 提交于
* add cast trt converter
-
由 Shijie 提交于
-
由 Shijie 提交于
* Fix test_fuse_resnet_unit failure * Fix test_imperative_auto_mixed_precision failure * Fix sparse_attention_op error * Fix sparse_attention_op error
-
由 zlsh80826 提交于
* Reduce gather op unit tests size and increase the timeout * Add NVIDIA_TF32_OVERRIDE for multi-processes environment * Remove record test for device event ut
-
由 Sylwester Fraczek 提交于
* sylwek prototype params to int8 pass * trying to make warmup work * wip * wip * change test to cpp test * review fixes, refactoring * more refactoring * add erasevars * change test to fixture * rename pass and reorder erasevars and graphsaferemovenodes * fix * more refactoring and fixed bug * formatting * remove scale count * enfroce message too short * remove erasevars erasevars couldbe cauuse of memory issues some other fixes * add count of successfull fuses to name of new nodes * FindVar -> GetVar and use ConvResidual pattern * use tensor->clear() instead of new variable * Update paddle/fluid/framework/ir/mkldnn/params_quantization_mkldnn_pass_tester.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/fluid/framework/ir/mkldnn/params_quantization_mkldnn_pass_tester.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * Update paddle/fluid/inference/tests/api/analyzer_lexical_analysis_gru_tester.cc Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * add log (review fix)c * review fix (2 functions to one) * code review: Conv->QuantizeConv * revert * fix formatting * remove unused functions * add paddle enforce Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
- 22 6月, 2022 6 次提交
-
-
由 sneaxiy 提交于
-
由 ccrrong 提交于
* fix arg_max converter
-
由 WJJ1995 提交于
* fixed multihead matmul fuse pass * Add unittests * rm scale op * fixed code style * fixed code style * resolve testcase falied * add note
-
由 zhoutianzi666 提交于
* add fc, multihead_mul, shape tensor infer, slice
-
由 zhangkaihuo 提交于
-
由 tianshuo78520a 提交于
* test=gpups
-