- 30 6月, 2022 12 次提交
-
-
由 zhaoying9105 提交于
-
由 chenjian 提交于
* add code * add unit test
-
由 zmxdream 提交于
* Revert "[GPUPS]Optimize dymf kernel (#43911)"
-
由 kuizhiqing 提交于
-
由 Leo Chen 提交于
* support scope_guard * fix test
-
由 xiongkun 提交于
* merge and add base support for non-local for * for and while non-local support * fix ci errors: v1 * fix bug * fix * fix code * fix * fix * fix
-
由 Ruibiao Chen 提交于
* Remove boost::variant for FetchResultType * Fix pybind errors
-
由 JingZhuangzhuang 提交于
* modify graph_pattern to thread_local * modify graph_pattern to thread_local
-
由 Zhang Zheng 提交于
* Add new attr of fused_multi_transformer * fix format * add note * add in layer * fixfixfixfix
-
由 chentianyu03 提交于
* add relu6 kernel and yaml * format files * format code and fix bug * fix build failed
-
由 Chenxiao Niu 提交于
-
由 Jiabin Yang 提交于
-
- 29 6月, 2022 17 次提交
-
-
由 Sing_chan 提交于
-
由 zmxdream 提交于
-
由 zyfncg 提交于
* support complexd selected_rows kernel in yaml * support configuring optimizer api in yaml * fix data transform bug
-
由 zhangchunle 提交于
-
由 Wilber 提交于
-
由 JZ-LIANG 提交于
* fixed bug for pass & engine * fixed bug for benchmark GPT-3
-
由 Zhen Wang 提交于
* Update the lock logic used in CinnCompiler::Compile.
-
由 tianshuo78520a 提交于
-
由 zhangkaihuo 提交于
-
由 ccrrong 提交于
* add comparisons trt converter
-
由 Leo Chen 提交于
-
由 Leo Chen 提交于
* separate variable scope and scope * hot fix for lod_tensor_blocking_queue * fix bug that variable exists in global scope
-
由 Chen Weihang 提交于
-
由 Wilber 提交于
* inference add convert to mixed model ability.
-
由 zyfncg 提交于
* move cross form legacy_api.yaml to api.yaml * move diagonal to api.yaml
-
由 ronnywang 提交于
-
由 QingshuChen 提交于
* skip xpu conv2d fp16 unitest *test=kunlun * minor *test=kunlun
-
- 28 6月, 2022 11 次提交
-
-
由 Yuang Liu 提交于
-
由 Sing_chan 提交于
-
由 Aurelius84 提交于
-
由 Aurelius84 提交于
* [Dy2Stat]Polish all API name of _jst
-
由 xiongkun 提交于
* add unittest for PR43688
-
由 wangzhen38 提交于
* [UPDATE FLUID API] only reference in paddlerec * change lr * [UPDATE FLUID API] only reference in paddlerec * update by reviews
-
由 Feiyu Chan 提交于
* change to condition to find python interpreter to avoid skipping the find process. PYTHONINTERP_FOUND is the best signal that python interpreter is found.
-
由 Chen Long 提交于
-
由 Tomasz Socha 提交于
* Remove output arguments from functions. Replace pointers with references * Name used bool flags * Reorder functions * Enable bfloat16 data type * Give declarations some space * Style * Style
-
由 zhaoying9105 提交于
-
由 Ming-Xu Huang 提交于
1. test_parallel_executor_seresnext_base_gpu failed on 2 P100 GPUs with `470.82` driver. ``` ====================================================================== FAIL: test_seresnext_with_learning_rate_decay (test_parallel_executor_seresnext_base_gpu.TestResnetGPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/test_parallel_executor_seresnext_base_gpu.py", line 32, in test_seresnext_with_learning_rate_decay self._compare_result_with_origin_model( File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/seresnext_test_base.py", line 56, in _compare_result_with_origin_model self.assertAlmostEquals( AssertionError: 6.8825445 != 6.882531 within 1e-05 delta (1.335144e-05 difference) ---------------------------------------------------------------------- ``` 2. To be more accuracte on evaluating loss convergence, we proposed to apply IOU as metric, instead of comparing first and last loss values. 3. As offline discussion, we also evaluated convergence on P100 and A100 in 1000 interations to make sure this UT have the same convergence property on both devices. The curves are showed below. ![A100-Single, P100-Single and Diff (1)](https://user-images.githubusercontent.com/13541238/175461920-25df6101-6dd8-4387-862c-d1c8e9299c57.png)
-