1. 13 10月, 2021 8 次提交
    • L
      Merge lars op (#35476) · 0c31579c
      limingshu 提交于
      * A leap of try for cudaLaunchCooperativeKernel
      
      * fix bugs
      
      * Totally replace the lar cuda kernel
      
      * Fix bugs
      
      * a test for lars merge
      
      * Adding las_op_momentum infer_shape
      
      * Fix codes
      
      * use avg_numel instead of max_numel to acquire grid num
      
      * modify unittest files about lars op
      
      * Finally converge when merged-lars works
      
      * fix ctest files
      
      * add merged_operation kernel when cuda version is older than 11
      
      * Fix code style
      
      * fix ctest failure
      
      * fix error
      
      * fix all ctest error and change lars compute code of cpu
      
      * fix bugs on v100.
      
      * revert python modififation about lars
      
      * revert python modification codes
      0c31579c
    • W
      24418479
    • W
      pool fix (#36388) · 192e08cb
      wenbin 提交于
      * pool fix
      
      * comments
      192e08cb
    • J
      Implemented LRU based cache clearing (#36290) · bf748f24
      Jacek Czaja 提交于
      - Lint
      
      - Merge with develop
      
      - lint
      bf748f24
    • L
      [Amp] refine code of amp level (#36362) · 59e425cd
      Leo Chen 提交于
      * refine amp level
      
      * fix typo
      
      * update tracer._amp_level
      59e425cd
    • H
      Remove RunFromCinn in PE because We Will Call CinnRunner in Compute of SubgraphOp (#36385) · e051bba0
      Huihuang Zheng 提交于
      Remove RunFromCinn method in PE because We Will Call CinnRunner in Compute method of SubgraphOp
      e051bba0
    • J
      [New Feature] Support triple grad in Paddle (#36187) · 2c44ee7e
      Jiabin Yang 提交于
      * native commit for triple grad of sigmod
      
      * Updated unittests files
      
      * init functional jacobian api
      
      * Updated trible_test func
      
      * Updated gradient_checker & test_script
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * fix dygraph grad to support high differential
      
      * polish API docstring
      
      * Updated gradient checker and some related files
      
      * fix double grad strip error for high differential
      
      * fix double grad strip error for high differential
      
      * Add Sigmoid triple grad tests
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * Updated triple grad teses func
      
      * Use np.random to initialize ddx
      
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * format python code
      Co-authored-by: Nveyron95 <veyron_wu@163.com>
      Co-authored-by: Nlevi131 <limaolin01@baidu.com>
      2c44ee7e
    • W
      [PaddleInference] Pass: add int8 flag for op (#36042) · d7858c99
      Wangzheee 提交于
      * add_int_pass
      
      * add_int8_flag_pass
      
      * add_int8_flag_pass
      
      * fix CMakeLists.txt
      
      * fix test_trt_fc_fuse_quant_dequant_pass.py
      
      * fix python/paddle/fluid/tests/unittests/ir/inference/test_trt_fc_fuse_quant_dequant_pass.py
      
      * fix test_trt_fc_fuse_quant_dequant_pass.py
      d7858c99
  2. 12 10月, 2021 7 次提交
  3. 11 10月, 2021 20 次提交
  4. 09 10月, 2021 5 次提交