1. 02 8月, 2023 7 次提交
    • W
      [XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
      wz1qqx 提交于
      22c7a6eb
    • Z
      [IR] NewIr Interpreter Beta run regular (#55828) · 63b7fc80
      zhangbo9674 提交于
      * add interface
      
      * add code
      
      * add code
      
      * add code
      
      * add code
      
      * fix bug
      
      * fix bug
      
      * add var prefix
      
      * add code
      
      * add code
      
      * add code
      
      * fix compile bug
      
      * fix bug
      
      * refine code
      
      * refine code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * add code
      
      * fix bug
      
      * add code
      
      * add code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * fix bug in phi__kernel_utils
      
      * refine code
      
      * fix bug
      
      * open flag
      
      * refine code
      
      * fix bug
      
      * fix bug
      
      * refine code
      
      * fix bug
      63b7fc80
    • Y
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a
      yangjianfengo1 提交于
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)
      
      * finish
      
      * cpergroup odd
      
      * fix bf16
      
      * single channel
      
      * code style
      
      * jingdu duiqi
      
      * add head_file
      
      * add bf16 head file
      
      * bf16 2
      
      * bf16
      
      * bf16 head
      
      * bf16 compile
      
      * py test
      
      * bf16 compile
      
      * bf16 compile
      
      * unset py test
      
      * nhwc
      
      * test
      
      * mean var
      
      * bf16 success
      
      * su
      
      * ctest success
      
      * use is_same_as
      
      * is_same
      
      * use is_same
      
      * rtol
      
      * gpu_stream
      
      * del sigmod
      
      * fix bfloat16 type
      
      * use cuda_bf16_hpp
      
      * use_cuda_arch
      
      * bfloat162float2
      
      * del inplace_tol
      
      * del max_releative_tol
      
      * temp store
      
      * jingdu duiqi
      
      * temp store
      
      * plugin
      
      * jingdu duiqi
      
      * duiqi
      
      * include cuda.h
      
      * del half
      
      * half single
      
      * ci
      
      * add const
      
      * ci
      
      * cudamemset
      
      * del printf
      
      * fp16 test
      
      * add half compute
      
      * del br16 ci
      
      * del ci
      
      * ci approve
      
      * del fluid include
      e61d892a
    • W
      fix security bug (#55866) · 92aa92fa
      wanghuancoder 提交于
      * fix security bug
      92aa92fa
    • C
      Add FP16 & BF16 for erfinv (#55287) · 6d7efd09
      cyberslack_lee 提交于
      6d7efd09
    • W
      fix security bug (#55782) · 19da5c0c
      wanghuancoder 提交于
      * fix security bug
      19da5c0c
    • J
      [XPU] Add gather_squeeze_pass (#55605) · d13a49d6
      jiangfan06 提交于
      d13a49d6
  2. 01 8月, 2023 7 次提交
  3. 31 7月, 2023 7 次提交
  4. 28 7月, 2023 4 次提交
  5. 27 7月, 2023 4 次提交
    • Z
      add int32/int64 for outer/matmul Kernel. (#55584) · ff2142f2
      zxcd 提交于
      * add int32/int64 for outer/matmul Kernel.
      
      * fix by comment.
      
      * fix by comment
      ff2142f2
    • H
      [NewIR]Fix new ir dygraph 2 static concat grad bug (#55634) · 51ebcf68
      hong 提交于
      * add kernel dialect
      
      * change DenseTensorTypeStorage to DenseTensorType
      
      * add test case`
      
      * add first pd_op to kernel dialect
      
      * lower pd op to kernel dialect
      
      * update
      
      * update
      
      * remove useless code
      
      * add attrite print test
      
      * fix bug
      
      * update
      
      * update
      
      * update
      
      * update
      
      * polish code
      
      * fix bug
      
      * polish  code  and add python test
      
      * add test
      
      * fix test error
      
      * relax constraint when inserting get_parameter
      
      * add env flag
      
      * fix bug
      
      * dygraph2static support new ir
      
      * fix bug
      
      * revert test env
      
      * change cc_test_old to cc_test
      
      * update
      
      * fix build_static bug
      
      * update test
      
      * fix type test error
      
      * udpate cmake
      
      * disable test in windows
      
      * fix inference compile
      
      * fix program translator error
      
      * only run on cpu, not support gpu yet
      
      * fix conflict
      
      * polish code
      
      * fix bug
      
      * add feed with place op
      
      * update
      
      * remove useless unitest
      
      * udpate mkldnn
      
      * update
      
      * update
      
      * align mkldnn version
      
      * new ir support builtin slice op
      
      * fix bug
      
      * fix phi kernel adaptor bug
      
      * add enable static
      
      * add enable_static
      
      * remove useless test case
      
      * change feed list to single variable
      
      * update
      
      * add feed with place and shaddow output op
      
      * fix bug
      
      * remove usless code
      
      * support gpu
      
      * fix bug
      
      * fix bug
      
      * remove template
      
      * add more data type
      
      * fix cimpile bug
      
      * udpate
      
      * remove useless code
      
      * revert dygraph2st test
      
      * remove usless code
      
      * revert op
      
      * fix bug
      
      * remove instance norm
      
      * fix concat grad bug
      
      * revert code
      
      ---------
      Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
      51ebcf68
    • G
      【inplace api】batch add inplace api paddle.log_, paddle.i0_,... · 58a03d41
      GGBond8488 提交于
      【inplace api】batch add inplace api paddle.log_, paddle.i0_, paddle.nn.functional.leaky_relu_... (#55576)
      
      * batch add inplace api
      
      * add inplace test
      
      * add activation inplace
      
      * fix test
      
      * remove atan2 ge, gt, le, lt, nq
      
      * remove atan2 ge, gt, le, lt, nq
      
      * fix windows ci error
      
      * rerun ci
      
      * fix typro
      
      * fix bugs
      
      ---------
      Co-authored-by: Nzhangrui34 <v_zhangrui34@baidu.com>
      58a03d41
    • A
      cbbd940e
  6. 26 7月, 2023 5 次提交
  7. 25 7月, 2023 6 次提交
    • L
      8db3ff1f
    • J
      Bugfix, fast layer norm, OOB (#55639) · 017a6164
      Jeng Bai-Cheng 提交于
      * Fix LayerNormForward perf issue
      
      * Bugfix, fast_layer_norm OOB
      
      * apply pre-commit
      
      ---------
      Co-authored-by: NShijie Wang <jaywan@nvidia.com>
      017a6164
    • c737f0ae
    • L
      fix bugs in rnn op (#55656) · 0cd422b6
      Lucas 提交于
      0cd422b6
    • W
      fix div 0 bug (#55644) · 690ffe81
      wanghuancoder 提交于
      690ffe81
    • H
      [NewIR]new ir dygraph to static supoort gpu (#55620) · fb9bec5d
      hong 提交于
      * add kernel dialect
      
      * change DenseTensorTypeStorage to DenseTensorType
      
      * add test case`
      
      * add first pd_op to kernel dialect
      
      * lower pd op to kernel dialect
      
      * update
      
      * update
      
      * remove useless code
      
      * add attrite print test
      
      * fix bug
      
      * update
      
      * update
      
      * update
      
      * update
      
      * polish code
      
      * fix bug
      
      * polish  code  and add python test
      
      * add test
      
      * fix test error
      
      * relax constraint when inserting get_parameter
      
      * add env flag
      
      * fix bug
      
      * dygraph2static support new ir
      
      * fix bug
      
      * revert test env
      
      * change cc_test_old to cc_test
      
      * update
      
      * fix build_static bug
      
      * update test
      
      * fix type test error
      
      * udpate cmake
      
      * disable test in windows
      
      * fix inference compile
      
      * fix program translator error
      
      * only run on cpu, not support gpu yet
      
      * fix conflict
      
      * polish code
      
      * fix bug
      
      * add feed with place op
      
      * update
      
      * remove useless unitest
      
      * udpate mkldnn
      
      * update
      
      * update
      
      * align mkldnn version
      
      * new ir support builtin slice op
      
      * fix bug
      
      * fix phi kernel adaptor bug
      
      * add enable static
      
      * add enable_static
      
      * remove useless test case
      
      * change feed list to single variable
      
      * update
      
      * add feed with place and shaddow output op
      
      * fix bug
      
      * remove usless code
      
      * support gpu
      
      * fix bug
      
      * fix bug
      
      * remove template
      
      * add more data type
      
      * fix cimpile bug
      
      * udpate
      
      * remove useless code
      
      * revert dygraph2st test
      
      * remove usless code
      
      * revert op
      
      * fix bug
      
      * new ir dygraph2static support gpu
      
      * remove usless code
      
      * code polish
      
      * add const
      
      * revert code and remove useless code
      
      * revert code
      
      * revert legacy op yaml
      
      * remove useless code
      
      * delete std::move
      
      ---------
      Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
      fb9bec5d