1. 16 2月, 2022 1 次提交
  2. 15 2月, 2022 9 次提交
    • R
      [PluggableDevice] Add custom runtime support (#38740) · 3e7825f3
      ronnywang 提交于
      * [CustomRuntime] Add DeviceManager
      
      * [CustomRuntime] Add DeviceInterface
      
      * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager
      
      * [CustomRuntime] Add plug-in device
      
      * [CustomRuntime] Memory module support PluggableDevice
      
      * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option
      
      * update
      
      * [API] update API doc based on comments, test=develop
      Co-authored-by: Nqili93 <qili93@qq.com>
      3e7825f3
    • A
      Added hapi BF16 lenet script (#39298) · 70714d1b
      arlesniak 提交于
      * hapi lenet BF16
      
      * ops list updated
      
      * year typo fix
      
      * tests updated fo CI
      70714d1b
    • F
      pool2d_coonvert_ut (#39545) · cf8a5573
      feng_shuai 提交于
      cf8a5573
    • L
      [Paddle-TRT] Replace GeLU plugin with TensorRT built-in layer for TensorRT 7.0. (#38399) · a3689d8c
      Leo Chen 提交于
      * Replace GeLU plugin with TRT built-in layers for approximate GeLU
      
      * Add TensorRT built-in layer for nonapproximate GeLU
      a3689d8c
    • W
      536a55fa
    • Z
      [Pten] Support SelectedRows in C++ API (#39497) · 5bb3b668
      zyfncg 提交于
      * add data_transform in pten api
      
      * support GetKernelTypeForVar
      
      * fix complie problem of bfloat16
      
      * add scale_sr in api
      
      * suppport select_row in C++ api
      
      * merge code
      5bb3b668
    • F
      delete mish_convert_ut skip (#39432) · 8cedcd3e
      feng_shuai 提交于
      8cedcd3e
    • z8hanghuan's avatar
      new way of test case, 2nd, *test=kunlun (#39478) · 4745234f
      z8hanghuan 提交于
      * new way of test case, 2nd, *test=kunlun
      
      * new way of test case, 2nd, *test=kunlun
      
      * new way of test case, 2nd, *test=kunlun
      4745234f
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  3. 14 2月, 2022 8 次提交
    • H
      Add Inplace addto pass and unittest. (#39433) · 52af0a60
      hlygit66666 提交于
      * add fuse_relu_depthwise_conv_pass unittest
      
      * fix atol and rtol
      
      * fix according to review
      
      * Update test_dist_fuse_relu_depthwise_conv_pass.py
      
      * add inplace_addto pass and unittest
      52af0a60
    • S
      [UT] mish op, conv+mish, fc+mish fuse passes (#39340) · 02938b3d
      Sławomir Siwek 提交于
      * mish unit tests
      
      * code format
      
      * remove unused imports
      
      * code format
      
      * remove hard-coded shape values
      
      * remove timeouts
      
      * remove timeouts v2
      
      * restore timeouts
      02938b3d
    • Z
      统一ps:heter ps 二阶段单测通过 (#39468) · 765a2ada
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * refactor theoneps
      
      * the_one_ps
      
      * add ps pass unittest
      
      * add ps pass unittest
      
      * ps unitest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * ps unittest ready
      
      * ps unittest ready
      
      * solve dist_pass init conflict
      
      * solve import CommContext error
      
      * unittest ok
      
      * implement AllocateFrom
      
      * solve setup.py.in conflict
      
      * solve conflict
      
      * solve conflict
      
      * solve conflict
      
      * .
      
      * .
      
      * cpu-async-ps minimize test ok & gpu minimize test ok
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
      765a2ada
    • z8hanghuan's avatar
      new may of test cases, *test=kunlun (#39444) · e07420b9
      z8hanghuan 提交于
      * new may of test cases, *test=kunlun
      
      * new may of test cases, *test=kunlun
      
      * new may of test cases, *test=kunlun
      e07420b9
    • T
      fix gather_nd, *test=kunlun (#39283) · d12c3636
      TTerror 提交于
      d12c3636
    • T
    • Z
      Fixed get_tensor method for EagerTensor (#39414) · 97229944
      Zhanlue Yang 提交于
      * Enabled Eager OpTest #1
      
      * Enabled Eager OpTest #1
      
      * Fixed get_tensor method for EagerTensor
      97229944
    • Z
      Adjusted python-level trace_op to accomodate final state Eager Dygraph (#39319) · ec8a0c1d
      Zhanlue Yang 提交于
      * Removed debug info
      
      * Added automatic code generation for final state Eager Dygraph
      
      * Modified backward yaml
      
      * Added EagerUtils helper functions for final state CodeGen
      
      * Adjusted CMakeFiles to support compilation for final state auto generated codes
      
      * Added python-c code generation for final state Eager Dygraph
      
      * Fixed minor issue
      
      * Fixed yaml.load() method failure
      
      * Fixed minor issues
      
      * Refactored Python-C Attributes Parsing Functions
      
      * Fixed minor issue with Python-C AddFunctions
      
      * Adjusted python-level trace_op to accomodate final state Eager Dygraph
      
      * Added Logs for final state Eager Dygraph
      
      * Fixed merge issues
      
      * Fixed minor issue
      ec8a0c1d
  4. 13 2月, 2022 1 次提交
  5. 11 2月, 2022 9 次提交
    • L
      Add TensorRT inspector into Paddle-TRT (#38362) · 69793a27
      Leo Chen 提交于
      69793a27
    • J
      Added shape (U)INT8/BF16/FP32 oneDNN kernel (#36033) · 52bbaae9
      jakpiase 提交于
      * added shape oneDNN kernel
      
      * removed unnecessary import from test
      
      * added skipping tests for GPU
      
      * refactoring
      
      * refactored shape kernel
      
      * added tests in new framework
      
      * removed one line
      
      * minor change
      
      * added newline at EOF
      
      * added formatting
      
      * added attributes as extra
      52bbaae9
    • F
      [MLU] add pool2d pytest (#39454) · 2db25f0d
      fwenguang 提交于
      2db25f0d
    • J
      uniform_random op for mlu (#39450) · 02f06708
      joeqiao12 提交于
      02f06708
    • Z
      [bf16] add bf16 kernel: transpose & unbind (#39457) · 1e6047f1
      zhangbo9674 提交于
      * add transpose unbind
      
      * add unittest
      
      * refine transpose unittest
      1e6047f1
    • Z
      [MLU]support c_gen_cncl_id_op run on MLU device (#39336) · 89aa8b1a
      zn 提交于
      Co-authored-by: Nzhangna <zhangna@cambricon.com>
      89aa8b1a
    • J
      fix prelu trt convert (#39389) · c86765ed
      JingZhuangzhuang 提交于
      c86765ed
    • Z
      统一 ps 开发 - python (#39431) · 22c67d14
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * refactor theoneps
      
      * the_one_ps
      
      * add ps pass unittest
      
      * add ps pass unittest
      
      * ps unitest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * ps unittest ready
      
      * ps unittest ready
      
      * solve dist_pass init conflict
      
      * solve import CommContext error
      
      * unittest ok
      
      * implement AllocateFrom
      
      * solve setup.py.in conflict
      
      * solve conflict
      
      * solve conflict
      
      * solve conflict
      
      * .
      
      * .
      
      * cpu-async-ps minimize test ok & gpu minimize test ok
      Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
      22c67d14
    • Z
      【Pten】Auto-Generate InterMeta register (#39436) · 7d6096ff
      zyfncg 提交于
      * fix code conflict
      
      * generate inter_meta register
      
      * clear cache
      
      * just try
      
      * add sign c++ api
      
      * polish some code
      7d6096ff
  6. 10 2月, 2022 11 次提交
  7. 09 2月, 2022 1 次提交