1. 26 9月, 2022 1 次提交
  2. 05 9月, 2022 1 次提交
    • R
      move elementwise_sub and elementwise_sub_grad XPU kernel to PHI,test=kunlun (#45623) · fb42ba70
      risemeup1 提交于
      * move elementwise_sub and elementwise_sub_grad XPU kernel to PHI,test=kunlun
      
      * modify code style,test=kunlun
      
      * modify elementwise_subtract_grad_kernel.cc,test=kunlun
      
      * modify elementwise_subtract_kernel.cc,test=kunlun
      
      * modify elementwise_subtract_grad_kernel.cc,test=kunlun
      
      * modify elementwise_kernel.cc and elementwise_subtract_kernel.cc,test=kunlun
      
      * modify codestyle,test=kunlun
      
      * modify elementwise_kernel.cc,test=kunlun
      fb42ba70
  3. 31 8月, 2022 3 次提交
  4. 29 8月, 2022 1 次提交
  5. 25 8月, 2022 1 次提交
  6. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  7. 27 7月, 2022 1 次提交
  8. 12 7月, 2022 1 次提交
  9. 11 7月, 2022 1 次提交
  10. 06 7月, 2022 1 次提交
    • J
      Performance fix for recommender model (#43803) · 48abaec6
      jakpiase 提交于
      * fix for binary kernels
      
      * fixed performance for elementwise, reduce and concat
      
      * added comment
      
      * CI fix
      
      * CI fix
      
      * added formatting
      
      * reverted one file
      
      * Revert "reverted one file"
      
      This reverts commit 54725e1c62318d3a18913821200e973816751019.
      
      * Revert "added formatting"
      
      This reverts commit b9795dd253d755a329376d7ab0542860aa7815c6.
      
      * added enforcing oneDNN BF16 reduce kernel
      
      * fix for eltwise and reenabled reshape kernels
      
      * fix for binary handler
      
      * added formatting
      
      * referted changes for flatten,squeeze and reshape ops
      48abaec6
  11. 02 7月, 2022 1 次提交
  12. 26 6月, 2022 1 次提交
  13. 21 6月, 2022 2 次提交
    • S
      Generalize conv+activation fuse pass (#43382) · 347e4b2e
      Sławomir Siwek 提交于
      * consolidate conv act passes
      
      * generalize conv_activation
      
      * integrate conv+act tests
      
      * code style format
      
      * whitespaces
      
      * remove timeout from old tests
      
      * implement comments from review
      
      * restore ut
      
      * whitespace
      
      * code style
      
      * transpose
      
      * fixes after review
      
      * method for gettin act
      
      * Change Paddle_enforce error type
      
      * code format
      
      * add missing opcompats
      347e4b2e
    • C
      [MLU] add mlu kernel for elementwise_max_grad (#43608) · f586110d
      cambriconhsq 提交于
      * [MLU] add mlu kernel for elementwise_max_grad
      
      * [MLU] modify mlu kernel elementwise_min_grad impl
      f586110d
  14. 17 6月, 2022 1 次提交
  15. 14 6月, 2022 1 次提交
  16. 05 6月, 2022 1 次提交
  17. 04 6月, 2022 1 次提交
  18. 31 5月, 2022 1 次提交
  19. 25 5月, 2022 1 次提交
  20. 24 5月, 2022 1 次提交
  21. 12 5月, 2022 1 次提交
  22. 10 5月, 2022 1 次提交
  23. 09 5月, 2022 1 次提交
  24. 06 5月, 2022 2 次提交
    • E
      bind elementwise_mod_op_xpu (#42175) · 6ea2f049
      enzodechine 提交于
      * bind elementwise_mod_op_xpu *test=kunlun
      
      * add more supported dtypes and UTs *test=kunlun
      
      * fix datatype error
      
      * add op to in xpu1_op_list
      
      * Update Mac cmake version >=3.15 (#41456)
      
      * Update Mac cmake version >=3.15
      
      * notest;read test1
      
      notest;read test2
      
      notest;read test3
      
      * fix inference link error
      
      * fix inference link error
      
      * fix windows link error
      
      * fix cmake_policy
      
      * fix build big size
      
      * Add paddle::variant and replace paddle::any (#42139)
      
      * add variant and replace any
      
      * split attribute
      
      * disable unittest failed in eager CI in temporary (#42101)
      
      * test=py3-eager
      
      * test=py3-eager
      
      * test=py3-eager
      
      * combine graph_table and feature_table in graph_engine (#42134)
      
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add dsm sample method
      
      * add graph_neighbor_sample_v2
      
      * Add graph_neighbor_sample_v2
      
      * fix for loop
      
      * add cpu sample interface
      
      * fix kernel judgement
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * change index settings
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      
      * move cudamemcpy after cuda stream sync
      
      * fix linking problem
      
      * remove comment
      
      * add cpu test
      
      * test
      
      * add cpu test
      
      * change comment
      
      * combine feature table and graph table
      
      * test
      
      * test
      
      * pybind
      
      * test
      
      * test
      
      * test
      
      * test
      
      * pybind
      
      * pybind
      
      * fix cmake
      
      * pybind
      
      * fix
      
      * fix
      
      * add pybind
      
      * add pybind
      Co-authored-by: NDesmonDay <908660116@qq.com>
      
      * [CustomDevice] add eager mode support (#42034)
      
      * fix FlattenContiguousRangeOpConverter out dim error (#42087)
      
      * fix FlattenContiguousRangeOpConverter out dim error
      
      * update code
      
      * fix python3.10 compile bug on windows (#42140)
      
      * Optimize dygraph GetExpectedKernelType perf (#42154)
      
      * opt dygraph scheduling
      
      * revert part impl
      
      * fix incorrect usages of std::move and other compile errors (#41045)
      
      * fix bug of std::move and others
      
      * fix an compile error in debug mode
      
      * fix wrong copy assignment operator
      Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>
      
      * reformat
      Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>
      
      * reformat
      Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>
      
      * fix ArrayRef constructor following llvm
      
      * fix format
      
      * fix conflict with master
      
      * fix variant compile error (#42203)
      
      * [Eager] Support numpy.ndarry in CastNumpy2Scalar (#42136)
      
      * [Eager] Remove redundancy code, fix fp16 case (#42169)
      
      * [Eager] Support div(scalar) in eager mode (#42148)
      
      * [Eager] Support div scalar in eager mode
      
      * Updated and remove debug logs
      
      * Remove list, use 'or' directly
      
      * Remove useless statement
      
      * fix recompute (#42128)
      
      * fix recompute
      
      * modify return
      
      * add LICENSE in wheel dist-info package (#42187)
      
      * replace any by variant in infermeta (#42181)
      
      * 【PaddlePaddle Hackathon 2】24、为 Paddle 新增 nn.ChannelShuffle 组网 API (#40743)
      
      * Add infermeta for ChannelShuffle
      
      * Create channel_shuffle_grad_kernel.h
      
      * Create channel_shuffle_kernel.h
      
      * Create channel_shuffle_sig.cc
      
      * Create channel_shuffle_op.cc
      
      ChannelShuffle算子的描述
      
      * Create channel_shuffle_kernel_impl.h
      
      ChannelShuffle核函数的实现
      
      * Create channel_shuffle_grad_kernel_impl.h
      
      ChannelShuffle反向核函数的实现
      
      * Add kernel register of channel shuffle and grad
      
      注册ChannelShuffle及其反向的核函数
      
      * add nn.functional.channel_shuffle
      
      * add nn.ChannelShuffle
      
      * Create test_channel_shuffle.py
      
      * Update example of ChannelShuffle in vision.py
      
      * Update test_channel_shuffle.py
      
      * 修改channel_shuffle核函数的实现位置
      
      * 修正代码格式
      
      * 删除多余空格
      
      * 完善channel_shuffle的错误检查
      
      * Update unary.cc
      
      * Update channel_shuffle_op.cc
      
      * Update test_channel_shuffle.py
      
      * Update unary.cc
      
      * add channel_shuffle
      
      * Update test_channel_shuffle.py
      
      * Update vision.py
      
      * 调整代码格式
      
      * Update channel_shuffle_sig.cc
      
      * 更新ChannelShuffle的文档
      
      * 更新channel_shuffle的文档
      
      * remove ChannelShuffleOpArgumentMapping
      
      * add ChannelShuffleGradInferMeta
      
      * Update channel_shuffle_op.cc
      
      * 调整channel_shuffle及其梯度的核函数的位置
      
      * Do not reset default stream for StreamSafeCUDAAllocator (#42149)
      
      * remove redundant computation in Categorical.probs (#42114)
      
      * Downloading data for test_analyzer_vit_ocr (#42041)
      
      * Change server URL
      
      * update config
      
      * add test to parallel UT rule
      
      * add checksum to ensure files are downloaded
      
      * change downloading target
      
      * reuse existing variable
      
      * change target directory
      
      * fix en docs of some Apis (gradients, scope_guard, cuda_places, name_scope, device_guard, load_program_state, scale, ParamAttr and WeightNormParamAttr) (#41604)
      
      * Update scope_guard; test=document_fix
      
      * gradients; test=document_fix
      
      * gradients; test=document_fix
      
      * name_scope; test=document_fix
      
      * cpu_places; test=document_fix
      
      * WeightNormParamAttr; test=document_fix
      
      * cuda_places; test=document_fix
      
      * load_program_state; test=document_fix
      
      * device_guard; test=document_fix
      
      * device_guard; test=document_fix
      
      * ParamAttr; test=document_fix
      
      * scale; test=document_fix
      
      * scale; test=document_fix
      
      * update code example;test=document_fix
      Co-authored-by: NChen Long <1300851984@qq.com>
      
      * fix datatype error
      
      add op to in xpu1_op_list
      
      *test=kunlun
      
      * fix elementwise_mod op path error  *test=kunlun
      
      * fix elementwise_mod UT error  *test=kunlun
      
      * fix datatype error
      
      add op to in xpu1_op_list
      
      *test=kunlun
      
      add op to in xpu1_op_list
      
      fix elementwise_mod op path error  *test=kunlun
      
      fix elementwise_mod UT error  *test=kunlun
      Co-authored-by: Ntianshuo78520a <707759223@qq.com>
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      Co-authored-by: Nseemingwang <seemingwang@users.noreply.github.com>
      Co-authored-by: NDesmonDay <908660116@qq.com>
      Co-authored-by: Nronnywang <524019753@qq.com>
      Co-authored-by: Nbaoachun <962571062@qq.com>
      Co-authored-by: zhouweiwei2014's avatarZhou Wei <1183042833@qq.com>
      Co-authored-by: Ntiancaishaonvjituizi <452565578@qq.com>
      Co-authored-by: NWeilong Wu <veyron_wu@163.com>
      Co-authored-by: NRoc <30228238+sljlp@users.noreply.github.com>
      Co-authored-by: NBrilliantYuKaimin <91609464+BrilliantYuKaimin@users.noreply.github.com>
      Co-authored-by: NRuibiao Chen <chenruibiao@baidu.com>
      Co-authored-by: NFeiyu Chan <chenfeiyu@baidu.com>
      Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
      Co-authored-by: NYilingyelu <103369238+Yilingyelu@users.noreply.github.com>
      Co-authored-by: NChen Long <1300851984@qq.com>
      6ea2f049
    • A
      [NPU] support model PPO (#42484) · d73eb38c
      Aganlengzi 提交于
      d73eb38c
  25. 21 4月, 2022 1 次提交
  26. 19 4月, 2022 1 次提交
  27. 18 4月, 2022 1 次提交
  28. 14 4月, 2022 1 次提交
  29. 30 3月, 2022 2 次提交
  30. 29 3月, 2022 2 次提交
  31. 25 3月, 2022 1 次提交
  32. 18 3月, 2022 1 次提交
  33. 17 3月, 2022 1 次提交
  34. 16 3月, 2022 1 次提交