1. 17 4月, 2023 1 次提交
    • W
      [CINN] fix concat (#52341) · 31fc763a
      wangzhen38 提交于
      * [CINN] fix concat&pow
      
      * update concat
      
      * composite_backward_api
      
      * for ci
      
      * for ci
      
      * update test & fix opmaker
      31fc763a
  2. 04 4月, 2023 1 次提交
    • R
      Improve new executor static build (#51149) · 5bac67d4
      Ruibiao Chen 提交于
      * Improve new executor static build
      
      * Skip GC for static build
      
      * Skip infershape for static build
      
      * Handle read_op
      
      * Add fused_attention to OpsWithFluidKernelNeedMoveToPhi
      
      * Fix argsort typos
      
      * Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi
      
      * Fix skip share lod errors
      
      * Fix errors for adam
      
      * Fix errors for eigvals, memcpy and fake_quantize
      
      * Add static_build.cc
      
      * Add black list
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Fix TensorArray
      
      * Fix TensorArray
      
      * Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel
      
      * Fix copy
      
      * Fix errors
      
      * Fix momentum
      
      * Skip mkldnn
      
      * Fix CI errors
      
      * Fix c_sync_calc_stream_op
      
      * Fix CINN
      
      * Fix while op
      
      * All CI pass, disable FLAGS to merge code, enable it after more tests in future
      
      * Add UTs
      
      * Fix typos
      
      * Fix typos
      
      * Add mkldnn UT
      
      * Remove mkldnn test
      
      * Fix typos
      
      * Fix dist test
      
      * Fix typos
      
      * Fix CI errors
      
      * Fix CI errors
      
      * Add UTs
      
      * Fix typos
      
      * Fix typos
      
      * Add sparse tests
      
      * ToComplexType -> ToComplex
      
      * Add test_matmul_op_static_build to disable_win_inference_test
      5bac67d4
  3. 07 3月, 2023 1 次提交
  4. 02 3月, 2023 1 次提交
    • W
      Add concat grad cinn (#50972) · a4689c90
      wangzhen38 提交于
      * [cinn] concat_grad
      
      * [cinn] concat_grad
      
      * [cinn] concat_grad build success
      
      * [Add PGLBOX] fix unnitest
      
      * [Add PGLBOX] fix unnitest
      
      * [Add PGLBOX] fix codestyle
      
      * [cinn] update by comments
      
      * [cinn] update by comment
      
      * [cinn] add axis check
      a4689c90
  5. 04 1月, 2023 1 次提交
    • H
      [Unify KernelKey] change OpKernelType->KernelKey (#49138) · 4383494f
      HongyuJia 提交于
      * execute use kernel_key first
      
      * change OpKernelType->KernelKey
      
      * fix py3 compile error, remove redundant header files
      
      * fix build_strategy_test
      
      * fix DataType::RAW
      
      * fix custom_type test: operator_test.cc
      
      * fix transform place
      
      * fix backends_are_same_class
      
      * try fix place TransDataDevice
      
      * support all KernelKey
      
      * fix TransformData
      
      * fix place_are_same_class
      
      * fix merge
      
      * fix test_params_no_grad
      
      * fix specific place of GetExpectedKernelType
      
      * fix specific place of GetExpectedKernelType
      
      * fix GetKernelTypeForVar
      
      * fix dtype error
      
      * fix fetch_v2
      
      * change GetKernelTypeForVar
      
      * fix interpreter
      
      * fix typo error
      
      * polish codes
      
      * polish codes
      
      * polish codes
      
      * fix conflict
      4383494f
  6. 07 12月, 2022 1 次提交
  7. 13 10月, 2022 1 次提交
    • H
      [Kernel Selection] Remove hard code of PADDLE_WITH_MKLDNN (#46606) · ef1c8759
      HongyuJia 提交于
      * remove PADDLE_WITH_MKLDNN, test white_list=abs
      
      * fix unique_ptr
      
      * fix op.Type()
      
      * remove TODO in kernel_dispatch.h
      
      * remove IndicateVarDataType function, update white_list
      
      * remove mkldnn hard code
      
      * add comments
      
      * fix ==
      
      * update mkldnn_op_list
      
      * delete hard code of OPs
      
      * update mkldnn_op_list
      
      * update mkldnn_op_list, remove interp
      
      * add error check for ExecutionContext
      
      * update mkldnn_op_list, remove transpose2_grad
      
      * remove interpolate mkldnn
      
      * remove fill_constant mkldnn
      
      * opt HasAttr in DygraphExecutionContext
      
      * deprecated commit, test mkldnn_white_list
      
      * deprecated commit, test mkldnn_white_list
      
      * deprecated commit, test mkldnn_black_list
      
      * update mkldnn_op_list, add assert error op
      
      * solve cudnn related op
      
      * fix error
      
      * add mkldnn fallback in phi_utils.cc
      
      * remove mkldnn fallback in phi_utils.cc
      
      * opt code implementation
      
      * polish Copyright License
      ef1c8759
  8. 28 9月, 2022 1 次提交
    • C
      Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e
      Chen Weihang 提交于
      * remove needless using tensor
      
      * remove needless using tensor
      
      * resolve conflict
      
      * replace tensor using
      
      * fix format error
      
      * revert needless changing
      
      * fix rocm and npu compile error
      
      * fix cinn compile error
      
      * fix format error
      
      * fix mkldnn format error
      
      * fix mkldnn format error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * resolve conflict
      e12a905e
  9. 02 9月, 2022 1 次提交
    • Z
      Clear extra attributes of some Op in OpMaker (#45613) · 2a149741
      zyfncg 提交于
      * remove extra attr of abs in opmaker
      
      * remove extra attrs of some op in opmaker
      
      * remove is_test of conv
      
      * fix attr getting of interpretercore
      
      * fix inplace_abn
      
      * fix bug
      
      * fix bug of create_op
      
      * refine code format
      2a149741
  10. 17 8月, 2022 1 次提交
  11. 26 6月, 2022 1 次提交
  12. 05 6月, 2022 1 次提交
  13. 03 4月, 2022 1 次提交
    • C
      [Phi]Concat grad (#41112) · 3f57ef7a
      chentianyu03 提交于
      * add concat_grad kernel
      
      * fix error
      
      * remove comment code
      
      * fix outs nullptr error
      
      * change to phi header
      
      * add concat_grad declare for standalone_executor_test
      3f57ef7a
  14. 06 3月, 2022 1 次提交
  15. 01 3月, 2022 1 次提交
  16. 20 2月, 2022 1 次提交
  17. 19 2月, 2022 1 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
  18. 15 2月, 2022 1 次提交
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  19. 28 1月, 2022 1 次提交
  20. 27 1月, 2022 1 次提交
  21. 21 1月, 2022 1 次提交
  22. 05 10月, 2021 1 次提交
    • J
      Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719
      jakpiase 提交于
      * tmp
      
      * added concat BF16/FP32 BWD oneDNN kernel
      
      * minor change
      
      * minor change
      
      * fix for CI
      
      * added formatting
      
      * Reverted deleting static keyword
      
      * added reviewers suggestions
      
      * reverted deleting concat bf16 test file
      
      * fixed concat tests
      dc4d5719
  23. 18 9月, 2021 1 次提交
    • F
      Add FFT related operators and APIs (#35665) · 11518a43
      Feiyu Chan 提交于
      * 1. add interface for fft;
      2. add data type predicate;
      3. fix paddle.roll.
      
      * add fft c2c cufft kernel
      
      * implement argument checking & op calling parts for fft_c2c and fftn_c2c
      
      * add operator and opmaker definitions
      
      * only register float and double for cpu.
      
      * add common code for implementing FFT, add pocketfft as a dependency
      
      * add fft c2c cufft kernel function
      
      * fix bugs in python interface
      
      * add support for c2r, r2c operators, op makers, kernels and kernel functors.
      
      * test and fix bugs
      
      * 1. fft_c2c function: add support for onesided=False;
      2. add complex<float>, complex<double> support for concat and flip.
      
      * 1. fft: fix python api bugs;
      2. shape_op: add support for complex data types.
      
      * fft c2c cufft kernel done with complie and link
      
      * fix shape_op, add mkl placeholder
      
      * remove mkl
      
      * complete fft c2c in gpu
      
      * 1. implement mkl-based fft, FFTC2CFunctor and common function exec_fft;
      2. change the design, add input and output typename as template parameter for all FFTFunctors, update pocketfft-based implementation.
      
      * complete fft c2c on gpu in ND
      
      * complete fft c2c on gpu in ND
      
      * complete fft c2c backward in ND
      
      * fix MKL-based implementation
      
      * Add frame op and CPU/GPU kernels.
      
      * Add frame op forward unittest.
      
      * Add frame op forward unittest.
      
      * Remove axis parameter in FrameFunctor.
      
      * Add frame op grad CPU/GPU kernels and unittest.
      
      * Add frame op grad CPU/GPU kernels and unittest.
      
      * Update doc string.
      
      * Update after review and remove librosa requirement in unittest.
      
      * Update grad kernel.
      
      * add fft_c2r op
      
      * Remove data allocation in TransCompute function.
      
      * add fft r2c onesided with cpu(pocketfft/mkl) and gpu
      
      * last fft c2r functor
      
      * fix C2R and R2C for cufft, becase the direction is not an option in these cases.
      
      * add fft r2c onesided with cpu(pocketfft/mkl) and gpu
      
      * fix bugs in python APIs
      
      * fix fft_c2r grad kernal
      
      * fix bugs in python APIs
      
      * add cuda fft c2r grad kernal functor
      
      * clean code
      
      * fix fft_c2r python API
      
      * fill fft r2c result with conjugate symmetry (#19)
      
      fill fft r2c result with conjugate symmetry
      
      * add placeholder for unittests (#24)
      
      * simple parameterize test function by auto generate test case from parm list (#25)
      
      * miscellaneous fixes for python APIs (#26)
      
      * add placeholder for unittests
      
      * resize fft inputs before computation is n or s is provided.
      
      * add complex kernels for pad and pad_grad
      
      * simplify argument checking.
      
      * add type promotion
      
      * add int to float or complex promotion
      
      * fix output data type for static mode
      
      * fix fft's input dtype dispatch, import fft to paddle
      
      * fix typos in axes checking (#27)
      
      * fix typos in axes checking
      
      * fix argument checking (#28)
      
      * fix argument checking
      
      * Add C2R Python layer normal and abnormal use cases (#29)
      
      * documents and single case
      
      * test c2r case
      
      * New C2R Python layer normal and exception use cases
      
      * complete rfft,rfft2,rfftn,ihfft,ihfft2,ihfftn unittest and doc string (#30)
      
      * Documentation of the common interfaces of c2r and c2c (#31)
      
      * Documentation of the common interfaces of c2r and c2c
      
      * clean c++ code  (#32)
      
      * clean code
      
      * Add numpy-based implementation of spectral ops (#33)
      
      * add numpy reference implementation of spectral ops
      
      * Add fft_c2r numpy based implementation for unittest. (#34)
      
      * add fft_c2r numpy implementation
      
      * Add deframe op and stft/istft api. (#23)
      
      * Add frame api
      
      * Add deframe op and kernels.
      
      * Add stft and istft apis.
      
      * Add deframe api. Update stft and istft apis.
      
      * Fix bug in frame_from_librosa function when input dims >= 3
      
      * Rename deframe to overlap_add.
      
      * Update istft.
      
      * Update after code review.
      
      * Add overlap_add op and stft/istft api unittest (#35)
      
      * Add overlap_add op unittest.
      
      * Register complex kernels of squeeze/unsquuze op.
      
      * Add stft/istft api unittest.
      
      * Add unittest for fft helper functions (#36)
      
      * add unittests for fft helper functions. add complex kernel for roll op.
      
      * complete static graph unittest for all public api (#37)
      
      * Unittest of op with FFT C2C, C2R and r2c added (#38)
      
      * documents and single case
      
      * test c2r case
      
      * New C2R Python layer normal and exception use cases
      
      * Documentation of the common interfaces of c2r and c2c
      
      * Unittest of op with FFT C2C, C2R and r2c added
      Co-authored-by: lijiaqi0612's avatarlijiaqi <lijiaqi0612@163.com>
      
      * add fft related options to CMakeLists.txt
      
      * fix typos and clean code (#39)
      
      * fix invisible character in mkl branch and fix error in error message
      
      * clean code: remove docstring from unittest for signal.py.
      
      * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. (#40)
      
      * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype.
      
      * fix CI Errors: numpy dtype comparison, thrust when cuda is not available (#41)
      
      1. always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype.
      2. promote floating point tensor to complex tensor ior fft_c2c and fft_c2r;
      3. fix unittest to catch UnImplementedError and RuntimeError;
      4. fix compile error by avoid using thrust when cuda is not available.
      5.  fix sample code, use paddle.fft instead of paddle.tensor.fft
      
      * remove inclusion of thrust, add __all__ list for fft (#42)
      
      * Add api doc and update unittest. (#43)
      
      * Add doc strings.
      * Update overlap_add op unittest
      
      * fix MKL-based FFT implementation (#44)
      
      * fix MKL-based FFT implementation, MKL CDFT's FORWARD DOMAIN is always REAL for R2C and C2R
      
      * remove code for debug (#45)
      
      * use dynload for cufft (#46)
      
      * use std::ptrdiff_t as datatype of stride (instead of int64_t) to avoid argument mismatch on some platforms.
      
      * add complex support for fill_zeros_like
      
      * use dynload for cufft
      
      * Update doc and unittest. (#47)
      
      * Add doc of frame op and overlap_add op.
      
      * Update unittest.
      
      * use dynload for cufft (#48)
      
      1. use dynload for cufft
      2. fix unittest;
      3. temporarily disable Rocm.
      
      * fix conflicts and merge upstream (#49)
      
      fix conflicts and merge upstream
      
      * fix compile error: only link dyload_cuda when cuda is available (#50)
      
      * fix compile error: only link dyload_cuda when cuda is available
      
      * fix dynload for cufft on windows (#51)
      
      1. fix dynload for cufft on windows;
      2. fix unittests.
      
      * add NOMINMAX to compile on windows (#52)
      
       add NOMINMAX to compile on windows
      
      * explicitly specify capture mode for lambdas (#55)
      
       explicitly specify capture mode for lambdas
      
      * fix fft sample (#53)
      
      * fix fft sample
      
      * update scipy and numpy version for unittests of fft (#56)
      
      update scipy and numpy version for unittests of fft
      
      * Add static graph unittests of frame and overlap_add api. (#57)
      
      * Remove cache of cuFFT & Disable ONEMKL (#59)
      
      1. replace numpy.fft with scipy.fft as numpy<1.20 not support ortho norm
      2. remove cache of cufft plans;
      3. enhance error checking.
      4. default WITH_ONEMKL to OFF
      Co-authored-by: Njeff41404 <jeff41404@gmail.com>
      Co-authored-by: Nroot <root@bjyz-sys-gpu-kongming9.bjyz.baidu.com>
      Co-authored-by: NKP <109694228@qq.com>
      Co-authored-by: lijiaqi0612's avatarlijiaqi <lijiaqi0612@163.com>
      Co-authored-by: NXiaoxu Chen <chenxx_id@163.com>
      Co-authored-by: Nlijiaqi0612 <33169170+lijiaqi0612@users.noreply.github.com>
      11518a43
  24. 03 9月, 2021 1 次提交
    • Z
      add AsExtra to concat op (#35380) · 42d36504
      zmx 提交于
      add AsExtra to the  following attribute of concat op:
        1. use_mkldnn
        2. use_quantizer
        3. mkldnn_data_type
      42d36504
  25. 21 6月, 2021 1 次提交
  26. 18 5月, 2021 1 次提交
  27. 25 1月, 2021 1 次提交
  28. 16 12月, 2020 1 次提交
  29. 27 11月, 2020 1 次提交
  30. 22 9月, 2020 1 次提交
  31. 08 8月, 2020 1 次提交
  32. 04 8月, 2020 1 次提交
  33. 27 7月, 2020 1 次提交
  34. 28 5月, 2020 1 次提交
  35. 09 4月, 2020 1 次提交
  36. 25 3月, 2020 1 次提交
  37. 09 3月, 2020 1 次提交
  38. 29 11月, 2019 1 次提交
    • H
      Add dygraph execution context (#20157) · ac854670
      hong 提交于
      * add_dygraph_execution_context
      
      * add dygraph infershape context and execution context; test=develop
      
      * fix imperative bug; test=develop
      
      * remove inputs outputs interface from execution context,
      because it have same function with inputNames;
      test=develop
      
      * remove tracer_test ctest; test=develop
      
      * fix split op bug; test=develop
      
      * fix unitests bug; test=develop
      
      * fix distribute test bug; test=develop
      
      * fix ngraph compile bug; test=develop
      
      * fix grad maker bug; test=develop
      
      * fix load op bugs; test=develop
      
      * fix operator.cc construct bug; test=develop
      
      * remove useless name find in operator; test=develop
      
      * add tracer_test; test=develop
      
      * fix concat, split bug; test=develop
      
      * remove tracer_test unitest; test=develop
      
      * fix attribute check bug; test=develop
      
      * add test code to fix converage; test=develop
      
      * remove useless code, change check backward input in engin; test=develop
      
      * unlock var type infer shape;test=develop
      
      * add ShareAllLoD api; test=develop
      
      * add dygraph infershape context unitest; test=develop
      
      * remove increase and decrease lod in dygraph; test=develop
      
      * addd override; test=develop
      
      * fix increase descrease lod; test=develop
      
      * fix paddle_enforce; test=develop
      
      * disable lod op dygraph check; test=develop
      
      * fix paddle enforce error; test=develop
      
      * add comment for op_registry and OperatorBase; test=develop
      
      * optimize the comment of op_registry; test=develop
      
      * fix format of comment; test=develop
      
      * fix format of comment; test=develop
      
      * optimize the format of comment; test=develop
      
      * optimize the format of the comment; test=develop
      
      * optimize comment of op_registry; test=develop
      ac854670
  39. 31 10月, 2019 1 次提交
    • H
      GradMaker for dygraph (#19706) · 8c4573a3
      hong 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * optimize grad maker; test=develop
      
      * optimize grad maker
      
      * test
      
      * grad make optim; test=develop
      
      * fix unittest bugs; test=develop
      
      * add dygraph grad op maker and split_op
      
      * grad op maker refactor; test=develop
      
      * add dygraph grad maker; test=develop
      
      * fix op deformable_conv_v1_op bug; test=develop
      
      * fix deformable_conv prroi pool bugs;
      
      * fix new op grad op maker bug; test=develop
      
      * fix split by ref bug; test=develop
      
      * fix dygraph auto prune bug; test=develop
      
      * fix test_trace bug; test=develop
      
      * fix fused emb seq pool bug; test=develop
      
      * remove useless code in op_desc file; test=develop
      
      * remove useless code, StrVarBaseNode; test=develop
      
      * fix review issues; test=develop
      
      * fix rank_loss grad maker; test=develop
      
      * remove flag in VarBase; test=develop
      
      * fix distributed_notify_op compile bug ; test=develop
      
      * fix reshape op double grad; test=develop
      
      * fix expand as op; test=develop
      
      * add impertive type_defs.h for demo_train; test=develop
      
      * fix inference lib cmake; test=develop
      
      * fix inference lib; test=develop
      
      * fix infernce_lib; test=develop
      
      * fix inference cmake; test=develop
      
      * fix inference lib; test=develop
      
      * fix inference lib; test=develop
      
      * remove condition dygraph grad maker, modify local name; test=develop
      
      * fix split grad maker bug; test=develop
      
      * fix pyramid_op bug; test=develop
      
      * change travis time out limit; test=develop
      
      * restore travis; test=develop
      
      * change timeout limit; test=develop
      8c4573a3
  40. 29 10月, 2019 1 次提交
    • L
      support Tensor for split and concat, support -1 in num_or_sections, add check... · 6802539a
      liym27 提交于
      support Tensor for split and concat, support -1 in num_or_sections, add check num_or_sections (#20780)
      
      * improve split and concat op:
      1. support Tensor for argument 'dim' in split op.
      2. support Tensor for argument 'axis' in concat op.
      test=develop
      
      * redefine function GetDataFromTensor and set unknown output shape to - 1.
      test=develop
      
      * add check: Attr(sections) match Input(X). test=develop
      
      * support Tensor for attr(sections) and attr(sections) can contain -1.
      add check for attr(sections).
      test=develop
      
      * modify error message for concat and call Resize only when necessary. test=develop
      6802539a