提交 · 814e5ab4e837a1c1270f67c6ca491da68b281a11 · PaddlePaddle / Paddle

05 10月, 2021 1 次提交

Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719

由 jakpiase 提交于 10月 05, 2021

* tmp

* added concat BF16/FP32 BWD oneDNN kernel

* minor change

* minor change

* fix for CI

* added formatting

* Reverted deleting static keyword

* added reviewers suggestions

* reverted deleting concat bf16 test file

* fixed concat tests

dc4d5719

18 9月, 2021 1 次提交

由 Feiyu Chan 提交于 9月 18, 2021

* 1. add interface for fft;
2. add data type predicate;
3. fix paddle.roll.

* add fft c2c cufft kernel

* implement argument checking & op calling parts for fft_c2c and fftn_c2c

* add operator and opmaker definitions

* only register float and double for cpu.

* add common code for implementing FFT, add pocketfft as a dependency

* add fft c2c cufft kernel function

* fix bugs in python interface

* add support for c2r, r2c operators, op makers, kernels and kernel functors.

* test and fix bugs

* 1. fft_c2c function: add support for onesided=False;
2. add complex<float>, complex<double> support for concat and flip.

* 1. fft: fix python api bugs;
2. shape_op: add support for complex data types.

* fft c2c cufft kernel done with complie and link

* fix shape_op, add mkl placeholder

* remove mkl

* complete fft c2c in gpu

* 1. implement mkl-based fft, FFTC2CFunctor and common function exec_fft;
2. change the design, add input and output typename as template parameter for all FFTFunctors, update pocketfft-based implementation.

* complete fft c2c on gpu in ND

* complete fft c2c on gpu in ND

* complete fft c2c backward in ND

* fix MKL-based implementation

* Add frame op and CPU/GPU kernels.

* Add frame op forward unittest.

* Add frame op forward unittest.

* Remove axis parameter in FrameFunctor.

* Add frame op grad CPU/GPU kernels and unittest.

* Add frame op grad CPU/GPU kernels and unittest.

* Update doc string.

* Update after review and remove librosa requirement in unittest.

* Update grad kernel.

* add fft_c2r op

* Remove data allocation in TransCompute function.

* add fft r2c onesided with cpu(pocketfft/mkl) and gpu

* last fft c2r functor

* fix C2R and R2C for cufft, becase the direction is not an option in these cases.

* add fft r2c onesided with cpu(pocketfft/mkl) and gpu

* fix bugs in python APIs

* fix fft_c2r grad kernal

* fix bugs in python APIs

* add cuda fft c2r grad kernal functor

* clean code

* fix fft_c2r python API

* fill fft r2c result with conjugate symmetry (#19)

fill fft r2c result with conjugate symmetry

* add placeholder for unittests (#24)

* simple parameterize test function by auto generate test case from parm list (#25)

* miscellaneous fixes for python APIs (#26)

* add placeholder for unittests

* resize fft inputs before computation is n or s is provided.

* add complex kernels for pad and pad_grad

* simplify argument checking.

* add type promotion

* add int to float or complex promotion

* fix output data type for static mode

* fix fft's input dtype dispatch, import fft to paddle

* fix typos in axes checking (#27)

* fix typos in axes checking

* fix argument checking (#28)

* fix argument checking

* Add C2R Python layer normal and abnormal use cases (#29)

* documents and single case

* test c2r case

* New C2R Python layer normal and exception use cases

* complete rfft,rfft2,rfftn,ihfft,ihfft2,ihfftn unittest and doc string (#30)

* Documentation of the common interfaces of c2r and c2c (#31)

* Documentation of the common interfaces of c2r and c2c

* clean c++ code  (#32)

* clean code

* Add numpy-based implementation of spectral ops (#33)

* add numpy reference implementation of spectral ops

* Add fft_c2r numpy based implementation for unittest. (#34)

* add fft_c2r numpy implementation

* Add deframe op and stft/istft api. (#23)

* Add frame api

* Add deframe op and kernels.

* Add stft and istft apis.

* Add deframe api. Update stft and istft apis.

* Fix bug in frame_from_librosa function when input dims >= 3

* Rename deframe to overlap_add.

* Update istft.

* Update after code review.

* Add overlap_add op and stft/istft api unittest (#35)

* Add overlap_add op unittest.

* Register complex kernels of squeeze/unsquuze op.

* Add stft/istft api unittest.

* Add unittest for fft helper functions (#36)

* add unittests for fft helper functions. add complex kernel for roll op.

* complete static graph unittest for all public api (#37)

* Unittest of op with FFT C2C, C2R and r2c added (#38)

* documents and single case

* test c2r case

* New C2R Python layer normal and exception use cases

* Documentation of the common interfaces of c2r and c2c

* Unittest of op with FFT C2C, C2R and r2c added
Co-authored-by: lijiaqi <lijiaqi0612@163.com>

* add fft related options to CMakeLists.txt

* fix typos and clean code (#39)

* fix invisible character in mkl branch and fix error in error message

* clean code: remove docstring from unittest for signal.py.

* always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. (#40)

* always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype.

* fix CI Errors: numpy dtype comparison, thrust when cuda is not available (#41)

1. always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype.
2. promote floating point tensor to complex tensor ior fft_c2c and fft_c2r;
3. fix unittest to catch UnImplementedError and RuntimeError;
4. fix compile error by avoid using thrust when cuda is not available.
5.  fix sample code, use paddle.fft instead of paddle.tensor.fft

* remove inclusion of thrust, add __all__ list for fft (#42)

* Add api doc and update unittest. (#43)

* Add doc strings.
* Update overlap_add op unittest

* fix MKL-based FFT implementation (#44)

* fix MKL-based FFT implementation, MKL CDFT's FORWARD DOMAIN is always REAL for R2C and C2R

* remove code for debug (#45)

* use dynload for cufft (#46)

* use std::ptrdiff_t as datatype of stride (instead of int64_t) to avoid argument mismatch on some platforms.

* add complex support for fill_zeros_like

* use dynload for cufft

* Update doc and unittest. (#47)

* Add doc of frame op and overlap_add op.

* Update unittest.

* use dynload for cufft (#48)

1. use dynload for cufft
2. fix unittest;
3. temporarily disable Rocm.

* fix conflicts and merge upstream (#49)

fix conflicts and merge upstream

* fix compile error: only link dyload_cuda when cuda is available (#50)

* fix compile error: only link dyload_cuda when cuda is available

* fix dynload for cufft on windows (#51)

1. fix dynload for cufft on windows;
2. fix unittests.

* add NOMINMAX to compile on windows (#52)

 add NOMINMAX to compile on windows

* explicitly specify capture mode for lambdas (#55)

 explicitly specify capture mode for lambdas

* fix fft sample (#53)

* fix fft sample

* update scipy and numpy version for unittests of fft (#56)

update scipy and numpy version for unittests of fft

* Add static graph unittests of frame and overlap_add api. (#57)

* Remove cache of cuFFT & Disable ONEMKL (#59)

1. replace numpy.fft with scipy.fft as numpy<1.20 not support ortho norm
2. remove cache of cufft plans;
3. enhance error checking.
4. default WITH_ONEMKL to OFF
Co-authored-by: Njeff41404 <jeff41404@gmail.com>
Co-authored-by: Nroot <root@bjyz-sys-gpu-kongming9.bjyz.baidu.com>
Co-authored-by: NKP <109694228@qq.com>
Co-authored-by: lijiaqi <lijiaqi0612@163.com>
Co-authored-by: NXiaoxu Chen <chenxx_id@163.com>
Co-authored-by: Nlijiaqi0612 <33169170+lijiaqi0612@users.noreply.github.com>

11518a43

03 9月, 2021 1 次提交

add AsExtra to concat op (#35380) · 42d36504

由 zmx 提交于 9月 03, 2021

add AsExtra to the  following attribute of concat op:
  1. use_mkldnn
  2. use_quantizer
  3. mkldnn_data_type

42d36504

21 6月, 2021 1 次提交
- 李
  
  fix the but that concat op can't support uint8 (#33666) · 0011450c
  由李季提交于 6月 21, 2021
  
  0011450c
18 5月, 2021 1 次提交
- L
  
  add unit8 for concat (#32850) · 53580bb4
  由 liuyuhui 提交于 5月 18, 2021
  
  53580bb4
25 1月, 2021 1 次提交

More precise mkldnn kernel rules in GetExpectedKernelType (#29840) · 5bf25d1e

由 arlesniak 提交于 1月 25, 2021

* More precise mkldnn kernel choice in GetExpectedKernelType

* Fixes after review

* Refresh develop for CI

* CI experiment

* get back from CI exper

5bf25d1e

16 12月, 2020 1 次提交
- C
  add pad and concat double grad (#29549) · cc387159
  由 ceci3 提交于 12月 16, 2020
```
* add constant pad double grad
```
  cc387159
27 11月, 2020 1 次提交
- A
  
  Fixes mkldnn dygraph learning rate scheduler crashes (#28988) · bc902044
  由 arlesniak 提交于 11月 27, 2020
  
  bc902044
22 9月, 2020 1 次提交
- 1
  Enhance Op's Error Message (#27455) · a0452475
  由 123malin 提交于 9月 22, 2020
```
* test=develop, update error message
```
  a0452475
08 8月, 2020 1 次提交

Change use_quantizer attribute name and data type (#25838) · 734cf1c3

由 joanna.wozna.intel 提交于 8月 08, 2020

* Change use_quantizer attribute name and data type

* Fix problem with setting attribute

* Add changes due to review

* Small change in function

* Restore use_quantizer attr for compatibility

734cf1c3

04 8月, 2020 1 次提交
- W
  
  Add support for tuple of concat Op test=develop (#25800) · ff717d51
  由 wangchaochaohu 提交于 8月 04, 2020
  
  ff717d51
27 7月, 2020 1 次提交
- W
  
  refine the concat Op for API 2.0 test=develop (#25307) · 1e4ab728
  由 wangchaochaohu 提交于 7月 27, 2020
  
  1e4ab728
28 5月, 2020 1 次提交
- L
  
  rename inplace/no_need_buffer inferer, part4, test=develop (#24781) · c0911fdd
  由 Leo Chen 提交于 5月 28, 2020
  
  c0911fdd
09 4月, 2020 1 次提交
- G
  
  Op (concat) error message enhancement (#23523) · 2c4b57e9
  由 GaoWei8 提交于 4月 09, 2020
  
  2c4b57e9
25 3月, 2020 1 次提交
- Z
  
  rename no_need_buffer_vars_macro, test=develop (#23159) · b8886bf1
  由 Zeng Jinle 提交于 3月 25, 2020
  
  b8886bf1
09 3月, 2020 1 次提交

Imperative tracer refactoring (#22457) · d33c4343

由 Zeng Jinle 提交于 3月 09, 2020

* refine grad maker, test=develop

* refactor tracer stage 1, test=develop

* merge develop to solve conflict third times, test=develop

d33c4343

29 11月, 2019 1 次提交

Add dygraph execution context (#20157) · ac854670

由 hong 提交于 11月 29, 2019

* add_dygraph_execution_context

* add dygraph infershape context and execution context; test=develop

* fix imperative bug; test=develop

* remove inputs outputs interface from execution context,
because it have same function with inputNames;
test=develop

* remove tracer_test ctest; test=develop

* fix split op bug; test=develop

* fix unitests bug; test=develop

* fix distribute test bug; test=develop

* fix ngraph compile bug; test=develop

* fix grad maker bug; test=develop

* fix load op bugs; test=develop

* fix operator.cc construct bug; test=develop

* remove useless name find in operator; test=develop

* add tracer_test; test=develop

* fix concat, split bug; test=develop

* remove tracer_test unitest; test=develop

* fix attribute check bug; test=develop

* add test code to fix converage; test=develop

* remove useless code, change check backward input in engin; test=develop

* unlock var type infer shape;test=develop

* add ShareAllLoD api; test=develop

* add dygraph infershape context unitest; test=develop

* remove increase and decrease lod in dygraph; test=develop

* addd override; test=develop

* fix increase descrease lod; test=develop

* fix paddle_enforce; test=develop

* disable lod op dygraph check; test=develop

* fix paddle enforce error; test=develop

* add comment for op_registry and OperatorBase; test=develop

* optimize the comment of op_registry; test=develop

* fix format of comment; test=develop

* fix format of comment; test=develop

* optimize the format of comment; test=develop

* optimize the format of the comment; test=develop

* optimize comment of op_registry; test=develop

ac854670

31 10月, 2019 1 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

29 10月, 2019 1 次提交

support Tensor for split and concat, support -1 in num_or_sections, add check... · 6802539a

由 liym27 提交于 10月 29, 2019

support Tensor for split and concat, support -1 in num_or_sections, add check num_or_sections (#20780)

* improve split and concat op:
1. support Tensor for argument 'dim' in split op.
2. support Tensor for argument 'axis' in concat op.
test=develop

* redefine function GetDataFromTensor and set unknown output shape to - 1.
test=develop

* add check: Attr(sections) match Input(X). test=develop

* support Tensor for attr(sections) and attr(sections) can contain -1.
add check for attr(sections).
test=develop

* modify error message for concat and call Resize only when necessary. test=develop

6802539a

28 10月, 2019 1 次提交

Replace risky GetInputType method with secure IndicateVarDataType interface (#20668) · 26cc1fe5

由 Chen Weihang 提交于 10月 28, 2019

* replace part of the old implementation, test=develop

* restore concat op, test=develop

* update all ops implemention & delete GetDataTypeOfVar func, test=develop

26cc1fe5

11 10月, 2019 1 次提交

add input type and dtype check, enhance shape error message for concat_op (#20101) · 3997743a

由 zhupengyang 提交于 10月 11, 2019

* add input type and dtype check, enhance shape error message for concat_op
test=develop

* enhance shape check
test=develop

* improve coverage

test=develop

3997743a

02 8月, 2019 1 次提交
- H
  
  fix concat check info typo (#18975) · b62c4f9b
  由 hutuxian 提交于 8月 02, 2019
  
  b62c4f9b
13 6月, 2019 1 次提交
- T
  concat op support negative axis (#18045) · 566bf2ec
  由 tensor-tang 提交于 6月 13, 2019
```
test=develop
```
  566bf2ec
10 6月, 2019 1 次提交
- J
  
  refine GetExpectedKernelType in conat op, test=develop (#17934) · aab4d12c
  由 jerrywgz 提交于 6月 10, 2019
  
  aab4d12c
27 5月, 2019 1 次提交

add Concat quantization (#17448) · 96845d21

由 Sylwester Fraczek 提交于 5月 27, 2019

* add Concat quantization
add unit test for quantizing concat
fix for wrong value when the input is not in map of calculated scales
add use_quantizer to concat_op.cc
add scale_algo rules for concat

test=develop

* missing fix for multiple inputs quantize-squash

* wojtuss review fix: adding comment

test=develop

96845d21

23 5月, 2019 1 次提交
- J
  Fix GetExpectedKernelType in Concat op (#17459) · c1aae8b8
  由 jerrywgz 提交于 5月 23, 2019
```
* fix concat op vartype check, test=develop
```
  c1aae8b8
08 5月, 2019 1 次提交

Fix concat shape check (#17247) · c3195de5

由 Hongyu Liu 提交于 5月 08, 2019

* fix shape_check; test=develop

* fix format; test=develop

* fix format; test=develop

* fix ddim bug; test=develop

* fix c++ format; test=develop

* change function name; test=develop

c3195de5

15 4月, 2019 1 次提交
- P
  
  fix concat; test=develop · 64bf752d
  由 phlrain 提交于 4月 15, 2019
  
  64bf752d
11 4月, 2019 1 次提交
- P
  
  fix concat shape; test=develop · dc6e8146
  由 phlrain 提交于 4月 11, 2019
  
  dc6e8146
26 3月, 2019 1 次提交
- S
  fix env variable settting bug · 78fb3a62
  由 sneaxiy 提交于 3月 26, 2019
```
test=develop
```
  78fb3a62
25 3月, 2019 1 次提交
- S
  try to fix ci error · f8ed2c22
  由 sneaxiy 提交于 3月 24, 2019
```
test=develop
```
  f8ed2c22
21 3月, 2019 2 次提交
- P
  
  fix concat shape check; test=develop · 8274d9d7
  由 phlrain 提交于 3月 21, 2019
  
  8274d9d7
- P
  
  fix concat shape check; test=develop · 249546bf
  由 phlrain 提交于 3月 21, 2019
  
  249546bf
19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
18 3月, 2019 2 次提交
- P
  
  fix conncat; test=develop · dcba2e72
  由 phlrain 提交于 3月 18, 2019
  
  dcba2e72
- P
  
  fix concat; test=develop · a7fe3b50
  由 phlrain 提交于 3月 18, 2019
  
  a7fe3b50
04 12月, 2018 3 次提交
- M
  Include MKL-DNN header to concat op only when flag is set · 6fdbb365
  由 Michal Gallus 提交于 12月 04, 2018
```
test=develop
```
  6fdbb365
- M
  Fix style @ concat integration and tests · f2a88042
  由 Michal Gallus 提交于 12月 04, 2018
```
test=develop
```
  f2a88042
- M
  Implement MKL-DNN Concat · 208f9125
  由 Michal Gallus 提交于 11月 30, 2018
```
test=develop
```
  208f9125
26 11月, 2018 1 次提交
- M
  Revert the changes of VLOG · 53433d7f
  由 minqiyang 提交于 11月 26, 2018
```
test=develop
```
  53433d7f

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功