提交 · c2a4a50eb29e1c8dab90c0a5403a700bfd1f4cc7 · BaiXuePrincess / Paddle

18 1月, 2021 1 次提交
- L
  fix cache key for inplaced elementwise ops (#30404) (#30478) · c2a4a50e
  由 lidanqing 提交于 1月 18, 2021
```
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>
```
  c2a4a50e
13 1月, 2021 1 次提交

git cherry-pick the commits of operator version registries, test=release/2.0 (#30292) · 5eab1a38

由石晓伟提交于 1月 13, 2021

* Register op version for grid_sampler, test=op_version (#29916)

* add op version for fake_quant and fake_dequant ops, test=op_version (#29923)

* Register op version for print, test=op_version (#29945)

* add gru op_register_version; test=op_version; (#29931)

* Register op version for coalesce_tensor. (#29940)

* register op version for conv2d_transpose, conv3d_transpose and depthwise_conv2d_transpose, test=op_version (#29937)

* add op_register_version for allclose op; test=op_version (#29968)

* register ModifyAttr for instance_norm, test=op_version (#29938)

* add op_version for flip op [test=op_version] (#30019)

* add the op version check for the elementwise ops, test=op_version (#30010)

* add the support the op version check for matmul, test=op_version (#30011)

* Revert "register ModifyAttr for instance_norm, test=op_version (#29938)"

* add REGISTER_OP_VERSION for generate_proposals, roi_align, roi_pool test=op_version (#30034)

* Fix rank_attention op_version, test=op_version (#30006)

* fix rank_attention, test=op_version

* Register op version for linspace,test=op_version (#30025)

* fix op_register_version for compare ops, test=op_version (#30007)
Co-authored-by: Nzhoushunjie <zhoushunjie@baidu.com>

* register ModifyAttr for instance_norm, test=op_version (#30065)

* register instance norm, test=op_version

* add trace op_register_version and fix version bug; test=op_version (#30000)

* fix a bug in op_version_registry, test=develop, test=op_version (#29994)

* Add version checking, test=op_version (#30129)

* fix a bug in gaussian_random_op version, test=release/2.0
Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: Ncc <52520497+juncaipeng@users.noreply.github.com>
Co-authored-by: NQi Li <qili93@qq.com>
Co-authored-by: NJack Zhou <zhoushunjie@baidu.com>
Co-authored-by: NGuo Sheng <whucsgs@163.com>
Co-authored-by: Nwangxinxin08 <69842442+wangxinxin08@users.noreply.github.com>
Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
Co-authored-by: NFlyingQianMM <245467267@qq.com>
Co-authored-by: Nceci3 <ceci3@users.noreply.github.com>
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Nchalsliu <45041955+chalsliu@users.noreply.github.com>
Co-authored-by: Nwangguanzhong <jerrywgz@126.com>
Co-authored-by: NShenLiang <shenliang03@baidu.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: Nchannings <chenlingchi@baidu.com>
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
Co-authored-by: Nruri <shipeng1108@163.com>

5eab1a38

12 1月, 2021 2 次提交

[cherry-pick]memory optimization for fuse pattern of elemwise_add + act (#30303) · b207b8a7

由 wangchaochaohu 提交于 1月 12, 2021

* reduce the  occupied size  of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)

* register OPMaker and Infer Shape Check for fused_elementwise_add (#30259)

b207b8a7

[Cherry-pick] Complex grad for matmul, kron and type promotion (#30304) · 7346edc2

由 chentianyu03 提交于 1月 12, 2021

* complex gradient matmul  (#29966)

* dot op support complex types

* matmul support complex types

* add test case

* matmul broadcast gradient support complex

* move conjFunctor to complex_functor.h

* change the kron gradient when complex types (#29995)

* type promotion for grad (#30177)

* type promotion for grad

* add type promotion for div op

7346edc2

11 1月, 2021 2 次提交

[cherry-pick]Elementwise add grad GPU kernel optimization (#30276) · e59524f8

由 wangchaochaohu 提交于 1月 11, 2021

* elementwise_add_grad Op optimization  (#29575)

* optimize for long width for elementwise (#29602)

* refine (#29622)

* delete the code for fp16 optimization because it is not faster than common template code (#29715)

* fix the shape choose of vectorize for cuda

* optimization for fp16 elementwise add (#29744)

* Fix the compiler error for half type (#29799)

* refine the compiler error for half2 operation (#29816)

* fix the compiler error when gcc4 cuda9.0 (#29997)

e59524f8

add aarch64 and sunway kunlun lib (#30027) (#30237) · eacbd488

由 QingshuChen 提交于 1月 11, 2021

* add aarch64 and sunway kunlun lib

* minor

* optimize elementwise_add for kunlun

* update kunlun dependence

* minor

* minor

eacbd488

07 1月, 2021 1 次提交

[cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad

由 Leo Chen 提交于 1月 07, 2021

* Improve performance of elementwise_add grad op (#29187)

* pass stop_gradient for cast op

* improve performance of elementwise_add grad

* use tensor copy async

* dygraph branch

* fix dygraph branch

* add ut

* make gelu fp16 computing more robust (#29484)

* Add fast path for dropout when p == 0  (#29553)

* add fast path for p == 0 in dropout

* add ut

07f68fad

29 12月, 2020 1 次提交

[Cherry-pick] Complex network execute support (#29905) · 91ebc460

由 Chen Weihang 提交于 12月 29, 2020

* [Complex] Add support for complex grad accumulated (#29889)

* add support for complex grad accumulated

* add unittest for coverage

* update test dtype

* remove useless blank line

* [Complex] Handle complex to real after type promotion (#29855)

* try to add fwd op input dtypes

* refactor base impl

* return tmp_ins after dygraph prepare data

* fix typo found in debug

* polish comment & add complex net test

* revert detail change

* fix unittest failed

* add complex kernel condition control

* fix xpu test failed & polish comment

* polish details by review comments

* Complex op test (#29753)

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* change grad elementwise_mul for complex types (#29757)

* add conj op for complex types

* add conj for complex types

* add more test case

* add conj_op test

* modify conj api and impl

* add complex type for fill_constant_op xpu

* add setConstant for complex type

* remove complex conj test file

* user define grad for test_conj_op

* add test case for static mode of conj api

* modify conj doc

* change input args name to x

* remove useless codes

* conj support real types

* add conj test case for real number

* delete no need to calculate inputs in dygraph op_test

* delete no need to calculate inputs in dygraph op_test

* modify grad of mul for complex types

* fix the grads of inputs args order not match bug

* change the grad of div when complex types (#29804)

* change the grad of div when complex types

* fix the grads of inputs args order not match bug
Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>

91ebc460

08 12月, 2020 1 次提交

[2.0 rc1/cherrypick] cherry-pick kunlun PR:29234/29229/29293/29367/29280/29448 (#29466) · 6bfc5721

由 liuyuhui 提交于 12月 08, 2020

* add deformable_conv op on xpu (#29234)

* rebase develop

* update deformable_conv op on xpu

* update deformable_conv op on xpu

* update kunlun conv2d/softmax/elementwise implemetation (#29229)

* update conv2d & softmax to new xpu api
* test=kunlun

* remove useless comments
* test=kunlun

* remote softmax xpu op
* test=kunlun

* update kunlun softmax
* test=kunlun

* update xpu unitest
* test=kunlun

* fix elementwise_grad bug for kunlun
*test=kunlun

* support global pooling for kunlun (#29293)

* test=kunlun

* update reduce_sum op on xpu (#29367)

* update reduce_sum op on xpu

* update reduce_sum op on xpu

* support running on xpu

* fix expand/uniform_random && concat/transpose to new api on xpu (#29280)

* fix expand && concat/transpose to new api

* update uniform_random_op

* update xpu_header

* 1. fix elementwise ops'bug 2. fix softmax_with_cross_entropy_op 3. add biliner_interp_op (#29448)
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>
Co-authored-by: N卖鱼的哲学 <tangzhiyi11@users.noreply.github.com>
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>
Co-authored-by: Ntaixiurong <taixiurong@126.com>
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>

6bfc5721

04 12月, 2020 1 次提交

Support type promote for basic math ops (quantum required) (#29265) (#29354) · 0e7539e7

由 Chen Weihang 提交于 12月 04, 2020

* basic impl of type promote

* add comment & another testcase

* fix complex bugs & support python op promote type

* fix failed unittests & polish code

* add unittest for coverage

* change to only promote complex type

* polish code details

* polish several comments

0e7539e7

01 12月, 2020 1 次提交

add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199) · 8f45d142

由 chentianyu03 提交于 12月 01, 2020

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

8f45d142

27 11月, 2020 1 次提交
- A
  
  Fixes mkldnn dygraph learning rate scheduler crashes (#28988) · bc902044
  由 arlesniak 提交于 11月 27, 2020
  
  bc902044
26 11月, 2020 1 次提交
- N
  Fix ops doc for some ops · da71173b
  由 Noel 提交于 11月 26, 2020
```
Fix ops doc for some ops 
```
  da71173b
25 11月, 2020 2 次提交
- T
  
  add xpu elementwise ops (#29031) · a5aa4dc7
  由 taixiurong 提交于 11月 25, 2020
  
  a5aa4dc7
- J
  Update pow (#29000) · b04c78ef
  由 joejiong 提交于 11月 25, 2020
```
Simple code clean up
```
  b04c78ef
20 11月, 2020 1 次提交
- J
  Add bf16 matmul, fc, elementwise add and mul (#28729) · 8c0ea4bf
  由 joanna.wozna.intel 提交于 11月 20, 2020
```
* Add bf16 matmul, fc, elementwise add and mul

* Correct unit test
```
  8c0ea4bf
19 10月, 2020 1 次提交
- C
  Fix xpu error message (#28061) · 5f04875c
  由 Chengmo 提交于 10月 19, 2020
```
* fix error message,test=kunlun

* fix, test=kunlun
```
  5f04875c
16 10月, 2020 1 次提交

Fix xpu enforce (#27978) · d330cf66

由 Jack Zhou 提交于 10月 16, 2020

* test=kunlun;

Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast):

    * elementwise_div op
    * elementwise_max op
    * elementwise_mul op (with grad op)
    * elementwise_sub op (with grad op)

* 0.05->0.01

* add xpu error message description;test=kunlun

d330cf66

14 10月, 2020 1 次提交
- J
  Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast · c791df09
  由 Jack Zhou 提交于 10月 14, 2020
```
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast
```
  c791df09
27 9月, 2020 1 次提交

support elementwise add, activation, matmul on Baidu Kunlun (#27143) · 6b727e08

由 QingshuChen 提交于 9月 27, 2020

* support elementwise add, activation, matmul on Baidu Kunlun
* test=kunlun

* minor
* test=kunlun

* reconstuct the xpu directory
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

6b727e08

24 9月, 2020 1 次提交

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

21 9月, 2020 1 次提交

[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112) · aba759ba

由 Leo Chen 提交于 9月 21, 2020

* support use add instead of sum to do gradient accumulation

* add inplace addto pass

* add grad_add op and inplace addto pass

* remove debug code

* code refine

* fix bug when sereral sum ops inserts at same op_idx

* fix Flags type

* add addto attribute for conv3d

* fix ut

* code clean

* fix type

aba759ba

17 9月, 2020 1 次提交
- S
  Fix elementwise_floordiv op (#27352) · 9ee77b1f
  由 ShenLiang 提交于 9月 17, 2020
```
* fix floordiv
```
  9ee77b1f
16 9月, 2020 1 次提交
- W
  update the error message check for the some ops · 4e8582fe
  由 wawltor 提交于 9月 16, 2020
```
update the error message check for the some ops
```
  4e8582fe
10 9月, 2020 2 次提交
- J
  [oneDNN]Introducing oneDNN 1.6 (#27137) · e0058615
  由 Jacek Czaja 提交于 9月 10, 2020
```
* - introducing oneDNN 1.6

test=develop

* - Removed redundant code

test=develop
```
  e0058615
- S
  
  revert divide (#27202) · 5bd84b22
  由 ShenLiang 提交于 9月 10, 2020
  
  5bd84b22
04 9月, 2020 1 次提交
- S
  
  fix the remainder (#26995) · ff3dc8ac
  由 ShenLiang 提交于 9月 04, 2020
  
  ff3dc8ac
28 8月, 2020 1 次提交
- S
  fix remainder, floor_div (#26732) · 29494d70
  由 ShenLiang 提交于 8月 28, 2020
```
* fix remainder, floordiv
```
  29494d70
27 8月, 2020 1 次提交
- J
  Fix pow api type error with python side method, merge elementwise_pow and pow. (#26163) · f311d3c1
  由 joejiong 提交于 8月 27, 2020
```
As the title
```
  f311d3c1
24 8月, 2020 1 次提交
- S
  add div, floor_div, remainder (#26562) · 0e816260
  由 ShenLiang 提交于 8月 24, 2020
```
* add div, floor_div, remainder
```
  0e816260
22 8月, 2020 1 次提交
- W
  
  【API】rename div to divide, add floor_divide, remainder (#26434) · 45711dad
  由 WangXi 提交于 8月 22, 2020
  
  45711dad
13 8月, 2020 1 次提交

[OpDevOptimize] Add common infershape functions (#26096) · ffe52b44

由 Leo Chen 提交于 8月 13, 2020

* add unchaged infershape function

* add broadcast infershape function

* fix bug

* rename infershape functions

* add UnaryOpUnchangedInferShapeCheckAxis

* add error message

* add test for common infer shape functions

* dont update existed ops

* dont update op_desc.h

* add more test

* add error check, refine error message

ffe52b44

12 8月, 2020 1 次提交
- W
  Add the max, min, maximum, minimum api for the API 2.0 · 9c17b3c9
  由 wawltor 提交于 8月 12, 2020
```
* Add the max, min, maximum, minimum api for the API 2.0, test=develop
```
  9c17b3c9
08 8月, 2020 1 次提交

Change use_quantizer attribute name and data type (#25838) · 734cf1c3

由 joanna.wozna.intel 提交于 8月 08, 2020

* Change use_quantizer attribute name and data type

* Fix problem with setting attribute

* Add changes due to review

* Small change in function

* Restore use_quantizer attr for compatibility

734cf1c3

05 8月, 2020 1 次提交
- Z
  add eltwise clip cuda impl. (#25689) · 5970871a
  由 Zhaolong Xing 提交于 8月 05, 2020
```
test=develop
```
  5970871a
18 6月, 2020 1 次提交

[oneDNN]elementwise_add and elementwise_mul int8 support (#24984) · a7944904

由 Jacek Czaja 提交于 6月 18, 2020

* Start implementing int8 eltwise add

test=develop

* - Fix to Michal PR

* - Fix

test=develop

* - Lint fixes

test=develop

* - Added checking if elementwise_mul can be used

test=develop

* - Added attribs to skip_attrs_set

test=develop

* - Improved broadcasting

test=develop

- fixes to compilation

- fix

- fix

- Lint fixes

test=develop

* - removed redundant condition

test=develop
Co-authored-by: NMichal Gallus <michal.gallus@intel.com>

a7944904

16 6月, 2020 1 次提交
- L
  
  fix dtype error of compare op, test=develop (#25059) · 028de857
  由 Leo Chen 提交于 6月 16, 2020
  
  028de857
03 6月, 2020 1 次提交
- M
  Remove old mkldnn_elementwise_mul test (#24855) · 23a85f03
  由 Michał Gallus 提交于 6月 03, 2020
```
test=develop
```
  23a85f03
27 5月, 2020 1 次提交
- L
  
  rename inplace/no_need_buffer inferer, part2, test=develop (#24733) · b0e7439f
  由 Leo Chen 提交于 5月 27, 2020
  
  b0e7439f
22 5月, 2020 1 次提交
- J
  
  [oneDNN] Fix to elementwise_add grad (#24639) · ca68b13f
  由 Jacek Czaja 提交于 5月 22, 2020
  
  ca68b13f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致