提交 · 46f3139c7febde8f13860839e9fc9ff7cc8cc824 · 机器未来 / Paddle

17 4月, 2020 1 次提交
- Z
  OP error message enhancement of l2_normalize, matmul, mean, etc · 361c6ccc
  由 Zhong Hui 提交于 4月 17, 2020
```
* fix error message of l2_normalize, matmul, mean, etc. 
* add the test case for those ops
```
  361c6ccc
11 4月, 2020 1 次提交

[DNNL][INT8][FP32] MatMul (#23395) · a63bcf9a

由 Michał Gallus 提交于 4月 11, 2020

* Initial FP32 DNNL MatMul Implementation

* Implement int8 DNNL MatMul

* Unify in-kernel-naming, clean UTs

* MatmuL: Introduce op caching

* Final adjustments

test=develop

* Remove dy_graph disablement

test=develop

* Change dnnl header name to new one

test=develop

* Contrain multi head check to prevent fails

test=develop

* Resolve dnnl header problems on MAC CI

* Variable namings to kernel and skip_grad_ci added

test=develop

* Prevent MAC CI from failing

* Prevent windows build from failing

test=develop

* Modify UTs to conform to the rules

* Modify MatMul aux functions namings

test=develop

a63bcf9a

04 4月, 2020 1 次提交

Delete Ref & VectorRef and add GetDataSafely (#22997) · 16315d3d

由 Chen Weihang 提交于 4月 04, 2020

* delete invalid check inferface Ref & VectorRef, test=develop

* fix vector ref delete error, test=develop

* try the new check inferface, test=develop

* change all related code with new check macro, test=develop

* remove static assert, test=develop

* polish detail, test=develop

* skip coverage problem, test=develop

* add new check macro, test=develop

16315d3d

11 3月, 2020 1 次提交
- W
  Speed up the matmul op, use the gemm replace the batch gemm (#22926) · f154d586
  由 wawltor 提交于 3月 11, 2020
```
In the op of gemm, we use the gemm to replace batch gemm, speed up the matmul op 
```
  f154d586
09 3月, 2020 1 次提交

Imperative tracer refactoring (#22457) · d33c4343

由 Zeng Jinle 提交于 3月 09, 2020

* refine grad maker, test=develop

* refactor tracer stage 1, test=develop

* merge develop to solve conflict third times, test=develop

d33c4343

25 12月, 2019 1 次提交
- H
  
  fix matmul error message; test=develop (#21885) · 30d000f8
  由 hong 提交于 12月 25, 2019
  
  30d000f8
31 10月, 2019 1 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

23 10月, 2019 1 次提交

石

update the infer shape of matmul, test=develop (#20717) · 37cd4354

由石晓伟提交于 10月 23, 2019

* update the infer shape of matmul,  test=release/1.6

* add unittests of matmul, test=release/1.6

* change func names, test=develop

37cd4354

15 10月, 2019 1 次提交

石

Optimize error message of mean_op and matmul_op (#20413) · a4753f3a

由石晓伟提交于 10月 15, 2019

* add data type check, test=develop

* polish error messages, test=develop

* polish error messages, test=develop

* Remove support for the CPU architecture matmul, test=develop

* fix syntax bug, test=develop

a4753f3a

25 9月, 2019 1 次提交

add support of matmul with multiple head even different width and height (#19708) · c670058a

由 Bob Zhu 提交于 9月 25, 2019

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* add support of matmul with multiple head even different width and height

Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.

One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]

test=develop

* refactor the code of matmul with multiple head even different width and height

test=develop

c670058a

24 7月, 2019 1 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

21 3月, 2019 1 次提交
- P
  
  fix matmul shape check; test=develop · 0e402989
  由 phlrain 提交于 3月 21, 2019
  
  0e402989
18 9月, 2018 1 次提交
- S
  
  modification · 0718113a
  由 sneaxiy 提交于 9月 18, 2018
  
  0718113a
17 9月, 2018 1 次提交
- S
  
  tiny change to save memory · abf9832c
  由 sneaxiy 提交于 9月 17, 2018
  
  abf9832c
10 5月, 2018 1 次提交
- Y
  
  matmul support float16/double · 27197290
  由 yuyang18 提交于 5月 10, 2018
  
  27197290
08 5月, 2018 2 次提交

Clean OpProtoAndCheckerMaker · 0e78cb69

由 Yu Yang 提交于 5月 08, 2018

Do not use ctor

* Reduce line of codes.
* We can use virtual function for Maker now.
* The implementation does not care what maker holds, it is easier to
refactor later.

0e78cb69

Y

Follow comments and polish code names · fcd31d61
由 Yu Yang 提交于 5月 08, 2018

fcd31d61

07 5月, 2018 1 次提交
- Y
  
  Rewrite Matmul, make code cleaner · c6a6d87f
  由 Yu Yang 提交于 5月 07, 2018
  
  c6a6d87f
19 4月, 2018 1 次提交
- Y
  add semicolon to op registry (#10034) · e04c43d5
  由 Yang Yang(Tony) 提交于 4月 18, 2018
```
* script to add semicolon

* fix typo
```
  e04c43d5
17 4月, 2018 1 次提交
- Y
  
  script to fix all · ce7c2e86
  由 Yang Yang 提交于 4月 16, 2018
  
  ce7c2e86
12 4月, 2018 1 次提交
- S
  Fix cpplint errors for a set of operators (#9837) · 8d3ce01f
  由 Siddharth Goyal 提交于 4月 11, 2018
```
* Fix cpplint errors, round2

* Fix pointer issue
```
  8d3ce01f
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
21 1月, 2018 1 次提交
- C
  
  follow comments · 782ddc5f
  由 chengduoZH 提交于 1月 21, 2018
  
  782ddc5f
19 1月, 2018 1 次提交
- C
  
  follow comments · 0468422d
  由 chengduoZH 提交于 1月 19, 2018
  
  0468422d
18 1月, 2018 3 次提交
- C
  
  modify doc · 259858b4
  由 chengduoZH 提交于 1月 18, 2018
  
  259858b4
- C
  
  code refine · 578d60bf
  由 chengduoZH 提交于 1月 18, 2018
  
  578d60bf
- C
  
  add 4-d for matmul_op · 2edc136c
  由 chengduoZH 提交于 1月 18, 2018
  
  2edc136c
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 11月, 2017 1 次提交
- K
  Polish Operator Doc (m) (#5375) · cb0118f3
  由 kexinzhao 提交于 11月 04, 2017
```
* fix m_ops

* fix activation op
```
  cb0118f3
18 10月, 2017 1 次提交

MatMul operator (#4856) · 16489827

由 Markus Kliegl 提交于 10月 17, 2017

* initial matmul operator

Similar to np.matmul, but also has transpose_X and transpose_Y flags,
and only supports tensors from rank 1 to 3 inclusive.

For GPU, uses cublas?gemmStridedBatched. For CPU, uses
cblas_?gemm_batch if available via MKL; otherwise a simple serial
implementation that loops over the batch dimension is employed for now.

16489827

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致