提交 · 340dfb26135aa9be903575aa29691402ccd40467 · 机器未来 / Paddle

28 12月, 2021 2 次提交
- T
  Add Amax and Amin API (#38417) · 340dfb26
  由 Tao Luo 提交于 12月 28, 2021
```
* add amax/amin

* support axis is list
```
  340dfb26
- H
  add reduce_prod_xpu. fix reduce_mean_xpu bug. (#38481) · 78836bb7
  由 houj04 提交于 12月 28, 2021
```
* add reduce_prod_xpu. fix reduce_mean_xpu bug.

* iadd reduce_prod_xpu. fix reduce_mean_xpu bug. test=kunlun
```
  78836bb7
24 12月, 2021 1 次提交

[pten] combine reduce_cuda codes (#38328) · 08941eda

由 chentianyu03 提交于 12月 24, 2021

* combine reduce_cuda codes

* support float16 in pten redcue_mean

* replace ReduceCudaKernel impl with pten reduce impl

* mv reduce funcs into reduce_cuda_impl

* rm unsed codes and headers

* mv GetReduceDim into reduce_cuda_impl

* recover GetReduceDim in reduce_op.h

* add new dispatch macro

* fix pool op output not inited and cause transform to pten::denseTensor error

* fix output tensor not initialized error

* rename new dispatch macro and format code style

* rm reduce_functor_op.h file

08941eda

21 12月, 2021 1 次提交
- S
  Support FP16 mean (#38289) · 643a268e
  由 sneaxiy 提交于 12月 21, 2021
```
* mean first version

* fix scalar mean

* add fp16 dtype for api
```
  643a268e
17 12月, 2021 2 次提交
- C
  [pten] modify reduce_sum reduce_mean args (#38216) · eaa2363e
  由 chentianyu03 提交于 12月 17, 2021
```
* modify sum mean args

* add GetExpectedPtenKernelArgs for redcue_op

* modify kernel args number

* modify kernel args number
```
  eaa2363e
- N
  
  Delete cub_reduce.h and modified the TensorReduce to TensorReduceFunctorImpl (#38197) · 9a8a4c77
  由 niuliling123 提交于 12月 17, 2021
  
  9a8a4c77
16 12月, 2021 1 次提交
- N
  Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
  由 niuliling123 提交于 12月 16, 2021
```
* Add the transformop parameter in TensorReduceFunctorImpl
```
  524389ee
13 12月, 2021 1 次提交
- N
  
  [pnorm] Optimize p_norm op for special cases (#37685) · 10d9ab4b
  由 Noel 提交于 12月 13, 2021
  
  10d9ab4b
09 12月, 2021 1 次提交
- C
  
  adjust main dir (#37916) · 1911b6f0
  由 Chen Weihang 提交于 12月 08, 2021
  
  1911b6f0
08 12月, 2021 1 次提交
- C
  implementation of broadcast sub backward by reduce (#37754) · 567e6bbc
  由 crystal 提交于 12月 08, 2021
```
* add boardcast_sub

* add boardcast_sub
```
  567e6bbc
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
29 11月, 2021 2 次提交

[Pten] Add reduce mean kernel, replace with mean API (#37559) · f9e9fd19

由 chentianyu03 提交于 11月 29, 2021

* add pten reduce kernel

* add reduce_sum kernel

* update attribute args and order

* make out dtype undefined

* fix empty input error

* merge develop branch

* rename sum as reduce function

* rename sum as reduce function

* fix reducekernelImpl args error

* add reduce cuda kernel

* modify dims type to const &

* remove unsed log

* fix reduce_all out eigen function error

* remove unused codes

* add the missing sum api define and testcase

* merge develop branch

* fix sum test axis value error

* replace pten mean kernel with reduce_mean

* revcover meam cuda to original implement

f9e9fd19

P

Add third batch of deprecated mkldnn namespace name changes (#37558) · 1ba81500
由 piotrekobiIntel 提交于 11月 29, 2021

1ba81500

27 11月, 2021 1 次提交

[NPU] reorganization for device API abstraction (#37110) · 72241a6a

由 Aganlengzi 提交于 11月 27, 2021

* [NPU] reorganization for device API abstraction

* [NPU] delete old files

* [NPU] fix npu_collective_helper

* [NPU] fix collective_helper

* [NPU] fix ut

* [NPU] mod memory allocation and hccl_helper

* [NPU] fix place_type

* [NPU] split enfoce.h

* move acl* call into npu_info

* merge conflict

* fix merge

* merge conflict

* merge conflict

72241a6a

23 11月, 2021 1 次提交
- Q
  [XPU] Reorganize xpu device codes in platform, test=develop (#37428) · 79800978
  由 Qi Li 提交于 11月 23, 2021
```
* [XPU] Reorganize xpu device codes in platform, test=develop

* fix xpu_header.h, test=develop
```
  79800978
17 11月, 2021 1 次提交
- N
  Modify reduce_op.op.h for xpu2 with kernel primitive api (#36904) · 9c5d5665
  由 niuliling123 提交于 11月 17, 2021
```
* Modify reduce_op.op.h for xpu2 with kernel primitive api
```
  9c5d5665
28 10月, 2021 1 次提交

[NPU] Add int64 supporting for expand_v2, reduce_max, scale and tests (#36582) · c038cc7a

由 ronnywang 提交于 10月 28, 2021

* add TypeAdapter method for npu_op_runner

* add int64 supporting for elementwise_mul and reduce_sum

* add int64 supporting and UT for expand_v2, scale and reduce_max

* fix bug

c038cc7a

26 10月, 2021 1 次提交

[NPU] fix argsort op, test=develop (#36576) · 3523bbe8

由 Qi Li 提交于 10月 26, 2021

* [NPU] fix argsort op, test=develop

* remove debug files, test=develop

* fix typo, test=develop

* address review comments, test=develop

3523bbe8

21 10月, 2021 1 次提交

Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1 (#36373) · 921c0917

由 niuliling123 提交于 10月 21, 2021

* Update the implement of reduceAnyKernel according to kernel primitive api
* Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1

921c0917

18 10月, 2021 1 次提交
- T
  [XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph... · d19a9b39
  由 taixiurong 提交于 10月 18, 2021
```
[XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph 3. xpu support update weight params in amp (#36439)
```
  d19a9b39
28 9月, 2021 1 次提交
- G
  
  fix bug of reduce_sum when src_dtype != dst_dtype and reduce_num == 1 (#36123) · d5268a6e
  由 Guoxia Wang 提交于 9月 28, 2021
  
  d5268a6e
18 9月, 2021 1 次提交

[oneDNN] Disable caching of Reorder operation (#35664) · e4c2a854

由 Jacek Czaja 提交于 9月 18, 2021

* - REorder disabling caching

* - compilation fix

* - another compilation fix

* - another compilation fix

* - compilation fix

* - Fix

* - yet another compilation fix

* - suppresingly another compilation fix

* - lint

* - fix after review

* - fix

e4c2a854

08 9月, 2021 2 次提交
- N
  
  Modify the reduce op according to the kernel primitive api (#35282) · 82b33be3
  由 niuliling123 提交于 9月 08, 2021
  
  82b33be3
- Z
  
  Add op define extra for norm and frobenius norm op. (#35329) · 3dab2e20
  由 Zhong Hui 提交于 9月 08, 2021
  
  3dab2e20
26 8月, 2021 1 次提交

[oneDNN] disable caching oneDNN primitives in matmul v2, Reduce grad and... · 31f0221f

由 Jacek Czaja 提交于 8月 26, 2021

[oneDNN] disable caching oneDNN primitives in  matmul v2, Reduce grad and elementwise_add grad, expand_v2 (#35132)

* - grad caching disabled of matmul_v1

- compilation fix

- compilation fix

* - reduction removed

* - Matmul v2 disabled caching

* Draft of further changes

* - workaround for reducegrad

* - fixes to UT

* - fix to compilation

* - another fix

* - fix

31f0221f

17 8月, 2021 1 次提交
- N
  fix a bug in nlp: text_matching/sentence_transformers when last dim is 1 and... · 181f7cec
  由 niuliling123 提交于 8月 17, 2021
```
fix a bug in nlp: text_matching/sentence_transformers when last dim is 1 and reduce mid dim (#34941)
```
  181f7cec
11 8月, 2021 2 次提交
- R
  [NPU] add reduce_mean_op_npu and test (#34053) · f6fab559
  由 ronnywang 提交于 8月 11, 2021
```
* add reduce_mean_op_npu and test

* remove skip.If

* update
```
  f6fab559
- N
  
  modified reduce_sum_op and reduce_mean_op for higher_performance (#32885) · 6a9fac14
  由 niuliling123 提交于 8月 11, 2021
  
  6a9fac14
06 8月, 2021 1 次提交

[NPU]add reduce_prod (#34182) · 47d81b09

由 furnace 提交于 8月 06, 2021

* [NPU] add reduce_prod

* [NPU] delete check_dygraph=False

* [NPU] delete skipIf

* add attrs support or check

* [NPU] delete extra codes for test_reduce_max_op_npu

* [NPU] add attr out_dtype

47d81b09

05 8月, 2021 2 次提交

New executor dev (#34407) · 012d12b5

由 hong 提交于 8月 05, 2021

* first test version

* add test exec;

* add data transfer; test=develop

* add new exec head;

* add memcpy; test=develop

* add python fetch

* add new test

* add graph node; test=develop

* remove useless new executor test; test=develop

* remove gperf dependency; test=develop

* fix compile bugs; test=develop

* remove useless code; test=develop

* remove useless code; test=develop

* add uni test; test=develop

* polish code; test=develop

* polish code; test=develop

* add interpreter cmakefile; test=develop

* remove useless code; test=develop

012d12b5

L

Support Ternary ops in elmentwise and broadcast (#33976) · 1d7b75dd
由 limingshu 提交于 8月 05, 2021

1d7b75dd

03 8月, 2021 1 次提交
- Q
  support Kunlun2 (#34459) · 2d0f3d9b
  由 QingshuChen 提交于 8月 03, 2021
```
* support Kunlun2

* support KL2

* support KL2
```
  2d0f3d9b
02 8月, 2021 2 次提交
- Z
  
  Unify the block/grid strategy and implementation of ReduceLastDim and ReduceAny (#34436) · c7cc5ac2
  由 Zhang Zheng 提交于 8月 02, 2021
  
  c7cc5ac2
- F
  [NPU] add reduce_max (#34179) · de53f2bf
  由 furnace 提交于 8月 02, 2021
```
* [NPU] add reduce_max

* [NPU] delete skipIf

* [NPU] add atrrs support or check

* [NPU] add attr out_dtype

* [NPU] delete debug codes
```
  de53f2bf
30 7月, 2021 1 次提交

Added expand_v2 BF16/FP32 FWD/BWD kernels (#34284) · 41c4f723

由 jakpiase 提交于 7月 30, 2021

* added expand_v2 bf16/fp32 kernel

* minor change

* CI fix

* added missing test file

* added formatting

* reduced binary size

* CI fix

41c4f723

12 7月, 2021 1 次提交
- Z
  
  optimize perfermance of multiple-dimension reduce (#33761) · 2dde0eb0
  由 Zhang Zheng 提交于 7月 12, 2021
  
  2dde0eb0
05 7月, 2021 1 次提交
- Z
  
  Reduce build time by deleting the template param BlockDim (#33901) · 7a476608
  由 Zhang Zheng 提交于 7月 05, 2021
  
  7a476608
02 7月, 2021 1 次提交
- N
  
  modified reduce_all_op reduce_any_op for higher performance (#33267) · 9b48199a
  由 niuliling123 提交于 7月 02, 2021
  
  9b48199a
22 6月, 2021 1 次提交
- N
  
  modified reduce_max, reduce_min, reduce_prod to higher_performance implementation. (#32974) · 480b284c
  由 niuliling123 提交于 6月 22, 2021
  
  480b284c
15 6月, 2021 1 次提交

Support reduce_sum_op float16 (#32966) · 606939de

由 jiangcheng 提交于 6月 15, 2021

* add reduce_sum_op by add self-kernel

* set all ReduceKernel MPType for accuracy

* add float16 test script which input is integer number

* solve reduce sum float16 check_grad problem

* solve conflict and change test script for CI

* change kernel register for CI

* remove all useless template

606939de

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致