提交 · 4e5d6743436dbed747002c3e040aca8e9b23244d · Crayon鑫 / Paddle

19 2月, 2022 2 次提交

[Pten] Adjust the params of creation kernel for inference (#39573) · 4e5d6743

由 zyfncg 提交于 2月 19, 2022

* remove manual_api

* change sig map of full and empty

* fix fill_any_like_xpu_op

* fix fill_any_like_xpu_op

* fix problem of fill_any_like_xpu_op

* fix conflict

* polish code

4e5d6743

Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61

由 sneaxiy 提交于 2月 19, 2022

* add DistributedFusedLamb op

* polish code

* fix compile error

* compatible with pten changement

* fix rocm compile error

* improve converage

* update upstream/develop

* fix cast_with_ptr.h

* add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1

* fix clip before allreduce

* add use_master_param_norm

* code polish

* fix bug

* fix ROCM ci

5df3cd61

18 2月, 2022 9 次提交
- F
  [Pten] blas and lapck migration (#39587) · 8c7ee8c2
  由 Feiyu Chan 提交于 2月 18, 2022
```
* move blas related files
* move lapack related files
```
  8c7ee8c2
- T
  cinn_instruction_run_op test (#39576) · fdc4fe3b
  由 TeFeng Chen 提交于 2月 18, 2022
```
* add cinn_instruction_run_op test code

* update several interfaces of CinnLaunchContext

* update several interfaces and add detail comments in CinnLaunchContext class

* to skip the bug of error message check

* fix ut test failed due to reliant interface updated
```
  fdc4fe3b
- X
  [pten] trans diagonal kernel into pten (#39575) · 5c66338f
  由 xiongkun 提交于 2月 18, 2022
```
* trans diagonal kernel into pten

* fix by code review
```
  5c66338f
- A
  [IPU] Update IpuStrategy (#39644) · 46161679
  由 Allen Guo 提交于 2月 18, 2022
```
* Update IpuStrategy

* fix ci

* rerun ci
```
  46161679
- B
  
  refactor the forward implementation of shape npu op (#39613) · e674af23
  由 baoachun 提交于 2月 18, 2022
  
  e674af23
- T
  
  dropout support Seed, fix elementwise_add_grad bug, test=kunlun (#39656) · 70b9f2ac
  由 taixiurong 提交于 2月 18, 2022
  
  70b9f2ac
- Q
  [MLU]add matmul and matmul_v2 op (#39539) · 229ec32a
  由 qipengh 提交于 2月 18, 2022
```
* [MLU]add matmul and matmul_v2 op

* [MLU] fix data_type and del matmul

* [MLU] fix compile error

* [MLU] fix ci_check error
```
  229ec32a
- J
  
  add flatten op for mlu (#39530) · 4c5cec5c
  由 joeqiao12 提交于 2月 18, 2022
  
  4c5cec5c
- Z
  [MLU]add sync stream ops and broadcast pytest (#39518) · d2bd05b9
  由 zn 提交于 2月 18, 2022
```
* [MLU]add sync stream ops and broadcast pytest

* [MLU]fix broadcast pytest to add data type
```
  d2bd05b9
17 2月, 2022 7 次提交
- L
  [pten] move bernoulli kernel to pten (#39590) · f86073c4
  由 Leo Chen 提交于 2月 17, 2022
```
* move bernoulli kernel to pten

* follow comments
```
  f86073c4
- J
  
  add reshape2 op for mlu (#39562) · 2d2f11d1
  由 joeqiao12 提交于 2月 17, 2022
  
  2d2f11d1
- S
  move trunc to pten (#39543) · 4501abd6
  由 Sing_chan 提交于 2月 17, 2022
```
* move trunc to pten

* modify according to YuanRisheng's comment
```
  4501abd6
- H
  add softplus op for kunlun2. test=kunlun (#39555) · 9f99b591
  由 houj04 提交于 2月 17, 2022
```
* add softplus op for kunlun2. test=kunlun

* add softplus op for kunlun2. test=kunlun

* fix code style. test=kunlun

* fix code style. test=kunlun

* add more test cases. test=kunlun
```
  9f99b591
- Z
  [Pten] Remove register of matmul_v2 kernel (#39542) · db43b541
  由 zyfncg 提交于 2月 17, 2022
```
* remove register of matmul_v2 kernel

* delete matmul_v2 grad register in fluid
```
  db43b541
- C
  
  move trace infer shape (#39517) · 1c9b2483
  由 Chen Weihang 提交于 2月 17, 2022
  
  1c9b2483
- N
  
  Modified distribution kernel with Kernel Primitive API (#39563) · 1354652b
  由 niuliling123 提交于 2月 17, 2022
  
  1354652b
16 2月, 2022 9 次提交
- T
  
  optimize prior_box for kunlun, *test=kunlun (#39477) · e254e7c6
  由 TTerror 提交于 2月 16, 2022
  
  e254e7c6
- F
  
  [MLU] support adative pooling (#39500) · f138371c
  由 fwenguang 提交于 2月 16, 2022
  
  f138371c
- 0
  Move lerp OP to pten (#39524) · d480d7b1
  由 0x45f 提交于 2月 16, 2022
```
* move lerp to pten

* refine include

* move files

* refine code
```
  d480d7b1
- A
  
  Add ConditionalBlockGradInferVarType (#39585) · ff7e3590
  由 Aurelius84 提交于 2月 16, 2022
  
  ff7e3590
- L
  [bf16] pten matmul cuda kernel support bf16 (#39485) · d5a0d31a
  由 Leo Chen 提交于 2月 16, 2022
```
* pten matmul cuda kernel support bf16

* fix pten kernel name

* add matmul_grad bf16 kernel

* add emptylike bf16 kernel

* fix compile

* suppport rocm

* fix error

* fix rocm

* add bf16 header file

* fix compile
```
  d5a0d31a
- F
  [Pten] move complex_functors.h (#39558) · 5b5656d0
  由 Feiyu Chan 提交于 2月 16, 2022
```
* move complex_functors.h and update all references to symbols within it
```
  5b5656d0
- C
  [PTen] Rename general grad infermeta func (#39578) · 12ca438e
  由 Chen Weihang 提交于 2月 16, 2022
```
* rename general grad infermeta func

* remove useless code
```
  12ca438e
- A
  [Pten]Modify framework::VisitDataType into Pten::VisitDataType (#39550) · 6b756fb7
  由 Aurelius84 提交于 2月 16, 2022
```
* Modify framework::VisitDataType into Pten::VisitDataType

* migrate unittest
```
  6b756fb7
- Y
  [Pten]Remove reshape and elementwise_add's registry code in Fluid (#39317) · c6478270
  由 YuanRisheng 提交于 2月 16, 2022
```
* remove reshape and elementwise_add registry

* delete code

* fix bugs when run ci ut

* remove log

* fix bugs when run unit test

* fix bugs when run unit test

* fix bugs when run cinn

* fix bugs when run ci-mac-python3

* fix compile bugs

* fix compile bugs

* fix compile bugs

* fix bugs when run kunlun

* fix bugs when compile

* update code according comment
```
  c6478270
15 2月, 2022 10 次提交

J

disabled unnecessary int reorders profiling (#39498) · 3581c075
由 jakpiase 提交于 2月 15, 2022

3581c075

[PluggableDevice] Add custom runtime support (#38740) · 3e7825f3

由 ronnywang 提交于 2月 15, 2022

* [CustomRuntime] Add DeviceManager

* [CustomRuntime] Add DeviceInterface

* [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager

* [CustomRuntime] Add plug-in device

* [CustomRuntime] Memory module support PluggableDevice

* [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option

* update

* [API] update API doc based on comments, test=develop
Co-authored-by: Nqili93 <qili93@qq.com>

3e7825f3

F
[Pten] move paddle/operators/math/functors.h and compound_functors.h (#39514) · 0d46a108
由 Feiyu Chan 提交于 2月 15, 2022
```
* move paddle/operators/math/functors.h
* move paddle/operators/math/compound_functors.h
```
0d46a108

Add cinn_instruction_run_op for launching execution of a cinn instruction (#39435) · 9d0baeab

由 TeFeng Chen 提交于 2月 15, 2022

* add cinn_instruction_run_op for launching execution of a cinn instruction

* fix multi definition compilation error

* update cmake

* fix bug at infershape

* fix compile error due to lacking header file

9d0baeab

move histogram to pten (#39496) · 556f6eb0

由 hong 提交于 2月 15, 2022

* move histogram to pten; test=develop

* fix format error; test=develop

* fix histogram kernel format; test=develop

556f6eb0

Move Abs OP to pten (#39492) · fb473067

由 From00 提交于 2月 15, 2022

* Move Abs op to pten

* Fix NPU compilation error

* Fix CI error

* Use LaunchSameDimsElementwiseCudaKernel in pten

fb473067

S

add dropout fp32 (#39501) · b81358d1
由 sneaxiy 提交于 2月 15, 2022

b81358d1

move algorithm.h (#39502) · 7eb9593e

由 Feiyu Chan 提交于 2月 15, 2022

Move paddle/fluid/operators/math/algorithm.h to paddle/pten/kernels/funcs and rename all references to symbols in it.

7eb9593e

[Pten]Move expand_v2 to pten (#39471) · 2d16d69b

由 Linjie Chen 提交于 2月 15, 2022

* move expand to pten

* move expand_v2 to pten

* move expand_v2 to pten

* fix grad register

* fix grad register

* fix tensorcpry

* fix tensorcopy

* fix tensorcopy

* fix tensorcopy

* fix tensorcopy

* fix ci

* fix tensorcopy

2d16d69b

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

14 2月, 2022 3 次提交

C
[PTen] Add HasAttr for ArgumentMappingContext (#39464) · ddb1e23f
由 Chen Weihang 提交于 2月 14, 2022
```
* add has_attr for arg map context

* skip useless attr now

* skip attr if not exists

* fix typo
```
ddb1e23f

[pten] add split kernel (#39060) · d0df5632

由 chentianyu03 提交于 2月 14, 2022

* add split kernel

* add split kernel signature

* fix split bug

* modify MakePtenScalarArrayFromVarList

* modify MakePtenScalarArrayFromVarList

* fix split windows register error

* add test case for split kernel

* replace raw split kernel with pten kernel

* fix makeScalar/ScalarArray bug

* remove debug log

* remove int64_t type in buildPtcontext

* update by code review

* fix split dev test failed

* change DenseTensorMeta to MetaTensor

* change split api code from auto gen to manual

* split cuda kernel support bfloat16 type

* fix conflict

* rm raw split kernel

* merge develop branch

* change to pten::errors

d0df5632

T

fix gather_nd, *test=kunlun (#39283) · d12c3636
由 TTerror 提交于 2月 14, 2022

d12c3636

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致