提交 · c269a160078593d6f66eecab721870f30d3d972f · BaiXuePrincess / Paddle

21 6月, 2021 1 次提交

[NPU] flatten params and grads, fuse grad_clip and optimizer op (#33461) · c269a160

由 Leo Chen 提交于 6月 21, 2021

* enable npu alignment

* support flatten_params/grads

* support clip by global norm

* remove memset in coalesce_tensor_op

* fix npu kernel of sum op when input is one tensor

* add ut for flatten_param_grads+regularizer

* fix ut

* fix typo

c269a160

10 6月, 2021 1 次提交
- B
  
  dp c_allreduce_sum_fusion op (#33169) · 003b4616
  由 Baibaifan 提交于 6月 10, 2021
  
  003b4616
26 2月, 2021 1 次提交
- W
  
  xpu support fuse allreduce (#31104) · b8bce682
  由 WangXi 提交于 2月 26, 2021
  
  b8bce682
22 2月, 2021 1 次提交
- Q
  
  [ROCM] update fluid platform for rocm39 (part4), test=develop (#30936) · 33429630
  由 Qi Li 提交于 2月 22, 2021
  
  33429630
03 7月, 2020 1 次提交
- G
  fix PADDLE_ENFORCE (#25297) · fb70682f
  由 GaoWei8 提交于 7月 03, 2020
```
* fix PADDLE_ENFORCE and refine the description
test=develop
```
  fb70682f
23 7月, 2019 1 次提交
- C
  Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
  由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
  fd3aad6c
03 7月, 2019 1 次提交
- Z
  
  add size op (#17412) · 67b48d7f
  由 zhoukunsheng 提交于 7月 03, 2019
  
  67b48d7f
15 8月, 2018 1 次提交

Add flatten op interface and enhance APIs about detection to support... · 9333a627

由 Bai Yifan 提交于 8月 15, 2018

Add flatten op interface and enhance APIs about detection to support variable-length image. (#12422)

* add flatten api&enhance detection api

* unify shape_op data type

* update API.spec

9333a627

01 6月, 2018 1 次提交

Add shape op to get the shape of variable. (#11048) · 28dc9ba3

由 whs 提交于 6月 01, 2018

* Add shape op to get the shape of variable.

* Rename get_shape to shape.

* Add checker for output and fix comments.

28dc9ba3

12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
22 12月, 2017 1 次提交
- Q
  add data layout (#6832) · 6b475981
  由 QI JUN 提交于 12月 22, 2017
```
* add data layout

* fix ci
```
  6b475981
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

25 10月, 2017 1 次提交

CPU Batch Norm Op (#4964) · ee998a9c

由 Qiao Longfei 提交于 10月 24, 2017

* init batch norm op

* prepare input output

* compute mean_out var_out save_mean save_var on CPU

* active is test

* use eigen to do computation

* complete batch norm forward

* set default momentum to 0.9

* add batch norm grad op in CPU

* add tensor_format and NHWC support, add python test

* add test training

* add batch norm gradient test

* improve comment, fix foward Python UnitTest

* add gradient test

* fix eigen warning

* follow name style

* fix a bug

* change float to T

* add simple forward test

* test with different place

* add backward test

* refine python test

* remove old python test code

* code clean

* follow code style

* update comment

ee998a9c

10 10月, 2017 1 次提交
- A
  
  Implementing the fill constant op for the executor · 6efacc14
  由 Abhinav Arora 提交于 10月 09, 2017
  
  6efacc14
28 9月, 2017 1 次提交
- Y
  
  Add Skeleton of Double support · 3a5693e0
  由 Yu Yang 提交于 9月 27, 2017
  
  3a5693e0
20 9月, 2017 1 次提交
- D
  
  Share LoD between input and output of each opeators. · b65709e4
  由 dangqingqing 提交于 9月 19, 2017
  
  b65709e4
23 8月, 2017 1 次提交
- D
  
  Remove set functor and add comapre_grad test · f188e22b
  由 dangqingqing 提交于 8月 23, 2017
  
  f188e22b
11 8月, 2017 1 次提交
- Y
  
  Fix python unit tests · c99f84ac
  由 Yu Yang 提交于 8月 11, 2017
  
  c99f84ac
08 8月, 2017 1 次提交
- F
  
  fix bug · 28476676
  由 fengjiayi 提交于 8月 07, 2017
  
  28476676
07 8月, 2017 1 次提交
- D
  
  "remove type alias done." · 72fb86a2
  由 dongzhihong 提交于 8月 07, 2017
  
  72fb86a2
05 8月, 2017 1 次提交
- Y
  
  Reformat paddle/operators/* strictly following Google Style Guide · 9620df44
  由 Yi Wang 提交于 8月 04, 2017
  
  9620df44
02 8月, 2017 1 次提交
- F
  
  Add unittest for `FillZerosLikeOp` · 8bd73159
  由 fengjiayi 提交于 8月 01, 2017
  
  8bd73159
01 8月, 2017 1 次提交
- Y
  
  Follow comments and merge develop · e2fd2bd0
  由 Yu Yang 提交于 8月 01, 2017
  
  e2fd2bd0
26 7月, 2017 1 次提交
- F
  
  Add fill_zeros_like op · a2dc9614
  由 fengjiayi 提交于 7月 26, 2017
  
  a2dc9614
25 7月, 2017 1 次提交
- Y
  Add type_alias to import framework into ops · efc119b4
  由 Yu Yang 提交于 7月 25, 2017
```
Make implement an operator less noisy.
```
  efc119b4
19 7月, 2017 2 次提交
- Q
  
  add Flatten method to EigenVector · d9fa6159
  由 qijun 提交于 7月 19, 2017
  
  d9fa6159
- Y
  
  Update · 00ed5643
  由 Yi Wang 提交于 7月 18, 2017
  
  00ed5643
17 7月, 2017 3 次提交
- Q
  
  set correct place for output tensor · 2a03e380
  由 qijun 提交于 7月 17, 2017
  
  2a03e380
- Y
  Op varient inputs (#2901) · a0caf234
  由 Yan Chunwei 提交于 7月 17, 2017
```
* add inputs

* add ut for multiple inputs

* fix AddToLayer

* op_desc -> op_proto

* CreateArgumentOffsetMap -> CreateInOutOffsetMap

* move CreateInOutOffsetMap from OperatorBase to op registry

* arg_idxs_ -> in_out_idxs_
```
  a0caf234
- Q
  
  implement add_op kernel · d649dbf4
  由 qijun 提交于 7月 17, 2017
  
  d649dbf4
14 7月, 2017 1 次提交
- Q
  
  add_op kernel implementation · bac1426d
  由 qijun 提交于 7月 14, 2017
  
  bac1426d
13 7月, 2017 2 次提交

Follow comments · 79b70c2d

由 Yu Yang 提交于 7月 13, 2017

* Convert `op` --> `operators`
* Remove AddType in OpProtoMaker, because type is part of registry.
* Rename CPU_OR_GPU --> DEVICE_TYPE in registry macro.

79b70c2d

Add a sample op, `add_op` · a0aaafe9

由 Yu Yang 提交于 7月 13, 2017

* Refine register methods, make Op can get rid of whole-archieve
* `USE_OP` before a op is used.
* Add unittest for add_op.

a0aaafe9

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致