提交 · 3ac6c189a3ec5189c6d154945dee72c607da904a · BaiXuePrincess / Paddle

14 4月, 2021 1 次提交

adds new CPU kernel for SGD op supporting BF16 data type (#32162) · 3ac6c189

由 Adam Osewski 提交于 4月 14, 2021

* Initial draft for SGD BG16 kernel.

* Unit tests for SGD with BF16 data type.

* Add VLOG message to SGD BF16 op CPU kernel.

* Enhance error messages and error types.

* Refactor SGD op kernels to leverage some common code.

* Make easier to add new kerne invoke code.

* Fix SGD op kernel for sparse grad.

* Unify quotes style.

* Fix error for ROCM compilation.

* Use specialized PADDLE_ENFORCE_xx functions.

3ac6c189

27 9月, 2020 1 次提交
- C
  fix error message (#27318) · d014e29f
  由 Chengmo 提交于 9月 27, 2020
```
* fix sgd/momentum/dpsgd/rmsprop error message
```
  d014e29f
26 4月, 2020 1 次提交

improve efficiency of runtime InferVarType (#22778) · 9a93f6aa

由 liuwei1031 提交于 4月 26, 2020

* save InferVarType changes, test=develop

* remove code comments, test=develop

* tweak code, test=develop

* fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop

* modify fused_bn_activation_op, test=develop

* fix error of fused_bn_activation_op, test=develop

* fix PADDLE_ENFORCE and unittest coverage issue, test=develop

* tweak PADDLE_ENFORCE messages, test=develop

* improve unittest coverage, test=develop

* add StaticGraphInferVarType class, test=develop

* rebase develop branch, test=develop

* fix unittest error, test=develop

* remove comments, test=develop

* improve unittest coverage, test=develop

* imporve error message and imporve unittest coverage, test=develop

* upgrade InferVarType API, test=develop

* tweak pyfunc error message, test=develop

* fix compilation conflict - save_combine_op, test=develop

9a93f6aa

06 12月, 2019 1 次提交

Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532) · 1dcf6a72

由 Huihuang Zheng 提交于 12月 06, 2019

Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.

Fix bugs:

1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.

2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.

This PR also did some code clean up:
1. Print the var name when sgd_op catches shape error so that it is easier to debug
2. Fix a typo: dicta -> dict

1dcf6a72

29 11月, 2019 1 次提交

Fix optimizer op infershape failed in dygraph multi-cards mode (#21374) · 664f958a

由 Chen Weihang 提交于 11月 29, 2019

* add param & grad shape check for sgd op

* add _reshape_inplece interface for dygraph parallel

* refine unittest based paddle/models scripts, test=develop

* add unittest for parallel grad fuse, test=develop

664f958a

31 10月, 2019 1 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

28 10月, 2019 1 次提交

Replace risky GetInputType method with secure IndicateVarDataType interface (#20668) · 26cc1fe5

由 Chen Weihang 提交于 10月 28, 2019

* replace part of the old implementation, test=develop

* restore concat op, test=develop

* update all ops implemention & delete GetDataTypeOfVar func, test=develop

26cc1fe5

24 10月, 2019 1 次提交
- W
  
  Fix DGC algorithm flow to make it the same as paper (#20758) · 250e72d2
  由 WangXi 提交于 10月 24, 2019
  
  250e72d2
04 9月, 2019 1 次提交

Add user-friendly error message in optimizer ops to give a hint about the... · 8cb54ede

由 Chen Weihang 提交于 9月 04, 2019

Add user-friendly error message in optimizer ops to give a hint about the position sensitive problem of run(startup_program) (#19605)

* add extra error message hint in optimizer ops

* polish format & delete useless change, test=develop

* extract init judue from shape compare, test=develop

8cb54ede

04 7月, 2019 1 次提交
- C
  
  Make fuse_all_reduce_op_pass support mix_precision (#17652) · 74538573
  由 chengduo 提交于 7月 04, 2019
  
  74538573
19 3月, 2019 1 次提交
- Z
  add allocator flags · 22715487
  由 zhhsplendid 提交于 3月 19, 2019
```
test=develop
```
  22715487
18 3月, 2019 1 次提交
- M
  Polish code style · b40e41fb
  由 minqiyang 提交于 3月 18, 2019
```
test=develop
```
  b40e41fb
15 3月, 2019 1 次提交
- M
  
  Implement infer var type context · ca392c7e
  由 minqiyang 提交于 3月 15, 2019
  
  ca392c7e
16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

22 10月, 2018 1 次提交
- X
  clean up after the changes have been stopped for so long. · 8f2116d8
  由 Xin Pan 提交于 10月 18, 2018
```
test=develop
```
  8f2116d8
15 10月, 2018 1 次提交

Add check for opt op (#13840) · 8e2fdc54

由 chengduo 提交于 10月 15, 2018

* add check for opt op

* fix opt op
test=develop

* fix test fail
test=develop

* fix optimization doc
test=develop

* test=develop

8e2fdc54

11 6月, 2018 1 次提交

add inplace attribute to op_proto_maker (#10665) · bfa3fd6f

由 dzhwinter 提交于 6月 11, 2018

* "add inplace attribute"

* "register inplace attribute"

* "change se-next model for memory-reuse"

* "fix typo"

* repick

* fix merge conflict

* "fix stupid error"

bfa3fd6f

08 5月, 2018 1 次提交

Clean OpProtoAndCheckerMaker · 0e78cb69

由 Yu Yang 提交于 5月 08, 2018

Do not use ctor

* Reduce line of codes.
* We can use virtual function for Maker now.
* The implementation does not care what maker holds, it is easier to
refactor later.

0e78cb69

26 4月, 2018 1 次提交
- Y
  
  refine distribute transpiler · dccd013b
  由 Yancey1989 提交于 4月 26, 2018
  
  dccd013b
12 4月, 2018 1 次提交

Dist transpiler support prefetch (#9714) · 4c55a602

由 Qiao Longfei 提交于 4月 12, 2018

* init

* add some check

* add dist transpile logic

* add insert op for block

* init change get_pserver_program

* optimize code

* fix a bug

* can run now

* start to do table split

* start to process table gradient

* complete pserver part

* can send_vars now

* revert cpplint

* fix a bug

* optimize code

* move dist test to models

* revert the interface of distribute_transpiler.transpile

* fix prefetch_block

* optimize trainspiler code

* add comment to sum_op

* add warning log

* fix comment

* fix test_send_recv

* fix test_send_recv

* fix train with no distributed table

* optimize GetDims

4c55a602

04 4月, 2018 1 次提交
- Q
  
  add GetDataTypeOfVar · e66bd4cb
  由 qiaolongfei 提交于 4月 04, 2018
  
  e66bd4cb
03 4月, 2018 1 次提交
- Q
  
  sgd_op support optimize SelectedRows · 2669aea6
  由 qiaolongfei 提交于 4月 03, 2018
  
  2669aea6
09 3月, 2018 1 次提交
- Y
  Fix sparse update memory error for distributed training (#8837) · 84680379
  由 Yancey 提交于 3月 09, 2018
```
Fix sparse update memory error for distributed training
```
  84680379
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
23 12月, 2017 1 次提交
- C
  
  refine sgd-op · 02fda711
  由 chengduoZH 提交于 12月 23, 2017
  
  02fda711
20 12月, 2017 1 次提交
- Y
  Move framework.proto to proto namespace (#6718) · e445b3ff
  由 Yu Yang 提交于 12月 20, 2017
```
* Move framework.proto to proto namespace

* Fix compile

* Fix compile

* Fix Compile
```
  e445b3ff
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

27 11月, 2017 1 次提交
- A
  Fix the latex comment syntax in sgd_op.cc (#5940) · ef3420e2
  由 Abhinav Arora 提交于 11月 27, 2017
```
* Fix the latex comment syntax in sgd_op.cc

* Change \textunderscore to \_
```
  ef3420e2
04 11月, 2017 1 次提交
- A
  Polish operator documentation (#5356) · b0b26dab
  由 Abhinav Arora 提交于 11月 03, 2017
```
* Polish the documentation for uniform_random and top_k ops

* Polishing more operators
```
  b0b26dab
29 10月, 2017 1 次提交

support sparse output for lookup table grad op (#5145) · 008f40ce

由 QI JUN 提交于 10月 28, 2017

* add sparse support for sum op

* typo fix

* fix gpu build error

* fix unittest error

* typo fix

* infer var type and shape in op_test

* follow comments

* fix build error

* bypass some unittests depend on NetOp

* support sparse output for lookup table grad op

* refine codes

* fix gpu build error

* fix lookup table grad gpu kernel

* fix ci

* fix ci

* fix ci

* fix bug in lookup_table_grad op

* fix bug in test_word2vec

* register double kernel for some operators

* set is_sparse=True in test_word2vec

* fix lookup table grad op CUDA kernel bug

* disable test_modified_huber_loss_op temporarily

* disable test_lstm_unit_op temporarily

008f40ce

18 10月, 2017 2 次提交
- Q
  
  fix gpu build error · f9681459
  由 qijun 提交于 10月 17, 2017
  
  f9681459
- Q
  
  add sparse kernel of sgd operator · 182ce51c
  由 qijun 提交于 10月 17, 2017
  
  182ce51c
17 10月, 2017 1 次提交
- Y
  Correct OpWithKernel's infershape (#4847) · 73a8b78a
  由 Yu Yang 提交于 10月 16, 2017
```
They are public now
```
  73a8b78a
07 10月, 2017 1 次提交
- Q
  
  rename InferShapeContextBase to InferShapeContext · c0a34e1c
  由 qiaolongfei 提交于 10月 07, 2017
  
  c0a34e1c
05 10月, 2017 1 次提交
- A
  
  Changing SGD inputs and outputs to conform to Operator naming convention (#4586) · eed2c1e1
  由 Abhinav Arora 提交于 10月 04, 2017
  
  eed2c1e1
04 10月, 2017 1 次提交
- A
  
  Changing learning rate from type Input(float) to Input(tensor) (#4578) · 324876bb
  由 Abhinav Arora 提交于 10月 03, 2017
  
  324876bb
03 10月, 2017 1 次提交
- A
  Changing learning rate from attribute to input(float) (#4568) · 42e7fe05
  由 Abhinav Arora 提交于 10月 02, 2017
```
* Changing learning rate from attribute to input(float)
* Removing obsolete code
```
  42e7fe05
27 9月, 2017 1 次提交

Refactoring InferShape (#3946) · 9a9d50a6

由 Qiao Longfei 提交于 9月 26, 2017

* init Infershape

* add static InferShape interface

* refactor add-op infershape

* add AttrReader

* add all maker's infershape

* add all InferShape

* add python infer api

* add VarDesc interface

* add python VarDesc and OpDesc interface

* update python code

* use infershape function to do shape inference

* clean code

* do not use pointer

* refine code of op_proto_maker

* add get_dims to VarDesc

* refine the code

* remove the dependency from operator to op registry

* remove OpProtoAndCheckerMaker from operator

* restore complete_add_op

* add shape_infer_impl.h

* code optimization

* remove const return value

* add fake BlockDesc class

* optimize code

* remove infer function in op_info

* move InferShapeContextImpl to operator.h

* optimize the interface of InferShapeContextBase

* add temperary interface of new infershape

* change add_op, clip_op, conv2d_op and activation_op

* change all operators InferShape

* fix SetDim

* update cos_sim_op

* update crop_op

* update lookup_table_op

* allocate tensor when call GetDim in InferShapeContext

* update modified_huber_loss_op

* update rowwise_add_op

* update mean_op

* update sequence_avg_pool_op

* typo

* remove old InferShape interface

* can compile

* fix or unit test

* clean code

* clean code

* remove const before InferShapeContext

* change InferenceContextBase to pointer

* rename RunTime to Runtime, code clean

9a9d50a6

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致