提交 · 203ac4f38e1e32e75a1977ca4e9c2ed78165fbc8 · 机器未来 / Paddle

19 4月, 2021 1 次提交

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop (#32294) · cbe5c9f8

由 Leo Chen 提交于 4月 19, 2021

* [NPU] support GarbageCollector for npu (#31874)

* support GarbageCollector for npu

* fix typo

* fix gather_grad

* disable NPUDefaultStreamGarbageCollector on NPU

* [NPU] support npu for memcpy op (#31808)

* support npu for memcpy op

* add ut

* fix ut

* fix typo

* 【NPU】fix bug of using temp vector (#31963)

* fix bug when beta1_pow on cpu (#31995)

* [NPU] support npu profiler (#31684)

* support npu profiler

* add python api

* fix bugs

* add wrapper for incomplete type

* update profile proto

* record npu wait

* add xpu placeholder

* fix adam (#32016)

* [NPU] enable async copy and  add wait before sync operation (#31956)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* make TensorFromVector/TensorToVector sync

* [NPU] Support dataloader on npu place. (#31867)

* [NPU] Wait on NPUPlace (#32086)

* [NPU] fix cast op (#32121)

* fix npu kernel of cast op to handle casting to same dtype

* add comments

* [NPU] support cann 20.3 (#32044)

* fix compile problem on cann 20.3

* fix ut

* fix test_mul

* fix check_finite_and_scale

* fix lookup_table_v2_grad

* fix cmake

* support print op

* [NPU] Support npu save load (#31893)

* support save load for NPU

* add save load npu unittest

* support np.array transform in NPU

* fix errors

* delete dygraph in unittest

* add Wait

* fix unittest

* fix review comment

* fix unittest problem

* fix little problem

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196)

* change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace

* refine code

* fix NPUDeviceContext in all c++ unittest (#32198)

* fix NPUDeviceContext in all c++ unittest

* refine log
Co-authored-by: Npangyoki <pangyoki@126.com>

* [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994)

* enable async copy and  add wait before sync operation

* remove unneccessary wait

* add FillNpuTensorWithConstant

* refine

* fix fill_constant

* change TensorFromVector to FillNpuTensorWithConstant

* fix ignored api

* delete extra unittest

* fix little error

* fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu

* change TensorCopySync to TensorCopy

* delete useless Wait and add StreamWait

* fix npu_stream error

* fix check_finite_and_unscale_op_npu TensorCopy

* only save stream wait

* fix NPUDeviceContext in all c++ unittest

* delete wait
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* delete useless unittest file (#32206)

* Fix op test (#32231)

* fix conditional block (#32243)

* fix adam bug again (#32246)

* fix compile

* fix ut

* fix ut
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

cbe5c9f8

15 4月, 2021 1 次提交

【NPU】Cherry-pick ascendrc ops code by 0325 to develop (#32197) · e6bc358d

由 zhang wenhui 提交于 4月 15, 2021

* merge 31065

* Fix typo of selected_npus (#31230)

* merge 31249

* [NPU] Support npu op pow and pow grad (#31247)

* [NPU] Support npu op: (1) pow (2) pow_grad

* Support fp16

* Fix pow npu fp16 test (#31256)

* support list of list attribute for NPU (#31299)

* support list of list attribute for NPU

* fix compile problem

* fix reference

* [NPU] Support npu op: (1) slice (2) slice_grad (#31275)

* fix reading flags from env (#31329)

* merge 31347

* [NPU] Support npu op layer_norm and layer_norm_grad (#31310)

* init commit, add layer_norm npu kernel

* fix typo

* add unittest

* add unittest

* fix bug

* fix bug

* refine ut

* [NPU] add npu kernel for equal op (#31393)

* add npu kernel for equal op

* refine code

* add more ut

* update year

* [NPU] Support npu kernel for shape op  (#31427)

* add shape npu

* fix

* fix

* fix endif (#31431)

* Fix pow, use fillD instead of broadcast (#31433)

* Fix pow, refine code (#31440)

* fix cmake of cryptopp to avoid downloading every time (#31451)

* [NPU] squeeze and unsqueeze op for ascend (#31452)
Co-authored-by: Nroot <xiayanming@baidu.com>

* Support npu kernel for gather op (#31458)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* 【NPU】add scale op for npu (#31499)

* add scale npu

* fix

* fix

* Support TensorFormVector, TensorToVector of bool type (#31518)

* support TensorFormVector, TensorToVector of bool type

* add ut

* fix compile problem

* 【NPU】support npu kernel for fill_constant op (#31521)

* add fill_constant npu

* add fill_constant npu

* fix

* cherry-pick 31422, solve conflict

* 【NPU】Support npu kernel for matmul op (#31544)

* add matmulv2_npu

* add matmul

* add matmul

* [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)

* [NPU] Support npu op elementwise_max (#31574)

* 【NPU】add relu op for  npu (#31515)

* add relu npu

* fixed

* fix

* 【NPU】Suppert npu kernel for reshape2 op (#31524)

* add reshape2 npu

* add reshpe2

* [NPU] Support npu kernel for gather op fix bug (#31541)

* add gather npu op

* code review done

* update python new line

* precommit

* fix review

* del commit

* update gather_grad

* fix bug

* fix bug

* [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457)

* Support npu kernel for amp_check_finite_and_unscale_npu op

* support EnforceNotMet exception

* fix exception bug

* modify python unittest

* precommit

* update c++ unittest

* fix review

* fix review

* [NPU] accuracy op (#31492)

* accuracy op

* fix license

* fix

* add test and fix bug

* [NPU] add Assign OP (#31561)

* add assign op

* add test assign npu test

* dele if def
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] fix npu op elementwise_mul_grad (#31592)

* 【NPU】Support npu op gelu and gelu_grad (#31530)

* Support npu op gelu and gelu_grad

* Support npu op gelu and gelu_grad

* [NPU] fix assgin cmake (#31595)

* fix gather_grad bug (#31607)

* [NPU] add range op (#31560)

* add range op

* fix codestyle; call GetSize directly
Co-authored-by: Noyjxer <1728722986@qq.com>

* 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573)

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* Support npu op elementwise_div and elementwise_div_grad

* [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600)

* [NPU] Support npu op logicalnot_op (#31534)

* [NPU] Support npu op elementwise_min (#31575)

* [NPU] Support npu op elementwise_pow (#31576)

* [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399)

* [npu] support npu kernel `table_lookup_v2`

* clean up

* +python test

* +cmake

* clean up

* remove int8 kernel
+ python unitest for fp16

* clean up

* [NPU] support npu kernel for `less_than` (#31327)

* [npu] support npu kernel for `less than`

* remove int* kernel

* cleanup

* [NPU] Support npu kernel scatter op (#31624)

* Support npu kernel scatter op

* Add more test

* [NPU] fix allocator min chunk size (#31632)

* [NPU] Support NPU kernel cast op (#31635)
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for sgd (#31639)

* 【NPU】Support NPU kernel for reduce_sum op v2 (#31620)

* add reduce_sum

* fix broadcastd

* fix test

* fix

* add unsqueeze in reduce_sum

* add template

* add unittest for keep_dim

* test reduce_all
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* [NPU] add npu kernel for adam (#31644)

* add npu kernel for adam

* refine code

* disable test

* modify atol

* 【NPU】Support npu kernel for mul op (#31584)

* add mul

* add test mul

* [NPU] add npu kernel for softmax_with_cross_entropy (#31656)

* init

* fix bugs

* [NPU] add npu kernel for mean Op (#31562)

* update mean op

* update mean op

* give a better test activation
Co-authored-by: Noyjxer <1728722986@qq.com>

* Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665)

This reverts commit 468ac699.

* 【NPU】Add TensorCopy to NPU kernel for reduce_sum op  (#31667)

* update unittest

* add TensorCopy in npu grad kernel

* [NPU] Support npu op `expand` (#31405)

* [npu] support npu kernel  for `expand`

* [NPU] fix shape of dx in mul_grad (#31675)

* fix shape of dx

* refine code

* [NPU] add Increment op (#31563)

* add increment

* fix

* update test increment op inplace

* update increment op

* increment b = 2
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] add NPU add topk  (#31596)

* add topk op

* add cmake

* update topk npu op

* refactor func

* fix test not go npu TopKD bug

* NPUPlace(4) to NPUPlace(0)

* update comment
Co-authored-by: Noyjxer <1728722986@qq.com>

* [NPU] Support NPU kernel sum op (#31671)

* [NPU] npu support `transpose` (#31486)

* cherry-pick 31564, solve conflict

* [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699)

* [NPU] Support testing grad of NPU ops in OpTest (#31697)

* [NPU] Support NPU kernel of stack op (#31711)

* [NPU] Remove redundant ctest of top_k_op_npu_test (#31718)

* [NPU] fix reshape npu op kernel (#31726)

* rename npu op file

* fix reshape

* [NPU] change transpose to transpose2 (#31734)

* change transpose to transpose2

* fix bug

* [NPU] Support  mean npu kernel (#31729)

* [NPU] fix some bugs of npu op (#31739)

* fix softmax

* fix mean

* fix lookup_table_v2

* 【NPU】Fix npu kernel elementwise_div_grad  (#31753)

* [NPU] fix the grad kernel diff bug of gather op (#31757)

* fix gather grad kernel diff

* fix gather grad kernel diff

* fix gather review bug

* 【NPU】Fix reshape test & add grad test (#31776)

* fix

* fix

* [NPU] support fp16 for npu accuracy op (#31797)

* [NPU] support list of tensor input (#31801)

* support list of tensor as npu input

* add comment

* fix typo

* fix typo

* [NPU] add npu kernel for concat op (#31695)

* add npu kernel for concat op

* add npu kernel for concat op

* refine code

* update

* refine concat_grad

* [NPU] Support npu kernel for op elementwise_floordiv (#31822)

* [NPU] fix bug of lookup_table_v2_grad (#31834)

* [NPU] support default stream (#31510)

* [NPU] support mixed precision input for npu layer norm (#31847)

* support mixed precision input for npu layer norm

* fix layer_norm npu kernel
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* 【NPU】Support npu kernel for update_loss_scaling op (#31830)

* add update_loss_scaling_npu NPU kernel

* change TensorFromVec to Memset

* fix compile problem (#31850)

* [NPU] support npu for conditional_block op (#31854)

* 【NPU】Add int dtype kernel for reshape2 op (#31864)

* fix

* fix

* [NPU] fix some op bugs (#31855)

* fix some op bugs

* fix some bugs

* follow comments

* fix log level

* add ut

* [NPU] support fp16 of input for api pow (#31871)

* [NPU] add npu kernel for truncated_gaussian_random op (#31654)

* init

* add todo

* add npu kernel for truncated_gaussian_random

* add sync

* fix concat_grad

* fix typo

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix compile

* fix code style

* fix code style

* fix code

* Fix op test (#32231)

* fix conditional block (#32243)

* fix style code
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
Co-authored-by: NReventon_L <luyuxiang1994@qq.com>
Co-authored-by: Nroot <xiayanming@baidu.com>
Co-authored-by: Noyjxer <1728722986@qq.com>
Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
Co-authored-by: NOleNet <olenet@126.com>
Co-authored-by: NMeiyim <chen_xuyi@outlook.com>
Co-authored-by: Noyxuan-11 <963650125@qq.com>
Co-authored-by: Npangyoki <pangyoki@126.com>

e6bc358d

14 4月, 2021 1 次提交

adds new CPU kernel for SGD op supporting BF16 data type (#32162) · 3ac6c189

由 Adam Osewski 提交于 4月 14, 2021

* Initial draft for SGD BG16 kernel.

* Unit tests for SGD with BF16 data type.

* Add VLOG message to SGD BF16 op CPU kernel.

* Enhance error messages and error types.

* Refactor SGD op kernels to leverage some common code.

* Make easier to add new kerne invoke code.

* Fix SGD op kernel for sparse grad.

* Unify quotes style.

* Fix error for ROCM compilation.

* Use specialized PADDLE_ENFORCE_xx functions.

3ac6c189

31 3月, 2021 1 次提交
- T
  
  fix some bug in transformer training in xpu (#31918) · 52b05bac
  由 taixiurong 提交于 3月 31, 2021
  
  52b05bac
02 3月, 2021 1 次提交

lamb_op_xpu;test=kunlun (#31012) · d79fdc3d

由 Gradie 提交于 3月 02, 2021

* lamb_op_xpu;test=kunlun

* modify lamb_op_xpu.cc;test=kunlun

* delete atol lamb_op_xpu; test=kunlun

* update xpu.cmake;test=kunlun

* test_error 1e-5,lamb_op_xpu;test=kunlun

* error1e-5,lamb_op_xpu,test=kunlun

* delete atol lamb_xpu;test=kunlun

* modify atol,lamb_op_xpy;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu, XPUOptest;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu,modify xpu_cmake; test=kunlun

* lamb_op_xpu;test=kunlun

* lamb_op_xpu,modify xpucmake;test=kunlun

d79fdc3d

19 1月, 2021 1 次提交
- Y
  add rmsprop_op_xpu test=kunlun (#30493) · 549855ac
  由 ykkk2333 提交于 1月 19, 2021
```
* add rmsprop_op_xpu test=kunlun

* modified rmsprop_op_xpu error code. test=kunlun
```
  549855ac
17 1月, 2021 1 次提交
- G
  Modify the calculation logic of LambOptimizer (#29313) · 11e78eba
  由 guofei 提交于 1月 17, 2021
```
* Modify the calculation logic of LambOptimizer
```
  11e78eba
12 1月, 2021 1 次提交
- 石
  
  fix header file paths of gflags, commit 3, test=develop (#30273) · efa54629
  由石晓伟提交于 1月 12, 2021
  
  efa54629
09 1月, 2021 1 次提交
- Z
  
  enhance error message, test=develop (#30220) · 5932fee6
  由 zhang wenhui 提交于 1月 09, 2021
  
  5932fee6
08 1月, 2021 1 次提交

Support pure fp16 training for AMP API. (#29544) · 7f7dfccf

由 Zhen Wang 提交于 1月 08, 2021

* add cast ops before and after unsupported fp16 ops.

* Keep partial net in FP32 pattern.

* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.

* Add fp16 support for adam op.

* add multi precision attr for adam.

* Fix the bug of test_multi_precision_fp16_train UT.

* Code format for CI.

* Fix the redefine error about MPTypeTrait on windows.

* fix bugs of the _create_accumulators func in Momentum.

* fix bug when inserting post cast op.

* Add the update_loss_scaling op in allow_set of UnusedVarCheck.

* Update for ci coverage.

* Add some doc for OptimizerWithMixedPrecision.

* Fix the code style.

* Imporve the doc of `amp_init`.

* Change for fp16 testing if users have the infer program defined in separate way.

7f7dfccf

30 12月, 2020 1 次提交
- C
  fix momentum op register (#29941) · 4cbcc9b6
  由 Chengmo 提交于 12月 30, 2020
```
* fix momentum op register
```
  4cbcc9b6
21 12月, 2020 1 次提交

Optimize compilation time with Unity Build (#29733) · 2e5b4a21

由 LoveAn 提交于 12月 21, 2020

* Test compilation time with less parallel count, notest, test=windows_ci

* optimize rules of Unity Build, notest, test=windows_ci, test=windows_op

* limit parallel counts used only on GPU, test=develop

* remove limit of argument /m:8 on Windows, test=develop

2e5b4a21

07 12月, 2020 1 次提交

Compiling operator libraries with Unity build (#29130) · 671555ed

由 LoveAn 提交于 12月 07, 2020

* Compiling operator libraries with Unity Build on Windows CPU.

* Compiling operator libraries with Unity Build on Windows GPU, no_test, test=windows_ci

* Add option in windows ci script, no_test, test=windows_ci

* Optimize parallel compiling, test=develop

* remove limit of parallel compile and skip some ops in UB, test=develop

* remove changes of header file, test=develop

* remove changes of header file, test=develop

* fix test_eye_op unittest failed, test=develop

* Compiling operator libraries with Unity Build on Linux, test=develop

* set default WITH_UNITY_BUILD=OFF, test=develop

* Move unity build rules into a single file and add comment, test=develop

* optimize parallel compilation, test=develop

* fix undefined reference error on coverage ci, test=develop

671555ed

02 12月, 2020 2 次提交

Z

Remove some useless log. (#29300) · 9b59a589
由 Zhen Wang 提交于 12月 02, 2020

9b59a589

Add pure fp16 training with master weights. (#27712) · be3777a5

由 Zhen Wang 提交于 12月 02, 2020

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

* For CI Coverage Checking.

be3777a5

23 11月, 2020 1 次提交
- F
  refactor momentum op to combine weight (#27414) · 8ff35506
  由 furnace 提交于 11月 23, 2020
```
* refactor momentum op to combine weight_decay (scale op and sum op)
```
  8ff35506
06 11月, 2020 1 次提交
- T
  
  fix crash in adam in xpu, *test=kunlun (#28433) · fad4744a
  由 taixiurong 提交于 11月 06, 2020
  
  fad4744a
19 10月, 2020 2 次提交

xpu adam op (#28031) · 6f0c3d1f

由 yinhaofeng 提交于 10月 19, 2020

* lookup_table_xpu op report errors;test=kunlun

* add adam xpu op;test=kunlun

* reset lookup

* change adam wrong;test=kunlun

6f0c3d1f

C
Fix xpu error message (#28061) · 5f04875c
由 Chengmo 提交于 10月 19, 2020
```
* fix error message,test=kunlun

* fix, test=kunlun
```
5f04875c

14 10月, 2020 2 次提交
- M
  Fix adam (#27778) · 263a9e97
  由 MRXLT 提交于 10月 14, 2020
```
* fix adam

* fix gpu adam

* fix code style

* fix ut

* update ut add cuda code
```
  263a9e97
- C
  Polish some error message in opeators (#27876) · 4ba977c7
  由 Chen Weihang 提交于 10月 14, 2020
```
* polish some error message

* add white list

* revert shell script change
```
  4ba977c7
13 10月, 2020 1 次提交
- C
  add xpu sgd & momentum (#27728) · 1607e87c
  由 Chengmo 提交于 10月 13, 2020
```
* add xpu sgd & momentum
```
  1607e87c
27 9月, 2020 1 次提交
- C
  fix error message (#27318) · d014e29f
  由 Chengmo 提交于 9月 27, 2020
```
* fix sgd/momentum/dpsgd/rmsprop error message
```
  d014e29f
22 9月, 2020 1 次提交
- 1
  Enhance Op's Error Message (#27455) · a0452475
  由 123malin 提交于 9月 22, 2020
```
* test=develop, update error message
```
  a0452475
21 9月, 2020 1 次提交
- M
  fix adam (#27343) · f936adbd
  由 MRXLT 提交于 9月 21, 2020
```
* fix adam

* rmsprop support double
```
  f936adbd
09 9月, 2020 1 次提交
- J
  modified the implement of Lars optimizer (#26733) · 5d039f40
  由 JZ-LIANG 提交于 9月 09, 2020
```
add lars to fleet meta optimizer
```
  5d039f40
29 8月, 2020 1 次提交

Adadelta Optimizer (#26590) · a1b99fae

由 Jiawei Wang 提交于 8月 29, 2020

* add doc; notest

* fix doc; notest

* update doc; notest

* refine optimizer && adam

* refine optimizer; notest

* add adam

* fix doc

* fix doc && add adamw; notest

* add error message

* bug fix

* refine rmsprop && adamax

* fix ci

* buf fix

* update comment

* unify arguments place; notest

* fix ut, test=develop

* bug fix

* fix conflicts, test=develop

* add examples code

* bug fix

* fix comments

* fix sample code

* add sample code for Optimizer

* add adamax ut, test=develop

* fix rmsprop ut, test=develop

* add ut for optimizer.py and adamw.py

* first commit of adadelta optimizer

* fix learning rate

* fix adadelta doc and add sgd momentum

* remove unused fluid

* fix codestyle

* Update test_adam_op.py

* Update test_adam_op.py

* fix SGD in 2 unittests

* fix SGD in 2 unittests

* fix ci

* fix ut
Co-authored-by: NMRXLT <xlt2024@gmail.com>
Co-authored-by: Nmapingshuo <mps2012@yeah.net>

a1b99fae

28 8月, 2020 1 次提交
- L
  
  modify error report message, test=develop (#26743) · 5f524efe
  由 lilong12 提交于 8月 28, 2020
  
  5f524efe
11 7月, 2020 1 次提交

Fix index overflow bug of the CUDA kernel loop increment (#25435) · 0b54d54f

由 Chen Weihang 提交于 7月 11, 2020

* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop

* replace old macro & for condition, test=develop

* polish details, test=develop

0b54d54f

03 6月, 2020 1 次提交
- L
  
  FTRL with sparse update, test=develop (#22092) · a6beb96d
  由 leesusu 提交于 6月 03, 2020
  
  a6beb96d
13 5月, 2020 3 次提交
- G
  
  Enhance error message of prefetch_op, proximal_adagrad_op, proximal_gd_op (#24436) · f1c57d64
  由 gongweibao 提交于 5月 13, 2020
  
  f1c57d64
- M
  
  update error message for unstack op and lamb op; test=develop (#24439) · 71ff32b6
  由 MRXLT 提交于 5月 13, 2020
  
  71ff32b6
- Z
  
  enhance cvm bpr_loss adam adagrad adamax ftrl error message, test=develop (#24452) · 621a4085
  由 zhang wenhui 提交于 5月 13, 2020
  
  621a4085
26 4月, 2020 1 次提交

improve efficiency of runtime InferVarType (#22778) · 9a93f6aa

由 liuwei1031 提交于 4月 26, 2020

* save InferVarType changes, test=develop

* remove code comments, test=develop

* tweak code, test=develop

* fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop

* modify fused_bn_activation_op, test=develop

* fix error of fused_bn_activation_op, test=develop

* fix PADDLE_ENFORCE and unittest coverage issue, test=develop

* tweak PADDLE_ENFORCE messages, test=develop

* improve unittest coverage, test=develop

* add StaticGraphInferVarType class, test=develop

* rebase develop branch, test=develop

* fix unittest error, test=develop

* remove comments, test=develop

* improve unittest coverage, test=develop

* imporve error message and imporve unittest coverage, test=develop

* upgrade InferVarType API, test=develop

* tweak pyfunc error message, test=develop

* fix compilation conflict - save_combine_op, test=develop

9a93f6aa

07 4月, 2020 1 次提交
- W
  Tensor value support (#23491) · 29c4fae1
  由 wangchaochaohu 提交于 4月 07, 2020
```
* add support for value tensor support of fill_constant Op
```
  29c4fae1
04 4月, 2020 1 次提交

Delete Ref & VectorRef and add GetDataSafely (#22997) · 16315d3d

由 Chen Weihang 提交于 4月 04, 2020

* delete invalid check inferface Ref & VectorRef, test=develop

* fix vector ref delete error, test=develop

* try the new check inferface, test=develop

* change all related code with new check macro, test=develop

* remove static assert, test=develop

* polish detail, test=develop

* skip coverage problem, test=develop

* add new check macro, test=develop

16315d3d

27 2月, 2020 1 次提交

Refine adam op to improve performance, test=develop (#22346) · 72dde4ab

由 zhaoyuchen2018 提交于 2月 27, 2020

* Refine adam op, test=develop

* Fuse kernels together to reduce cpu time.

* Refine paddle enforce, test=develop

* Remove some comments, test=develop

* Refine code,test=develop

* Refine cuda kernel, test=develop

* Refine code according to comments, test=develop

72dde4ab

09 1月, 2020 1 次提交

test Optimizer in dygraph (#21949) · d0f0a252

由 zhongpu 提交于 1月 09, 2020

* test Optimizer in dygraph, test=develop

* add optest for Optimizer in dygraph, test=develop

* fix adagrad optimizer, test=develop

* fix dpsgd optimizer, test=develop

* fix test_optimizer.py, test=develop

* fix dpsgd optimizer, this op only support cpu, test=develop

* add optest for optimizer, test=develop

* add description for dpsgd, test=develop

* add rmsprop to white_list in unused_var_check.cc, test=develop

* polish code style, test=develop

* polish code style, test=develop

* delete seed attribute for DpsgdOptimizer, test=develop

* change testing to debugging, test=develop

d0f0a252

24 12月, 2019 1 次提交

Optimize adam speed (#21777) · 51a86d2b

由 Aurelius84 提交于 12月 24, 2019

* optimize adam speed by removing _finish_update test=develop

* fix SparseAdamFunctor param list test=develop

* Remove scale_op in expect_list of adam_op test=develop

* fix test optimizer loss assert error test=develop

* fix test optimizer loss assert error test=develop

* modify PADDLE_ENFORCE usage test=develop

* fix op_type in lamb_op.cc test=develop

* fix errors ostream format bug test=develop

* add betaPowOut in ngraph op test=develop

* fix ngraph::op api for gcc8 test=develop

* clean code test=develop

* modify struct into class test=develop

* remove code of beta1Tensor in lamb_op test=develop

51a86d2b

06 12月, 2019 1 次提交

Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532) · 1dcf6a72

由 Huihuang Zheng 提交于 12月 06, 2019

Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.

Fix bugs:

1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.

2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.

This PR also did some code clean up:
1. Print the var name when sgd_op catches shape error so that it is easier to debug
2. Fix a typo: dicta -> dict

1dcf6a72

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致