- 25 11月, 2021 1 次提交
-
-
由 pangyoki 提交于
Cherry-pick PR 37420, fix inplace bug when the first grad_var(loss_grad) is inplace var (#37420) (#37488) fix inplace bug,Cherry pick PR #37420
-
- 16 11月, 2021 1 次提交
-
-
由 zhangkaihuo 提交于
修复了fused_transformer_encoder_layer fine-tune过程发现的一些问题: fused_attention_op添加attn_mask=None的支持:PR pre_layer_norm处理问题:PR 参数处理,计算错误的问题:PR add_bias计算错误问题:PR 添加pure fp16的支持:PR
-
- 27 10月, 2021 1 次提交
-
-
由 zhangkaihuo 提交于
本PR是fused_transformer的layer层代码,包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。
-
- 26 10月, 2021 4 次提交
-
-
由 Steffy-zxf 提交于
* Add FasterTokenizer Operator (#34491) Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent. * support the text string as an input Tensor * support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens * Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization. * It first applies basic tokenization, followed by wordpiece tokenization. * optimize fast tokenizer * remove const_cast Co-authored-by: Nzhoushunjie <zhoushunjie@baidu.com> Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
-
由 xiongkun 提交于
Support various length support for SelectedRows in GLOO::AllGather (#36637) In cpu parallel using gloo, add various length support for SelectedRows
-
由 Leo Chen 提交于
* refine amp level * fix typo * update tracer._amp_level
-
由 xiongkun 提交于
[cherry-pick] Support CPU Parallel in DataParallel Interface by GLOO to speed up training (#35745) (#36605) * User specified backend (#35745) * remove tensordot
-
- 18 9月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 17 9月, 2021 2 次提交
-
-
由 zhangbo9674 提交于
* add pure fp16 major function in auto_cast & tracer * support master weight in dygraph for pure fp16 * check mix dtype of fp16&fp32 for check_finite_and_unscale op * change pure fp16 funtion name * refine some bug in auto_cast * refine auto_cast interface logic * add param _casted_by_pure_fp16 for class Layer * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator * refine pure_fp16_decorator as decorator * add unittest * add comment * add comment * support recompute * add comment for auto_cast and decorator * support to_static_state_dict for paddle.jit.save * unlimite models num and optimizers num * add lookup_table in black_list * fix momentum and layer state_dict * fix bug in layer state_dict * fix bug in layer state_dict_helper * refine unittest * refine test_momentun_op * refine interface and some code * refine amp_decorator interface * refine pure fp16 interface * refine master weight interface
-
由 Zeng Jinle 提交于
* make flag setter easier * update * rename macro name * fix bug of public/writable * update to pass CI * polish * fix CPU link error
-
- 14 9月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* Add solutions to PyLayer which is unsupported in DataParallel * modify note format for parallel.py * modify docs of dataparallel * add docs of dp with pylayer * modify docs format * modify example format * change example of dp with pylayer * add unittest for dp with pylayer * modify ut * merge latest codes * update * modify for CI-Coverage * modify text-indent
-
- 10 9月, 2021 1 次提交
-
-
由 ronnywang 提交于
-
- 08 9月, 2021 2 次提交
-
-
由 Leo Chen 提交于
* add backward inplace for dygraph * fix bug * support gradient accumulation
-
由 xiongkun 提交于
* can pass the fake test * add files * modify cmake to pass windows-ci * for ci pass * WITH_GLOO=ON * for pass coverage test * add cpuonly testcase * add * disable nccl when compile with cuda * change python version in cpuonly * add backend argument * add required gpu * add required:gpu
-
- 01 9月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support KL label smooth * update UT for KL label_smooth
-
- 24 8月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
* Add no_sync in data parallel for dynamic graph * modify UT of no_sync * delete test_parallel_dygraph_dataparallel_no_sync.py * add test_parallel_dygraph_no_sync.py * modify run_trainer_with_spawn in UTs * Add UT of complex control flow in no_sync * add specific descriptions and notes for no_sync * check code style * modify UT's TIMEOUT in CMakeLists.txt
-
- 12 8月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 06 8月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support kunlun black list and add kl1 op * xpu_op_list add device_context dependence
-
- 05 8月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 04 8月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* fix backward bug * format code style * add test case for grad tensor accumulator
-
- 03 8月, 2021 3 次提交
-
-
由 WangXi 提交于
-
由 QingshuChen 提交于
* support Kunlun2 * support KL2 * support KL2
-
由 wanghuancoder 提交于
-
- 09 7月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 02 7月, 2021 1 次提交
-
-
由 houj04 提交于
-
- 30 6月, 2021 1 次提交
-
-
由 houj04 提交于
* support set_device for NPU. * minor update doc and add more unit test.
-
- 29 6月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 24 6月, 2021 1 次提交
-
-
由 houj04 提交于
* in NPU environment, use CPUPlace for missing operators. * in NPU environment, use CPUPlace for missing operators. * fix TensorCopy bug and add unit test. * fix code style. * add more unit tests.
-
- 23 6月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* optimize attr default value, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * fix bug in AttrReader, test=develop * fix bug, test=develop * fix double_grad, test=develop * refine, test=develop * refine, test=develop * fix checker null, test=develop * for test, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
- 21 6月, 2021 1 次提交
-
-
由 cc 提交于
* Combine amp and qat * add unit test
-
- 10 6月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* add check nan of inf for dygraph * add unittest for dygraph * revert error change
-
- 08 6月, 2021 1 次提交
-
-
由 WeiXin 提交于
* replace 'InnerSetOverridedStopGradient' with 'SetOverridedStopGradient'. * improve coverage. * polish error message.
-
- 26 5月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* modify matmul Op to complex template types * remove complex64/128 head file
-
- 12 5月, 2021 1 次提交
-
-
由 liym27 提交于
-
- 11 5月, 2021 1 次提交
-
-
由 ShenLiang 提交于
* fix find_unused_parameters default value
-
- 10 5月, 2021 1 次提交
-
-
由 Roc 提交于
-
- 01 5月, 2021 1 次提交
-
-
由 ShenLiang 提交于
-
- 30 4月, 2021 2 次提交
-
-
由 WeiXin 提交于
-
由 pangyoki 提交于
* add relu6_ hardsigmoid_ leaky_relu_ Inplace APIs * add softmax_with_cross_entropy_ Inplace API * add clip_ scale_ add_ subtract_ Inplace APIs * add wlist * fix parameter of scale api * add add_n_ Inplace API and remove log_ Inplace API * fix elementwise_add_ and elementwise_sub_ broadcast problem * elementwise inplace api give error message before run the op * use broadcast_shape in elementwise inplace op * add 8 inplace apis that is auto generated * add unittest for all inplace apis * add decorator for inplace apis in static mode * fix windows blas fail of exp inplace api, change array_equal to allclose * add flatten inplace api * add flatten unittest * fix flatten unittest * add decorator * fix grad.numpy in test_pylayer_op * unsupport softmax_with_cross_entropy_ * add test_inplace_softmax_with_cross_entropy to static_mode_white_list * delete __all__ in inplace_utils * delete activation inplace function and add Tensor.inplace_func * change paddle.inplace_ to Tensor.inplace_ * fix little problem * add paddle in inplace_utils
-
- 29 4月, 2021 1 次提交
-
-
由 liuyuhui 提交于
-