- 24 8月, 2021 1 次提交
-
-
由 Yulong Ao 提交于
* add auto_parallel dir * mv to paddle.distributed * add shard_xx api * add distributed attrs for var * add ut, test=develop * add dist * update * update * update * update * update * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update, test=develop * update * update * update * update * update * update, test=develop * update, test=develop * update * update * delete unused proto * resotre op_desc * restore type_defs * update var_desc * remove dimss_mapping for proto_pybind * update interface.py * update framework.py * update * update * add auto_parallel dir * mv to paddle.distributed * add shard_xx api * add distributed attrs for var * add ut, test=develop * [WIP] Add the auto completion feature and related codes * [WIP] Improve the auto completion and related codes * [WIP] Make the auto completion to support data-parallel * [WIP] Make the completion support mp and dp+mp * [WIP] Refactor auto completion unit test for MLP * [WIP] Refactor the implementation of DistributedOperatorImpl * [WIP] Improve dims_mapping update rule and fix a bug * [WIP] Support auto completion for one transformer decoder layer * [WIP] Add a minor change * [WIP] Fix a bug within the uint test * Shard XShape tensor, add embedding completion and refactor code * Add the distributed_operators dir to setup.py.in * Improve the completion process and add the unittest for gpt * fix process_mesh ut * fix process_mesh ut * update * update, test=develop * Add support for automatically completing distributed attrs of special ops * update * update * update * fix doc sample codes, test=develop * improve coverage, test=develop * add static_mode check, test=develop * Model the cluster for cost model and physical mapping * update, test=develop * add set_placement, test=develop * Add the check to make sure the candidate tensors' size is great than zero * update doc, test=develop * update doc, test=develop * update doc, test=develop * update doc, test=develop * update, test=develop * Auto mark dist attrs annotated by user * update ndarray to nested list, test=develop * update, test=develop * Add auto-completion module for auto-parallel (based on PR#33804) * Remove unnecessary files * Remove unrelated files for the auto completion pr * Update the unit test to improve the coverage * Modify codes based on reviews * Minor changes for CI * Improve some codes based on new comments * Fix bugs caused by shallow copy in attributes.py * Imporve amend_distributed_attr_for_program in context.py * Other changes for weihang's comments Co-authored-by: Nsandyhouse <lilong12@baidu.com>
-
- 23 8月, 2021 7 次提交
-
-
由 Bo Liu 提交于
-
由 Wilber 提交于
-
由 zyfncg 提交于
* Support getitem by Bool index * delete some debug info of bool index * support the case that the shape of bool index is different from indexed tensor
-
由 pangyoki 提交于
-
由 pangyoki 提交于
-
由 zhaoyingli 提交于
* adamw support cuda * adamw support cuda
-
由 Linjie Chen 提交于
* Add cuda device count api * update coda format * fix unittest error * update code format * update comment
-
- 20 8月, 2021 7 次提交
-
-
由 Hao Lin 提交于
-
由 Yuang Liu 提交于
-
由 lzzyzlbb 提交于
* add rmsprop npu * add argsort npu * add argsort npu * modify according to review * modify sharedatawith according to review * modify reshape according to review * rm dygraph=false
-
由 Sing_chan 提交于
* [NPU] Support npu kernel for pad3d op * fix for comment of zhouwei25 * fix some bugs according to qili93's comments * add support and test for paddings in input * delete VLOG used for debug
-
由 zhaoyingli 提交于
* add depthwise_conv2d npu * add some tests * Delete test_unique_op_npu.py * delete trans input
-
由 zhaoyingli 提交于
* [NPU] Support npu op where and where grad * fix use const_cast * delete a test
-
由 JYChen 提交于
* add (N,C,*) input support for GroupNorm * --amend
-
- 19 8月, 2021 3 次提交
-
-
由 JingZhuangzhuang 提交于
* add npu sin op * [NPU] Support npu kernel for sin op * modify support npu kernel for sin op * modify support npu kernel for sin op * modify nou sin op * modify npu sin op * add sin op npu
-
由 lilong12 提交于
-
由 王明冬 提交于
-
- 18 8月, 2021 11 次提交
-
-
由 lzzyzlbb 提交于
* [npu]add rmsprop op
-
由 xiongkun 提交于
* Add NPU kernel for norm Op: float16 and float32 * fix code for code review * fix for code review * add type for paddle_throw * remove unnecessary head file.\nAdd more testcase * remove a broadcast
-
由 littletomatodonkey 提交于
* fix pad outliers err * fix pad api input type and doc * fix example of pad * add unittest for pad3d * fix unittest * fix error format * fix pad doc
-
由 wanghuancoder 提交于
* code refactoring, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
由 Jackwaterveg 提交于
* test=develop * test=develop
-
由 Jackwaterveg 提交于
* test=develop * test=develop
-
由 WangXi 提交于
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
-
由 Fan Zhang 提交于
[CPU-PSLIB] Add consistency insepection of use_var_list and data_generator data, test=develop (#34463)
-
由 Zhanlue Yang 提交于
* Add function to disable paddle signal handler Paddle used google::InstallFaultSignalHandler to handle selected system signals, mainly for debugging and bug report purposes. However, this can be conflicted with other python packages whoever captures similar signals. Such python package involves tvm and more To resolve this issue, we support a function to disable signal handler * Remove signal test from WIN32 platform * Remove redundant return from disable_signal_handler() function * Add detailed messages to en_doc
-
由 Guoxia Wang 提交于
* support class center sample of PartialFC
-
由 Wangzheee 提交于
* unitest_quant_dequant * fix * fix * deleted: test_trt_quant_conv2d_dequant_fuse_pass.py * fix
-
- 17 8月, 2021 8 次提交
-
-
由 Roc 提交于
-
由 Aganlengzi 提交于
-
由 WeiXin 提交于
* polish unittest. * polish code * polish code
-
由 shangliang Xu 提交于
* [bug fix] fix unfold negative_size_param
-
由 Hui Zhang 提交于
* dygraph support more ctc grad scale * scale for 1.x * fix unitest * fix unitest * format code * fix unittest * fix log info * unittest cov * fix format;notest,test=cpu,coverage * skip ctc_loss egs;test=cpu * warpctc grad cov;test=coverage * add dygraph test;test=coverage * format;test=cpu,coverage * format;test=cpu * add api compat;test=cpu * add cpu test * rename * rename * fix * fix test * format * eigen cpu * eigen gpu grad pass * cuda gpu pass * format * fix ci
-
由 Zeng Jinle 提交于
* add inplace passes and tests * update * fix use_cuda undefined fix compile error of op compat * add more ut * fix CPU CI error * check adam unique * fix mac/windows ci, improve coverage * fix ci error * follow weihang's comment * fix BlockDesc::MoveFrom * follow qiuliang's comment * update * follow huihuang's comments
-
由 zhiboniu 提交于
-
由 Kaipeng Deng 提交于
* fix drop_last not work in IterableDataset. test=develop
-
- 16 8月, 2021 3 次提交
-
-
由 veyron95 提交于
* [NPU] Support npu op:(1)arg_min (2)arg_max * Modify and add unit test cases * Modify unit test cases
-
由 0x45f 提交于
* add size npu op * modify support data type * no longer use NPU size OP * remove useless comments, add test case * fix copyright, remove useless include
-
由 zhangchunle 提交于
-