- 23 4月, 2023 2 次提交
-
-
由 risemeup1 提交于
* apply gcc12 to gpups * apply gcc12 to gpups * apply gcc12 to gpups * apply gcc12 to gpups * apply gcc12 to gpups * apply gcc12 to gpups * apply gcc12 to gpips * apply gcc12 to gpups * apply gcc12 to gpups * test * test * apply gcc12 to gpups * apply_gcc12_to_gpups * fix compiler bug * fix compiler bug * test * fix dangling-pointer compiler * fix dangling-pointer compiler * fix dangling-pointer compiler * apply_gcc12_to_gpups * apply gcc12 to gpups * Update cuda_streams_py.cc
-
由 niuliling123 提交于
* Delete temp param in eager_gen
-
- 21 4月, 2023 1 次提交
-
-
由 JYChen 提交于
* support 0-D output and 0-D as indice in __getitem__ * fix tests * fix inference and UT * add unittest for setitem * fix xpu test * fix xpu 0-d
-
- 20 4月, 2023 2 次提交
- 19 4月, 2023 1 次提交
-
-
由 ronnywang 提交于
* [CustomDevice] add recompute support * update
-
- 18 4月, 2023 2 次提交
-
-
由 niuliling123 提交于
-
由 张春乔 提交于
-
- 17 4月, 2023 3 次提交
-
-
由 LiYuRio 提交于
* cherry-pick fleet executor from 2.4 * fix test case
-
由 JingZhuangzhuang 提交于
-
由 张春乔 提交于
-
- 14 4月, 2023 3 次提交
-
-
由 Feiyu Chan 提交于
1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408) 2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition); 3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version; 3. provide an option `legacy_format=false` in serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute; 4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.
-
由 Kim Yann 提交于
-
由 ronnywang 提交于
-
- 13 4月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 12 4月, 2023 1 次提交
-
-
由 liuruyan 提交于
-
- 11 4月, 2023 3 次提交
-
-
由 Yuanle Liu 提交于
-
由 Xiaoxu Chen 提交于
-
由 wangzhen38 提交于
-
- 10 4月, 2023 4 次提交
-
-
由 Zhang Ting 提交于
* support set master_grad * move register_hook to auto_cast * update unittest * fix fp16 test * update for review comments
-
由 HongyuJia 提交于
* [Opt Performance] Optimize custom operator performance, reconstruct python API auto-gen, add cache and use const inference * opt AutoGradMeta implementation * remove profiler codes * fix unit test * change year, 2021->2023 * fix int64_t parse bug
-
由 kangguangli 提交于
* add strategy force_sequential_run * remove flag * fix * fix * fix * fix * fix * fix * fix * fix * fix
-
由 张春乔 提交于
* mv WITH_ASCEND_CL * mv WITH_ASCEND * rollback * remove WITH_ASCEND * remove WITH_ASCEND
-
- 08 4月, 2023 2 次提交
-
-
由 kangguangli 提交于
* add strategy force_sequential_run * fix * fix * fix * fix * fix
-
由 张春乔 提交于
* mv WITH_ASCEND_CL * mv WITH_ASCEND * rollback
-
- 07 4月, 2023 1 次提交
-
-
由 Wang Xin 提交于
-
- 06 4月, 2023 3 次提交
- 05 4月, 2023 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 04 4月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* add gloo gather * add gloo_tools * fix CI bug * use gloo gather * remove redundant code * fix process_group_gloo.py * rename send_recv * fix conflict * fix conflict * fix codestyle * fix CI bug * add PADDLE_ENFORCE_NE
-
- 03 4月, 2023 2 次提交
-
-
由 engineer1109 提交于
-
由 Kim Yann 提交于
* rem is_compiled_with_mlu * fix some mlu_place and mlu_device_coount * make lint happy
-
- 01 4月, 2023 2 次提交
-
-
由 jjyaoao 提交于
* Delete the /paddle/fluid/platform/device/npu directory * clear Cmakelists * Try removing npu in the header file
-
由 Feiyu Chan 提交于
-
- 31 3月, 2023 2 次提交
-
-
由 zhenhailiu 提交于
* gather with doc * resolve comment * polish * polish * code style * polish doc * add_test * polish * polish * add test check * add test check * polish * polish * polish * polish * fix_time_out * polish * fix timeout * fix_timeout * polish * polish * polish * polish * polish
-
由 HongyuJia 提交于
* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output * delete custom_inplace_setup.py * [CustomOP Optional Inplace] Custom operator supports inplace optional Tensor input * fix bug for vector<Tensor> inplace test
-
- 30 3月, 2023 3 次提交
-
-
由 zhouweiwei2014 提交于
-
由 Feiyu Chan 提交于
1. add type caster for paddle's complex type, to allow pybind to automatically cast it with python's complex type; 2. add complex64 and complex128 data type for `libpaddle.Tensor`'s element get and set(which is required to perturb an element to get the numerical derivative) 3. add support for cuda pinned place in `libpaddle.Tensor` element get and set --- 4. fix a bug in op code generation.(Creation of output folder in concurrent with parsing op yamls.)
-
由 Yiqun Liu 提交于
* [AMP] Add python API for collecting operator stats. * Fix import and polish codes. * Add more unittest. * Add doc for the new APIs.
-