- 01 4月, 2021 2 次提交
-
-
由 kuizhiqing 提交于
* new group * ci compatible fix * assert nccl
-
由 Chen Weihang 提交于
* refactor and simplify hook design * fix reducer add hook error * add Tensor.register_hook basic impl * refine prepare data impl * revert prepare data change * support register_hook for Tensor * add hook test in model * polish tests and doc example * fix double grad test failed * remove reduce hook func * fix set empty error * polish code by comments * change reduce_hook to mutable_hook * remove useless tmp_ins * fix shape code format error * fix shape code format error
-
- 31 3月, 2021 2 次提交
-
-
由 Zhou Wei 提交于
* [Parallel UT]improve Parallel UT level on Windows/Linux * [Parallel UT]improve Parallel UT level on Windows/Linux * [Parallel UT]Improve Parallel UT level on Windows/Linux * [Parallel UT]Improve Parallel UT level on Windows/Linux * fix CI
-
由 Kaipeng Deng 提交于
* polish tensor pipeline. test=develop
-
- 30 3月, 2021 2 次提交
- 29 3月, 2021 1 次提交
-
-
由 ronnywang 提交于
-
- 26 3月, 2021 2 次提交
-
-
由 cc 提交于
* Use layer to calculate output scale * add backward for moving_average_abs_max_scale and save output scales to op's attr
-
由 tianshuo78520a 提交于
* delete include framework.pb.h * fix error
-
- 23 3月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
* modify windows CI to VS2017 * modify windows CI to VS2017 * modify windows CI to VS2017
-
- 18 3月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* support custom complex op * fix detail error * add inference support * fix setup windows failed
-
- 15 3月, 2021 1 次提交
-
-
由 Kaipeng Deng 提交于
* add dict/str/list supprot for DataLoader. test=develop
-
- 05 3月, 2021 1 次提交
-
-
由 liuyuhui 提交于
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes (#31130)
-
- 04 3月, 2021 2 次提交
-
-
由 Qi Li 提交于
-
由 wuhuanzhou 提交于
-
- 26 2月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 22 2月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* save multi table one path * format
-
- 20 2月, 2021 2 次提交
-
-
由 123malin 提交于
* test=develop, save/load, shrink Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
-
由 liym27 提交于
* [static setitem] support the index step > 1. tensor_a[::3] = value * [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value * [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value * Add op version.
-
- 19 2月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 10 2月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* initial commit: simple demo * polish copyright format * add grap op simple demo * adapt uncertain number of argument * change trait marco name * add place & dtype support for add kernel * add dispath and infershape func * poish code & add notes * add dynamic_loader dep for paddle_framework * add new custom op test dir * polish impl details * add unittest for new custom op * fix failed unittest * Costum op (#1) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * refactor register design & add test * change op_funtion to op_meta_info * split op meta info into .h and .cc * move get methods into friend class * move OpMetaInfoHelper into framework space * move CustomTensorUtils into framework space * change pybind api name * move PD C API into op meta info * add register custom op api * remove inference cmake change * refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * polish detail & error message * polish test details * Add cast api && Change copy related api to copy_to && add more test (#4) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add type cast * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * merge cwh code * merge cwh code * merge cwh code * merge cwh code * merge cwh code * add more error log * add more error log * polish code * used for test * remove test comment * remove test comment * fix uint8 type error * fix lost uint8 type error * add test for coverage * polish details by reviewer comments * add prefix for DISABLE_COPY_AND_ASSIGN Co-authored-by: NJiabin Yang <360788950@qq.com>
-
- 05 2月, 2021 1 次提交
-
-
由 liym27 提交于
Performance optimization for dynamic setitem: Call op set_value to speed up because the original call to TensorToPyArray will introduce unnecessary data copy. (#30817)
-
- 04 2月, 2021 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Update Xbyak and add bf16 fast performance verification * Fix formating * Change LOG message * Trigger an update of a new tag
-
由 WangXi 提交于
-
- 03 2月, 2021 2 次提交
- 01 2月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* dump to cpu * format * format * format
-
- 29 1月, 2021 1 次提交
-
-
由 ShenLiang 提交于
-
- 25 1月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add dla * add dla done * add python api Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
-
- 21 1月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* build gpu task core * format
-
- 20 1月, 2021 2 次提交
-
-
由 wanghuancoder 提交于
* delete empty line of pybing.cc, test=develop * use nvtx push pop in timeline, test=develop * change year, test=develop * add #ifdef PADDLE_WITH_CUDA, test=develop * add #ifndef WIN32, test=develop * is_pushed to is_pushed_, test=develop
-
由 wanghuancoder 提交于
* add some RecordEvent, for dygraph timeline, test=develop * change GpuMemcpySync to memory::Copy, test=develop * fix compile problem, test=develop * fix compile problem, test=develop * fix, test=develop * fix, test=develop
-
- 19 1月, 2021 2 次提交
-
-
由 Leo Chen 提交于
* support layer_norm fp16 in dygraph amp * add ut * refine code
-
由 wanghuancoder 提交于
-
- 18 1月, 2021 2 次提交
-
-
由 hutuxian 提交于
-
由 wanghuancoder 提交于
-
- 17 1月, 2021 1 次提交
-
-
由 guofei 提交于
* Modify the calculation logic of LambOptimizer
-
- 15 1月, 2021 1 次提交
-
-
由 pangyoki 提交于
* add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error
-
- 14 1月, 2021 1 次提交
-
-
由 yaoxuefeng 提交于
-
- 13 1月, 2021 1 次提交
-
-
由 Leo Chen 提交于
Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338) * set expected place in child thread for dataloader * set device id when set tensor from numpy * revert tensor_py change * add compile guard * fix ci * fix bug
-