- 10 2月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* initial commit: simple demo * polish copyright format * add grap op simple demo * adapt uncertain number of argument * change trait marco name * add place & dtype support for add kernel * add dispath and infershape func * poish code & add notes * add dynamic_loader dep for paddle_framework * add new custom op test dir * polish impl details * add unittest for new custom op * fix failed unittest * Costum op (#1) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * refactor register design & add test * change op_funtion to op_meta_info * split op meta info into .h and .cc * move get methods into friend class * move OpMetaInfoHelper into framework space * move CustomTensorUtils into framework space * change pybind api name * move PD C API into op meta info * add register custom op api * remove inference cmake change * refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * polish detail & error message * polish test details * Add cast api && Change copy related api to copy_to && add more test (#4) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add type cast * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * merge cwh code * merge cwh code * merge cwh code * merge cwh code * merge cwh code * add more error log * add more error log * polish code * used for test * remove test comment * remove test comment * fix uint8 type error * fix lost uint8 type error * add test for coverage * polish details by reviewer comments * add prefix for DISABLE_COPY_AND_ASSIGN Co-authored-by: NJiabin Yang <360788950@qq.com>
-
- 05 2月, 2021 1 次提交
-
-
由 liym27 提交于
Performance optimization for dynamic setitem: Call op set_value to speed up because the original call to TensorToPyArray will introduce unnecessary data copy. (#30817)
-
- 04 2月, 2021 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Update Xbyak and add bf16 fast performance verification * Fix formating * Change LOG message * Trigger an update of a new tag
-
由 WangXi 提交于
-
- 03 2月, 2021 2 次提交
- 01 2月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* dump to cpu * format * format * format
-
- 29 1月, 2021 1 次提交
-
-
由 ShenLiang 提交于
-
- 25 1月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add dla * add dla done * add python api Co-authored-by: Nshangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
-
- 21 1月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* build gpu task core * format
-
- 20 1月, 2021 2 次提交
-
-
由 wanghuancoder 提交于
* delete empty line of pybing.cc, test=develop * use nvtx push pop in timeline, test=develop * change year, test=develop * add #ifdef PADDLE_WITH_CUDA, test=develop * add #ifndef WIN32, test=develop * is_pushed to is_pushed_, test=develop
-
由 wanghuancoder 提交于
* add some RecordEvent, for dygraph timeline, test=develop * change GpuMemcpySync to memory::Copy, test=develop * fix compile problem, test=develop * fix compile problem, test=develop * fix, test=develop * fix, test=develop
-
- 19 1月, 2021 2 次提交
-
-
由 Leo Chen 提交于
* support layer_norm fp16 in dygraph amp * add ut * refine code
-
由 wanghuancoder 提交于
-
- 18 1月, 2021 2 次提交
-
-
由 hutuxian 提交于
-
由 wanghuancoder 提交于
-
- 17 1月, 2021 1 次提交
-
-
由 guofei 提交于
* Modify the calculation logic of LambOptimizer
-
- 15 1月, 2021 1 次提交
-
-
由 pangyoki 提交于
* add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error
-
- 14 1月, 2021 1 次提交
-
-
由 yaoxuefeng 提交于
-
- 13 1月, 2021 2 次提交
- 12 1月, 2021 2 次提交
-
-
由 tangwei12 提交于
* rename sendrecv.proto to namespace paddle.distributed * split ps with distributed
-
由 Chengmo 提交于
* add save tensor support Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
-
- 11 1月, 2021 1 次提交
-
-
由 AshburnLee 提交于
-
- 09 1月, 2021 1 次提交
-
-
由 pangyoki 提交于
* add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput
-
- 08 1月, 2021 3 次提交
-
-
由 Leo Chen 提交于
* fix dtype of ungenerated grad var * update ut * refine code * set default dtype * fix could_use_cudnn bug * remove debug code * re-implement * fix bug
-
由 Leo Chen 提交于
* change to tensor copy sync * change to tensor copy sync * make copy_to safe when use TensorCopy * refine code * add ut * add cudapinned garbagecollector * add testcase: cpu place -> cuda pinned place
-
由 Chengmo 提交于
* add tensor table
-
- 07 1月, 2021 1 次提交
-
-
由 123malin 提交于
* test=develop, add model_average and lookahead
-
- 06 1月, 2021 2 次提交
-
-
由 Leo Chen 提交于
* add dispenable input 'shape' for core.ops.reshape2 * add dispenable inputs for core.ops.reshape2/expand/slice * add ut
-
由 liym27 提交于
1. when slice_item is a slice: 1) the start of __getitem__ should be std::max(start, 0) if slice 2) the start of __getitem__ should be std::min(end, dim) 2. when slice_item is an integer, it should be in [-dim_len, dim_len) 3. Fix error message to use accurate data
-
- 05 1月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* add topo aware * resource.h * topo aware * format
-
- 04 1月, 2021 1 次提交
-
-
由 cc 提交于
* zero_copy_tensor supports int8_t
-
- 27 12月, 2020 1 次提交
-
- 26 12月, 2020 1 次提交
-
-
由 liuyuhui 提交于
-
- 24 12月, 2020 1 次提交
-
-
由 tangwei12 提交于
* oneps (3/4) Co-authored-by: NMrChengmo <cmchengmo@163.com> Co-authored-by: Nmalin10 <malin10@baidu.com> Co-authored-by: Nchengmo <chengmo@baidu.com>
-
- 23 12月, 2020 1 次提交
-
-
由 Thunderbrook 提交于
* add heter box * add trainer, worker, wrapper... * format * for ci * format * remove boost get * boost & copyright * rename * rename * format * format * format Co-authored-by: Nyaoxuefeng6 <yaoxuefeng@baidu.com>
-
- 22 12月, 2020 1 次提交
-
-
由 ShenLiang 提交于
* fix fleet for multi-stream * fix memcpy for ncclid * use sync to solve move operation
-
- 16 12月, 2020 2 次提交