- 27 1月, 2022 10 次提交
-
-
由 Siming Dai 提交于
* add the test case for the UVA * add the context load for the uva * Add graph_sample kernel * Add graph_sample commit * add new commit for graph_sample * add unsigned long long int * delete some remarks * add cpu version * add cuda eids * add cpu eids * delete _uva * optimize speed: emplace_back, last_layer * add to_uva_tensor * add cpu return_eids choice * add gpu return_eids choice * add cpu reindex_nodes * add gpu reindex_nodes * rename op and add OMP for cpu * add incubate api * fix the compile problem for the PADDLE_ENFORE and different device * fix the rcom and windows compile problem * add unittest for graph_sample_neighbors * fix cpu unittest and unique problem * fix uva unittest, fix cuda unique problem * fix the windows compile problem * fix the windows rand_r compile problem * add correct unittest, add src_eids dispensable * delete black * combine uva unittest * mv Sample_index to Sample_Index; check input shape; fix random sample func * delete memset & cudaMemset * fix according to PR comments * fix rocm ci * modify function names according to the specification * fix windows_openblas ci * refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors * fix rocm ci * rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc * add data type * fix conflict Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
-
由 Aurelius84 提交于
* Support allocate_from in Tensor and allocate_data in Context * fix #ifdef CUDA * fix cycle depends * fix test_xxx_dev_api failed * fix windows compiling error * fix unittest * modify into PImpl * fix selected rows * add TODO comment * refine interface according reviewer
-
由 zhouweiwei2014 提交于
-
由 wenbin 提交于
* shuffle channel pass * add ut * timeout fix * makefile fix
-
由 caozhou 提交于
* update planner * update unitest * update dist matmul * update auto converter
-
由 QingshuChen 提交于
* optimize kunlun/xpu softmax_with_cross_entropy add add unitest *test=kunlun * minor *test=kunlun * minor *test=kunlun * minor *test=kunlun * minor *test=kunlun
-
由 caozhou 提交于
* update dist param grad for pass * update unitest * update unitests * fix conflict
-
由 Wangzheee 提交于
* Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * [Paddle-Inference]: fix concat slice * [Paddle-Inference]: fix concat slice * [Paddle-Inference]: fix concat slice
-
由 huangxu96 提交于
Support the cases that the indices shape size is larger than the arr shape size
-
由 zhangbo9674 提交于
* add master weight for opt state_dict * check empty of master weight * strict gpu test * refine unittest
-
- 26 1月, 2022 9 次提交
-
-
由 hlygit66666 提交于
* add fuse_relu_depthwise_conv_pass unittest * fix atol and rtol * fix according to review * add FuseBatchNormAddActPass and unittest * Update test_dist_fuse_bn_add_act_pass.py * solve conflict
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Selected_Rows inherits from TensorBase * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again * Use paddle/pten/core/enforce and polish code * Support imperative selected_rows_to_lod_tensor * Polish code
-
由 qipengh 提交于
* [MLU]Add conv2d op * [MLU]fix comment * [MLU]adapt NCHW of conv2d op
-
由 yaozhixin 提交于
-
由 yaozhixin 提交于
-
由 Li Min 提交于
* Optimize layer_norm fwd when cols is 1024.
-
由 yaozhixin 提交于
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl2. test=kunlun * add sigmoid cross entropy with logits to kl2. test=kunlun * follow comments. test=kunlun
-
由 joeqiao12 提交于
-
- 25 1月, 2022 12 次提交
-
-
由 hlygit66666 提交于
* add fuse_relu_depthwise_conv_pass unittest * fix atol and rtol * fix according to review * Add fuse_bn_act_pass unittest * rm others * add fuse_bn_act_pass
-
由 Zhang Jun 提交于
* [inference] update convert reduce op&ut,test=develop * update * update * update * add int32 support * add int32 support * add comments * trt < 7.0 do not support int32 * test=develop * update * test=develop
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for fill_constant op * delete device_context DEPS
-
由 feng_shuai 提交于
-
由 fwenguang 提交于
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for concat and split op * delete device_context DEPS
-
由 Yuang Liu 提交于
-
由 Haohongxiang 提交于
* support param groups in grad_clip * update * modify for review
-
由 TTerror 提交于
-
由 Noel 提交于
-
由 caozhou 提交于
* update reshard for newest completion * update unitest * merge newest
-
由 Zhanlue Yang 提交于
-
- 24 1月, 2022 8 次提交
-
-
由 Tongxin Bai 提交于
* [autograd] static Jacobian pass tests. * [autograd] apply CR suggested changes. * [autograd] more tests. * [autograd] add CPUPlace in tests. * [autograd] bug fixes. * [autograd] reformatted.
-
由 sneaxiy 提交于
-
由 Jiaqi Liu 提交于
-
由 Zhang Ting 提交于
-
由 Baibaifan 提交于
-
由 Baibaifan 提交于
-
由 z8hanghuan 提交于
* support sparse of adam, *test=kunlun * add pre-commit-config.yaml * support sparse of adam in KL2,*test=kunlun * support sparse of adam in KL2, *test=kunlun * modify xpu.cmake, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun
-
由 Zhanlue Yang 提交于
Refactored python-level trace_op to call through _C_ops instead of Tracer::TraceOp, under eager_mode (#38338) * Replaced core.ops with _C_ops * Refactored python-level trace_op to call through _C_ops instead of Tracer::TraceOp, under eager_mode * Modified trace_op interface * Refactored trace_op logic for eager mode * Added Eager Dygraph support for OpTest * Fixed ci issues * Fixed CI failures * Fixed Coverage CI Issues * Fixed XPU CI Issues
-
- 23 1月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * eager test case * support inference test * refine test and fix initializer failed * modify eagertensor patch method * add eagertensor.clear_grandint, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * support create varbase and fix retain grad error * call monkey_patch_varbase in _test_eager_guard, test=develop * fix windows error * split clear_gradient to clear_gradient and zero_grads, test=develop * refine, test=develop * refine, test=develop * support test_imperative_basic test in eager mode * remove additional log in variable.h * remove additional log in variable.h * remove additional code create in merge * eager * fix some eager logic, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * patch_tensor_method_func, test=develop * refine, test=develop * eager test case, test=develop * refine, test=develop * eager, test=develop * eager, test=develop * eager optimizer, test=develop * eager optimizer, test=develop * eager test_imperative_optimizer_v2, test=develop * eager, test=develop * refine, test=develop * refine, test=develop * eager, test=develop * add resize in share buffer to, test=develop * eager, test=develop * fix _share_buffer_to, test=develop * refine, test=develop * refine, test=develop * support eager for dataloader,test=develop * Exposed EagerTensor's set func to implement set_value func * Rename set to _set_value, Supplement the corresponding test case * fix test concat dev api build failed * fix conflict * fix conflict * Use extern to Polish code Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NJiabinYang <360788950@qq.com> Co-authored-by: NWang Huan <wanghuan29@baidu.com> Co-authored-by: Nwanghuancoder <wanghuancoder@163.com> Co-authored-by: Nchentianyu03 <chentianyu03@baidu.com>
-