- 18 8月, 2022 5 次提交
-
-
由 pangyoki 提交于
apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085) * apply inplace addto in python apply_pass * fix * apply inplace pass for program * skip feed and fetch var * fix block_desc.move_from * fix block desc * alltoall remove inplace * fix
-
由 Aurelius84 提交于
* [OpAttr]Squeeze axes support Tensor * add support_tensor * fix unittest * fix coverage
-
由 Roc 提交于
-
由 HongyuJia 提交于
* transfer bilinear op to phi, change bname from bilinear_interp_v2 to bilinear_interp * reserve linear_interp param * fix cross device import
-
由 zyfncg 提交于
-
- 17 8月, 2022 7 次提交
-
-
由 Allen Guo 提交于
-
由 Nyakku Shigure 提交于
[CodeStyle][NPU] use np.testing.assert_allclose instead of self.assertTrue(np.allclose(...)) (part 1) (#44988) * autofix * try resolve precision issues * revert some changes * clean some `err_msg` * 0.0001 -> 1e-4 * update commented assert code * try to fix some shape errors * `numpy` -> `np` * empty commit, trigger kunlun ci, test=kunlun * empty commit, retrigger kunlun ci, test=kunlun * empty commit, trigger kunlun ci, try fix npu memcpy_h2d, test=kunlun * try fix npu import error, test=kunlun
-
由 Aurelius84 提交于
* [OpAttr]Add SupportTensor for OpMaker * fix typo * fix code style * add SupportTensor for concat op * add unittest for register Tensor * add shape checker and split attribute
-
由 Aurelius84 提交于
* [Eager]Support Lazy initialization for nn.Lazyer
-
由 ykkk2333 提交于
* xpu unittest grad compute supports more types, *test=kunlun * add instance norm xpu, *test=kunlun
-
由 HongyuJia 提交于
* transfer bicubic_interp op to phi, change name from bicubic_interp_v2 to bicubic_interp * test final_state_bicubic_interp api * testcase match imperative case
-
由 Zhang Zheng 提交于
-
- 16 8月, 2022 8 次提交
-
-
由 Chen Weihang 提交于
* move check finite and unscale kernel into phi * move infershape into phi * move update_loss_scaling kernel into phi * remove original kernels * move update loss scaling infershape into phi * add header for xpu and npu * solve coverage failed * fix npu test failed * remove mutable data in cu file * fix new executor failed * add valid check for meta tensor output
-
由 Siming Dai 提交于
* initial commit * fix op maker bug * fix mul grad bug * add unittest * fix add grad bug, add cpu kernel * add paddle.geometric.message_passing * add paddle.geometric.send_uv api, add unittest * add fp16 judgement * fix file typo, move compute_type to message_op * add impl file * fix unittest timeout time * add review revise
-
由 caozhou 提交于
* update reshard cost and cost estimator * add unittest * add dropout cost * fix import error * fix reshard code style error * improve unittest coverage
-
由 feng_shuai 提交于
* convert multihead to oss * fix:bug * fix:delete const cast * fix:don't support bias_qk * add vit pass * fix:convert bug and add preln_residual_bias * support length=-1 * add UT for convert * add no_bias_qk support for gpu_multihead_op * delete infer_shape depends on bias_qk * oss just can be used in T4 and A* * fix:change api for ROCM CI
-
由 HongyuJia 提交于
-
由 houj04 提交于
-
由 Sing_chan 提交于
* add select_p * fix bugs * add custom test for select_p; modify select_p primrules * modify according to xiaoxu's comment * add eq_p, select_p, pow_p, use autograd to test grad of high order * add requirement of autograd, modify expected type of eq * modify according to xiaoxu's comment * import primops to use primops.pow
-
由 Feiyu Chan 提交于
-
- 15 8月, 2022 9 次提交
-
-
由 RichardWooSJTU 提交于
Co-authored-by: NminghaoBD <liminghao03@baidu.com>
-
由 HongyuJia 提交于
* change name linear_interp_v2 to linear_interp * fix deprecated_op_names * deprecated_op_names add linear_interp_grad
-
由 zlsh80826 提交于
* Reduce pool2d test configuration * Reduce depthwise_conv2d test configuration * Reduce trt_convert_conv2d_fusion test configuration * Reduce trt_convert_conv2d test configuration * Reduce trt_convert_conv2d_transpose test configuration * Reduce trt_convert_hard_swish test configuration * Enhance trt auto scan test error message and mechanism * Increase FP16 trt ut tolerance
-
由 zhangyikun02 提交于
-
由 zhaoyingli 提交于
* add collate_fn * fix number of inputs
-
由 Hui Zhang 提交于
* rm useless pybind * rm useless ut
-
由 Yulong Ao 提交于
* [Auto Parallel] Move the distributed info from python to c++ * [Auto Parallel] Add dist_attrs for VarDesc and OpDesc * [Auto Parallel] Add the lost file * [Auto Parallel] Make the dist attr be unique_ptr * [Auto Parallel] Add the proto conversion * [Auto Parallel] Improve the proto support * [Auto Parallel] Fix the bugs for adding a device or a link * [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper * [Auto Parallel] Improve the impl of these dist attrs * [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh * [Auto Parallel] Fix the unittest problem * [Auto Parallel] Explicitly add the src file for auto_parallel target * [Auto Parallel] Add the proto depedency explicitly * [Auto Parallel] Fix the cmake bug on windows and mac * [Auto Parallel] Remove the pybind11 header file in process_mesh.h * [Auto Parallel] Remove unused codes * [Auto Parallel] Check whether the dist attr is null * [Auto Parallel] Implement the assign operator for OpDesc explicitly
-
由 houj04 提交于
* [XPU] add some collective ops. test=kunlun * use XPUOpTestWrapper. test=kunlun * skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
-
由 Ruibiao Chen 提交于
* Update FLAGS for standalone executor * Update FLAGS_FORCE_USE_PROGRAM_CACHE
-
- 13 8月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* add cached_serialize_str_ * support program hash * add sha * add ut * use hash_str only for new_exe * fix attr order
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error * fix logging risk * fix logging possible risk * write trainer_desc file * support split sparse params in local & remote * fix import paddle.fluid.core.PSGPU * fix import paddle.fluid.core.PSGPU * add remote_sparse & local_sparse config * fix unittest * fix test_dist_fleet_geo table error * fix PADDLE_ENFORCE error * fix other's pr conflict
-
- 12 8月, 2022 9 次提交
-
-
由 Sławomir Siwek 提交于
* remove v2_transpose_reshape * matmul_transpose_reshape * reshape_transpose_matmul * Add int8 support for matmulV2 * restore ut * adjust old ut * restore parallel UT ruels * remove mkldnn code from base ops * move enforces to pass * remove duplicated functions * delete duplicated enforces * feedback from review * add comments to variables * enable eltwise support * dynamic attribute * remove fusepass tests from op test * remove fuse pass cases from op test * revert introduction of dynamic attributes * style Co-authored-by: Nwozna <joanna.wozna@intel.com>
-
由 HongyuJia 提交于
* support optional<vector<Tensor>> in yaml and eager * delete useless comments in eager_gen.py * fix api_base.py support optional<vector<TTensor>> * python_c_gen.py support optional<vector<tensor>> * transfer linear_interp_v2 yaml from fluid to phi * fix op_test typo error * change linear_interp_v2 testcase * fix args in final_state_linear_interp_v2 * fix zeropad2d typo. test=document_fix
-
由 caozhou 提交于
* update reshard for auto search * fix unittest bug * update dist tensor * update reshard output * fix unittests bug * merge develop
-
由 Chang Xu 提交于
-
由 Aurelius84 提交于
* Fix concat and tile attribute for ONNX * disable unittest
-
由 JZ-LIANG 提交于
* bugfix * remove scaling * support rescale_grad opt
-
由 Jiabin Yang 提交于
* support more final_state code * support more final_state code * fix api error * fix norm error * fix pool3d error * revert pool3d and max_pool_3d_adaptive * fix code check error * fix norm problem
-
由 Yulong Ao 提交于
* [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh * [Auto Parallel] Fix the unittest problem * [Auto Parallel] Explicitly add the src file for auto_parallel target * [Auto Parallel] Add the proto depedency explicitly * [Auto Parallel] Fix the cmake bug on windows and mac * [Auto Parallel] Remove the pybind11 header file in process_mesh.h
-
由 duanyanhui 提交于
* enhance grid_sampler to support 3d input
-