- 17 1月, 2023 17 次提交
-
-
由 pangyoki 提交于
* new exe supports CUDA Graph * fix * fix * fix * fix FLAGS_use_stream_safe_cuda_allocator in unittest * insert output of coalesce_tensor op to skip_gc_var * fix
-
-
由 xiaoguoguo626807 提交于
* proto type of composite grad in paddle * proto type of composite grad in paddle * refactor composite api with phi * fix compile error * support static graph code-gen for squeeze op * generate static graph code of unsqueeze * refine op name * fix compile error * add extra output in op_compat * remove debug log * fix clang compile error * support prim switch flag * support prim switch flag * fix dygraph error * merge develop * add code_gen * add necessary files without codegen * fix code_gen bug * add deps * modify igmnore * add ignore * delete std cout * add composite logic for backward.py * add tanh first order grad composite * support enable_prim flag for static graph * throw expection when both GrapOpMaker and GradCompOpMaker not been registered * reorganize the directory of prim api tests * fix windows error * add eager_utils * add eager_utils * modify code gen * add composite parse * add unittest for get_grad_op_desc * code optimize * fix static test on windows * support generate static graph code for imag and real op * fix windows compile error in test_static_prim * merge develop * disable test eager in inference * prim code gen * disable eager compile in inference * origin_yaml codegen success * rm other file * rm gitignore file * code_style * add eager test * code_style * clear # * merge develop * clear # * remove useless files * modify static test * support bool flag from singlton * merge develop * recover git ignore * fix conflict * clear prim_gen * recover git ignore for generated op * parse_yaml success * fix test compile error * remove some tests * add python test * code_style * revert parse_utils+ clear prim_gen * fix some name issue * add composite code gen * modify backward yaml * fix static composite grad maker code gen * remove addtional files * add some static funcs unit test * fix some bugs * fix composite grad maker register code gen * optimize some functions * modify gen cmake * add more api gen * add header * modify static * add static expand unsqueeze * comments * modify compopmaker * revert * modify gen name Co-authored-by: NJiabinYang <360788950@qq.com> Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com> Co-authored-by: Ncxxly <chenxx_id@163.com> Co-authored-by: Ncharles-hit <wanghao107@baidu.com>
-
由 YuanRisheng 提交于
* change feed_op to phi kernel * fix ci bugs * fix build bugs * fix ci bugs * fix compile bugs * fix ci bugs * perfect code * perfect comment code * fix install bugs * modify code according comment * remove visitor in feed_op * modify according comment * perfect code according comment * add infershape * fix py3 bugs * fix getexpected kernel type * fix getexpected kernel type * fix ci bugs * add registry for custom device * fix py3 bugs * fix floating point error * fix py3 test bugs
-
由 cyber-pioneer 提交于
* support @to_static+to_prime+cinn * fix code logic * debug4 * debug5 * debug6 * debug7 * debug 8 * debug 9 * debug10 * debug11 * debug11 * debug 12 Co-authored-by: NAurelius84 <zhangliujie@baidu.com>
-
由 WangZhen 提交于
* Fix translated layer fine-tune
-
由 danleifeng 提交于
-
由 Jiabin Yang 提交于
-
由 Huihuang Zheng 提交于
Support 0d Tensor in ConditionalBlockOp 1. Add dygraph 0d tensor support for ConditionalBlockOp 2. Set scalar loss shape when `append_backward`
-
由 姜永久 提交于
* rm retain grad * fix zero_dim * fix zero_dim for xpu * reset zero dim for xpu * reset xpu * reset custom_relu * Reset flip * fix zero dim
-
由 risemeup1 提交于
* fix build ci bug * fix build ci bug,test=test=document_fix * fix build ci bug,test=document_fix
-
由 zhangkaihuo 提交于
-
由 zhouweiwei2014 提交于
-
由 HongyuJia 提交于
-
由 WangZhen 提交于
* Support call backward() without params in dy2st
-
由 LiYuRio 提交于
-
由 Xiaoxu Chen 提交于
* support elementwise base func * fix compiling error and add test * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * another magic * add skip rename strategy * support add vjp * support add with new axis cal * support sub vjp * [prim] add multiply vjp rules * [prim] add multiply vjp rules * [prim] fix no infershape with composite in _append_backward_ops * [prim] add expand vjp rule * [prim] add exp vjp rule * uncomment infer shape for reshape/sum static prim api * [prim] fix tanh nullptr error * remove some print message * fix magic number in run_program relative tests @JiaBinYang * [prim] add expand,multiply,exp vjp rules * fix only support single direction reduce error * infer reduce dims using out dims Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 16 1月, 2023 22 次提交
-
-
由 HappyHeavyRain 提交于
* support the 'data_transform' for generating static graph ops * reset 'pow' code * change the 'GetKernelTypeForVar'
-
由 zlsh80826 提交于
* Update warpctc for cuda-12 * Deprecate cudaProfilerInitialize for CUDA > 11 * Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040 * Add the missing thrust header
-
由 Zhang Jun 提交于
* add outvar name for nvtx mark * nly network created with kEXPLICIT_BATCH can setsetMaxBatchSize
-
由 Weilong Wu 提交于
-
由 wawltor 提交于
-
由 risemeup1 提交于
-
由 Aurelius84 提交于
* [CINN]Switch cinn GIT_TAG from v0.2 into develop * fix branch name * specify commit * disable unittest * disable unittest
-
由 Yuanle Liu 提交于
* add trt_support_nhwc_pass
-
由 wangxiaoning 提交于
-
由 wangxiaoning 提交于
* fix ctr_double_accessor.h * fix graph_brpc_client.h non-const reference to pointer * fix common_table.h * fix graph_py_service.cc, server.cc, server.h
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Remove climits. * Fix bug of paddle.save. It may cause bug for saving sharded optimizer state_dict() in parallel.
-
由 wangxiaoning 提交于
-
由 Jiabin Yang 提交于
This reverts commit 4d5265b8.
-
由 QingshuChen 提交于
-
由 zhouweiwei2014 提交于
-
由 Weilong Wu 提交于
-
由 Leo Chen 提交于
-
由 Yuanle Liu 提交于
* add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses * disable fc_elementwise_layernorm_fuse_pass in mixed precision
-
由 Charles-hit 提交于
* polish static grad op maker gen * fix some bugs * fix static code gen * solve conflict * modify composite grad maker name
-
由 zqw_1997 提交于
-
由 xiaoguoguo626807 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Rename methods of ProcessMesh * [Auto Parallel] Impl the python process_mesh by the c++ one * [Auto Parallel] Add some minor modifications * [Auto Parallel] Rename some methods * [Auto Parallel] Remove unnecessary codes * [Auto Parallel] Add back some removed files * [Auto Parallel] Fix bugs * [Auto Parallel] Fix a bug * Update process_mesh.cc * [Auto Parallel] Merge dist attrs of Python into C++ * [Auto Parallel] Add back deleted importing * [Auto Parallel] Add back removed unittest * [Auto Parallel] Remove type qualifiers of return types * [Auto Parallel] Fix some bugs * [Auto Parallel] Fix a bug of the quant pass * [Auto Parallel] Fix the code style * [Auto Parallel] Clear some fluid APIs
-
- 15 1月, 2023 1 次提交
-
-
由 Roc 提交于
1 update xccl lib 2 when using comm_ctx, the allocator should be set manually.
-