- 17 1月, 2023 4 次提交
-
-
由 Jiabin Yang 提交于
-
由 WangZhen 提交于
* Support call backward() without params in dy2st
-
由 LiYuRio 提交于
-
由 Xiaoxu Chen 提交于
* support elementwise base func * fix compiling error and add test * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * another magic * add skip rename strategy * support add vjp * support add with new axis cal * support sub vjp * [prim] add multiply vjp rules * [prim] add multiply vjp rules * [prim] fix no infershape with composite in _append_backward_ops * [prim] add expand vjp rule * [prim] add exp vjp rule * uncomment infer shape for reshape/sum static prim api * [prim] fix tanh nullptr error * remove some print message * fix magic number in run_program relative tests @JiaBinYang * [prim] add expand,multiply,exp vjp rules * fix only support single direction reduce error * infer reduce dims using out dims Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 16 1月, 2023 12 次提交
-
-
由 HappyHeavyRain 提交于
* support the 'data_transform' for generating static graph ops * reset 'pow' code * change the 'GetKernelTypeForVar'
-
由 zlsh80826 提交于
* Update warpctc for cuda-12 * Deprecate cudaProfilerInitialize for CUDA > 11 * Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040 * Add the missing thrust header
-
由 Zhang Jun 提交于
* add outvar name for nvtx mark * nly network created with kEXPLICIT_BATCH can setsetMaxBatchSize
-
由 Aurelius84 提交于
* [CINN]Switch cinn GIT_TAG from v0.2 into develop * fix branch name * specify commit * disable unittest * disable unittest
-
由 Yuanle Liu 提交于
* add trt_support_nhwc_pass
-
由 wangxiaoning 提交于
* fix ctr_double_accessor.h * fix graph_brpc_client.h non-const reference to pointer * fix common_table.h * fix graph_py_service.cc, server.cc, server.h
-
由 wangxiaoning 提交于
-
由 Jiabin Yang 提交于
This reverts commit 4d5265b8.
-
由 Yuanle Liu 提交于
* add gpu_cpu_map_matmul_to_mul_pass to kGpuLowerPrecisionPasses * disable fc_elementwise_layernorm_fuse_pass in mixed precision
-
由 Charles-hit 提交于
* polish static grad op maker gen * fix some bugs * fix static code gen * solve conflict * modify composite grad maker name
-
由 zqw_1997 提交于
-
由 xiaoguoguo626807 提交于
-
- 15 1月, 2023 2 次提交
-
-
由 Roc 提交于
1 update xccl lib 2 when using comm_ctx, the allocator should be set manually.
-
由 Jiabin Yang 提交于
* support elementwise base func * fix compiling error and add test * remove additional param * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * add more test * fix windows problem * another magic * fix windows compile * invoke ci * add skip rename strategy * support add vjp * fix test_tanh * support add with new axis cal * fix resnet and some test * add composite log * support sub vjp * enhance_tests * support more dtype for full
-
- 13 1月, 2023 13 次提交
-
-
由 Wang Bojun 提交于
* add fmha_flashattention oss plugin
-
由 wanghuancoder 提交于
-
由 Zhang Jun 提交于
* update trt engine to set in/out data type * update * Update engine.cc * Update engine.cc * update * set engine output type before freeze the network * update * update trt autoscan ut * update * update ut * fix equal bug, update ut * fix cast and equal ut * update cast ut using TRT < 8.4 * set datatype from scope * check output var is nullptr * Update op_converter.h * update tensorrt_engine_op_test ut * update
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
由 duanyanhui 提交于
* update get_device to custom * add custom_device api * rm is_compiled_with_custom_device from framework * add todo comments
-
由 Jiabin Yang 提交于
* support elementwise base func * fix compiling error and add test * remove additional param * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * add more test * fix windows problem * another magic * fix windows compile * invoke ci * add skip rename strategy * support add vjp * fix test_tanh * support add with new axis cal * fix resnet and some test * add composite log * support sub vjp
-
由 LiYuRio 提交于
-
由 jameszhang 提交于
* kunlun add support for c_concat and c_split * replace mutable_data() and ShareDataWith()
-
由 zyfncg 提交于
* generate static graph code of stack, unbind, unique_consecutive op * fix bug
-
由 Wilber 提交于
-
由 wangzhen38 提交于
* [cpplint fix] under ps
-
由 HongyuJia 提交于
-
由 Yuanle Liu 提交于
-
- 12 1月, 2023 7 次提交
-
-
由 xiaoxiaohehe001 提交于
-
由 Wen Sun 提交于
* refactor: migrate comm checks * refactor: add check in comm context * feat: add gloo static check * refactor: add place param in static check
-
由 jameszhang 提交于
* Fix reduce func bug in process_group_bkcl Also catch up with a recent process_group PR that failed to add XPU branch. Note that reduce is still accomplished by allreduce for xpu. Fix this should xccl lib be updated. * fix compile issue for non-XPU
-
由 gem5 提交于
-
由 wenbin 提交于
* compile fix * fix compile * compile fix * add more preln
-
由 jiangcheng 提交于
-
由 YuanRisheng 提交于
* rename kernel * delete sig * modify code according comment * fix ci bugs
-
- 11 1月, 2023 2 次提交
-
-
由 zhangxin81 提交于
* fix paddle_infer_contrib include
-
由 niuliling123 提交于
-