- 11 2月, 2022 15 次提交
-
-
由 joeqiao12 提交于
-
由 zhangbo9674 提交于
* add transpose unbind * add unittest * refine transpose unittest
-
由 zn 提交于
Co-authored-by: Nzhangna <zhangna@cambricon.com>
-
由 fwenguang 提交于
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
由 Chen Weihang 提交于
* ermove xxx_info include * fix namespace error * resolve conflict * skip xpu context in registry * fix macro error * resolve conflict * resolve conflict * revert xpu convert * remove trans to fluid place * remove useless headers
-
由 Zhang Zheng 提交于
* Optimize performance of softmax_bwd when axis!=-1 * fix * fix * fix * fix
-
由 Lijunhui 提交于
* bilinear_fw init * optimize code * pre-compute linear_interp input index
-
由 JingZhuangzhuang 提交于
-
由 Chen Weihang 提交于
* move grad get expected pten kernel args * fix reduce sum error * fix element_sub_grad failed * revert kernel judge change
-
由 Wangzheee 提交于
* support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * add log for Executor Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 Leo Chen 提交于
-
由 chenjian 提交于
* add event node implementation * modify profiler.stop interface * fix according to review * fix file mode * modify class method name in event_node.cc * modify LLONG_MAX to ULLONG_MAX * fix ci error * fix ci error
-
由 Zhang Ting 提交于
* improve backward performance * support different dtypes for elementwise ops
-
- 10 2月, 2022 13 次提交
-
-
由 fwenguang 提交于
* [MLU] add mlu kernel for accuracy op * fix license format * fix error message
-
由 furnace 提交于
[NPU] add reduce_min
-
由 TeFeng Chen 提交于
* add a graph pass to share MemOptVarInfos of external variables into subgraph * update pass name * fix compile failed * add share_mem_opt_info_to_subgraph_pass test * share_mem_opt_info_to_subgraph_pass_test pass * modify some codes for better style and more robust * update cmake
-
由 Zhanlue Yang 提交于
* Removed debug info * Added automatic code generation for final state Eager Dygraph * Modified backward yaml * Added EagerUtils helper functions for final state CodeGen * Adjusted CMakeFiles to support compilation for final state auto generated codes * Added python-c code generation for final state Eager Dygraph * Fixed minor issue * Fixed yaml.load() method failure * Fixed minor issues * Refactored Python-C Attributes Parsing Functions * Fixed minor issue with Python-C AddFunctions * Fixed issues from merge * Fixed merge issues
-
由 chenyanlann 提交于
-
由 hong 提交于
* move masked select cpu kernel * add masked selected gpu kernel; test=develop * fix bugs; test=develop * bug fix; test=develop * bug fix; test=develop * add namespace to set mask array; test=develop * fix bug; test=develop * fix bugs; test=develop * fix ddim bug; test=develop * fix npu op bug; test=develop * fix xpu dependecy bug; test=develop * move kernel args to sig.cc; test=develop
-
由 wenbin 提交于
* mkldnn conv fix * definetion
-
由 Zhanlue Yang 提交于
-
由 crystal 提交于
* optimize conv1d forward * add conv opt * Optimize memory copy * delete share data with * set num_filters=512 * add nlc optimize * Optimize num_filter=512 data on A100 and V100 * Fix the workspace_size size setting of filter
-
由 zhangbo9674 提交于
* add squeeze unsqueeze stack * add unittest * add cpu kernel
-
由 zhangbo9674 提交于
* add dropout * add reshape * add slice * refien slice unittest * refine slice unittest * add cpu bf16 kernel
-
由 Leo Chen 提交于
* update isnan registration * fix compile
-
由 Aganlengzi 提交于
-
- 09 2月, 2022 12 次提交
-
-
由 Zhang Zheng 提交于
* Optimize performence of softmax_fwd when axis!=-1 * use functor * support hip * fix functor
-
由 Leo Chen 提交于
* fit pten for amp * fix typo
-
由 Wangzheee 提交于
* rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu
-
由 niuliling123 提交于
-
由 mhhhh1 提交于
-
由 fwenguang 提交于
-
由 fwenguang 提交于
-
由 fwenguang 提交于
-
由 Jiabin Yang 提交于
* merge legacy to fluid * Remove legacy code * Remove legacy code * Remove DataType test * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer
-
由 Yiqun Liu 提交于
-
由 hong 提交于
* add trace op * bug fix * bug fix; test=develop * thrust bug fix; test=develop * remove useless register; test=develop * fix bug; test=develop * update trace kernel; test=develop * move kernel args to trace_sig; test=develop
-
由 Chen Weihang 提交于
* fix slice bug of cusstom op * add offset in check
-