- 14 2月, 2022 6 次提交
-
-
由 Chen Weihang 提交于
* add has_attr for arg map context * skip useless attr now * skip attr if not exists * fix typo
-
由 chentianyu03 提交于
* add split kernel * add split kernel signature * fix split bug * modify MakePtenScalarArrayFromVarList * modify MakePtenScalarArrayFromVarList * fix split windows register error * add test case for split kernel * replace raw split kernel with pten kernel * fix makeScalar/ScalarArray bug * remove debug log * remove int64_t type in buildPtcontext * update by code review * fix split dev test failed * change DenseTensorMeta to MetaTensor * change split api code from auto gen to manual * split cuda kernel support bfloat16 type * fix conflict * rm raw split kernel * merge develop branch * change to pten::errors
-
由 TTerror 提交于
-
由 mhhhh1 提交于
-
由 Zhanlue Yang 提交于
* Enabled Eager OpTest #1 * Enabled Eager OpTest #1 * Fixed get_tensor method for EagerTensor
-
由 Zhanlue Yang 提交于
* Removed debug info * Added automatic code generation for final state Eager Dygraph * Modified backward yaml * Added EagerUtils helper functions for final state CodeGen * Adjusted CMakeFiles to support compilation for final state auto generated codes * Added python-c code generation for final state Eager Dygraph * Fixed minor issue * Fixed yaml.load() method failure * Fixed minor issues * Refactored Python-C Attributes Parsing Functions * Fixed minor issue with Python-C AddFunctions * Adjusted python-level trace_op to accomodate final state Eager Dygraph * Added Logs for final state Eager Dygraph * Fixed merge issues * Fixed minor issue
-
- 11 2月, 2022 18 次提交
-
-
由 chenjian 提交于
* add event node implementation * modify profiler.stop interface * fix according to review * fix file mode * modify class method name in event_node.cc * modify LLONG_MAX to ULLONG_MAX * fix ci error * fix ci error * fix dependency error
-
由 Leo Chen 提交于
-
由 jakpiase 提交于
* added shape oneDNN kernel * removed unnecessary import from test * added skipping tests for GPU * refactoring * refactored shape kernel * added tests in new framework * removed one line * minor change * added newline at EOF * added formatting * added attributes as extra
-
由 joeqiao12 提交于
-
由 zhangbo9674 提交于
* add transpose unbind * add unittest * refine transpose unittest
-
由 zn 提交于
Co-authored-by: Nzhangna <zhangna@cambricon.com>
-
由 fwenguang 提交于
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
由 Chen Weihang 提交于
* ermove xxx_info include * fix namespace error * resolve conflict * skip xpu context in registry * fix macro error * resolve conflict * resolve conflict * revert xpu convert * remove trans to fluid place * remove useless headers
-
由 Zhang Zheng 提交于
* Optimize performance of softmax_bwd when axis!=-1 * fix * fix * fix * fix
-
由 Lijunhui 提交于
* bilinear_fw init * optimize code * pre-compute linear_interp input index
-
由 JingZhuangzhuang 提交于
-
由 Chen Weihang 提交于
* move grad get expected pten kernel args * fix reduce sum error * fix element_sub_grad failed * revert kernel judge change
-
由 Wangzheee 提交于
* support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * add log for Executor Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 Leo Chen 提交于
-
由 chenjian 提交于
* add event node implementation * modify profiler.stop interface * fix according to review * fix file mode * modify class method name in event_node.cc * modify LLONG_MAX to ULLONG_MAX * fix ci error * fix ci error
-
由 Zhang Ting 提交于
* improve backward performance * support different dtypes for elementwise ops
-
- 10 2月, 2022 13 次提交
-
-
由 fwenguang 提交于
* [MLU] add mlu kernel for accuracy op * fix license format * fix error message
-
由 furnace 提交于
[NPU] add reduce_min
-
由 TeFeng Chen 提交于
* add a graph pass to share MemOptVarInfos of external variables into subgraph * update pass name * fix compile failed * add share_mem_opt_info_to_subgraph_pass test * share_mem_opt_info_to_subgraph_pass_test pass * modify some codes for better style and more robust * update cmake
-
由 Zhanlue Yang 提交于
* Removed debug info * Added automatic code generation for final state Eager Dygraph * Modified backward yaml * Added EagerUtils helper functions for final state CodeGen * Adjusted CMakeFiles to support compilation for final state auto generated codes * Added python-c code generation for final state Eager Dygraph * Fixed minor issue * Fixed yaml.load() method failure * Fixed minor issues * Refactored Python-C Attributes Parsing Functions * Fixed minor issue with Python-C AddFunctions * Fixed issues from merge * Fixed merge issues
-
由 chenyanlann 提交于
-
由 hong 提交于
* move masked select cpu kernel * add masked selected gpu kernel; test=develop * fix bugs; test=develop * bug fix; test=develop * bug fix; test=develop * add namespace to set mask array; test=develop * fix bug; test=develop * fix bugs; test=develop * fix ddim bug; test=develop * fix npu op bug; test=develop * fix xpu dependecy bug; test=develop * move kernel args to sig.cc; test=develop
-
由 wenbin 提交于
* mkldnn conv fix * definetion
-
由 Zhanlue Yang 提交于
-
由 crystal 提交于
* optimize conv1d forward * add conv opt * Optimize memory copy * delete share data with * set num_filters=512 * add nlc optimize * Optimize num_filter=512 data on A100 and V100 * Fix the workspace_size size setting of filter
-
由 zhangbo9674 提交于
* add squeeze unsqueeze stack * add unittest * add cpu kernel
-
由 zhangbo9674 提交于
* add dropout * add reshape * add slice * refien slice unittest * refine slice unittest * add cpu bf16 kernel
-
由 Leo Chen 提交于
* update isnan registration * fix compile
-
由 Aganlengzi 提交于
-
- 09 2月, 2022 3 次提交
-
-
由 Zhang Zheng 提交于
* Optimize performence of softmax_fwd when axis!=-1 * use functor * support hip * fix functor
-
由 Leo Chen 提交于
* fit pten for amp * fix typo
-
由 Wangzheee 提交于
* rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu
-