- 15 4月, 2021 1 次提交
-
-
由 zhang wenhui 提交于
* merge 31065 * Fix typo of selected_npus (#31230) * merge 31249 * [NPU] Support npu op pow and pow grad (#31247) * [NPU] Support npu op: (1) pow (2) pow_grad * Support fp16 * Fix pow npu fp16 test (#31256) * support list of list attribute for NPU (#31299) * support list of list attribute for NPU * fix compile problem * fix reference * [NPU] Support npu op: (1) slice (2) slice_grad (#31275) * fix reading flags from env (#31329) * merge 31347 * [NPU] Support npu op layer_norm and layer_norm_grad (#31310) * init commit, add layer_norm npu kernel * fix typo * add unittest * add unittest * fix bug * fix bug * refine ut * [NPU] add npu kernel for equal op (#31393) * add npu kernel for equal op * refine code * add more ut * update year * [NPU] Support npu kernel for shape op (#31427) * add shape npu * fix * fix * fix endif (#31431) * Fix pow, use fillD instead of broadcast (#31433) * Fix pow, refine code (#31440) * fix cmake of cryptopp to avoid downloading every time (#31451) * [NPU] squeeze and unsqueeze op for ascend (#31452) Co-authored-by: Nroot <xiayanming@baidu.com> * Support npu kernel for gather op (#31458) * add gather npu op * code review done * update python new line * precommit * fix review * del commit * 【NPU】add scale op for npu (#31499) * add scale npu * fix * fix * Support TensorFormVector, TensorToVector of bool type (#31518) * support TensorFormVector, TensorToVector of bool type * add ut * fix compile problem * 【NPU】support npu kernel for fill_constant op (#31521) * add fill_constant npu * add fill_constant npu * fix * cherry-pick 31422, solve conflict * 【NPU】Support npu kernel for matmul op (#31544) * add matmulv2_npu * add matmul * add matmul * [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571) * [NPU] Support npu op elementwise_max (#31574) * 【NPU】add relu op for npu (#31515) * add relu npu * fixed * fix * 【NPU】Suppert npu kernel for reshape2 op (#31524) * add reshape2 npu * add reshpe2 * [NPU] Support npu kernel for gather op fix bug (#31541) * add gather npu op * code review done * update python new line * precommit * fix review * del commit * update gather_grad * fix bug * fix bug * [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457) * Support npu kernel for amp_check_finite_and_unscale_npu op * support EnforceNotMet exception * fix exception bug * modify python unittest * precommit * update c++ unittest * fix review * fix review * [NPU] accuracy op (#31492) * accuracy op * fix license * fix * add test and fix bug * [NPU] add Assign OP (#31561) * add assign op * add test assign npu test * dele if def Co-authored-by: Noyjxer <1728722986@qq.com> * [NPU] fix npu op elementwise_mul_grad (#31592) * 【NPU】Support npu op gelu and gelu_grad (#31530) * Support npu op gelu and gelu_grad * Support npu op gelu and gelu_grad * [NPU] fix assgin cmake (#31595) * fix gather_grad bug (#31607) * [NPU] add range op (#31560) * add range op * fix codestyle; call GetSize directly Co-authored-by: Noyjxer <1728722986@qq.com> * 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573) * Support npu op elementwise_div and elementwise_div_grad * Support npu op elementwise_div and elementwise_div_grad * Support npu op elementwise_div and elementwise_div_grad * [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600) * [NPU] Support npu op logicalnot_op (#31534) * [NPU] Support npu op elementwise_min (#31575) * [NPU] Support npu op elementwise_pow (#31576) * [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399) * [npu] support npu kernel `table_lookup_v2` * clean up * +python test * +cmake * clean up * remove int8 kernel + python unitest for fp16 * clean up * [NPU] support npu kernel for `less_than` (#31327) * [npu] support npu kernel for `less than` * remove int* kernel * cleanup * [NPU] Support npu kernel scatter op (#31624) * Support npu kernel scatter op * Add more test * [NPU] fix allocator min chunk size (#31632) * [NPU] Support NPU kernel cast op (#31635) Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> * [NPU] add npu kernel for sgd (#31639) * 【NPU】Support NPU kernel for reduce_sum op v2 (#31620) * add reduce_sum * fix broadcastd * fix test * fix * add unsqueeze in reduce_sum * add template * add unittest for keep_dim * test reduce_all Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> * [NPU] add npu kernel for adam (#31644) * add npu kernel for adam * refine code * disable test * modify atol * 【NPU】Support npu kernel for mul op (#31584) * add mul * add test mul * [NPU] add npu kernel for softmax_with_cross_entropy (#31656) * init * fix bugs * [NPU] add npu kernel for mean Op (#31562) * update mean op * update mean op * give a better test activation Co-authored-by: Noyjxer <1728722986@qq.com> * Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665) This reverts commit 468ac699. * 【NPU】Add TensorCopy to NPU kernel for reduce_sum op (#31667) * update unittest * add TensorCopy in npu grad kernel * [NPU] Support npu op `expand` (#31405) * [npu] support npu kernel for `expand` * [NPU] fix shape of dx in mul_grad (#31675) * fix shape of dx * refine code * [NPU] add Increment op (#31563) * add increment * fix * update test increment op inplace * update increment op * increment b = 2 Co-authored-by: Noyjxer <1728722986@qq.com> * [NPU] add NPU add topk (#31596) * add topk op * add cmake * update topk npu op * refactor func * fix test not go npu TopKD bug * NPUPlace(4) to NPUPlace(0) * update comment Co-authored-by: Noyjxer <1728722986@qq.com> * [NPU] Support NPU kernel sum op (#31671) * [NPU] npu support `transpose` (#31486) * cherry-pick 31564, solve conflict * [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699) * [NPU] Support testing grad of NPU ops in OpTest (#31697) * [NPU] Support NPU kernel of stack op (#31711) * [NPU] Remove redundant ctest of top_k_op_npu_test (#31718) * [NPU] fix reshape npu op kernel (#31726) * rename npu op file * fix reshape * [NPU] change transpose to transpose2 (#31734) * change transpose to transpose2 * fix bug * [NPU] Support mean npu kernel (#31729) * [NPU] fix some bugs of npu op (#31739) * fix softmax * fix mean * fix lookup_table_v2 * 【NPU】Fix npu kernel elementwise_div_grad (#31753) * [NPU] fix the grad kernel diff bug of gather op (#31757) * fix gather grad kernel diff * fix gather grad kernel diff * fix gather review bug * 【NPU】Fix reshape test & add grad test (#31776) * fix * fix * [NPU] support fp16 for npu accuracy op (#31797) * [NPU] support list of tensor input (#31801) * support list of tensor as npu input * add comment * fix typo * fix typo * [NPU] add npu kernel for concat op (#31695) * add npu kernel for concat op * add npu kernel for concat op * refine code * update * refine concat_grad * [NPU] Support npu kernel for op elementwise_floordiv (#31822) * [NPU] fix bug of lookup_table_v2_grad (#31834) * [NPU] support default stream (#31510) * [NPU] support mixed precision input for npu layer norm (#31847) * support mixed precision input for npu layer norm * fix layer_norm npu kernel Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com> * 【NPU】Support npu kernel for update_loss_scaling op (#31830) * add update_loss_scaling_npu NPU kernel * change TensorFromVec to Memset * fix compile problem (#31850) * [NPU] support npu for conditional_block op (#31854) * 【NPU】Add int dtype kernel for reshape2 op (#31864) * fix * fix * [NPU] fix some op bugs (#31855) * fix some op bugs * fix some bugs * follow comments * fix log level * add ut * [NPU] support fp16 of input for api pow (#31871) * [NPU] add npu kernel for truncated_gaussian_random op (#31654) * init * add todo * add npu kernel for truncated_gaussian_random * add sync * fix concat_grad * fix typo * fix compile * fix compile * fix compile * fix compile * fix compile * fix compile * fix code style * fix code style * fix code * Fix op test (#32231) * fix conditional block (#32243) * fix style code Co-authored-by: Nxiayanming <41795079@qq.com> Co-authored-by: NLeo Chen <chenqiuliang@baidu.com> Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com> Co-authored-by: NReventon_L <luyuxiang1994@qq.com> Co-authored-by: Nroot <xiayanming@baidu.com> Co-authored-by: Noyjxer <1728722986@qq.com> Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com> Co-authored-by: NOleNet <olenet@126.com> Co-authored-by: NMeiyim <chen_xuyi@outlook.com> Co-authored-by: Noyxuan-11 <963650125@qq.com> Co-authored-by: Npangyoki <pangyoki@126.com>
-
- 03 3月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 25 1月, 2021 1 次提交
-
-
由 arlesniak 提交于
* More precise mkldnn kernel choice in GetExpectedKernelType * Fixes after review * Refresh develop for CI * CI experiment * get back from CI exper
-
- 27 11月, 2020 1 次提交
-
-
由 arlesniak 提交于
-
- 10 11月, 2020 1 次提交
-
-
由 zhupengyang 提交于
-
- 06 11月, 2020 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Add bfloat16 softmax and gelu * Add pass attr bfloat16_enabled_op_types * Changes from review
-
- 31 8月, 2020 1 次提交
-
-
由 GaoWei8 提交于
* refine cudnn softmax
-
- 14 5月, 2020 1 次提交
-
-
由 suytingwan 提交于
* test=develop error message update
-
- 26 4月, 2020 1 次提交
-
-
由 liuwei1031 提交于
* save InferVarType changes, test=develop * remove code comments, test=develop * tweak code, test=develop * fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop * modify fused_bn_activation_op, test=develop * fix error of fused_bn_activation_op, test=develop * fix PADDLE_ENFORCE and unittest coverage issue, test=develop * tweak PADDLE_ENFORCE messages, test=develop * improve unittest coverage, test=develop * add StaticGraphInferVarType class, test=develop * rebase develop branch, test=develop * fix unittest error, test=develop * remove comments, test=develop * improve unittest coverage, test=develop * imporve error message and imporve unittest coverage, test=develop * upgrade InferVarType API, test=develop * tweak pyfunc error message, test=develop * fix compilation conflict - save_combine_op, test=develop
-
- 09 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop
-
- 05 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop
-
- 31 10月, 2019 1 次提交
-
-
由 hong 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop
-
- 28 10月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop
-
- 21 9月, 2019 1 次提交
-
-
由 Adam 提交于
* Initial, functional commit * Clean commit related files test=develop
-
- 05 9月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* enable inplace for affine_channel op, dropout op, test=develop * remove dropout inplace for ngraph fails, test=develop
-
- 30 4月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix op graph view test=develop * rewrite inplace pass and fix reference count pass bug test=develop * fix unittest failed test=develop * follow comments, test=develop
-
- 27 3月, 2019 1 次提交
-
-
由 liuwei1031 提交于
* fix cdn issue, test=develop * fix memory optimize bugs, test=develop * fix memory optimize bugs, test=develop * remove add/sub_2 op, test=develop * disable memory_optimize by default, test=develop * disable inplace activation in python, test=develop * fix unittests, test=develop * fix unittests, test=develop * bug-fix, test=develop
-
- 25 3月, 2019 1 次提交
-
-
由 dengkaipeng 提交于
-
- 20 3月, 2019 1 次提交
-
-
由 dengkaipeng 提交于
-
- 18 3月, 2019 5 次提交
-
-
由 dengkaipeng 提交于
-
由 dengkaipeng 提交于
-
由 dengkaipeng 提交于
-
由 dengkaipeng 提交于
-
由 dengkaipeng 提交于
-
- 21 1月, 2019 1 次提交
-
-
由 dzhwinter 提交于
-
- 25 12月, 2018 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 19 12月, 2018 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 12 12月, 2018 1 次提交
-
-
由 Yu Yang 提交于
test=develop
-
- 15 11月, 2018 1 次提交
-
-
由 Sylwester Fraczek 提交于
* add is_test to pooling and activations add prop_kind support for layers activation. conv and pooling add a pass that sets is_test to true add transpiler version of is_test pass test=develop * patch test and pass test=develop * add pass to analyzer.h test=develop * add is_test attr description & pass only on mkldnn in: activation_op.cc batch_norm_op.cc conv_op.cc dropout_op.cc lrn_op.cc pool_op.cc sequence_pool_op.cc softmax_op.cc * fix is_test handling for activation pool and conv * change description of is_test for all layers again * remove GetAttr(use_mkldnn) from pass * rename correct_mkldnn_test_phase to is_test and remove dependency on MKLDNN test=develop * review fix magic number * two if(..)s into one * Check is_test once and pass mkldnn forward prop kind * dereference shared_ptr with * (without get()) test=develop * add is_test_pass back test=develop
-
- 09 11月, 2018 1 次提交
-
-
由 chengduo 提交于
* add_infer_var_type test=develop * InferVarTypeHelper-> VarTypeInferenceHelper test=develop * PassInputTypeAndDTypeOnOutput test=develop * follow comment test=develop
-
- 22 10月, 2018 1 次提交
-
-
由 Xin Pan 提交于
test=develop
-
- 01 8月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "add gradient register" * "make some enhance" * "better format" * "fix typo" * "fix reuse" * "fix get expected kernel" * "change the mkldnn code" * "fix mkldnn" * "fix mkldnn failed test" * "add comment"
-
- 31 7月, 2018 1 次提交
-
-
由 fengjiayi 提交于
-
- 21 6月, 2018 1 次提交
-
-
由 Jacek Czaja 提交于
- Added hash function inside of MKLDNN softmax op to be used as handle for primitives stroing in a context - Style fixes to softmax mkldnn op - Fixes after review - Coding style - Fix to style - style fixes - style fix - style fixes - Fix to cody style check - Rephrasing a comment fix t obroken merge Fixes to rebase Conflicts: benchmark/fluid/models/machine_translation.py cmake/external/mkldnn.cmake paddle/fluid/operators/softmax_mkldnn_op.cc - Bumped revision of MKL-DNN up to have softmax backward primitive - Added choosing MKLDNN softmax grad operator - First reuse of softmax backward - Reinvented reusing for softmax - Fix to crash in reinvented reuse - Clang format fixes - Clang format fixes - Improved softmax mkldnn reuse mechanism - clang format fixes - Fix to broken merge - Fix
-
- 11 6月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "add inplace attribute" * "register inplace attribute" * "change se-next model for memory-reuse" * "fix typo" * repick * fix merge conflict * "fix stupid error"
-
- 07 6月, 2018 1 次提交
-
-
由 mozga-intel 提交于
* Add MKLDNN layout support in Paddle Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout can be used in MKLDNN enabled OP kernel. Before this commit, NCHW is hardcode to be used in all MKLDNN op kernels. As a result, non-optimized execution path is selected in MKLDNN primitive which bring worse performance. Besides framework change, three MKLDNN OP kernels were updated for using new MKLDNN layout. They are conv/pool2d/batch_norm. Other MKLDNN OP kernels need be also updated in similar way to achieve best performance. * Add MKLDNN layout support in activation OP * Don't populate layout from input to output when kMKLDNN in * Refine pool mkldnn op kernel * MKLDNN layout * Remove the inferitance from tensor file * MKLDNN layout: refactoring * Remove additional #define to register new operator * Prepare mkldnn tests to work with layout
-
- 08 5月, 2018 1 次提交
-
-
由 Yu Yang 提交于
Do not use ctor * Reduce line of codes. * We can use virtual function for Maker now. * The implementation does not care what maker holds, it is easier to refactor later.
-
- 03 5月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "fix double type error" * "fix ci" * "softmax fp64" * "fix momentum" * "fix ci"
-
- 19 4月, 2018 1 次提交
-
-
由 Yang Yang(Tony) 提交于
* script to add semicolon * fix typo
-
- 17 4月, 2018 1 次提交
-
-
由 Jacek Czaja 提交于
- EPS added to softmax mkldnn primitive outcome is limited to training phase Fixes after review clang format fixes clang format fixes
-