• Z
    【NPU】Cherry-pick ascendrc ops code by 0325 to develop (#32197) · e6bc358d
    zhang wenhui 提交于
    * merge 31065
    
    * Fix typo of selected_npus (#31230)
    
    * merge 31249
    
    * [NPU] Support npu op pow and pow grad (#31247)
    
    * [NPU] Support npu op: (1) pow (2) pow_grad
    
    * Support fp16
    
    * Fix pow npu fp16 test (#31256)
    
    * support list of list attribute for NPU (#31299)
    
    * support list of list attribute for NPU
    
    * fix compile problem
    
    * fix reference
    
    * [NPU] Support npu op: (1) slice (2) slice_grad (#31275)
    
    * fix reading flags from env (#31329)
    
    * merge 31347
    
    * [NPU] Support npu op layer_norm and layer_norm_grad (#31310)
    
    * init commit, add layer_norm npu kernel
    
    * fix typo
    
    * add unittest
    
    * add unittest
    
    * fix bug
    
    * fix bug
    
    * refine ut
    
    * [NPU] add npu kernel for equal op (#31393)
    
    * add npu kernel for equal op
    
    * refine code
    
    * add more ut
    
    * update year
    
    * [NPU] Support npu kernel for shape op  (#31427)
    
    * add shape npu
    
    * fix
    
    * fix
    
    * fix endif (#31431)
    
    * Fix pow, use fillD instead of broadcast (#31433)
    
    * Fix pow, refine code (#31440)
    
    * fix cmake of cryptopp to avoid downloading every time (#31451)
    
    * [NPU] squeeze and unsqueeze op for ascend (#31452)
    Co-authored-by: Nroot <xiayanming@baidu.com>
    
    * Support npu kernel for gather op (#31458)
    
    * add gather npu op
    
    * code review done
    
    * update python new line
    
    * precommit
    
    * fix review
    
    * del commit
    
    * 【NPU】add scale op for npu (#31499)
    
    * add scale npu
    
    * fix
    
    * fix
    
    * Support TensorFormVector, TensorToVector of bool type (#31518)
    
    * support TensorFormVector, TensorToVector of bool type
    
    * add ut
    
    * fix compile problem
    
    * 【NPU】support npu kernel for fill_constant op (#31521)
    
    * add fill_constant npu
    
    * add fill_constant npu
    
    * fix
    
    * cherry-pick 31422, solve conflict
    
    * 【NPU】Support npu kernel for matmul op (#31544)
    
    * add matmulv2_npu
    
    * add matmul
    
    * add matmul
    
    * [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)
    
    * [NPU] Support npu op elementwise_max (#31574)
    
    * 【NPU】add relu op for  npu (#31515)
    
    * add relu npu
    
    * fixed
    
    * fix
    
    * 【NPU】Suppert npu kernel for reshape2 op (#31524)
    
    * add reshape2 npu
    
    * add reshpe2
    
    * [NPU] Support npu kernel for gather op fix bug (#31541)
    
    * add gather npu op
    
    * code review done
    
    * update python new line
    
    * precommit
    
    * fix review
    
    * del commit
    
    * update gather_grad
    
    * fix bug
    
    * fix bug
    
    * [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457)
    
    * Support npu kernel for amp_check_finite_and_unscale_npu op
    
    * support EnforceNotMet exception
    
    * fix exception bug
    
    * modify python unittest
    
    * precommit
    
    * update c++ unittest
    
    * fix review
    
    * fix review
    
    * [NPU] accuracy op (#31492)
    
    * accuracy op
    
    * fix license
    
    * fix
    
    * add test and fix bug
    
    * [NPU] add Assign OP (#31561)
    
    * add assign op
    
    * add test assign npu test
    
    * dele if def
    Co-authored-by: Noyjxer <1728722986@qq.com>
    
    * [NPU] fix npu op elementwise_mul_grad (#31592)
    
    * 【NPU】Support npu op gelu and gelu_grad (#31530)
    
    * Support npu op gelu and gelu_grad
    
    * Support npu op gelu and gelu_grad
    
    * [NPU] fix assgin cmake (#31595)
    
    * fix gather_grad bug (#31607)
    
    * [NPU] add range op (#31560)
    
    * add range op
    
    * fix codestyle; call GetSize directly
    Co-authored-by: Noyjxer <1728722986@qq.com>
    
    * 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573)
    
    * Support npu op elementwise_div and elementwise_div_grad
    
    * Support npu op elementwise_div and elementwise_div_grad
    
    * Support npu op elementwise_div and elementwise_div_grad
    
    * [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600)
    
    * [NPU] Support npu op logicalnot_op (#31534)
    
    * [NPU] Support npu op elementwise_min (#31575)
    
    * [NPU] Support npu op elementwise_pow (#31576)
    
    * [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399)
    
    * [npu] support npu kernel `table_lookup_v2`
    
    * clean up
    
    * +python test
    
    * +cmake
    
    * clean up
    
    * remove int8 kernel
    + python unitest for fp16
    
    * clean up
    
    * [NPU] support npu kernel for `less_than` (#31327)
    
    * [npu] support npu kernel for `less than`
    
    * remove int* kernel
    
    * cleanup
    
    * [NPU] Support npu kernel scatter op (#31624)
    
    * Support npu kernel scatter op
    
    * Add more test
    
    * [NPU] fix allocator min chunk size (#31632)
    
    * [NPU] Support NPU kernel cast op (#31635)
    Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
    
    * [NPU] add npu kernel for sgd (#31639)
    
    * 【NPU】Support NPU kernel for reduce_sum op v2 (#31620)
    
    * add reduce_sum
    
    * fix broadcastd
    
    * fix test
    
    * fix
    
    * add unsqueeze in reduce_sum
    
    * add template
    
    * add unittest for keep_dim
    
    * test reduce_all
    Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
    
    * [NPU] add npu kernel for adam (#31644)
    
    * add npu kernel for adam
    
    * refine code
    
    * disable test
    
    * modify atol
    
    * 【NPU】Support npu kernel for mul op (#31584)
    
    * add mul
    
    * add test mul
    
    * [NPU] add npu kernel for softmax_with_cross_entropy (#31656)
    
    * init
    
    * fix bugs
    
    * [NPU] add npu kernel for mean Op (#31562)
    
    * update mean op
    
    * update mean op
    
    * give a better test activation
    Co-authored-by: Noyjxer <1728722986@qq.com>
    
    * Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665)
    
    This reverts commit 468ac699.
    
    * 【NPU】Add TensorCopy to NPU kernel for reduce_sum op  (#31667)
    
    * update unittest
    
    * add TensorCopy in npu grad kernel
    
    * [NPU] Support npu op `expand` (#31405)
    
    * [npu] support npu kernel  for `expand`
    
    * [NPU] fix shape of dx in mul_grad (#31675)
    
    * fix shape of dx
    
    * refine code
    
    * [NPU] add Increment op (#31563)
    
    * add increment
    
    * fix
    
    * update test increment op inplace
    
    * update increment op
    
    * increment b = 2
    Co-authored-by: Noyjxer <1728722986@qq.com>
    
    * [NPU] add NPU add topk  (#31596)
    
    * add topk op
    
    * add cmake
    
    * update topk npu op
    
    * refactor func
    
    * fix test not go npu TopKD bug
    
    * NPUPlace(4) to NPUPlace(0)
    
    * update comment
    Co-authored-by: Noyjxer <1728722986@qq.com>
    
    * [NPU] Support NPU kernel sum op (#31671)
    
    * [NPU] npu support `transpose` (#31486)
    
    * cherry-pick 31564, solve conflict
    
    * [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699)
    
    * [NPU] Support testing grad of NPU ops in OpTest (#31697)
    
    * [NPU] Support NPU kernel of stack op (#31711)
    
    * [NPU] Remove redundant ctest of top_k_op_npu_test (#31718)
    
    * [NPU] fix reshape npu op kernel (#31726)
    
    * rename npu op file
    
    * fix reshape
    
    * [NPU] change transpose to transpose2 (#31734)
    
    * change transpose to transpose2
    
    * fix bug
    
    * [NPU] Support  mean npu kernel (#31729)
    
    * [NPU] fix some bugs of npu op (#31739)
    
    * fix softmax
    
    * fix mean
    
    * fix lookup_table_v2
    
    * 【NPU】Fix npu kernel elementwise_div_grad  (#31753)
    
    * [NPU] fix the grad kernel diff bug of gather op (#31757)
    
    * fix gather grad kernel diff
    
    * fix gather grad kernel diff
    
    * fix gather review bug
    
    * 【NPU】Fix reshape test & add grad test (#31776)
    
    * fix
    
    * fix
    
    * [NPU] support fp16 for npu accuracy op (#31797)
    
    * [NPU] support list of tensor input (#31801)
    
    * support list of tensor as npu input
    
    * add comment
    
    * fix typo
    
    * fix typo
    
    * [NPU] add npu kernel for concat op (#31695)
    
    * add npu kernel for concat op
    
    * add npu kernel for concat op
    
    * refine code
    
    * update
    
    * refine concat_grad
    
    * [NPU] Support npu kernel for op elementwise_floordiv (#31822)
    
    * [NPU] fix bug of lookup_table_v2_grad (#31834)
    
    * [NPU] support default stream (#31510)
    
    * [NPU] support mixed precision input for npu layer norm (#31847)
    
    * support mixed precision input for npu layer norm
    
    * fix layer_norm npu kernel
    Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
    
    * 【NPU】Support npu kernel for update_loss_scaling op (#31830)
    
    * add update_loss_scaling_npu NPU kernel
    
    * change TensorFromVec to Memset
    
    * fix compile problem (#31850)
    
    * [NPU] support npu for conditional_block op (#31854)
    
    * 【NPU】Add int dtype kernel for reshape2 op (#31864)
    
    * fix
    
    * fix
    
    * [NPU] fix some op bugs (#31855)
    
    * fix some op bugs
    
    * fix some bugs
    
    * follow comments
    
    * fix log level
    
    * add ut
    
    * [NPU] support fp16 of input for api pow (#31871)
    
    * [NPU] add npu kernel for truncated_gaussian_random op (#31654)
    
    * init
    
    * add todo
    
    * add npu kernel for truncated_gaussian_random
    
    * add sync
    
    * fix concat_grad
    
    * fix typo
    
    * fix compile
    
    * fix compile
    
    * fix compile
    
    * fix compile
    
    * fix compile
    
    * fix compile
    
    * fix code style
    
    * fix code style
    
    * fix code
    
    * Fix op test (#32231)
    
    * fix conditional block (#32243)
    
    * fix style code
    Co-authored-by: Nxiayanming <41795079@qq.com>
    Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
    Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
    Co-authored-by: NReventon_L <luyuxiang1994@qq.com>
    Co-authored-by: Nroot <xiayanming@baidu.com>
    Co-authored-by: Noyjxer <1728722986@qq.com>
    Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
    Co-authored-by: NOleNet <olenet@126.com>
    Co-authored-by: NMeiyim <chen_xuyi@outlook.com>
    Co-authored-by: Noyxuan-11 <963650125@qq.com>
    Co-authored-by: Npangyoki <pangyoki@126.com>
    e6bc358d
elementwise_mul_op_npu.cc 2.9 KB