1. 15 4月, 2021 1 次提交
    • Z
      【NPU】Cherry-pick ascendrc ops code by 0325 to develop (#32197) · e6bc358d
      zhang wenhui 提交于
      * merge 31065
      
      * Fix typo of selected_npus (#31230)
      
      * merge 31249
      
      * [NPU] Support npu op pow and pow grad (#31247)
      
      * [NPU] Support npu op: (1) pow (2) pow_grad
      
      * Support fp16
      
      * Fix pow npu fp16 test (#31256)
      
      * support list of list attribute for NPU (#31299)
      
      * support list of list attribute for NPU
      
      * fix compile problem
      
      * fix reference
      
      * [NPU] Support npu op: (1) slice (2) slice_grad (#31275)
      
      * fix reading flags from env (#31329)
      
      * merge 31347
      
      * [NPU] Support npu op layer_norm and layer_norm_grad (#31310)
      
      * init commit, add layer_norm npu kernel
      
      * fix typo
      
      * add unittest
      
      * add unittest
      
      * fix bug
      
      * fix bug
      
      * refine ut
      
      * [NPU] add npu kernel for equal op (#31393)
      
      * add npu kernel for equal op
      
      * refine code
      
      * add more ut
      
      * update year
      
      * [NPU] Support npu kernel for shape op  (#31427)
      
      * add shape npu
      
      * fix
      
      * fix
      
      * fix endif (#31431)
      
      * Fix pow, use fillD instead of broadcast (#31433)
      
      * Fix pow, refine code (#31440)
      
      * fix cmake of cryptopp to avoid downloading every time (#31451)
      
      * [NPU] squeeze and unsqueeze op for ascend (#31452)
      Co-authored-by: Nroot <xiayanming@baidu.com>
      
      * Support npu kernel for gather op (#31458)
      
      * add gather npu op
      
      * code review done
      
      * update python new line
      
      * precommit
      
      * fix review
      
      * del commit
      
      * 【NPU】add scale op for npu (#31499)
      
      * add scale npu
      
      * fix
      
      * fix
      
      * Support TensorFormVector, TensorToVector of bool type (#31518)
      
      * support TensorFormVector, TensorToVector of bool type
      
      * add ut
      
      * fix compile problem
      
      * 【NPU】support npu kernel for fill_constant op (#31521)
      
      * add fill_constant npu
      
      * add fill_constant npu
      
      * fix
      
      * cherry-pick 31422, solve conflict
      
      * 【NPU】Support npu kernel for matmul op (#31544)
      
      * add matmulv2_npu
      
      * add matmul
      
      * add matmul
      
      * [NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)
      
      * [NPU] Support npu op elementwise_max (#31574)
      
      * 【NPU】add relu op for  npu (#31515)
      
      * add relu npu
      
      * fixed
      
      * fix
      
      * 【NPU】Suppert npu kernel for reshape2 op (#31524)
      
      * add reshape2 npu
      
      * add reshpe2
      
      * [NPU] Support npu kernel for gather op fix bug (#31541)
      
      * add gather npu op
      
      * code review done
      
      * update python new line
      
      * precommit
      
      * fix review
      
      * del commit
      
      * update gather_grad
      
      * fix bug
      
      * fix bug
      
      * [NPU] Support npu kernel for amp_check_finite_and_unscale_npu op (#31457)
      
      * Support npu kernel for amp_check_finite_and_unscale_npu op
      
      * support EnforceNotMet exception
      
      * fix exception bug
      
      * modify python unittest
      
      * precommit
      
      * update c++ unittest
      
      * fix review
      
      * fix review
      
      * [NPU] accuracy op (#31492)
      
      * accuracy op
      
      * fix license
      
      * fix
      
      * add test and fix bug
      
      * [NPU] add Assign OP (#31561)
      
      * add assign op
      
      * add test assign npu test
      
      * dele if def
      Co-authored-by: Noyjxer <1728722986@qq.com>
      
      * [NPU] fix npu op elementwise_mul_grad (#31592)
      
      * 【NPU】Support npu op gelu and gelu_grad (#31530)
      
      * Support npu op gelu and gelu_grad
      
      * Support npu op gelu and gelu_grad
      
      * [NPU] fix assgin cmake (#31595)
      
      * fix gather_grad bug (#31607)
      
      * [NPU] add range op (#31560)
      
      * add range op
      
      * fix codestyle; call GetSize directly
      Co-authored-by: Noyjxer <1728722986@qq.com>
      
      * 【NPU】Support npu op elementwise_div and elementwise_div_grad (#31573)
      
      * Support npu op elementwise_div and elementwise_div_grad
      
      * Support npu op elementwise_div and elementwise_div_grad
      
      * Support npu op elementwise_div and elementwise_div_grad
      
      * [NPU] Support npu op log, log_grad, sqrt, sqrt_grad, square, tanh and tanh_grad (#31600)
      
      * [NPU] Support npu op logicalnot_op (#31534)
      
      * [NPU] Support npu op elementwise_min (#31575)
      
      * [NPU] Support npu op elementwise_pow (#31576)
      
      * [NPU] Support npu op table_lookup_v2 and table_lookup_v2_grad (#31399)
      
      * [npu] support npu kernel `table_lookup_v2`
      
      * clean up
      
      * +python test
      
      * +cmake
      
      * clean up
      
      * remove int8 kernel
      + python unitest for fp16
      
      * clean up
      
      * [NPU] support npu kernel for `less_than` (#31327)
      
      * [npu] support npu kernel for `less than`
      
      * remove int* kernel
      
      * cleanup
      
      * [NPU] Support npu kernel scatter op (#31624)
      
      * Support npu kernel scatter op
      
      * Add more test
      
      * [NPU] fix allocator min chunk size (#31632)
      
      * [NPU] Support NPU kernel cast op (#31635)
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * [NPU] add npu kernel for sgd (#31639)
      
      * 【NPU】Support NPU kernel for reduce_sum op v2 (#31620)
      
      * add reduce_sum
      
      * fix broadcastd
      
      * fix test
      
      * fix
      
      * add unsqueeze in reduce_sum
      
      * add template
      
      * add unittest for keep_dim
      
      * test reduce_all
      Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
      
      * [NPU] add npu kernel for adam (#31644)
      
      * add npu kernel for adam
      
      * refine code
      
      * disable test
      
      * modify atol
      
      * 【NPU】Support npu kernel for mul op (#31584)
      
      * add mul
      
      * add test mul
      
      * [NPU] add npu kernel for softmax_with_cross_entropy (#31656)
      
      * init
      
      * fix bugs
      
      * [NPU] add npu kernel for mean Op (#31562)
      
      * update mean op
      
      * update mean op
      
      * give a better test activation
      Co-authored-by: Noyjxer <1728722986@qq.com>
      
      * Revert "[NPU] add npu kernel for mean Op (#31562)" (#31665)
      
      This reverts commit 468ac699.
      
      * 【NPU】Add TensorCopy to NPU kernel for reduce_sum op  (#31667)
      
      * update unittest
      
      * add TensorCopy in npu grad kernel
      
      * [NPU] Support npu op `expand` (#31405)
      
      * [npu] support npu kernel  for `expand`
      
      * [NPU] fix shape of dx in mul_grad (#31675)
      
      * fix shape of dx
      
      * refine code
      
      * [NPU] add Increment op (#31563)
      
      * add increment
      
      * fix
      
      * update test increment op inplace
      
      * update increment op
      
      * increment b = 2
      Co-authored-by: Noyjxer <1728722986@qq.com>
      
      * [NPU] add NPU add topk  (#31596)
      
      * add topk op
      
      * add cmake
      
      * update topk npu op
      
      * refactor func
      
      * fix test not go npu TopKD bug
      
      * NPUPlace(4) to NPUPlace(0)
      
      * update comment
      Co-authored-by: Noyjxer <1728722986@qq.com>
      
      * [NPU] Support NPU kernel sum op (#31671)
      
      * [NPU] npu support `transpose` (#31486)
      
      * cherry-pick 31564, solve conflict
      
      * [NPU] Fix bug: Fix calculation errors of pow grad npu kernel (#31699)
      
      * [NPU] Support testing grad of NPU ops in OpTest (#31697)
      
      * [NPU] Support NPU kernel of stack op (#31711)
      
      * [NPU] Remove redundant ctest of top_k_op_npu_test (#31718)
      
      * [NPU] fix reshape npu op kernel (#31726)
      
      * rename npu op file
      
      * fix reshape
      
      * [NPU] change transpose to transpose2 (#31734)
      
      * change transpose to transpose2
      
      * fix bug
      
      * [NPU] Support  mean npu kernel (#31729)
      
      * [NPU] fix some bugs of npu op (#31739)
      
      * fix softmax
      
      * fix mean
      
      * fix lookup_table_v2
      
      * 【NPU】Fix npu kernel elementwise_div_grad  (#31753)
      
      * [NPU] fix the grad kernel diff bug of gather op (#31757)
      
      * fix gather grad kernel diff
      
      * fix gather grad kernel diff
      
      * fix gather review bug
      
      * 【NPU】Fix reshape test & add grad test (#31776)
      
      * fix
      
      * fix
      
      * [NPU] support fp16 for npu accuracy op (#31797)
      
      * [NPU] support list of tensor input (#31801)
      
      * support list of tensor as npu input
      
      * add comment
      
      * fix typo
      
      * fix typo
      
      * [NPU] add npu kernel for concat op (#31695)
      
      * add npu kernel for concat op
      
      * add npu kernel for concat op
      
      * refine code
      
      * update
      
      * refine concat_grad
      
      * [NPU] Support npu kernel for op elementwise_floordiv (#31822)
      
      * [NPU] fix bug of lookup_table_v2_grad (#31834)
      
      * [NPU] support default stream (#31510)
      
      * [NPU] support mixed precision input for npu layer norm (#31847)
      
      * support mixed precision input for npu layer norm
      
      * fix layer_norm npu kernel
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      
      * 【NPU】Support npu kernel for update_loss_scaling op (#31830)
      
      * add update_loss_scaling_npu NPU kernel
      
      * change TensorFromVec to Memset
      
      * fix compile problem (#31850)
      
      * [NPU] support npu for conditional_block op (#31854)
      
      * 【NPU】Add int dtype kernel for reshape2 op (#31864)
      
      * fix
      
      * fix
      
      * [NPU] fix some op bugs (#31855)
      
      * fix some op bugs
      
      * fix some bugs
      
      * follow comments
      
      * fix log level
      
      * add ut
      
      * [NPU] support fp16 of input for api pow (#31871)
      
      * [NPU] add npu kernel for truncated_gaussian_random op (#31654)
      
      * init
      
      * add todo
      
      * add npu kernel for truncated_gaussian_random
      
      * add sync
      
      * fix concat_grad
      
      * fix typo
      
      * fix compile
      
      * fix compile
      
      * fix compile
      
      * fix compile
      
      * fix compile
      
      * fix compile
      
      * fix code style
      
      * fix code style
      
      * fix code
      
      * Fix op test (#32231)
      
      * fix conditional block (#32243)
      
      * fix style code
      Co-authored-by: Nxiayanming <41795079@qq.com>
      Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
      Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
      Co-authored-by: NReventon_L <luyuxiang1994@qq.com>
      Co-authored-by: Nroot <xiayanming@baidu.com>
      Co-authored-by: Noyjxer <1728722986@qq.com>
      Co-authored-by: Nyinhaofeng <66763551+yinhaofeng@users.noreply.github.com>
      Co-authored-by: NOleNet <olenet@126.com>
      Co-authored-by: NMeiyim <chen_xuyi@outlook.com>
      Co-authored-by: Noyxuan-11 <963650125@qq.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      e6bc358d
  2. 14 4月, 2021 10 次提交
  3. 13 4月, 2021 4 次提交
  4. 12 4月, 2021 4 次提交
  5. 10 4月, 2021 1 次提交
  6. 09 4月, 2021 5 次提交
  7. 08 4月, 2021 3 次提交
  8. 07 4月, 2021 7 次提交
    • D
      add uint8 type for flatten op (#32120) · 297290a8
      danleifeng 提交于
      * add uint8 type for flatten;test=develop
      297290a8
    • S
      move graph files (#32103) · 4935b8e7
      seemingwang 提交于
      * graph engine demo
      
      * upload unsaved changes
      
      * fix dependency error
      
      * fix shard_num problem
      
      * py client
      
      * remove lock and graph-type
      
      * add load direct graph
      
      * add load direct graph
      
      * add load direct graph
      
      * batch random_sample
      
      * batch_sample_k
      
      * fix num_nodes size
      
      * batch brpc
      
      * batch brpc
      
      * add test
      
      * add test
      
      * add load_nodes; change add_node function
      
      * change sample return type to pair
      
      * resolve conflict
      
      * resolved conflict
      
      * resolved conflict
      
      * separate server and client
      
      * merge pair type
      
      * fix
      
      * resolved conflict
      
      * fixed segment fault; high-level VLOG for load edges and load nodes
      
      * random_sample return 0
      
      * rm useless loop
      
      * test:load edge
      
      * fix ret -1
      
      * test: rm sample
      
      * rm sample
      
      * random_sample return future
      
      * random_sample return int
      
      * test fake node
      
      * fixed here
      
      * memory leak
      
      * remove test code
      
      * fix return problem
      
      * add common_graph_table
      
      * random sample node &test & change data-structure from linkedList to vector
      
      * add common_graph_table
      
      * sample with srand
      
      * add node_types
      
      * optimize nodes sample
      
      * recover test
      
      * random sample
      
      * destruct weighted sampler
      
      * GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * pybind sample nodes api
      
      * pull nodes with step
      
      * fixed pull_graph_list bug; add test for pull_graph_list by step
      
      * add graph table;name
      
      * add graph table;name
      
      * add pybind
      
      * add pybind
      
      * add FeatureNode
      
      * add FeatureNode
      
      * add FeatureNode Serialize
      
      * add FeatureNode Serialize
      
      * get_feat_node
      
      * avoid local rpc
      
      * fix get_node_feat
      
      * fix get_node_feat
      
      * remove log
      
      * get_node_feat return  py:bytes
      
      * merge develop with graph_engine
      
      * fix threadpool.h head
      
      * fix
      
      * fix typo
      
      * resolve conflict
      
      * fix conflict
      
      * recover lost content
      
      * fix pybind of FeatureNode
      
      * recover cmake
      
      * recover tools
      
      * resolve conflict
      
      * resolve linking problem
      
      * code style
      
      * change test_server port
      
      * fix code problems
      
      * remove shard_num config
      
      * remove redundent threads
      
      * optimize start server
      
      * remove logs
      
      * fix code problems by reviewers' suggestions
      
      * move graph files into a folder
      
      * code style change
      
      * remove graph operations from base table
      Co-authored-by: NHuang Zhengjie <270018958@qq.com>
      Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
      Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
      Co-authored-by: Nluobin06 <luobin06@baidu.com>
      Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
      Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
      4935b8e7
    • F
      bugfix for unit test test_segment_ops (#32116) · d91faf29
      furnace 提交于
      d91faf29
    • Z
      【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3
      zhang wenhui 提交于
      * Ascend rc (#30483)
      
      * Fix compilcation on CANN20.1 and older (#30494)
      
      Fix compilcation on CANN20.1 and older
      
      * Add distribution supported (#30578)
      
      Add distribution supported
      
      * Build praser for Hcom* operators (#30627)
      
      Build praser for Hcom* operators
      
      * Pass device_ids info from launch to trainer. (#30632)
      
      Pass device_ids info from launch to trainer
      
      * Add Hccl program group (#30642)
      
      Add Hccl program group
      
      * Add startup bash files of test_ascend_group. (#30645)
      
      Add startup bash files of test_ascend_group
      
      * cleanup (#30646)
      
      cleanup test_ascend_group.py
      
      * [Feature] Build parser to support distributed training (#30658)
      
      [Feature] Build parser to support distributed training
      
      * fix compilation on ascend-20.1 (#30722)
      
      fix compilation on ascend-20.1
      
      * Dev/fix ascend string (#30749)
      
      Dev/fix ascend string
      
      * code style (#30781)
      
      code style
      
      * Merge ascend_optimizer and ascend_parser. (#30776)
      
      Merge ascend_optimizer and ascend_parser.
      
      * Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)
      
      Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug
      
      * Add paddle ascend distribution training supported (#30796)
      
      Add paddle ascend distribution training supported
      
      * pass cxx_flags to gloo cmake (#30857)
      
      * Destroy session first. (#30954)
      
      Destroy session first.
      
      * merge
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style, test=develop
      
      * fix, test=develop
      
      * fix
      
      * fix log fatal, test=develop
      
      * fix enforce style, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix rccl, test=develop
      
      * fix test, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix node_num, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
      Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
      Co-authored-by: Ndingsiyu <18369187719@163.com>
      Co-authored-by: NOleNet <olenet@126.com>
      8c7c53b3
    • J
      [3D-parallelism] Hybrid Model Parallelism (#32074) · 1e60a0c4
      JZ-LIANG 提交于
      1e60a0c4
    • O
      improve performance of DepthwiseConv(NHWC) (#31677) · 363b25aa
      Ouyang Chao 提交于
      * improve performance of DepthwiseConv(NWHC)
      363b25aa
    • T
      Struct SparseValue && Bug Fix (#31721) · a881b4d5
      tangwei12 提交于
      * add PullSparseValue for pull sparse
      
      * fix bug for PullSparseValue
      
      * add test mode in lookuptable
      
      * revert API change
      
      * add comment for is_training
      a881b4d5
  9. 06 4月, 2021 5 次提交