1. 17 5月, 2019 2 次提交
  2. 16 5月, 2019 2 次提交
  3. 15 5月, 2019 6 次提交
  4. 14 5月, 2019 6 次提交
  5. 13 5月, 2019 4 次提交
    • Y
      Optimize the computing kernel of sequence_reverse operator (#17349) · 218d8d8f
      Yihua Xu 提交于
      * Optimize the computing kernel of sequence_reverse operator.
      
      test=develop
      
      * Clean code
      
      test=develop
      
      * Fix for cpplint syntax checking.
      
      test=develop
      
      * Fix the compile warning issue.
      
      test=develop
      218d8d8f
    • Y
      Optimize the elementwise op using eigen (#15494) · dcda2023
      Yiqun Liu 提交于
      * Optimize the elementwise op with CUDA kernels.
      test=develop
      
      * Support setting of attr in op config file.
      test=develop
      
      * Add the support the setting dtype and initializer in config.
      test=develop
      
      * Save workspace.
      
      * Add initializer "zeros".
      test=develop
      
      * Fix compiling error.
      
      * Support the use of existed file to initailize tensor in op_tester.
      
      * Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
      test=develop
      dcda2023
    • K
      add double grad for elementwise_mul op (#17255) · 8bae8590
      Kaipeng Deng 提交于
      * add double grad for elementwise_mul. test=develop
      
      * remove comment. test=develop
      
      * fix grad sum. test=develop
      
      * fix for axis expand. test=develop
      
      * add test for axis expand. test=develop
      8bae8590
    • K
      add double grad for square op (#17173) · 11d3a38f
      Kaipeng Deng 提交于
      * add double grad for square. test=develop
      
      * formax code. test=develop
      
      * fix for grad sum. test=develop
      
      * refine shape. test=develop
      
      * refine extract. test=develop
      11d3a38f
  6. 10 5月, 2019 4 次提交
  7. 09 5月, 2019 2 次提交
  8. 08 5月, 2019 8 次提交
  9. 07 5月, 2019 6 次提交
    • Z
      Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225) · 4f859408
      Zeng Jinle 提交于
      * add use_cuda to inplace pass,test=develop
      
      * add test softmax_with_xe_inplace test,test=develop
      
      * fix potential inplace bug
      test=develop
      
      * add more skip vars in mem opt pass,test=develop
      
      * follow comment,test=develop
      
      * follow comments,move duplicate out arg check to program->graph,test=develop
      4f859408
    • B
      update sofmax with axis arg test=develop (#17190) · e782b54b
      baojun 提交于
      e782b54b
    • K
      Softmax_cross_entropy op add axis (#16806) · a71d8fdb
      Kaipeng Deng 提交于
      * add attr axis infershape. test=develop
      
      * add CUDA kernel. test=develop
      
      * fix unittest. test=develop
      
      * fix unittest for soft_label. test=develop
      
      * fix fp16 unittest. test=develop
      
      * remove comment code. test=develop
      
      * refine test for axis. test=develop
      
      * add python api. test=develop
      
      * fix doc. test=develop
      
      * fix fp16 unittest. test=develop
      
      * fix ngraph test. test=develop
      
      * fix ENFORCE for test_imperative_transformer. test=develop
      
      * fit for ngraph test. test=develop
      
      * fix after rebase develop. test=develop
      
      * fix doc. test=develop
      
      * fix API.spec. test=develop
      
      * fix test_layers. test=develop
      
      * fix format. test=develop
      a71d8fdb
    • Z
      Quant output scale (#17215) · a914d9b1
      Zhen Wang 提交于
      * Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale.
      
      * test=develop
      
      * change the output into inplace. test=develop
      
      * Revert "test=develop"
      
      This reverts commit 696cf626.
      
      * Revert "change the output into inplace. test=develop"
      
      This reverts commit a19acd20.
      
      * test=develop.
      
      * update the MovingAverageAbsMaxScaleOp test. test=develop
      a914d9b1
    • Z
      optimize sum op (#16820) · 32b62c25
      zhaoyuchen2018 提交于
      * optimize sum op
      
      fuse multi eigen kernel calls into one cuda kernel.
      refine code
      
      test=develop.
      Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
      
      * Refine code.
      
      test=develop
      Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
      
      * Refine code according to comments.
      
      test=develop
      
      * refine code
      
      delete sum_op_gpu.h
      test=develop
      
      * Fix test error.
      
      test=develop
      Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
      
      * refine code in format.
      
      test=develop.
      
      * refine code
      
      test=develop
      Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
      
      * refine code
      
      test=develop
      Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
      32b62c25
    • Cherry-pick benchmark related changes from release/1.4 (#17156) · a72dbe9a
      石晓伟 提交于
      * cherry-pick commit from 88770542
      
      * cherry-pick commit from 3f0b97df
      
      * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn
      
      (cherry picked from commit 8643dbc2)
      
      * Cherry-Pick from 16662 : Anakin subgraph cpu support
      
      (cherry picked from commit 7ad182e1)
      
      * Cherry-pick from 1662, 16797.. : add anakin int8 support
      
      (cherry picked from commit e14ab180)
      
      * Cherry-pick from 16813 : change singleton to graph RegistBlock
      test=release/1.4
      
      (cherry picked from commit 4b9fa423)
      
      * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2
      
      Support ShuffleNet and MobileNet-v2, test=release/1.4
      
      (cherry picked from commit a6fb066f)
      
      * Cherry-pick : anakin subgraph add opt config layout argument #16846
      test=release/1.4
      
      (cherry picked from commit 8121b3ec)
      
      * 1. add shuffle_channel_detect
      
      (cherry picked from commit 6efdea89)
      
      * update shuffle_channel op convert, test=release/1.4
      
      (cherry picked from commit e4726a06)
      
      * Modify symbol export rules
      
      test=develop
      a72dbe9a