1. 24 7月, 2019 8 次提交
    • B
      Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60
      Bob Zhu 提交于
      * extend matmul op to support multiple head multiplication
      
      With the support of multiple head, the multiplication of two big matrixes is
      split into multiplication of several (head_number) small matrixes. e.g. if
      Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
      as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
      [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
      220eef60
    • W
      Add python API for appending LoD level (#18702) · 075e1cf7
      whs 提交于
      * Make lod reset op support for append lod level.
      
      * Fix API.spec
      test=develop
      
      * Fix unitest.
      test=develop
      
      * Add python api for lod append.
      test=develop
      
      * Fix API.spec
      test=develop
      
      * Fix format of doc.
      test=develop
      
      * Fix unitest.
      test=develop
      
      * Fix doc.
      test=develop
      075e1cf7
    • T
      remove package.cmake (#18760) · 8de5aa1b
      Tao Luo 提交于
      test=develop
      8de5aa1b
    • C
      Enhance backward process (#18700) · 8259f141
      chengduo 提交于
      * prun backward ops
      test=develop
      8259f141
    • J
      Modify auc doc. Add output variable description, previously was the scalar... · 25c9b57b
      JesseyXujin 提交于
      Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)
      
      25c9b57b
    • Z
      Update trt5 for paddle-trt (#18645) · 26ae6d49
      Zhaolong Xing 提交于
      * update paddle-trt for:
          1. fix bug: when batch > 2, core in split plugin.
          2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
          3. add new attr to dropout.
          4. shuffle channel, swish, relu6 support
          test=develop
      
      * 1. fix ci
      test=develop
      26ae6d49
    • T
      add slot to sparse table (#18686) · d8396281
      Thunderbrook 提交于
      The change includes 2 things:
      
      1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
      2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
      test=develop
      d8396281
    • X
      modify install GPU 97 (#18768) · f0cfc3c3
      xsrobin 提交于
      * modify install GPU97
      
      * modify install GPU97
      f0cfc3c3
  2. 23 7月, 2019 6 次提交
    • J
      [MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e
      Jacek Czaja 提交于
      test=develop
      
      - compileation fix
      
      - Yet another compilation fix
      
      - Even yet another compilation fix
      
      - Surprise! Again compilation fix
      
      - lint fixes
      
      test=develop
      
      - Fix to workspace acquire of LRN
      
      test=develop
      
      - Fix to hash of BWD LRN
      
      test=develop
      
      - fix to lrn BWD PD acquire
      
      test=develop
      
      - Fixing LRN PD creation
      
      test=develop
      
      - cosmetic fix in comment
      
      test=develop
      
      - Fixes after review
      
      test=develop
      95c1816e
    • T
      remove unused cmake file (#18744) · 0ae45f0b
      Tao Luo 提交于
      test=develop
      0ae45f0b
    • J
      support patch data, add load_one_table, fix bug (#18509) · d18aabb4
      jiaqi 提交于
      (1)support patch data (merge slots of instances of same line id, modify dense layer which
      changes its size)
      (2)add fleet load_one_table interface, support load from paddle model and load from pslib model
      (3)fix push sparse bug which cause push sparse cost more time(about 10% in my testcase)
      (4)when some slots are not in one of your network (join/update, etc.),data feed、collect label info、push/pull sparse will skip these slots, instead of throw error.
      (5)add more debug info in TrainFilesWithProfiler
      d18aabb4
    • C
      Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
      chengduo 提交于
      * support sparse gradients
      test=develop
      fd3aad6c
    • W
      Cudnn convolution reconstruction (#18284) · 6b78e00d
      wangchaochaohu 提交于
      * rewrite the conv_op using cudnn_conv_helper
      
      * add workspace limit for v7 test=develop
      
      * fix test=develop
      
      * add half float test=develop
      
      * fix test=develop
      
      * fix test=develop
      
      * revise code style test=develop
      
      * fix test=develop
      6b78e00d
    • Y
      supports distributed classification (#18690) · 157211c4
      Yi Liu 提交于
      * supports distributed classification training
      * update API.spec
      * fix evenly division in python3
      * change "index_range" to "index_num" in shard_index operator
      test=document_preview
      test=develop
      157211c4
  3. 22 7月, 2019 11 次提交
  4. 20 7月, 2019 2 次提交
  5. 19 7月, 2019 5 次提交
  6. 18 7月, 2019 6 次提交
  7. 17 7月, 2019 2 次提交