1. 19 10月, 2018 1 次提交
  2. 28 9月, 2018 4 次提交
    • Y
      refactor(op): polish generate_proposals_op · 593ad763
      Yu Yang 提交于
      Polish styles in generate_proposals_op.
      
      1. inline lambda functions rathar than use std::function to save var.
      2. add `static inline` to template functions .cc
         * Make them static to prevent generating symbols.
         * Make them inline to give compiler a hit inline them as possible.
         * Not if the function is not static, they cannot be inlined since the
           symbols should be exported.
      3. add `static` to global functions in .cc
         * Make them static to prevent generating symbols.
      4. Use Vector<uint64> instead manually manange storage between devices.
      5. Prefer to use platform::ForRange, so we can optimize `ForRange` by
         just changing `for_range.h` if it is needed.
      6. Do not change shape of inputs
      
      test=develop
      593ad763
    • W
      Fix memory optimization with dist train (#13535) · 7a5f3f75
      Wu Yi 提交于
      * show detail error log on ci
      
      * test
      
      * fix memopt and dist
      
      * update apispec
      
      * will fix different batch issue test=develop
      7a5f3f75
    • Y
      fea/infer executor and concurrency performance issue bug fix (#13451) · c8744d11
      Yan Chunwei 提交于
      - add naive executor
      - fix concurrency performance issue
      c8744d11
    • D
      Update API.spec · f189bf6a
      Dang Qingqing 提交于
      test=develop
      f189bf6a
  3. 27 9月, 2018 8 次提交
    • C
      Add GraphChecker (#13580) · 5175b3cb
      chengduo 提交于
      * add GraphNum
      
      test=develop
      
      * add graph number check in parallelExecutor
      
      test=develop
      
      * fix transformer_model bug
      
      test=develop
      
      * fix graph num
      5175b3cb
    • C
      refine sgd_op (#13626) · 43a3af86
      chengduo 提交于
      test=develop
      43a3af86
    • C
      add op frequence (#13328) · 4e81e228
      chengduo 提交于
      4e81e228
    • Q
      Cuda speed for generate_proposals_op. (#13596) · fd4c4df9
      qingqing01 提交于
      * Add CUDA implementation for generate_proposals_op.
      * Clean code.
      * Update code.
      fd4c4df9
    • Y
      hide attention lstm fuse (#13615) · 9e8d372f
      Yan Chunwei 提交于
      9e8d372f
    • T
      Add distributed unit tests about text_classification/simnet-bow/ctr (#12812) · 97cf1eb6
      tangwei12 提交于
      * add dist ut for text_classification
      
      * add dist ut for text_classification
      
      * add simnet bow unittest
      
      * add dist ut for simnet bow
      
      * add trainning data url for simnet bow
      
      * add trainning data url for simnet bow
      
      * modify simnet test_reader to train reader
      
      * add test_dist_ctr
      
      * test_dist_ctr can run now
      
      * dense update is good
      
      * add unit test for selected rows
      
      * debug unit test
      
      * fix dist sparse update problem
      
      * Constant args at init
      
      * optimize code
      
      * simnet optimize
      
      * fix DebugStringEx
      
      * optimize sum_op.h
      
      * add ScaleOpVarTypeInference
      
      * clean code
      
      * fix test_dist_transpiler.py
      
      * code optimize
      
      * modify delta
      
      * fix sparse update bug
      
      * dist test use one cpu
      
      * update some data
      
      * remove unused code
      
      * add use cuda config
      
      * unit test fix
      
      * unit test fix
      
      * unit test fix
      
      * unit test fix
      
      * dist_word2vec use CPU
      
      * unit test fix
      
      * unit test fix
      
      * code clean
      
      * code clean
      
      * merge develop
      
      * api spec update
      
      * Revert: api spec update
      
      * replace simnet data with fake
      
      * replace simnet data with fake
      
      * update dim
      
      * add batch auc
      
      * code clean
      
      * code clean
      
      * modify print to stderr
      
      * update simnet delta -> 1e-5
      
      * update RUN_STEP
      
      * add use_reader_alloc
      
      * add use_reader_alloc
      
      * add use_reader_alloc
      
      * modify delta
      
      * add use_reader_alloc
      
      * fix stderr write
      
      * python3 compatibility
      
      test=develop
      
      * python3 compatibility, test=develop
      
      * Update dist_text_classification.py
      
      * test=develop
      97cf1eb6
    • T
      Revert "Some trivial optimization (#13530)" · a4f7696a
      typhoonzero 提交于
      This reverts commit 1d91a49d.
      a4f7696a
    • T
      Batch AUC (#13567) · 85362e98
      tangwei12 提交于
      * add distributed auc
      
      * add attr "is distributed" and config it
      
      * add distributed auc
      
      * add batch auc and code format
      
      * code format
      
      * auc optimize
      
      * metric_op optimize
      
      * code clean
      
      * bug fix and code clean
      
      * bug fix and code clean
      
      * code optimize
      
      * code optimize
      
      * api spec update
      
      * Comments optimized
      
      * add mutex
      
      * Revert: add mutex
      
      * remove distribute metric
      
      * remove distribute metric
      
      * spec modifyed
      
      * add annotation, test=develop
      
      * keep API compatibility
      test=develop
      85362e98
  4. 26 9月, 2018 6 次提交
  5. 25 9月, 2018 21 次提交