- 08 10月, 2018 1 次提交
-
-
由 Sylwester Fraczek 提交于
review fix review from hshen14 fix test=develop fix error in broadcast and code cleanup rename bias -> eltwise and added macro to shorten code formatting
-
- 30 9月, 2018 5 次提交
-
-
由 sneaxiy 提交于
-
由 tensor-tang 提交于
test=develop
-
由 Tao Luo 提交于
test=develop
-
由 dzhwinter 提交于
* "fix compile error" * "fix ci" * rerun ci test=develop * test=develop rerun ci
-
- 29 9月, 2018 11 次提交
-
-
由 luotao1 提交于
test=develop
-
由 wangguibao 提交于
test=develop
-
由 chengduo 提交于
test=develop
-
由 luotao1 提交于
test=develop
-
由 luotao1 提交于
-
由 Xin Pan 提交于
test=develop
-
由 Xin Pan 提交于
test=develop
-
由 Xin Pan 提交于
test=develop
-
由 Dun 提交于
* refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible
-
由 Xin Pan 提交于
test=develop
-
由 Xin Pan 提交于
test=develop
-
- 28 9月, 2018 10 次提交
-
-
由 Xin Pan 提交于
Add API.spec test=develop
-
由 Tao Luo 提交于
test=develop
-
由 Jacek Czaja 提交于
test=develop
-
由 JiabinYang 提交于
-
由 dzhwinter 提交于
* flags * "follow comment"
-
由 Jacek Czaja 提交于
test=develop
-
由 Xin Pan 提交于
scope's API modifies its internal state. And scope's API can be called from multiple threads during traing. Hence, we need locks to protect the scope's internal states. We can optimize it in the future. But the current solution is buggy. test=develop
-
由 Wu Yi 提交于
* show detail error log on ci * test * fix memopt and dist * update apispec * will fix different batch issue test=develop
-
由 Yan Chunwei 提交于
- add naive executor - fix concurrency performance issue
-
由 Dang Qingqing 提交于
test=develop
-
- 27 9月, 2018 13 次提交
-
-
由 chengduo 提交于
* add GraphNum test=develop * add graph number check in parallelExecutor test=develop * fix transformer_model bug test=develop * fix graph num
-
由 Jacek Czaja 提交于
extended test_text_classification ot use new op
-
由 minqiyang 提交于
test=develop
-
由 chengduo 提交于
test=develop
-
由 Jacek Czaja 提交于
-
由 Jacek Czaja 提交于
- Added draft of new operator - Added fused embedding fc lstm files - First time embedding_fc_lstm_fuse_pass was invoked in test_text_classification - Added Embedding pattern - Not crashing - Enabled draft of embedding_fc_lstm pass (does it job) - First working (Seqcompute only) version - Removed diagnostic comment - First enabling of BatchCompute - Disabling pass for embedding with is_sparse and is_distributed - Cosmetics - Style - Style
-
由 chengduo 提交于
-
由 qingqing01 提交于
* Add CUDA implementation for generate_proposals_op. * Clean code. * Update code.
-
由 wanghaoshuang 提交于
-
由 wanghaoshuang 提交于
-
由 Yan Chunwei 提交于
-
由 sneaxiy 提交于
-
由 tangwei12 提交于
* add dist ut for text_classification * add dist ut for text_classification * add simnet bow unittest * add dist ut for simnet bow * add trainning data url for simnet bow * add trainning data url for simnet bow * modify simnet test_reader to train reader * add test_dist_ctr * test_dist_ctr can run now * dense update is good * add unit test for selected rows * debug unit test * fix dist sparse update problem * Constant args at init * optimize code * simnet optimize * fix DebugStringEx * optimize sum_op.h * add ScaleOpVarTypeInference * clean code * fix test_dist_transpiler.py * code optimize * modify delta * fix sparse update bug * dist test use one cpu * update some data * remove unused code * add use cuda config * unit test fix * unit test fix * unit test fix * unit test fix * dist_word2vec use CPU * unit test fix * unit test fix * code clean * code clean * merge develop * api spec update * Revert: api spec update * replace simnet data with fake * replace simnet data with fake * update dim * add batch auc * code clean * code clean * modify print to stderr * update simnet delta -> 1e-5 * update RUN_STEP * add use_reader_alloc * add use_reader_alloc * add use_reader_alloc * modify delta * add use_reader_alloc * fix stderr write * python3 compatibility test=develop * python3 compatibility, test=develop * Update dist_text_classification.py * test=develop
-