- 24 7月, 2019 5 次提交
-
-
由 Bob Zhu 提交于
* extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
-
由 whs 提交于
* Make lod reset op support for append lod level. * Fix API.spec test=develop * Fix unitest. test=develop * Add python api for lod append. test=develop * Fix API.spec test=develop * Fix format of doc. test=develop * Fix unitest. test=develop * Fix doc. test=develop
-
由 JesseyXujin 提交于
Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)
-
由 Zhaolong Xing 提交于
* update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop
-
由 Thunderbrook 提交于
The change includes 2 things: 1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table. 2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta. test=develop
-
- 23 7月, 2019 5 次提交
-
-
由 Jacek Czaja 提交于
test=develop - compileation fix - Yet another compilation fix - Even yet another compilation fix - Surprise! Again compilation fix - lint fixes test=develop - Fix to workspace acquire of LRN test=develop - Fix to hash of BWD LRN test=develop - fix to lrn BWD PD acquire test=develop - Fixing LRN PD creation test=develop - cosmetic fix in comment test=develop - Fixes after review test=develop
-
由 jiaqi 提交于
(1)support patch data (merge slots of instances of same line id, modify dense layer which changes its size) (2)add fleet load_one_table interface, support load from paddle model and load from pslib model (3)fix push sparse bug which cause push sparse cost more time(about 10% in my testcase) (4)when some slots are not in one of your network (join/update, etc.),data feed、collect label info、push/pull sparse will skip these slots, instead of throw error. (5)add more debug info in TrainFilesWithProfiler
-
由 chengduo 提交于
* support sparse gradients test=develop
-
由 wangchaochaohu 提交于
* rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop
-
由 Yi Liu 提交于
* supports distributed classification training * update API.spec * fix evenly division in python3 * change "index_range" to "index_num" in shard_index operator test=document_preview test=develop
-
- 22 7月, 2019 4 次提交
-
-
由 qingqing01 提交于
-
由 Tao Luo 提交于
test=develop
-
由 whs 提交于
test=develop
-
由 Bai Yifan 提交于
-
- 20 7月, 2019 2 次提交
-
-
由 cjt222 提交于
add license
-
由 wangguanzhong 提交于
* fix clip_by_norm doc, test=develop
-
- 19 7月, 2019 3 次提交
-
-
由 Huihuang Zheng 提交于
Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)
-
由 Jacek Czaja 提交于
test=develop
-
由 Adam 提交于
test=develop
-
- 18 7月, 2019 4 次提交
-
-
由 zhouwei25 提交于
Optimize the content of error reporting information, print error code and official document web sites (#18671) optimize the error reporting information of cuda related API index on develop: 130ac177 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop
-
由 Zeng Jinle 提交于
* feature/auto_growth_allocator, test=develop * add unittest of AlignedAllocator, test=develop * try to turn on auto_growth to test on CI, test=develop * fix segmentation fault in mixed_vector.h, test=develop * add unittests, test=develop
-
由 hutuxian 提交于
* hash_op support int64 hash_size * add corresponding UT
-
由 guru4elephant 提交于
* remove ctr reader, all functions are satisfied in dataset
-
- 17 7月, 2019 5 次提交
-
-
由 guru4elephant 提交于
* remove async executor and add data_feed.proto to the deps of train demo
-
由 Yang Zhang 提交于
* Add GPU implementation for `prelu` backward pass test=develop * Fix logic error in `prelu` GPU backward and simplify a bit test=develop * Fix `prelu` backward CUDA implementation test=develop CPU version was not used actually, so test passed
-
由 石晓伟 提交于
* update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop * modify interfaces test=develop * add cmake dependments test=develop * enforce the outputs of net test=develop
-
由 Yihua Xu 提交于
-
由 baojun 提交于
-
- 16 7月, 2019 4 次提交
-
-
由 chengduo 提交于
test=develop
-
由 liuwei1031 提交于
-
由 Jacek Czaja 提交于
* - Added partial draft of pooling acquire - Workspace support - compilation fix - Added draft of pooling backward reimplementation - Segfault fix - reverted 'any' for diff_dst crewation in pooling - Lint fixes test=develop - lint fixes test=develop - Further lint fixes test=develop * - Fixes after review test=develop * - Lint fixes test=develop * - Even more lint fixes test=develop
-
由 chengduo 提交于
test=develop
-
- 15 7月, 2019 1 次提交
-
-
由 guru4elephant 提交于
* make auc op compatible with 1 dim
-
- 12 7月, 2019 4 次提交
-
-
由 Leo Zhao 提交于
* not use transferscope cache in cpu case test=develop * adjust variable name and add comments test=develop * use correct format for class member in operator.h * use correct format for class member in operator.cc test=develop
-
由 123malin 提交于
* fix int64_t * update fill constant op unittest * add empty line
-
由 tangwei12 提交于
* delete m, test=develop
-
由 Kevin 提交于
-
- 11 7月, 2019 3 次提交
-
-
由 Tao Luo 提交于
* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy test=develop * enhance MkldnnPostReset test=develop * add comments for mkldnn_cache_capacity field test=develop
-
由 Hongyu Liu 提交于
-
由 gongweibao 提交于
-