- 06 8月, 2020 1 次提交
-
-
由 Thunderbrook 提交于
* add heter ps mode * code style test=develop * add with_pslib test=develop * unitest test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * test monitor test=develop * prepare trainer test=develop * code style test=develop
-
- 11 7月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop * replace old macro & for condition, test=develop * polish details, test=develop
-
- 10 6月, 2020 1 次提交
-
-
由 hutuxian 提交于
Support CMatchAucCalculator based on CMatchRankAucCalculator with a new parameter ignore_rank
-
- 04 6月, 2020 1 次提交
-
-
由 hutuxian 提交于
* Fix the field length in LoD scenario * Fix the missed lod info when copy tensor in dump field * Add some log to make debug easy
-
- 03 6月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop
-
- 26 5月, 2020 1 次提交
-
-
由 ShenLiang 提交于
-
- 25 5月, 2020 1 次提交
-
-
由 hutuxian 提交于
* Support AucRunner in PaddleBox * update some code style
-
- 11 5月, 2020 2 次提交
-
-
由 hutuxian 提交于
* Add InitializeGPUAndLoadModel to solve random hang when downloading sparse parameters. * Update SaveBase to solve test problem.
-
由 Chen Weihang 提交于
* add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop
-
- 30 4月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add timeout and http store in communication, add revert and confirm in fleet * test=develop
-
- 17 4月, 2020 1 次提交
-
-
由 hutuxian 提交于
-
- 11 4月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add save with prefix * test=develop
-
- 10 4月, 2020 1 次提交
-
-
由 hutuxian 提交于
* Involves AfsAPI to resolve slow downloading. * Mainly used in PaddleBox
-
- 01 4月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add fleet pslib pull and push sparse op and push dense op * test=develop
-
- 26 3月, 2020 2 次提交
-
-
由 xujiaqi01 提交于
* add clear_one_table * test=develop
-
由 danleifeng 提交于
* add maskauc in paddlebox; test=develop
-
- 20 3月, 2020 1 次提交
-
-
由 hutuxian 提交于
-
- 25 2月, 2020 1 次提交
-
-
由 hutuxian 提交于
* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.
-
- 11 2月, 2020 3 次提交
-
-
由 hutuxian 提交于
Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.
-
由 yaoxuefeng 提交于
* update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 02 2月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add GeneralRoleMaker which is for general usage * test=develop
-
- 14 1月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add collective communication library in fleet to replace mpi * test=develop
-
- 20 12月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop * add table id in cache shuffle test=develop * table id test=develop * code style test=develop
-
- 10 12月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* fix code style of fleet_wrapper * test=develop
-
- 25 11月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* print table stat test=develop * notes test=develop * notes test=develop
-
- 21 11月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop
-
- 20 11月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop
-
- 15 11月, 2019 2 次提交
-
-
由 xujiaqi01 提交于
* fix cache table bug * add save_paddle_inference_model * fix hdfs util bug * test=develop
-
由 xujiaqi01 提交于
* copy some feasigns and corresponding embeddings from one sparse table to another * copy all feasigns and corresponding embeddings from one sparse table to another * copy all dense params from one table to another * copy some local vars to other local vars
-
- 25 10月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto. * add find_distributed_lookup_table_grads instead of hard code GRAD * support embedding stop gradient. push sparse has error before fix this.* * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this. * fix pull sparse, skip slots which do not have embedding. * fix collect feasign label info, skip slots which do not have embedding. * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables. * test=develop
-
- 24 9月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize
-
- 17 9月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug
-
- 31 8月, 2019 1 次提交
-
-
由 hutuxian 提交于
* Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982
-
- 14 8月, 2019 1 次提交
-
-
由 jiaqi 提交于
* add get_last_save_xbox_base/get_last_save_xbox * fix fleet_util bug of load paddle model * add doc string in fleet api
-
- 11 8月, 2019 1 次提交
-
-
由 yaoxuefeng 提交于
add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871) * add ctr related metric layer test=develop * add save cache and slots shuffle test=develop * add save cache and slots shuffle test=develop * fix error * fix error * fix style for ci * fix for comments * change SlotsShuffle input to std::strinf for generality * fix style * fix style * fix style * fix style * fix style * fix style * fix stylr * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * change non-const reference to pointer * fix style * fix style * fix style test=develop * fix style test=develop * add return ins num in ctr metric op * change dtype to float in metric_op.py * fix error test=develop * fix style test=develop * fix API spec * fix API spec * fix API spec test=develop * add UT test=develop
-
- 29 7月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop
-
- 25 7月, 2019 1 次提交
-
-
由 fuyinno4 提交于
Fix FleetWrapper: 1. fix shrink dense: just scale show 2. add datanorm scale: divide datanorm's gradient by batch_size
-
- 24 7月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
The change includes 2 things: 1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table. 2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta. test=develop
-