Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • models
  • Issue
  • #2239

M
models
  • 项目概览

PaddlePaddle / models
大约 2 年 前同步成功

通知 232
Star 6828
Fork 2962
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 602
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 255
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
M
models
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 602
    • Issue 602
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 255
    • 合并请求 255
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 5月 14, 2019 by saxon_zh@saxon_zhGuest

cpu utilization is low in PaddleRec/ctr model

Created by: lzha106

When run ctr training model in PaddleRec/ctr on one 48 core server, it seems the CPU utilization is very low (avg to be 200%) during training. It is same either in local mode or distributed mode.

  1. Cmd (Run in docker)

# export NUM_THREADS=20 # python train.py --train_data_path ./data/train.txt

  1. Env
  • hub.baidubce.com/paddlepaddle/paddle:latest
  1. Log

2019-05-14 11:51:05,883-INFO: run local training 2019-05-14 11:51:05,884-INFO: num threads= 0 2019-05-14 11:51:05,884-INFO: cpu num = 48 ParallelExecutor is deprecated. Please use CompiledProgram and Executor. CompiledProgram is a central place for optimization and Executor is the unified executor. Example can be found in compiler.py. W0514 11:51:05.886332 4342 graph.h:204] WARN: After a series of passes, the current graph can be quite different from OriginProgram. So, please avoid using the OriginProgram() method! 2019-05-14 11:51:14,050-INFO: TRAIN --> pass: 0 batch: 0 loss: 0.735086364746 auc: 0.503536689127, batch_auc: 0.504577935984 2019-05-14 11:51:16,557-INFO: TRAIN --> pass: 0 batch: 1 loss: 0.687324645996 auc: 0.503425833509, batch_auc: 0.502190308561 2019-05-14 11:51:25,562-INFO: TRAIN --> pass: 0 batch: 2 loss: 0.652316955566 auc: 0.49810737994, batch_auc: 0.511218654271 2019-05-14 11:51:28,201-INFO: TRAIN --> pass: 0 batch: 3 loss: 0.624634887695 auc: 0.492766345222, batch_auc: 0.516552284336 2019-05-14 11:51:36,096-INFO: TRAIN --> pass: 0 batch: 4 loss: 0.604645751953 auc: 0.491407935754, batch_auc: 0.521633943716 2019-05-14 11:51:38,219-INFO: TRAIN --> pass: 0 batch: 5 loss: 0.591483703613 auc: 0.49000279105, batch_auc: 0.516322304276 2019-05-14 11:51:46,069-INFO: TRAIN --> pass: 0 batch: 6 loss: 0.584241821289 auc: 0.489829021597, batch_auc: 0.519842867821 2019-05-14 11:51:48,183-INFO: TRAIN --> pass: 0 batch: 7 loss: 0.578797851562 auc: 0.490242734155, batch_auc: 0.524336059192 2019-05-14 11:51:56,606-INFO: TRAIN --> pass: 0 batch: 8 loss: 0.579120727539 auc: 0.490747555956, batch_auc: 0.526528383667 2019-05-14 11:51:58,903-INFO: TRAIN --> pass: 0 batch: 9 loss: 0.577897460938 auc: 0.491967175691, batch_auc: 0.531965010737 2019-05-14 11:52:06,682-INFO: TRAIN --> pass: 0 batch: 10 loss: 0.578246459961 auc: 0.493372356486, batch_auc: 0.52775468058 2019-05-14 11:52:08,846-INFO: TRAIN --> pass: 0 batch: 11 loss: 0.578046508789 auc: 0.495237219199, batch_auc: 0.531861205473 2019-05-14 11:52:17,071-INFO: TRAIN --> pass: 0 batch: 12 loss: 0.577464294434 auc: 0.497194691728, batch_auc: 0.536939430909 2019-05-14 11:52:19,243-INFO: TRAIN --> pass: 0 batch: 13 loss: 0.580472900391 auc: 0.499336215148, batch_auc: 0.539940881903 2019-05-14 11:52:27,277-INFO: TRAIN --> pass: 0 batch: 14 loss: 0.578672119141 auc: 0.501259237555, batch_auc: 0.545386792237 2019-05-14 11:52:29,362-INFO: TRAIN --> pass: 0 batch: 15 loss: 0.572959228516 auc: 0.503349640492, batch_auc: 0.546570044491 2019-05-14 11:52:37,339-INFO: TRAIN --> pass: 0 batch: 16 loss: 0.568415466309 auc: 0.505703493934, batch_auc: 0.553619374192 2019-05-14 11:52:39,439-INFO: TRAIN --> pass: 0 batch: 17 loss: 0.572214294434 auc: 0.507821561535, batch_auc: 0.553008822041 2019-05-14 11:52:47,451-INFO: TRAIN --> pass: 0 batch: 18 loss: 0.564864013672 auc: 0.509964511397, batch_auc: 0.556775745606 2019-05-14 11:52:49,572-INFO: TRAIN --> pass: 0 batch: 19 loss: 0.559789306641 auc: 0.512298856513, batch_auc: 0.570818080853 pass_id: 0, pass_time_cost: 105.293049

  1. Top info

top - 19:51:50 up 188 days, 3:56, 2 users, load average: 42.83, 42.64, 42.54 Tasks: 787 total, 2 running, 785 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.3 us, 0.2 sy, 0.0 ni, 95.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 19779323+total, 13814598+free, 11023284 used, 48623964 buff/cache KiB Swap: 4194300 total, 3760116 free, 434184 used. 18493142+avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 60425 root 20 0 14.730g 2.927g 45432 S 170.9 1.6 1:30.33 python

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/models#2239
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7