mnist demo training speed problem
Created by: procurity
training the mnist demo with cpu and trainer=10 i've been training the model for over 15 mins, it didn't even finish the first pass. here is the debug message:
I1201 15:03:29.609876 5524 ThreadLocal.cpp:37] thread use undeterministic rand seed:5525 I1201 15:03:29.610935 5523 ThreadLocal.cpp:37] thread use undeterministic rand seed:5524 I1201 15:03:29.619053 5522 ThreadLocal.cpp:37] thread use undeterministic rand seed:5523 ......... I1201 15:06:11.298797 5367 TrainerInternal.cpp:165] Batch=100 samples=12800 AvgCost=1.85298 CurrentCost=1.85298 Eval: classification_error_evaluator=0.658594 CurrentEval: classification_error_evaluator=0.658594 ......... I1201 15:08:50.625761 5367 TrainerInternal.cpp:165] Batch=200 samples=25600 AvgCost=1.23572 CurrentCost=0.618452 Eval: classification_error_evaluator=0.432773 CurrentEval: classification_error_evaluator=0.206953 ......... I1201 15:11:24.153884 5367 TrainerInternal.cpp:165] Batch=300 samples=38400 AvgCost=0.927435 CurrentCost=0.310871 Eval: classification_error_evaluator=0.318724 CurrentEval: classification_error_evaluator=0.090625 ......... I1201 15:13:56.731485 5367 TrainerInternal.cpp:165] Batch=400 samples=51200 AvgCost=0.751423 CurrentCost=0.223385 Eval: classification_error_evaluator=0.254688 CurrentEval: classification_error_evaluator=0.0625781 ......I1201 15:15:42.951373 5367 TrainerInternal.cpp:182] Pass=0 Batch=469 samples=60000 AvgCost=0.666388 Eval: classification_error_evaluator=0.224417 I1201 15:16:18.152338 5367 Tester.cpp:127] Test samples=10000 cost=0.0651589 Eval: classification_error_evaluator=0.0198 I1201 15:16:18.152487 5367 GradientMachine.cpp:112] Saving parameters to ./mnist_vgg_model/pass-00000 I1201 15:16:18.207792 5367 Util.cpp:230] copy vgg_16_mnist.py to ./mnist_vgg_model/pass-00000
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────top - 15:16:48 up 433 days, 5:01, 8 users, load average: 9.56, 12.10, 9.16 Tasks: 575 total, 4 running, 571 sleeping, 0 stopped, 0 zombie Cpu0 : 99.7% us, 0.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu1 : 3.0% us, 4.0% sy, 0.0% ni, 93.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu2 : 0.3% us, 0.7% sy, 1.0% ni, 97.3% id, 0.7% wa, 0.0% hi, 0.0% si Cpu3 : 98.7% us, 1.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu4 : 0.3% us, 0.3% sy, 0.0% ni, 99.3% id, 0.0% wa, 0.0% hi, 0.0% si Cpu5 : 0.3% us, 1.0% sy, 0.7% ni, 98.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu6 : 37.5% us, 45.8% sy, 0.0% ni, 16.6% id, 0.0% wa, 0.0% hi, 0.0% si Cpu7 : 7.3% us, 4.6% sy, 0.0% ni, 87.7% id, 0.3% wa, 0.0% hi, 0.0% si Cpu8 : 99.3% us, 0.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu9 : 1.0% us, 2.3% sy, 0.7% ni, 96.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu10 : 0.3% us, 0.3% sy, 0.3% ni, 98.7% id, 0.3% wa, 0.0% hi, 0.0% si Cpu11 : 0.7% us, 0.3% sy, 0.0% ni, 99.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 65945556k total, 46134032k used, 19811524k free, 358432k buffers Swap: 0k total, 0k used, 0k free, 29618384k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5367 work 20 0 5236m 4.2g 9632 S 99.9 6.6 112:44.61 paddle_trainer