多显卡服务器中,用单显卡训练图片分类时,出现Pass中batch数不对的问题
Created by: youwei2567
相关参数设置如下:
add_arg('total_images', int, 777, "The number of total training images.") add_arg('batch_size', int, 32, "Minibatch size on a device.")
具体显示如下:
[Pass 1, train batch 186] loss 2.17508, acc1 0.50000, acc5 0.75000, lr 0.00188, elapse 0.0189 sec [Pass 1, train batch 187] loss 2.31072, acc1 0.25000, acc5 0.50000, lr 0.00188, elapse 0.0188 sec [Pass 1, train batch 188] loss 2.21176, acc1 0.25000, acc5 0.75000, lr 0.00188, elapse 0.0185 sec [Pass 1, train batch 189] loss 2.14681, acc1 0.75000, acc5 0.75000, lr 0.00188, elapse 0.0201 sec [Pass 1, train batch 190] loss 1.80597, acc1 0.75000, acc5 1.00000, lr 0.00188, elapse 0.0181 sec [Pass 1, train batch 191] loss 1.95061, acc1 0.50000, acc5 0.75000, lr 0.00188, elapse 0.0194 sec [Pass 1, train batch 192] loss 2.30522, acc1 0.25000, acc5 0.50000, lr 0.00188, elapse 0.0193 sec [Pass 1, train batch 193] loss 2.49756, acc1 0.25000, acc5 0.50000, lr 0.00188, elapse 0.0199 sec
Pass中的batch数应为24,却有193个batch