Created by: qingqing01
Fix #9571 (closed)
- Use two ParallelExecutors, one for training, one for testing.
- The ParallelExecutor for testing is shared local scopes with training.
- When testing during training, set run_startupFalse.
- There is no need to set loss_namefor testing ParallelExecutor.
Now, following testing code can run successfully. The correctness will be verified later.
The usage is as follows:
    image, label = fluid.layers.read_file(data_file)
    avg_cost, accuracy, accuracy5 = net_conf(image, label, class_dim)
    test_program = fluid.default_main_program().clone(for_test=True)
    optimizer = fluid.optimizer.Momentum(
        learning_rate=fluid.layers.piecewise_decay(
            boundaries=[100], values=[0.1, 0.2]),
        momentum=0.9,
        regularization=fluid.regularizer.L2Decay(1e-4))
    opts = optimizer.minimize(avg_cost)
    exe = fluid.ParallelExecutor(loss_name=avg_cost.name,
                                     use_cuda=True)
    test_exe = fluid.ParallelExecutor(use_cuda=True,
                                     main_program=test_program,
                                     run_startup=False,
                                     local_scopes=exe.local_scopes())
    def test():
        for i in xrange(10):
            loss, top1, top5 = test_exe.run([avg_cost.name, accuracy.name, accuracy5.name])
            l,t1,t5 = np.mean(np.array(loss)), np.mean(np.array(top1)), np.mean(np.array(top5))
            print('Test Loss {0}, Top1 {1}, Top5 {2}'.format(l, t1, t5))
    batch_id = 0
    time_record = []
    for i in xrange(20):
        loss, = exe.run([avg_cost.name])
        loss_v = np.mean(np.array(loss))
        print('Batch {0}, Loss {1}'.format(batch_id, loss_v))
        if batch_id % 10 == 0:
            test()
        batch_id += 1
