test_parallel_executor fails
Created by: luotao1
test_parallel_executor fails random
[13:22:28] [Step 1/1] 96/125 Test #91: test_parallel_executor ..........................***Failed 21.84 sec
[13:22:28] [Step 1/1] test_parallel_testing (test_parallel_executor.ParallelExecutorTestingDuringTraining) ... FAIL
[13:22:28] [Step 1/1] test_batchnorm_fc (test_parallel_executor.TestMNIST) ... [3.5742419 3.3595216] [0.6401141 1.6629317]
[13:22:28] [Step 1/1] [2.302585 2.302585] [2.0785856 2.0785856]
[13:22:28] [Step 1/1] ok
[13:22:28] [Step 1/1] test_simple_fc (test_parallel_executor.TestMNIST) ... [3.6306825 2.4718497] [1.8946763 1.7671887]
[13:22:28] [Step 1/1] [2.784895 2.5584354] [1.6662283 1.7044266]
[13:22:28] [Step 1/1] [0.99775946 0.99775946] [0.00021562 0.00021562]
[13:22:28] [Step 1/1] ok
[13:22:28] [Step 1/1] test_resnet (test_parallel_executor.TestResnet) ... 19.2143 Instance per second
[13:22:28] [Step 1/1] [6.9077554 6.9077554] [6.8658004 6.8658004]
[13:22:28] [Step 1/1] ok
[13:22:28] [Step 1/1] test_main (test_parallel_executor.TestTransformer) ... skipped 'transformer is buggy in multi gpu'
[13:22:28] [Step 1/1]
[13:22:28] [Step 1/1] ======================================================================
[13:22:28] [Step 1/1] FAIL: test_parallel_testing (test_parallel_executor.ParallelExecutorTestingDuringTraining)
[13:22:28] [Step 1/1] ----------------------------------------------------------------------
[13:22:28] [Step 1/1] Traceback (most recent call last):
[13:22:28] [Step 1/1] File "test_parallel_executor.py", line 497, in test_parallel_testing
[13:22:28] [Step 1/1] self.assertTrue(numpy.allclose(train_loss, test_loss))
[13:22:28] [Step 1/1] AssertionError: False is not true
[13:22:28] [Step 1/1]
[13:22:28] [Step 1/1] ----------------------------------------------------------------------
[13:22:28] [Step 1/1] Ran 5 tests in 19.030s
[13:22:28] [Step 1/1]
[13:22:28] [Step 1/1] FAILED (failures=1, skipped=1)