Collector Overview¶
Profile of Speed¶
We conduct speed tests of our collector in comparison with Tianshou, under environments with different scales and environment managers with different types.
As the results show, our collector generally outperforms Tianshou’s, especially under environments with large scales or long reset times.
Settings
We design three environments with different scales, whose parameters are as follows:
Observation Dim
Reset Time
Step Time
Infer Time
Small Env
64
0.1
0.005
0.004
Middle Env
300
0.5
0.01
0.008
Big Env
3000
2
0.1
0.02
where Observation Dim means the dimension of the observation, Reset Time, Step Time and Infer Time are the times the collector needs to reset the environment, execute a step of the environment and use the policy to infer a step. In our tests, the three types of Time uniformly fluctuate in
[0.4, 1.6]
times of their original values in the above table.We conducts tests under sync and async modes, where the collector runs 8 environments in the environment manager. Under sync mode, we set
wait_num
as 8 and the collector waits for all 8 environments, while under async mode, it is 7. We retain the default values oftimeout
for both our and Tianshou’s env managers due to the different implementations of the async mode.Wait Num
Timeout
Sync
8
None
Async
7
0.01(nx) / None(ts)
Basically, for each environment, we conduct 4 tests with the env managers under sync and async modes in ours and Tianshou, namely ours-sync, ours-async, ts-sync, ts-async. Additionally, we include our base env manager as baseline, namely ours-base.
We simulate 300 collections during one test with each collection producing at least 80 samples, and repeat each test 3 times to get the averages and standard variances. To approximate the real-world scenes, we run a pure cpu task to raise the utilization of cpu to ~60% before each test.
Results
The results of speed tests are as follows(unit: second):
ts-async
ts-sync
ours-base
ours-async
ours-sync
Small Env
49.54+0.35
44.63+0.09
157.70+0.30
47.60+0.62
47.19+1.13
Middle Env
93.02+0.09
88.70+0.14
311.88+0.22
90.84+0.67
82.73+1.51
Big Env
545.07+1.55
520.52+0.30
2592.77+0.25
519.05+1.05
467.50+2.18
We conclude the findings as:
The speeds of our subprocess collector is 3~5 times of the base collector’s.
Our collector is faster than Tianshou’s in all of the six settings except the Small Env under sync mode.
To approximate some environments with large reset times, e.g. Carla and StarCraft2, we further conduct a group of tests with all reset times multiplied by 5 (unit: second):
ts-async
ts-sync
ours-base
ours-async
ours-sync
Small Env
55.11+0.10
45.10+0.08
176.55+0.05
50.71+0.80
50.55+1.39
Middle Env
130.49+0.09
112.27+0.03
407.65+0.14
98.49+1.29
94.18+1.74
Big Env
703.49+0.61
577.92+0.30
2976.80+0.39
555.15+1.90
520.31+1.05
As shown above, Tianshou’s collector bears drastic speed decreases in such environments compared with our collector, which uses reset threads to avoid busy waiting and remain high performance under large reset times.