Multi-Thread computing on mobile end
Created by: hedaoyuan
When the single-thread computing on the mobile cannot meet the performance requirements of the model inference, it will naturally expect to be accelerated by multi-thread. At present, most of the mobile phone is a multi-core system, but the hardware architecture is not the same with the general multi-core server system. So, in the actual scene, the multi-thread acceleration method used on the server cannot get the same good acceleration on the mobile side. The main reason that impact the multi-thread acceleration effect on the mobile end is as follows.
- The big.LITTLE architecture. Because the computational power of the big and little cores is inconsistent, if the computational tasks are evenly distributed to multiple cores, the overall computational performance is dragged down by the little cores. This has been encountered in previous experiments. https://github.com/PaddlePaddle/Paddle/wiki/2017-07-19#hedaoyuan
- interactive mode. On the mobile side, CPU mode is generally set as
interactive
instead ofperformance
mode. When a CPU core is awakened for calculations, the core starts at low frequency, and it takes some time to run to the high frequency. So, the multi-thread computing on the mobile end, the performance of other threads is not as good as the performance of the main thread. - Power limited. The mobile end generally has power limited, which may exceed the power limit when using multiple threads, leading to CPU frequency reduction, thus affecting computing performance.