Paddle on Mobile (#5782) · Issue · PaddlePaddle / Paddle

Paddle on Mobile

Created by: hedaoyuan

Paddle on Mobile

Based on some previous work and issues, I've listed some things Paddle needs to do on the mobile and embedded devices.

Build

Paddle mobile inference library needs to support a variety of computing platforms, including Linux, Android, iOS and CPUs, CPUs etc. So, we need to continue refining the entire compilation project (Especially Android and iOS compilation project). In addition, the binary size of the inference library also needs to continue to optimize.

Inference API

The C-API design did not consider the mobile scene. The existing C-API is also not enough on the mobile side (Android need Java API). We need to think about whether to refactor or refine the C-API. And it is more reasonable to rename C-API to Inference API. Also, we need to improve the inference programming model on mobile.

Low Precision

Low-precision calculations can allow for smaller and faster model inference. Many hardware are enhancing hardware support for low-precision computing. Next year, there will be chips that support the ARMv8.2 instruction set architecture. And we can use float16 calculations on the mobile to speed up model inference. Here is an issue #4853 (closed) about support for float16 calculation.

Multi-Thread

Multi-thread computing can be used to speed up some computationally intensive operations. However, due to the big.LITTLE architecture and power consumption issues, multi-thread in the mobile is hard to achieve the expected speed of acceleration. Here #4678 (closed) is a more detailed explanation of the mobile multi-threaded computing difficulties.

Mobile GPU

Mobile GPU performance has been greatly improved in recent years. For some computationally intensive operations, an order of magnitude acceleration can be achieved on the mobile GPU compared to the CPU. We need to add GPU computing on Paddle Mobile. Here #5469 (closed) is a more detailed explanation of why Paddle needs to support the mobile GPU.

Hardware Acceleration

On the mobile, hardware acceleration for model Inference is a trend. We need to know about libraries for Android NN, SNPE, ARM NN, etc. that can be used for hardware acceleration. And how Paddle uses these libraries for the model inference. Here is a project for this work.

PaddlePaddle / Paddle
1 年多前同步成功

Paddle on Mobile

Paddle on Mobile

Build

Inference API

Low Precision

Multi-Thread

Mobile GPU

Hardware Acceleration

Convolution optimization

Matrix multiplication optimization

Document

Benchmark

Demo

PaddlePaddle / Paddle 1 年多 前同步成功

Paddle on Mobile

Paddle on Mobile

Build

Inference API

Low Precision

Multi-Thread

Mobile GPU

Hardware Acceleration

Convolution optimization

Matrix multiplication optimization

Document

Benchmark

Demo

PaddlePaddle / Paddle
1 年多前同步成功