Support multi-thread inference for the same program and sharing the parameters (#9650) · Issue · PaddlePaddle / Paddle

Support multi-thread inference for the same program and sharing the parameters

Created by: Xreki

Sometimes users want to use multi-thread to do inference for the same program, so they want to share the parameters among different threads to minimize the memory usage.

In Fluid, it is easy to share variables among threads, the only need is to run with the same scope and the same type of place.

Both parameters and feed, fetch holder variables are defined as persistable which are defined in the global scope and will not be created and destroyed in each Run. However, we do not want to share feed, fetch holder variables among different threads, or the inference results will be overridden by the slowest thread's computing result. If the program defined in main threads contains feed_ops and fetch_ops, we need to clone a copy of the program and change the feed/fetch_holder_name to a unique one in each thread.

PaddlePaddle / Paddle 大约 1 年 前同步成功

Support multi-thread inference for the same program and sharing the parameters

PaddlePaddle / Paddle
大约 1 年前同步成功