Support multi-thread inference for the same program and sharing the parameters
Created by: Xreki
Sometimes users want to use multi-thread to do inference for the same program, so they want to share the parameters among different threads to minimize the memory usage.
In Fluid, it is easy to share variables among threads, the only need is to run with the same scope
and the same type of place
.
Both parameters and feed, fetch holder variables are defined as persistable
which are defined in the global scope and will not be created and destroyed in each Run
. However, we do not want to share feed, fetch holder variables among different threads, or the inference results will be overridden by the slowest thread's computing result. If the program defined in main threads contains feed_op
s and fetch_op
s, we need to clone a copy of the program and change the feed/fetch_holder_name
to a unique one in each thread.