[Enhancement] multi_thread training / prediction is unfriendly to user.
Created by: TomorrowIsAnOtherDay
In reinforcement learning algorithm, having training and prediction at the same time is necessary. But it's hard to implement our parallel algorithm using current API of Fluid. Also, it's very easy to get into trouble when you try to use multi-thread training in your algorithm. Here is an example to show how to use multi-thread safely in Fluid.
#!/usr/bin/env python
# coding=utf8
# File: paddle_milti_thread.py
from paddle import fluid
import numpy as np
import threading as th
import time
def train_thread(idx, output):
local_scope = fluid.global_scope().new_scope()
place = fluid.CPUPlace()
exe = fluid.Executor(place=place)
feed = {'inputs': np.array(idx)}
output_np = exe.run(feed=feed, fetch_list=[output], scope=local_scope, feed_var_name='feed' + str(idx), fetch_var_name='fetch' + str(idx))[0]
print('thread idx: {}, ideal output: {}, real output: {}'.format(idx, idx, output_np))
def main():
with fluid.unique_name.guard():
inputs = fluid.layers.data(name='inputs', shape=[], dtype='int64')
output = inputs
ass = fluid.layers.assign(input=output)
place = fluid.CPUPlace()
exe = fluid.Executor(place=place)
exe.run(fluid.default_startup_program())
start = time.time()
th_list = []
train_thread_num = 10
for i in range(train_thread_num):
t = th.Thread(target=train_thread, args=(i,ass))
t.start()
th_list.append(t)
for t in th_list:
t.join()
print('time: {}'.format(time.time() - start))
main()
To use multi-thread in your algorithm, there are three rules you have to obey:
- each thread has its own executor.
- each thread has its own scope.
- specify the prefix name for each executor in each run since some mechanisms below executor.
It's really unfriendly to users who want to use multi-thread in their algorithms. Could we simplify these limitation or hide these process below the fluid API?