Created by: wzzju
Support ParallelExecutor to fetch unmerged LoDTensors.
Add return_merged as a parameter in Executor.run. It indicates whether fetched variables (the variables specified in the fetch list) should be merged according to the execution device dimension. If return_merged is False, the type of the return value is a two-dimensional list of Tensor ( return_numpy is False) or a two-dimensional list of numpy.ndarray ( return_numpy is True). If return_merged is True, the type of the return value is an one-dimensional list of Tensor ( return_numpy is False) or an one-dimensional list of numpy.ndarray ( return_numpy is True). If the lengths of fetched results are variant, please set return_merged as False, which denotes that the fetched results will not be merged. The default is True, but it is just for the compatibility, and may use False as default value in the future version.


