Model Parallelism
Created by: yingfeng
Does paddle have the plan to support model parallelism in distributed environment? Current model parallelism is limited to run on single machine, additionally, it's used only when layers themselves could be parallel where different gpus are assigned with different layers. A distributed model parallelism means to split certain layers into sublayers running across different machines, although much larger communication cost is introduced, while under situations where model couldnot be contained on single machine, it's a useful feature.