Created by: sandyhouse
PR types: New features PR changes: Others Describe: add the support for device index for device_guard, such as "gpu:1" to support pipeline. Now, device index is only used for pipeline parallelism. For non-pipeline, the device index is ignored.