Created by: guru4elephant
Add mirrored worker for multi GPU execution. Add Collective Wrapper as a base class for collective operation. Add corresponding python API for mirrored worker. More examples will be added.