-en:Data-parallel training also works with [Automatic Mixed Precision (AMP)](https://pytorch.org/docs/stable/notes/amp_examples.html#working-with-multiple-gpus).
requires one more step to set up, i.e., calling [init_process_group](https://pytorch.org/docs/stable/distributed.html#torch.distributed.init_process_group).
-en:The [Implementing Batch RPC Processing Using Asynchronous Executions](../intermediate/rpc_async_execution.html)
tutorial demonstrates how to implement RPC batch processing using the [@rpc.functions.async_execution](https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.functions.async_execution)
decorator, which can help speed up inference and training. It uses RL and PS examples