If you wish to resume from an existing model, please set ``--checkpoint_path`` and ``--transformer_step``.
**Note: In order to ensure the training effect, we recommend using multi-GPU training to enlarge the batch size, and at least 16 samples in single batch per GPU.**