Convert the training progress to the Once-For-All training progress, a detailed description in the paper: `Once-for-All: Train One Network and Specialize it for Efficient Deployment<https://arxiv.org/abs/1908.09791>`_ . This paper propose a training propgress named progressive shrinking (PS), which means we start with training the largest neural network with the maximum kernel size (i.e., 7), depth (i.e., 4), and width (i.e., 6). Next, we progressively fine-tune the network to support smaller sub-networks by gradually adding them into the sampling space (larger sub-networks may also be sampled). Specifically, after training the largest network, we first support elastic kernel size which can choose from {3, 5, 7} at each layer, while the depth and width remain the maximum values. Then, we support elastic depth and elastic width sequentially.
Parameters:
model(paddle.nn.Layer): instance of model.
run_config(paddleslim.ofa.RunConfig, optional): config in ofa training, can reference `<>`_ . Default: None.
distill_config(paddleslim.ofa.DistillConfig, optional): config of distilltion in ofa training, can reference `<>`_. Default: None.
elastic_order(list, optional): define the training order, if it set to None, use the default order in the paper. Default: None.
train_full(bool, optional): whether to train the largest sub-network only. Default: False.
Examples:
.. code-block:: python
from paddle.vision.models import mobilenet_v1
from paddleslim.nas.ofa import OFA
from paddleslim.nas.ofa.convert_super import Convert, supernet