Internal change

PiperOrigin-RevId: 357758634

Internal change
PiperOrigin-RevId: 357758634
a86917df · A. Unique TensorFlower · 2fb3c898 · a86917df · a86917df · a86917df
15 changed file
--- a/official/vision/beta/MODEL_GARDEN.md
+++ b/official/vision/beta/MODEL_GARDEN.md
@@ -6,7 +6,7 @@ TF Vision model garden provides a large collection of baselines and checkpoints

 ## Image Classification
 ### ImageNet Baselines
-#### Models trained with vanilla settings:
+#### ResNet models trained with vanilla settings:
 * Models are trained from scratch with batch size 4096 and 1.6 initial learning rate.
 * Linear warmup is applied for the first 5 epochs.
 * Models trained with l2 weight regularization and ReLU activation.
@@ -18,17 +18,27 @@ TF Vision model garden provides a large collection of baselines and checkpoints
 | ResNet-101   | 224x224       |    200   | 78.3 | 94.2 | config |
 | ResNet-152   | 224x224       |    200   | 78.7 | 94.3 | config |

-#### Models trained with training features including:
-* Label smoothing 0.1.
-* Swish activation.
-
-| model        | resolution    | epochs  |   Top-1  |  Top-5  | download |
-| ------------ |:-------------:| ---------:|--------:|---------:|---------:|
-| ResNet-50    | 224x224       |    200    | 78.1 | 93.9 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet50_tpu.yaml) |
-| ResNet-101   | 224x224       |    200    | 79.1 | 94.5 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet101_tpu.yaml) |
-| ResNet-152   | 224x224       |    200    | 79.4 | 94.7 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet152_tpu.yaml) |
-| ResNet-200   | 224x224       |    200    | 79.9 | 94.8 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml) |

+#### ResNet-RS models trained with settings including:
+
+*   ResNet-RS architectural changes and Swish activation.
+*   Regularization methods including Random Augment, 4e-5 weight decay, stochastic depth, label smoothing and dropout.
+*   New training methods including a 350-epoch schedule, cosine learning rate and
+    EMA.
+*   Configs are in this [directory](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification)
+
+model     | resolution | params (M) | Top-1 | Top-5 | download
+--------- | :--------: | -----: | ----: | ----: | -------:
+ResNet-RS-50 | 160x160    | 35.7    | 79.1  | 94.5  |
+ResNet-RS-101 | 160x160    | 63.7    | 80.2  | 94.9  |
+ResNet-RS-101 | 192x192    | 63.7    | 81.3  | 95.6  |
+ResNet-RS-152 | 192x192    | 86.8    | 81.9  | 95.8  |
+ResNet-RS-152 | 224x224    | 86.8    | 82.5  | 96.1  |
+ResNet-RS-152 | 256x256    | 86.8    | 83.1  | 96.3  |
+ResNet-RS-200 | 256x256    | 93.4    | 83.5  | 96.6  |
+ResNet-RS-270 | 256x256    | 130.1    | 83.6  | 96.6  |
+ResNet-RS-350 | 256x256    |  164.3   | 83.7  | 96.7  |
+ResNet-RS-350 | 320x320    | 164.3   | 84.2  | 96.9  |


 ## Object Detection and Instance Segmentation

--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet300_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet300_tpu.yaml
-# ResNet-300 ImageNet classification. 82.6% top-1 and 96.3% top-5 accuracy.
+# ResNet-RS-101 ImageNet classification. 80.2% accuracy.
 runtime:
  distribution_strategy: 'tpu'
  mixed_precision_dtype: 'bfloat16'
 task:
  model:
    num_classes: 1001
-    input_size: [380, 380, 3]
+    input_size: [160, 160, 3]
    backbone:
      type: 'resnet'
      resnet:
-        model_id: 300
-        stem_type: 'v1'
+        model_id: 101
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
        se_ratio: 0.25
-        stochastic_depth_drop_rate: 0.2
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
    norm_activation:
      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
  losses:
-    l2_weight_decay: 0.0001
+    l2_weight_decay: 0.00004
    one_hot: true
    label_smoothing: 0.1
  train_data:
@@ -24,6 +28,8 @@ task:
    is_training: true
    global_batch_size: 4096
    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
  validation_data:
    input_path: 'imagenet-2012-tfrecord/valid*'
    is_training: false
@@ -31,13 +37,15 @@ task:
    dtype: 'bfloat16'
    drop_remainder: false
 trainer:
-  train_steps: 62400
+  train_steps: 109200
  validation_steps: 13
  validation_interval: 312
  steps_per_loop: 312
  summary_interval: 312
  checkpoint_interval: 312
  optimizer_config:
+    ema:
+      average_decay: 0.9999
    optimizer:
      type: 'sgd'
      sgd:
@@ -46,7 +54,7 @@ trainer:
      type: 'cosine'
      cosine:
        initial_learning_rate: 1.6
-        decay_steps: 62400
+        decay_steps: 109200
    warmup:
      type: 'linear'
      linear:

--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs101_i192.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs101_i192.yaml
+# ResNet-RS-101 ImageNet classification. 81.3% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [192, 192, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 101
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs152_i192.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs152_i192.yaml
+# ResNet-RS-152 ImageNet classification. 81.9% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [192, 192, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 152
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml
-# ResNet-200 ImageNet classification. 79.9% top-1 and 94.8% top-5 accuracy.
+# ResNet-RS-152 ImageNet classification. 82.5% top-5 accuracy.
 runtime:
  distribution_strategy: 'tpu'
  mixed_precision_dtype: 'bfloat16'
@@ -9,11 +9,18 @@ task:
    backbone:
      type: 'resnet'
      resnet:
-        model_id: 200
+        model_id: 152
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
    norm_activation:
      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
  losses:
-    l2_weight_decay: 0.0001
+    l2_weight_decay: 0.00004
    one_hot: true
    label_smoothing: 0.1
  train_data:
@@ -21,6 +28,8 @@ task:
    is_training: true
    global_batch_size: 4096
    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
  validation_data:
    input_path: 'imagenet-2012-tfrecord/valid*'
    is_training: false
@@ -28,13 +37,15 @@ task:
    dtype: 'bfloat16'
    drop_remainder: false
 trainer:
-  train_steps: 62400
+  train_steps: 109200
  validation_steps: 13
  validation_interval: 312
  steps_per_loop: 312
  summary_interval: 312
  checkpoint_interval: 312
  optimizer_config:
+    ema:
+      average_decay: 0.9999
    optimizer:
      type: 'sgd'
      sgd:
@@ -43,7 +54,7 @@ trainer:
      type: 'cosine'
      cosine:
        initial_learning_rate: 1.6
-        decay_steps: 62400
+        decay_steps: 109200
    warmup:
      type: 'linear'
      linear:

--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs152_i256.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs152_i256.yaml
+# ResNet-RS-152 ImageNet classification. 83.1% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [256, 256, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 152
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs200_i256.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs200_i256.yaml
+# ResNet-RS-200 ImageNet classification. 83.5% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [256, 256, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 200
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.1
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs270_i256.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs270_i256.yaml
+# ResNet-RS-270 ImageNet classification. 83.6% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [256, 256, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 270
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.1
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs350_i256.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs350_i256.yaml
+# ResNet-RS-350 ImageNet classification. 83.7% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [256, 256, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 350
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.1
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs350_i320.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs350_i320.yaml
+# ResNet-RS-350 ImageNet classification. 84.2% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [320, 320, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 350
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.1
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 15
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet350_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet350_tpu.yaml
-# ResNet-350 ImageNet classification. 84.2% top-1 accuracy.
 runtime:
  distribution_strategy: 'tpu'
  mixed_precision_dtype: 'bfloat16'
@@ -9,14 +8,16 @@ task:
    backbone:
      type: 'resnet'
      resnet:
-        model_id: 350
-        depth_multiplier: 1.25
-        stem_type: 'v1'
+        model_id: 420
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
        se_ratio: 0.25
-        stochastic_depth_drop_rate: 0.2
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.1
    norm_activation:
      activation: 'swish'
-    dropout_rate: 0.5
+      norm_momentum: 0.0
+      use_sync_bn: false
  losses:
    l2_weight_decay: 0.00004
    one_hot: true
@@ -27,6 +28,7 @@ task:
    global_batch_size: 4096
    dtype: 'bfloat16'
    aug_policy: 'randaug'
+    randaug_magnitude: 15
  validation_data:
    input_path: 'imagenet-2012-tfrecord/valid*'
    is_training: false
@@ -41,6 +43,8 @@ trainer:
  summary_interval: 312
  checkpoint_interval: 312
  optimizer_config:
+    ema:
+      average_decay: 0.9999
    optimizer:
      type: 'sgd'
      sgd:
@@ -53,4 +57,4 @@ trainer:
    warmup:
      type: 'linear'
      linear:
-        warmup_steps: 5000
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs50_i160.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnetrs50_i160.yaml
+# ResNet-RS-50 ImageNet classification. 79.1% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [160, 160, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 50
+        replace_stem_max_pool: true
+        resnetd_shortcut: true
+        se_ratio: 0.25
+        stem_type: 'v1'
+        stochastic_depth_drop_rate: 0.0
+    norm_activation:
+      activation: 'swish'
+      norm_momentum: 0.0
+      use_sync_bn: false
+  losses:
+    l2_weight_decay: 0.00004
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    aug_policy: 'randaug'
+    randaug_magnitude: 10
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 109200
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    ema:
+      average_decay: 0.9999
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 109200
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/image_classification.py
+++ b/official/vision/beta/configs/image_classification.py
@@ -35,6 +35,7 @@ class DataConfig(cfg.DataConfig):
  shuffle_buffer_size: int = 10000
  cycle_length: int = 10
  aug_policy: Optional[str] = None  # None, 'autoaug', or 'randaug'
+  randaug_magnitude: Optional[int] = 10
  file_type: str = 'tfrecord'


@@ -184,13 +185,17 @@ def image_classification_imagenet_resnetrs() -> cfg.ExperimentConfig:
                      stochastic_depth_drop_rate=0.0)),
              dropout_rate=0.25,
              norm_activation=common.NormActivation(
-                  norm_momentum=0.0, norm_epsilon=1e-5, use_sync_bn=False)),
+                  norm_momentum=0.0,
+                  norm_epsilon=1e-5,
+                  use_sync_bn=False,
+                  activation='swish')),
          losses=Losses(l2_weight_decay=4e-5, label_smoothing=0.1),
          train_data=DataConfig(
              input_path=os.path.join(IMAGENET_INPUT_PATH_BASE, 'train*'),
              is_training=True,
              global_batch_size=train_batch_size,
-              aug_policy='randaug'),
+              aug_policy='randaug',
+              randaug_magnitude=10),
          validation_data=DataConfig(
              input_path=os.path.join(IMAGENET_INPUT_PATH_BASE, 'valid*'),
              is_training=False,
@@ -199,7 +204,7 @@ def image_classification_imagenet_resnetrs() -> cfg.ExperimentConfig:
          steps_per_loop=steps_per_epoch,
          summary_interval=steps_per_epoch,
          checkpoint_interval=steps_per_epoch,
-          train_steps=360 * steps_per_epoch,
+          train_steps=350 * steps_per_epoch,
          validation_steps=IMAGENET_VAL_EXAMPLES // eval_batch_size,
          validation_interval=steps_per_epoch,
          optimizer_config=optimization.OptimizationConfig({
@@ -215,8 +220,8 @@ def image_classification_imagenet_resnetrs() -> cfg.ExperimentConfig:
              'learning_rate': {
                  'type': 'cosine',
                  'cosine': {
-                      'initial_learning_rate': 0.1,
-                      'decay_steps': 360 * steps_per_epoch
+                      'initial_learning_rate': 1.6,
+                      'decay_steps': 350 * steps_per_epoch
                  }
              },
              'warmup': {

--- a/official/vision/beta/dataloaders/classification_input.py
+++ b/official/vision/beta/dataloaders/classification_input.py
@@ -49,6 +49,7 @@ class Parser(parser.Parser):
               num_classes: float,
               aug_rand_hflip: bool = True,
               aug_policy: Optional[str] = None,
+               randaug_magnitude: Optional[int] = 10,
               dtype: str = 'float32'):
    """Initializes parameters for parsing annotations in the dataset.

@@ -59,6 +60,7 @@ class Parser(parser.Parser):
      aug_rand_hflip: `bool`, if True, augment training with random
        horizontal flip.
      aug_policy: `str`, augmentation policies. None, 'autoaug', or 'randaug'.
+      randaug_magnitude: `int`, magnitude of the randaugment policy.
      dtype: `str`, cast output image in dtype. It can be 'float32', 'float16',
        or 'bfloat16'.
    """
@@ -77,7 +79,8 @@ class Parser(parser.Parser):
      if aug_policy == 'autoaug':
        self._augmenter = augment.AutoAugment()
      elif aug_policy == 'randaug':
-        self._augmenter = augment.RandAugment(num_layers=2, magnitude=20)
+        self._augmenter = augment.RandAugment(
+            num_layers=2, magnitude=randaug_magnitude)
      else:
        raise ValueError(
            'Augmentation policy {} not supported.'.format(aug_policy))

--- a/official/vision/beta/tasks/image_classification.py
+++ b/official/vision/beta/tasks/image_classification.py
@@ -93,6 +93,7 @@ class ImageClassificationTask(base_task.Task):
        output_size=input_size[:2],
        num_classes=num_classes,
        aug_policy=params.aug_policy,
+        randaug_magnitude=params.randaug_magnitude,
        dtype=params.dtype)

    reader = input_reader.InputReader(