未验证 提交 3a572f95 编写于 作者: T Tingquan Gao 提交者: GitHub

Add ViT and DeiT (#579)

* Update the discription about ViT and DeiT

* Fix the error file

* Unified data format
上级 1c81cc7d
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
PaddleClas is a toolset for image classification tasks prepared for the industry and academia. It helps users train better computer vision models and apply them in real scenarios. PaddleClas is a toolset for image classification tasks prepared for the industry and academia. It helps users train better computer vision models and apply them in real scenarios.
**Recent update** **Recent update**
- 2021.01.27 Add `ViT` and `DeiT` pretrained model, `ViT`'s Top-1 Acc on ImageNet-1k dataset reaches 81.05%, and `DeiT` reaches 85.5%.
- 2021.01.08 Add support for whl package and its usage, Model inference can be done by simply install paddleclas using pip. - 2021.01.08 Add support for whl package and its usage, Model inference can be done by simply install paddleclas using pip.
- 2020.12.16 Add support for TensorRT when using cpp inference to obain more obvious acceleration. - 2020.12.16 Add support for TensorRT when using cpp inference to obain more obvious acceleration.
- 2020.12.06 Add `SE_HRNet_W64_C_ssld` pretrained model, whose Top-1 Acc on ImageNet-1k dataset reaches 84.75%. - 2020.12.06 Add `SE_HRNet_W64_C_ssld` pretrained model, whose Top-1 Acc on ImageNet-1k dataset reaches 84.75%.
...@@ -66,6 +67,7 @@ PaddleClas is a toolset for image classification tasks prepared for the industry ...@@ -66,6 +67,7 @@ PaddleClas is a toolset for image classification tasks prepared for the industry
- [Inception series](#Inception_series) - [Inception series](#Inception_series)
- [EfficientNet and ResNeXt101_wsl series](#EfficientNet_and_ResNeXt101_wsl_series) - [EfficientNet and ResNeXt101_wsl series](#EfficientNet_and_ResNeXt101_wsl_series)
- [ResNeSt and RegNet series](#ResNeSt_and_RegNet_series) - [ResNeSt and RegNet series](#ResNeSt_and_RegNet_series)
- [Transformer series](#Transformer)
- [Others](#Others) - [Others](#Others)
- HS-ResNet: arxiv link: [https://arxiv.org/pdf/2010.07621.pdf](https://arxiv.org/pdf/2010.07621.pdf). Code and models are coming soon! - HS-ResNet: arxiv link: [https://arxiv.org/pdf/2010.07621.pdf](https://arxiv.org/pdf/2010.07621.pdf). Code and models are coming soon!
- Model training/evaluation - Model training/evaluation
...@@ -351,6 +353,37 @@ Accuracy and inference time metrics of ResNeSt and RegNet series models are show ...@@ -351,6 +353,37 @@ Accuracy and inference time metrics of ResNeSt and RegNet series models are show
| RegNetX_4GF | 0.785 | 0.9416 | 6.46478 | 11.19862 | 8 | 22.1 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RegNetX_4GF_pretrained.pdparams) | | RegNetX_4GF | 0.785 | 0.9416 | 6.46478 | 11.19862 | 8 | 22.1 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RegNetX_4GF_pretrained.pdparams) |
<a name="Transformer"></a>
### Transformer series
Accuracy and inference time metrics of ViT and DeiT series models are shown as follows. More detailed information can be refered to [Transformer series tutorial](./docs/en/models/Transformer.md).
| Model | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address |
|------------------------|-----------|-----------|------------------|------------------|----------|------------------------|------------------------|
| ViT_small_<br/>patch16_224 | 0.7727 | 0.9319 | - | - | | | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_small_patch16_224_pretrained.pdparams) |
| ViT_base_<br/>patch16_224 | 0.8176 | 0.9613 | - | - | | 86 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch16_224_pretrained.pdparams) |
| ViT_base_<br/>patch16_384 | 0.8393 | 0.9710 | - | - | | | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch16_384_pretrained.pdparams) |
| ViT_base_<br/>patch32_384 | 0.8124 | 0.9598 | - | - | | | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch32_384_pretrained.pdparams) |
| ViT_large_<br/>patch16_224 | 0.8325 | 0.9658 | - | - | | 307 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch16_224_pretrained.pdparams) |
| ViT_large_<br/>patch16_384 | 0.8507 | 0.9741 | - | - | | | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch16_384_pretrained.pdparams) |
| ViT_large_<br/>patch32_384 | 0.8105 | 0.9596 | - | - | | | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch32_384_pretrained.pdparams) |
| | | | | | | | |
| Model | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address |
|------------------------|-----------|-----------|------------------|------------------|----------|------------------------|------------------------|
| DeiT_tiny_<br>patch16_224 | 0.709 | 0.906 | - | - | | 5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_tiny_patch16_224_pretrained.pdparams) |
| DeiT_small_<br>patch16_224 | 0.794 | 0.948 | - | - | | 22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_small_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>patch16_224 | 0.816 | 0.955 | - | - | | 86 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>patch16_384 | 0.831 | 0.962 | - | - | | 87 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_patch16_384_pretrained.pdparams) |
| DeiT_tiny_<br>distilled_patch16_224 | 0.736 | 0.915 | - | - | | 6 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_tiny_distilled_patch16_224_pretrained.pdparams) |
| DeiT_small_<br>distilled_patch16_224 | 0.810 | 0.953 | - | - | | 22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_small_distilled_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>distilled_patch16_224 | 0.830 | 0.963 | - | - | | 87 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_distilled_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>distilled_patch16_384 | 0.855 | 0.974 | - | - | | 88 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_distilled_patch16_384_pretrained.pdparams) |
| | | | | | | | |
<a name="Others"></a> <a name="Others"></a>
### Others ### Others
......
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
**近期更新** **近期更新**
- 2021.01.27 添加`ViT``DeiT`模型,在ImageNet-1k上,`ViT`模型Top-1 Acc可达81.05%,`DeiT`模型可达85.5%。
- 2021.01.08 添加whl包及其使用说明,直接安装paddleclas whl包,即可快速完成模型预测。 - 2021.01.08 添加whl包及其使用说明,直接安装paddleclas whl包,即可快速完成模型预测。
- 2020.12.16 添加对cpp预测的tensorRT支持,预测加速更明显。 - 2020.12.16 添加对cpp预测的tensorRT支持,预测加速更明显。
- 2020.12.06 添加`SE_HRNet_W64_C_ssld`模型,在ImageNet-1k上Top-1 Acc可达84.75%。 - 2020.12.06 添加`SE_HRNet_W64_C_ssld`模型,在ImageNet-1k上Top-1 Acc可达84.75%。
...@@ -66,6 +67,7 @@ ...@@ -66,6 +67,7 @@
- [Inception系列](#Inception系列) - [Inception系列](#Inception系列)
- [EfficientNet与ResNeXt101_wsl系列](#EfficientNet与ResNeXt101_wsl系列) - [EfficientNet与ResNeXt101_wsl系列](#EfficientNet与ResNeXt101_wsl系列)
- [ResNeSt与RegNet系列](#ResNeSt与RegNet系列) - [ResNeSt与RegNet系列](#ResNeSt与RegNet系列)
- [Transformer系列](#Transformer系列)
- [其他模型](#其他模型) - [其他模型](#其他模型)
- HS-ResNet: arxiv文章链接: [https://arxiv.org/pdf/2010.07621.pdf](https://arxiv.org/pdf/2010.07621.pdf)。 代码和预训练模型即将开源,敬请期待。 - HS-ResNet: arxiv文章链接: [https://arxiv.org/pdf/2010.07621.pdf](https://arxiv.org/pdf/2010.07621.pdf)。 代码和预训练模型即将开源,敬请期待。
- 模型训练/评估 - 模型训练/评估
...@@ -296,7 +298,7 @@ HRNet系列模型的精度、速度指标如下表所示,更多关于该系列 ...@@ -296,7 +298,7 @@ HRNet系列模型的精度、速度指标如下表所示,更多关于该系列
| HRNet_W48_C | 0.7895 | 0.9442 | 13.70761 | 34.43572 | 34.58 | 77.47 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W48_C_pretrained.pdparams) | | HRNet_W48_C | 0.7895 | 0.9442 | 13.70761 | 34.43572 | 34.58 | 77.47 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W48_C_pretrained.pdparams) |
| HRNet_W48_C_ssld | 0.8363 | 0.9682 | 13.70761 | 34.43572 | 34.58 | 77.47 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W48_C_ssld_pretrained.pdparams) | | HRNet_W48_C_ssld | 0.8363 | 0.9682 | 13.70761 | 34.43572 | 34.58 | 77.47 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W48_C_ssld_pretrained.pdparams) |
| HRNet_W64_C | 0.7930 | 0.9461 | 17.57527 | 47.9533 | 57.83 | 128.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W64_C_pretrained.pdparams) | | HRNet_W64_C | 0.7930 | 0.9461 | 17.57527 | 47.9533 | 57.83 | 128.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/HRNet_W64_C_pretrained.pdparams) |
| SE_HRNet_W64_C_ssld | 0.8475 | 0.9726 | 31.69770 | 94.99546 | 57.83 | 128.97 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SE_HRNet_W64_C_ssld_pretrained.pdparams) | | SE_HRNet_W64_C_ssld | 0.8475 | 0.9726 | 31.69770 | 94.99546 | 57.83 | 128.97 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/SE_HRNet_W64_C_ssld_pretrained.pdparams) |
<a name="Inception系列"></a> <a name="Inception系列"></a>
...@@ -352,6 +354,38 @@ ResNeSt与RegNet系列模型的精度、速度指标如下表所示,更多关 ...@@ -352,6 +354,38 @@ ResNeSt与RegNet系列模型的精度、速度指标如下表所示,更多关
| ResNeSt50 | 0.8083 | 0.9542 | 6.69042 | 8.01664 | 10.78 | 27.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeSt50_pretrained.pdparams) | | ResNeSt50 | 0.8083 | 0.9542 | 6.69042 | 8.01664 | 10.78 | 27.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeSt50_pretrained.pdparams) |
| RegNetX_4GF | 0.785 | 0.9416 | 6.46478 | 11.19862 | 8 | 22.1 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RegNetX_4GF_pretrained.pdparams) | | RegNetX_4GF | 0.785 | 0.9416 | 6.46478 | 11.19862 | 8 | 22.1 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RegNetX_4GF_pretrained.pdparams) |
<a name="Transformer系列"></a>
### Transformer系列
ViT(Vision Transformer)与DeiT(Data-efficient Image Transformers)系列模型的精度、速度指标如下表所示. 更多关于该系列模型的介绍可以参考: [Transformer系列模型文档](./docs/zh_CN/models/Transformer.md)
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
|------------------------|-----------|-----------|------------------|------------------|----------|------------------------|------------------------|
| ViT_small_<br/>patch16_224 | 0.7727 | 0.9319 | - | - | | | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_small_patch16_224_pretrained.pdparams) |
| ViT_base_<br/>patch16_224 | 0.8176 | 0.9613 | - | - | | 86 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch16_224_pretrained.pdparams) |
| ViT_base_<br/>patch16_384 | 0.8393 | 0.9710 | - | - | | | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch16_384_pretrained.pdparams) |
| ViT_base_<br/>patch32_384 | 0.8124 | 0.9598 | - | - | | | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_base_patch32_384_pretrained.pdparams) |
| ViT_large_<br/>patch16_224 | 0.8325 | 0.9658 | - | - | | 307 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch16_224_pretrained.pdparams) |
| ViT_large_<br/>patch16_384 | 0.8507 | 0.9741 | - | - | | | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch16_384_pretrained.pdparams) |
| ViT_large_<br/>patch32_384 | 0.8105 | 0.9596 | - | - | | | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ViT_large_patch32_384_pretrained.pdparams) |
| | | | | | | | |
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
|------------------------|-----------|-----------|------------------|------------------|----------|------------------------|------------------------|
| DeiT_tiny_<br>patch16_224 | 0.709 | 0.906 | - | - | | 5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_tiny_patch16_224_pretrained.pdparams) |
| DeiT_small_<br>patch16_224 | 0.794 | 0.948 | - | - | | 22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_small_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>patch16_224 | 0.816 | 0.955 | - | - | | 86 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>patch16_384 | 0.831 | 0.962 | - | - | | 87 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_patch16_384_pretrained.pdparams) |
| DeiT_tiny_<br>distilled_patch16_224 | 0.736 | 0.915 | - | - | | 6 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_tiny_distilled_patch16_224_pretrained.pdparams) |
| DeiT_small_<br>distilled_patch16_224 | 0.810 | 0.953 | - | - | | 22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_small_distilled_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>distilled_patch16_224 | 0.830 | 0.963 | - | - | | 87 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_distilled_patch16_224_pretrained.pdparams) |
| DeiT_base_<br>distilled_patch16_384 | 0.855 | 0.974 | - | - | | 88 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DeiT_base_distilled_patch16_384_pretrained.pdparams) |
| | | | | | | | |
<a name="其他模型"></a> <a name="其他模型"></a>
### 其他模型 ### 其他模型
......
mode: 'train'
ARCHITECTURE:
name: 'ViT_small_patch16_224'
pretrained_model: ""
model_save_dir: "./output/"
classes_num: 1000
total_images: 1281167
save_interval: 1
validate: True
valid_interval: 1
epochs: 120
topk: 5
image_shape: [3, 224, 224]
use_mix: False
ls_epsilon: -1
LEARNING_RATE:
function: 'Cosine'
params:
lr: 0.01
OPTIMIZER:
function: 'Momentum'
params:
momentum: 0.9
regularizer:
function: 'L2'
factor: 0.000100
TRAIN:
batch_size: 64
num_workers: 4
file_list: "./dataset/ILSVRC2012/train_list.txt"
data_dir: "./dataset/ILSVRC2012/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1./255.
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
VALID:
batch_size: 64
num_workers: 4
file_list: "./dataset/ILSVRC2012/val_list.txt"
data_dir: "./dataset/ILSVRC2012/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- ResizeImage:
size: 248
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
# ViT and DeiT series
## Overview
ViT(Vision Transformer) series models were proposed by Google in 2020. These models only use the standard transformer structure, completely abandon the convolution structure, splits the image into multiple patches and then inputs them into the transformer, showing the potential of transformer in the CV field.。[Paper](https://arxiv.org/abs/2010.11929)
DeiT(Data-efficient Image Transformers) series models were proposed by Facebook at the end of 2020. Aiming at the problem that the ViT models need large-scale dataset training, the DeiT improved them, and finally achieved 83.1% Top1 accuracy on ImageNet. More importantly, using convolution model as teacher model, and performing knowledge distillation on these models, the Top1 accuracy of 85.2% can be achieved on the ImageNet dataset.
## Accuracy, FLOPS and Parameters
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) |
|:--:|:--:|:--:|:--:|:--:|:--:|
| ViT_small_patch16_224 | 0.7727 | 0.9319 | 0.7785 | 0.9342 | |
| ViT_base_patch16_224 | 0.8176 | 0.9613 | 0.8178 | 0.9613 | |
| ViT_base_patch16_384 | 0.8393 | 0.9710 | 0.8420 | 0.9722 | |
| ViT_base_patch32_384 | 0.8124 | 0.9598 | 0.8166 | 0.9613 | |
| ViT_large_patch16_224 | 0.8325 | 0.9658 | 0.8306 | 0.9644 | |
| ViT_large_patch16_384 | 0.8507 | 0.9741 | 0.8517 | 0.9736 | |
| ViT_large_patch32_384 | 0.8105 | 0.9596 | 0.815 | - | |
| | | | | | |
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) |
|:--:|:--:|:--:|:--:|:--:|:--:|
| DeiT_tiny_patch16_224 | 0.709 | 0.906 | 0.722 | 0.911 | |
| DeiT_small_patch16_224 | 0.794 | 0.948 | 0.799 | 0.950 | |
| DeiT_base_patch16_224 | 0.816 | 0.955 | 0.818 | 0.956 | |
| DeiT_base_patch16_384 | 0.831 | 0.962 | 0.829 | 0.972 | |
| DeiT_tiny_distilled_patch16_224 | 0.736 | 0.915 | 0.745 | 0.919 | |
| DeiT_small_distilled_patch16_224 | 0.810 | 0.953 | 0.812 | 0.954 | |
| DeiT_base_distilled_patch16_224 | 0.830 | 0.963 | 0.834 | 0.965 | |
| DeiT_base_distilled_patch16_384 | 0.855 | 0.974 | 0.852 | 0.972 | |
| | | | | | |
Params, FLOPs, Inference speed and other information are coming soon.
# ViT与DeiT系列
## 概述
ViT(Vision Transformer)系列模型是Google在2020年提出的,该模型仅使用标准的Transformer结构,完全抛弃了卷积结构,将图像拆分为多个patch后再输入到Transformer中,展示了Transformer在CV领域的潜力。[论文地址](https://arxiv.org/abs/2010.11929)
DeiT(Data-efficient Image Transformers)系列模型是由FaceBook在2020年底提出的,针对ViT模型需要大规模数据集训练的问题进行了改进,最终在ImageNet上取得了83.1%的Top1精度。并且使用卷积模型作为教师模型,针对该模型进行知识蒸馏,在ImageNet数据集上可以达到85.2%的Top1精度。[论文地址](https://arxiv.org/abs/2012.12877)
## 精度、FLOPS和参数量
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) |
|:--:|:--:|:--:|:--:|:--:|:--:|
| ViT_small_patch16_224 | 0.7727 | 0.9319 | 0.7785 | 0.9342 | |
| ViT_base_patch16_224 | 0.8176 | 0.9613 | 0.8178 | 0.9613 | |
| ViT_base_patch16_384 | 0.8393 | 0.9710 | 0.8420 | 0.9722 | |
| ViT_base_patch32_384 | 0.8124 | 0.9598 | 0.8166 | 0.9613 | |
| ViT_large_patch16_224 | 0.8325 | 0.9658 | 0.8306 | 0.9644 | |
| ViT_large_patch16_384 | 0.8507 | 0.9741 | 0.8517 | 0.9736 | |
| ViT_large_patch32_384 | 0.8105 | 0.9596 | 0.815 | - | |
| | | | | | |
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) |
|:--:|:--:|:--:|:--:|:--:|:--:|
| DeiT_tiny_patch16_224 | 0.709 | 0.906 | 0.722 | 0.911 | |
| DeiT_small_patch16_224 | 0.794 | 0.948 | 0.799 | 0.950 | |
| DeiT_base_patch16_224 | 0.816 | 0.955 | 0.818 | 0.956 | |
| DeiT_base_patch16_384 | 0.831 | 0.962 | 0.829 | 0.972 | |
| DeiT_tiny_distilled_patch16_224 | 0.736 | 0.915 | 0.745 | 0.919 | |
| DeiT_small_distilled_patch16_224 | 0.810 | 0.953 | 0.812 | 0.954 | |
| DeiT_base_distilled_patch16_224 | 0.830 | 0.963 | 0.834 | 0.965 | |
| DeiT_base_distilled_patch16_384 | 0.855 | 0.974 | 0.852 | 0.972 | |
| | | | | | |
关于Params、FLOPs、Inference speed等信息,敬请期待。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册