5.6 KB
Newer Older
liuzhuang13 已提交
# [A ConvNet for the 2020s](
Saining Xie 已提交
2 3 4

Official PyTorch implementation of **ConvNeXt**, from the following paper:

liuzhuang13 已提交
[A ConvNet for the 2020s]( arXiv 2022.\
Saining Xie 已提交
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
[Zhuang Liu](, [Hanzi Mao](, [Chao-Yuan Wu](, [Christoph Feichtenhofer](, [Trevor Darrell]( and [Saining Xie](\
Facebook AI Research, UC Berkeley


<p align="center">
<img src="" width=100% height=100% 

We propose **ConvNeXt**, a pure ConvNet model constructed entirely from standard ConvNet modules. ConvNeXt is accurate, efficient, scalable and very simple in design.

## Catalog
- [x] ImageNet-1K Training Code  
- [x] ImageNet-22K Pre-training Code  
- [x] ImageNet-1K Fine-tuning Code  
- [x] Downstream Transfer (Detection, Segmentation) Code
<!-- ✅ ⬜️  -->

## Results and Pre-trained Models
### ImageNet-1K trained models

| name | resolution |acc@1 | #params | FLOPs | model |
|:---:|:---:|:---:|:---:| :---:|:---:|
| ConvNeXt-T | 224x224 | 82.1 | 28M | 4.5G | [model]( |
| ConvNeXt-S | 224x224 | 83.1 | 50M | 8.7G | [model]( |
| ConvNeXt-B | 224x224 | 83.8 | 89M | 15.4G | [model]( |
| ConvNeXt-B | 384x384 | 85.1 | 89M | 45.0G | [model]( |
| ConvNeXt-L | 224x224 | 84.3 | 198M | 34.4G | [model]( |
| ConvNeXt-L | 384x384 | 85.5 | 198M | 101.0G | [model]( |

### ImageNet-22K trained models

| name | resolution |acc@1 | #params | FLOPs | 22k model | 1k model |
|:---:|:---:|:---:|:---:| :---:| :---:|:---:|
| ConvNeXt-B | 224x224 | 85.8 | 89M | 15.4G | [model](   | [model](
| ConvNeXt-B | 384x384 | 86.8 | 89M | 47.0G |     -          | [model](
| ConvNeXt-L | 224x224 | 86.6 | 198M | 34.4G | [model](  | [model](
| ConvNeXt-L | 384x384 | 87.5 | 198M | 101.0G |    -         | [model](
| ConvNeXt-XL | 224x224 | 87.0 | 350M | 60.9G | [model]( | [model](
| ConvNeXt-XL | 384x384 | 87.8 | 350M | 179.0G |  -          | [model](

### ImageNet-1K trained models (isotropic)
| name | resolution |acc@1 | #params | FLOPs | model |
|:---:|:---:|:---:|:---:| :---:|:---:|
| ConvNeXt-S | 224x224 | 78.7 | 22M | 4.3G | [model]( |
| ConvNeXt-B | 224x224 | 82.0 | 87M | 16.9G | [model]( |
| ConvNeXt-L | 224x224 | 82.6 | 306M | 59.7G | [model]( |

## Installation
Please check []( for installation instructions. 

## Evaluation
We give an example evaluation command for a ImageNet-22K pre-trained, then ImageNet-1K fine-tuned ConvNeXt-B:

python --model convnext_base --eval true \
--resume \
--input_size 224 --drop_path 0.2 \
--data_path /path/to/imagenet-1k
python -m torch.distributed.launch --nproc_per_node=8 \
--model convnext_base --eval true \
--resume \
--input_size 224 --drop_path 0.2 \
--data_path /path/to/imagenet-1k

This should give 
* Acc@1 85.820 Acc@5 97.868 loss 0.563

- For evaluating other model variants, change `--model`, `--resume`, `--input_size` accordingly. You can get the url to pre-trained models from the tables above. 
- Setting model-specific `--drop_path` is not strictly required in evaluation, as the `DropPath` module in timm behaves the same during evaluation; but it is required in training. See []( or our paper for the values used for different models.

## Training
See []( for training and fine-tuning instructions.

## Acknowledgement
This repository is built using the [timm]( library, [DeiT]( and [BEiT]( repositories.

## License
This project is released under the MIT license. Please see the [LICENSE](LICENSE) file for more information.

## Citation
If you find this repository helpful, please consider citing:
  author  = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
  title   = {A ConvNet for the 2020s},
liuzhuang13 已提交
  journal = {arXiv preprint arXiv:2201.03545},
Saining Xie 已提交
103 104 105
  year    = {2022},