diff --git a/modelcenter/PLSC-ViT/download_cn.md b/modelcenter/PLSC-ViT/download_cn.md index 0ad84fc9438722f8f0c7fd294690957d14e6d0a2..c7353cd564139a8cc00dbbac37cf51b7b14dff6d 100644 --- a/modelcenter/PLSC-ViT/download_cn.md +++ b/modelcenter/PLSC-ViT/download_cn.md @@ -2,8 +2,7 @@ |模型名称|模型简介|模型配置|预训练checkpoint下载地址| | --- | --- | --- | --- | -| ViT-B_16_224 |输入size为224,layers=12|[config](./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml) |[download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | -| ViT-B_16_384 |输入size为384,layers=12|[config](./configs/ViT_base_patch16_384_ft_in1k_1n8c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | -| ViT-L_16_224 |输入size为224,layers=24|[config](./configs/ViT_large_patch16_224_in21k_4n32c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | -| ViT-L_16_384 |输入size为384,layers=32|[config](./configs/ViT_large_patch16_384_in1k_ft_4n32c_dp_fp16o2.yaml) | [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | - +| ViT-B_16_224 |输入size为224,layers=12|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml) |[download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | +| ViT-B_16_384 |输入size为384,layers=12|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_base_patch16_384_ft_in1k_1n8c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | +| ViT-L_16_224 |输入size为224,layers=24|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_large_patch16_224_in21k_4n32c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | +| ViT-L_16_384 |输入size为384,layers=32|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_large_patch16_384_in1k_ft_4n32c_dp_fp16o2.yaml) | [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | diff --git a/modelcenter/PLSC-ViT/download_en.md b/modelcenter/PLSC-ViT/download_en.md index cab7db7c9fd6e7f2fd7c114ed55fe088e66c9eb8..1bdbc40e95f0f7f95230e1f416ff9f8dd97b6758 100644 --- a/modelcenter/PLSC-ViT/download_en.md +++ b/modelcenter/PLSC-ViT/download_en.md @@ -2,8 +2,7 @@ |Model Name|Introduction|Config|Pretrained checkpoint Download| | --- | --- | --- | --- | -| ViT-B_16_224 |input_size=224,layers=12|[config](./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml) |[download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | -| ViT-B_16_384 |input_size=384,layers=12|[config](./configs/ViT_base_patch16_384_ft_in1k_1n8c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | -| ViT-L_16_224 |input_size=224,layers=24|[config](./configs/ViT_large_patch16_224_in21k_4n32c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | -| ViT-L_16_384 |input_size=384,layers=32|[config](./configs/ViT_large_patch16_384_in1k_ft_4n32c_dp_fp16o2.yaml) | [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | - +| ViT-B_16_224 |input_size=224,layers=12|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml) |[download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | +| ViT-B_16_384 |input_size=384,layers=12|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_base_patch16_384_ft_in1k_1n8c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams) | +| ViT-L_16_224 |input_size=224,layers=24|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_large_patch16_224_in21k_4n32c_dp_fp16o2.yaml)| [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | +| ViT-L_16_384 |input_size=384,layers=32|[config](https://github.com/PaddlePaddle/PLSC/blob/release/2.4/task/classification/vit/configs/ViT_large_patch16_384_in1k_ft_4n32c_dp_fp16o2.yaml) | [download](https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet21k-ViT-L_16-224.pdparams) | diff --git a/modelcenter/PLSC-ViT/introduction_cn.ipynb b/modelcenter/PLSC-ViT/introduction_cn.ipynb index b65fffaa751074d09461d140018f8850b6eff57d..58b99fba4a81cfa2a13941c8aa86ca5d319e8162 100644 --- a/modelcenter/PLSC-ViT/introduction_cn.ipynb +++ b/modelcenter/PLSC-ViT/introduction_cn.ipynb @@ -56,14 +56,16 @@ ] }, { - "cell_type": "raw", - "id": "8d291263", + "cell_type": "markdown", + "id": "492fa769-2fe0-4220-b6d9-bbc32f8cca10", "metadata": {}, "source": [ + "```\n", "git clone https://github.com/PaddlePaddle/PLSC.git\n", "cd /path/to/PLSC/\n", "# [optional] pip install -r requirements.txt\n", - "python setup.py develop" + "python setup.py develop\n", + "```" ] }, { @@ -79,15 +81,11 @@ "id": "d68ca5fb", "metadata": {}, "source": [ - "1. 进入任务目录" - ] - }, - { - "cell_type": "raw", - "id": "59cd341e", - "metadata": {}, - "source": [ - "cd task/classification/vit\n" + "1. 进入任务目录\n", + "\n", + "```\n", + "cd task/classification/vit\n", + "```" ] }, { @@ -97,20 +95,15 @@ "source": [ "2. 准备数据\n", "\n", - "将数据整理成以下格式:" - ] - }, - { - "cell_type": "raw", - "id": "ee32eb85", - "metadata": {}, - "source": [ + "将数据整理成以下格式:\n", + "```text\n", "dataset/\n", "└── ILSVRC2012\n", " ├── train\n", " ├── val\n", " ├── train_list.txt\n", - " └── val_list.txt" + " └── val_list.txt\n", + "```" ] }, { @@ -118,15 +111,9 @@ "id": "bea743ea", "metadata": {}, "source": [ - "3. 执行训练命令" - ] - }, - { - "cell_type": "raw", - "id": "ec8f9946", - "metadata": {}, - "source": [ + "3. 执行训练命令\n", "\n", + "```shell\n", "export PADDLE_NNODES=1\n", "export PADDLE_MASTER=\"127.0.0.1:12538\"\n", "export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7\n", @@ -136,15 +123,10 @@ " --master=$PADDLE_MASTER \\\n", " --devices=$CUDA_VISIBLE_DEVICES \\\n", " plsc-train \\\n", - " -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml" - ] - }, - { - "cell_type": "markdown", - "id": "c4b48418", - "metadata": {}, - "source": [ - "更多模型的训练教程可参考文档:[ViT训练文档](https://github.com/PaddlePaddle/PLSC/blob/master/task/classification/vit/README.md)\n" + " -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml\n", + "```\n", + "\n", + "更多模型的训练教程可参考文档:[ViT训练文档](https://github.com/PaddlePaddle/PLSC/blob/master/task/classification/vit/README.md)" ] }, { @@ -160,16 +142,12 @@ "id": "e97c527c", "metadata": {}, "source": [ - "1. 下载预训练模型" - ] - }, - { - "cell_type": "raw", - "id": "838afffb", - "metadata": {}, - "source": [ + "1. 下载预训练模型\n", + "\n", + "```shell\n", "mkdir -p pretrained/vit/ViT_base_patch16_224/\n", - "wget -O ./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224.pdparams https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams" + "wget -O ./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224.pdparams https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams\n", + "```" ] }, { @@ -177,24 +155,11 @@ "id": "a07c6549", "metadata": {}, "source": [ - "2. 导出推理模型\n" - ] - }, - { - "cell_type": "raw", - "id": "c69963e7", - "metadata": {}, - "source": [ - "plsc-export -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml -o Global.pretrained_model=./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224 -o Model.data_format=NCHW -o FP16.level=O0" - ] - }, - { - "cell_type": "markdown", - "id": "02a8f5d5", - "metadata": {}, - "source": [ - "## 4.注意事项\n", - "ImageNet21集没有官方划分的验证集,所以我们使用了所有图像作为训练集。我们为了验证程序运行的可行性构建了一个模拟的验证集。" + "2. 导出推理模型\n", + "\n", + "```shell\n", + "plsc-export -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml -o Global.pretrained_model=./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224 -o Model.data_format=NCHW -o FP16.level=O0\n", + "```\n" ] }, { @@ -202,26 +167,28 @@ "id": "d375934d", "metadata": {}, "source": [ - "## 5. 相关论文及引用信息\n" + "## 4. 相关论文及引用信息\n" ] }, { - "cell_type": "raw", - "id": "ae86fc47", + "cell_type": "markdown", + "id": "29f05b07-d323-45e4-b00d-0728eafb5af7", "metadata": {}, "source": [ + "```text\n", "@article{dosovitskiy2020,\n", " title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},\n", " author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},\n", " journal={arXiv preprint arXiv:2010.11929},\n", " year={2020}\n", - "}" + "}\n", + "```" ] } ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "Python 3", "language": "python", "name": "python3" }, diff --git a/modelcenter/PLSC-ViT/introduction_en.ipynb b/modelcenter/PLSC-ViT/introduction_en.ipynb index fe65e740b597624ab58974fa8045b00847468dd9..c41199ed8954c07010d26953eb82f31f80e31a03 100644 --- a/modelcenter/PLSC-ViT/introduction_en.ipynb +++ b/modelcenter/PLSC-ViT/introduction_en.ipynb @@ -52,18 +52,14 @@ "id": "186a0c17", "metadata": {}, "source": [ - "### 3.1 Install PLSC" - ] - }, - { - "cell_type": "raw", - "id": "84ebdf32", - "metadata": {}, - "source": [ + "### 3.1 Install PLSC\n", + "\n", + "```shell\n", "git clone https://github.com/PaddlePaddle/PLSC.git\n", "cd /path/to/PLSC/\n", "# [optional] pip install -r requirements.txt\n", - "python setup.py develop\n" + "python setup.py develop\n", + "```" ] }, { @@ -79,15 +75,11 @@ "id": "a562bf23", "metadata": {}, "source": [ - "1. Enter into the task directory" - ] - }, - { - "cell_type": "raw", - "id": "e3cc0f53", - "metadata": {}, - "source": [ - "cd task/classification/vit" + "1. Enter into the task directory\n", + "\n", + "```shell\n", + "cd task/classification/vit\n", + "```" ] }, { @@ -97,20 +89,16 @@ "source": [ "2. Prepare the data\n", "\n", - "Organize the data into the following format:" - ] - }, - { - "cell_type": "raw", - "id": "a26cdd37", - "metadata": {}, - "source": [ + "Organize the data into the following format:\n", + "\n", + "```text\n", "dataset/\n", "└── ILSVRC2012\n", " ├── train\n", " ├── val\n", " ├── train_list.txt\n", - " └── val_list.txt" + " └── val_list.txt\n", + "```" ] }, { @@ -118,15 +106,9 @@ "id": "ec78efdf", "metadata": {}, "source": [ - "3. Run the command" - ] - }, - { - "cell_type": "raw", - "id": "5facf542", - "metadata": {}, - "source": [ + "3. Run the command\n", "\n", + "```shell\n", "export PADDLE_NNODES=1\n", "export PADDLE_MASTER=\"127.0.0.1:12538\"\n", "export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7\n", @@ -136,14 +118,9 @@ " --master=$PADDLE_MASTER \\\n", " --devices=$CUDA_VISIBLE_DEVICES \\\n", " plsc-train \\\n", - " -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml" - ] - }, - { - "cell_type": "markdown", - "id": "9efae4fb", - "metadata": {}, - "source": [ + " -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml\n", + "```\n", + "\n", "More courses about model training can be learned here [ViT readme](https://github.com/PaddlePaddle/PLSC/blob/master/task/classification/vit/README.md)" ] }, @@ -160,16 +137,12 @@ "id": "7a3ce1ab", "metadata": {}, "source": [ - "1. Download pretrained model" - ] - }, - { - "cell_type": "raw", - "id": "2bcf0a3f", - "metadata": {}, - "source": [ + "1. Download pretrained model\n", + "\n", + "```shell\n", "mkdir -p pretrained/vit/ViT_base_patch16_224/\n", - "wget -O ./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224.pdparams https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams" + "wget -O ./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224.pdparams https://plsc.bj.bcebos.com/models/vit/v2.4/imagenet2012-ViT-B_16-224.pdparams\n", + "```" ] }, { @@ -177,24 +150,11 @@ "id": "cff5ac83", "metadata": {}, "source": [ - "2. Export model for inference" - ] - }, - { - "cell_type": "raw", - "id": "a52873b1", - "metadata": {}, - "source": [ - "plsc-export -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml -o Global.pretrained_model=./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224 -o Model.data_format=NCHW -o FP16.level=O0" - ] - }, - { - "cell_type": "markdown", - "id": "02a8f5d5", - "metadata": {}, - "source": [ - "## 4. Attention\n", - "- Since ImageNet21K does not have an officially divided verification set, we use all the images as the training set. We construct the dummy verification set not for parameter adjustment and evaluation, but for the convenience of observing whether the training is ok." + "2. Export model for inference\n", + "\n", + "```shell\n", + "plsc-export -c ./configs/ViT_base_patch16_224_in1k_1n8c_dp_fp16o2.yaml -o Global.pretrained_model=./pretrained/vit/ViT_base_patch16_224/imagenet2012-ViT-B_16-224 -o Model.data_format=NCHW -o FP16.level=O0\n", + "```" ] }, { @@ -202,27 +162,22 @@ "id": "d375934d", "metadata": {}, "source": [ - "## 5. Related papers and citations\n", - "\n" - ] - }, - { - "cell_type": "raw", - "id": "4fd64e43", - "metadata": {}, - "source": [ + "## 4. Related papers and citations\n", + "\n", + "```text\n", "@article{dosovitskiy2020,\n", " title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},\n", " author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},\n", " journal={arXiv preprint arXiv:2010.11929},\n", " year={2020}\n", - "}" + "}\n", + "```" ] } ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "Python 3", "language": "python", "name": "python3" },