diff --git a/configs/vitdet/README.md b/configs/vitdet/README.md index 699bb9454b978df50fc4de0c9321ceb173933e92..f42e2473202c02e0798ec8286bbee210a9ab1373 100644 --- a/configs/vitdet/README.md +++ b/configs/vitdet/README.md @@ -17,11 +17,12 @@ non-trivial when new architectures, such as Vision Transformer (ViT) models, arr |:------:|:--------:|:--------------:|:--------------:|:--------------:|:------:|:------:|:--------:| | ViT-base | CAE | Cascade RCNN | 1x | 1 | 52.7 | [config](./cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/cascade_rcnn_vit_base_hrfpn_cae_1x_coco.pdparams) | | ViT-large | CAE | Cascade RCNN | 1x | 1 | 55.7 | [config](./cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/cascade_rcnn_vit_large_hrfpn_cae_1x_coco.pdparams) | +| ViT-base | CAE | PP-YOLOE | 36e | 2 | 52.2 | [config](./ppyoloe_vit_base_csppan_cae_36e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_vit_base_csppan_cae_36e_coco.pdparams) | **Notes:** - Model is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95) - Base model is trained on 8x32G V100 GPU, large model on 8x80G A100 -- The above experiments are based on PaddlePaddle 2.2.2 +- The `Cascade RCNN` experiments are based on PaddlePaddle 2.2.2 ## Citations ```