diff --git a/configs/rtdetr/README.md b/configs/rtdetr/README.md index 92e8ba67a753190485c13dc110887fdb3fe7ebc4..1a8c00268636739824af2819d2a450579f4ef33f 100644 --- a/configs/rtdetr/README.md +++ b/configs/rtdetr/README.md @@ -18,6 +18,7 @@ RT-DETR是第一个实时端到端目标检测器。具体而言,我们设计 | Model | Epoch | backbone | input shape | $AP^{val}$ | $AP^{val}_{50}$| Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) | Pretrained Model | config | |:--------------:|:-----:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:| :---------------------: |:------------------------------------------------------------------------------------:|:-------------------------------------------:| +| RT-DETR-R50-scaled | 6x | ResNet-50 | 640 | 51.3 | - | - | - | 145 | [download](https://bj.bcebos.com/v1/paddledet/models/rtdetr_r50vd_m_6x_coco.pdparams) | [config](./rtdetr_r50vd_m_6x_coco.yml) | RT-DETR-R50 | 6x | ResNet-50 | 640 | 53.1 | 71.3 | 42 | 136 | 108 | [download](https://bj.bcebos.com/v1/paddledet/models/rtdetr_r50vd_6x_coco.pdparams) | [config](./rtdetr_r50vd_6x_coco.yml) | RT-DETR-R101 | 6x | ResNet-101 | 640 | 54.3 | 72.7 | 76 | 259 | 74 | [download](https://bj.bcebos.com/v1/paddledet/models/rtdetr_r101vd_6x_coco.pdparams) | [config](./rtdetr_r101vd_6x_coco.yml) | RT-DETR-L | 6x | HGNetv2 | 640 | 53.0 | 71.6 | 32 | 110 | 114 | [download](https://bj.bcebos.com/v1/paddledet/models/rtdetr_hgnetv2_l_6x_coco.pdparams) | [config](rtdetr_hgnetv2_l_6x_coco.yml) diff --git a/configs/rtdetr/rtdetr_r50vd_m_6x_coco.yml b/configs/rtdetr/rtdetr_r50vd_m_6x_coco.yml new file mode 100644 index 0000000000000000000000000000000000000000..d4ab6f9f318b2ce7749b5c8f0576d94d14157edf --- /dev/null +++ b/configs/rtdetr/rtdetr_r50vd_m_6x_coco.yml @@ -0,0 +1,28 @@ +_BASE_: [ + '../datasets/coco_detection.yml', + '../runtime.yml', + '_base_/optimizer_6x.yml', + '_base_/rtdetr_r50vd.yml', + '_base_/rtdetr_reader.yml', +] + +weights: output/rtdetr_r50vd_m_6x_coco/model_final +find_unused_parameters: True +log_iter: 200 + +HybridEncoder: + hidden_dim: 256 + use_encoder_idx: [2] + num_encoder_layers: 1 + encoder_layer: + name: TransformerLayer + d_model: 256 + nhead: 8 + dim_feedforward: 1024 + dropout: 0. + activation: 'gelu' + expansion: 0.5 + depth_mult: 1.0 + +RTDETRTransformer: + eval_idx: 2 # use 3th decoder layer to eval