提交 0d3d0000 编写于 作者: L linjintao

Add json col

上级 0fb6612f
......@@ -4,9 +4,9 @@
### ActivityNet feature
|config | gpus | pretrain | AR@100| AUC | gpu_mem(M) | iter time(s) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[bmn_400x100_9e_2x8_activitynet_feature](/configs/localization/bmn/bmn_400x100_2x8_9e_activitynet_feature.py) |x| None |75.28|67.22|5420|3.27|[ckpt]()| [log]()|
|config | gpus | pretrain | AR@100| AUC | gpu_mem(M) | iter time(s) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[bmn_400x100_9e_2x8_activitynet_feature](/configs/localization/bmn/bmn_400x100_2x8_9e_activitynet_feature.py) |x| None |75.28|67.22|5420|3.27|[ckpt]()| [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,9 +4,9 @@
### ActivityNet feature
|config | gpus| pretrain | AR@100| AUC | gpu_mem(M) | iter time(s) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|bsn_400x100_1x16_20e_activitynet_feature |x| None |74.65|66.45|41(TEM)+25(PEM)|0.074(TEM)+0.036(PEM)|[ckpt_tem]() [ckpt_pem]()| [log_tem]() [log_pem]()|
|config | gpus| pretrain | AR@100| AUC | gpu_mem(M) | iter time(s) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|bsn_400x100_1x16_20e_activitynet_feature |x| None |74.65|66.45|41(TEM)+25(PEM)|0.074(TEM)+0.036(PEM)|[ckpt_tem]() [ckpt_pem]()| [log_tem]() [log_pem]()| [json_tem]() [json_pem]()||
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,13 +4,13 @@
### Kinetics-400
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[i3d_r34_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r34_32x2x1_100e_kinetics400_rgb.py) |x| ResNet34|ImageNet |68.37|88.15|1.6 (320x3 frames)| 3176| [ckpt]() | [log]()|
|[i3d_r50_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50|ImageNet |72.68|90.78|1.7 (320x3 frames)| 5170|[ckpt]() | [log]()|
|[i3d_r50_dense_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_dense_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet|72.77|90.57|1.7 (320x3 frames)| 5170| [ckpt]() | [log]()|
|[i3d_r50_fast_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_fast_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50 |ImageNet|72.32|90.72|1.8 (320x3 frames)| 5170| [ckpt]() | [log]()|
|[i3d_r50_video_3d_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_video_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet| x | x | x| x| [ckpt]() | [log]()|
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[i3d_r34_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r34_32x2x1_100e_kinetics400_rgb.py) |x| ResNet34|ImageNet |68.37|88.15|1.6 (320x3 frames)| 3176| [ckpt]() | [log]()| [json]()|
|[i3d_r50_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50|ImageNet |72.68|90.78|1.7 (320x3 frames)| 5170|[ckpt]() | [log]()| [json]()|
|[i3d_r50_dense_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_dense_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet|72.77|90.57|1.7 (320x3 frames)| 5170| [ckpt]() | [log]()| [json]()|
|[i3d_r50_fast_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_fast_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50 |ImageNet|72.32|90.72|1.8 (320x3 frames)| 5170| [ckpt]() | [log]()| [json]()|
|[i3d_r50_video_3d_32x2x1_100e_kinetics400_rgb](/configs/recognition/i3d/i3d_r50_video_32x2x1_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet| x | x | x| x| [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,11 +4,11 @@
### Kinetics-400
|config | gpus | backbone | pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[r2plus1d_r34_8x8x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_8x8x1_180e_kinetics400_rgb.py) |x| ResNet34|None |68.68|88.36|1.6 (80x3 frames)|5019|[ckpt]()| [log]()|
|[r2plus1d_r34_32x2x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_32x2x1_180e_kinetics400_rgb.py) |x| ResNet34|None |74.60|91.59|0.5 (320x3 frames)|12975| [ckpt]() | [log]()|
|[r2plus1d_r34_video_8x8x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_video_8x8x1_180e_kinetics400_rgb.py) |x| ResNet34|None |x|x|x|x| [ckpt]() | [log]()|
|config | gpus | backbone | pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[r2plus1d_r34_8x8x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_8x8x1_180e_kinetics400_rgb.py) |x| ResNet34|None |68.68|88.36|1.6 (80x3 frames)|5019|[ckpt]()| [log]()| [json]()|
|[r2plus1d_r34_32x2x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_32x2x1_180e_kinetics400_rgb.py) |x| ResNet34|None |74.60|91.59|0.5 (320x3 frames)|12975| [ckpt]() | [log]()| [json]()|
|[r2plus1d_r34_video_8x8x1_180e_kinetics400_rgb](/configs/recognition/r2plus1d/r2plus1d_r34_video_8x8x1_180e_kinetics400_rgb.py) |x| ResNet34|None |x|x|x|x| [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,11 +4,11 @@
### Kinetics-400
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[slowfast_r50_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_4x16x1_256e_kinetics400_rgb.py) |x| ResNet50|None |75.3|92.2|1.6 (320x3 frames)|6203|[ckpt]()| [log]()|
|[slowfast_r50_8x8x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_8x8x1_256e_kinetics400_rgb.py) |x| ResNet50 |None|76.36|92.56|1.3 (320x3 frames)|9062| [ckpt]() | [log]()|
|[slowfast_r50_video_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_video_4x16x1_256e_kinetics400_rgb.py) |x| ResNet50|None |x|x|x|x| [ckpt]() | [log]()|
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[slowfast_r50_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_4x16x1_256e_kinetics400_rgb.py) |x| ResNet50|None |75.3|92.2|1.6 (320x3 frames)|6203|[ckpt]()| [log]()| [json]()|
|[slowfast_r50_8x8x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_8x8x1_256e_kinetics400_rgb.py) |x| ResNet50 |None|76.36|92.56|1.3 (320x3 frames)|9062| [ckpt]() | [log]()| [json]()|
|[slowfast_r50_video_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowfast/slowfast_r50_video_4x16x1_256e_kinetics400_rgb.py) |x| ResNet50|None |x|x|x|x| [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,13 +4,13 @@
### Kinetics-400
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[slowonly_r50_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_rgb.py)|x| ResNet50 | None |73.02|90.77|4.0 (40x3 frames)|3168|[ckpt]()| [log]()|
|[slowonly_r50_8x8x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_8x8x1_256e_kinetics400_rgb.py) |x| ResNet50 | None |74.93|91.92|2.3 (80x3 frames)|5820| [ckpt]() | [log]()|
|[slowonly_r50_4x16x1_256e_kinetics400_flow](/configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_flow.py)|x| ResNet50 | ImageNet |61.79|83.62|x|8450| [ckpt]() | [log]() |
|[slowonly_r50_8x8x1_196e_kinetics400_flow](/configs/recognition/slowonly/slowonly_r50_8x8x1_196e_kinetics400_flow.py) |x| ResNet50 | ImageNet |65.76|86.25|x|8455| [ckpt]() | [log]() |
|[slowonly_r50_video_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_video_4x16x1_256e_kinetics400_rgb.py)|x| ResNet50 | None |x|x|x|x| [ckpt]() | [log]()|
|config | gpus | backbone |pretrain| top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[slowonly_r50_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_rgb.py)|x| ResNet50 | None |73.02|90.77|4.0 (40x3 frames)|3168|[ckpt]()| [log]()| [json]()|
|[slowonly_r50_8x8x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_8x8x1_256e_kinetics400_rgb.py) |x| ResNet50 | None |74.93|91.92|2.3 (80x3 frames)|5820| [ckpt]() | [log]()| [json]()|
|[slowonly_r50_4x16x1_256e_kinetics400_flow](/configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_flow.py)|x| ResNet50 | ImageNet |61.79|83.62|x|8450| [ckpt]() | [log]() | [json]()|
|[slowonly_r50_8x8x1_196e_kinetics400_flow](/configs/recognition/slowonly/slowonly_r50_8x8x1_196e_kinetics400_flow.py) |x| ResNet50 | ImageNet |65.76|86.25|x|8455| [ckpt]() | [log]() | [json]()|
|[slowonly_r50_video_4x16x1_256e_kinetics400_rgb](/configs/recognition/slowonly/slowonly_r50_video_4x16x1_256e_kinetics400_rgb.py)|x| ResNet50 | None |x|x|x|x| [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,21 +4,21 @@
### Kinetics-400
|config | gpus | backbone| pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50| ImageNet |69.44|89.19|16.5 (8x1 frames)| 6173| [ckpt]() | [log]()|
|[tin_r50_finetune_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_finetune_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50| ImageNet |71.00|89.98| x | 6174 | [ckpt]() | [log]()|
|[tin_r50_video_2d_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_video_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50 | ImageNet | x | x | x | x | [ckpt]() | [log]()|
|config | gpus | backbone| pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50| ImageNet |69.44|89.19|16.5 (8x1 frames)| 6173| [ckpt]() | [log]()| [json]()|
|[tin_r50_finetune_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_finetune_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50| ImageNet |71.00|89.98| x | 6174 | [ckpt]() | [log]()| [json]()|
|[tin_r50_video_2d_1x1x8_35e_kinetics400_rgb](/configs/recognition/tin/tin_r50_video_1x1x8_35e_kinetics400_rgb.py) |x| ResNet50 | ImageNet | x | x | x | x | [ckpt]() | [log]()| [json]()|
### Something-Something V1
|config | gpus | backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_sthv1_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_sthv1_rgb.py) |x| ResNet50 |ImageNet|41.59|71.94| x | [ckpt]() | [log]()|
|config | gpus | backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_sthv1_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_sthv1_rgb.py) |x| ResNet50 |ImageNet|41.59|71.94| x | [ckpt]() | [log]()| [json]()|
### Something-Something V2
|config | gpus | backbone | pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_sthv2_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_sthv2_rgb.py) |x| ResNet50|ImageNet |53.08|82.02| x | [ckpt]() | [log]()|
|config | gpus | backbone | pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tin_r50_1x1x8_35e_sthv2_rgb](/configs/recognition/tin/tin_r50_1x1x8_35e_sthv2_rgb.py) |x| ResNet50|ImageNet |53.08|82.02| x | [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,28 +4,28 @@
### Kinetics-400
|config | gpus | backbone | pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |70.24|89.56|74.0 (8x1 frames)| 7079 | [ckpt]() | [log]()|
|[tsm_r50_dense_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_dense_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|71.84|90.18|11.5 (8x10 frames)| 7079 | [ckpt]() | [log]()|
|[tsm_r50_1x1x16_50e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |71.69|90.4|47.0 (16x1 frames)| 10404 | [ckpt]() | [log]()|
|[tsm_r50_video_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_video_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet | x | x | x | 7077 | [ckpt]() | [log]()|
|config | gpus | backbone | pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |70.24|89.56|74.0 (8x1 frames)| 7079 | [ckpt]() | [log]()| [json]()|
|[tsm_r50_dense_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_dense_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|71.84|90.18|11.5 (8x10 frames)| 7079 | [ckpt]() | [log]()| [json]()|
|[tsm_r50_1x1x16_50e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |71.69|90.4|47.0 (16x1 frames)| 10404 | [ckpt]() | [log]()| [json]()|
|[tsm_r50_video_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsm/tsm_r50_video_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet | x | x | x | 7077 | [ckpt]() | [log]()| [json]()|
### Something-Something V1
|config | gpus | backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|44.62|75.51| 7077| [ckpt]() | [log]()|
|[tsm_r50_1x1x16_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|43.81|74.73| x | [ckpt]() | [log]()|
|[tsm_r101_1x1x8_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r101_1x1x8_50e_sthv1_rgb.py) |x| ResNet101| ImageNet |46.41|74.07| x | [ckpt]() | [log]()|
|config | gpus | backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|44.62|75.51| 7077| [ckpt]() | [log]()| [json]()|
|[tsm_r50_1x1x16_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|43.81|74.73| x | [ckpt]() | [log]()| [json]()|
|[tsm_r101_1x1x8_50e_sthv1_rgb](/configs/recognition/tsm/tsm_r101_1x1x8_50e_sthv1_rgb.py) |x| ResNet101| ImageNet |46.41|74.07| x | [ckpt]() | [log]()| [json]()|
### Something-Something V2
|config | gpus | backbone | pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |59.91|84.61| x| [ckpt]() | [log]()|
|[tsm_r50_1x1x16_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |56.10|84.43| 10400| [ckpt]() | [log]()|
|[tsm_r101_1x1x8_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r101_1x1x8_50e_sthv2_rgb.py) |x| ResNet101 | ImageNet|59.12|85.74| 9784 | [ckpt]() | [log]()|
|config | gpus | backbone | pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsm_r50_1x1x8_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r50_1x1x8_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |59.91|84.61| x| [ckpt]() | [log]()| [json]()|
|[tsm_r50_1x1x16_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r50_1x1x16_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |56.10|84.43| 10400| [ckpt]() | [log]()| [json]()|
|[tsm_r101_1x1x8_50e_sthv2_rgb](/configs/recognition/tsm/tsm_r101_1x1x8_50e_sthv2_rgb.py) |x| ResNet101 | ImageNet|59.12|85.74| 9784 | [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
......@@ -4,54 +4,54 @@
### UCF-101
|config | gpus | backbone | pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x3_100e_ucf101_rgb](/configs/recognition/tsn/tsn_r50_1x1x3_80e_ucf101_rgb.py) |x| ResNet50 | ImageNet |80.12|96.09|8332| [ckpt]() | [log]()|
|config | gpus | backbone | pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x3_100e_ucf101_rgb](/configs/recognition/tsn/tsn_r50_1x1x3_80e_ucf101_rgb.py) |x| ResNet50 | ImageNet |80.12|96.09|8332| [ckpt]() | [log]()| [json]()|
### Kinetics-400
|config | gpus | backbone|pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|70.60|89.26|4.3 (25x10 frames)|8344| [ckpt]() | [log]()|
|[tsn_r50_1x1x5_50e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x5_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |68.64|88.19|86.7 (8x1 frames)|7031| [ckpt]() | [log]()|
|[tsn_r50_dense_1x1x5_50e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_dense_1x1x5_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |68.59|88.31|12.7 (8x10 frames)|7028| [ckpt]() | [log]()|
|[tsn_r50_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |69.41|88.37|81.6 (8x1 frames)| x | [ckpt]() | [log]()|
|[tsn_r50_320p_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_320p_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |70.91|89.51|10.7 (25x3 frames)| 8344 | [ckpt]() | [log]() |
|[tsn_r50_320p_1x1x3_110e_kinetics400_flow](/configs/recognition/tsn/tsn_r50_320p_1x1x3_110e_kinetics400_flow.py) |x| ResNet50 | ImageNet|55.70|79.85|x| 8471 | [ckpt]() | [log]() |
|tsn_r50_320p_1x1x3_kinetics400_twostream [1: 1]* |x| ResNet50 | ImageNet|72.76|90.52| x | x | [ckpt]() | [log]() |
|[tsn_r50_320p_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_320p_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |72.41|90.55|11.1 (25x3 frames)| 8344 | [ckpt]() | [log]() |
|[tsn_r50_320p_1x1x8_110e_kinetics400_flow](/configs/recognition/tsn/tsn_r50_320p_1x1x8_110e_kinetics400_flow.py) |x| ResNet50 | ImageNet|57.76|80.99|x| 8473 | [ckpt]() | [log]() |
|tsn_r50_320p_1x1x8_kinetics400_twostream [1: 1]* |x| ResNet50| ImageNet |74.64|91.77| x | x | [ckpt]() | [log]() |
|[tsn_r50_dense_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_dense_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|70.77|89.3|12.2 (8x10 frames)|8344| [ckpt]() | [log]()|
|[tsn_r50_video_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_video_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet | x | x |8339| [ckpt]() | [log]()|
|config | gpus | backbone|pretrain | top1 acc| top5 acc | inference_time(video/s) | gpu_mem(M)| ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|70.60|89.26|4.3 (25x10 frames)|8344| [ckpt]() | [log]()| [json]()|
|[tsn_r50_1x1x5_50e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x5_50e_kinetics400_rgb.py) |x| ResNet50| ImageNet |68.64|88.19|86.7 (8x1 frames)|7031| [ckpt]() | [log]()| [json]()|
|[tsn_r50_dense_1x1x5_50e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_dense_1x1x5_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |68.59|88.31|12.7 (8x10 frames)|7028| [ckpt]() | [log]()| [json]()|
|[tsn_r50_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |69.41|88.37|81.6 (8x1 frames)| x | [ckpt]() | [log]()| [json]()|
|[tsn_r50_320p_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_320p_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |70.91|89.51|10.7 (25x3 frames)| 8344 | [ckpt]() | [log]() | [json]()|
|[tsn_r50_320p_1x1x3_110e_kinetics400_flow](/configs/recognition/tsn/tsn_r50_320p_1x1x3_110e_kinetics400_flow.py) |x| ResNet50 | ImageNet|55.70|79.85|x| 8471 | [ckpt]() | [log]() | [json]()|
|tsn_r50_320p_1x1x3_kinetics400_twostream [1: 1]* |x| ResNet50 | ImageNet|72.76|90.52| x | x | [ckpt]() | [log]() | [json]()|
|[tsn_r50_320p_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_320p_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet |72.41|90.55|11.1 (25x3 frames)| 8344 | [ckpt]() | [log]() | [json]()|
|[tsn_r50_320p_1x1x8_110e_kinetics400_flow](/configs/recognition/tsn/tsn_r50_320p_1x1x8_110e_kinetics400_flow.py) |x| ResNet50 | ImageNet|57.76|80.99|x| 8473 | [ckpt]() | [log]() | [json]()|
|tsn_r50_320p_1x1x8_kinetics400_twostream [1: 1]* |x| ResNet50| ImageNet |74.64|91.77| x | x | [ckpt]() | [log]() | [json]()|
|[tsn_r50_dense_1x1x8_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_dense_1x1x8_100e_kinetics400_rgb.py) |x| ResNet50 | ImageNet|70.77|89.3|12.2 (8x10 frames)|8344| [ckpt]() | [log]()| [json]()|
|[tsn_r50_video_1x1x3_100e_kinetics400_rgb](/configs/recognition/tsn/tsn_r50_video_1x1x3_100e_kinetics400_rgb.py) |x| ResNet50| ImageNet | x | x |8339| [ckpt]() | [log]()| [json]()|
*We combine rgb and flow score with coefficients 1: 1 to get the two-stream prediction (without applying softmax).
### Something-Something V1
|config | gpus| backbone |pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x8_50e_sthv1_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|18.55|44.80| 10978 | [ckpt]() | [log]()|
|[tsn_r50_1x1x16_50e_sthv1_rgb](/configs/recognition/tsn/tsn_r50_1x1x16_50e_sthv1_rgb.py) |x| ResNet50| ImageNet |15.77|39.85| 5691 | [ckpt]() | [log]()|
|config | gpus| backbone |pretrain| top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x8_50e_sthv1_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_50e_sthv1_rgb.py) |x| ResNet50 | ImageNet|18.55|44.80| 10978 | [ckpt]() | [log]()| [json]()|
|[tsn_r50_1x1x16_50e_sthv1_rgb](/configs/recognition/tsn/tsn_r50_1x1x16_50e_sthv1_rgb.py) |x| ResNet50| ImageNet |15.77|39.85| 5691 | [ckpt]() | [log]()| [json]()|
### Something-Something V2
|config | gpus| backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x8_50e_sthv2_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |32.41|64.05| 10978 | [ckpt]() | [log]()|
|[tsn_r50_1x1x16_50e_sthv2_rgb](/configs/recognition/tsn/tsn_r50_1x1x16_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |22.48|49.08|5698| [ckpt]() | [log]()|
|config | gpus| backbone| pretrain | top1 acc| top5 acc | gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x8_50e_sthv2_rgb](/configs/recognition/tsn/tsn_r50_1x1x8_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |32.41|64.05| 10978 | [ckpt]() | [log]()| [json]()|
|[tsn_r50_1x1x16_50e_sthv2_rgb](/configs/recognition/tsn/tsn_r50_1x1x16_50e_sthv2_rgb.py) |x| ResNet50| ImageNet |22.48|49.08|5698| [ckpt]() | [log]()| [json]()|
### Moments in Time
|config | gpus| backbone | pretrain | top1 acc| top5 acc | gpu_mem(M)| ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x6_100e_mit_rgb](/configs/recognition/tsn/tsn_r50_1x1x6_100e_mit_rgb.py) |x| ResNet50| ImageNet |26.84|51.6| 8339| [ckpt]() | [log]()|
|config | gpus| backbone | pretrain | top1 acc| top5 acc | gpu_mem(M)| ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r50_1x1x6_100e_mit_rgb](/configs/recognition/tsn/tsn_r50_1x1x6_100e_mit_rgb.py) |x| ResNet50| ImageNet |26.84|51.6| 8339| [ckpt]() | [log]()| [json]()|
### Multi-Moments in Time
|config | gpus| backbone | pretrain | mAP| gpu_mem(M) | ckpt | log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r101_1x1x5_50e_mmit_rgb](/configs/recognition/tsn/tsn_r101_1x1x5_50e_mmit_rgb.py) |x| ResNet101| ImageNet |61.09| 10467 | [ckpt]() | [log]()|
|config | gpus| backbone | pretrain | mAP| gpu_mem(M) | ckpt | log| json|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|[tsn_r101_1x1x5_50e_mmit_rgb](/configs/recognition/tsn/tsn_r101_1x1x5_50e_mmit_rgb.py) |x| ResNet101| ImageNet |61.09| 10467 | [ckpt]() | [log]()| [json]()|
Notes:
1. The **gpus** indicates the number of gpu we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册