提交 f5e723e7 编写于 作者: L LiuChaoXD

add tsn

上级 fdcd1ff9
# TSN 视频分类模型 # TSN 视频分类模型
本目录下为基于PaddlePaddle 动态图实现的 TSM视频分类模型 本目录下为基于PaddlePaddle 动态图实现的 TSN视频分类模型
--- ---
## 内容 ## 内容
...@@ -19,34 +19,35 @@ Temporal Segment Network (TSN) 是视频分类领域经典的基于2D-CNN的解 ...@@ -19,34 +19,35 @@ Temporal Segment Network (TSN) 是视频分类领域经典的基于2D-CNN的解
## 数据准备 ## 数据准备
TSN的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。数据下载及准备请参考[数据说明](./data/dataset/ucf101/README.md) TSN的训练数据采用UCF101动作识别数据集。数据下载及准备请参考[数据说明](./data/dataset/ucf101/README.md)
## 模型训练 ## 模型训练
数据准备完毕后,可以通过如下两种方式启动训练 数据准备完毕后,可以通过如下两种方式启动训练
1. 多卡训练 1. 多卡训练
```python ```bash
bash multi-gpus-run.sh train ./configs/tsn.yaml bash multi-gpus-run.sh ./configs/tsn.yaml
``` ```
多卡训练所使用的gpu可以通过如下方式设置: 多卡训练所使用的gpu可以通过如下方式设置:
- 首先,修改./configs/tsn.yaml 中的 num_gpus (默认为4,表示使用4个gpu进行训练) - 首先,修改`./configs/tsn.yaml` 中的 num_gpus (默认为4,表示使用4个gpu进行训练)
- 其次,修改 multi-gpus-run.sh 中 export CUDA_VISIBLE_DEVICES=0,1,2,3 和 --selected_gpus=0,1,2,3 (默认为0,1,2,3表示使用0,1,2,3卡号的gpu进行训练) - 其次,修改`multi-gpus-run.sh``export CUDA_VISIBLE_DEVICES=0,1,2,3`(默认为0,1,2,3表示使用0,1,2,3卡号的gpu进行训练)
- 注意:若修改了batchsize则学习率也要做相应的修改。例如,默认batchsize=128,lr=0.001,若batchsize=64,lr=0.0005
2. 单卡训练 2. 单卡训练
```python ```bash
bash run.sh train ./configs/tsn.yaml bash run.sh ./configs/tsn.yaml
``` ```
单卡训练所使用的gpu可以通过如下方式设置: 单卡训练所使用的gpu可以通过如下方式设置:
- 首先,修改./configs/tsn.yaml 中的 num_gpus=1 (表示使用单卡进行训练) - 首先,修改`./configs/tsn.yaml` 中的 `num_gpus=1` (表示使用单卡进行训练)
- 首先,修改run.sh 中的 export CUDA_VISIBLE_DEVICES=0 (表示使用gpu 0 进行模型训练) - 其次,修改 `run.sh` 中的 `export CUDA_VISIBLE_DEVICES=0` (表示使用gpu 0 进行模型训练)
- 注意,若修改了batchsize则学习率也要做相应的修改。例如,默认batchsize=128,lr=0.001,若batchsize=64,lr=0.0005
## 模型评估 ## 模型评估
可通过如下两种方式进行模型评估: 可通过如下方式进行模型评估:
```python ```bash
bash run.sh eval ./configs/tsn-test.yaml ./weights/final.pdparams bash run-eval.sh ./configs/tsn-test.yaml ./weights/final.pdparams
``` ```
- 使用`run.sh`进行评估时,需要修改脚本中的`weights`参数指定需要评估的权重 - 使用`run.sh`进行评估时,需要修改脚本中的`weights`参数指定需要评估的权重
...@@ -57,7 +58,7 @@ bash run.sh eval ./configs/tsn-test.yaml ./weights/final.pdparams ...@@ -57,7 +58,7 @@ bash run.sh eval ./configs/tsn-test.yaml ./weights/final.pdparams
当取如下参数时,在UCF101数据的validation数据集下评估精度如下: 实验结果,采用四卡训练,默认配置参数时,在UCF101数据的validation数据集下评估精度如下:
| | seg\_num | Top-1 | Top-5 | | | seg\_num | Top-1 | Top-5 |
| :------: | :----------: | :----: | :----: | | :------: | :----------: | :----: | :----: |
......
...@@ -24,7 +24,7 @@ bash download_annotations.sh ...@@ -24,7 +24,7 @@ bash download_annotations.sh
直接输入如下命令,即可提取ucf101视频文件的frames 直接输入如下命令,即可提取ucf101视频文件的frames
``` python ``` python
python extract_rawframes_opencv.py ./videos/ ./rawframes/ --level 2 --ext avi python extract_rawframes.py ./videos/ ./rawframes/ --level 2 --ext avi
``` ```
--- ---
...@@ -32,11 +32,11 @@ python extract_rawframes_opencv.py ./videos/ ./rawframes/ --level 2 --ext avi ...@@ -32,11 +32,11 @@ python extract_rawframes_opencv.py ./videos/ ./rawframes/ --level 2 --ext avi
生成视频文件的路径list,输入如下命令 生成视频文件的路径list,输入如下命令
```python ```python
python build_file_list.py videos/ --level 2 --format videos --out_list_path ./ --shuffle python build_ucf101_file_list.py videos/ --level 2 --format videos --out_list_path ./ --shuffle
``` ```
生成frames文件的路径list,输入如下命令: 生成frames文件的路径list,输入如下命令:
```python ```python
python build_file_list.py rawframes/ --level 2 --format rawframes --out_list_path ./ --shuffle python build_ucf101_file_list.py rawframes/ --level 2 --format rawframes --out_list_path ./ --shuffle
``` ```
**参数说明** **参数说明**
......
1 ApplyEyeMakeup
2 ApplyLipstick
3 Archery
4 BabyCrawling
5 BalanceBeam
6 BandMarching
7 BaseballPitch
8 Basketball
9 BasketballDunk
10 BenchPress
11 Biking
12 Billiards
13 BlowDryHair
14 BlowingCandles
15 BodyWeightSquats
16 Bowling
17 BoxingPunchingBag
18 BoxingSpeedBag
19 BreastStroke
20 BrushingTeeth
21 CleanAndJerk
22 CliffDiving
23 CricketBowling
24 CricketShot
25 CuttingInKitchen
26 Diving
27 Drumming
28 Fencing
29 FieldHockeyPenalty
30 FloorGymnastics
31 FrisbeeCatch
32 FrontCrawl
33 GolfSwing
34 Haircut
35 Hammering
36 HammerThrow
37 HandstandPushups
38 HandstandWalking
39 HeadMassage
40 HighJump
41 HorseRace
42 HorseRiding
43 HulaHoop
44 IceDancing
45 JavelinThrow
46 JugglingBalls
47 JumpingJack
48 JumpRope
49 Kayaking
50 Knitting
51 LongJump
52 Lunges
53 MilitaryParade
54 Mixing
55 MoppingFloor
56 Nunchucks
57 ParallelBars
58 PizzaTossing
59 PlayingCello
60 PlayingDaf
61 PlayingDhol
62 PlayingFlute
63 PlayingGuitar
64 PlayingPiano
65 PlayingSitar
66 PlayingTabla
67 PlayingViolin
68 PoleVault
69 PommelHorse
70 PullUps
71 Punch
72 PushUps
73 Rafting
74 RockClimbingIndoor
75 RopeClimbing
76 Rowing
77 SalsaSpin
78 ShavingBeard
79 Shotput
80 SkateBoarding
81 Skiing
82 Skijet
83 SkyDiving
84 SoccerJuggling
85 SoccerPenalty
86 StillRings
87 SumoWrestling
88 Surfing
89 Swing
90 TableTennisShot
91 TaiChi
92 TennisSwing
93 ThrowDiscus
94 TrampolineJumping
95 Typing
96 UnevenBars
97 VolleyballSpiking
98 WalkingWithDog
99 WallPushups
100 WritingOnBoard
101 YoYo
#! /usr/bin/bash env #! /usr/bin/bash env
wget https://www.crcv.ucf.edu/data/UCF101/UCF101.rar wget --no-check-certificate "https://www.crcv.ucf.edu/data/UCF101/UCF101.rar"
unrar x UCF101.rar unrar x UCF101.rar
mv ./UCF-101 ./videos mv ./UCF-101 ./videos
rm -rf ./UCF101.rar
...@@ -24,7 +24,7 @@ from paddle.fluid.dygraph.base import to_variable ...@@ -24,7 +24,7 @@ from paddle.fluid.dygraph.base import to_variable
from model import TSN_ResNet from model import TSN_ResNet
from utils.config_utils import * from utils.config_utils import *
from ucf101_reader import UCF101Reader from reader.ucf101_reader import UCF101Reader
logging.root.handlers = [] logging.root.handlers = []
FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s' FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s'
...@@ -50,7 +50,7 @@ def parse_args(): ...@@ -50,7 +50,7 @@ def parse_args():
default=True, default=True,
help='default use gpu.') help='default use gpu.')
parser.add_argument( parser.add_argument(
'--weights', type=str, default="./final", help="weight path") '--weights', type=str, default="./weights/final", help="weight path")
args = parser.parse_args() args = parser.parse_args()
return args return args
......
# examples of running programs:
# bash ./run.sh train CTCN ./configs/ctcn.yaml
# bash ./run.sh eval NEXTVLAD ./configs/nextvlad.yaml
# bash ./run.sh predict NONLOCAL ./cofings/nonlocal.yaml
# mode should be one of [train, eval, predict, inference]
# name should be one of [AttentionCluster, AttentionLSTM, NEXTVLAD, NONLOCAL, TSN, TSM, STNET, CTCN]
# configs should be ./configs/xxx.yaml
mode=$1
configs="./tsn.yaml" configs="./tsn.yaml"
pretrain="" # set pretrain model path if needed pretrain="" # set pretrain model path if needed
resume="" # set pretrain model path if needed resume="" # set checkpoints model path if u want to resume training
save_dir="" save_dir=""
use_gpu=True use_gpu=True
use_data_parallel=True use_data_parallel=True
weights="" #set the path of weights to enable eval and predicut, just ignore this when training
export CUDA_VISIBLE_DEVICES=0,1,2,3 export CUDA_VISIBLE_DEVICES=0,1,2,3
export FLAGS_fast_eager_deletion_mode=1 export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0 export FLAGS_eager_delete_tensor_gb=0.0
export FLAGS_fraction_of_gpu_memory_to_use=0.98 export FLAGS_fraction_of_gpu_memory_to_use=0.98
echo $mode "TSN" $configs $resume $pretrain
if [ "$mode"x == "train"x ]; then if [ "$resume"x != ""x ]; then
echo $mode "TSN" $configs $resume $pretrain python -m paddle.distributed.launch train.py \
if [ "$resume"x != ""x ]; then --config=$configs \
python -m paddle.distributed.launch --selected_gpus=0,1,2,3 train.py \ --resume=$resume \
--config=$configs \ --use_gpu=$use_gpu \
--resume=$resume \ --use_data_parallel=$use_data_parallel
--use_gpu=$use_gpu \ elif [ "$pretrain"x != ""x ]; then
--use_data_parallel=$use_data_parallel python -m paddle.distributed.launch train.py \
elif [ "$pretrain"x != ""x ]; then --config=$configs \
python -m paddle.distributed.launch --selected_gpus=0,1,2,3 train.py \ --pretrain=$pretrain \
--config=$configs \ --use_gpu=$use_gpu \
--pretrain=$pretrain \ --use_data_parallel=$use_data_parallel
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
else
python -m paddle.distributed.launch --selected_gpus=0,1,2,3 train.py \
--config=$configs \
--use_gpu=$use_gpu\
--use_data_parallel=$use_data_parallel
fi
elif [ "$mode"x == "eval"x ]; then
echo $mode $name $configs $weights
if [ "$weights"x != ""x ]; then
python eval.py --config=$configs \
--weights=$weights \
--use_gpu=$use_gpu
else
python eval.py --config=$configs \
--use_gpu=$use_gpu
fi
else else
echo "Not implemented mode " $mode python -m paddle.distributed.launch train.py \
--config=$configs \
--use_gpu=$use_gpu\
--use_data_parallel=$use_data_parallel
fi fi
configs="tsn-test.yaml"
use_gpu=True
use_data_parallel=False
export CUDA_VISIBLE_DEVICES=0
export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0
export FLAGS_fraction_of_gpu_memory_to_use=0.98
echo $mode $configs $weights
if [ "$weights"x != ""x ]; then
python eval.py --config=$configs \
--weights=$weights \
--use_gpu=$use_gpu
else
python eval.py --config=$configs \
--use_gpu=$use_gpu
fi
# examples of running programs:
# bash ./run.sh train CTCN ./configs/ctcn.yaml
# bash ./run.sh eval NEXTVLAD ./configs/nextvlad.yaml
# bash ./run.sh predict NONLOCAL ./cofings/nonlocal.yaml
# mode should be one of [train, eval, predict, inference]
# name should be one of [AttentionCluster, AttentionLSTM, NEXTVLAD, NONLOCAL, TSN, TSM, STNET, CTCN]
# configs should be ./configs/xxx.yaml
mode=$1
configs="tsn.yaml"
pretrain="" # set pretrain model path if needed
resume="" # set pretrain model path if needed
save_dir=""
use_gpu=True
use_data_parallel=False
weights="" #set the path of weights to enable eval and predicut, just ignore this when training
export CUDA_VISIBLE_DEVICES=0
export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0
export FLAGS_fraction_of_gpu_memory_to_use=0.98
if [ "$mode"x == "train"x ]; then
echo $mode "TSN" $configs $resume $pretrain
if [ "$resume"x != ""x ]; then
python train.py --config=$configs \
--resume=$resume \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
elif [ "$pretrain"x != ""x ]; then
python train.py --config=$configs \
--pretrain=$pretrain \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
else
python train.py --config=$configs \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
fi
elif [ "$mode"x == "eval"x ]; then
echo $mode $configs $weights
if [ "$weights"x != ""x ]; then
python eval.py --config=$configs \
--weights=$weights \
--use_gpu=$use_gpu
else
python eval.py --config=$configs \
--use_gpu=$use_gpu
fi
else
echo "Not implemented mode " $mode
fi
configs="tsn.yaml"
pretrain="" # set pretrain model path if needed
resume="" # set checkpoints model path if u want to resume training
save_dir=""
use_gpu=True
use_data_parallel=False
weights="" #set the path of weights to enable eval and predicut, just ignore this when training
export CUDA_VISIBLE_DEVICES=0
export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0
export FLAGS_fraction_of_gpu_memory_to_use=0.98
echo $mode "TSN" $configs $resume $pretrain
if [ "$resume"x != ""x ]; then
python train.py --config=$configs \
--resume=$resume \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
elif [ "$pretrain"x != ""x ]; then
python train.py --config=$configs \
--pretrain=$pretrain \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
else
python train.py --config=$configs \
--use_gpu=$use_gpu \
--use_data_parallel=$use_data_parallel
fi
...@@ -313,14 +313,11 @@ def train(args): ...@@ -313,14 +313,11 @@ def train(args):
total_acc5 += acc_top5.numpy()[0] total_acc5 += acc_top5.numpy()[0]
total_sample += 1 total_sample += 1
train_batch_cost = time.time() - batch_start train_batch_cost = time.time() - batch_start
print(
if fluid.dygraph.parallel.Env().local_rank == 0: 'TRAIN Epoch: {}, iter: {}, batch_cost: {: .5f}s, reader_cost: {: .5f}s loss={: .6f}, acc1 {: .6f}, acc5 {: .6f} \t'.
print( format(epoch, batch_id, train_batch_cost, train_reader_cost,
'TRAIN Epoch: {}, iter: {}, batch_cost: {:.5f}s, reader_cost: {:.5f}s loss = {:.6f}, acc1 {:.6f}, acc5 {:.6f}'. avg_loss.numpy()[0],
format(epoch, batch_id, train_batch_cost, acc_top1.numpy()[0], acc_top5.numpy()[0]))
train_reader_cost,
avg_loss.numpy()[0],
acc_top1.numpy()[0], acc_top5.numpy()[0]))
batch_start = time.time() batch_start = time.time()
print( print(
......
MODEL: MODEL:
name: "TSN" name: "TSN"
format: "video" format: "frames"
num_classes: 101 num_classes: 101
seg_num: 3 seg_num: 3
seglen: 1 seglen: 1
...@@ -18,7 +18,7 @@ TRAIN: ...@@ -18,7 +18,7 @@ TRAIN:
batch_size: 128 batch_size: 128
use_gpu: True use_gpu: True
num_gpus: 4 #8 num_gpus: 4 #8
filelist: "./data/dataset/ucf101/ucf101_train_split_1_videos.txt" filelist: "./data/dataset/ucf101/ucf101_train_split_1_rawframes.txt"
learning_rate: 0.001 learning_rate: 0.001
learning_rate_decay: 0.1 learning_rate_decay: 0.1
decay_epochs: [30, 60] decay_epochs: [30, 60]
...@@ -32,7 +32,7 @@ VALID: ...@@ -32,7 +32,7 @@ VALID:
num_reader_threads: 12 num_reader_threads: 12
buf_size: 1024 buf_size: 1024
batch_size: 128 batch_size: 128
filelist: "./data/dataset/ucf101/ucf101_val_split_1_videos.txt" filelist: "./data/dataset/ucf101/ucf101_val_split_1_rawframes.txt"
TEST: TEST:
short_size: 256 short_size: 256
...@@ -40,4 +40,4 @@ TEST: ...@@ -40,4 +40,4 @@ TEST:
num_reader_threads: 12 num_reader_threads: 12
buf_size: 1024 buf_size: 1024
batch_size: 64 batch_size: 64
filelist: "./data/dataset/ucf101/ucf101_val_split_1_videos.txt" filelist: "./data/dataset/ucf101/ucf101_val_split_1_rawframes.txt"
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册