diff --git a/PaddleCV/video/application/video_tag/README.md b/PaddleCV/video/application/video_tag/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..262d8d82cb8e83d591c5803de85d65cdc417e2a3
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/README.md
@@ -0,0 +1,115 @@
+# VideoTag 飞桨大规模视频分类模型
+
+---
+## 内容
+
+- [模型简介](#模型简介)
+- [安装说明](#安装说明)
+- [数据准备](#数据准备)
+- [模型推断](#模型推断)
+- [模型微调](#模型微调)
+- [参考论文](#参考论文)
+
+
+## 模型简介
+
+飞桨大规模视频分类模型VideoTag基于百度短视频业务千万级数据，支持3000个源于产业实践的实用标签，具有良好的泛化能力，非常适用于国内大规模（千万/亿/十亿级别）短视频分类场景的应用。VideoTag采用两阶段建模方式，即图像建模和序列学习。第一阶段，使用少量视频样本（十万级别）训练大规模视频特征提取模型(Extractor)；第二阶段，使用千万级数据训练预测器(Predictor)，最终实现在超大规模（千万/亿/十亿级别）短视频上产业应用，其原理示意如下图所示。
+
+<p align="center">
+<img src="video_tag.png" height=220 width=800 hspace='10'/> <br />
+Temporal shift module
+</p>
+
+- 数据处理：视频是按特定顺序排列的一组图像的集合，这些图像也称为帧。视频分类任务需要先对短视频进行解码，然后再将输出的图像帧序列灌入到VideoTag中进行训练和预测。
+
+- 图像建模：先从训练数据中，对每个类别均匀采样少量样本数据，构成十万量级的训练视频。然后使用TSN网络进行训练，提取所有视频帧的TSN模型分类层前一层的特征数据。在这个过程中，每一帧都被转化成相应的特征向量，一段视频被转化成一个特征序列。
+
+- 序列学习：采用Attention clusters、LSTM和Nextvlad对特征序列进行建模，学习各个特征之间的组合方式，进一步提高模型准确率。由于序列学习相比于图像建模耗时更短，因此可以融合多个具有互补性的序列模型。示例代码仅使用Attention\_LSTM网络进行序列特征预测。
+
+- 预测结果：融合多个模型结果实现视频分类，进一步提高分类准确率。
+
+
+## 安装说明
+
+运行样例代码需要PaddlePaddle版本>= 1.7.0，请参考[安装文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/1.7/install/index_cn.html)安装PaddlePaddle。
+
+- 环境依赖：
+
+```
+    CUDA >= 9.0
+    cudnn >= 7.5
+    OpenCV >= 4.1.0 : pip install opencv-python
+```
+
+## 数据准备
+
+- 预训练权重下载：我们提供了[TSN](https://videotag.bj.bcebos.com/video_tag_tsn.tar)和[AttentionLSTM](https://videotag.bj.bcebos.com/video_tag_lstm.tar)预训练权重，请下载后解压，并将参数文件放在weights目录下，目录结构如下：
+
+```
+video_tag
+  ├──weights
+    ├── attention_lstm.pdmodel
+    ├── attention_lstm.pdopt  
+    ├── attention_lstm.pdparams
+    ├── tsn.pdmodel
+    ├── tsn.pdopt
+    └── tsn.pdparams
+```
+
+- 示例视频下载：我们提供了[样例视频](https://videotag.bj.bcebos.com/mp4.tar)方便用户测试，请下载后解压，并将视频文件放置在video\_tag/data/mp4目录下，目录结构如下：
+
+```
+video_tag
+  ├──data
+    ├── mp4
+      ├── 1.mp4
+      └── 2.mp4
+```
+
+- 目前支持的视频文件输入格式为：mp4、mkv和webm格式；
+
+- 模型会从输入的视频文件中均匀抽取300帧用于预测。对于较长的视频文件，建议先截取有效部分输入模型以提高预测速度。
+
+
+## 模型推断
+
+模型推断的启动方式如下：
+
+    bash run_TSN_LSTM.sh
+
+- 可修改video\_tag/data/tsn.list文件内容，指定待推断的文件路径列表；
+
+- 通过--filelist可指定输入list文件路径，默认为video\_tag/data/tsn.list；
+
+- 通过--extractor\_weights可指定特征提取器参数的存储路径，默认为video\_tag/weights/tsn；
+
+- 通过--predictor\_weights可指定预测器参数的存储路径，默认为video\_tag/weights/attention\_lstm；
+
+- 通过--use\_gpu参数可指定是否使用gpu进行推断，默认使用gpu。对于10s左右的短视频文件，gpu推断时间约为4s；
+
+- 通过--save\_dir可指定预测结果存储路径，默认为video\_tag/data/results，结果保存在json文件中，其格式为：
+
+```
+    [file_path,
+     {"class_name": class_name1, "probability": probability1, "class_id": class_id1},
+     {"class_name": class_name2, "probability": probability2, "class_id": class_id2},
+     ...
+    ]
+```
+
+- 通过--label\_file可指定标签文件存储路径，默认为video\_tag/label\_3396.txt；
+
+- 模型相关配置写在video\_tag/configs目录下的yaml文件中。
+
+
+## 模型微调
+
+- VideoTag中的TSN模型只输出视频特征，无需输出最终分类结果，fine-tune请参考PaddleCV视频库[TSN视频分类模型](../../models/tsn/README.md)请对应修改模型文件。
+
+- VideoTag中的attention\_lstm模型只需要输入视频特征，无需音频特征输入，fine-tune请参考PaddleCV视频库[AttentionLSTM视频分类模型](../../models/attention_lstm/README.md)对应修改模型文件。
+
+## 参考论文
+
+- [Temporal Segment Networks: Towards Good Practices for Deep Action Recognition](https://arxiv.org/abs/1608.00859), Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool
+
+- [Beyond Short Snippets: Deep Networks for Video Classification](https://arxiv.org/abs/1503.08909) Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici
diff --git a/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml b/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..389b1c0f9adc4f37268071839a4645e7a4f29002
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/configs/attention_lstm.yaml
@@ -0,0 +1,15 @@
+MODEL:
+    name: "AttentionLSTM"
+    dataset: None 
+    bone_nework: None
+    drop_rate: 0.5
+    feature_num: 2
+    feature_names: ['rgb'] 
+    feature_dims: [2048] 
+    embedding_size: 1024 
+    lstm_size: 512 
+    num_classes: 3396 
+    topk: 20
+
+INFER:
+    batch_size: 1
diff --git a/PaddleCV/video/application/video_tag/configs/tsn.yaml b/PaddleCV/video/application/video_tag/configs/tsn.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..1ec4dfbf24b0f1c0a85e0960d8a59afde20cfb9b
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/configs/tsn.yaml
@@ -0,0 +1,20 @@
+MODEL:
+    name: "TSN"
+    format: "mp4"
+    num_classes: 400
+    seglen: 1
+    image_mean: [0.485, 0.456, 0.406]
+    image_std: [0.229, 0.224, 0.225]
+    num_layers: 50
+    topk: 5
+
+INFER:
+    seg_num: 300 
+    short_size: 256
+    target_size: 224
+    num_reader_threads: 1 
+    buf_size: 1024
+    batch_size: 1
+    kinetics_labels: None 
+    video_path: ""
+    filelist: "./data/tsn.list" 
diff --git a/PaddleCV/video/application/video_tag/data/tsn.list b/PaddleCV/video/application/video_tag/data/tsn.list
new file mode 100644
index 0000000000000000000000000000000000000000..44f6e8e43acc88626ade03d0c8dae29610633d34
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/data/tsn.list
@@ -0,0 +1 @@
+data/mp4/1.mp4
diff --git a/PaddleCV/video/application/video_tag/label_3396.txt b/PaddleCV/video/application/video_tag/label_3396.txt
new file mode 100644
index 0000000000000000000000000000000000000000..bcda50c015c15d0f0cbd129a251e4a58b1fc93bd
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/label_3396.txt
@@ -0,0 +1,3396 @@
+胶合板
+坠楼
+空手道
+弹奏
+直升机
+罗盘
+健身
+羽毛球拍
+龙与地下城
+漆
+混合器
+学生
+安全气囊
+法庭
+游泳池
+潜艇
+穆斯林头巾
+奇葩
+绞狐大冒险
+飞行器
+演出
+喷枪
+萝莉
+暗黑血统
+彭彭丁满历险记
+出生
+嫩模
+流星雨
+超市
+StepMania
+自动扶梯
+讲座
+缝纫机
+自助餐
+衣服
+翼装飞行
+手语
+可爱颂
+复合弓
+列车
+欧洲模拟卡车
+吃豆人
+队长
+僵尸
+猩红
+战争片
+通关攻略
+横梁
+机场
+引体向上
+暴力片
+橱柜
+卡车
+美人
+仙境传说
+格斗
+奇趣蛋
+健美
+新能源
+佳能
+电视
+喊麦
+信件
+双胞胎
+膳食补充剂
+胸部
+碟子
+女排
+地铁：最后的曙光
+牛肉
+激光照明
+毛巾
+面包店
+时空之轮
+泰迪
+吉他
+绿茶
+自驾游
+签名会
+酱
+抽屉
+山火
+T台
+喝醉
+马桶
+巴松管
+皇帝
+沙丘
+主播
+炖汤
+糖
+球球大作战
+彩票
+中暑
+雷达
+独木舟
+星座
+弓箭
+跑车
+大豆
+妖怪
+激光
+中秋节
+风景
+橡皮筋
+固体
+音乐会
+幽灵
+救生员
+彩虹
+政治
+眼线
+柴
+医疗
+购物中心
+舰载机
+空战
+服装
+钢模
+拖鞋
+教室
+羽毛球
+烤肉
+煎饼
+金星
+火箭
+婴儿车
+黑暗之魂
+夏目友人帐
+图像处理
+恐龙
+柔术
+剪刀
+冒险任务世界
+冰雹
+木工刨
+白金汉宫
+可丽饼
+绅士
+盖瑞模组
+滑板
+游戏网站
+套房
+动作教学
+DOTA
+海盗传说
+小马慢跑
+怪物中学
+快闪
+冠军
+手风琴
+工具
+进击的巨人
+怀孕
+停车场
+舌钉
+自行车运动
+飞檐走壁
+滑雪板
+保健
+大蒜
+门
+咏春
+热火吉他手
+筷子
+饮料罐
+拳无虚发
+糗事
+豆腐
+动物园大亨
+佛兰肯斯坦
+动漫
+机长
+脱发
+石英
+医生
+母婴
+数码
+螳螂
+加仑
+核电站
+老鹰
+哑铃
+成语故事
+情景剧
+小提琴
+熊猫
+泥石流
+贴花
+合唱团
+质量效应
+东京食尸鬼
+流行音乐
+犁
+帆
+监拍
+城市
+液氮
+扳手
+卫星
+跳伞
+三维
+美味
+特种部队
+名模
+手帕
+瀑布
+教师
+风铃
+爱丽丝梦游仙境
+风光
+通用电气公司
+逗比
+豹子
+石油
+仙乐传说
+晴天
+皮革
+露台·天井
+实验室
+口琴
+驾车
+枕头
+鸡
+遥控器
+铁路运输
+瓦片
+原子弹
+偶像剧
+闯关
+西游记
+吉他音箱
+车速表
+甜品
+电源供应器
+人行道
+疲劳驾驶
+房车
+量子
+民工
+薄暮传说
+节日
+连连看
+遥控
+科学探索
+银河
+雨水沟
+小丑
+建造
+鹅
+地毯
+赛车俱乐部
+超级飞侠
+美女与野兽
+克兰娜德
+中央处理器
+儿童故事
+口罩
+警匪片
+美女直播
+海洋
+睡衣
+忍者
+烧伤
+裙子
+剪影
+生活大爆炸
+麦田怪圈
+勺子
+狮子王
+床戏
+导管
+冰雪奇缘
+彩泥
+货物
+驼铃
+牙膏
+高铁
+古风
+新娘
+深空传说
+鹰
+鹿
+铲车
+星际战甲
+怪物猎人
+转蛋
+香奈儿
+醉驾
+坦克世界
+新能源汽车
+幻想传奇
+纺织品
+超级英雄
+谍战片
+起重机
+钥匙·按键
+苹果商店
+河粉
+名侦探柯南
+蜂窝
+演唱会
+喷泉
+比基尼
+面粉
+日本食玩
+王子
+画画
+激情戏
+中国队
+帆船
+电商
+消防员
+美腿
+侏罗纪
+吃饭
+锯木机
+烤面包机
+土星
+珠子
+大头儿子
+穴位
+旅客
+演员
+短信
+擂台
+东方永夜抄
+龙之谷
+马路
+袜子
+神秘岛
+勋章
+斑马
+攻壳特工队
+激流回旋
+路易鬼屋
+飞盘
+汽车
+走秀
+异度之刃
+奥利奥
+相声
+房屋
+三国无双
+猫和老鼠
+高校
+鬼片
+维修
+巢
+煎蛋
+哪吒
+排球
+人体穿孔
+核武器
+明星
+水底
+水库
+海军陆战队
+景区
+陀螺战士
+战斗公主西娜
+教学
+火花塞
+收费站
+风力
+马里奥派对
+操作系统
+灼眼的夏娜
+古罗马
+哈士奇
+气象
+神魔之塔
+锁定：现代空战
+球接头娃娃
+神鬼寓言
+幽灵战车
+战争前线
+骡子
+出游
+早餐
+华为
+房间
+现代片
+海报
+游戏王
+咳嗽
+金丝雀
+音乐剧
+根
+灯泡
+星界边境
+视频教学
+剥
+钢铁
+星之卡比
+试驾
+车技
+剑
+树
+茄子
+轨道
+坠毁
+面团
+玩具屋
+拳击
+音乐中心
+行李
+长江
+花絮
+纯情罗曼史
+地精
+铁铲
+公园
+杠铃
+旅游团
+特斯拉线圈
+喷染术
+电子书
+猪猪侠
+骆驼
+假人挑战
+推杆
+图书馆
+洗澡
+耀西之岛
+武装突袭
+幼儿园
+印刷电路板
+头盔式相机
+金字塔
+双簧管
+养老院
+黎明杀机
+复活节兔子
+马棚
+枪杀
+二维码
+击杀
+刷子
+古筝
+财经
+武术
+影视周边
+游览车
+鳄鱼
+开箱
+水晶
+街头霸王
+恐怖袭击
+过生日
+陶瓷
+健身球
+慢镜头
+贝斯
+异形附身
+风扇
+时装秀
+海底
+奔驰小哥
+弹弓
+生化奇兵
+俱乐部
+人字拖
+推土机
+钞票
+救人
+派对
+土豆
+宿舍
+玉米
+乐动魔方
+国产剧
+柚子
+模子
+细菌
+背包
+婚礼
+菠菜
+遛狗
+东方红魔乡
+山口
+驴友
+偶像大师
+噬神者
+假面骑士
+瑞奇与叮当
+新郎
+坦克在线
+网吧
+酵母
+车手
+枪击
+杂志封面
+孩之宝
+猎人
+夜市
+黑岩射手
+王座
+雕塑粘土
+同人志
+浪客剑心
+车票
+重生娃娃
+驱逐舰
+反叛的鲁路修
+领带
+死亡空间
+幽默
+障碍技巧
+运输机
+铙钹
+条码
+采石场
+排骨
+壁橱
+高尔夫球
+恐怖主义
+圆号
+悠悠球
+科技奇趣
+陶轮
+石头
+枪战
+纸板
+斯诺克
+荒野大镖客
+吉祥物
+满月
+野蛮人柯南
+家电
+电子竞技
+但丁地狱
+天花板
+披萨
+车辆
+巨人
+风车
+高速公路
+婚房
+蛤蜊
+抢救
+兔子
+航展
+火山
+发动机
+装载机
+皮艇
+梳子
+维秘
+星际火狐
+嫦娥
+沼泽
+舞曲
+炒鸡蛋
+心灵杀手
+怪物
+中国风
+理发师
+悬崖
+铅笔
+博士
+海豚
+芥末
+磨刀
+卸妆
+黄牌
+魔法门
+飞行
+游泳
+羚羊
+自动售货机
+优惠券
+银行
+打车
+东北二人转
+演讲
+香槟酒
+油罐车
+海豹
+万智牌
+步枪
+造型师
+空间站
+大风
+鼻子
+外卖
+X战警
+田径
+外星人
+木材
+速度生活
+豪车
+鬼魂
+手榴弹
+海底隧道
+表演者
+木琴
+月饼
+活页乐谱
+红牛
+天才
+南瓜饼
+鸟
+离合器
+精灵复兴
+击倒
+农产品
+轰炸
+商家
+美貌
+狗粮
+绞盘
+虚构人物
+冰川
+怒之铁拳
+车祸
+星火
+陆战队
+太阳
+大学
+录音机
+全职猎人
+内衣
+赛车总动员
+同学会
+四重奏
+桨
+驾驶员
+健身房
+瓷器
+抢劫
+爆米花
+绿色
+蕾丝
+黑熊
+公主抱
+刀剑神域
+馒头
+圣诞礼物
+墙壁
+幼儿
+信用卡
+刀
+狂飙旧金山
+日历
+新生
+婚戒
+雪
+雨
+竹子
+美人鱼
+音乐键盘
+娃娃
+键盘
+动力火车
+骑兵·装甲兵
+立交桥
+散步
+成就
+荣誉勋章
+助攻
+沙滩
+蚯蚓
+动物
+汽车越野赛
+项链
+啤酒
+女装
+和尚
+乳清蛋白
+圣诞树
+手绘
+投篮
+大麦
+光头强
+工作会议
+苍蝇
+宝藏
+射击游戏
+粉笔
+杏仁
+碗
+神舟
+胭脂
+惊天动地
+马
+封面
+小学
+物联网
+沙子
+录音棚
+挖土机
+穿衣
+飞机
+大盘
+内涝
+恶魔
+鳄梨
+飞驰竞速
+西兰花
+实验
+录影机
+气球塔防
+跑酷
+交警
+熊
+桔梗
+解放军
+活动房屋
+相机
+数学
+特斯拉
+太空堡垒
+宅男女神
+安卓
+冰块
+鸡舍
+美妙天堂
+化石
+超时空要塞
+数字
+网球
+神秘海域
+艺考
+艺术节
+编织
+打字
+明星派
+二十一点
+护栏
+大海
+极光
+舞力全开
+广场
+神庙逃亡
+纽扣
+时装周
+西葫芦
+炊具和烤盘
+星巴克
+油炸
+划船
+创世纪
+摩托车越野赛
+星星
+金刚
+弹球
+美女
+三明治
+工艺
+冒险
+垃圾桶
+极限竞速
+加菲猫
+宝宝辅食
+首饰
+场地赛
+球
+幻想水浒
+生活剧
+希曼
+插图
+潜水
+秃鹫
+诺亚方舟
+少女
+比武
+糖果粉碎传奇
+拳皇
+墨水
+校园暴力
+引擎
+脱口秀
+路由·伐木
+牡蛎
+漂移
+熊出没
+校车
+牧羊人
+功夫
+植物大战僵尸
+朗诵
+娇妻
+镜框·画框
+百叶窗
+客流
+咖啡
+塑像
+生物学
+手电筒
+机器
+座位
+沙包·沙袋
+森林
+乐高主题公园
+视频制作
+充电器
+犬夜叉
+超级粉碎兄弟
+交通安全
+躲猫猫
+翼
+粘土动画
+山羊
+海王星
+导弹
+街头表演
+水獭
+访谈节目
+石榴
+讲解教学
+拥堵
+变形
+电饭煲
+星际公民
+猿
+头
+丝路传说
+极品飞车
+皮卡丘
+拍照
+化油器
+肥料
+鲨鱼
+星云
+冬奥会
+模拟器
+CD机
+中国梦
+捕食
+泰坦陨落
+白宫
+饺子
+光环
+火鸡
+男装
+火爆狂飙
+推钱机
+命令与征服
+大金刚国度
+古琴
+食堂
+消防站
+愤怒的小鸟
+护士
+母亲
+暗杀
+美妙旋律
+芦笋
+荷花
+弓猎
+超车
+松下
+宙斯
+生活记录
+公路
+模拟合成器
+时尚
+宾馆
+难民
+立体声扬声器
+旋转
+杯子
+模型
+坦克
+生食
+波西杰克逊
+气球
+峡谷
+锁
+粉蜡笔画
+铅笔盒
+收藏
+激光笔
+智能家居
+翻筋斗
+烤面包
+生化危机
+演奏
+百货公司
+屁股
+锯
+车站
+瓜
+极速前进
+篮子
+蹦极
+纸片马里奥
+秦时明月
+全面战争
+游乐园
+最终幻想
+水手
+水上乐园
+尾巴
+鸡蛋
+相声演员
+坚果
+硬盘驱动器
+吃货
+望远镜
+夹克
+僧侣
+山洪
+打斗
+仓库
+独奏
+毁灭战士
+牵手
+普乐路路轨道
+天鹅
+旅行社
+柔道
+景观
+古墓丽影
+蓝龙
+甜美
+拍手
+酒店
+膝盖
+歌曲
+滑翔伞
+小马宝莉
+修道院
+滑板公园
+旅馆
+云朵
+麦片
+灾区
+水槽
+卧室
+避暑
+小熊维尼
+棒球帽
+拖车
+四大名助
+铜管乐器
+沙画
+外太空
+模拟人生
+健身教练
+数字电子
+公寓
+乐迪
+枪战片
+便秘
+姑娘
+大宅门
+猪蹄
+山峰
+三国志大战
+灯
+锅炉
+火
+气球造型
+面部
+光标
+动作片
+上网本
+汽艇
+棉花
+雪橇
+热泵
+装修
+记者
+女警
+恐怖
+龙
+夜景
+民警
+算命
+手里剑
+夜晚
+笑傲江湖
+精灵
+炮弹
+表情包
+刮刮卡
+三轮车
+护目镜
+墙纸
+洗头
+红包
+星系
+运动鞋
+菌类
+冰
+拔牙
+腿
+肿瘤
+先锋
+开心农场
+迪士尼
+山体滑坡
+表格
+文物
+眉毛
+刷牙
+绝命毒师
+电子宠物
+咖啡机
+流苏花边
+素描
+超级跑跑
+搏击
+司机
+卡通
+灰姑娘
+晨练
+记号笔
+心脏
+大提琴
+卫生巾
+受灾
+任天堂
+珠宝
+英雄连
+溜冰场
+青岛大姨
+大灰熊
+骑车
+基督
+道具
+料理
+甜菜根
+鱼饵
+车床
+反曲弓
+影视
+网络直播
+车库
+波斯王子
+船厂
+捕食者
+青铜
+橄榄
+污点·着色
+咖啡屋
+水稻
+改装车
+小正太
+烧烤
+卡布奇诺
+蝴蝶结
+桥梁
+邮件
+数码宝贝
+手臂
+炉子
+学校
+霸王龙
+山
+客车
+焊接
+小车
+分裂细胞
+管道
+爱情剧
+摇滚名人堂
+游行
+完美世界
+开枪
+微波炉
+中学
+东方大头条
+香菇
+虾
+双眼皮
+椅子
+格雷少年
+相亲节目
+称重秤
+香精油
+小路
+压力清洗
+木头
+水彩画
+土豆泥
+电脑
+方舟
+乐高好友
+球体
+冷空气
+大闸蟹
+帽子
+涂料
+手提包
+战争
+水球
+汤
+西红柿
+唇妆
+商铺
+王者之剑
+腕表
+藤蔓
+钱包
+刀工
+平衡车
+奥斯卡金像奖
+抗日剧
+导游
+行星边际
+泡沫
+任务栏
+中药
+死侍
+小小大星球
+自行车
+签名
+胸肌
+太极
+儿童安全座椅
+口哨
+罗技
+休闲
+汉堡
+德军司令部
+变压器
+考拉
+动物之森
+手势
+竖琴
+椰子
+大炮
+医保
+杂技
+电影摄像机
+表演艺术
+话剧
+工作室
+黄河
+吸毒
+黄油
+无限试驾
+高空
+冬天
+酒
+洞穴
+甘薯
+流星体
+手表
+救护车
+金牌
+麦迪逊广场花园
+特技演员
+饼干
+垃圾车
+服装搭配
+出租车
+暴力
+女王
+盗墓
+手提箱
+丝巾
+化学反应
+海贼王
+淋浴
+选秀
+成型
+童话故事
+麦克风
+黑客
+无尽传说
+羊
+狙击手
+小轮车
+夺宝奇兵
+美食
+食品
+肥皂泡
+骑牛
+辫子
+重型设备
+战队
+制服诱惑
+法官
+蝎子
+小屋
+酒精灯
+青鬼
+马赛克
+南方公园
+无人机
+调酒师
+万万没想到
+粉底
+捕鱼
+初音未来
+毒贩
+矮人
+好莱坞
+六孔哨
+棺材
+猜拳
+潜水服
+搞笑
+火星
+盗窃
+DJ
+沐浴类产品
+长颈鹿
+整蛊
+围攻
+教堂
+黑带
+浮桥
+单眼皮
+陷
+软件
+过山车大亨
+围巾
+幸存者
+情感剧
+洗剂
+拆除
+星际迷航
+浮子
+雪地
+安保
+黄金眼
+追尾
+岩石
+电视广告
+行窃
+会计
+鸭子
+VR显示器
+莱克斯卢瑟
+反恐精英
+蒸汽机
+球场
+游戏动漫
+玉米卷
+漫威传奇
+腾讯
+亚洲
+卫生间
+吸烟
+战争机器
+青蛙
+喜羊羊与灰太狼
+飞艇
+猎犬
+招式
+拉伸
+连帽衫
+欧美音乐
+恶魔岛
+拳击之夜
+车
+大型强子对撞机
+舰艇
+枫之谷
+真功夫
+轴
+飞碟
+生物
+魔兽争霸
+欧巴
+平底锅
+石膏
+钢琴
+海关
+剪纸
+坐垫
+镜子
+夏令营
+战争之人
+简历
+彩排
+船
+真空管
+邮轮
+法制节目
+皇室战争
+小龙斯派罗
+博览会
+舞蹈革命
+生活
+圣诞贺卡
+拥抱
+飞飞全明星
+驾考
+卫生纸
+上市
+果酱
+儿子
+教会
+艺术团
+刷卡
+信封
+军阀
+军队
+黑塔利亚
+玉米饼
+滑雪
+猕猴桃
+提拉米苏
+航天
+芭蕾
+狮子
+跑步机
+杀出重围
+忍者龙剑传
+碰撞
+使命召唤
+自拍
+火柴
+火车站
+枫树
+咖啡师
+解说
+狒狒
+终极格斗冠军
+魔法禁书目录
+消防车
+极限运动
+电脑机箱
+兵
+家畜
+墨镜
+演技派
+大长腿
+功夫片
+梯子
+夏日
+排箫
+法师
+急救
+福尔摩斯
+农场
+发型
+决战之夜
+太子妃
+华夫饼
+刺猬索尼克
+赌博
+磨砂机
+办公室
+器官
+毕业
+军训
+带子
+治愈
+船长
+砂浆
+最游记
+绿野仙踪
+炉石传说
+数字录像机
+清洁
+喷气艇
+刺猬
+恒温器
+透视装
+黑执事
+基金
+守望者
+ATM取款机
+干墙
+曲棍球
+双节棍
+明胶
+锤子
+婚宴
+街道
+甜饼怪
+上帝模式
+狂神国度
+烈火战车
+麻将
+X音素
+液压机
+水杯
+扭曲
+魔界战记
+车评
+独角兽
+特种兵
+诱饵
+活动
+面具
+九阴真经
+实况足球
+护肤品
+游戏工作室
+榴莲
+马戏团
+原油
+蚁类
+分娩
+钓鱼
+游戏手柄
+影评
+虚幻竞技场
+神枪手
+架线工
+无线遥控飞机
+轮滑
+排气系统
+水管
+电源
+星之海洋
+摄像机
+纪录片
+优雅
+闺蜜
+曼妥思
+作曲家
+锡罐
+骑行
+快递
+电影节
+车队
+犀牛
+肌肉
+纽约时代广场
+敌人
+英雄
+八路
+纹身
+留声机唱片
+家常菜
+影视原声
+撞车
+达人秀
+古玩
+吊坠手链
+旅游
+录节目
+竞技
+黄梅戏
+村民
+昆虫
+旅行车
+草原
+毛衣
+叉车
+决斗大师
+灌木
+手工
+神之浩劫
+广场舞
+工厂
+练习室
+智能硬件
+龙珠
+龙梦幻境
+模仿
+枪支
+加速处理单元
+皮卡
+踏板车
+卡丁车
+歹徒
+跳跃
+大屠杀
+阀
+霍比特人
+煤矿
+遥控车
+女仆
+眼镜
+遇难者
+足球
+英雄工厂
+种族
+武打
+皇牌空战
+曲奇饼
+蜡像
+衬衫
+平衡木
+火灾
+水果蜜饯
+孔雀
+头文字D
+战国
+正手击打
+港台剧
+空中巴士
+部队
+挡风玻璃刮水器
+楼梯
+无人驾驶
+写作
+塑料袋
+灯塔
+徒步旅行
+埃菲尔铁塔
+快餐
+丛林
+怪兽
+灌篮高手
+导航
+台球
+裤子
+包子
+绘图仪
+宠物
+冲浪板
+厕所
+龙虾
+寿司
+海蜇
+赛车游戏
+下午茶
+跨栏
+图像扫描仪
+王者荣耀
+钢琴弹奏
+润肤膏
+真人快打
+橡皮泥
+二胡
+新封印传说
+衣服熨斗
+红烧肉
+除毛
+变脸
+泡菜
+酸奶
+中文
+甘蔗
+拉丁
+萨克斯
+鼓
+炸弹人
+壁炉
+球员
+角斗士
+轮缘
+病毒
+洛基
+科技数码
+梦想俱乐部
+私房菜
+平板
+灯光
+圆筒
+工人
+音乐
+灯具
+探险
+相亲
+传送门
+互联网
+喝
+鼠
+齿轮
+油脂
+旗
+糖霜酥皮
+光学错觉
+数字音频工作站
+击球
+截拳道
+指环王
+高达
+网球王子
+瘦腿
+神秘博士
+自行火炮
+向日葵
+纤维
+电视台
+羊肉
+飞行员
+电车
+按摩
+射箭
+欧洲杯
+戒指
+英雄传说
+棋牌
+魔术
+电动车
+体操
+毁灭公爵
+T恤
+宗教
+豚鼠
+精彩剪辑
+卡拉OK
+护肤
+海盗
+染发
+名人采访
+锐化
+午夜俱乐部
+吃鱼
+飙车
+吸管
+肾脏
+焙烧
+跑步
+紫罗兰
+海岛奇兵
+东京喵喵
+阅兵
+偷窃
+奶茶
+辣条
+特战先锋
+蝙蝠侠
+孤岛危机
+魔法王国
+挖掘机
+U盘
+荧光棒
+图章
+女婴
+光晕
+礼品
+会议
+车展
+电音
+家具
+木雕
+台锯
+终极奇迹
+草坪
+模拟城市
+画眉
+淑女
+酒馆
+唇膏
+手机数码
+橄榄球
+锻造
+水疗
+音悦台
+反导系统
+动感
+第二人生
+星空
+园艺
+稻草人
+无头骑士
+盔甲
+舞会
+蛋
+高空抛物
+无敌浩克
+姜饼
+印刷
+帝国时代
+黄山
+鲁邦三世
+盲人
+蛇
+睡眠
+战舰世界
+蟑螂
+面包车
+缝纫针
+脂肪
+纸模型
+室内装潢
+恐怖分子
+客机
+欧美影视
+便利店
+核弹
+双面人
+厨师
+跑道
+计算机
+灾难片
+飞哥与小佛
+放牧
+文艺演出
+肖像
+红绿灯
+锥体
+喇叭
+赛道狂飙
+全家福
+麻辣烫
+包包
+身体护甲
+航空
+毒品
+天空
+针织
+魔杖
+猪肉
+砖
+松糕
+圣诞装饰
+轰炸机
+无尽的任务
+摇滚史密斯
+网页
+汽车照明系统
+小镇
+巫师
+月球
+硬汉
+机车
+面食
+手术
+海鲜
+玩具熊的五夜后宫
+巧克力
+手机
+Vox
+画法
+莫妮卡的团伙
+大米
+全金属狂潮
+随声听
+旋律
+放生
+操场
+窗户
+恐怖喜剧
+大力水手
+惩罚者
+木工
+悬疑
+长方形
+木片
+电子电路
+查理与巧克力工厂
+不锈钢
+苍翼默示录
+盒子
+耐力赛
+保龄球
+海啸
+舰队收藏
+死亡岛
+歌手
+电话
+感染：幸存者故事
+真人秀
+恶魔城
+五佳球
+机械
+马里奥与路易吉
+饲养员
+滑水
+龙舟
+大理石
+港片
+葫芦娃
+武装分子
+奶油烤菜
+吓人
+斧头
+正义联盟
+超凡双生
+蜜蜂
+游艇
+头骨
+道路
+神奇四侠
+弓道
+呼啦圈
+拍客
+航空母舰
+狂热节拍
+宇宙
+美景
+健身队
+武侠
+武林高手
+测评
+薄樱鬼
+人物专访
+颈椎
+皮带
+少年泰坦
+黑色
+交响乐
+震荡
+火炉
+光盘
+喝水
+守望先锋
+烹饪
+装甲车
+棒球
+网游
+黄蜂
+安全带
+泰坦
+巴掌
+指南
+复活节彩蛋
+餐馆
+樱花
+溜冰鞋
+机甲战士
+耐克
+命运石之门
+装扮
+山水画
+耀斑
+贺卡
+日本团子
+月亮
+黑人
+科普
+钥匙扣
+甜瓜
+垃圾
+美食猎人
+头巾
+无线电遥控船
+骨牌
+单挑
+上古世纪
+覆盆子
+绳子
+海绵
+超模
+香肠
+奇观
+直线加速赛
+菜园
+雨伞
+十二生肖
+奶油
+汽车修理
+大号
+倒霉熊
+音乐节目
+唇彩
+几何冲刺
+视频游戏厅
+射击
+鬼屋
+手套
+驾驶
+青蛙军曹
+鞍
+港口
+彩灯
+广播公司
+摄影
+鞋
+我的世界
+大发
+马甲线
+模式·图案
+干衣机
+机器人战斗
+人工呼吸
+华尔兹
+水族馆
+国庆
+领奖
+巫师之怒
+火影忍者
+马克杯
+战鹰
+年会
+垂钓
+摩天大楼
+炸酱面
+企鹅
+整形
+睫毛
+暴走大事件
+教程
+钢铁侠
+日出
+国家公园
+戏剧
+折纸
+花
+说唱史诗战
+白娘子
+头盔
+威浮球
+热血无赖
+眼球
+香烟
+抗战片
+小鲜肉
+音响
+武功
+场地自行车
+稻田
+真侍魂
+海战英豪
+火焰之纹章
+婚纱摄影
+发布会
+损伤
+下水道
+雕刻
+制服
+延时摄影
+凯蒂猫
+截屏
+奇幻森林
+舞台剧
+雪糕
+飞车手罗德
+我想当爷们
+肉丸
+短号
+炮兵
+孩子
+搞怪
+军事
+对决
+战神
+菜花
+欧冠
+冰壶
+蓝莓
+帐篷
+幸运星
+化妆
+激战
+方便面
+旋转木马
+人物
+磁带
+恐怖片
+梦幻龙族
+牙齿
+海滩
+猛鬼街
+鲸
+唱片公司
+露营
+松饼
+安妮
+百乐门
+圣诞
+扬琴
+棚子
+调解
+发射
+体育
+通心粉
+热可可
+二次元
+迷人
+宇航员
+运钞车
+行车记录仪
+官员
+奥数
+玉米地
+音乐人
+彗星
+颁奖典礼
+表演
+粉丝
+军人
+堂吉诃德
+狙击枪
+减脂
+古装
+游戏机
+饥饿游戏
+撒旦
+邮票
+理发店
+网络主播
+身材火辣
+棒球
+兔八哥
+大巴车
+耳环
+数码产品
+游民星空
+泰拳
+配音秀
+机器人
+盛装舞步
+玩具人
+袋鼠
+酒吧
+蘑菇
+死亡边境
+世界杯
+驾驶舱
+海藻
+乐高
+艺术
+龙之信条
+开关
+武警
+日蚀·月蚀
+手机评测
+诛仙
+行李箱
+恐龙世界
+天宫
+滑板
+青贮饲料
+摄像头
+工程车
+阀门·龙头
+石工
+孤岛惊魂
+胫骨
+砸车
+迷你人形
+超级玛丽
+生活技巧
+武打片
+胡子
+苹果
+橙色
+灾害
+猫
+翅膀
+吵架
+唱诗班
+雷神
+扑克
+史酷比
+魔龙骑士
+人体
+拾音器
+圆圈·循环
+地狱
+运球
+游轮
+疯狂动物城
+战舰
+核反应堆
+雾霾
+版画
+真正的家庭主妇
+海龟
+烘培
+电容器
+核试验
+寒潮
+垂死之光
+橡木
+游乐场
+养生
+杀手
+魔法
+台阶·门廊
+倒塌
+法院
+硬币
+拳击比赛
+弩
+可爱
+笔记本
+花卉设计
+僵尸末日
+闹钟
+调制解调器
+狗窝
+萌妹
+部落战争
+聚会
+乐器
+劫匪
+腹语
+电动工具
+头发
+地下城与勇士
+卡牌
+卡片
+别墅
+地球冒险
+暴风雪
+瑜伽
+海狸
+安检
+绘画
+沙拉
+浴缸
+毛绒玩具
+海狮
+琵琶
+肯得基
+口红
+娱乐
+魔戒
+婴儿
+烫发器
+狂飙
+积水
+机动车
+奖
+椰奶
+芦荟
+刺客
+拖拉机
+蒙娜丽莎
+牛仔
+葡萄酒
+猴子
+潜水员
+盘式制动器
+比赛
+吸尘器
+豌豆
+拍摄现场
+帆布
+喜剧演员
+蜡笔小新
+香蕉
+全民健身
+牛排
+音响系统
+啦啦队
+街头采访
+视觉小说
+弹唱
+飞车
+装甲核心
+罐头
+哈利波特
+沉香
+举重
+纸
+拼图
+电视频道
+防护
+视频游戏
+家居
+平屋顶
+开车
+航拍
+特技
+杂货店
+拍卖
+薯条
+珍珠
+手指
+柔力球
+美少女战士
+游戏公司
+冰球
+天气预报
+充气船
+爆炒
+机油
+眼泪
+西区故事
+镶嵌
+仪表着陆系统
+鱼
+爆炸
+骑马
+礼服
+植物
+战地
+淘宝
+烟花
+求婚
+饮料
+蹲
+喜剧
+猎天使魔女
+潜行者
+船员
+汽油
+低音炮
+美甲
+无花果
+超级大金刚
+猩猩
+带锯
+国旗
+开幕式
+货运工具
+腹部
+泥潭
+秀逗魔导士
+交通
+小米
+钢琴家
+机票
+肉
+姜黄
+龙腾世纪
+杀戮空间
+婴儿吊带
+拿铁
+僵尸片
+孤儿院
+自爆
+马里奥赛车
+火锅
+冬季运动
+女巫
+大厦
+街头赛车
+快板
+驾校
+秀场
+侠盗猎车手
+杂志拍摄
+乌龟
+蜂蜜
+减肥操
+水上艇筏
+象
+播种
+单词
+偷车
+玻璃贴膜
+俄罗斯方块
+惊悚
+火车头托马斯
+净水器
+电影解说
+画家
+谷类
+机枪
+滑翔翼
+瓶子
+合唱
+超胆侠
+轮盘
+电气布线
+考古
+豆类
+集装箱
+异形
+洗碗机
+割草机
+茶
+计算器
+魔方
+宝莱坞
+辣妹
+军官
+牛人
+后备箱
+海边
+电磁线圈
+印度
+红酒
+食谱
+工地
+特技飞行
+家庭剧
+培乐多
+温泉
+钩针
+宫殿
+时装
+鹦鹉
+棕熊
+运动会
+空姐
+球星卡
+葱油饼
+洛奇
+女团
+老虎机
+记者会
+体育场
+票房
+无冬城
+浣熊
+洗衣服
+菜市场
+寂静岭
+肉汁
+大力士
+鼓棒
+金属加工
+壶铃
+德云社
+国际军事
+驾照
+面条
+手枪
+金条
+泰迪熊
+河马
+洗涤
+阁楼
+爆炸袭击
+桑拿
+踢打
+爱探险的朵拉
+葡萄园
+闪光
+妈妈
+骨头
+钓竿
+颜色
+摩托车头盔
+纱线
+驯鹿
+银魂
+独轮车
+虚拟玩家角色
+圣经
+毛笔字
+电影
+音乐影片
+西餐
+菠萝
+西湖
+清洁剂
+斗牛
+小红帽
+餐巾
+单杠
+地球
+爽肤水
+打印机
+吹风机
+记号笔
+小麦
+螺帽
+乐高都市
+白酒
+显卡
+都市
+画展
+光之美少女
+银行卡
+群星
+穿越火线
+古装剧
+单簧管
+网络
+洪水
+美容
+汤姆猫
+讲故事
+海底世界
+操作杆
+赛车方向盘
+倚天
+球赛
+海岸
+空调
+铁路
+怪物卡车大毁灭
+下巴
+票
+复仇者联盟
+新闻
+雪崩
+彩绘
+狂野飙车
+沙雕
+木偶
+轮椅
+文艺
+家电公司
+海岛
+苹果派
+降龙十八掌
+打结
+素食
+深渊传说
+骑士
+视频解说
+活塞
+小猪佩奇
+直播
+蟋蟀
+乘客
+英雄联盟
+大气污染
+硬石餐厅
+晶体管
+宝石
+奶酪
+图表
+鲜花
+背心
+反恐
+科学家
+种子
+喂食
+爪子
+火线精英
+体育用品
+照片
+军事武器
+直线
+电脑硬件
+开锁
+鼓手
+模型车
+航天器
+屏幕
+花生
+直排轮滑鞋
+军舰
+钻石
+橄榄油
+稻草
+蜡笔
+妆容
+杀手本能
+餐厅
+摔跤
+内裤
+蹦床
+樱兰高校男公关部
+跆拳道
+科幻
+豪宅
+停车
+冰淇淋
+钢盘·平底深锅
+大乱斗
+服装店
+千与千寻
+音标
+吉他英雄
+南瓜
+采访
+小吃
+漫画英雄
+最后生还者
+红薯
+镜之边缘
+燃脂
+葫芦丝
+篮球
+组装
+台球杆
+过滤器
+空翻
+壁画
+闪电
+海域
+红唇
+面试
+吊坠
+武侠剧
+睫毛膏
+香水
+舞蹈室
+资讯
+眼影
+军装
+躺骑车
+白色
+英魂之刃
+魔鬼
+饭团
+琴弦
+冰箱
+通灵王
+公交
+魔法之战
+泳装
+文本
+长号
+羊毛
+古诗
+马克思佩恩
+演习
+陀螺仪
+车牌
+静物写生
+木屋
+米饭
+萝卜
+高尔夫球
+散热器
+直播间
+星球大战
+黄金
+果汁
+疯狂橄榄球
+散打
+犰狳
+爱情故事
+决斗
+电动汽车
+缝纫
+餐饮
+魔兽世界
+设计师
+航班
+麻薯
+以撒的结合
+中提琴
+孢子
+说唱
+死神
+迷宫
+战斗
+警长
+手球
+睡袋
+镲片
+城堡
+性感
+酒精
+生化模式
+湖
+黑暗
+小小世界
+户外休闲
+球技
+同步带
+制动
+剧情片
+球鞋
+清纯
+聚餐
+刺绣
+减肥
+对唱
+睡美人
+儿童
+烤箱
+黄色
+干草
+神灵
+航空公司
+元素周期表
+电影院
+女神转生
+字典
+飞镖
+战锤
+失忆症
+死亡笔记
+亚马逊公司
+虐杀原形
+象棋
+虚幻引擎
+烧烤架
+奶粉
+悉尼歌剧院
+伐木
+草莓
+爆破
+忍者神龟
+银
+四轮车
+鬼泣
+娱乐八卦
+浴室
+鸡肉
+胡萝卜
+胎儿
+液体
+收割机
+铜
+玩具世界
+一字马
+飞船
+修剪器
+煤炭
+简笔图
+网剧
+小品
+洋葱
+便当
+百事
+蜘蛛
+警车
+马车
+尼姑
+河流
+斗牛士
+染色
+黄瓜
+跳水
+音乐大师课
+蜗牛
+钢笔
+故宫
+公益片
+渔船
+蓝色
+卷发器
+超级快递
+鞭炮
+珊瑚
+实战
+跳绳
+滑冰
+小行星
+翻车
+博物馆
+欧元
+哆啦A梦
+乐乐天使娃娃
+空难
+阴阳师
+辣椒
+青之驱魔师
+鸿雁
+SaGa
+凝胶
+池塘
+节拍器
+亲子节目
+播放机
+打印
+歌迷
+荒野星球
+农业
+地震
+时政
+吴哥窟
+拉面
+音乐节
+甜甜圈
+藤球
+灾难意外
+骑马与砍杀
+柑橘
+不明飞行物
+软管
+相册
+触摸屏
+飞行表演
+圣杯神器
+紫色
+笛子
+存储卡
+鸽赛
+蔬菜
+山地自行车
+哑剧大师
+双簧
+长椅
+松弛熊
+官兵
+巧克力
+动画
+侦探
+溜冰
+拉链
+警察局
+工程师
+分屏
+牧师
+球拍
+馅饼
+马展
+蜡烛
+游戏
+舌头
+增压器
+泰拉瑞亚
+三国
+污染
+管带夹
+丫鬟
+歌剧魅影
+温室
+八卦
+晚会
+多米诺骨牌
+西瓜
+无主之地
+薯片
+降落伞
+家具装饰
+螃蟹
+模拟山羊
+麦当劳
+传感器
+粉扑
+太阳能
+裁判
+保卫萝卜
+地铁
+松鼠
+猫女
+课堂
+木星
+耳机
+耳朵
+医学
+尼尔机械纪元
+驾驶证
+婚车
+砂锅
+死海
+海绵宝宝
+模拟农场
+警官
+调酒
+龙战士
+动车
+老鼠
+辛普森一家
+蜥蜴
+和服
+女生
+影视混剪
+长毛绒
+广告牌
+撒娇
+炒锅
+萌宝
+自然
+指甲油
+灰泥
+火腿
+桌子
+月姬格斗
+塑料
+大脑
+接线盒
+攀岩
+水果忍者
+货币
+秋千
+销售
+卷轴
+化妆品
+包裹
+斑马线
+面包超人
+蛋糕
+肉桂
+寺庙
+书法
+团队套牛
+仙人掌
+餐饮
+火箭炮
+视频直播
+鬼娃回魂
+画线骑士
+宜家
+春晚
+步行
+日落
+袋子
+击剑
+理发
+地下室
+斗地主
+打针
+喝酒
+喷漆
+柯南时代
+锦鲤
+凝乳
+杀戮地带
+恶霸鲁尼
+奖牌
+猫头鹰
+赛道
+战士
+美照
+购物
+蝴蝶
+字母表
+客厅
+乌鸦
+唢呐
+反串
+潘多拉
+监控
+烤鸭
+明星大乱斗
+葡萄
+飓风
+病人
+吊车
+蝙蝠
+伪装
+益智玩具
+舞蹈
+合金装备
+跳楼
+勇者斗恶龙
+油
+网站
+厨师机
+凯恩的遗产
+钱
+食材
+外交部
+酒厂
+显示器
+主持
+羽绒服
+牛仔布
+车模
+盐
+芝麻
+痘痘
+股票
+微笑
+菜单
+地板
+烤鸡
+自动唱机
+雪貂
+涡轮
+扎染
+歌剧
+变形金刚
+失火
+门票
+雪山
+风筝
+长袍·礼服
+书柜
+家庭教师
+死亡之屋
+DarkOrbit
+粮食
+公益活动
+藏獒
+渔民
+下一站巨星
+彩虹手环
+苦瓜
+冲浪
+卷心菜
+珠饰
+西贡小姐
+地铁酷跑
+训练营
+运输
+磁铁
+健康
+床垫
+摇摆
+街头恶搞
+糕点
+拳王
+肋骨
+猫
+曲艺
+加油站
+凉宫春日
+妖怪手表
+动力伞
+墓地
+工程
+民房
+胶片
+色带
+主教
+樱桃小丸子
+鸡翅
+轮子
+牛
+邻里
+萌
+音乐制作
+洛克人
+芒果
+地图
+劈木机
+勇士
+火锅店
+电梯
+吻
+弹球盘
+三角形
+粘土
+鸡尾酒
+慈善
+天天酷跑
+唱片骑师
+结婚
+家庭
+手机壳
+航线
+职业摔跤
+肥皂
+竞技场
+丧钟
+摩天轮
+天使
+台面
+外汇市场
+肉搏
+求生之路
+铜牌
+泡面
+流亡黯道
+灯笼
+谜题
+婴儿室
+捕猎
+尿布袋
+鱼鹰
+雪犁
+方块世界
+斑鸠
+建筑
+电视剧
+堆肥
+细胞
+邪恶力量
+零食
+湾岸竞速
+太鼓达人
+赛车
+金枪鱼
+司令
+皮肤
+马拉松
+末日
+垒球
+涂鸦
+充气城堡
+十字架
+食疗
+早教
+速叠杯
+纸牌
+披肩
+躲避球
+柠檬
+打牌
+抗战
+绕口令
+美容院
+惠普
+情感节目
+永恒之塔
+电脑鼠标
+虚拟现实
+特警
+吊床
+货车
+飞绑
+可乐
+运动
+双重国度
+多功能工具
+妹子
+农村
+眼睛
+干冰
+果冻
+相声小品
+电线杆
+战友
+影视配音
+孤岛生存大乱斗
+奥运
+沃尔玛
+太空
+星际之门
+装饰
+灰色
+樱桃
+电锯
+手铃
+科幻片
+身份证
+古墓
+乒乓
+溪流
+手链
+野外生存
+天线
+玻璃
+营地
+庆典
+玩具
+袭击事件
+美术
+橡皮
+加农
+镜头
+探测器
+洗发精
+彩虹岛
+武器
+装置艺术
+葱
+护理
+命运
+仓鼠
+碎石
+青蛙科密特
+螺旋桨
+七日杀
+整容
+行星
+小宝宝
+科技
+台风
+勇者前线
+皇家国教骑士团
+狂欢节
+热狗
+捉迷藏
+弦乐琴
+叶子
+床
+彼得潘
+写真
+托儿所
+设备
+冰桶挑战
+萌物
+变色龙
+花瓣
+伴郎
+打戏
+画报
+罪恶装备
+漫画
+瘫痪
+飞机失事
+奇闻趣事
+大选
+花瓶
+钢之炼金术师
+杂志
+鼠型车
+教育
+旺达与巨像
+插花
+城堡破坏者
+泵
+混音带
+字体
+超人
+倒计时
+恶作剧
+鹌鹑
+吸血鬼
+小朋友
+颤音琴
+符号
+调音台
+梦幻之星
+橘子
+奶昔
+面糊
+冬不拉
+北斗神拳
+越野
+灭火器
+水果
+婚纱
+上古卷轴
+007
+暮光之城
+蜘蛛侠
+冰沙
+下坡
+毡
+警察
+超市特工
+外套
+汉服
+女童
+筏流
+花园
+布丁
+花圈
+生菜
+新年
+清雪机
+气雾喷雾器
+暮蝉悲鸣时
+公主
+显微镜
+秋天
+模特
+收藏品
+咖喱
+空气净化器
+漫威宇宙
+混凝土
+育儿
+电子琴
+遮瑕膏
+火车
+芭比娃娃
+爵士
+音箱
+黑洞
+积木
+剑球
+奶爸
+监管
+美国队长
+爆笑
+闪电
+降世神通
+祷告
+家禽
+穿越时空
+分裂
+轮胎
+水坝
+索尼
+战斗机
+恶搞路人
+拍戏
+电池
+爆胎
+光棍
+俯卧撑
+摩斯
+饮用水
+狂热
+阅读器
+训练
+奥特曼
+王国之心
+学车
+快递员
+住宅
+袋狼大冒险
+悟空
+面包
+雷曼疯狂兔子
+杀手
+赛马
+啄木鸟伍迪
+国务院
+拖把
+壁虎
+铁拳
+高跟鞋
+动物园
+唱片
+金鹰节
+棒球公园
+宠物小精灵
+手游
+部落冲突
+兽人
+魔术师
+谷仓
+圣剑传说
+商场
+起火
+内饰
+暴龙
+鲸
+上课
+油画
+剧本
+武士
+村庄
+脖子
+卷饼
+蚊子
+狩猎
+保健品
+红毯
+总统
+塔罗牌
+偶像活动
+涂层
+合金弹头
+黑白
+沙漠
+白头鹰
+芝士
+宅男
+战利品
+军营
+围棋
+洗衣店
+教育部
+模糊
+国画
+菲比娃娃
+雕塑
+施工
+书呆子
+冬季
+F-Zero
+核桃
+狱警
+游戏人物
+旗袍
+笑话
+衣柜
+综艺
+迫击炮
+梨
+圣斗士
+媒体
+辩论
+健美操
+速降
+男团
+杀人
+圣诞老人
+圆顶
+海豚音
+特技表演
+耙
+探索
+僵尸围城
+银河战士
+长城
+雪人
+作画
+狼
+星际争霸
+立方体
+武装·装备
+被子
+自行车赛
+吃东西
+金属
+交易
+铲屎官
+培根
+档案
+飞去来器
+歌舞表演
+报纸
+仙女
+舞蹈中心
+亚瑟王传奇
+浏览器
+钟
+狗
+露营车
+艺术品
+洗衣机
+睡姿
+打野
+西装
+管风琴
+半机械人
+U型场地
+光
+鸽子
+窗帘
+练习生
+刺客信条
+黑道圣徒
+农民
+煤气灶
+播放器
+塞尔达传说
+消防
+黄铜
+胶带
+挡泥板
+越战越勇
+糖浆
+武装部队
+录像带
+倒车
+牛奶
+冰棍
+阳台
+饮品
+番茄
+灵异事件
+屋顶
+角色扮演
+大富翁
+饿狼传说
+玫瑰
+猪
+海马
+防汛抗洪
+水井
+书
+土地
+村长
+权力的游戏
+东方妖妖梦
+半条命
+国家队
+木瓜
+绿箭
+滑翔
+视频艺术
+人猿泰山
+国防部
+报警装置
+吉尼斯
+厢型布景
+突袭
+狐狸
+倒立
+搅拌机
+腹肌
+飙酷车神
+电子键盘
+惩罚
+失落的星球
+乐队
+丝绸
+冲突
+豆芽
+交通工具
+滑翔机
+亲子
+拳击手
+少儿
+厨房
+花栗鼠
+楼市
+卡通城
+夜店
+洗车
+广告
+饭店
+合气道
+雪地车
+留声机
+全民枪战
+毛皮
+迷你四驱车
+钻头
+生活常识
+少林
+校园
+拔河
+事故
+菊花
+小蛮腰
+过山车
+鸡腿
+暗黑破坏神
+炸鸡
+排版
+拼贴画
+制造业
+艺人
+选美
+猛兽
+英语
+手
+酥皮
+运动员
+卡士达酱
+内衣秀
+护照
+民航
+土匪
+监狱
+靴子
+积雪草
+沙发
+加勒比海盗
+咱们穿越吧
+极度恐慌
+拉力赛
+背部
+伴娘
+投影机
+面膜
+水
+玉·翡翠
+易拉罐
+度假村
+益智
+吻戏
+丈夫
+吊扇
+模具
+水泥
+火柴人
+公安部
+泥土
+地铁站
+打火机
+小小宠物店
+橙子
+子弹
+猴子岛
+闪电十一人
+雪碧
+指甲
+摩托车
+摄影师
+角色
+电人
+老虎
+音乐合奏
+塑料瓶
+发带
+标签·商标
+肉排
+桃子
+指板
+狼人
+分解动作
+读书
+志愿者
+灵魂能力
+星际宝贝
diff --git a/PaddleCV/video/application/video_tag/metrics/__init__.py b/PaddleCV/video/application/video_tag/metrics/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d1df762bdf3d3b920fc1e00d15a3a2ecdcdbe55
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/metrics/__init__.py
@@ -0,0 +1 @@
+from .metrics_util import get_metrics
diff --git a/PaddleCV/video/application/video_tag/metrics/metrics_util.py b/PaddleCV/video/application/video_tag/metrics/metrics_util.py
new file mode 100644
index 0000000000000000000000000000000000000000..730562e8055547ec6afb94790a6f414b1350f7e1
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/metrics/metrics_util.py
@@ -0,0 +1,169 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import unicode_literals
+from __future__ import print_function
+from __future__ import division
+
+import os
+import io
+import logging
+
+import numpy as np
+import json
+from metrics.youtube8m import eval_util as youtube8m_metrics
+
+logger = logging.getLogger(__name__)
+
+
+class Metrics(object):
+    def __init__(self, name, mode, metrics_args):
+        """Not implemented"""
+        pass
+
+    def calculate_and_log_out(self, fetch_list, info=''):
+        """Not implemented"""
+        pass
+
+    def accumulate(self, fetch_list, info=''):
+        """Not implemented"""
+        pass
+
+    def finalize_and_log_out(self, info='', savedir='./'):
+        """Not implemented"""
+        pass
+
+    def reset(self):
+        """Not implemented"""
+        pass
+
+
+class Youtube8mMetrics(Metrics):
+    def __init__(self, name, mode, metrics_args):
+        self.name = name
+        self.mode = mode
+        self.num_classes = metrics_args['MODEL']['num_classes']
+        self.topk = metrics_args['MODEL']['topk']
+        self.calculator = youtube8m_metrics.EvaluationMetrics(self.num_classes,
+                                                              self.topk)
+        if self.mode == 'infer':
+            self.infer_results = []
+
+    def calculate_and_log_out(self, fetch_list, info=''):
+        loss = np.mean(np.array(fetch_list[0]))
+        pred = np.array(fetch_list[1])
+        label = np.array(fetch_list[2])
+        hit_at_one = youtube8m_metrics.calculate_hit_at_one(pred, label)
+        perr = youtube8m_metrics.calculate_precision_at_equal_recall_rate(pred,
+                                                                          label)
+        gap = youtube8m_metrics.calculate_gap(pred, label)
+        logger.info(info + ' , loss = {0}, Hit@1 = {1}, PERR = {2}, GAP = {3}'.format(\
+                     '%.6f' % loss, '%.2f' % hit_at_one, '%.2f' % perr, '%.2f' % gap))
+
+    def accumulate(self, fetch_list, info=''):
+        if self.mode == 'infer':
+            predictions = np.array(fetch_list[0])
+            video_id = fetch_list[1]
+            for i in range(len(predictions)):
+                topk_inds = predictions[i].argsort()[0 - self.topk:]
+                topk_inds = topk_inds[::-1]
+                preds = predictions[i][topk_inds]
+                self.infer_results.append(
+                    (video_id[i], topk_inds.tolist(), preds.tolist()))
+        else:
+            loss = np.array(fetch_list[0])
+            pred = np.array(fetch_list[1])
+            label = np.array(fetch_list[2])
+            self.calculator.accumulate(loss, pred, label)
+
+    def finalize_and_log_out(self,
+                             info='',
+                             savedir='./data/results',
+                             label_file='./label_3396.txt'):
+        if self.mode == 'infer':
+            for index, item in enumerate(self.infer_results):
+                video_id = item[0]
+                logger.info(
+                    '========video_id [ {} ] , topk({}) preds: ========\n'.
+                    format(video_id, self.topk))
+
+                f = io.open(label_file, "r", encoding="utf-8")
+                fl = f.readlines()
+                res_list = []
+                res_list.append(video_id)
+                for i in range(len(item[1])):
+                    class_id = item[1][i]
+                    class_prob = item[2][i]
+                    class_name = fl[class_id].split('\n')[0]
+                    print('class_id: {},'.format(class_id), 'class_name:',
+                          class_name,
+                          ',  probability:  {} \n'.format(class_prob))
+                    save_dict = {
+                        "'class_id": class_id,
+                        "class_name": class_name,
+                        "probability": class_prob
+                    }
+                    res_list.append(save_dict)
+
+                # save infer result into output dir
+                with io.open(
+                        os.path.join(savedir, 'result' + str(index) + '.json'),
+                        'w',
+                        encoding='utf-8') as f:
+                    f.write(json.dumps(res_list, f, ensure_ascii=False))
+        else:
+            epoch_info_dict = self.calculator.get()
+            logger.info(info + '\tavg_hit_at_one: {0},\tavg_perr: {1},\tavg_loss :{2},\taps: {3},\tgap:{4}'\
+                     .format(epoch_info_dict['avg_hit_at_one'], epoch_info_dict['avg_perr'], \
+                             epoch_info_dict['avg_loss'], epoch_info_dict['aps'], epoch_info_dict['gap']))
+
+    def reset(self):
+        self.calculator.clear()
+        if self.mode == 'infer':
+            self.infer_results = []
+
+
+class MetricsZoo(object):
+    def __init__(self):
+        self.metrics_zoo = {}
+
+    def regist(self, name, metrics):
+        assert metrics.__base__ == Metrics, "Unknow model type {}".format(
+            type(metrics))
+        self.metrics_zoo[name] = metrics
+
+    def get(self, name, mode, cfg):
+        for k, v in self.metrics_zoo.items():
+            if k == name:
+                return v(name, mode, cfg)
+        raise MetricsNotFoundError(name, self.metrics_zoo.keys())
+
+
+# singleton metrics_zoo
+metrics_zoo = MetricsZoo()
+
+
+def regist_metrics(name, metrics):
+    metrics_zoo.regist(name, metrics)
+
+
+def get_metrics(name, mode, cfg):
+    return metrics_zoo.get(name, mode, cfg)
+
+
+# sort by alphabet
+regist_metrics("ATTENTIONCLUSTER", Youtube8mMetrics)
+regist_metrics("ATTENTIONLSTM", Youtube8mMetrics)
+regist_metrics("NEXTVLAD", Youtube8mMetrics)
diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/__init__.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py
new file mode 100644
index 0000000000000000000000000000000000000000..9bad69dd0aff1906e3548fb0322203f0bc5b408d
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/average_precision_calculator.py
@@ -0,0 +1,275 @@
+# Copyright 2016 Google Inc. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS-IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Calculate or keep track of the interpolated average precision.
+
+It provides an interface for calculating interpolated average precision for an
+entire list or the top-n ranked items. For the definition of the
+(non-)interpolated average precision:
+http://trec.nist.gov/pubs/trec15/appendices/CE.MEASURES06.pdf
+
+Example usages:
+1) Use it as a static function call to directly calculate average precision for
+a short ranked list in the memory.
+
+```
+import random
+
+p = np.array([random.random() for _ in xrange(10)])
+a = np.array([random.choice([0, 1]) for _ in xrange(10)])
+
+ap = average_precision_calculator.AveragePrecisionCalculator.ap(p, a)
+```
+
+2) Use it as an object for long ranked list that cannot be stored in memory or
+the case where partial predictions can be observed at a time (Tensorflow
+predictions). In this case, we first call the function accumulate many times
+to process parts of the ranked list. After processing all the parts, we call
+peek_interpolated_ap_at_n.
+```
+p1 = np.array([random.random() for _ in xrange(5)])
+a1 = np.array([random.choice([0, 1]) for _ in xrange(5)])
+p2 = np.array([random.random() for _ in xrange(5)])
+a2 = np.array([random.choice([0, 1]) for _ in xrange(5)])
+
+# interpolated average precision at 10 using 1000 break points
+calculator = average_precision_calculator.AveragePrecisionCalculator(10)
+calculator.accumulate(p1, a1)
+calculator.accumulate(p2, a2)
+ap3 = calculator.peek_ap_at_n()
+```
+"""
+
+import heapq
+import random
+import numbers
+
+import numpy
+
+
+class AveragePrecisionCalculator(object):
+    """Calculate the average precision and average precision at n."""
+
+    def __init__(self, top_n=None):
+        """Construct an AveragePrecisionCalculator to calculate average precision.
+
+    This class is used to calculate the average precision for a single label.
+
+    Args:
+      top_n: A positive Integer specifying the average precision at n, or
+        None to use all provided data points.
+
+    Raises:
+      ValueError: An error occurred when the top_n is not a positive integer.
+    """
+        if not ((isinstance(top_n, int) and top_n >= 0) or top_n is None):
+            raise ValueError("top_n must be a positive integer or None.")
+
+        self._top_n = top_n  # average precision at n
+        self._total_positives = 0  # total number of positives have seen
+        self._heap = []  # max heap of (prediction, actual)
+
+    @property
+    def heap_size(self):
+        """Gets the heap size maintained in the class."""
+        return len(self._heap)
+
+    @property
+    def num_accumulated_positives(self):
+        """Gets the number of positive samples that have been accumulated."""
+        return self._total_positives
+
+    def accumulate(self, predictions, actuals, num_positives=None):
+        """Accumulate the predictions and their ground truth labels.
+
+    After the function call, we may call peek_ap_at_n to actually calculate
+    the average precision.
+    Note predictions and actuals must have the same shape.
+
+    Args:
+      predictions: a list storing the prediction scores.
+      actuals: a list storing the ground truth labels. Any value
+      larger than 0 will be treated as positives, otherwise as negatives.
+      num_positives = If the 'predictions' and 'actuals' inputs aren't complete,
+      then it's possible some true positives were missed in them. In that case,
+      you can provide 'num_positives' in order to accurately track recall.
+
+    Raises:
+      ValueError: An error occurred when the format of the input is not the
+      numpy 1-D array or the shape of predictions and actuals does not match.
+    """
+        if len(predictions) != len(actuals):
+            raise ValueError(
+                "the shape of predictions and actuals does not match.")
+
+        if not num_positives is None:
+            if not isinstance(num_positives,
+                              numbers.Number) or num_positives < 0:
+                raise ValueError(
+                    "'num_positives' was provided but it wan't a nonzero number."
+                )
+
+        if not num_positives is None:
+            self._total_positives += num_positives
+        else:
+            self._total_positives += numpy.size(numpy.where(actuals > 0))
+        topk = self._top_n
+        heap = self._heap
+
+        for i in range(numpy.size(predictions)):
+            if topk is None or len(heap) < topk:
+                heapq.heappush(heap, (predictions[i], actuals[i]))
+            else:
+                if predictions[i] > heap[0][0]:  # heap[0] is the smallest
+                    heapq.heappop(heap)
+                    heapq.heappush(heap, (predictions[i], actuals[i]))
+
+    def clear(self):
+        """Clear the accumulated predictions."""
+        self._heap = []
+        self._total_positives = 0
+
+    def peek_ap_at_n(self):
+        """Peek the non-interpolated average precision at n.
+
+    Returns:
+      The non-interpolated average precision at n (default 0).
+      If n is larger than the length of the ranked list,
+      the average precision will be returned.
+    """
+        if self.heap_size <= 0:
+            return 0
+        predlists = numpy.array(list(zip(*self._heap)))
+
+        ap = self.ap_at_n(
+            predlists[0],
+            predlists[1],
+            n=self._top_n,
+            total_num_positives=self._total_positives)
+        return ap
+
+    @staticmethod
+    def ap(predictions, actuals):
+        """Calculate the non-interpolated average precision.
+
+    Args:
+      predictions: a numpy 1-D array storing the sparse prediction scores.
+      actuals: a numpy 1-D array storing the ground truth labels. Any value
+      larger than 0 will be treated as positives, otherwise as negatives.
+
+    Returns:
+      The non-interpolated average precision at n.
+      If n is larger than the length of the ranked list,
+      the average precision will be returned.
+
+    Raises:
+      ValueError: An error occurred when the format of the input is not the
+      numpy 1-D array or the shape of predictions and actuals does not match.
+    """
+        return AveragePrecisionCalculator.ap_at_n(predictions, actuals, n=None)
+
+    @staticmethod
+    def ap_at_n(predictions, actuals, n=20, total_num_positives=None):
+        """Calculate the non-interpolated average precision.
+
+    Args:
+      predictions: a numpy 1-D array storing the sparse prediction scores.
+      actuals: a numpy 1-D array storing the ground truth labels. Any value
+      larger than 0 will be treated as positives, otherwise as negatives.
+      n: the top n items to be considered in ap@n.
+      total_num_positives : (optionally) you can specify the number of total
+        positive
+      in the list. If specified, it will be used in calculation.
+
+    Returns:
+      The non-interpolated average precision at n.
+      If n is larger than the length of the ranked list,
+      the average precision will be returned.
+
+    Raises:
+      ValueError: An error occurred when
+      1) the format of the input is not the numpy 1-D array;
+      2) the shape of predictions and actuals does not match;
+      3) the input n is not a positive integer.
+    """
+        if len(predictions) != len(actuals):
+            raise ValueError(
+                "the shape of predictions and actuals does not match.")
+
+        if n is not None:
+            if not isinstance(n, int) or n <= 0:
+                raise ValueError("n must be 'None' or a positive integer."
+                                 " It was '%s'." % n)
+
+        ap = 0.0
+
+        predictions = numpy.array(predictions)
+        actuals = numpy.array(actuals)
+
+        # add a shuffler to avoid overestimating the ap
+        predictions, actuals = AveragePrecisionCalculator._shuffle(predictions,
+                                                                   actuals)
+        sortidx = sorted(
+            range(len(predictions)), key=lambda k: predictions[k], reverse=True)
+
+        if total_num_positives is None:
+            numpos = numpy.size(numpy.where(actuals > 0))
+        else:
+            numpos = total_num_positives
+
+        if numpos == 0:
+            return 0
+
+        if n is not None:
+            numpos = min(numpos, n)
+        delta_recall = 1.0 / numpos
+        poscount = 0.0
+
+        # calculate the ap
+        r = len(sortidx)
+        if n is not None:
+            r = min(r, n)
+        for i in range(r):
+            if actuals[sortidx[i]] > 0:
+                poscount += 1
+                ap += poscount / (i + 1) * delta_recall
+        return ap
+
+    @staticmethod
+    def _shuffle(predictions, actuals):
+        random.seed(0)
+        suffidx = random.sample(range(len(predictions)), len(predictions))
+        predictions = predictions[suffidx]
+        actuals = actuals[suffidx]
+        return predictions, actuals
+
+    @staticmethod
+    def _zero_one_normalize(predictions, epsilon=1e-7):
+        """Normalize the predictions to the range between 0.0 and 1.0.
+
+    For some predictions like SVM predictions, we need to normalize them before
+    calculate the interpolated average precision. The normalization will not
+    change the rank in the original list and thus won't change the average
+    precision.
+
+    Args:
+      predictions: a numpy 1-D array storing the sparse prediction scores.
+      epsilon: a small constant to avoid denominator being zero.
+
+    Returns:
+      The normalized prediction.
+    """
+        denominator = numpy.max(predictions) - numpy.min(predictions)
+        ret = (predictions - numpy.min(predictions)) / numpy.max(denominator,
+                                                                 epsilon)
+        return ret
diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py
new file mode 100644
index 0000000000000000000000000000000000000000..f7742236f1176073eae84fdc7c3a3a1a2e294fe0
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/eval_util.py
@@ -0,0 +1,245 @@
+# Copyright 2016 Google Inc. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS-IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Provides functions to help with evaluating models."""
+import datetime
+import numpy
+
+from . import mean_average_precision_calculator as map_calculator
+from . import average_precision_calculator as ap_calculator
+
+
+def flatten(l):
+    """ Merges a list of lists into a single list. """
+    return [item for sublist in l for item in sublist]
+
+
+def calculate_hit_at_one(predictions, actuals):
+    """Performs a local (numpy) calculation of the hit at one.
+
+  Args:
+    predictions: Matrix containing the outputs of the model.
+      Dimensions are 'batch' x 'num_classes'.
+    actuals: Matrix containing the ground truth labels.
+      Dimensions are 'batch' x 'num_classes'.
+
+  Returns:
+    float: The average hit at one across the entire batch.
+  """
+    top_prediction = numpy.argmax(predictions, 1)
+    hits = actuals[numpy.arange(actuals.shape[0]), top_prediction]
+    return numpy.average(hits)
+
+
+def calculate_precision_at_equal_recall_rate(predictions, actuals):
+    """Performs a local (numpy) calculation of the PERR.
+
+  Args:
+    predictions: Matrix containing the outputs of the model.
+      Dimensions are 'batch' x 'num_classes'.
+    actuals: Matrix containing the ground truth labels.
+      Dimensions are 'batch' x 'num_classes'.
+
+  Returns:
+    float: The average precision at equal recall rate across the entire batch.
+  """
+    aggregated_precision = 0.0
+    num_videos = actuals.shape[0]
+    for row in numpy.arange(num_videos):
+        num_labels = int(numpy.sum(actuals[row]))
+        top_indices = numpy.argpartition(predictions[row],
+                                         -num_labels)[-num_labels:]
+        item_precision = 0.0
+        for label_index in top_indices:
+            if predictions[row][label_index] > 0:
+                item_precision += actuals[row][label_index]
+        item_precision /= top_indices.size
+        aggregated_precision += item_precision
+    aggregated_precision /= num_videos
+    return aggregated_precision
+
+
+def calculate_gap(predictions, actuals, top_k=20):
+    """Performs a local (numpy) calculation of the global average precision.
+
+  Only the top_k predictions are taken for each of the videos.
+
+  Args:
+    predictions: Matrix containing the outputs of the model.
+      Dimensions are 'batch' x 'num_classes'.
+    actuals: Matrix containing the ground truth labels.
+      Dimensions are 'batch' x 'num_classes'.
+    top_k: How many predictions to use per video.
+
+  Returns:
+    float: The global average precision.
+  """
+    gap_calculator = ap_calculator.AveragePrecisionCalculator()
+    sparse_predictions, sparse_labels, num_positives = top_k_by_class(
+        predictions, actuals, top_k)
+    gap_calculator.accumulate(
+        flatten(sparse_predictions), flatten(sparse_labels), sum(num_positives))
+    return gap_calculator.peek_ap_at_n()
+
+
+def top_k_by_class(predictions, labels, k=20):
+    """Extracts the top k predictions for each video, sorted by class.
+
+  Args:
+    predictions: A numpy matrix containing the outputs of the model.
+      Dimensions are 'batch' x 'num_classes'.
+    k: the top k non-zero entries to preserve in each prediction.
+
+  Returns:
+    A tuple (predictions,labels, true_positives). 'predictions' and 'labels'
+    are lists of lists of floats. 'true_positives' is a list of scalars. The
+    length of the lists are equal to the number of classes. The entries in the
+    predictions variable are probability predictions, and
+    the corresponding entries in the labels variable are the ground truth for
+    those predictions. The entries in 'true_positives' are the number of true
+    positives for each class in the ground truth.
+
+  Raises:
+    ValueError: An error occurred when the k is not a positive integer.
+  """
+    if k <= 0:
+        raise ValueError("k must be a positive integer.")
+    k = min(k, predictions.shape[1])
+    num_classes = predictions.shape[1]
+    prediction_triplets = []
+    for video_index in range(predictions.shape[0]):
+        prediction_triplets.extend(
+            top_k_triplets(predictions[video_index], labels[video_index], k))
+    out_predictions = [[] for v in range(num_classes)]
+    out_labels = [[] for v in range(num_classes)]
+    for triplet in prediction_triplets:
+        out_predictions[triplet[0]].append(triplet[1])
+        out_labels[triplet[0]].append(triplet[2])
+    out_true_positives = [numpy.sum(labels[:, i]) for i in range(num_classes)]
+
+    return out_predictions, out_labels, out_true_positives
+
+
+def top_k_triplets(predictions, labels, k=20):
+    """Get the top_k for a 1-d numpy array. Returns a sparse list of tuples in
+  (prediction, class) format"""
+    m = len(predictions)
+    k = min(k, m)
+    indices = numpy.argpartition(predictions, -k)[-k:]
+    return [(index, predictions[index], labels[index]) for index in indices]
+
+
+class EvaluationMetrics(object):
+    """A class to store the evaluation metrics."""
+
+    def __init__(self, num_class, top_k):
+        """Construct an EvaluationMetrics object to store the evaluation metrics.
+
+    Args:
+      num_class: A positive integer specifying the number of classes.
+      top_k: A positive integer specifying how many predictions are considered per video.
+
+    Raises:
+      ValueError: An error occurred when MeanAveragePrecisionCalculator cannot
+        not be constructed.
+    """
+        self.sum_hit_at_one = 0.0
+        self.sum_perr = 0.0
+        self.sum_loss = 0.0
+        self.map_calculator = map_calculator.MeanAveragePrecisionCalculator(
+            num_class)
+        self.global_ap_calculator = ap_calculator.AveragePrecisionCalculator()
+        self.top_k = top_k
+        self.num_examples = 0
+
+    #def accumulate(self, predictions, labels, loss):
+    def accumulate(self, loss, predictions, labels):
+        """Accumulate the metrics calculated locally for this mini-batch.
+
+    Args:
+      predictions: A numpy matrix containing the outputs of the model.
+        Dimensions are 'batch' x 'num_classes'.
+      labels: A numpy matrix containing the ground truth labels.
+        Dimensions are 'batch' x 'num_classes'.
+      loss: A numpy array containing the loss for each sample.
+
+    Returns:
+      dictionary: A dictionary storing the metrics for the mini-batch.
+
+    Raises:
+      ValueError: An error occurred when the shape of predictions and actuals
+        does not match.
+    """
+        batch_size = labels.shape[0]
+        mean_hit_at_one = calculate_hit_at_one(predictions, labels)
+        mean_perr = calculate_precision_at_equal_recall_rate(predictions,
+                                                             labels)
+        mean_loss = numpy.mean(loss)
+
+        # Take the top 20 predictions.
+        sparse_predictions, sparse_labels, num_positives = top_k_by_class(
+            predictions, labels, self.top_k)
+        self.map_calculator.accumulate(sparse_predictions, sparse_labels,
+                                       num_positives)
+        self.global_ap_calculator.accumulate(
+            flatten(sparse_predictions),
+            flatten(sparse_labels), sum(num_positives))
+
+        self.num_examples += batch_size
+        self.sum_hit_at_one += mean_hit_at_one * batch_size
+        self.sum_perr += mean_perr * batch_size
+        self.sum_loss += mean_loss * batch_size
+
+        return {
+            "hit_at_one": mean_hit_at_one,
+            "perr": mean_perr,
+            "loss": mean_loss
+        }
+
+    def get(self):
+        """Calculate the evaluation metrics for the whole epoch.
+
+    Raises:
+      ValueError: If no examples were accumulated.
+
+    Returns:
+      dictionary: a dictionary storing the evaluation metrics for the epoch. The
+        dictionary has the fields: avg_hit_at_one, avg_perr, avg_loss, and
+        aps (default nan).
+    """
+        if self.num_examples <= 0:
+            raise ValueError("total_sample must be positive.")
+        avg_hit_at_one = self.sum_hit_at_one / self.num_examples
+        avg_perr = self.sum_perr / self.num_examples
+        avg_loss = self.sum_loss / self.num_examples
+
+        aps = self.map_calculator.peek_map_at_n()
+        gap = self.global_ap_calculator.peek_ap_at_n()
+
+        epoch_info_dict = {}
+        return {
+            "avg_hit_at_one": avg_hit_at_one,
+            "avg_perr": avg_perr,
+            "avg_loss": avg_loss,
+            "aps": aps,
+            "gap": gap
+        }
+
+    def clear(self):
+        """Clear the evaluation metrics and reset the EvaluationMetrics object."""
+        self.sum_hit_at_one = 0.0
+        self.sum_perr = 0.0
+        self.sum_loss = 0.0
+        self.map_calculator.clear()
+        self.global_ap_calculator.clear()
+        self.num_examples = 0
diff --git a/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py b/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py
new file mode 100644
index 0000000000000000000000000000000000000000..0ae8b0ed3717aba13b7ed35b4af025be40423967
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/metrics/youtube8m/mean_average_precision_calculator.py
@@ -0,0 +1,114 @@
+# Copyright 2016 Google Inc. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS-IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Calculate the mean average precision.
+
+It provides an interface for calculating mean average precision
+for an entire list or the top-n ranked items.
+
+Example usages:
+We first call the function accumulate many times to process parts of the ranked
+list. After processing all the parts, we call peek_map_at_n
+to calculate the mean average precision.
+
+```
+import random
+
+p = np.array([[random.random() for _ in xrange(50)] for _ in xrange(1000)])
+a = np.array([[random.choice([0, 1]) for _ in xrange(50)]
+     for _ in xrange(1000)])
+
+# mean average precision for 50 classes.
+calculator = mean_average_precision_calculator.MeanAveragePrecisionCalculator(
+            num_class=50)
+calculator.accumulate(p, a)
+aps = calculator.peek_map_at_n()
+```
+"""
+
+import numpy
+from . import average_precision_calculator
+
+
+class MeanAveragePrecisionCalculator(object):
+    """This class is to calculate mean average precision.
+  """
+
+    def __init__(self, num_class):
+        """Construct a calculator to calculate the (macro) average precision.
+
+    Args:
+      num_class: A positive Integer specifying the number of classes.
+      top_n_array: A list of positive integers specifying the top n for each
+      class. The top n in each class will be used to calculate its average
+      precision at n.
+      The size of the array must be num_class.
+
+    Raises:
+      ValueError: An error occurred when num_class is not a positive integer;
+      or the top_n_array is not a list of positive integers.
+    """
+        if not isinstance(num_class, int) or num_class <= 1:
+            raise ValueError("num_class must be a positive integer.")
+
+        self._ap_calculators = []  # member of AveragePrecisionCalculator
+        self._num_class = num_class  # total number of classes
+        for i in range(num_class):
+            self._ap_calculators.append(
+                average_precision_calculator.AveragePrecisionCalculator())
+
+    def accumulate(self, predictions, actuals, num_positives=None):
+        """Accumulate the predictions and their ground truth labels.
+
+    Args:
+      predictions: A list of lists storing the prediction scores. The outer
+      dimension corresponds to classes.
+      actuals: A list of lists storing the ground truth labels. The dimensions
+      should correspond to the predictions input. Any value
+      larger than 0 will be treated as positives, otherwise as negatives.
+      num_positives: If provided, it is a list of numbers representing the
+      number of true positives for each class. If not provided, the number of
+      true positives will be inferred from the 'actuals' array.
+
+    Raises:
+      ValueError: An error occurred when the shape of predictions and actuals
+      does not match.
+    """
+        if not num_positives:
+            num_positives = [None for i in predictions.shape[1]]
+
+        calculators = self._ap_calculators
+        for i in range(len(predictions)):
+            calculators[i].accumulate(predictions[i], actuals[i],
+                                      num_positives[i])
+
+    def clear(self):
+        for calculator in self._ap_calculators:
+            calculator.clear()
+
+    def is_empty(self):
+        return ([calculator.heap_size for calculator in self._ap_calculators] ==
+                [0 for _ in range(self._num_class)])
+
+    def peek_map_at_n(self):
+        """Peek the non-interpolated mean average precision at n.
+
+    Returns:
+      An array of non-interpolated average precision at n (default 0) for each
+      class.
+    """
+        aps = [
+            self._ap_calculators[i].peek_ap_at_n()
+            for i in range(self._num_class)
+        ]
+        return aps
diff --git a/PaddleCV/video/application/video_tag/models/__init__.py b/PaddleCV/video/application/video_tag/models/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..4a3adbbfb4ee895e532f03bb2d392ef88dcd4dcf
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/__init__.py
@@ -0,0 +1,7 @@
+from .model import regist_model, get_model
+from .attention_lstm import AttentionLSTM
+from .tsn import TSN
+
+# regist models, sort by alphabet
+regist_model("AttentionLSTM", AttentionLSTM)
+regist_model("TSN", TSN)
diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py b/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..cb872f0e43ab52054b42970896e5791a0eeb691d
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/attention_lstm/__init__.py
@@ -0,0 +1 @@
+from .attention_lstm import *
diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py b/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py
new file mode 100644
index 0000000000000000000000000000000000000000..dbf417ae98b9d27cd858d9b6ac66973ccde917f2
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/attention_lstm/attention_lstm.py
@@ -0,0 +1,149 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid import ParamAttr
+
+from ..model import ModelBase
+from .lstm_attention import LSTMAttentionModel
+
+import logging
+logger = logging.getLogger(__name__)
+
+__all__ = ["AttentionLSTM"]
+
+
+class AttentionLSTM(ModelBase):
+    def __init__(self, name, cfg, mode='train'):
+        super(AttentionLSTM, self).__init__(name, cfg, mode)
+        self.get_config()
+
+    def get_config(self):
+        # get model configs
+        self.feature_num = self.cfg.MODEL.feature_num
+        self.feature_names = self.cfg.MODEL.feature_names
+        self.feature_dims = self.cfg.MODEL.feature_dims
+        self.num_classes = self.cfg.MODEL.num_classes
+        self.embedding_size = self.cfg.MODEL.embedding_size
+        self.lstm_size = self.cfg.MODEL.lstm_size
+        self.drop_rate = self.cfg.MODEL.drop_rate
+
+        # get mode configs
+        self.batch_size = self.get_config_from_sec(self.mode, 'batch_size', 1)
+        self.num_gpus = self.get_config_from_sec(self.mode, 'num_gpus', 1)
+
+    def build_input(self, use_dataloader):
+        self.feature_input = []
+        for name, dim in zip(self.feature_names, self.feature_dims):
+            self.feature_input.append(
+                fluid.data(
+                    shape=[None, dim], lod_level=1, dtype='float32', name=name))
+        #video_tag without label_input
+        if use_dataloader:
+            assert self.mode != 'infer', \
+                    'dataloader is not recommendated when infer, please set use_dataloader to be false.'
+            self.dataloader = fluid.io.DataLoader.from_generator(
+                feed_list=self.feature_input,  #video_tag
+                capacity=8,
+                iterable=True)
+
+    def build_model(self):
+        att_outs = []
+        for i, (input_dim, feature
+                ) in enumerate(zip(self.feature_dims, self.feature_input)):
+            att = LSTMAttentionModel(input_dim, self.embedding_size,
+                                     self.lstm_size, self.drop_rate)
+            att_out = att.forward(feature, is_training=(self.mode == 'train'))
+            att_outs.append(att_out)
+        if len(att_outs) > 1:
+            out = fluid.layers.concat(att_outs, axis=1)
+        else:
+            out = att_outs[0]
+
+        fc1 = fluid.layers.fc(
+            input=out,
+            size=8192,
+            act='relu',
+            bias_attr=ParamAttr(
+                regularizer=fluid.regularizer.L2Decay(0.0),
+                initializer=fluid.initializer.NormalInitializer(scale=0.0)),
+            name='fc1')
+        fc2 = fluid.layers.fc(
+            input=fc1,
+            size=4096,
+            act='tanh',
+            bias_attr=ParamAttr(
+                regularizer=fluid.regularizer.L2Decay(0.0),
+                initializer=fluid.initializer.NormalInitializer(scale=0.0)),
+            name='fc2')
+
+        self.logit = fluid.layers.fc(input=fc2, size=self.num_classes, act=None, \
+                              bias_attr=ParamAttr(regularizer=fluid.regularizer.L2Decay(0.0),
+                                                  initializer=fluid.initializer.NormalInitializer(scale=0.0)),
+                              name = 'output')
+
+        self.output = fluid.layers.sigmoid(self.logit)
+
+    def optimizer(self):
+        assert self.mode == 'train', "optimizer only can be get in train mode"
+        values = [
+            self.learning_rate * (self.decay_gamma**i)
+            for i in range(len(self.decay_epochs) + 1)
+        ]
+        iter_per_epoch = self.num_samples / self.batch_size
+        boundaries = [e * iter_per_epoch for e in self.decay_epochs]
+        return fluid.optimizer.RMSProp(
+            learning_rate=fluid.layers.piecewise_decay(
+                values=values, boundaries=boundaries),
+            centered=True,
+            regularization=fluid.regularizer.L2Decay(self.weight_decay))
+
+    def loss(self):
+        assert self.mode != 'infer', "invalid loss calculationg in infer mode"
+        cost = fluid.layers.sigmoid_cross_entropy_with_logits(
+            x=self.logit, label=self.label_input)
+        cost = fluid.layers.reduce_sum(cost, dim=-1)
+        sum_cost = fluid.layers.reduce_sum(cost)
+        self.loss_ = fluid.layers.scale(
+            sum_cost, scale=self.num_gpus, bias_after_scale=False)
+        return self.loss_
+
+    def outputs(self):
+        return [self.output, self.logit]
+
+    def feeds(self):
+        return self.feature_input
+
+    def fetches(self):
+        fetch_list = [self.output]
+        return fetch_list
+
+    def weights_info(self):
+        return None
+
+    def load_pretrain_params(self, exe, pretrain, prog, place):
+        logger.info("Load pretrain weights from {}, exclude fc layer.".format(
+            pretrain))
+
+        state_dict = fluid.load_program_state(pretrain)
+        dict_keys = list(state_dict.keys())
+        for name in dict_keys:
+            if "fc_0" in name:
+                del state_dict[name]
+                logger.info(
+                    'Delete {} from pretrained parameters. Do not load it'.
+                    format(name))
+        fluid.set_program_state(prog, state_dict)
diff --git a/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py b/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py
new file mode 100644
index 0000000000000000000000000000000000000000..baca36c13b663bd2c4589a2876f72a731a1ec487
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/attention_lstm/lstm_attention.py
@@ -0,0 +1,87 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import paddle.fluid as fluid
+from paddle.fluid import ParamAttr
+import numpy as np
+
+
+class LSTMAttentionModel(object):
+    """LSTM Attention Model"""
+
+    def __init__(self,
+                 bias_attr,
+                 embedding_size=512,
+                 lstm_size=1024,
+                 drop_rate=0.5):
+        self.lstm_size = lstm_size
+        self.embedding_size = embedding_size
+        self.drop_rate = drop_rate
+
+    def forward(self, input, is_training):
+        input_fc = fluid.layers.fc(
+            input=input,
+            size=self.embedding_size,
+            act='tanh',
+            bias_attr=ParamAttr(
+                regularizer=fluid.regularizer.L2Decay(0.0),
+                initializer=fluid.initializer.NormalInitializer(scale=0.0)),
+            name='rgb_fc')
+
+        lstm_forward_fc = fluid.layers.fc(
+            input=input_fc,
+            size=self.lstm_size * 4,
+            act=None,
+            bias_attr=False,  # video_tag
+            name='rgb_fc_forward')
+
+        lstm_forward, _ = fluid.layers.dynamic_lstm(
+            input=lstm_forward_fc,
+            size=self.lstm_size * 4,
+            is_reverse=False,
+            name='rgb_lstm_forward')
+
+        lsmt_backward_fc = fluid.layers.fc(
+            input=input_fc,
+            size=self.lstm_size * 4,
+            act=None,
+            bias_attr=False,  #video_tag
+            name='rgb_fc_backward')
+
+        lstm_backward, _ = fluid.layers.dynamic_lstm(
+            input=lsmt_backward_fc,
+            size=self.lstm_size * 4,
+            is_reverse=True,
+            name='rgb_lstm_backward')
+
+        lstm_concat = fluid.layers.concat(
+            input=[lstm_forward, lstm_backward], axis=1)
+
+        lstm_dropout = fluid.layers.dropout(
+            x=lstm_concat,
+            dropout_prob=self.drop_rate,
+            is_test=(not is_training))
+
+        lstm_weight = fluid.layers.fc(
+            input=lstm_dropout,
+            size=1,
+            act='sequence_softmax',
+            bias_attr=False,  #video_tag
+            name='rgb_weight')
+
+        scaled = fluid.layers.elementwise_mul(
+            x=lstm_dropout, y=lstm_weight, axis=0)
+        lstm_pool = fluid.layers.sequence_pool(input=scaled, pool_type='sum')
+
+        return lstm_pool
diff --git a/PaddleCV/video/application/video_tag/models/model.py b/PaddleCV/video/application/video_tag/models/model.py
new file mode 100644
index 0000000000000000000000000000000000000000..512502ae763355622ca2e6ec27a9187c905ac450
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/model.py
@@ -0,0 +1,191 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import wget
+import logging
+try:
+    from configparser import ConfigParser
+except:
+    from ConfigParser import ConfigParser
+
+import paddle.fluid as fluid
+from .utils import download, AttrDict
+
+WEIGHT_DIR = os.path.join(os.path.expanduser('~'), '.paddle', 'weights')
+
+logger = logging.getLogger(__name__)
+
+
+def is_parameter(var):
+    return isinstance(var, fluid.framework.Parameter)
+
+
+class NotImplementError(Exception):
+    "Error: model function not implement"
+
+    def __init__(self, model, function):
+        super(NotImplementError, self).__init__()
+        self.model = model.__class__.__name__
+        self.function = function.__name__
+
+    def __str__(self):
+        return "Function {}() is not implemented in model {}".format(
+            self.function, self.model)
+
+
+class ModelNotFoundError(Exception):
+    "Error: model not found"
+
+    def __init__(self, model_name, avail_models):
+        super(ModelNotFoundError, self).__init__()
+        self.model_name = model_name
+        self.avail_models = avail_models
+
+    def __str__(self):
+        msg = "Model {} Not Found.\nAvailiable models:\n".format(
+            self.model_name)
+        for model in self.avail_models:
+            msg += "  {}\n".format(model)
+        return msg
+
+
+class ModelBase(object):
+    def __init__(self, name, cfg, mode='train'):
+        assert mode in ['train', 'valid', 'test', 'infer'], \
+                "Unknown mode type {}".format(mode)
+        self.name = name
+        self.is_training = (mode == 'train')
+        self.mode = mode
+        self.cfg = cfg
+        self.dataloader = None
+
+    def build_model(self):
+        "build model struct"
+        raise NotImplementError(self, self.build_model)
+
+    def build_input(self, use_dataloader):
+        "build input Variable"
+        raise NotImplementError(self, self.build_input)
+
+    def optimizer(self):
+        "get model optimizer"
+        raise NotImplementError(self, self.optimizer)
+
+    def outputs():
+        "get output variable"
+        raise notimplementerror(self, self.outputs)
+
+    def loss(self):
+        "get loss variable"
+        raise notimplementerror(self, self.loss)
+
+    def feeds(self):
+        "get feed inputs list"
+        raise NotImplementError(self, self.feeds)
+
+    def fetches(self):
+        "get fetch list of model"
+        raise NotImplementError(self, self.fetches)
+
+    def weights_info(self):
+        "get model weight default path and download url"
+        raise NotImplementError(self, self.weights_info)
+
+    def get_weights(self):
+        "get model weight file path, download weight from Paddle if not exist"
+        path, url = self.weights_info()
+        path = os.path.join(WEIGHT_DIR, path)
+        if not os.path.isdir(WEIGHT_DIR):
+            logger.info('{} not exists, will be created automatically.'.format(
+                WEIGHT_DIR))
+            os.makedirs(WEIGHT_DIR)
+        if os.path.exists(path):
+            return path
+
+        logger.info("Download weights of {} from {}".format(self.name, url))
+        wget.download(url, path)
+        return path
+
+    def dataloader(self):
+        return self.dataloader
+
+    def epoch_num(self):
+        "get train epoch num"
+        return self.cfg.TRAIN.epoch
+
+    def pretrain_info(self):
+        "get pretrain base model directory"
+        return (None, None)
+
+    def get_pretrain_weights(self):
+        "get model weight file path, download weight from Paddle if not exist"
+        path, url = self.pretrain_info()
+        if not path:
+            return None
+
+        path = os.path.join(WEIGHT_DIR, path)
+        if not os.path.isdir(WEIGHT_DIR):
+            logger.info('{} not exists, will be created automatically.'.format(
+                WEIGHT_DIR))
+            os.makedirs(WEIGHT_DIR)
+        if os.path.exists(path):
+            return path
+
+        logger.info("Download pretrain weights of {} from {}".format(self.name,
+                                                                     url))
+        download(url, path)
+        return path
+
+    def load_pretrain_params(self, exe, pretrain, prog, place):
+        logger.info("Load pretrain weights from {}".format(pretrain))
+        state_dict = fluid.load_program_state(pretrain)
+        fluid.set_program_state(prog, state_dict)
+
+    def load_test_weights(self, exe, weights, prog):
+        params_list = list(filter(is_parameter, prog.list_vars()))
+        fluid.load(prog, weights, executor=exe, var_list=params_list)
+
+    def get_config_from_sec(self, sec, item, default=None):
+        if sec.upper() not in self.cfg:
+            return default
+        return self.cfg[sec.upper()].get(item, default)
+
+
+class ModelZoo(object):
+    def __init__(self):
+        self.model_zoo = {}
+
+    def regist(self, name, model):
+        assert model.__base__ == ModelBase, "Unknow model type {}".format(
+            type(model))
+        self.model_zoo[name] = model
+
+    def get(self, name, cfg, mode='train'):
+        for k, v in self.model_zoo.items():
+            if k.upper() == name.upper():
+                return v(name, cfg, mode)
+        raise ModelNotFoundError(name, self.model_zoo.keys())
+
+
+# singleton model_zoo
+model_zoo = ModelZoo()
+
+
+def regist_model(name, model):
+    model_zoo.regist(name, model)
+
+
+def get_model(name, cfg, mode='train'):
+    return model_zoo.get(name, cfg, mode)
diff --git a/PaddleCV/video/application/video_tag/models/tsn/__init__.py b/PaddleCV/video/application/video_tag/models/tsn/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..bd57d2687bc948e63dd88306e9d435bbbb5a7978
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/tsn/__init__.py
@@ -0,0 +1 @@
+from .tsn import *
diff --git a/PaddleCV/video/application/video_tag/models/tsn/tsn.py b/PaddleCV/video/application/video_tag/models/tsn/tsn.py
new file mode 100644
index 0000000000000000000000000000000000000000..4bbce1874efa143c5a178455fa1765fa6e761e34
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/tsn/tsn.py
@@ -0,0 +1,172 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import numpy as np
+
+import paddle.fluid as fluid
+from paddle.fluid import ParamAttr
+
+from ..model import ModelBase
+from .tsn_res_model import TSN_ResNet
+
+import logging
+logger = logging.getLogger(__name__)
+
+__all__ = ["TSN"]
+
+
+class TSN(ModelBase):
+    def __init__(self, name, cfg, mode='train'):
+        super(TSN, self).__init__(name, cfg, mode=mode)
+        self.get_config()
+
+    def get_config(self):
+        self.num_classes = self.get_config_from_sec('model', 'num_classes')
+        self.seg_num = self.get_config_from_sec('model', 'seg_num')
+        self.seglen = self.get_config_from_sec('model', 'seglen')
+        self.image_mean = self.get_config_from_sec('model', 'image_mean')
+        self.image_std = self.get_config_from_sec('model', 'image_std')
+        self.num_layers = self.get_config_from_sec('model', 'num_layers')
+
+        self.num_epochs = self.get_config_from_sec('train', 'epoch')
+        self.total_videos = self.get_config_from_sec('train', 'total_videos')
+        self.base_learning_rate = self.get_config_from_sec('train',
+                                                           'learning_rate')
+        self.learning_rate_decay = self.get_config_from_sec(
+            'train', 'learning_rate_decay')
+        self.l2_weight_decay = self.get_config_from_sec('train',
+                                                        'l2_weight_decay')
+        self.momentum = self.get_config_from_sec('train', 'momentum')
+
+        self.seg_num = self.get_config_from_sec(self.mode, 'seg_num',
+                                                self.seg_num)
+        self.target_size = self.get_config_from_sec(self.mode, 'target_size')
+        self.batch_size = self.get_config_from_sec(self.mode, 'batch_size')
+
+    def build_input(self, use_dataloader=True):
+        image_shape = [3, self.target_size, self.target_size]
+        image_shape[0] = image_shape[0] * self.seglen
+        image_shape = [None, self.seg_num] + image_shape
+        self.use_dataloader = use_dataloader
+
+        image = fluid.data(name='image', shape=image_shape, dtype='float32')
+        if self.mode != 'infer':
+            label = fluid.data(name='label', shape=[None, 1], dtype='int64')
+        else:
+            label = None
+
+        if use_dataloader:
+            assert self.mode != 'infer', \
+                        'dataloader is not recommendated when infer, please set use_dataloader to be false.'
+            self.dataloader = fluid.io.DataLoader.from_generator(
+                feed_list=[image, label], capacity=4, iterable=True)
+
+        self.feature_input = [image]
+        self.label_input = label
+
+    def create_model_args(self):
+        cfg = {}
+        cfg['layers'] = self.num_layers
+        cfg['class_dim'] = self.num_classes
+        cfg['seg_num'] = self.seg_num
+        return cfg
+
+    def build_model(self):
+        cfg = self.create_model_args()
+        videomodel = TSN_ResNet(
+            layers=cfg['layers'],
+            seg_num=cfg['seg_num'],
+            is_training=(self.mode == 'train'))
+        out = videomodel.net(input=self.feature_input[0],
+                             class_dim=cfg['class_dim'])
+        # videotag just need extractor feature
+        self.feature_output = out
+
+    def optimizer(self):
+        assert self.mode == 'train', "optimizer only can be get in train mode"
+        epoch_points = [self.num_epochs / 3, self.num_epochs * 2 / 3]
+        total_videos = self.total_videos
+        step = int(total_videos / self.batch_size + 1)
+        bd = [e * step for e in epoch_points]
+        base_lr = self.base_learning_rate
+        lr_decay = self.learning_rate_decay
+        lr = [base_lr, base_lr * lr_decay, base_lr * lr_decay * lr_decay]
+        l2_weight_decay = self.l2_weight_decay
+        momentum = self.momentum
+        optimizer = fluid.optimizer.Momentum(
+            learning_rate=fluid.layers.piecewise_decay(
+                boundaries=bd, values=lr),
+            momentum=momentum,
+            regularization=fluid.regularizer.L2Decay(l2_weight_decay))
+
+        return optimizer
+
+    def loss(self):
+        assert self.mode != 'infer', "invalid loss calculationg in infer mode"
+        cost = fluid.layers.cross_entropy(input=self.network_outputs[0], \
+                           label=self.label_input, ignore_index=-1)
+        self.loss_ = fluid.layers.mean(x=cost)
+        return self.loss_
+
+    def outputs(self):
+        return self.network_outputs
+
+    def feeds(self):
+        return self.feature_input if self.mode == 'infer' else self.feature_input + [
+            self.label_input
+        ]
+
+    def fetches(self):
+        if self.mode == 'train' or self.mode == 'valid':
+            losses = self.loss()
+            fetch_list = [losses, self.network_outputs[0], self.label_input]
+        elif self.mode == 'test':
+            losses = self.loss()
+            fetch_list = [self.feature_output, self.label_input]
+        elif self.mode == 'infer':
+            fetch_list = self.feature_output
+        else:
+            raise NotImplementedError('mode {} not implemented'.format(
+                self.mode))
+
+        return fetch_list
+
+    def pretrain_info(self):
+        return (
+            'ResNet50_pretrained',
+            'https://paddlemodels.bj.bcebos.com/video_classification/ResNet50_pretrained.tar.gz'
+        )
+
+    def weights_info(self):
+        return None
+
+    def load_pretrain_params(self, exe, pretrain, prog, place):
+        def is_parameter(var):
+            return isinstance(var, fluid.framework.Parameter)
+
+        params_list = list(filter(is_parameter, prog.list_vars()))
+        for param in params_list:
+            print(param.name)
+
+        logger.info("Load pretrain weights from {}, exclude fc layer.".format(
+            pretrain))
+
+        state_dict = fluid.load_program_state(pretrain)
+        dict_keys = list(state_dict.keys())
+        for name in dict_keys:
+            if "fc_0" in name:
+                del state_dict[name]
+                print('Delete {} from pretrained parameters. Do not load it'.
+                      format(name))
+        fluid.set_program_state(prog, state_dict)
diff --git a/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py b/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py
new file mode 100644
index 0000000000000000000000000000000000000000..05027bb2bfeb3379095ee9b49483f7e8618686b8
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/tsn/tsn_res_model.py
@@ -0,0 +1,150 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import time
+import sys
+import paddle.fluid as fluid
+import math
+
+
+class TSN_ResNet():
+    def __init__(self, layers=50, seg_num=7, is_training=True):
+        self.layers = 101  #layers
+        self.seg_num = seg_num
+        self.is_training = is_training
+
+    def conv_bn_layer(self,
+                      input,
+                      num_filters,
+                      filter_size,
+                      stride=1,
+                      groups=1,
+                      act=None,
+                      name=None):
+        conv = fluid.layers.conv2d(
+            input=input,
+            num_filters=num_filters,
+            filter_size=filter_size,
+            stride=stride,
+            padding=(filter_size - 1) // 2,
+            groups=groups,
+            act=None,
+            param_attr=fluid.param_attr.ParamAttr(name=name + "_weights"),
+            bias_attr=False)
+        if name == "conv1":
+            bn_name = "bn_" + name
+        else:
+            bn_name = "bn" + name[3:]
+
+        return fluid.layers.batch_norm(
+            input=conv,
+            act=act,
+            is_test=(not self.is_training),
+            param_attr=fluid.param_attr.ParamAttr(name=bn_name + "_scale"),
+            bias_attr=fluid.param_attr.ParamAttr(bn_name + '_offset'),
+            moving_mean_name=bn_name + "_mean",
+            moving_variance_name=bn_name + '_variance')
+
+    def shortcut(self, input, ch_out, stride, name):
+        ch_in = input.shape[1]
+        if ch_in != ch_out or stride != 1:
+            return self.conv_bn_layer(input, ch_out, 1, stride, name=name)
+        else:
+            return input
+
+    def bottleneck_block(self, input, num_filters, stride, name):
+        conv0 = self.conv_bn_layer(
+            input=input,
+            num_filters=num_filters,
+            filter_size=1,
+            act='relu',
+            name=name + "_branch2a")
+        conv1 = self.conv_bn_layer(
+            input=conv0,
+            num_filters=num_filters,
+            filter_size=3,
+            stride=stride,
+            act='relu',
+            name=name + "_branch2b")
+        conv2 = self.conv_bn_layer(
+            input=conv1,
+            num_filters=num_filters * 4,
+            filter_size=1,
+            act=None,
+            name=name + "_branch2c")
+
+        short = self.shortcut(
+            input, num_filters * 4, stride, name=name + "_branch1")
+
+        return fluid.layers.elementwise_add(x=short, y=conv2, act='relu')
+
+    def net(self, input, class_dim=101):
+        layers = self.layers
+        seg_num = self.seg_num
+        supported_layers = [50, 101, 152]
+        assert layers in supported_layers, \
+            "supported layers are {} but input layer is {}".format(supported_layers, layers)
+
+        # reshape input
+        channels = input.shape[2]
+        short_size = input.shape[3]
+        input = fluid.layers.reshape(
+            x=input, shape=[-1, channels, short_size, short_size])
+
+        if layers == 50:
+            depth = [3, 4, 6, 3]
+        elif layers == 101:
+            depth = [3, 4, 23, 3]
+        elif layers == 152:
+            depth = [3, 8, 36, 3]
+        num_filters = [64, 128, 256, 512]
+
+        conv = self.conv_bn_layer(
+            input=input,
+            num_filters=64,
+            filter_size=7,
+            stride=2,
+            act='relu',
+            name='conv1')
+        conv = fluid.layers.pool2d(
+            input=conv,
+            pool_size=3,
+            pool_stride=2,
+            pool_padding=1,
+            pool_type='max')
+
+        for block in range(len(depth)):
+            for i in range(depth[block]):
+                if layers in [101, 152] and block == 2:
+                    if i == 0:
+                        conv_name = "res" + str(block + 2) + "a"
+                    else:
+                        conv_name = "res" + str(block + 2) + "b" + str(i)
+                else:
+                    conv_name = "res" + str(block + 2) + chr(97 + i)
+
+                conv = self.bottleneck_block(
+                    input=conv,
+                    num_filters=num_filters[block],
+                    stride=2 if i == 0 and block != 0 else 1,
+                    name=conv_name)
+
+        pool = fluid.layers.pool2d(
+            input=conv, pool_size=7, pool_type='avg', global_pooling=True)
+
+        # video_tag just need extractor feature
+        feature = fluid.layers.reshape(
+            x=pool, shape=[-1, seg_num, pool.shape[1]])
+        return feature
diff --git a/PaddleCV/video/application/video_tag/models/utils.py b/PaddleCV/video/application/video_tag/models/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..3480794285d0b2da3832c25ff3512c5678e2b0e1
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/models/utils.py
@@ -0,0 +1,47 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import wget
+import tarfile
+
+__all__ = ['decompress', 'download', 'AttrDict']
+
+
+def decompress(path):
+    t = tarfile.open(path)
+    t.extractall(path=os.path.split(path)[0])
+    t.close()
+    os.remove(path)
+
+
+def download(url, path):
+    weight_dir = os.path.split(path)[0]
+    if not os.path.exists(weight_dir):
+        os.makedirs(weight_dir)
+
+    path = path + ".tar.gz"
+    wget.download(url, path)
+    decompress(path)
+
+
+class AttrDict(dict):
+    def __getattr__(self, key):
+        return self[key]
+
+    def __setattr__(self, key, value):
+        if key in self.__dict__:
+            self.__dict__[key] = value
+        else:
+            self[key] = value
diff --git a/PaddleCV/video/application/video_tag/reader/__init__.py b/PaddleCV/video/application/video_tag/reader/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..d419ab75df3a105329c65f7d96a78f3b1964823c
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/reader/__init__.py
@@ -0,0 +1,5 @@
+from .reader_utils import regist_reader, get_reader
+from .kinetics_reader import KineticsReader
+
+# regist reader, sort by alphabet
+regist_reader("TSN", KineticsReader)
diff --git a/PaddleCV/video/application/video_tag/reader/kinetics_reader.py b/PaddleCV/video/application/video_tag/reader/kinetics_reader.py
new file mode 100644
index 0000000000000000000000000000000000000000..4eb560a111b2c47752226ba0776158d657280cf1
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/reader/kinetics_reader.py
@@ -0,0 +1,255 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import cv2
+import math
+import random
+import functools
+import time
+try:
+    import cPickle as pickle
+    from cStringIO import StringIO
+except ImportError:
+    import pickle
+    from io import BytesIO
+import numpy as np
+import paddle.fluid as fluid
+from PIL import Image, ImageEnhance
+import logging
+
+from .reader_utils import DataReader
+
+logger = logging.getLogger(__name__)
+python_ver = sys.version_info
+
+
+class KineticsReader(DataReader):
+    """
+    Data reader for kinetics dataset of two format mp4 and pkl.
+    1. mp4, the original format of kinetics400
+    2. pkl, the mp4 was decoded previously and stored as pkl
+    In both case, load the data, and then get the frame data in the form of numpy and label as an integer.
+     dataset cfg: format
+                  num_classes
+                  seg_num
+                  short_size
+                  target_size
+                  num_reader_threads
+                  buf_size
+                  image_mean
+                  image_std
+                  batch_size
+                  list
+    """
+
+    def __init__(self, name, mode, cfg):
+        super(KineticsReader, self).__init__(name, mode, cfg)
+        self.format = cfg.MODEL.format
+        self.num_classes = self.get_config_from_sec('model', 'num_classes')
+        self.seg_num = self.get_config_from_sec('model', 'seg_num')
+        self.seglen = self.get_config_from_sec('model', 'seglen')
+
+        self.seg_num = self.get_config_from_sec(mode, 'seg_num', self.seg_num)
+        self.short_size = self.get_config_from_sec(mode, 'short_size')
+        self.target_size = self.get_config_from_sec(mode, 'target_size')
+        self.num_reader_threads = self.get_config_from_sec(mode,
+                                                           'num_reader_threads')
+        self.buf_size = self.get_config_from_sec(mode, 'buf_size')
+        self.fix_random_seed = self.get_config_from_sec(mode, 'fix_random_seed')
+
+        self.img_mean = np.array(cfg.MODEL.image_mean).reshape(
+            [3, 1, 1]).astype(np.float32)
+        self.img_std = np.array(cfg.MODEL.image_std).reshape(
+            [3, 1, 1]).astype(np.float32)
+        # set batch size and file list
+        self.batch_size = cfg[mode.upper()]['batch_size']
+        self.filelist = cfg[mode.upper()]['filelist']
+        if self.fix_random_seed:
+            random.seed(0)
+            np.random.seed(0)
+            self.num_reader_threads = 1
+
+    def create_reader(self):
+        assert os.path.exists(self.filelist), \
+            '{} not exist, please check the data list'.format(self.filelist)
+        _reader = self._reader_creator(self.filelist, self.mode, seg_num=self.seg_num, seglen = self.seglen, \
+                                       short_size = self.short_size, target_size = self.target_size, \
+                                       img_mean = self.img_mean, img_std = self.img_std, \
+                                       shuffle = (self.mode == 'train'), \
+                                       num_threads = self.num_reader_threads, \
+                                       buf_size = self.buf_size, format = self.format)
+
+        def _batch_reader():
+            batch_out = []
+            for imgs, label in _reader():
+                if imgs is None:
+                    continue
+                batch_out.append((imgs, label))
+                if len(batch_out) == self.batch_size:
+                    yield batch_out
+                    batch_out = []
+
+        return _batch_reader
+
+    def _reader_creator(self,
+                        pickle_list,
+                        mode,
+                        seg_num,
+                        seglen,
+                        short_size,
+                        target_size,
+                        img_mean,
+                        img_std,
+                        shuffle=False,
+                        num_threads=1,
+                        buf_size=1024,
+                        format='pkl'):
+        def decode_mp4(sample, mode, seg_num, seglen, short_size, target_size,
+                       img_mean, img_std):
+            sample = sample[0].split(' ')
+            mp4_path = sample[0]
+            try:
+                load_time1 = time.time()
+                imgs = mp4_loader(mp4_path, seg_num, seglen, mode)
+                load_time2 = time.time()
+                if len(imgs) < 1:
+                    logger.error('{} frame length {} less than 1.'.format(
+                        mp4_path, len(imgs)))
+                    return None, None
+            except:
+                logger.error('Error when loading {}'.format(mp4_path))
+                return None, None
+
+            transform_time_1 = time.time()
+            imgs = imgs_transform(
+                imgs,
+                mode,
+                seg_num,
+                seglen,
+                short_size,
+                target_size,
+                img_mean,
+                img_std,
+                name=self.name)
+            transform_time_2 = time.time()
+            return imgs, mp4_path
+
+        def reader():
+            with open(pickle_list) as flist:
+                lines = [line.strip() for line in flist]
+                if shuffle:
+                    random.shuffle(lines)
+                for line in lines:
+                    pickle_path = line.strip()
+                    yield [pickle_path]
+
+        mapper = functools.partial(
+            decode_mp4,
+            mode=mode,
+            seg_num=seg_num,
+            seglen=seglen,
+            short_size=short_size,
+            target_size=target_size,
+            img_mean=img_mean,
+            img_std=img_std)
+
+        return fluid.io.xmap_readers(mapper, reader, num_threads, buf_size)
+
+
+def imgs_transform(imgs,
+                   mode,
+                   seg_num,
+                   seglen,
+                   short_size,
+                   target_size,
+                   img_mean,
+                   img_std,
+                   name=''):
+    imgs = group_scale(imgs, short_size)
+
+    np_imgs = np.array([np.array(img).astype('float32') for img in imgs])  #dhwc
+    np_imgs = group_center_crop(np_imgs, target_size)
+    np_imgs = np_imgs.transpose(0, 3, 1, 2) / 255  #dchw
+    np_imgs -= img_mean
+    np_imgs /= img_std
+
+    return np_imgs
+
+
+def group_center_crop(np_imgs, target_size):
+    d, h, w, c = np_imgs.shape
+    th, tw = target_size, target_size
+    assert (w >= target_size) and (h >= target_size), \
+        "image width({}) and height({}) should be larger than crop size".format(w, h, target_size)
+
+    h_off = int(round((h - th) / 2.))
+    w_off = int(round((w - tw) / 2.))
+
+    img_crop = np_imgs[:, h_off:h_off + target_size, w_off:w_off +
+                       target_size, :]
+    return img_crop
+
+
+def group_scale(imgs, target_size):
+    resized_imgs = []
+    for i in range(len(imgs)):
+        img = imgs[i]
+        w, h = img.size
+        if (w <= h and w == target_size) or (h <= w and h == target_size):
+            resized_imgs.append(img)
+            continue
+
+        if w < h:
+            ow = target_size
+            oh = int(target_size * 4.0 / 3.0)
+            resized_imgs.append(img.resize((ow, oh), Image.BILINEAR))
+        else:
+            oh = target_size
+            ow = int(target_size * 4.0 / 3.0)
+            resized_imgs.append(img.resize((ow, oh), Image.BILINEAR))
+
+    return resized_imgs
+
+
+def mp4_loader(filepath, nsample, seglen, mode):
+    cap = cv2.VideoCapture(filepath)
+    videolen = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+    sampledFrames = []
+    for i in range(videolen):
+        ret, frame = cap.read()
+        # maybe first frame is empty
+        if ret == False:
+            continue
+        img = frame[:, :, ::-1]
+        sampledFrames.append(img)
+    average_dur = int(len(sampledFrames) / nsample)
+    imgs = []
+    for i in range(nsample):
+        idx = 0
+        if average_dur >= seglen:
+            idx = (average_dur - 1) // 2
+            idx += i * average_dur
+        elif average_dur >= 1:
+            idx += i * average_dur
+        else:
+            idx = i
+
+        for jj in range(idx, idx + seglen):
+            imgbuf = sampledFrames[int(jj % len(sampledFrames))]
+            img = Image.fromarray(imgbuf, mode='RGB')
+            imgs.append(img)
+
+    return imgs
diff --git a/PaddleCV/video/application/video_tag/reader/reader_utils.py b/PaddleCV/video/application/video_tag/reader/reader_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..b3741188e11350231600b50fb7fabad72340768c
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/reader/reader_utils.py
@@ -0,0 +1,81 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import pickle
+import cv2
+import numpy as np
+import random
+
+
+class ReaderNotFoundError(Exception):
+    "Error: reader not found"
+
+    def __init__(self, reader_name, avail_readers):
+        super(ReaderNotFoundError, self).__init__()
+        self.reader_name = reader_name
+        self.avail_readers = avail_readers
+
+    def __str__(self):
+        msg = "Reader {} Not Found.\nAvailiable readers:\n".format(
+            self.reader_name)
+        for reader in self.avail_readers:
+            msg += "  {}\n".format(reader)
+        return msg
+
+
+class DataReader(object):
+    """data reader for video input"""
+
+    def __init__(self, model_name, mode, cfg):
+        self.name = model_name
+        self.mode = mode
+        self.cfg = cfg
+
+    def create_reader(self):
+        """Not implemented"""
+        pass
+
+    def get_config_from_sec(self, sec, item, default=None):
+        if sec.upper() not in self.cfg:
+            return default
+        return self.cfg[sec.upper()].get(item, default)
+
+
+class ReaderZoo(object):
+    def __init__(self):
+        self.reader_zoo = {}
+
+    def regist(self, name, reader):
+        assert reader.__base__ == DataReader, "Unknow model type {}".format(
+            type(reader))
+        self.reader_zoo[name] = reader
+
+    def get(self, name, mode, cfg):
+        for k, v in self.reader_zoo.items():
+            if k == name:
+                return v(name, mode, cfg)
+        raise ReaderNotFoundError(name, self.reader_zoo.keys())
+
+
+# singleton reader_zoo
+reader_zoo = ReaderZoo()
+
+
+def regist_reader(name, reader):
+    reader_zoo.regist(name, reader)
+
+
+def get_reader(name, mode, cfg):
+    reader_model = reader_zoo.get(name, mode, cfg)
+    return reader_model.create_reader()
diff --git a/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh b/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh
new file mode 100644
index 0000000000000000000000000000000000000000..8c2cf7087ec8543406e8404c74ff862069287222
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/run_TSN_LSTM.sh
@@ -0,0 +1,4 @@
+export CUDA_VISIBLE_DEVICES=0
+
+# TSN + AttentionLSTM
+python videotag_main.py
diff --git a/PaddleCV/video/application/video_tag/utils/__init__.py b/PaddleCV/video/application/video_tag/utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/PaddleCV/video/application/video_tag/utils/config_utils.py b/PaddleCV/video/application/video_tag/utils/config_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..6ceb42ee5ef3b535325fa26a7d140edd767ac0b7
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/utils/config_utils.py
@@ -0,0 +1,75 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import yaml
+from .utility import AttrDict
+import logging
+logger = logging.getLogger(__name__)
+
+CONFIG_SECS = [
+    'train',
+    'valid',
+    'test',
+    'infer',
+]
+
+
+def parse_config(cfg_file):
+    """Load a config file into AttrDict"""
+    import yaml
+    with open(cfg_file, 'r') as fopen:
+        yaml_config = AttrDict(yaml.load(fopen, Loader=yaml.Loader))
+    create_attr_dict(yaml_config)
+    return yaml_config
+
+
+def create_attr_dict(yaml_config):
+    from ast import literal_eval
+    for key, value in yaml_config.items():
+        if type(value) is dict:
+            yaml_config[key] = value = AttrDict(value)
+        if isinstance(value, str):
+            try:
+                value = literal_eval(value)
+            except BaseException:
+                pass
+        if isinstance(value, AttrDict):
+            create_attr_dict(yaml_config[key])
+        else:
+            yaml_config[key] = value
+    return
+
+
+def merge_configs(cfg, sec, args_dict):
+    assert sec in CONFIG_SECS, "invalid config section {}".format(sec)
+    sec_dict = getattr(cfg, sec.upper())
+    for k, v in args_dict.items():
+        if v is None:
+            continue
+        try:
+            if hasattr(sec_dict, k):
+                setattr(sec_dict, k, v)
+        except:
+            pass
+    return cfg
+
+
+def print_configs(cfg, mode):
+    logger.info("---------------- {:>5} Arguments ----------------".format(
+        mode))
+    for sec, sec_items in cfg.items():
+        logger.info("{}:".format(sec))
+        for k, v in sec_items.items():
+            logger.info("    {}:{}".format(k, v))
+    logger.info("-------------------------------------------------")
diff --git a/PaddleCV/video/application/video_tag/utils/utility.py b/PaddleCV/video/application/video_tag/utils/utility.py
new file mode 100644
index 0000000000000000000000000000000000000000..fa94c0ddc4296100b206a4b4529774bd1c75c773
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/utils/utility.py
@@ -0,0 +1,71 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import signal
+import logging
+import paddle
+import paddle.fluid as fluid
+
+__all__ = ['AttrDict']
+
+logger = logging.getLogger(__name__)
+
+
+def _term(sig_num, addition):
+    print('current pid is %s, group id is %s' % (os.getpid(), os.getpgrp()))
+    os.killpg(os.getpgid(os.getpid()), signal.SIGKILL)
+
+
+signal.signal(signal.SIGTERM, _term)
+signal.signal(signal.SIGINT, _term)
+
+
+class AttrDict(dict):
+    def __getattr__(self, key):
+        return self[key]
+
+    def __setattr__(self, key, value):
+        if key in self.__dict__:
+            self.__dict__[key] = value
+        else:
+            self[key] = value
+
+def check_cuda(use_cuda, err = \
+    "\nYou can not set use_gpu = True in the model because you are using paddlepaddle-cpu.\n \
+    Please: 1. Install paddlepaddle-gpu to run your models on GPU or 2. Set use_gpu = False to run models on CPU.\n"
+                                                                                                                     ):
+    try:
+        if use_cuda == True and fluid.is_compiled_with_cuda() == False:
+            print(err)
+            sys.exit(1)
+    except Exception as e:
+        pass
+
+
+def check_version():
+    """
+     Log error and exit when the installed version of paddlepaddle is
+     not satisfied.
+     """
+    err = "PaddlePaddle version 1.6 or higher is required, " \
+          "or a suitable develop version is satisfied as well. \n" \
+          "Please make sure the version is good with your code." \
+
+    try:
+        fluid.require_version('1.6.0')
+    except Exception as e:
+        logger.error(err)
+        sys.exit(1)
diff --git a/PaddleCV/video/application/video_tag/video_tag.png b/PaddleCV/video/application/video_tag/video_tag.png
new file mode 100644
index 0000000000000000000000000000000000000000..50ada247073a8aaaf3d4004547f8772ddb31fcb2
Binary files /dev/null and b/PaddleCV/video/application/video_tag/video_tag.png differ
diff --git a/PaddleCV/video/application/video_tag/videotag_main.py b/PaddleCV/video/application/video_tag/videotag_main.py
new file mode 100644
index 0000000000000000000000000000000000000000..e5cb5fa9cc3a29a000748dc6078a69663d246c07
--- /dev/null
+++ b/PaddleCV/video/application/video_tag/videotag_main.py
@@ -0,0 +1,236 @@
+#  Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import os
+import sys
+import time
+import logging
+import argparse
+import ast
+import numpy as np
+import paddle.fluid as fluid
+
+from utils.config_utils import *
+import models
+from reader import get_reader
+from metrics import get_metrics
+from utils.utility import check_cuda
+from utils.utility import check_version
+
+logging.root.handlers = []
+FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s'
+logging.basicConfig(level=logging.INFO, format=FORMAT, stream=sys.stdout)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        '--extractor_config',
+        type=str,
+        default='configs/tsn.yaml',
+        help='path to config file of model')
+    parser.add_argument(
+        '--extractor_name',
+        type=str,
+        default='TSN',
+        help='extractor model name, default TSN')
+    parser.add_argument(
+        '--predictor_config',
+        '--pconfig',
+        type=str,
+        default='configs/attention_lstm.yaml',
+        help='path to config file of model')
+    parser.add_argument(
+        '--predictor_name',
+        '--pname',
+        type=str,
+        default='AttentionLSTM',
+        help='predictor model name, as AttentionLSTM, AttentionCluster, NEXTVLAD'
+    )
+    parser.add_argument(
+        '--use_gpu',
+        type=ast.literal_eval,
+        default=True,
+        help='default use gpu.')
+    parser.add_argument(
+        '--extractor_weights',
+        type=str,
+        default='weights/tsn',
+        help='extractor weight path')
+    parser.add_argument(
+        '--predictor_weights',
+        '--pweights',
+        type=str,
+        default='weights/attention_lstm',
+        help='predictor weight path')
+    parser.add_argument(
+        '--filelist',
+        type=str,
+        default=None,
+        help='path of video data, multiple video')
+    parser.add_argument(
+        '--save_dir', type=str, default='data/results', help='output file path')
+    parser.add_argument(
+        '--label_file',
+        type=str,
+        default='label_3396.txt',
+        help='chinese label file path')
+
+    args = parser.parse_args()
+    return args
+
+
+def main():
+    """
+    Video classification model of 3000 Chinese tags.
+    videotag_extractor_prdictor (as videotag_TSN_AttentionLSTM)
+    two stages in our model:
+        1. extract feature from input video(mp4 format) using extractor
+        2. predict classification results from extracted feature  using predictor
+    we implement this using two name scopes, ie. extractor_scope and predictor_scope.
+    """
+
+    if not os.path.isdir(args.save_dir):
+        os.makedirs(args.save_dir)
+    extractor_config = parse_config(args.extractor_config)
+    extractor_infer_config = merge_configs(extractor_config, 'infer',
+                                           vars(args))
+    extractor_start_time = time.time()
+    extractor_scope = fluid.Scope()
+    with fluid.scope_guard(extractor_scope):
+        extractor_startup_prog = fluid.Program()
+        extractor_main_prog = fluid.Program()
+        with fluid.program_guard(extractor_main_prog, extractor_startup_prog):
+            with fluid.unique_name.guard():
+                # build model
+                extractor_model = models.get_model(
+                    args.extractor_name, extractor_infer_config, mode='infer')
+                extractor_model.build_input(use_dataloader=False)
+                extractor_model.build_model()
+                extractor_feeds = extractor_model.feeds()
+                extractor_fetch_list = extractor_model.fetches()
+
+                place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
+                exe = fluid.Executor(place)
+
+                exe.run(extractor_startup_prog)
+
+                logger.info('load extractor weights from {}'.format(
+                    args.extractor_weights))
+                extractor_model.load_test_weights(exe, args.extractor_weights,
+                                                  extractor_main_prog)
+
+                # get reader and metrics
+                extractor_reader = get_reader(args.extractor_name, 'infer',
+                                              extractor_infer_config)
+                extractor_feeder = fluid.DataFeeder(
+                    place=place, feed_list=extractor_feeds)
+
+                feature_list = []
+                file_list = []
+                for idx, data in enumerate(extractor_reader()):
+                    file_id = [item[-1] for item in data]
+                    feed_data = [item[:-1] for item in data]
+                    feature_out = exe.run(fetch_list=extractor_fetch_list,
+                                          feed=extractor_feeder.feed(feed_data))
+                    feature_list.append(feature_out[0])  #get out from list
+                    file_list.append(file_id)
+                    logger.info(
+                        '========[Stage 1 Sample {} ] Extractor finished======'.
+                        format(idx))
+        extractor_end_time = time.time()
+        print('extractor_time', extractor_end_time - extractor_start_time)
+
+    predictor_config = parse_config(args.predictor_config)
+    predictor_infer_config = merge_configs(predictor_config, 'infer',
+                                           vars(args))
+
+    # get Predictor input from Extractor output
+    predictor_feed_list = []
+    for i in range(len(feature_list)):
+        feature_out = feature_list[i]
+        if args.predictor_name == "AttentionCluster":
+            extractor_seg_num = extractor_infer_config.INFER.seg_num
+            predictor_seg_num = predictor_infer_config.MODEL.seg_num
+            idxs = []
+            stride = float(extractor_seg_num) / predictor_seg_num
+            for j in range(predictor_seg_num):
+                pos = (j + np.random.random()) * stride
+                idxs.append(min(extractor_seg_num - 1, int(pos)))
+            extractor_feature = feature_out[:, idxs, :].astype(
+                float)  # get from bs dim
+        else:
+            extractor_feature = feature_out.astype(float)
+        predictor_feed_data = [extractor_feature]
+        predictor_feed_list.append((predictor_feed_data, file_list[i]))
+
+    predictor_start_time = time.time()
+    predictor_scope = fluid.Scope()
+    with fluid.scope_guard(predictor_scope):
+        predictor_startup_prog = fluid.Program()
+        predictor_main_prog = fluid.Program()
+        with fluid.program_guard(predictor_main_prog, predictor_startup_prog):
+            with fluid.unique_name.guard():
+                # parse config
+                predictor_model = models.get_model(
+                    args.predictor_name, predictor_infer_config, mode='infer')
+                predictor_model.build_input(use_dataloader=False)
+                predictor_model.build_model()
+                predictor_feeds = predictor_model.feeds()
+
+                exe.run(predictor_startup_prog)
+
+                logger.info('load predictor weights from {}'.format(
+                    args.predictor_weights))
+                predictor_model.load_test_weights(exe, args.predictor_weights,
+                                                  predictor_main_prog)
+
+                predictor_feeder = fluid.DataFeeder(
+                    place=place, feed_list=predictor_feeds)
+                predictor_fetch_list = predictor_model.fetches()
+                predictor_metrics = get_metrics(args.predictor_name.upper(),
+                                                'infer', predictor_infer_config)
+                predictor_metrics.reset()
+
+                for idx, data in enumerate(predictor_feed_list):
+                    file_id = data[1]
+                    predictor_feed_data = data[0]
+                    final_outs = exe.run(
+                        fetch_list=predictor_fetch_list,
+                        feed=predictor_feeder.feed(predictor_feed_data))
+                    logger.info(
+                        '=======[Stage 2 Sample {} ] Predictor finished========'.
+                        format(idx))
+                    final_result_list = [item
+                                         for item in final_outs] + [file_id]
+
+                    predictor_metrics.accumulate(final_result_list)
+                predictor_metrics.finalize_and_log_out(
+                    savedir=args.save_dir, label_file=args.label_file)
+    predictor_end_time = time.time()
+    print('predictor_time', predictor_end_time - predictor_start_time)
+
+
+if __name__ == '__main__':
+    start_time = time.time()
+    args = parse_args()
+    print(args)
+    check_cuda(args.use_gpu)
+    check_version()
+    logger.info(args)
+    main()
+    end_time = time.time()
+    period = end_time - start_time
+    print('[INFER] infer finished. cost time: {}'.format(period))
diff --git a/dygraph/mnist/train.py b/dygraph/mnist/train.py
index f81df8f26458c93c1f658a9bc783d14a3c5b8256..58db6f1d728090cc63b0b802e7f765c37c5036aa 100644
--- a/dygraph/mnist/train.py
+++ b/dygraph/mnist/train.py
@@ -99,11 +99,13 @@ class MNIST(fluid.dygraph.Layer):
         self.pool_2_shape = 50 * 4 * 4
         SIZE = 10
         scale = (2.0 / (self.pool_2_shape**2 * SIZE))**0.5
-        self._fc = Linear(self.pool_2_shape, 10,
-                      param_attr=fluid.param_attr.ParamAttr(
-                          initializer=fluid.initializer.NormalInitializer(
-                              loc=0.0, scale=scale)),
-                      act="softmax")
+        self._fc = Linear(
+            self.pool_2_shape,
+            10,
+            param_attr=fluid.param_attr.ParamAttr(
+                initializer=fluid.initializer.NormalInitializer(
+                    loc=0.0, scale=scale)),
+            act="softmax")
 
     def forward(self, inputs, label=None):
         x = self._simple_img_conv_pool_1(inputs)
@@ -117,17 +119,21 @@ class MNIST(fluid.dygraph.Layer):
             return x
 
 
+def reader_decorator(reader):
+    def __reader__():
+        for item in reader():
+            img = np.array(item[0]).astype('float32').reshape(1, 28, 28)
+            label = np.array(item[1]).astype('int64').reshape(1)
+            yield img, label
+
+    return __reader__
+
+
 def test_mnist(reader, model, batch_size):
     acc_set = []
     avg_loss_set = []
     for batch_id, data in enumerate(reader()):
-        dy_x_data = np.array([x[0].reshape(1, 28, 28)
-                              for x in data]).astype('float32')
-        y_data = np.array(
-            [x[1] for x in data]).astype('int64').reshape(batch_size, 1)
-
-        img = to_variable(dy_x_data)
-        label = to_variable(y_data)
+        img, label = data
         label.stop_gradient = True
         prediction, acc = model(img, label)
         loss = fluid.layers.cross_entropy(input=prediction, label=label)
@@ -187,28 +193,33 @@ def train_mnist(args):
         if args.use_data_parallel:
             strategy = fluid.dygraph.parallel.prepare_context()
         mnist = MNIST()
-        adam = AdamOptimizer(learning_rate=0.001, parameter_list=mnist.parameters())
+        adam = AdamOptimizer(
+            learning_rate=0.001, parameter_list=mnist.parameters())
         if args.use_data_parallel:
             mnist = fluid.dygraph.parallel.DataParallel(mnist, strategy)
 
         train_reader = paddle.batch(
-            paddle.dataset.mnist.train(), batch_size=BATCH_SIZE, drop_last=True)
+            reader_decorator(paddle.dataset.mnist.train()),
+            batch_size=BATCH_SIZE,
+            drop_last=True)
         if args.use_data_parallel:
             train_reader = fluid.contrib.reader.distributed_batch_reader(
                 train_reader)
 
         test_reader = paddle.batch(
-            paddle.dataset.mnist.test(), batch_size=BATCH_SIZE, drop_last=True)
+            reader_decorator(paddle.dataset.mnist.test()),
+            batch_size=BATCH_SIZE,
+            drop_last=True)
+
+        train_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        train_loader.set_sample_list_generator(train_reader, places=place)
+
+        test_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        test_loader.set_sample_list_generator(test_reader, places=place)
 
         for epoch in range(epoch_num):
-            for batch_id, data in enumerate(train_reader()):
-                dy_x_data = np.array([x[0].reshape(1, 28, 28)
-                                      for x in data]).astype('float32')
-                y_data = np.array(
-                    [x[1] for x in data]).astype('int64').reshape(-1, 1)
-
-                img = to_variable(dy_x_data)
-                label = to_variable(y_data)
+            for batch_id, data in enumerate(train_loader()):
+                img, label = data
                 label.stop_gradient = True
 
                 cost, acc = mnist(img, label)
@@ -231,7 +242,7 @@ def train_mnist(args):
                         epoch, batch_id, avg_loss.numpy()))
 
             mnist.eval()
-            test_cost, test_acc = test_mnist(test_reader, mnist, BATCH_SIZE)
+            test_cost, test_acc = test_mnist(test_loader, mnist, BATCH_SIZE)
             mnist.train()
             if args.ce:
                 print("kpis\ttest_acc\t%s" % test_acc)
@@ -244,7 +255,7 @@ def train_mnist(args):
             fluid.dygraph.parallel.Env().local_rank == 0)
         if save_parameters:
             fluid.save_dygraph(mnist.state_dict(), "save_temp")
-            
+
             print("checkpoint saved")
 
             inference_mnist()
diff --git a/dygraph/mobilenet/reader.py b/dygraph/mobilenet/reader.py
index bba33c355ba02983c5d9d54b3bc5f2535d53cfb1..e598d19a3b44fdfcea31abd3f909c5639ba22d45 100644
--- a/dygraph/mobilenet/reader.py
+++ b/dygraph/mobilenet/reader.py
@@ -239,7 +239,7 @@ def process_image(sample, settings, mode, color_jitter, rotate):
     img /= img_std
 
     if mode == 'train' or mode == 'val':
-        return (img, sample[1])
+        return (img, [sample[1]])
     elif mode == 'test':
         return (img, )
 
diff --git a/dygraph/mobilenet/train.py b/dygraph/mobilenet/train.py
index 16e27dc4fbc22675e2446dbc5ff146e1b6b5b909..547e9d45506b7cec9f84e7543d8a60fea2fadc9c 100644
--- a/dygraph/mobilenet/train.py
+++ b/dygraph/mobilenet/train.py
@@ -116,10 +116,8 @@ def train_mobilenet():
             optimizer.set_dict(opti_dict)
 
         # 3. reader
-        train_data_loader, train_data = utility.create_data_loader(
-            is_train=True, args=args)
-        test_data_loader, test_data = utility.create_data_loader(
-            is_train=False, args=args)
+        train_data_loader = utility.create_data_loader(is_train=True, args=args)
+        test_data_loader = utility.create_data_loader(is_train=False, args=args)
         num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
         imagenet_reader = reader.ImageNetReader(seed=0, place_num=place_num)
         train_reader = imagenet_reader.train(settings=args)
@@ -145,8 +143,6 @@ def train_mobilenet():
                 t1 = time.time()
                 if args.max_iter and total_batch_num == args.max_iter:
                     return
-                label = to_variable(label.numpy().astype('int64').reshape(
-                    int(args.batch_size // place_num), 1))
                 t_start = time.time()
 
                 # 4.1.1 call net()
diff --git a/dygraph/mobilenet/utils/utility.py b/dygraph/mobilenet/utils/utility.py
index a7bc9c883edba2e6115d3fe96a61e569b5d7407a..22314941adb4f5ee2399147562310054a3392448 100644
--- a/dygraph/mobilenet/utils/utility.py
+++ b/dygraph/mobilenet/utils/utility.py
@@ -309,32 +309,14 @@ def create_data_loader(is_train, args):
     Returns:
         data_loader and the input data of net, 
     """
-    image_shape = [int(m) for m in args.image_shape.split(",")]
-
-    feed_image = fluid.data(
-        name="feed_image",
-        shape=[None] + image_shape,
-        dtype="float32",
-        lod_level=0)
-
-    feed_label = fluid.data(
-        name="feed_label", shape=[None, 1], dtype="int64", lod_level=0)
-    feed_y_a = fluid.data(
-        name="feed_y_a", shape=[None, 1], dtype="int64", lod_level=0)
-
     if is_train and args.use_mixup:
-        feed_y_b = fluid.data(
-            name="feed_y_b", shape=[None, 1], dtype="int64", lod_level=0)
-        feed_lam = fluid.data(
-            name="feed_lam", shape=[None, 1], dtype="float32", lod_level=0)
-
         data_loader = fluid.io.DataLoader.from_generator(
             capacity=64,
             use_double_buffer=True,
             iterable=True,
             return_list=True)
 
-        return data_loader, [feed_image, feed_y_a, feed_y_b, feed_lam]
+        return data_loader
     else:
         data_loader = fluid.io.DataLoader.from_generator(
             capacity=64,
@@ -342,7 +324,7 @@ def create_data_loader(is_train, args):
             iterable=True,
             return_list=True)
 
-        return data_loader, [feed_image, feed_label]
+        return data_loader
 
 
 def print_info(pass_id, batch_id, print_step, metrics, time_info, info_mode):
diff --git a/dygraph/ptb_lm/ptb_dy.py b/dygraph/ptb_lm/ptb_dy.py
index d33e64194c33c5a4c7ddedbda405daa58fe330ae..0a8ed9494e16937ac1fac4068e37b2a6415212bb 100644
--- a/dygraph/ptb_lm/ptb_dy.py
+++ b/dygraph/ptb_lm/ptb_dy.py
@@ -1,461 +1,474 @@
-#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import print_function
-
-import os
-import unittest
-import paddle.fluid as fluid
-import paddle.fluid.core as core
-from paddle.fluid.dygraph.nn import Embedding
-import paddle.fluid.framework as framework
-from paddle.fluid.optimizer import SGDOptimizer
-from paddle.fluid.dygraph.base import to_variable
-import numpy as np
-import six
-import multiprocessing
-
-import reader
-import model_check
-import time
-
-from args import *
-
-#import fluid.clip as clip
-#from fluid.clip  import *
-
-import sys
-if sys.version[0] == '2':
-    reload(sys)
-    sys.setdefaultencoding("utf-8")
-
-
-class SimpleLSTMRNN(fluid.Layer):
-    def __init__(self,
-                 hidden_size,
-                 num_steps,
-                 num_layers=2,
-                 init_scale=0.1,
-                 dropout=None):
-        super(SimpleLSTMRNN, self).__init__()
-        self._hidden_size = hidden_size
-        self._num_layers = num_layers
-        self._init_scale = init_scale
-        self._dropout = dropout
-        self._num_steps = num_steps
-        self.cell_array = []
-        self.hidden_array = []
-
-        self.weight_1_arr = []
-        self.weight_2_arr = []
-        self.bias_arr = []
-        self.mask_array = []
-
-        for i in range(self._num_layers):
-            weight_1 = self.create_parameter(
-                attr=fluid.ParamAttr(
-                    initializer=fluid.initializer.UniformInitializer(
-                        low=-self._init_scale, high=self._init_scale)),
-                shape=[self._hidden_size * 2, self._hidden_size * 4],
-                dtype="float32",
-                default_initializer=fluid.initializer.UniformInitializer(
-                    low=-self._init_scale, high=self._init_scale))
-            self.weight_1_arr.append(self.add_parameter('w_%d' % i, weight_1))
-            bias_1 = self.create_parameter(
-                attr=fluid.ParamAttr(
-                    initializer=fluid.initializer.UniformInitializer(
-                        low=-self._init_scale, high=self._init_scale)),
-                shape=[self._hidden_size * 4],
-                dtype="float32",
-                default_initializer=fluid.initializer.Constant(0.0))
-            self.bias_arr.append(self.add_parameter('b_%d' % i, bias_1))
-
-    def forward(self, input_embedding, init_hidden=None, init_cell=None):
-        cell_array = []
-        hidden_array = []
-
-        for i in range(self._num_layers):
-            hidden_array.append(init_hidden[i])
-            cell_array.append(init_cell[i])
-
-        res = []
-        for index in range(self._num_steps):
-            step_input = input_embedding[:,index,:]
-            for k in range(self._num_layers):
-                pre_hidden = hidden_array[k]
-                pre_cell = cell_array[k]
-                weight_1 = self.weight_1_arr[k]
-                bias = self.bias_arr[k]
-
-                nn = fluid.layers.concat([step_input, pre_hidden], 1)
-                gate_input = fluid.layers.matmul(x=nn, y=weight_1)
-
-                gate_input = fluid.layers.elementwise_add(gate_input, bias)
-                i, j, f, o = fluid.layers.split(
-                    gate_input, num_or_sections=4, dim=-1)
-                c = pre_cell * fluid.layers.sigmoid(f) + fluid.layers.sigmoid(
-                    i) * fluid.layers.tanh(j)
-                m = fluid.layers.tanh(c) * fluid.layers.sigmoid(o)
-                hidden_array[k] = m
-                cell_array[k] = c
-                step_input = m
-
-                if self._dropout is not None and self._dropout > 0.0:
-                    step_input = fluid.layers.dropout(
-                        step_input,
-                        dropout_prob=self._dropout,
-                        dropout_implementation='upscale_in_train')
-            res.append(step_input)
-        real_res = fluid.layers.concat(res, 1)
-        real_res = fluid.layers.reshape(real_res, [ -1, self._num_steps, self._hidden_size])
-        last_hidden = fluid.layers.concat(hidden_array, 1)
-        last_hidden = fluid.layers.reshape(
-            last_hidden, shape=[-1, self._num_layers, self._hidden_size])
-        last_hidden = fluid.layers.transpose(x=last_hidden, perm=[1, 0, 2])
-        last_cell = fluid.layers.concat(cell_array, 1)
-        last_cell = fluid.layers.reshape(
-            last_cell, shape=[-1, self._num_layers, self._hidden_size])
-        last_cell = fluid.layers.transpose(x=last_cell, perm=[1, 0, 2])
-        return real_res, last_hidden, last_cell
-
-
-class PtbModel(fluid.Layer):
-    def __init__(self,
-                 hidden_size,
-                 vocab_size,
-                 num_layers=2,
-                 num_steps=20,
-                 init_scale=0.1,
-                 dropout=None):
-        super(PtbModel, self).__init__()
-        self.hidden_size = hidden_size
-        self.vocab_size = vocab_size
-        self.init_scale = init_scale
-        self.num_layers = num_layers
-        self.num_steps = num_steps
-        self.dropout = dropout
-        self.simple_lstm_rnn = SimpleLSTMRNN(
-            hidden_size,
-            num_steps,
-            num_layers=num_layers,
-            init_scale=init_scale,
-            dropout=dropout)
-        self.embedding = Embedding(
-            size=[vocab_size, hidden_size],
-            dtype='float32',
-            is_sparse=False,
-            param_attr=fluid.ParamAttr(
-                name='embedding_para',
-                initializer=fluid.initializer.UniformInitializer(
-                    low=-init_scale, high=init_scale)))
-        self.softmax_weight = self.create_parameter(
-            attr=fluid.ParamAttr(),
-            shape=[self.hidden_size, self.vocab_size],
-            dtype="float32",
-            default_initializer=fluid.initializer.UniformInitializer(
-                low=-self.init_scale, high=self.init_scale))
-        self.softmax_bias = self.create_parameter(
-            attr=fluid.ParamAttr(),
-            shape=[self.vocab_size],
-            dtype="float32",
-            default_initializer=fluid.initializer.UniformInitializer(
-                low=-self.init_scale, high=self.init_scale))
-
-    def build_once(self, input, label, init_hidden, init_cell):
-        pass
-
-    def forward(self, input, label, init_hidden, init_cell):
-
-        init_h = fluid.layers.reshape(
-            init_hidden, shape=[self.num_layers, -1, self.hidden_size])
-
-        init_c = fluid.layers.reshape(
-            init_cell, shape=[self.num_layers, -1, self.hidden_size])
-
-        x_emb = self.embedding(input)
-
-        x_emb = fluid.layers.reshape(
-            x_emb, shape=[-1, self.num_steps, self.hidden_size])
-        if self.dropout is not None and self.dropout > 0.0:
-            x_emb = fluid.layers.dropout(
-                x_emb,
-                dropout_prob=self.dropout,
-                dropout_implementation='upscale_in_train')
-        rnn_out, last_hidden, last_cell = self.simple_lstm_rnn(x_emb, init_h,
-                                                               init_c)
-
-        projection = fluid.layers.matmul(rnn_out, self.softmax_weight)
-        projection = fluid.layers.elementwise_add(projection, self.softmax_bias)
-
-        loss = fluid.layers.softmax_with_cross_entropy(
-            logits=projection, label=label, soft_label=False)
-        loss = fluid.layers.reshape(loss, shape=[-1, self.num_steps])
-        loss = fluid.layers.reduce_mean(loss, dim=[0])
-        loss = fluid.layers.reduce_sum(loss)
-
-        return loss, last_hidden, last_cell
-
-    def debug_emb(self):
-
-        np.save("emb_grad", self.x_emb.gradient())
-
-
-def train_ptb_lm():
-    args = parse_args()
-
-    # check if set use_gpu=True in paddlepaddle cpu version
-    model_check.check_cuda(args.use_gpu)
-
-    place = core.CPUPlace()
-    if args.use_gpu:
-        place = fluid.CUDAPlace(0)
-        dev_count = fluid.core.get_cuda_device_count()
-    else:
-        place = fluid.CPUPlace()
-        dev_count = int(os.environ.get('CPU_NUM', multiprocessing.cpu_count()))
-    
-    # check if paddlepaddle version is satisfied
-    model_check.check_version()
-
-    model_type = args.model_type
-
-    vocab_size = 10000
-    if model_type == "test":
-        num_layers = 1
-        batch_size = 2
-        hidden_size = 10
-        num_steps = 3
-        init_scale = 0.1
-        max_grad_norm = 5.0
-        epoch_start_decay = 1
-        max_epoch = 1
-        dropout = 0.0
-        lr_decay = 0.5
-        base_learning_rate = 1.0
-    elif model_type == "small":
-        num_layers = 2
-        batch_size = 20
-        hidden_size = 200
-        num_steps = 20
-        init_scale = 0.1
-        max_grad_norm = 5.0
-        epoch_start_decay = 4
-        max_epoch = 13
-        dropout = 0.0
-        lr_decay = 0.5
-        base_learning_rate = 1.0
-    elif model_type == "medium":
-        num_layers = 2
-        batch_size = 20
-        hidden_size = 650
-        num_steps = 35
-        init_scale = 0.05
-        max_grad_norm = 5.0
-        epoch_start_decay = 6
-        max_epoch = 39
-        dropout = 0.5
-        lr_decay = 0.8
-        base_learning_rate = 1.0
-    elif model_type == "large":
-        num_layers = 2
-        batch_size = 20
-        hidden_size = 1500
-        num_steps = 35
-        init_scale = 0.04
-        max_grad_norm = 10.0
-        epoch_start_decay = 14
-        max_epoch = 55
-        dropout = 0.65
-        lr_decay = 1.0 / 1.15
-        base_learning_rate = 1.0
-    else:
-        print("model type not support")
-        return
-
-    with fluid.dygraph.guard(place):
-        if args.ce:
-            print("ce mode")
-            seed = 33
-            np.random.seed(seed)
-            fluid.default_startup_program().random_seed = seed
-            fluid.default_main_program().random_seed = seed
-            max_epoch = 1
-        ptb_model = PtbModel(
-            hidden_size=hidden_size,
-            vocab_size=vocab_size,
-            num_layers=num_layers,
-            num_steps=num_steps,
-            init_scale=init_scale,
-            dropout=dropout)
-
-        if args.init_from_pretrain_model:
-            if not os.path.exists(args.init_from_pretrain_model + '.pdparams'):
-                print(args.init_from_pretrain_model)
-                raise Warning("The pretrained params do not exist.")
-                return
-            fluid.load_dygraph(args.init_from_pretrain_model)
-            print("finish initing model from pretrained params from %s" %
-                  (args.init_from_pretrain_model))
-
-        dy_param_updated = dict()
-        dy_param_init = dict()
-        dy_loss = None
-        last_hidden = None
-        last_cell = None
-
-        data_path = args.data_path
-        print("begin to load data")
-        ptb_data = reader.get_ptb_data(data_path)
-        print("finished load data")
-        train_data, valid_data, test_data = ptb_data
-
-        batch_len = len(train_data) // batch_size
-        total_batch_size = (batch_len - 1) // num_steps
-        log_interval = 200
-
-        bd = []
-        lr_arr = [1.0]
-        for i in range(1, max_epoch):
-            bd.append(total_batch_size * i)
-            new_lr = base_learning_rate * (lr_decay**
-                                           max(i + 1 - epoch_start_decay, 0.0))
-            lr_arr.append(new_lr)
-
-        grad_clip = fluid.clip.GradientClipByGlobalNorm(max_grad_norm)
-        sgd = SGDOptimizer(
-            learning_rate=fluid.layers.piecewise_decay(boundaries=bd, values=lr_arr), 
-            parameter_list=ptb_model.parameters(), 
-            grad_clip=grad_clip)
-
-        def eval(model, data):
-            print("begin to eval")
-            total_loss = 0.0
-            iters = 0.0
-            init_hidden_data = np.zeros(
-                (num_layers, batch_size, hidden_size), dtype='float32')
-            init_cell_data = np.zeros(
-                (num_layers, batch_size, hidden_size), dtype='float32')
-
-            model.eval()
-            train_data_iter = reader.get_data_iter(data, batch_size, num_steps)
-            for batch_id, batch in enumerate(train_data_iter):
-                x_data, y_data = batch
-                x_data = x_data.reshape((-1, num_steps, 1))
-                y_data = y_data.reshape((-1, num_steps, 1))
-                x = to_variable(x_data)
-                y = to_variable(y_data)
-                init_hidden = to_variable(init_hidden_data)
-                init_cell = to_variable(init_cell_data)
-                dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden,
-                                                            init_cell)
-
-                out_loss = dy_loss.numpy()
-
-                init_hidden_data = last_hidden.numpy()
-                init_cell_data = last_cell.numpy()
-
-                total_loss += out_loss
-                iters += num_steps
-
-            print("eval finished")
-            ppl = np.exp(total_loss / iters)
-            print("ppl ", batch_id, ppl[0])
-
-        ce_time = []
-        ce_ppl = []
-        
-        total_batch_num = 0  #this is for benchmark
-        for epoch_id in range(max_epoch):
-            ptb_model.train()
-            total_loss = 0.0
-            iters = 0.0
-            init_hidden_data = np.zeros(
-                (num_layers, batch_size, hidden_size), dtype='float32')
-            init_cell_data = np.zeros(
-                (num_layers, batch_size, hidden_size), dtype='float32')
-
-            train_data_iter = reader.get_data_iter(train_data, batch_size,
-                                                   num_steps)
-            init_hidden = to_variable(init_hidden_data)
-            init_cell = to_variable(init_cell_data)
-            start_time = time.time()
-            for batch_id, batch in enumerate(train_data_iter):
-                if args.max_iter and total_batch_num == args.max_iter:
-                    return
-                batch_start = time.time()
-                x_data, y_data = batch
-
-                x_data = x_data.reshape((-1, num_steps, 1))
-                y_data = y_data.reshape((-1, num_steps, 1))
-
-                x = to_variable(x_data)
-                y = to_variable(y_data)
-
-                dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden,
-                                                            init_cell)
-                init_hidden = last_hidden.detach()
-                init_cell = last_cell.detach()
-                out_loss = dy_loss.numpy()
-
-                dy_loss.backward()
-                sgd.minimize(dy_loss)
-
-                ptb_model.clear_gradients()
-                total_loss += out_loss
-                batch_end = time.time()
-                train_batch_cost = batch_end - batch_start
-                iters += num_steps
-                total_batch_num = total_batch_num + 1 #this is for benchmark
-
-                if batch_id > 0 and batch_id % log_interval == 0:
-                    ppl = np.exp(total_loss / iters)
-                    print("-- Epoch:[%d]; Batch:[%d]; ppl: %.5f, lr: %.5f, loss: %.5f, batch cost: %.5f" %
-                          (epoch_id, batch_id, ppl[0],
-                           sgd._global_learning_rate().numpy(), out_loss, train_batch_cost))
-
-            print("one epoch finished", epoch_id)
-            print("time cost ", time.time() - start_time)
-            ppl = np.exp(total_loss / iters)
-            ce_time.append(time.time() - start_time)
-            ce_ppl.append(ppl[0])
-            print("-- Epoch:[%d]; ppl: %.5f" % (epoch_id, ppl[0]))
-
-            if batch_size <= 20 and epoch_id == 0 and ppl[0] > 1000:
-                # for bad init, after first epoch, the loss is over 1000
-                # no more need to continue
-                print("Parameters are randomly initialized and not good this time because the loss is over 1000 after the first epoch.")
-                print("Abort this training process and please start again.")
-                return 
-
-            save_model_dir = os.path.join(args.save_model_dir,
-                                          str(epoch_id), 'params')
-            fluid.save_dygraph(ptb_model.state_dict(), save_model_dir)
-            print("Saved model to: %s.\n" % save_model_dir)
-
-            eval(ptb_model, valid_data)
-
-        if args.ce:
-            _ppl = 0
-            _time = 0
-            try:
-                _time = ce_time[-1]
-                _ppl = ce_ppl[-1]
-            except:
-                print("ce info error")
-            print("kpis\ttrain_duration_card%s\t%s" % (dev_count, _time))
-            print("kpis\ttrain_ppl_card%s\t%f" % (dev_count, _ppl))
-
-        eval(ptb_model, test_data)
-
-train_ptb_lm()
+#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+
+import os
+import unittest
+import paddle.fluid as fluid
+import paddle.fluid.core as core
+from paddle.fluid.dygraph.nn import Embedding
+import paddle.fluid.framework as framework
+from paddle.fluid.optimizer import SGDOptimizer
+from paddle.fluid.dygraph.base import to_variable
+import numpy as np
+import six
+import multiprocessing
+
+import reader
+import model_check
+import time
+
+from args import *
+
+#import fluid.clip as clip
+#from fluid.clip  import *
+
+import sys
+if sys.version[0] == '2':
+    reload(sys)
+    sys.setdefaultencoding("utf-8")
+
+
+class SimpleLSTMRNN(fluid.Layer):
+    def __init__(self,
+                 hidden_size,
+                 num_steps,
+                 num_layers=2,
+                 init_scale=0.1,
+                 dropout=None):
+        super(SimpleLSTMRNN, self).__init__()
+        self._hidden_size = hidden_size
+        self._num_layers = num_layers
+        self._init_scale = init_scale
+        self._dropout = dropout
+        self._num_steps = num_steps
+        self.cell_array = []
+        self.hidden_array = []
+
+        self.weight_1_arr = []
+        self.weight_2_arr = []
+        self.bias_arr = []
+        self.mask_array = []
+
+        for i in range(self._num_layers):
+            weight_1 = self.create_parameter(
+                attr=fluid.ParamAttr(
+                    initializer=fluid.initializer.UniformInitializer(
+                        low=-self._init_scale, high=self._init_scale)),
+                shape=[self._hidden_size * 2, self._hidden_size * 4],
+                dtype="float32",
+                default_initializer=fluid.initializer.UniformInitializer(
+                    low=-self._init_scale, high=self._init_scale))
+            self.weight_1_arr.append(self.add_parameter('w_%d' % i, weight_1))
+            bias_1 = self.create_parameter(
+                attr=fluid.ParamAttr(
+                    initializer=fluid.initializer.UniformInitializer(
+                        low=-self._init_scale, high=self._init_scale)),
+                shape=[self._hidden_size * 4],
+                dtype="float32",
+                default_initializer=fluid.initializer.Constant(0.0))
+            self.bias_arr.append(self.add_parameter('b_%d' % i, bias_1))
+
+    def forward(self, input_embedding, init_hidden=None, init_cell=None):
+        cell_array = []
+        hidden_array = []
+
+        for i in range(self._num_layers):
+            hidden_array.append(init_hidden[i])
+            cell_array.append(init_cell[i])
+
+        res = []
+        for index in range(self._num_steps):
+            step_input = input_embedding[:, index, :]
+            for k in range(self._num_layers):
+                pre_hidden = hidden_array[k]
+                pre_cell = cell_array[k]
+                weight_1 = self.weight_1_arr[k]
+                bias = self.bias_arr[k]
+
+                nn = fluid.layers.concat([step_input, pre_hidden], 1)
+                gate_input = fluid.layers.matmul(x=nn, y=weight_1)
+
+                gate_input = fluid.layers.elementwise_add(gate_input, bias)
+                i, j, f, o = fluid.layers.split(
+                    gate_input, num_or_sections=4, dim=-1)
+                c = pre_cell * fluid.layers.sigmoid(f) + fluid.layers.sigmoid(
+                    i) * fluid.layers.tanh(j)
+                m = fluid.layers.tanh(c) * fluid.layers.sigmoid(o)
+                hidden_array[k] = m
+                cell_array[k] = c
+                step_input = m
+
+                if self._dropout is not None and self._dropout > 0.0:
+                    step_input = fluid.layers.dropout(
+                        step_input,
+                        dropout_prob=self._dropout,
+                        dropout_implementation='upscale_in_train')
+            res.append(step_input)
+        real_res = fluid.layers.concat(res, 1)
+        real_res = fluid.layers.reshape(
+            real_res, [-1, self._num_steps, self._hidden_size])
+        last_hidden = fluid.layers.concat(hidden_array, 1)
+        last_hidden = fluid.layers.reshape(
+            last_hidden, shape=[-1, self._num_layers, self._hidden_size])
+        last_hidden = fluid.layers.transpose(x=last_hidden, perm=[1, 0, 2])
+        last_cell = fluid.layers.concat(cell_array, 1)
+        last_cell = fluid.layers.reshape(
+            last_cell, shape=[-1, self._num_layers, self._hidden_size])
+        last_cell = fluid.layers.transpose(x=last_cell, perm=[1, 0, 2])
+        return real_res, last_hidden, last_cell
+
+
+class PtbModel(fluid.Layer):
+    def __init__(self,
+                 hidden_size,
+                 vocab_size,
+                 num_layers=2,
+                 num_steps=20,
+                 init_scale=0.1,
+                 dropout=None):
+        super(PtbModel, self).__init__()
+        self.hidden_size = hidden_size
+        self.vocab_size = vocab_size
+        self.init_scale = init_scale
+        self.num_layers = num_layers
+        self.num_steps = num_steps
+        self.dropout = dropout
+        self.simple_lstm_rnn = SimpleLSTMRNN(
+            hidden_size,
+            num_steps,
+            num_layers=num_layers,
+            init_scale=init_scale,
+            dropout=dropout)
+        self.embedding = Embedding(
+            size=[vocab_size, hidden_size],
+            dtype='float32',
+            is_sparse=False,
+            param_attr=fluid.ParamAttr(
+                name='embedding_para',
+                initializer=fluid.initializer.UniformInitializer(
+                    low=-init_scale, high=init_scale)))
+        self.softmax_weight = self.create_parameter(
+            attr=fluid.ParamAttr(),
+            shape=[self.hidden_size, self.vocab_size],
+            dtype="float32",
+            default_initializer=fluid.initializer.UniformInitializer(
+                low=-self.init_scale, high=self.init_scale))
+        self.softmax_bias = self.create_parameter(
+            attr=fluid.ParamAttr(),
+            shape=[self.vocab_size],
+            dtype="float32",
+            default_initializer=fluid.initializer.UniformInitializer(
+                low=-self.init_scale, high=self.init_scale))
+
+    def build_once(self, input, label, init_hidden, init_cell):
+        pass
+
+    def forward(self, input, label, init_hidden, init_cell):
+
+        init_h = fluid.layers.reshape(
+            init_hidden, shape=[self.num_layers, -1, self.hidden_size])
+
+        init_c = fluid.layers.reshape(
+            init_cell, shape=[self.num_layers, -1, self.hidden_size])
+
+        x_emb = self.embedding(input)
+
+        x_emb = fluid.layers.reshape(
+            x_emb, shape=[-1, self.num_steps, self.hidden_size])
+        if self.dropout is not None and self.dropout > 0.0:
+            x_emb = fluid.layers.dropout(
+                x_emb,
+                dropout_prob=self.dropout,
+                dropout_implementation='upscale_in_train')
+        rnn_out, last_hidden, last_cell = self.simple_lstm_rnn(x_emb, init_h,
+                                                               init_c)
+
+        projection = fluid.layers.matmul(rnn_out, self.softmax_weight)
+        projection = fluid.layers.elementwise_add(projection, self.softmax_bias)
+
+        loss = fluid.layers.softmax_with_cross_entropy(
+            logits=projection, label=label, soft_label=False)
+        loss = fluid.layers.reshape(loss, shape=[-1, self.num_steps])
+        loss = fluid.layers.reduce_mean(loss, dim=[0])
+        loss = fluid.layers.reduce_sum(loss)
+
+        return loss, last_hidden, last_cell
+
+    def debug_emb(self):
+
+        np.save("emb_grad", self.x_emb.gradient())
+
+
+def train_ptb_lm():
+    args = parse_args()
+
+    # check if set use_gpu=True in paddlepaddle cpu version
+    model_check.check_cuda(args.use_gpu)
+
+    place = core.CPUPlace()
+    if args.use_gpu:
+        place = fluid.CUDAPlace(0)
+        dev_count = fluid.core.get_cuda_device_count()
+    else:
+        place = fluid.CPUPlace()
+        dev_count = int(os.environ.get('CPU_NUM', multiprocessing.cpu_count()))
+
+    # check if paddlepaddle version is satisfied
+    model_check.check_version()
+
+    model_type = args.model_type
+
+    vocab_size = 10000
+    if model_type == "test":
+        num_layers = 1
+        batch_size = 2
+        hidden_size = 10
+        num_steps = 3
+        init_scale = 0.1
+        max_grad_norm = 5.0
+        epoch_start_decay = 1
+        max_epoch = 1
+        dropout = 0.0
+        lr_decay = 0.5
+        base_learning_rate = 1.0
+    elif model_type == "small":
+        num_layers = 2
+        batch_size = 20
+        hidden_size = 200
+        num_steps = 20
+        init_scale = 0.1
+        max_grad_norm = 5.0
+        epoch_start_decay = 4
+        max_epoch = 13
+        dropout = 0.0
+        lr_decay = 0.5
+        base_learning_rate = 1.0
+    elif model_type == "medium":
+        num_layers = 2
+        batch_size = 20
+        hidden_size = 650
+        num_steps = 35
+        init_scale = 0.05
+        max_grad_norm = 5.0
+        epoch_start_decay = 6
+        max_epoch = 39
+        dropout = 0.5
+        lr_decay = 0.8
+        base_learning_rate = 1.0
+    elif model_type == "large":
+        num_layers = 2
+        batch_size = 20
+        hidden_size = 1500
+        num_steps = 35
+        init_scale = 0.04
+        max_grad_norm = 10.0
+        epoch_start_decay = 14
+        max_epoch = 55
+        dropout = 0.65
+        lr_decay = 1.0 / 1.15
+        base_learning_rate = 1.0
+    else:
+        print("model type not support")
+        return
+
+    with fluid.dygraph.guard(place):
+        if args.ce:
+            print("ce mode")
+            seed = 33
+            np.random.seed(seed)
+            fluid.default_startup_program().random_seed = seed
+            fluid.default_main_program().random_seed = seed
+            max_epoch = 1
+        ptb_model = PtbModel(
+            hidden_size=hidden_size,
+            vocab_size=vocab_size,
+            num_layers=num_layers,
+            num_steps=num_steps,
+            init_scale=init_scale,
+            dropout=dropout)
+
+        if args.init_from_pretrain_model:
+            if not os.path.exists(args.init_from_pretrain_model + '.pdparams'):
+                print(args.init_from_pretrain_model)
+                raise Warning("The pretrained params do not exist.")
+                return
+            fluid.load_dygraph(args.init_from_pretrain_model)
+            print("finish initing model from pretrained params from %s" %
+                  (args.init_from_pretrain_model))
+
+        dy_param_updated = dict()
+        dy_param_init = dict()
+        dy_loss = None
+        last_hidden = None
+        last_cell = None
+
+        data_path = args.data_path
+        print("begin to load data")
+        ptb_data = reader.get_ptb_data(data_path)
+        print("finished load data")
+        train_data, valid_data, test_data = ptb_data
+
+        batch_len = len(train_data) // batch_size
+        total_batch_size = (batch_len - 1) // num_steps
+        log_interval = 200
+
+        bd = []
+        lr_arr = [1.0]
+        for i in range(1, max_epoch):
+            bd.append(total_batch_size * i)
+            new_lr = base_learning_rate * (lr_decay**
+                                           max(i + 1 - epoch_start_decay, 0.0))
+            lr_arr.append(new_lr)
+
+        grad_clip = fluid.clip.GradientClipByGlobalNorm(max_grad_norm)
+        sgd = SGDOptimizer(
+            learning_rate=fluid.layers.piecewise_decay(
+                boundaries=bd, values=lr_arr),
+            parameter_list=ptb_model.parameters(),
+            grad_clip=grad_clip)
+
+        def reader_decorator(reader):
+            def __reader__():
+                for item in reader:
+                    x_data = item[0].reshape((-1, num_steps, 1))
+                    y_data = item[1].reshape((-1, num_steps, 1))
+                    yield x_data, y_data
+
+            return __reader__
+
+        def eval(model, data):
+            print("begin to eval")
+            total_loss = 0.0
+            iters = 0.0
+            init_hidden_data = np.zeros(
+                (num_layers, batch_size, hidden_size), dtype='float32')
+            init_cell_data = np.zeros(
+                (num_layers, batch_size, hidden_size), dtype='float32')
+
+            model.eval()
+            train_data_iter = reader_decorator(
+                reader.get_data_iter(data, batch_size, num_steps))
+
+            eval_data_loader = fluid.io.DataLoader.from_generator(capacity=200)
+            eval_data_loader.set_batch_generator(train_data_iter, places=place)
+
+            for batch_id, batch in enumerate(eval_data_loader):
+                x, y = batch
+                init_hidden = to_variable(init_hidden_data)
+                init_cell = to_variable(init_cell_data)
+                dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden,
+                                                            init_cell)
+
+                out_loss = dy_loss.numpy()
+
+                init_hidden_data = last_hidden.numpy()
+                init_cell_data = last_cell.numpy()
+
+                total_loss += out_loss
+                iters += num_steps
+
+            print("eval finished")
+            ppl = np.exp(total_loss / iters)
+            print("ppl ", batch_id, ppl[0])
+
+        ce_time = []
+        ce_ppl = []
+        
+        total_batch_num = 0  #this is for benchmark
+        for epoch_id in range(max_epoch):
+            ptb_model.train()
+            total_loss = 0.0
+            iters = 0.0
+            init_hidden_data = np.zeros(
+                (num_layers, batch_size, hidden_size), dtype='float32')
+            init_cell_data = np.zeros(
+                (num_layers, batch_size, hidden_size), dtype='float32')
+
+            train_data_iter = reader_decorator(
+                reader.get_data_iter(train_data, batch_size, num_steps))
+
+            train_data_loader = fluid.io.DataLoader.from_generator(capacity=200)
+            train_data_loader.set_batch_generator(train_data_iter, places=place)
+
+            init_hidden = to_variable(init_hidden_data)
+            init_cell = to_variable(init_cell_data)
+            start_time = time.time()
+            for batch_id, batch in enumerate(train_data_loader):
+                if args.max_iter and total_batch_num == args.max_iter:
+                    return
+                batch_start = time.time()
+                x, y = batch
+
+                dy_loss, last_hidden, last_cell = ptb_model(x, y, init_hidden,
+                                                            init_cell)
+                init_hidden = last_hidden.detach()
+                init_cell = last_cell.detach()
+                out_loss = dy_loss.numpy()
+
+                dy_loss.backward()
+                sgd.minimize(dy_loss)
+
+                ptb_model.clear_gradients()
+                total_loss += out_loss
+                batch_end = time.time()
+                train_batch_cost = batch_end - batch_start
+                iters += num_steps
+                total_batch_num = total_batch_num + 1 #this is for benchmark
+
+                if batch_id > 0 and batch_id % log_interval == 0:
+                    ppl = np.exp(total_loss / iters)
+                    print("-- Epoch:[%d]; Batch:[%d]; ppl: %.5f, lr: %.5f, loss: %.5f, batch cost: %.5f" %
+                          (epoch_id, batch_id, ppl[0],
+                           sgd._global_learning_rate().numpy(), out_loss, train_batch_cost))
+
+            print("one epoch finished", epoch_id)
+            print("time cost ", time.time() - start_time)
+            ppl = np.exp(total_loss / iters)
+            ce_time.append(time.time() - start_time)
+            ce_ppl.append(ppl[0])
+            print("-- Epoch:[%d]; ppl: %.5f" % (epoch_id, ppl[0]))
+
+            if batch_size <= 20 and epoch_id == 0 and ppl[0] > 1000:
+                # for bad init, after first epoch, the loss is over 1000
+                # no more need to continue
+                print(
+                    "Parameters are randomly initialized and not good this time because the loss is over 1000 after the first epoch."
+                )
+                print("Abort this training process and please start again.")
+                return
+
+            save_model_dir = os.path.join(args.save_model_dir,
+                                          str(epoch_id), 'params')
+            fluid.save_dygraph(ptb_model.state_dict(), save_model_dir)
+            print("Saved model to: %s.\n" % save_model_dir)
+
+            eval(ptb_model, valid_data)
+
+        if args.ce:
+            _ppl = 0
+            _time = 0
+            try:
+                _time = ce_time[-1]
+                _ppl = ce_ppl[-1]
+            except:
+                print("ce info error")
+            print("kpis\ttrain_duration_card%s\t%s" % (dev_count, _time))
+            print("kpis\ttrain_ppl_card%s\t%f" % (dev_count, _ppl))
+
+        eval(ptb_model, test_data)
+
+
+train_ptb_lm()
diff --git a/dygraph/resnet/train.py b/dygraph/resnet/train.py
index e92a39bde5bce633dda9452d5c0dad3399092248..5339cadcc88954f63d482f535ad72e5305f30490 100644
--- a/dygraph/resnet/train.py
+++ b/dygraph/resnet/train.py
@@ -81,7 +81,6 @@ def optimizer_setting(parameter_list=None):
                 boundaries=bd, values=lr),
             momentum=momentum_rate,
             regularization=fluid.regularizer.L2Decay(l2_decay))
-        
 
     return optimizer
 
@@ -116,11 +115,7 @@ class ConvBNLayer(fluid.dygraph.Layer):
 
 
 class BottleneckBlock(fluid.dygraph.Layer):
-    def __init__(self,
-                 num_channels,
-                 num_filters,
-                 stride,
-                 shortcut=True):
+    def __init__(self, num_channels, num_filters, stride, shortcut=True):
         super(BottleneckBlock, self).__init__()
 
         self.conv0 = ConvBNLayer(
@@ -186,16 +181,9 @@ class ResNet(fluid.dygraph.Layer):
         num_filters = [64, 128, 256, 512]
 
         self.conv = ConvBNLayer(
-            num_channels=3,
-            num_filters=64,
-            filter_size=7,
-            stride=2,
-            act='relu')
+            num_channels=3, num_filters=64, filter_size=7, stride=2, act='relu')
         self.pool2d_max = Pool2D(
-            pool_size=3,
-            pool_stride=2,
-            pool_padding=1,
-            pool_type='max')
+            pool_size=3, pool_stride=2, pool_padding=1, pool_type='max')
 
         self.bottleneck_block_list = []
         for block in range(len(depth)):
@@ -220,11 +208,12 @@ class ResNet(fluid.dygraph.Layer):
         import math
         stdv = 1.0 / math.sqrt(2048 * 1.0)
 
-        self.out = Linear(self.pool2d_avg_output,
-                      class_dim,
-                      act='softmax',
-                      param_attr=fluid.param_attr.ParamAttr(
-                          initializer=fluid.initializer.Uniform(-stdv, stdv)))
+        self.out = Linear(
+            self.pool2d_avg_output,
+            class_dim,
+            act='softmax',
+            param_attr=fluid.param_attr.ParamAttr(
+                initializer=fluid.initializer.Uniform(-stdv, stdv)))
 
     def forward(self, inputs):
         y = self.conv(inputs)
@@ -237,6 +226,16 @@ class ResNet(fluid.dygraph.Layer):
         return y
 
 
+def reader_decorator(reader):
+    def __reader__():
+        for item in reader():
+            img = np.array(item[0]).astype('float32').reshape(3, 224, 224)
+            label = np.array(item[1]).astype('int64').reshape(1)
+            yield img, label
+
+    return __reader__
+
+
 def eval(model, data):
 
     model.eval()
@@ -245,15 +244,8 @@ def eval(model, data):
     total_acc5 = 0.0
     total_sample = 0
     for batch_id, data in enumerate(data()):
-        dy_x_data = np.array(
-            [x[0].reshape(3, 224, 224) for x in data]).astype('float32')
-        if len(np.array([x[1] for x in data]).astype('int64')) != batch_size:
-            continue
-        y_data = np.array([x[1] for x in data]).astype('int64').reshape(
-            batch_size, 1)
-
-        img = to_variable(dy_x_data)
-        label = to_variable(y_data)
+        img = data[0]
+        label = data[1]
         label.stop_gradient = True
 
         out = model(img)
@@ -303,13 +295,24 @@ def train_resnet():
             resnet = fluid.dygraph.parallel.DataParallel(resnet, strategy)
 
         train_reader = paddle.batch(
-            paddle.dataset.flowers.train(use_xmap=False), batch_size=batch_size)
+            reader_decorator(paddle.dataset.flowers.train(use_xmap=True)),
+            batch_size=batch_size,
+            drop_last=True)
+
         if args.use_data_parallel:
             train_reader = fluid.contrib.reader.distributed_batch_reader(
                 train_reader)
 
         test_reader = paddle.batch(
-            paddle.dataset.flowers.test(use_xmap=False), batch_size=batch_size)
+            reader_decorator(paddle.dataset.flowers.test(use_xmap=True)),
+            batch_size=batch_size,
+            drop_last=True)
+
+        train_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        train_loader.set_sample_list_generator(train_reader, places=place)
+
+        test_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        test_loader.set_sample_list_generator(test_reader, places=place)
 
         #file_name = './model/epoch_0.npz'
         #model_data = np.load( file_name )
@@ -331,23 +334,13 @@ def train_resnet():
 
             print("load finished")
 
-            for batch_id, data in enumerate(train_reader()):
-
+            for batch_id, data in enumerate(train_loader()):
                 #NOTE: used in benchmark
                 if args.max_iter and total_batch_num == args.max_iter:
                     return
                 batch_start = time.time()
 
-                dy_x_data = np.array(
-                    [x[0].reshape(3, 224, 224) for x in data]).astype('float32')
-                if len(np.array([x[1]
-                                 for x in data]).astype('int64')) != batch_size:
-                    continue
-                y_data = np.array([x[1] for x in data]).astype('int64').reshape(
-                    -1, 1)
-
-                img = to_variable(dy_x_data)
-                label = to_variable(y_data)
+                img, label = data
                 label.stop_gradient = True
 
                 out = resnet(img)
@@ -390,16 +383,14 @@ def train_resnet():
                   (eop, batch_id, total_loss / total_sample, \
                    total_acc1 / total_sample, total_acc5 / total_sample))
             resnet.eval()
-            eval(resnet, test_reader)
+            eval(resnet, test_loader)
 
             save_parameters = (not args.use_data_parallel) or (
                 args.use_data_parallel and
                 fluid.dygraph.parallel.Env().local_rank == 0)
             if save_parameters:
-                fluid.save_dygraph(resnet.state_dict(),
-                                                'resnet_params')
+                fluid.save_dygraph(resnet.state_dict(), 'resnet_params')
 
 
 if __name__ == '__main__':
-
     train_resnet()
diff --git a/dygraph/se_resnet/train.py b/dygraph/se_resnet/train.py
index 67b9dacf2e07e19e07d466683769641830a6fd36..0ba5de46f83dfcd3f3e820ce11aedc9da88925ff 100644
--- a/dygraph/se_resnet/train.py
+++ b/dygraph/se_resnet/train.py
@@ -169,8 +169,7 @@ class BottleneckBlock(fluid.dygraph.Layer):
             act=None)
 
         self.scale = SqueezeExcitation(
-            num_channels=num_filters * 2,
-            reduction_ratio=reduction_ratio)
+            num_channels=num_filters * 2, reduction_ratio=reduction_ratio)
 
         if not shortcut:
             self.short = ConvBNLayer(
@@ -219,10 +218,7 @@ class SeResNeXt(fluid.dygraph.Layer):
                 stride=2,
                 act='relu')
             self.pool = Pool2D(
-                pool_size=3,
-                pool_stride=2,
-                pool_padding=1,
-                pool_type='max')
+                pool_size=3, pool_stride=2, pool_padding=1, pool_type='max')
         elif layers == 101:
             cardinality = 32
             reduction_ratio = 16
@@ -235,10 +231,7 @@ class SeResNeXt(fluid.dygraph.Layer):
                 stride=2,
                 act='relu')
             self.pool = Pool2D(
-                pool_size=3,
-                pool_stride=2,
-                pool_padding=1,
-                pool_type='max')
+                pool_size=3, pool_stride=2, pool_padding=1, pool_type='max')
         elif layers == 152:
             cardinality = 64
             reduction_ratio = 16
@@ -263,10 +256,7 @@ class SeResNeXt(fluid.dygraph.Layer):
                 stride=1,
                 act='relu')
             self.pool = Pool2D(
-                pool_size=3,
-                pool_stride=2,
-                pool_padding=1,
-                pool_type='max')
+                pool_size=3, pool_stride=2, pool_padding=1, pool_type='max')
 
         self.bottleneck_block_list = []
         num_channels = 64
@@ -294,10 +284,11 @@ class SeResNeXt(fluid.dygraph.Layer):
 
         self.pool2d_avg_output = num_filters[len(num_filters) - 1] * 2 * 1 * 1
 
-        self.out = Linear(self.pool2d_avg_output,
-                      class_dim,
-                      param_attr=fluid.param_attr.ParamAttr(
-                          initializer=fluid.initializer.Uniform(-stdv, stdv)))
+        self.out = Linear(
+            self.pool2d_avg_output,
+            class_dim,
+            param_attr=fluid.param_attr.ParamAttr(
+                initializer=fluid.initializer.Uniform(-stdv, stdv)))
 
     def forward(self, inputs):
         if self.layers == 50 or self.layers == 101:
@@ -318,6 +309,16 @@ class SeResNeXt(fluid.dygraph.Layer):
         return y
 
 
+def reader_decorator(reader):
+    def __reader__():
+        for item in reader():
+            img = np.array(item[0]).astype('float32').reshape(3, 224, 224)
+            label = np.array(item[1]).astype('int64').reshape(1)
+            yield img, label
+
+    return __reader__
+
+
 def eval(model, data):
 
     model.eval()
@@ -327,15 +328,7 @@ def eval(model, data):
     total_acc5 = 0.0
     total_sample = 0
     for batch_id, data in enumerate(data()):
-        dy_x_data = np.array(
-            [x[0].reshape(3, 224, 224) for x in data]).astype('float32')
-        if len(np.array([x[1] for x in data]).astype('int64')) != batch_size:
-            continue
-        y_data = np.array([x[1] for x in data]).astype('int64').reshape(
-            batch_size, 1)
-
-        img = to_variable(dy_x_data)
-        label = to_variable(y_data)
+        img, label = data
         label.stop_gradient = True
         out = model(img)
 
@@ -389,29 +382,29 @@ def train():
             se_resnext = fluid.dygraph.parallel.DataParallel(se_resnext,
                                                              strategy)
         train_reader = paddle.batch(
-            paddle.dataset.flowers.train(use_xmap=False),
+            reader_decorator(paddle.dataset.flowers.train(use_xmap=False)),
             batch_size=batch_size,
             drop_last=True)
         if args.use_data_parallel:
             train_reader = fluid.contrib.reader.distributed_batch_reader(
                 train_reader)
         test_reader = paddle.batch(
-            paddle.dataset.flowers.test(use_xmap=False), batch_size=32)
+            reader_decorator(paddle.dataset.flowers.test(use_xmap=False)),
+            batch_size=32)
+
+        train_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        train_loader.set_sample_list_generator(train_reader, places=place)
+
+        test_loader = fluid.io.DataLoader.from_generator(capacity=10)
+        test_loader.set_sample_list_generator(test_reader, places=place)
 
         for epoch_id in range(epoch_num):
             total_loss = 0.0
             total_acc1 = 0.0
             total_acc5 = 0.0
             total_sample = 0
-            for batch_id, data in enumerate(train_reader()):
-
-                dy_x_data = np.array([x[0].reshape(3, 224, 224)
-                                      for x in data]).astype('float32')
-                y_data = np.array([x[1] for x in data]).astype('int64').reshape(
-                    batch_size, 1)
-
-                img = to_variable(dy_x_data)
-                label = to_variable(y_data)
+            for batch_id, data in enumerate(train_loader()):
+                img, label = data
                 label.stop_gradient = True
 
                 out = se_resnext(img)
@@ -454,7 +447,7 @@ def train():
                   (epoch_id, batch_id, total_loss / total_sample, \
                    total_acc1 / total_sample, total_acc5 / total_sample))
             se_resnext.eval()
-            eval(se_resnext, test_reader)
+            eval(se_resnext, test_loader)
             se_resnext.train()